thermal issues in testing of advanced systems on chip

Linköping Studies in Science and Technology

Dissertations. No. 1702

Thermal Issues in Testing of

Advanced Systems on Chip

By

Nima Aghaee

Department of Computer and Information Science

Linköping University

SE-581 83 Linköping, Sweden

Linköping 2015

Copyright © 2015 Aghaee Ghaleshahi, Nima

ISBN 978-91-7685-949-0

ISSN 0345-7524

Printed by LiU-Tryck 2015

URL: http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-120798

i

Abstract

Many cutting-edge computer and electronic products are powered by

advanced Systems-on-Chip (SoC). Advanced SoCs encompass superb

performance together with large number of functions. This is achieved by

efficient integration of huge number of transistors. Such very large scale

integration is enabled by a core-based design paradigm as well as deep-

submicron and 3D-stacked-IC technologies. These technologies are

susceptible to reliability and testing complications caused by thermal

issues. Three crucial thermal issues related to temperature variations,

temperature gradients, and temperature cycling are addressed in this thesis.

Existing test scheduling techniques rely on temperature simulations to

generate schedules that meet thermal constraints such as overheating

prevention. The difference between the simulated temperatures and the

actual temperatures is called temperature error. This error, for past

technologies, is negligible. However, advanced SoCs experience large

errors due to large process variations. Such large errors have costly

consequences, such as overheating, and must be taken care of. This thesis

presents an adaptive approach to generate test schedules that handle such

temperature errors.

Advanced SoCs manufactured as 3D-stacked-ICs experience large

temperature gradients. Temperature gradients accelerate certain early-life

defect mechanisms. These mechanisms can be accelerated using gradient-

based, burn-in like operations so that the defects are detected before

shipping. Moreover, temperature gradients exacerbate some delay-related

defects. In order to detect such defects, testing must be performed when

appropriate temperature-gradients are enforced. Schedule-based

techniques that enforces the temperature-gradients for burn-in like

operations and delay testing are proposed in this thesis.

Abstract

ii

The last thermal issue addressed by this thesis is related to temperature

cycling. Temperature cycling test procedures are usually applied to safety-

critical systems to detect cycling-related early-life failures. Such failures

affect advanced SoCs, particularly through-silicon-via structures in 3D-

stacked-ICs. An efficient schedule-based cycling-test technique that

combines cycling acceleration with testing is proposed in this thesis. The

proposed technique fits into existing 3D testing procedures and does not

require temperature chambers. Therefore, the overall cycling acceleration

and testing cost can be drastically reduced.

All the proposed techniques have been implemented and evaluated with

extensive experiments based on ITC’02 benchmarks as well as a number

of 3D stacked ICs. Experiments show that the proposed techniques work

effectively and reduce the test costs. We have also developed a fast

temperature simulation technique based on a closed-form solution for the

temperature equations. Experiments demonstrate that the proposed

simulation technique reduces the test schedule generation time by more

than half.

iii

Populärvetenskaplig sammanfattning

Många banbrytande dator- och elektronikprodukter drivs av avancerade

System-on-Chip (SoC). Avancerade SoCs har enastående prestanda

tillsammans med ett stort antal funktioner. Detta uppnås genom effektiv

integrering av ett stort antal transistorer. En sådan storskalig integration

möjliggörs av ett kärnbaserat designparadigm samt djup submicron och

3D-stacked-IC teknik. Dessa teknologier är känsliga för tillförlitlighet och

testkomplikationer orsakade av termiska problem. Tre viktiga termiska

frågor som berör temperaturvariationer, temperaturgradienter och

temperaturcykler behandlas i denna avhandling.

Befintliga testschemaläggningstekniker förlitar sig på

temperatursimuleringar för att generera scheman som uppfyller termiska

begränsningar. Skillnaden mellan de simulerade temperaturerna och de

faktiska temperaturerna är ett fel. Detta fel, för tidigare tekniker, är

försumbart. Men avancerade SoCs upplever stora fel på grund av stora

processvariationer. Sådana stora fel har kostsamma följder, så som

överhettning, och måste tas om hand.

Avancerade SoCs tillverkade som 3D-stacked-IC upplever stora

temperaturgradienter. Temperaturgradienter påskyndar uppkomsten av

vissa defekta mekanismer när produkten är ny. Dessa mekanismer kan

artificiellt påskyndas genom att tillämpa gradienter så att motsvarande fel

upptäcks i tid. Dessutom förvärrar temperaturgradienter vissa

fördröjningsrelaterade defekter. För att upptäcka sådana defekter måste

testen utföras när lämpliga temperaturgradienter appliceras.

Den sista värmefrågan som behandlas i denna avhandling är relaterad till

temperaturcykling. Temperaturcyklingstester används för att detektera

cykelrelaterade fel tidigt. Sådana fel påverkar avancerade SoCs, särskilt

through-silicon-via-strukturer i 3D-stacked-IC. Befintliga

Populärvetenskaplig sammanfattning

iv

temperaturcyklings-testmetoder är för dyra för 3D-stacked-IC och därmed

måste nya billigare tekniker utvecklas.

Denna avhandling föreslår effektiva schemabaserade lösningar för

termiska problem så som diskuteras ovan. Dessa inkluderar termiska test-

och tillförlitlighetsproblem i samband med processvariation,

temperaturgradienter och temperaturvariationer. En snabb

temperatursimuleringsteknik föreslås i denna avhandling. Omfattande

experiment har visat effektiviteten av dessa föreslagna tekniker.

v

Acknowledgements

I would like to express my sincere gratitude and appreciation to my

advisors Professor Zebo Peng and Professor Petru Eles. I am thankful for

the opportunity, support, education, and training that they have provided

throughout my doctoral studies.

I would like to thank and express my appreciations to the Swedish National

Graduate School in Computer Science, CUGS, for funding and supporting

my research and studies.

I cannot forget the impact the high quality doctoral courses had on my

professional life by offering new perspectives. Even though I cannot name

them one by one I must thank professors and teachers that offered them.

Many thanks go to my friends and other employees at the embedded

systems laboratory, ESLAB, and at the department of computer and

information science, IDA, (including my colleagues, administration,

technical sections, etc.) for the pleasant and supportive work place that they

have created.

All these support would have not worked without the extraordinary support

and motivation from my parents and siblings. Thank you all!

Nima Aghaee Ghaleshahi

Linköping, September 2015

Please note that those copies of this thesis that are printed by LiU-Tryck

are in grayscale except for pages 8, 161, and 171. All the full color figures

can be found in the electronic copy.

vii

Contents

Abstract ......................................................................................................... i

Populärvetenskaplig sammanfattning ............................................................ iii

Acknowledgments ......................................................................................... v

Chapter 1 Introduction 1

1.1 Motivation ...................................................................................... 2

1.2 Contributions .................................................................................. 4

1.3 Publications .................................................................................... 5

1.4 Thesis Organization ....................................................................... 6

Chapter 2 Preliminaries 7

2.1 Temperature Related Defects ......................................................... 7

2.1.1 Temperature Dependent Defects ............................................... 7

2.1.2 Early Life Failures .................................................................... 9

2.1.3 Delay Faults .............................................................................. 10

2.2 Core-Based SoC Testing ................................................................ 10

2.3 3D Stacked IC Testing ................................................................... 11

2.4 Test Scheduling .............................................................................. 13

2.5 Test Power and Temperature ......................................................... 16

2.6 Temperature Simulation ................................................................. 17

2.7 Meta-Heuristic ............................................................................... 20

2.7.1 Motivational Example ............................................................... 20

Contents

viii

2.7.2 Particle Swarm Optimization .................................................... 23

Chapter 3 Related Work 27

3.1 SoC Test Scheduling ...................................................................... 27

3.2 3D Stacked IC Testing ................................................................... 28

3.3 Temperature-Aware Test Scheduling ............................................ 30

3.4 Process Variation Effects on Power and Temperature ................... 33

3.5 Multi-Temperature Testing ............................................................ 37

3.6 Temperature Gradients and Burn-In .............................................. 40

3.7 Testing for Delay-Related Defects ................................................. 41

3.8 Temperature Cycling ...................................................................... 44

3.9 Test Reordering .............................................................................. 48

Chapter 4 Process-Variation Aware SoC Test Scheduling Techniques 51

4.1 Introduction .................................................................................... 51

4.2 Motivational Example .................................................................... 52

4.3 Problem Formulation ..................................................................... 56

4.4 Temperature Error Model .............................................................. 59

4.5 Adaptive Test Scheduling .............................................................. 61

4.5.1 Tree Construction ...................................................................... 63

4.5.2 Linear Schedule Tables ............................................................. 65

4.5.3 Sub-Tree Evaluation ................................................................. 68

4.5.4 Sub-Tree Scheduling ................................................................. 74

4.5.5 Remarks .................................................................................... 78

4.6 A Fast Temperature Simulation Approach .................................... 79

4.7 Experimental Results ..................................................................... 82

4.7.1 Fast Temperature Simulation Approach ................................... 82

4.7.2 Adaptive Test Scheduling Technique ....................................... 84

4.8 Adaptive Multi-Temperature Testing ............................................ 88

Contents

ix

4.9 Remarks ......................................................................................... 90

4.10 Conclusions .................................................................................... 91

4.11 Notations and Abbreviations .......................................................... 93

Chapter 5 Temperature-Gradient Based Burn-In and Test Scheduling 97

5.1 Introduction .................................................................................... 97

5.1.1 Test for Early-Life Failures ...................................................... 97

5.1.2 Test for Delay Faults ................................................................. 99

5.2 Temperature-Gradient Based Burn-In ........................................... 101

5.2.1 Motivation and Problem Description ........................................ 101

5.2.2 Steady State Solution ................................................................ 104

5.2.3 Transient Solution ..................................................................... 111

5.2.4 Transient-Based Heuristic ......................................................... 115

5.2.5 Remarks .................................................................................... 119

5.2.6 Experimental Results ................................................................ 121

5.3 Temperature-Gradient Based Test ................................................. 123

5.3.1 Straightforward Algorithm ........................................................ 123

5.3.2 Fast Heuristic ............................................................................ 125


5.4 Temperature-Map Ordering ........................................................... 129

5.4.1 Map Ordering Technique .......................................................... 129


5.5 Conclusions .................................................................................... 134


Chapter 6 Integrated Temperature-Cycling Acceleration and Test 139

6.1 Preliminaries .................................................................................. 139

6.1.1 Circuit under Test and Test Access Mechanism ....................... 142

6.1.2 Thermal Model ......................................................................... 142

Contents

x

6.1.3 Temperature Cycling Model ..................................................... 144

6.2 Motivational Examples .................................................................. 144

6.2.1 ATC Rate for a Simple Scenario ............................................... 144

6.2.2 Optimal Cycling in a Simplified Scenario ................................ 146

6.2.3 Effect of the Test Application Order ......................................... 149

6.3 Problem Formulation ..................................................................... 149

6.4 Three-Phase Approach ................................................................... 153

6.5 Integrated Approach ....................................................................... 157

6.5.1 Path-Graph Scheduling Algorithm ............................................ 160

6.5.2 Length of the Power Averaging Window ................................. 163

6.5.3 Priorities for TAM Access ........................................................ 164

6.5.4 Node Ordering in the Test Graph .............................................. 165

6.5.5 Remarks .................................................................................... 169

6.6 Experimental Results ..................................................................... 170

6.6.1 Cycling Acceleration................................................................. 170

6.6.2 Performance of the Integrated Approach .................................. 173

6.7 Conclusions .................................................................................... 176


Chapter 7 Conclusions and Future Work 181

7.1 Conclusions .................................................................................... 181

7.2 Future Work ................................................................................... 183

References ..................................................................................................... 185

1

Chapter 1 Introduction

This thesis deals with temperature-related test issues. We focus on

manufacturing test of digital electronics that are produced by Very Large

Scale Integration (VLSI) techniques. The thermal test issues that are dealt

with in this thesis result in two categories of imperfect products being sent

to market: (1) products that are defective and (2) products that even though

are fully functional at the beginning, will fail during the field operation

shortly after being employed.

The test issues are considered for System-on-Chip (SoC) designs where

usually a core-based test architecture is in place. In such cases, the Test

Access Mechanism (TAM) is most often scan-based. We focus mainly on

advanced SoCs, where a fabrication technique with very small feature size

is used, usually referred to as deep submicron technology.

Reducing the feature size has been a mean to integrate more functionality

within an Integrated Circuit (IC) with good operational speed, manageable

power consumption, and acceptable production cost. This trend cannot be

endlessly continued, as the feature size is getting close to the size of a

single atom. An alternative for integrating more functionality into a single

package is 3D Stacked IC (3D-SIC) technology. 3D-SIC technology can

efficiently bond multiple dies into a single package. In this thesis,

sometimes we refer to this package as an IC. This thesis focuses on

advanced SoCs that have very small feature size or are manufactured by

3D-SIC technology. These technologies are affected by temperature-

related testing and reliability issues.

This chapter continues with the motivations for this thesis. Then a

summary of contributions is given, followed by a list of the author’s

publications that contain parts of these contributions. Finally, the

organization of the thesis is explained.

1

Chapter 1

2

1.1 Motivation

As the feature size is getting smaller, some parts of a modern IC must

include a precise small number of certain atoms1. Having a few atoms more

or less than the planned number will therefore result in a significant change

in the characteristics of the circuit. The manufacturing Process Variation

(PV) for older technologies that have a relatively large feature size is

negligible. However, for an advanced SoC, new techniques are required to

address the effects of the PV that is no longer negligible. PV includes

variations in the geometry of the chips’ components and variation in the

properties of the chips’ materials. For example, the effective channel

length may vary and result in variation of the threshold voltage and sub-

threshold leakage. These variations will result in differences in several

aspects of the circuit’s performance including its leakage current which is

an important contributor to the overall power consumption. Consequently,

the chips will experience power and temperature variations [Choi07,

Nebel97].

This means that the thermal aspects of hardware testing must be revised to

prevent potential damages. An important thermal issue with testing of

advanced SoC has been thermal safety. Advanced SoCs suffer from

exceedingly large power densities under test, so much so that the testing

must be slowed down to allow for cooling; otherwise the IC under test will

overheat. In general, a fast testing procedure is desirable to reduce the

testing costs. But in this case, a bit of testing speed is traded off to avoid

overheating. Overheating may result in good dies failing the test, since the

die’s temperature is higher than the intended operational temperatures.

Worse than this, is the situation that dies are damaged because their

temperatures even exceed the safe temperature limit.

The overheating problem can be efficiently addressed by carefully

scheduling the tests. This includes leaving the appropriate amount of

cooling intervals in the schedule, just as required. This can be achieved

with the help of temperature simulation. An important assumption for

existing simulation based techniques is that all the dies have similar

1 For example see the number of dopant atoms:

http://www.itrs.net/itwg/beyond_cmos/2008ERD_December/02_4_Architectur

e_SuhwanKim.pdf

Introduction

3

thermal behavior. Therefore, the result of temperature simulations and thus

the generated test schedules are valid for all of the dies.

Process variation renders the above assumption untrue for advanced SoCs.

What happens with one die is different from another die. One die may work

warmer than the other, therefore needing more cooling. Otherwise, it is

overheated. On the other hand the die that works colder can be tested faster,

saving valuable testing time thus reducing the costs. This means that

statistical approaches for temperature and PV-aware test scheduling are

required, as introduced in this thesis.

Temperature plays also an important role in testing. For example some of

the defects are activated only at high temperatures. This means that the

device works perfectly at low temperatures, but fails when it is too hot.

High-temperature defects are very common; therefore many existing

techniques stress the die with high temperature while testing. They are

common since the resistive opens in metals are common. Some resistive

open defects only manifest themselves at high temperatures since the

resistivity temperature-coefficient of the involved metals is positive. A

large number of interconnects including the crucial clock network are

made of metals.

Beside these temperature-dependent defects, there are other defects that

depend on temperature. For example, the signal delay depends on the

temperature. In an advanced SoC, an extensive clock network runs all over

the IC to assure the correct timing of the operations. Some areas in the IC

might be hot, while other areas are cold. Exacerbated by negative effects

of process variation or otherwise minor defects, this may result in some

signal paths being much slower than intended. This can result in timing

errors that occur only when certain sites have certain temperatures (usually

very different temperatures). This type of defects can only be detected

when certain temperatures are enforced on certain sites in the IC. These

temperature arrangements can be captured by a temperature map that

shows the temperatures for different sites in the IC. Some defects may need

their corresponding temperature map to be enforced while testing for them.

A temperature map also implies certain temperature gradients that are

temperature differences among different sites. Temperature gradients have

an effect on detection of early-life failures. So far we focused on defects

that exist immediately after the manufacturing. However, there are defects

Chapter 1

4

that even though do not exist just after the manufacturing, will occur

shortly after the device is being used. Burn-in techniques to speed up the

device’s early life before testing in order to detect certain early-life failures

already exist. A burn-in technique is to operate the ICs in a hot environment

usually with increased voltage. This speeds up a number of aging

mechanisms including the electromigration. Recent research has shown

that some early-life failures develop in sites that experience large

temperature gradients [Smorodin08]. The defect-related gradients can be

captured with a temperature map that is enforced on the IC using the

techniques proposed in this thesis.

Another phenomenon that is related to early-life failures is temperature

cycling. Exposing the IC to a number of large-scale temperature changes

before testing it, makes some early-life failures detectable. A simple burn-

in will not help to detect these early-life defects and the affected devices

will fail shortly after being employed in the field. The existing temperature-

cycling tests use temperature chambers [Mil04] and, therefore, the

temperature-cycling test is costly. A low-cost temperature-cycling test is

proposed in this thesis that uses high-power tests, among other stimuli, to

enforce the required amount of cycling on the IC.

1.2 Contributions

The first contribution of this thesis is the development of stochastic

approaches for thermally-safe and multi-temperature testing under large

process variation. The usual cost function for test scheduling is the

deterministic test application time which is not appropriate for the

situations in which some dies will be overheated due to the negative

consequences of process variation. A probabilistic cost function is

introduced to include the cost of the overheated ICs. Later on, for multi-

temperature testing, this cost function is extended to take the cost of the

test-escapes (due to temperature-dependent defect) into account. Adaptive

approaches, which utilize these cost functions, are proposed to deal with

intra-die variations and temperature fluctuations over time [Aghaee11a,

Aghaee14b]. Test scheduling techniques that take the temperature into

account use a thermal simulator in order to estimate the temperatures

before the actual testing. A fast temperature simulation technique is

introduced to facilitate faster process-variation aware schedule generation

[Aghaee13a].

Introduction

5

The second contribution of this thesis is a collection of techniques for

enforcing the given temperature maps on the ICs. Enforcing certain

temperature gradients on an IC for a given time makes the related gradient-

dependent early-life failures detectable by a targeted test performed later

[Aghaee14a]. Enforcing certain temperature maps while testing for

gradient-dependent defects (including some hard-to-detect delay faults)

helps to detect them [Aghaee13b]. Ordering these temperature maps and

consequently their related tests in an effective manner can reduce the test

application time, as proposed in [Aghaee15b].

The third and last contribution of this thesis targets cycling-dependent

early-life failures. The proposed algorithm utilizes the normal tests (tests

not related to cycling) and other stimuli in order to enforce a high level of

temperature-cycling activity. This is performed in a controlled manner, so

that no overheating or excessive cycling threatens the IC or test

performance [Aghaee15a]. The order of the tests affects the dissipated

power in the circuit under test. This fact is utilized by the proposed

algorithm to achieve a short test application time (including the

temperature-cycling time).

1.3 Publications

The contributions of this thesis are reported in the following articles:

N Aghaee, Z He, Z Peng, P Eles. Temperature-aware SoC test scheduling

considering inter-chip process variation. 19th IEEE Asian Test

Symposium (ATS), pp 395–398. Shanghai, China, Dec 2010.

N Aghaee, Z Peng, P Eles. Adaptive temperature-aware SoC test

scheduling considering process variation. 14th Euromicro Conference on

Digital System Design (DSD), pp 197–204. Oulu, Finland, Aug 2011.

N Aghaee, Z Peng, P Eles. Process-variation and temperature aware SoC

test scheduling using particle swarm optimization. 6th IEEE International

Design and Test Workshop (IDT), pp 1–6. Beirut, Lebanon, Dec 2011.

N Aghaee, Z Peng, P Eles. Process-variation and temperature aware SoC

test scheduling technique. Journal of Electronic Testing: Theory and

Applications, vol 29, no 4, pp 499–520. Aug 2013.

Chapter 1

6

N Aghaee, Z Peng, P Eles. Temperature-gradient based test scheduling for

3D stacked ICs. 20th IEEE International Conference on Electronics,

Circuits, and Systems (ICECS), pp 405–408. Abu Dhabi, UAE, Dec 2013.

N Aghaee, Z Peng, P Eles. Process-variation aware multi-temperature test

scheduling. 27th International Conference on VLSI Design (VLSID), pp

32–37. Mumbai, India, Jan 2014.

N Aghaee, Z Peng, P Eles. An efficient temperature-gradient based burn-

in technique for 3D stacked ICs. Design, Automation and Test in Europe

Conference (DATE). Dresden, Germany, Mar 2014.

N Aghaee, Z Peng, P Eles. An integrated temperature-cycling acceleration

and test technique for 3D stacked ICs. 20th Asia and South Pacific Design

Automation Conference (ASP-DAC), pp 526–531. Chiba, Japan, Jan 2015.

N Aghaee, Z Peng, P Eles. Temperature-gradient-based burn-in and test

scheduling for 3-D stacked ICs. IEEE Transactions on Very Large Scale

Integration (VLSI) Systems, Accepted.

N Aghaee, Z Peng, P Eles. Efficient test application for rapid multi-

temperature testing. 25th Great Lakes Symposium on VLSI (GLSVLSI),

pp 3–8. Pittsburgh, PA, USA, May 2015.

N Aghaee, Z Peng, P Eles. A test-ordering based temperature-cycling

acceleration techniques for 3D stacked ICs. Journal of Electronic

Testing: Theory and Applications, Accepted.

1.4 Thesis Organization

This thesis is organized in 7 chapters. The current chapter, chapter 1, is the

introduction. The next chapter, chapter 2, explains the preliminaries.

Related work is reviewed in chapter 3. Chapter 4 presents the proposed

process-variation aware SoC test scheduling techniques. Chapter 5 focuses

on temperature-gradient-based burn-in and test scheduling for 3D-stacked-

ICs. Chapter 6 presents our integrated temperature-cycling acceleration

and test techniques. Chapter 7 concludes the thesis and discusses the future

work.

7

Chapter 2 Preliminaries

This chapter introduces preliminaries that are helpful for understanding the

rest of this thesis. The temperature related defects and tests to detect them

are discussed in section 2.1. The testing procedure for core-based system-

on-chip designs is explained in section 2.2. The through silicon via and the

3D stacked IC technology that is based on them are briefly introduced in

section 2.3. Test scheduling approaches are reviewed in section 2.4. Power

and temperature issues are discussed in section 2.5. A temperature

simulation technique is introduced in section 2.6. A meta-heuristic

approach is introduced in section 2.7.

2.1 Temperature Related Defects

A well-known category of manufacturing defects affects the correct

operation of the IC just after the manufacturing. Therefore, they can be

tested for, immediately after the manufacturing process without any

particular environment/temperature-related requirement. We refer to these

type of defects as normal defects. Normal defects are relatively easy to

detect since they show up just after the manufacturing and can be detected

independent of the environmental conditions. An example of such defects

is a normal stuck-at fault.

2.1.1 Temperature Dependent Defects

Another category of defects is environment-sensitive, and show up only

under certain environmental conditions. An important sub-category of

these defects are temperature-sensitive defects [Needham98]. For example,

some defects show up only when the IC follows a certain temperature

pattern [Hagihara97].

An example for such temperature-sensitive defect is a resistive open which

is a major cause of test escapes [Needham98]. It occurs when a connection

2

Chapter 2

8

between two circuit nodes has a conductance high enough to be considered

connected at normal temperatures. But at high temperatures the

conductance decreases so much that the connection is considered

disconnected. This may occur since usually most of interconnects on the

chip are made from metals and the conductance of those metals has

negative temperature coefficient. Therefore, it is expected that a large

number of such defects appear at high temperatures. On the other hand, we

have other defects that manifest themselves differently with respect to

temperature. For example, in [Needham98] a defect (“Dark Via”) is

reported that “had previously passed all production tests, but then failed a

monitor test at cold temperature”. Several other defects are also identified

in [Needham98] that similarly appear only at low temperatures.

Besides the temperature coefficient for conductivity of the material,

thermal expansion may also contribute to temperature-dependent defects.

The Dark Via defect, which appears at low temperature, could be seen as

voids between interconnect and via [Needham98, Segura04]. This

observation could be explained with thermal expansion in metals that fills

up the voids and increases the conductivity. This effect is illustrated in

Figure 2.1.1, where large voids at low temperature shrink at high

temperature because of thermal expansion. Therefore, the conductance of

the via may increase albeit the reduced conductivity of the via's

constructing material.

Other similar defects also exist. For example, some defects for a different

technology (i.e., copper-based interconnects) are studied in [Zschech02]

and interface voids are mentioned along with sidewall voids and bulk voids

(shown also in Figure 2.1.1) as temperature-dependent defects. Moreover,

similar to possible temperature-dependent mechanisms for open defects,

one may think of temperature dependent mechanisms for short or bridging

defects.

Another type of temperature-dependent defect that is hard to detect is

silicide open [Tseng00]. Silicide is used to make local interconnects. In its

Figure 2.1.1 Voids in a via create a resistive open

(a) Large voids at low temperature. (b) At high temperature, materials expand and voids shrink.

Via Via

(a) (b)

Interconnect Interconnect

ii

iii

i

ii

iii

i

Bulk voidiSidewall voidiiInterface voidiii

Preliminaries

9

perfect condition, such a local interconnect has a positive temperature

coefficient for resistance, but a defective one will have it as negative.

Detecting such defects at normal temperature is difficult since their

difference is not recognizable. Testing at low temperatures is a good

solution since there will be a recognizable difference between the perfect

and the defective interconnects [Tseng00].

Resistive-open and stuck-open defects are experimentally studied in

[Li01]. The resistive-opens occur more frequently (39 samples) compared

to stuck-open defects (11 samples) [Li01]. By knowing the location of the

resistive defects and the materials involved in those defects, the proper test

temperatures can be found and the appropriate tests can be developed

[Li01].

Interconnect malfunctions (e.g., opens and shorts) are not the only sources

of temperature-dependent defects; transistor malfunctions are also a source

of concern. This issue is studied in [Long04] and the impact of temperature

is demonstrated. The thermal behavior of a transistor depends on its

quiescent point and therefore higher or lower temperatures, per se, do not

imply better or worst results. Usually, in order to minimize the effect of the

temperature, transistors are biased at the Zero-Temperature-Coefficient

(ZTC) point. ZTC is a point where the temperature will not affect the

transistor behavior. The problem is that there will be variations in the actual

quiescent points of the manufactured transistors and therefore temperature

will affect them. This will lead to defects that are hard to detect. Multi-

temperature testing can help to detect such defects [Long04].

2.1.2 Early Life Failures

Another category of defects consists of early-life failures. These can be

seen as manufacturing imperfections that are not manifesting as a defect

just after the manufacturing and therefore cannot be detected by the

manufacturing test that is performed immediately after the fabrication. A

burn-in process is usually used to push the IC through its early-life in an

accelerated manner. The existing techniques operate the device under high

temperature and perhaps with increased voltage and/or frequency. These

techniques handle the normal early-life failures that can be efficiently

accelerated this way. Two subcategories of early-life failures that are

different from the usual ones are explained below.

Chapter 2

10

There are early-life failures that show up at certain sites in the IC where

large temperature gradients are in place for relatively long periods of time

[Smorodin08]. In order to efficiently detect these defects, corresponding

temperature gradients must be enforced for a certain duration of time

before testing. The second type of defects are those that are made

detectable by temperature cycling. This means that the device goes through

an aggressive temperature cycling before being tested for the related

defects [Mil04]. This way some other imperfections that are not detectable

immediately after the manufacturing can be detected.

2.1.3 Delay Faults

Another category of defects that have similar features with some of the

temperature related defects mentioned above, consists of delay-related

faults. These happen when a signal propagates slower (faster in some cases

in relative terms) than expected (e.g., clock signal affected by skew). This

may happen due to temperature gradients and usually results in wrong data

being latched in memory elements. This can be due to data and clock

timings not being correct with respect to each other (e.g., due to different

temperatures at different sites). It can, also, be that the IC under test cannot

work at the intended frequency, however it can work correctly at a slower

clock. At-speed and delay tests are usually used to detect these defects

[Ahmed05, Higami13, Ko08].

2.2 Core-Based SoC Testing

A simple explanation for testing is that certain stimuli are applied to the

site of the targeted defect to activate it and then the circuit outputs are

compared against the correct outputs to detect the defect. In order to

generate such a test, the circuit model and the possible defect models must

be analyzed. This is a tedious task best done with the help of a computer

algorithm. Therefore, an Automated Test Pattern Generation (ATPG) tool

is used to generate the tests that cover a large number of defects while the

tests are kept acceptably short [Abramovici94].

The decision about which defects to target and which tests to include in the

test procedure of a certain product has a number of aspects. Incorporating

tests for all of the defects, in a modern system-on-chip, will make the test

application time very long. Testing costs are considerable, especially if

costly test equipment are involved. But shipping defective devices will also

cost, since they are usually covered by the manufacturer’s guarantee. The

Preliminaries

11

failures that show up after the device has left the fabrication and test facility

will cost much more than the defective device’s own cost [Davis94]. The

testing process is therefore designed to minimize the overall cost. The other

aspect to be considered is reliability for safety-critical applications. The

devices manufactured for safety-critical applications usually go through

much more elaborate tests to comply with the high reliability requirements.

A modern system-on-chip includes a large number of memory elements

(e.g., flip flops and registers) and therefore the number of states that such

digital designs include is huge. Moreover, taking the circuit from one state

to another state that is needed for some other tests can be very time

consuming. This is one of the motivations for Design for Testability (DfT)

techniques that include a Test Access Mechanism (TAM) on the core-

based system-on-chips.

A test access mechanism is used to provide test access to all the cores.

There might be some other testable modules in a system-on-chip that are

not conventional cores. These modules are also accessible using the TAM.

There is always a trade-off between the test acceleration gained by

inclusion of a TAM and the cost of the TAM itself that includes its area on

the die, the delays that it adds to the signal paths, and its static power

consumption. The TAM design is usually kept small to avoid these

overheads. Therefore, it is extremely unlikely to be able to provide

simultaneous access to all modules. Consequently, during the test some of

the modules must wait while other modules are being tested.

The tests are usually performed using Automated Test Equipment (ATE)

which put the device in the test mode, feed it with stimuli, and check the

circuit under tests’ outputs for defects.

2.3 3D Stacked IC Testing

Existing systems-on-chip like Apple A8X and Xbox One have 3 and 5

billion (i.e., ) transistors, respectively. Larger number of transistors

have already been integrated. For example Intel Xeon E5-2600 v3 has 5.6

billion transistors [Intel13], Nvidia Kepler GK110 has 7.1 billion

transistors [Nvidia12] and Xilinx Virtex UltraScale XCVU440 has 20

billion [Santarini14]. These indicate the extremely large number of

transistors that will be integrated into advanced system-on-chips in order

to provide a wider range of functionalities as well as higher computational

power.

Chapter 2

12

More functions as well as higher computational power are traditionally

achieved by shrinking the feature size as well as some other minor

improvements so that a large number of possibly faster transistors fit on a

single die. For more than that, a number of dies must be connected. These

inter-die interconnects are usually long and thick. Moreover, a relatively

small number of interconnects can be made per die area (i.e., low

interconnect density). These lead to high power consumption as well as

low data transmission rate.

A promising technology for efficiently connecting different dies is based

on Through Silicon Vias (TSV). A through silicon via is a via that runs

throughout the bulk silicon and allows the dies to be stacked on top of each

other while making electrical connections. The ICs fabricated this way are

called 3D Stacked ICs (3D-SIC). This technology supports high density

signal connections with a short wire length that translates into high

bandwidth communication (both number of lines and the frequency that

they support) with a small power consumption.

TSVs are manufactured in the individual dies. They are initially contained

within the die, since their length is smaller than the die’s thickness.

Therefore, a thinning step follows in order to carefully remove extra

thickness of the die. After the thinning process, the TSVs reach the surface

of the die.

On the surface of the die the so called micro-bumps are placed. The micro-

bumps are places where electrical connections, for example by soldering,

are made. The dies must be carefully aligned and then correct bonding can

take place.

The steps in the manufacturing process may involve multiple bonding

stages. A testing procedure at each of these stages may help to reduce the

overall costs. These tests are referred to as pre-bond, mid-bond, and post-

bond test stages. The pre-bond test is performed before bonding when the

die is separate. If a defect goes undetected to the next steps, some other

potentially perfect dies as well as the bonding efforts are wasted because

of the defective die. Similarly, a mid-bond test may be helpful especially

if an expensive die is going to be bonded to a low-cost partial stack. In this

case, it might be a good idea to test the partial stack before bonding. At the

end of the bonding process, a post-bond test can be performed.

Preliminaries

13

The bonding can be, also, done with wafers instead of the individual dies.

In this case, the wafers are aligned and bonded and then diced. Since the

dies are still not diced during the bonding, it is not possible to choose the

non-defective dies to be bonded together. In such a scenario, the wafers

can be matched, positioned, and aligned so that the low defect-rate areas

of the two wafers meet each other. The probability of ending up with

defective stacks are reduced this way, although it is not possible to fully

prevent good dies being wasted.

So far we explained die to die bonding and then wafer to wafer bonding.

Another alternative for bonding is die to wafer bonding. In this case, a

particular layer in the 3D-SIC structure is diced into dies while the other

wafer is not diced. This way bonding known bad dies can be avoided.

The TSV manufacturing process and bonding process are new sources of

defect that do not exist for normal 2D ICs. Therefore, a more elaborate

testing process may be required, especially for defects that are related to

the TSV fabrication or the bonding process.

For 3D stacked IC testing, the TAM is designed so that the test access is

possible at different test stages [Ieee14a]. 3D-SICs experience more

thermal issues than the conventional 2D ICs. These include the issues that

affect the conventional 2D ICs as well as thermo-mechanical issues related

to TSV technology. Moreover, the dies cannot cool as efficiently as 2D ICs

that usually have many low-resistance thermal paths for cooling. The

situation is particularly difficult for dies located in the middle of the stack.

2.4 Test Scheduling

As mentioned before, the test access mechanism, in either 2D or 3D SoCs,

is a resource bottleneck for testing. Therefore, tests must be scheduled in

order to minimize the test application time. A test schedule determines at

each time-point which modules must run their tests. Moreover, it

determines which test must be performed for the module.

Test scheduling can be done with or without partitioning and interleaving.

Schedules without partitioning [Chou97, Zorian93] are simpler but in

general result in large test application times. In this case, when a module

starts a certain test it runs to the test’s completion and the schedule cannot

make changes when a test is being applied. Nowadays, partitioning and

interleaving of tests is common [Marinissen00]. In this case, a test can be

Chapter 2

14

halted for a while and other modules may use the released TAM resources.

This thesis uses test partitioning and interleaving for all the proposed

scheduling approaches.

The authors in [Iyengar02] have formulated the test scheduling problem as

a rectangle packing problem. The problem is proven to be NP-complete

and is solved using a Mixed-Integer Linear Programming (MILP) approach

in [Chakrabarty00]. The test scheduling problem becomes even more

complicated when, for instance, the thermal issues must be taken into

account.

Here we briefly explain the main ideas related to test scheduling using an

example. Assume that the SoC under test consists of three modules ,

, and as shown in Figure 2.4.1a. Assume that the test access

mechanism can accommodate only two of these modules at a time (

, where TAM width is denoted by ).

There are two Built-In Self-Test (BIST) modules and as shown in

Figure 2.4.1a. Each of them performs only a part of the tests for the

corresponding module. uses the TAM to test but is directly

connected to and can test it without occupying the TAM. Assume that

each module has four tests and that each one of them is a node in a directed

path-graph (i.e., there is only one path in the test graph). The th test for

module is denoted by as shown in Figure 2.4.1b. The forth test

for module (i.e., ) is performed by the BIST while is

performed by . The rest of the tests (marked as normal in Figure 2.4.1b)

are performed using an ATE through TAM.

Since the TAM cannot support simultaneous testing of all modules, the

tests must be scheduled. A shorter test application time is desirable and

therefore the test schedule must be optimized for a minimal test

applications time. In general, there could be other constraints, in addition

to TAM, including power, temperature, and tester memory constraints. The

Figure 2.4.1 Examples for (a) a SoC, and (b) tests

SoC

TAM

(a)

m1b1

m2m0b0

BISTn2,0 n2,1 n2,2 n2,3

n1,0 n1,1 n1,2 n1,3

n0,0 n0,1 n0,2 n0,3

(b)

Normal

Preliminaries

15

scheduling objective may include other factors, in addition to test

application time, including test throughput and perhaps test coverage

considering defect probabilities.

Let us focus only on test application time reduction under TAM limitation.

A module can be only in one of these two states: active (i.e., testing) or

inactive. A schedule indicates the test cycles (time) that a change in one or

more of the modules’ states must happen and what that change is. The

schedule indicates that at cycle modules and start testing, as

indicated in in Figure 2.4.2a. Since the tests’ path-graphs are given, there

is no need to include the test nodes in the schedule, however, the tests being

applied are shown in Figure 2.4.2d. The active modules go through their

1st, 2nd, and 3rd test without any new entry in the schedule.

At test cycle , the BIST tests ( and ) start as indicated in Figure

2.4.2c. Since has dedicated access to module , it does not occupy the

TAM and, therefore, module can gain access to the TAM, as shown in

Figure 2.4.2b. Consequently, all three modules are active simultaneously.

At test cycle , testing of and is complete. Testing of continues

to completion at cycle .

In the above example, we assumed that the order of the tests is fixed, but

in reality it might be possible to reorder tests to achieve better results. In

that case, the nodes (e.g., Figure 2.4.2d) must be included in the schedule.

This means that at least two additional entries in the schedule table (Figure

2.4.2a) between cycles and as well as two more additional entries

between cycles and must be added to indicate transition to new test

nodes. (In fact one entry is sufficient since the last node is trivial.)

Figure 2.4.2 Example for a test schedule

(a) the test schedule; (b) TAM occupation; (c) BIST activity; (d) test nodes

ActiveInactive

stateschedule

cycle i2i0 i1

m0

m1

m2

(a)

(b) TAMm0

m1

m0

m1

m0

m1

m0

m2 m2 m2 m2

(c)b1

b0BIST

test noden0,0n1,0

n0,1n1,1

n0,2n1,2

n0,3n1,3n2,0 n2,1 n2,2 n2,3

(d)

i3

Chapter 2

16

Moreover, we assumed that testing is always done for all of the specified

tests, but in reality testing may be terminated as soon as a defect is found.

In this case the optimization objective (e.g., test application time) is a

stochastic quantity (e.g., expected test application time) that is evaluated

based on the defect probabilities (or statistics).

A test schedule can be adaptive, depending on certain run-time parameters.

An adaptive schedule acts based on the actual value of an otherwise

stochastic quantity during the test. An example is sensing the actual

temperature and changing the schedule accordingly. In this case, a number

of schedule pieces are generated and during the test, the temperature is

sensed when required and the schedule-piece that fits the situation is

selected.

2.5 Test Power and Temperature

The circuit under test consumes power as a result of switching activity

during the test process, similar to when the circuit is in operation. In

general, power density for digital circuits is increasing by the advancement

of technology and increased integration. One of the problems is that this

dense power dissipation leads to very high temperatures and can affect the

correct system behavior. The situation is worst during the testing. In

particular scan-chain based DfT features result in even higher power

densities. It is reported that the test power dissipation can be as large as

twice the normal power [Bonhomme02, Zorian93].

In order to prevent incorrect device behavior or damage to the device

because of high temperature (overheating) something must be done. A

category of efficient approaches that do not make the testing unnecessarily

long are based on changing the test schedules [Rosinger06]. In order to

prevent overheating during the test, temperature simulations are performed

before the actual test during the scheduling process. The simulated

temperature shows the time intervals in the schedule where overheating

may occur. One of the options is to halt the test to allow for cooling at such

time intervals. This way, cooling which slows down the testing process is

just added to the schedule exactly when it is needed.

Process variation results in large variations in the dissipated power in

advanced SoC designs [Cheng00]. This results in considerable variations

in the temperature of the device and poses difficulties for the offline

temperature-aware test scheduling techniques that are deterministic (e.g.,

Preliminaries

17

[Rosinger06]). To handle this situation, stochastic approaches are proposed

in this thesis in chapter 4.

The dissipated power in a circuit depends on the current input values and

the circuit’s state. The state depends on the previous inputs. Therefore, the

dissipated power during the test depends on the tests order [Girard97]. This

phenomenon is used in chapter 6 to harvest different power values from

the same set of tests.

The power dissipations are calculated based on the given switching

activities and the IC power-related characteristics. The actual dissipated

power also depends on the leakage current (i.e., static power). The leakage

current, itself, depends on the temperature. As mentioned before, always

in this thesis a temperature simulation is performed. The simulated

temperatures are used to guide the schedule generation. Also, they are used

to approximate the static power, the component that depends on the

temperature.

Leakage current plays an essential role in thermal run away. Thermal run

away is a situation in which the static power, per se, can keep increasing

the temperature, even beyond the safe limit. This means that introducing a

halt that takes away the dynamic power will not stop the temperature from

increasing. Consequently the temperature further increases, increasing the

static power and the increased static power increases the temperature, in

return [Vassighi06].

This positive feedback loop goes on and on until the circuit is disconnected

from the power source or until the circuit is damaged. Once started, this

usually goes fast. However, it only starts at high temperatures. In the usual

DfT architectures only the dynamic power can be controlled by the

schedule. Therefore, in schedule-based solutions, such high temperatures

must be avoided.

2.6 Temperature Simulation

As mentioned above, in order to estimate the actual temperatures during

the test, temperature simulations are performed during the scheduling

process. This paradigm has been used in all chapters of this thesis. A

temperature simulator consists of a thermal model and an algorithm to

analyze it. The thermal model describes the mathematical relation between

the IC characteristics, the dissipated power, and the temperatures.

Chapter 2

18

There exists a range of thermal models. Some of them may focus on the

steady state temperatures which means that the dynamic response cannot

be obtained. Some other thermal models, only focus on each individual

module and ignore the heat transfer among modules. In this thesis we use

a thermal model that supports dynamic response analysis and takes the heat

transfer among modules into account, similar to the widely used thermal

simulator, HotSpot [Huang07, Huang06, Stan03].

This model is a lumped element model meaning that the chip is modeled

as a combination of thermal resistances and thermal capacitances. An

example for such a thermal model is given in Figure 2.6.1. A typical

thermal model consists of a number of lumped elements connected to each

other. A connection point of thermal elements is called a node.

An equivalent view is that an IC is divided into small elements each of

which is characterized by a single temperature. Each of these small

elements is represented as an individual node in the model. In Figure 2.6.1,

two cores are modeled as two nodes (i.e., elements) which are connected

to two exclusive power sources. Power sources represent the power

dissipated by the cores.

Assume that the thermal model consists of nodes and is the number

of cores. In a high quality thermal model, usually the number of nodes is

larger than the number of cores, , (e.g., six thermal nodes for two

cores) as shown in Figure 2.6.1. Assume that is the power vector and

is the temperature vector. The mathematical representation of the thermal

model is a system of ordinary differential equations:

(2.6.1)

Figure 2.6.1 An example of a lumped element thermal model

Core 1

Core 2

Resistance

Capacitance

Ambient

Power Source

Preliminaries

19

The properties of the thermal model are encapsulated into two

matrices and . and are temperature and power vectors. The

mathematical representation of this commonly used model (equation 2.6.1)

is a system of linear constant-coefficient differential equations. As an

example, assume that a SoC has two cores ( ) and assume that the

model has four nodes ( ). The expanded characteristic equation of

the model is

and are core temperatures which should be taken care of. and

are the power values applied to the cores.

For architectural design purposes, usually the dissipated power is assumed

to correspond to a fixed scenario. The inputs are the IC characteristics that

are varied to find a good design. The outputs are the temperatures that

somehow affect the cost function for the architectural design. This

viewpoint is useful for example for designing the TAM1. For this view

point numerical approximation is a good choice to solve equation 2.6.1.

In order to numerically analyze and solve the combination of the thermal

model and the dissipated power values, a time interval which is called a

simulation cycle is defined. The length of simulation cycle is determined

based on a number of factors including the required accuracy. The

computed temperatures are recorded and reported for each simulation

cycle. It is common to assume that the power ( in equation 2.6.1) is

constant during a simulation cycle.

The numerical approximations are usually done with very small

intermediate steps, and as a result, the complete temperature curve for the

interval is meticulously constructed. HotSpot uses the Runge-Kutta

method for the numerical approximation [Huang06]. Though only the

temperature at the end of the simulation cycle is registered, many points of

the temperature curve are calculated.

1 Not the viewpoint of this thesis. In this thesis we assume that the TAM is

already designed and given along with the other IC specifications.

Chapter 2

20

This thesis’ viewpoint is that the IC characteristics are fixed. The inputs

that are varied are the power values. They vary because they depend on the

tests and the schedules. A range of different schedules are explored to find

a near optimal schedule. The outputs are temperatures. The thermal models

work equally well for both of the above viewpoints, whether the IC

characteristics are fixed or not. However, the difference in these

viewpoints means that different approaches may be appropriate for solving

the thermal model.

Since the physical design of the devices is assumed to be fixed, a

superposition-based approach as the one suggested in [Yao09] can be used.

This superposition-based approach is particularly helpful if the tests are

partitioned in advance (before the scheduling process) and if large errors

in static power (due to temperature-dependent leakage) are acceptable. In

this thesis a third approach different from the Runge-Kutta and the

superposition-based approach is used. A fast temperature simulation

scheme is proposed in section 4.6.

2.7 Meta-Heuristic

The test scheduling process is usually based on a number of decision

variables. These decision variables go through an optimization process in

order to generate a near optimal test schedule. A cost function is defined to

evaluate the quality of alternative schedules which are themselves based

on the combinations of the decision variable values. A motivational

example explains these concepts. Then, particle swarm optimization,

which is a meta-heuristic frequently used in this thesis, is introduced.

2.7.1 Motivational Example

A thermal-safe scheduling paradigm is discussed here to explain basic

ideas of thermal-aware test scheduling and optimization. The objective is

to generate a test schedule with the minimal Test Application Time (TAT).

The constraint is that the temperature must not exceed the overheating level

denoted by (this includes a safety margin).

We consider an IC made of only one module. Therefore, there are no

constraints for access to modules using the test access mechanism. Assume

that the tests dissipate a constant power (including both dynamic and static

power) denoted by . It is assumed that is so large that it results in

overheating. Usually leakage and clock networks power result in a non-

Preliminaries

21

zero power dissipation during cooling. This cooling power which is

denoted by ( ) results in a rest temperature (denoted by )

that is higher than ambient ( ).

The module temperature is initially equal to the ambient temperature

denoted by . As discussed above, the test is paused as soon as the

temperature reaches . Testing is resumed after sufficient

cooling. The question is how much cooling is sufficient. Certain

temperature level can be considered as sufficient. Let us denote this

sufficient temperature level by ( ). Thus,

sufficient-cooling temperature, , is the decision variable in this problem

formulation. The temperature curve is plotted in Figure 2.7.1.

Since the power values (i.e., and ) are constants, the testing and

cooling patterns are periodic, as can be seen in Figure 2.7.1. In each of

these periods, the testing time is denoted by and the cooling time

with . There is, also, a delay associated with starting or resumption of

the testing process, denoted by . This delay is associated with testing

equipment and architecture and cannot be changed. A part of this delay,

denoted by , results in the temperatures to further reduce to a low

temperature level, denoted by .

The other part of the switching delay, denoted by , results in a shorter

effective test time than the testing times, . Therefore, the actual times

when testing takes place is equal to . Assuming that one test unit

(e.g., a thousand test bits) is applied per second, and assuming that the test

length is test units, the total number of testing/cooling periods,

approximately, is:

.

Figure 2.7.1 Temperature curve for a simple thermal-aware testing scenario

Te

mp

era

ture

time

Chapter 2

22

Therefore,

(2.7.1)

Assume that the module under test is thermally modeled by a single

thermal element using equation 2.6.1. The module’s heat capacitance is

denoted by (analogous to ). The heat resistance between the module

and the ambient is equal to (analogous to ). In this case, equation

2.6.1 can be described for the testing part of the period as:

For the cooling part of the period, the thermal equation can be written as:

These equations can be used to compute the values of and as:

(2.7.2a)

and

(2.7.2b)

Using equations 2.7.1–2, TAT values are plotted for a range of values

in Figure 2.7.2. It is assumed that ,

, and (this is the rest temperature, ).

The TAT is minimal when .

In the above example, there was only one decision variable, no TAM

congestion, constant testing and cooling power values, and a simple

thermal model. Therefore, the optimization problem was solvable by

plotting TAT versus . The problem is that none of the above assumptions

are realistic.

In reality there are a number of decision variables (e.g., one for each

module). Because of TAM congestion, a module cannot start/resume

testing disregarding of other modules. Testing and cooling powers can be

different for different test stimuli and they, also, depend on the

temperature. A module’s temperature may need to be modeled with several

thermal elements. A thermal element’s temperature depends on the test

Preliminaries

23

stimuli power and the temperature of the adjacent thermal elements. This

situation is much more complex than the above example and it will be

extremely time-consuming to find the exact optimal schedule. Therefore,

a near-optimal solution that can be found in an affordably short time is

preferred. For this purpose, particle swarm optimization which is a

population-based meta-heuristic is used in this thesis.

2.7.2 Particle Swarm Optimization

Let us review a more realistic version of the thermal-safe scheduling

discussed in the previous section. For this purpose the IC’s temperature

must be simulated offline during the schedule generation, as shown in

Figure 2.7.3. As soon as the temperature reaches the overheating level

denoted by the test is halted to allow for cooling. For example

at test cycle testing is paused (module is inactive) to allow for cooling.

This is registered in the schedule table as shown in Figure 2.7.3b–c.

Temperature simulation continues and when the temperature reduces to

(sufficient-cooling temperature), the module activity (i.e., testing) may

resume. The actual resumption may be delayed due to testing equipment

and architecture characteristics. Moreover, the delay may be due to TAM

congestion which forces the module to wait for test access. In this example,

testing resumes at test cycle , as registered in the schedule table in Figure

2.7.3b–c. Since the power values are not constant, the heating time between

and is shorter than the heating time between and .

Figure 2.7.2 Test application time versus sufficient-cooling temperature

330 340 350 360 370 380 390

340

350

360

370

380

390

400

410

420

430

440

TAT

50 100 12060 70 80 90 110

Chapter 2

24

This constructive, on-the-fly, and temperature-simulation-based

scheduling continues until all the tests are scheduled. This point marks the

test application time that must be minimized using a meta-heuristic2.

There are a number of meta-heuristics that can be used for optimization. A

population-based meta-heuristic is usually used in such situations. A well-

known example for such category of algorithms is the genetic algorithm

[Falkenauer98, Maulik00]. In this thesis we often use a Particle Swarm

Optimization (PSO) technique. Here we briefly explain the PSO which is

used in this thesis.

Particle swarm optimization mimics the social behavior of a swarm

searching for food [Poli07]. Each individual member of the swarm is called

a particle. A particle is represented by two attributes, its location and its

velocity. The location in fact is a solution which, usually, is represented by

a coordinate in a Cartesian system. The velocity keeps the particles moving

in the search space.

Each particle remembers its previous best location, and in addition to this

individual memory, the swarm remembers the best location any of its

particles have visited before, the global best. The previous bests and the

global best are then used to give a hint to the random velocities. A

2 The technique used in this example is from [He08a]. The actual optimization

problems in this thesis are more sophisticated than this example.

Figure 2.7.3 Test scheduling based on temperature simulation

(a) Temperature curve; (b) test cycles registered in the schedule table; (c) module states in the

schedule table. (Curves are only illustrative.)

Sch

ed

ule

Te

mp

era

ture

cycles

state

(a)

(b)

(c)

i0 i1 i2 i3 i4 i5

InactiveActive

Preliminaries

25

canonical form of the particle swarm optimization is expressed by the

following equations [Poli07]:

(2.7.3)

(2.7.4)

This canonical form of the particle swarm optimization uses equation 2.7.3

to update the velocity. The coefficients in equation 2.7.3 ( , ,

and ) are given as a part of the chosen canonical form. The

and are two distinct random numbers between 0 and 1 which are

renewed iteratively. The location and velocity on the right hand side of

equation 2.7.3 are the previous values and the left hand side velocity is the

new value. The new location is the sum of the previous location and the

new velocity as expressed in equation 2.7.4. Sometimes an action is needed

to prevent the new location from going outside the valid search space. This

can be done by limiting its value (e.g., by changing its value) to the valid

extremes.

For example, in the above example the decision variable (i.e., sufficient-

cooling temperature) must be larger than the rest temperature and smaller

than the overheating temperature ( ). Smaller

values will result in an infinite loop in the scheduling algorithm since the

temperature will never become smaller than . Larger values have a

similar effect, since when cooling the temperature only decreases and

cannot increase beyond . In these cases the scheduling

algorithm will wait forever for a temperature that cannot be reached.

A simple form of the particle swarm optimization is presented below:

1. Generate the initial locations (in the valid search space) 2. Generate random initial velocities (in a reasonable range) 3. Evaluate the solutions 4. Find the best solutions as follows:

a. Loop for all particles. i. If the current location is better than the previous best location replace it and check

if it is better than the global best, if so, replace the global best. (For the first iteration, copy the current solution as previous best, and find the global best among the previous best solutions.)

5. If the termination condition is met, exit with the global best as final solution. 6. Update the Swarm as follows:

a. Loop for particles: i. Update the velocities according to equation 2.7.3 ii. Update the particle’s location according to equation 2.7.4

Chapter 2

26

iii. Limit the location to the valid extent of search space

7. GO TO point 3.

In order to see how PSO works, assume that the lower location value

corresponds to more cooling (e.g., as the decision variable in Figure

2.7.3a). Therefore, a negative velocity guides the particle towards more

cooling and if more cooling in this iteration helps to reduce the cost (TAT

for the above example), it is reasonable to keep a negative velocity for the

next iteration as well.

If such a move makes the particle the best in the swarm, that will affect the

velocities of other particles as well. If a particle is at great distance to the

promising search region, its velocity will, generally, be larger due to large

difference values in equation 2.7.3. This allows fast move towards a better

area. The particle slows down when it approaches the promising area due

to small difference values in equation 2.7.3. This enables a detailed search

in the promising areas.

The evaluation of the cost function (e.g., schedule length) for different

particles can be performed in parallel (e.g., using multiple threads). This

might be very helpful, especially if temperature simulations are involved.

In many cases (e.g., scheduling) as the evaluation proceeds, the cost (e.g.,

test application time) grows. Therefore, at any time-point for a particle

(parallel thread) it becomes certain that it has no chance of affecting the

local best (and, therefore, the global best) the thread can be stopped. There

is no need to further evaluate the particle, since it is not going to be used

in equation 2.7.3.

This is important since the CPU time is usually proportional to the schedule

length and, hence, to the test application time. Bad schedules that do not

contribute to equation 2.7.3 usually correspond to a long test application

time and take a long CPU time to complete. Therefore, aborting their

corresponding threads drastically speeds up the search. As soon as a good

particle is found, the bad ones can be stopped.

27

Chapter 3 Related Work

Recently thermal issues affecting testing procedures have been intensively

studied [Tadayon00]. A promising class of solutions are scheduling-based

[He06b, Rosinger06]. Some of these issues, like thermal-safe testing, have

been previously addressed and some of them like issues related to the

process variation, temperature gradients, and temperature cycling have not

been sufficiently studied. This chapter provides an overview of the related

work.

3.1 SoC Test Scheduling

An optimal solution for the test scheduling problem for core-based systems

is presented in [Chakrabarty00, Chakrabarty02]. Test data and the test

access mechanism are assumed to be given. The decision variables are the

start times for the tests. The optimization objective is to minimize the total

test application time. It is shown that this problem is NP-complete. A

solution based on a mixed-integer linear programming (MILP) formulation

is suggested. It is shown that MILP solution is too slow for large designs.

Consequently, an efficient heuristic algorithm for dealing with such large

designs is presented in [Chakrabarty00, Chakrabarty02].

A method to address both the scheduling and the design of DfT features,

together, is proposed in [Huang01, Huang02]. The objective is to reduce

the test application time and the constraints include the power budget. The

problem is formulated as a two-dimensional bin-packing problem which is

solved using a best-fit heuristic algorithm [Huang01, Huang02].

A test scheduling approach that supports test preemption is introduced in

[Iyengar01]. The constraints include the power budget and precedence

constraints. Precedence constraints impose that the generated schedules

preserve desirable orderings among tests. Allowing test preemptions

results into shorter schedules [Iyengar01].

3

Chapter 3

28

A test design approach that optimizes some of the DfT features along with

the schedule generation is proposed in [Zou03]. A simulated annealing

based heuristic is used to solve the test scheduling problem that is

formulated as a two-dimensional bin packing problem. The width of the

core wrapper is one of the decision variables optimized by the simulated

annealing algorithm [Zou03].

An abort on first fail approach with test power constraint is introduced in

[He06a]. Abort on first fail means that the test is terminated as soon as a

defect is detected. Defect probabilities for cores and power constraints are

assumed as given. Test partitioning is performed by the scheduling

algorithm. A heuristic generates the test schedules with partitioning,

aiming at a minimal test application time [He06a].

A test scheduling approach for 3D-SIC is proposed in [SenGupta12]. For

normal 2D ICs, the same test schedule is used both at wafer sort and at

package test. In a 3D-SIC a number of dies are integrated into a single

package. Therefore, the package will have a collection of the tests for

individual dies in addition to tests for the TSV interconnects. A technique

for co-optimization of the wafer sort and the package test is proposed for

3D-SIC. The proposed approach utilizes an on-chip JTAG infrastructure

and efficiently re-uses JTAG lines to perform testing of different cores

[SenGupta12].

3.2 3D Stacked IC Testing

Miniaturization and performance requirements result in the usage of new

technologies, such as 3D-SICs based on TSVs. Their advanced fabrication

processes as well as physical access limitations result in major testing

challenges. The manufacturing steps of TSV-based ICs and their testing

challenges are introduced in [Marinissen09]. The necessary steps for

wafer-level and package-level testing in addition to the required test data,

wafer-level probe access, and DfT features are discussed [Marinissen09].

A technique for clock network synthesis that supports pre-bond testability

for 3D-SICs is proposed in [Kim10]. This prevents bonding of a bad die to

good dies by testing the dies before stacking. The pre-bond clock network

testing requires a complete 2D clock tree on each die. The proposed tree

topology generation algorithm uses a minimal number of TSV-related

buffer resources. Moreover, self-controlled clock transmission gates are

proposed in order to eliminate transmission gate control lines.

Related Work

29

Consequently, the number of TSVs and clock-network power consumption

are reduced [Kim10].

A DfT architecture for 3D-SICs allowing pre-bond die testing as well as

mid- and post-bond stack testing is proposed in [Marinissen10a]. This

architecture facilitates modular testing, in which the various dies, their

cores, the inter-die TSV-based interconnects, and the external I/Os can be

tested as separate modules. This helps to achieve an optimal test flow for

various 3D-SIC designs. The proposed architecture is based on the existing

DfT features at the core, die, and product level. A die-level wrapper which

can be based on either IEEE Std 1500 or IEEE Std 1149.1 is proposed

[Marinissen10a].

A DfT architecture based on a modular test paradigm for 3D-SIC is

proposed in [Marinissen10b]. Different dies, their cores, the TSVs, and the

external I/Os can be tested individually. The proposed architecture is based

on existing DfT hardware at the core, die, and product level. Die-level

wrapper compatible with IEEE 1500 are supported. The proposed DfT

includes dedicated probe pads on the non-bottom dies to facilitate pre-bond

testing. Moreover, TSVs working as “test escalators” for routing test

control and data signals up and down during mid- and post-bond testing

are supported. Furthermore, a hierarchical “wrapper instruction register”

chain can be included in the design [Marinissen10b]. Some of the test

techniques and DfT features are further discussed in [Plas11].

Challenges in testing of 3D-SIC for manufacturing defects and their

potential solutions are discussed in [Marinissen10c]. These are divided into

the following categories: test flow, test data, and test access. Examples for

interconnect defects, including voids in TSVs and misaligned micro-

bumps are discussed [Marinissen10c].

The die-stacking steps that include thinning, alignment, and bonding may

introduce defects. Therefore, the partial stacks (mid-bond) and the

complete stacks (post-bond) may need testing. A test architecture

optimization technique for 3D-SIC is proposed in [Noia10a] to minimize

the test application time for both mid- and post-bond test stages. It is

demonstrated that an optimal DfT architecture considering these test stages

is different compared with the situation that only the final test stage is

considered [Noia10a].

Chapter 3

30

A DfT architecture optimization technique for 3D-SIC is proposed in

[Noia10b]. It is shown that 3D-SIC with large and complex dies placed at

the lower layers requires less test time than stacks with complex dies at

higher layers [Noia10b].

Test challenges for 3D-SICs are discussed in [Marinissen12a, Noia11].

The need for standards like IEEE P1838 is discussed. P1838 consists of a

test wrapper hardware and a description language for a test standard. This

includes a generic die wrapper, which will create a standardized interface

for each die in a stack. The wrapper design must enable pre-, mid-, and

post-bond test access. A standard interface for each die in the stack is

suggested. All these enables partial and complete stack tests, including die-

external tests [Noia11].

Another DfT architecture for 3D-SIC is proposed in [Marinissen12b]. It

supports modular testing, meaning different dies, cores, TSV-based

interconnects, and external I/Os can be tested during the relevant pre-, mid-

or post-bond stages. The proposed architecture makes it possible to

optimize the test flow under various conditions. It also provides yield

monitoring and first-order fault diagnosis [Marinissen12b].

A DfT architectural optimization for 3D-SIC is proposed in [Noia12] to

minimize the test application time for mid- or post-bond testing. Optimal

architecture and the corresponding test schedule are obtained for a scenario

where only the post-bond testing is performed. It is demonstrated that the

optimal architecture and schedule are different for the scenario where a

mid-bond testing is added to the existing post-bond tests [Noia12].

The optimal test flow for 3D-SIC is studied in [Taouil12]. A framework

that embodies different test flows for die to wafer bonding paradigms is

introduced. The cost associated with a range of test flows is assessed for

several die yield and stack size alternatives. It is shown that the inclusion

of pre- and mid-bond testing potentially reduces the overall cost

[Taouil12].

3.3 Temperature-Aware Test Scheduling

Without simulating the temperatures, the thermally safe schedules for

advanced SoCs will be unnecessarily long. This is due to the large safety

margins which are necessary when the temperature values are unknown.

Prior to the actual test (during the schedule generation) our knowledge of

Related Work

31

the actual temperatures during testing, without simulating the

temperatures, will be severely limited. Therefore, temperature-aware test

scheduling techniques that use a kind of temperature simulation are

introduced.

A temperature-aware scheduling technique is proposed in [He06b]. The

objective is to minimize the test application time and the constraints

include keeping the temperature under a safe limit. Test partitioning is

supported and the time between two consecutive partitions can be utilized

for cooling. Interleaving allows tests for other cores to be performed while

a hot core is interrupted for cooling. This allows for efficient TAM

utilization and a short schedule is achieved. The problem is formulated as

a combinatorial optimization and a Constraint Logic Programming (CLP)

formulation is used to solve it [He06b]. A faster heuristic-based approach

for this purpose is later on proposed in [He07].

The power impact of scan chain testing is studied in [Bild08]. It is shown

that the scan-chain power consumption is considerably higher for at-speed

testing compared to the operational mode. An exact test schedule

optimization for minimization of test application time under temperature

constraints is introduced. This exact approach could be slow for practical

purposes and therefore a fast heuristic-based approach is proposed

[Bild08].

A temperature-simulation driven test scheduling algorithm is proposed in

[He08a]. Instantaneous simulated temperatures are used to guide the

partitioning of the tests and lengths of the cooling intervals. Interleaving of

tests for different cores is supported to achieve a high TAM utilization

[He08a].

A partitioning and interleaving approach is introduced in [He08b]. The

suggested method formulates the number of test partitions and the length

of cooling intervals into an optimization problem and then uses constrained

logic programming to solve it. The temperature is simulated using HotSpot

[Huang06] and is constrained to avoid overheating. Since constrained logic

programming is too slow to handle long tests, a heuristic is proposed for

the test scheduling [He08b].

A partition-based temperature-aware test scheduling algorithm is proposed

in [Yao09, Yao11a]. Tests are partitioned and the proper start time for each

partition is defined as a decision variable. The optimization goal is to

Chapter 3

32

achieve a short test application time under the temperature constraints. A

superposition-based temperature simulation scheme is proposed. The

actual temperature simulation is performed for each partition only once at

the very beginning using HotSpot [Huang06]. Later on these simulated

temperatures are combined based on the superposition principle in order to

obtain the temperature for different situations [Yao09, Yao11a].

A temperature-aware combined TAM design and test scheduling technique

is proposed in [Yu09]. The proposed approach supports cycle-accurate

temperature simulation as well as test partitioning and interleaving.

Maximal TAM size and maximal safe temperature are given as constraints.

A heuristic-based approach is used to minimize the test application time.

In addition to the temperature simulations, the heuristic is guided by the

power density and test application time of individual partitions [Yu09].

A temperature-aware test scheduling technique supporting the abort on

first fail testing approach is proposed in [He09]. In such an approach

testing is terminated as soon as a defect is detected. Therefore, the defect

probabilities must be known before the scheduling. The proposed test

scheduling technique supports partitioning and interleaving of tests. The

proposed algorithm uses the simulated temperatures to guide the

partitioning of tests and to determine the duration of the cooling intervals.

The objective is to minimize the expected test application time while the

temperatures of the cores are kept below the thermal safety limit [He09].

A temperature-driven test access routing and test scheduling for three-

dimensional SoC is introduced in [Chandran09]. Three dimensional design

of DfT features combined with partition-based test scheduling is studied.

The proposed temperature-aware technique minimizes the test application

time while constraints on the available hardware resources are taken into

account [Chandran09].

In [Vinay10], it is shown that 3D-SICs may rapidly become too hot since

the thermal resistance between dies located at the middle of the stack and

the heat sink is large. A temperature aware test scheduling technique is,

then, proposed [Vinay10]. The proposed approach focuses on vertical

temperature distribution in the 3D IC to avoid overheating. Moreover, a

new test partitioning scheme is proposed based on the power variations.

The proposed techniques consist of heuristics aiming at minimizing the test

application time. The proposed thermal model is a linear RC-model that

focuses on vertical temperature effects. It is demonstrated that the proposed

Related Work

33

technique can achieve a uniform vertical temperature distribution

[Vinay10].

A test partitioning method for temperature-aware testing of 3D-SIC is

proposed in [Millican14]. The objective is to generate a test schedule with

a minimal test application time under the thermal-safety constraints. The

partitions are determined based on a partitioning temperature so that the

temperature within a partition does not vary too much [Millican14].

A test scheduling technique for 3D-SICs based on a session-less approach

is proposed in [Flottes15]. Testing start times are formulated as the

decision variables and test application time is minimized. A set of

constraints including TAM availability, power budget, and thermal limits

must be respected. A greedy heuristic is proposed and experimentally

evaluated in [Flottes15]. The session-less approach generates shorter

schedules compared with the session-based ones. Session-based

approaches afford to find the optimal schedule while session-less

approaches usually cannot find the exact optimum. The proposed heuristic

can find a near optimal solution for large problem sizes resulted from

session-less approaches [Flottes15].

3.4 Process Variation Effects on Power and Temperature

Process variation causes uncertainty in circuit parameters including the

electric currents and therefore the dissipated power. Variations in the

dissipated power values will result in temperature variations. This means

that the temperature for two different fabricated instances of the same

entity (e.g., an identical core design) will be different.

Consider a homogeneous multi-core SoC design. Assume that all the cores

are executing exactly the same tasks with identical memory and resource

access patterns (also identical state and input data). Assume that all the

cores started from the ambient temperature (i.e., identical initial

conditions) and the voltages are precisely equal. Assume that there is not

heat transfer among the cores and the cores cooling capabilities are

designed to be identical. The difference in their working temperatures is

due to the so called intra-die variations1.

1 Intra-die and inter-die variations are formally defined based on the concept of

temperature error that will be introduced in chapter 4.

Chapter 3

34

Now consider two single-core SoCs with the same design. Assume that

they are executing exactly the same tasks with identical memory and

resource access patterns (also identical state and input data). Assume that

both ICs started from the ambient temperature (i.e., identical initial

conditions) and the voltages are precisely equal. The difference in their

working temperatures is due to the so called inter-die variations.

The impact of process variation on leakage power for a 0.18μm

Complementary Metal Oxide Semiconductor (CMOS) technology is

studied in [Srivastava02]. It is shown that the process variation can

drastically affect the leakage current. Based on Monte Carlo simulations

an analytical model for estimating the average and the standard deviation

of the leakage current is developed. It is then demonstrated that the average

leakage obtained by taking the PV into account is significantly different

from the leakage predicted by the deterministic models [Srivastava02].

Process variation is a major challenge for designing with technology nodes

smaller than 90nm [Borkar03]. Large variations in voltage, current, power,

temperature, and delay are expected. PV causes serious difficulties in

designing advanced electronics and to address these difficulties a shift in

the design paradigm, from existing deterministic approaches to adaptive or

stochastic approaches (either probabilistic or statistical) is necessary

[Borkar03].

A method for estimating the leakage current variations due to PV is

proposed in [Rao03]. The problem is analyzed for both inter- and intra-die

variations and a closed form Probability Density Function (PDF) for

calculating the leakage current is developed. Distributions of individual

gate’s leakage currents are then combined to calculate the average and

variance for a whole design. The closed form results are then validated

against a set of Monte Carlo simulations [Rao03].

A stochastic approach for leakage power minimization based on dual Vth2

technologies considering PV is proposed in [Liu04]. A statistical model of

PV is used in this stochastic optimization. Probabilistic analytical models

are then developed to predict the impact of PV on the leakage power and

2 Vth is the threshold voltage in CMOS-based technologies. One Vth value is

sufficient to fabricate working ICs. However some manufacturers offer the

possibility of using two different Vth values in a single die in order to achieve

better performance/power trade-offs.

Related Work

35

delays. This model indicates that the existing non-probabilistic analysis

significantly (around ) underestimates the leakage power. Based on the

proposed model the value of the second Vth is optimized considering the

PV [Liu04].

A different stochastic Dual-Vth optimization technique considering PV is

proposed in [Srivastava04]. It is shown that the deterministic methods are

not appropriate in the presence of large process variations. The proposed

stochastic approach can improve the leakage by 15–35% compared with

the traditional deterministic approaches [Srivastava04].

A method for analyzing the leakage power under intra-die process

variations is proposed in [Chang05]. A lognormal distribution is used to

approximate the leakage current of individual gates and the overall leakage

of a die is determined by combining these lognormal distributions. Both

subthreshold leakage and gate tunneling leakage are considered

[Chang05].

The leakage power is very sensitive to process variations and therefore PV

results in large temperature variations [Choi07]. The temperature

variations in FinFET circuits are affected by both the channel length

variations as well as the body thickness variations. The temperature

variation caused by PV are assessed using Monte Carlo simulations

combined with temperature simulations. The simulations show that circuits

with large switching activity suffer from larger temperature variations.

This is due to larger static power as a result of the higher temperature

caused by large switching activity. It is shown that under a moderate

process variation (e.g., for channel length and body thickness)

thermal runaway can occur in more than 15% of chips in a 28nm FinFET

technology [Choi07].

Process variation is caused by various reasons including [Nowka08]:

· imprecisions in alignment, rotation and magnification (lithography);

· interference effects from neighboring shapes (lithography);

· fluctuations in the photon absorption positions (lithography);

· fluctuations in the dosage of chemicals used for etching and treatment;

· random dopant fluctuation;

· gate oxide thickness fluctuation;

· Chemical Mechanical Polishing (CMP) unevenness;

Chapter 3

36

Gate oxide thickness variation results in variation in the threshold voltage

and consequently in variations in the static power dissipation [Nowka08].

Some of the stochastic approaches for dealing with the PV (e.g., statistical

static timing analysis) cannot handle the dynamic changes during operation

[Ganapathy10]. A new method based on multivariate regression is

proposed to model the temporal delay variations under PV. Such variations

are related to temperature variations [Ganapathy10].

On-chip temperature sensors are used to achieve a temperature-aware test

scheduling and reduce the test application time compared to a static

schedule [Yao11c]. Due to large PV the estimated test power values can

be very different from the actual ones. Consequently, the estimated

temperatures during the offline scheduling phase (prior to the actual test)

can be inaccurate. A test architecture that supports dynamic test scheduling

is assumed. A heuristic is suggested to generate the static schedule that the

method is based on. Then a dynamic test scheduling method using on-chip

temperature sensors is proposed [Yao11c].

Dynamic reliability management techniques dynamically tune a system’s

operation based on the tradeoff between performance and reliability. The

proposed method in [Zhuo10] takes the spatial and temporal variations

(including PV) into account. Moreover, the proposed technique is

workload-aware meaning that it reacts to sudden workload variations

[Zhuo10].

A flexible probabilistic framework for evaluation of the transient power

and temperature variations under large PV is introduced in [Ukhov14a].

This models the probability functions of the fluctuating parameters. The

proposed technique captures the power and temperature variations in a

closed-form analytical model [Ukhov14a].

A system-level framework for the analysis of temperature-related failures

affected by PV is proposed in [Ukhov14b]. This includes a probabilistic

technique for dynamic steady-state temperature modeling and a closed-

form stochastic modeling of the system. Temperature cycling induced

aging is analyzed in presence of large PV. The proposed technique

minimizes the expected energy consumption under performance,

temperature, and reliability constraints [Ukhov14b].

Related Work

37

3.5 Multi-Temperature Testing

A detailed study of defects found in a commercial microprocessor is

performed in [Needham98]. For this high production volume micro-

processors, the manufacturing tests are designed so that very small test-

escape statistics are achieved. Some of the escaped defective devices are

rigorously analyzed to find out the defect’s type, its electrical effect, and

the possible methods to detect such defects easily. Lessons learned from

these defects in combination with the technology trends enables the authors

to determine what should be done to achieve and maintain high-quality

manufacturing and test. This includes defects that can be detected by multi-

temperature testing but are otherwise hard to detect [Needham98].

The conclusions from a failure analysis study in SEMATECH3 is reported

in [Nigh98]. The testing procedures, IC stressing to achieve high

reliability, characterization of the defects, fault diagnosis, and physical

analysis are presented for a number of devices. Testing at different

temperatures is discussed in [Nigh98].

Delay-defect test-escapes are examined in [Tseng00]. Among these

defects, detecting the defects that are caused by high resistance

interconnects are very challenging. A cold testing technique that performs

the test at low temperature can help. Cold testing is in particular effective

for detecting the silicide open defects [Tseng00].

The behavior of resistive open defects are studied in [Li01]. Temperature-

dependent defects that motivate multi-temperature testing are discussed.

The effects of temperature on testing are investigated and an effective

testing method for resistive opens is presented [Li01]. It is suggested that

by knowing the location of such defects and the materials involved in those

defects, the proper testing temperatures can be found and the appropriate

test patterns can be generated [Li01]. Such testing temperatures and test

patterns are used to perform multi-temperature testing.

Parametric failures are more frequent in advanced electronics, where the

feature size is very small [Segura02]. These hard to detect failures are

experimentally studied and classified. Multi-parameter test strategies are

3 SEMiconductor MAnufacturing TECHnology (SEMATECH) is a research

consortium for IC manufacturing. http://public.sematech.org/ (May 2015)

Chapter 3

38

suggested to address this complex test problem. These issues are also

discussed in [Segura04].

Due to very small copper interconnect dimensions in advanced electronics,

physical failure analysis is needed to address the potential defects. Failure

localization and defect analysis are challenges for copper inlaid

technologies [Zschech02]. Failure localization and analysis using

FIB/SEM4 and TEM5 are described. The voids in copper interconnects and

buried residuals in vias are studied in [Zschech02]. These defects result in

temperature sensitive defects that necessitate multi-temperature testing.

Multi-temperature testing is analyzed based on experimental data from

0.25μm and 0.18μm technologies, in [Long04]. Then, based on these data,

a model is developed. This model is used to design new temperature-based

tests to improve the test’s quality. Temperature based test data are

presented for a range of measurements including transistor characteristics

needed to parameterize the model [Long04].

Some imperfections in the chip (e.g., some resistive opens or shorts) will

not hinder the normal operation of the chip just after the fabrication, at the

time that the manufacturing test is performed. But these imperfections are

reliability threats because they are weak points in the circuit that wear out

quickly and will lead to failures during the expected lifetime of the chip

[Long04, Needham98]. Some of these imperfections can be identified by

multi-temperature testing.

Performance outliers and defects are examined across the expected

operating temperature range [Schuermyer04]. Minimum testing

requirements to detect temperature dependent outliers at wafer sort and

final test are investigated. This is based on data from a 0.18μm technology

obtained at 30°C and 85°C. It is argued that temperature-sensitive defects

are expected to become more frequent in advanced technologies and,

therefore, it is important to develop effective test methodologies for them

[Schuermyer04].

4 Focused Ion Beam (FIB) is a visualization technique used for site-specific

analysis of materials. It is similar to a Scanning Electron Microscope (SEM).

5 Transmission Electron Microscopy is a visualization technique based on electron

beams transmitted through the object.

Related Work

39

Resistive defects are important in advanced electronics but they need

special conditions in order to be detected. The detection approaches for

resistive bridging (short) defects are studied in [Engelke08, Kundu05].

Testing at low temperatures may help. Resistive bridge defects are studied

under multiple environmental conditions. Moreover, imperfections that are

not defects at nominal conditions but could deteriorate and become early-

life failures are studied. It is suggested that there exist appropriate

combinations of these tests that provide satisfactory test coverage for

different types of defects [Engelke08, Kundu05].

The performance of advanced electronics that are made by deep submicron

technologies can be affected by phenomena that were previously

considered not to be important [Wu10]. One of such phenomena is

Inversed Temperature Dependence (ITD). ITD means that the delay of

electronics may decrease with temperature; against the traditional

understanding that the electronics delay increases with the temperature.

The reason for this phenomenon is the smaller threshold voltage (implying

faster operation) at high temperatures which dominates the smaller carrier

mobility (implying slower operation) at high temperatures. Traditionally,

delays are checked at two temperature corners, one representing the best

case (used to happen at low temperatures) and the other representing the

worst case (used to happen at high temperatures). For advanced electronics

which experience ITD the high temperature may not correspond to worst-

case delays [Wu10].

Advanced electronics require new types of testing, like temperature-

testing, in order to maintain high product quality. The effect of test

temperature on the quality of the tests is studied in [Jagan10]. A low-cost

alternative to temperature testing is proposed. Moreover, the proposed

technique determines the appropriate test conditions for the best test

quality and lowest cost. The proposed test flow is experimentally evaluated

on an industrial-standard die. A defect’s behavior at low-temperature is

studied using Shmoo plots6 [Jagan10].

The need for testing advanced core-based SoCs at different temperatures

is discussed in [He10]. Then a multi-temperature test scheduling for SoCs

is introduced which assumes that tests should be applied inside predefined

6 Shmoo plot is a graphical representation of a device’s response to a range of

conditions and inputs (e.g., temperature and voltage).

Chapter 3

40

temperature ranges. The proposed scheduling approach minimizes the test

application time and ensures that tests are only applied within the valid

temperature ranges [He10]. For this purpose the temperatures of the cores

are simulated. Based on the simulated temperatures, heating or cooling

intervals are introduced into the schedule [He10]. The proposed method is

based on partitioning and interleaving and therefore when a core is having

its cooling interval, other cores may utilize the test access mechanism’s

capacity that has been just made available [He10].

Another multi-temperature test scheduling scheme for SoCs is introduced

in [Yao11b]. It assumes that tests should be applied at their specified

temperature ranges (can be different from each other). Cooling intervals

are inserted if the core temperature is too high and heating stimuli are

applied when the temperature must be increased in order to meet the

required temperature conditions for correct testing [Yao11b]. The

proposed scheduling approach in [Yao11b] is based on list scheduling and

assumes that tests run always to completion without any interrupts. The

initial list order is determined based on the lowest valid temperatures for

the tests. The list schedule determines the earliest start times for tests. The

test application time is minimized and it is ensured that tests are applied

within correct temperature ranges [Yao11b].

3.6 Temperature Gradients and Burn-In

The presence of voids in Cu structures results in important reliability issues

for advanced electronics. The mechanical stress in the interface between

the Cu and capping layers7 is experimentally investigated in [Murray12].

In technologies that deposit the cap at lower temperatures, the Cu does not

show considerable depth-dependent stress. Even though an annealing

technique can decrease the stress gradient, when the temperature goes back

to the room temperature after being close to the deposition temperature, the

gradient appears again [Murray12].

A mechanism that causes defect formation in metallization (e.g.,

interconnects) under fast temperature cycle stress is studied in

[Smorodin08]. The lateral temperature distribution (i.e. temperature

gradient) causes an accumulating plastic deformation of the metal layer.

7 Capping layer is the electrical insulation used to insulate different interconnects

and wire lines in a die.

Related Work

41

Large deformations occur in sites which experience large temperature

gradients [Smorodin08].

Burn-in is used to accelerate various aging and failure mechanisms so that

the imperfections that may cause infant mortality are detected before the

product is shipped [Miller01]. Burn-in acceleration is achieved by

imposing high temperatures, high voltages, high toggle rate, and/or high

current density on the circuit under test. One of the traditional test flows is

a burn-in following a test by ATE and then again an ATE test. It is

suggested that this first ATE testing and the burn-in can be combined into

a hybrid burn-in, improving the overall test process [Miller01].

Reliability predictions are based on a number of sources of information

including in-service field return data and physics of failure [Bayle10].

Previously, predictions were mainly based on empirical data but recently

physics of failure is being incorporated into the lifetime models. The

existing reliability models usually are based on steady-state temperature.

A new methodology that combines several recent works that address new

mechanisms of failure (e.g., hot carrier and delamination) is proposed for

aeronautic applications [Bayle10].

3.7 Testing for Delay-Related Defects

Gradients and early life failures are discussed above. Gradients have some

other negative consequences. One of them is discussed in the following.

Different temperatures on different sites mean that the signal delays (that

depend on the sites that a signals route passes through) will have different

delays. This can cause delay-related faults that must be detected using at-

speed and delay tests, as discussed below.

Some floating-point data-paths are developed for graphics and simulation

applications in [Hagihara97] using a 0.35-micron technology. They are

designed to be embedded in a vector pipelined processor for use in

supercomputers. An online test technique is introduced to improve the

reliability under actual operating conditions that includes temperature-

gradients. The technique makes it possible to detect delay faults as well as

the static faults (i.e., normal defects) [Hagihara97].

For advanced SoCs containing millions of gates and working with

frequency in gigahertz range, at-speed test is crucial [Ahmed05]. The

launch-off-shift method has some advantages over the launch-off-capture

Chapter 3

42

technique but requires perfect transition fault testing with regard to at-

speed scan enable signal. A scan-based at-speed test is introduced in

[Ahmed05] that is based on multiple local fast scan enable signals. The

scan enable control information is sent as test data through the scan.

Moreover, an innovative scan cell is introduced to generate the fast local

scan enable signal [Ahmed05].

Keeping the power consumption checked during at-speed testing is

investigated in [Ko08]. A common practice is to divide the scan chain to

control shift power by activating mutually exclusive flip-flops at different

times during the scan cycle. However, the existing automatic test pattern

generation techniques do not provide means to control the capture power.

Therefore, a new scan chain division algorithm is introduced in [Ko08]. It

takes into account the signal dependencies and partitions the circuit such

that both shift and capture power can be reduced. Moreover, a technique

for utilizing partial scan combined with the scan chain divisions is

proposed [Ko08].

Test power constraints are usually due to the power delivery limitations.

These limitation could be due to the limited capability of the power

network in the device or the limited test equipment’s capability [Zhao10].

Excessive switching activity during launch-to-capture cycle in delay test

causes many problems. These include overkill of dies and damaging the

ATEs’ probes [Zhao10]. A fast technique for finding the high-power

patterns and replacing them with power-safe ones is introduced. Being high

power is defined in relation with ATEs’ power limit. The proposed

technique takes the spatial and electrical properties of the power

distribution network into account [Zhao10].

At-speed scan-based testing may be affected by launch safety issues

[Wen11]. This means that the test results are incorrect because of excessive

launch switching activity which is related to the test stimulus launch in the

at-speed test cycle. A power-aware test generation flow is proposed in

[Wen11] to guarantee a safe launch. The proposed rescue and mask scheme

targets the excessive switching activity around the long path that the test

vector targets. The rescue phase reduces the power. If the new power value

is still too large the test responses are masked. The proposed approach

guarantees launch safety with a negligible impact on test quality and costs

[Wen11].

Related Work

43

Launch-off-capture and launch-off-shift are the two major at-speed scan-

based delay testing techniques [Bosio11]. Usually, launch-off-shift offers

higher fault coverage and faster test than launch-off-capture technique.

However, it suffers from higher peak power consumption in the launch-to-

capture cycle. A don’t care filling technique is proposed to adjust peak

power consumption in relation with the power consumption in functional

mode. The objective is to generate a test set with peak power values similar

to the functional power [Bosio11].

3D-SICs are manufactured based on micro-bumps that connect two of the

stack dies together [Shibin15]. Moreover, TSVs provide electrical

connections between the front- and back-side of a die. It is reported that

imec8 and Cadence9 have developed a 3D-DfT architecture based on DfT

die wrappers [Shibin15]. The TSVs and micro-bumps can be tested for

static defects (e.g., hard opens and shorts) by existing techniques. Such

interconnects might also be affected by resistive opens and shorts, which

usually manifest themselves as delay faults. The reported 3D-DfT is

recently enhanced to support at-speed transition-based delay-fault testing.

The reported framework works with mission-mode speed and employs the

already existing clock distribution network [Shibin15].

A delay fault simulator for combinational circuits is developed in

[Manikandan11]. It helps to develop the delay tests faster. The experiments

consider K-longest path sets of ISCAS'85 benchmarks. A large number of

single input test patterns are repeated for a number of times to achieve

statistically valid data. The proposed technique is reported to provide good

fault coverage and 20% speed-up [Manikandan11].

A transient fault injection technique for simulation-based fault-injection in

advanced SoCs is proposed in [Rohani13]. The proposed technique can

inject a wide range of faults without modifying the top-level design.

Moreover, the proposed technique is fast. Two experimental case studies

show that the proposed technique reduces the CPU time by 10% compared

with other similar techniques [Rohani13].

8 Interuniversity MicroElectronics Centre (IMEC) is an electronics research

center. http://www2.imec.be/be_en/home.html (May 2015)

9 Cadence Design Systems Inc is an electronic design automation company.

http://www.cadence.com/cadence/Pages/default.aspx (May 2015)

Chapter 3

44

3.8 Temperature Cycling

It has been known for long that varying mechanical stress in metals will

result in metal fatigue and consequently lead to metal structure failure. The

varying stress has various causes, including mechanical load variations and

temperature fluctuations (i.e., cycling). Accurate estimates for the effect of

fluctuations help to know the lifetime of a part. This enables a better

(simulation-based) design of the parts. Besides it enables the timely

replacement of the parts which translates into a safe and cost-efficient

maintenance of the structure (e.g., a ship or a plane). A well-known

approach for estimating this aging effect is the Rainflow counting

algorithm proposed in [Matsuishi68].

Cycle counting methods (e.g., Rainflow algorithm) identify equivalent full

and half cycles within the irregular load profile [Musallam12]. Then the

cycle-based lifetime models can be used. The original Rainflow algorithm

is applied offline meaning that the whole temperature or load profile over

the desired operational time period must exist before it can start. Therefore,

it cannot be used for applications that need it in real time. An online

counting algorithm which uses a stack-based implementation is proposed

in [Musallam12] and used in this thesis.

Time dependent average temperature effect is combined with the results

from the Rainflow algorithm in a single lifetime model in [GopiReddy14].

A month long load profile is used as a test profile to estimate temperatures

in a power system for reactive compensation of load [GopiReddy14].

Insulated Gate Bipolar Transistor (IGBT) is a power-electronic device with

a relatively wide range of applications including automotive traction

[Held97]. Such applications require high reliability in particular under

power cycling. Power cycling causes temperature changes which lead to

mechanical stress. This can lead to defects such as lifting of bond wires. A

fast cycling test that activates the failure mechanism is suggested to enable

reproduction of millions of cycles in a short time. The effectiveness of the

proposed approach is verified by a mechanical analysis. A model is

developed to relate the number of cycles-to-failure to the magnitude of

temperature changes [Held97].

Another mechanism affecting the lifetime of electronic devices can be

modeled by the Arrhenius equation. An important parameter is the

Related Work

45

activation energy10 that relates to the working temperature [Groebel01].

Accelerated-test data are experimentally obtained and then used to

accurately estimate the activation energy. A software package dedicated to

this experimental approach is used to speed up the process. Accelerated-

life test data are acquired for a thermally stressed hard-drive system and

analyzed using the Arrhenius-Weibull model. The Arrhenius model

parameters are estimated using a maximum likelihood algorithm. Then, the

activation energy is estimated [Groebel01].

A few procedures for extracting the statistical parameters of temperature

cycling experienced by power devices for different mission profiles (e.g.,

how an electric vehicle is driven) are investigated [Ciappa03a, Ciappa03b].

These statistical models help to design efficient accelerated tests and to

fine-tune the lifetime models. A precise lifetime model that takes into

account the creep11 experienced by compliant materials under thermal

cycles is developed in [Ciappa03a, Ciappa03b].

Electronics reliability is affected by the average working temperature as

well as the temperature cycling effect [Hirschmann06, Hirschmann07].

Temperature simulation is used to estimate the dynamic temperature

values. Temperature cycling plays an important role in lifetime prediction

models. A technique for detecting all relevant temperature cycles is

developed in [Hirschmann06, Hirschmann07].

A lifetime model for solder joints under cyclic thermal-mechanical loading

is developed in [Lu07]. The model combines a linear damage accumulation

10 Activation energy is a term primarily used in chemistry to approximately

describe the minimum energy required to start (activate) a reaction. The reaction

is modeled by Arrhenius equation that has the activation energy as a main

parameter. In other situations that are not exactly a chemical reaction, but the

Arrhenius equation is used for pure modelling purposes, the term “activation

energy” is nevertheless used for the main parameter in the model disregarding

its original namesake.

11 Creep or cold flow is when a solid material moves slowly or deforms

(permanently) under mechanical stresses. Exposure to stress during a relatively

long period of time can do this. Heat exacerbate creep. The amount of stress that

can cause this is less than the value needed to literally bend the material

instantaneously.

Chapter 3

46

with the effect of accumulated plastic strain12. The model is then used to

predict the lifetime for a power module that operates under mixed cyclic

loading conditions (e.g., a train’s traction system) [Lu07].

A solder fatigue model based on a modified Coffin-Manson approach is

proposed in [Vasudevan08]. The proposed model is evaluated using

temperature cycling experiments. The experimental data for various types

of packages and sockets have been used. The proposed model’s error is

reported to be less than 6% [Vasudevan08].

Through silicon vias reliability issues are investigated in [Kamto09]. The

experiments performed using a technology based on deep reactive ion

etching show that TSVs with tapered sidewalls can be formed. The TSVs

experience temperature cycling. Considerable increase in the electrical

resistance of the paths going through TSVs is observed after temperature

cycling. Perfect TSVs only show small increases in resistance for 200

cycles. Moreover, small changes in resistance are observed when TSVs

experience high temperatures for extended periods of time [Kamto09].

The acceleration factor for solder depends on the magnitude of temperature

changes, dwell times, ramp rates, actual values of temperature extremes,

and the type of package [Syed10]. A lifetime model that relates the actual

real-life lifetime with accelerated lifetime based on a number of factors

including temperature cycling is proposed [Syed10].

The relation between the initial electrical resistance of TSVs and failures

due to temperature cycling as well as electromigration is studied in

[Frank10]. Physical analysis shows that a carbon impurity layer at one end

of the problematic TSVs is developed. This impurity results in failure

under temperature cycling while it has no correlation with defects caused

by electromigration [Frank10].

The thermal stress distribution for a TSV array is studied in [Kuo11,

Kuo12]. In TSV-based structures, there are large coefficient of thermal

expansion (CTE) mismatches between silicon substrate, dielectric

material, and filled metal. Therefore, the thermal stress at the interface of

materials is large and results in material failure or delamination. The

12 Strain within material is either elastic or plastic. While elastic strain only can

cause a reversible distortion, the plastic strain can result in non-reversible

deformation including cracking of the material.

Related Work

47

thermal-mechanical stress distribution of a TSV array model under the

accelerated temperature cycling is investigated by a finite element

approach. The surface area between TSVs is squeezed at high temperature

and this results in compressive stress at the surface area. The analysis

shows that large stress occurs around pads. This may result in failure or

delamination of TSV pads. The simulations indicate that for larger pads

that result in smaller space between TSVs the stress is larger. Smaller pads

experience higher stress close to the pad corners but the stress is smaller at

the middle of bottom pad. The proposed analysis technique helps to

identify possible failure regions in the TSV structure [Kuo11, Kuo12].

Large shear stress13 develops at the interfaces between different materials

during temperature cycling, especially if the difference between their CTE

is large [Kumar12]. The shear stress may cause interfaces to slide by a

diffusional process. This results in relative dimensional changes in the

materials. This is a reliability risk for TSV based structures which not only

suffer from temperature cycling but also convey large current densities.

Experimental results demonstrate interfacial sliding caused by temperature

cycling in presence of electric current. The presence of current moved the

affected area in the direction of electron flow. This leads to exacerbated

protrusion (or intrusion) of TSV relative to the temperature cycling only

situation (when the electric current is negligible) [Kumar12].

The effect of temperature cycling as well as some other thermal

phenomena on the performance of TSV based electronics is studied in

[Cherman12]. The transistor performance is affected by the stress induced

by the TSV. It is reported that high working temperature increases the

TSV-induced stress while temperature cycling decreases this stress. These

stress variations may be due to the TSV creep [Cherman12].

A study for understanding the effect of temperature cycling on the signal

integrity for TSV based electronics is conducted in [Okoro12]. Radio

frequency signals are used to detect discontinuities in the isolation liner

around the TSV metal body. Signal degradation increases with temperature

cycling. Atomic Force Microscopy (AFM) showed that void formation and

growth in the isolation liner is the root cause [Okoro12].

13 Shear stress is the stress force parallel to the surface of a material. It is different

from the normal stress which acts vertical to the surface.

Chapter 3

48

System reliability is affected by a number of factors including the amount

of temperature cycling [Chantem13]. Task assignment and scheduling may

help to even out the core wears in an advanced multi-core system. A

dynamically-activated task assignment and scheduling algorithm that

prolongs system lifetime is proposed in [Chantem13].

Thermal-mechanical failures of TSVs including the TSV protrusions from

the die surface are studied in [Zhang13]. The TSV protrusions are observed

after wafer bonding, thinning, and TSV revealing. TSV protrusion on the

backside is affected by temperature cycling. Protrusion magnitude can be

fitted to an exponential model which suggests a grain boundary diffusion

mechanism might be behind it [Zhang13].

Temperature-related mechanical stress in TSV structures is studied in

[Jiang14]. An X-ray micro-beam diffraction visualization technique is used

to observe the stress and deformation in TSV with submicron resolution.

Local plasticity in TSV and the deformation induced by thermal stresses

are investigated using this technique. Grain growth in TSV metal body

affects the stress relaxation during temperature cycling and consequently

the residual stress and plasticity in the TSV structure [Jiang14].

3.9 Test Reordering

During the test, the power consumption of the circuit under test may exceed

its power rating, as discussed before. A test power reduction technique

based on test vector ordering is proposed in [Chakravarty94,

Dabholkar98]. The objective is to minimize the tests average switching

activity. It is demonstrated that the test ordering problem is NP-hard.

Consequently, a greedy approach for finding a low-power test order is

proposed. An elaborate power model based on the transition count in the

scan chain is used [Chakravarty94, Dabholkar98].

Another test planning technique is proposed in [Girard97] to reorder the

test vectors to minimize the switching activity of the circuit under test. A

close connection between the actual number of transitions and the

Hamming distance between tests is confirmed. Consequently, a fast

algorithm to calculate Hamming distances is used instead of the actual

transition count which is excessively time-consuming to calculate. A

greedy heuristic is then used to find a low power test order [Girard97].

Related Work

49

In safety-critical applications, the electronics are frequently tested (the test

might be even in-field and even online) by Built-in Self-Test (BIST)

modules [Flores99]. Online testing, in particular, can consume a large part

of the overall power budget. A test ordering technique for power reduction

is proposed in [Flores99]. The circuit-under-tests switching activities are

approximated by Hamming distances between the subsequent tests. The

problem is equivalent to a travelling salesman problem. The problem is

simplified so that an ILP technique can be used. Moreover, a Christofides

algorithm is employed to find a low-power test order [Flores99].

A method for reducing the test application time while respecting a power

budget is proposed in [Rosinger02]. The method focuses on the test power

peaks. These peak values depend on the order of the tests. The tests are

reordered so that the power peaks for different cores are not overlapping.

This leads to a minimized TAT under power constraints [Rosinger02]. The

technique works as follows: First, power dissipation is minimized. Then,

the current results are further improved by test application time

minimization. When minimizing the test application time, the power is

considered as a constraint [Rosinger02].

Testing during the burn-in process is a common practice since it reduces

test and burn-in costs [Bahukud08a, Bahukud08b, Bahukud09]. However,

power variations caused by scan-based testing may lead to large

temperature fluctuations. This affects the accuracy of the burn-in process

since the actual temperatures are not exactly known. Reducing power

variations in order to reduce the temperature variations during burn-in is

investigated in [Bahukud08a, Bahukud08b, Bahukud09]. The variation is

reduced through test reordering. An ILP approach as well as a greedy

algorithm are used to properly reorder the tests. An efficient transition

counting method is proposed to rapidly estimate the test power values.

Then, a heuristic-based test-pattern ordering technique is proposed to

minimize the fluctuations in the power dissipation during test

[Bahukud08a, Bahukud08b, Bahukud09].

Scan-based testing usually causes much larger switching compared to

normal circuit operation [Tudu09]. This results in large power

consumption which in turn leads to supply droop and yield loss. An

efficient technique for test vector reordering to achieve an acceptably low

peak power is proposed in [Tudu09]. The peak power values are

represented by a complete directed graph. Consequently, a number of

Chapter 3

50

graph based techniques are employed to reduce the peak power. Removing

the edges with peak power larger than a certain threshold is one of the pre-

processing techniques. After that, the remaining graph is searched for a

Hamiltonian path. Other techniques, such as repeating a test, adding an all-

zero test, and adding an all-one test are also studied. The average power is

also minimized under the peak power constraint [Tudu09].

51

Chapter 4 Process-Variation Aware SoC

Test Scheduling Techniques

This chapter presents techniques to address the negative effects of process

variation on the thermal issues during test. In advanced SoCs manufactured

by deep submicron technologies, the portion and the absolute value of the

temperature error induced by process variation (PV) is considerable. The

PV will cause large errors mainly due to power variations [Choi07]. Large

error magnitudes directly translate into the need for larger safety margins

and consequently excessively long test application times.

The usual offline test scheduling techniques are vulnerable to temperature

errors since the error values are not known a priory. Therefore, the

temperatures that are simulated offline could be very different compared

with the actual temperatures. Since the actual temperatures are accessible

during test through temperature sensing, an online scheduling alternative

seems promising. However, online test scheduling has its own drawbacks,

such as additional delays due to temperature readout times and run time

overhead. As a compromise, an adaptive approach is proposed in this

chapter to take advantage of both offline and online scheduling paradigms.

4.1 Introduction

Two process variation aware methods are proposed in [Aghaee10] in order

to maximize the test throughput. One of these techniques is offline and the

other is hybrid (quasi-static). The optimization objective (i.e., testing

throughput) is defined to take the cost of the overheated chips into account

in addition to the test application time. However, these techniques handle

neither intra-die process variation nor temperature error fluctuations. In

this section an adaptive test scheduling method is introduced which

navigates the tests according to the intra-die process variation thermal

effects and temporal deviations in thermal behavior of the chip. It makes

4

Chapter 4

52

use of multiple on-chip temperature sensors to provide intra-die

temperature information.

Integration of such sensors is already practical. For example Power5 is

reported to have 24 sensors in year 2004 [Clabes04]. A variety of

mechanisms to access the sensors during test are proposed in [Ieee14b,

Yao11c]. The proposed approach in this thesis assumes an overhead for

sensor access and tries to reduce the number of sensor accesses.

As mentioned in section 3.4, there are related sensor-based works in this

area that support neither partitioning nor temperature-dependent leakage.

The method proposed in this chapter is based on partitioning and

interleaving, which reduces and/or utilizes the cooling times in order to

decrease the overall test application time. It also handles the long and

power intensive tests which are not thermally-safe. Moreover,

temperature-dependent leakage is taken into account.

The proposed method generates a near optimal schedule tree at design time

(offline-phase). During testing (online-phase), each chip traverses the

schedule tree, starting from the tree’s root and ending at one of the tree’s

leaves, depending on the actual temperatures. The schedule indicates when

a core is testing and when it is in the cooling state. The order of the test

sequences is untouched and the schedule tree’s size (i.e., storage footprint)

is small.

Traversing the schedule tree requires a very small delay overhead for

jumping from one point in the schedule tables to another point. This way,

the complexity is moved into the offline-phase and the memory/delay

overhead of the online-phase is minor. To our knowledge, this is the first

work to present an approach which incorporates the on-chip temperature

sensors data, repetitively during test, in order to adapt to the temperature

deviations caused by process variation and to achieve a superior test

performance.

4.2 Motivational Example

Assume that there are two instances, and , from a set of chips

manufactured for a given design. When the temperature error between the

actual temperature and the expected one is negligible, the temperatures of

and during a test process are equal and the same offline test schedule

is used for both of them. As illustrated in Figure 4.2.1a, both and

Process-Variation Aware SoC Test Scheduling Techniques

53

are tested without overheating, since the test schedule includes cooling

periods whenever the thermal simulator indicates that the chip temperature

will exceed the limit.

Due to process variation, however, the thermal responses of the different

chips to the same test sequence will be different. Now, assume that chip

is warmer than expected, while chip behaves normally. As

illustrated in Figure 4.2.1b, will overheat. To prevent this, a more

conservative offline schedule has to be designed based on the thermal

profile of , for both chips, as illustrated in Figure 4.2.1c.

This new schedule will avoid overheating, but will lead to longer test

application time compared with . For chip , this test

application time is unnecessarily long, since the original schedule, , in

Figure 4.2.1a is a safe schedule for this particular chip. For a set of

manufactured chips with large temperature variations, in order to generate

a globally conservative offline schedule, the hottest chip will be used to

Figure 4.2.1 Test schedule examples

Temperature curves (a) when there is no temperature error; (b) when there is time-invariant

temperature error and schedule is used; (c) when there is time-invariant temperature error and

schedule is used; (d) when there is time-variant temperature error. (Curves are only illustrative.)

(a)

(b)

(c)

(d)

Time

Time

Time State

S1

Time State

S2

Temperature of

Testing

Cooling

Temperature of

Temperature Limit

Chapter 4

54

determine the test schedule. This test schedule will introduce too long

cooling periods for most of the chips, leading to an inefficient test process.

The hybrid technique presented in [Aghaee10] addresses the above

problem with the help of a chip classification scheme. This scheme consists

of several test schedules for different temperature error ranges. After

applying a short test sequence, the actual temperature of the chip under test

is measured using a sensor and depending on its value, the proper test

schedule is selected. Therefore, the hotter chips will use a test schedule

with more cooling, while the colder chips will have less cooling. The

overheating issue is solved and the test application time will not be made

unnecessarily long. This approach works fine under the assumption that

the thermal behavior of the chips is time invariant (e.g., Figure 4.2.1a–c).

However, in the case of large process variation, the thermal behavior is

time variant and the technique presented in [Aghaee10] will not be able to

achieve high quality schedules. The variation of thermal response with

time is illustrated in Figure 4.2.1d. In this case, the temperature of chip

gradually lifts up, as compared to chip , and eventually overheats. A

scheduling method capable of capturing temporal deviations is therefore

required to deal with this new situation

The temperature behavior given in Figure 4.2.1d is captured in Figure

4.2.2a with more details. The lift up of the temperatures of chip starts

at time , as shown in Figure 4.2.2a. Since will only overheat after ,

both chips can be safely tested with schedule up to . At , the actual

temperature of the chip under test, , can be obtained via sensors. The

actual temperature can then be compared to a Threshold and the following

two different situations can be identified:

For the rest of the test, after , two dedicated schedules, and , are

generated in the offline phase for and , respectively. Therefore, in

the online phase the test of continues with schedule , as in Figure

4.2.2a, and the test of continues with schedule , as in Figure 4.2.2b.

In this illustrative example, at the end of , the schedule does a branching

to either or based on the actual temperature. This information and the

branching condition can be captured in a branching table, in Figure


55

4.2.2. As shown in Figure 4.2.2a, is tested initially with and then

with , while, as shown in Figure 4.2.2b, is initially tested with and

then with a more conservative schedule, .

The segments of the schedule which are executed sequentially without

branching are called linear schedules. An adaptive test schedule consists

therefore of a number of branching tables in addition to multiple linear

schedule tables. Note that the original test sequences are saved elsewhere

in an intact order without being duplicated.

Although the above illustrative example was about a single-core design,

the focus of this thesis is on multi-core SoCs. It is assumed that, due to the

intra-die process variation, each core has its own thermal behavior similar

to what is described above for a chip. Moreover, multi-core designs usually

Figure 4.2.2 Schedule and branching tables

Temperature curves when there is time-variant temperature error (a) when both chips are tested with

linear schedules and ; (b) when by referring to the branching table, , test of chip continues

with linear schedule after time . (Curves are only illustrative.)

ConditionLinear Schedule

Table ID

Branching Table B1

Temperature ≤ Threshold

Temperature > Threshold

S2

S3

Temperature of

Testing

Cooling

Temperature of

Temperature Limit

Threshold

(a)

(b)

Linear Schedule Table S3

Time StateBranching

Table ID

—

—

—

—

—

—

—

Time


StateBranching

Table ID

—

—

—

Branching

Table IDTime State


—

B1

—

—

—

Chapter 4

56

are affected by lateral heat dissipation among the cores and also by the

limited test bus width which is shared by different cores.

Temperature curves for a SoC with four cores, as an example, are given in

Figure 4.2.3 [He08a]. It shows how the temperatures of the different cores

change over time. For a given core, when it is tested, its temperature

increases. When a core is not tested, there are no switching activities, and

it starts to cool down, as shown by the temperature curve going down.

To guarantee thermal safety, testing is interrupted when a core reaches the

high temperature threshold. As shown in Figure 4.2.3, more than one core

may be tested at the same time (e.g., the temperature for both core 1 and

core 3 is going up around because of testing). Cores will utilize the

available TAM which is freed during the cooling intervals of other cores

(e.g., core 1 utilizes the cooling time of core 3 at ) [He08a].

4.3 Problem Formulation

The goal is to generate an efficient adaptive test schedule, offline. This is

formulated as an optimization problem. The input consists of a SoC design

with its set of cores and their corresponding test sequences and their

switching activities. The floor plan, the thermal parameters, the static

power parameters, and the dynamic power parameters for the chip are

given as inputs. The statistical data that models the temperature deviations

are also given as input. The adaptive test schedule should be generated to

minimize the test application time and the probability of overheating.

These objectives are captured by a cost function which expresses the

Figure 4.2.3 Temperature curves for a four core chip under test


57

overall efficiency of the generated test schedule, as discussed in the

following.

The test schedule should be generated under two constraints. The first

constraint is the available test bus width. The test bus width limits the

number of cores that can be tested in parallel. The second constraint is the

available Automatic Test Equipment (ATE) memory which limits the size

and the number of the linear schedule tables and branching tables. It is

assumed that the available memory after loading the test patterns will be

utilized for storing the schedule and, therefore, the amount of memory

dedicated to the schedule will not introduce new costs.

In this thesis a comprehensive cost function is defined by combining the

cost of the overheated chips and the cost of the test facility operation, as

follows:

(4.3.1)

The first term in the cost function is related to the test facility operation

cost, which is defined as the operational Cost of the Test Facility per time

unit ( ) divided by the Test Throughput ( ). The cost of the test

facility operation per time unit depends on the cost of the ATE machines,

their maintenance costs, and other operational costs. The test throughput

captures the applied test size per time unit and is explained later.

The second term of the cost function is related to the cost of the overheated

chips, which is the product of the Price of One Chip ( ) and the

expected number of overheated chips. The expected number of overheated

chips is calculated based on the Test Overheating Probability ( ) which

represents the number of overheated chips per number of chips entering

the test facility

In equation 4.3.1 the test overheating probability, , is divided by

in order to give the expected number of overheated chips per number

of non-overheated chips. The cost of the test facility per time unit, ,

and the price of one chip, , depend on the particular manufacturing

and test facility and on the particular SoC. To have a simple model for the

test throughput, , assume that the given test facility is characterized by

1 A list of notations and abbreviations is provided in section 4.11.

Chapter 4

58

its overall Effective Test Time per Second ( ) and Test Handling

Time ( ).

The effective test time per second is the total test time that the test facility

provides. For example if there are two ATE machines working in parallel,

the could be as high as two. Therefore, the depends on the

number and specification of the ATE machines and possibly other test

facility specifications. The test handling time represents the wasted times

that chips are not actually under test (e.g., placing, connecting, and

detaching the chips) and therefore, it depends on the test facility

specifications. The test throughput, , which depends on the Applied Test

Size ( ) and Test Application Time ( ), is calculated as:

(4.3.2)

In order to gain a better understanding of the test throughput, the

Normalized Test Throughput ( ) is defined by normalizing the test

throughput, , to the effective test time per second, , and assuming

that the test handling time, , is negligible, as follows:

(4.3.3)

The normalized test throughput, , is proportional to the applied test

size divided by the test application time. It is also proportional to the

percentage of the chips that have completed the test without overheating.

Therefore large test application time and large test overheating probability

will result in small test throughput and consequently the cost component

related to the test facility operation will be higher.

In this thesis, , , and do not depend on the test schedule

and therefore they are considered to be constants. The cost function is then

normalized so that all constants are lumped into one new constant, the

Balancing Coefficient ( ). The result is the Normalized Cost Function

( ) which is expressed as:

(4.3.4)


59

The balancing coefficient, , is in direct proportion to the price of one

chip, , and in inverse proportion to the cost of the test facility per time

unit, . The first term in the above equation captures the test facility

operation cost.

The second term captures the balanced cost of the overheated chips and is

proportional to the test overheating probability, , and the balancing

coefficient. The balancing coefficient balances the cost of the overheated

chips against the cost of the test facility operation. Expensive chips will

results in a larger balancing coefficient and expensive test facility will

result in a smaller balancing coefficient.

4.4 Temperature Error Model

As previously defined, temperature error is the difference between the

expected temperature (can be estimated by simulation) and the actual

temperature (can be measured by sensors). This error can be categorized

into spatial temperature error and temporal temperature error. Spatial

temperature error shows that different cores have different temperature

errors while the temporal temperature error shows that the same core has

different errors at different times.

A temperature error model gives the probabilities of the temperature errors

for every core in every test cycle. The spatial error model gives the initial

error distribution and then the temporal error model is used to recursively

estimate the error distribution for the next cycle.

For example, a spatial temperature error model which consists of a discrete

distribution shows that at the very beginning of the test the probability of

an error equal to in core 1 is 0.001 while the probability for the

same error in core 2 is 0.02. The spatial error model is specified using a

look up table which is assumed to be given as one of the inputs. Assuming

that the error for a SoC design may range from to by a

resolution of , the number of the look up table entries ( ) would be

80 for a core and for a SoC with cores.

The temporal temperature error model is assumed to be a discrete-time

model which means that the temperature error is fixed during a period and

then it changes discretely from one period to the next. Therefore, the

temporal temperature error model specification has two parts, the period

which is called temporal error period and a table of error change

Chapter 4

60

probabilities. The temporal temperature error table gives the probability of

a particular change in error.

For example, a temporal temperature error model shows that the

probability that the error increases by is 0.015. Assume that the

temporal error period is and the error is measured to be at time

0, as shown in Figure 4.4.1. The error will remain up to (

). Then after the exact error is not known

any more. However the probability of a certain error can be estimated using

the temporal error model. In this example, the probability of a temperature

error equal to , between and is

0.015. Without a measurement at , the only available information is

that the probability of a temperature error equal to

is 0.015 × 0.015, between and . In Figure 4.4.1, a new

measurement is done at time and the actual error is .

The size of the temperature error data set, given as input, might be quite

large. In such a case it is necessary to extract a smaller set of data which is

representative of the original data in accordance with the accuracy and

speed requirements. This is done by clustering the errors into error clusters.

The error clusters are characterized by temperature Error-cluster Borders

( ). The temperature error range, resolution, and error clusters are

assumed to be identical for all cores, in this thesis.

The Temperature Error Values and the Spatial

Temperature Error Probabilities are

original inputs which are given for a SoC with cores for temperature

Figure 4.4.1 An example for temporal temperature error probabilities


61

error samples. The Temporal Temperature Error Probability ( ) is the

other input and it gives the probability for a certain change in the error

value. The probability that the temperature error value changes from

to is

(4.4.1)

The error clustering is assumed to be uniform and the error-clusters

borders, , are identical for all cores. Assuming error

clusters, the size of the original data set reduces to . Error clustering

will divide the -dimensional error space into error cells indexed using

Cartesian system (i.e., ). For example, assume that for a

SoC with two cores, each core has two error clusters. The 2-dimensional

error space is divided into four error cells, indexed with , , ,

and . While the original size of the error space is , the number of

error cells is . Assuming and , the original size is

while the size of the clustered error space, with , is .

4.5 Adaptive Test Scheduling

The proposed adaptive method is based on the on-chip temperature sensors

implemented on each core. During test, the actual temperatures of selected

cores are read at certain selected moments. A group of chips with similar

thermal behavior which are tested with the same schedule is called a chip

cluster. During the test, chips are dynamically classified into one of the

chip clusters and are tested using its corresponding schedule. The chip

clusters vary during the test, and at every adaptation moment (time moment

corresponding to a certain branching table) the chip clusters change into a

new scheme which is suitable for the new situation.

The parameters that affect the efficiency of the adaptive method are the

moments when branching/adaptation happens, the number of edges (i.e.,

linear schedule tables) and the branching conditions (i.e., chip clustering).

For the example in Figure 4.2.2, the adaptation is happening at , the

number of edges is two (two linear schedule tables, and ), and the

branching condition is a comparison with the temperature.

Since the possible branching moments are multiples of the temporal error

period, the first design decision is whether to branch or not at a possible

node in a schedule tree. This design decision will be merged with the

second design decision which is the number of edges (i.e., the number of

Chapter 4

62

chip clusters). The third design decision is the chip clustering for nodes.

These problems are summarized into the following two sub-problems.

1. How many chip clusters, at each possible node in the schedule tree, is

suitable? The special case of one edge implies no branching, no sensor

reading, and no extra effort.

2. What is the proper chip clustering into the given number of chip

clusters? The number of chip clusters is known from the previous

question. Depending on the chip clustering some cores may not need

sensor readout.

The second question is only relevant when the answer to the first question

is larger than one. The above questions are then formulated in two different

forms, the first question is described as a tree topology and the second

question is the chip clustering for the nodes of that tree topology.

A candidate schedule tree is generated by combining a candidate tree

topology with a candidate chip clustering. The number of candidate tree

topologies and the number of alternative chip clusterings grow very fast

with parameters like temporal error resolution and the number of cores.

Since the number of candidate trees is the product of the tree topology

alternatives and the chip clustering alternatives, the search space is so huge

that ordinary search approaches would not work fast enough. Therefore a

constructive method is suggested to deal with this high complexity.

The schedule tree is constructed by adding small partial trees to its leaves.

These small partial trees which are the building blocks of the schedule tree

are called sub-trees. A sub-tree consists of a small number of linear

schedules and branching tables which makes it possible to be clustered and

optimized (scheduled) at once. The tree that is under construction with

unfinished tests is called an unfinished tree.

For example, assume that there is an unfinished tree, Tree 1, as shown in

Figure 4.5.1a. The linear schedule tables of Figure 4.2.2 correspond to the

edges of Tree 1 while the branching table corresponds to node 1, as shown

in Figure 4.5.1a. Two sub-trees with one and with two edges are shown in

Figure 4.5.1b. Tree 1 has two leaves and combinations of the sub-trees are

added to them in order to generate the offspring as shown in Figure 4.5.1c.

Offspring 2, for example, is generated by attaching the Sub-tree 1 to node

2 of Tree 1 and attaching the Sub-tree 2 to node 3 of Tree 1.


63

The proposed constructive algorithm is shown in Figure 4.5.2. The inputs

to the algorithm include the switching activities of the tests in order to

compute the dynamic power, the thermal error model in order to estimate

the temperature errors, and the thermal model of the chip in order to predict

the temperatures.

Furthermore, the algorithm requires the electrical model of the chip in

order to compute the static power and the dynamic power and in order to

be informed about the test bus width limit. The test facility specifications

are also inputs to the algorithm which provides the knowledge of the

available ATE memory, delay overheads, and the balancing coefficient

(i.e., in equation 4.3.4).

The algorithm starts with an initialization phase, as shown in Figure 4.5.2.

Here, the unfinished tree, sub-tree topologies, temperature error model, and

thermal simulator are initialized. Then it proceeds with constructing the

schedule tree out of the sub-trees as will be explained in section 4.5.1. The

linear schedule tables are discussed in section 4.5.2. The sub-tree

evaluation is explained in section 4.5.3. The sub-tree scheduling which is

based on an optimization heuristic is explained in section 4.5.4.

4.5.1 Tree Construction

The schedule tree construction starts with a root node and in each iteration

an unfinished tree extends and multiplies by adding alternative

combinations of sub-trees to its active leaf nodes, as shown in Figure 4.5.1.

Then, a small number of promising under-construction trees are selected

as unfinished trees from the offspring list to be used in the next iteration.

Figure 4.5.1 Constructive method

Main components are (a) Unfinished tree, (b) sub-tree topologies, and (c) offspring trees. For , ,

, and in (a) refer to Figure 4.2.2.

(b) Sub-trees

Sub-tree 1

Sub-tree 2

0 1

012

(a) Unfinished Tree

Tree 1 0 1B1

S1S2

S3

23

(c) Offspring Trees

Offspring 1

Offspring 2

Offspring 3

0 12

3

4

5

0 12

3

4

5

6

0 12

3

4

5

6

Chapter 4

64

For example, an unfinished tree list will be selected from the offspring list

(partially shown in Figure 4.5.1c) to go on with. The algorithm, as shown

in Figure 4.5.2, ends when all the unfinished trees have completed the test.

The selection process keeps the ATE memory constraint satisfied by not

selecting the candidates that will exceed the memory limit. A naïve

algorithm will have a tendency to create many edges in all iterations at the

beginning since it reduces the cost. As a result of this naïve approach the

algorithm will put many edges near the root of the tree and later on as the

memory fills up there will not be any possibility to add a new edge. In order

to provide the algorithm with the freedom to put more edges in the more

beneficial regions. In our proposed algorithm, the selection is done based

on the Scaled Cost Function ( ) as defined in the following.

Figure 4.5.2 The proposed constructive method

Initialize

Initialize

unfinished trees

(shown in Figure

4.5.1a)

Initialize sub-tree

topologies (shown

in Figure 4.5.1b)

Initialize thermal

simulator

(discussed in

section 4.6)

Initialize temperature

error model (discussed

in section 4.4)

Generate offspring trees (shown in Figure 4.5.1)

Schedule

AL1.StT1

Core clustering for errors

(discussed in section 4.5.4)

Schedule

AL1.StT2

Schedule

AL2.StT1

Schedule

ALlast.StTlast

Connect the scheduled sub-trees (ALi.StTj) to the corresponding leaf node

(ALi) in order to generate all possible combinations (discussed in section 4.5.1)

Select the unfinished trees from the offspring trees list

using equation 4.5.1 (discussed in section 4.5.1)

Is there any active leaf in

the unfinished trees?

Select the final schedule tree from the

offspring trees list using equation 4.3.4

(discussed in section 4.5)

Final schedule tree

Schedule the sub-trees (discussed in section 4.5.4)

The j-th Sub-tree Topology to be

connected to the i-th Active Leaf node

of the unfinished tree

ALi.StTj

Yes No

Thermal

model

Test

switching

activities

Thermal

error model

Electrical

model

Test facility

specification


65

(4.5.1)

The normalized cost function, (equation 4.3.4), is scaled by the tree’s

number of nodes plus an adjusting offset. Now, adding nodes to the tree is

only beneficial if it gives a reasonable cost reduction, otherwise a smaller

tree may get a lower scaled cost and manage to survive to the next iteration,

while bigger trees are discarded. In general bigger trees will have smaller

but not necessarily smaller . The effect of the number of nodes

is adjusted by the adjusting offset. A small adjusting offset promotes

having fewer edges compared to a large adjusting offset which promotes

having more edges. In other words, a larger adjusting offset reduces the

sensitivity to the number of nodes. An extremely large adjusting offset

means that the number of nodes has almost no effect on decision making

while dominates the decision making process.

To satisfy the memory constraint, when unfinished tree is selected based

on its scaled cost function, it is scheduled for the rest of test by just using

the linear schedule tables which mean no further branching. During this

scheduling, the linear scheduling aborts as soon as the memory limit is

violated. If the linear scheduling succeeds in respecting the memory limit,

the candidate survives to the next iteration. Otherwise, the currently chosen

unfinished tree is discarded and the next candidate with larger or equal

scaled cost is tested for its compliance with the memory constraint. The

scheduling will fail if no candidate could meet the memory constraint,

meaning that the limit is too tight even for a linear schedule.

4.5.2 Linear Schedule Tables

A linear schedule table captures a schedule without branching. The linear

schedule table entries (start/stop times for each and all cores) are optimized

in the offline phase to reduce the probability of overheating. The

temperatures are checked frequently in order to keep the overheating

probability small.

The start/stop states in the linear schedule tables are generated using the

heuristic proposed in [He08a]. According to this heuristic, the test of the

cores with lower temperature and higher remaining test size will be started

or resumed earlier. Activating the cores with lower temperatures is

desirable because it provides longer testing intervals and therefore reduces

the number of test partitions and their corresponding overheads.

Chapter 4

66

Moreover, by choosing the colder cores while the effect of adjacent cores

are taken into account by temperature simulation, in fact, the algorithm

activates the cores which are far from the current active cores. This will

save the newly activated cores from the accumulated heat in their possible

neighbors and furthermore by not activating the adjacent cores, the newly

deactivated cores will experience a faster cooling. The heuristic gives also

advantage to the cores with longer remaining tests, thus maximizing the

interleaving opportunities. Besides, the situation in which a long test

sequence leads to a long total test application time is avoided.

As mentioned before, each chip cluster is tested with a dedicated linear

schedule. Every chip cluster is represented by a single error value which

will be used to estimate the actual temperature based on the simulated

temperature; this error value is called representative temperature error. The

estimated temperature is updated periodically by correcting the cores’

simulated temperatures with the representative temperature error. The

estimated temperature is then used to compute the static power and to

determine the ‘state’ of the cores (i.e., testing or cooling).

For example, assume that there are two chips in a certain chip

cluster and the chips consist of only one core. Therefore, at a certain

moment in time, there are two error values corresponding to the

two chips. But the linear scheduling heuristic works with one error value

for one chip cluster. Therefore, the representative temperature error, ,

which is a real number ( ) is defined as a value which represents chips

error values, .

The representative temperature error is updated periodically with the

temporal error period (see section 4.4) while the estimated temperature,

static power, and state of the cores are updated more frequently. After

updating the state of the cores, the dynamic power sequence is computed.

The initial temperatures are available as the results of the previous

temperature simulation. Having dynamic and static power sequences in

addition to the initial temperatures, the next temperature simulation is

performed.

The representative temperature error for a chip cluster is viewed as a safety

margin in [Aghaee10] and its optimal value is experimentally computed

for a number of examples. Those experiments suggest that the optimal

value for a representative temperature error is equal to the border between

the chip cluster and the adjacent chip cluster that has larger error (i.e.,


67

hottest possible chip in the chip cluster). This is true for all chip clusters

except the last one that has the largest error. For example, for a chip cluster

stretching from to , would be a good

choice to be the representative temperature error for this chip cluster. The

representative temperature errors are assigned in a similar way in this

thesis.

To have an example from a different point of view, assume that in total

there are four chips and chips consist of only one core.

Therefore, at a certain moment in time, there are four error values

corresponding to the four chips. Assume that

. Assume that the chip-clustering algorithm (will be explained in

section 4.5.4) has generated two chip clusters and . The

representative temperature error for the chip cluster that has smaller errors

(i.e., ) is and the representative temperature error for the

last chip cluster, is formulated as an optimization variable along with

the chip-clusters borders in the chip-clustering algorithm. This is in

particular important when the number of chip clusters is small. This is

usually the case2 and therefore the cluster on the high temperature extreme

will contain a non-negligible number of chips.

The sub-tree optimization method encodes the problem based on chip-

clusters borders. The representative temperature errors are defined as chip-

clusters borders for all chip clusters but the last one. For the last error

cluster (one with the largest errors), the representative temperature errors

are encoded along with the chip-clusters borders as the sub-tree

optimization variables. This will be explained in more details in section

4.5.4.

The optimization problem for a linear schedule table is to minimize the

partial normalized cost function by finding the proper start/stop times. This

is done based on the heuristic proposed in [He08a]. The utilized test bus

width is the sum of the TAM widths of the active cores for tests which

utilize the TAM. The schedule size is the product of the number of the

linear schedule table entries and the record size. The schedule tree is

equivalent to a number of linear schedule tables (edges) in addition to a

number of branching tables (nodes), as shown in Figure 4.5.1a. The linear

2 Refer to [Aghaee10].

Chapter 4

68

schedule table is explained above and the rest of the construction process

will be explained in the following sections.

4.5.3 Sub-Tree Evaluation

The schedule tree is constructed by attaching sub-trees to the leaves of the

unfinished trees (See Figure 4.5.1). For this purpose, the proper schedule

for a sub-tree topology should be found. In a sense, a sub-tree is a tree and

the cost function introduced in section 4.3 should be usable. However,

there is a subtle difference between their objectives. For the schedule tree

the objective is its very own cost. For a sub-tree the objective is, on the

contrary, the cost of the schedule tree that is to be constructed. Therefore,

the cost of the final schedule tree should be estimated assuming that this

particular sub-tree is used in its construction. This makes the cost

evaluation different for the sub-trees.

To find the near optimal schedule for a sub-tree topology, a partial cost

function must be used for different sub-tree clustering alternatives. For the

evaluation of the cost function (i.e., in equation 4.3.4), the expected

values of the test application time, , the applied test size, , and the

test overheating probability, , (denoted by , , and ,

respectively) should be computed by utilizing the temperature error

statistics.

The expected values are computed while each edge is being scheduled. In

the formulation of the schedule tree, an edge is represented by its

destination node. Assuming that the number of nodes is , the Nodes’

Probabilities , the Nodes’ Applied Test Sizes

, and the Nodes’ Test Application Times

are used to compute the expected applied test size and the expected

test application time as follows:

(4.5.2)

(4.5.3)

In order to explain the expected test overheating probability, , and

understand how node probabilities are computed, the notion of node

clustering and error cells are introduced here. Temperature errors of cores

constitute a -dimensional errors space ( is the number of cores). For

example in Figure 4.5.3, there are two cores and therefore the error space

is two dimensional. The horizontal axis represents the error values of the


69

first core and the vertical axis represents the error values of the second

core. There are four error clusters for each core and therefore there are

sixteen error-cells in Figure 4.5.3.

This is specifically important for the nodes at which branching takes place.

Branching at a node is, in fact, a chip clustering to a number of groups, so

that each chip cluster corresponds to an exclusive edge that branches out

of that node. Chips are identified by their cores’ errors and therefore a chip

clustering is a partitioning of the -dimensional error space into a number

of chip clusters.

This means that a chip cluster is a combination of specific error intervals

of the cores. A candidate ‘sub-tree clustering’ is a set of chip clustering

alternatives for nodes. Furthermore, a candidate ‘sub-tree clustering’ could

be viewed as a set of nodes’ clustering alternatives for a sub-tree topology.

An error cell is a cell in -dimensional error space separated by cores’

error-clusters borders and therefore its projection on a core error axis is an

error cluster for that core. Therefore, a node clustering could be seen as

assigning error cells to chip clusters or equivalently labeling error cells

with chip clusters. An example for labeling of the error cells is shown in

Figure 4.5.3. There are two cores in the figure, and the numbers

(“0” or “1” in this case) inside the rectangular error cells are the labels.

A candidate sub-tree topology will have a number of candidate clustering

alternatives which label the nodes’ error cells with the relevant chip

clusters. Each chip cluster for a node corresponds to an edge branching out

of that node and corresponds to a linear schedule table. Each node has its

own dedicated Error-Cell Labeling

. Looking from a branching node, a succeeding node corresponds to a

Figure 4.5.3 An example for error-cells labeling

Four error-cells are labeled with 0 that is the ID of the chip cluster number 0 and the remaining twelve

error-cells are labeled with 1 that is the ID of the chip cluster number 1.

0

Core 1 error clusters

Core

2err

or

clu

ste

rs

0 1 2

0

1

2

0 1

0 0 1

1 1 1

1 1 1

1

1

1

1

3

3

Chapter 4

70

chip cluster and therefore it receives a Node’s Cluster Label

to represent that chip cluster (or equivalently the preceding edge

and corresponding linear schedule). This label indicates which of the

branching node’s chip clusters will lead to a certain succeeding node.

The probabilities of error cells for different nodes and consequently the

probabilities of those nodes are computed based on the temperature error

model and based on the chip clusterings of the preceding nodes. In order

to speed up the computation of the Error-Cells Probabilities ( ) the

Error Cell Change Probabilities ( ) are pre-computed as shown below,

in equation 4.5.5. The error cell change probabilities are, in fact, the

concentrated effect of the temporal error model which is repeatedly used

to compute the error-cells and nodes probabilities.

It is assumed that the variation in the probabilities inside an error cluster is

negligible. Furthermore, it is assumed that the error change probabilities

for different cores are independent. The error-cell probabilities change

from node to node and therefore most of the time the equations are about

two nodes, the origin and the destination. The error cells for the origin

node are superscripted with and for the destination node with .

is computed as follows:

(4.5.4)

is the temporal temperature error probability, is temperature

error value, and is error cluster border. is computed as follows:

(4.5.5)

The error-cell probabilities for the root node (i.e., ) are computed

based on the spatial temperature error probabilities ( ) as follows:

(4.5.6)

The error-cell probabilities for non-root nodes (i.e., ) are computed

based on the predecessor node which is denoted by . First, error-cell

probabilities just after the branching are extracted from the predecessor

node as follows:


71

(4.5.7)

While scheduling an edge, overheating may occur to some of the cells

(ranges of chips) which have larger temperature errors. Consequently, the

probability of these cells at the end of the edge (after the corresponding

chunk of the test is applied) is considered to be zero. The error-cell

probabilities, , after overheating are computed based on

Representative Temperature Error ( ) ( is introduced in section

4.5.2) as represented below, in equation 4.5.8. Overheating of a core occurs

when the core’s actual temperature which is estimated by adding to

the Simulated Temperature ( ) exceeds the High Temperature Threshold

( ). A chip is considered as being overheated if at least one of its cores

overheat.

after overheatingafter branching

(4.5.8)

According to the temperature error models (introduced in section 4.4) the

error-cell probabilities, , after temporal changes are computed as:

(4.5.9)

The node’s probability, , is computed as follows:

(4.5.10)

Node’s not Overheating Probability ( ) is the probability that a chip

which corresponds to this edge according to the chip clustering scheme, is

not overheated after traversing this edge. for a node, , is computed

as follows:

(4.5.11)

Finally, error-cell probabilities, , are computed as:

(4.5.12)

Chapter 4

72

Edges are scheduled by determining the linear schedule tables as explained

in section 4.5.2. Then the candidate sub-tree clustering is evaluated using

the partial cost function which is based on the expected applied test size,

the expected test application time, and the predicted test overheating

probability. The first two are already introduced in equation 4.5.2–3 and

the last one is explained below.

Evaluation of a partial tree is in fact an attempt to predict the cost of the

completed schedule tree, based on the current situation of that partial tree.

For this purpose, it is assumed that the final schedule tree will be composed

of a number of similar partial trees (building blocks for the final schedule

tree). These partial trees are assumed to have similar expected applied test

size, expected test application time, and expected test overheating

probability. These expected values are assumed to be similar to those of

the partial tree that we are evaluating.

Therefore, a good prediction approach for the test application time and the

applied test size would be their current expected values multiplied by the

predicted total number of partial trees. Since only the ratio of the predicted

test application time to the predicted applied test size matters in the cost

function (the first term in equation 4.3.4), a good choice for predicted

values of these variables is their expected values. But the situation for

Predicted Test Overheating Probability ( ) is different since its value

does not change linearly when a number of similar partial trees (building

blocks) are put one after the other (unlike and ).

Assuming that there are leaves in the tree, the Leaf’s Overheating

Probability is the overheating probability for the path

from the root node to the specified leaf node. Its computation includes

multiplication over nodes that belong to the specified root-to-leaf path. The

overheating probability for leaf is computed as:

is a leaf node for all nodes, , belonging to the root–to– path

(4.5.13)

The expected test overheating, assuming a total of leaf nodes, is

computed as:

(4.5.14)


73

can be used in equation 4.3.4 to evaluate a fully constructed

schedule tree, but for partial cost function when the tree is not yet fully

constructed, the predicted test overheating probability, , is used in to

evaluate the partial cost function (which will replace in equation

4.5.1). is computed as:

(4.5.15)

Where λ is the total number of partial trees (building blocks) that are

assumed to be similar to the current partial tree and will construct the final

schedule tree. is computed for the partial tree as expressed in

equation 4.5.14 and then the predicted test overheating probability is

computed by assuming that these λ partial trees have overheating rates

equal to the current partial tree’s overheating rate. λ is computed based on

the expected Number of Partial Trees ( ) which is defined as the total

test size divided by the expected applied test size, , for the current

partial tree.

A naïve algorithm will use the instead of λ in equation 4.5.15.

However, because of the localities in the schedule tree, partial trees

(building blocks) with a lot of cooling may exist. For these partial trees,

the expected applied test size is small and consequently the expected

number of partial trees, , will be estimated pessimistically. This

unrealistic estimation may result in exceedingly large predicted test

overheating probabilities, , and consequently a long schedule tree

with too much of cooling may receive a low cost and be selected.

Therefore, limiting the expected number of partial trees, , would be

helpful for good schedules to receive a more realistic cost. A reciprocal

limiter is used here which amplifies small inputs and attenuates large

inputs. In the proposed reciprocal limiter, the output is always one when

the input is one and the output is equal to the input in a point that is called

. The output will be always smaller than the limit which is (

). The limited output, λ, is calculated based on the input, , as follows:

(4.5.16)

A larger promotes lower overheating since the maximal value of λ

increases and also because of the increased limiter’s amplification for the

values which are less than . On the other hand, a smaller

will result in schedules with shorter test application time.

Chapter 4

74

At this point, the introduction to computation of the expected test

application time, expected applied test size, expected test overheating

probability, and predicted test overheating probability is completed and the

expected cost for a sub-tree could be computed using them. Therefore, the

clustering alternatives for a sub-tree topology could be evaluated using the

scaled cost (equation 4.5.1). The clustering alternatives are explored by

PSO and the best scheduled sub-tree is selected at the end. This

optimization is further discussed in the next section.

4.5.4 Sub-Tree Scheduling

As mentioned before, the schedule tree is constructed by attaching sub-

trees to unfinished trees’ leaves (See Figure 4.5.1). For this purpose, the

proper schedule for a sub-tree topology should be found. In order to

schedule a sub-tree topology which is going to be connected to the

specified leaf node of the unfinished tree ( in Figure 4.5.2) a

heuristic, as shown in Figure 4.5.4, iteratively generates alternative chip

clustering schemes and evaluates them. The evaluation is explained in

section 4.5.3 and requires the sub-trees’ edges (i.e., linear schedule tables)

to be scheduled as explained in section 4.5.2.

A chip clustering scheme for a sub-tree specifies which chips will take

which edges. The chips are specified by their cores’ errors and therefore

the problem could be seen as assigning chip clusters to the error cells

located in the -dimensional error space. The search space could be seen

as the collection of different alternatives for . For example for

a chip with two cores, the general form is and therefore, for a

sub-tree with two nodes, the solutions will be similar to the two alternatives

given in Figure 4.5.5.

A solution encoding scheme is suggested in [Aghaee11b] which labels the

error cells with chip clusters. The number of the decision variables grows

exponentially with the number of cores and therefore the computational

complexity is very high. In this thesis, we suggest a solution encoding

scheme which encodes the chip-cluster borders instead of the error cells.

For a node with succeeding chip clusters the number of decision variables

is . For chip clusters, there are chip-cluster borders and the

- value is the representative temperature error for the last cluster. Here,

the number of the decision variables grows in proportion to the number of


75

cores and therefore the computational complexity is much smaller

compared with the scheme suggested in [Aghaee11b].

Two examples for the suggested solution encoding for a sub-tree with only

one node, similar to sub-tree 2 in Figure 4.5.1b, are shown in Figure 4.5.6.

The solutions correspond to a SoC which has only two cores. There are

three temperature error clusters per core and the number of edges (i.e.,

number of chip clusters) in the corresponding sub-tree is two. and are

representative temperature errors for the last chip cluster. The 0-th chip

cluster in Figure 4.5.6a is larger than the 0-th chip cluster in Figure 4.5.6b.

An equivalent view-point is to compare the number of error cells which

are indexed by 0. Another equivalent view-point is to compare the chip-

clusters borders on the vertical axes (i.e., third element in the solutions

encodings).

Figure 4.5.4 Sub-tree optimization algorithm

Initialize the swarm:

generate particles’ locations and velocities

Schedule the sub-tree’s

edges for particle1

Select local and global bests (evaluate according to section 4.5.3)

Report global best as

the final chip clustering

Schedule the edges of the sub-tree ALi.StTj for different chip clustering alternatives

(explained in section 4.5.2)

The j-th Sub-tree Topology to be connected to the i-th

Active Leaf node of the unfinished tree

ALi.StTj

Are particles in valid range and

Do required chip clusters exist?

Yes

Update particles’ velocities using equation 4.5.17

Update particles’ locations using equation 4.5.18

Are particles in valid range and

Do required chip clusters exist?

Fix for admissibility

(explained in section 4.5.4)

Is convergence condition met?

ScheduledALi.StTj


edges for particle2


edges for particlelast

Schedule ALi.StTj

No

YesNo

Yes No

Chapter 4

76

The possible solutions are then explored using particle swarm

optimization. Although PSO is introduced in section 2.7, let us briefly

review it here. A candidate solution is called a particle and is represented

by its location and its velocity. The locations are the encoded solutions and

the velocities are used to determine the next candidate solutions. Each

particle remembers its previous best location, and the swarm remembers

the global best solution that is the best location any of its particles have

visited ever. The previous bests and the global best are then used to give a

hint to the random velocities.

Figure 4.5.5 Error-cell labeling alternatives

Two alternatives are for a chip with two cores and a sub-tree with two nodes. The general form is

, being the node index (row). Cells (columns) are indexed by and .

Alternative 2

0 0 0 1

0 1 1 1

Alternative 1

0 0 1 1

1 1 1 0

The Labeling Plan

Figure 4.5.6 Two examples for error-cells labeling

Error cells are labeled with chip clusters’ IDs (numbers inside the small rectangles). The solution

encodings are given below the error spaces.

Solution encoding is [ 1, r1, 0, r2 ]

(b)

0


Core

2err

or

clu

ste

rs

0 1 2

0

1

2

0 1

1 1 1

1 1 1

Solution encoding is [ 1, r1, 1, r2 ]

(a)

0


Core

2err

or

clu

ste

rs

0 1 2

0

1

2

0 1

0 0 1

1 1 1


77

A canonical form of PSO uses equation 4.5.17, below, to update the

velocities. The coefficients in equation 4.5.17 are given as a part of the

chosen canonical form [Poli07]. and are two distinct

randomly generated numbers between 0 and 1. The location and the

velocity on the right hand side of equation 4.5.17 are the current values,

and the left hand side velocity is the next value.

(4.5.17)

Since the location, in this sub-tree scheduling problem, is a natural number,

the next location is the rounded sum of the current location and the next

velocity, as expressed in equation 4.5.18, below:

(4.5.18)

There are two admissibility conditions to ensure that the particles are valid

solutions. The first condition is the valid range and the other is the presence

of required chip clusters. For example assume that the errors range from

to and therefore smaller or larger errors will never happen in

practice. If it happens that one element in the next particle’s location is

, then this particle is out of range.

An example for a required chip cluster not being present is as follows.

Assume that there are three edges in a certain node and, therefore, three

chip clusters are necessary. It may happen that in the next particle’s

location, the first and the second chip-cluster borders are assigned with

identical values and therefore the second chip cluster is missing.

The proposed solution encoding which is based on chip-clusters borders

works well with particle swarm optimization, since the location and

velocity in PSO’s terminology correspond to the location and velocity for

chip-clusters borders. A typical particle in the beginning is far from being

good and experiences a high velocity towards the better location since

typically the difference between the best location and the current location

is large at the beginning. Therefore a rapid convergence towards the

preferred value for the chip cluster border will take place.

Later on, a typical particle will be close to the optimal location and

according to equation 4.5.17 it will move slower, thus pinpointing the

Chapter 4

78

preferred value for the chip cluster border. Some experiments, for chip

clustering optimization for sub-trees using PSO, are reported in

[Aghaee11b]. The experiments showed that the PSO performs well for this

purpose. Therefore, it is used here as a part of the proposed SoC test

scheduling technique.

4.5.5 Remarks

The proposed optimization technique is structured so that it enables

parallel implementations with different granularities. The alternative sub-

tree topologies ( in Figure 4.5.2) could be optimized in parallel.

For example, assuming one unfinished tree with two leaf nodes and three

sub-tree topologies, there will be combinations to optimize in

parallel.

Furthermore, at the lower level of sub-tree scheduling, each alternative

chip clustering in PSO ( in Figure 4.5.4) could be generated

(corresponding edges being scheduled) in parallel with other alternative

schedules. The scheduling of the edges (i.e., optimizing the linear schedule

tables) is the part that requires temperature simulation (dashed-line blocks

in Figure 4.5.4). Therefore, these computationally expensive parts could

be implemented in parallel in two different nested levels.

The proposed adaptive approach in this thesis combines the benefits of an

online scheduling technique with the benefits of an offline scheduling

approach and avoids their shortcomings. An online schedule will introduce

very large overheads that are associated with sensor readouts, decision

making process, and pausing/resuming the tests. An offline schedule, on

the other hand, is not capable of reacting to variations but has no run-time

overheads. In a fully online approach, reading the temperature sensors for

all cores as often as it is necessary and making the corresponding decisions

based on the acquired data will cause a very large load on the test access

mechanism and will introduce large delays to the schedule. Our proposed

approach uses temperature simulations as much as possible offline and

accesses carefully-selected cores’ sensors at carefully-selected times

during the test.

There is one schedule tree for a chip that addresses all cores individually.

For example, in a linear schedule table that corresponds to an edge, it is

stated that at time cores and are being tested, while cores and

are cooling. It might be that at another time, , cores and are


79

being tested, while cores and are cooling. The linear schedule table

is similar to in Figure 4.2.2, but instead of the second column that shows

only one column for state (in Figure 4.2.2), there are as many state columns

as there are cores. There is only one branching table for one node in the

schedule tree (similar to in Figure 4.2.2) but it contains, in every row,

conditions that include at least one core and at most all the cores.

4.6 A Fast Temperature Simulation Approach

In order to evaluate the candidates, the test application time and the test

overheating probability should be computed as previously explained. In

order to calculate the test application time and the test overheating

probability, the temperatures of the cores are required. Therefore, for every

candidate schedule which is examined by the meta-heuristic ( in

Figure 4.5.4), temperature simulation should be performed. Temperature

simulation is in the main loop in Figure 4.5.4 which itself is in the main

loop in Figure 4.5.2. This means that the temperature simulation which is

performed inside the optimization loop is repeated numerously.

On the other hand, the temperature simulation is the slowest step in the

iterative part of the algorithm. Therefore, the temperature simulation is the

bottleneck. It limits the number of the cores which can be handled by the

proposed method. Moreover, it is, also, a limiting factor for the quality of

the schedules.

Since the optimization heuristic will have a time consuming process inside

its main loop, the time required to achieve a high quality schedule will be

excessively large and impractical, thus the quality might be sacrificed by

ending the optimization process prematurely. It is, therefore, important to

use a fast temperature simulation approach.

As previously discussed, the temperature simulation is based on a thermal

model and a technique to solve the model response to the given input power

profile. The input power consists of the static power and the dynamic

power. The static power depends on the chip and on the temperature, while

the dynamic power depends on the chip and on the input test sequence.

Both the static power and the dynamic power are time-variant, but for

practicality reasons, it is commonly assumed that the power is constant

Chapter 4

80

during a simulation cycle3 (a discrete-time model is assumed). Therefore,

in the following we focus on a single simulation cycle in which the input

power is constant. The input power is updated with new static and dynamic

power values, based on the results of the previous simulation cycle and

then the simulation for the next cycle is performed.

The thermal model was previously discussed in section 2.6. Equation 2.6.1

is repeated below for convenience.

(4.6.1)

Assume that the thermal model consists of nodes and is the number

of cores ( ). The properties of the thermal model are encapsulated

into two matrices and . and are temperature and

power vectors. The mathematical representation of this model (equation

4.6.1) is a system of linear constant-coefficient differential equations and

therefore it is a linear time-invariant (LTI) system [Oppenheim97]. In fact

the thermal model is a linear time-invariant lumped element model and

both the heat capacities (captured in matrix ) and thermal conductivities

(captured in matrix ) are linear and time invariant.

The other part of the temperature simulation is to solve the model in order

to find its response to the input power. Usually, the simulation time is

divided into smaller intervals in which the power could be assumed to be

fixed. Then equation 4.6.1 is solved iteratively for each interval.

In order to solve equation 4.6.1 there are two distinct approaches, the

numerical approximation and the closed form solution. The numerical

approximations are usually done with very small intermediate steps, and as

a result, the complete temperature curve for the interval is constructed.

HotSpot uses the Runge-Kutta method for numerical approximation

[Huang06]. Though only the temperature at the end of the interval is

registered, many points of the temperature curve are calculated. Since we

do not need such a detailed temperature curve and we only need the

temperature at the end of the intervals, the equation is solved analytically

in order to give the temperature at the end of the intervals in a closed form.

3 Simulation cycle is explained in section 2.6.


81

In addition to the granularity of the temperature curve, another important

factor, which affects the simulation speed, is how frequently equation 4.6.1

is required to be solved. The scheduling technique presented in this thesis,

requires large number of simulations. Note that the system is LTI.

Moreover, the only changes in the inputs (within the simulation intervals)

is scaling of the previous inputs. Therefore, the differential equation needs

to be solved only once at the very beginning [Oppenheim97].

The responses to the scaled versions of the previous inputs are obtained by

scaling (a matrix multiplication) the previous outputs. Since the

computational cost of the scaling is less than the computational cost of

solving the equation from scratch, a method which utilizes the LTI

properties (i.e., scaling and superposition [Oppenheim97]) is faster than

the Runge-Kutta method when numerous simulations are required.

In situations that the thermal simulator is invoked quite frequently, the

input power is just being scaled from cycle to cycle, and the thermal model

is kept unchanged, the closed form solution is faster that the numerical

techniques. Therefore, we continue with the simulation approach which is

based on the closed form solution. By using Laplace transform

[Oppenheim97] and assuming that is the initial temperature vector and

is the temperature at the end of an interval, the closed form solution is

(4.6.2)

is the identity matrix of size and is the length of the interval.

Now, and matrices are defined as follows.

(4.6.3)

(4.6.4)

With the help of and , equation 4.6.2 could be written as

(4.6.5)

Equation 4.6.5 could be understood intuitively by thinking about the

system being LTI. According to the superposition principle, the effects of

the initial value and the input power will add up, thus the plus sign between

the two terms. The scaling property of the system could also be verified

Chapter 4

82

rapidly, as the scaling of an input, or with a certain factor, will scale

its own effect by the same factor.

The temperature simulation is done in two phases, an initialization phase

and then the operational phase. In the initialization phase the model is

invoked and based on it and are computed (this is shown in Figure

4.5.2 in regard to the overall scheduling method). The operational phase is

the iterative computation of the temperatures for different times using

equation 4.6.5. Since the thermal model is time invariant, the initialization

is done only once at the very beginning of the design process. Throughout

the offline scheduling phase, only the iterative computations are

performed.

In the closed form solution, the most computationally expensive part is the

matrix exponential for which is a part of equation 4.6.3. The matrix

exponential could be computed using numerical methods such as Padé

approximation [Higham05]. In fact the initialization phase for the closed

form solution includes calculating equation 4.6.3 and therefore it is very

time consuming. However, the operational phase only includes computing

equation 4.6.5 and therefore it is fast.

On the other hand, for the Runge-Kutta approach [Press07], the

initialization is fast since there is no need for computations which are as

heavy as equation 4.6.3. However, the operational phase is slow since the

equation is required to be solved in many fine steps through large number

of intermediate time instances. The conclusion is that the Runge-Kutta

method is faster for limited number of simulations and the closed form

method is faster for large number of simulations. The experiments in

section 4.7.1 will support this statement.

4.7 Experimental Results

Two distinct contributions in chapter 4 are the temperature simulation

approach and the adaptive scheduling technique. These are experimentally

evaluated in this section. All experiments are performed on a desktop

computer with Intel® Xeon® W3520 processor and 8 GB of memory. The

experiments for temperature simulation are presented first.

4.7.1 Fast Temperature Simulation Approach

A temperature simulation approach based on the closed form solution is

suggested in section 4.6 in order to increase the simulation speed. The


83

problem with numerical approximation approaches for temperature

simulation is that they are very slow for large number of simulation cycles

especially when there are a large number of cores. Temperature

simulations for a SoC with 100 cores and for different numbers of

simulation cycles are performed using the proposed approach and using

HotSpot [Huang07], and the CPU times are plotted in Figure 4.7.1a.

The numerical approximation approaches, such as the one used by

HotSpot, perform faster than the suggested approach for a small number of

simulation cycles. But for simulations longer than 1700 cycles, the

proposed approach is faster than HotSpot, as shown in Figure 4.7.1a. In

general, this difference increases with a rate close to 0.011 second per

simulation cycle and it reached a CPU time difference of 100 seconds for

10000 simulation cycles. This is important since for every edge in every

candidate schedule tree temperature simulation is performed for the

number of test cycles plus cooling cycles.

Temperature simulations are performed using the proposed approach and

using HotSpot [Huang07] for 10000 simulation cycles for different

numbers of cores, and the CPU times are plotted in Figure 4.7.1b. In

general, the CPU time difference increases rapidly with the number of

cores and the difference reaches 100 seconds for 100 cores. This is also

important, since achieving a good schedule in reasonable time becomes

infeasible with a small increase in the number of cores, when the slower

approach is in use.

Figure 4.7.1 CPU times for temperature simulation

HotSpot and the suggested approach. The simulations are performed (a) for 100 cores for different

numbers of simulation cycles and (b) for 10000 simulation cycles for different numbers of cores.

HotSpot

Suggested

Approach

Number of cores(b)Simulation cycles

CP

Utim

e[s

ec]

(a)

HotSpot

Suggested

Approach

2000 4000 6000 8000 100000 20 40 60 80 1000

20

40

60

80

100

0

120

140

160

180

CP

Utim

e[s

ec]

20

40

60

80

100

0

120

140

160

180

Chapter 4

84

4.7.2 Adaptive Test Scheduling Technique

The proposed adaptive SoC test scheduling technique is experimentally

evaluated in this section. The first set of experiments is performed on SoCs

with different number of cores and the CPU times are reported. Then,

experiments are done for ITC’02 [Marinissen02] benchmark chips with

random test switching activities generated using a Markov chain similar to

[Yao11c]. Finally, an experiment is performed for the d695 benchmark

chip from ITC’02 with real switching activities based on real test data from

[Samii06]. The costs of the test schedules and the test schedule sizes are

reported for the last two sets of experiments. The experimental setup is

briefly introduced at the beginning and then the results are presented.

The static power is computed using the temperature dependent model

given in [Liao05]. The temperature simulations are performed using the

approach proposed in section 4.6. The spatial temperature error is assumed

to have normal distribution ranging from to with

a resolution of . The temporal temperature error is also

assumed to have a normal distribution ranging from to

with a resolution of . It is assumed that there are twenty

temperature error clusters .

The balancing coefficient is assumed to be equal to ten . It is

assumed that each entry in a linear schedule table occupies 64 bits and each

entry in a branching table per core per edge occupies 32 bits. For example

a node with two succeeding edges for a SoC with two cores, occupies

bits.

The first set of experiments is performed on a number of SoCs with

different number of cores ranging from five to 50 cores. Markov chains are

used to generate random test switching activity sequences having random

averages and random lengths. The experiments are performed for at least

five randomly generated sets of tests for each chip and the average CPU

times are reported in Table 4.7.1. Note that even for a 50-core SoC, the

CPU time remains in an affordable range.

Table 4.7.1 CPU times for SoCs with different number of cores

Number of Cores 5 10 15 20 25 30 35 40 45 50

CPU time [Sec] 9 46 52 132 208 308 590 762 1141 1367


85

The second set of experiments is performed on ITC’02 SoCs with

randomly generated test switching activities similar to the first set of

experiments but this time tests for a chip have constant power averages and

length. The proposed technique is compared with the two methods

proposed in [Aghaee10]. The first one is an Offline method which uses

only one linear schedule and the other is a Hybrid method which selects a

linear schedule (out of a set of pre-generated schedules) only once during

the test process. The test costs offered by the Offline and Hybrid methods

proposed in [Aghaee10] and by the proposed technique in this chapter are

computed using the metric given in equation 4.3.4 and are reported in Table

4.7.2.

Column 1 is the name of the ITC’02 circuits. Columns 2 and 3 are the costs

(based on equation 4.3.4) for schedules generated by Offline and Hybrid

approaches proposed in [Aghaee10], respectively. The costs of the

schedules generated by the proposed adaptive approach are reported in

column 4 in Table 4.7.2. The percentage reduction in cost achieved by the

Hybrid and adaptive approaches compared with the Offline approach are

reported in columns 5 and 6, respectively. Column 7 is the percentage

reduction in cost achieved by the proposed adaptive approach compared

with the Hybrid approach. The adaptive method proposed here reduces the

cost by 76% over the Offline method and 43% over the Hybrid method.

This demonstrates the advantage of the proposed adaptive method.

The ATE memory occupied to store the schedules (i.e., the schedule size)

is reported in Table 4.7.3. The cost reduction comes with increase in the

Table 4.7.2 Test cost for test scheduling techniques

ITC’02

chips

Costs

Percentage

reduction relative to

the Offline

Percentage

reduction relative

to the Hybrid

Offline Hybrid Proposed Hybrid Proposed Proposed

a586710 1.44 0.56 0.54 61 62 4

d281 0.69 0.45 0.03 35 96 93

d695 0.50 0.12 0.06 76 88 50

f2126 2.71 1.39 0.51 49 81 63

g1023 5.09 4.27 1.99 16 61 53

h953 0.46 0.14 0.11 70 76 21

p22810 1.22 0.70 0.69 43 43 1

p34392 0.75 0.72 0.06 4 92 92

p93791 1.02 0.13 0.08 87 92 38

q12710 1.32 0.40 0.23 70 83 42

t512505 0.48 0.23 0.13 52 73 43

u226 1.05 0.43 0.37 59 65 14

Average 52 76 43

Chapter 4

86

schedule size because of increased number of linear schedules and

branching tables, which consume ATE memory space. The average

increase in schedule size compared to Offline is 87% for Hybrid and 308%

for the proposed adaptive method. When compared to Hybrid, the average

schedule size increase for the proposed method is 117%. The increase in

the usage of ATE memory (as given in Table 4.7.3) refers only to the

memory space used to store the schedule. This is usually small, compared

with the memory space used to store the test patterns. Therefore a large

increase in the schedule size is very likely to be translated into a small

increase in the usage of the ATE memory as a whole.

The proposed scheduling method will utilize the available ATE memory

even if a very small reduction in cost (e.g., from 0.70 to 0.69 for p22810 in

Table 4.7.2) is achieved. Since the number of nodes contributes to the

scaled cost function (equation 4.5.1), a larger schedule will not be

generated (e.g., 195% larger for p22810 in Table 4.7.3 compared with

hybrid solution) if it does not reduce the cost compared with a smaller

schedule.

The ATE memory constraint will affect the quality of the adaptive test

schedules. The proposed algorithm will not generate even an offline

schedule when the available memory is too small to accommodate it. By

increasing the available ATE memory, first an offline schedule and then a

hybrid schedule will be generated. With the further increase of the

available memory, better schedules with lower costs will be generated.

This trend continues until the cost reaches a minimum beyond which

Table 4.7.3 ATE memory utilized only for schedule

ITC’02

chips

Utilized memory for

schedule [bit]

Percentage increase

relative to the Offline

Percentage increase

relative to the Hybrid

Offline Hybrid Proposed Hybrid Proposed Proposed

a586710 1216 1888 4768 55 292 152

d281 1088 1280 2624 18 141 105

d695 1280 2176 3392 70 165 54

f2126 704 960 2368 36 236 147

g1023 576 1088 4480 89 678 312

h953 576 1088 1472 89 155 35

p22810 704 1888 5568 168 691 195

p34392 832 1472 2688 77 223 83

p93791 704 1920 3136 173 345 63

q12710 640 1024 1664 60 160 62

t512505 1152 2336 3712 103 222 59

u226 320 672 1568 110 390 133

Average 87 308 117


87

further cost reduction is impossible. The minimum cost is usually dictated

by the branching overheads (time to read sensors and react accordingly).

The reduction of the cost with the increase of the memory limit is shown

in Table 4.7.4. The memory limit is increased in eight steps. It is expected

that the increase in the memory limit improves the cost before it reaches

the saturation limit. The saturation limit for this set of experiments is equal

to 1320. Memory sizes and limits for the schedule are given in bytes. The

CPU time increases in general with the increase of memory limit. This

trend continues even if the cost is not improved (after the saturation) since

the algorithm has more space to search and thus it takes more time. The

costs and sizes are normalized to the first working schedule (row 4) and

reported in columns 4 and 5 of Table 4.7.4.

The last experiment is performed on d695 (one of the ITC’02 chips) using

the real test switching activities. The costs and schedule sizes are reported

in Table 4.7.5. The Hybrid method improves the cost compared to Offline

method by 59% while the proposed adaptive technique achieves a

reduction of 71%. The proposed technique improves the cost by 30% over

the Hybrid method. The schedule size for the proposed method is 169%

and 49% larger than Offline and Hybrid, respectively. As we expected, the

improvement in cost and the increase in the schedule size are in the ranges

suggested before by the second set of experiments.

As previously mentioned, the effect of increased schedule size on the total

consumed ATE memory is small. For example consider the experiments

with the d695 chip with real switching activities. The size of the schedule

Table 4.7.4 Costs and utilized memory volumes for different memory limits

Memory

limit

Results

Cost Size Cost (%) Size (%) CPU time (H:M:S)

300 Aborted, memory limit is too tight 1:03:42

500 3.3875 460 100.00 100.00 3:15:21

750 3.3875 460 100.00 100.00 3:34:20

1000 2.9389 920 86.76 200.00 3:41:03

1250 2.9389 920 86.76 200.00 3:48:47

1500 2.7170 1320 80.21 286.96 3:53:52

1750 2.7170 1320 80.21 286.96 3:59:12

2000 2.7170 1320 80.21 286.96 4:04:16

Chapter 4

88

for the adaptive solution is approximately 7 Kbit while the test size is

approximately 1324 Kbit. Therefore the percentage increase in total

utilized ATE memory from the offline solution to the adaptive solution is

0.34%. This means that the adaptive method achieves 71% reduction in

cost relative to the offline method, with a small expense of 0.34% increase

in the occupied ATE memory.

4.8 Adaptive Multi-Temperature Testing

As previously discussed in section 3.5, temperature-dependent defects are

a challenge for achieving high test quality for advanced SoC. The existing

multi-temperature test scheduling methods optimize the test schedule for

the shortest test application time while making sure that the tests are

applied inside the specified temperature ranges [He10, Yao11b]. These

methods neglect the temperature deviations that are mainly caused by

process variation. Therefore, a large process variation implies a decreased

number of chips that are tested within the specified temperature ranges,

which will reduce the effectiveness of the tests and, in the worst case, may

lead to damage of the chips due to overheating.

In order to maximize the chances that the tests are applied within the

intended temperature ranges, static schedules should be designed

pessimistically. In this case, a large process variation implies a very long

test application time due to the intensive use of the heating and cooling

intervals. This means that the chips under test are heating up/cooling down

more than actually needed in order to make sure that it is warm/cold

enough for the majority of the chips. This is similar to the discussions about

the safety margins in section 3.3. A detailed discussion and analysis of

safety margins can be found in [Aghaee10].

The test application time for multi-temperature testing is much longer than

the normal testing and therefore the test cost is higher [He10, Yao11b].

Table 4.7.5 Cost and ATE memory utilized for schedule for d695

Offline Hybrid Proposed

Percentage change

relative to the Offline

Percentage

change relative

to the Hybrid

Hybrid Proposed Proposed

Cost 20.84 8.53 5.93 - 59 - 71 - 30

Utilized

memory for

schedule [bit]

2688 4992 7232 + 86 + 169 + 49


89

This becomes a serious cost issue, in particular in situations that the normal

test application time is already very long, as it is for advanced SoCs. The

proposed methods in [He10, Yao11b] provide satisfactory results when the

temperature at a certain test cycle could be assumed to be identical for all

chips of the same design.

However, advanced SoCs manufactured with deep submicron technologies

are likely to have different temperatures at the same test cycle because of

process variation. The negative effect of temperature variations on the

thermal safety of the SoCs during test is addressed by the scheduling

methods proposed in the previous sections. These methods try to limit the

cores’ maximum temperatures so that the test damages caused by

overheating during the test process are minimized. Similar techniques can

be applied in the context of multi-temperature testing. We have proposed

a technique to generate test schedules so that the tests have a large

likelihood of being applied at the correct temperatures [Aghaee14b]. Here,

we briefly explain the methodology.

As mentioned before, the adaptive multi-temperature testing is similar to

the thermal-safe approach introduced in this chapter. The key difference is

that heating stimuli and cooling intervals are used to bring the temperature

inside the required range. Only then the tests can be applied. Due to testing,

the temperature may exceed the high limit. In this case, the testing is

paused at an appropriate moment and then cooling intervals are introduced.

On the other hand, if the temperature falls below the low limit, testing is

paused and a heating sequence is introduced, instead.

The thermal-aware techniques only support one temperature limit which is

the overheating limit and exceeding it adds to the overall cost of testing by

increasing the number of overheated chips. On the other hand, multi-

temperature techniques have to consider the upper limit and the lower limit

of the temperature interval characteristic to each test. Exceeding these

limits results in test escapes which means that some defective chips may

not be detected. This new contributor to the testing costs is defined in

[Aghaee14b] and is added to the costs already defined in section 4.3.

Having to handle a low temperature limit adds to the complexity of the

techniques presented in section 4.5. Representative temperature error is

introduced in section 4.5.2. Every chip cluster is represented by a dedicated

representative temperature error. In fact this representative value is defined

and optimized with regard to the high temperature limit. Therefore, having

Chapter 4

90

an additional low limit means that another representative is required with

regard to this low limit.

The proposed adaptive multi-temperature technique is explained in details

in [Aghaee14b] and is supported by experiments. The overall cost that

captures costs related to the test application time, overheating, and out of

required-range testing is minimized. Required ATE memory, CPU times,

and cost dependency on the amount of process variation are also reported

in [Aghaee14b].

4.9 Remarks

Although the proposed adaptive techniques are developed to handle PV-

related temperature errors, they can be adopted to handle other non-ideal

situations. Such situations may happen during in-field testing, where the

initial and ambient temperatures (among other parameters like voltage)

may vary. Acquired temperature data using on-chip (or on-board) sensors

help to select the most appropriate linear-schedule and minimize the costs.

The temperature error that is explained in section 4.4 is discussed in more

details here. In order to distinguish between the effects of the process

variation and other undesirable thermal effects, four different temperatures

can be defined. The first one is expected temperature that is the

temperature of a normal chip which is not affected by undesirable thermal

effects (including process variation). The expected temperature is an

abstract concept and its exact value could not be acquired. The second one

is simulated temperature that is the temperature computed by simulation.

The aim of simulation is to compute the expected temperature and

therefore, ideally, the simulated temperature is equal to the expected

temperature. The third one is actual temperature that is the actual real-

world temperature. Its exact value is usually impossible to acquire due to

measurement errors. The fourth and last one is measured temperature that

is the measured temperature using temperature sensors.

Based on the above definitions, three different temperature errors can be

identified. The first one is simulator error that is the difference between

the expected temperature and the simulated temperature. The inaccuracies

in the thermal model and algorithms which the simulator is based on,

contribute to this error. The second one is measurement error that is the

difference between the actual temperature and the measured temperature.

The inaccuracies in the sensor technologies contribute to it. The third and


91

last one is variation error that is defined as the difference between the

actual temperature and the expected temperature. This error has various

sources including process variation, ambient temperature fluctuations, and

voltage variations.

Even though the temperature simulator errors and sensor measurement

errors are not addressed explicitly in this thesis, in practice when the

temperature error model is being tuned empirically, a considerable amount

of these errors will also be covered. There still might be small residual

errors which are not captured by the temperature error model. These small

residual errors are addressed by introducing a small safety margin (e.g., a

slightly lower overheating limit than the actual overheating limit is used in

practice). The effect of this small safety margin on cost is negligible as

discussed in [Aghaee10].

The focus of this chapter is process variation which mainly contributes to

the variation error and therefore in this thesis we focus on this category of

errors. To avoid these complications, we usually consider the temperature

error as the difference between the expected temperature which is

estimated by simulation and the actual temperature which is measured by

sensors. Meaning that we assume that the actual temperature and measured

temperature are equivalent. Moreover, we assume that the simulated and

expected temperatures are equivalent. Nevertheless, our proposed

approaches can be used to address other errors like the simulator errors and

the measurement errors, although they are not explicitly designed for

addressing these types of errors.

4.10 Conclusions

This chapter mainly presents an adaptive SoC test scheduling technique to

deal with spatial and temporal temperature deviations, caused by process

variations in deep submicron technologies. Mitigating the negative

variation effects on the multi-temperature testing, reported in

[Aghaee14b], is similar to the thermal-safe testing and therefore is just

briefly discussed above.

The key contribution of this chapter is an algorithm to generate a set of

efficient test schedules, each corresponding to a different thermal behavior

of different cores during test. The on-chip temperature sensors are used to

monitor the actual temperatures of the different cores and to guide the

Chapter 4

92

selection of the corresponding test schedules accordingly, during the test.

This way, the overall test efficiency will be improved considerably.

The proposed technique consists of two distinct algorithms, the test

scheduler and the thermal simulator. The temperature-aware test scheduler

is a constructive algorithm which generates tree-based test schedules by

putting the optimized sub-trees together. Sub-tree optimization is basically

a chip-clustering algorithm which involves a linear test scheduling

algorithm. A new sub-tree scheduling algorithm is proposed here. The

linear scheduling algorithm requires a thermal simulator in its main loop.

A fast temperature simulation approach is proposed in order to speed up

the temperature-aware test scheduling algorithm.

The proposed adaptive test scheduling technique generates process-

variation and temperature aware test schedules for SoCs with a large

number of cores. The algorithm has a relatively short run-time and

generates high quality test schedules. The proposed technique has been

experimentally evaluated using a number of experiments including ITC’02

benchmark SoCs.


93

4.11 Notations and Abbreviations

Notation Description

Capacitances vector in the thermal model

Applied Test Size

Resistances vector in the thermal model

Balancing Coefficient

The table that determines with which linear schedule table a

specific chip should be tested. (See the example in section 4.2)

Number of cores

Chip cluster A group of chips with similar thermal behavior that are tested

with the same Linear schedule table. A chip cluster corresponds

to an edge in the schedule tree.

Chip-cluster border The border line between two Chip clusters. For two adjacent

Chip clusters the border is a set of natural numbers, each

corresponding to an individual core. A border represents a

particular error value. (See section 4.5.4)

Chip clustering Finding the optimal partitioning of the -dimensional error

space into an already known number of Chip clusters for the

nodes of a tree. (See full explanation in section 4.5.4)

Cost of the Test Facility per time unit

Expected Applied Test Size

(temperature) Error-clusters Borders

Error Cell Change Probabilities

ECCP before being normalized

Error-Cell Labeling

Error-Cells Probabilities

ECP just after branching

ECP just after overheating

Chapter 4

94


ECP after temporal changes (according to temperature error

model)

Error cluster A range of error values which are to be treated as one single error

value. Error clusters are separated by Error-clusters Borders, EB.

Expected Test Application Time

Expected Test Overheating Probability

Effective Test Time per Second

High Temperature Threshold

Identity matrix

The point that the output is equal to the input and not equal to one,

in the proposed reciprocal limiter.

Number of temperature error clusters

Linear schedule table A schedule that specifies stop/start times for the test of each and

every core, individually. This will correspond to an edge or to a

single Chip cluster. (See the example in section 4.2)

Leaves’ Overheating Probabilities

Number of temperature error values

Number of nodes in a tree

Nodes’ Applied Test Sizes

Normalized Cost Function

Node’s Cluster Label

Node’s not Overheating Probability

Node A node in the schedule tree that corresponds to the ending of a

Linear schedule table (i.e., a place that branching is possible).

Nodes’ Probabilities

expected Number of Partial Trees, similar to the current partial

tree, that are required to construct the complete schedule tree


95


Nodes’ Test Application Times

Normalized Test Throughput

Power vector

Partial cost function NCF evaluated for a part of the schedule tree (e.g., a sub-tree).

Price of One Chip

Particle Swarm Optimization

Predicted Test Overheating Probability

Number of leaf nodes

Number of succeeding edges for a node

Scaled Cost Function that is used to select the unfinished trees

out of a group of offspring trees.

Simulated Temperature

Spatial Temperature Error Probabilities

Test Access Mechanism

Test Application Time

The period for the discrete-time temperature error model. The

error values are updated regularly with a frequency equal

to . (See section 4.4)

Temperature Error Values

Test Handling Time

Test Overheating Probability

Test Throughput

Temporal Temperature Error Probability

Number of nodes in the thermal model

Chapter 4

96


Transfer matrix for initial temperatures

Transfer matrix for power values

Temperatures vector in thermal model

Initial temperatures

Temperatures at the end of the interval of size t

Temperatures at t-th time sample (in section 4.6)

Temperature of w-th thermal node

The output of the proposed limiter, applied on the expected

number of partial trees, .

97

Chapter 5 Temperature-Gradient Based

Burn-In and Test Scheduling

Large temperature gradients (e.g., temperature difference between two

adjacent cores) exacerbate various types of defects including early-life

failures and delay faults. The capability to detect these temperature-

gradient induced defects is crucial for advanced SoCs. In particular, 3D-

SICs exhibit considerably larger temperature gradients compared with

normal ICs (for example, three times is reported in [Plas10]) and therefore

temperature-gradient based test is crucial for them.

The gradients are captured and represented by temperature maps. This

chapter presents schedule based techniques to enforce temperature maps

on the IC. A temperature map specifies the temperatures for different sites

(e.g., cores) in the IC at a given time-point. It usually specifies the high and

the low temperature limits for each site. Alternatively, the intermediate

temperatures (half-way from low limit to high limit) can be used to

represent a temperature map, in particular if the difference between high

and low limits are similar for all sites.

5.1 Introduction

5.1.1 Test for Early-Life Failures

Burn-in is a common way of accelerating and detecting early-life failures

and it should be done with low cost in a reasonably short time. For this

purpose, usually, the dies are operated at elevated temperature and voltage.

The elevated temperature and voltage speed up the aging and wear

mechanisms so that the dies experience their early life before testing. The

wear mechanisms that are speeded up include metal stress voiding and

electromigration, metal slivers bridging shorts, as well as gate-oxide wear-

out and breakdown [Semenov03].

5

Chapter 5

98

Recently, several studies have, however, shown that some wear

mechanisms are speeded up more efficiently by large temperature

gradients rather than the high temperature itself. A temperature-gradient

induced wear mechanism is identified in [Smorodin08] which shows that

a metal layer elevation develops rapidly on the sites that experience large

temperature gradients. Moreover, in the atomic flux equation that models

the electromigration, temperature gradient is present directly and also

indirectly through its effect on the mechanical-stress gradient [Pak11].

Therefore, a burn-in process that has not created the appropriate thermal

scenarios will not sufficiently speed up the formation of the defects and,

consequently, such early-life defects will go undetected. In order to prevent

these test escapes, it is necessary to introduce a burn-in process that

enforces appropriate temperature scenarios on the IC. This necessity is

more urgent for the ICs that suffer from large temperature gradients, such

as 3D-SIC.

3D-SIC technology, similar to other deep submicron technologies, suffers

from high power densities. Additionally, power densities are considerably

higher in the test mode compared to the functional mode, in particular for

core-based designs. Consequently overheating may damage the ICs under

test. This means that the application of test stimuli to ICs can raise their

temperatures beyond their tolerable limits. This often undesirable effect is,

however, utilized in this thesis to heat up the IC for burn-in.

In our case the stimuli are not necessarily actual test patterns. Instead, they

could be specially generated sequences which cause large switching

activities. Such stimuli are called heating sequences. The use of the heating

sequences to heat up the IC from inside means that special equipment for

heating the IC from outside are not necessary. This will lead to large

reduction of cost, and also allow for the generation of necessary

temperature gradients.

Some temperature gradients might be enforced on an IC by applying

appropriate inputs to the IC’s input ports in the functional mode. This

might work, to some extent, for 2D ICs, since from the functional point of

view all the required circuitry, including the input ports, are fabricated and

available when the IC enters the test process. For 2D ICs, there are usually

two possible stages for burn-in: Wafer-Level Burn-In (WLBI) which is

performed before packaging and Die-Level Burn-In (DLBI) performed

after packaging [Semenov03]. For 3D-SIC, however, there are more

Temperature-Gradient Based Burn-In and Test Scheduling

99

stages, including pre-bond, mid-bond, post-bond, and final stages

[Taouil12].

Existence of the test stages before the IC is fully assembled is a key

difference between the 2D and 3D-SIC burn-in process. In the case of 3D-

SIC, using input ports in the functional mode may benefit burn-in for the

post-bond and the final stages similar to 2D ICs. But for the pre-bond or

mid-bond stages, the inputs to the die or partially stacked dies are not

necessarily the inputs to the IC. The input ports to the unit under test for

3D-SICs, before the final bonding, are likely to include a number of TSVs.

The TSVs and test equipment are not designed to support simultaneous

application of functional signals, particularly to large number of TSVs

(even though they might be designed to allow simple electrical tests for the

TSV itself). Therefore, the use of the IC’s ports for enforcing the

temperature gradients is not possible for the pre-bond and mid-bond stages.

Albeit this lack of access in the functional mode, TAM provides access to

the cores, in the test mode [Ieee14a]. Therefore, the heating sequences

could be applies using the TAM in order to enforce the desired gradients.

The necessity to utilize the TAM has yet another reason that is not specific

to 3D-SICs. The thermal gradients in some maps might be placed in

locations that cannot be properly stimulated through functional input ports.

Such thermal maps can often be enforced if the TAM is used. The reason

is that the TAM, in the test mode, provides direct access to cores; while in

the normal operational mode, a core might be limited to receive inputs only

from an adjacent core. Therefore, heating could be targeted toward a

specific core using the TAM.

5.1.2 Test for Delay Faults

Advanced SoCs manufactured by 3D-SIC technology suffer from a

considerably larger number of delay faults as compared with previous

technologies [Deutsch12]. The causes for these delay faults include

resistive bridges and vias, power droops, and cross-talk noise effects.

Therefore, delay-fault testing is necessary to provide sufficient fault

coverage [Patil07, Raina07]. A large number of pre-bond TSV defects are

resistive in nature and, moreover, the mechanical stress caused by TSVs

contributes also to delay faults [Chakrabarty12, Deutsch12]. Therefore, the

expected number of delay faults for 3D-SIC is much larger than that of 2D

ICs.

Chapter 5

100

Since temperature has a significant effect on delay, its impact should be

taken into account for delay-fault test. A very important effect of

temperature on signal integrity is its effect on the clock network [Bota04].

Delay faults usually occur because of increased clock skew and a major

contributor to skew in 3D-SICs is temperature gradient [Mondal07]. Since

propagation delays depend on temperature, different temperatures on

different sites (i.e., temperature gradients) result in different clock skews.

Temperature gradients may reach up to 50 in adjacent cores for normal

operation and even higher during test [Borkar03, Bota04, Mondal07]. Such

large temperature gradients may lead to considerable clock skew and thus

many delay faults.

Moreover, the difference between the temperature maps during the normal

functional operation and temperature maps during test will result in non-

realistic delay faults [Bota04]. These delay faults usually happen because

of increased clock skew. Therefore, in order to detect the realistic delay

faults during the test, the test should be performed when the die has a

temperature map which corresponds to a normal functional situation.

In order to test a die under the thermal conditions that correspond to reality,

a simple technique is to operate the chip with realistic inputs so that the

temperature map is created disregarding the test. Then start the test and go

on with it as long as the thermal map maintains an acceptable difference

with the specified thermal map. When the difference grows larger than

accepted, the test is halted and the specified thermal map is re-created

disregarding the test. Apart from being slow, this scenario has another

problem in case of 3D-SIC.

As discussed in the previous section, usually a die in a 3D stack has a large

number of TSVs as its input ports. The TSVs and test equipment are not

expected to be designed to support simultaneous application of realistic

signals, particularly to large number of TSVs. Therefore, it is not possible

to use the IC’s real functional inputs to create the specified thermal map

for pre-bond and mid-bond tests. However, the test access mechanism can

be utilized for this purpose. This will be further discussed in section 5.2.1.


101

Besides creating realistic gradient scenarios to avoid test overkills, certain

unreal1 scenarios may help to detect certain early-life defects before

causing further costs. As previously discussed, in the normal operational

mode such unreal scenarios may not be achievable since not all involved

cores are accessible. However, in the test mode, TAM may provide access

to these involved cores.

As mentioned before, the temperature gradients in 3D-SICs are much

larger than in 2D ICs [Plas10]. This will exacerbate temperature-gradient

related issues including delay faults, in particular, for 3D-SIC. Therefore,

the associated tests should be performed when the proper temperature

maps are enforced. A temperature map specifies the appropriate

temperatures for different sites (e.g., cores) in the IC. These temperatures

are to be realized simultaneously in order to enforce the proper temperature

gradients. The temperature maps are given along with their corresponding

tests. Beside the gradient-based burn-in the other objective of this chapter

is to introduce a technique to apply the tests while the corresponding maps

are enforced on the IC.

5.2 Temperature-Gradient Based Burn-In

5.2.1 Motivation and Problem Description

As discussed earlier, a temperature map specifies the desired temperature

values for different sites (e.g., cores). The temperature maps are to be given

by the user, who studies the typical temperature-gradient induced failure

mechanisms in an IC analytically or experimentally [Pak11, Smorodin08].

Each map corresponds to a particular temperature condition of an IC, such

as large temperature differences between adjacent cores (i.e., large

temperature gradients), that can accelerate aging for early-life failures or

enlarge the delay fault effect so that they can easily be tested for. There

might also be some locations in the ICs such that their temperatures are not

important regarding the targeted defects. Such locations are indicated as

don’t-cares. Even though they are marked as don’t-cares, their temperature

should, however, be kept below the overheating limit (denoted by

) in order to prevent damage.

1 Unreal gradients are scenarios that are not expected to happen during field

operations. The opposite is “realistic gradients” that happen during normal

operations.

Chapter 5

102

When the expected locations in the IC simultaneously have the temperature

values that are specified by a map, it is said that that temperature map is

enforced. The specified temperature maps should be enforced quickly. In

case of burn-in, the temperatures should then be maintained for a given

period of time to achieve the intended effect and for test it should be

maintained as long as the corresponding tests are being performed.

Usually, there are many temperature maps that one would like to achieve

and maintain. Therefore, it is important to achieve them rapidly whether

the ICs start from the ambient temperature or from another map. The order

of the maps has a considerable impact on the overall burn-in/test time and

will be discussed in-depth later on in this chapter. For the time being, we

assume that the maps order is given and focus on other aspects of the

problem. In our work, a temperature map will be achieved by using heating

sequences sent through the TAM. Moreover, it is assumed that no test is

applied when an IC is kept under a temperature map for burn-in. This

assumption will be relaxed in section 5.3 so that the tests can be applied

when an IC is kept under a temperature map.

Assume that there are modules in an IC (on one or multiple dies) and

their tests can be started and stopped independently (e.g., the modules are

cores with core wrappers in a core-based design). In order to enforce the

specified temperature maps, heating sequences are used to heat up some of

the modules. The average power of the heating sequence is given by a real

number, denoted by for module . It is assumed that

the TAM only affords (a positive integer number) modules to be tested

simultaneously.

Assume that the desired temperature map is specified by a low temperature

limit and a high temperature limit for each module and the don’t-care

modules are declared separately. For example, a temperature map specifies

that module has a low temperature limit equal to and a high

temperature limit equal to .

The inputs to the proposed method include temperature maps, the IC’s

temperature model, the IC’s electrical model (e.g., specification of the

TAM and power-related specifications), switching activities of the heating

sequences, ambient temperature ( ), and overheating limit



103

( ). The output is a schedule that guides the application of the

heating sequences to the modules so that their temperatures move into the

specified ranges and stay there.

As an example, consider an IC with 3 modules, , , and . Assume

that a temperature map is specified as , ,

, , , and , and no module is specified

as don’t-care. These temperature limits are shown in Figure 5.2.1a with

dashed/dotted lines.

A temperature simulation is performed based on a proper periodic schedule

and the simulated temperatures are shown in Figure 5.2.1a. Starting from

the ambient temperature ( ), the modules’ temperatures

steadily raise until they are inside the specified ranges. As shown in this

example, applying heating sequences can drive the modules of an IC into

a high temperature situation. For example, the temperature of module

has reached at around Time Units (TU). A TU consists of

test cycles in this example.

The temperatures around the TU point, are amplified and shown

in Figure 5.2.1b. The time interval shown in Figure 5.2.1b corresponds to

three periods of the schedule. Since the schedule is periodic, one period

Figure 5.2.1 Temperature curves for an example

0 1 2 3 4 5 6 7

x 104

40

60

80

100

120

90

60

m0

m2

m1

2 410 3 5

Te

mp

era

ture

[oC

]

120

30

(a)×10

4 TU6

heating on/offtemperature(b)

0 1 2

A period

(c)

t0 t1 t2 t3

Te

mp

era

ture

Chapter 5

104

captures the entire schedule which is repeated in a cyclic manner. Figure

5.2.1c further amplifies and shows one period of the schedule that starts at

and ends at . The length of the period for this schedule is denoted by

( ). One period is divided into three intervals, specified by

numbers 0, 1, and 2 in Figure 5.2.1b.

They correspond to the time intervals3 [ ], [ ], and [ ] in Figure

5.2.1c, respectively. The schedule specifies that the heating sequence for

module is applied only in the [ ] interval, the [ ]

interval, and in general in [ ] intervals (

), assuming that the process starts at time . The application of

the heating sequences for module and module are specified in a

similar manner by the schedule.

For the [ ] period, the time intervals that the heating sequences are

applied are depicted by gray areas in Figure 5.2.1c. In this example, the

TAM provides access to one module at a time ( ). Therefore, in

interval [ ] only module receives a heating sequence. Similarly, in

[ ] only is heated and the same goes for interval [ ] for . We

need an efficient algorithm to generate such schedules.

5.2.2 Steady State Solution

Let us first analyze a simplified situation, where we assume that a steady

state power could be provided for the modules. In this case, there exists a

steady state solution that could generate and maintain the specified

temperature map.

Providing continuous steady state powers simultaneously for all modules

is, however, very likely to be impossible mainly due to TAM limitations.

One solution is to use the maximal practical power for each core in

combination with a Pulse Width Modulation (PWM) technique. Therefore,

the best that can be achieved is a discrete stimulus sequence that has

constant long-term average power with small ripples. This way, the

modules have a time-divided multiple access to the TAM.

In order to reduce the risk of out of range temperatures due to ripples in the

input power, the desired steady state temperatures are defined at the middle

of the specified ranges . Such ripples could be seen

3 The notation [a b] is used to represent an interval that ranges from a to b.


105

in the temperature curves given in Figure 5.2.1. In order to find the power

values that result in the specified temperatures, the IC’s temperature model

should be analyzed.

As previously discussed, the temperature model works by dividing an IC

into elements represented by nodes. Each node has a heat capacitance

modelling its thermal capacity. Adjacent nodes are connected through a

heat resistance that models the thermal conductivity between them. They

are connected together in a network configuration, similar to an electric

circuit. The temperatures correspond to voltages and the heat dissipation

corresponds to a current source. A node is called active if it directly

receives electrical power caused by switching activities.

A 3D-SIC is usually laid out so that the main blocks (e.g., logic and

memory) are placed in a certain distance relative to TSVs to avoid

undesirable effects induced by TSVs such as high mechanical stress. Such

forbidden areas are called Keep-Out-Zones (KOZ) [Chakrabarty12,

Deutsch12]. A collection of the TSVs placed next to each other (perhaps

to overlap the KOZ of different TSVs and save area on the die) is called a

TSV block. A TSV block may consist of only one TSV if the TSVs are

placed far apart.

In this section (section 5.2.2) it is assumed that a module is a single active

thermal node. Furthermore, it is assumed that TSV blocks are always

thermally don’t-care and do not dissipate heat (are passive thermal nodes)

since their drivers are placed together with the corresponding modules.

These assumptions will be relaxed in section 5.2.4. The temperature

equation (equation 2.6.1) is repeated below for convenience:

(5.2.1)

Like before, is the temperature vector and is the power vector. Heat

transfer among nodes is included in the temperature model and it means

that a node can be heated up by its neighboring nodes even if it has no

switching activities.

The specified temperature map consists, in fact, of the steady state

temperatures that must be enforced on the IC for a while. A temperature

map could be thought as the targeted steady state temperatures, , which

are composed of the desired steady state temperatures for each module

(e.g., for module ). Since is, in this case, equivalent to the steady

Chapter 5

106

state temperatures, which are considered constant (for a certain amount of

time), its derivatives are zero (no variation in time). Therefore, equation

5.2.1 (similar to equation 3 in [Aghaee14b]) may be written as

(5.2.2)

This means that it is possible to calculate the required powers that lead to

the specified temperature map. In order for the specified temperature map

to be achievable, the computed steady state power values must satisfy a

feasibility and a schedulability condition. The first part of the feasibility

condition is that the computed steady state power for module ( )

should be larger than or equal to the stray power dissipated by the module.

The stray power is an unintended part of the power that could not be

independently controlled. Its value for module is denoted by . It

consists of the leakage power in addition to the clock networks’ power. As

previously discussed, the clock networks’ power can be large [Oberg03].

Therefore, it is important to take it into account.

The second part of the feasibility condition is that should be less than

or equal to the average power of the corresponding heating sequence, ,

plus . Therefore, the feasibility condition is:

(5.2.3)

Usually the feasibility condition is easily met if the specified temperature

map is realistic (e.g., the specified temperature is neither lower than the

ambient nor larger than the achievable temperature). Assuming that

equation 5.2.3 is satisfied, the schedulability condition which is related to

the limited TAM bandwidth should be verified. The challenging problem

here is to create the required average power values, , using the available

TAM bandwidth. This is done by selectively applying the heating

sequences to the modules.

The continuous application of the heating sequence generates an average

dynamic power equal to . The desired power values, , which are

smaller than , are created by applying the heating sequence, ,

for a fraction of a time period. The average power in a period should be

made equal to the required steady state power. As mentioned before, this

is done using a technique similar to PWM. The ratio of the duration of

heating sequence application to the overall time period is therefore called

Duty-cycle ( ) and its value is calculated using the following equation.


107

(5.2.4)

The duty-cycles might not be achievable if their values are relatively large

and if the TAM does not provide sufficient bandwidth. For example,

assume a design with two modules, with the duty-cycles and

. This means that in a period of time equal to 1, we need access to

module 0 for 60% of the time and access to module 1 for 80% of the time.

Therefore, simultaneous access to more than one module (0.6 + 0.8 = 1.4

modules) is needed. This means that the TAM must provide simultaneous

access to these two modules otherwise these duty-cycles are not

schedulable and the specified temperature map cannot be enforced.

Note that can be divided into pieces; for example could be

implemented by first applying the heating sequence for a duration equal to

at the beginning of the period and then for a duration of

at the end of the same period. The feasibility and schedulability

conditions could be written together using the duty cycle concept as

follows:

(5.2.5)

In fact, the first line in equation 5.2.5 is identical to the feasibility condition

in equation 5.2.3, which is written here in terms of the duty cycles. The

second line in equation 5.2.5 is the schedulability condition, where is

the number of modules that can access the TAM simultaneously. Given a

temperature map that satisfies both feasibility and schedulability

conditions, it is relatively simple to develop a schedule to deliver the

required duty cycles.

Figure 5.2.2a gives an illustrative example, where the available

parallelism, , provided by the TAM is represented by the number of rows

that could be filled with duty-cycles, s ( ). The scheduling

algorithm starts by sorting the duty-cycles and then allocating them from

the largest one to the smallest ones by filling the rows from the lowest one

upwards. Note that if a duty-cycle starts in the lower row and continues to

an upper row, it will not reach the end of the upper row, since .

Therefore, a duty cycle will, at most, be assigned to two TAM rows and a

module needs to switch at most twice during a period. The overheads

Chapter 5

108

associated with switching are thus negligible. The fractions of the time

period that the modules receive heating sequences are illustrated in

Figure 5.2.2b. At every moment in time only three modules are receiving

their heating sequences (the TAM limitation is not exceeded), and the

average of applied heating sequence for a module in a period is equal to

the specified steady state power.

As mentioned before, a thermal map may leave the temperatures for some

nodes unspecified (don’t-care nodes). Besides, the temperatures for

inactive thermal nodes (e.g., TSV blocks) are also left unspecified. On the

other hand, in order to compute the steady state powers, these temperatures

should also be known. The proper choice of temperatures for the don’t-

care nodes may help a thermal map that is otherwise not schedulable

become schedulable. The problem of finding proper temperature values for

the don’t-care nodes could be formulated as a Linear Programming (LP)

problem. Since we are more interested in knowing the duty cycles than the

temperatures, the problem formulation is, then, written with the duty cycles

as decision variables, as shown in Figure 5.2.3. The main objective is to

find a feasible solution.

In Figure 5.2.3, the temperatures, , should have the values specified by

the thermal map, (line 4). If not specified by the temperature map (e.g.,

don’t care modules or inactive nodes) the temperatures should be between

the ambient and the overheating temperature (line 5). For an inactive

module, the power value should be equal to the stray power, , and

therefore the duty cycles should be zero (line 6). For an active node, the

duty cycles are between zero and one (line 7) according to equation 5.2.5.

Figure 5.2.2 An example for scheduled duty-cycles

D1 = 0.75

D2 = 0.75

D3 = 0.50

D0 = 1.00

Sorted

Order

W = 3 (three rows)

M = 4 (four modules)

m : module

T : the period

0.25 0.500.00 0.75 1.00

D2 D2 D3 D3

D1 D1 D1 D2

D0 D0 D0 D0

(a)

(b)

t0 t0+T time

m=3

m=2

m=1

m=0


109

Besides, the duty cycles should satisfy the schedulability condition

according to equation 5.2.5 (line 8 in Figure 5.2.3). The relation between

the power values, , and the duty cycles is defined by equation 5.2.4.

The temperatures, , are computed based on power values, , using

equation 5.2.2 (by replacing and with vectors composed of

and , respectively). If the LP solver finds a feasible solution, then the

thermal map is achievable and the duty cycles are returned by the LP

solver. We also have the temperature values of the don’t-care modules.

Knowing the duty cycles, a proper period for the PWM-like method has to

be found.

Finding an Appropriate Period for PWM-Based Schedule

The duty cycles and the scheduling approach, discussed so far, are

independent of the schedule’s period, . They generate the modules’

temperatures such that their average equals the specified steady state

temperatures. The period, , should be short enough so that the fluctuations

in the temperatures do not violate the specified limits ( and ). On the

other hand, a longer period is desirable in order to minimize the switching

actions in the schedule. An example for the results obtained by the

proposed algorithm could be seen in Figure 5.2.1a. After the temperatures

have completed their transitions to their new values (after TU),

the proper choice of the period keeps them inside the specified ranges, with

a relatively low number of switching actions in the schedule.

In order to find a relatively long period, , that albeit being long, keeps the

temperature fluctuations inside the specified ranges, two different

situations should be considered: (H)heating sequence is applied (e.g., the

second half of the period for module in Figure 5.2.2b); and (L)no stimuli

Figure 5.2.3 Linear programming formulation

1. Decision variables: ;

Objective:

Constraints:

;

Equations 5.2.2 and 5.2.4 relate variables and .

2.

3.

4.

5.

6.

7.

8.

9.

Chapter 5

110

are applied (e.g., the first half of the period for module in Figure

5.2.2b). In order to estimate the proper period for situation (H), equation

5.2.1 is re-written around the steady state temperature for the heating

sequence power, as shown in equation 5.2.6a. For situation (L), equation

5.2.6b is used, instead.

(5.2.6a)

(5.2.6b)

An illustrative example for the above equations is given in Figure 5.2.4.

Equation 5.2.6a describes the tangent line that touches the temperature

curve at point A, around the steady state temperature. A similar example

for equation 5.2.6b is the tangent line, CD, in Figure 5.2.4. Equation 5.2.6a

is then used to estimate the desired value for the period focusing only on

the high temperature limit. Assume that the proper , only focusing on

situation (H) and ignoring situation (L), is denoted by . Similarly, the

proper , only focusing on situation (L), is denoted by .

It is safe to assume that ( is the duty cycle) is the amount of

time that will result in a near violation situation for module in situation (H). In order to estimate , first the derivative on the left side of equation

5.2.6a is linearly approximated as follows:

(5.2.7)

Now, is computed for module as

(5.2.8)

Figure 5.2.4 An example for the computation of a safe period

A long but safe period is computed so that the temperature limits will not be violated any more

Time

Te

mp

era

ture

Heating seq. on/off

A

B

C

D

Dm×TmH (1-Dm)×Tm

L

Temp. of module m


111

The values for are obtained from the right side of equation 5.2.6a

and, consequently, the values for are computed using equation 5.2.8.

For example, in Figure 5.2.4, when the module is receiving active power,

the derivative that is represented by a straight line is tangential to the

temperature curve at its intersection point with the steady state temperature

at point A and later on intersects with the high temperature limit at point

B. The period, , is then calculated based on the time difference between

A and B. The other part of the line that stand between A and the low

temperature limit is deliberately left out in order to achieve a shorter period

that is safe in most of the situations (e.g., variation in the input power).

In a similar manner values for situation (L), , are calculated based on

equation 5.2.6b focusing only on the low temperature limit. Since the

temperatures should not violate any of the specified limits, the shortest

( ) is selected as the acceptable period for module .

The actual period, , should be the smallest among the acceptable periods

for all modules ( ) so that none of the temperature limits for

the modules is violated. For example, after the temperatures have

completed their transitions to their new values in Figure 5.2.1 (after

time units), the proper choice of the period keeps them insides the

specified ranges, albeit relatively large fluctuations caused by relatively

low number of switching actions in the schedule.

Moreover, the average of the applied heating sequences for each module is

equal to the specified steady state power for it. For example in Figure

5.2.1c, modules , , and receive 50, 35, and 15 percent of ,

, and plus , , and , respectively. This is indicated by the

width of the gray areas as compared with the schedule’s period, (

).

5.2.3 Transient Solution

Up till now, it was assumed that the power values applied to an IC during

transition to a new map are the same steady state powers that are used to

maintain the new map afterwards. This implies that the transition to a new

map is very slow and the transition time may be excessively long. For

example, as shown in Figure 5.2.1, it takes about time units for

the IC to reach the specified thermal map from the ambient temperature.

Chapter 5

112

Here, in this section, burn-in time is the time required for bringing the IC

into a thermal situation that complies with the first thermal map and then

to the next map, until all maps are applied. It is likely that a large number

of thermal maps are specified and therefore the transition to a new map

should happen very fast. After transition, the map is maintained using the

steady state powers, , as calculated in the previous section. In order to

reduce the burn-in time, a new solution that takes the transient response

into account and uses larger or smaller power values (compared to the

steady state solution) is presented here.

In this section, the transient response is taken into account while

minimizing the overall transition time. We start by looking into the analytic

solution for equation 5.2.1. This was previously discussed in section 4.6.

The closed-form solution for a duration of time equal to , is copied below

from equation 4.6.5:

(5.2.9)

In the above equation, and are matrices that are computed based

on and , and for a duration of time equal to , as follows (similar to

equation 4.6.3–4):

(5.2.10a)

(5.2.10b)

In the rest of this chapter and are represented as and ,

respectively. The initial temperatures are expressed by and the

temperatures at time is denoted by . is the power vector that is

assumed to be constant for the time interval . An intuitive explanation of

equation 5.2.9 is that determines how fast the initial temperatures fade

away and determines how fast the input power affects the temperatures.

As mentioned before, achieving a new temperature map in a short time is

crucial and, therefore, this transition should happen as fast as possible.

Once the IC’s temperatures have converged to the specified temperature

map, they can be maintained using the steady state powers, , found by

the steady state solution as presented in the previous section.

We would like to extend the steady state solution approach to equation

5.2.9, which includes the transient response, in order to find the


113

schedulable power values that result in the shortest transition time. The

new problem can be formulated as:

Find the shortest transition time, , and the corresponding power

values, , such that the specified map is achievable.

The transition time from map to map is defined as the time required

to construct the temperatures specified by map starting from

temperatures specified by map .

This problem can be solved using an iterative approach that tries different

alternatives for . The main part of the proposed algorithm is illustrated in

Figure 5.2.5. The algorithm uses the latest information regarding the

interval that contains the optimal transition time. This interval is denoted

by [ ]. At any step, it is known from the previous steps that the specified

map is not achievable for transition times shorter than .

It is also known that since the temperature map is achievable for a

transition time equal to , longer transition times are not optimal. Initially

is set to zero and to the transition time for the steady state approach (1st

step in Figure 5.2.5). This steady state transition time is obtained by

simulating the temperatures when the steady state schedule is used. A

number of candidate transition times with uniform distances are selected

between and (2nd step in Figure 5.2.5) according to:

(5.2.11)

The -th candidate transition time is denoted by . is the number of

parallel LP solvers and its value is selected based on the degree of

parallelism offered by the platform that runs the algorithm. For example,

for a machine that supports eight threads, eight is a reasonable choice for

. For each candidate , solving the LP formulation determines whether

the temperature map is achievable or not (3rd step in Figure 5.2.5). This is

represented by the Boolean variable, , for the -th candidate transition

time.

The value of is updated to be equal to the smallest that leads to

schedulable power values. The value of is updated to be equal to the

largest that leads to power values that are not schedulable (4th step in

Figure 5.2.5). Note that if for all the candidate transition times, denoted by

in Figure 5.2.5 ( ) the map is achievable, then

Chapter 5

114

remains unchanged. On the other extreme, if none of the s are

schedulable then remains unchanged. The algorithm stops when the

smallest transition time is found with acceptably low error (i.e., as shown

in the conditional step in Figure 5.2.5). The error is bounded to ( ) and

therefore if this difference is smaller than the specified limit, , then the

actual error, too, will be smaller than .

The problem formulation for the LP solver that is used in the 3rd step in

Figure 5.2.5, is similar to the LP formulation in the previous section

(Figure 5.2.3) with the following differences: (1) Instead of s, the

temperatures at the end of the transition time, s, are used. (2) Instead of

equation 5.2.2, equation 5.2.9 is used to calculate the temperatures based

on the power values. The relation between the power values and the duty

cycles defined by equation 5.2.4 is modified by replacing with and

used as indicated in line 9 in Figure 5.2.3. If the LP solver finds a feasible

solution, the temperature map is achievable. This information is then used

to update the and values.

Since during the transition the temperatures will not be in the specified

ranges, the period for the PWM-like method is not crucial, unlike in the

steady state solution. Therefore, it is sufficient that the period is much

smaller than the transition time, , so that the average power is a

meaningful quantity for this span of time. For the experiments, the steady-

state-solutions’ periods are also used for the transient solution (they are

much smaller than ).

Figure 5.2.5 Main algorithm for the transient solution

LPLP LP

transition time for steady state solution

min { is TRUE}

max { is FALSE}

is the minimal transition timeyesno

(σ – λ) <


115

The matrix exponent computation for , in equation 5.2.10, is performed

using techniques proposed in [Ukhov12]. These techniques are used in

order to speed up the repeated recalculations of and for alternative

transition times. They are based on eigenvalue decomposition, utilizing the

inherent properties of matrices and and replace the excessively time

consuming matrix exponent calculations in equation 5.2.10 with simpler

operations. Although these techniques speed up the calculations, the

required time is still very large, as experimentally shown in section 5.2.6.

Even though, the transient solution is an intuitive extension of the steady

state solution and greatly outperforms it, it is slow in generating the

schedules. Therefore, a new approach that avoids the time-consuming

successive calculations of and is necessary. Such an approach is

proposed in the next section, based on a fast heuristic. Moreover, this new

approach is capable of handling a more realistic problem formulation

compared with the steady state and transient solutions.

5.2.4 Transient-Based Heuristic

So far, it has been assumed that it is possible to apply heating sequence to

an arbitrarily selected active thermal node and, simultaneously, avoid

application of heating sequences to all other nodes. This implies that the

smallest element in the temperature model should not be smaller than the

corresponding module on the TAM, in order to be able to control the

heating sequence application to it independently from all other elements.

On the other hand, a temperature model with finer granularities might be

preferable in order to achieve a better spatial precision in the simulated

temperatures and perhaps the gradients. This way, the temperature maps

can be planned with a higher resolution. Therefore, a technique that allows

the modules to be further divided into a number of sub-modules is

advantageous. These sub-modules correspond to a higher number of nodes

in the temperature model.

Let us assume that the overall number of thermal nodes, denoted by , is

larger than or equal to the number of modules ( ). In the rest of this

chapter, the desired temperature maps are specified for the thermal nodes

instead of the modules. Consequently, the temperature map specifies that

node has low temperature limit equal to and high temperature limit

equal to ( ).

Chapter 5

116

In this new context, the switching activities for heating sequences are more

specific and provide information concerning the power breakdown among

active thermal nodes. For example, assuming that module is divided into

two active thermal nodes and , instead of only one heating sequence for

module , there will be two heating sequences corresponding to these two

nodes.

The average power of a heating sequence for active node is represented

by . The other active node of that module (i.e., node ) may also receive

power, denoted by . Therefore, when trying to heat up node with

, node is also heated by . Similarly, when trying to heat up node

with , node is also heated by . Such a situation cannot be

handled by the techniques previously proposed.

Furthermore, power dissipation for TSV blocks is now supported, and the

TSV drivers/buffers may be placed in TSV blocks and their desired

temperatures might also be specified in the temperature maps (not always

don’t-care, as assumed in the previous sections).

The proposed technique allows longer heating intervals during transition

time as opposed to relatively shorter heating intervals during steady state

(assuming that the new map’s temperature is higher). This relatively long

application of the heating sequence is called boosting. Boosting of an

active node stops when the node reaches the Stop Boosting temperature,

. The stop boosting temperature may be higher than the high

temperature limit, , but it is always lower than .

Boosting is helpful in different ways. One way is to achieve the following

desirable scenario. Assume that the node is initially heated beyond

( ). Then the node does not need to receive heating sequence for

a while and this leaves the TAM available for other nodes. Meanwhile, the

temperature keeps decreasing naturally and just before the end of the

transition time (the moment that all other nodes are in their specified

temperature ranges), the temperature drops below the high temperature

limit.

This simplifies and shortens the schedule for the transition period and,

therefore, is desirable. An example for the temperature curves when the

transient-based heuristic is used is given in Figure 5.2.6 for thermal node

. The overall transition time is indicated by the gray area. The temperature


117

of node passes through the valid temperature range already in the interval

(a) in Figure 5.2.6. But the termination of the transition interval is deferred

since at least one of the other nodes, when is in the valid temperature

range, is outside its valid range.

A node’s temperature will naturally decrease if no power or little power is

applied to it, but it should not fall below the low temperature limit.

Therefore, a heating sequence should be applied at some point, before the

temperature falls out of range. This point is marked with a temperature

level named Heating Trigger and denoted by for active thermal node

( ). The heating sequence should be applied when the

temperature of node falls below .

The difference between and provides sufficient time for the node

to wait for gaining access to the TAM without its temperature falling below

. In Figure 5.2.6, the heating is required at the beginning of the interval

(c), but since the TAM is not available, the node waits. At the beginning of

the interval (d) the node has finally gained access to the TAM and the

heating begins.

Heating should stop when the temperature reaches the high temperature

limit. The time it takes to get back to the low temperature limit could be

utilized to heat up other nodes that need heating. In a situation that a

module consists of multiple active thermal nodes, the heating sequence

could only be applied if all of these thermal nodes have temperatures lower

than their high temperature limit.

The nodes that simultaneously require heating should be accommodated

within the available bandwidth of the TAM. This bandwidth might not be

sufficient for all of them and, therefore, the nodes that need heating more

than others should be prioritized. The priorities for using the TAM are

Figure 5.2.6 An example for transient-based heuristic

Te

mp

era

ture

Transition

Pause WaitCooling

Heatin

gPause

Cooling

Pause

CoolingW

ait

Heatin

g

(a) (b) (c) (d) (e) (f) (g) (h)

Boosting

Chapter 5

118

determined based on the regional need for heating (denoted by around

a node ).

The value of is recomputed whenever node needs heating. A node

requires heating in the following two situations: (1) When , after

the transition, for example the interval (d) in Figure 5.2.6. (2) When

, during the transition, for example the interval (a) in Figure 5.2.6. In

the following, we explain how to calculate for situation (1). Regional

need for heating for situation (2) is obtained in a similar manner by

replacing with .

Equation 5.2.1 is re-written here with the approximate derivatives:

(5.2.12)

The input power, , in equation 5.2.1 is substituted with the stray power,

, plus the PWM power of heating sequences, . Vector is the

vector form of the regional need for heating and consists of s. Equation

5.2.12 is written for one test cycle with period which is a very short time.

The equation is then solved for the nodes that need heating as follows.

(5.2.13)

The regional need for heating, , depends on the required heating for node

(consider the summations when is equal to ), on the required heating

that is related to the adjacent nodes (consider the summations when

denotes an adjacent node to ), and on the average power of the

corresponding heating sequence, .

The regional need for heating for a node has the highest dependency on the

node itself, and then a relatively high dependency on the adjacent nodes

(this characteristic is captured by the temperature model). The influence of

other nodes located far away from the targeted node is small. The heat

transfer between nodes is taken into account automatically, since equation

5.2.13 is derived from the temperature equation (equation 5.2.1) and

includes the thermal conductances from matrix . This is reflected by

in equation 5.2.13.

Equation 5.2.13 ensures that the priority for using the TAM is given to the

regions that need longer heating times, for example because of large


119

and small . Furthermore, the locality of this heuristic is

helpful because adjacent nodes are likely to be in the same module and

therefore these nodes will receive some desirable active heating power

( ) or heat transferred from module .

The problem with heat transfer exists also in the previous sections, but it

was taken care of automatically by the LP solver. An effect of the interplay

between priorities could be seen in Figure 5.2.6. The waiting period in the

interval (f) is much shorter than the waiting period in the interval (c). The

length of a waiting period depends on the other nodes’ priorities in addition

to the node ’s priority. The priorities in thermal boost mode are computed

in a similar manner by replacing with (e.g., in equation 5.2.12–

13).

As discussed before, the performance of the transient-based heuristic

strongly depends on the stop boosting, , and heating trigger, ,

temperatures. One example is the priorities calculated using equation

5.2.13, since they depend on after the transition and on during the

transition. Efficient values for these temperature levels for each

temperature map and each thermal node are found using a PSO technique,

as introduced in section 2.7.

5.2.5 Remarks

The output for the steady state and transient solutions is a periodic offline

schedule and therefore producing a small periodic schedule is one of their

advantages. The transient solution, on the other hand, returns also the

transition time as an output. The periodic schedule generated by the

transient solution is applied just during the transition time and then the

steady state schedule must be used. A periodic schedule means that there

is a constant average power for each module during the transition, despite

the fact that a higher or lower average power might be suitable for different

periods. The transient-based heuristic addresses this issue by generating a

non-periodic offline schedule that facilitates the heating for the nodes that

need it the most. Furthermore, the introduction of the boost mode helps to

reduce the switching overheads in the schedule. For these reasons, the

transient-based heuristic offers a reduced transition time.

The proposed approaches support also heating sequences generated by a

Built-In Self-Test (BIST) engine. An example for the use of BIST engines

during burn-in in order to achieve high toggle coverage is reported in

Chapter 5

120

[Carbine97]. Such BIST engines that stimulate high switching activities in

a certain area of the IC under burn-in can be used to produce heating

sequences online. The only difference, in our context, is that if the BIST

engine does not occupy TAM, then it can be scheduled at any time as

needed.

For instance assuming that module can receive its heating sequence

from an adjacent BIST engine that is not occupying TAM, the 8th line for

the LP formulation in Figure 5.2.3 should be changed to:

. The situation for the transient-based heuristic is even

simpler, since the algorithm only needs to know that module can

receive its heating sequence at any time. Then, does not need to

compete with other modules for access to TAM. Consequently, there is no

need to evaluate the regional need for heating for .

The techniques proposed above make it possible to perform burn-in based

on heating sequences without requiring a heat chamber. One of the

situations when a heat chamber might be required is for the ICs that are

designed to work in an extremely high temperature environment. For

example, a microcontroller for a car engine is designed with low power in

order not to raise too much its temperature from the very high ambient

temperature in the engine area. When such a chip is tested or operated with

regular low ambient temperature, it is impossible to have enough power

density to boost its temperature to its usual high level in normal working

condition.

Another such situation is when some parts of an IC (e.g., package pins, die

to pin connections, and the interposer) cannot be heated up sufficiently by

input stimuli. In such cases, an extremely hot burn-in condition might be

required that is not achievable by exclusive use of heating sequences. Even

in such cases the use of the methods proposed in this thesis for enforcing

the temperature gradients will still be useful. The proposed algorithms do

not need any modifications to work under such situations, except for

setting a large ambient temperature corresponding to the heat chamber

temperature. Note that as discussed previously in section 4.6 the thermal

behavior is modeled as a Linear Time Invariant (LTI) system. Therefore, a

larger ambient temperature will directly add up to the temperatures created

by the application of the heating sequences.


121

The focus of this chapter is not on the issues related to process variations.

Small temperature variations can be tolerated by introducing a safety

margin for the specified temperature limits, in particular the overheating

temperature. Large temperature variations need a variation-aware

technique, for example, by combining the method proposed in this chapter

with the techniques proposed in the previous chapter. This is, however,

outside the scope of this thesis.

We use the term “temperature gradients” to precisely refer to the spatial

temperature differences. But we also use it in a relaxed manner to refer to

different sites’ temperature values. For example the temperature difference

between two adjacent modules and that is , is exactly a

temperature gradient and speeds up the early life-time of the affected area.

However, the fact that module ’s temperature is equal to and ’s is

is not directly a temperature gradient. These facts are captured by a

temperature map and affect the signal delays (for signals that are routed

through or close to these modules).

5.2.6 Experimental Results

The proposed techniques are evaluated for twelve experimental ICs with

one to three layers as detailed in Table 5.2.1, columns 2, 3, and 4. The one-

layer experimental ICs (row 1 to 4) are bare dies and could represent the

pre-bond test stage. The ICs that have two layers (row 5 to 8) could

represent mid-bond test stage. The ICs with three layers (row 9 to 12) could

represent post-bond test stage.

There are two, four, eight, and 16 physical modules per layer for different

dies, resulting in the total number of modules ranging from two to 48, as

given in column 3. There are one, two, and three TSV blocks per layer on

the dies, resulting in the total number of TSV blocks given in column 4,

ranging from one to nine. Each TSV block hosts a relatively large number

of TSVs. The dies are assumed to be stacked in a face to back

configuration.

The temperature models are extracted using an approach similar to the

method proposed in [Coskun09] for 3D-SIC. This is an extended form of

the technique used by HotSpot [Huang07] for normal 2D ICs. The heating

patterns’ switching activities are generated using Markov chains, similarly

as in [Yao11c]. The temperature maps specify the valid temperature ranges

for nodes in the temperature model. The valid ranges are randomly selected

Chapter 5

122

between , and some modules/nodes are randomly selected to be

don’t-care.

Only temperature maps that can be achieved in practice are considered. An

example for a temperature map that cannot be achieved is one that requires

a central node with very low temperature and its adjacent nodes with very

high temperature. In this case the temperature gradient is huge and it

probably will require negative power (active cooling) for the central node.

The transient solution (section 5.2.3) and the transient-based heuristic

(section 5.2.4) are evaluated and compared with the steady state solution

(section 5.2.2). The transient-based method is capable of handling

temperature models having multiple nodes per module, while the steady

state and transient solutions only support one thermal node per module. In

order to have comparable experiments, the temperature model that is

supported by the steady state method is used for the other techniques.

The CPU time to generate the schedules for the transient-based method for

all of the twelve experimental ICs together is about 12 minutes while the

transient solution takes 17 minutes and steady state method completes in 2

seconds. As discussed earlier, the time required to bring the IC into a

thermal situation that complies with the first temperature map and then to

the next map until all maps are applied is defined as the overall transition

time in this work.

Table 5.2.1 Percentage changes achieved by proposed techniques

IC

Number

IC Specifications

Percentage change in

overall transition time

Number of

layers

Number of

modules

Number of

TSV blocks

Transient

solution

Transient-

based heuristic

1 1 2 1 -83.88 -97.82

2 1 4 1 -68.35 -73.05

3 1 8 2 -64.97 -69.95

4 1 16 3 -56.93 -62.63

5 2 4 2 -64.37 -68.37

6 2 8 2 -58.32 -65.94

7 2 16 4 -57.19 -63.82

8 2 32 6 -43.99 -55.14

9 3 6 3 -70.44 -97.18

10 3 12 3 -57.15 -93.17

11 3 24 6 -84.56 -95.87

12 3 48 9 -92.06 -94.52

Average -66.85 -78.12


123

The percentage change in overall transition time offered by the transient

solution and the transient-based heuristic, compared with the steady state

solution, are given in columns 5 and 6 of Table 5.2.1, respectively.

Considerable speed up (78% in average) is achieved by the transient-based

heuristic and moreover, it also outperforms the transient solution.

The CPU times for the transient-based heuristic for different number of

modules are given in Figure 5.2.7. Even though they grow rapidly with the

increase in the number of modules, for an IC with 48 modules it is still

relatively short (480 sec).

5.3 Temperature-Gradient Based Test

For the temperature-gradient based test, the goal is to make sure that the

tests are performed when the temperature gradients are correctly captured

on the IC. This means that the specified temperature maps should be

reached and maintained during the corresponding test periods. In the

followings a straightforward algorithm and then a fast heuristic are

proposed.

5.3.1 Straightforward Algorithm

This algorithm works by changing between two modes, the temperature

construction mode and the test mode. Initially the temperature construction

mode is activated and it creates the specified temperature map using a

method similar to the transient-based heuristic proposed in section 5.2.4.

Then the test mode is activated and the tests that are scheduled with a third

party algorithm (e.g., scheduling method proposed in [SenGupta12]) are

applied. The test temperatures are simulated at design time and as soon as

at least one of the thermal nodes is out of its specified range, the test mode

is paused and the temperature construction mode takes over again. When

Figure 5.2.7 CPU time versus number of modules

0 5 10 15 20 25 30 35 40 45 500

1

2

3

4

5

6

7

8

9512

128

643216

4

10 5 10 15 45

CP

U t

ime

[se

c]

5035 4020 25 30Number of Modules

8

256

2

Chapter 5

124

all thermal nodes are brought back into the specified temperature ranges,

the temperature construction mode is paused and testing resumes.

Similar to the transient-based heuristic, if the temperature of a node is

lower than the heating trigger temperature, it should be heated by applying

the heating sequence to it. If there are many nodes that need heating (more

than what the TAM can support), priority is given to those with higher

regional need for heating as defined in section 5.2.4. The construction

mode, unlike the transient-based heuristic, should not heat the nodes up to

their high temperature limit since the power of the tests that are applied

immediately after the construction mode may rapidly heat up the node

beyond high temperature limit. Therefore, Testing Trigger temperatures

which are denoted by for node ( ) are introduced

here. During the temperature construction mode, the heating for node

stops as soon as the temperature reaches .

In the test mode, as soon as the temperature of a node reaches the high

temperature limit, the test mode is immediately paused, the temperature

construction mode is activated and, consequently, a cooling interval is

applied. The cooling continues until the node is cooled down to the testing

trigger temperature, , and then the node is ready for testing again. The

actual activation of the test mode will also depend on the temperatures of

the other nodes. Efficient values for testing trigger temperatures, , for

each map are found using a particle swarm optimization technique along

with and .

The inputs to the methods proposed here in section 5.3 include the inputs

to the methods proposed in section 5.2 in addition to the test specifications

(e.g., test switching activities). The output is a set of offline schedules.

Moreover, the proper values for the heating trigger, , stop boosting

temperatures, , and testing trigger temperatures, , which result in a

rapid test could also be considered as the outputs that provide a basis for

an online scheduling scenario.

The straightforward algorithm is simple, and allows the choice of a desired

arbitrary test schedule that is used in the test mode. But the overall test

application time offered by this method is very long. Note that the total test

application time also includes time intervals spent for temperature

construction


125

5.3.2 Fast Heuristic

The fast heuristic schedules the tests together with the heating sequences

such that the specified temperature map is maintained. This way, a shorter

test application time can be achieved. An illustrative example for the

proposed method is given in Figure 5.3.1 for a single thermal node. The

proposed technique has similarities to the temperature construction

algorithm in section 5.2.4. For example, stop boosting temperature, ,

indicates that the boosting should stop, as illustrated at the end of interval

(a) in Figure 5.3.1. After being too warm, the node should cool until its

temperature gets below the testing trigger temperature, , as shown in

interval (b).

When the temperatures for all of the other thermal nodes covered by

module are between their high temperature limit, , and their heating

trigger, ( ), testing may start, as in interval (c) in Figure 5.3.1.

All other nodes should be within their temperature limits . Testing

continues until the temperature of at least one of the nodes goes beyond the

high temperature limit or falls below the heating trigger . For

example at the end of interval (c), the node is too cold for testing and a

heating interval should be introduced. Note that the TAM may no longer

be available and, therefore, the node is waiting for access to the TAM in

interval (d).

Finally, when access to the TAM is obtained, the heating sequence is

applied in interval (e). In order to start heating, all nodes covered by a

module should be colder than the high temperature limit since the heating

sequence for one node is very likely to inject power to other nodes in the

same module (as explained in section 5.2.4). Heating continues until the

temperature goes beyond the testing trigger temperature and, then, the test

resumes as in interval (f) in Figure 5.3.1. When the temperature reaches the

high temperature limit, a cooling interval is introduced as in interval (g).

Figure 5.3.1 An example for the fast heuristic

Boost Test Heat

Wait Test Cool Test

(a) (b) (c) (d) (e) (f) (g) (h)

Cool

Chapter 5

126

This procedure continues until all tests corresponding to the current

temperature map are completed.

As mentioned before, nodes will compete for access to the TAM and,

therefore, some of them should be prioritized. First the nodes that require

heating (not the tests) are granted access to TAM. This helps to keep the

temperatures most of the time within the specified limits and, thus, keep

the flow of the tests uninterrupted. Note that if only one node falls out of

its specified range, all tests must be interrupted until the map is achieved

again. This will waste a lot of time, since the tests for the modules that are

in their specified range should also be interrupted. The priorities for the

nodes that require heating are determined based on the regional need for

heating as proposed in section 5.2.4 (equation 5.2.13).

If the TAM is left with some available bandwidth after the heating

sequences are scheduled, the modules that are thermally qualified may

resume their tests. A module is thermally qualified if none of the nodes that

correspond to that module are demanded by the previously discussed rules

to receive heating, wait for heating, or receive cooling. The priority is given

to the modules that are expected to offer long test endurance. The test

endurance is denoted by for module , and is defined as:

(5.3.1)

The test endurance is directly proportional with the remaining test size

denoted by for module . The larger the remaining test size, the longer

the test endurance. The thermal tolerance, denoted by for module ,

is the other contributor to the test endurance. High thermal tolerance, ,

indicates that the module is capable of receiving tests for a relatively long

time without exceeding the specified thermal limits. Therefore, a module

with large thermal tolerance may remain under test for a relatively long

time. The thermal tolerance is defined as:

(5.3.2)

In equation 5.3.2, it is assumed that module covers active thermal

nodes. ( ) denotes the expected thermal distance to a

temperature limit for node and is defined as:

(5.3.3)


127

As mentioned in section 5.2.2, the desired steady state power is the

power that results in a temperature equal to .

Equation 5.3.3 indicates that if the upcoming tests have relatively high

average power, then it is likely that the thermal node exceeds the high

temperature limit and, therefore, the difference between the current

temperature, , and the high temperature limit, , is a good measure for

thermal tolerance.

Similarly, for a relatively low power test, it is more likely that the

temperature falls below the heating trigger in the future. Therefore, the

difference between the current temperature, , and the heating trigger

temperature, , is a good measure for thermal tolerance. Thermal

tolerance, , is defined as the smallest ( ) since as

soon as a single node is out of the specified range , disregarding of

the temperatures of the other nodes, test should be interrupted. Note that if

the temperature falls below , only for a node in module , then the test

is interrupted only for module .

A proper value for the testing trigger temperature, is selected so that

the temperature variation during test (caused by the variations in the test

power) rarely results in the temperatures below or above . Every

time that or are violated, the test must be interrupted and a heating

or cooling interval must be introduced, respectively. Since these are time

consuming, a proper value helps to obtain a short test application time

by reducing the number of interruptions. Besides the testing trigger

temperature, stop boosting and heating trigger temperatures ( and

respectively) have a considerable effect on the test application time and

therefor proper values for them should be found. A particle swarm

optimization technique, as discussed in section 2.7, is used to find the

proper values for , , and for each map.


The fast heuristic (section 5.3.2) is evaluated and compared with the

straightforward method (section 5.3.1). An experimental setup similar to

section 5.2.6 is used here. This includes experimental ICs described in

Table 5.2.1. For convenience, columns 1–4 from this table are repeated in

Table 5.3.1 that reports the experimental results. The temperature model

used for these experiments has multiple nodes per module, as opposed to

experiments presented in section 5.2.6.

Chapter 5

128

The total time required to enforce a temperature map and maintain it while

the tests are being applied, in addition to the time spent applying the

corresponding tests, is defined as the test time in this section. The

percentage change in test time offered by the fast heuristic compared with

the straightforward method is given in column 5 of Table 5.3.1, which

shows that considerable speed up (67% in average) is achieved.

The percentage change in CPU time required by the fast heuristic

compared with the straightforward method is -36%. The overall CPU time

depends on the interaction between the computational complexity of a

single decision point4 in the schedule and the schedule length. The

experimental results indicate that since the fast heuristic method makes

better decisions, compared with the straightforward method, the overall

length of the schedule is reduced considerably and therefore the overall

CPU time is also reduced. This happens despite of the fast heuristic’s

higher computational complexity for individual decision points. In fact, the

schedule length is an important contributor to the CPU time, since longer

schedules require longer temperature simulations and temperature

simulation is, per se, very time consuming.

4 A decision point is a point in the schedule where the scheduling algorithm must

decide about the upcoming states (e.g., whether to cool, wait, heat, or test).

Table 5.3.1 Percentage changes achieved by fast heuristic

IC

Number

IC Specifications Percentage change in

test time achieved by

fast heuristic

Number of

layers

Number of

modules

Number of

TSV blocks

1 1 2 1 -16.97

2 1 4 1 -39.69

3 1 8 2 -63.35

4 1 16 3 -94.77

5 2 4 2 -8.70

6 2 8 2 -60.80

7 2 16 4 -78.17

8 2 32 6 -95.04

9 3 6 3 -75.90

10 3 12 3 -84.81

11 3 24 6 -87.08

12 3 48 9 -94.72

Average -66.67


129

The CPU times for the fast heuristic for different number of modules are

given in Figure 5.3.2. Even though they grow rapidly with the increase in

the number of modules, for an IC with 48 modules it is still acceptably

short. The CPU times for the burn-in (section 5.2) will be relatively shorter

since here the tests are also scheduled along with the heating sequences.

The increase rate in the CPU times, as shown in Figure 5.3.2, is tolerable

similar to the transient-based heuristic (section 5.2.4). This was expected

since these algorithms are very similar.

5.4 Temperature-Map Ordering

The order in which the maps are enforced has a considerable impact on the

overall burn-in and test time. Since there are usually a number of

temperature maps to be applied, their ordering is important. In this section

we present methods to rapidly obtain a proper order for temperature maps

that results in a short burn-in and test time.

5.4.1 Map Ordering Technique

To simplify the discussions, let us assume that the temperature map for a

thermal node is represented by the middle value of the specified

temperature range . As an example, assume that an

IC has two thermal nodes and the initial temperature is . The specified

temperatures, by temperature map , are denoted by . This

means that temperatures and are specified by map for nodes

and , respectively. Assume that there are three temperature maps

5 The notation { , , …, } is used to represent an ordered sequence of

elements ( ).

Figure 5.3.2 CPU time versus number of modules

512

64

10 5 10 15 45 5035 4020 25 30

Number of Modules

8

CP

U t

ime

[m

in]

Chapter 5

130

denoted by , , and . These maps specify the following temperatures:

= { , }, = { , }, and = { , },

respectively.

These temperature maps are represented in Figure 5.4.1a–b by three points

in a Cartesian space. The temperature for node is represented by the

horizontal axes, , and for node by the vertical axes, . The initial

order of temperature maps { , , } requires a long time to increase the

temperature for node from 30 to 110 ( in Figure 5.4.1a), then decrease

it to 40 ( in Figure 5.4.1a), and then again increase it from 40 to 110 (

in Figure 5.4.1a). This process will take a long time due to the required

large changes in the temperature. In contrast, it is much faster to work with

the maps ordered as { , , }, since in this case, the required

temperature changes consist of smaller temperature variations, as shown in

Figure 5.4.1b.

As discussed earlier, in order to minimize the overall transition time for

burn in, a particle swarm optimization technique finds the proper values

for stop boosting and heating trigger temperatures ( s and s,

respectively). The map orders should be optimized along with these

temperatures, since all of these factors have a crucial effect on the overall

transition time for a given set of temperature maps. The naïve approach to

find proper map orders is to introduce them as decision variables into the

PSO along with s and s. Experiments showed that this naïve

approach takes very long CPU time to complete. Since the optimized

values for and depend on the map order, different map orders

result in different optimized values for and .

The initial PSO population in the naïve approach consists only of random

solutions (random s, s, and random map orders). Introducing a

relatively good map order into the initial population of PSO (among other

initial solutions that are random) will help to speed up the search. This

approach is denoted by A1. The idea for approach A1 is to rapidly find a

potentially good map order using some initialization heuristic and

introduce it into the initial PSO population. By doing this, the search should

speed up while the quality of the final values for s and s are kept

reasonably high. Experiments suggest that in the majority of cases, PSO

finds a better map order than the one produced by the initialization

heuristic.


131

It is, in fact, possible to find a potentially good map order without having

to go through the time-consuming optimization of s and s.

Furthermore, it is possible to do it without the relatively time consuming

scheduling procedures for the heating sequences. A temperature map could

be considered as a point in an -dimensional Euclidean space ( is the

number of thermal nodes). The thermal distance between two maps is

defined as the Euclidean distance between them (e.g., between maps

and in Figure 5.4.1b). For a sequence of the maps, the total thermal

distance (TTD) is defined as the sum of the thermal distances between

successive maps. For example, TTD for Figure 5.4.1a is approximately

257, while for Figure 5.4.1b it is 108, which is much smaller. In general, a

sequence of maps with smaller TTD is expected to have a shorter transition

time compared with a sequence with larger TTD.

Note also that the time required to change the temperature differs from

node to node depending on the node’s location on the IC, the adjacent

nodes’ temperatures, the heating sequence powers, and so on. Moreover,

depending on these factors, the rise time and the fall time for the

temperature of a certain node are also different (e.g., in many cases heating

up is faster than cooling down, with the same temperature gap). The TTD

does not take these differences into account in favor of a simple but

meaningful metric that is fast to evaluate. However, when the map order is

optimized using PSO, all these once ignored factors are automatically

taken into account.

Figure 5.4.1 The total thermal distance (TTD)

(a) a bad map order. (b) a good map order.

Map order: {μ1, μ2, μ0}

TTD = │b0│+│b1│+│b2│

(b)

[oC]

70

50

30

90 μ0

μ1

μ2b0

b1

b2

50 70 90 110[

oC]

30

Map order: {μ0, μ1, μ2}

TTD = │a0│+│a1│+│a2│

(a)

[oC]

70

50

30

90 μ0

μ1

μ2

a0

a1 a2

50 70 90 110[

oC]

30

Chapter 5

132

This problem is similar to finding the shortest Hamiltonian path in a

complete graph whose vertices are temperature maps and the distance

between two vertices is their Euclidean distance. Therefore, the initial

heuristic based map order that is added to the PSO’s initial population in

approach A1 is called shortest Hamiltonian path. Due to the reasons

discussed previously, this shortest path does not necessarily correspond to

the optimal map order.

If A1 is allowed to run for a long time, it will produce very high quality

solutions. However, for larger designs, this is unaffordable. We have

therefore proposed the A2 approach, which consists of a short run of A1

followed by a post-PSO optimization of map orders. The motivation for

this is that PSO optimization in A1 can rapidly identify possible solutions

in the near optimal area of the search space but it then becomes very slow.

Knowing the near optimal area, other optimization techniques can be

deployed to rapidly improve the results. In the followings, the post-PSO

optimization for the map orders is discussed.

In the general case, the post-PSO optimization could be excessively time

consuming. A greedy heuristic is therefore used to rapidly find a near

optimal solution. The greedy approach is characterized by its size, . This

size is the number of alternative partial solutions that are kept at each step

(i.e., among the vertices with equal depth in the search tree). A greedy

heuristic with size works as follows. Starting from the root vertex (initial

temperature) in the search tree, vertices (i.e., temperature maps) that

have the shortest partial transition times are selected. This corresponds to

the first map in the final map order. Here the scheduling is performed to

calculate the actual transition times.

Then again new vertices that have the shortest partial transition times are

selected out of the set of vertices that succeed the previous best vertices.

Two maps (in the final map order) are scheduled so far. This procedure

repeats until all maps are scheduled. For equal to one, at each step the

map that is the fastest to achieve is selected. A large slows down the

search but it may provide better results. Our experiments showed that 10

is a good choice for .

Albeit this general case which addresses large and time consuming ICs, for

smaller ICs it is possible to find the optimal map order (i.e., exact solution)

using an exact algorithm (e.g., branch and bound). Since a relatively good

solution is already found by PSO in approach A1, we can skip many paths


133

in the search tree that result in a larger transition time, without wasting time

to fully schedule them. For example assuming that the map order in Figure

5.4.1b is already found by A1, there is no need to schedule (in Figure

5.4.1a) at all. Scheduling may also be aborted before completion since

the overall transition time of this path in the search tree exceeds the overall

transition time of the path corresponding to Figure 5.4.1b before it even

gets to vertex . Note that in this algorithm, the edges are actual transition

times and not the Euclidean distances. Albeit significant acceleration

achieved by utilizing the near optimal result from A1 approach, larger

examples are excessively time consuming and therefore finding their

optimal solution is not practical.

Although this section has focused on map ordering for the temperature-

gradient based burn-in, the map ordering for the delay test is very similar

and the same technique can be used. Moreover, there might be a map

dependency graph (e.g., because of corresponding tests’ dependencies)

which dictates that certain maps must be applied in certain order. Although

not discussed in this section, the proposed approach can accommodate such

scenarios.


Experimental setup is similar to section 5.2.6. All experiments are

performed on a desktop computer with Intel® Xeon® W3520 processor

and 8 GB of memory. Percentage change in CPU time for the A1 approach

compared with the naïve approach is -266% in average. Furthermore, the

overall transition time achieved by A1 is 18% smaller than the overall

transition time achieved by the naïve approach.

Optimal map orders are found for some of the small experimental ICs to

be used for comparison purposes. It is not practical to find optimal map

orders for all the experimental ICs because of the excessive search time

that relatively large ICs require. The overall transition times achieved by

A1 are around 23% larger than the overall transition times offered by the

optimal map orders. As mentioned before, this shows that the map orders

found by A1 are close to optimal, but A2 can do better. In the following

A2, that includes post-PSO optimization, is compared with A1 that

terminates after the PSO optimization.

The greedy approach with a population size of one ( ) is used to find

map orders for all of the experimental ICs. The results show 16%

Chapter 5

134

improvement over the A1 results, but it is 13% worse than the optimal.

Increasing the population size to ten ( ), further improves the results

so that there is 21% improvement over the A1 and it is only 7% worse than

the optimal. However, it almost doubles the search time. In short, A1 finds

map orders that result in overall transition time around 23% worse than

optimal. The post-PSO optimization in A2 improves the map orders by

21%, which means that it is very close to the optimum.

5.5 Conclusions

Early-life failures and delay faults that are dependent on temperature-

gradients introduce additional challenges to achieve efficient burn-in and

delay-fault test. The negative effects of temperature gradients are more

pronounced for 3D-SIC technology, since their magnitude is much larger.

The challenge for burn-in is that some defects develop and cause early-life

failures very rapidly when the IC is working with certain temperature maps

that include large temperature gradients. These are difficult to enforce by

traditional burn-in methods. The challenge for delay-fault test is that some

defects can be detected only when a certain temperature map is enforced

on the IC.

In order to effectively detect these defects, it is necessary to construct and

maintain the specified temperature maps during burn-in and delay-fault

test. The methods proposed in this thesis utilize the available test access

mechanisms in order to do so. The specified temperature maps are

constructed and maintained by selectively applying high-power stimuli to

the IC. Therefore, there is no need for expensive equipment to heat up the

chip externally. To our knowledge, this is the first technique to achieve

temperature maps for burn-in and test without any external heating

mechanism.

For burn-in, a steady state solution is introduced that is fast to generate the

schedules, but the schedules are slow to achieve the specified temperatures.

A schedule in this case consists of a single periodic schedule for each map.

The steady state solution has been extended to the transient solution which

is slow in generating the schedules, but constructs the maps faster. Finally,

the transient-based heuristic is proposed to support a more precise

temperature model, and offer a shorter overall transition time by generating

schedules that rapidly bring the IC to the specified temperature conditions.

The experiments indicate that this method outperforms the transient


135

solution. Moreover, this method is 78% faster than the steady state solution

in realizing the specified temperature maps.

For delay-fault test, a straightforward method is proposed that is based on

two working modes, the temperature construction mode and the test mode.

The temperature construction mode works similar to the transient-based

method for burn-in and brings the IC to the specified temperature

conditions. Then, the test mode applies the tests according to a given test

schedule until the IC’s temperatures exits the specified range, when the

temperature construction mode is activated again. This continues until all

tests are performed. Furthermore, another method (fast heuristic) has been

developed to schedule the heating and cooling intervals mixed with the

tests. Therefore, the test time offered by this method is reduced. The

experiments indicate that the fast heuristic is 67% faster in performing the

tests compared with the straightforward method.

The order of the temperature maps has a considerable effect on the overall

burn-in and test time. Therefore, map orders need to be optimized, since

they affect the optimal values for other decision variables. Experiments for

map ordering show that the introduction of an initialization heuristic that

adds an initial map order to the PSO’s initial population speeds up the

search time by 266% in average. Furthermore, the overall transition time

improves by 18% in average for burn-in. The overall transition times are

further improved by 21% through introduction of a post-PSO optimization

stage that consists of a greedy approach.

Chapter 5

136



Represents heat capacitances in the thermal model. is the

matrix element at -th row and -th column.

Represents thermal conductance (related to heat transfer) in the

thermal model. is the matrix element at -th row and -th

column.

Need for heating in a general case. is -th thermal-element’s

need for heating.

Duty cycle for module in PWM method

Testing endurance for module .

Identity matrix

Number of modules ( ).

is the -th module.

Total number of thermal elements in the thermal model (

)

Power value(s) in a general case. is power for module .

Power values in transient solution

Heating sequences’ powers

Heating sequence power received by node when heating is

intended for node .

Steady state power values in transient solution

Stray power

PSO Particle Swarm Optimization [Poli07]

Number of parallel LP solvers in transient solution

Remaining tests’ size

Proper schedule period in PWM method, calculated solely for

heating interval of module


137


Proper schedule period in PWM method, calculated solely for

cooling interval of module

TAM Test Access Mechanism

TAT Test Application Time

Thermal tolerance for module .

TTD Total Thermal Distance

TAM width: number of modules that can be accessed at the same

time

Transfer matrix for initial temperatures considering a time interval

equalt to

Transfer matrix for power values considering a time interval equalt

to

Boolean variable indicating that the -th LP solver has found a valid

solution

Thermal distance for -th active thermal element.

Accpeptable error in the minimal transition time in trasient solution

Temperatures vector in a general case. is the temperature for

module . is the temperature for -th thermal element.

Ambient temperature

Overheating temperature limit

Initial temperatures

Final temperatures after seconds

Stop-boosting temperature limit in a general case. is stop

boosting limit for -th thermal element.

Steady state temperatures

is high temperature limit for module . is high temperature

limit for -th thermal node.

Chapter 5

138


is low temperature limit for module . is low temperature

limit for -th thermal node.

Testing-trigger temperature threshld in a general case. is

testing trigger threshold for -th thermal element.

Lower bound for optimal transition time in transient solution. The

upcoming temperature map cannot be achieved if transition time

is smaller than . See .

-th temperature map.

139

Chapter 6 Integrated Temperature-

Cycling Acceleration and Test

Large and frequent temperature changes (i.e., temperature cycling) create

fatigue and wearout in Integrated Circuits (IC), as pointed out earlier in

section 3.8. Temperature-cycling affects ICs by causing various damages,

including solder joint fatigue, fracture in bond wires, and die deformation

[Jedec10]. In addition to these undesirable effects, 3D stacked ICs suffer

from defects related to through silicon vias. TSV protrusion and void

formation in TSV are two of such defects. These effects are worsened by

temperature cycling. Furthermore, some other defects, including resistive

opens and stress induced carrier mobility reduction, can also be worsened

by temperature cycling [Kumar12, Okoro14, Zhang13].

This chapter presents a schedule-based technique that integrates

temperature cycling acceleration with testing procedure. The cycling

acceleration is achieved by mixing heating sequences and cooling intervals

with test sequences in an efficient order. Furthermore, tests and heating

sequences are reordered so that a rapid testing and acceleration process is

achieved. The proposed technique is in contrast with the existing

approaches that are based on temperature chambers and can be impractical

for 3D-SICs due to their unaffordable costs and limitations.

6.1 Preliminaries

Temperature-cycling exacerbates a number of defect mechanisms, as

pointed out before. Therefore, operating the dies under intensive

temperature cycling can effectively accelerate such failures so that they can

be detected by the subsequent test, before the 3D-SIC is shipped out. This

procedure is called temperature-cycling acceleration [Jedec09, Mil04].

6

Chapter 6

140

Note that even though both conventional burn-in test and temperature-

cycling test are designed to detect early-life failures, temperature-cycling

is different from the conventional burn-in. These two aim at accelerating

different aging mechanisms. Cycling acceleration will not accelerate aging

mechanisms identical to those that burn-in does and vice versa. To briefly

explain this difference, let us focus only on two distinct aging mechanisms.

During burn-in, the device is operated in a very hot environment with

increased voltage to accelerate electromigration. This must continue for a

relatively long time to allow for sufficient migration (detectable atomic

built-up or depletion). On the contrary, simply operating the device at a

single temperature does not create cycling-related material fatigue. It is the

variation of the mechanical stress (as a result of varying temperature) that

does it. The required amounts of burn-in and cycling are decided based on

analytical, experimental, and empirical studies that are outside the scope

of this thesis. In this thesis we solely focus on temperature-cycling and

assume that the required amount of cycling is given by the user.

Let us have a closer look at protrusion of TSVs out of the die surface

caused by temperature cycling. Right after TSV fabrication, there is

normally no protrusion and the TSVs have about the same length as the

die’s thickness. However, after a few temperature-cycles an increase in the

TSV length may be observed. The TSV length will continue to increase

with the number of cycles [Kumar12, Zhang13]. After a certain amount of

temperature cycling, the TSV length approaches a maximum level. Further

temperature cycling will have almost no effect on the TSV length,

afterwards. The TSV protrusion can be further exacerbated by the electrical

current it carries [Kumar12, Zhang13]. Therefore, operating the IC during

this procedure (letting the current to flow) speeds up the cycling

acceleration.

The existing procedure for temperature-cycling acceleration is based on

one or multiple temperature chambers [Jedec09]. Although this procedure

is usually affordable for 2D ICs, it is likely to be too expensive for 3D-

SICs. Due to TSV-related defects, a larger number of dies manufactured to

be a part of a 3D-SIC may require cycling acceleration compared with 2D

ICs. The shortcomings of the traditional approach include costs for running

the temperature chambers as well as the time and equipment required for

handling the dies/stacks between test equipment and chambers. Besides,

chambers are slow, meaning that only very low frequency cycling is

possible.

Integrated Temperature-Cycling Acceleration and Test

141

Moreover, the 3D-SIC manufacturing process includes multiple bonding

stages. Corresponding to these bonding stages, pre-, mid-, or post-bond

tests are introduced in order to avoid: (1) wasting a good die bonded to a

bad die or stack, (2) wasting bonding effort for bonding bad dies or stacks,

and (3) wasting packaging effort spent on a bad stack. Based on the cost

breakdown, temperature-cycling acceleration could be beneficial at one or

multiple test stages. In order to avoid costs associated with the traditional

techniques, in current practice, some or even all of the temperature-cycling

acceleration operations are avoided. Therefore, the temperature-cycling

related early-life failure rates in the final products will be unnecessarily

high. Integrating the temperature-cycling acceleration with the tests that

are performed at different stages and eliminating the need for temperature

chambers will reduce the overall manufacturing costs.

As previously mentioned, advanced SoCs, especially those manufactured

as a 3D-SIC experience excessively large test power densities during test.

High power densities lead to excessively high temperatures, in particular

for the middle dies in a 3D stack. This otherwise undesirable thermal effect

is, however, utilized here to generate large amounts of temperature-

cycling. Temperature-cycling acceleration is achieved by frequent

switching between high power tests that heat up the IC and pauses that

allow for cooling.

A deliberate pause for cooling is called a cooling interval. A cooling

interval is the time interval that no stimuli are applied to a core and,

therefore, the core’s temperature decreases, as already discussed in earlier

chapters. Some cooling intervals are usually present in the original test

schedule for thermal-safety reasons, as discussed in chapter 4. More

intensive temperature-cycling acceleration can be achieved by introducing

additional cooling intervals and stronger heating sequences into the

process. A stronger heating sequence consists of stimuli that generate

larger switching activities in a core and, therefore, increases the core’s

temperature faster than usual (as discussed in chapter 4 and chapter 5). The

mixture of cooling intervals and heating sequences can generate the

required temperature-cycling acceleration effect.

A test sequence’s bit streams define the circuit-under-test’s power

dissipation in combination with the previously applied test sequence

(circuit’s state) as well as the core’s power-related properties.

Consequently, the power dissipation generated by a series of tests depends

Chapter 6

142

on the order in which they are applied [Chakravarty94]. This phenomenon

is employed in this thesis in order to produce extreme power values for

tests as well as heating sequences and, consequently, achieve a high speed

temperature-cycling process.

The existing methods for managing ICs’ temperatures (in relation with the

testing processes) focus on two issues:

1. Keeping the temperatures under a global upper temperature limit to

prevent overheating (e.g., section 4.1–7) or

2. To respect upper and lower bounds for cores in order to target

temperature-dependent defects (e.g., section 4.8) or gradient-

dependent defects (chapter 5).

In all the above cases, the cores’ temperatures are considered independent

of their cycling effects. Integrating temperature cycling acceleration with

the test procedure was previously studied in [Aghaee15a]. This chapter

develops an integrated temperature cycling technique based on this study.

Moreover, an efficient technique to order the tests and heating sequences

to achieve a high-speed temperature-cycling process is proposed.

6.1.1 Circuit under Test and Test Access Mechanism

It is assumed that there are modules (cores) in the 3D-SIC under test.

These modules are located on different levels of stacked dies. The modules

that are on different layers are connected using TSVs. Tests for each

module can be started and stopped independent of other modules. The

modules could be cores with core wrappers in a core-based design. The

extension of this scenario to 3D-SIC is proposed as the IEEE P1838

standard [Ieee14a]. Test stimuli are, therefore, transferred through a test

access mechanism to the relevant module. It is assumed that the TAM only

affords (a positive integer number) modules to be tested at the same

time. Other modules, therefore, have to queue up and wait for TAM access.

6.1.2 Thermal Model

In order to obtain the temperature values from power values, a thermal

model that describes the thermal behavior of the IC must be used. The

temperature equation (introduced in section 2.6, equation 2.6.1) is repeated

here for convenience:

(6.1.1)


143

All the thermal characteristics of the IC are captured in two matrices

and , obtained in a manner similar to [Coskun09, Huang06]. is the

temperature vector and is the power. and consist of s and s,

respectively, put together in a vector format. Index indicates the relevant

module. There are a total of modules ( ). As

discussed in section 4.6, equation 6.1.1 can be solved for the time-domain

assuming that the power values are constant during a period of time equal

to . The result from equation 4.6.5 is repeated here for convenience:

(6.1.2)

The initial temperature is expressed by and the temperature after a

period of seconds (note that a fraction of a second is used in practice) is

represented by . Matrices and are copied below from equations

4.6.3–4:

(6.1.3a)

(6.1.3b)

The identity matrix is denoted by . The above equations are explained in

the following case study, assuming that there is only one module ( )

with its heat capacitance denoted by (analogous to ). The heat

resistance between the module and the ambient is equal to (analogous to

). In this case, equation 6.1.2 can be re-written as:

(6.1.4)

Since there is only one module, the vectors and matrices are reduced to

scalar values. A larger initial temperature ( ), power ( ), or resistance

( ) results in higher final temperature ( ), if other factors are kept

unchanged. A larger period ( ) means that the contribution of the initial

temperature is smaller while the effect of power on the final temperature is

larger. In the vector form, increasing the period translates into a decreased

and an increased . A large time-constant ( ) means that the initial

temperature takes longer to lose its effect while power takes longer to

noticeably affect the final temperature. In the vector form, increasing the

time-constant translates into an increased and a decreased .


Chapter 6

144

6.1.3 Temperature Cycling Model

The effect of temperature cycling can be described based on the Amount

of Temperature Cycling induced fatigue (denoted by for module ).

Based on the Arrhenius-Coffin-Manson model [Held97, Jedec10], ATC is

estimated as:

(6.1.5)

Considering module , is the number of temperature cycles and

is the amplitude of temperature changes during cycling. In the above

equation, a regular cycling pattern is assumed. It means that the

temperature monotonically increases from an arbitrary temperature, , to

and then monotonically decreases back to .

Usually, when the actual temperature curve is only slightly different from

a regular pattern, the average amplitude is used for . must be

larger than (a very small threshold value) in order to be considered in

the temperature cycling calculations. However, it is not unusual to

completely ignore since the typical temperature changes are much

larger than .

The effect of the average temperature is captured in the exponential term.

The average temperature is expressed by . , , , , and are

constants that are obtained analytically or empirically by reliability

analysts. A comprehensive explanation and details of equation 6.1.5 can

be found in [Jedec10, Held97]. As equation 6.1.5 suggests, a large number

of cycles, , or a large temperature swing, , will result in a large

cycling effect.

6.2 Motivational Examples

6.2.1 ATC Rate for a Simple Scenario

As an example, consider an IC with two modules ( ). Assume that

the TAM can only support one module to be tested at a time ( =1).

Assume that and . The required

amounts of temperature cycling are and for modules and

, respectively. In this chapter, tests that target cycling-dependent defects

are called cycling tests and the other tests are called normal tests. Cycling


145

tests can only be applied after the required amount of temperature cycling,

, is achieved.

A three-phase approach is introduced here: In phase 1, normal tests are

scheduled. A thermal aware scheduling of tests based on the proposed

approach in [He08a] is used. The corresponding temperature curves are

shown in Figure 6.2.1 (green2 for and blue for ). The normal tests

for module end at . Phase 1 starts at time 0 and end at that is defined

as .

Phase 2 starts by evaluating the ATC generated in phase 1. This value is

less than the required in this example. Therefore, phase 2 will

generate additional temperature cycling. This is done by applying the

heating sequences and cooling intervals. Corresponding temperature

cycles can be seen in Figure 6.2.1 from to . Time-point marks the

point when the required is achieved for module . Phase 2 ends

when all required ATCs for all modules are met. This point is marked with

that is defined as . After this, phase 3 starts by applying the

cycling tests. Phase 3 ends when all the cycling tests are complete. This

point is marked with .

Always, a small TAT is desirable. Test application time from 0 to and

from to is already minimized by the given third-party test scheduling

algorithm. The only time reduction opportunity is to speed up phase 2. This

means that a large ATC should be achieved in a short time. Therefore,

should be maximized. Here we assume a uniform periodic

temperature profile that means all cycles have the same amplitude.

2 Figure 6.2.1 is printed in grayscale in copies printed by LiU-Tryck.

Figure 6.2.1 Temperature curves for the three-phase approach

(Curves are illustrative.)

90

60

0

Te

mp

era

ture

[oC

]

150

30 time

120

phase 3phase 2phase 1

Chapter 6

146

Moreover, for this motivational example we assume that in equation 6.1.5:

, , , and .

Since it is assumed that , the exponential term can be ignored for

the moment. Furthermore, since it is assumed that , could

also be ignored. The ATC rate (denoted by for module ) can,

therefore, be defined as:

(6.2.1)

The frequency of temperature changes (i.e., the number of cycles per time

unit) depends on the physical properties of the system and the amplitude

of temperature changes, . It is possible to achieve a high frequency

(i.e., a large ) if is small. A large amplitude on the other hand,

may increase the ATC, only if it dominates the resulted reduction in the

frequency.

6.2.2 Optimal Cycling in a Simplified Scenario

In order to clarify the tradeoff between the frequency and the amplitude of

the temperature cycling, the physical properties of the system should be

captured in the ATC rate equation (equation 6.2.1). In the following this is

done for a simple IC with only one module. The thermal model for such a

case was discussed in section 6.1.2, equation 6.1.4. Remember that is the

heat capacitance and is the thermal resistance between the module and

the ambient. Assume that the heating sequence generates a power equal to

and the power during a cooling interval is zero. Assume that the

temperature varies between and . Both and are positive

real numbers.

The period of a temperature cycle is denoted by . This period consists of

a rise time denoted by plus a fall time denoted by . is the time the

temperature takes to increase from to . is the time taken to

decrease from to . These values are calculated as follows. First,

the system’s differential equation is solved in the time domain similar to

equation 6.1.4 for a period of (i.e., ):

(6.2.2)

Let us denote by and by . For the heating situation:


147

(6.2.3)

Then

(6.2.4a)

Similarly for cooling ( ), can be calculated:

(6.2.4b)

The period, , is calculated as follows:

(6.2.5)

Now, the ATC rate (equation 6.2.1) could be re-written incorporating the

physical properties of the system:

(6.2.6)

Let us first focus on the optimal value for , assuming that is constant.

In this case optimality happens when the denominator in equation 6.2.6 is

minimized. Considering a realistic situation, this is equivalent to finding

the minimum for

(6.2.7)

Following a closed-form approach:

(6.2.8)

The valid solution is . Here for the sake of simplicity, the

ambient temperature was not included in the equations. Since the

temperature model is a linear time-invariant (LTI) system as discussed in

section 4.6 the ambient temperature can be added later on. Assume that

power and resistance values are so that . This means that

considering the ambient temperature ( ), the IC’s temperature will

increase to if no control is applied. Thus, the optimal value for is

.

Chapter 6

148

The resulted equations for finding the optimal value for do not have a

simple closed form. Therefore, a numerical method is employed. The ATC

rate versus for is plotted in Figure 6.2.2. If and

, then the ATC rate is maximal at . For values of

less than the ATC rate increases by increase in . This is due

to the increase in amplitude, , dominating the decrease in

frequency, , in equation 6.2.1. For larger values the ATC rate

decreases by increase in . This is due to the increase in amplitude,

, being dominated by the decrease in frequency, . In other

words, a very large temperature cycle takes too much time to complete.

If the assumption that does not hold, the temperature cycling rate

equation, equation 6.2.6, will be as follows:

(6.2.9)

The inclusion of the exponential (Arrhenius) term results in a larger (or

equal) optimal value. Since both the exponential term and

equation 6.2.6 are increasing when is smaller than , the optimal

value cannot happen for a smaller than . After this point, the value

of equation 6.2.6 decreases while the exponential term is increasing. The

optimal can be in this region ( ). Besides, the introduction of

the exponential term leads to dependency of the optimal on the value of

.

In the general case (without assumptions made solely for the motivational

examples), the optimal value for could be very different compared with

the obtained here. Moreover, the assumptions made for obtaining

equation 6.2.1 will not be valid and therefore the situation will be more

complicated than discussed in the above paragraph. In such situations a

numerical approach is best suited to find the optimal values for and .

Figure 6.2.2 ATC rate, , versus for three-phase approach

010002000300040005000

0 10 20 30 40 50 60


149

Moreover, in the general case, there are multiple modules competing for

access to TAM and their interference makes the problem even more

complicated, so complex that a heuristic is the only practical technique to

deal with the problem.

6.2.3 Effect of the Test Application Order

In general, the circuit under test’s consumed power depends on the order

in which the tests are performed. Let us consider the scan chain itself.

Different orders of the tests will result in different transition counts and

thus different power values.

Consider a 4-bit scan chain as shown in Figure 6.2.3. Assume that 0101,

1111, and 1010 are the test stimuli. The order 1010-1111-0101, as shown

in Figure 6.2.3a, results in 12 transitions in the scan chain during shift-in.

Another test order, 1111-1010-0101, as shown in Figure 6.2.3b, results in

22 transitions and thus higher power dissipation. Assuming that the

temperature of the core should be reduced, arranging the tests in their low

power order may avoid an additional cooling interval. Alternatively, if the

core is in its heating interval of the cycling process, the high power

arrangement may replace an unnecessary heating sequence application.

This will ensure that TAM is not unnecessarily occupied by dummy

heating sequences. Both situations help to shorten the test application time.

6.3 Problem Formulation

As discussed before, along with pre-, mid-, or, post-bond tests,

temperature-cycling acceleration might be beneficial. In this case, there

will be tests that target cycling-dependent defects (i.e. cycling tests) in

addition to other tests (i.e., normal tests). Normal tests are scheduled along

with heating and cooling intervals in order to generate the required amount

of temperature cycling. The cycling tests can be performed afterward.

Figure 6.2.3 Test orders

(a) A low power order. (b) A high power order.

(a)

Total transitions=

1010

1101

1110

1111

1111

1111

0111

1011

0101

(b)

Total transitions=

1111

0111

1011

0101

1010

1101

0110

1011

0101

7654

22

3333

12

Chapter 6

150

The amount of temperature cycling can be easily calculated using equation

6.1.5 if the temperature swings in a uniform periodic manner similar to

Figure 6.3.1a. In Figure 6.3.1a five cycles with amplitudes equal to can

be identified. In the general case, for example when the IC is under test,

the temperature fluctuations are irregular, as shown in Figure 6.3.1b. In

this case, identifying cycles and their amplitudes is not straightforward. For

such irregular patterns, the number and amplitudes of the cycles are

calculated using the widely used Rainflow-counting algorithm

[Matsuishi68].

As mentioned previously, the required amount of temperature cycling is

denoted by . The current amount of temperature cycling generated

by normal tests or heating sequences (e.g., phase 1 and phase 2 in Figure

6.2.1), up to a given time, , is denoted by . For a certain test

schedule, the temperature curves are obtained using temperature

simulations. Then a fast version of the Rainflow-counting algorithm,

introduced in [Musallam12], calculates .

Assuming that for , , only normal tests can be

performed before time . The cycling tests can only be performed after

the required amount of cycling ( ) has been applied. Therefore, after

time , cycling tests can be performed too. The test application time,

, marks the point that testing module is complete. consists

of the time spent before and after time . The goal is to generate a

schedule with a minimal overall TAT. The overall test application time is

defined as .

As previously discussed, the power dissipation during a test depends on

the previous test, among other factors. Assuming that test for module

immediately follows test , the dynamic power is expressed by

. The overall power dissipation (in the circuit under test), denoted

by , consists of the dynamic power, , plus the stray power,

Figure 6.3.1 Temperature patterns

(a) Uniform periodic. (b) Irregular.

(a) (b)1 2 3 4 5


151

denoted by ( ). The dynamic power is caused by

the circuit under tests’ switching activities. As introduced in section 5.2.2,

the stray power is defined, in this thesis, as the sum of all those power

values whose dissipations cannot be independently controlled with existing

test controls. This includes the leakage power as well as the clock

networks’ power. Stray power’s exact value depends on the module’s

current temperature since the leakage power depends on the temperature.

In this chapter, the stray power (including temperature dependent leakage)

is taken into account.

It is assumed that module has tests including both normal and

cycling tests. Relevant test properties can be captured in a test graph.

Consider an IC that consists of two modules ( ). Assume that module

has two tests ( ) as shown in Figure 6.3.2a. Module has three

tests ( ) as shown in Figure 6.3.2b. Assume that one of the tests for

module is a normal test (the node is marked with N) and the other is a

cycling test (marked with C). A node that corresponds to a heating

sequence (marked with H) is also included in the test graph. Tests and the

heating sequence for module are marked in a similar manner. Total test

powers are shown on the edges in Figure 6.3.2. Usually, in the general case,

there are a number of normal and cycling tests in addition to a number of

heating sequences.

At each time point, during the test, there could be some tests that cannot be

performed. This is due to a number of reasons, including the limited

Figure 6.3.2 Test graphs: (a) module (b) module .

Test graphs consist of normal (N), cycling (C), and heating (H) nodes.

(a)

(b)

N C

s0,0 s0,1p0,0-1

p0,1-0H

s0,2p0,0-1

p0,1-0

p0,0-2

p1,2-0

N C C

s1,0 s1,1 s1,2

p1,0-2

p1,2-0

p1,0-1

p1,1-0 p1,2-1

p1,1-2

H

s1,3

p1,3-2

p1,2-3

p1,1-3

p1,3-1

p1,3-0

p1,0-3

Chapter 6

152

capacity of the TAM as well as the cycling tests that cannot be performed

before the required ATC is applied. A validity checker is used to make sure

that the scheduling algorithm takes these limitations into account. The

validity checker updates the set of Valid Tests (VaT) if a new test can be

performed in parallel with the tests that are already selected for the current

time point. It also makes sure that any test that cannot be applied in parallel

with the currently selected tests does not remain in VaT. This is based on

the knowledge of previously applied tests as well as the partial set of tests

selected to be applied next.

Moreover, the current amount of the ATC is also taken into account. For

example, assume that in Figure 6.3.2 normal tests ( and ) have been

performed previously. Assume that is already selected to be applied

next and the required ATC for is already achieved. In this case VaT is

. Meaning that , , or can be applied in parallel

with without violating TAM limit or ATC requirement. Although

using (i.e., the heating sequence) does not make sense since the

required ATC is already achieved, it would be a valid choice from the

VaT’s point of view. Note that the heating sequences can be applied

repeatedly, as needed, while repeating the tests is usually unnecessary.

The goal is to schedule the tests so that all the cycling tests are performed

after the required amount of ATC is achieved and the overall test

application time (including the cycling process) is minimized. This is

achieved by scheduling and reordering the tests and the heating sequences.

High power test stimuli and heating sequences can increase the modules’

temperatures. A module may become so hot that unrealistic failures show

up and even the device gets damaged. In order to avoid these undesirable

overheating situations, the modules’ temperatures must be kept below the

overheating temperature ( ) at any time. The overheating

temperature is equal to the temperature limit minus a safety margin to

ensure thermal safety. The power dissipation during a pause is equal to the

stray power, (which includes leakage).

The problem can be formally stated as follows. The inputs to the suggested

technique include the IC’s thermal model, the IC’s electrical model (e.g.,

specification of the TAM and power-related specifications), the test graph

(i.e., the cycling tests, normal tests, and the switching activities of the tests

and heating sequences), the ambient temperature ( ), and the

required amount of temperature cycling, . The objective is to


153

minimize the test application time. The output is the corresponding

schedule that guides the application of the tests and heating sequences in

proper order so that all the tests are performed rapidly and correctly.

The generated schedule will imply, for each of the modules, a certain

ordering of the test graph’s node. The ordering can be represented by a

directed path in each of the original test graphs (e.g., graphs in Figure

6.3.2). This directed path must visit each test node at least once and may

visit heating nodes as many times as needed. Applying a test or a heating

sequence is equivalent to visiting the corresponding test or the heating

node.

The test ordering and scheduling can also be viewed as converting the

original test graph into a final path-graph. A path-graph is defined as a

graph with only one directed path that connects all the nodes. There is no

other edge in a path-graph except those on this unique path. The final path-

graph must include all of the test nodes, while the heating nodes are

included as needed. The complete test scheduler that includes the ordering

algorithm decides at which point to insert a node taken from the original

test graph into the final path-graph.

6.4 Three-Phase Approach

The basics of the three-phase approach are briefly explained in section

6.2.1. Section 6.2.2 presented a technique to find the best temperature

interval ( to ) for a simplified scenario. As discussed before, if

the coefficient (in equation 6.1.5) is much larger than the average

temperature ( ) and the high temperature level ( ) is smaller

than the overheating temperature, , everything in section 6.2

would be fine.

However, often these assumptions are not valid, for example the

overheating temperature may be relatively low compared with . For

the example in section 6.2.2, is equal to while the

overheating temperature might be . There are some other

complications, as well. In practice there are a number of modules, instead

of one, and their temperatures depend on each other due to heat transfer.

Moreover, the power values fluctuate with time. Besides, power values

include the stray powers that depend on the temperature due to the

temperature dependent leakage currents. Additionally, the modules may

Chapter 6

154

not be able to receive their heating sequences at desired times, due to the

TAM limitation. New approaches capable of taking all these situations into

account are, therefore, proposed in this section.

As discussed in section 6.2.1, in phase 1 and 3 the tests are scheduled using

a thermally safe third-party algorithm. It is assumed that these algorithms

perform optimization to reduce the test application time. Our focus will

therefore be on phase 2 where new algorithms can be designed to minimize

the test application time. This was demonstrated using a small example in

section 6.2.2. Assume that in phase 2 the temperature of module is

intended to swing between a low temperature level and a high

temperature level ( ). In comparison with the example in

section 6.2.2, and have roles similar to that of and ,

respectively.

The heating sequences are assumed to be powerful enough to raise the

module’s temperature to . The high temperature level should always be

lower than the overheating temperature ( ) to avoid any

kind of damage. Since all the normal tests and all the cycling tests are to

be separately scheduled using third party algorithms and then performed in

two isolated phases (phase 1 and 3), there is no need to represent them in

the test graph. Consequently the test graph reduces to only include the

heating nodes (nodes marked with H in Figure 6.3.2). This simplifies the

problem of finding proper paths in these reduced graphs. For each module,

a greedy approach is used here and the heating node that offers the highest

heating power is selected to follow the current node.

An on-the-fly approach is used to schedule the heating sequences for phase

2 based on the simulated temperatures. The temperatures that are obtained

by simulation are then compared with and in order to generate the

schedule. High power heating nodes are used to rapidly increase the

temperature. Immediately after the temperature reaches its peak at , a

cooling interval is introduced to reduce the temperature back to . Then,

for the sake of a fast cycling, the heating sequence must be immediately

applied again. However, the TAM might not be available at this moment.

Consequently, the temperature may fall below from time to time.

Heating sequences for different modules will compete for access to TAM.

The priority is decided based on the following equation:


155

(6.4.1)

Both and depend on time and are shortened forms of and

, respectively, at time . The priority is higher if the module’s

current temperature is much below . Note that the priorities are

calculated only for modules that need heating, therefore . The

reason for the inclusion of this difference term (i.e., ) in the

priority assessment is that if a module gets really cold, it takes too much

time to warm it up again. Therefore, it is a good idea to give a higher

priority to the colder modules.

A module that has a large amount of temperature cycling left to fill has also

a higher priority. This is indicated by . Such a module is likely to

need a relatively long time to achieve its required ATC. Consequently, it

is likely that at the later stages of phase 2 this module remains alone. This

implies that the interleaving opportunities for TAM access will be reduced.

Consequently TAM utilization may decrease and test application time may

increase. A small value, , is added to the denominator in order to prevent

numerical problems when ATC is zero (e.g., at the beginning of phase 2,

if there has not been any normal test).

The test application time for the schedules generated by this on-the-fly

approach depends on and . These temperature levels could assume

a range of values provided that . The

temperature that corresponds to the stray power is called stray temperature

and is denoted by (always ).

Temperature of a module cannot be lower than this because of the stray

power dissipation.

The combination of these temperature levels ( and ) among different

modules affects the test application time. The proper values for these

decision variables will be found in an external optimization loop, as shown

in Figure 6.4.1. In the inner scheduling loop, the temperature levels (i.e.,

decision variables) defined by the outer optimization loop are used to

generate the schedule. In Figure 6.4.1, the scheduler boxes inside the

dashed box represent multiple copies of the inner scheduling algorithm.

However only one of such schedulers is sufficient to perform the

optimization, multiple of them are used in parallel to speed up the

procedure.

Chapter 6

156

The outer optimization loop in Figure 6.4.1 makes use of a particle swarm

optimization algorithm. PSO is an iterative population-based optimization

metaheuristic, as discussed in section 2.7. For each alternative solution in

the PSO’s population, an on-the-fly scheduling is performed (inside the

dashed box in Figure 6.4.1) to compute the cost function (i.e., TAT).

The working of the PSO algorithm is repeated here for convenience. The

algorithm starts from a random initial population, similar to other

population based metaheuristics (e.g., evolutionary methods). The

population is referred to as a swarm in PSO terms. An individual in the

population is referred to as a particle. Each particle goes through a number

of alternative solutions, one at a time, as the algorithm iterates.

Each particle has a location in the search space (i.e., the current alternative

solution). A particle records the best solution it has ever encountered, the

local best. The swarm records the best solution its particles have ever

encountered, the global best. Based on these best solutions and the

previous alternative solution a velocity is determined which also

incorporates some randomization. Velocity is the vector that determines

the next location for a particle. The particles move throughout the search

space in a guided random manner until they gather around a near optimal

solution.

Figure 6.4.1 External optimization loop based on particle swarm optimization

The algorithm is used to minimize the test application time. Inside the dashed box, copies of the

scheduling heuristics are performed in parallel for a number of particles

Schedule the tests for each particle

1st

scheduler 2nd

scheduler Last scheduler

Finished?

Update the local bests and the global best

Update the swarm (velocities & locations)

Alternative decision variables

Schedules & test application times

No Yes

Final schedule & test application time

Initialize the swarm


157

6.5 Integrated Approach

Let us assume, now, that the orders in which normal test nodes (e.g., nodes

marked with N in Figure 6.3.2) must be visited are given. Furthermore,

assume that the order for heating sequence nodes (e.g., nodes marked with

H in Figure 6.3.2) are also given. This means that the original test graph is

broken down into a number of sub-graphs. This includes two separate

directed path-graphs, one for normal tests and the other for the heating

sequences among other sub-graphs.

This simplified scenario which involves two separate path-graphs will be

discussed first and a path-graph scheduling algorithm will be introduced in

sections 6.5.1–3. Afterwards, section 6.5.4 explains how to employ this

path-graph scheduling algorithm to solve the original problem that

involves the original test graph (i.e., the problem described in section 6.3).

An example in the following paragraphs, using Figure 6.5.1, explains how

the proposed schedule generation works. Figure 6.5.2 shows how different

blocks of the algorithm are put together. The example in Figure 6.5.1

explains how all these blocks work together to generate a schedule. Let us

assume that path-graph scheduling (i.e., Path-graph scheduling block in

Figure 6.5.2) determines that the module must receive heating at test

cycle. Test cycles are shown in Figure 6.5.1f. It asks test graph node

ordering (i.e., the node ordering block in Figure 6.5.2) for options.

Test graph ordering replies by two options (as shown in Figure 6.5.1d):

The first option is [ , ] that is a path-graph consisting of high power

normal test nodes. The second option is [ , ] that consists of heating

nodes. This interaction is depicted in Figure 6.5.2 as the loop between the

path-graph scheduling block and the node ordering block. The output of

the node ordering block is monitored to determine if all tests are completed.

The path-graph scheduling decides to go on with [ , ]. Now, the

power values are known and temperature simulation is performed to obtain

the temperatures. This interaction is depicted in Figure 6.5.2 as the loop

between the path-graph scheduling block and the temperature simulator

block.

The simulated temperatures are plotted in Figure 6.5.1a–b. As module

heats up, module is slightly warmed up by the transferred heat from

. It is assumed that the die in this example consists of only two modules.

Chapter 6

158

Moreover, it is assumed that the test access mechanism provides access to

only one of the modules at a time. The module that occupies the TAM is

depicted in Figure 6.5.1c.

Every decision (i.e., change in the schedule) is recorded in the schedule as

a new entry. Each entry consists of the corresponding cycle in addition to

the node and state for each and every module. For example, a decision was

made at cycle to start . This is registered in the schedule as shown in

Figure 6.5.1f–j. Applying continues smoothly to the end and then

starts (at ), as previously suggested by the node ordering block.

At cycle the temperature of reaches the high level and cooling is

required. The node ordering block is consulted and it returns [ , ]

that consists of low power normal tests. The other alternative is a pause

Figure 6.5.1 An example for schedule generation

(Curves are illustrative).

module m1

s1,1

s1,2

s1,4

s1,2

module m0

s0,0

s0,2 s0,3

s0,2

TAM m1m0m1m0

sch

ed

ule

Te

st G

rap

h

Ord

eri

ng

m0

m1

Pause Start/Resume

Te

mp

era

ture

Cu

rve

s &

No

de

Tra

nsi

tio

ns

m0

m1

s0,8 s0,7

s0,5 s0,4

s0,3 s0,1 s0,7 s0,8

s1,9 s1,6

s1,8 s1,7

s1,1 s1,3

s0,4 s0,6

s0,0 s0,2

s1,5 s1,8

s1,4 s1,2

cycles

node

state

node

state

i0

s0,0

i1

s0,2

i2

s0,2

s1,4

i3

s0,2

s1,2

i4

s0,2

s1,2

i5

s0,3

s1,2

i6

s0,1

s1,2

i7

s0,1

s1,1

(a)

(b)

(c)

(d)

(e)

(f)

(g)

(h)

(i)

(j)


159

(cooling interval). Since the application of is not complete, the

application of low power normal tests is not possible. Therefore, a cooling

interval is introduced. This frees the TAM that the other module can utilize.

The node ordering block suggests either [ , ] or [ , ]. The

scheduler decides to go with , a new entry for the cycle is added to

the schedule and then the simulations and scheduling continue. Note that

if the temperature reaches the overheating limit (that is higher than the high

level discussed here and, therefore, is not shown in Figure 6.5.1) only a

pause can be selected (definitely not a low power test).



that consists of low power normal tests. The other alternative, as always

for cooling, is a pause. Since the application of is not complete, the

application of low power normal tests is not possible. Therefore, a cooling

interval is introduced. This frees the TAM that the other module can utilize.

Figure 6.5.2 Integrated scheduling approach

The decision variables are highlighted with gray.

Decision variables:

Threshold on power difference to decide between cooling interval or low-power test application

Length of the power assessment window for node ordering in:

Cooling situation

Heating situation

Ordinary situation

Thermal emergency situation

Temperatures

Current amounts of temperature cycling

Stop cooling temperature limits

Low cycling temperature limits

High cycling temperature limits

Emergency temperature limits

Threshold on power difference to decide between heating sequence or high-power test application

Remaining test sizes

Priorities

Power values

Alternative path-graphs

Alternative decision variables

th scheduler

Node orderingTest

graph

No

Yes

Schedule & test application time

Priority calculation

Temperature simulator

Path-graph scheduling

All tests scheduled?

Chapter 6

160

Since was pending, it is resumed and there is no need to consult the

node ordering block at the moment. At cycle , the node ordering block is

consulted and is selected for application.



that consists of low power normal tests. Obviously, the other alternative is

a pause. This time the application of is complete and, therefore,

can actually be selected. However, the path-graph scheduler decides that,

in any case, a pause is better. Note that before a node is started or resumed,

its validity (VaT as discussed in section 6.3) is checked. If not in the VaT

list, either another alternative must be selected or the module must wait

until incompatible tests are complete. The above process, as explained in

Figure 6.5.1, continues until all tests are performed.

6.5.1 Path-Graph Scheduling Algorithm

The test application time could be reduced if normal tests (phase 1) are

integrated into the temperature-cycling acceleration process (phase 2). For

example, a test can be employed to heat a module and avoid an unnecessary

inclusion of a heating node. It may happen that a test is not powerful

enough to increase the modules’ temperature to and yet it is beneficial

to include it to partially heat the module. A heating node is introduced

afterwards to rapidly increase the temperature up to .

Similar to this heating scenario, a mixed cooling scenario is also possible.

The benefit of these mixing scenarios is that although the temperature will

change slowly (increasing the test application time), a part of the tests is

being applied (decreasing the TAT). In a mixed cooling scenario, a low

power test is introduced when the temperature must decrease to create a

cycle. Albeit the decrease in the module’s temperature, the temperature

may not decreases to . A cooling interval is then introduced to complete

the cycle.

Assume that a high power test is being applied in a heating scenario as

shown in Figure 6.5.3a. Assume that the high-power test’s power for the

current time interval is denoted by . This power rapidly increases the

temperature at the beginning. Assume that this level of power is applied

for a long time. In this case a steady state temperature equal to will

eventually be reached. As the current temperature approaches , the

heating rate decreases. The derivative of the temperature (i.e., heating rate)


161

is shown in the lower part of Figure 6.5.3a. When the difference between

the heating-sequence’s heating rate and the test’s heating rate increases

beyond a certain threshold ( in Figure 6.5.3a), it is time to

switch to the heating sequence.

This will rapidly increase the temperature to . The temperature caused

by heating sequence (shown as the red curve in Figure 6.5.3a) introduces a

heating rate much larger than that of the test. Therefore, it is better to save

the rest of the tests for a time that the initial temperature is lower and the

tests can offer a large heating rate. The rate of temperature change (heating

rate in this case) is . Therefore the condition on heating rate is:

(6.5.1)

The temperature when the heating sequence is applied is denoted by .

When the high-power test is applied, the temperature is denoted by .

The heating rate can be calculated based on the current temperature and

upcoming power values using equation 6.1.1:

(6.5.2)

Combining equation 6.5.1, equation 6.5.2, and the equivalent of equation

6.5.2 written for the heating sequences (instead of high power tests in

equation 6.5.2) results in:

(6.5.3)

Figure 6.5.3 Thresholds in the integrated approach

(a) Heating and (b) Cooling.

(a) (b)

Te

mp

era

ture

De

riv

ati

ve Derivatives:

Temperatures:

Heating sequence

Testing

Cooling interval

Heating sequence

Testing

Cooling interval

Chapter 6

162

Considering the fact that at the moment of decision making, there is only

one actual temperature, ( ), the condition can be further

simplified to:

(6.5.4)

This could be re-written to have the condition expressed for the power

values:

(6.5.5)

Renaming to results in:

(6.5.6a)

Similarly, for the situation that the temperature must decrease (as shown in

Figure 6.5.3b), the proper condition for switching from a test to a cooling

interval is:

(6.5.6b)

The power of the low-power test is denoted by and the power of the

cooling interval (i.e., the stray power) is denoted by . Switching to the

cooling interval when indicated by the above equation speeds up the

cooling. This way, the normal tests are employed in an efficient way during

temperature-cycling process so that the overall test application time is

further reduced.

According to equations 6.5.6a–b, the scheduling heuristic does not need to

compute the derivatives of the upcoming tests’ temperatures. Instead, it is

sufficient to compare the upcoming power values. Whenever the inequality

in equation 6.5.6a is satisfied, test nodes are followed by heating nodes and

whenever the inequality in equation 6.5.6b is satisfied, the testing is paused

for cooling purpose. The variables and (elements that construct

and vectors), are to be optimized along with and , in the outer

optimization loop (e.g., Figure 6.4.1), to achieve a short test application

time.

These variables are optimized using PSO similar to the one explained in

section 6.4. As mentioned before, Figure 6.5.2 shows how the components

of the integrated approach (excluding the outer PSO loop that is already

explained in Figure 6.4.1) are put together. The path-graph scheduling is

shown in Figure 6.5.2 as a part of the scheduling algorithm. Other


163

components of Figure 6.5.2 will be explained in the upcoming sections.

Since the optimization process is similar to the PSO discussed in section

6.4, Figure 6.5.2, as a whole, can be viewed as one of the scheduler boxes

shown inside the dashed box in Figure 6.4.1. The alternative decision

variables shown in gray above Figure 6.5.2 come from Figure 6.4.1.

6.5.2 Length of the Power Averaging Window

The average upcoming powers (i.e., , , and ) can be calculated

for a short segment of the tests or heating sequences that immediately

follows. The shortest length of this segment is denoted by for module

. Having a much shorter segment than results in higher computational

effort without a significant improvement in the accuracy. Taking multiple

s into account helps to obtain a long-term estimate of the power values.

A much longer minimal segment length than is not desirable since a

more accurate estimate becomes unlikely to achieve.

The proper value of depends on the dynamics of the system. Consider

a that corresponds to percent ( ) of the final response

to a step input. Here, the final response is the steady state temperature and

the step input is when zero input power is followed by a constant power.

Assuming a constant power, the temperature equation in the time-domain

can be written according to equations 6.1.2–3. We assume that the step

response starts from the initial temperature equal to zero ( ).

Replacing with the percent of the final temperature results in:

(6.5.7)

Since the steady state situation means negligible variations in the

temperature, the temperature derivative can be assumed zero ( ).

By combining this observation with equation 6.1.1, the steady state

temperature can be described as:

(6.5.8)

Replacing from the above equation and from equation 6.1.3b in

equation 6.5.7 results in:

(6.5.9)

Here we are going to replace a scalar time, , with a matrix of time, .

Besides, we assume that the equivalence of the sides in the above equation

Chapter 6

164

is achieved by satisfying the following equation (equation 6.5.10). These

assumptions work for estimating the values of ’s [Lin84].

(6.5.10)

Replacing from equation 6.1.3a results in

(6.5.11)

And finally

(6.5.12)

( ) is the time constants matrix [Lin84] (analogous to in

equations 6.1.4, 6.2.2–6, and 6.2.9 for a single-element case) and is the

matrix that contains the values of s. A diagonal element in (i.e.,

that is denoted by ) represents the proper minimal length for averaging

the upcoming test powers for module .

A ’s value obtained this way is not too short and will contain the

required information. On the other hand, the use of such values

prevents the temperature changes that are larger than from going

unnoticed. This percentage, , is only used for estimating the upcoming

tests’ average powers. The temperature simulations are always performed

based on the original power sequence. Therefore, the value of will not

affect them.

A set of experiments reported in [Aghaee15a] evaluate the accuracy of

values estimated using equation 6.5.12. The accurate value for is

obtained based on high quality temperature simulations. The average error

is found to be around 5%. This confirms that the above estimates have

sufficient accuracy, in practice.

6.5.3 Priorities for TAM Access

Normal tests, heating sequences, and cycling tests may compete for access

to TAM. The priority for letting module to access TAM is assigned

based on the following criterion.

(6.5.13)

Similar to equation 6.4.1, the priority is higher for the colder modules and

for the modules with larger remaining ATC. Moreover, a module’s priority


165

is higher if its current amount of remaining tests (denoted by ) is larger.

Both normal and cycling tests are taken into account for calculation.

The motivation for inclusion of , similar to that of , is to avoid a

small number of modules running long after all other modules have

completed their tests. Such a scenario implies inefficient use of TAM due

to lack of interleaving opportunities.

In the above equation, is used to calculate the priority for a module

running a heating sequence. For normal or cycling tests, instead of ,

that indicates sufficient cooling, as introduced in section 2.7, is used in

equation 6.5.13. In case of the cycling tests, is replaced with one

(removed from equation 6.5.13) since the value of ATC is not relevant

anymore (after the required ATC is achieved). The priorities are calculated

based on frequently updated values for amount of temperature cycling,

temperatures, and the size of the remaining tests. These values are sent out

from the path-graph scheduling box in Figure 6.5.2 and the resulted

priorities are sent back to it.

6.5.4 Node Ordering in the Test Graph

The path-graph scheduling algorithm cannot be directly employed to solve

the problem that involves the original test graph (e.g., Figure 6.3.2). The

path-graph scheduling needs to know, at certain time points, the order of

the test nodes that will follow and sometimes also the order of the heating

nodes. A path-graph format is usually used to represent the node order for

different sub-graphs. These orders may change during the scheduling, as

different nodes are being included in the schedule’s final path-graph.

A node ordering technique is introduced in this section to determine the

proper node orders, over and over again, during the scheduling process.

This node ordering technique, put together with the path-graph scheduling

algorithm (section 6.5.1), solves the problem that involves the original test

graph, as shown in Figure 6.5.2.

For example, consider a test sub-graph with three normal and three heating

nodes, as shown in Figure 6.5.4a. The graph is simplified for the situation

in which the required ATC is not achieved yet. Therefore, all cycling tests

can be safely removed from the original test graph, for the moment.

Assume that the node (a normal test) is already included in the

Chapter 6

166

schedule. Assume that after completion, the temperature must increase

in order to create a cycle.

The path-graph scheduling only needs to know what the sequence of the

normal test nodes (e.g., [ , ] in Figure 6.5.4b) would be if it decides

to continue the schedule with the high power tests. Furthermore, it needs

to know what the sequence of the heating nodes (e.g., [ , , ] in

Figure 6.5.4b) would be if it decides to continue with the heating

sequences. Based on the average power of these upcoming tests and

heating sequences, the path-graph scheduling algorithm decides which

node to include in the schedule, next.

For the above heating case, the high power orders for tests and heating

nodes are desirable. Similar to the above heating scenario, a node ordering

is performed also for the cooling scenario. In this case, there are no heating

nodes, and a low power order for the test nodes is the only thing to be

determined.

Let us continue with the test ordering for a heating scenario. Assume that

just one node can be considered at a time to determine the high power

order. Continuing with the previous example where is already selected,

if is larger than , then is selected to immediately follow

. Since only one node is left, the node ordering for the test nodes must

be [ , ].

Instead of only one node at a time, two nodes at a time, also, can be

considered to determine the high power order. In this case, the decision is

made based on a normalized power value for two nodes. For example if

Figure 6.5.4 Node ordering

(a) Original graph (b) Ordered during the scheduling process for the time point that comes just after

has been scheduled.

(a) (b)

N

s0,1

N

s0,2

N

s0,0

H

s0,3

H

s0,4

H

s0,5

N

s0,0 N

s0,2

N

s0,1

H

s0,4

H

s0,3

H

s0,5

p0,0-2

p0,0-4


167

>

Then, the node ordering for the test nodes must be [ , ]. The

normalized power value, if node follows node , is denoted by .

Therefore, in the above example, . The number of nodes taken

into account at a time could be larger. Moreover, it might be helpful to

consider only a part of the test sequence at the beginning of a node.

Therefore, the ordering criterion can be generalized to consider the power

values inside a power assessment window. The length of the power

assessment window is defined as multiples of . The multiplier is denoted

by and it is sufficient to capture the window length. Thus, the length

of the power assessment window is cycles (of high-power test or

heating sequence). Assume that a node consists of samples and

cycles is equal to nodes plus cycles ( ). This

means that nodes ( ) will be involved. Assuming [ , , …,

, ] as the supposed node order, its normalized power is:

(6.5.14)

It is assumed that node is visited immediately after node ( for

). Note that unlike the test nodes that are visited only once, the

heating nodes may be repeated as needed. For example the heating nodes

could have [ ] as the order, although this has not happened in

Figure 6.5.4b. Similar to , which is for heating situations, is used

for the cooling situations.

A small ) results in fast schedule generation, but the generated

schedules might not be as short as they would have been with a large

). A large ), on the other hand, results in a slow schedule

generation. Moreover, a too large may delay the use of some of the

best heating sequences so much so that they are left unused at the end. The

proper and values are obtained in the external optimization loop.

In the inner ordering/scheduling loop, the ) values defined by the

outer optimization loop are used to generate the orders and the schedule as

shown in Figure 6.5.2. The outer optimization loop consists of a particle

swarm optimization algorithm as previously described in Figure 6.4.1.

Chapter 6

168

After the required amount of cycling is achieved, the remaining normal

tests and cycling tests must be performed. In this case, heating nodes as

well as the already applied normal tests can be safely removed from the

original test graph. The newly created test sub-graph must be converted to

a path-graph whenever the path-graph scheduling algorithm (detailed in

section 6.5.1) demands a new node.

The module temperature may be high due to its previous activities or

because of a high temperature in adjacent modules (heat transfer among

modules). When the module’s temperature is too high and close to the

overheating limit ( ), it might be helpful to find a node ordering

that swiftly reduces the power. This is important in a short-time window

and moreover usually a rather low-power test sequence may be found if the

power-assessment window is rather short.

It might not matter if this node order results in a higher test power some

time later, since then the module might be cold. In such an emergency

situation a short power-assessment window, denoted by , is used. There

is an emergency situation if the current temperature is larger than the

emergency temperature limit, denoted by . If the temperature is less than

then the situation is ordinary. In any case, the nodes must be ordered in

a low power configuration as detailed above.

The length of power-assessment window in this ordinary situation is

denoted by . It might be helpful to have a long ordinary power-

assessment window ( ) to avoid large switching activities in a

long-term sense (as opposed to short-term low-power in emergency

situations). The value of is optimized in the outer optimization loop

along with , , , , , , , , and . This outer

optimization loop is similar to Figure 6.4.1 and sends the alternative

decision variables to the integrated scheduling algorithm, as shown in

Figure 6.5.2. The alternative decision variables shown coming to Figure

6.5.2 are from Figure 6.4.1. The generated schedule and its corresponding

TAT shown going out of Figure 6.5.2 are used in Figure 6.4.1.

The search to find the best order (e.g., a path in a graph similar to Figure

6.5.4) is performed using a branch and bound approach which searches the

graph down to a depth equal to cycles (also , , or ,

correspondingly). The cost function (normalized power similar to equation

6.5.14) can be replaced with the accumulated power since all the


169

alternatives have the same ( , , or , correspondingly). When a

low power order is required (corresponding to the situation in which ,

, or are used), the search can be very fast, since after a relatively

good path is found, the bad candidates’ accumulated power values rapidly

exceed the already found relatively small power value. Consequently, the

inferior candidate paths are rapidly discarded. The search for heating

situations may take longer but, nevertheless, the overall schedule

generation procedure is adequately fast.

6.5.5 Remarks

As mentioned before, the proposed techniques are designed so that a test

dependency graph can be accommodated. In many other cases, the test

dependencies rule out some of the combinations in the schedule and

therefore reduce the search space, which helps to achieve a faster schedule

generation (shorter CPU time). In the case of test ordering, especially for

the test graph, test dependencies will remove some of the edges in the test

graph. Therefore techniques that utilize the fact that the test graph is a

complete graph are not helpful in this case.

The proposed technique uses temperature simulations in order to generate

a test schedule that has certain temperature characteristics. We are using a

good simulator and, therefore, there is no large temperature error. Since the

error is minor, a safety margin is sufficient to prevent overheating, as

discussed in chapter 4.

To ensure that a sufficient amount of cycling has been applied before the

related tests, a slightly larger amount of required cycling can be assumed.

It is assumed that a node in the test graph can be paused and resumed. This

is required for on-demand cooling as well as partitioning and interleaving.

In other words a session-less testing scheme is used. A certain module can

pause and resume its test but it cannot change to a different node before it

completes the node that it has already started.

A node in a test graph may consist of a single test vector or a number of

vectors that are applied one after the other. In general, a node in the test

graph consists of a single test vector and, therefore, the test graph is large.

Albeit the test-graphs’ large sizes, the scheduling heuristic is capable of

handling them, since it is very fast. However, if the number of test vectors

is excessively large, then the schedule generation may become slow. In

such a situation, multiple test vectors can be grouped into a single node in

Chapter 6

170

the test graph. Ideally, nodes such that their different orders do not cause

large power dissipation differences should be grouped together. This

reduces both the computational effort and the loss of ordering

effectiveness. The test-vector clustering problem (i.e., how to group the

test vectors into the nodes of a test graph) is, however, outside the scope of

this thesis.

It should also be mentioned that there can be scenarios such that using a

chamber-based technique is required. For example, after the packaging, to

perform cycling tests targeting the IC features that are external to dies, a

chamber-based technique is required.

A chamber-based approach enforces, however, the maximal cycling

acceleration on all modules. This may lead to longer overall test time and

unnecessary aging of modules that require less cycling acceleration. The

integrated approach, on the other hand, can be faster and cheaper than the

chamber-based approach. Moreover, it supports different amounts of

temperature cycling for different modules. For example, one module can

receive very little cycling acceleration, while another module receives a

very large cycling acceleration, as needed.

6.6 Experimental Results

Experiments have been performed to demonstrate that the proposed

technique can efficiently achieve desired temperature-cycling

accelerations. Moreover, it is demonstrated that the proposed integrated

approach offers a smaller test application time and, therefore, outperforms

the three-phase approach. However, if the normal or cycling test schedules

provided by a third-party have to be used, the three-phase approach must

be chosen. In the following, first the cycling acceleration effect is

demonstrated in section 6.6.1 and then, in section 6.6.2, the performance

of the proposed approach is discussed.

6.6.1 Cycling Acceleration

The proposed integrated approach is used to perform tests and cycling

acceleration for an IC with two modules, as a demonstrator example. It is

assumed that the TAM in this example can only support one module to be

tested at a time ( =1). The corresponding temperature curves are plotted

in Figure 6.6.1a. At the beginning (before ) there are many normal


171

tests that are properly mixed with heating sequences and cooling intervals

in order to create a high cycling rate. As time goes on, the number of

normal tests that can be effectively used reduces and, therefore, the

majority of cycling is generated by a mix of heating sequences and cooling

intervals (which is, in general, faster and more effective). Around ,

the required amounts of temperature cycling for the two modules are met

and the cycling tests as well as the remaining normal tests can be applied

until all tests are performed (around ).

As more and more temperature cycles are performed (as in Figure 6.6.1a),

the amount of temperature cycling accumulates as suggested by the

increasing accelerated time in Figure 6.6.1b. The vertical axis in Figure

6.6.1b is the accelerated cycling time and the horizontal axis is the actual

time. Moreover, the temperature curves in Figure 6.6.1a are used to

Figure 6.6.1 Cycling acceleration

(a)

(b)

30

40

50

60

70

80

90

1

70

13

9

20

8

27

7

34

6

41

5

48

4

55

3

62

2

69

1

76

0

82

9

89

8

96

7

10

36

11

05

11

74

12

43

13

12

13

81

14

50

15

19

15

88

16

57

17

26

17

95

18

64

19

33

20

02

20

71

21

40

22

09

22

78

23

47

24

16

24

85

25

54

26

23

26

92

1

2

0

10

20

30

40

50

Te

mp

era

ture

[oC

]

90

80

70

60

50

Acc

ele

rate

d T

ime

[h

ou

rs]

5

4

3

2

1

0

Time [ms]

m0

m1

m0

m1

(c)

65

70

75

80

85

12

50

12

55

12

60

12

65

12

70

12

75

12

80

12

85

12

90

12

95

13

00

13

05

13

10

13

15

13

20

Te

mp

era

ture

[oC

]

8580757065

1266 1318

×4000 Cycles

m0

10 20 30 40 50 60 70 80 90 100 1100

Chapter 6

172

compare the proposed integrated approach (Alternative 1, below) with a

chamber-based technique (Alternative 2):

Alternative 1

Let us evaluate our proposed integrated approach here. A middle section

of the temperature curve from Figure 6.6.1a is magnified in Figure 6.6.1c

for module . Temperature swings between and , resulting in

a temperature-cycle amplitude equal to ( ). The average

temperature is approximately ( ). A temperature cycle

happens in test cycles. The test is performed at

. Therefore:

.

Equation 6.1.5 should be used to calculate the ATC value. Here we assume

that , , , , and . Therefore,

the ATC value achieved in a second is:

Alternative 2

Assume a chamber-based approach that uses Thermotron® test chamber

number SE-400-15-15. According to its specifications, this chamber can

create a temperature-cycle, similar to that of Alternative 1, in

approximately 380 seconds. Therefore:

and , .

The ATC value achieved in a second is:

The amount of temperature cycling per second achieved by the chamber-

based technique is around while the integrated approach

achieves around . This means that our approach outperforms the

chamber-based technique by a huge margin (almost 180000 times3).

3 Although other chamber setups may perform better, their corresponding

margins will be still very large.


173

6.6.2 Performance of the Integrated Approach

The proposed techniques are evaluated on a set of 24 experimental ICs as

detailed in Table 6.6.1. Column 1 indicates the IC’s serial numbers. These

ICs have one to four stacked dies (column 2). The ICs with one layer

(number 1 to 6) correspond to dies at the pre-bond test stage. The ICs with

more than one layer represent a mid-bond or a post-bond test stage. Each

die accommodates 2, 12, 20, 30, 42, and 49 modules resulting in 2 to 196

Table 6.6.1 Percentage changes achieved by integrated approach

IC specifications

Percentage change in TATs Percentage change in CPU times

Integrated approach compared

with:

Integrated approach compared

with:

Number

of layers

Number of

modules

Three-phase

w/o ordering1

Three-phase

with ordering2

Three-phase

w/o ordering3

Three-phase

with ordering4

1 1 2 –22.99 –21.75 0.00 0.00

2 1 12 –17.02 –14.81 0.00 0.00

3 1 20 –29.73 –2.12 0.00 0.00

4 1 30 –22.24 –5.51 300.00 300.00

5 1 42 –12.65 –10.08 200.00 200.00

6 1 49 –33.33 –32.19 500.00 500.00

7 2 4 –19.08 –19.08 0.00 0.00

8 2 24 –2.13 –2.13 200.00 200.00

9 2 40 –4.58 –3.74 500.00 500.00

10 2 60 –8.09 –7.34 350.00 350.00

11 2 84 –9.81 –9.52 325.00 325.00

12 2 98 –7.11 –2.94 300.00 300.00

13 3 6 –38.16 –35.60 0.00 0.00

14 3 36 –32.47 –20.71 400.00 400.00

15 3 60 –15.05 –13.85 333.33 333.33

16 3 90 –9.034 –8.92 283.33 283.33

17 3 126 –10.88 –7.45 169.23 191.67

18 3 147 –22.32 –12.98 83.33 83.33

19 4 8 –52.83 –44.92 0.00 0.00

20 4 48 –22.12 –21.95 25.00 25.00

21 4 80 –25.04 –24.86 13.04 13.04

22 4 120 –16.22 –16.03 18.48 15.96

23 4 168 –16.97 –16.96 14.43 18.08

24 4 196 –22.99 –13.99 20.50 26.87

Average –19.70 –15.39 168.15 169.40

1 Percentage change in the integrated approach’s test application time ( )

compared with the 3-phase approach without test ordering ( )

2 Percentage change in the integrated approach’s test application time ( )

compared with the 3-phase approach with test ordering ( )

3 Percentage change in the integrated approach’s CPU time ( ) compared

with the 3-phase approach without test ordering ( )

4 Percentage change in the integrated approach’s CPU time ( ) compared

with the 3-phase approach with test ordering ( )

Chapter 6

174

modules per IC, as shown in column 3. The number of the corresponding

nodes in the test graphs is between 60 and 5880 and test sizes are between

234 kB and 22 MB. The thermal models are extracted using an approach

similar to [Coskun09]. The switching activities for tests and heating

sequences are generated using Markov chains, similar to [Yao11c]. All

experiments are performed on a desktop computer with Intel® Xeon®

W3520 processor and 8 GB of memory.

The required amount of cycling is separately defined for each module in

the experimental ICs. It does not depend on the scheduling method and,

therefore, both the three-phase and integrated approaches must enforce

identical amounts of temperature cycling. Since before enforcing ,

cycling tests cannot be performed, the cycling operations continue until

is enforced. This implies that it is sufficient to compare the test

application times.

The integrated approach achieves shorter TAT compared with the three-

phase approach for all of the experimental ICs. Two types of the three-

phase approach are considered. Type 1 does not use any ordering

technique. Type 2 uses a test ordering technique. The percentage changes

compared with the three-phase approach without test ordering (type 1) and

with test ordering (type 2) are reported in columns 4 and 5, respectively.

In average, the proposed technique outperforms the three-phase technique

with ordering and the three-phase technique without ordering by about 15

percent and 20 percent, respectively. Since the test ordering offers a

reduced TAT, the improvements achieved by the integrated approach are

larger when compared with the three-phase without test-ordering approach.

Compared with the three-phase approach, the integrated approach is more

complicated and, therefore, it takes more time to run. The CPU times are

rounded to seconds and then percentage changes are calculated. The

percentage changes in the CPU times are reported in columns 6 and 7 for

the integrated approach compared with the three-phase approach without

and with test ordering, respectively. In average, the proposed technique is

slower than the three-phase technique with ordering and the three-phase

technique without ordering by about 169 and 168 in percentage change,

respectively.

The CPU times for the three-phase approaches with and without test

ordering are comparable. Compared with the three-phase without ordering

approach, the three-phase with ordering approach is more complicated for


175

the decision-points4 in the schedule. This is due to the time taken to search

for a good node order. On the other hand, the three-phase with node

ordering approach offers slightly shorter schedules which mean less

decision-points. Consequently, sometimes shorter schedules compensate

for the time-consuming node ordering operations, but not always.

Therefore, sometimes the number in columns 6 is smaller than the number

in column 7, for example for IC number 17 and sometimes larger, for

example for IC number 22.

Note that in these experiments for CPU times we included the scheduling

times for the normal and cycling tests. If the schedules would have been

provided by a third-party, then the actual CPU times for the three-phase

approaches will be smaller. Consequently, the numbers reported in

columns 6 and 7 could be larger than the current values. Shorter CPU time

can be considered as an advantage for the three-phase approach.

CPU times, in general, grow with the tests size as shown in Figure 6.6.2.

Moreover, CPU times grow also with the number of modules and layers as

shown in Figure 6.6.2b. The data points in Figure 6.6.2a represent

multiples of test sizes used in Figure 6.6.2b. The growth rates are however

4A decision-point is a point that a module’s state (testing/heating/cooling) or

test/heating node may change.

Figure 6.6.2 CPU time growth (a) with test size, (b) with IC complexity

0

100

200

300

0 100 200 300 400 500 600 700 800

CP

U t

ime

[se

c]

Number of modules × Number of layers

(b)

0

20

40

60

0 1000 2000 3000 4000 5000

CP

U T

ime

[m

in]

Test Size [kB]

(a)

Chapter 6

176

acceptably low and the scheduling process for the largest IC (number 24)

takes less than 5 minutes to complete.

6.7 Conclusions

Temperature-cycling acceleration is a useful technique to help the

detection of cycling-dependent early-life failures. These failures are

usually not considered as a major issue for conventional 2D ICs. Therefore,

cycling acceleration is usually recommended when a high degree of

reliability is crucial. Recent studies have shown that the cycling-dependent

early-life failures can be a major issue for 3D stacked ICs. The existing

cycling acceleration procedures are very costly since they are usually

performed using temperature chambers. In this thesis we propose an

inexpensive technique to order the tests and heating sequences so that

required temperature cycling effects can be achieved in a short time,

without the use of temperature chambers.

For this purpose tests are ordered differently based on the required power

for the related situation. When a module’s temperature must increase to

generate a temperature cycle, a high-power ordering of the tests and

heating sequences is considered. For the situation that the temperature must

decrease, a low-power ordering of the tests is used, instead. During the

tests, after the required cycling is achieved, depending on the current

module’s temperature, a long term or a short term low-power ordering of

the tests is selected. All these help to achieve a short test application time,

as demonstrated by the experiments. Consequently, this integrated

approach is well-suited to be integrated into pre-, mid-, and post-bond test

stages for 3D stacked ICs.

177



Represents heat capacitances in the thermal model

Amount of temperature cycling

Required amount of temperature cycling for module

Represents thermal conductance (related to heat transfer) in the

thermal model

Heat capacitance in a simple single module case (analogous to )

Identity matrix

Constants in ATC equation

Number of temperature cycles for module

Power value(s) in a general case

Heating sequences’ powers

High-power tests’ powers

Low-power tests’ powers

Stray power

PSO Particle Swarm Optimization [Poli07]

Thermal resistance in a simple single module case (analogous to

)

Remaining tests’ size

Node in module ’s test graph (same as th test for module

)

TAM Test Access Mechanism

TAT Test Application Time

Transfer matrix for initial temperatures

Transfer matrix for power values

Constant in ATC equation

Number of samples/cycles in one test/heating node.

Chapter 6

178


Temperatures vector in a general case

Ambient temperature

Overheating temperature limit

Stray temperature (caused by stray power, )

Initial temperature(s)

Final temperature(s) after seconds

Steady state temperatures

High cycling temperature limit for module

Low cycling temperature limit for module

Emergency temperature limit for module

Stop cooling temperature limit for module

Threshold for temperature cycle amplitudes

Average cycling temperature for module

The amplitude of temperature cycles for module

Heating rate of the heating sequences

Heating rate of the high-power tests

Remaining number of cycles when full nodes are subtracted from

the length of power assessment window ( ). covers

nodes plus cycles.

Minimal segment lengths for power averaging/assessment (Vector

and element formats)

Average cycling temperature for single module example

Number of full nodes in the power assessment window ( ).

covers nodes plus cycles.

TAM access priority for module


179


ATC rate

Half of temperature cycle amplitude for single module example

Time period between the initial temperatures and the final

temperature

Threshold on power difference to decide between cooling interval

or low-power test application

Number of nodes involved in the power assessment window

( ).

Threshold on power difference to decide between heating

sequence or high-power test application

Length of the power assessment window. cycles are

considered. Refers indifferently to , , , or .

for node ordering in Cooling situation.

for node ordering in thermal Emergency situation.

for node ordering in Heating situation.

for node ordering in Ordinary situation.

181

Chapter 7 Conclusions and Future Work

7.1 Conclusions

Many cutting-edge computer and electronic products are based on

advanced Systems-on-Chip (SoC). Advanced SoCs are manufactured with

deep submicron and 3D-stacked-IC technologies. These advanced

manufacturing technologies enable the integration of a large number of

high performance functions. Such advanced manufacturing technologies

face a number of thermal challenges in regard with their reliability and

testing procedures. These challenges, related to temperature uncertainty,

temperature gradients, and temperature cycling have been addressed in this

thesis.

Temperature Uncertainty

Advanced SoCs manufactured with deep submicron technologies suffer

from process variation and its thermal consequences. Existing testing

techniques rely on temperature simulations to predict the circuit-under-

tests’ temperatures and design the test so that overheating is prevented. The

difference between the expected temperatures and the actual temperatures

is called temperature error. This error, for past technologies, is negligible.

However, advanced SoCs experience large error magnitudes due to large

process variations. Such large error magnitudes have costly consequences

(e.g., test overkill and overheating) and must be taken care of.

This thesis presents several scheduling-based approaches to take care of

the temperature errors induced by process variation. An adaptive technique

for addressing the intra-die and time-variant errors is introduced. This

technique is designed to support a thermal-safe test which means that a

high-temperature limit must be respected. A slightly different scenario is

multi-temperature testing which also requires considering a low-

7

Chapter 7

182

temperature limit. An adaptive technique to deal with temperature errors

that affect multi-temperature testing is therefore proposed.

Temperature Gradients

Temperature gradients in a chip accelerate certain defect mechanisms

including some types of early-life failures. Therefore, performing a burn-

in like operation that enforces appropriate gradients helps to accelerate and

detect these early-life failures. A test-scheduling based approach for

performing burn-in like operations is proposed in this thesis. The proposed

approach enforces the required temperature gradients by selectively

applying high power test stimuli to the circuit-under-test. This way, the

required life-time acceleration is achieved without requiring temperature

chambers.

Temperature gradients affect also some delay-related defects. The delay

experienced by a signal depends on its path temperature. Moreover, some

defects (e.g., resistive opens) can, also, affect the delay. Different signals

travelling through different paths may therefore experience different

delays because of a subtle defect in one of the paths as well as the path’s

temperature. This means that the circuit may operate correctly when the

gradients are negligible even though a subtle defect exists. However, this

negligible defect may cause a fault when certain gradient occurs on the

chip. In order to detect such subtle defects, the related tests must be applied

when appropriate temperature-gradients are enforced. 3D stacked ICs

experience large gradients and, therefore, the proposed techniques are

developed so that they can be efficiently applied to 3D stacked ICs.

Temperature Cycling

Temperature-cycling test procedures are usually applied to safety-critical

systems to detect cycling-related early-life failures. Such failures affect

advanced SoCs, particularly through-silicon-via structures in 3D-stacked-

ICs. An efficient schedule-based cycling-test technique that combines

cycling acceleration with testing is proposed in this thesis.

The circuit-under-test’s dissipated power depends on the order in which

the tests are applied. Therefore, the tests are reordered by the proposed

technique to adjust the power dissipation levels as needed. This helps to

achieve a short test application time. Moreover, the proposed technique fits

into existing 3D testing procedures and does not require temperature

Conclusions and Future Work

183

chambers. Therefore, the overall cycling acceleration and testing cost can

be drastically reduced.

Temperature Simulation and Experiments

A fast temperature simulation technique based on a closed-form solution

for the temperature equations is introduced in this thesis. Dedicated

experiments show that the proposed simulation technique reduces the

schedule generation time by more than half. This technique is used in the

majority of the experiments reported in this thesis.

All the proposed techniques in this thesis have been implemented and

evaluated with extensive experiments based on ITC’02 benchmark as well

as a number of experimental 3D-stacked-ICs. Experiments show that the

proposed techniques work effectively and reduce the costs, in particular

the costs related to addressing thermal issues and early-life failures.

7.2 Future Work

In this thesis we focused on the manufacturing test process. However, in-

field and online testing are required, for example, for safety-critical

systems. Similar issues to those considered in this thesis, for manufacturing

testing, can cause problems during in-field and online testing as well.

Temperature issues caused by process variation and temperature gradients

are among these issues.

Temperature cycling for applications that require frequent in-field or

online testing is another direction for future research. Designing these

testing procedures that minimize the temperature cycling can be of interest,

in order to slow down the aging process. This is also true for minimizing

the gradients. Utilizing the already existing gradients (during normal

operation) for online gradient-based testing can be efficient and therefore

interesting to study.

Adaptive online testing is another related topic. Temperature cycling and

gradients can be monitored (during the normal operation) and online tests

targeting the weakened areas (likely defects) can be applied. The

temperature errors (caused by process variation) can, also, be estimated

during the normal operation and then a decision between using a slow (low-

power) and a fast (high-power) online test scheme can be made

accordingly. For example for modules that work warmer than usual a

longer low-power online test might be a good choice. On the other hand,

Chapter 7

184

for modules that work colder than usual a faster high-power online test

might be a good choice. This may change over time, partly in relation to

gradients and cycling.

In a manufacturing test setup, testing frequency can be used to alter the

power dissipation, if the DfT circuitry and the ATE support it. For

example, when a colder testing is preferred, the frequency can be reduced.

If heating is required, then the frequency could be increased to generate

more heat. Although not used in this thesis, testing frequency can be added

as another decision variable to the problem formulation.

Defect explorations and reliability studies can identify new challenging

issues that need to be addressed, especially for new technologies. Many

potential issues regarding through silicon vias are already identified, some

of which are discussed in this thesis. As 3D stacked IC technology matures,

more issues may be identified. This is, in particular, important for

applications that require high reliability.

The focus of this thesis is mainly on logic, even though some of the

proposed techniques can be applied to whatever entity that has the

properties of a module (as discussed earlier). There exist several non-logic

components that are usually integrated into advanced SoCs and similar

devices. Memory modules are very important among these devices and are

widely studied in connection with normal 2D and 3D-stack technologies.

Process variation, temperature gradients, and temperature cycling affect

the memories, too.

There are other components that are similarly affected by these negative

effects. Image sensors that are widely used today are among them. CMOS

image sensors can be manufactured using through silicon vias.

Consequently, defects that relate to through silicon vias affect such image

sensors, among other sources of defect. Developing new testing techniques

as well as extending and specializing the methods proposed in this thesis

can be of interest for all these non-logic components.

185

References

[Abramovici94] Miron Abramovici, Melvin A Breuer, and Arthur D Friedman. DIGITAL SYSTEMS

TESTING AND TESTABLE DESIGN, 1994. [Online]. Available:

http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0780310624.html.

[Accessed: 29-May-2015].

[Aghaee10] Nima Aghaee, Zhiyuan He, Zebo Peng, and Petru Eles. TEMPERATURE-AWARE

SOC TEST SCHEDULING CONSIDERING INTER-CHIP PROCESS VARIATION, in 19th

Asian Test Symposium (ATS), pages 395–398, 2010.

[Aghaee11a] Nima Aghaee, Zebo Peng, and Petru Eles. ADAPTIVE TEMPERATURE-AWARE SOC

TEST SCHEDULING CONSIDERING PROCESS VARIATION, in 14th Euromicro

Conference on Digital System Design (DSD), Oulu, Finland, pages 197–204, 2011.

[Aghaee11b] Nima Aghaee, Zebo Peng, and Petru Eles. PROCESS-VARIATION AND

TEMPERATURE AWARE SOC TEST SCHEDULING USING PARTICLE SWARM

OPTIMIZATION, in 6th International Design and Test Workshop (IDT), Beirut,

Lebanon, pages 1–6, 2011.

[Aghaee13a] Nima Aghaee, Zebo Peng, and Petru Eles. PROCESS-VARIATION AND

TEMPERATURE AWARE SOC TEST SCHEDULING TECHNIQUE, Journal of Electronic

Testing, vol. 29, no. 4, pages 499–520, Aug. 2013.

[Aghaee13b] Nima Aghaee, Zebo Peng, and Petru Eles. TEMPERATURE-GRADIENT BASED TEST

SCHEDULING FOR 3D STACKED ICS, in 20th International Conference on

Electronics, Circuits, and Systems (ICECS), Abu Dhabi, UAE, pages 405–408,

2013.

[Aghaee14a] Nima Aghaee, Zebo Peng, and Petru Eles. AN EFFICIENT TEMPERATURE-GRADIENT

BASED BURN-IN TECHNIQUE FOR 3D STACKED ICS, in Design, Automation and Test

in Europe Conference and Exhibition (DATE), Dresden, Germany, pages 1–4,

2014.

[Aghaee14b] Nima Aghaee, Zebo Peng, and Petru Eles. PROCESS-VARIATION AWARE MULTI-

TEMPERATURE TEST SCHEDULING, in 27th International Conference on VLSI

Design, Mumbai, India, pages 32–37, 2014.

[Aghaee15a] Nima Aghaee, Zebo Peng, and Petru Eles. AN INTEGRATED TEMPERATURE-

CYCLING ACCELERATION AND TEST TECHNIQUE FOR 3D STACKED ICS, in 20th Asia

and South Pacific Design Automation Conference (ASP-DAC), Chiba, Japan, pages

526–531, 2015.

R

References

186

[Aghaee15b] Nima Aghaee, Zebo Peng, and Petru Eles. TEMPERATURE-GRADIENT-BASED BURN-

IN AND TEST SCHEDULING FOR 3-D STACKED ICS, IEEE Transactions on Very

Large Scale Integration (VLSI) Systems, vol. PP, no. 99, pages 1–1, 2015.

[Ahmed05] Nisar Ahmed, Mohammad Tehranipoor, and CP Ravikumar. ENHANCED LAUNCH-

OFF-CAPTURE TRANSITION FAULT TESTING, in International Test Conference, pages

255–264, 2005.

[Ayala09] José L Ayala, Arvind Sridhar, Vinod Pangracious, David Atienza, and Yusuf

Leblebici. THROUGH SILICON VIA-BASED GRID FOR THERMAL CONTROL IN 3D

CHIPS, in Nano-Net, Springer, pages 90–98, 2009.

[Bahukud08a] Sudarshan Bahukudumbi and Krishnendu Chakrabarty. POWER MANAGEMENT FOR

WAFER-LEVEL TEST DURING BURN-IN, in 17th Asian Test Symposium, pages 231–

236, 2008.

[Bahukud08b] Sudarshan Bahukudumbi and Krishnendu Chakrabarty. TEST-PATTERN ORDERING

FOR WAFER-LEVEL TEST DURING BURN-IN, in 26th VLSI Test Symposium, pages

193–198, 2008.

[Bahukud09] Sudarshan Bahukudumbi and Krishnendu Chakrabarty. POWER MANAGEMENT

USING TEST-PATTERN ORDERING FOR WAFER-LEVEL TEST DURING BURN-IN, IEEE

Transactions on Very Large Scale Integration (VLSI) Systems, vol. 17, no. 12,

pages 1730–1741, Dec. 2009.

[Bayle10] F Bayle and A Mettas. TEMPERATURE ACCELERATION MODELS IN RELIABILITY

PREDICTIONS: JUSTIFICATION AND IMPROVEMENTS, in annual Reliability and

Maintainability Symposium (RAMS), pages 1–6, 2010.

[Bild08] David R Bild, Sanchit Misra, Thidapat Chantemy, Prabhat Kumar, Robert P Dick,

X Sharon Hu, Li Shang, and Alok Choudhary. TEMPERATURE-AWARE TEST

SCHEDULING FOR MULTIPROCESSOR SYSTEMS-ON-CHIP, in IEEE/ACM

International Conference on Computer-Aided Design, pages 59–66, 2008.

[Bonhomme02] Y Bonhomme, P Girard, C Landrault, and S Pravossoudovitch. TEST POWER: A BIG

ISSUE IN LARGE SOC DESIGNS, in 1st International Workshop on Electronic Design,

Test and Applications, pages 447–449, 2002.

[Borkar03] Shekhar Borkar, Tanay Karnik, Siva Narendra, Jim Tschanz, Ali Keshavarzi, and

Vivek De. PARAMETER VARIATIONS AND IMPACT ON CIRCUITS AND

MICROARCHITECTURE, in 40th annual Design Automation Conference, pages 338–

342, 2003.

[Bosio11] A Bosio, L Dilillo, P Girard, A Todri, A Virazel, K Miyase, and X Wen. POWER-

AWARE TEST PATTERN GENERATION FOR AT-SPEED LOS TESTING, in 20th Asian Test

Symposium, pages 506–510, 2011.

[Bota04] Sebastiàn A Bota, M Rosales, JL Rosello, A Keshavarzi, and J Segura. WITHIN DIE

THERMAL GRADIENT IMPACT ON CLOCK-SKEW: A NEW TYPE OF DELAY-FAULT

MECHANISM, in International Test Conference, pages 1276–1283, 2004.

[Carbine97] Adrian Carbine and Derek Feltham. PENTIUM (R) PRO PROCESSOR DESIGN FOR

TEST AND DEBUG, in International Test Conference, pages 294–303, 1997.

[Chakrabarty00] Krishnendu Chakrabarty. TEST SCHEDULING FOR CORE-BASED SYSTEMS USING

MIXED-INTEGER LINEAR PROGRAMMING, IEEE Transactions on Computer-Aided

Design of Integrated Circuits and Systems, vol. 19, no. 10, pages 1163–1174, Oct.

2000.

References

187

[Chakrabarty02] Krishnendu Chakrabarty, Vikram Iyengar, and Anshuman Chandra. TEST

SCHEDULING USING MIXED-INTEGER LINEAR PROGRAMMING, Frontiers in

Electronic Testing: Test Resource Partitioning for System-on-a-Chip, Springer US,

pages 97–118, 2002.

[Chakrabarty12] Krishnendu Chakrabarty, Sergej Deutsch, Himanshu Thapliyal, and Fangming Ye.

TSV DEFECTS AND TSV-INDUCED CIRCUIT FAILURES: THE THIRD DIMENSION IN

TEST AND DESIGN-FOR-TEST, in International Reliability Physics Symposium, page

5F–1, 2012.

[Chakravarty94] S Chakravarty and VP Dabholkar. TWO TECHNIQUES FOR MINIMIZING POWER

DISSIPATION IN SCAN CIRCUITS DURING TEST APPLICATION, in 3rd Asian Test


[Chandran09] Unni Chandran and Dan Zhao. THERMAL DRIVEN TEST ACCESS ROUTING IN HYPER-

INTERCONNECTED THREE-DIMENSIONAL SYSTEM-ON-CHIP, in 24th IEEE

International Symposium on Defect and Fault Tolerance in VLSI Systems, pages

410–418, 2009.

[Chang05] Hongliang Chang and Sachin S Sapatnekar. FULL-CHIP ANALYSIS OF LEAKAGE

POWER UNDER PROCESS VARIATIONS, INCLUDING SPATIAL CORRELATIONS, in 42nd

annual Design Automation Conference, pages 523–528, 2005.

[Chantem13] Thidapat Chantem, Yun Xiang, X Sharo Hu, and Robert P Dick. ENHANCING

MULTICORE RELIABILITY THROUGH WEAR COMPENSATION IN ONLINE ASSIGNMENT

AND SCHEDULING, in Design, Automation Test in Europe, pages 1373–1378, 2013.

[Cheng00] Kwang Ting Cheng, S Dey, M Rodgers, and K Roy. TEST CHALLENGES FOR DEEP

SUB-MICRON TECHNOLOGIES, in Design Automation Conference, pages 142–149,

2000.

[Cherman12] VO Cherman, J De Messemaeker, K Croes, B Dimcic, G Van der Plas, I De Wolf,

G Beyer, B Swinnen, and E Beyne. IMPACT OF THROUGH SILICON VIAS ON FRONT-

END-OF-LINE PERFORMANCE AFTER THERMAL CYCLING AND THERMAL STORAGE,

in International Reliability Physics Symposium, pages 2B.3.1–2B.3.5, 2012.

[Choi07] Jung Hwan Choi, Jayathi Murthy, and Kaushik Roy. THE EFFECT OF PROCESS

VARIATION ON DEVICE TEMPERATURE IN FINFET CIRCUITS, in IEEE/ACM

international conference on Computer-aided design, pages 747–751, 2007.

[Chou97] RM Chou, KK Saluja, and VD Agrawal. SCHEDULING TESTS FOR VLSI SYSTEMS

UNDER POWER CONSTRAINTS, IEEE Transactions on Very Large Scale Integration

(VLSI) Systems, vol. 5, no. 2, pages 175–185, Jun. 1997.

[Ciappa03a] M Ciappa, F Carbognani, P Cova, and W Fichtner. LIFETIME PREDICTION AND

DESIGN OF RELIABILITY TESTS FOR HIGH-POWER DEVICES IN AUTOMOTIVE

APPLICATIONS, in 41st annual International Reliability Physics Symposium, pages

523–528, 2003.

[Ciappa03b] M Ciappa, F Carbognani, and Wolfgang Fichtner. LIFETIME PREDICTION AND

DESIGN OF RELIABILITY TESTS FOR HIGH-POWER DEVICES IN AUTOMOTIVE

APPLICATIONS, IEEE Transactions on Device and Materials Reliability, vol. 3, no.

4, pages 191–196, Dec. 2003.

[Clabes04] Joachim Clabes, Joshua Friedrich, Mark Sweet, Jack DiLullo, Sam Chu, Donald

Plass, James Dawson, Paul Muench, Larry Powell, and Michael Floyd. DESIGN

AND IMPLEMENTATION OF THE POWER5TM MICROPROCESSOR, in 41st annual

Design Automation Conference, pages 670–672, 2004.

References

188

[Coskun09] AK Coskun, JL Ayala, D Atienza, TS Rosing, and Y Leblebici. DYNAMIC

THERMAL MANAGEMENT IN 3D MULTICORE ARCHITECTURES, in Design,

Automation Test in Europe, pages 1410–1415, 2009.

[Dabholkar98] V Dabholkar, S Chakravarty, I Pomeranz, and S Reddy. TECHNIQUES FOR

MINIMIZING POWER DISSIPATION IN SCAN AND COMBINATIONAL CIRCUITS DURING

TEST APPLICATION, IEEE Transactions on Computer-Aided Design of Integrated

Circuits and Systems, vol. 17, no. 12, pages 1325–1333, Dec. 1998.

[Davis94] Brendan Davis. THE ECONOMICS OF AUTOMATIC TESTING. McGraw- Hill, 1994.

[Deutsch11] Sergej Deutsch, Vivek Chickermane, Brion Keller, Subhasish Mukherjee, Mario

Konijnenburg, Erik Jan Marinissen, and Sandeep K Goel. AUTOMATION OF 3D-

DFT INSERTION, in 20th Asian Test Symposium, pages 395–400, 2011.

[Deutsch12] Sergej Deutsch, Krishnendu Chakrabarty, Shreepad Panth, and Sung Kyu Lim.

TSV STRESS-AWARE ATPG FOR 3D STACKED ICS, in 21st Asian Test Symposium,

pages 31–36, 2012.

[Engelke08] P Engelke, I Polian, M Renovell, S Kundu, B Seshadri, and B Becker. ON

DETECTION OF RESISTIVE BRIDGING DEFECTS BY LOW-TEMPERATURE AND LOW-

VOLTAGE TESTING, IEEE Transactions on Computer-Aided Design of Integrated

Circuits and Systems, vol. 27, no. 2, pages 327–338, Feb. 2008.

[Falkenauer98] Emanuel Falkenauer. GENETIC ALGORITHMS AND GROUPING PROBLEMS.

Chichester ; New York: Wiley, 1998.

[Flores99] P Flores, J Costa, H Neto, J Monteiro, and J Marques-Silva. ASSIGNMENT AND

REORDERING OF INCOMPLETELY SPECIFIED PATTERN SEQUENCES TARGETING

MINIMUM POWER DISSIPATION, in 12th International Conference On VLSI Design,

pages 37–41, 1999.

[Flottes15] Marie-Lise Flottes, Joao Azevedo, Giorgio Di Natale, and Bruno Rouzeyre.

SESSION-LESS BASED THERMAL-AWARE 3D-SIC TEST SCHEDULING, in 20th

European Test Symposium, Cluj-Napoca, Romania, 2015.

[Frank10] T Frank, Cedrick Chappaz, P Leduc, L Arnaud, S Moreau, A Thuaire, R El-

Farhane, and L Anghel. RELIABILITY APPROACH OF HIGH DENSITY THROUGH

SILICON VIA (TSV), in 12th Electronics Packaging Technology Conference, pages

321–324, 2010.

[Ganapathy10] Shrikanth Ganapathy, Ramon Canal, Antonio Gonzalez, and Antonio Rubio.

CIRCUIT PROPAGATION DELAY ESTIMATION THROUGH MULTIVARIATE

REGRESSION-BASED MODELING UNDER SPATIO-TEMPORAL VARIABILITY, in

Design, Automation & Test in Europe, pages 417–422, 2010.

[Girard97] P Girard, C Landrault, S Pravossoudovitch, and D Severac. REDUCTION OF POWER

CONSUMPTION DURING TEST APPLICATION BY TEST VECTOR ORDERING [VLSI

CIRCUITS], Electronics Letters, vol. 33, no. 21, pages 1752–1754, Oct. 1997.

[GopiReddy14] L GopiReddy, LM Tolbert, B Ozpineci, and JOP Pinto. RAINFLOW ALGORITHM

BASED LIFETIME ESTIMATION OF POWER SEMICONDUCTORS IN UTILITY

APPLICATIONS, in 29th annual Applied Power Electronics Conference and

Exposition, pages 2293–2299, 2014.

[Gorev13] M Gorev, R Ubar, P Ellervee, S Devadze, J Raik, and M Min. AT-SPEED SELF-

TESTING OF HIGH-PERFORMANCE PIPE-LINED PROCESSING ARCHITECTURES, in

NORCHIP Conference, pages 1–6, 2013.

References

189

[Groebel01] DJ Groebel, A Mettas, and Feng-Bin Sun. DETERMINATION AND INTERPRETATION

OF ACTIVATION ENERGY USING ACCELERATED-TEST DATA, in annual Reliability

and Maintainability Symposium, pages 58–63, 2001.

[Hagihara97] Y Hagihara, S Inui, F Okamoto, M Nishida, T Nakamura, and H Yamada.

FLOATING-POINT DATAPATHS WITH ONLINE BUILT-IN SELF SPEED TEST, IEEE

Journal of Solid-State Circuits, vol. 32, no. 3, pages 444–449, Mar. 1997.

[Held97] M Held, P Jacob, G Nicoletti, P Scacco, and MH Poech. FAST POWER CYCLING

TEST OF IGBT MODULES IN TRACTION APPLICATION, in International Conference

on Power Electronics and Drive Systems, vol. 1, pages 425–430 vol.1, 1997.

[He06a] Zhiyuan He, Zebo Peng, and P Eles. POWER CONSTRAINED AND DEFECT-

PROBABILITY DRIVEN SOC TEST SCHEDULING WITH TEST SET PARTITIONING, in

Design, Automation and Test in Europe, vol. 1, pages 1–6, 2006.

[He06b] Zhiyuan He, Zebo Peng, P Eles, P Rosinger, and BM Al-Hashimi. THERMAL-

AWARE SOC TEST SCHEDULING WITH TEST SET PARTITIONING AND INTERLEAVING,

in 21st International Symposium on Defect and Fault Tolerance in VLSI Systems,

pages 477–485, 2006.

[He07] Zhiyuan He, Zebo Peng, and P Eles. A HEURISTIC FOR THERMAL-SAFE SOC TEST

SCHEDULING, in International Test Conference, pages 1–10, 2007.

[He08a] Zhiyuan He, Zebo Peng, and Petru Eles. SIMULATION-DRIVEN THERMAL-SAFE TEST

TIME MINIMIZATION FOR SYSTEM-ON-CHIP, in 17th Asian Test Symposium, pages

283–288, 2008.

[He08b] Zhiyuan He, Zebo Peng, Petru Eles, Paul Rosinger, and Bashir M Al-Hashimi.

THERMAL-AWARE SOC TEST SCHEDULING WITH TEST SET PARTITIONING AND

INTERLEAVING, Journal of Electronic Testing, vol. 24, no. 1–3, pages 247–257,

Jan. 2008.

[He09] Zhiyuan He, Zebo Peng, and P Eles. THERMAL-AWARE TEST SCHEDULING FOR

CORE-BASED SOC IN AN ABORT-ON-FIRST-FAIL TEST ENVIRONMENT, in 12th

Euromicro Conference on Digital System Design, Architectures, Methods and

Tools, pages 239–246, 2009.

[He10] Zhiyuan He, Zebo Peng, and P Eles. MULTI-TEMPERATURE TESTING FOR CORE-

BASED SYSTEM-ON-CHIP, in Design, Automation Test in Europe, pages 208–213,

2010.

[Higami13] Yoshinobu Higami, Hiroshi Takahashi, Shin-ya Kobayashi, and Kewal K Saluja.

TEST GENERATION FOR DELAY FAULTS ON CLOCK LINES UNDER LAUNCH-ON-

CAPTURE TEST ENVIRONMENT, IEICE Transactions on Information and Systems,

vol. E96-D, no. 6, pages 1323–1331, Jun. 2013.

[Higham05] N Higham. THE SCALING AND SQUARING METHOD FOR THE MATRIX EXPONENTIAL

REVISITED, SIAM Journal on Matrix Analysis and Applications, vol. 26, no. 4,

pages 1179–1193, Jan. 2005.

[Hirschmann06] D Hirschmann, D Tissen, S Schroder, and RW De Doncker. RELIABILITY

PREDICTION FOR INVERTERS IN HYBRID ELECTRICAL VEHICLES, in 37th Power

Electronics Specialists Conference, pages 1–6, 2006.

[Hirschmann07] D Hirschmann, D Tissen, S Schroder, and RW De Doncker. RELIABILITY

PREDICTION FOR INVERTERS IN HYBRID ELECTRICAL VEHICLES, IEEE Transactions

on Power Electronics, vol. 22, no. 6, pages 2511–2517, Nov. 2007.

References

190

[Huang01] Yu Huang, Wu-Tung Cheng, Chien-Chung Tsai, N Mukherjee, O Samman, Y

Zaidan, and SM Reddy. RESOURCE ALLOCATION AND TEST SCHEDULING FOR

CONCURRENT TEST OF CORE-BASED SOC DESIGN, in 10th Asian Test Symposium,

pages 265–270, 2001.

[Huang02] Yu Huang, SM Reddy, Wu-Tung Cheng, P Reuter, N Mukherjee, Chien-Chung

Tsai, O Samman, and Y Zaidan. OPTIMAL CORE WRAPPER WIDTH SELECTION AND

SOC TEST SCHEDULING BASED ON 3-D BIN PACKING ALGORITHM, in International

Test Conference, pages 74–82, 2002.

[Huang06] Wei Huang, S Ghosh, S Velusamy, K Sankaranarayanan, K Skadron, and MR Stan.

HOTSPOT: A COMPACT THERMAL MODELING METHODOLOGY FOR EARLY-STAGE

VLSI DESIGN, IEEE Transactions on Very Large Scale Integration (VLSI) Systems,

vol. 14, no. 5, pages 501–513, May. 2006.

[Huang07] Wei Huang. HOTSPOT—A CHIP AND PACKAGE COMPACT THERMAL MODELING

METHODOLOGY FOR VLSI DESIGN, Dissertation, University of Virginia, 2007.

[Ieee14a] IEEE P1838 3D-TEST WORKING GROUP, 2014. [Online]. Available:

http://grouper.ieee.org/groups/3Dtest/. [Accessed: 29-May-2015].

[Ieee14b] IEEE STANDARD FOR ACCESS AND CONTROL OF INSTRUMENTATION EMBEDDED

WITHIN A SEMICONDUCTOR DEVICE, IEEE Std 1687-2014, pages 1–283, Dec. 2014.

[Intel13] INTEL XEON E5-2600 V3 PROCESSOR OVERVIEW: HASWELL-EP UP TO 18 CORES,

PC PERSPECTIVE, 2013. [Online]. Available:

http://www.pcper.com/reviews/Processors/Intel-Xeon-E5-2600-v3-Processor-

Overview-Haswell-EP-18-Cores. [Accessed: 28-May-2015].

[Iyengar01] Vikram Iyengar and Krishnendu Chakrabarty. PRECEDENCE-BASED, PREEMPTIVE,

AND POWER-CONSTRAINED TEST SCHEDULING FOR SYSTEM-ON-A-CHIP, in 19th

VLSI Test Symposium, pages 368–374, 2001.

[Iyengar02] Vikram Iyengar, Krishnendu Chakrabarty, and Erik Jan Marinissen. ON USING

RECTANGLE PACKING FOR SOC WRAPPER/TAM CO-OPTIMIZATION, in 20th VLSI

Test Symposium, pages 253–258, 2002.

[Jagan10] L Jagan, C Hora, B Kruseman, S Eichenberger, AK Majhi, and V Kamakoti.

IMPACT OF TEMPERATURE ON TEST QUALITY, in 23rd International Conference on

VLSI Design, pages 276–281, 2010.

[Jedec09] TEMPERATURE CYCLING. Jedec solid state technology association, 2009.

[Jedec10] FAILURE MECHANISMS AND MODELS FOR SEMICONDUCTOR DEVICES, 2010.

[Online]. Available: http://www.jedec.org/standards-documents/docs/jep-122e.

[Accessed: 23-May-2014].

[Jiang14] T Jiang, C Wu, N Tamura, M Kunz, B Kim, H Son, M Suh, J Im, R Huang, and P

Ho. STUDY OF STRESSES AND PLASTICITY IN THROUGH-SILICON VIA STRUCTURES

FOR 3D INTERCONNECTS BY X-RAY MICRO-BEAM DIFFRACTION, IEEE

Transactions on Device and Materials Reliability, vol. 14, no. 2, pages 698–703,

June 2014.

[Kamto09] A Kamto, Y Liu, L Schaper, and SL Burkett. RELIABILITY STUDY OF THROUGH-

SILICON VIA (TSV) COPPER FILLED INTERCONNECTS, Thin Solid Films, vol. 518, no.

5, pages 1614–1619, Dec. 2009.

References

191

[Kim10] Tak-Yung Kim and Taewhan Kim. CLOCK TREE SYNTHESIS WITH PRE-BOND

TESTABILITY FOR 3D STACKED IC DESIGNS, in 47th Design Automation

Conference, pages 723–728, 2010.

[Ko08] HF Ko and N Nicolici. AUTOMATED SCAN CHAIN DIVISION FOR REDUCING SHIFT

AND CAPTURE POWER DURING BROADSIDE AT-SPEED TEST, IEEE Transactions on

Computer-Aided Design of Integrated Circuits and Systems, vol. 27, no. 11, pages

2092–2097, Nov. 2008.

[Kumar12] P Kumar, I Dutta, and MS Bakir. INTERFACIAL EFFECTS DURING THERMAL

CYCLING OF CU-FILLED THROUGH-SILICON VIAS (TSV), Journal of Electronic

Materials, vol. 41, no. 2, pages 322–335, Feb. 2012.

[Kundu05] S Kundu, P Engelke, I Polian, and B Becker. ON DETECTION OF RESISTIVE

BRIDGING DEFECTS BY LOW-TEMPERATURE AND LOW-VOLTAGE TESTING, in 14th

Asian Test Symposium, pages 266–271, 2005.

[Kuo11] Chi-Wei Kuo and Hung-Yin Tsai. THERMAL STRESS ANALYSIS AND FAILURE

MECHANISMS FOR THROUGH SILICON VIA ARRAY, in 6th International

Microsystems, Packaging, Assembly and Circuits Technology Conference, pages

169–172, 2011.

[Kuo12] Chi-Wei Kuo and Hung-Yin Tsai. THERMAL STRESS ANALYSIS AND FAILURE

MECHANISMS FOR THROUGH SILICON VIA ARRAY, in 13th Intersociety Conference

on Thermal and Thermomechanical Phenomena in Electronic Systems, pages 202–

206, 2012.

[Liao05] Weiping Liao, Lei He, and KM Lepak. TEMPERATURE AND SUPPLY VOLTAGE

AWARE PERFORMANCE AND POWER MODELING AT MICROARCHITECTURE LEVEL,

IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,

vol. 24, no. 7, pages 1042–1053, Jul. 2005.

[Lin84] Tzu-Mu Lin and CA Mead. SIGNAL DELAY IN GENERAL RC NETWORKS, IEEE

Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol.

3, no. 4, pages 331–349, Oct. 1984.

[Li01] JC Li, Chao-Wen Tseng, and EJ McCluskey. TESTING FOR RESISTIVE OPENS AND

STUCK OPENS, in International Test Conference, pages 1049–1058, 2001.

[Liu04] Michael Liu, Wei-Shen Wang, and Michael Orshansky. LEAKAGE POWER

REDUCTION BY DUAL-VTH DESIGNS UNDER PROBABILISTIC ANALYSIS OF VTH

VARIATION, in International Symposium on Low Power Electronics and Design,

pages 2–7, 2004.

[Loi08] Igor Loi, Subhasish Mitra, Thomas H Lee, Shinobu Fujita, and Luca Benini. A

LOW-OVERHEAD FAULT TOLERANCE SCHEME FOR TSV-BASED 3D NETWORK ON

CHIP LINKS, in IEEE/ACM International Conference on Computer-Aided Design,

pages 598–602, 2008.

[Long04] E Long, WR Daasch, R Madge, and B Benware. DETECTION OF TEMPERATURE

SENSITIVE DEFECTS USING ZTC, in 22nd VLSI Test Symposium, 2004. Proceedings,

pages 185–190, 2004.

[Lu07] Hua Lu, T Tilford, and DR Newcombe. LIFETIME PREDICTION FOR POWER

ELECTRONICS MODULE SUBSTRATE MOUNT-DOWN SOLDER INTERCONNECT, in

International Symposium on High Density packaging and Microsystem Integration,

pages 1–10, 2007.

References

192

[Manikandan11] P Manikandan, BB Larsen, and EJ Aas. AN ENHANCED PATH DELAY FAULT

SIMULATOR FOR COMBINATIONAL CIRCUITS, in 14th Euromicro Conference on

Digital System Design, pages 375–381, 2011.

[Marinissen00] EJ Marinissen, SK Goel, and M Lousberg. WRAPPER DESIGN FOR EMBEDDED CORE

TEST, in International Test Conference, pages 911–920, 2000.

[Marinissen02] EJ Marinissen, V Iyengar, and K Chakrabarty. A SET OF BENCHMARKS FOR

MODULAR TESTING OF SOCS, in International Test Conference, pages 519–528,

2002.

[Marinissen09] Erik Jan Marinissen and Yervant Zorian. TESTING 3D CHIPS CONTAINING

THROUGH-SILICON VIAS, in International Test Conference, pages 1–11, 2009.

[Marinissen10a] Erik Jan Marinissen, Chun-Chuan Chi, Jouke Verbree, and Mario Konijnenburg.

3D DFT ARCHITECTURE FOR PRE-BOND AND POST-BOND TESTING, in International

3D Systems Integration Conference, pages 1–8, 2010.

[Marinissen10b] Erik Jan Marinissen, Jouke Verbree, and Mario Konijnenburg. A STRUCTURED AND

SCALABLE TEST ACCESS ARCHITECTURE FOR TSV-BASED 3D STACKED ICS, in 28th

VLSI Test Symposium, pages 269–274, 2010.

[Marinissen10c] Erik Jan Marinissen. CHALLENGES IN TESTING TSV-BASED 3D STACKED ICS: TEST

FLOWS, TEST CONTENTS, AND TEST ACCESS, in Asia Pacific Conference on Circuits

and Systems, pages 544–547, 2010.

[Marinissen12a] Erik Jan Marinissen. CHALLENGES AND EMERGING SOLUTIONS IN TESTING TSV-

BASED 2 1/2D-AND 3D-STACKED ICS, in Design, Automation and Test in Europe,

pages 1277–1282, 2012.

[Marinissen12b] Erik Jan Marinissen, Chun-Chuan Chi, Mario Konijnenburg, and Jouke Verbree. A

DFT ARCHITECTURE FOR 3D-SICS BASED ON A STANDARDIZABLE DIE WRAPPER,

Journal of Electronic Testing, vol. 28, no. 1, pages 73–92, Feb. 2012.

[Matsuishi68] M Matsuishi and T Endo. FATIGUE OF METALS SUBJECTED TO VARYING STRESS,

Japan Society of Mechanical Engineers, Fukuoka, Japan, pages 37–40, 1968.

[Maulik00] Ujjwal Maulik and Sanghamitra Bandyopadhyay. GENETIC ALGORITHM-BASED

CLUSTERING TECHNIQUE, Pattern recognition, vol. 33, no. 9, pages 1455–1465,

2000.

[Mil04] TEMPERATURE CYCLING (MIL-STD-883; METHOD 1010), DLA Land and Maritime

Mil. Specs & Drawings, Jun-2004. [Online]. Available:

http://www.landandmaritime.dla.mil/programs/milspec/ListDocs.aspx?BasicDoc=

MIL-STD-883. [Accessed: 28-May-2014].

[Miller01] Mark Miller. NEXT GENERATION BURN-IN AND TEST SYSTEMS FOR ATHLON

MICROPROCESSORS: HYBRID BURN-IN, in BiTS Workshop, 2001.

[Millican14] SK Millican and KK Saluja. A TEST PARTITIONING TECHNIQUE FOR SCHEDULING

TESTS FOR THERMALLY CONSTRAINED 3D INTEGRATED CIRCUITS, in 27th

International Conference on VLSI Design, pages 20–25, 2014.

[Mohapatra07] Debabrata Mohapatra, Georgios Karakonstantis, and Kaushik Roy. LOW-POWER

PROCESS-VARIATION TOLERANT ARITHMETIC UNITS USING INPUT-BASED ELASTIC

CLOCKING, in International Symposium on Low Power Electronics and Design,

pages 74–79, 2007.

References

193

[Mondal07] Mosin Mondal, Andrew J Ricketts, Sami Kirolos, Tamer Ragheb, Greg Link,

Narayanan Vijaykrishnan, and Yehia Massoud. THERMALLY ROBUST CLOCKING

SCHEMES FOR 3D INTEGRATED CIRCUITS, in Design, Automation & Test in Europe,

pages 1–6, 2007.

[Murray12] Conal E Murray, ET Ryan, Paul R Besser, C Witt, Jean L Jordan-Sweet, and MF

Toney. EVOLUTION OF STRESS GRADIENTS IN CU FILMS AND FEATURES INDUCED

BY CAPPING LAYERS, Microelectronic Engineering, vol. 92, pages 95–100, Apr.

2012.

[Musallam12] M Musallam and CM Johnson. AN EFFICIENT IMPLEMENTATION OF THE RAINFLOW

COUNTING ALGORITHM FOR LIFE CONSUMPTION ESTIMATION, IEEE Transactions

on Reliability, vol. 61, no. 4, pages 978–986, Dec. 2012.

[Nebel97] Wolfgang Nebel and Jean P Mermet. LOW POWER DESIGN IN DEEP SUBMICRON

ELECTRONICS. Norwell, MA, USA: Kluwer Academic Publishers, 1997.

[Needham98] Wayne Needham, Cheryl Prunty, and Eng Hong Yeoh. HIGH VOLUME

MICROPROCESSOR TEST ESCAPES, AN ANALYSIS OF DEFECTS OUR TESTS ARE

MISSING, in International Test Conference, pages 25–34, 1998.

[Nigh98] P Nigh, D Vallett, P Patel, J Wright, F Motika, D Forlenza, R Kurtulik, and W

Chong. FAILURE ANALYSIS OF TIMING AND IDDQ-ONLY FAILURES FROM THE

SEMATECH TEST METHODS EXPERIMENT, in International Test Conference, pages

43–52, 1998.

[Noia10a] Brandon Noia, Krishnendu Chakrabarty, and Erik Jan Marinissen. OPTIMIZATION

METHODS FOR POST-BOND DIE-INTERNAL/EXTERNAL TESTING IN 3D STACKED ICS,

in International Test Conference, pages 1–9, 2010.

[Noia10b] Brandon Noia, Sandeep Kumar Goel, Krishnendu Chakrabarty, Erik Jan

Marinissen, and Jouke Verbree. TEST-ARCHITECTURE OPTIMIZATION FOR TSV-

BASED 3D STACKED ICS, in 15th European Test Symposium, pages 24–29, 2010.

[Noia11] Brandon Noia and Krishnendu Chakrabarty. TESTING AND DESIGN-FOR-

TESTABILITY TECHNIQUES FOR 3D INTEGRATED CIRCUITS, 20th Asian Test


[Noia12] Brandon Noia, Krishnendu Chakrabarty, and Erik Jan Marinissen. OPTIMIZATION

METHODS FOR POST-BOND TESTING OF 3D STACKED ICS, Journal of Electronic

Testing, vol. 28, no. 1, pages 103–120, Feb. 2012.

[Nowka08] Kevin Nowka. SURVIVAL OF VLSI DESIGN - COPING WITH DEVICE VARIABILITY

AND UNCERTAINTY, in Circuits and Systems Workshop: System-on-Chip - Design,

Applications, Integration, and Software, Dallas, pages 1–6, 2008.

[Nvidia12] NVIDIA’S NEXT GENERATION CUDA COMPUTE ARCHITECTURE: KEPLER GK110.

2012.

[Oberg03] Johnny Oberg. NETWORKS ON CHIP, A. Jantsch and H. Tenhunen, Eds. Hingham,

MA, USA: Kluwer Academic Publishers, pages 153–172, 2003.

[Okoro12] C Okoro and YS Obeng. EFFECT OF THERMAL CYCLING ON THE SIGNAL INTEGRITY

AND MORPHOLOGY OF TSV ISOLATION LINER- SIO2, in International Interconnect

Technology Conference, pages 1–3, 2012.

[Okoro14] Chukwudi Okoro, June W Lau, Fardad Golshany, Klaus Hummler, and Yaw S

Obeng. A DETAILED FAILURE ANALYSIS EXAMINATION OF THE EFFECT OF THERMAL

References

194

CYCLING ON CU TSV RELIABILITY, IEEE Transactions on Electron Devices, vol.

61, no. 1, pages 15–22, Jan. 2014.

[Oppenheim97] Alan V Oppenheim, Alan S Willsky, and Syed Hamid Nawab. SIGNALS AND

SYSTEMS, 2nd ed. Upper Saddle River, N.J: Prentice Hall, 1997.

[Pak11] JS Pak, Mohit Pathak, Sung Kyu Lim, and David Z Pan. MODELING OF

ELECTROMIGRATION IN THROUGH-SILICON-VIA BASED 3D IC, in 61st Electronic

Components and Technology Conference, pages 1420–1427, 2011.

[Patil07] Srinivas Patil. AT-SPEED SCAN TESTS: REALITY OR FANTASY? PANEL 1.4, in

International Test Conference, pages 1–1, 2007.

[Plas10] G Van der Plas, S Thijs, D Linten, G Katti, P Limaye, A Mercha, M Stucchi, H

Oprins, B Vandevelde, N Minas, M Cupac, M Dehan, M Nelis, R Agarwal, W

Dehaene, Y Travaly, E Beyne, and P Marchal. VERIFYING

ELECTRICAL/THERMAL/THERMO-MECHANICAL BEHAVIOR OF A 3D STACK -

CHALLENGES AND SOLUTIONS, in Custom Integrated Circuits Conference, pages

1–4, 2010.

[Plas11] Geert Van Der Plas, Erik-Jan Marinissen, Nikolaos Minas, and Paul Marchal.

METHOD AND DEVICE FOR TESTING TSVS IN A 3D CHIP STACK, U.S. Patent

US20110102011 A105-May-2011.

[Poli07] Riccardo Poli, James Kennedy, and Tim Blackwell. PARTICLE SWARM

OPTIMIZATION, Swarm Intelligence, vol. 1, no. 1, pages 33–57, Jun. 2007.

[Press07] William H Press. NUMERICAL RECIPES: THE ART OF SCIENTIFIC COMPUTING, 3rd

ed. Cambridge, UK ; New York: Cambridge University Press, 2007.

[Raina07] Rajesh Raina. AT-SPEED SCAN TESTS: REALITY OR FANTASY? PANEL 1.5, in


[Rao03] Rajeev Rao, Ashish Srivastava, David Blaauw, and Dennis Sylvester. STATISTICAL

ESTIMATION OF LEAKAGE CURRENT CONSIDERING INTER-AND INTRA-DIE PROCESS

VARIATION, in International Symposium on Low Power Electronics and Design,

pages 84–89, 2003.

[Rohani13] Alireza Rohani and Hans G Kerkhoff. RAPID TRANSIENT FAULT INSERTION IN

LARGE DIGITAL SYSTEMS, Microprocessors and Microsystems, vol. 37, no. 2, pages

147–154, Mar. 2013.

[Rosinger02] PM Rosinger, BM Al-Hashimi, and N Nicolici. POWER PROFILE MANIPULATION: A

NEW APPROACH FOR REDUCING TEST APPLICATION TIME UNDER POWER

CONSTRAINTS, IEEE Transactions on Computer-Aided Design of Integrated

Circuits and Systems, vol. 21, no. 10, pages 1217–1225, Oct. 2002.

[Rosinger06] Paul Rosinger, Bashir M Al-Hashimi, and Krishnendu Chakrabarty. THERMAL-

SAFE TEST SCHEDULING FOR CORE-BASED SYSTEM-ON-CHIP INTEGRATED CIRCUITS,


vol. 25, no. 11, pages 2502–2512, 2006.

[Samii06] Soheil Samii, Erik Larsson, Krishnendu Chakrabarty, and Zebo Peng. CYCLE-

ACCURATE TEST POWER MODELING AND ITS APPLICATION TO SOC TEST

SCHEDULING, in International Test Conference, pages 1–10, 2006.

[Santarini14] Mike Santarini. XILINX SHIPS INDUSTRY’S FIRST 20-NM ALL PROGRAMMABLE

DEVICES, Xcell, vol. 1, no. 86, page 14, 2014.

References

195

[Sarangi08] SR Sarangi, B Greskamp, R Teodorescu, J Nakano, A Tiwari, and J Torrellas.

VARIUS: A MODEL OF PROCESS VARIATION AND RESULTING TIMING ERRORS FOR

MICROARCHITECTS, IEEE Transactions on Semiconductor Manufacturing, vol. 21,

no. 1, pages 3–13, Feb. 2008.

[Schuermyer04] C Schuermyer, J Ruffler, R Daasch, and R Madge. MINIMUM TESTING

REQUIREMENTS TO SCREEN TEMPERATURE DEPENDENT DEFECTS, in International

Test Conference, pages 300–308, 2004.

[Segura02] J Segura, A Keshavarzi, J Soden, and C Hawkins. PARAMETRIC FAILURES IN

CMOS ICS - A DEFECT-BASED ANALYSIS, in International Test Conference, pages

90–99, 2002.

[Segura04] Jaume Segura and Charles F Hawkins. CMOS ELECTRONICS: HOW IT WORKS, HOW

IT FAILS. Hoboken, NJ, USA: John Wiley & Sons, Inc., 2004.

[Semenov03] O Semenov, A Vassighi, M Sachdev, A Keshavarzi, and CF Hawkins. EFFECT OF

CMOS TECHNOLOGY SCALING ON THERMAL MANAGEMENT DURING BURN-IN, IEEE

Transactions on Semiconductor Manufacturing, vol. 16, no. 4, pages 686–695,

Nov. 2003.

[SenGupta12] Breeta SenGupta, Urban Ingelsson, and Erik Larsson. SCHEDULING TESTS FOR 3D

STACKED CHIPS UNDER POWER CONSTRAINTS, Journal of Electronic Testing, vol.

28, no. 1, pages 121–135, Feb. 2012.

[Shibin15] Konstantin Shibin, Vivek Chickermane, Brion Keller, Christos Papameletis, and

Erik Jan Marinissen. AT-SPEED DELAY TESTING OF INTER-DIE CONNECTIONS OF

2.5D- AND 3D-SICS, in 20th European Test Symposium, 2015.

[Smorodin08] T Smorodin, J Wilde, P Alpern, and M Stecher. A TEMPERATURE-GRADIENT-

INDUCED FAILURE MECHANISM IN METALLIZATION UNDER FAST THERMAL

CYCLING, IEEE Transactions on Device and Materials Reliability, vol. 8, no. 3,

pages 590–599, Sep. 2008.

[Srivastava02] Ashish Srivastava, Robert Bai, David Blaauw, and Dennis Sylvester. MODELING

AND ANALYSIS OF LEAKAGE POWER CONSIDERING WITHIN-DIE PROCESS

VARIATIONS, in International Symposium on Low Power Electronics and Design,

pages 64–67, 2002.

[Srivastava04] Ashish Srivastava, Dennis Sylvester, and David Blaauw. STATISTICAL

OPTIMIZATION OF LEAKAGE POWER CONSIDERING PROCESS VARIATIONS USING

DUAL-VTH AND SIZING, in 41st annual Design Automation Conference, pages 773–

778, 2004.

[Stan03] Mircea R Stan, Kevin Skadron, Marco Barcella, Wei Huang, Karthik

Sankaranarayanan, and Sivakumar Velusamy. HOTSPOT: A DYNAMIC COMPACT

THERMAL MODEL AT THE PROCESSOR-ARCHITECTURE LEVEL, Microelectronics

Journal, vol. 34, no. 12, pages 1153–1165, 2003.

[Syed10] A Syed. LIMITATIONS OF NORRIS-LANDZBERG EQUATION AND APPLICATION OF

DAMAGE ACCUMULATION BASED METHODOLOGY FOR ESTIMATING ACCELERATION

FACTORS FOR PB FREE SOLDERS, in 11th International Conference on Thermal,

Mechanical Multi-Physics Simulation, and Experiments in Microelectronics and

Microsystems, pages 1–11, 2010.

[Tadayon00] Pooya Tadayon. THERMAL CHALLENGES DURING MICROPROCESSOR TESTING, Intel

Technology Journal, vol. 4, no. 3, pages 1–8, 2000.

References

196

[Taouil10a] Mottaqiallah Taouil, Said Hamdioui, Kees Beenakker, and Erik Jan Marinissen.

TEST COST ANALYSIS FOR 3D DIE-TO-WAFER STACKING, in 19th Asian Test


[Taouil10b] Mottaqiallah Taouil, Said Hamdioui, Jouke Verbree, and Erik Jan Marinissen. ON

MAXIMIZING THE COMPOUND YIELD FOR 3D WAFER-TO-WAFER STACKED ICS, in


[Taouil11] Mottaqiallah Taouil, Said Hamdioui, and Erik Jan Marinissen. HOW SIGNIFICANT

WILL BE THE TEST COST SHARE FOR 3D DIE-TO-WAFER STACKED-ICS?, in 6th

International Conference on Design & Technology of Integrated Systems in

Nanoscale Era, pages 1–6, 2011.

[Taouil12] Mottaqiallah Taouil, Said Hamdioui, Kees Beenakker, and Erik Jan Marinissen.

TEST IMPACT ON THE OVERALL DIE-TO-WAFER 3D STACKED IC COST, Journal of

Electronic Testing, vol. 28, no. 1, pages 15–25, Feb. 2012.

[Tseng00] Chao-Wen Tseng, EJ McCluskey, Xiaoping Shao, and DM Wu. COLD DELAY

DEFECT SCREENING, in 18th VLSI Test Symposium, pages 183–188, 2000.

[Tudu09] JT Tudu, E Larsson, V Singh, and VD Agrawal. ON MINIMIZATION OF PEAK POWER

FOR SCAN CIRCUIT DURING TEST, in 14th European Test Symposium, pages 25–30,

2009.

[Ukhov12] Ivan Ukhov, Min Bao, Petru Eles, and Zebo Peng. STEADY-STATE DYNAMIC

TEMPERATURE ANALYSIS AND RELIABILITY OPTIMIZATION FOR EMBEDDED

MULTIPROCESSOR SYSTEMS, in 49th annual Design Automation Conference, New

York, NY, USA, pages 197–204, 2012.

[Ukhov14a] I Ukhov, P Eles, and Z Peng. PROBABILISTIC ANALYSIS OF POWER AND

TEMPERATURE UNDER PROCESS VARIATION FOR ELECTRONIC SYSTEM DESIGN,


vol. 33, no. 6, pages 931–944, Jun. 2014.

[Ukhov14b] I Ukhov, P Eles, and Z Peng. TEMPERATURE-CENTRIC RELIABILITY ANALYSIS AND

OPTIMIZATION OF ELECTRONIC SYSTEMS UNDER PROCESS VARIATION, IEEE

Transactions on Very Large Scale Integration (VLSI) Systems, vol. PP, no. 99,

pages 1–1, 2014.

[Vassighi06] Arman Vassighi and Manoj Sachdev. THERMAL AND POWER MANAGEMENT OF

INTEGRATED CIRCUITS. Boston: Kluwer Academic Publishers, 2006.

[Vasudevan08] V Vasudevan and Xuejun Fan. AN ACCELERATION MODEL FOR LEAD-FREE (SAC)

SOLDER JOINT RELIABILITY UNDER THERMAL CYCLING, in 58th Electronic


[Velenis10] Dimitrios Velenis, Erik Jan Marinissen, and Eric Beyne. COST EFFECTIVENESS OF

3D INTEGRATION OPTIONS, in International 3D Systems Integration Conference,

pages 1–6, 2010.

[Verbree10] Jouke Verbree, Erik Jan Marinissen, Philippe Roussel, and Dimitrios Velenis. ON

THE COST-EFFECTIVENESS OF MATCHING REPOSITORIES OF PRE-TESTED WAFERS

FOR WAFER-TO-WAFER 3D CHIP STACKING, in 15th European Test Symposium,

pages 36–41, 2010.

[Vinay10] NS Vinay, Indira Rawat, Erik Larsson, MS Gaur, and Virendra Singh. THERMAL

AWARE TEST SCHEDULING FOR STACKED MULTI-CHIP-MODULES, in East-West

Design & Test Symposium, pages 343–349, 2010.

References

197

[Wen11] Xiaoqing Wen, Kazunari Enokimoto, Kohei Miyase, Yuta Yamato, Michael A

Kochte, Seiji Kajihara, Patrick Girard, and Mohammad Tehranipoor. POWER-

AWARE TEST GENERATION WITH GUARANTEED LAUNCH SAFETY FOR AT-SPEED

SCAN TESTING, in 29th VLSI Test Symposium, pages 166–171, 2011.

[Wu10] Sean H Wu, Alexander Tetelbaum, and Li-C Wang. HOW DOES INVERSE

TEMPERATURE DEPENDENCE AFFECT TIMING SIGN-OFF, in Emerging Technologies

and Circuits, A. Amara, T. Ea, and M. Belleville, Eds. Springer Netherlands, pages

179–189, 2010.

[Yao09] Chunhua Yao, Kewal K Saluja, and Parameswaran Ramanathan. PARTITION BASED

SOC TEST SCHEDULING WITH THERMAL AND POWER CONSTRAINTS UNDER DEEP

SUBMICRON TECHNOLOGIES, in Asian Test Symposium, pages 281–286, 2009.

[Yao11a] Chunhua Yao, Kewal K Saluja, and Parameswaran Ramanathan. POWER AND

THERMAL CONSTRAINED TEST SCHEDULING UNDER DEEP SUBMICRON

TECHNOLOGIES, IEEE Transactions on Computer-Aided Design of Integrated

Circuits and Systems, vol. 30, no. 2, pages 317–322, Feb. 2011.

[Yao11b] C Yao, KK Saluja, and P Ramanathan. TEMPERATURE DEPENDENT TEST

SCHEDULING FOR MULTI-CORE SYSTEM-ON-CHIP, in 20th Asian Test Symposium,

pages 27–32, 2011.

[Yao11c] Chunhua Yao, Kewal K Saluja, and Parameswaran Ramanathan. THERMAL-

AWARE TEST SCHEDULING USING ON-CHIP TEMPERATURE SENSORS, in 24th

International Conference on VLSI Design, pages 376–381, 2011.

[Yu09] TE Yu, T Yoneda, K Chakrabarty, and H Fujiwara. TEST INFRASTRUCTURE DESIGN

FOR CORE-BASED SYSTEM-ON-CHIP UNDER CYCLE-ACCURATE THERMAL

CONSTRAINTS, in Asia and South Pacific Design Automation Conference, pages

793–798, 2009.

[Zhang13] Dingyou Zhang, K Hummler, L Smith, and JJQ Lu. BACKSIDE TSV PROTRUSION

INDUCED BY THERMAL SHOCK AND THERMAL CYCLING, in 63rd Electronic


[Zhao10] Wei Zhao, Junxia Ma, M Tehranipoor, and S Chakravarty. POWER-SAFE

APPLICATION OF TRANSITION DELAY FAULT PATTERNS CONSIDERING CURRENT

LIMIT DURING WAFER TEST, in 19th Asian Test Symposium, pages 301–306, 2010.

[Zhuo10] Cheng Zhuo, Dennis Sylvester, and David Blaauw. PROCESS VARIATION AND

TEMPERATURE-AWARE RELIABILITY MANAGEMENT, in Design, Automation and

Test in Europe, pages 580–585, 2010.

[Zorian93] Y Zorian. A DISTRIBUTED BIST CONTROL SCHEME FOR COMPLEX VLSI DEVICES,

in 11th annual VLSI Test Symposium, pages 4–9, 1993.

[Zou03] Wei Zou, SM Reddy, I Pomeranz, and Yu Huang. SOC TEST SCHEDULING USING

SIMULATED ANNEALING, in 21st VLSI Test Symposium, pages 325–330, 2003.

[Zschech02] Ehrenfried Zschech, Eckhard Langer, Hans-Juergen Engelmann, and Kornelia

Dittmar. PHYSICAL FAILURE ANALYSIS IN SEMICONDUCTOR INDUSTRY—

CHALLENGES OF THE COPPER INTERCONNECT PROCESS, Materials Science in

Semiconductor Processing, vol. 5, no. 4–5, pages 457–464, Aug. 2002.

Department of Computer and Information Science

Linköpings universitet

Dissertations


Linköping Studies in Arts and Science Linköping Studies in Statistics

Linköpings Studies in Information Science


No 14 Anders Haraldsson: A Program Manipulation

System Based on Partial Evaluation, 1977, ISBN 91-

7372-144-1.

No 17 Bengt Magnhagen: Probability Based Verification of

Time Margins in Digital Designs, 1977, ISBN 91-7372-

157-3.

No 18 Mats Cedwall: Semantisk analys av process-

beskrivningar i naturligt språk, 1977, ISBN 91- 7372-

168-9.

No 22 Jaak Urmi: A Machine Independent LISP Compiler

and its Implications for Ideal Hardware, 1978, ISBN

91-7372-188-3.

No 33 Tore Risch: Compilation of Multiple File Queries in

a Meta-Database System 1978, ISBN 91- 7372-232-4.

No 51 Erland Jungert: Synthesizing Database Structures

from a User Oriented Data Model, 1980, ISBN 91-

7372-387-8.

No 54 Sture Hägglund: Contributions to the Development

of Methods and Tools for Interactive Design of

Applications Software, 1980, ISBN 91-7372-404-1.

No 55 Pär Emanuelson: Performance Enhancement in a

Well-Structured Pattern Matcher through Partial

Evaluation, 1980, ISBN 91-7372-403-3.

No 58 Bengt Johnsson, Bertil Andersson: The Human-

Computer Interface in Commercial Systems, 1981,

ISBN 91-7372-414-9.

No 69 H. Jan Komorowski: A Specification of an Abstract

Prolog Machine and its Application to Partial

Evaluation, 1981, ISBN 91-7372-479-3.

No 71 René Reboh: Knowledge Engineering Techniques

and Tools for Expert Systems, 1981, ISBN 91-7372-

489-0.

No 77 Östen Oskarsson: Mechanisms of Modifiability in

large Software Systems, 1982, ISBN 91- 7372-527-7.

No 94 Hans Lunell: Code Generator Writing Systems, 1983,

ISBN 91-7372-652-4.

No 97 Andrzej Lingas: Advances in Minimum Weight

Triangulation, 1983, ISBN 91-7372-660-5.

No 109 Peter Fritzson: Towards a Distributed Programming

Environment based on Incremental Compilation,

1984, ISBN 91-7372-801-2.

No 111 Erik Tengvald: The Design of Expert Planning

Systems. An Experimental Operations Planning

System for Turning, 1984, ISBN 91-7372- 805-5.

No 155 Christos Levcopoulos: Heuristics for Minimum

Decompositions of Polygons, 1987, ISBN 91-7870-

133-3.

No 165 James W. Goodwin: A Theory and System for Non-

Monotonic Reasoning, 1987, ISBN 91-7870-183-X.

No 170 Zebo Peng: A Formal Methodology for Automated

Synthesis of VLSI Systems, 1987, ISBN 91-7870-225-9.

No 174 Johan Fagerström: A Paradigm and System for

Design of Distributed Systems, 1988, ISBN 91-7870-

301-8.

No 192 Dimiter Driankov: Towards a Many Valued Logic of

Quantified Belief, 1988, ISBN 91-7870-374-3.

No 213 Lin Padgham: Non-Monotonic Inheritance for an

Object Oriented Knowledge Base, 1989, ISBN 91-

7870-485-5.

No 214 Tony Larsson: A Formal Hardware Description and

Verification Method, 1989, ISBN 91-7870-517-7.

No 221 Michael Reinfrank: Fundamentals and Logical

Foundations of Truth Maintenance, 1989, ISBN 91-

7870-546-0.

No 239 Jonas Löwgren: Knowledge-Based Design Support

and Discourse Management in User Interface

Management Systems, 1991, ISBN 91-7870-720-X.

No 244 Henrik Eriksson: Meta-Tool Support for Knowledge

Acquisition, 1991, ISBN 91-7870-746-3.

No 252 Peter Eklund: An Epistemic Approach to Interactive

Design in Multiple Inheritance Hierarchies, 1991,

ISBN 91-7870-784-6.

No 258 Patrick Doherty: NML3 - A Non-Monotonic

Formalism with Explicit Defaults, 1991, ISBN 91-

7870-816-8.

No 260 Nahid Shahmehri: Generalized Algorithmic

Debugging, 1991, ISBN 91-7870-828-1.

No 264 Nils Dahlbäck: Representation of Discourse-

Cognitive and Computational Aspects, 1992, ISBN

91-7870-850-8.

No 265 Ulf Nilsson: Abstract Interpretations and Abstract

Machines: Contributions to a Methodology for the

Implementation of Logic Programs, 1992, ISBN 91-

7870-858-3.

No 270 Ralph Rönnquist: Theory and Practice of Tense-

bound Object References, 1992, ISBN 91-7870-873-7.

No 273 Björn Fjellborg: Pipeline Extraction for VLSI Data

Path Synthesis, 1992, ISBN 91-7870-880-X.

No 276 Staffan Bonnier: A Formal Basis for Horn Clause

Logic with External Polymorphic Functions, 1992,

ISBN 91-7870-896-6.

No 277 Kristian Sandahl: Developing Knowledge Manage-

ment Systems with an Active Expert Methodology,

1992, ISBN 91-7870-897-4.

No 281 Christer Bäckström: Computational Complexity of

Reasoning about Plans, 1992, ISBN 91-7870-979-2.

No 292 Mats Wirén: Studies in Incremental Natural

Language Analysis, 1992, ISBN 91-7871-027-8.

No 297 Mariam Kamkar: Interprocedural Dynamic Slicing

with Applications to Debugging and Testing, 1993,

ISBN 91-7871-065-0.

No 302 Tingting Zhang: A Study in Diagnosis Using

Classification and Defaults, 1993, ISBN 91-7871-078-2

No 312 Arne Jönsson: Dialogue Management for Natural

Language Interfaces - An Empirical Approach, 1993,

ISBN 91-7871-110-X.

No 338 Simin Nadjm-Tehrani: Reactive Systems in Physical

Environments: Compositional Modelling and Frame-

work for Verification, 1994, ISBN 91-7871-237-8.

No 371 Bengt Savén: Business Models for Decision Support

and Learning. A Study of Discrete-Event

Manufacturing Simulation at Asea/ABB 1968-1993,

1995, ISBN 91-7871-494-X.

No 375 Ulf Söderman: Conceptual Modelling of Mode

Switching Physical Systems, 1995, ISBN 91-7871-516-

4.

No 383 Andreas Kågedal: Exploiting Groundness in Logic

Programs, 1995, ISBN 91-7871-538-5.

No 396 George Fodor: Ontological Control, Description,

Identification and Recovery from Problematic

Control Situations, 1995, ISBN 91-7871-603-9.

No 413 Mikael Pettersson: Compiling Natural Semantics,

1995, ISBN 91-7871-641-1.

No 414 Xinli Gu: RT Level Testability Improvement by

Testability Analysis and Transformations, 1996, ISBN

91-7871-654-3.

No 416 Hua Shu: Distributed Default Reasoning, 1996, ISBN

91-7871-665-9.

No 429 Jaime Villegas: Simulation Supported Industrial

Training from an Organisational Learning

Perspective - Development and Evaluation of the

SSIT Method, 1996, ISBN 91-7871-700-0.

No 431 Peter Jonsson: Studies in Action Planning:

Algorithms and Complexity, 1996, ISBN 91-7871-704-

3.

No 437 Johan Boye: Directional Types in Logic

Programming, 1996, ISBN 91-7871-725-6.

No 439 Cecilia Sjöberg: Activities, Voices and Arenas:

Participatory Design in Practice, 1996, ISBN 91-7871-

728-0.

No 448 Patrick Lambrix: Part-Whole Reasoning in

Description Logics, 1996, ISBN 91-7871-820-1.

No 452 Kjell Orsborn: On Extensible and Object-Relational

Database Technology for Finite Element Analysis

Applications, 1996, ISBN 91-7871-827-9.

No 459 Olof Johansson: Development Environments for

Complex Product Models, 1996, ISBN 91-7871-855-4.

No 461 Lena Strömbäck: User-Defined Constructions in

Unification-Based Formalisms, 1997, ISBN 91-7871-

857-0.

No 462 Lars Degerstedt: Tabulation-based Logic Program-

ming: A Multi-Level View of Query Answering,

1996, ISBN 91-7871-858-9.

No 475 Fredrik Nilsson: Strategi och ekonomisk styrning -

En studie av hur ekonomiska styrsystem utformas

och används efter företagsförvärv, 1997, ISBN 91-

7871-914-3.

No 480 Mikael Lindvall: An Empirical Study of Require-

ments-Driven Impact Analysis in Object-Oriented

Software Evolution, 1997, ISBN 91-7871-927-5.

No 485 Göran Forslund: Opinion-Based Systems: The Coop-

erative Perspective on Knowledge-Based Decision

Support, 1997, ISBN 91-7871-938-0.

No 494 Martin Sköld: Active Database Management

Systems for Monitoring and Control, 1997, ISBN 91-

7219-002-7.

No 495 Hans Olsén: Automatic Verification of Petri Nets in

a CLP framework, 1997, ISBN 91-7219-011-6.

No 498 Thomas Drakengren: Algorithms and Complexity

for Temporal and Spatial Formalisms, 1997, ISBN 91-

7219-019-1.

No 502 Jakob Axelsson: Analysis and Synthesis of Heteroge-

neous Real-Time Systems, 1997, ISBN 91-7219-035-3.

No 503 Johan Ringström: Compiler Generation for Data-

Parallel Programming Languages from Two-Level

Semantics Specifications, 1997, ISBN 91-7219-045-0.

No 512 Anna Moberg: Närhet och distans - Studier av kom-

munikationsmönster i satellitkontor och flexibla

kontor, 1997, ISBN 91-7219-119-8.

No 520 Mikael Ronström: Design and Modelling of a

Parallel Data Server for Telecom Applications, 1998,

ISBN 91-7219-169-4.

No 522 Niclas Ohlsson: Towards Effective Fault Prevention

- An Empirical Study in Software Engineering, 1998,

ISBN 91-7219-176-7.

No 526 Joachim Karlsson: A Systematic Approach for

Prioritizing Software Requirements, 1998, ISBN 91-

7219-184-8.

No 530 Henrik Nilsson: Declarative Debugging for Lazy

Functional Languages, 1998, ISBN 91-7219-197-x.

No 555 Jonas Hallberg: Timing Issues in High-Level Synthe-

sis, 1998, ISBN 91-7219-369-7.

No 561 Ling Lin: Management of 1-D Sequence Data - From

Discrete to Continuous, 1999, ISBN 91-7219-402-2.

No 563 Eva L Ragnemalm: Student Modelling based on Col-

laborative Dialogue with a Learning Companion,

1999, ISBN 91-7219-412-X.

No 567 Jörgen Lindström: Does Distance matter? On geo-

graphical dispersion in organisations, 1999, ISBN 91-

7219-439-1.

No 582 Vanja Josifovski: Design, Implementation and

Evaluation of a Distributed Mediator System for

Data Integration, 1999, ISBN 91-7219-482-0.

No 589 Rita Kovordányi: Modeling and Simulating

Inhibitory Mechanisms in Mental Image

Reinterpretation - Towards Cooperative Human-

Computer Creativity, 1999, ISBN 91-7219-506-1.

No 592 Mikael Ericsson: Supporting the Use of Design

Knowledge - An Assessment of Commenting

Agents, 1999, ISBN 91-7219-532-0.

No 593 Lars Karlsson: Actions, Interactions and Narratives,

1999, ISBN 91-7219-534-7.

No 594 C. G. Mikael Johansson: Social and Organizational

Aspects of Requirements Engineering Methods - A

practice-oriented approach, 1999, ISBN 91-7219-541-

X.

No 595 Jörgen Hansson: Value-Driven Multi-Class Overload

Management in Real-Time Database Systems, 1999,

ISBN 91-7219-542-8.

No 596 Niklas Hallberg: Incorporating User Values in the

Design of Information Systems and Services in the

Public Sector: A Methods Approach, 1999, ISBN 91-

7219-543-6.

No 597 Vivian Vimarlund: An Economic Perspective on the

Analysis of Impacts of Information Technology:

From Case Studies in Health-Care towards General

Models and Theories, 1999, ISBN 91-7219-544-4.

No 598 Johan Jenvald: Methods and Tools in Computer-

Supported Taskforce Training, 1999, ISBN 91-7219-

547-9.

No 607 Magnus Merkel: Understanding and enhancing

translation by parallel text processing, 1999, ISBN 91-

7219-614-9.

No 611 Silvia Coradeschi: Anchoring symbols to sensory

data, 1999, ISBN 91-7219-623-8.

No 613 Man Lin: Analysis and Synthesis of Reactive

Systems: A Generic Layered Architecture

Perspective, 1999, ISBN 91-7219-630-0.

No 618 Jimmy Tjäder: Systemimplementering i praktiken -

En studie av logiker i fyra projekt, 1999, ISBN 91-

7219-657-2.

No 627 Vadim Engelson: Tools for Design, Interactive

Simulation, and Visualization of Object-Oriented

Models in Scientific Computing, 2000, ISBN 91-7219-

709-9.

No 637 Esa Falkenroth: Database Technology for Control

and Simulation, 2000, ISBN 91-7219-766-8.

No 639 Per-Arne Persson: Bringing Power and Knowledge

Together: Information Systems Design for Autonomy

and Control in Command Work, 2000, ISBN 91-7219-

796-X.

No 660 Erik Larsson: An Integrated System-Level Design for

Testability Methodology, 2000, ISBN 91-7219-890-7.

No 688 Marcus Bjäreland: Model-based Execution

Monitoring, 2001, ISBN 91-7373-016-5.

No 689 Joakim Gustafsson: Extending Temporal Action

Logic, 2001, ISBN 91-7373-017-3.

No 720 Carl-Johan Petri: Organizational Information Provi-

sion - Managing Mandatory and Discretionary Use

of Information Technology, 2001, ISBN-91-7373-126-

9.

No 724 Paul Scerri: Designing Agents for Systems with Ad-

justable Autonomy, 2001, ISBN 91 7373 207 9.

No 725 Tim Heyer: Semantic Inspection of Software

Artifacts: From Theory to Practice, 2001, ISBN 91

7373 208 7.

No 726 Pär Carlshamre: A Usability Perspective on Require-

ments Engineering - From Methodology to Product

Development, 2001, ISBN 91 7373 212 5.

No 732 Juha Takkinen: From Information Management to

Task Management in Electronic Mail, 2002, ISBN 91

7373 258 3.

No 745 Johan Åberg: Live Help Systems: An Approach to

Intelligent Help for Web Information Systems, 2002,

ISBN 91-7373-311-3.

No 746 Rego Granlund: Monitoring Distributed Teamwork

Training, 2002, ISBN 91-7373-312-1.

No 757 Henrik André-Jönsson: Indexing Strategies for Time

Series Data, 2002, ISBN 917373-346-6.

No 747 Anneli Hagdahl: Development of IT-supported

Interorganisational Collaboration - A Case Study in

the Swedish Public Sector, 2002, ISBN 91-7373-314-8.

No 749 Sofie Pilemalm: Information Technology for Non-

Profit Organisations - Extended Participatory Design

of an Information System for Trade Union Shop

Stewards, 2002, ISBN 91-7373-318-0.

No 765 Stefan Holmlid: Adapting users: Towards a theory

of use quality, 2002, ISBN 91-7373-397-0.

No 771 Magnus Morin: Multimedia Representations of Dis-

tributed Tactical Operations, 2002, ISBN 91-7373-421-

7.

No 772 Pawel Pietrzak: A Type-Based Framework for Locat-

ing Errors in Constraint Logic Programs, 2002, ISBN

91-7373-422-5.

No 758 Erik Berglund: Library Communication Among Pro-

grammers Worldwide, 2002, ISBN 91-7373-349-0.

No 774 Choong-ho Yi: Modelling Object-Oriented Dynamic

Systems Using a Logic-Based Framework, 2002, ISBN

91-7373-424-1.

No 779 Mathias Broxvall: A Study in the Computational

Complexity of Temporal Reasoning, 2002, ISBN 91-

7373-440-3.

No 793 Asmus Pandikow: A Generic Principle for Enabling

Interoperability of Structured and Object-Oriented

Analysis and Design Tools, 2002, ISBN 91-7373-479-9.

No 785 Lars Hult: Publika Informationstjänster. En studie av

den Internetbaserade encyklopedins bruksegenska-

per, 2003, ISBN 91-7373-461-6.

No 800 Lars Taxén: A Framework for the Coordination of

Complex Systems´ Development, 2003, ISBN 91-

7373-604-X

No 808 Klas Gäre: Tre perspektiv på förväntningar och

förändringar i samband med införande av

informationssystem, 2003, ISBN 91-7373-618-X.

No 821 Mikael Kindborg: Concurrent Comics -

programming of social agents by children, 2003,

ISBN 91-7373-651-1.

No 823 Christina Ölvingson: On Development of

Information Systems with GIS Functionality in

Public Health Informatics: A Requirements

Engineering Approach, 2003, ISBN 91-7373-656-2.

No 828 Tobias Ritzau: Memory Efficient Hard Real-Time

Garbage Collection, 2003, ISBN 91-7373-666-X.

No 833 Paul Pop: Analysis and Synthesis of

Communication-Intensive Heterogeneous Real-Time

Systems, 2003, ISBN 91-7373-683-X.

No 852 Johan Moe: Observing the Dynamic Behaviour of

Large Distributed Systems to Improve Development

and Testing – An Empirical Study in Software

Engineering, 2003, ISBN 91-7373-779-8.

No 867 Erik Herzog: An Approach to Systems Engineering

Tool Data Representation and Exchange, 2004, ISBN

91-7373-929-4.

No 872 Aseel Berglund: Augmenting the Remote Control:

Studies in Complex Information Navigation for

Digital TV, 2004, ISBN 91-7373-940-5.

No 869 Jo Skåmedal: Telecommuting’s Implications on

Travel and Travel Patterns, 2004, ISBN 91-7373-935-9.

No 870 Linda Askenäs: The Roles of IT - Studies of

Organising when Implementing and Using

Enterprise Systems, 2004, ISBN 91-7373-936-7.

No 874 Annika Flycht-Eriksson: Design and Use of Ontolo-

gies in Information-Providing Dialogue Systems,

2004, ISBN 91-7373-947-2.

No 873 Peter Bunus: Debugging Techniques for Equation-

Based Languages, 2004, ISBN 91-7373-941-3.

No 876 Jonas Mellin: Resource-Predictable and Efficient

Monitoring of Events, 2004, ISBN 91-7373-956-1.

No 883 Magnus Bång: Computing at the Speed of Paper:

Ubiquitous Computing Environments for Healthcare

Professionals, 2004, ISBN 91-7373-971-5

No 882 Robert Eklund: Disfluency in Swedish human-

human and human-machine travel booking di-

alogues, 2004, ISBN 91-7373-966-9.

No 887 Anders Lindström: English and other Foreign

Linguistic Elements in Spoken Swedish. Studies of

Productive Processes and their Modelling using

Finite-State Tools, 2004, ISBN 91-7373-981-2.

No 889 Zhiping Wang: Capacity-Constrained Production-in-

ventory systems - Modelling and Analysis in both a

traditional and an e-business context, 2004, ISBN 91-

85295-08-6.

No 893 Pernilla Qvarfordt: Eyes on Multimodal Interaction,

2004, ISBN 91-85295-30-2.

No 910 Magnus Kald: In the Borderland between Strategy

and Management Control - Theoretical Framework

and Empirical Evidence, 2004, ISBN 91-85295-82-5.

No 918 Jonas Lundberg: Shaping Electronic News: Genre

Perspectives on Interaction Design, 2004, ISBN 91-

85297-14-3.

No 900 Mattias Arvola: Shades of use: The dynamics of

interaction design for sociable use, 2004, ISBN 91-

85295-42-6.

No 920 Luis Alejandro Cortés: Verification and Scheduling

Techniques for Real-Time Embedded Systems, 2004,

ISBN 91-85297-21-6.

No 929 Diana Szentivanyi: Performance Studies of Fault-

Tolerant Middleware, 2005, ISBN 91-85297-58-5.

No 933 Mikael Cäker: Management Accounting as

Constructing and Opposing Customer Focus: Three

Case Studies on Management Accounting and

Customer Relations, 2005, ISBN 91-85297-64-X.

No 937 Jonas Kvarnström: TALplanner and Other

Extensions to Temporal Action Logic, 2005, ISBN 91-

85297-75-5.

No 938 Bourhane Kadmiry: Fuzzy Gain-Scheduled Visual

Servoing for Unmanned Helicopter, 2005, ISBN 91-

85297-76-3.

No 945 Gert Jervan: Hybrid Built-In Self-Test and Test

Generation Techniques for Digital Systems, 2005,

ISBN: 91-85297-97-6.

No 946 Anders Arpteg: Intelligent Semi-Structured Informa-

tion Extraction, 2005, ISBN 91-85297-98-4.

No 947 Ola Angelsmark: Constructing Algorithms for Con-

straint Satisfaction and Related Problems - Methods

and Applications, 2005, ISBN 91-85297-99-2.

No 963 Calin Curescu: Utility-based Optimisation of

Resource Allocation for Wireless Networks, 2005,

ISBN 91-85457-07-8.

No 972 Björn Johansson: Joint Control in Dynamic

Situations, 2005, ISBN 91-85457-31-0.

No 974 Dan Lawesson: An Approach to Diagnosability

Analysis for Interacting Finite State Systems, 2005,

ISBN 91-85457-39-6.

No 979 Claudiu Duma: Security and Trust Mechanisms for

Groups in Distributed Services, 2005, ISBN 91-85457-

54-X.

No 983 Sorin Manolache: Analysis and Optimisation of

Real-Time Systems with Stochastic Behaviour, 2005,

ISBN 91-85457-60-4.

No 986 Yuxiao Zhao: Standards-Based Application

Integration for Business-to-Business

Communications, 2005, ISBN 91-85457-66-3.

No 1004 Patrik Haslum: Admissible Heuristics for

Automated Planning, 2006, ISBN 91-85497-28-2.

No 1005 Aleksandra Tešanovic: Developing Reusable and

Reconfigurable Real-Time Software using Aspects

and Components, 2006, ISBN 91-85497-29-0.

No 1008 David Dinka: Role, Identity and Work: Extending

the design and development agenda, 2006, ISBN 91-

85497-42-8.

No 1009 Iakov Nakhimovski: Contributions to the Modeling

and Simulation of Mechanical Systems with Detailed

Contact Analysis, 2006, ISBN 91-85497-43-X.

No 1013 Wilhelm Dahllöf: Exact Algorithms for Exact

Satisfiability Problems, 2006, ISBN 91-85523-97-6.

No 1016 Levon Saldamli: PDEModelica - A High-Level Lan-

guage for Modeling with Partial Differential Equa-

tions, 2006, ISBN 91-85523-84-4.

No 1017 Daniel Karlsson: Verification of Component-based

Embedded System Designs, 2006, ISBN 91-85523-79-8

No 1018 Ioan Chisalita: Communication and Networking

Techniques for Traffic Safety Systems, 2006, ISBN 91-

85523-77-1.

No 1019 Tarja Susi: The Puzzle of Social Activity - The

Significance of Tools in Cognition and Cooperation,

2006, ISBN 91-85523-71-2.

No 1021 Andrzej Bednarski: Integrated Optimal Code Gener-

ation for Digital Signal Processors, 2006, ISBN 91-

85523-69-0.

No 1022 Peter Aronsson: Automatic Parallelization of Equa-

tion-Based Simulation Programs, 2006, ISBN 91-

85523-68-2.

No 1030 Robert Nilsson: A Mutation-based Framework for

Automated Testing of Timeliness, 2006, ISBN 91-

85523-35-6.

No 1034 Jon Edvardsson: Techniques for Automatic

Generation of Tests from Programs and

Specifications, 2006, ISBN 91-85523-31-3.

No 1035 Vaida Jakoniene: Integration of Biological Data,

2006, ISBN 91-85523-28-3.

No 1045 Genevieve Gorrell: Generalized Hebbian

Algorithms for Dimensionality Reduction in Natural

Language Processing, 2006, ISBN 91-85643-88-2.

No 1051 Yu-Hsing Huang: Having a New Pair of Glasses -

Applying Systemic Accident Models on Road Safety,

2006, ISBN 91-85643-64-5.

No 1054 Åsa Hedenskog: Perceive those things which cannot

be seen - A Cognitive Systems Engineering

perspective on requirements management, 2006,

ISBN 91-85643-57-2.

No 1061 Cécile Åberg: An Evaluation Platform for Semantic

Web Technology, 2007, ISBN 91-85643-31-9.

No 1073 Mats Grindal: Handling Combinatorial Explosion in

Software Testing, 2007, ISBN 978-91-85715-74-9.

No 1075 Almut Herzog: Usable Security Policies for Runtime

Environments, 2007, ISBN 978-91-85715-65-7.

No 1079 Magnus Wahlström: Algorithms, measures, and

upper bounds for Satisfiability and related problems,

2007, ISBN 978-91-85715-55-8.

No 1083 Jesper Andersson: Dynamic Software Architectures,

2007, ISBN 978-91-85715-46-6.

No 1086 Ulf Johansson: Obtaining Accurate and Compre-

hensible Data Mining Models - An Evolutionary

Approach, 2007, ISBN 978-91-85715-34-3.

No 1089 Traian Pop: Analysis and Optimisation of

Distributed Embedded Systems with Heterogeneous

Scheduling Policies, 2007, ISBN 978-91-85715-27-5.

No 1091 Gustav Nordh: Complexity Dichotomies for CSP-

related Problems, 2007, ISBN 978-91-85715-20-6.

No 1106 Per Ola Kristensson: Discrete and Continuous Shape

Writing for Text Entry and Control, 2007, ISBN 978-

91-85831-77-7.

No 1110 He Tan: Aligning Biomedical Ontologies, 2007, ISBN

978-91-85831-56-2.

No 1112 Jessica Lindblom: Minding the body - Interacting so-

cially through embodied action, 2007, ISBN 978-91-

85831-48-7.

No 1113 Pontus Wärnestål: Dialogue Behavior Management

in Conversational Recommender Systems, 2007,

ISBN 978-91-85831-47-0.

No 1120 Thomas Gustafsson: Management of Real-Time

Data Consistency and Transient Overloads in

Embedded Systems, 2007, ISBN 978-91-85831-33-3.

No 1127 Alexandru Andrei: Energy Efficient and Predictable

Design of Real-time Embedded Systems, 2007, ISBN

978-91-85831-06-7.

No 1139 Per Wikberg: Eliciting Knowledge from Experts in

Modeling of Complex Systems: Managing Variation

and Interactions, 2007, ISBN 978-91-85895-66-3.

No 1143 Mehdi Amirijoo: QoS Control of Real-Time Data

Services under Uncertain Workload, 2007, ISBN 978-

91-85895-49-6.

No 1150 Sanny Syberfeldt: Optimistic Replication with For-

ward Conflict Resolution in Distributed Real-Time

Databases, 2007, ISBN 978-91-85895-27-4.

No 1155 Beatrice Alenljung: Envisioning a Future Decision

Support System for Requirements Engineering - A

Holistic and Human-centred Perspective, 2008, ISBN

978-91-85895-11-3.

No 1156 Artur Wilk: Types for XML with Application to

Xcerpt, 2008, ISBN 978-91-85895-08-3.

No 1183 Adrian Pop: Integrated Model-Driven Development

Environments for Equation-Based Object-Oriented

Languages, 2008, ISBN 978-91-7393-895-2.

No 1185 Jörgen Skågeby: Gifting Technologies -

Ethnographic Studies of End-users and Social Media

Sharing, 2008, ISBN 978-91-7393-892-1.

No 1187 Imad-Eldin Ali Abugessaisa: Analytical tools and

information-sharing methods supporting road safety

organizations, 2008, ISBN 978-91-7393-887-7.

No 1204 H. Joe Steinhauer: A Representation Scheme for De-

scription and Reconstruction of Object

Configurations Based on Qualitative Relations, 2008,

ISBN 978-91-7393-823-5.

No 1222 Anders Larsson: Test Optimization for Core-based

System-on-Chip, 2008, ISBN 978-91-7393-768-9.

No 1238 Andreas Borg: Processes and Models for Capacity

Requirements in Telecommunication Systems, 2009,

ISBN 978-91-7393-700-9.

No 1240 Fredrik Heintz: DyKnow: A Stream-Based Know-

ledge Processing Middleware Framework, 2009,

ISBN 978-91-7393-696-5.

No 1241 Birgitta Lindström: Testability of Dynamic Real-

Time Systems, 2009, ISBN 978-91-7393-695-8.

No 1244 Eva Blomqvist: Semi-automatic Ontology Construc-

tion based on Patterns, 2009, ISBN 978-91-7393-683-5.

No 1249 Rogier Woltjer: Functional Modeling of Constraint

Management in Aviation Safety and Command and

Control, 2009, ISBN 978-91-7393-659-0.

No 1260 Gianpaolo Conte: Vision-Based Localization and

Guidance for Unmanned Aerial Vehicles, 2009, ISBN

978-91-7393-603-3.

No 1262 AnnMarie Ericsson: Enabling Tool Support for For-

mal Analysis of ECA Rules, 2009, ISBN 978-91-7393-

598-2.

No 1266 Jiri Trnka: Exploring Tactical Command and

Control: A Role-Playing Simulation Approach, 2009,

ISBN 978-91-7393-571-5.

No 1268 Bahlol Rahimi: Supporting Collaborative Work

through ICT - How End-users Think of and Adopt

Integrated Health Information Systems, 2009, ISBN

978-91-7393-550-0.

No 1274 Fredrik Kuivinen: Algorithms and Hardness Results

for Some Valued CSPs, 2009, ISBN 978-91-7393-525-8.

No 1281 Gunnar Mathiason: Virtual Full Replication for

Scalable Distributed Real-Time Databases, 2009,

ISBN 978-91-7393-503-6.

No 1290 Viacheslav Izosimov: Scheduling and Optimization

of Fault-Tolerant Distributed Embedded Systems,

2009, ISBN 978-91-7393-482-4.

No 1294 Johan Thapper: Aspects of a Constraint

Optimisation Problem, 2010, ISBN 978-91-7393-464-0.

No 1306 Susanna Nilsson: Augmentation in the Wild: User

Centered Development and Evaluation of

Augmented Reality Applications, 2010, ISBN 978-91-

7393-416-9.

No 1313 Christer Thörn: On the Quality of Feature Models,

2010, ISBN 978-91-7393-394-0.

No 1321 Zhiyuan He: Temperature Aware and Defect-

Probability Driven Test Scheduling for System-on-

Chip, 2010, ISBN 978-91-7393-378-0.

No 1333 David Broman: Meta-Languages and Semantics for

Equation-Based Modeling and Simulation, 2010,

ISBN 978-91-7393-335-3.

No 1337 Alexander Siemers: Contributions to Modelling and

Visualisation of Multibody Systems Simulations with

Detailed Contact Analysis, 2010, ISBN 978-91-7393-

317-9.

No 1354 Mikael Asplund: Disconnected Discoveries:

Availability Studies in Partitioned Networks, 2010,

ISBN 978-91-7393-278-3.

No 1359 Jana Rambusch: Mind Games Extended:

Understanding Gameplay as Situated Activity, 2010,

ISBN 978-91-7393-252-3.

No 1373 Sonia Sangari: Head Movement Correlates to Focus

Assignment in Swedish,2011,ISBN 978-91-7393-154-0.

No 1374 Jan-Erik Källhammer: Using False Alarms when

Developing Automotive Active Safety Systems, 2011,

ISBN 978-91-7393-153-3.

No 1375 Mattias Eriksson: Integrated Code Generation, 2011,

ISBN 978-91-7393-147-2.

No 1381 Ola Leifler: Affordances and Constraints of

Intelligent Decision Support for Military Command

and Control – Three Case Studies of Support

Systems, 2011, ISBN 978-91-7393-133-5.

No 1386 Soheil Samii: Quality-Driven Synthesis and

Optimization of Embedded Control Systems, 2011,

ISBN 978-91-7393-102-1.

No 1419 Erik Kuiper: Geographic Routing in Intermittently-

connected Mobile Ad Hoc Networks: Algorithms

and Performance Models, 2012, ISBN 978-91-7519-

981-8.

No 1451 Sara Stymne: Text Harmonization Strategies for

Phrase-Based Statistical Machine Translation, 2012,

ISBN 978-91-7519-887-3.

No 1455 Alberto Montebelli: Modeling the Role of Energy

Management in Embodied Cognition, 2012, ISBN

978-91-7519-882-8.

No 1465 Mohammad Saifullah: Biologically-Based Interactive

Neural Network Models for Visual Attention and

Object Recognition, 2012, ISBN 978-91-7519-838-5.

No 1490 Tomas Bengtsson: Testing and Logic Optimization

Techniques for Systems on Chip, 2012, ISBN 978-91-

7519-742-5.

No 1481 David Byers: Improving Software Security by

Preventing Known Vulnerabilities, 2012, ISBN 978-

91-7519-784-5.

No 1496 Tommy Färnqvist: Exploiting Structure in CSP-

related Problems, 2013, ISBN 978-91-7519-711-1.

No 1503 John Wilander: Contributions to Specification,

Implementation, and Execution of Secure Software,

2013, ISBN 978-91-7519-681-7.

No 1506 Magnus Ingmarsson: Creating and Enabling the

Useful Service Discovery Experience, 2013, ISBN 978-

91-7519-662-6.

No 1547 Wladimir Schamai: Model-Based Verification of

Dynamic System Behavior against Requirements:

Method, Language, and Tool, 2013, ISBN 978-91-

7519-505-6.

No 1551 Henrik Svensson: Simulations, 2013, ISBN 978-91-

7519-491-2.

No 1559 Sergiu Rafiliu: Stability of Adaptive Distributed

Real-Time Systems with Dynamic Resource

Management, 2013, ISBN 978-91-7519-471-4.

No 1581 Usman Dastgeer: Performance-aware Component

Composition for GPU-based Systems, 2014, ISBN

978-91-7519-383-0.

No 1602 Cai Li: Reinforcement Learning of Locomotion based

on Central Pattern Generators, 2014, ISBN 978-91-

7519-313-7.

No 1652 Roland Samlaus: An Integrated Development

Environment with Enhanced Domain-Specific

Interactive Model Validation, 2015, ISBN 978-91-

7519-090-7.

No 1663 Hannes Uppman: On Some Combinatorial

Optimization Problems: Algorithms and Complexity,

2015, ISBN 978-91-7519-072-3.

No 1664 Martin Sjölund: Tools and Methods for Analysis,

Debugging, and Performance Improvement of

Equation-Based Models, 2015, ISBN 978-91-7519-071-6.

No 1666 Kristian Stavåker: Contributions to Simulation of

Modelica Models on Data-Parallel Multi-Core

Architectures, 2015, ISBN 978-91-7519-068-6.

No 1680 Adrian Lifa: Hardware/Software Codesign of

Embedded Systems with Reconfigurable and

Heterogeneous Platforms, 2015, ISBN 978-91-7519-040-

2.

No 1685 Bogdan Tanasa: Timing Analysis of Distributed

Embedded Systems with Stochastic Workload and

Reliability Constraints, 2015, ISBN 978-91-7519-022-8.

No 1691 Håkan Warnquist: Troubleshooting Trucks –

Automated Planning and Diagnosis, 2015, ISBN 978-

91-7685-993-3.

No 1702 Nima Aghaee: Thermal Issues in Testing of

Advanced Systems on Chip, 2015, ISBN 978-91-7685-

949-0.

Linköping Studies in Arts and Science

No 504 Ing-Marie Jonsson: Social and Emotional

Characteristics of Speech-based In-Vehicle

Information Systems: Impact on Attitude and

Driving Behaviour, 2009, ISBN 978-91-7393-478-7.

No 586 Fabian Segelström: Stakeholder Engagement for

Service Design: How service designers identify and

communicate insights, 2013, ISBN 978-91-7519-554-4.

No 618 Johan Blomkvist: Representing Future Situations of

Service: Prototyping in Service Design, 2014, ISBN

978-91-7519-343-4.

No 620 Marcus Mast: Human-Robot Interaction for Semi-

Autonomous Assistive Robots, 2014, ISBN 978-91-

7519-319-9.

Linköping Studies in Statistics

No 9 Davood Shahsavani: Computer Experiments De-

signed to Explore and Approximate Complex Deter-

ministic Models, 2008, ISBN 978-91-7393-976-8.

No 10 Karl Wahlin: Roadmap for Trend Detection and As-

sessment of Data Quality, 2008, ISBN 978-91-7393-

792-4.

No 11 Oleg Sysoev: Monotonic regression for large

multivariate datasets, 2010, ISBN 978-91-7393-412-1.

No 13 Agné Burauskaite-Harju: Characterizing Temporal

Change and Inter-Site Correlations in Daily and Sub-

daily Precipitation Extremes, 2011, ISBN 978-91-7393-

110-6.

Linköping Studies in Information Science

No 1 Karin Axelsson: Metodisk systemstrukturering- att

skapa samstämmighet mellan informationssystem-

arkitektur och verksamhet, 1998. ISBN-9172-19-296-8.

No 2 Stefan Cronholm: Metodverktyg och användbarhet -

en studie av datorstödd metodbaserad

systemutveckling, 1998, ISBN-9172-19-299-2.

No 3 Anders Avdic: Användare och utvecklare - om

anveckling med kalkylprogram, 1999. ISBN-91-7219-

606-8.

No 4 Owen Eriksson: Kommunikationskvalitet hos infor-

mationssystem och affärsprocesser, 2000, ISBN 91-

7219-811-7.

No 5 Mikael Lind: Från system till process - kriterier för

processbestämning vid verksamhetsanalys, 2001,

ISBN 91-7373-067-X.

No 6 Ulf Melin: Koordination och informationssystem i

företag och nätverk, 2002, ISBN 91-7373-278-8.

No 7 Pär J. Ågerfalk: Information Systems Actability - Un-

derstanding Information Technology as a Tool for

Business Action and Communication, 2003, ISBN 91-

7373-628-7.

No 8 Ulf Seigerroth: Att förstå och förändra system-

utvecklingsverksamheter - en taxonomi för

metautveckling, 2003, ISBN91-7373-736-4.

No 9 Karin Hedström: Spår av datoriseringens värden –

Effekter av IT i äldreomsorg, 2004, ISBN 91-7373-963-

4.

No 10 Ewa Braf: Knowledge Demanded for Action -

Studies on Knowledge Mediation in Organisations,

2004, ISBN 91-85295-47-7.

No 11 Fredrik Karlsson: Method Configuration method

and computerized tool support, 2005, ISBN 91-85297-

48-8.

No 12 Malin Nordström: Styrbar systemförvaltning - Att

organisera systemförvaltningsverksamhet med hjälp

av effektiva förvaltningsobjekt, 2005, ISBN 91-85297-

60-7.

No 13 Stefan Holgersson: Yrke: POLIS - Yrkeskunskap,

motivation, IT-system och andra förutsättningar för

polisarbete, 2005, ISBN 91-85299-43-X.

No 14 Benneth Christiansson, Marie-Therese Christiansson: Mötet mellan process och komponent

- mot ett ramverk för en verksamhetsnära

kravspecifikation vid anskaffning av komponent-

baserade informationssystem, 2006, ISBN 91-85643-

22-X.

thermal issues in testing of advanced systems on chip

Documents