thermal issues in testing of advanced systems on chip
TRANSCRIPT
Linköping Studies in Science and Technology
Dissertations. No. 1702
Thermal Issues in Testing of
Advanced Systems on Chip
By
Nima Aghaee
Department of Computer and Information Science
Linköping University
SE-581 83 Linköping, Sweden
Linköping 2015
Copyright © 2015 Aghaee Ghaleshahi, Nima
ISBN 978-91-7685-949-0
ISSN 0345-7524
Printed by LiU-Tryck 2015
URL: http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-120798
i
Abstract
Many cutting-edge computer and electronic products are powered by
advanced Systems-on-Chip (SoC). Advanced SoCs encompass superb
performance together with large number of functions. This is achieved by
efficient integration of huge number of transistors. Such very large scale
integration is enabled by a core-based design paradigm as well as deep-
submicron and 3D-stacked-IC technologies. These technologies are
susceptible to reliability and testing complications caused by thermal
issues. Three crucial thermal issues related to temperature variations,
temperature gradients, and temperature cycling are addressed in this thesis.
Existing test scheduling techniques rely on temperature simulations to
generate schedules that meet thermal constraints such as overheating
prevention. The difference between the simulated temperatures and the
actual temperatures is called temperature error. This error, for past
technologies, is negligible. However, advanced SoCs experience large
errors due to large process variations. Such large errors have costly
consequences, such as overheating, and must be taken care of. This thesis
presents an adaptive approach to generate test schedules that handle such
temperature errors.
Advanced SoCs manufactured as 3D-stacked-ICs experience large
temperature gradients. Temperature gradients accelerate certain early-life
defect mechanisms. These mechanisms can be accelerated using gradient-
based, burn-in like operations so that the defects are detected before
shipping. Moreover, temperature gradients exacerbate some delay-related
defects. In order to detect such defects, testing must be performed when
appropriate temperature-gradients are enforced. Schedule-based
techniques that enforces the temperature-gradients for burn-in like
operations and delay testing are proposed in this thesis.
Abstract
ii
The last thermal issue addressed by this thesis is related to temperature
cycling. Temperature cycling test procedures are usually applied to safety-
critical systems to detect cycling-related early-life failures. Such failures
affect advanced SoCs, particularly through-silicon-via structures in 3D-
stacked-ICs. An efficient schedule-based cycling-test technique that
combines cycling acceleration with testing is proposed in this thesis. The
proposed technique fits into existing 3D testing procedures and does not
require temperature chambers. Therefore, the overall cycling acceleration
and testing cost can be drastically reduced.
All the proposed techniques have been implemented and evaluated with
extensive experiments based on ITC’02 benchmarks as well as a number
of 3D stacked ICs. Experiments show that the proposed techniques work
effectively and reduce the test costs. We have also developed a fast
temperature simulation technique based on a closed-form solution for the
temperature equations. Experiments demonstrate that the proposed
simulation technique reduces the test schedule generation time by more
than half.
iii
Populärvetenskaplig sammanfattning
Många banbrytande dator- och elektronikprodukter drivs av avancerade
System-on-Chip (SoC). Avancerade SoCs har enastående prestanda
tillsammans med ett stort antal funktioner. Detta uppnås genom effektiv
integrering av ett stort antal transistorer. En sådan storskalig integration
möjliggörs av ett kärnbaserat designparadigm samt djup submicron och
3D-stacked-IC teknik. Dessa teknologier är känsliga för tillförlitlighet och
testkomplikationer orsakade av termiska problem. Tre viktiga termiska
frågor som berör temperaturvariationer, temperaturgradienter och
temperaturcykler behandlas i denna avhandling.
Befintliga testschemaläggningstekniker förlitar sig på
temperatursimuleringar för att generera scheman som uppfyller termiska
begränsningar. Skillnaden mellan de simulerade temperaturerna och de
faktiska temperaturerna är ett fel. Detta fel, för tidigare tekniker, är
försumbart. Men avancerade SoCs upplever stora fel på grund av stora
processvariationer. Sådana stora fel har kostsamma följder, så som
överhettning, och måste tas om hand.
Avancerade SoCs tillverkade som 3D-stacked-IC upplever stora
temperaturgradienter. Temperaturgradienter påskyndar uppkomsten av
vissa defekta mekanismer när produkten är ny. Dessa mekanismer kan
artificiellt påskyndas genom att tillämpa gradienter så att motsvarande fel
upptäcks i tid. Dessutom förvärrar temperaturgradienter vissa
fördröjningsrelaterade defekter. För att upptäcka sådana defekter måste
testen utföras när lämpliga temperaturgradienter appliceras.
Den sista värmefrågan som behandlas i denna avhandling är relaterad till
temperaturcykling. Temperaturcyklingstester används för att detektera
cykelrelaterade fel tidigt. Sådana fel påverkar avancerade SoCs, särskilt
through-silicon-via-strukturer i 3D-stacked-IC. Befintliga
Populärvetenskaplig sammanfattning
iv
temperaturcyklings-testmetoder är för dyra för 3D-stacked-IC och därmed
måste nya billigare tekniker utvecklas.
Denna avhandling föreslår effektiva schemabaserade lösningar för
termiska problem så som diskuteras ovan. Dessa inkluderar termiska test-
och tillförlitlighetsproblem i samband med processvariation,
temperaturgradienter och temperaturvariationer. En snabb
temperatursimuleringsteknik föreslås i denna avhandling. Omfattande
experiment har visat effektiviteten av dessa föreslagna tekniker.
v
Acknowledgements
I would like to express my sincere gratitude and appreciation to my
advisors Professor Zebo Peng and Professor Petru Eles. I am thankful for
the opportunity, support, education, and training that they have provided
throughout my doctoral studies.
I would like to thank and express my appreciations to the Swedish National
Graduate School in Computer Science, CUGS, for funding and supporting
my research and studies.
I cannot forget the impact the high quality doctoral courses had on my
professional life by offering new perspectives. Even though I cannot name
them one by one I must thank professors and teachers that offered them.
Many thanks go to my friends and other employees at the embedded
systems laboratory, ESLAB, and at the department of computer and
information science, IDA, (including my colleagues, administration,
technical sections, etc.) for the pleasant and supportive work place that they
have created.
All these support would have not worked without the extraordinary support
and motivation from my parents and siblings. Thank you all!
Nima Aghaee Ghaleshahi
Linköping, September 2015
Please note that those copies of this thesis that are printed by LiU-Tryck
are in grayscale except for pages 8, 161, and 171. All the full color figures
can be found in the electronic copy.
vii
Contents
Abstract ......................................................................................................... i
Populärvetenskaplig sammanfattning ............................................................ iii
Acknowledgments ......................................................................................... v
Chapter 1 Introduction 1
1.1 Motivation ...................................................................................... 2
1.2 Contributions .................................................................................. 4
1.3 Publications .................................................................................... 5
1.4 Thesis Organization ....................................................................... 6
Chapter 2 Preliminaries 7
2.1 Temperature Related Defects ......................................................... 7
2.1.1 Temperature Dependent Defects ............................................... 7
2.1.2 Early Life Failures .................................................................... 9
2.1.3 Delay Faults .............................................................................. 10
2.2 Core-Based SoC Testing ................................................................ 10
2.3 3D Stacked IC Testing ................................................................... 11
2.4 Test Scheduling .............................................................................. 13
2.5 Test Power and Temperature ......................................................... 16
2.6 Temperature Simulation ................................................................. 17
2.7 Meta-Heuristic ............................................................................... 20
2.7.1 Motivational Example ............................................................... 20
Contents
viii
2.7.2 Particle Swarm Optimization .................................................... 23
Chapter 3 Related Work 27
3.1 SoC Test Scheduling ...................................................................... 27
3.2 3D Stacked IC Testing ................................................................... 28
3.3 Temperature-Aware Test Scheduling ............................................ 30
3.4 Process Variation Effects on Power and Temperature ................... 33
3.5 Multi-Temperature Testing ............................................................ 37
3.6 Temperature Gradients and Burn-In .............................................. 40
3.7 Testing for Delay-Related Defects ................................................. 41
3.8 Temperature Cycling ...................................................................... 44
3.9 Test Reordering .............................................................................. 48
Chapter 4 Process-Variation Aware SoC Test Scheduling Techniques 51
4.1 Introduction .................................................................................... 51
4.2 Motivational Example .................................................................... 52
4.3 Problem Formulation ..................................................................... 56
4.4 Temperature Error Model .............................................................. 59
4.5 Adaptive Test Scheduling .............................................................. 61
4.5.1 Tree Construction ...................................................................... 63
4.5.2 Linear Schedule Tables ............................................................. 65
4.5.3 Sub-Tree Evaluation ................................................................. 68
4.5.4 Sub-Tree Scheduling ................................................................. 74
4.5.5 Remarks .................................................................................... 78
4.6 A Fast Temperature Simulation Approach .................................... 79
4.7 Experimental Results ..................................................................... 82
4.7.1 Fast Temperature Simulation Approach ................................... 82
4.7.2 Adaptive Test Scheduling Technique ....................................... 84
4.8 Adaptive Multi-Temperature Testing ............................................ 88
Contents
ix
4.9 Remarks ......................................................................................... 90
4.10 Conclusions .................................................................................... 91
4.11 Notations and Abbreviations .......................................................... 93
Chapter 5 Temperature-Gradient Based Burn-In and Test Scheduling 97
5.1 Introduction .................................................................................... 97
5.1.1 Test for Early-Life Failures ...................................................... 97
5.1.2 Test for Delay Faults ................................................................. 99
5.2 Temperature-Gradient Based Burn-In ........................................... 101
5.2.1 Motivation and Problem Description ........................................ 101
5.2.2 Steady State Solution ................................................................ 104
5.2.3 Transient Solution ..................................................................... 111
5.2.4 Transient-Based Heuristic ......................................................... 115
5.2.5 Remarks .................................................................................... 119
5.2.6 Experimental Results ................................................................ 121
5.3 Temperature-Gradient Based Test ................................................. 123
5.3.1 Straightforward Algorithm ........................................................ 123
5.3.2 Fast Heuristic ............................................................................ 125
5.3.3 Experimental Results ................................................................ 127
5.4 Temperature-Map Ordering ........................................................... 129
5.4.1 Map Ordering Technique .......................................................... 129
5.4.2 Experimental Results ................................................................ 133
5.5 Conclusions .................................................................................... 134
5.6 Notations and Abbreviations .......................................................... 136
Chapter 6 Integrated Temperature-Cycling Acceleration and Test 139
6.1 Preliminaries .................................................................................. 139
6.1.1 Circuit under Test and Test Access Mechanism ....................... 142
6.1.2 Thermal Model ......................................................................... 142
Contents
x
6.1.3 Temperature Cycling Model ..................................................... 144
6.2 Motivational Examples .................................................................. 144
6.2.1 ATC Rate for a Simple Scenario ............................................... 144
6.2.2 Optimal Cycling in a Simplified Scenario ................................ 146
6.2.3 Effect of the Test Application Order ......................................... 149
6.3 Problem Formulation ..................................................................... 149
6.4 Three-Phase Approach ................................................................... 153
6.5 Integrated Approach ....................................................................... 157
6.5.1 Path-Graph Scheduling Algorithm ............................................ 160
6.5.2 Length of the Power Averaging Window ................................. 163
6.5.3 Priorities for TAM Access ........................................................ 164
6.5.4 Node Ordering in the Test Graph .............................................. 165
6.5.5 Remarks .................................................................................... 169
6.6 Experimental Results ..................................................................... 170
6.6.1 Cycling Acceleration................................................................. 170
6.6.2 Performance of the Integrated Approach .................................. 173
6.7 Conclusions .................................................................................... 176
6.8 Notations and Abbreviations .......................................................... 177
Chapter 7 Conclusions and Future Work 181
7.1 Conclusions .................................................................................... 181
7.2 Future Work ................................................................................... 183
References ..................................................................................................... 185
1
Chapter 1 Introduction
This thesis deals with temperature-related test issues. We focus on
manufacturing test of digital electronics that are produced by Very Large
Scale Integration (VLSI) techniques. The thermal test issues that are dealt
with in this thesis result in two categories of imperfect products being sent
to market: (1) products that are defective and (2) products that even though
are fully functional at the beginning, will fail during the field operation
shortly after being employed.
The test issues are considered for System-on-Chip (SoC) designs where
usually a core-based test architecture is in place. In such cases, the Test
Access Mechanism (TAM) is most often scan-based. We focus mainly on
advanced SoCs, where a fabrication technique with very small feature size
is used, usually referred to as deep submicron technology.
Reducing the feature size has been a mean to integrate more functionality
within an Integrated Circuit (IC) with good operational speed, manageable
power consumption, and acceptable production cost. This trend cannot be
endlessly continued, as the feature size is getting close to the size of a
single atom. An alternative for integrating more functionality into a single
package is 3D Stacked IC (3D-SIC) technology. 3D-SIC technology can
efficiently bond multiple dies into a single package. In this thesis,
sometimes we refer to this package as an IC. This thesis focuses on
advanced SoCs that have very small feature size or are manufactured by
3D-SIC technology. These technologies are affected by temperature-
related testing and reliability issues.
This chapter continues with the motivations for this thesis. Then a
summary of contributions is given, followed by a list of the author’s
publications that contain parts of these contributions. Finally, the
organization of the thesis is explained.
1
Chapter 1
2
1.1 Motivation
As the feature size is getting smaller, some parts of a modern IC must
include a precise small number of certain atoms1. Having a few atoms more
or less than the planned number will therefore result in a significant change
in the characteristics of the circuit. The manufacturing Process Variation
(PV) for older technologies that have a relatively large feature size is
negligible. However, for an advanced SoC, new techniques are required to
address the effects of the PV that is no longer negligible. PV includes
variations in the geometry of the chips’ components and variation in the
properties of the chips’ materials. For example, the effective channel
length may vary and result in variation of the threshold voltage and sub-
threshold leakage. These variations will result in differences in several
aspects of the circuit’s performance including its leakage current which is
an important contributor to the overall power consumption. Consequently,
the chips will experience power and temperature variations [Choi07,
Nebel97].
This means that the thermal aspects of hardware testing must be revised to
prevent potential damages. An important thermal issue with testing of
advanced SoC has been thermal safety. Advanced SoCs suffer from
exceedingly large power densities under test, so much so that the testing
must be slowed down to allow for cooling; otherwise the IC under test will
overheat. In general, a fast testing procedure is desirable to reduce the
testing costs. But in this case, a bit of testing speed is traded off to avoid
overheating. Overheating may result in good dies failing the test, since the
die’s temperature is higher than the intended operational temperatures.
Worse than this, is the situation that dies are damaged because their
temperatures even exceed the safe temperature limit.
The overheating problem can be efficiently addressed by carefully
scheduling the tests. This includes leaving the appropriate amount of
cooling intervals in the schedule, just as required. This can be achieved
with the help of temperature simulation. An important assumption for
existing simulation based techniques is that all the dies have similar
1 For example see the number of dopant atoms:
http://www.itrs.net/itwg/beyond_cmos/2008ERD_December/02_4_Architectur
e_SuhwanKim.pdf
Introduction
3
thermal behavior. Therefore, the result of temperature simulations and thus
the generated test schedules are valid for all of the dies.
Process variation renders the above assumption untrue for advanced SoCs.
What happens with one die is different from another die. One die may work
warmer than the other, therefore needing more cooling. Otherwise, it is
overheated. On the other hand the die that works colder can be tested faster,
saving valuable testing time thus reducing the costs. This means that
statistical approaches for temperature and PV-aware test scheduling are
required, as introduced in this thesis.
Temperature plays also an important role in testing. For example some of
the defects are activated only at high temperatures. This means that the
device works perfectly at low temperatures, but fails when it is too hot.
High-temperature defects are very common; therefore many existing
techniques stress the die with high temperature while testing. They are
common since the resistive opens in metals are common. Some resistive
open defects only manifest themselves at high temperatures since the
resistivity temperature-coefficient of the involved metals is positive. A
large number of interconnects including the crucial clock network are
made of metals.
Beside these temperature-dependent defects, there are other defects that
depend on temperature. For example, the signal delay depends on the
temperature. In an advanced SoC, an extensive clock network runs all over
the IC to assure the correct timing of the operations. Some areas in the IC
might be hot, while other areas are cold. Exacerbated by negative effects
of process variation or otherwise minor defects, this may result in some
signal paths being much slower than intended. This can result in timing
errors that occur only when certain sites have certain temperatures (usually
very different temperatures). This type of defects can only be detected
when certain temperatures are enforced on certain sites in the IC. These
temperature arrangements can be captured by a temperature map that
shows the temperatures for different sites in the IC. Some defects may need
their corresponding temperature map to be enforced while testing for them.
A temperature map also implies certain temperature gradients that are
temperature differences among different sites. Temperature gradients have
an effect on detection of early-life failures. So far we focused on defects
that exist immediately after the manufacturing. However, there are defects
Chapter 1
4
that even though do not exist just after the manufacturing, will occur
shortly after the device is being used. Burn-in techniques to speed up the
device’s early life before testing in order to detect certain early-life failures
already exist. A burn-in technique is to operate the ICs in a hot environment
usually with increased voltage. This speeds up a number of aging
mechanisms including the electromigration. Recent research has shown
that some early-life failures develop in sites that experience large
temperature gradients [Smorodin08]. The defect-related gradients can be
captured with a temperature map that is enforced on the IC using the
techniques proposed in this thesis.
Another phenomenon that is related to early-life failures is temperature
cycling. Exposing the IC to a number of large-scale temperature changes
before testing it, makes some early-life failures detectable. A simple burn-
in will not help to detect these early-life defects and the affected devices
will fail shortly after being employed in the field. The existing temperature-
cycling tests use temperature chambers [Mil04] and, therefore, the
temperature-cycling test is costly. A low-cost temperature-cycling test is
proposed in this thesis that uses high-power tests, among other stimuli, to
enforce the required amount of cycling on the IC.
1.2 Contributions
The first contribution of this thesis is the development of stochastic
approaches for thermally-safe and multi-temperature testing under large
process variation. The usual cost function for test scheduling is the
deterministic test application time which is not appropriate for the
situations in which some dies will be overheated due to the negative
consequences of process variation. A probabilistic cost function is
introduced to include the cost of the overheated ICs. Later on, for multi-
temperature testing, this cost function is extended to take the cost of the
test-escapes (due to temperature-dependent defect) into account. Adaptive
approaches, which utilize these cost functions, are proposed to deal with
intra-die variations and temperature fluctuations over time [Aghaee11a,
Aghaee14b]. Test scheduling techniques that take the temperature into
account use a thermal simulator in order to estimate the temperatures
before the actual testing. A fast temperature simulation technique is
introduced to facilitate faster process-variation aware schedule generation
[Aghaee13a].
Introduction
5
The second contribution of this thesis is a collection of techniques for
enforcing the given temperature maps on the ICs. Enforcing certain
temperature gradients on an IC for a given time makes the related gradient-
dependent early-life failures detectable by a targeted test performed later
[Aghaee14a]. Enforcing certain temperature maps while testing for
gradient-dependent defects (including some hard-to-detect delay faults)
helps to detect them [Aghaee13b]. Ordering these temperature maps and
consequently their related tests in an effective manner can reduce the test
application time, as proposed in [Aghaee15b].
The third and last contribution of this thesis targets cycling-dependent
early-life failures. The proposed algorithm utilizes the normal tests (tests
not related to cycling) and other stimuli in order to enforce a high level of
temperature-cycling activity. This is performed in a controlled manner, so
that no overheating or excessive cycling threatens the IC or test
performance [Aghaee15a]. The order of the tests affects the dissipated
power in the circuit under test. This fact is utilized by the proposed
algorithm to achieve a short test application time (including the
temperature-cycling time).
1.3 Publications
The contributions of this thesis are reported in the following articles:
N Aghaee, Z He, Z Peng, P Eles. Temperature-aware SoC test scheduling
considering inter-chip process variation. 19th IEEE Asian Test
Symposium (ATS), pp 395–398. Shanghai, China, Dec 2010.
N Aghaee, Z Peng, P Eles. Adaptive temperature-aware SoC test
scheduling considering process variation. 14th Euromicro Conference on
Digital System Design (DSD), pp 197–204. Oulu, Finland, Aug 2011.
N Aghaee, Z Peng, P Eles. Process-variation and temperature aware SoC
test scheduling using particle swarm optimization. 6th IEEE International
Design and Test Workshop (IDT), pp 1–6. Beirut, Lebanon, Dec 2011.
N Aghaee, Z Peng, P Eles. Process-variation and temperature aware SoC
test scheduling technique. Journal of Electronic Testing: Theory and
Applications, vol 29, no 4, pp 499–520. Aug 2013.
Chapter 1
6
N Aghaee, Z Peng, P Eles. Temperature-gradient based test scheduling for
3D stacked ICs. 20th IEEE International Conference on Electronics,
Circuits, and Systems (ICECS), pp 405–408. Abu Dhabi, UAE, Dec 2013.
N Aghaee, Z Peng, P Eles. Process-variation aware multi-temperature test
scheduling. 27th International Conference on VLSI Design (VLSID), pp
32–37. Mumbai, India, Jan 2014.
N Aghaee, Z Peng, P Eles. An efficient temperature-gradient based burn-
in technique for 3D stacked ICs. Design, Automation and Test in Europe
Conference (DATE). Dresden, Germany, Mar 2014.
N Aghaee, Z Peng, P Eles. An integrated temperature-cycling acceleration
and test technique for 3D stacked ICs. 20th Asia and South Pacific Design
Automation Conference (ASP-DAC), pp 526–531. Chiba, Japan, Jan 2015.
N Aghaee, Z Peng, P Eles. Temperature-gradient-based burn-in and test
scheduling for 3-D stacked ICs. IEEE Transactions on Very Large Scale
Integration (VLSI) Systems, Accepted.
N Aghaee, Z Peng, P Eles. Efficient test application for rapid multi-
temperature testing. 25th Great Lakes Symposium on VLSI (GLSVLSI),
pp 3–8. Pittsburgh, PA, USA, May 2015.
N Aghaee, Z Peng, P Eles. A test-ordering based temperature-cycling
acceleration techniques for 3D stacked ICs. Journal of Electronic
Testing: Theory and Applications, Accepted.
1.4 Thesis Organization
This thesis is organized in 7 chapters. The current chapter, chapter 1, is the
introduction. The next chapter, chapter 2, explains the preliminaries.
Related work is reviewed in chapter 3. Chapter 4 presents the proposed
process-variation aware SoC test scheduling techniques. Chapter 5 focuses
on temperature-gradient-based burn-in and test scheduling for 3D-stacked-
ICs. Chapter 6 presents our integrated temperature-cycling acceleration
and test techniques. Chapter 7 concludes the thesis and discusses the future
work.
7
Chapter 2 Preliminaries
This chapter introduces preliminaries that are helpful for understanding the
rest of this thesis. The temperature related defects and tests to detect them
are discussed in section 2.1. The testing procedure for core-based system-
on-chip designs is explained in section 2.2. The through silicon via and the
3D stacked IC technology that is based on them are briefly introduced in
section 2.3. Test scheduling approaches are reviewed in section 2.4. Power
and temperature issues are discussed in section 2.5. A temperature
simulation technique is introduced in section 2.6. A meta-heuristic
approach is introduced in section 2.7.
2.1 Temperature Related Defects
A well-known category of manufacturing defects affects the correct
operation of the IC just after the manufacturing. Therefore, they can be
tested for, immediately after the manufacturing process without any
particular environment/temperature-related requirement. We refer to these
type of defects as normal defects. Normal defects are relatively easy to
detect since they show up just after the manufacturing and can be detected
independent of the environmental conditions. An example of such defects
is a normal stuck-at fault.
2.1.1 Temperature Dependent Defects
Another category of defects is environment-sensitive, and show up only
under certain environmental conditions. An important sub-category of
these defects are temperature-sensitive defects [Needham98]. For example,
some defects show up only when the IC follows a certain temperature
pattern [Hagihara97].
An example for such temperature-sensitive defect is a resistive open which
is a major cause of test escapes [Needham98]. It occurs when a connection
2
Chapter 2
8
between two circuit nodes has a conductance high enough to be considered
connected at normal temperatures. But at high temperatures the
conductance decreases so much that the connection is considered
disconnected. This may occur since usually most of interconnects on the
chip are made from metals and the conductance of those metals has
negative temperature coefficient. Therefore, it is expected that a large
number of such defects appear at high temperatures. On the other hand, we
have other defects that manifest themselves differently with respect to
temperature. For example, in [Needham98] a defect (“Dark Via”) is
reported that “had previously passed all production tests, but then failed a
monitor test at cold temperature”. Several other defects are also identified
in [Needham98] that similarly appear only at low temperatures.
Besides the temperature coefficient for conductivity of the material,
thermal expansion may also contribute to temperature-dependent defects.
The Dark Via defect, which appears at low temperature, could be seen as
voids between interconnect and via [Needham98, Segura04]. This
observation could be explained with thermal expansion in metals that fills
up the voids and increases the conductivity. This effect is illustrated in
Figure 2.1.1, where large voids at low temperature shrink at high
temperature because of thermal expansion. Therefore, the conductance of
the via may increase albeit the reduced conductivity of the via's
constructing material.
Other similar defects also exist. For example, some defects for a different
technology (i.e., copper-based interconnects) are studied in [Zschech02]
and interface voids are mentioned along with sidewall voids and bulk voids
(shown also in Figure 2.1.1) as temperature-dependent defects. Moreover,
similar to possible temperature-dependent mechanisms for open defects,
one may think of temperature dependent mechanisms for short or bridging
defects.
Another type of temperature-dependent defect that is hard to detect is
silicide open [Tseng00]. Silicide is used to make local interconnects. In its
Figure 2.1.1 Voids in a via create a resistive open
(a) Large voids at low temperature. (b) At high temperature, materials expand and voids shrink.
Via Via
(a) (b)
Interconnect Interconnect
ii
iii
i
ii
iii
i
Bulk voidiSidewall voidiiInterface voidiii
Preliminaries
9
perfect condition, such a local interconnect has a positive temperature
coefficient for resistance, but a defective one will have it as negative.
Detecting such defects at normal temperature is difficult since their
difference is not recognizable. Testing at low temperatures is a good
solution since there will be a recognizable difference between the perfect
and the defective interconnects [Tseng00].
Resistive-open and stuck-open defects are experimentally studied in
[Li01]. The resistive-opens occur more frequently (39 samples) compared
to stuck-open defects (11 samples) [Li01]. By knowing the location of the
resistive defects and the materials involved in those defects, the proper test
temperatures can be found and the appropriate tests can be developed
[Li01].
Interconnect malfunctions (e.g., opens and shorts) are not the only sources
of temperature-dependent defects; transistor malfunctions are also a source
of concern. This issue is studied in [Long04] and the impact of temperature
is demonstrated. The thermal behavior of a transistor depends on its
quiescent point and therefore higher or lower temperatures, per se, do not
imply better or worst results. Usually, in order to minimize the effect of the
temperature, transistors are biased at the Zero-Temperature-Coefficient
(ZTC) point. ZTC is a point where the temperature will not affect the
transistor behavior. The problem is that there will be variations in the actual
quiescent points of the manufactured transistors and therefore temperature
will affect them. This will lead to defects that are hard to detect. Multi-
temperature testing can help to detect such defects [Long04].
2.1.2 Early Life Failures
Another category of defects consists of early-life failures. These can be
seen as manufacturing imperfections that are not manifesting as a defect
just after the manufacturing and therefore cannot be detected by the
manufacturing test that is performed immediately after the fabrication. A
burn-in process is usually used to push the IC through its early-life in an
accelerated manner. The existing techniques operate the device under high
temperature and perhaps with increased voltage and/or frequency. These
techniques handle the normal early-life failures that can be efficiently
accelerated this way. Two subcategories of early-life failures that are
different from the usual ones are explained below.
Chapter 2
10
There are early-life failures that show up at certain sites in the IC where
large temperature gradients are in place for relatively long periods of time
[Smorodin08]. In order to efficiently detect these defects, corresponding
temperature gradients must be enforced for a certain duration of time
before testing. The second type of defects are those that are made
detectable by temperature cycling. This means that the device goes through
an aggressive temperature cycling before being tested for the related
defects [Mil04]. This way some other imperfections that are not detectable
immediately after the manufacturing can be detected.
2.1.3 Delay Faults
Another category of defects that have similar features with some of the
temperature related defects mentioned above, consists of delay-related
faults. These happen when a signal propagates slower (faster in some cases
in relative terms) than expected (e.g., clock signal affected by skew). This
may happen due to temperature gradients and usually results in wrong data
being latched in memory elements. This can be due to data and clock
timings not being correct with respect to each other (e.g., due to different
temperatures at different sites). It can, also, be that the IC under test cannot
work at the intended frequency, however it can work correctly at a slower
clock. At-speed and delay tests are usually used to detect these defects
[Ahmed05, Higami13, Ko08].
2.2 Core-Based SoC Testing
A simple explanation for testing is that certain stimuli are applied to the
site of the targeted defect to activate it and then the circuit outputs are
compared against the correct outputs to detect the defect. In order to
generate such a test, the circuit model and the possible defect models must
be analyzed. This is a tedious task best done with the help of a computer
algorithm. Therefore, an Automated Test Pattern Generation (ATPG) tool
is used to generate the tests that cover a large number of defects while the
tests are kept acceptably short [Abramovici94].
The decision about which defects to target and which tests to include in the
test procedure of a certain product has a number of aspects. Incorporating
tests for all of the defects, in a modern system-on-chip, will make the test
application time very long. Testing costs are considerable, especially if
costly test equipment are involved. But shipping defective devices will also
cost, since they are usually covered by the manufacturer’s guarantee. The
Preliminaries
11
failures that show up after the device has left the fabrication and test facility
will cost much more than the defective device’s own cost [Davis94]. The
testing process is therefore designed to minimize the overall cost. The other
aspect to be considered is reliability for safety-critical applications. The
devices manufactured for safety-critical applications usually go through
much more elaborate tests to comply with the high reliability requirements.
A modern system-on-chip includes a large number of memory elements
(e.g., flip flops and registers) and therefore the number of states that such
digital designs include is huge. Moreover, taking the circuit from one state
to another state that is needed for some other tests can be very time
consuming. This is one of the motivations for Design for Testability (DfT)
techniques that include a Test Access Mechanism (TAM) on the core-
based system-on-chips.
A test access mechanism is used to provide test access to all the cores.
There might be some other testable modules in a system-on-chip that are
not conventional cores. These modules are also accessible using the TAM.
There is always a trade-off between the test acceleration gained by
inclusion of a TAM and the cost of the TAM itself that includes its area on
the die, the delays that it adds to the signal paths, and its static power
consumption. The TAM design is usually kept small to avoid these
overheads. Therefore, it is extremely unlikely to be able to provide
simultaneous access to all modules. Consequently, during the test some of
the modules must wait while other modules are being tested.
The tests are usually performed using Automated Test Equipment (ATE)
which put the device in the test mode, feed it with stimuli, and check the
circuit under tests’ outputs for defects.
2.3 3D Stacked IC Testing
Existing systems-on-chip like Apple A8X and Xbox One have 3 and 5
billion (i.e., ) transistors, respectively. Larger number of transistors
have already been integrated. For example Intel Xeon E5-2600 v3 has 5.6
billion transistors [Intel13], Nvidia Kepler GK110 has 7.1 billion
transistors [Nvidia12] and Xilinx Virtex UltraScale XCVU440 has 20
billion [Santarini14]. These indicate the extremely large number of
transistors that will be integrated into advanced system-on-chips in order
to provide a wider range of functionalities as well as higher computational
power.
Chapter 2
12
More functions as well as higher computational power are traditionally
achieved by shrinking the feature size as well as some other minor
improvements so that a large number of possibly faster transistors fit on a
single die. For more than that, a number of dies must be connected. These
inter-die interconnects are usually long and thick. Moreover, a relatively
small number of interconnects can be made per die area (i.e., low
interconnect density). These lead to high power consumption as well as
low data transmission rate.
A promising technology for efficiently connecting different dies is based
on Through Silicon Vias (TSV). A through silicon via is a via that runs
throughout the bulk silicon and allows the dies to be stacked on top of each
other while making electrical connections. The ICs fabricated this way are
called 3D Stacked ICs (3D-SIC). This technology supports high density
signal connections with a short wire length that translates into high
bandwidth communication (both number of lines and the frequency that
they support) with a small power consumption.
TSVs are manufactured in the individual dies. They are initially contained
within the die, since their length is smaller than the die’s thickness.
Therefore, a thinning step follows in order to carefully remove extra
thickness of the die. After the thinning process, the TSVs reach the surface
of the die.
On the surface of the die the so called micro-bumps are placed. The micro-
bumps are places where electrical connections, for example by soldering,
are made. The dies must be carefully aligned and then correct bonding can
take place.
The steps in the manufacturing process may involve multiple bonding
stages. A testing procedure at each of these stages may help to reduce the
overall costs. These tests are referred to as pre-bond, mid-bond, and post-
bond test stages. The pre-bond test is performed before bonding when the
die is separate. If a defect goes undetected to the next steps, some other
potentially perfect dies as well as the bonding efforts are wasted because
of the defective die. Similarly, a mid-bond test may be helpful especially
if an expensive die is going to be bonded to a low-cost partial stack. In this
case, it might be a good idea to test the partial stack before bonding. At the
end of the bonding process, a post-bond test can be performed.
Preliminaries
13
The bonding can be, also, done with wafers instead of the individual dies.
In this case, the wafers are aligned and bonded and then diced. Since the
dies are still not diced during the bonding, it is not possible to choose the
non-defective dies to be bonded together. In such a scenario, the wafers
can be matched, positioned, and aligned so that the low defect-rate areas
of the two wafers meet each other. The probability of ending up with
defective stacks are reduced this way, although it is not possible to fully
prevent good dies being wasted.
So far we explained die to die bonding and then wafer to wafer bonding.
Another alternative for bonding is die to wafer bonding. In this case, a
particular layer in the 3D-SIC structure is diced into dies while the other
wafer is not diced. This way bonding known bad dies can be avoided.
The TSV manufacturing process and bonding process are new sources of
defect that do not exist for normal 2D ICs. Therefore, a more elaborate
testing process may be required, especially for defects that are related to
the TSV fabrication or the bonding process.
For 3D stacked IC testing, the TAM is designed so that the test access is
possible at different test stages [Ieee14a]. 3D-SICs experience more
thermal issues than the conventional 2D ICs. These include the issues that
affect the conventional 2D ICs as well as thermo-mechanical issues related
to TSV technology. Moreover, the dies cannot cool as efficiently as 2D ICs
that usually have many low-resistance thermal paths for cooling. The
situation is particularly difficult for dies located in the middle of the stack.
2.4 Test Scheduling
As mentioned before, the test access mechanism, in either 2D or 3D SoCs,
is a resource bottleneck for testing. Therefore, tests must be scheduled in
order to minimize the test application time. A test schedule determines at
each time-point which modules must run their tests. Moreover, it
determines which test must be performed for the module.
Test scheduling can be done with or without partitioning and interleaving.
Schedules without partitioning [Chou97, Zorian93] are simpler but in
general result in large test application times. In this case, when a module
starts a certain test it runs to the test’s completion and the schedule cannot
make changes when a test is being applied. Nowadays, partitioning and
interleaving of tests is common [Marinissen00]. In this case, a test can be
Chapter 2
14
halted for a while and other modules may use the released TAM resources.
This thesis uses test partitioning and interleaving for all the proposed
scheduling approaches.
The authors in [Iyengar02] have formulated the test scheduling problem as
a rectangle packing problem. The problem is proven to be NP-complete
and is solved using a Mixed-Integer Linear Programming (MILP) approach
in [Chakrabarty00]. The test scheduling problem becomes even more
complicated when, for instance, the thermal issues must be taken into
account.
Here we briefly explain the main ideas related to test scheduling using an
example. Assume that the SoC under test consists of three modules ,
, and as shown in Figure 2.4.1a. Assume that the test access
mechanism can accommodate only two of these modules at a time (
, where TAM width is denoted by ).
There are two Built-In Self-Test (BIST) modules and as shown in
Figure 2.4.1a. Each of them performs only a part of the tests for the
corresponding module. uses the TAM to test but is directly
connected to and can test it without occupying the TAM. Assume that
each module has four tests and that each one of them is a node in a directed
path-graph (i.e., there is only one path in the test graph). The th test for
module is denoted by as shown in Figure 2.4.1b. The forth test
for module (i.e., ) is performed by the BIST while is
performed by . The rest of the tests (marked as normal in Figure 2.4.1b)
are performed using an ATE through TAM.
Since the TAM cannot support simultaneous testing of all modules, the
tests must be scheduled. A shorter test application time is desirable and
therefore the test schedule must be optimized for a minimal test
applications time. In general, there could be other constraints, in addition
to TAM, including power, temperature, and tester memory constraints. The
Figure 2.4.1 Examples for (a) a SoC, and (b) tests
SoC
TAM
(a)
m1b1
m2m0b0
BISTn2,0 n2,1 n2,2 n2,3
n1,0 n1,1 n1,2 n1,3
n0,0 n0,1 n0,2 n0,3
(b)
Normal
Preliminaries
15
scheduling objective may include other factors, in addition to test
application time, including test throughput and perhaps test coverage
considering defect probabilities.
Let us focus only on test application time reduction under TAM limitation.
A module can be only in one of these two states: active (i.e., testing) or
inactive. A schedule indicates the test cycles (time) that a change in one or
more of the modules’ states must happen and what that change is. The
schedule indicates that at cycle modules and start testing, as
indicated in in Figure 2.4.2a. Since the tests’ path-graphs are given, there
is no need to include the test nodes in the schedule, however, the tests being
applied are shown in Figure 2.4.2d. The active modules go through their
1st, 2nd, and 3rd test without any new entry in the schedule.
At test cycle , the BIST tests ( and ) start as indicated in Figure
2.4.2c. Since has dedicated access to module , it does not occupy the
TAM and, therefore, module can gain access to the TAM, as shown in
Figure 2.4.2b. Consequently, all three modules are active simultaneously.
At test cycle , testing of and is complete. Testing of continues
to completion at cycle .
In the above example, we assumed that the order of the tests is fixed, but
in reality it might be possible to reorder tests to achieve better results. In
that case, the nodes (e.g., Figure 2.4.2d) must be included in the schedule.
This means that at least two additional entries in the schedule table (Figure
2.4.2a) between cycles and as well as two more additional entries
between cycles and must be added to indicate transition to new test
nodes. (In fact one entry is sufficient since the last node is trivial.)
Figure 2.4.2 Example for a test schedule
(a) the test schedule; (b) TAM occupation; (c) BIST activity; (d) test nodes
ActiveInactive
stateschedule
cycle i2i0 i1
m0
m1
m2
(a)
(b) TAMm0
m1
m0
m1
m0
m1
m0
m2 m2 m2 m2
(c)b1
b0BIST
test noden0,0n1,0
n0,1n1,1
n0,2n1,2
n0,3n1,3n2,0 n2,1 n2,2 n2,3
(d)
i3
Chapter 2
16
Moreover, we assumed that testing is always done for all of the specified
tests, but in reality testing may be terminated as soon as a defect is found.
In this case the optimization objective (e.g., test application time) is a
stochastic quantity (e.g., expected test application time) that is evaluated
based on the defect probabilities (or statistics).
A test schedule can be adaptive, depending on certain run-time parameters.
An adaptive schedule acts based on the actual value of an otherwise
stochastic quantity during the test. An example is sensing the actual
temperature and changing the schedule accordingly. In this case, a number
of schedule pieces are generated and during the test, the temperature is
sensed when required and the schedule-piece that fits the situation is
selected.
2.5 Test Power and Temperature
The circuit under test consumes power as a result of switching activity
during the test process, similar to when the circuit is in operation. In
general, power density for digital circuits is increasing by the advancement
of technology and increased integration. One of the problems is that this
dense power dissipation leads to very high temperatures and can affect the
correct system behavior. The situation is worst during the testing. In
particular scan-chain based DfT features result in even higher power
densities. It is reported that the test power dissipation can be as large as
twice the normal power [Bonhomme02, Zorian93].
In order to prevent incorrect device behavior or damage to the device
because of high temperature (overheating) something must be done. A
category of efficient approaches that do not make the testing unnecessarily
long are based on changing the test schedules [Rosinger06]. In order to
prevent overheating during the test, temperature simulations are performed
before the actual test during the scheduling process. The simulated
temperature shows the time intervals in the schedule where overheating
may occur. One of the options is to halt the test to allow for cooling at such
time intervals. This way, cooling which slows down the testing process is
just added to the schedule exactly when it is needed.
Process variation results in large variations in the dissipated power in
advanced SoC designs [Cheng00]. This results in considerable variations
in the temperature of the device and poses difficulties for the offline
temperature-aware test scheduling techniques that are deterministic (e.g.,
Preliminaries
17
[Rosinger06]). To handle this situation, stochastic approaches are proposed
in this thesis in chapter 4.
The dissipated power in a circuit depends on the current input values and
the circuit’s state. The state depends on the previous inputs. Therefore, the
dissipated power during the test depends on the tests order [Girard97]. This
phenomenon is used in chapter 6 to harvest different power values from
the same set of tests.
The power dissipations are calculated based on the given switching
activities and the IC power-related characteristics. The actual dissipated
power also depends on the leakage current (i.e., static power). The leakage
current, itself, depends on the temperature. As mentioned before, always
in this thesis a temperature simulation is performed. The simulated
temperatures are used to guide the schedule generation. Also, they are used
to approximate the static power, the component that depends on the
temperature.
Leakage current plays an essential role in thermal run away. Thermal run
away is a situation in which the static power, per se, can keep increasing
the temperature, even beyond the safe limit. This means that introducing a
halt that takes away the dynamic power will not stop the temperature from
increasing. Consequently the temperature further increases, increasing the
static power and the increased static power increases the temperature, in
return [Vassighi06].
This positive feedback loop goes on and on until the circuit is disconnected
from the power source or until the circuit is damaged. Once started, this
usually goes fast. However, it only starts at high temperatures. In the usual
DfT architectures only the dynamic power can be controlled by the
schedule. Therefore, in schedule-based solutions, such high temperatures
must be avoided.
2.6 Temperature Simulation
As mentioned above, in order to estimate the actual temperatures during
the test, temperature simulations are performed during the scheduling
process. This paradigm has been used in all chapters of this thesis. A
temperature simulator consists of a thermal model and an algorithm to
analyze it. The thermal model describes the mathematical relation between
the IC characteristics, the dissipated power, and the temperatures.
Chapter 2
18
There exists a range of thermal models. Some of them may focus on the
steady state temperatures which means that the dynamic response cannot
be obtained. Some other thermal models, only focus on each individual
module and ignore the heat transfer among modules. In this thesis we use
a thermal model that supports dynamic response analysis and takes the heat
transfer among modules into account, similar to the widely used thermal
simulator, HotSpot [Huang07, Huang06, Stan03].
This model is a lumped element model meaning that the chip is modeled
as a combination of thermal resistances and thermal capacitances. An
example for such a thermal model is given in Figure 2.6.1. A typical
thermal model consists of a number of lumped elements connected to each
other. A connection point of thermal elements is called a node.
An equivalent view is that an IC is divided into small elements each of
which is characterized by a single temperature. Each of these small
elements is represented as an individual node in the model. In Figure 2.6.1,
two cores are modeled as two nodes (i.e., elements) which are connected
to two exclusive power sources. Power sources represent the power
dissipated by the cores.
Assume that the thermal model consists of nodes and is the number
of cores. In a high quality thermal model, usually the number of nodes is
larger than the number of cores, , (e.g., six thermal nodes for two
cores) as shown in Figure 2.6.1. Assume that is the power vector and
is the temperature vector. The mathematical representation of the thermal
model is a system of ordinary differential equations:
(2.6.1)
Figure 2.6.1 An example of a lumped element thermal model
Core 1
Core 2
Resistance
Capacitance
Ambient
Power Source
Preliminaries
19
The properties of the thermal model are encapsulated into two
matrices and . and are temperature and power vectors. The
mathematical representation of this commonly used model (equation 2.6.1)
is a system of linear constant-coefficient differential equations. As an
example, assume that a SoC has two cores ( ) and assume that the
model has four nodes ( ). The expanded characteristic equation of
the model is
and are core temperatures which should be taken care of. and
are the power values applied to the cores.
For architectural design purposes, usually the dissipated power is assumed
to correspond to a fixed scenario. The inputs are the IC characteristics that
are varied to find a good design. The outputs are the temperatures that
somehow affect the cost function for the architectural design. This
viewpoint is useful for example for designing the TAM1. For this view
point numerical approximation is a good choice to solve equation 2.6.1.
In order to numerically analyze and solve the combination of the thermal
model and the dissipated power values, a time interval which is called a
simulation cycle is defined. The length of simulation cycle is determined
based on a number of factors including the required accuracy. The
computed temperatures are recorded and reported for each simulation
cycle. It is common to assume that the power ( in equation 2.6.1) is
constant during a simulation cycle.
The numerical approximations are usually done with very small
intermediate steps, and as a result, the complete temperature curve for the
interval is meticulously constructed. HotSpot uses the Runge-Kutta
method for the numerical approximation [Huang06]. Though only the
temperature at the end of the simulation cycle is registered, many points of
the temperature curve are calculated.
1 Not the viewpoint of this thesis. In this thesis we assume that the TAM is
already designed and given along with the other IC specifications.
Chapter 2
20
This thesis’ viewpoint is that the IC characteristics are fixed. The inputs
that are varied are the power values. They vary because they depend on the
tests and the schedules. A range of different schedules are explored to find
a near optimal schedule. The outputs are temperatures. The thermal models
work equally well for both of the above viewpoints, whether the IC
characteristics are fixed or not. However, the difference in these
viewpoints means that different approaches may be appropriate for solving
the thermal model.
Since the physical design of the devices is assumed to be fixed, a
superposition-based approach as the one suggested in [Yao09] can be used.
This superposition-based approach is particularly helpful if the tests are
partitioned in advance (before the scheduling process) and if large errors
in static power (due to temperature-dependent leakage) are acceptable. In
this thesis a third approach different from the Runge-Kutta and the
superposition-based approach is used. A fast temperature simulation
scheme is proposed in section 4.6.
2.7 Meta-Heuristic
The test scheduling process is usually based on a number of decision
variables. These decision variables go through an optimization process in
order to generate a near optimal test schedule. A cost function is defined to
evaluate the quality of alternative schedules which are themselves based
on the combinations of the decision variable values. A motivational
example explains these concepts. Then, particle swarm optimization,
which is a meta-heuristic frequently used in this thesis, is introduced.
2.7.1 Motivational Example
A thermal-safe scheduling paradigm is discussed here to explain basic
ideas of thermal-aware test scheduling and optimization. The objective is
to generate a test schedule with the minimal Test Application Time (TAT).
The constraint is that the temperature must not exceed the overheating level
denoted by (this includes a safety margin).
We consider an IC made of only one module. Therefore, there are no
constraints for access to modules using the test access mechanism. Assume
that the tests dissipate a constant power (including both dynamic and static
power) denoted by . It is assumed that is so large that it results in
overheating. Usually leakage and clock networks power result in a non-
Preliminaries
21
zero power dissipation during cooling. This cooling power which is
denoted by ( ) results in a rest temperature (denoted by )
that is higher than ambient ( ).
The module temperature is initially equal to the ambient temperature
denoted by . As discussed above, the test is paused as soon as the
temperature reaches . Testing is resumed after sufficient
cooling. The question is how much cooling is sufficient. Certain
temperature level can be considered as sufficient. Let us denote this
sufficient temperature level by ( ). Thus,
sufficient-cooling temperature, , is the decision variable in this problem
formulation. The temperature curve is plotted in Figure 2.7.1.
Since the power values (i.e., and ) are constants, the testing and
cooling patterns are periodic, as can be seen in Figure 2.7.1. In each of
these periods, the testing time is denoted by and the cooling time
with . There is, also, a delay associated with starting or resumption of
the testing process, denoted by . This delay is associated with testing
equipment and architecture and cannot be changed. A part of this delay,
denoted by , results in the temperatures to further reduce to a low
temperature level, denoted by .
The other part of the switching delay, denoted by , results in a shorter
effective test time than the testing times, . Therefore, the actual times
when testing takes place is equal to . Assuming that one test unit
(e.g., a thousand test bits) is applied per second, and assuming that the test
length is test units, the total number of testing/cooling periods,
approximately, is:
.
Figure 2.7.1 Temperature curve for a simple thermal-aware testing scenario
Te
mp
era
ture
time
Chapter 2
22
Therefore,
(2.7.1)
Assume that the module under test is thermally modeled by a single
thermal element using equation 2.6.1. The module’s heat capacitance is
denoted by (analogous to ). The heat resistance between the module
and the ambient is equal to (analogous to ). In this case, equation
2.6.1 can be described for the testing part of the period as:
For the cooling part of the period, the thermal equation can be written as:
These equations can be used to compute the values of and as:
(2.7.2a)
and
(2.7.2b)
Using equations 2.7.1–2, TAT values are plotted for a range of values
in Figure 2.7.2. It is assumed that ,
, and (this is the rest temperature, ).
The TAT is minimal when .
In the above example, there was only one decision variable, no TAM
congestion, constant testing and cooling power values, and a simple
thermal model. Therefore, the optimization problem was solvable by
plotting TAT versus . The problem is that none of the above assumptions
are realistic.
In reality there are a number of decision variables (e.g., one for each
module). Because of TAM congestion, a module cannot start/resume
testing disregarding of other modules. Testing and cooling powers can be
different for different test stimuli and they, also, depend on the
temperature. A module’s temperature may need to be modeled with several
thermal elements. A thermal element’s temperature depends on the test
Preliminaries
23
stimuli power and the temperature of the adjacent thermal elements. This
situation is much more complex than the above example and it will be
extremely time-consuming to find the exact optimal schedule. Therefore,
a near-optimal solution that can be found in an affordably short time is
preferred. For this purpose, particle swarm optimization which is a
population-based meta-heuristic is used in this thesis.
2.7.2 Particle Swarm Optimization
Let us review a more realistic version of the thermal-safe scheduling
discussed in the previous section. For this purpose the IC’s temperature
must be simulated offline during the schedule generation, as shown in
Figure 2.7.3. As soon as the temperature reaches the overheating level
denoted by the test is halted to allow for cooling. For example
at test cycle testing is paused (module is inactive) to allow for cooling.
This is registered in the schedule table as shown in Figure 2.7.3b–c.
Temperature simulation continues and when the temperature reduces to
(sufficient-cooling temperature), the module activity (i.e., testing) may
resume. The actual resumption may be delayed due to testing equipment
and architecture characteristics. Moreover, the delay may be due to TAM
congestion which forces the module to wait for test access. In this example,
testing resumes at test cycle , as registered in the schedule table in Figure
2.7.3b–c. Since the power values are not constant, the heating time between
and is shorter than the heating time between and .
Figure 2.7.2 Test application time versus sufficient-cooling temperature
330 340 350 360 370 380 390
340
350
360
370
380
390
400
410
420
430
440
TAT
50 100 12060 70 80 90 110
Chapter 2
24
This constructive, on-the-fly, and temperature-simulation-based
scheduling continues until all the tests are scheduled. This point marks the
test application time that must be minimized using a meta-heuristic2.
There are a number of meta-heuristics that can be used for optimization. A
population-based meta-heuristic is usually used in such situations. A well-
known example for such category of algorithms is the genetic algorithm
[Falkenauer98, Maulik00]. In this thesis we often use a Particle Swarm
Optimization (PSO) technique. Here we briefly explain the PSO which is
used in this thesis.
Particle swarm optimization mimics the social behavior of a swarm
searching for food [Poli07]. Each individual member of the swarm is called
a particle. A particle is represented by two attributes, its location and its
velocity. The location in fact is a solution which, usually, is represented by
a coordinate in a Cartesian system. The velocity keeps the particles moving
in the search space.
Each particle remembers its previous best location, and in addition to this
individual memory, the swarm remembers the best location any of its
particles have visited before, the global best. The previous bests and the
global best are then used to give a hint to the random velocities. A
2 The technique used in this example is from [He08a]. The actual optimization
problems in this thesis are more sophisticated than this example.
Figure 2.7.3 Test scheduling based on temperature simulation
(a) Temperature curve; (b) test cycles registered in the schedule table; (c) module states in the
schedule table. (Curves are only illustrative.)
Sch
ed
ule
Te
mp
era
ture
cycles
state
(a)
(b)
(c)
i0 i1 i2 i3 i4 i5
InactiveActive
Preliminaries
25
canonical form of the particle swarm optimization is expressed by the
following equations [Poli07]:
(2.7.3)
(2.7.4)
This canonical form of the particle swarm optimization uses equation 2.7.3
to update the velocity. The coefficients in equation 2.7.3 ( , ,
and ) are given as a part of the chosen canonical form. The
and are two distinct random numbers between 0 and 1 which are
renewed iteratively. The location and velocity on the right hand side of
equation 2.7.3 are the previous values and the left hand side velocity is the
new value. The new location is the sum of the previous location and the
new velocity as expressed in equation 2.7.4. Sometimes an action is needed
to prevent the new location from going outside the valid search space. This
can be done by limiting its value (e.g., by changing its value) to the valid
extremes.
For example, in the above example the decision variable (i.e., sufficient-
cooling temperature) must be larger than the rest temperature and smaller
than the overheating temperature ( ). Smaller
values will result in an infinite loop in the scheduling algorithm since the
temperature will never become smaller than . Larger values have a
similar effect, since when cooling the temperature only decreases and
cannot increase beyond . In these cases the scheduling
algorithm will wait forever for a temperature that cannot be reached.
A simple form of the particle swarm optimization is presented below:
1. Generate the initial locations (in the valid search space) 2. Generate random initial velocities (in a reasonable range) 3. Evaluate the solutions 4. Find the best solutions as follows:
a. Loop for all particles. i. If the current location is better than the previous best location replace it and check
if it is better than the global best, if so, replace the global best. (For the first iteration, copy the current solution as previous best, and find the global best among the previous best solutions.)
5. If the termination condition is met, exit with the global best as final solution. 6. Update the Swarm as follows:
a. Loop for particles: i. Update the velocities according to equation 2.7.3 ii. Update the particle’s location according to equation 2.7.4
Chapter 2
26
iii. Limit the location to the valid extent of search space
7. GO TO point 3.
In order to see how PSO works, assume that the lower location value
corresponds to more cooling (e.g., as the decision variable in Figure
2.7.3a). Therefore, a negative velocity guides the particle towards more
cooling and if more cooling in this iteration helps to reduce the cost (TAT
for the above example), it is reasonable to keep a negative velocity for the
next iteration as well.
If such a move makes the particle the best in the swarm, that will affect the
velocities of other particles as well. If a particle is at great distance to the
promising search region, its velocity will, generally, be larger due to large
difference values in equation 2.7.3. This allows fast move towards a better
area. The particle slows down when it approaches the promising area due
to small difference values in equation 2.7.3. This enables a detailed search
in the promising areas.
The evaluation of the cost function (e.g., schedule length) for different
particles can be performed in parallel (e.g., using multiple threads). This
might be very helpful, especially if temperature simulations are involved.
In many cases (e.g., scheduling) as the evaluation proceeds, the cost (e.g.,
test application time) grows. Therefore, at any time-point for a particle
(parallel thread) it becomes certain that it has no chance of affecting the
local best (and, therefore, the global best) the thread can be stopped. There
is no need to further evaluate the particle, since it is not going to be used
in equation 2.7.3.
This is important since the CPU time is usually proportional to the schedule
length and, hence, to the test application time. Bad schedules that do not
contribute to equation 2.7.3 usually correspond to a long test application
time and take a long CPU time to complete. Therefore, aborting their
corresponding threads drastically speeds up the search. As soon as a good
particle is found, the bad ones can be stopped.
27
Chapter 3 Related Work
Recently thermal issues affecting testing procedures have been intensively
studied [Tadayon00]. A promising class of solutions are scheduling-based
[He06b, Rosinger06]. Some of these issues, like thermal-safe testing, have
been previously addressed and some of them like issues related to the
process variation, temperature gradients, and temperature cycling have not
been sufficiently studied. This chapter provides an overview of the related
work.
3.1 SoC Test Scheduling
An optimal solution for the test scheduling problem for core-based systems
is presented in [Chakrabarty00, Chakrabarty02]. Test data and the test
access mechanism are assumed to be given. The decision variables are the
start times for the tests. The optimization objective is to minimize the total
test application time. It is shown that this problem is NP-complete. A
solution based on a mixed-integer linear programming (MILP) formulation
is suggested. It is shown that MILP solution is too slow for large designs.
Consequently, an efficient heuristic algorithm for dealing with such large
designs is presented in [Chakrabarty00, Chakrabarty02].
A method to address both the scheduling and the design of DfT features,
together, is proposed in [Huang01, Huang02]. The objective is to reduce
the test application time and the constraints include the power budget. The
problem is formulated as a two-dimensional bin-packing problem which is
solved using a best-fit heuristic algorithm [Huang01, Huang02].
A test scheduling approach that supports test preemption is introduced in
[Iyengar01]. The constraints include the power budget and precedence
constraints. Precedence constraints impose that the generated schedules
preserve desirable orderings among tests. Allowing test preemptions
results into shorter schedules [Iyengar01].
3
Chapter 3
28
A test design approach that optimizes some of the DfT features along with
the schedule generation is proposed in [Zou03]. A simulated annealing
based heuristic is used to solve the test scheduling problem that is
formulated as a two-dimensional bin packing problem. The width of the
core wrapper is one of the decision variables optimized by the simulated
annealing algorithm [Zou03].
An abort on first fail approach with test power constraint is introduced in
[He06a]. Abort on first fail means that the test is terminated as soon as a
defect is detected. Defect probabilities for cores and power constraints are
assumed as given. Test partitioning is performed by the scheduling
algorithm. A heuristic generates the test schedules with partitioning,
aiming at a minimal test application time [He06a].
A test scheduling approach for 3D-SIC is proposed in [SenGupta12]. For
normal 2D ICs, the same test schedule is used both at wafer sort and at
package test. In a 3D-SIC a number of dies are integrated into a single
package. Therefore, the package will have a collection of the tests for
individual dies in addition to tests for the TSV interconnects. A technique
for co-optimization of the wafer sort and the package test is proposed for
3D-SIC. The proposed approach utilizes an on-chip JTAG infrastructure
and efficiently re-uses JTAG lines to perform testing of different cores
[SenGupta12].
3.2 3D Stacked IC Testing
Miniaturization and performance requirements result in the usage of new
technologies, such as 3D-SICs based on TSVs. Their advanced fabrication
processes as well as physical access limitations result in major testing
challenges. The manufacturing steps of TSV-based ICs and their testing
challenges are introduced in [Marinissen09]. The necessary steps for
wafer-level and package-level testing in addition to the required test data,
wafer-level probe access, and DfT features are discussed [Marinissen09].
A technique for clock network synthesis that supports pre-bond testability
for 3D-SICs is proposed in [Kim10]. This prevents bonding of a bad die to
good dies by testing the dies before stacking. The pre-bond clock network
testing requires a complete 2D clock tree on each die. The proposed tree
topology generation algorithm uses a minimal number of TSV-related
buffer resources. Moreover, self-controlled clock transmission gates are
proposed in order to eliminate transmission gate control lines.
Related Work
29
Consequently, the number of TSVs and clock-network power consumption
are reduced [Kim10].
A DfT architecture for 3D-SICs allowing pre-bond die testing as well as
mid- and post-bond stack testing is proposed in [Marinissen10a]. This
architecture facilitates modular testing, in which the various dies, their
cores, the inter-die TSV-based interconnects, and the external I/Os can be
tested as separate modules. This helps to achieve an optimal test flow for
various 3D-SIC designs. The proposed architecture is based on the existing
DfT features at the core, die, and product level. A die-level wrapper which
can be based on either IEEE Std 1500 or IEEE Std 1149.1 is proposed
[Marinissen10a].
A DfT architecture based on a modular test paradigm for 3D-SIC is
proposed in [Marinissen10b]. Different dies, their cores, the TSVs, and the
external I/Os can be tested individually. The proposed architecture is based
on existing DfT hardware at the core, die, and product level. Die-level
wrapper compatible with IEEE 1500 are supported. The proposed DfT
includes dedicated probe pads on the non-bottom dies to facilitate pre-bond
testing. Moreover, TSVs working as “test escalators” for routing test
control and data signals up and down during mid- and post-bond testing
are supported. Furthermore, a hierarchical “wrapper instruction register”
chain can be included in the design [Marinissen10b]. Some of the test
techniques and DfT features are further discussed in [Plas11].
Challenges in testing of 3D-SIC for manufacturing defects and their
potential solutions are discussed in [Marinissen10c]. These are divided into
the following categories: test flow, test data, and test access. Examples for
interconnect defects, including voids in TSVs and misaligned micro-
bumps are discussed [Marinissen10c].
The die-stacking steps that include thinning, alignment, and bonding may
introduce defects. Therefore, the partial stacks (mid-bond) and the
complete stacks (post-bond) may need testing. A test architecture
optimization technique for 3D-SIC is proposed in [Noia10a] to minimize
the test application time for both mid- and post-bond test stages. It is
demonstrated that an optimal DfT architecture considering these test stages
is different compared with the situation that only the final test stage is
considered [Noia10a].
Chapter 3
30
A DfT architecture optimization technique for 3D-SIC is proposed in
[Noia10b]. It is shown that 3D-SIC with large and complex dies placed at
the lower layers requires less test time than stacks with complex dies at
higher layers [Noia10b].
Test challenges for 3D-SICs are discussed in [Marinissen12a, Noia11].
The need for standards like IEEE P1838 is discussed. P1838 consists of a
test wrapper hardware and a description language for a test standard. This
includes a generic die wrapper, which will create a standardized interface
for each die in a stack. The wrapper design must enable pre-, mid-, and
post-bond test access. A standard interface for each die in the stack is
suggested. All these enables partial and complete stack tests, including die-
external tests [Noia11].
Another DfT architecture for 3D-SIC is proposed in [Marinissen12b]. It
supports modular testing, meaning different dies, cores, TSV-based
interconnects, and external I/Os can be tested during the relevant pre-, mid-
or post-bond stages. The proposed architecture makes it possible to
optimize the test flow under various conditions. It also provides yield
monitoring and first-order fault diagnosis [Marinissen12b].
A DfT architectural optimization for 3D-SIC is proposed in [Noia12] to
minimize the test application time for mid- or post-bond testing. Optimal
architecture and the corresponding test schedule are obtained for a scenario
where only the post-bond testing is performed. It is demonstrated that the
optimal architecture and schedule are different for the scenario where a
mid-bond testing is added to the existing post-bond tests [Noia12].
The optimal test flow for 3D-SIC is studied in [Taouil12]. A framework
that embodies different test flows for die to wafer bonding paradigms is
introduced. The cost associated with a range of test flows is assessed for
several die yield and stack size alternatives. It is shown that the inclusion
of pre- and mid-bond testing potentially reduces the overall cost
[Taouil12].
3.3 Temperature-Aware Test Scheduling
Without simulating the temperatures, the thermally safe schedules for
advanced SoCs will be unnecessarily long. This is due to the large safety
margins which are necessary when the temperature values are unknown.
Prior to the actual test (during the schedule generation) our knowledge of
Related Work
31
the actual temperatures during testing, without simulating the
temperatures, will be severely limited. Therefore, temperature-aware test
scheduling techniques that use a kind of temperature simulation are
introduced.
A temperature-aware scheduling technique is proposed in [He06b]. The
objective is to minimize the test application time and the constraints
include keeping the temperature under a safe limit. Test partitioning is
supported and the time between two consecutive partitions can be utilized
for cooling. Interleaving allows tests for other cores to be performed while
a hot core is interrupted for cooling. This allows for efficient TAM
utilization and a short schedule is achieved. The problem is formulated as
a combinatorial optimization and a Constraint Logic Programming (CLP)
formulation is used to solve it [He06b]. A faster heuristic-based approach
for this purpose is later on proposed in [He07].
The power impact of scan chain testing is studied in [Bild08]. It is shown
that the scan-chain power consumption is considerably higher for at-speed
testing compared to the operational mode. An exact test schedule
optimization for minimization of test application time under temperature
constraints is introduced. This exact approach could be slow for practical
purposes and therefore a fast heuristic-based approach is proposed
[Bild08].
A temperature-simulation driven test scheduling algorithm is proposed in
[He08a]. Instantaneous simulated temperatures are used to guide the
partitioning of the tests and lengths of the cooling intervals. Interleaving of
tests for different cores is supported to achieve a high TAM utilization
[He08a].
A partitioning and interleaving approach is introduced in [He08b]. The
suggested method formulates the number of test partitions and the length
of cooling intervals into an optimization problem and then uses constrained
logic programming to solve it. The temperature is simulated using HotSpot
[Huang06] and is constrained to avoid overheating. Since constrained logic
programming is too slow to handle long tests, a heuristic is proposed for
the test scheduling [He08b].
A partition-based temperature-aware test scheduling algorithm is proposed
in [Yao09, Yao11a]. Tests are partitioned and the proper start time for each
partition is defined as a decision variable. The optimization goal is to
Chapter 3
32
achieve a short test application time under the temperature constraints. A
superposition-based temperature simulation scheme is proposed. The
actual temperature simulation is performed for each partition only once at
the very beginning using HotSpot [Huang06]. Later on these simulated
temperatures are combined based on the superposition principle in order to
obtain the temperature for different situations [Yao09, Yao11a].
A temperature-aware combined TAM design and test scheduling technique
is proposed in [Yu09]. The proposed approach supports cycle-accurate
temperature simulation as well as test partitioning and interleaving.
Maximal TAM size and maximal safe temperature are given as constraints.
A heuristic-based approach is used to minimize the test application time.
In addition to the temperature simulations, the heuristic is guided by the
power density and test application time of individual partitions [Yu09].
A temperature-aware test scheduling technique supporting the abort on
first fail testing approach is proposed in [He09]. In such an approach
testing is terminated as soon as a defect is detected. Therefore, the defect
probabilities must be known before the scheduling. The proposed test
scheduling technique supports partitioning and interleaving of tests. The
proposed algorithm uses the simulated temperatures to guide the
partitioning of tests and to determine the duration of the cooling intervals.
The objective is to minimize the expected test application time while the
temperatures of the cores are kept below the thermal safety limit [He09].
A temperature-driven test access routing and test scheduling for three-
dimensional SoC is introduced in [Chandran09]. Three dimensional design
of DfT features combined with partition-based test scheduling is studied.
The proposed temperature-aware technique minimizes the test application
time while constraints on the available hardware resources are taken into
account [Chandran09].
In [Vinay10], it is shown that 3D-SICs may rapidly become too hot since
the thermal resistance between dies located at the middle of the stack and
the heat sink is large. A temperature aware test scheduling technique is,
then, proposed [Vinay10]. The proposed approach focuses on vertical
temperature distribution in the 3D IC to avoid overheating. Moreover, a
new test partitioning scheme is proposed based on the power variations.
The proposed techniques consist of heuristics aiming at minimizing the test
application time. The proposed thermal model is a linear RC-model that
focuses on vertical temperature effects. It is demonstrated that the proposed
Related Work
33
technique can achieve a uniform vertical temperature distribution
[Vinay10].
A test partitioning method for temperature-aware testing of 3D-SIC is
proposed in [Millican14]. The objective is to generate a test schedule with
a minimal test application time under the thermal-safety constraints. The
partitions are determined based on a partitioning temperature so that the
temperature within a partition does not vary too much [Millican14].
A test scheduling technique for 3D-SICs based on a session-less approach
is proposed in [Flottes15]. Testing start times are formulated as the
decision variables and test application time is minimized. A set of
constraints including TAM availability, power budget, and thermal limits
must be respected. A greedy heuristic is proposed and experimentally
evaluated in [Flottes15]. The session-less approach generates shorter
schedules compared with the session-based ones. Session-based
approaches afford to find the optimal schedule while session-less
approaches usually cannot find the exact optimum. The proposed heuristic
can find a near optimal solution for large problem sizes resulted from
session-less approaches [Flottes15].
3.4 Process Variation Effects on Power and Temperature
Process variation causes uncertainty in circuit parameters including the
electric currents and therefore the dissipated power. Variations in the
dissipated power values will result in temperature variations. This means
that the temperature for two different fabricated instances of the same
entity (e.g., an identical core design) will be different.
Consider a homogeneous multi-core SoC design. Assume that all the cores
are executing exactly the same tasks with identical memory and resource
access patterns (also identical state and input data). Assume that all the
cores started from the ambient temperature (i.e., identical initial
conditions) and the voltages are precisely equal. Assume that there is not
heat transfer among the cores and the cores cooling capabilities are
designed to be identical. The difference in their working temperatures is
due to the so called intra-die variations1.
1 Intra-die and inter-die variations are formally defined based on the concept of
temperature error that will be introduced in chapter 4.
Chapter 3
34
Now consider two single-core SoCs with the same design. Assume that
they are executing exactly the same tasks with identical memory and
resource access patterns (also identical state and input data). Assume that
both ICs started from the ambient temperature (i.e., identical initial
conditions) and the voltages are precisely equal. The difference in their
working temperatures is due to the so called inter-die variations.
The impact of process variation on leakage power for a 0.18μm
Complementary Metal Oxide Semiconductor (CMOS) technology is
studied in [Srivastava02]. It is shown that the process variation can
drastically affect the leakage current. Based on Monte Carlo simulations
an analytical model for estimating the average and the standard deviation
of the leakage current is developed. It is then demonstrated that the average
leakage obtained by taking the PV into account is significantly different
from the leakage predicted by the deterministic models [Srivastava02].
Process variation is a major challenge for designing with technology nodes
smaller than 90nm [Borkar03]. Large variations in voltage, current, power,
temperature, and delay are expected. PV causes serious difficulties in
designing advanced electronics and to address these difficulties a shift in
the design paradigm, from existing deterministic approaches to adaptive or
stochastic approaches (either probabilistic or statistical) is necessary
[Borkar03].
A method for estimating the leakage current variations due to PV is
proposed in [Rao03]. The problem is analyzed for both inter- and intra-die
variations and a closed form Probability Density Function (PDF) for
calculating the leakage current is developed. Distributions of individual
gate’s leakage currents are then combined to calculate the average and
variance for a whole design. The closed form results are then validated
against a set of Monte Carlo simulations [Rao03].
A stochastic approach for leakage power minimization based on dual Vth2
technologies considering PV is proposed in [Liu04]. A statistical model of
PV is used in this stochastic optimization. Probabilistic analytical models
are then developed to predict the impact of PV on the leakage power and
2 Vth is the threshold voltage in CMOS-based technologies. One Vth value is
sufficient to fabricate working ICs. However some manufacturers offer the
possibility of using two different Vth values in a single die in order to achieve
better performance/power trade-offs.
Related Work
35
delays. This model indicates that the existing non-probabilistic analysis
significantly (around ) underestimates the leakage power. Based on the
proposed model the value of the second Vth is optimized considering the
PV [Liu04].
A different stochastic Dual-Vth optimization technique considering PV is
proposed in [Srivastava04]. It is shown that the deterministic methods are
not appropriate in the presence of large process variations. The proposed
stochastic approach can improve the leakage by 15–35% compared with
the traditional deterministic approaches [Srivastava04].
A method for analyzing the leakage power under intra-die process
variations is proposed in [Chang05]. A lognormal distribution is used to
approximate the leakage current of individual gates and the overall leakage
of a die is determined by combining these lognormal distributions. Both
subthreshold leakage and gate tunneling leakage are considered
[Chang05].
The leakage power is very sensitive to process variations and therefore PV
results in large temperature variations [Choi07]. The temperature
variations in FinFET circuits are affected by both the channel length
variations as well as the body thickness variations. The temperature
variation caused by PV are assessed using Monte Carlo simulations
combined with temperature simulations. The simulations show that circuits
with large switching activity suffer from larger temperature variations.
This is due to larger static power as a result of the higher temperature
caused by large switching activity. It is shown that under a moderate
process variation (e.g., for channel length and body thickness)
thermal runaway can occur in more than 15% of chips in a 28nm FinFET
technology [Choi07].
Process variation is caused by various reasons including [Nowka08]:
· imprecisions in alignment, rotation and magnification (lithography);
· interference effects from neighboring shapes (lithography);
· fluctuations in the photon absorption positions (lithography);
· fluctuations in the dosage of chemicals used for etching and treatment;
· random dopant fluctuation;
· gate oxide thickness fluctuation;
· Chemical Mechanical Polishing (CMP) unevenness;
Chapter 3
36
Gate oxide thickness variation results in variation in the threshold voltage
and consequently in variations in the static power dissipation [Nowka08].
Some of the stochastic approaches for dealing with the PV (e.g., statistical
static timing analysis) cannot handle the dynamic changes during operation
[Ganapathy10]. A new method based on multivariate regression is
proposed to model the temporal delay variations under PV. Such variations
are related to temperature variations [Ganapathy10].
On-chip temperature sensors are used to achieve a temperature-aware test
scheduling and reduce the test application time compared to a static
schedule [Yao11c]. Due to large PV the estimated test power values can
be very different from the actual ones. Consequently, the estimated
temperatures during the offline scheduling phase (prior to the actual test)
can be inaccurate. A test architecture that supports dynamic test scheduling
is assumed. A heuristic is suggested to generate the static schedule that the
method is based on. Then a dynamic test scheduling method using on-chip
temperature sensors is proposed [Yao11c].
Dynamic reliability management techniques dynamically tune a system’s
operation based on the tradeoff between performance and reliability. The
proposed method in [Zhuo10] takes the spatial and temporal variations
(including PV) into account. Moreover, the proposed technique is
workload-aware meaning that it reacts to sudden workload variations
[Zhuo10].
A flexible probabilistic framework for evaluation of the transient power
and temperature variations under large PV is introduced in [Ukhov14a].
This models the probability functions of the fluctuating parameters. The
proposed technique captures the power and temperature variations in a
closed-form analytical model [Ukhov14a].
A system-level framework for the analysis of temperature-related failures
affected by PV is proposed in [Ukhov14b]. This includes a probabilistic
technique for dynamic steady-state temperature modeling and a closed-
form stochastic modeling of the system. Temperature cycling induced
aging is analyzed in presence of large PV. The proposed technique
minimizes the expected energy consumption under performance,
temperature, and reliability constraints [Ukhov14b].
Related Work
37
3.5 Multi-Temperature Testing
A detailed study of defects found in a commercial microprocessor is
performed in [Needham98]. For this high production volume micro-
processors, the manufacturing tests are designed so that very small test-
escape statistics are achieved. Some of the escaped defective devices are
rigorously analyzed to find out the defect’s type, its electrical effect, and
the possible methods to detect such defects easily. Lessons learned from
these defects in combination with the technology trends enables the authors
to determine what should be done to achieve and maintain high-quality
manufacturing and test. This includes defects that can be detected by multi-
temperature testing but are otherwise hard to detect [Needham98].
The conclusions from a failure analysis study in SEMATECH3 is reported
in [Nigh98]. The testing procedures, IC stressing to achieve high
reliability, characterization of the defects, fault diagnosis, and physical
analysis are presented for a number of devices. Testing at different
temperatures is discussed in [Nigh98].
Delay-defect test-escapes are examined in [Tseng00]. Among these
defects, detecting the defects that are caused by high resistance
interconnects are very challenging. A cold testing technique that performs
the test at low temperature can help. Cold testing is in particular effective
for detecting the silicide open defects [Tseng00].
The behavior of resistive open defects are studied in [Li01]. Temperature-
dependent defects that motivate multi-temperature testing are discussed.
The effects of temperature on testing are investigated and an effective
testing method for resistive opens is presented [Li01]. It is suggested that
by knowing the location of such defects and the materials involved in those
defects, the proper testing temperatures can be found and the appropriate
test patterns can be generated [Li01]. Such testing temperatures and test
patterns are used to perform multi-temperature testing.
Parametric failures are more frequent in advanced electronics, where the
feature size is very small [Segura02]. These hard to detect failures are
experimentally studied and classified. Multi-parameter test strategies are
3 SEMiconductor MAnufacturing TECHnology (SEMATECH) is a research
consortium for IC manufacturing. http://public.sematech.org/ (May 2015)
Chapter 3
38
suggested to address this complex test problem. These issues are also
discussed in [Segura04].
Due to very small copper interconnect dimensions in advanced electronics,
physical failure analysis is needed to address the potential defects. Failure
localization and defect analysis are challenges for copper inlaid
technologies [Zschech02]. Failure localization and analysis using
FIB/SEM4 and TEM5 are described. The voids in copper interconnects and
buried residuals in vias are studied in [Zschech02]. These defects result in
temperature sensitive defects that necessitate multi-temperature testing.
Multi-temperature testing is analyzed based on experimental data from
0.25μm and 0.18μm technologies, in [Long04]. Then, based on these data,
a model is developed. This model is used to design new temperature-based
tests to improve the test’s quality. Temperature based test data are
presented for a range of measurements including transistor characteristics
needed to parameterize the model [Long04].
Some imperfections in the chip (e.g., some resistive opens or shorts) will
not hinder the normal operation of the chip just after the fabrication, at the
time that the manufacturing test is performed. But these imperfections are
reliability threats because they are weak points in the circuit that wear out
quickly and will lead to failures during the expected lifetime of the chip
[Long04, Needham98]. Some of these imperfections can be identified by
multi-temperature testing.
Performance outliers and defects are examined across the expected
operating temperature range [Schuermyer04]. Minimum testing
requirements to detect temperature dependent outliers at wafer sort and
final test are investigated. This is based on data from a 0.18μm technology
obtained at 30°C and 85°C. It is argued that temperature-sensitive defects
are expected to become more frequent in advanced technologies and,
therefore, it is important to develop effective test methodologies for them
[Schuermyer04].
4 Focused Ion Beam (FIB) is a visualization technique used for site-specific
analysis of materials. It is similar to a Scanning Electron Microscope (SEM).
5 Transmission Electron Microscopy is a visualization technique based on electron
beams transmitted through the object.
Related Work
39
Resistive defects are important in advanced electronics but they need
special conditions in order to be detected. The detection approaches for
resistive bridging (short) defects are studied in [Engelke08, Kundu05].
Testing at low temperatures may help. Resistive bridge defects are studied
under multiple environmental conditions. Moreover, imperfections that are
not defects at nominal conditions but could deteriorate and become early-
life failures are studied. It is suggested that there exist appropriate
combinations of these tests that provide satisfactory test coverage for
different types of defects [Engelke08, Kundu05].
The performance of advanced electronics that are made by deep submicron
technologies can be affected by phenomena that were previously
considered not to be important [Wu10]. One of such phenomena is
Inversed Temperature Dependence (ITD). ITD means that the delay of
electronics may decrease with temperature; against the traditional
understanding that the electronics delay increases with the temperature.
The reason for this phenomenon is the smaller threshold voltage (implying
faster operation) at high temperatures which dominates the smaller carrier
mobility (implying slower operation) at high temperatures. Traditionally,
delays are checked at two temperature corners, one representing the best
case (used to happen at low temperatures) and the other representing the
worst case (used to happen at high temperatures). For advanced electronics
which experience ITD the high temperature may not correspond to worst-
case delays [Wu10].
Advanced electronics require new types of testing, like temperature-
testing, in order to maintain high product quality. The effect of test
temperature on the quality of the tests is studied in [Jagan10]. A low-cost
alternative to temperature testing is proposed. Moreover, the proposed
technique determines the appropriate test conditions for the best test
quality and lowest cost. The proposed test flow is experimentally evaluated
on an industrial-standard die. A defect’s behavior at low-temperature is
studied using Shmoo plots6 [Jagan10].
The need for testing advanced core-based SoCs at different temperatures
is discussed in [He10]. Then a multi-temperature test scheduling for SoCs
is introduced which assumes that tests should be applied inside predefined
6 Shmoo plot is a graphical representation of a device’s response to a range of
conditions and inputs (e.g., temperature and voltage).
Chapter 3
40
temperature ranges. The proposed scheduling approach minimizes the test
application time and ensures that tests are only applied within the valid
temperature ranges [He10]. For this purpose the temperatures of the cores
are simulated. Based on the simulated temperatures, heating or cooling
intervals are introduced into the schedule [He10]. The proposed method is
based on partitioning and interleaving and therefore when a core is having
its cooling interval, other cores may utilize the test access mechanism’s
capacity that has been just made available [He10].
Another multi-temperature test scheduling scheme for SoCs is introduced
in [Yao11b]. It assumes that tests should be applied at their specified
temperature ranges (can be different from each other). Cooling intervals
are inserted if the core temperature is too high and heating stimuli are
applied when the temperature must be increased in order to meet the
required temperature conditions for correct testing [Yao11b]. The
proposed scheduling approach in [Yao11b] is based on list scheduling and
assumes that tests run always to completion without any interrupts. The
initial list order is determined based on the lowest valid temperatures for
the tests. The list schedule determines the earliest start times for tests. The
test application time is minimized and it is ensured that tests are applied
within correct temperature ranges [Yao11b].
3.6 Temperature Gradients and Burn-In
The presence of voids in Cu structures results in important reliability issues
for advanced electronics. The mechanical stress in the interface between
the Cu and capping layers7 is experimentally investigated in [Murray12].
In technologies that deposit the cap at lower temperatures, the Cu does not
show considerable depth-dependent stress. Even though an annealing
technique can decrease the stress gradient, when the temperature goes back
to the room temperature after being close to the deposition temperature, the
gradient appears again [Murray12].
A mechanism that causes defect formation in metallization (e.g.,
interconnects) under fast temperature cycle stress is studied in
[Smorodin08]. The lateral temperature distribution (i.e. temperature
gradient) causes an accumulating plastic deformation of the metal layer.
7 Capping layer is the electrical insulation used to insulate different interconnects
and wire lines in a die.
Related Work
41
Large deformations occur in sites which experience large temperature
gradients [Smorodin08].
Burn-in is used to accelerate various aging and failure mechanisms so that
the imperfections that may cause infant mortality are detected before the
product is shipped [Miller01]. Burn-in acceleration is achieved by
imposing high temperatures, high voltages, high toggle rate, and/or high
current density on the circuit under test. One of the traditional test flows is
a burn-in following a test by ATE and then again an ATE test. It is
suggested that this first ATE testing and the burn-in can be combined into
a hybrid burn-in, improving the overall test process [Miller01].
Reliability predictions are based on a number of sources of information
including in-service field return data and physics of failure [Bayle10].
Previously, predictions were mainly based on empirical data but recently
physics of failure is being incorporated into the lifetime models. The
existing reliability models usually are based on steady-state temperature.
A new methodology that combines several recent works that address new
mechanisms of failure (e.g., hot carrier and delamination) is proposed for
aeronautic applications [Bayle10].
3.7 Testing for Delay-Related Defects
Gradients and early life failures are discussed above. Gradients have some
other negative consequences. One of them is discussed in the following.
Different temperatures on different sites mean that the signal delays (that
depend on the sites that a signals route passes through) will have different
delays. This can cause delay-related faults that must be detected using at-
speed and delay tests, as discussed below.
Some floating-point data-paths are developed for graphics and simulation
applications in [Hagihara97] using a 0.35-micron technology. They are
designed to be embedded in a vector pipelined processor for use in
supercomputers. An online test technique is introduced to improve the
reliability under actual operating conditions that includes temperature-
gradients. The technique makes it possible to detect delay faults as well as
the static faults (i.e., normal defects) [Hagihara97].
For advanced SoCs containing millions of gates and working with
frequency in gigahertz range, at-speed test is crucial [Ahmed05]. The
launch-off-shift method has some advantages over the launch-off-capture
Chapter 3
42
technique but requires perfect transition fault testing with regard to at-
speed scan enable signal. A scan-based at-speed test is introduced in
[Ahmed05] that is based on multiple local fast scan enable signals. The
scan enable control information is sent as test data through the scan.
Moreover, an innovative scan cell is introduced to generate the fast local
scan enable signal [Ahmed05].
Keeping the power consumption checked during at-speed testing is
investigated in [Ko08]. A common practice is to divide the scan chain to
control shift power by activating mutually exclusive flip-flops at different
times during the scan cycle. However, the existing automatic test pattern
generation techniques do not provide means to control the capture power.
Therefore, a new scan chain division algorithm is introduced in [Ko08]. It
takes into account the signal dependencies and partitions the circuit such
that both shift and capture power can be reduced. Moreover, a technique
for utilizing partial scan combined with the scan chain divisions is
proposed [Ko08].
Test power constraints are usually due to the power delivery limitations.
These limitation could be due to the limited capability of the power
network in the device or the limited test equipment’s capability [Zhao10].
Excessive switching activity during launch-to-capture cycle in delay test
causes many problems. These include overkill of dies and damaging the
ATEs’ probes [Zhao10]. A fast technique for finding the high-power
patterns and replacing them with power-safe ones is introduced. Being high
power is defined in relation with ATEs’ power limit. The proposed
technique takes the spatial and electrical properties of the power
distribution network into account [Zhao10].
At-speed scan-based testing may be affected by launch safety issues
[Wen11]. This means that the test results are incorrect because of excessive
launch switching activity which is related to the test stimulus launch in the
at-speed test cycle. A power-aware test generation flow is proposed in
[Wen11] to guarantee a safe launch. The proposed rescue and mask scheme
targets the excessive switching activity around the long path that the test
vector targets. The rescue phase reduces the power. If the new power value
is still too large the test responses are masked. The proposed approach
guarantees launch safety with a negligible impact on test quality and costs
[Wen11].
Related Work
43
Launch-off-capture and launch-off-shift are the two major at-speed scan-
based delay testing techniques [Bosio11]. Usually, launch-off-shift offers
higher fault coverage and faster test than launch-off-capture technique.
However, it suffers from higher peak power consumption in the launch-to-
capture cycle. A don’t care filling technique is proposed to adjust peak
power consumption in relation with the power consumption in functional
mode. The objective is to generate a test set with peak power values similar
to the functional power [Bosio11].
3D-SICs are manufactured based on micro-bumps that connect two of the
stack dies together [Shibin15]. Moreover, TSVs provide electrical
connections between the front- and back-side of a die. It is reported that
imec8 and Cadence9 have developed a 3D-DfT architecture based on DfT
die wrappers [Shibin15]. The TSVs and micro-bumps can be tested for
static defects (e.g., hard opens and shorts) by existing techniques. Such
interconnects might also be affected by resistive opens and shorts, which
usually manifest themselves as delay faults. The reported 3D-DfT is
recently enhanced to support at-speed transition-based delay-fault testing.
The reported framework works with mission-mode speed and employs the
already existing clock distribution network [Shibin15].
A delay fault simulator for combinational circuits is developed in
[Manikandan11]. It helps to develop the delay tests faster. The experiments
consider K-longest path sets of ISCAS'85 benchmarks. A large number of
single input test patterns are repeated for a number of times to achieve
statistically valid data. The proposed technique is reported to provide good
fault coverage and 20% speed-up [Manikandan11].
A transient fault injection technique for simulation-based fault-injection in
advanced SoCs is proposed in [Rohani13]. The proposed technique can
inject a wide range of faults without modifying the top-level design.
Moreover, the proposed technique is fast. Two experimental case studies
show that the proposed technique reduces the CPU time by 10% compared
with other similar techniques [Rohani13].
8 Interuniversity MicroElectronics Centre (IMEC) is an electronics research
center. http://www2.imec.be/be_en/home.html (May 2015)
9 Cadence Design Systems Inc is an electronic design automation company.
http://www.cadence.com/cadence/Pages/default.aspx (May 2015)
Chapter 3
44
3.8 Temperature Cycling
It has been known for long that varying mechanical stress in metals will
result in metal fatigue and consequently lead to metal structure failure. The
varying stress has various causes, including mechanical load variations and
temperature fluctuations (i.e., cycling). Accurate estimates for the effect of
fluctuations help to know the lifetime of a part. This enables a better
(simulation-based) design of the parts. Besides it enables the timely
replacement of the parts which translates into a safe and cost-efficient
maintenance of the structure (e.g., a ship or a plane). A well-known
approach for estimating this aging effect is the Rainflow counting
algorithm proposed in [Matsuishi68].
Cycle counting methods (e.g., Rainflow algorithm) identify equivalent full
and half cycles within the irregular load profile [Musallam12]. Then the
cycle-based lifetime models can be used. The original Rainflow algorithm
is applied offline meaning that the whole temperature or load profile over
the desired operational time period must exist before it can start. Therefore,
it cannot be used for applications that need it in real time. An online
counting algorithm which uses a stack-based implementation is proposed
in [Musallam12] and used in this thesis.
Time dependent average temperature effect is combined with the results
from the Rainflow algorithm in a single lifetime model in [GopiReddy14].
A month long load profile is used as a test profile to estimate temperatures
in a power system for reactive compensation of load [GopiReddy14].
Insulated Gate Bipolar Transistor (IGBT) is a power-electronic device with
a relatively wide range of applications including automotive traction
[Held97]. Such applications require high reliability in particular under
power cycling. Power cycling causes temperature changes which lead to
mechanical stress. This can lead to defects such as lifting of bond wires. A
fast cycling test that activates the failure mechanism is suggested to enable
reproduction of millions of cycles in a short time. The effectiveness of the
proposed approach is verified by a mechanical analysis. A model is
developed to relate the number of cycles-to-failure to the magnitude of
temperature changes [Held97].
Another mechanism affecting the lifetime of electronic devices can be
modeled by the Arrhenius equation. An important parameter is the
Related Work
45
activation energy10 that relates to the working temperature [Groebel01].
Accelerated-test data are experimentally obtained and then used to
accurately estimate the activation energy. A software package dedicated to
this experimental approach is used to speed up the process. Accelerated-
life test data are acquired for a thermally stressed hard-drive system and
analyzed using the Arrhenius-Weibull model. The Arrhenius model
parameters are estimated using a maximum likelihood algorithm. Then, the
activation energy is estimated [Groebel01].
A few procedures for extracting the statistical parameters of temperature
cycling experienced by power devices for different mission profiles (e.g.,
how an electric vehicle is driven) are investigated [Ciappa03a, Ciappa03b].
These statistical models help to design efficient accelerated tests and to
fine-tune the lifetime models. A precise lifetime model that takes into
account the creep11 experienced by compliant materials under thermal
cycles is developed in [Ciappa03a, Ciappa03b].
Electronics reliability is affected by the average working temperature as
well as the temperature cycling effect [Hirschmann06, Hirschmann07].
Temperature simulation is used to estimate the dynamic temperature
values. Temperature cycling plays an important role in lifetime prediction
models. A technique for detecting all relevant temperature cycles is
developed in [Hirschmann06, Hirschmann07].
A lifetime model for solder joints under cyclic thermal-mechanical loading
is developed in [Lu07]. The model combines a linear damage accumulation
10 Activation energy is a term primarily used in chemistry to approximately
describe the minimum energy required to start (activate) a reaction. The reaction
is modeled by Arrhenius equation that has the activation energy as a main
parameter. In other situations that are not exactly a chemical reaction, but the
Arrhenius equation is used for pure modelling purposes, the term “activation
energy” is nevertheless used for the main parameter in the model disregarding
its original namesake.
11 Creep or cold flow is when a solid material moves slowly or deforms
(permanently) under mechanical stresses. Exposure to stress during a relatively
long period of time can do this. Heat exacerbate creep. The amount of stress that
can cause this is less than the value needed to literally bend the material
instantaneously.
Chapter 3
46
with the effect of accumulated plastic strain12. The model is then used to
predict the lifetime for a power module that operates under mixed cyclic
loading conditions (e.g., a train’s traction system) [Lu07].
A solder fatigue model based on a modified Coffin-Manson approach is
proposed in [Vasudevan08]. The proposed model is evaluated using
temperature cycling experiments. The experimental data for various types
of packages and sockets have been used. The proposed model’s error is
reported to be less than 6% [Vasudevan08].
Through silicon vias reliability issues are investigated in [Kamto09]. The
experiments performed using a technology based on deep reactive ion
etching show that TSVs with tapered sidewalls can be formed. The TSVs
experience temperature cycling. Considerable increase in the electrical
resistance of the paths going through TSVs is observed after temperature
cycling. Perfect TSVs only show small increases in resistance for 200
cycles. Moreover, small changes in resistance are observed when TSVs
experience high temperatures for extended periods of time [Kamto09].
The acceleration factor for solder depends on the magnitude of temperature
changes, dwell times, ramp rates, actual values of temperature extremes,
and the type of package [Syed10]. A lifetime model that relates the actual
real-life lifetime with accelerated lifetime based on a number of factors
including temperature cycling is proposed [Syed10].
The relation between the initial electrical resistance of TSVs and failures
due to temperature cycling as well as electromigration is studied in
[Frank10]. Physical analysis shows that a carbon impurity layer at one end
of the problematic TSVs is developed. This impurity results in failure
under temperature cycling while it has no correlation with defects caused
by electromigration [Frank10].
The thermal stress distribution for a TSV array is studied in [Kuo11,
Kuo12]. In TSV-based structures, there are large coefficient of thermal
expansion (CTE) mismatches between silicon substrate, dielectric
material, and filled metal. Therefore, the thermal stress at the interface of
materials is large and results in material failure or delamination. The
12 Strain within material is either elastic or plastic. While elastic strain only can
cause a reversible distortion, the plastic strain can result in non-reversible
deformation including cracking of the material.
Related Work
47
thermal-mechanical stress distribution of a TSV array model under the
accelerated temperature cycling is investigated by a finite element
approach. The surface area between TSVs is squeezed at high temperature
and this results in compressive stress at the surface area. The analysis
shows that large stress occurs around pads. This may result in failure or
delamination of TSV pads. The simulations indicate that for larger pads
that result in smaller space between TSVs the stress is larger. Smaller pads
experience higher stress close to the pad corners but the stress is smaller at
the middle of bottom pad. The proposed analysis technique helps to
identify possible failure regions in the TSV structure [Kuo11, Kuo12].
Large shear stress13 develops at the interfaces between different materials
during temperature cycling, especially if the difference between their CTE
is large [Kumar12]. The shear stress may cause interfaces to slide by a
diffusional process. This results in relative dimensional changes in the
materials. This is a reliability risk for TSV based structures which not only
suffer from temperature cycling but also convey large current densities.
Experimental results demonstrate interfacial sliding caused by temperature
cycling in presence of electric current. The presence of current moved the
affected area in the direction of electron flow. This leads to exacerbated
protrusion (or intrusion) of TSV relative to the temperature cycling only
situation (when the electric current is negligible) [Kumar12].
The effect of temperature cycling as well as some other thermal
phenomena on the performance of TSV based electronics is studied in
[Cherman12]. The transistor performance is affected by the stress induced
by the TSV. It is reported that high working temperature increases the
TSV-induced stress while temperature cycling decreases this stress. These
stress variations may be due to the TSV creep [Cherman12].
A study for understanding the effect of temperature cycling on the signal
integrity for TSV based electronics is conducted in [Okoro12]. Radio
frequency signals are used to detect discontinuities in the isolation liner
around the TSV metal body. Signal degradation increases with temperature
cycling. Atomic Force Microscopy (AFM) showed that void formation and
growth in the isolation liner is the root cause [Okoro12].
13 Shear stress is the stress force parallel to the surface of a material. It is different
from the normal stress which acts vertical to the surface.
Chapter 3
48
System reliability is affected by a number of factors including the amount
of temperature cycling [Chantem13]. Task assignment and scheduling may
help to even out the core wears in an advanced multi-core system. A
dynamically-activated task assignment and scheduling algorithm that
prolongs system lifetime is proposed in [Chantem13].
Thermal-mechanical failures of TSVs including the TSV protrusions from
the die surface are studied in [Zhang13]. The TSV protrusions are observed
after wafer bonding, thinning, and TSV revealing. TSV protrusion on the
backside is affected by temperature cycling. Protrusion magnitude can be
fitted to an exponential model which suggests a grain boundary diffusion
mechanism might be behind it [Zhang13].
Temperature-related mechanical stress in TSV structures is studied in
[Jiang14]. An X-ray micro-beam diffraction visualization technique is used
to observe the stress and deformation in TSV with submicron resolution.
Local plasticity in TSV and the deformation induced by thermal stresses
are investigated using this technique. Grain growth in TSV metal body
affects the stress relaxation during temperature cycling and consequently
the residual stress and plasticity in the TSV structure [Jiang14].
3.9 Test Reordering
During the test, the power consumption of the circuit under test may exceed
its power rating, as discussed before. A test power reduction technique
based on test vector ordering is proposed in [Chakravarty94,
Dabholkar98]. The objective is to minimize the tests average switching
activity. It is demonstrated that the test ordering problem is NP-hard.
Consequently, a greedy approach for finding a low-power test order is
proposed. An elaborate power model based on the transition count in the
scan chain is used [Chakravarty94, Dabholkar98].
Another test planning technique is proposed in [Girard97] to reorder the
test vectors to minimize the switching activity of the circuit under test. A
close connection between the actual number of transitions and the
Hamming distance between tests is confirmed. Consequently, a fast
algorithm to calculate Hamming distances is used instead of the actual
transition count which is excessively time-consuming to calculate. A
greedy heuristic is then used to find a low power test order [Girard97].
Related Work
49
In safety-critical applications, the electronics are frequently tested (the test
might be even in-field and even online) by Built-in Self-Test (BIST)
modules [Flores99]. Online testing, in particular, can consume a large part
of the overall power budget. A test ordering technique for power reduction
is proposed in [Flores99]. The circuit-under-tests switching activities are
approximated by Hamming distances between the subsequent tests. The
problem is equivalent to a travelling salesman problem. The problem is
simplified so that an ILP technique can be used. Moreover, a Christofides
algorithm is employed to find a low-power test order [Flores99].
A method for reducing the test application time while respecting a power
budget is proposed in [Rosinger02]. The method focuses on the test power
peaks. These peak values depend on the order of the tests. The tests are
reordered so that the power peaks for different cores are not overlapping.
This leads to a minimized TAT under power constraints [Rosinger02]. The
technique works as follows: First, power dissipation is minimized. Then,
the current results are further improved by test application time
minimization. When minimizing the test application time, the power is
considered as a constraint [Rosinger02].
Testing during the burn-in process is a common practice since it reduces
test and burn-in costs [Bahukud08a, Bahukud08b, Bahukud09]. However,
power variations caused by scan-based testing may lead to large
temperature fluctuations. This affects the accuracy of the burn-in process
since the actual temperatures are not exactly known. Reducing power
variations in order to reduce the temperature variations during burn-in is
investigated in [Bahukud08a, Bahukud08b, Bahukud09]. The variation is
reduced through test reordering. An ILP approach as well as a greedy
algorithm are used to properly reorder the tests. An efficient transition
counting method is proposed to rapidly estimate the test power values.
Then, a heuristic-based test-pattern ordering technique is proposed to
minimize the fluctuations in the power dissipation during test
[Bahukud08a, Bahukud08b, Bahukud09].
Scan-based testing usually causes much larger switching compared to
normal circuit operation [Tudu09]. This results in large power
consumption which in turn leads to supply droop and yield loss. An
efficient technique for test vector reordering to achieve an acceptably low
peak power is proposed in [Tudu09]. The peak power values are
represented by a complete directed graph. Consequently, a number of
Chapter 3
50
graph based techniques are employed to reduce the peak power. Removing
the edges with peak power larger than a certain threshold is one of the pre-
processing techniques. After that, the remaining graph is searched for a
Hamiltonian path. Other techniques, such as repeating a test, adding an all-
zero test, and adding an all-one test are also studied. The average power is
also minimized under the peak power constraint [Tudu09].
51
Chapter 4 Process-Variation Aware SoC
Test Scheduling Techniques
This chapter presents techniques to address the negative effects of process
variation on the thermal issues during test. In advanced SoCs manufactured
by deep submicron technologies, the portion and the absolute value of the
temperature error induced by process variation (PV) is considerable. The
PV will cause large errors mainly due to power variations [Choi07]. Large
error magnitudes directly translate into the need for larger safety margins
and consequently excessively long test application times.
The usual offline test scheduling techniques are vulnerable to temperature
errors since the error values are not known a priory. Therefore, the
temperatures that are simulated offline could be very different compared
with the actual temperatures. Since the actual temperatures are accessible
during test through temperature sensing, an online scheduling alternative
seems promising. However, online test scheduling has its own drawbacks,
such as additional delays due to temperature readout times and run time
overhead. As a compromise, an adaptive approach is proposed in this
chapter to take advantage of both offline and online scheduling paradigms.
4.1 Introduction
Two process variation aware methods are proposed in [Aghaee10] in order
to maximize the test throughput. One of these techniques is offline and the
other is hybrid (quasi-static). The optimization objective (i.e., testing
throughput) is defined to take the cost of the overheated chips into account
in addition to the test application time. However, these techniques handle
neither intra-die process variation nor temperature error fluctuations. In
this section an adaptive test scheduling method is introduced which
navigates the tests according to the intra-die process variation thermal
effects and temporal deviations in thermal behavior of the chip. It makes
4
Chapter 4
52
use of multiple on-chip temperature sensors to provide intra-die
temperature information.
Integration of such sensors is already practical. For example Power5 is
reported to have 24 sensors in year 2004 [Clabes04]. A variety of
mechanisms to access the sensors during test are proposed in [Ieee14b,
Yao11c]. The proposed approach in this thesis assumes an overhead for
sensor access and tries to reduce the number of sensor accesses.
As mentioned in section 3.4, there are related sensor-based works in this
area that support neither partitioning nor temperature-dependent leakage.
The method proposed in this chapter is based on partitioning and
interleaving, which reduces and/or utilizes the cooling times in order to
decrease the overall test application time. It also handles the long and
power intensive tests which are not thermally-safe. Moreover,
temperature-dependent leakage is taken into account.
The proposed method generates a near optimal schedule tree at design time
(offline-phase). During testing (online-phase), each chip traverses the
schedule tree, starting from the tree’s root and ending at one of the tree’s
leaves, depending on the actual temperatures. The schedule indicates when
a core is testing and when it is in the cooling state. The order of the test
sequences is untouched and the schedule tree’s size (i.e., storage footprint)
is small.
Traversing the schedule tree requires a very small delay overhead for
jumping from one point in the schedule tables to another point. This way,
the complexity is moved into the offline-phase and the memory/delay
overhead of the online-phase is minor. To our knowledge, this is the first
work to present an approach which incorporates the on-chip temperature
sensors data, repetitively during test, in order to adapt to the temperature
deviations caused by process variation and to achieve a superior test
performance.
4.2 Motivational Example
Assume that there are two instances, and , from a set of chips
manufactured for a given design. When the temperature error between the
actual temperature and the expected one is negligible, the temperatures of
and during a test process are equal and the same offline test schedule
is used for both of them. As illustrated in Figure 4.2.1a, both and
Process-Variation Aware SoC Test Scheduling Techniques
53
are tested without overheating, since the test schedule includes cooling
periods whenever the thermal simulator indicates that the chip temperature
will exceed the limit.
Due to process variation, however, the thermal responses of the different
chips to the same test sequence will be different. Now, assume that chip
is warmer than expected, while chip behaves normally. As
illustrated in Figure 4.2.1b, will overheat. To prevent this, a more
conservative offline schedule has to be designed based on the thermal
profile of , for both chips, as illustrated in Figure 4.2.1c.
This new schedule will avoid overheating, but will lead to longer test
application time compared with . For chip , this test
application time is unnecessarily long, since the original schedule, , in
Figure 4.2.1a is a safe schedule for this particular chip. For a set of
manufactured chips with large temperature variations, in order to generate
a globally conservative offline schedule, the hottest chip will be used to
Figure 4.2.1 Test schedule examples
Temperature curves (a) when there is no temperature error; (b) when there is time-invariant
temperature error and schedule is used; (c) when there is time-invariant temperature error and
schedule is used; (d) when there is time-variant temperature error. (Curves are only illustrative.)
(a)
(b)
(c)
(d)
Time
Time
Time State
S1
Time State
S2
Temperature of
Testing
Cooling
Temperature of
Temperature Limit
Chapter 4
54
determine the test schedule. This test schedule will introduce too long
cooling periods for most of the chips, leading to an inefficient test process.
The hybrid technique presented in [Aghaee10] addresses the above
problem with the help of a chip classification scheme. This scheme consists
of several test schedules for different temperature error ranges. After
applying a short test sequence, the actual temperature of the chip under test
is measured using a sensor and depending on its value, the proper test
schedule is selected. Therefore, the hotter chips will use a test schedule
with more cooling, while the colder chips will have less cooling. The
overheating issue is solved and the test application time will not be made
unnecessarily long. This approach works fine under the assumption that
the thermal behavior of the chips is time invariant (e.g., Figure 4.2.1a–c).
However, in the case of large process variation, the thermal behavior is
time variant and the technique presented in [Aghaee10] will not be able to
achieve high quality schedules. The variation of thermal response with
time is illustrated in Figure 4.2.1d. In this case, the temperature of chip
gradually lifts up, as compared to chip , and eventually overheats. A
scheduling method capable of capturing temporal deviations is therefore
required to deal with this new situation
The temperature behavior given in Figure 4.2.1d is captured in Figure
4.2.2a with more details. The lift up of the temperatures of chip starts
at time , as shown in Figure 4.2.2a. Since will only overheat after ,
both chips can be safely tested with schedule up to . At , the actual
temperature of the chip under test, , can be obtained via sensors. The
actual temperature can then be compared to a Threshold and the following
two different situations can be identified:
For the rest of the test, after , two dedicated schedules, and , are
generated in the offline phase for and , respectively. Therefore, in
the online phase the test of continues with schedule , as in Figure
4.2.2a, and the test of continues with schedule , as in Figure 4.2.2b.
In this illustrative example, at the end of , the schedule does a branching
to either or based on the actual temperature. This information and the
branching condition can be captured in a branching table, in Figure
Process-Variation Aware SoC Test Scheduling Techniques
55
4.2.2. As shown in Figure 4.2.2a, is tested initially with and then
with , while, as shown in Figure 4.2.2b, is initially tested with and
then with a more conservative schedule, .
The segments of the schedule which are executed sequentially without
branching are called linear schedules. An adaptive test schedule consists
therefore of a number of branching tables in addition to multiple linear
schedule tables. Note that the original test sequences are saved elsewhere
in an intact order without being duplicated.
Although the above illustrative example was about a single-core design,
the focus of this thesis is on multi-core SoCs. It is assumed that, due to the
intra-die process variation, each core has its own thermal behavior similar
to what is described above for a chip. Moreover, multi-core designs usually
Figure 4.2.2 Schedule and branching tables
Temperature curves when there is time-variant temperature error (a) when both chips are tested with
linear schedules and ; (b) when by referring to the branching table, , test of chip continues
with linear schedule after time . (Curves are only illustrative.)
ConditionLinear Schedule
Table ID
Branching Table B1
Temperature ≤ Threshold
Temperature > Threshold
S2
S3
Temperature of
Testing
Cooling
Temperature of
Temperature Limit
Threshold
(a)
(b)
Linear Schedule Table S3
Time StateBranching
Table ID
—
—
—
—
—
—
—
Time
Linear Schedule Table S2
StateBranching
Table ID
—
—
—
Branching
Table IDTime State
Linear Schedule Table S1
—
B1
—
—
—
Chapter 4
56
are affected by lateral heat dissipation among the cores and also by the
limited test bus width which is shared by different cores.
Temperature curves for a SoC with four cores, as an example, are given in
Figure 4.2.3 [He08a]. It shows how the temperatures of the different cores
change over time. For a given core, when it is tested, its temperature
increases. When a core is not tested, there are no switching activities, and
it starts to cool down, as shown by the temperature curve going down.
To guarantee thermal safety, testing is interrupted when a core reaches the
high temperature threshold. As shown in Figure 4.2.3, more than one core
may be tested at the same time (e.g., the temperature for both core 1 and
core 3 is going up around because of testing). Cores will utilize the
available TAM which is freed during the cooling intervals of other cores
(e.g., core 1 utilizes the cooling time of core 3 at ) [He08a].
4.3 Problem Formulation
The goal is to generate an efficient adaptive test schedule, offline. This is
formulated as an optimization problem. The input consists of a SoC design
with its set of cores and their corresponding test sequences and their
switching activities. The floor plan, the thermal parameters, the static
power parameters, and the dynamic power parameters for the chip are
given as inputs. The statistical data that models the temperature deviations
are also given as input. The adaptive test schedule should be generated to
minimize the test application time and the probability of overheating.
These objectives are captured by a cost function which expresses the
Figure 4.2.3 Temperature curves for a four core chip under test
Process-Variation Aware SoC Test Scheduling Techniques
57
overall efficiency of the generated test schedule, as discussed in the
following.
The test schedule should be generated under two constraints. The first
constraint is the available test bus width. The test bus width limits the
number of cores that can be tested in parallel. The second constraint is the
available Automatic Test Equipment (ATE) memory which limits the size
and the number of the linear schedule tables and branching tables. It is
assumed that the available memory after loading the test patterns will be
utilized for storing the schedule and, therefore, the amount of memory
dedicated to the schedule will not introduce new costs.
In this thesis a comprehensive cost function is defined by combining the
cost of the overheated chips and the cost of the test facility operation, as
follows:
(4.3.1)
The first term in the cost function is related to the test facility operation
cost, which is defined as the operational Cost of the Test Facility per time
unit ( ) divided by the Test Throughput ( ). The cost of the test
facility operation per time unit depends on the cost of the ATE machines,
their maintenance costs, and other operational costs. The test throughput
captures the applied test size per time unit and is explained later.
The second term of the cost function is related to the cost of the overheated
chips, which is the product of the Price of One Chip ( ) and the
expected number of overheated chips. The expected number of overheated
chips is calculated based on the Test Overheating Probability ( ) which
represents the number of overheated chips per number of chips entering
the test facility
In equation 4.3.1 the test overheating probability, , is divided by
in order to give the expected number of overheated chips per number
of non-overheated chips. The cost of the test facility per time unit, ,
and the price of one chip, , depend on the particular manufacturing
and test facility and on the particular SoC. To have a simple model for the
test throughput, , assume that the given test facility is characterized by
1 A list of notations and abbreviations is provided in section 4.11.
Chapter 4
58
its overall Effective Test Time per Second ( ) and Test Handling
Time ( ).
The effective test time per second is the total test time that the test facility
provides. For example if there are two ATE machines working in parallel,
the could be as high as two. Therefore, the depends on the
number and specification of the ATE machines and possibly other test
facility specifications. The test handling time represents the wasted times
that chips are not actually under test (e.g., placing, connecting, and
detaching the chips) and therefore, it depends on the test facility
specifications. The test throughput, , which depends on the Applied Test
Size ( ) and Test Application Time ( ), is calculated as:
(4.3.2)
In order to gain a better understanding of the test throughput, the
Normalized Test Throughput ( ) is defined by normalizing the test
throughput, , to the effective test time per second, , and assuming
that the test handling time, , is negligible, as follows:
(4.3.3)
The normalized test throughput, , is proportional to the applied test
size divided by the test application time. It is also proportional to the
percentage of the chips that have completed the test without overheating.
Therefore large test application time and large test overheating probability
will result in small test throughput and consequently the cost component
related to the test facility operation will be higher.
In this thesis, , , and do not depend on the test schedule
and therefore they are considered to be constants. The cost function is then
normalized so that all constants are lumped into one new constant, the
Balancing Coefficient ( ). The result is the Normalized Cost Function
( ) which is expressed as:
(4.3.4)
Process-Variation Aware SoC Test Scheduling Techniques
59
The balancing coefficient, , is in direct proportion to the price of one
chip, , and in inverse proportion to the cost of the test facility per time
unit, . The first term in the above equation captures the test facility
operation cost.
The second term captures the balanced cost of the overheated chips and is
proportional to the test overheating probability, , and the balancing
coefficient. The balancing coefficient balances the cost of the overheated
chips against the cost of the test facility operation. Expensive chips will
results in a larger balancing coefficient and expensive test facility will
result in a smaller balancing coefficient.
4.4 Temperature Error Model
As previously defined, temperature error is the difference between the
expected temperature (can be estimated by simulation) and the actual
temperature (can be measured by sensors). This error can be categorized
into spatial temperature error and temporal temperature error. Spatial
temperature error shows that different cores have different temperature
errors while the temporal temperature error shows that the same core has
different errors at different times.
A temperature error model gives the probabilities of the temperature errors
for every core in every test cycle. The spatial error model gives the initial
error distribution and then the temporal error model is used to recursively
estimate the error distribution for the next cycle.
For example, a spatial temperature error model which consists of a discrete
distribution shows that at the very beginning of the test the probability of
an error equal to in core 1 is 0.001 while the probability for the
same error in core 2 is 0.02. The spatial error model is specified using a
look up table which is assumed to be given as one of the inputs. Assuming
that the error for a SoC design may range from to by a
resolution of , the number of the look up table entries ( ) would be
80 for a core and for a SoC with cores.
The temporal temperature error model is assumed to be a discrete-time
model which means that the temperature error is fixed during a period and
then it changes discretely from one period to the next. Therefore, the
temporal temperature error model specification has two parts, the period
which is called temporal error period and a table of error change
Chapter 4
60
probabilities. The temporal temperature error table gives the probability of
a particular change in error.
For example, a temporal temperature error model shows that the
probability that the error increases by is 0.015. Assume that the
temporal error period is and the error is measured to be at time
0, as shown in Figure 4.4.1. The error will remain up to (
). Then after the exact error is not known
any more. However the probability of a certain error can be estimated using
the temporal error model. In this example, the probability of a temperature
error equal to , between and is
0.015. Without a measurement at , the only available information is
that the probability of a temperature error equal to
is 0.015 × 0.015, between and . In Figure 4.4.1, a new
measurement is done at time and the actual error is .
The size of the temperature error data set, given as input, might be quite
large. In such a case it is necessary to extract a smaller set of data which is
representative of the original data in accordance with the accuracy and
speed requirements. This is done by clustering the errors into error clusters.
The error clusters are characterized by temperature Error-cluster Borders
( ). The temperature error range, resolution, and error clusters are
assumed to be identical for all cores, in this thesis.
The Temperature Error Values and the Spatial
Temperature Error Probabilities are
original inputs which are given for a SoC with cores for temperature
Figure 4.4.1 An example for temporal temperature error probabilities
Process-Variation Aware SoC Test Scheduling Techniques
61
error samples. The Temporal Temperature Error Probability ( ) is the
other input and it gives the probability for a certain change in the error
value. The probability that the temperature error value changes from
to is
(4.4.1)
The error clustering is assumed to be uniform and the error-clusters
borders, , are identical for all cores. Assuming error
clusters, the size of the original data set reduces to . Error clustering
will divide the -dimensional error space into error cells indexed using
Cartesian system (i.e., ). For example, assume that for a
SoC with two cores, each core has two error clusters. The 2-dimensional
error space is divided into four error cells, indexed with , , ,
and . While the original size of the error space is , the number of
error cells is . Assuming and , the original size is
while the size of the clustered error space, with , is .
4.5 Adaptive Test Scheduling
The proposed adaptive method is based on the on-chip temperature sensors
implemented on each core. During test, the actual temperatures of selected
cores are read at certain selected moments. A group of chips with similar
thermal behavior which are tested with the same schedule is called a chip
cluster. During the test, chips are dynamically classified into one of the
chip clusters and are tested using its corresponding schedule. The chip
clusters vary during the test, and at every adaptation moment (time moment
corresponding to a certain branching table) the chip clusters change into a
new scheme which is suitable for the new situation.
The parameters that affect the efficiency of the adaptive method are the
moments when branching/adaptation happens, the number of edges (i.e.,
linear schedule tables) and the branching conditions (i.e., chip clustering).
For the example in Figure 4.2.2, the adaptation is happening at , the
number of edges is two (two linear schedule tables, and ), and the
branching condition is a comparison with the temperature.
Since the possible branching moments are multiples of the temporal error
period, the first design decision is whether to branch or not at a possible
node in a schedule tree. This design decision will be merged with the
second design decision which is the number of edges (i.e., the number of
Chapter 4
62
chip clusters). The third design decision is the chip clustering for nodes.
These problems are summarized into the following two sub-problems.
1. How many chip clusters, at each possible node in the schedule tree, is
suitable? The special case of one edge implies no branching, no sensor
reading, and no extra effort.
2. What is the proper chip clustering into the given number of chip
clusters? The number of chip clusters is known from the previous
question. Depending on the chip clustering some cores may not need
sensor readout.
The second question is only relevant when the answer to the first question
is larger than one. The above questions are then formulated in two different
forms, the first question is described as a tree topology and the second
question is the chip clustering for the nodes of that tree topology.
A candidate schedule tree is generated by combining a candidate tree
topology with a candidate chip clustering. The number of candidate tree
topologies and the number of alternative chip clusterings grow very fast
with parameters like temporal error resolution and the number of cores.
Since the number of candidate trees is the product of the tree topology
alternatives and the chip clustering alternatives, the search space is so huge
that ordinary search approaches would not work fast enough. Therefore a
constructive method is suggested to deal with this high complexity.
The schedule tree is constructed by adding small partial trees to its leaves.
These small partial trees which are the building blocks of the schedule tree
are called sub-trees. A sub-tree consists of a small number of linear
schedules and branching tables which makes it possible to be clustered and
optimized (scheduled) at once. The tree that is under construction with
unfinished tests is called an unfinished tree.
For example, assume that there is an unfinished tree, Tree 1, as shown in
Figure 4.5.1a. The linear schedule tables of Figure 4.2.2 correspond to the
edges of Tree 1 while the branching table corresponds to node 1, as shown
in Figure 4.5.1a. Two sub-trees with one and with two edges are shown in
Figure 4.5.1b. Tree 1 has two leaves and combinations of the sub-trees are
added to them in order to generate the offspring as shown in Figure 4.5.1c.
Offspring 2, for example, is generated by attaching the Sub-tree 1 to node
2 of Tree 1 and attaching the Sub-tree 2 to node 3 of Tree 1.
Process-Variation Aware SoC Test Scheduling Techniques
63
The proposed constructive algorithm is shown in Figure 4.5.2. The inputs
to the algorithm include the switching activities of the tests in order to
compute the dynamic power, the thermal error model in order to estimate
the temperature errors, and the thermal model of the chip in order to predict
the temperatures.
Furthermore, the algorithm requires the electrical model of the chip in
order to compute the static power and the dynamic power and in order to
be informed about the test bus width limit. The test facility specifications
are also inputs to the algorithm which provides the knowledge of the
available ATE memory, delay overheads, and the balancing coefficient
(i.e., in equation 4.3.4).
The algorithm starts with an initialization phase, as shown in Figure 4.5.2.
Here, the unfinished tree, sub-tree topologies, temperature error model, and
thermal simulator are initialized. Then it proceeds with constructing the
schedule tree out of the sub-trees as will be explained in section 4.5.1. The
linear schedule tables are discussed in section 4.5.2. The sub-tree
evaluation is explained in section 4.5.3. The sub-tree scheduling which is
based on an optimization heuristic is explained in section 4.5.4.
4.5.1 Tree Construction
The schedule tree construction starts with a root node and in each iteration
an unfinished tree extends and multiplies by adding alternative
combinations of sub-trees to its active leaf nodes, as shown in Figure 4.5.1.
Then, a small number of promising under-construction trees are selected
as unfinished trees from the offspring list to be used in the next iteration.
Figure 4.5.1 Constructive method
Main components are (a) Unfinished tree, (b) sub-tree topologies, and (c) offspring trees. For , ,
, and in (a) refer to Figure 4.2.2.
(b) Sub-trees
Sub-tree 1
Sub-tree 2
0 1
012
(a) Unfinished Tree
Tree 1 0 1B1
S1S2
S3
23
(c) Offspring Trees
Offspring 1
Offspring 2
Offspring 3
0 12
3
4
5
0 12
3
4
5
6
0 12
3
4
5
6
Chapter 4
64
For example, an unfinished tree list will be selected from the offspring list
(partially shown in Figure 4.5.1c) to go on with. The algorithm, as shown
in Figure 4.5.2, ends when all the unfinished trees have completed the test.
The selection process keeps the ATE memory constraint satisfied by not
selecting the candidates that will exceed the memory limit. A naïve
algorithm will have a tendency to create many edges in all iterations at the
beginning since it reduces the cost. As a result of this naïve approach the
algorithm will put many edges near the root of the tree and later on as the
memory fills up there will not be any possibility to add a new edge. In order
to provide the algorithm with the freedom to put more edges in the more
beneficial regions. In our proposed algorithm, the selection is done based
on the Scaled Cost Function ( ) as defined in the following.
Figure 4.5.2 The proposed constructive method
Initialize
Initialize
unfinished trees
(shown in Figure
4.5.1a)
Initialize sub-tree
topologies (shown
in Figure 4.5.1b)
Initialize thermal
simulator
(discussed in
section 4.6)
Initialize temperature
error model (discussed
in section 4.4)
Generate offspring trees (shown in Figure 4.5.1)
Schedule
AL1.StT1
Core clustering for errors
(discussed in section 4.5.4)
Schedule
AL1.StT2
Schedule
AL2.StT1
Schedule
ALlast.StTlast
Connect the scheduled sub-trees (ALi.StTj) to the corresponding leaf node
(ALi) in order to generate all possible combinations (discussed in section 4.5.1)
Select the unfinished trees from the offspring trees list
using equation 4.5.1 (discussed in section 4.5.1)
Is there any active leaf in
the unfinished trees?
Select the final schedule tree from the
offspring trees list using equation 4.3.4
(discussed in section 4.5)
Final schedule tree
Schedule the sub-trees (discussed in section 4.5.4)
The j-th Sub-tree Topology to be
connected to the i-th Active Leaf node
of the unfinished tree
ALi.StTj
Yes No
Thermal
model
Test
switching
activities
Thermal
error model
Electrical
model
Test facility
specification
Process-Variation Aware SoC Test Scheduling Techniques
65
(4.5.1)
The normalized cost function, (equation 4.3.4), is scaled by the tree’s
number of nodes plus an adjusting offset. Now, adding nodes to the tree is
only beneficial if it gives a reasonable cost reduction, otherwise a smaller
tree may get a lower scaled cost and manage to survive to the next iteration,
while bigger trees are discarded. In general bigger trees will have smaller
but not necessarily smaller . The effect of the number of nodes
is adjusted by the adjusting offset. A small adjusting offset promotes
having fewer edges compared to a large adjusting offset which promotes
having more edges. In other words, a larger adjusting offset reduces the
sensitivity to the number of nodes. An extremely large adjusting offset
means that the number of nodes has almost no effect on decision making
while dominates the decision making process.
To satisfy the memory constraint, when unfinished tree is selected based
on its scaled cost function, it is scheduled for the rest of test by just using
the linear schedule tables which mean no further branching. During this
scheduling, the linear scheduling aborts as soon as the memory limit is
violated. If the linear scheduling succeeds in respecting the memory limit,
the candidate survives to the next iteration. Otherwise, the currently chosen
unfinished tree is discarded and the next candidate with larger or equal
scaled cost is tested for its compliance with the memory constraint. The
scheduling will fail if no candidate could meet the memory constraint,
meaning that the limit is too tight even for a linear schedule.
4.5.2 Linear Schedule Tables
A linear schedule table captures a schedule without branching. The linear
schedule table entries (start/stop times for each and all cores) are optimized
in the offline phase to reduce the probability of overheating. The
temperatures are checked frequently in order to keep the overheating
probability small.
The start/stop states in the linear schedule tables are generated using the
heuristic proposed in [He08a]. According to this heuristic, the test of the
cores with lower temperature and higher remaining test size will be started
or resumed earlier. Activating the cores with lower temperatures is
desirable because it provides longer testing intervals and therefore reduces
the number of test partitions and their corresponding overheads.
Chapter 4
66
Moreover, by choosing the colder cores while the effect of adjacent cores
are taken into account by temperature simulation, in fact, the algorithm
activates the cores which are far from the current active cores. This will
save the newly activated cores from the accumulated heat in their possible
neighbors and furthermore by not activating the adjacent cores, the newly
deactivated cores will experience a faster cooling. The heuristic gives also
advantage to the cores with longer remaining tests, thus maximizing the
interleaving opportunities. Besides, the situation in which a long test
sequence leads to a long total test application time is avoided.
As mentioned before, each chip cluster is tested with a dedicated linear
schedule. Every chip cluster is represented by a single error value which
will be used to estimate the actual temperature based on the simulated
temperature; this error value is called representative temperature error. The
estimated temperature is updated periodically by correcting the cores’
simulated temperatures with the representative temperature error. The
estimated temperature is then used to compute the static power and to
determine the ‘state’ of the cores (i.e., testing or cooling).
For example, assume that there are two chips in a certain chip
cluster and the chips consist of only one core. Therefore, at a certain
moment in time, there are two error values corresponding to the
two chips. But the linear scheduling heuristic works with one error value
for one chip cluster. Therefore, the representative temperature error, ,
which is a real number ( ) is defined as a value which represents chips
error values, .
The representative temperature error is updated periodically with the
temporal error period (see section 4.4) while the estimated temperature,
static power, and state of the cores are updated more frequently. After
updating the state of the cores, the dynamic power sequence is computed.
The initial temperatures are available as the results of the previous
temperature simulation. Having dynamic and static power sequences in
addition to the initial temperatures, the next temperature simulation is
performed.
The representative temperature error for a chip cluster is viewed as a safety
margin in [Aghaee10] and its optimal value is experimentally computed
for a number of examples. Those experiments suggest that the optimal
value for a representative temperature error is equal to the border between
the chip cluster and the adjacent chip cluster that has larger error (i.e.,
Process-Variation Aware SoC Test Scheduling Techniques
67
hottest possible chip in the chip cluster). This is true for all chip clusters
except the last one that has the largest error. For example, for a chip cluster
stretching from to , would be a good
choice to be the representative temperature error for this chip cluster. The
representative temperature errors are assigned in a similar way in this
thesis.
To have an example from a different point of view, assume that in total
there are four chips and chips consist of only one core.
Therefore, at a certain moment in time, there are four error values
corresponding to the four chips. Assume that
. Assume that the chip-clustering algorithm (will be explained in
section 4.5.4) has generated two chip clusters and . The
representative temperature error for the chip cluster that has smaller errors
(i.e., ) is and the representative temperature error for the
last chip cluster, is formulated as an optimization variable along with
the chip-clusters borders in the chip-clustering algorithm. This is in
particular important when the number of chip clusters is small. This is
usually the case2 and therefore the cluster on the high temperature extreme
will contain a non-negligible number of chips.
The sub-tree optimization method encodes the problem based on chip-
clusters borders. The representative temperature errors are defined as chip-
clusters borders for all chip clusters but the last one. For the last error
cluster (one with the largest errors), the representative temperature errors
are encoded along with the chip-clusters borders as the sub-tree
optimization variables. This will be explained in more details in section
4.5.4.
The optimization problem for a linear schedule table is to minimize the
partial normalized cost function by finding the proper start/stop times. This
is done based on the heuristic proposed in [He08a]. The utilized test bus
width is the sum of the TAM widths of the active cores for tests which
utilize the TAM. The schedule size is the product of the number of the
linear schedule table entries and the record size. The schedule tree is
equivalent to a number of linear schedule tables (edges) in addition to a
number of branching tables (nodes), as shown in Figure 4.5.1a. The linear
2 Refer to [Aghaee10].
Chapter 4
68
schedule table is explained above and the rest of the construction process
will be explained in the following sections.
4.5.3 Sub-Tree Evaluation
The schedule tree is constructed by attaching sub-trees to the leaves of the
unfinished trees (See Figure 4.5.1). For this purpose, the proper schedule
for a sub-tree topology should be found. In a sense, a sub-tree is a tree and
the cost function introduced in section 4.3 should be usable. However,
there is a subtle difference between their objectives. For the schedule tree
the objective is its very own cost. For a sub-tree the objective is, on the
contrary, the cost of the schedule tree that is to be constructed. Therefore,
the cost of the final schedule tree should be estimated assuming that this
particular sub-tree is used in its construction. This makes the cost
evaluation different for the sub-trees.
To find the near optimal schedule for a sub-tree topology, a partial cost
function must be used for different sub-tree clustering alternatives. For the
evaluation of the cost function (i.e., in equation 4.3.4), the expected
values of the test application time, , the applied test size, , and the
test overheating probability, , (denoted by , , and ,
respectively) should be computed by utilizing the temperature error
statistics.
The expected values are computed while each edge is being scheduled. In
the formulation of the schedule tree, an edge is represented by its
destination node. Assuming that the number of nodes is , the Nodes’
Probabilities , the Nodes’ Applied Test Sizes
, and the Nodes’ Test Application Times
are used to compute the expected applied test size and the expected
test application time as follows:
(4.5.2)
(4.5.3)
In order to explain the expected test overheating probability, , and
understand how node probabilities are computed, the notion of node
clustering and error cells are introduced here. Temperature errors of cores
constitute a -dimensional errors space ( is the number of cores). For
example in Figure 4.5.3, there are two cores and therefore the error space
is two dimensional. The horizontal axis represents the error values of the
Process-Variation Aware SoC Test Scheduling Techniques
69
first core and the vertical axis represents the error values of the second
core. There are four error clusters for each core and therefore there are
sixteen error-cells in Figure 4.5.3.
This is specifically important for the nodes at which branching takes place.
Branching at a node is, in fact, a chip clustering to a number of groups, so
that each chip cluster corresponds to an exclusive edge that branches out
of that node. Chips are identified by their cores’ errors and therefore a chip
clustering is a partitioning of the -dimensional error space into a number
of chip clusters.
This means that a chip cluster is a combination of specific error intervals
of the cores. A candidate ‘sub-tree clustering’ is a set of chip clustering
alternatives for nodes. Furthermore, a candidate ‘sub-tree clustering’ could
be viewed as a set of nodes’ clustering alternatives for a sub-tree topology.
An error cell is a cell in -dimensional error space separated by cores’
error-clusters borders and therefore its projection on a core error axis is an
error cluster for that core. Therefore, a node clustering could be seen as
assigning error cells to chip clusters or equivalently labeling error cells
with chip clusters. An example for labeling of the error cells is shown in
Figure 4.5.3. There are two cores in the figure, and the numbers
(“0” or “1” in this case) inside the rectangular error cells are the labels.
A candidate sub-tree topology will have a number of candidate clustering
alternatives which label the nodes’ error cells with the relevant chip
clusters. Each chip cluster for a node corresponds to an edge branching out
of that node and corresponds to a linear schedule table. Each node has its
own dedicated Error-Cell Labeling
. Looking from a branching node, a succeeding node corresponds to a
Figure 4.5.3 An example for error-cells labeling
Four error-cells are labeled with 0 that is the ID of the chip cluster number 0 and the remaining twelve
error-cells are labeled with 1 that is the ID of the chip cluster number 1.
0
Core 1 error clusters
Core
2err
or
clu
ste
rs
0 1 2
0
1
2
0 1
0 0 1
1 1 1
1 1 1
1
1
1
1
3
3
Chapter 4
70
chip cluster and therefore it receives a Node’s Cluster Label
to represent that chip cluster (or equivalently the preceding edge
and corresponding linear schedule). This label indicates which of the
branching node’s chip clusters will lead to a certain succeeding node.
The probabilities of error cells for different nodes and consequently the
probabilities of those nodes are computed based on the temperature error
model and based on the chip clusterings of the preceding nodes. In order
to speed up the computation of the Error-Cells Probabilities ( ) the
Error Cell Change Probabilities ( ) are pre-computed as shown below,
in equation 4.5.5. The error cell change probabilities are, in fact, the
concentrated effect of the temporal error model which is repeatedly used
to compute the error-cells and nodes probabilities.
It is assumed that the variation in the probabilities inside an error cluster is
negligible. Furthermore, it is assumed that the error change probabilities
for different cores are independent. The error-cell probabilities change
from node to node and therefore most of the time the equations are about
two nodes, the origin and the destination. The error cells for the origin
node are superscripted with and for the destination node with .
is computed as follows:
(4.5.4)
is the temporal temperature error probability, is temperature
error value, and is error cluster border. is computed as follows:
(4.5.5)
The error-cell probabilities for the root node (i.e., ) are computed
based on the spatial temperature error probabilities ( ) as follows:
(4.5.6)
The error-cell probabilities for non-root nodes (i.e., ) are computed
based on the predecessor node which is denoted by . First, error-cell
probabilities just after the branching are extracted from the predecessor
node as follows:
Process-Variation Aware SoC Test Scheduling Techniques
71
(4.5.7)
While scheduling an edge, overheating may occur to some of the cells
(ranges of chips) which have larger temperature errors. Consequently, the
probability of these cells at the end of the edge (after the corresponding
chunk of the test is applied) is considered to be zero. The error-cell
probabilities, , after overheating are computed based on
Representative Temperature Error ( ) ( is introduced in section
4.5.2) as represented below, in equation 4.5.8. Overheating of a core occurs
when the core’s actual temperature which is estimated by adding to
the Simulated Temperature ( ) exceeds the High Temperature Threshold
( ). A chip is considered as being overheated if at least one of its cores
overheat.
after overheatingafter branching
(4.5.8)
According to the temperature error models (introduced in section 4.4) the
error-cell probabilities, , after temporal changes are computed as:
(4.5.9)
The node’s probability, , is computed as follows:
(4.5.10)
Node’s not Overheating Probability ( ) is the probability that a chip
which corresponds to this edge according to the chip clustering scheme, is
not overheated after traversing this edge. for a node, , is computed
as follows:
(4.5.11)
Finally, error-cell probabilities, , are computed as:
(4.5.12)
Chapter 4
72
Edges are scheduled by determining the linear schedule tables as explained
in section 4.5.2. Then the candidate sub-tree clustering is evaluated using
the partial cost function which is based on the expected applied test size,
the expected test application time, and the predicted test overheating
probability. The first two are already introduced in equation 4.5.2–3 and
the last one is explained below.
Evaluation of a partial tree is in fact an attempt to predict the cost of the
completed schedule tree, based on the current situation of that partial tree.
For this purpose, it is assumed that the final schedule tree will be composed
of a number of similar partial trees (building blocks for the final schedule
tree). These partial trees are assumed to have similar expected applied test
size, expected test application time, and expected test overheating
probability. These expected values are assumed to be similar to those of
the partial tree that we are evaluating.
Therefore, a good prediction approach for the test application time and the
applied test size would be their current expected values multiplied by the
predicted total number of partial trees. Since only the ratio of the predicted
test application time to the predicted applied test size matters in the cost
function (the first term in equation 4.3.4), a good choice for predicted
values of these variables is their expected values. But the situation for
Predicted Test Overheating Probability ( ) is different since its value
does not change linearly when a number of similar partial trees (building
blocks) are put one after the other (unlike and ).
Assuming that there are leaves in the tree, the Leaf’s Overheating
Probability is the overheating probability for the path
from the root node to the specified leaf node. Its computation includes
multiplication over nodes that belong to the specified root-to-leaf path. The
overheating probability for leaf is computed as:
is a leaf node for all nodes, , belonging to the root–to– path
(4.5.13)
The expected test overheating, assuming a total of leaf nodes, is
computed as:
(4.5.14)
Process-Variation Aware SoC Test Scheduling Techniques
73
can be used in equation 4.3.4 to evaluate a fully constructed
schedule tree, but for partial cost function when the tree is not yet fully
constructed, the predicted test overheating probability, , is used in to
evaluate the partial cost function (which will replace in equation
4.5.1). is computed as:
(4.5.15)
Where λ is the total number of partial trees (building blocks) that are
assumed to be similar to the current partial tree and will construct the final
schedule tree. is computed for the partial tree as expressed in
equation 4.5.14 and then the predicted test overheating probability is
computed by assuming that these λ partial trees have overheating rates
equal to the current partial tree’s overheating rate. λ is computed based on
the expected Number of Partial Trees ( ) which is defined as the total
test size divided by the expected applied test size, , for the current
partial tree.
A naïve algorithm will use the instead of λ in equation 4.5.15.
However, because of the localities in the schedule tree, partial trees
(building blocks) with a lot of cooling may exist. For these partial trees,
the expected applied test size is small and consequently the expected
number of partial trees, , will be estimated pessimistically. This
unrealistic estimation may result in exceedingly large predicted test
overheating probabilities, , and consequently a long schedule tree
with too much of cooling may receive a low cost and be selected.
Therefore, limiting the expected number of partial trees, , would be
helpful for good schedules to receive a more realistic cost. A reciprocal
limiter is used here which amplifies small inputs and attenuates large
inputs. In the proposed reciprocal limiter, the output is always one when
the input is one and the output is equal to the input in a point that is called
. The output will be always smaller than the limit which is (
). The limited output, λ, is calculated based on the input, , as follows:
(4.5.16)
A larger promotes lower overheating since the maximal value of λ
increases and also because of the increased limiter’s amplification for the
values which are less than . On the other hand, a smaller
will result in schedules with shorter test application time.
Chapter 4
74
At this point, the introduction to computation of the expected test
application time, expected applied test size, expected test overheating
probability, and predicted test overheating probability is completed and the
expected cost for a sub-tree could be computed using them. Therefore, the
clustering alternatives for a sub-tree topology could be evaluated using the
scaled cost (equation 4.5.1). The clustering alternatives are explored by
PSO and the best scheduled sub-tree is selected at the end. This
optimization is further discussed in the next section.
4.5.4 Sub-Tree Scheduling
As mentioned before, the schedule tree is constructed by attaching sub-
trees to unfinished trees’ leaves (See Figure 4.5.1). For this purpose, the
proper schedule for a sub-tree topology should be found. In order to
schedule a sub-tree topology which is going to be connected to the
specified leaf node of the unfinished tree ( in Figure 4.5.2) a
heuristic, as shown in Figure 4.5.4, iteratively generates alternative chip
clustering schemes and evaluates them. The evaluation is explained in
section 4.5.3 and requires the sub-trees’ edges (i.e., linear schedule tables)
to be scheduled as explained in section 4.5.2.
A chip clustering scheme for a sub-tree specifies which chips will take
which edges. The chips are specified by their cores’ errors and therefore
the problem could be seen as assigning chip clusters to the error cells
located in the -dimensional error space. The search space could be seen
as the collection of different alternatives for . For example for
a chip with two cores, the general form is and therefore, for a
sub-tree with two nodes, the solutions will be similar to the two alternatives
given in Figure 4.5.5.
A solution encoding scheme is suggested in [Aghaee11b] which labels the
error cells with chip clusters. The number of the decision variables grows
exponentially with the number of cores and therefore the computational
complexity is very high. In this thesis, we suggest a solution encoding
scheme which encodes the chip-cluster borders instead of the error cells.
For a node with succeeding chip clusters the number of decision variables
is . For chip clusters, there are chip-cluster borders and the
- value is the representative temperature error for the last cluster. Here,
the number of the decision variables grows in proportion to the number of
Process-Variation Aware SoC Test Scheduling Techniques
75
cores and therefore the computational complexity is much smaller
compared with the scheme suggested in [Aghaee11b].
Two examples for the suggested solution encoding for a sub-tree with only
one node, similar to sub-tree 2 in Figure 4.5.1b, are shown in Figure 4.5.6.
The solutions correspond to a SoC which has only two cores. There are
three temperature error clusters per core and the number of edges (i.e.,
number of chip clusters) in the corresponding sub-tree is two. and are
representative temperature errors for the last chip cluster. The 0-th chip
cluster in Figure 4.5.6a is larger than the 0-th chip cluster in Figure 4.5.6b.
An equivalent view-point is to compare the number of error cells which
are indexed by 0. Another equivalent view-point is to compare the chip-
clusters borders on the vertical axes (i.e., third element in the solutions
encodings).
Figure 4.5.4 Sub-tree optimization algorithm
Initialize the swarm:
generate particles’ locations and velocities
Schedule the sub-tree’s
edges for particle1
Select local and global bests (evaluate according to section 4.5.3)
Report global best as
the final chip clustering
Schedule the edges of the sub-tree ALi.StTj for different chip clustering alternatives
(explained in section 4.5.2)
The j-th Sub-tree Topology to be connected to the i-th
Active Leaf node of the unfinished tree
ALi.StTj
Are particles in valid range and
Do required chip clusters exist?
Yes
Update particles’ velocities using equation 4.5.17
Update particles’ locations using equation 4.5.18
Are particles in valid range and
Do required chip clusters exist?
Fix for admissibility
(explained in section 4.5.4)
Is convergence condition met?
ScheduledALi.StTj
Schedule the sub-tree’s
edges for particle2
Schedule the sub-tree’s
edges for particlelast
Schedule ALi.StTj
No
YesNo
Yes No
Chapter 4
76
The possible solutions are then explored using particle swarm
optimization. Although PSO is introduced in section 2.7, let us briefly
review it here. A candidate solution is called a particle and is represented
by its location and its velocity. The locations are the encoded solutions and
the velocities are used to determine the next candidate solutions. Each
particle remembers its previous best location, and the swarm remembers
the global best solution that is the best location any of its particles have
visited ever. The previous bests and the global best are then used to give a
hint to the random velocities.
Figure 4.5.5 Error-cell labeling alternatives
Two alternatives are for a chip with two cores and a sub-tree with two nodes. The general form is
, being the node index (row). Cells (columns) are indexed by and .
Alternative 2
0 0 0 1
0 1 1 1
Alternative 1
0 0 1 1
1 1 1 0
The Labeling Plan
Figure 4.5.6 Two examples for error-cells labeling
Error cells are labeled with chip clusters’ IDs (numbers inside the small rectangles). The solution
encodings are given below the error spaces.
Solution encoding is [ 1, r1, 0, r2 ]
(b)
0
Core 1 error clusters
Core
2err
or
clu
ste
rs
0 1 2
0
1
2
0 1
1 1 1
1 1 1
Solution encoding is [ 1, r1, 1, r2 ]
(a)
0
Core 1 error clusters
Core
2err
or
clu
ste
rs
0 1 2
0
1
2
0 1
0 0 1
1 1 1
Process-Variation Aware SoC Test Scheduling Techniques
77
A canonical form of PSO uses equation 4.5.17, below, to update the
velocities. The coefficients in equation 4.5.17 are given as a part of the
chosen canonical form [Poli07]. and are two distinct
randomly generated numbers between 0 and 1. The location and the
velocity on the right hand side of equation 4.5.17 are the current values,
and the left hand side velocity is the next value.
(4.5.17)
Since the location, in this sub-tree scheduling problem, is a natural number,
the next location is the rounded sum of the current location and the next
velocity, as expressed in equation 4.5.18, below:
(4.5.18)
There are two admissibility conditions to ensure that the particles are valid
solutions. The first condition is the valid range and the other is the presence
of required chip clusters. For example assume that the errors range from
to and therefore smaller or larger errors will never happen in
practice. If it happens that one element in the next particle’s location is
, then this particle is out of range.
An example for a required chip cluster not being present is as follows.
Assume that there are three edges in a certain node and, therefore, three
chip clusters are necessary. It may happen that in the next particle’s
location, the first and the second chip-cluster borders are assigned with
identical values and therefore the second chip cluster is missing.
The proposed solution encoding which is based on chip-clusters borders
works well with particle swarm optimization, since the location and
velocity in PSO’s terminology correspond to the location and velocity for
chip-clusters borders. A typical particle in the beginning is far from being
good and experiences a high velocity towards the better location since
typically the difference between the best location and the current location
is large at the beginning. Therefore a rapid convergence towards the
preferred value for the chip cluster border will take place.
Later on, a typical particle will be close to the optimal location and
according to equation 4.5.17 it will move slower, thus pinpointing the
Chapter 4
78
preferred value for the chip cluster border. Some experiments, for chip
clustering optimization for sub-trees using PSO, are reported in
[Aghaee11b]. The experiments showed that the PSO performs well for this
purpose. Therefore, it is used here as a part of the proposed SoC test
scheduling technique.
4.5.5 Remarks
The proposed optimization technique is structured so that it enables
parallel implementations with different granularities. The alternative sub-
tree topologies ( in Figure 4.5.2) could be optimized in parallel.
For example, assuming one unfinished tree with two leaf nodes and three
sub-tree topologies, there will be combinations to optimize in
parallel.
Furthermore, at the lower level of sub-tree scheduling, each alternative
chip clustering in PSO ( in Figure 4.5.4) could be generated
(corresponding edges being scheduled) in parallel with other alternative
schedules. The scheduling of the edges (i.e., optimizing the linear schedule
tables) is the part that requires temperature simulation (dashed-line blocks
in Figure 4.5.4). Therefore, these computationally expensive parts could
be implemented in parallel in two different nested levels.
The proposed adaptive approach in this thesis combines the benefits of an
online scheduling technique with the benefits of an offline scheduling
approach and avoids their shortcomings. An online schedule will introduce
very large overheads that are associated with sensor readouts, decision
making process, and pausing/resuming the tests. An offline schedule, on
the other hand, is not capable of reacting to variations but has no run-time
overheads. In a fully online approach, reading the temperature sensors for
all cores as often as it is necessary and making the corresponding decisions
based on the acquired data will cause a very large load on the test access
mechanism and will introduce large delays to the schedule. Our proposed
approach uses temperature simulations as much as possible offline and
accesses carefully-selected cores’ sensors at carefully-selected times
during the test.
There is one schedule tree for a chip that addresses all cores individually.
For example, in a linear schedule table that corresponds to an edge, it is
stated that at time cores and are being tested, while cores and
are cooling. It might be that at another time, , cores and are
Process-Variation Aware SoC Test Scheduling Techniques
79
being tested, while cores and are cooling. The linear schedule table
is similar to in Figure 4.2.2, but instead of the second column that shows
only one column for state (in Figure 4.2.2), there are as many state columns
as there are cores. There is only one branching table for one node in the
schedule tree (similar to in Figure 4.2.2) but it contains, in every row,
conditions that include at least one core and at most all the cores.
4.6 A Fast Temperature Simulation Approach
In order to evaluate the candidates, the test application time and the test
overheating probability should be computed as previously explained. In
order to calculate the test application time and the test overheating
probability, the temperatures of the cores are required. Therefore, for every
candidate schedule which is examined by the meta-heuristic ( in
Figure 4.5.4), temperature simulation should be performed. Temperature
simulation is in the main loop in Figure 4.5.4 which itself is in the main
loop in Figure 4.5.2. This means that the temperature simulation which is
performed inside the optimization loop is repeated numerously.
On the other hand, the temperature simulation is the slowest step in the
iterative part of the algorithm. Therefore, the temperature simulation is the
bottleneck. It limits the number of the cores which can be handled by the
proposed method. Moreover, it is, also, a limiting factor for the quality of
the schedules.
Since the optimization heuristic will have a time consuming process inside
its main loop, the time required to achieve a high quality schedule will be
excessively large and impractical, thus the quality might be sacrificed by
ending the optimization process prematurely. It is, therefore, important to
use a fast temperature simulation approach.
As previously discussed, the temperature simulation is based on a thermal
model and a technique to solve the model response to the given input power
profile. The input power consists of the static power and the dynamic
power. The static power depends on the chip and on the temperature, while
the dynamic power depends on the chip and on the input test sequence.
Both the static power and the dynamic power are time-variant, but for
practicality reasons, it is commonly assumed that the power is constant
Chapter 4
80
during a simulation cycle3 (a discrete-time model is assumed). Therefore,
in the following we focus on a single simulation cycle in which the input
power is constant. The input power is updated with new static and dynamic
power values, based on the results of the previous simulation cycle and
then the simulation for the next cycle is performed.
The thermal model was previously discussed in section 2.6. Equation 2.6.1
is repeated below for convenience.
(4.6.1)
Assume that the thermal model consists of nodes and is the number
of cores ( ). The properties of the thermal model are encapsulated
into two matrices and . and are temperature and
power vectors. The mathematical representation of this model (equation
4.6.1) is a system of linear constant-coefficient differential equations and
therefore it is a linear time-invariant (LTI) system [Oppenheim97]. In fact
the thermal model is a linear time-invariant lumped element model and
both the heat capacities (captured in matrix ) and thermal conductivities
(captured in matrix ) are linear and time invariant.
The other part of the temperature simulation is to solve the model in order
to find its response to the input power. Usually, the simulation time is
divided into smaller intervals in which the power could be assumed to be
fixed. Then equation 4.6.1 is solved iteratively for each interval.
In order to solve equation 4.6.1 there are two distinct approaches, the
numerical approximation and the closed form solution. The numerical
approximations are usually done with very small intermediate steps, and as
a result, the complete temperature curve for the interval is constructed.
HotSpot uses the Runge-Kutta method for numerical approximation
[Huang06]. Though only the temperature at the end of the interval is
registered, many points of the temperature curve are calculated. Since we
do not need such a detailed temperature curve and we only need the
temperature at the end of the intervals, the equation is solved analytically
in order to give the temperature at the end of the intervals in a closed form.
3 Simulation cycle is explained in section 2.6.
Process-Variation Aware SoC Test Scheduling Techniques
81
In addition to the granularity of the temperature curve, another important
factor, which affects the simulation speed, is how frequently equation 4.6.1
is required to be solved. The scheduling technique presented in this thesis,
requires large number of simulations. Note that the system is LTI.
Moreover, the only changes in the inputs (within the simulation intervals)
is scaling of the previous inputs. Therefore, the differential equation needs
to be solved only once at the very beginning [Oppenheim97].
The responses to the scaled versions of the previous inputs are obtained by
scaling (a matrix multiplication) the previous outputs. Since the
computational cost of the scaling is less than the computational cost of
solving the equation from scratch, a method which utilizes the LTI
properties (i.e., scaling and superposition [Oppenheim97]) is faster than
the Runge-Kutta method when numerous simulations are required.
In situations that the thermal simulator is invoked quite frequently, the
input power is just being scaled from cycle to cycle, and the thermal model
is kept unchanged, the closed form solution is faster that the numerical
techniques. Therefore, we continue with the simulation approach which is
based on the closed form solution. By using Laplace transform
[Oppenheim97] and assuming that is the initial temperature vector and
is the temperature at the end of an interval, the closed form solution is
(4.6.2)
is the identity matrix of size and is the length of the interval.
Now, and matrices are defined as follows.
(4.6.3)
(4.6.4)
With the help of and , equation 4.6.2 could be written as
(4.6.5)
Equation 4.6.5 could be understood intuitively by thinking about the
system being LTI. According to the superposition principle, the effects of
the initial value and the input power will add up, thus the plus sign between
the two terms. The scaling property of the system could also be verified
Chapter 4
82
rapidly, as the scaling of an input, or with a certain factor, will scale
its own effect by the same factor.
The temperature simulation is done in two phases, an initialization phase
and then the operational phase. In the initialization phase the model is
invoked and based on it and are computed (this is shown in Figure
4.5.2 in regard to the overall scheduling method). The operational phase is
the iterative computation of the temperatures for different times using
equation 4.6.5. Since the thermal model is time invariant, the initialization
is done only once at the very beginning of the design process. Throughout
the offline scheduling phase, only the iterative computations are
performed.
In the closed form solution, the most computationally expensive part is the
matrix exponential for which is a part of equation 4.6.3. The matrix
exponential could be computed using numerical methods such as Padé
approximation [Higham05]. In fact the initialization phase for the closed
form solution includes calculating equation 4.6.3 and therefore it is very
time consuming. However, the operational phase only includes computing
equation 4.6.5 and therefore it is fast.
On the other hand, for the Runge-Kutta approach [Press07], the
initialization is fast since there is no need for computations which are as
heavy as equation 4.6.3. However, the operational phase is slow since the
equation is required to be solved in many fine steps through large number
of intermediate time instances. The conclusion is that the Runge-Kutta
method is faster for limited number of simulations and the closed form
method is faster for large number of simulations. The experiments in
section 4.7.1 will support this statement.
4.7 Experimental Results
Two distinct contributions in chapter 4 are the temperature simulation
approach and the adaptive scheduling technique. These are experimentally
evaluated in this section. All experiments are performed on a desktop
computer with Intel® Xeon® W3520 processor and 8 GB of memory. The
experiments for temperature simulation are presented first.
4.7.1 Fast Temperature Simulation Approach
A temperature simulation approach based on the closed form solution is
suggested in section 4.6 in order to increase the simulation speed. The
Process-Variation Aware SoC Test Scheduling Techniques
83
problem with numerical approximation approaches for temperature
simulation is that they are very slow for large number of simulation cycles
especially when there are a large number of cores. Temperature
simulations for a SoC with 100 cores and for different numbers of
simulation cycles are performed using the proposed approach and using
HotSpot [Huang07], and the CPU times are plotted in Figure 4.7.1a.
The numerical approximation approaches, such as the one used by
HotSpot, perform faster than the suggested approach for a small number of
simulation cycles. But for simulations longer than 1700 cycles, the
proposed approach is faster than HotSpot, as shown in Figure 4.7.1a. In
general, this difference increases with a rate close to 0.011 second per
simulation cycle and it reached a CPU time difference of 100 seconds for
10000 simulation cycles. This is important since for every edge in every
candidate schedule tree temperature simulation is performed for the
number of test cycles plus cooling cycles.
Temperature simulations are performed using the proposed approach and
using HotSpot [Huang07] for 10000 simulation cycles for different
numbers of cores, and the CPU times are plotted in Figure 4.7.1b. In
general, the CPU time difference increases rapidly with the number of
cores and the difference reaches 100 seconds for 100 cores. This is also
important, since achieving a good schedule in reasonable time becomes
infeasible with a small increase in the number of cores, when the slower
approach is in use.
Figure 4.7.1 CPU times for temperature simulation
HotSpot and the suggested approach. The simulations are performed (a) for 100 cores for different
numbers of simulation cycles and (b) for 10000 simulation cycles for different numbers of cores.
HotSpot
Suggested
Approach
Number of cores(b)Simulation cycles
CP
Utim
e[s
ec]
(a)
HotSpot
Suggested
Approach
2000 4000 6000 8000 100000 20 40 60 80 1000
20
40
60
80
100
0
120
140
160
180
CP
Utim
e[s
ec]
20
40
60
80
100
0
120
140
160
180
Chapter 4
84
4.7.2 Adaptive Test Scheduling Technique
The proposed adaptive SoC test scheduling technique is experimentally
evaluated in this section. The first set of experiments is performed on SoCs
with different number of cores and the CPU times are reported. Then,
experiments are done for ITC’02 [Marinissen02] benchmark chips with
random test switching activities generated using a Markov chain similar to
[Yao11c]. Finally, an experiment is performed for the d695 benchmark
chip from ITC’02 with real switching activities based on real test data from
[Samii06]. The costs of the test schedules and the test schedule sizes are
reported for the last two sets of experiments. The experimental setup is
briefly introduced at the beginning and then the results are presented.
The static power is computed using the temperature dependent model
given in [Liao05]. The temperature simulations are performed using the
approach proposed in section 4.6. The spatial temperature error is assumed
to have normal distribution ranging from to with
a resolution of . The temporal temperature error is also
assumed to have a normal distribution ranging from to
with a resolution of . It is assumed that there are twenty
temperature error clusters .
The balancing coefficient is assumed to be equal to ten . It is
assumed that each entry in a linear schedule table occupies 64 bits and each
entry in a branching table per core per edge occupies 32 bits. For example
a node with two succeeding edges for a SoC with two cores, occupies
bits.
The first set of experiments is performed on a number of SoCs with
different number of cores ranging from five to 50 cores. Markov chains are
used to generate random test switching activity sequences having random
averages and random lengths. The experiments are performed for at least
five randomly generated sets of tests for each chip and the average CPU
times are reported in Table 4.7.1. Note that even for a 50-core SoC, the
CPU time remains in an affordable range.
Table 4.7.1 CPU times for SoCs with different number of cores
Number of Cores 5 10 15 20 25 30 35 40 45 50
CPU time [Sec] 9 46 52 132 208 308 590 762 1141 1367
Process-Variation Aware SoC Test Scheduling Techniques
85
The second set of experiments is performed on ITC’02 SoCs with
randomly generated test switching activities similar to the first set of
experiments but this time tests for a chip have constant power averages and
length. The proposed technique is compared with the two methods
proposed in [Aghaee10]. The first one is an Offline method which uses
only one linear schedule and the other is a Hybrid method which selects a
linear schedule (out of a set of pre-generated schedules) only once during
the test process. The test costs offered by the Offline and Hybrid methods
proposed in [Aghaee10] and by the proposed technique in this chapter are
computed using the metric given in equation 4.3.4 and are reported in Table
4.7.2.
Column 1 is the name of the ITC’02 circuits. Columns 2 and 3 are the costs
(based on equation 4.3.4) for schedules generated by Offline and Hybrid
approaches proposed in [Aghaee10], respectively. The costs of the
schedules generated by the proposed adaptive approach are reported in
column 4 in Table 4.7.2. The percentage reduction in cost achieved by the
Hybrid and adaptive approaches compared with the Offline approach are
reported in columns 5 and 6, respectively. Column 7 is the percentage
reduction in cost achieved by the proposed adaptive approach compared
with the Hybrid approach. The adaptive method proposed here reduces the
cost by 76% over the Offline method and 43% over the Hybrid method.
This demonstrates the advantage of the proposed adaptive method.
The ATE memory occupied to store the schedules (i.e., the schedule size)
is reported in Table 4.7.3. The cost reduction comes with increase in the
Table 4.7.2 Test cost for test scheduling techniques
ITC’02
chips
Costs
Percentage
reduction relative to
the Offline
Percentage
reduction relative
to the Hybrid
Offline Hybrid Proposed Hybrid Proposed Proposed
a586710 1.44 0.56 0.54 61 62 4
d281 0.69 0.45 0.03 35 96 93
d695 0.50 0.12 0.06 76 88 50
f2126 2.71 1.39 0.51 49 81 63
g1023 5.09 4.27 1.99 16 61 53
h953 0.46 0.14 0.11 70 76 21
p22810 1.22 0.70 0.69 43 43 1
p34392 0.75 0.72 0.06 4 92 92
p93791 1.02 0.13 0.08 87 92 38
q12710 1.32 0.40 0.23 70 83 42
t512505 0.48 0.23 0.13 52 73 43
u226 1.05 0.43 0.37 59 65 14
Average 52 76 43
Chapter 4
86
schedule size because of increased number of linear schedules and
branching tables, which consume ATE memory space. The average
increase in schedule size compared to Offline is 87% for Hybrid and 308%
for the proposed adaptive method. When compared to Hybrid, the average
schedule size increase for the proposed method is 117%. The increase in
the usage of ATE memory (as given in Table 4.7.3) refers only to the
memory space used to store the schedule. This is usually small, compared
with the memory space used to store the test patterns. Therefore a large
increase in the schedule size is very likely to be translated into a small
increase in the usage of the ATE memory as a whole.
The proposed scheduling method will utilize the available ATE memory
even if a very small reduction in cost (e.g., from 0.70 to 0.69 for p22810 in
Table 4.7.2) is achieved. Since the number of nodes contributes to the
scaled cost function (equation 4.5.1), a larger schedule will not be
generated (e.g., 195% larger for p22810 in Table 4.7.3 compared with
hybrid solution) if it does not reduce the cost compared with a smaller
schedule.
The ATE memory constraint will affect the quality of the adaptive test
schedules. The proposed algorithm will not generate even an offline
schedule when the available memory is too small to accommodate it. By
increasing the available ATE memory, first an offline schedule and then a
hybrid schedule will be generated. With the further increase of the
available memory, better schedules with lower costs will be generated.
This trend continues until the cost reaches a minimum beyond which
Table 4.7.3 ATE memory utilized only for schedule
ITC’02
chips
Utilized memory for
schedule [bit]
Percentage increase
relative to the Offline
Percentage increase
relative to the Hybrid
Offline Hybrid Proposed Hybrid Proposed Proposed
a586710 1216 1888 4768 55 292 152
d281 1088 1280 2624 18 141 105
d695 1280 2176 3392 70 165 54
f2126 704 960 2368 36 236 147
g1023 576 1088 4480 89 678 312
h953 576 1088 1472 89 155 35
p22810 704 1888 5568 168 691 195
p34392 832 1472 2688 77 223 83
p93791 704 1920 3136 173 345 63
q12710 640 1024 1664 60 160 62
t512505 1152 2336 3712 103 222 59
u226 320 672 1568 110 390 133
Average 87 308 117
Process-Variation Aware SoC Test Scheduling Techniques
87
further cost reduction is impossible. The minimum cost is usually dictated
by the branching overheads (time to read sensors and react accordingly).
The reduction of the cost with the increase of the memory limit is shown
in Table 4.7.4. The memory limit is increased in eight steps. It is expected
that the increase in the memory limit improves the cost before it reaches
the saturation limit. The saturation limit for this set of experiments is equal
to 1320. Memory sizes and limits for the schedule are given in bytes. The
CPU time increases in general with the increase of memory limit. This
trend continues even if the cost is not improved (after the saturation) since
the algorithm has more space to search and thus it takes more time. The
costs and sizes are normalized to the first working schedule (row 4) and
reported in columns 4 and 5 of Table 4.7.4.
The last experiment is performed on d695 (one of the ITC’02 chips) using
the real test switching activities. The costs and schedule sizes are reported
in Table 4.7.5. The Hybrid method improves the cost compared to Offline
method by 59% while the proposed adaptive technique achieves a
reduction of 71%. The proposed technique improves the cost by 30% over
the Hybrid method. The schedule size for the proposed method is 169%
and 49% larger than Offline and Hybrid, respectively. As we expected, the
improvement in cost and the increase in the schedule size are in the ranges
suggested before by the second set of experiments.
As previously mentioned, the effect of increased schedule size on the total
consumed ATE memory is small. For example consider the experiments
with the d695 chip with real switching activities. The size of the schedule
Table 4.7.4 Costs and utilized memory volumes for different memory limits
Memory
limit
Results
Cost Size Cost (%) Size (%) CPU time (H:M:S)
300 Aborted, memory limit is too tight 1:03:42
500 3.3875 460 100.00 100.00 3:15:21
750 3.3875 460 100.00 100.00 3:34:20
1000 2.9389 920 86.76 200.00 3:41:03
1250 2.9389 920 86.76 200.00 3:48:47
1500 2.7170 1320 80.21 286.96 3:53:52
1750 2.7170 1320 80.21 286.96 3:59:12
2000 2.7170 1320 80.21 286.96 4:04:16
Chapter 4
88
for the adaptive solution is approximately 7 Kbit while the test size is
approximately 1324 Kbit. Therefore the percentage increase in total
utilized ATE memory from the offline solution to the adaptive solution is
0.34%. This means that the adaptive method achieves 71% reduction in
cost relative to the offline method, with a small expense of 0.34% increase
in the occupied ATE memory.
4.8 Adaptive Multi-Temperature Testing
As previously discussed in section 3.5, temperature-dependent defects are
a challenge for achieving high test quality for advanced SoC. The existing
multi-temperature test scheduling methods optimize the test schedule for
the shortest test application time while making sure that the tests are
applied inside the specified temperature ranges [He10, Yao11b]. These
methods neglect the temperature deviations that are mainly caused by
process variation. Therefore, a large process variation implies a decreased
number of chips that are tested within the specified temperature ranges,
which will reduce the effectiveness of the tests and, in the worst case, may
lead to damage of the chips due to overheating.
In order to maximize the chances that the tests are applied within the
intended temperature ranges, static schedules should be designed
pessimistically. In this case, a large process variation implies a very long
test application time due to the intensive use of the heating and cooling
intervals. This means that the chips under test are heating up/cooling down
more than actually needed in order to make sure that it is warm/cold
enough for the majority of the chips. This is similar to the discussions about
the safety margins in section 3.3. A detailed discussion and analysis of
safety margins can be found in [Aghaee10].
The test application time for multi-temperature testing is much longer than
the normal testing and therefore the test cost is higher [He10, Yao11b].
Table 4.7.5 Cost and ATE memory utilized for schedule for d695
Offline Hybrid Proposed
Percentage change
relative to the Offline
Percentage
change relative
to the Hybrid
Hybrid Proposed Proposed
Cost 20.84 8.53 5.93 - 59 - 71 - 30
Utilized
memory for
schedule [bit]
2688 4992 7232 + 86 + 169 + 49
Process-Variation Aware SoC Test Scheduling Techniques
89
This becomes a serious cost issue, in particular in situations that the normal
test application time is already very long, as it is for advanced SoCs. The
proposed methods in [He10, Yao11b] provide satisfactory results when the
temperature at a certain test cycle could be assumed to be identical for all
chips of the same design.
However, advanced SoCs manufactured with deep submicron technologies
are likely to have different temperatures at the same test cycle because of
process variation. The negative effect of temperature variations on the
thermal safety of the SoCs during test is addressed by the scheduling
methods proposed in the previous sections. These methods try to limit the
cores’ maximum temperatures so that the test damages caused by
overheating during the test process are minimized. Similar techniques can
be applied in the context of multi-temperature testing. We have proposed
a technique to generate test schedules so that the tests have a large
likelihood of being applied at the correct temperatures [Aghaee14b]. Here,
we briefly explain the methodology.
As mentioned before, the adaptive multi-temperature testing is similar to
the thermal-safe approach introduced in this chapter. The key difference is
that heating stimuli and cooling intervals are used to bring the temperature
inside the required range. Only then the tests can be applied. Due to testing,
the temperature may exceed the high limit. In this case, the testing is
paused at an appropriate moment and then cooling intervals are introduced.
On the other hand, if the temperature falls below the low limit, testing is
paused and a heating sequence is introduced, instead.
The thermal-aware techniques only support one temperature limit which is
the overheating limit and exceeding it adds to the overall cost of testing by
increasing the number of overheated chips. On the other hand, multi-
temperature techniques have to consider the upper limit and the lower limit
of the temperature interval characteristic to each test. Exceeding these
limits results in test escapes which means that some defective chips may
not be detected. This new contributor to the testing costs is defined in
[Aghaee14b] and is added to the costs already defined in section 4.3.
Having to handle a low temperature limit adds to the complexity of the
techniques presented in section 4.5. Representative temperature error is
introduced in section 4.5.2. Every chip cluster is represented by a dedicated
representative temperature error. In fact this representative value is defined
and optimized with regard to the high temperature limit. Therefore, having
Chapter 4
90
an additional low limit means that another representative is required with
regard to this low limit.
The proposed adaptive multi-temperature technique is explained in details
in [Aghaee14b] and is supported by experiments. The overall cost that
captures costs related to the test application time, overheating, and out of
required-range testing is minimized. Required ATE memory, CPU times,
and cost dependency on the amount of process variation are also reported
in [Aghaee14b].
4.9 Remarks
Although the proposed adaptive techniques are developed to handle PV-
related temperature errors, they can be adopted to handle other non-ideal
situations. Such situations may happen during in-field testing, where the
initial and ambient temperatures (among other parameters like voltage)
may vary. Acquired temperature data using on-chip (or on-board) sensors
help to select the most appropriate linear-schedule and minimize the costs.
The temperature error that is explained in section 4.4 is discussed in more
details here. In order to distinguish between the effects of the process
variation and other undesirable thermal effects, four different temperatures
can be defined. The first one is expected temperature that is the
temperature of a normal chip which is not affected by undesirable thermal
effects (including process variation). The expected temperature is an
abstract concept and its exact value could not be acquired. The second one
is simulated temperature that is the temperature computed by simulation.
The aim of simulation is to compute the expected temperature and
therefore, ideally, the simulated temperature is equal to the expected
temperature. The third one is actual temperature that is the actual real-
world temperature. Its exact value is usually impossible to acquire due to
measurement errors. The fourth and last one is measured temperature that
is the measured temperature using temperature sensors.
Based on the above definitions, three different temperature errors can be
identified. The first one is simulator error that is the difference between
the expected temperature and the simulated temperature. The inaccuracies
in the thermal model and algorithms which the simulator is based on,
contribute to this error. The second one is measurement error that is the
difference between the actual temperature and the measured temperature.
The inaccuracies in the sensor technologies contribute to it. The third and
Process-Variation Aware SoC Test Scheduling Techniques
91
last one is variation error that is defined as the difference between the
actual temperature and the expected temperature. This error has various
sources including process variation, ambient temperature fluctuations, and
voltage variations.
Even though the temperature simulator errors and sensor measurement
errors are not addressed explicitly in this thesis, in practice when the
temperature error model is being tuned empirically, a considerable amount
of these errors will also be covered. There still might be small residual
errors which are not captured by the temperature error model. These small
residual errors are addressed by introducing a small safety margin (e.g., a
slightly lower overheating limit than the actual overheating limit is used in
practice). The effect of this small safety margin on cost is negligible as
discussed in [Aghaee10].
The focus of this chapter is process variation which mainly contributes to
the variation error and therefore in this thesis we focus on this category of
errors. To avoid these complications, we usually consider the temperature
error as the difference between the expected temperature which is
estimated by simulation and the actual temperature which is measured by
sensors. Meaning that we assume that the actual temperature and measured
temperature are equivalent. Moreover, we assume that the simulated and
expected temperatures are equivalent. Nevertheless, our proposed
approaches can be used to address other errors like the simulator errors and
the measurement errors, although they are not explicitly designed for
addressing these types of errors.
4.10 Conclusions
This chapter mainly presents an adaptive SoC test scheduling technique to
deal with spatial and temporal temperature deviations, caused by process
variations in deep submicron technologies. Mitigating the negative
variation effects on the multi-temperature testing, reported in
[Aghaee14b], is similar to the thermal-safe testing and therefore is just
briefly discussed above.
The key contribution of this chapter is an algorithm to generate a set of
efficient test schedules, each corresponding to a different thermal behavior
of different cores during test. The on-chip temperature sensors are used to
monitor the actual temperatures of the different cores and to guide the
Chapter 4
92
selection of the corresponding test schedules accordingly, during the test.
This way, the overall test efficiency will be improved considerably.
The proposed technique consists of two distinct algorithms, the test
scheduler and the thermal simulator. The temperature-aware test scheduler
is a constructive algorithm which generates tree-based test schedules by
putting the optimized sub-trees together. Sub-tree optimization is basically
a chip-clustering algorithm which involves a linear test scheduling
algorithm. A new sub-tree scheduling algorithm is proposed here. The
linear scheduling algorithm requires a thermal simulator in its main loop.
A fast temperature simulation approach is proposed in order to speed up
the temperature-aware test scheduling algorithm.
The proposed adaptive test scheduling technique generates process-
variation and temperature aware test schedules for SoCs with a large
number of cores. The algorithm has a relatively short run-time and
generates high quality test schedules. The proposed technique has been
experimentally evaluated using a number of experiments including ITC’02
benchmark SoCs.
Process-Variation Aware SoC Test Scheduling Techniques
93
4.11 Notations and Abbreviations
Notation Description
Capacitances vector in the thermal model
Applied Test Size
Resistances vector in the thermal model
Balancing Coefficient
The table that determines with which linear schedule table a
specific chip should be tested. (See the example in section 4.2)
Number of cores
Chip cluster A group of chips with similar thermal behavior that are tested
with the same Linear schedule table. A chip cluster corresponds
to an edge in the schedule tree.
Chip-cluster border The border line between two Chip clusters. For two adjacent
Chip clusters the border is a set of natural numbers, each
corresponding to an individual core. A border represents a
particular error value. (See section 4.5.4)
Chip clustering Finding the optimal partitioning of the -dimensional error
space into an already known number of Chip clusters for the
nodes of a tree. (See full explanation in section 4.5.4)
Cost of the Test Facility per time unit
Expected Applied Test Size
(temperature) Error-clusters Borders
Error Cell Change Probabilities
ECCP before being normalized
Error-Cell Labeling
Error-Cells Probabilities
ECP just after branching
ECP just after overheating
Chapter 4
94
Notation Description
ECP after temporal changes (according to temperature error
model)
Error cluster A range of error values which are to be treated as one single error
value. Error clusters are separated by Error-clusters Borders, EB.
Expected Test Application Time
Expected Test Overheating Probability
Effective Test Time per Second
High Temperature Threshold
Identity matrix
The point that the output is equal to the input and not equal to one,
in the proposed reciprocal limiter.
Number of temperature error clusters
Linear schedule table A schedule that specifies stop/start times for the test of each and
every core, individually. This will correspond to an edge or to a
single Chip cluster. (See the example in section 4.2)
Leaves’ Overheating Probabilities
Number of temperature error values
Number of nodes in a tree
Nodes’ Applied Test Sizes
Normalized Cost Function
Node’s Cluster Label
Node’s not Overheating Probability
Node A node in the schedule tree that corresponds to the ending of a
Linear schedule table (i.e., a place that branching is possible).
Nodes’ Probabilities
expected Number of Partial Trees, similar to the current partial
tree, that are required to construct the complete schedule tree
Process-Variation Aware SoC Test Scheduling Techniques
95
Notation Description
Nodes’ Test Application Times
Normalized Test Throughput
Power vector
Partial cost function NCF evaluated for a part of the schedule tree (e.g., a sub-tree).
Price of One Chip
Particle Swarm Optimization
Predicted Test Overheating Probability
Number of leaf nodes
Number of succeeding edges for a node
Scaled Cost Function that is used to select the unfinished trees
out of a group of offspring trees.
Simulated Temperature
Spatial Temperature Error Probabilities
Test Access Mechanism
Test Application Time
The period for the discrete-time temperature error model. The
error values are updated regularly with a frequency equal
to . (See section 4.4)
Temperature Error Values
Test Handling Time
Test Overheating Probability
Test Throughput
Temporal Temperature Error Probability
Number of nodes in the thermal model
Chapter 4
96
Notation Description
Transfer matrix for initial temperatures
Transfer matrix for power values
Temperatures vector in thermal model
Initial temperatures
Temperatures at the end of the interval of size t
Temperatures at t-th time sample (in section 4.6)
Temperature of w-th thermal node
The output of the proposed limiter, applied on the expected
number of partial trees, .
97
Chapter 5 Temperature-Gradient Based
Burn-In and Test Scheduling
Large temperature gradients (e.g., temperature difference between two
adjacent cores) exacerbate various types of defects including early-life
failures and delay faults. The capability to detect these temperature-
gradient induced defects is crucial for advanced SoCs. In particular, 3D-
SICs exhibit considerably larger temperature gradients compared with
normal ICs (for example, three times is reported in [Plas10]) and therefore
temperature-gradient based test is crucial for them.
The gradients are captured and represented by temperature maps. This
chapter presents schedule based techniques to enforce temperature maps
on the IC. A temperature map specifies the temperatures for different sites
(e.g., cores) in the IC at a given time-point. It usually specifies the high and
the low temperature limits for each site. Alternatively, the intermediate
temperatures (half-way from low limit to high limit) can be used to
represent a temperature map, in particular if the difference between high
and low limits are similar for all sites.
5.1 Introduction
5.1.1 Test for Early-Life Failures
Burn-in is a common way of accelerating and detecting early-life failures
and it should be done with low cost in a reasonably short time. For this
purpose, usually, the dies are operated at elevated temperature and voltage.
The elevated temperature and voltage speed up the aging and wear
mechanisms so that the dies experience their early life before testing. The
wear mechanisms that are speeded up include metal stress voiding and
electromigration, metal slivers bridging shorts, as well as gate-oxide wear-
out and breakdown [Semenov03].
5
Chapter 5
98
Recently, several studies have, however, shown that some wear
mechanisms are speeded up more efficiently by large temperature
gradients rather than the high temperature itself. A temperature-gradient
induced wear mechanism is identified in [Smorodin08] which shows that
a metal layer elevation develops rapidly on the sites that experience large
temperature gradients. Moreover, in the atomic flux equation that models
the electromigration, temperature gradient is present directly and also
indirectly through its effect on the mechanical-stress gradient [Pak11].
Therefore, a burn-in process that has not created the appropriate thermal
scenarios will not sufficiently speed up the formation of the defects and,
consequently, such early-life defects will go undetected. In order to prevent
these test escapes, it is necessary to introduce a burn-in process that
enforces appropriate temperature scenarios on the IC. This necessity is
more urgent for the ICs that suffer from large temperature gradients, such
as 3D-SIC.
3D-SIC technology, similar to other deep submicron technologies, suffers
from high power densities. Additionally, power densities are considerably
higher in the test mode compared to the functional mode, in particular for
core-based designs. Consequently overheating may damage the ICs under
test. This means that the application of test stimuli to ICs can raise their
temperatures beyond their tolerable limits. This often undesirable effect is,
however, utilized in this thesis to heat up the IC for burn-in.
In our case the stimuli are not necessarily actual test patterns. Instead, they
could be specially generated sequences which cause large switching
activities. Such stimuli are called heating sequences. The use of the heating
sequences to heat up the IC from inside means that special equipment for
heating the IC from outside are not necessary. This will lead to large
reduction of cost, and also allow for the generation of necessary
temperature gradients.
Some temperature gradients might be enforced on an IC by applying
appropriate inputs to the IC’s input ports in the functional mode. This
might work, to some extent, for 2D ICs, since from the functional point of
view all the required circuitry, including the input ports, are fabricated and
available when the IC enters the test process. For 2D ICs, there are usually
two possible stages for burn-in: Wafer-Level Burn-In (WLBI) which is
performed before packaging and Die-Level Burn-In (DLBI) performed
after packaging [Semenov03]. For 3D-SIC, however, there are more
Temperature-Gradient Based Burn-In and Test Scheduling
99
stages, including pre-bond, mid-bond, post-bond, and final stages
[Taouil12].
Existence of the test stages before the IC is fully assembled is a key
difference between the 2D and 3D-SIC burn-in process. In the case of 3D-
SIC, using input ports in the functional mode may benefit burn-in for the
post-bond and the final stages similar to 2D ICs. But for the pre-bond or
mid-bond stages, the inputs to the die or partially stacked dies are not
necessarily the inputs to the IC. The input ports to the unit under test for
3D-SICs, before the final bonding, are likely to include a number of TSVs.
The TSVs and test equipment are not designed to support simultaneous
application of functional signals, particularly to large number of TSVs
(even though they might be designed to allow simple electrical tests for the
TSV itself). Therefore, the use of the IC’s ports for enforcing the
temperature gradients is not possible for the pre-bond and mid-bond stages.
Albeit this lack of access in the functional mode, TAM provides access to
the cores, in the test mode [Ieee14a]. Therefore, the heating sequences
could be applies using the TAM in order to enforce the desired gradients.
The necessity to utilize the TAM has yet another reason that is not specific
to 3D-SICs. The thermal gradients in some maps might be placed in
locations that cannot be properly stimulated through functional input ports.
Such thermal maps can often be enforced if the TAM is used. The reason
is that the TAM, in the test mode, provides direct access to cores; while in
the normal operational mode, a core might be limited to receive inputs only
from an adjacent core. Therefore, heating could be targeted toward a
specific core using the TAM.
5.1.2 Test for Delay Faults
Advanced SoCs manufactured by 3D-SIC technology suffer from a
considerably larger number of delay faults as compared with previous
technologies [Deutsch12]. The causes for these delay faults include
resistive bridges and vias, power droops, and cross-talk noise effects.
Therefore, delay-fault testing is necessary to provide sufficient fault
coverage [Patil07, Raina07]. A large number of pre-bond TSV defects are
resistive in nature and, moreover, the mechanical stress caused by TSVs
contributes also to delay faults [Chakrabarty12, Deutsch12]. Therefore, the
expected number of delay faults for 3D-SIC is much larger than that of 2D
ICs.
Chapter 5
100
Since temperature has a significant effect on delay, its impact should be
taken into account for delay-fault test. A very important effect of
temperature on signal integrity is its effect on the clock network [Bota04].
Delay faults usually occur because of increased clock skew and a major
contributor to skew in 3D-SICs is temperature gradient [Mondal07]. Since
propagation delays depend on temperature, different temperatures on
different sites (i.e., temperature gradients) result in different clock skews.
Temperature gradients may reach up to 50 in adjacent cores for normal
operation and even higher during test [Borkar03, Bota04, Mondal07]. Such
large temperature gradients may lead to considerable clock skew and thus
many delay faults.
Moreover, the difference between the temperature maps during the normal
functional operation and temperature maps during test will result in non-
realistic delay faults [Bota04]. These delay faults usually happen because
of increased clock skew. Therefore, in order to detect the realistic delay
faults during the test, the test should be performed when the die has a
temperature map which corresponds to a normal functional situation.
In order to test a die under the thermal conditions that correspond to reality,
a simple technique is to operate the chip with realistic inputs so that the
temperature map is created disregarding the test. Then start the test and go
on with it as long as the thermal map maintains an acceptable difference
with the specified thermal map. When the difference grows larger than
accepted, the test is halted and the specified thermal map is re-created
disregarding the test. Apart from being slow, this scenario has another
problem in case of 3D-SIC.
As discussed in the previous section, usually a die in a 3D stack has a large
number of TSVs as its input ports. The TSVs and test equipment are not
expected to be designed to support simultaneous application of realistic
signals, particularly to large number of TSVs. Therefore, it is not possible
to use the IC’s real functional inputs to create the specified thermal map
for pre-bond and mid-bond tests. However, the test access mechanism can
be utilized for this purpose. This will be further discussed in section 5.2.1.
Temperature-Gradient Based Burn-In and Test Scheduling
101
Besides creating realistic gradient scenarios to avoid test overkills, certain
unreal1 scenarios may help to detect certain early-life defects before
causing further costs. As previously discussed, in the normal operational
mode such unreal scenarios may not be achievable since not all involved
cores are accessible. However, in the test mode, TAM may provide access
to these involved cores.
As mentioned before, the temperature gradients in 3D-SICs are much
larger than in 2D ICs [Plas10]. This will exacerbate temperature-gradient
related issues including delay faults, in particular, for 3D-SIC. Therefore,
the associated tests should be performed when the proper temperature
maps are enforced. A temperature map specifies the appropriate
temperatures for different sites (e.g., cores) in the IC. These temperatures
are to be realized simultaneously in order to enforce the proper temperature
gradients. The temperature maps are given along with their corresponding
tests. Beside the gradient-based burn-in the other objective of this chapter
is to introduce a technique to apply the tests while the corresponding maps
are enforced on the IC.
5.2 Temperature-Gradient Based Burn-In
5.2.1 Motivation and Problem Description
As discussed earlier, a temperature map specifies the desired temperature
values for different sites (e.g., cores). The temperature maps are to be given
by the user, who studies the typical temperature-gradient induced failure
mechanisms in an IC analytically or experimentally [Pak11, Smorodin08].
Each map corresponds to a particular temperature condition of an IC, such
as large temperature differences between adjacent cores (i.e., large
temperature gradients), that can accelerate aging for early-life failures or
enlarge the delay fault effect so that they can easily be tested for. There
might also be some locations in the ICs such that their temperatures are not
important regarding the targeted defects. Such locations are indicated as
don’t-cares. Even though they are marked as don’t-cares, their temperature
should, however, be kept below the overheating limit (denoted by
) in order to prevent damage.
1 Unreal gradients are scenarios that are not expected to happen during field
operations. The opposite is “realistic gradients” that happen during normal
operations.
Chapter 5
102
When the expected locations in the IC simultaneously have the temperature
values that are specified by a map, it is said that that temperature map is
enforced. The specified temperature maps should be enforced quickly. In
case of burn-in, the temperatures should then be maintained for a given
period of time to achieve the intended effect and for test it should be
maintained as long as the corresponding tests are being performed.
Usually, there are many temperature maps that one would like to achieve
and maintain. Therefore, it is important to achieve them rapidly whether
the ICs start from the ambient temperature or from another map. The order
of the maps has a considerable impact on the overall burn-in/test time and
will be discussed in-depth later on in this chapter. For the time being, we
assume that the maps order is given and focus on other aspects of the
problem. In our work, a temperature map will be achieved by using heating
sequences sent through the TAM. Moreover, it is assumed that no test is
applied when an IC is kept under a temperature map for burn-in. This
assumption will be relaxed in section 5.3 so that the tests can be applied
when an IC is kept under a temperature map.
Assume that there are modules in an IC (on one or multiple dies) and
their tests can be started and stopped independently (e.g., the modules are
cores with core wrappers in a core-based design). In order to enforce the
specified temperature maps, heating sequences are used to heat up some of
the modules. The average power of the heating sequence is given by a real
number, denoted by for module . It is assumed that
the TAM only affords (a positive integer number) modules to be tested
simultaneously.
Assume that the desired temperature map is specified by a low temperature
limit and a high temperature limit for each module and the don’t-care
modules are declared separately. For example, a temperature map specifies
that module has a low temperature limit equal to and a high
temperature limit equal to .
The inputs to the proposed method include temperature maps, the IC’s
temperature model, the IC’s electrical model (e.g., specification of the
TAM and power-related specifications), switching activities of the heating
sequences, ambient temperature ( ), and overheating limit
2 A list of notations and abbreviations is provided in section 5.6.
Temperature-Gradient Based Burn-In and Test Scheduling
103
( ). The output is a schedule that guides the application of the
heating sequences to the modules so that their temperatures move into the
specified ranges and stay there.
As an example, consider an IC with 3 modules, , , and . Assume
that a temperature map is specified as , ,
, , , and , and no module is specified
as don’t-care. These temperature limits are shown in Figure 5.2.1a with
dashed/dotted lines.
A temperature simulation is performed based on a proper periodic schedule
and the simulated temperatures are shown in Figure 5.2.1a. Starting from
the ambient temperature ( ), the modules’ temperatures
steadily raise until they are inside the specified ranges. As shown in this
example, applying heating sequences can drive the modules of an IC into
a high temperature situation. For example, the temperature of module
has reached at around Time Units (TU). A TU consists of
test cycles in this example.
The temperatures around the TU point, are amplified and shown
in Figure 5.2.1b. The time interval shown in Figure 5.2.1b corresponds to
three periods of the schedule. Since the schedule is periodic, one period
Figure 5.2.1 Temperature curves for an example
0 1 2 3 4 5 6 7
x 104
40
60
80
100
120
90
60
m0
m2
m1
2 410 3 5
Te
mp
era
ture
[oC
]
120
30
(a)×10
4 TU6
heating on/offtemperature(b)
0 1 2
A period
(c)
t0 t1 t2 t3
Te
mp
era
ture
Chapter 5
104
captures the entire schedule which is repeated in a cyclic manner. Figure
5.2.1c further amplifies and shows one period of the schedule that starts at
and ends at . The length of the period for this schedule is denoted by
( ). One period is divided into three intervals, specified by
numbers 0, 1, and 2 in Figure 5.2.1b.
They correspond to the time intervals3 [ ], [ ], and [ ] in Figure
5.2.1c, respectively. The schedule specifies that the heating sequence for
module is applied only in the [ ] interval, the [ ]
interval, and in general in [ ] intervals (
), assuming that the process starts at time . The application of
the heating sequences for module and module are specified in a
similar manner by the schedule.
For the [ ] period, the time intervals that the heating sequences are
applied are depicted by gray areas in Figure 5.2.1c. In this example, the
TAM provides access to one module at a time ( ). Therefore, in
interval [ ] only module receives a heating sequence. Similarly, in
[ ] only is heated and the same goes for interval [ ] for . We
need an efficient algorithm to generate such schedules.
5.2.2 Steady State Solution
Let us first analyze a simplified situation, where we assume that a steady
state power could be provided for the modules. In this case, there exists a
steady state solution that could generate and maintain the specified
temperature map.
Providing continuous steady state powers simultaneously for all modules
is, however, very likely to be impossible mainly due to TAM limitations.
One solution is to use the maximal practical power for each core in
combination with a Pulse Width Modulation (PWM) technique. Therefore,
the best that can be achieved is a discrete stimulus sequence that has
constant long-term average power with small ripples. This way, the
modules have a time-divided multiple access to the TAM.
In order to reduce the risk of out of range temperatures due to ripples in the
input power, the desired steady state temperatures are defined at the middle
of the specified ranges . Such ripples could be seen
3 The notation [a b] is used to represent an interval that ranges from a to b.
Temperature-Gradient Based Burn-In and Test Scheduling
105
in the temperature curves given in Figure 5.2.1. In order to find the power
values that result in the specified temperatures, the IC’s temperature model
should be analyzed.
As previously discussed, the temperature model works by dividing an IC
into elements represented by nodes. Each node has a heat capacitance
modelling its thermal capacity. Adjacent nodes are connected through a
heat resistance that models the thermal conductivity between them. They
are connected together in a network configuration, similar to an electric
circuit. The temperatures correspond to voltages and the heat dissipation
corresponds to a current source. A node is called active if it directly
receives electrical power caused by switching activities.
A 3D-SIC is usually laid out so that the main blocks (e.g., logic and
memory) are placed in a certain distance relative to TSVs to avoid
undesirable effects induced by TSVs such as high mechanical stress. Such
forbidden areas are called Keep-Out-Zones (KOZ) [Chakrabarty12,
Deutsch12]. A collection of the TSVs placed next to each other (perhaps
to overlap the KOZ of different TSVs and save area on the die) is called a
TSV block. A TSV block may consist of only one TSV if the TSVs are
placed far apart.
In this section (section 5.2.2) it is assumed that a module is a single active
thermal node. Furthermore, it is assumed that TSV blocks are always
thermally don’t-care and do not dissipate heat (are passive thermal nodes)
since their drivers are placed together with the corresponding modules.
These assumptions will be relaxed in section 5.2.4. The temperature
equation (equation 2.6.1) is repeated below for convenience:
(5.2.1)
Like before, is the temperature vector and is the power vector. Heat
transfer among nodes is included in the temperature model and it means
that a node can be heated up by its neighboring nodes even if it has no
switching activities.
The specified temperature map consists, in fact, of the steady state
temperatures that must be enforced on the IC for a while. A temperature
map could be thought as the targeted steady state temperatures, , which
are composed of the desired steady state temperatures for each module
(e.g., for module ). Since is, in this case, equivalent to the steady
Chapter 5
106
state temperatures, which are considered constant (for a certain amount of
time), its derivatives are zero (no variation in time). Therefore, equation
5.2.1 (similar to equation 3 in [Aghaee14b]) may be written as
(5.2.2)
This means that it is possible to calculate the required powers that lead to
the specified temperature map. In order for the specified temperature map
to be achievable, the computed steady state power values must satisfy a
feasibility and a schedulability condition. The first part of the feasibility
condition is that the computed steady state power for module ( )
should be larger than or equal to the stray power dissipated by the module.
The stray power is an unintended part of the power that could not be
independently controlled. Its value for module is denoted by . It
consists of the leakage power in addition to the clock networks’ power. As
previously discussed, the clock networks’ power can be large [Oberg03].
Therefore, it is important to take it into account.
The second part of the feasibility condition is that should be less than
or equal to the average power of the corresponding heating sequence, ,
plus . Therefore, the feasibility condition is:
(5.2.3)
Usually the feasibility condition is easily met if the specified temperature
map is realistic (e.g., the specified temperature is neither lower than the
ambient nor larger than the achievable temperature). Assuming that
equation 5.2.3 is satisfied, the schedulability condition which is related to
the limited TAM bandwidth should be verified. The challenging problem
here is to create the required average power values, , using the available
TAM bandwidth. This is done by selectively applying the heating
sequences to the modules.
The continuous application of the heating sequence generates an average
dynamic power equal to . The desired power values, , which are
smaller than , are created by applying the heating sequence, ,
for a fraction of a time period. The average power in a period should be
made equal to the required steady state power. As mentioned before, this
is done using a technique similar to PWM. The ratio of the duration of
heating sequence application to the overall time period is therefore called
Duty-cycle ( ) and its value is calculated using the following equation.
Temperature-Gradient Based Burn-In and Test Scheduling
107
(5.2.4)
The duty-cycles might not be achievable if their values are relatively large
and if the TAM does not provide sufficient bandwidth. For example,
assume a design with two modules, with the duty-cycles and
. This means that in a period of time equal to 1, we need access to
module 0 for 60% of the time and access to module 1 for 80% of the time.
Therefore, simultaneous access to more than one module (0.6 + 0.8 = 1.4
modules) is needed. This means that the TAM must provide simultaneous
access to these two modules otherwise these duty-cycles are not
schedulable and the specified temperature map cannot be enforced.
Note that can be divided into pieces; for example could be
implemented by first applying the heating sequence for a duration equal to
at the beginning of the period and then for a duration of
at the end of the same period. The feasibility and schedulability
conditions could be written together using the duty cycle concept as
follows:
(5.2.5)
In fact, the first line in equation 5.2.5 is identical to the feasibility condition
in equation 5.2.3, which is written here in terms of the duty cycles. The
second line in equation 5.2.5 is the schedulability condition, where is
the number of modules that can access the TAM simultaneously. Given a
temperature map that satisfies both feasibility and schedulability
conditions, it is relatively simple to develop a schedule to deliver the
required duty cycles.
Figure 5.2.2a gives an illustrative example, where the available
parallelism, , provided by the TAM is represented by the number of rows
that could be filled with duty-cycles, s ( ). The scheduling
algorithm starts by sorting the duty-cycles and then allocating them from
the largest one to the smallest ones by filling the rows from the lowest one
upwards. Note that if a duty-cycle starts in the lower row and continues to
an upper row, it will not reach the end of the upper row, since .
Therefore, a duty cycle will, at most, be assigned to two TAM rows and a
module needs to switch at most twice during a period. The overheads
Chapter 5
108
associated with switching are thus negligible. The fractions of the time
period that the modules receive heating sequences are illustrated in
Figure 5.2.2b. At every moment in time only three modules are receiving
their heating sequences (the TAM limitation is not exceeded), and the
average of applied heating sequence for a module in a period is equal to
the specified steady state power.
As mentioned before, a thermal map may leave the temperatures for some
nodes unspecified (don’t-care nodes). Besides, the temperatures for
inactive thermal nodes (e.g., TSV blocks) are also left unspecified. On the
other hand, in order to compute the steady state powers, these temperatures
should also be known. The proper choice of temperatures for the don’t-
care nodes may help a thermal map that is otherwise not schedulable
become schedulable. The problem of finding proper temperature values for
the don’t-care nodes could be formulated as a Linear Programming (LP)
problem. Since we are more interested in knowing the duty cycles than the
temperatures, the problem formulation is, then, written with the duty cycles
as decision variables, as shown in Figure 5.2.3. The main objective is to
find a feasible solution.
In Figure 5.2.3, the temperatures, , should have the values specified by
the thermal map, (line 4). If not specified by the temperature map (e.g.,
don’t care modules or inactive nodes) the temperatures should be between
the ambient and the overheating temperature (line 5). For an inactive
module, the power value should be equal to the stray power, , and
therefore the duty cycles should be zero (line 6). For an active node, the
duty cycles are between zero and one (line 7) according to equation 5.2.5.
Figure 5.2.2 An example for scheduled duty-cycles
D1 = 0.75
D2 = 0.75
D3 = 0.50
D0 = 1.00
Sorted
Order
W = 3 (three rows)
M = 4 (four modules)
m : module
T : the period
0.25 0.500.00 0.75 1.00
D2 D2 D3 D3
D1 D1 D1 D2
D0 D0 D0 D0
(a)
(b)
t0 t0+T time
m=3
m=2
m=1
m=0
Temperature-Gradient Based Burn-In and Test Scheduling
109
Besides, the duty cycles should satisfy the schedulability condition
according to equation 5.2.5 (line 8 in Figure 5.2.3). The relation between
the power values, , and the duty cycles is defined by equation 5.2.4.
The temperatures, , are computed based on power values, , using
equation 5.2.2 (by replacing and with vectors composed of
and , respectively). If the LP solver finds a feasible solution, then the
thermal map is achievable and the duty cycles are returned by the LP
solver. We also have the temperature values of the don’t-care modules.
Knowing the duty cycles, a proper period for the PWM-like method has to
be found.
Finding an Appropriate Period for PWM-Based Schedule
The duty cycles and the scheduling approach, discussed so far, are
independent of the schedule’s period, . They generate the modules’
temperatures such that their average equals the specified steady state
temperatures. The period, , should be short enough so that the fluctuations
in the temperatures do not violate the specified limits ( and ). On the
other hand, a longer period is desirable in order to minimize the switching
actions in the schedule. An example for the results obtained by the
proposed algorithm could be seen in Figure 5.2.1a. After the temperatures
have completed their transitions to their new values (after TU),
the proper choice of the period keeps them inside the specified ranges, with
a relatively low number of switching actions in the schedule.
In order to find a relatively long period, , that albeit being long, keeps the
temperature fluctuations inside the specified ranges, two different
situations should be considered: (H)heating sequence is applied (e.g., the
second half of the period for module in Figure 5.2.2b); and (L)no stimuli
Figure 5.2.3 Linear programming formulation
1. Decision variables: ;
Objective:
Constraints:
;
Equations 5.2.2 and 5.2.4 relate variables and .
2.
3.
4.
5.
6.
7.
8.
9.
Chapter 5
110
are applied (e.g., the first half of the period for module in Figure
5.2.2b). In order to estimate the proper period for situation (H), equation
5.2.1 is re-written around the steady state temperature for the heating
sequence power, as shown in equation 5.2.6a. For situation (L), equation
5.2.6b is used, instead.
(5.2.6a)
(5.2.6b)
An illustrative example for the above equations is given in Figure 5.2.4.
Equation 5.2.6a describes the tangent line that touches the temperature
curve at point A, around the steady state temperature. A similar example
for equation 5.2.6b is the tangent line, CD, in Figure 5.2.4. Equation 5.2.6a
is then used to estimate the desired value for the period focusing only on
the high temperature limit. Assume that the proper , only focusing on
situation (H) and ignoring situation (L), is denoted by . Similarly, the
proper , only focusing on situation (L), is denoted by .
It is safe to assume that ( is the duty cycle) is the amount of
time that will result in a near violation situation for module in situation (H). In order to estimate , first the derivative on the left side of equation
5.2.6a is linearly approximated as follows:
(5.2.7)
Now, is computed for module as
(5.2.8)
Figure 5.2.4 An example for the computation of a safe period
A long but safe period is computed so that the temperature limits will not be violated any more
Time
Te
mp
era
ture
Heating seq. on/off
A
B
C
D
Dm×TmH (1-Dm)×Tm
L
Temp. of module m
Temperature-Gradient Based Burn-In and Test Scheduling
111
The values for are obtained from the right side of equation 5.2.6a
and, consequently, the values for are computed using equation 5.2.8.
For example, in Figure 5.2.4, when the module is receiving active power,
the derivative that is represented by a straight line is tangential to the
temperature curve at its intersection point with the steady state temperature
at point A and later on intersects with the high temperature limit at point
B. The period, , is then calculated based on the time difference between
A and B. The other part of the line that stand between A and the low
temperature limit is deliberately left out in order to achieve a shorter period
that is safe in most of the situations (e.g., variation in the input power).
In a similar manner values for situation (L), , are calculated based on
equation 5.2.6b focusing only on the low temperature limit. Since the
temperatures should not violate any of the specified limits, the shortest
( ) is selected as the acceptable period for module .
The actual period, , should be the smallest among the acceptable periods
for all modules ( ) so that none of the temperature limits for
the modules is violated. For example, after the temperatures have
completed their transitions to their new values in Figure 5.2.1 (after
time units), the proper choice of the period keeps them insides the
specified ranges, albeit relatively large fluctuations caused by relatively
low number of switching actions in the schedule.
Moreover, the average of the applied heating sequences for each module is
equal to the specified steady state power for it. For example in Figure
5.2.1c, modules , , and receive 50, 35, and 15 percent of ,
, and plus , , and , respectively. This is indicated by the
width of the gray areas as compared with the schedule’s period, (
).
5.2.3 Transient Solution
Up till now, it was assumed that the power values applied to an IC during
transition to a new map are the same steady state powers that are used to
maintain the new map afterwards. This implies that the transition to a new
map is very slow and the transition time may be excessively long. For
example, as shown in Figure 5.2.1, it takes about time units for
the IC to reach the specified thermal map from the ambient temperature.
Chapter 5
112
Here, in this section, burn-in time is the time required for bringing the IC
into a thermal situation that complies with the first thermal map and then
to the next map, until all maps are applied. It is likely that a large number
of thermal maps are specified and therefore the transition to a new map
should happen very fast. After transition, the map is maintained using the
steady state powers, , as calculated in the previous section. In order to
reduce the burn-in time, a new solution that takes the transient response
into account and uses larger or smaller power values (compared to the
steady state solution) is presented here.
In this section, the transient response is taken into account while
minimizing the overall transition time. We start by looking into the analytic
solution for equation 5.2.1. This was previously discussed in section 4.6.
The closed-form solution for a duration of time equal to , is copied below
from equation 4.6.5:
(5.2.9)
In the above equation, and are matrices that are computed based
on and , and for a duration of time equal to , as follows (similar to
equation 4.6.3–4):
(5.2.10a)
(5.2.10b)
In the rest of this chapter and are represented as and ,
respectively. The initial temperatures are expressed by and the
temperatures at time is denoted by . is the power vector that is
assumed to be constant for the time interval . An intuitive explanation of
equation 5.2.9 is that determines how fast the initial temperatures fade
away and determines how fast the input power affects the temperatures.
As mentioned before, achieving a new temperature map in a short time is
crucial and, therefore, this transition should happen as fast as possible.
Once the IC’s temperatures have converged to the specified temperature
map, they can be maintained using the steady state powers, , found by
the steady state solution as presented in the previous section.
We would like to extend the steady state solution approach to equation
5.2.9, which includes the transient response, in order to find the
Temperature-Gradient Based Burn-In and Test Scheduling
113
schedulable power values that result in the shortest transition time. The
new problem can be formulated as:
Find the shortest transition time, , and the corresponding power
values, , such that the specified map is achievable.
The transition time from map to map is defined as the time required
to construct the temperatures specified by map starting from
temperatures specified by map .
This problem can be solved using an iterative approach that tries different
alternatives for . The main part of the proposed algorithm is illustrated in
Figure 5.2.5. The algorithm uses the latest information regarding the
interval that contains the optimal transition time. This interval is denoted
by [ ]. At any step, it is known from the previous steps that the specified
map is not achievable for transition times shorter than .
It is also known that since the temperature map is achievable for a
transition time equal to , longer transition times are not optimal. Initially
is set to zero and to the transition time for the steady state approach (1st
step in Figure 5.2.5). This steady state transition time is obtained by
simulating the temperatures when the steady state schedule is used. A
number of candidate transition times with uniform distances are selected
between and (2nd step in Figure 5.2.5) according to:
(5.2.11)
The -th candidate transition time is denoted by . is the number of
parallel LP solvers and its value is selected based on the degree of
parallelism offered by the platform that runs the algorithm. For example,
for a machine that supports eight threads, eight is a reasonable choice for
. For each candidate , solving the LP formulation determines whether
the temperature map is achievable or not (3rd step in Figure 5.2.5). This is
represented by the Boolean variable, , for the -th candidate transition
time.
The value of is updated to be equal to the smallest that leads to
schedulable power values. The value of is updated to be equal to the
largest that leads to power values that are not schedulable (4th step in
Figure 5.2.5). Note that if for all the candidate transition times, denoted by
in Figure 5.2.5 ( ) the map is achievable, then
Chapter 5
114
remains unchanged. On the other extreme, if none of the s are
schedulable then remains unchanged. The algorithm stops when the
smallest transition time is found with acceptably low error (i.e., as shown
in the conditional step in Figure 5.2.5). The error is bounded to ( ) and
therefore if this difference is smaller than the specified limit, , then the
actual error, too, will be smaller than .
The problem formulation for the LP solver that is used in the 3rd step in
Figure 5.2.5, is similar to the LP formulation in the previous section
(Figure 5.2.3) with the following differences: (1) Instead of s, the
temperatures at the end of the transition time, s, are used. (2) Instead of
equation 5.2.2, equation 5.2.9 is used to calculate the temperatures based
on the power values. The relation between the power values and the duty
cycles defined by equation 5.2.4 is modified by replacing with and
used as indicated in line 9 in Figure 5.2.3. If the LP solver finds a feasible
solution, the temperature map is achievable. This information is then used
to update the and values.
Since during the transition the temperatures will not be in the specified
ranges, the period for the PWM-like method is not crucial, unlike in the
steady state solution. Therefore, it is sufficient that the period is much
smaller than the transition time, , so that the average power is a
meaningful quantity for this span of time. For the experiments, the steady-
state-solutions’ periods are also used for the transient solution (they are
much smaller than ).
Figure 5.2.5 Main algorithm for the transient solution
LPLP LP
transition time for steady state solution
min { is TRUE}
max { is FALSE}
is the minimal transition timeyesno
(σ – λ) <
Temperature-Gradient Based Burn-In and Test Scheduling
115
The matrix exponent computation for , in equation 5.2.10, is performed
using techniques proposed in [Ukhov12]. These techniques are used in
order to speed up the repeated recalculations of and for alternative
transition times. They are based on eigenvalue decomposition, utilizing the
inherent properties of matrices and and replace the excessively time
consuming matrix exponent calculations in equation 5.2.10 with simpler
operations. Although these techniques speed up the calculations, the
required time is still very large, as experimentally shown in section 5.2.6.
Even though, the transient solution is an intuitive extension of the steady
state solution and greatly outperforms it, it is slow in generating the
schedules. Therefore, a new approach that avoids the time-consuming
successive calculations of and is necessary. Such an approach is
proposed in the next section, based on a fast heuristic. Moreover, this new
approach is capable of handling a more realistic problem formulation
compared with the steady state and transient solutions.
5.2.4 Transient-Based Heuristic
So far, it has been assumed that it is possible to apply heating sequence to
an arbitrarily selected active thermal node and, simultaneously, avoid
application of heating sequences to all other nodes. This implies that the
smallest element in the temperature model should not be smaller than the
corresponding module on the TAM, in order to be able to control the
heating sequence application to it independently from all other elements.
On the other hand, a temperature model with finer granularities might be
preferable in order to achieve a better spatial precision in the simulated
temperatures and perhaps the gradients. This way, the temperature maps
can be planned with a higher resolution. Therefore, a technique that allows
the modules to be further divided into a number of sub-modules is
advantageous. These sub-modules correspond to a higher number of nodes
in the temperature model.
Let us assume that the overall number of thermal nodes, denoted by , is
larger than or equal to the number of modules ( ). In the rest of this
chapter, the desired temperature maps are specified for the thermal nodes
instead of the modules. Consequently, the temperature map specifies that
node has low temperature limit equal to and high temperature limit
equal to ( ).
Chapter 5
116
In this new context, the switching activities for heating sequences are more
specific and provide information concerning the power breakdown among
active thermal nodes. For example, assuming that module is divided into
two active thermal nodes and , instead of only one heating sequence for
module , there will be two heating sequences corresponding to these two
nodes.
The average power of a heating sequence for active node is represented
by . The other active node of that module (i.e., node ) may also receive
power, denoted by . Therefore, when trying to heat up node with
, node is also heated by . Similarly, when trying to heat up node
with , node is also heated by . Such a situation cannot be
handled by the techniques previously proposed.
Furthermore, power dissipation for TSV blocks is now supported, and the
TSV drivers/buffers may be placed in TSV blocks and their desired
temperatures might also be specified in the temperature maps (not always
don’t-care, as assumed in the previous sections).
The proposed technique allows longer heating intervals during transition
time as opposed to relatively shorter heating intervals during steady state
(assuming that the new map’s temperature is higher). This relatively long
application of the heating sequence is called boosting. Boosting of an
active node stops when the node reaches the Stop Boosting temperature,
. The stop boosting temperature may be higher than the high
temperature limit, , but it is always lower than .
Boosting is helpful in different ways. One way is to achieve the following
desirable scenario. Assume that the node is initially heated beyond
( ). Then the node does not need to receive heating sequence for
a while and this leaves the TAM available for other nodes. Meanwhile, the
temperature keeps decreasing naturally and just before the end of the
transition time (the moment that all other nodes are in their specified
temperature ranges), the temperature drops below the high temperature
limit.
This simplifies and shortens the schedule for the transition period and,
therefore, is desirable. An example for the temperature curves when the
transient-based heuristic is used is given in Figure 5.2.6 for thermal node
. The overall transition time is indicated by the gray area. The temperature
Temperature-Gradient Based Burn-In and Test Scheduling
117
of node passes through the valid temperature range already in the interval
(a) in Figure 5.2.6. But the termination of the transition interval is deferred
since at least one of the other nodes, when is in the valid temperature
range, is outside its valid range.
A node’s temperature will naturally decrease if no power or little power is
applied to it, but it should not fall below the low temperature limit.
Therefore, a heating sequence should be applied at some point, before the
temperature falls out of range. This point is marked with a temperature
level named Heating Trigger and denoted by for active thermal node
( ). The heating sequence should be applied when the
temperature of node falls below .
The difference between and provides sufficient time for the node
to wait for gaining access to the TAM without its temperature falling below
. In Figure 5.2.6, the heating is required at the beginning of the interval
(c), but since the TAM is not available, the node waits. At the beginning of
the interval (d) the node has finally gained access to the TAM and the
heating begins.
Heating should stop when the temperature reaches the high temperature
limit. The time it takes to get back to the low temperature limit could be
utilized to heat up other nodes that need heating. In a situation that a
module consists of multiple active thermal nodes, the heating sequence
could only be applied if all of these thermal nodes have temperatures lower
than their high temperature limit.
The nodes that simultaneously require heating should be accommodated
within the available bandwidth of the TAM. This bandwidth might not be
sufficient for all of them and, therefore, the nodes that need heating more
than others should be prioritized. The priorities for using the TAM are
Figure 5.2.6 An example for transient-based heuristic
Te
mp
era
ture
Transition
Pause WaitCooling
Heatin
gPause
Cooling
Pause
CoolingW
ait
Heatin
g
(a) (b) (c) (d) (e) (f) (g) (h)
Boosting
Chapter 5
118
determined based on the regional need for heating (denoted by around
a node ).
The value of is recomputed whenever node needs heating. A node
requires heating in the following two situations: (1) When , after
the transition, for example the interval (d) in Figure 5.2.6. (2) When
, during the transition, for example the interval (a) in Figure 5.2.6. In
the following, we explain how to calculate for situation (1). Regional
need for heating for situation (2) is obtained in a similar manner by
replacing with .
Equation 5.2.1 is re-written here with the approximate derivatives:
(5.2.12)
The input power, , in equation 5.2.1 is substituted with the stray power,
, plus the PWM power of heating sequences, . Vector is the
vector form of the regional need for heating and consists of s. Equation
5.2.12 is written for one test cycle with period which is a very short time.
The equation is then solved for the nodes that need heating as follows.
(5.2.13)
The regional need for heating, , depends on the required heating for node
(consider the summations when is equal to ), on the required heating
that is related to the adjacent nodes (consider the summations when
denotes an adjacent node to ), and on the average power of the
corresponding heating sequence, .
The regional need for heating for a node has the highest dependency on the
node itself, and then a relatively high dependency on the adjacent nodes
(this characteristic is captured by the temperature model). The influence of
other nodes located far away from the targeted node is small. The heat
transfer between nodes is taken into account automatically, since equation
5.2.13 is derived from the temperature equation (equation 5.2.1) and
includes the thermal conductances from matrix . This is reflected by
in equation 5.2.13.
Equation 5.2.13 ensures that the priority for using the TAM is given to the
regions that need longer heating times, for example because of large
Temperature-Gradient Based Burn-In and Test Scheduling
119
and small . Furthermore, the locality of this heuristic is
helpful because adjacent nodes are likely to be in the same module and
therefore these nodes will receive some desirable active heating power
( ) or heat transferred from module .
The problem with heat transfer exists also in the previous sections, but it
was taken care of automatically by the LP solver. An effect of the interplay
between priorities could be seen in Figure 5.2.6. The waiting period in the
interval (f) is much shorter than the waiting period in the interval (c). The
length of a waiting period depends on the other nodes’ priorities in addition
to the node ’s priority. The priorities in thermal boost mode are computed
in a similar manner by replacing with (e.g., in equation 5.2.12–
13).
As discussed before, the performance of the transient-based heuristic
strongly depends on the stop boosting, , and heating trigger, ,
temperatures. One example is the priorities calculated using equation
5.2.13, since they depend on after the transition and on during the
transition. Efficient values for these temperature levels for each
temperature map and each thermal node are found using a PSO technique,
as introduced in section 2.7.
5.2.5 Remarks
The output for the steady state and transient solutions is a periodic offline
schedule and therefore producing a small periodic schedule is one of their
advantages. The transient solution, on the other hand, returns also the
transition time as an output. The periodic schedule generated by the
transient solution is applied just during the transition time and then the
steady state schedule must be used. A periodic schedule means that there
is a constant average power for each module during the transition, despite
the fact that a higher or lower average power might be suitable for different
periods. The transient-based heuristic addresses this issue by generating a
non-periodic offline schedule that facilitates the heating for the nodes that
need it the most. Furthermore, the introduction of the boost mode helps to
reduce the switching overheads in the schedule. For these reasons, the
transient-based heuristic offers a reduced transition time.
The proposed approaches support also heating sequences generated by a
Built-In Self-Test (BIST) engine. An example for the use of BIST engines
during burn-in in order to achieve high toggle coverage is reported in
Chapter 5
120
[Carbine97]. Such BIST engines that stimulate high switching activities in
a certain area of the IC under burn-in can be used to produce heating
sequences online. The only difference, in our context, is that if the BIST
engine does not occupy TAM, then it can be scheduled at any time as
needed.
For instance assuming that module can receive its heating sequence
from an adjacent BIST engine that is not occupying TAM, the 8th line for
the LP formulation in Figure 5.2.3 should be changed to:
. The situation for the transient-based heuristic is even
simpler, since the algorithm only needs to know that module can
receive its heating sequence at any time. Then, does not need to
compete with other modules for access to TAM. Consequently, there is no
need to evaluate the regional need for heating for .
The techniques proposed above make it possible to perform burn-in based
on heating sequences without requiring a heat chamber. One of the
situations when a heat chamber might be required is for the ICs that are
designed to work in an extremely high temperature environment. For
example, a microcontroller for a car engine is designed with low power in
order not to raise too much its temperature from the very high ambient
temperature in the engine area. When such a chip is tested or operated with
regular low ambient temperature, it is impossible to have enough power
density to boost its temperature to its usual high level in normal working
condition.
Another such situation is when some parts of an IC (e.g., package pins, die
to pin connections, and the interposer) cannot be heated up sufficiently by
input stimuli. In such cases, an extremely hot burn-in condition might be
required that is not achievable by exclusive use of heating sequences. Even
in such cases the use of the methods proposed in this thesis for enforcing
the temperature gradients will still be useful. The proposed algorithms do
not need any modifications to work under such situations, except for
setting a large ambient temperature corresponding to the heat chamber
temperature. Note that as discussed previously in section 4.6 the thermal
behavior is modeled as a Linear Time Invariant (LTI) system. Therefore, a
larger ambient temperature will directly add up to the temperatures created
by the application of the heating sequences.
Temperature-Gradient Based Burn-In and Test Scheduling
121
The focus of this chapter is not on the issues related to process variations.
Small temperature variations can be tolerated by introducing a safety
margin for the specified temperature limits, in particular the overheating
temperature. Large temperature variations need a variation-aware
technique, for example, by combining the method proposed in this chapter
with the techniques proposed in the previous chapter. This is, however,
outside the scope of this thesis.
We use the term “temperature gradients” to precisely refer to the spatial
temperature differences. But we also use it in a relaxed manner to refer to
different sites’ temperature values. For example the temperature difference
between two adjacent modules and that is , is exactly a
temperature gradient and speeds up the early life-time of the affected area.
However, the fact that module ’s temperature is equal to and ’s is
is not directly a temperature gradient. These facts are captured by a
temperature map and affect the signal delays (for signals that are routed
through or close to these modules).
5.2.6 Experimental Results
The proposed techniques are evaluated for twelve experimental ICs with
one to three layers as detailed in Table 5.2.1, columns 2, 3, and 4. The one-
layer experimental ICs (row 1 to 4) are bare dies and could represent the
pre-bond test stage. The ICs that have two layers (row 5 to 8) could
represent mid-bond test stage. The ICs with three layers (row 9 to 12) could
represent post-bond test stage.
There are two, four, eight, and 16 physical modules per layer for different
dies, resulting in the total number of modules ranging from two to 48, as
given in column 3. There are one, two, and three TSV blocks per layer on
the dies, resulting in the total number of TSV blocks given in column 4,
ranging from one to nine. Each TSV block hosts a relatively large number
of TSVs. The dies are assumed to be stacked in a face to back
configuration.
The temperature models are extracted using an approach similar to the
method proposed in [Coskun09] for 3D-SIC. This is an extended form of
the technique used by HotSpot [Huang07] for normal 2D ICs. The heating
patterns’ switching activities are generated using Markov chains, similarly
as in [Yao11c]. The temperature maps specify the valid temperature ranges
for nodes in the temperature model. The valid ranges are randomly selected
Chapter 5
122
between , and some modules/nodes are randomly selected to be
don’t-care.
Only temperature maps that can be achieved in practice are considered. An
example for a temperature map that cannot be achieved is one that requires
a central node with very low temperature and its adjacent nodes with very
high temperature. In this case the temperature gradient is huge and it
probably will require negative power (active cooling) for the central node.
The transient solution (section 5.2.3) and the transient-based heuristic
(section 5.2.4) are evaluated and compared with the steady state solution
(section 5.2.2). The transient-based method is capable of handling
temperature models having multiple nodes per module, while the steady
state and transient solutions only support one thermal node per module. In
order to have comparable experiments, the temperature model that is
supported by the steady state method is used for the other techniques.
The CPU time to generate the schedules for the transient-based method for
all of the twelve experimental ICs together is about 12 minutes while the
transient solution takes 17 minutes and steady state method completes in 2
seconds. As discussed earlier, the time required to bring the IC into a
thermal situation that complies with the first temperature map and then to
the next map until all maps are applied is defined as the overall transition
time in this work.
Table 5.2.1 Percentage changes achieved by proposed techniques
IC
Number
IC Specifications
Percentage change in
overall transition time
Number of
layers
Number of
modules
Number of
TSV blocks
Transient
solution
Transient-
based heuristic
1 1 2 1 -83.88 -97.82
2 1 4 1 -68.35 -73.05
3 1 8 2 -64.97 -69.95
4 1 16 3 -56.93 -62.63
5 2 4 2 -64.37 -68.37
6 2 8 2 -58.32 -65.94
7 2 16 4 -57.19 -63.82
8 2 32 6 -43.99 -55.14
9 3 6 3 -70.44 -97.18
10 3 12 3 -57.15 -93.17
11 3 24 6 -84.56 -95.87
12 3 48 9 -92.06 -94.52
Average -66.85 -78.12
Temperature-Gradient Based Burn-In and Test Scheduling
123
The percentage change in overall transition time offered by the transient
solution and the transient-based heuristic, compared with the steady state
solution, are given in columns 5 and 6 of Table 5.2.1, respectively.
Considerable speed up (78% in average) is achieved by the transient-based
heuristic and moreover, it also outperforms the transient solution.
The CPU times for the transient-based heuristic for different number of
modules are given in Figure 5.2.7. Even though they grow rapidly with the
increase in the number of modules, for an IC with 48 modules it is still
relatively short (480 sec).
5.3 Temperature-Gradient Based Test
For the temperature-gradient based test, the goal is to make sure that the
tests are performed when the temperature gradients are correctly captured
on the IC. This means that the specified temperature maps should be
reached and maintained during the corresponding test periods. In the
followings a straightforward algorithm and then a fast heuristic are
proposed.
5.3.1 Straightforward Algorithm
This algorithm works by changing between two modes, the temperature
construction mode and the test mode. Initially the temperature construction
mode is activated and it creates the specified temperature map using a
method similar to the transient-based heuristic proposed in section 5.2.4.
Then the test mode is activated and the tests that are scheduled with a third
party algorithm (e.g., scheduling method proposed in [SenGupta12]) are
applied. The test temperatures are simulated at design time and as soon as
at least one of the thermal nodes is out of its specified range, the test mode
is paused and the temperature construction mode takes over again. When
Figure 5.2.7 CPU time versus number of modules
0 5 10 15 20 25 30 35 40 45 500
1
2
3
4
5
6
7
8
9512
128
643216
4
10 5 10 15 45
CP
U t
ime
[se
c]
5035 4020 25 30Number of Modules
8
256
2
Chapter 5
124
all thermal nodes are brought back into the specified temperature ranges,
the temperature construction mode is paused and testing resumes.
Similar to the transient-based heuristic, if the temperature of a node is
lower than the heating trigger temperature, it should be heated by applying
the heating sequence to it. If there are many nodes that need heating (more
than what the TAM can support), priority is given to those with higher
regional need for heating as defined in section 5.2.4. The construction
mode, unlike the transient-based heuristic, should not heat the nodes up to
their high temperature limit since the power of the tests that are applied
immediately after the construction mode may rapidly heat up the node
beyond high temperature limit. Therefore, Testing Trigger temperatures
which are denoted by for node ( ) are introduced
here. During the temperature construction mode, the heating for node
stops as soon as the temperature reaches .
In the test mode, as soon as the temperature of a node reaches the high
temperature limit, the test mode is immediately paused, the temperature
construction mode is activated and, consequently, a cooling interval is
applied. The cooling continues until the node is cooled down to the testing
trigger temperature, , and then the node is ready for testing again. The
actual activation of the test mode will also depend on the temperatures of
the other nodes. Efficient values for testing trigger temperatures, , for
each map are found using a particle swarm optimization technique along
with and .
The inputs to the methods proposed here in section 5.3 include the inputs
to the methods proposed in section 5.2 in addition to the test specifications
(e.g., test switching activities). The output is a set of offline schedules.
Moreover, the proper values for the heating trigger, , stop boosting
temperatures, , and testing trigger temperatures, , which result in a
rapid test could also be considered as the outputs that provide a basis for
an online scheduling scenario.
The straightforward algorithm is simple, and allows the choice of a desired
arbitrary test schedule that is used in the test mode. But the overall test
application time offered by this method is very long. Note that the total test
application time also includes time intervals spent for temperature
construction
Temperature-Gradient Based Burn-In and Test Scheduling
125
5.3.2 Fast Heuristic
The fast heuristic schedules the tests together with the heating sequences
such that the specified temperature map is maintained. This way, a shorter
test application time can be achieved. An illustrative example for the
proposed method is given in Figure 5.3.1 for a single thermal node. The
proposed technique has similarities to the temperature construction
algorithm in section 5.2.4. For example, stop boosting temperature, ,
indicates that the boosting should stop, as illustrated at the end of interval
(a) in Figure 5.3.1. After being too warm, the node should cool until its
temperature gets below the testing trigger temperature, , as shown in
interval (b).
When the temperatures for all of the other thermal nodes covered by
module are between their high temperature limit, , and their heating
trigger, ( ), testing may start, as in interval (c) in Figure 5.3.1.
All other nodes should be within their temperature limits . Testing
continues until the temperature of at least one of the nodes goes beyond the
high temperature limit or falls below the heating trigger . For
example at the end of interval (c), the node is too cold for testing and a
heating interval should be introduced. Note that the TAM may no longer
be available and, therefore, the node is waiting for access to the TAM in
interval (d).
Finally, when access to the TAM is obtained, the heating sequence is
applied in interval (e). In order to start heating, all nodes covered by a
module should be colder than the high temperature limit since the heating
sequence for one node is very likely to inject power to other nodes in the
same module (as explained in section 5.2.4). Heating continues until the
temperature goes beyond the testing trigger temperature and, then, the test
resumes as in interval (f) in Figure 5.3.1. When the temperature reaches the
high temperature limit, a cooling interval is introduced as in interval (g).
Figure 5.3.1 An example for the fast heuristic
Boost Test Heat
Wait Test Cool Test
(a) (b) (c) (d) (e) (f) (g) (h)
Cool
Chapter 5
126
This procedure continues until all tests corresponding to the current
temperature map are completed.
As mentioned before, nodes will compete for access to the TAM and,
therefore, some of them should be prioritized. First the nodes that require
heating (not the tests) are granted access to TAM. This helps to keep the
temperatures most of the time within the specified limits and, thus, keep
the flow of the tests uninterrupted. Note that if only one node falls out of
its specified range, all tests must be interrupted until the map is achieved
again. This will waste a lot of time, since the tests for the modules that are
in their specified range should also be interrupted. The priorities for the
nodes that require heating are determined based on the regional need for
heating as proposed in section 5.2.4 (equation 5.2.13).
If the TAM is left with some available bandwidth after the heating
sequences are scheduled, the modules that are thermally qualified may
resume their tests. A module is thermally qualified if none of the nodes that
correspond to that module are demanded by the previously discussed rules
to receive heating, wait for heating, or receive cooling. The priority is given
to the modules that are expected to offer long test endurance. The test
endurance is denoted by for module , and is defined as:
(5.3.1)
The test endurance is directly proportional with the remaining test size
denoted by for module . The larger the remaining test size, the longer
the test endurance. The thermal tolerance, denoted by for module ,
is the other contributor to the test endurance. High thermal tolerance, ,
indicates that the module is capable of receiving tests for a relatively long
time without exceeding the specified thermal limits. Therefore, a module
with large thermal tolerance may remain under test for a relatively long
time. The thermal tolerance is defined as:
(5.3.2)
In equation 5.3.2, it is assumed that module covers active thermal
nodes. ( ) denotes the expected thermal distance to a
temperature limit for node and is defined as:
(5.3.3)
Temperature-Gradient Based Burn-In and Test Scheduling
127
As mentioned in section 5.2.2, the desired steady state power is the
power that results in a temperature equal to .
Equation 5.3.3 indicates that if the upcoming tests have relatively high
average power, then it is likely that the thermal node exceeds the high
temperature limit and, therefore, the difference between the current
temperature, , and the high temperature limit, , is a good measure for
thermal tolerance.
Similarly, for a relatively low power test, it is more likely that the
temperature falls below the heating trigger in the future. Therefore, the
difference between the current temperature, , and the heating trigger
temperature, , is a good measure for thermal tolerance. Thermal
tolerance, , is defined as the smallest ( ) since as
soon as a single node is out of the specified range , disregarding of
the temperatures of the other nodes, test should be interrupted. Note that if
the temperature falls below , only for a node in module , then the test
is interrupted only for module .
A proper value for the testing trigger temperature, is selected so that
the temperature variation during test (caused by the variations in the test
power) rarely results in the temperatures below or above . Every
time that or are violated, the test must be interrupted and a heating
or cooling interval must be introduced, respectively. Since these are time
consuming, a proper value helps to obtain a short test application time
by reducing the number of interruptions. Besides the testing trigger
temperature, stop boosting and heating trigger temperatures ( and
respectively) have a considerable effect on the test application time and
therefor proper values for them should be found. A particle swarm
optimization technique, as discussed in section 2.7, is used to find the
proper values for , , and for each map.
5.3.3 Experimental Results
The fast heuristic (section 5.3.2) is evaluated and compared with the
straightforward method (section 5.3.1). An experimental setup similar to
section 5.2.6 is used here. This includes experimental ICs described in
Table 5.2.1. For convenience, columns 1–4 from this table are repeated in
Table 5.3.1 that reports the experimental results. The temperature model
used for these experiments has multiple nodes per module, as opposed to
experiments presented in section 5.2.6.
Chapter 5
128
The total time required to enforce a temperature map and maintain it while
the tests are being applied, in addition to the time spent applying the
corresponding tests, is defined as the test time in this section. The
percentage change in test time offered by the fast heuristic compared with
the straightforward method is given in column 5 of Table 5.3.1, which
shows that considerable speed up (67% in average) is achieved.
The percentage change in CPU time required by the fast heuristic
compared with the straightforward method is -36%. The overall CPU time
depends on the interaction between the computational complexity of a
single decision point4 in the schedule and the schedule length. The
experimental results indicate that since the fast heuristic method makes
better decisions, compared with the straightforward method, the overall
length of the schedule is reduced considerably and therefore the overall
CPU time is also reduced. This happens despite of the fast heuristic’s
higher computational complexity for individual decision points. In fact, the
schedule length is an important contributor to the CPU time, since longer
schedules require longer temperature simulations and temperature
simulation is, per se, very time consuming.
4 A decision point is a point in the schedule where the scheduling algorithm must
decide about the upcoming states (e.g., whether to cool, wait, heat, or test).
Table 5.3.1 Percentage changes achieved by fast heuristic
IC
Number
IC Specifications Percentage change in
test time achieved by
fast heuristic
Number of
layers
Number of
modules
Number of
TSV blocks
1 1 2 1 -16.97
2 1 4 1 -39.69
3 1 8 2 -63.35
4 1 16 3 -94.77
5 2 4 2 -8.70
6 2 8 2 -60.80
7 2 16 4 -78.17
8 2 32 6 -95.04
9 3 6 3 -75.90
10 3 12 3 -84.81
11 3 24 6 -87.08
12 3 48 9 -94.72
Average -66.67
Temperature-Gradient Based Burn-In and Test Scheduling
129
The CPU times for the fast heuristic for different number of modules are
given in Figure 5.3.2. Even though they grow rapidly with the increase in
the number of modules, for an IC with 48 modules it is still acceptably
short. The CPU times for the burn-in (section 5.2) will be relatively shorter
since here the tests are also scheduled along with the heating sequences.
The increase rate in the CPU times, as shown in Figure 5.3.2, is tolerable
similar to the transient-based heuristic (section 5.2.4). This was expected
since these algorithms are very similar.
5.4 Temperature-Map Ordering
The order in which the maps are enforced has a considerable impact on the
overall burn-in and test time. Since there are usually a number of
temperature maps to be applied, their ordering is important. In this section
we present methods to rapidly obtain a proper order for temperature maps
that results in a short burn-in and test time.
5.4.1 Map Ordering Technique
To simplify the discussions, let us assume that the temperature map for a
thermal node is represented by the middle value of the specified
temperature range . As an example, assume that an
IC has two thermal nodes and the initial temperature is . The specified
temperatures, by temperature map , are denoted by . This
means that temperatures and are specified by map for nodes
and , respectively. Assume that there are three temperature maps
5 The notation { , , …, } is used to represent an ordered sequence of
elements ( ).
Figure 5.3.2 CPU time versus number of modules
512
64
10 5 10 15 45 5035 4020 25 30
Number of Modules
8
CP
U t
ime
[m
in]
Chapter 5
130
denoted by , , and . These maps specify the following temperatures:
= { , }, = { , }, and = { , },
respectively.
These temperature maps are represented in Figure 5.4.1a–b by three points
in a Cartesian space. The temperature for node is represented by the
horizontal axes, , and for node by the vertical axes, . The initial
order of temperature maps { , , } requires a long time to increase the
temperature for node from 30 to 110 ( in Figure 5.4.1a), then decrease
it to 40 ( in Figure 5.4.1a), and then again increase it from 40 to 110 (
in Figure 5.4.1a). This process will take a long time due to the required
large changes in the temperature. In contrast, it is much faster to work with
the maps ordered as { , , }, since in this case, the required
temperature changes consist of smaller temperature variations, as shown in
Figure 5.4.1b.
As discussed earlier, in order to minimize the overall transition time for
burn in, a particle swarm optimization technique finds the proper values
for stop boosting and heating trigger temperatures ( s and s,
respectively). The map orders should be optimized along with these
temperatures, since all of these factors have a crucial effect on the overall
transition time for a given set of temperature maps. The naïve approach to
find proper map orders is to introduce them as decision variables into the
PSO along with s and s. Experiments showed that this naïve
approach takes very long CPU time to complete. Since the optimized
values for and depend on the map order, different map orders
result in different optimized values for and .
The initial PSO population in the naïve approach consists only of random
solutions (random s, s, and random map orders). Introducing a
relatively good map order into the initial population of PSO (among other
initial solutions that are random) will help to speed up the search. This
approach is denoted by A1. The idea for approach A1 is to rapidly find a
potentially good map order using some initialization heuristic and
introduce it into the initial PSO population. By doing this, the search should
speed up while the quality of the final values for s and s are kept
reasonably high. Experiments suggest that in the majority of cases, PSO
finds a better map order than the one produced by the initialization
heuristic.
Temperature-Gradient Based Burn-In and Test Scheduling
131
It is, in fact, possible to find a potentially good map order without having
to go through the time-consuming optimization of s and s.
Furthermore, it is possible to do it without the relatively time consuming
scheduling procedures for the heating sequences. A temperature map could
be considered as a point in an -dimensional Euclidean space ( is the
number of thermal nodes). The thermal distance between two maps is
defined as the Euclidean distance between them (e.g., between maps
and in Figure 5.4.1b). For a sequence of the maps, the total thermal
distance (TTD) is defined as the sum of the thermal distances between
successive maps. For example, TTD for Figure 5.4.1a is approximately
257, while for Figure 5.4.1b it is 108, which is much smaller. In general, a
sequence of maps with smaller TTD is expected to have a shorter transition
time compared with a sequence with larger TTD.
Note also that the time required to change the temperature differs from
node to node depending on the node’s location on the IC, the adjacent
nodes’ temperatures, the heating sequence powers, and so on. Moreover,
depending on these factors, the rise time and the fall time for the
temperature of a certain node are also different (e.g., in many cases heating
up is faster than cooling down, with the same temperature gap). The TTD
does not take these differences into account in favor of a simple but
meaningful metric that is fast to evaluate. However, when the map order is
optimized using PSO, all these once ignored factors are automatically
taken into account.
Figure 5.4.1 The total thermal distance (TTD)
(a) a bad map order. (b) a good map order.
Map order: {μ1, μ2, μ0}
TTD = │b0│+│b1│+│b2│
(b)
[oC]
70
50
30
90 μ0
μ1
μ2b0
b1
b2
50 70 90 110[
oC]
30
Map order: {μ0, μ1, μ2}
TTD = │a0│+│a1│+│a2│
(a)
[oC]
70
50
30
90 μ0
μ1
μ2
a0
a1 a2
50 70 90 110[
oC]
30
Chapter 5
132
This problem is similar to finding the shortest Hamiltonian path in a
complete graph whose vertices are temperature maps and the distance
between two vertices is their Euclidean distance. Therefore, the initial
heuristic based map order that is added to the PSO’s initial population in
approach A1 is called shortest Hamiltonian path. Due to the reasons
discussed previously, this shortest path does not necessarily correspond to
the optimal map order.
If A1 is allowed to run for a long time, it will produce very high quality
solutions. However, for larger designs, this is unaffordable. We have
therefore proposed the A2 approach, which consists of a short run of A1
followed by a post-PSO optimization of map orders. The motivation for
this is that PSO optimization in A1 can rapidly identify possible solutions
in the near optimal area of the search space but it then becomes very slow.
Knowing the near optimal area, other optimization techniques can be
deployed to rapidly improve the results. In the followings, the post-PSO
optimization for the map orders is discussed.
In the general case, the post-PSO optimization could be excessively time
consuming. A greedy heuristic is therefore used to rapidly find a near
optimal solution. The greedy approach is characterized by its size, . This
size is the number of alternative partial solutions that are kept at each step
(i.e., among the vertices with equal depth in the search tree). A greedy
heuristic with size works as follows. Starting from the root vertex (initial
temperature) in the search tree, vertices (i.e., temperature maps) that
have the shortest partial transition times are selected. This corresponds to
the first map in the final map order. Here the scheduling is performed to
calculate the actual transition times.
Then again new vertices that have the shortest partial transition times are
selected out of the set of vertices that succeed the previous best vertices.
Two maps (in the final map order) are scheduled so far. This procedure
repeats until all maps are scheduled. For equal to one, at each step the
map that is the fastest to achieve is selected. A large slows down the
search but it may provide better results. Our experiments showed that 10
is a good choice for .
Albeit this general case which addresses large and time consuming ICs, for
smaller ICs it is possible to find the optimal map order (i.e., exact solution)
using an exact algorithm (e.g., branch and bound). Since a relatively good
solution is already found by PSO in approach A1, we can skip many paths
Temperature-Gradient Based Burn-In and Test Scheduling
133
in the search tree that result in a larger transition time, without wasting time
to fully schedule them. For example assuming that the map order in Figure
5.4.1b is already found by A1, there is no need to schedule (in Figure
5.4.1a) at all. Scheduling may also be aborted before completion since
the overall transition time of this path in the search tree exceeds the overall
transition time of the path corresponding to Figure 5.4.1b before it even
gets to vertex . Note that in this algorithm, the edges are actual transition
times and not the Euclidean distances. Albeit significant acceleration
achieved by utilizing the near optimal result from A1 approach, larger
examples are excessively time consuming and therefore finding their
optimal solution is not practical.
Although this section has focused on map ordering for the temperature-
gradient based burn-in, the map ordering for the delay test is very similar
and the same technique can be used. Moreover, there might be a map
dependency graph (e.g., because of corresponding tests’ dependencies)
which dictates that certain maps must be applied in certain order. Although
not discussed in this section, the proposed approach can accommodate such
scenarios.
5.4.2 Experimental Results
Experimental setup is similar to section 5.2.6. All experiments are
performed on a desktop computer with Intel® Xeon® W3520 processor
and 8 GB of memory. Percentage change in CPU time for the A1 approach
compared with the naïve approach is -266% in average. Furthermore, the
overall transition time achieved by A1 is 18% smaller than the overall
transition time achieved by the naïve approach.
Optimal map orders are found for some of the small experimental ICs to
be used for comparison purposes. It is not practical to find optimal map
orders for all the experimental ICs because of the excessive search time
that relatively large ICs require. The overall transition times achieved by
A1 are around 23% larger than the overall transition times offered by the
optimal map orders. As mentioned before, this shows that the map orders
found by A1 are close to optimal, but A2 can do better. In the following
A2, that includes post-PSO optimization, is compared with A1 that
terminates after the PSO optimization.
The greedy approach with a population size of one ( ) is used to find
map orders for all of the experimental ICs. The results show 16%
Chapter 5
134
improvement over the A1 results, but it is 13% worse than the optimal.
Increasing the population size to ten ( ), further improves the results
so that there is 21% improvement over the A1 and it is only 7% worse than
the optimal. However, it almost doubles the search time. In short, A1 finds
map orders that result in overall transition time around 23% worse than
optimal. The post-PSO optimization in A2 improves the map orders by
21%, which means that it is very close to the optimum.
5.5 Conclusions
Early-life failures and delay faults that are dependent on temperature-
gradients introduce additional challenges to achieve efficient burn-in and
delay-fault test. The negative effects of temperature gradients are more
pronounced for 3D-SIC technology, since their magnitude is much larger.
The challenge for burn-in is that some defects develop and cause early-life
failures very rapidly when the IC is working with certain temperature maps
that include large temperature gradients. These are difficult to enforce by
traditional burn-in methods. The challenge for delay-fault test is that some
defects can be detected only when a certain temperature map is enforced
on the IC.
In order to effectively detect these defects, it is necessary to construct and
maintain the specified temperature maps during burn-in and delay-fault
test. The methods proposed in this thesis utilize the available test access
mechanisms in order to do so. The specified temperature maps are
constructed and maintained by selectively applying high-power stimuli to
the IC. Therefore, there is no need for expensive equipment to heat up the
chip externally. To our knowledge, this is the first technique to achieve
temperature maps for burn-in and test without any external heating
mechanism.
For burn-in, a steady state solution is introduced that is fast to generate the
schedules, but the schedules are slow to achieve the specified temperatures.
A schedule in this case consists of a single periodic schedule for each map.
The steady state solution has been extended to the transient solution which
is slow in generating the schedules, but constructs the maps faster. Finally,
the transient-based heuristic is proposed to support a more precise
temperature model, and offer a shorter overall transition time by generating
schedules that rapidly bring the IC to the specified temperature conditions.
The experiments indicate that this method outperforms the transient
Temperature-Gradient Based Burn-In and Test Scheduling
135
solution. Moreover, this method is 78% faster than the steady state solution
in realizing the specified temperature maps.
For delay-fault test, a straightforward method is proposed that is based on
two working modes, the temperature construction mode and the test mode.
The temperature construction mode works similar to the transient-based
method for burn-in and brings the IC to the specified temperature
conditions. Then, the test mode applies the tests according to a given test
schedule until the IC’s temperatures exits the specified range, when the
temperature construction mode is activated again. This continues until all
tests are performed. Furthermore, another method (fast heuristic) has been
developed to schedule the heating and cooling intervals mixed with the
tests. Therefore, the test time offered by this method is reduced. The
experiments indicate that the fast heuristic is 67% faster in performing the
tests compared with the straightforward method.
The order of the temperature maps has a considerable effect on the overall
burn-in and test time. Therefore, map orders need to be optimized, since
they affect the optimal values for other decision variables. Experiments for
map ordering show that the introduction of an initialization heuristic that
adds an initial map order to the PSO’s initial population speeds up the
search time by 266% in average. Furthermore, the overall transition time
improves by 18% in average for burn-in. The overall transition times are
further improved by 21% through introduction of a post-PSO optimization
stage that consists of a greedy approach.
Chapter 5
136
5.6 Notations and Abbreviations
Notation Description
Represents heat capacitances in the thermal model. is the
matrix element at -th row and -th column.
Represents thermal conductance (related to heat transfer) in the
thermal model. is the matrix element at -th row and -th
column.
Need for heating in a general case. is -th thermal-element’s
need for heating.
Duty cycle for module in PWM method
Testing endurance for module .
Identity matrix
Number of modules ( ).
is the -th module.
Total number of thermal elements in the thermal model (
)
Power value(s) in a general case. is power for module .
Power values in transient solution
Heating sequences’ powers
Heating sequence power received by node when heating is
intended for node .
Steady state power values in transient solution
Stray power
PSO Particle Swarm Optimization [Poli07]
Number of parallel LP solvers in transient solution
Remaining tests’ size
Proper schedule period in PWM method, calculated solely for
heating interval of module
Temperature-Gradient Based Burn-In and Test Scheduling
137
Notation Description
Proper schedule period in PWM method, calculated solely for
cooling interval of module
TAM Test Access Mechanism
TAT Test Application Time
Thermal tolerance for module .
TTD Total Thermal Distance
TAM width: number of modules that can be accessed at the same
time
Transfer matrix for initial temperatures considering a time interval
equalt to
Transfer matrix for power values considering a time interval equalt
to
Boolean variable indicating that the -th LP solver has found a valid
solution
Thermal distance for -th active thermal element.
Accpeptable error in the minimal transition time in trasient solution
Temperatures vector in a general case. is the temperature for
module . is the temperature for -th thermal element.
Ambient temperature
Overheating temperature limit
Initial temperatures
Final temperatures after seconds
Stop-boosting temperature limit in a general case. is stop
boosting limit for -th thermal element.
Steady state temperatures
is high temperature limit for module . is high temperature
limit for -th thermal node.
Chapter 5
138
Notation Description
is low temperature limit for module . is low temperature
limit for -th thermal node.
Testing-trigger temperature threshld in a general case. is
testing trigger threshold for -th thermal element.
Lower bound for optimal transition time in transient solution. The
upcoming temperature map cannot be achieved if transition time
is smaller than . See .
-th temperature map.
139
Chapter 6 Integrated Temperature-
Cycling Acceleration and Test
Large and frequent temperature changes (i.e., temperature cycling) create
fatigue and wearout in Integrated Circuits (IC), as pointed out earlier in
section 3.8. Temperature-cycling affects ICs by causing various damages,
including solder joint fatigue, fracture in bond wires, and die deformation
[Jedec10]. In addition to these undesirable effects, 3D stacked ICs suffer
from defects related to through silicon vias. TSV protrusion and void
formation in TSV are two of such defects. These effects are worsened by
temperature cycling. Furthermore, some other defects, including resistive
opens and stress induced carrier mobility reduction, can also be worsened
by temperature cycling [Kumar12, Okoro14, Zhang13].
This chapter presents a schedule-based technique that integrates
temperature cycling acceleration with testing procedure. The cycling
acceleration is achieved by mixing heating sequences and cooling intervals
with test sequences in an efficient order. Furthermore, tests and heating
sequences are reordered so that a rapid testing and acceleration process is
achieved. The proposed technique is in contrast with the existing
approaches that are based on temperature chambers and can be impractical
for 3D-SICs due to their unaffordable costs and limitations.
6.1 Preliminaries
Temperature-cycling exacerbates a number of defect mechanisms, as
pointed out before. Therefore, operating the dies under intensive
temperature cycling can effectively accelerate such failures so that they can
be detected by the subsequent test, before the 3D-SIC is shipped out. This
procedure is called temperature-cycling acceleration [Jedec09, Mil04].
6
Chapter 6
140
Note that even though both conventional burn-in test and temperature-
cycling test are designed to detect early-life failures, temperature-cycling
is different from the conventional burn-in. These two aim at accelerating
different aging mechanisms. Cycling acceleration will not accelerate aging
mechanisms identical to those that burn-in does and vice versa. To briefly
explain this difference, let us focus only on two distinct aging mechanisms.
During burn-in, the device is operated in a very hot environment with
increased voltage to accelerate electromigration. This must continue for a
relatively long time to allow for sufficient migration (detectable atomic
built-up or depletion). On the contrary, simply operating the device at a
single temperature does not create cycling-related material fatigue. It is the
variation of the mechanical stress (as a result of varying temperature) that
does it. The required amounts of burn-in and cycling are decided based on
analytical, experimental, and empirical studies that are outside the scope
of this thesis. In this thesis we solely focus on temperature-cycling and
assume that the required amount of cycling is given by the user.
Let us have a closer look at protrusion of TSVs out of the die surface
caused by temperature cycling. Right after TSV fabrication, there is
normally no protrusion and the TSVs have about the same length as the
die’s thickness. However, after a few temperature-cycles an increase in the
TSV length may be observed. The TSV length will continue to increase
with the number of cycles [Kumar12, Zhang13]. After a certain amount of
temperature cycling, the TSV length approaches a maximum level. Further
temperature cycling will have almost no effect on the TSV length,
afterwards. The TSV protrusion can be further exacerbated by the electrical
current it carries [Kumar12, Zhang13]. Therefore, operating the IC during
this procedure (letting the current to flow) speeds up the cycling
acceleration.
The existing procedure for temperature-cycling acceleration is based on
one or multiple temperature chambers [Jedec09]. Although this procedure
is usually affordable for 2D ICs, it is likely to be too expensive for 3D-
SICs. Due to TSV-related defects, a larger number of dies manufactured to
be a part of a 3D-SIC may require cycling acceleration compared with 2D
ICs. The shortcomings of the traditional approach include costs for running
the temperature chambers as well as the time and equipment required for
handling the dies/stacks between test equipment and chambers. Besides,
chambers are slow, meaning that only very low frequency cycling is
possible.
Integrated Temperature-Cycling Acceleration and Test
141
Moreover, the 3D-SIC manufacturing process includes multiple bonding
stages. Corresponding to these bonding stages, pre-, mid-, or post-bond
tests are introduced in order to avoid: (1) wasting a good die bonded to a
bad die or stack, (2) wasting bonding effort for bonding bad dies or stacks,
and (3) wasting packaging effort spent on a bad stack. Based on the cost
breakdown, temperature-cycling acceleration could be beneficial at one or
multiple test stages. In order to avoid costs associated with the traditional
techniques, in current practice, some or even all of the temperature-cycling
acceleration operations are avoided. Therefore, the temperature-cycling
related early-life failure rates in the final products will be unnecessarily
high. Integrating the temperature-cycling acceleration with the tests that
are performed at different stages and eliminating the need for temperature
chambers will reduce the overall manufacturing costs.
As previously mentioned, advanced SoCs, especially those manufactured
as a 3D-SIC experience excessively large test power densities during test.
High power densities lead to excessively high temperatures, in particular
for the middle dies in a 3D stack. This otherwise undesirable thermal effect
is, however, utilized here to generate large amounts of temperature-
cycling. Temperature-cycling acceleration is achieved by frequent
switching between high power tests that heat up the IC and pauses that
allow for cooling.
A deliberate pause for cooling is called a cooling interval. A cooling
interval is the time interval that no stimuli are applied to a core and,
therefore, the core’s temperature decreases, as already discussed in earlier
chapters. Some cooling intervals are usually present in the original test
schedule for thermal-safety reasons, as discussed in chapter 4. More
intensive temperature-cycling acceleration can be achieved by introducing
additional cooling intervals and stronger heating sequences into the
process. A stronger heating sequence consists of stimuli that generate
larger switching activities in a core and, therefore, increases the core’s
temperature faster than usual (as discussed in chapter 4 and chapter 5). The
mixture of cooling intervals and heating sequences can generate the
required temperature-cycling acceleration effect.
A test sequence’s bit streams define the circuit-under-test’s power
dissipation in combination with the previously applied test sequence
(circuit’s state) as well as the core’s power-related properties.
Consequently, the power dissipation generated by a series of tests depends
Chapter 6
142
on the order in which they are applied [Chakravarty94]. This phenomenon
is employed in this thesis in order to produce extreme power values for
tests as well as heating sequences and, consequently, achieve a high speed
temperature-cycling process.
The existing methods for managing ICs’ temperatures (in relation with the
testing processes) focus on two issues:
1. Keeping the temperatures under a global upper temperature limit to
prevent overheating (e.g., section 4.1–7) or
2. To respect upper and lower bounds for cores in order to target
temperature-dependent defects (e.g., section 4.8) or gradient-
dependent defects (chapter 5).
In all the above cases, the cores’ temperatures are considered independent
of their cycling effects. Integrating temperature cycling acceleration with
the test procedure was previously studied in [Aghaee15a]. This chapter
develops an integrated temperature cycling technique based on this study.
Moreover, an efficient technique to order the tests and heating sequences
to achieve a high-speed temperature-cycling process is proposed.
6.1.1 Circuit under Test and Test Access Mechanism
It is assumed that there are modules (cores) in the 3D-SIC under test.
These modules are located on different levels of stacked dies. The modules
that are on different layers are connected using TSVs. Tests for each
module can be started and stopped independent of other modules. The
modules could be cores with core wrappers in a core-based design. The
extension of this scenario to 3D-SIC is proposed as the IEEE P1838
standard [Ieee14a]. Test stimuli are, therefore, transferred through a test
access mechanism to the relevant module. It is assumed that the TAM only
affords (a positive integer number) modules to be tested at the same
time. Other modules, therefore, have to queue up and wait for TAM access.
6.1.2 Thermal Model
In order to obtain the temperature values from power values, a thermal
model that describes the thermal behavior of the IC must be used. The
temperature equation (introduced in section 2.6, equation 2.6.1) is repeated
here for convenience:
(6.1.1)
Integrated Temperature-Cycling Acceleration and Test
143
All the thermal characteristics of the IC are captured in two matrices
and , obtained in a manner similar to [Coskun09, Huang06]. is the
temperature vector and is the power. and consist of s and s,
respectively, put together in a vector format. Index indicates the relevant
module. There are a total of modules ( ). As
discussed in section 4.6, equation 6.1.1 can be solved for the time-domain
assuming that the power values are constant during a period of time equal
to . The result from equation 4.6.5 is repeated here for convenience:
(6.1.2)
The initial temperature is expressed by and the temperature after a
period of seconds (note that a fraction of a second is used in practice) is
represented by . Matrices and are copied below from equations
4.6.3–4:
(6.1.3a)
(6.1.3b)
The identity matrix is denoted by . The above equations are explained in
the following case study, assuming that there is only one module ( )
with its heat capacitance denoted by (analogous to ). The heat
resistance between the module and the ambient is equal to (analogous to
). In this case, equation 6.1.2 can be re-written as:
(6.1.4)
Since there is only one module, the vectors and matrices are reduced to
scalar values. A larger initial temperature ( ), power ( ), or resistance
( ) results in higher final temperature ( ), if other factors are kept
unchanged. A larger period ( ) means that the contribution of the initial
temperature is smaller while the effect of power on the final temperature is
larger. In the vector form, increasing the period translates into a decreased
and an increased . A large time-constant ( ) means that the initial
temperature takes longer to lose its effect while power takes longer to
noticeably affect the final temperature. In the vector form, increasing the
time-constant translates into an increased and a decreased .
1 A list of notations and abbreviations is provided in section 6.8.
Chapter 6
144
6.1.3 Temperature Cycling Model
The effect of temperature cycling can be described based on the Amount
of Temperature Cycling induced fatigue (denoted by for module ).
Based on the Arrhenius-Coffin-Manson model [Held97, Jedec10], ATC is
estimated as:
(6.1.5)
Considering module , is the number of temperature cycles and
is the amplitude of temperature changes during cycling. In the above
equation, a regular cycling pattern is assumed. It means that the
temperature monotonically increases from an arbitrary temperature, , to
and then monotonically decreases back to .
Usually, when the actual temperature curve is only slightly different from
a regular pattern, the average amplitude is used for . must be
larger than (a very small threshold value) in order to be considered in
the temperature cycling calculations. However, it is not unusual to
completely ignore since the typical temperature changes are much
larger than .
The effect of the average temperature is captured in the exponential term.
The average temperature is expressed by . , , , , and are
constants that are obtained analytically or empirically by reliability
analysts. A comprehensive explanation and details of equation 6.1.5 can
be found in [Jedec10, Held97]. As equation 6.1.5 suggests, a large number
of cycles, , or a large temperature swing, , will result in a large
cycling effect.
6.2 Motivational Examples
6.2.1 ATC Rate for a Simple Scenario
As an example, consider an IC with two modules ( ). Assume that
the TAM can only support one module to be tested at a time ( =1).
Assume that and . The required
amounts of temperature cycling are and for modules and
, respectively. In this chapter, tests that target cycling-dependent defects
are called cycling tests and the other tests are called normal tests. Cycling
Integrated Temperature-Cycling Acceleration and Test
145
tests can only be applied after the required amount of temperature cycling,
, is achieved.
A three-phase approach is introduced here: In phase 1, normal tests are
scheduled. A thermal aware scheduling of tests based on the proposed
approach in [He08a] is used. The corresponding temperature curves are
shown in Figure 6.2.1 (green2 for and blue for ). The normal tests
for module end at . Phase 1 starts at time 0 and end at that is defined
as .
Phase 2 starts by evaluating the ATC generated in phase 1. This value is
less than the required in this example. Therefore, phase 2 will
generate additional temperature cycling. This is done by applying the
heating sequences and cooling intervals. Corresponding temperature
cycles can be seen in Figure 6.2.1 from to . Time-point marks the
point when the required is achieved for module . Phase 2 ends
when all required ATCs for all modules are met. This point is marked with
that is defined as . After this, phase 3 starts by applying the
cycling tests. Phase 3 ends when all the cycling tests are complete. This
point is marked with .
Always, a small TAT is desirable. Test application time from 0 to and
from to is already minimized by the given third-party test scheduling
algorithm. The only time reduction opportunity is to speed up phase 2. This
means that a large ATC should be achieved in a short time. Therefore,
should be maximized. Here we assume a uniform periodic
temperature profile that means all cycles have the same amplitude.
2 Figure 6.2.1 is printed in grayscale in copies printed by LiU-Tryck.
Figure 6.2.1 Temperature curves for the three-phase approach
(Curves are illustrative.)
90
60
0
Te
mp
era
ture
[oC
]
150
30 time
120
phase 3phase 2phase 1
Chapter 6
146
Moreover, for this motivational example we assume that in equation 6.1.5:
, , , and .
Since it is assumed that , the exponential term can be ignored for
the moment. Furthermore, since it is assumed that , could
also be ignored. The ATC rate (denoted by for module ) can,
therefore, be defined as:
(6.2.1)
The frequency of temperature changes (i.e., the number of cycles per time
unit) depends on the physical properties of the system and the amplitude
of temperature changes, . It is possible to achieve a high frequency
(i.e., a large ) if is small. A large amplitude on the other hand,
may increase the ATC, only if it dominates the resulted reduction in the
frequency.
6.2.2 Optimal Cycling in a Simplified Scenario
In order to clarify the tradeoff between the frequency and the amplitude of
the temperature cycling, the physical properties of the system should be
captured in the ATC rate equation (equation 6.2.1). In the following this is
done for a simple IC with only one module. The thermal model for such a
case was discussed in section 6.1.2, equation 6.1.4. Remember that is the
heat capacitance and is the thermal resistance between the module and
the ambient. Assume that the heating sequence generates a power equal to
and the power during a cooling interval is zero. Assume that the
temperature varies between and . Both and are positive
real numbers.
The period of a temperature cycle is denoted by . This period consists of
a rise time denoted by plus a fall time denoted by . is the time the
temperature takes to increase from to . is the time taken to
decrease from to . These values are calculated as follows. First,
the system’s differential equation is solved in the time domain similar to
equation 6.1.4 for a period of (i.e., ):
(6.2.2)
Let us denote by and by . For the heating situation:
Integrated Temperature-Cycling Acceleration and Test
147
(6.2.3)
Then
(6.2.4a)
Similarly for cooling ( ), can be calculated:
(6.2.4b)
The period, , is calculated as follows:
(6.2.5)
Now, the ATC rate (equation 6.2.1) could be re-written incorporating the
physical properties of the system:
(6.2.6)
Let us first focus on the optimal value for , assuming that is constant.
In this case optimality happens when the denominator in equation 6.2.6 is
minimized. Considering a realistic situation, this is equivalent to finding
the minimum for
(6.2.7)
Following a closed-form approach:
(6.2.8)
The valid solution is . Here for the sake of simplicity, the
ambient temperature was not included in the equations. Since the
temperature model is a linear time-invariant (LTI) system as discussed in
section 4.6 the ambient temperature can be added later on. Assume that
power and resistance values are so that . This means that
considering the ambient temperature ( ), the IC’s temperature will
increase to if no control is applied. Thus, the optimal value for is
.
Chapter 6
148
The resulted equations for finding the optimal value for do not have a
simple closed form. Therefore, a numerical method is employed. The ATC
rate versus for is plotted in Figure 6.2.2. If and
, then the ATC rate is maximal at . For values of
less than the ATC rate increases by increase in . This is due
to the increase in amplitude, , dominating the decrease in
frequency, , in equation 6.2.1. For larger values the ATC rate
decreases by increase in . This is due to the increase in amplitude,
, being dominated by the decrease in frequency, . In other
words, a very large temperature cycle takes too much time to complete.
If the assumption that does not hold, the temperature cycling rate
equation, equation 6.2.6, will be as follows:
(6.2.9)
The inclusion of the exponential (Arrhenius) term results in a larger (or
equal) optimal value. Since both the exponential term and
equation 6.2.6 are increasing when is smaller than , the optimal
value cannot happen for a smaller than . After this point, the value
of equation 6.2.6 decreases while the exponential term is increasing. The
optimal can be in this region ( ). Besides, the introduction of
the exponential term leads to dependency of the optimal on the value of
.
In the general case (without assumptions made solely for the motivational
examples), the optimal value for could be very different compared with
the obtained here. Moreover, the assumptions made for obtaining
equation 6.2.1 will not be valid and therefore the situation will be more
complicated than discussed in the above paragraph. In such situations a
numerical approach is best suited to find the optimal values for and .
Figure 6.2.2 ATC rate, , versus for three-phase approach
010002000300040005000
0 10 20 30 40 50 60
Integrated Temperature-Cycling Acceleration and Test
149
Moreover, in the general case, there are multiple modules competing for
access to TAM and their interference makes the problem even more
complicated, so complex that a heuristic is the only practical technique to
deal with the problem.
6.2.3 Effect of the Test Application Order
In general, the circuit under test’s consumed power depends on the order
in which the tests are performed. Let us consider the scan chain itself.
Different orders of the tests will result in different transition counts and
thus different power values.
Consider a 4-bit scan chain as shown in Figure 6.2.3. Assume that 0101,
1111, and 1010 are the test stimuli. The order 1010-1111-0101, as shown
in Figure 6.2.3a, results in 12 transitions in the scan chain during shift-in.
Another test order, 1111-1010-0101, as shown in Figure 6.2.3b, results in
22 transitions and thus higher power dissipation. Assuming that the
temperature of the core should be reduced, arranging the tests in their low
power order may avoid an additional cooling interval. Alternatively, if the
core is in its heating interval of the cycling process, the high power
arrangement may replace an unnecessary heating sequence application.
This will ensure that TAM is not unnecessarily occupied by dummy
heating sequences. Both situations help to shorten the test application time.
6.3 Problem Formulation
As discussed before, along with pre-, mid-, or, post-bond tests,
temperature-cycling acceleration might be beneficial. In this case, there
will be tests that target cycling-dependent defects (i.e. cycling tests) in
addition to other tests (i.e., normal tests). Normal tests are scheduled along
with heating and cooling intervals in order to generate the required amount
of temperature cycling. The cycling tests can be performed afterward.
Figure 6.2.3 Test orders
(a) A low power order. (b) A high power order.
(a)
Total transitions=
1010
1101
1110
1111
1111
1111
0111
1011
0101
(b)
Total transitions=
1111
0111
1011
0101
1010
1101
0110
1011
0101
7654
22
3333
12
Chapter 6
150
The amount of temperature cycling can be easily calculated using equation
6.1.5 if the temperature swings in a uniform periodic manner similar to
Figure 6.3.1a. In Figure 6.3.1a five cycles with amplitudes equal to can
be identified. In the general case, for example when the IC is under test,
the temperature fluctuations are irregular, as shown in Figure 6.3.1b. In
this case, identifying cycles and their amplitudes is not straightforward. For
such irregular patterns, the number and amplitudes of the cycles are
calculated using the widely used Rainflow-counting algorithm
[Matsuishi68].
As mentioned previously, the required amount of temperature cycling is
denoted by . The current amount of temperature cycling generated
by normal tests or heating sequences (e.g., phase 1 and phase 2 in Figure
6.2.1), up to a given time, , is denoted by . For a certain test
schedule, the temperature curves are obtained using temperature
simulations. Then a fast version of the Rainflow-counting algorithm,
introduced in [Musallam12], calculates .
Assuming that for , , only normal tests can be
performed before time . The cycling tests can only be performed after
the required amount of cycling ( ) has been applied. Therefore, after
time , cycling tests can be performed too. The test application time,
, marks the point that testing module is complete. consists
of the time spent before and after time . The goal is to generate a
schedule with a minimal overall TAT. The overall test application time is
defined as .
As previously discussed, the power dissipation during a test depends on
the previous test, among other factors. Assuming that test for module
immediately follows test , the dynamic power is expressed by
. The overall power dissipation (in the circuit under test), denoted
by , consists of the dynamic power, , plus the stray power,
Figure 6.3.1 Temperature patterns
(a) Uniform periodic. (b) Irregular.
(a) (b)1 2 3 4 5
Integrated Temperature-Cycling Acceleration and Test
151
denoted by ( ). The dynamic power is caused by
the circuit under tests’ switching activities. As introduced in section 5.2.2,
the stray power is defined, in this thesis, as the sum of all those power
values whose dissipations cannot be independently controlled with existing
test controls. This includes the leakage power as well as the clock
networks’ power. Stray power’s exact value depends on the module’s
current temperature since the leakage power depends on the temperature.
In this chapter, the stray power (including temperature dependent leakage)
is taken into account.
It is assumed that module has tests including both normal and
cycling tests. Relevant test properties can be captured in a test graph.
Consider an IC that consists of two modules ( ). Assume that module
has two tests ( ) as shown in Figure 6.3.2a. Module has three
tests ( ) as shown in Figure 6.3.2b. Assume that one of the tests for
module is a normal test (the node is marked with N) and the other is a
cycling test (marked with C). A node that corresponds to a heating
sequence (marked with H) is also included in the test graph. Tests and the
heating sequence for module are marked in a similar manner. Total test
powers are shown on the edges in Figure 6.3.2. Usually, in the general case,
there are a number of normal and cycling tests in addition to a number of
heating sequences.
At each time point, during the test, there could be some tests that cannot be
performed. This is due to a number of reasons, including the limited
Figure 6.3.2 Test graphs: (a) module (b) module .
Test graphs consist of normal (N), cycling (C), and heating (H) nodes.
(a)
(b)
N C
s0,0 s0,1p0,0-1
p0,1-0H
s0,2p0,0-1
p0,1-0
p0,0-2
p1,2-0
N C C
s1,0 s1,1 s1,2
p1,0-2
p1,2-0
p1,0-1
p1,1-0 p1,2-1
p1,1-2
H
s1,3
p1,3-2
p1,2-3
p1,1-3
p1,3-1
p1,3-0
p1,0-3
Chapter 6
152
capacity of the TAM as well as the cycling tests that cannot be performed
before the required ATC is applied. A validity checker is used to make sure
that the scheduling algorithm takes these limitations into account. The
validity checker updates the set of Valid Tests (VaT) if a new test can be
performed in parallel with the tests that are already selected for the current
time point. It also makes sure that any test that cannot be applied in parallel
with the currently selected tests does not remain in VaT. This is based on
the knowledge of previously applied tests as well as the partial set of tests
selected to be applied next.
Moreover, the current amount of the ATC is also taken into account. For
example, assume that in Figure 6.3.2 normal tests ( and ) have been
performed previously. Assume that is already selected to be applied
next and the required ATC for is already achieved. In this case VaT is
. Meaning that , , or can be applied in parallel
with without violating TAM limit or ATC requirement. Although
using (i.e., the heating sequence) does not make sense since the
required ATC is already achieved, it would be a valid choice from the
VaT’s point of view. Note that the heating sequences can be applied
repeatedly, as needed, while repeating the tests is usually unnecessary.
The goal is to schedule the tests so that all the cycling tests are performed
after the required amount of ATC is achieved and the overall test
application time (including the cycling process) is minimized. This is
achieved by scheduling and reordering the tests and the heating sequences.
High power test stimuli and heating sequences can increase the modules’
temperatures. A module may become so hot that unrealistic failures show
up and even the device gets damaged. In order to avoid these undesirable
overheating situations, the modules’ temperatures must be kept below the
overheating temperature ( ) at any time. The overheating
temperature is equal to the temperature limit minus a safety margin to
ensure thermal safety. The power dissipation during a pause is equal to the
stray power, (which includes leakage).
The problem can be formally stated as follows. The inputs to the suggested
technique include the IC’s thermal model, the IC’s electrical model (e.g.,
specification of the TAM and power-related specifications), the test graph
(i.e., the cycling tests, normal tests, and the switching activities of the tests
and heating sequences), the ambient temperature ( ), and the
required amount of temperature cycling, . The objective is to
Integrated Temperature-Cycling Acceleration and Test
153
minimize the test application time. The output is the corresponding
schedule that guides the application of the tests and heating sequences in
proper order so that all the tests are performed rapidly and correctly.
The generated schedule will imply, for each of the modules, a certain
ordering of the test graph’s node. The ordering can be represented by a
directed path in each of the original test graphs (e.g., graphs in Figure
6.3.2). This directed path must visit each test node at least once and may
visit heating nodes as many times as needed. Applying a test or a heating
sequence is equivalent to visiting the corresponding test or the heating
node.
The test ordering and scheduling can also be viewed as converting the
original test graph into a final path-graph. A path-graph is defined as a
graph with only one directed path that connects all the nodes. There is no
other edge in a path-graph except those on this unique path. The final path-
graph must include all of the test nodes, while the heating nodes are
included as needed. The complete test scheduler that includes the ordering
algorithm decides at which point to insert a node taken from the original
test graph into the final path-graph.
6.4 Three-Phase Approach
The basics of the three-phase approach are briefly explained in section
6.2.1. Section 6.2.2 presented a technique to find the best temperature
interval ( to ) for a simplified scenario. As discussed before, if
the coefficient (in equation 6.1.5) is much larger than the average
temperature ( ) and the high temperature level ( ) is smaller
than the overheating temperature, , everything in section 6.2
would be fine.
However, often these assumptions are not valid, for example the
overheating temperature may be relatively low compared with . For
the example in section 6.2.2, is equal to while the
overheating temperature might be . There are some other
complications, as well. In practice there are a number of modules, instead
of one, and their temperatures depend on each other due to heat transfer.
Moreover, the power values fluctuate with time. Besides, power values
include the stray powers that depend on the temperature due to the
temperature dependent leakage currents. Additionally, the modules may
Chapter 6
154
not be able to receive their heating sequences at desired times, due to the
TAM limitation. New approaches capable of taking all these situations into
account are, therefore, proposed in this section.
As discussed in section 6.2.1, in phase 1 and 3 the tests are scheduled using
a thermally safe third-party algorithm. It is assumed that these algorithms
perform optimization to reduce the test application time. Our focus will
therefore be on phase 2 where new algorithms can be designed to minimize
the test application time. This was demonstrated using a small example in
section 6.2.2. Assume that in phase 2 the temperature of module is
intended to swing between a low temperature level and a high
temperature level ( ). In comparison with the example in
section 6.2.2, and have roles similar to that of and ,
respectively.
The heating sequences are assumed to be powerful enough to raise the
module’s temperature to . The high temperature level should always be
lower than the overheating temperature ( ) to avoid any
kind of damage. Since all the normal tests and all the cycling tests are to
be separately scheduled using third party algorithms and then performed in
two isolated phases (phase 1 and 3), there is no need to represent them in
the test graph. Consequently the test graph reduces to only include the
heating nodes (nodes marked with H in Figure 6.3.2). This simplifies the
problem of finding proper paths in these reduced graphs. For each module,
a greedy approach is used here and the heating node that offers the highest
heating power is selected to follow the current node.
An on-the-fly approach is used to schedule the heating sequences for phase
2 based on the simulated temperatures. The temperatures that are obtained
by simulation are then compared with and in order to generate the
schedule. High power heating nodes are used to rapidly increase the
temperature. Immediately after the temperature reaches its peak at , a
cooling interval is introduced to reduce the temperature back to . Then,
for the sake of a fast cycling, the heating sequence must be immediately
applied again. However, the TAM might not be available at this moment.
Consequently, the temperature may fall below from time to time.
Heating sequences for different modules will compete for access to TAM.
The priority is decided based on the following equation:
Integrated Temperature-Cycling Acceleration and Test
155
(6.4.1)
Both and depend on time and are shortened forms of and
, respectively, at time . The priority is higher if the module’s
current temperature is much below . Note that the priorities are
calculated only for modules that need heating, therefore . The
reason for the inclusion of this difference term (i.e., ) in the
priority assessment is that if a module gets really cold, it takes too much
time to warm it up again. Therefore, it is a good idea to give a higher
priority to the colder modules.
A module that has a large amount of temperature cycling left to fill has also
a higher priority. This is indicated by . Such a module is likely to
need a relatively long time to achieve its required ATC. Consequently, it
is likely that at the later stages of phase 2 this module remains alone. This
implies that the interleaving opportunities for TAM access will be reduced.
Consequently TAM utilization may decrease and test application time may
increase. A small value, , is added to the denominator in order to prevent
numerical problems when ATC is zero (e.g., at the beginning of phase 2,
if there has not been any normal test).
The test application time for the schedules generated by this on-the-fly
approach depends on and . These temperature levels could assume
a range of values provided that . The
temperature that corresponds to the stray power is called stray temperature
and is denoted by (always ).
Temperature of a module cannot be lower than this because of the stray
power dissipation.
The combination of these temperature levels ( and ) among different
modules affects the test application time. The proper values for these
decision variables will be found in an external optimization loop, as shown
in Figure 6.4.1. In the inner scheduling loop, the temperature levels (i.e.,
decision variables) defined by the outer optimization loop are used to
generate the schedule. In Figure 6.4.1, the scheduler boxes inside the
dashed box represent multiple copies of the inner scheduling algorithm.
However only one of such schedulers is sufficient to perform the
optimization, multiple of them are used in parallel to speed up the
procedure.
Chapter 6
156
The outer optimization loop in Figure 6.4.1 makes use of a particle swarm
optimization algorithm. PSO is an iterative population-based optimization
metaheuristic, as discussed in section 2.7. For each alternative solution in
the PSO’s population, an on-the-fly scheduling is performed (inside the
dashed box in Figure 6.4.1) to compute the cost function (i.e., TAT).
The working of the PSO algorithm is repeated here for convenience. The
algorithm starts from a random initial population, similar to other
population based metaheuristics (e.g., evolutionary methods). The
population is referred to as a swarm in PSO terms. An individual in the
population is referred to as a particle. Each particle goes through a number
of alternative solutions, one at a time, as the algorithm iterates.
Each particle has a location in the search space (i.e., the current alternative
solution). A particle records the best solution it has ever encountered, the
local best. The swarm records the best solution its particles have ever
encountered, the global best. Based on these best solutions and the
previous alternative solution a velocity is determined which also
incorporates some randomization. Velocity is the vector that determines
the next location for a particle. The particles move throughout the search
space in a guided random manner until they gather around a near optimal
solution.
Figure 6.4.1 External optimization loop based on particle swarm optimization
The algorithm is used to minimize the test application time. Inside the dashed box, copies of the
scheduling heuristics are performed in parallel for a number of particles
Schedule the tests for each particle
1st
scheduler 2nd
scheduler Last scheduler
Finished?
Update the local bests and the global best
Update the swarm (velocities & locations)
Alternative decision variables
Schedules & test application times
No Yes
Final schedule & test application time
Initialize the swarm
Integrated Temperature-Cycling Acceleration and Test
157
6.5 Integrated Approach
Let us assume, now, that the orders in which normal test nodes (e.g., nodes
marked with N in Figure 6.3.2) must be visited are given. Furthermore,
assume that the order for heating sequence nodes (e.g., nodes marked with
H in Figure 6.3.2) are also given. This means that the original test graph is
broken down into a number of sub-graphs. This includes two separate
directed path-graphs, one for normal tests and the other for the heating
sequences among other sub-graphs.
This simplified scenario which involves two separate path-graphs will be
discussed first and a path-graph scheduling algorithm will be introduced in
sections 6.5.1–3. Afterwards, section 6.5.4 explains how to employ this
path-graph scheduling algorithm to solve the original problem that
involves the original test graph (i.e., the problem described in section 6.3).
An example in the following paragraphs, using Figure 6.5.1, explains how
the proposed schedule generation works. Figure 6.5.2 shows how different
blocks of the algorithm are put together. The example in Figure 6.5.1
explains how all these blocks work together to generate a schedule. Let us
assume that path-graph scheduling (i.e., Path-graph scheduling block in
Figure 6.5.2) determines that the module must receive heating at test
cycle. Test cycles are shown in Figure 6.5.1f. It asks test graph node
ordering (i.e., the node ordering block in Figure 6.5.2) for options.
Test graph ordering replies by two options (as shown in Figure 6.5.1d):
The first option is [ , ] that is a path-graph consisting of high power
normal test nodes. The second option is [ , ] that consists of heating
nodes. This interaction is depicted in Figure 6.5.2 as the loop between the
path-graph scheduling block and the node ordering block. The output of
the node ordering block is monitored to determine if all tests are completed.
The path-graph scheduling decides to go on with [ , ]. Now, the
power values are known and temperature simulation is performed to obtain
the temperatures. This interaction is depicted in Figure 6.5.2 as the loop
between the path-graph scheduling block and the temperature simulator
block.
The simulated temperatures are plotted in Figure 6.5.1a–b. As module
heats up, module is slightly warmed up by the transferred heat from
. It is assumed that the die in this example consists of only two modules.
Chapter 6
158
Moreover, it is assumed that the test access mechanism provides access to
only one of the modules at a time. The module that occupies the TAM is
depicted in Figure 6.5.1c.
Every decision (i.e., change in the schedule) is recorded in the schedule as
a new entry. Each entry consists of the corresponding cycle in addition to
the node and state for each and every module. For example, a decision was
made at cycle to start . This is registered in the schedule as shown in
Figure 6.5.1f–j. Applying continues smoothly to the end and then
starts (at ), as previously suggested by the node ordering block.
At cycle the temperature of reaches the high level and cooling is
required. The node ordering block is consulted and it returns [ , ]
that consists of low power normal tests. The other alternative is a pause
Figure 6.5.1 An example for schedule generation
(Curves are illustrative).
module m1
s1,1
s1,2
s1,4
s1,2
module m0
s0,0
s0,2 s0,3
s0,2
TAM m1m0m1m0
sch
ed
ule
Te
st G
rap
h
Ord
eri
ng
m0
m1
Pause Start/Resume
Te
mp
era
ture
Cu
rve
s &
No
de
Tra
nsi
tio
ns
m0
m1
s0,8 s0,7
s0,5 s0,4
s0,3 s0,1 s0,7 s0,8
s1,9 s1,6
s1,8 s1,7
s1,1 s1,3
s0,4 s0,6
s0,0 s0,2
s1,5 s1,8
s1,4 s1,2
cycles
node
state
node
state
i0
s0,0
i1
s0,2
i2
s0,2
s1,4
i3
s0,2
s1,2
i4
s0,2
s1,2
i5
s0,3
s1,2
i6
s0,1
s1,2
i7
s0,1
s1,1
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
(j)
Integrated Temperature-Cycling Acceleration and Test
159
(cooling interval). Since the application of is not complete, the
application of low power normal tests is not possible. Therefore, a cooling
interval is introduced. This frees the TAM that the other module can utilize.
The node ordering block suggests either [ , ] or [ , ]. The
scheduler decides to go with , a new entry for the cycle is added to
the schedule and then the simulations and scheduling continue. Note that
if the temperature reaches the overheating limit (that is higher than the high
level discussed here and, therefore, is not shown in Figure 6.5.1) only a
pause can be selected (definitely not a low power test).
At cycle the temperature of reaches the high level and cooling is
required. The node ordering block is consulted and it returns [ , ]
that consists of low power normal tests. The other alternative, as always
for cooling, is a pause. Since the application of is not complete, the
application of low power normal tests is not possible. Therefore, a cooling
interval is introduced. This frees the TAM that the other module can utilize.
Figure 6.5.2 Integrated scheduling approach
The decision variables are highlighted with gray.
Decision variables:
Threshold on power difference to decide between cooling interval or low-power test application
Length of the power assessment window for node ordering in:
Cooling situation
Heating situation
Ordinary situation
Thermal emergency situation
Temperatures
Current amounts of temperature cycling
Stop cooling temperature limits
Low cycling temperature limits
High cycling temperature limits
Emergency temperature limits
Threshold on power difference to decide between heating sequence or high-power test application
Remaining test sizes
Priorities
Power values
Alternative path-graphs
Alternative decision variables
th scheduler
Node orderingTest
graph
No
Yes
Schedule & test application time
Priority calculation
Temperature simulator
Path-graph scheduling
All tests scheduled?
Chapter 6
160
Since was pending, it is resumed and there is no need to consult the
node ordering block at the moment. At cycle , the node ordering block is
consulted and is selected for application.
At cycle the temperature of reaches the high level and cooling is
required. The node ordering block is consulted and it returns [ , ]
that consists of low power normal tests. Obviously, the other alternative is
a pause. This time the application of is complete and, therefore,
can actually be selected. However, the path-graph scheduler decides that,
in any case, a pause is better. Note that before a node is started or resumed,
its validity (VaT as discussed in section 6.3) is checked. If not in the VaT
list, either another alternative must be selected or the module must wait
until incompatible tests are complete. The above process, as explained in
Figure 6.5.1, continues until all tests are performed.
6.5.1 Path-Graph Scheduling Algorithm
The test application time could be reduced if normal tests (phase 1) are
integrated into the temperature-cycling acceleration process (phase 2). For
example, a test can be employed to heat a module and avoid an unnecessary
inclusion of a heating node. It may happen that a test is not powerful
enough to increase the modules’ temperature to and yet it is beneficial
to include it to partially heat the module. A heating node is introduced
afterwards to rapidly increase the temperature up to .
Similar to this heating scenario, a mixed cooling scenario is also possible.
The benefit of these mixing scenarios is that although the temperature will
change slowly (increasing the test application time), a part of the tests is
being applied (decreasing the TAT). In a mixed cooling scenario, a low
power test is introduced when the temperature must decrease to create a
cycle. Albeit the decrease in the module’s temperature, the temperature
may not decreases to . A cooling interval is then introduced to complete
the cycle.
Assume that a high power test is being applied in a heating scenario as
shown in Figure 6.5.3a. Assume that the high-power test’s power for the
current time interval is denoted by . This power rapidly increases the
temperature at the beginning. Assume that this level of power is applied
for a long time. In this case a steady state temperature equal to will
eventually be reached. As the current temperature approaches , the
heating rate decreases. The derivative of the temperature (i.e., heating rate)
Integrated Temperature-Cycling Acceleration and Test
161
is shown in the lower part of Figure 6.5.3a. When the difference between
the heating-sequence’s heating rate and the test’s heating rate increases
beyond a certain threshold ( in Figure 6.5.3a), it is time to
switch to the heating sequence.
This will rapidly increase the temperature to . The temperature caused
by heating sequence (shown as the red curve in Figure 6.5.3a) introduces a
heating rate much larger than that of the test. Therefore, it is better to save
the rest of the tests for a time that the initial temperature is lower and the
tests can offer a large heating rate. The rate of temperature change (heating
rate in this case) is . Therefore the condition on heating rate is:
(6.5.1)
The temperature when the heating sequence is applied is denoted by .
When the high-power test is applied, the temperature is denoted by .
The heating rate can be calculated based on the current temperature and
upcoming power values using equation 6.1.1:
(6.5.2)
Combining equation 6.5.1, equation 6.5.2, and the equivalent of equation
6.5.2 written for the heating sequences (instead of high power tests in
equation 6.5.2) results in:
(6.5.3)
Figure 6.5.3 Thresholds in the integrated approach
(a) Heating and (b) Cooling.
(a) (b)
Te
mp
era
ture
De
riv
ati
ve Derivatives:
Temperatures:
Heating sequence
Testing
Cooling interval
Heating sequence
Testing
Cooling interval
Chapter 6
162
Considering the fact that at the moment of decision making, there is only
one actual temperature, ( ), the condition can be further
simplified to:
(6.5.4)
This could be re-written to have the condition expressed for the power
values:
(6.5.5)
Renaming to results in:
(6.5.6a)
Similarly, for the situation that the temperature must decrease (as shown in
Figure 6.5.3b), the proper condition for switching from a test to a cooling
interval is:
(6.5.6b)
The power of the low-power test is denoted by and the power of the
cooling interval (i.e., the stray power) is denoted by . Switching to the
cooling interval when indicated by the above equation speeds up the
cooling. This way, the normal tests are employed in an efficient way during
temperature-cycling process so that the overall test application time is
further reduced.
According to equations 6.5.6a–b, the scheduling heuristic does not need to
compute the derivatives of the upcoming tests’ temperatures. Instead, it is
sufficient to compare the upcoming power values. Whenever the inequality
in equation 6.5.6a is satisfied, test nodes are followed by heating nodes and
whenever the inequality in equation 6.5.6b is satisfied, the testing is paused
for cooling purpose. The variables and (elements that construct
and vectors), are to be optimized along with and , in the outer
optimization loop (e.g., Figure 6.4.1), to achieve a short test application
time.
These variables are optimized using PSO similar to the one explained in
section 6.4. As mentioned before, Figure 6.5.2 shows how the components
of the integrated approach (excluding the outer PSO loop that is already
explained in Figure 6.4.1) are put together. The path-graph scheduling is
shown in Figure 6.5.2 as a part of the scheduling algorithm. Other
Integrated Temperature-Cycling Acceleration and Test
163
components of Figure 6.5.2 will be explained in the upcoming sections.
Since the optimization process is similar to the PSO discussed in section
6.4, Figure 6.5.2, as a whole, can be viewed as one of the scheduler boxes
shown inside the dashed box in Figure 6.4.1. The alternative decision
variables shown in gray above Figure 6.5.2 come from Figure 6.4.1.
6.5.2 Length of the Power Averaging Window
The average upcoming powers (i.e., , , and ) can be calculated
for a short segment of the tests or heating sequences that immediately
follows. The shortest length of this segment is denoted by for module
. Having a much shorter segment than results in higher computational
effort without a significant improvement in the accuracy. Taking multiple
s into account helps to obtain a long-term estimate of the power values.
A much longer minimal segment length than is not desirable since a
more accurate estimate becomes unlikely to achieve.
The proper value of depends on the dynamics of the system. Consider
a that corresponds to percent ( ) of the final response
to a step input. Here, the final response is the steady state temperature and
the step input is when zero input power is followed by a constant power.
Assuming a constant power, the temperature equation in the time-domain
can be written according to equations 6.1.2–3. We assume that the step
response starts from the initial temperature equal to zero ( ).
Replacing with the percent of the final temperature results in:
(6.5.7)
Since the steady state situation means negligible variations in the
temperature, the temperature derivative can be assumed zero ( ).
By combining this observation with equation 6.1.1, the steady state
temperature can be described as:
(6.5.8)
Replacing from the above equation and from equation 6.1.3b in
equation 6.5.7 results in:
(6.5.9)
Here we are going to replace a scalar time, , with a matrix of time, .
Besides, we assume that the equivalence of the sides in the above equation
Chapter 6
164
is achieved by satisfying the following equation (equation 6.5.10). These
assumptions work for estimating the values of ’s [Lin84].
(6.5.10)
Replacing from equation 6.1.3a results in
(6.5.11)
And finally
(6.5.12)
( ) is the time constants matrix [Lin84] (analogous to in
equations 6.1.4, 6.2.2–6, and 6.2.9 for a single-element case) and is the
matrix that contains the values of s. A diagonal element in (i.e.,
that is denoted by ) represents the proper minimal length for averaging
the upcoming test powers for module .
A ’s value obtained this way is not too short and will contain the
required information. On the other hand, the use of such values
prevents the temperature changes that are larger than from going
unnoticed. This percentage, , is only used for estimating the upcoming
tests’ average powers. The temperature simulations are always performed
based on the original power sequence. Therefore, the value of will not
affect them.
A set of experiments reported in [Aghaee15a] evaluate the accuracy of
values estimated using equation 6.5.12. The accurate value for is
obtained based on high quality temperature simulations. The average error
is found to be around 5%. This confirms that the above estimates have
sufficient accuracy, in practice.
6.5.3 Priorities for TAM Access
Normal tests, heating sequences, and cycling tests may compete for access
to TAM. The priority for letting module to access TAM is assigned
based on the following criterion.
(6.5.13)
Similar to equation 6.4.1, the priority is higher for the colder modules and
for the modules with larger remaining ATC. Moreover, a module’s priority
Integrated Temperature-Cycling Acceleration and Test
165
is higher if its current amount of remaining tests (denoted by ) is larger.
Both normal and cycling tests are taken into account for calculation.
The motivation for inclusion of , similar to that of , is to avoid a
small number of modules running long after all other modules have
completed their tests. Such a scenario implies inefficient use of TAM due
to lack of interleaving opportunities.
In the above equation, is used to calculate the priority for a module
running a heating sequence. For normal or cycling tests, instead of ,
that indicates sufficient cooling, as introduced in section 2.7, is used in
equation 6.5.13. In case of the cycling tests, is replaced with one
(removed from equation 6.5.13) since the value of ATC is not relevant
anymore (after the required ATC is achieved). The priorities are calculated
based on frequently updated values for amount of temperature cycling,
temperatures, and the size of the remaining tests. These values are sent out
from the path-graph scheduling box in Figure 6.5.2 and the resulted
priorities are sent back to it.
6.5.4 Node Ordering in the Test Graph
The path-graph scheduling algorithm cannot be directly employed to solve
the problem that involves the original test graph (e.g., Figure 6.3.2). The
path-graph scheduling needs to know, at certain time points, the order of
the test nodes that will follow and sometimes also the order of the heating
nodes. A path-graph format is usually used to represent the node order for
different sub-graphs. These orders may change during the scheduling, as
different nodes are being included in the schedule’s final path-graph.
A node ordering technique is introduced in this section to determine the
proper node orders, over and over again, during the scheduling process.
This node ordering technique, put together with the path-graph scheduling
algorithm (section 6.5.1), solves the problem that involves the original test
graph, as shown in Figure 6.5.2.
For example, consider a test sub-graph with three normal and three heating
nodes, as shown in Figure 6.5.4a. The graph is simplified for the situation
in which the required ATC is not achieved yet. Therefore, all cycling tests
can be safely removed from the original test graph, for the moment.
Assume that the node (a normal test) is already included in the
Chapter 6
166
schedule. Assume that after completion, the temperature must increase
in order to create a cycle.
The path-graph scheduling only needs to know what the sequence of the
normal test nodes (e.g., [ , ] in Figure 6.5.4b) would be if it decides
to continue the schedule with the high power tests. Furthermore, it needs
to know what the sequence of the heating nodes (e.g., [ , , ] in
Figure 6.5.4b) would be if it decides to continue with the heating
sequences. Based on the average power of these upcoming tests and
heating sequences, the path-graph scheduling algorithm decides which
node to include in the schedule, next.
For the above heating case, the high power orders for tests and heating
nodes are desirable. Similar to the above heating scenario, a node ordering
is performed also for the cooling scenario. In this case, there are no heating
nodes, and a low power order for the test nodes is the only thing to be
determined.
Let us continue with the test ordering for a heating scenario. Assume that
just one node can be considered at a time to determine the high power
order. Continuing with the previous example where is already selected,
if is larger than , then is selected to immediately follow
. Since only one node is left, the node ordering for the test nodes must
be [ , ].
Instead of only one node at a time, two nodes at a time, also, can be
considered to determine the high power order. In this case, the decision is
made based on a normalized power value for two nodes. For example if
Figure 6.5.4 Node ordering
(a) Original graph (b) Ordered during the scheduling process for the time point that comes just after
has been scheduled.
(a) (b)
N
s0,1
N
s0,2
N
s0,0
H
s0,3
H
s0,4
H
s0,5
N
s0,0 N
s0,2
N
s0,1
H
s0,4
H
s0,3
H
s0,5
p0,0-2
p0,0-4
Integrated Temperature-Cycling Acceleration and Test
167
>
Then, the node ordering for the test nodes must be [ , ]. The
normalized power value, if node follows node , is denoted by .
Therefore, in the above example, . The number of nodes taken
into account at a time could be larger. Moreover, it might be helpful to
consider only a part of the test sequence at the beginning of a node.
Therefore, the ordering criterion can be generalized to consider the power
values inside a power assessment window. The length of the power
assessment window is defined as multiples of . The multiplier is denoted
by and it is sufficient to capture the window length. Thus, the length
of the power assessment window is cycles (of high-power test or
heating sequence). Assume that a node consists of samples and
cycles is equal to nodes plus cycles ( ). This
means that nodes ( ) will be involved. Assuming [ , , …,
, ] as the supposed node order, its normalized power is:
(6.5.14)
It is assumed that node is visited immediately after node ( for
). Note that unlike the test nodes that are visited only once, the
heating nodes may be repeated as needed. For example the heating nodes
could have [ ] as the order, although this has not happened in
Figure 6.5.4b. Similar to , which is for heating situations, is used
for the cooling situations.
A small ) results in fast schedule generation, but the generated
schedules might not be as short as they would have been with a large
). A large ), on the other hand, results in a slow schedule
generation. Moreover, a too large may delay the use of some of the
best heating sequences so much so that they are left unused at the end. The
proper and values are obtained in the external optimization loop.
In the inner ordering/scheduling loop, the ) values defined by the
outer optimization loop are used to generate the orders and the schedule as
shown in Figure 6.5.2. The outer optimization loop consists of a particle
swarm optimization algorithm as previously described in Figure 6.4.1.
Chapter 6
168
After the required amount of cycling is achieved, the remaining normal
tests and cycling tests must be performed. In this case, heating nodes as
well as the already applied normal tests can be safely removed from the
original test graph. The newly created test sub-graph must be converted to
a path-graph whenever the path-graph scheduling algorithm (detailed in
section 6.5.1) demands a new node.
The module temperature may be high due to its previous activities or
because of a high temperature in adjacent modules (heat transfer among
modules). When the module’s temperature is too high and close to the
overheating limit ( ), it might be helpful to find a node ordering
that swiftly reduces the power. This is important in a short-time window
and moreover usually a rather low-power test sequence may be found if the
power-assessment window is rather short.
It might not matter if this node order results in a higher test power some
time later, since then the module might be cold. In such an emergency
situation a short power-assessment window, denoted by , is used. There
is an emergency situation if the current temperature is larger than the
emergency temperature limit, denoted by . If the temperature is less than
then the situation is ordinary. In any case, the nodes must be ordered in
a low power configuration as detailed above.
The length of power-assessment window in this ordinary situation is
denoted by . It might be helpful to have a long ordinary power-
assessment window ( ) to avoid large switching activities in a
long-term sense (as opposed to short-term low-power in emergency
situations). The value of is optimized in the outer optimization loop
along with , , , , , , , , and . This outer
optimization loop is similar to Figure 6.4.1 and sends the alternative
decision variables to the integrated scheduling algorithm, as shown in
Figure 6.5.2. The alternative decision variables shown coming to Figure
6.5.2 are from Figure 6.4.1. The generated schedule and its corresponding
TAT shown going out of Figure 6.5.2 are used in Figure 6.4.1.
The search to find the best order (e.g., a path in a graph similar to Figure
6.5.4) is performed using a branch and bound approach which searches the
graph down to a depth equal to cycles (also , , or ,
correspondingly). The cost function (normalized power similar to equation
6.5.14) can be replaced with the accumulated power since all the
Integrated Temperature-Cycling Acceleration and Test
169
alternatives have the same ( , , or , correspondingly). When a
low power order is required (corresponding to the situation in which ,
, or are used), the search can be very fast, since after a relatively
good path is found, the bad candidates’ accumulated power values rapidly
exceed the already found relatively small power value. Consequently, the
inferior candidate paths are rapidly discarded. The search for heating
situations may take longer but, nevertheless, the overall schedule
generation procedure is adequately fast.
6.5.5 Remarks
As mentioned before, the proposed techniques are designed so that a test
dependency graph can be accommodated. In many other cases, the test
dependencies rule out some of the combinations in the schedule and
therefore reduce the search space, which helps to achieve a faster schedule
generation (shorter CPU time). In the case of test ordering, especially for
the test graph, test dependencies will remove some of the edges in the test
graph. Therefore techniques that utilize the fact that the test graph is a
complete graph are not helpful in this case.
The proposed technique uses temperature simulations in order to generate
a test schedule that has certain temperature characteristics. We are using a
good simulator and, therefore, there is no large temperature error. Since the
error is minor, a safety margin is sufficient to prevent overheating, as
discussed in chapter 4.
To ensure that a sufficient amount of cycling has been applied before the
related tests, a slightly larger amount of required cycling can be assumed.
It is assumed that a node in the test graph can be paused and resumed. This
is required for on-demand cooling as well as partitioning and interleaving.
In other words a session-less testing scheme is used. A certain module can
pause and resume its test but it cannot change to a different node before it
completes the node that it has already started.
A node in a test graph may consist of a single test vector or a number of
vectors that are applied one after the other. In general, a node in the test
graph consists of a single test vector and, therefore, the test graph is large.
Albeit the test-graphs’ large sizes, the scheduling heuristic is capable of
handling them, since it is very fast. However, if the number of test vectors
is excessively large, then the schedule generation may become slow. In
such a situation, multiple test vectors can be grouped into a single node in
Chapter 6
170
the test graph. Ideally, nodes such that their different orders do not cause
large power dissipation differences should be grouped together. This
reduces both the computational effort and the loss of ordering
effectiveness. The test-vector clustering problem (i.e., how to group the
test vectors into the nodes of a test graph) is, however, outside the scope of
this thesis.
It should also be mentioned that there can be scenarios such that using a
chamber-based technique is required. For example, after the packaging, to
perform cycling tests targeting the IC features that are external to dies, a
chamber-based technique is required.
A chamber-based approach enforces, however, the maximal cycling
acceleration on all modules. This may lead to longer overall test time and
unnecessary aging of modules that require less cycling acceleration. The
integrated approach, on the other hand, can be faster and cheaper than the
chamber-based approach. Moreover, it supports different amounts of
temperature cycling for different modules. For example, one module can
receive very little cycling acceleration, while another module receives a
very large cycling acceleration, as needed.
6.6 Experimental Results
Experiments have been performed to demonstrate that the proposed
technique can efficiently achieve desired temperature-cycling
accelerations. Moreover, it is demonstrated that the proposed integrated
approach offers a smaller test application time and, therefore, outperforms
the three-phase approach. However, if the normal or cycling test schedules
provided by a third-party have to be used, the three-phase approach must
be chosen. In the following, first the cycling acceleration effect is
demonstrated in section 6.6.1 and then, in section 6.6.2, the performance
of the proposed approach is discussed.
6.6.1 Cycling Acceleration
The proposed integrated approach is used to perform tests and cycling
acceleration for an IC with two modules, as a demonstrator example. It is
assumed that the TAM in this example can only support one module to be
tested at a time ( =1). The corresponding temperature curves are plotted
in Figure 6.6.1a. At the beginning (before ) there are many normal
Integrated Temperature-Cycling Acceleration and Test
171
tests that are properly mixed with heating sequences and cooling intervals
in order to create a high cycling rate. As time goes on, the number of
normal tests that can be effectively used reduces and, therefore, the
majority of cycling is generated by a mix of heating sequences and cooling
intervals (which is, in general, faster and more effective). Around ,
the required amounts of temperature cycling for the two modules are met
and the cycling tests as well as the remaining normal tests can be applied
until all tests are performed (around ).
As more and more temperature cycles are performed (as in Figure 6.6.1a),
the amount of temperature cycling accumulates as suggested by the
increasing accelerated time in Figure 6.6.1b. The vertical axis in Figure
6.6.1b is the accelerated cycling time and the horizontal axis is the actual
time. Moreover, the temperature curves in Figure 6.6.1a are used to
Figure 6.6.1 Cycling acceleration
(a)
(b)
30
40
50
60
70
80
90
1
70
13
9
20
8
27
7
34
6
41
5
48
4
55
3
62
2
69
1
76
0
82
9
89
8
96
7
10
36
11
05
11
74
12
43
13
12
13
81
14
50
15
19
15
88
16
57
17
26
17
95
18
64
19
33
20
02
20
71
21
40
22
09
22
78
23
47
24
16
24
85
25
54
26
23
26
92
1
2
0
10
20
30
40
50
Te
mp
era
ture
[oC
]
90
80
70
60
50
Acc
ele
rate
d T
ime
[h
ou
rs]
5
4
3
2
1
0
Time [ms]
m0
m1
m0
m1
(c)
65
70
75
80
85
12
50
12
55
12
60
12
65
12
70
12
75
12
80
12
85
12
90
12
95
13
00
13
05
13
10
13
15
13
20
Te
mp
era
ture
[oC
]
8580757065
1266 1318
×4000 Cycles
m0
10 20 30 40 50 60 70 80 90 100 1100
Chapter 6
172
compare the proposed integrated approach (Alternative 1, below) with a
chamber-based technique (Alternative 2):
Alternative 1
Let us evaluate our proposed integrated approach here. A middle section
of the temperature curve from Figure 6.6.1a is magnified in Figure 6.6.1c
for module . Temperature swings between and , resulting in
a temperature-cycle amplitude equal to ( ). The average
temperature is approximately ( ). A temperature cycle
happens in test cycles. The test is performed at
. Therefore:
.
Equation 6.1.5 should be used to calculate the ATC value. Here we assume
that , , , , and . Therefore,
the ATC value achieved in a second is:
Alternative 2
Assume a chamber-based approach that uses Thermotron® test chamber
number SE-400-15-15. According to its specifications, this chamber can
create a temperature-cycle, similar to that of Alternative 1, in
approximately 380 seconds. Therefore:
and , .
The ATC value achieved in a second is:
The amount of temperature cycling per second achieved by the chamber-
based technique is around while the integrated approach
achieves around . This means that our approach outperforms the
chamber-based technique by a huge margin (almost 180000 times3).
3 Although other chamber setups may perform better, their corresponding
margins will be still very large.
Integrated Temperature-Cycling Acceleration and Test
173
6.6.2 Performance of the Integrated Approach
The proposed techniques are evaluated on a set of 24 experimental ICs as
detailed in Table 6.6.1. Column 1 indicates the IC’s serial numbers. These
ICs have one to four stacked dies (column 2). The ICs with one layer
(number 1 to 6) correspond to dies at the pre-bond test stage. The ICs with
more than one layer represent a mid-bond or a post-bond test stage. Each
die accommodates 2, 12, 20, 30, 42, and 49 modules resulting in 2 to 196
Table 6.6.1 Percentage changes achieved by integrated approach
IC specifications
Percentage change in TATs Percentage change in CPU times
Integrated approach compared
with:
Integrated approach compared
with:
Number
of layers
Number of
modules
Three-phase
w/o ordering1
Three-phase
with ordering2
Three-phase
w/o ordering3
Three-phase
with ordering4
1 1 2 –22.99 –21.75 0.00 0.00
2 1 12 –17.02 –14.81 0.00 0.00
3 1 20 –29.73 –2.12 0.00 0.00
4 1 30 –22.24 –5.51 300.00 300.00
5 1 42 –12.65 –10.08 200.00 200.00
6 1 49 –33.33 –32.19 500.00 500.00
7 2 4 –19.08 –19.08 0.00 0.00
8 2 24 –2.13 –2.13 200.00 200.00
9 2 40 –4.58 –3.74 500.00 500.00
10 2 60 –8.09 –7.34 350.00 350.00
11 2 84 –9.81 –9.52 325.00 325.00
12 2 98 –7.11 –2.94 300.00 300.00
13 3 6 –38.16 –35.60 0.00 0.00
14 3 36 –32.47 –20.71 400.00 400.00
15 3 60 –15.05 –13.85 333.33 333.33
16 3 90 –9.034 –8.92 283.33 283.33
17 3 126 –10.88 –7.45 169.23 191.67
18 3 147 –22.32 –12.98 83.33 83.33
19 4 8 –52.83 –44.92 0.00 0.00
20 4 48 –22.12 –21.95 25.00 25.00
21 4 80 –25.04 –24.86 13.04 13.04
22 4 120 –16.22 –16.03 18.48 15.96
23 4 168 –16.97 –16.96 14.43 18.08
24 4 196 –22.99 –13.99 20.50 26.87
Average –19.70 –15.39 168.15 169.40
1 Percentage change in the integrated approach’s test application time ( )
compared with the 3-phase approach without test ordering ( )
2 Percentage change in the integrated approach’s test application time ( )
compared with the 3-phase approach with test ordering ( )
3 Percentage change in the integrated approach’s CPU time ( ) compared
with the 3-phase approach without test ordering ( )
4 Percentage change in the integrated approach’s CPU time ( ) compared
with the 3-phase approach with test ordering ( )
Chapter 6
174
modules per IC, as shown in column 3. The number of the corresponding
nodes in the test graphs is between 60 and 5880 and test sizes are between
234 kB and 22 MB. The thermal models are extracted using an approach
similar to [Coskun09]. The switching activities for tests and heating
sequences are generated using Markov chains, similar to [Yao11c]. All
experiments are performed on a desktop computer with Intel® Xeon®
W3520 processor and 8 GB of memory.
The required amount of cycling is separately defined for each module in
the experimental ICs. It does not depend on the scheduling method and,
therefore, both the three-phase and integrated approaches must enforce
identical amounts of temperature cycling. Since before enforcing ,
cycling tests cannot be performed, the cycling operations continue until
is enforced. This implies that it is sufficient to compare the test
application times.
The integrated approach achieves shorter TAT compared with the three-
phase approach for all of the experimental ICs. Two types of the three-
phase approach are considered. Type 1 does not use any ordering
technique. Type 2 uses a test ordering technique. The percentage changes
compared with the three-phase approach without test ordering (type 1) and
with test ordering (type 2) are reported in columns 4 and 5, respectively.
In average, the proposed technique outperforms the three-phase technique
with ordering and the three-phase technique without ordering by about 15
percent and 20 percent, respectively. Since the test ordering offers a
reduced TAT, the improvements achieved by the integrated approach are
larger when compared with the three-phase without test-ordering approach.
Compared with the three-phase approach, the integrated approach is more
complicated and, therefore, it takes more time to run. The CPU times are
rounded to seconds and then percentage changes are calculated. The
percentage changes in the CPU times are reported in columns 6 and 7 for
the integrated approach compared with the three-phase approach without
and with test ordering, respectively. In average, the proposed technique is
slower than the three-phase technique with ordering and the three-phase
technique without ordering by about 169 and 168 in percentage change,
respectively.
The CPU times for the three-phase approaches with and without test
ordering are comparable. Compared with the three-phase without ordering
approach, the three-phase with ordering approach is more complicated for
Integrated Temperature-Cycling Acceleration and Test
175
the decision-points4 in the schedule. This is due to the time taken to search
for a good node order. On the other hand, the three-phase with node
ordering approach offers slightly shorter schedules which mean less
decision-points. Consequently, sometimes shorter schedules compensate
for the time-consuming node ordering operations, but not always.
Therefore, sometimes the number in columns 6 is smaller than the number
in column 7, for example for IC number 17 and sometimes larger, for
example for IC number 22.
Note that in these experiments for CPU times we included the scheduling
times for the normal and cycling tests. If the schedules would have been
provided by a third-party, then the actual CPU times for the three-phase
approaches will be smaller. Consequently, the numbers reported in
columns 6 and 7 could be larger than the current values. Shorter CPU time
can be considered as an advantage for the three-phase approach.
CPU times, in general, grow with the tests size as shown in Figure 6.6.2.
Moreover, CPU times grow also with the number of modules and layers as
shown in Figure 6.6.2b. The data points in Figure 6.6.2a represent
multiples of test sizes used in Figure 6.6.2b. The growth rates are however
4A decision-point is a point that a module’s state (testing/heating/cooling) or
test/heating node may change.
Figure 6.6.2 CPU time growth (a) with test size, (b) with IC complexity
0
100
200
300
0 100 200 300 400 500 600 700 800
CP
U t
ime
[se
c]
Number of modules × Number of layers
(b)
0
20
40
60
0 1000 2000 3000 4000 5000
CP
U T
ime
[m
in]
Test Size [kB]
(a)
Chapter 6
176
acceptably low and the scheduling process for the largest IC (number 24)
takes less than 5 minutes to complete.
6.7 Conclusions
Temperature-cycling acceleration is a useful technique to help the
detection of cycling-dependent early-life failures. These failures are
usually not considered as a major issue for conventional 2D ICs. Therefore,
cycling acceleration is usually recommended when a high degree of
reliability is crucial. Recent studies have shown that the cycling-dependent
early-life failures can be a major issue for 3D stacked ICs. The existing
cycling acceleration procedures are very costly since they are usually
performed using temperature chambers. In this thesis we propose an
inexpensive technique to order the tests and heating sequences so that
required temperature cycling effects can be achieved in a short time,
without the use of temperature chambers.
For this purpose tests are ordered differently based on the required power
for the related situation. When a module’s temperature must increase to
generate a temperature cycle, a high-power ordering of the tests and
heating sequences is considered. For the situation that the temperature must
decrease, a low-power ordering of the tests is used, instead. During the
tests, after the required cycling is achieved, depending on the current
module’s temperature, a long term or a short term low-power ordering of
the tests is selected. All these help to achieve a short test application time,
as demonstrated by the experiments. Consequently, this integrated
approach is well-suited to be integrated into pre-, mid-, and post-bond test
stages for 3D stacked ICs.
177
6.8 Notations and Abbreviations
Notation Description
Represents heat capacitances in the thermal model
Amount of temperature cycling
Required amount of temperature cycling for module
Represents thermal conductance (related to heat transfer) in the
thermal model
Heat capacitance in a simple single module case (analogous to )
Identity matrix
Constants in ATC equation
Number of temperature cycles for module
Power value(s) in a general case
Heating sequences’ powers
High-power tests’ powers
Low-power tests’ powers
Stray power
PSO Particle Swarm Optimization [Poli07]
Thermal resistance in a simple single module case (analogous to
)
Remaining tests’ size
Node in module ’s test graph (same as th test for module
)
TAM Test Access Mechanism
TAT Test Application Time
Transfer matrix for initial temperatures
Transfer matrix for power values
Constant in ATC equation
Number of samples/cycles in one test/heating node.
Chapter 6
178
Notation Description
Temperatures vector in a general case
Ambient temperature
Overheating temperature limit
Stray temperature (caused by stray power, )
Initial temperature(s)
Final temperature(s) after seconds
Steady state temperatures
High cycling temperature limit for module
Low cycling temperature limit for module
Emergency temperature limit for module
Stop cooling temperature limit for module
Threshold for temperature cycle amplitudes
Average cycling temperature for module
The amplitude of temperature cycles for module
Heating rate of the heating sequences
Heating rate of the high-power tests
Remaining number of cycles when full nodes are subtracted from
the length of power assessment window ( ). covers
nodes plus cycles.
Minimal segment lengths for power averaging/assessment (Vector
and element formats)
Average cycling temperature for single module example
Number of full nodes in the power assessment window ( ).
covers nodes plus cycles.
TAM access priority for module
Integrated Temperature-Cycling Acceleration and Test
179
Notation Description
ATC rate
Half of temperature cycle amplitude for single module example
Time period between the initial temperatures and the final
temperature
Threshold on power difference to decide between cooling interval
or low-power test application
Number of nodes involved in the power assessment window
( ).
Threshold on power difference to decide between heating
sequence or high-power test application
Length of the power assessment window. cycles are
considered. Refers indifferently to , , , or .
for node ordering in Cooling situation.
for node ordering in thermal Emergency situation.
for node ordering in Heating situation.
for node ordering in Ordinary situation.
181
Chapter 7 Conclusions and Future Work
7.1 Conclusions
Many cutting-edge computer and electronic products are based on
advanced Systems-on-Chip (SoC). Advanced SoCs are manufactured with
deep submicron and 3D-stacked-IC technologies. These advanced
manufacturing technologies enable the integration of a large number of
high performance functions. Such advanced manufacturing technologies
face a number of thermal challenges in regard with their reliability and
testing procedures. These challenges, related to temperature uncertainty,
temperature gradients, and temperature cycling have been addressed in this
thesis.
Temperature Uncertainty
Advanced SoCs manufactured with deep submicron technologies suffer
from process variation and its thermal consequences. Existing testing
techniques rely on temperature simulations to predict the circuit-under-
tests’ temperatures and design the test so that overheating is prevented. The
difference between the expected temperatures and the actual temperatures
is called temperature error. This error, for past technologies, is negligible.
However, advanced SoCs experience large error magnitudes due to large
process variations. Such large error magnitudes have costly consequences
(e.g., test overkill and overheating) and must be taken care of.
This thesis presents several scheduling-based approaches to take care of
the temperature errors induced by process variation. An adaptive technique
for addressing the intra-die and time-variant errors is introduced. This
technique is designed to support a thermal-safe test which means that a
high-temperature limit must be respected. A slightly different scenario is
multi-temperature testing which also requires considering a low-
7
Chapter 7
182
temperature limit. An adaptive technique to deal with temperature errors
that affect multi-temperature testing is therefore proposed.
Temperature Gradients
Temperature gradients in a chip accelerate certain defect mechanisms
including some types of early-life failures. Therefore, performing a burn-
in like operation that enforces appropriate gradients helps to accelerate and
detect these early-life failures. A test-scheduling based approach for
performing burn-in like operations is proposed in this thesis. The proposed
approach enforces the required temperature gradients by selectively
applying high power test stimuli to the circuit-under-test. This way, the
required life-time acceleration is achieved without requiring temperature
chambers.
Temperature gradients affect also some delay-related defects. The delay
experienced by a signal depends on its path temperature. Moreover, some
defects (e.g., resistive opens) can, also, affect the delay. Different signals
travelling through different paths may therefore experience different
delays because of a subtle defect in one of the paths as well as the path’s
temperature. This means that the circuit may operate correctly when the
gradients are negligible even though a subtle defect exists. However, this
negligible defect may cause a fault when certain gradient occurs on the
chip. In order to detect such subtle defects, the related tests must be applied
when appropriate temperature-gradients are enforced. 3D stacked ICs
experience large gradients and, therefore, the proposed techniques are
developed so that they can be efficiently applied to 3D stacked ICs.
Temperature Cycling
Temperature-cycling test procedures are usually applied to safety-critical
systems to detect cycling-related early-life failures. Such failures affect
advanced SoCs, particularly through-silicon-via structures in 3D-stacked-
ICs. An efficient schedule-based cycling-test technique that combines
cycling acceleration with testing is proposed in this thesis.
The circuit-under-test’s dissipated power depends on the order in which
the tests are applied. Therefore, the tests are reordered by the proposed
technique to adjust the power dissipation levels as needed. This helps to
achieve a short test application time. Moreover, the proposed technique fits
into existing 3D testing procedures and does not require temperature
Conclusions and Future Work
183
chambers. Therefore, the overall cycling acceleration and testing cost can
be drastically reduced.
Temperature Simulation and Experiments
A fast temperature simulation technique based on a closed-form solution
for the temperature equations is introduced in this thesis. Dedicated
experiments show that the proposed simulation technique reduces the
schedule generation time by more than half. This technique is used in the
majority of the experiments reported in this thesis.
All the proposed techniques in this thesis have been implemented and
evaluated with extensive experiments based on ITC’02 benchmark as well
as a number of experimental 3D-stacked-ICs. Experiments show that the
proposed techniques work effectively and reduce the costs, in particular
the costs related to addressing thermal issues and early-life failures.
7.2 Future Work
In this thesis we focused on the manufacturing test process. However, in-
field and online testing are required, for example, for safety-critical
systems. Similar issues to those considered in this thesis, for manufacturing
testing, can cause problems during in-field and online testing as well.
Temperature issues caused by process variation and temperature gradients
are among these issues.
Temperature cycling for applications that require frequent in-field or
online testing is another direction for future research. Designing these
testing procedures that minimize the temperature cycling can be of interest,
in order to slow down the aging process. This is also true for minimizing
the gradients. Utilizing the already existing gradients (during normal
operation) for online gradient-based testing can be efficient and therefore
interesting to study.
Adaptive online testing is another related topic. Temperature cycling and
gradients can be monitored (during the normal operation) and online tests
targeting the weakened areas (likely defects) can be applied. The
temperature errors (caused by process variation) can, also, be estimated
during the normal operation and then a decision between using a slow (low-
power) and a fast (high-power) online test scheme can be made
accordingly. For example for modules that work warmer than usual a
longer low-power online test might be a good choice. On the other hand,
Chapter 7
184
for modules that work colder than usual a faster high-power online test
might be a good choice. This may change over time, partly in relation to
gradients and cycling.
In a manufacturing test setup, testing frequency can be used to alter the
power dissipation, if the DfT circuitry and the ATE support it. For
example, when a colder testing is preferred, the frequency can be reduced.
If heating is required, then the frequency could be increased to generate
more heat. Although not used in this thesis, testing frequency can be added
as another decision variable to the problem formulation.
Defect explorations and reliability studies can identify new challenging
issues that need to be addressed, especially for new technologies. Many
potential issues regarding through silicon vias are already identified, some
of which are discussed in this thesis. As 3D stacked IC technology matures,
more issues may be identified. This is, in particular, important for
applications that require high reliability.
The focus of this thesis is mainly on logic, even though some of the
proposed techniques can be applied to whatever entity that has the
properties of a module (as discussed earlier). There exist several non-logic
components that are usually integrated into advanced SoCs and similar
devices. Memory modules are very important among these devices and are
widely studied in connection with normal 2D and 3D-stack technologies.
Process variation, temperature gradients, and temperature cycling affect
the memories, too.
There are other components that are similarly affected by these negative
effects. Image sensors that are widely used today are among them. CMOS
image sensors can be manufactured using through silicon vias.
Consequently, defects that relate to through silicon vias affect such image
sensors, among other sources of defect. Developing new testing techniques
as well as extending and specializing the methods proposed in this thesis
can be of interest for all these non-logic components.
185
References
[Abramovici94] Miron Abramovici, Melvin A Breuer, and Arthur D Friedman. DIGITAL SYSTEMS
TESTING AND TESTABLE DESIGN, 1994. [Online]. Available:
http://eu.wiley.com/WileyCDA/WileyTitle/productCd-0780310624.html.
[Accessed: 29-May-2015].
[Aghaee10] Nima Aghaee, Zhiyuan He, Zebo Peng, and Petru Eles. TEMPERATURE-AWARE
SOC TEST SCHEDULING CONSIDERING INTER-CHIP PROCESS VARIATION, in 19th
Asian Test Symposium (ATS), pages 395–398, 2010.
[Aghaee11a] Nima Aghaee, Zebo Peng, and Petru Eles. ADAPTIVE TEMPERATURE-AWARE SOC
TEST SCHEDULING CONSIDERING PROCESS VARIATION, in 14th Euromicro
Conference on Digital System Design (DSD), Oulu, Finland, pages 197–204, 2011.
[Aghaee11b] Nima Aghaee, Zebo Peng, and Petru Eles. PROCESS-VARIATION AND
TEMPERATURE AWARE SOC TEST SCHEDULING USING PARTICLE SWARM
OPTIMIZATION, in 6th International Design and Test Workshop (IDT), Beirut,
Lebanon, pages 1–6, 2011.
[Aghaee13a] Nima Aghaee, Zebo Peng, and Petru Eles. PROCESS-VARIATION AND
TEMPERATURE AWARE SOC TEST SCHEDULING TECHNIQUE, Journal of Electronic
Testing, vol. 29, no. 4, pages 499–520, Aug. 2013.
[Aghaee13b] Nima Aghaee, Zebo Peng, and Petru Eles. TEMPERATURE-GRADIENT BASED TEST
SCHEDULING FOR 3D STACKED ICS, in 20th International Conference on
Electronics, Circuits, and Systems (ICECS), Abu Dhabi, UAE, pages 405–408,
2013.
[Aghaee14a] Nima Aghaee, Zebo Peng, and Petru Eles. AN EFFICIENT TEMPERATURE-GRADIENT
BASED BURN-IN TECHNIQUE FOR 3D STACKED ICS, in Design, Automation and Test
in Europe Conference and Exhibition (DATE), Dresden, Germany, pages 1–4,
2014.
[Aghaee14b] Nima Aghaee, Zebo Peng, and Petru Eles. PROCESS-VARIATION AWARE MULTI-
TEMPERATURE TEST SCHEDULING, in 27th International Conference on VLSI
Design, Mumbai, India, pages 32–37, 2014.
[Aghaee15a] Nima Aghaee, Zebo Peng, and Petru Eles. AN INTEGRATED TEMPERATURE-
CYCLING ACCELERATION AND TEST TECHNIQUE FOR 3D STACKED ICS, in 20th Asia
and South Pacific Design Automation Conference (ASP-DAC), Chiba, Japan, pages
526–531, 2015.
R
References
186
[Aghaee15b] Nima Aghaee, Zebo Peng, and Petru Eles. TEMPERATURE-GRADIENT-BASED BURN-
IN AND TEST SCHEDULING FOR 3-D STACKED ICS, IEEE Transactions on Very
Large Scale Integration (VLSI) Systems, vol. PP, no. 99, pages 1–1, 2015.
[Ahmed05] Nisar Ahmed, Mohammad Tehranipoor, and CP Ravikumar. ENHANCED LAUNCH-
OFF-CAPTURE TRANSITION FAULT TESTING, in International Test Conference, pages
255–264, 2005.
[Ayala09] José L Ayala, Arvind Sridhar, Vinod Pangracious, David Atienza, and Yusuf
Leblebici. THROUGH SILICON VIA-BASED GRID FOR THERMAL CONTROL IN 3D
CHIPS, in Nano-Net, Springer, pages 90–98, 2009.
[Bahukud08a] Sudarshan Bahukudumbi and Krishnendu Chakrabarty. POWER MANAGEMENT FOR
WAFER-LEVEL TEST DURING BURN-IN, in 17th Asian Test Symposium, pages 231–
236, 2008.
[Bahukud08b] Sudarshan Bahukudumbi and Krishnendu Chakrabarty. TEST-PATTERN ORDERING
FOR WAFER-LEVEL TEST DURING BURN-IN, in 26th VLSI Test Symposium, pages
193–198, 2008.
[Bahukud09] Sudarshan Bahukudumbi and Krishnendu Chakrabarty. POWER MANAGEMENT
USING TEST-PATTERN ORDERING FOR WAFER-LEVEL TEST DURING BURN-IN, IEEE
Transactions on Very Large Scale Integration (VLSI) Systems, vol. 17, no. 12,
pages 1730–1741, Dec. 2009.
[Bayle10] F Bayle and A Mettas. TEMPERATURE ACCELERATION MODELS IN RELIABILITY
PREDICTIONS: JUSTIFICATION AND IMPROVEMENTS, in annual Reliability and
Maintainability Symposium (RAMS), pages 1–6, 2010.
[Bild08] David R Bild, Sanchit Misra, Thidapat Chantemy, Prabhat Kumar, Robert P Dick,
X Sharon Hu, Li Shang, and Alok Choudhary. TEMPERATURE-AWARE TEST
SCHEDULING FOR MULTIPROCESSOR SYSTEMS-ON-CHIP, in IEEE/ACM
International Conference on Computer-Aided Design, pages 59–66, 2008.
[Bonhomme02] Y Bonhomme, P Girard, C Landrault, and S Pravossoudovitch. TEST POWER: A BIG
ISSUE IN LARGE SOC DESIGNS, in 1st International Workshop on Electronic Design,
Test and Applications, pages 447–449, 2002.
[Borkar03] Shekhar Borkar, Tanay Karnik, Siva Narendra, Jim Tschanz, Ali Keshavarzi, and
Vivek De. PARAMETER VARIATIONS AND IMPACT ON CIRCUITS AND
MICROARCHITECTURE, in 40th annual Design Automation Conference, pages 338–
342, 2003.
[Bosio11] A Bosio, L Dilillo, P Girard, A Todri, A Virazel, K Miyase, and X Wen. POWER-
AWARE TEST PATTERN GENERATION FOR AT-SPEED LOS TESTING, in 20th Asian Test
Symposium, pages 506–510, 2011.
[Bota04] Sebastiàn A Bota, M Rosales, JL Rosello, A Keshavarzi, and J Segura. WITHIN DIE
THERMAL GRADIENT IMPACT ON CLOCK-SKEW: A NEW TYPE OF DELAY-FAULT
MECHANISM, in International Test Conference, pages 1276–1283, 2004.
[Carbine97] Adrian Carbine and Derek Feltham. PENTIUM (R) PRO PROCESSOR DESIGN FOR
TEST AND DEBUG, in International Test Conference, pages 294–303, 1997.
[Chakrabarty00] Krishnendu Chakrabarty. TEST SCHEDULING FOR CORE-BASED SYSTEMS USING
MIXED-INTEGER LINEAR PROGRAMMING, IEEE Transactions on Computer-Aided
Design of Integrated Circuits and Systems, vol. 19, no. 10, pages 1163–1174, Oct.
2000.
References
187
[Chakrabarty02] Krishnendu Chakrabarty, Vikram Iyengar, and Anshuman Chandra. TEST
SCHEDULING USING MIXED-INTEGER LINEAR PROGRAMMING, Frontiers in
Electronic Testing: Test Resource Partitioning for System-on-a-Chip, Springer US,
pages 97–118, 2002.
[Chakrabarty12] Krishnendu Chakrabarty, Sergej Deutsch, Himanshu Thapliyal, and Fangming Ye.
TSV DEFECTS AND TSV-INDUCED CIRCUIT FAILURES: THE THIRD DIMENSION IN
TEST AND DESIGN-FOR-TEST, in International Reliability Physics Symposium, page
5F–1, 2012.
[Chakravarty94] S Chakravarty and VP Dabholkar. TWO TECHNIQUES FOR MINIMIZING POWER
DISSIPATION IN SCAN CIRCUITS DURING TEST APPLICATION, in 3rd Asian Test
Symposium, pages 324–329, 1994.
[Chandran09] Unni Chandran and Dan Zhao. THERMAL DRIVEN TEST ACCESS ROUTING IN HYPER-
INTERCONNECTED THREE-DIMENSIONAL SYSTEM-ON-CHIP, in 24th IEEE
International Symposium on Defect and Fault Tolerance in VLSI Systems, pages
410–418, 2009.
[Chang05] Hongliang Chang and Sachin S Sapatnekar. FULL-CHIP ANALYSIS OF LEAKAGE
POWER UNDER PROCESS VARIATIONS, INCLUDING SPATIAL CORRELATIONS, in 42nd
annual Design Automation Conference, pages 523–528, 2005.
[Chantem13] Thidapat Chantem, Yun Xiang, X Sharo Hu, and Robert P Dick. ENHANCING
MULTICORE RELIABILITY THROUGH WEAR COMPENSATION IN ONLINE ASSIGNMENT
AND SCHEDULING, in Design, Automation Test in Europe, pages 1373–1378, 2013.
[Cheng00] Kwang Ting Cheng, S Dey, M Rodgers, and K Roy. TEST CHALLENGES FOR DEEP
SUB-MICRON TECHNOLOGIES, in Design Automation Conference, pages 142–149,
2000.
[Cherman12] VO Cherman, J De Messemaeker, K Croes, B Dimcic, G Van der Plas, I De Wolf,
G Beyer, B Swinnen, and E Beyne. IMPACT OF THROUGH SILICON VIAS ON FRONT-
END-OF-LINE PERFORMANCE AFTER THERMAL CYCLING AND THERMAL STORAGE,
in International Reliability Physics Symposium, pages 2B.3.1–2B.3.5, 2012.
[Choi07] Jung Hwan Choi, Jayathi Murthy, and Kaushik Roy. THE EFFECT OF PROCESS
VARIATION ON DEVICE TEMPERATURE IN FINFET CIRCUITS, in IEEE/ACM
international conference on Computer-aided design, pages 747–751, 2007.
[Chou97] RM Chou, KK Saluja, and VD Agrawal. SCHEDULING TESTS FOR VLSI SYSTEMS
UNDER POWER CONSTRAINTS, IEEE Transactions on Very Large Scale Integration
(VLSI) Systems, vol. 5, no. 2, pages 175–185, Jun. 1997.
[Ciappa03a] M Ciappa, F Carbognani, P Cova, and W Fichtner. LIFETIME PREDICTION AND
DESIGN OF RELIABILITY TESTS FOR HIGH-POWER DEVICES IN AUTOMOTIVE
APPLICATIONS, in 41st annual International Reliability Physics Symposium, pages
523–528, 2003.
[Ciappa03b] M Ciappa, F Carbognani, and Wolfgang Fichtner. LIFETIME PREDICTION AND
DESIGN OF RELIABILITY TESTS FOR HIGH-POWER DEVICES IN AUTOMOTIVE
APPLICATIONS, IEEE Transactions on Device and Materials Reliability, vol. 3, no.
4, pages 191–196, Dec. 2003.
[Clabes04] Joachim Clabes, Joshua Friedrich, Mark Sweet, Jack DiLullo, Sam Chu, Donald
Plass, James Dawson, Paul Muench, Larry Powell, and Michael Floyd. DESIGN
AND IMPLEMENTATION OF THE POWER5TM MICROPROCESSOR, in 41st annual
Design Automation Conference, pages 670–672, 2004.
References
188
[Coskun09] AK Coskun, JL Ayala, D Atienza, TS Rosing, and Y Leblebici. DYNAMIC
THERMAL MANAGEMENT IN 3D MULTICORE ARCHITECTURES, in Design,
Automation Test in Europe, pages 1410–1415, 2009.
[Dabholkar98] V Dabholkar, S Chakravarty, I Pomeranz, and S Reddy. TECHNIQUES FOR
MINIMIZING POWER DISSIPATION IN SCAN AND COMBINATIONAL CIRCUITS DURING
TEST APPLICATION, IEEE Transactions on Computer-Aided Design of Integrated
Circuits and Systems, vol. 17, no. 12, pages 1325–1333, Dec. 1998.
[Davis94] Brendan Davis. THE ECONOMICS OF AUTOMATIC TESTING. McGraw- Hill, 1994.
[Deutsch11] Sergej Deutsch, Vivek Chickermane, Brion Keller, Subhasish Mukherjee, Mario
Konijnenburg, Erik Jan Marinissen, and Sandeep K Goel. AUTOMATION OF 3D-
DFT INSERTION, in 20th Asian Test Symposium, pages 395–400, 2011.
[Deutsch12] Sergej Deutsch, Krishnendu Chakrabarty, Shreepad Panth, and Sung Kyu Lim.
TSV STRESS-AWARE ATPG FOR 3D STACKED ICS, in 21st Asian Test Symposium,
pages 31–36, 2012.
[Engelke08] P Engelke, I Polian, M Renovell, S Kundu, B Seshadri, and B Becker. ON
DETECTION OF RESISTIVE BRIDGING DEFECTS BY LOW-TEMPERATURE AND LOW-
VOLTAGE TESTING, IEEE Transactions on Computer-Aided Design of Integrated
Circuits and Systems, vol. 27, no. 2, pages 327–338, Feb. 2008.
[Falkenauer98] Emanuel Falkenauer. GENETIC ALGORITHMS AND GROUPING PROBLEMS.
Chichester ; New York: Wiley, 1998.
[Flores99] P Flores, J Costa, H Neto, J Monteiro, and J Marques-Silva. ASSIGNMENT AND
REORDERING OF INCOMPLETELY SPECIFIED PATTERN SEQUENCES TARGETING
MINIMUM POWER DISSIPATION, in 12th International Conference On VLSI Design,
pages 37–41, 1999.
[Flottes15] Marie-Lise Flottes, Joao Azevedo, Giorgio Di Natale, and Bruno Rouzeyre.
SESSION-LESS BASED THERMAL-AWARE 3D-SIC TEST SCHEDULING, in 20th
European Test Symposium, Cluj-Napoca, Romania, 2015.
[Frank10] T Frank, Cedrick Chappaz, P Leduc, L Arnaud, S Moreau, A Thuaire, R El-
Farhane, and L Anghel. RELIABILITY APPROACH OF HIGH DENSITY THROUGH
SILICON VIA (TSV), in 12th Electronics Packaging Technology Conference, pages
321–324, 2010.
[Ganapathy10] Shrikanth Ganapathy, Ramon Canal, Antonio Gonzalez, and Antonio Rubio.
CIRCUIT PROPAGATION DELAY ESTIMATION THROUGH MULTIVARIATE
REGRESSION-BASED MODELING UNDER SPATIO-TEMPORAL VARIABILITY, in
Design, Automation & Test in Europe, pages 417–422, 2010.
[Girard97] P Girard, C Landrault, S Pravossoudovitch, and D Severac. REDUCTION OF POWER
CONSUMPTION DURING TEST APPLICATION BY TEST VECTOR ORDERING [VLSI
CIRCUITS], Electronics Letters, vol. 33, no. 21, pages 1752–1754, Oct. 1997.
[GopiReddy14] L GopiReddy, LM Tolbert, B Ozpineci, and JOP Pinto. RAINFLOW ALGORITHM
BASED LIFETIME ESTIMATION OF POWER SEMICONDUCTORS IN UTILITY
APPLICATIONS, in 29th annual Applied Power Electronics Conference and
Exposition, pages 2293–2299, 2014.
[Gorev13] M Gorev, R Ubar, P Ellervee, S Devadze, J Raik, and M Min. AT-SPEED SELF-
TESTING OF HIGH-PERFORMANCE PIPE-LINED PROCESSING ARCHITECTURES, in
NORCHIP Conference, pages 1–6, 2013.
References
189
[Groebel01] DJ Groebel, A Mettas, and Feng-Bin Sun. DETERMINATION AND INTERPRETATION
OF ACTIVATION ENERGY USING ACCELERATED-TEST DATA, in annual Reliability
and Maintainability Symposium, pages 58–63, 2001.
[Hagihara97] Y Hagihara, S Inui, F Okamoto, M Nishida, T Nakamura, and H Yamada.
FLOATING-POINT DATAPATHS WITH ONLINE BUILT-IN SELF SPEED TEST, IEEE
Journal of Solid-State Circuits, vol. 32, no. 3, pages 444–449, Mar. 1997.
[Held97] M Held, P Jacob, G Nicoletti, P Scacco, and MH Poech. FAST POWER CYCLING
TEST OF IGBT MODULES IN TRACTION APPLICATION, in International Conference
on Power Electronics and Drive Systems, vol. 1, pages 425–430 vol.1, 1997.
[He06a] Zhiyuan He, Zebo Peng, and P Eles. POWER CONSTRAINED AND DEFECT-
PROBABILITY DRIVEN SOC TEST SCHEDULING WITH TEST SET PARTITIONING, in
Design, Automation and Test in Europe, vol. 1, pages 1–6, 2006.
[He06b] Zhiyuan He, Zebo Peng, P Eles, P Rosinger, and BM Al-Hashimi. THERMAL-
AWARE SOC TEST SCHEDULING WITH TEST SET PARTITIONING AND INTERLEAVING,
in 21st International Symposium on Defect and Fault Tolerance in VLSI Systems,
pages 477–485, 2006.
[He07] Zhiyuan He, Zebo Peng, and P Eles. A HEURISTIC FOR THERMAL-SAFE SOC TEST
SCHEDULING, in International Test Conference, pages 1–10, 2007.
[He08a] Zhiyuan He, Zebo Peng, and Petru Eles. SIMULATION-DRIVEN THERMAL-SAFE TEST
TIME MINIMIZATION FOR SYSTEM-ON-CHIP, in 17th Asian Test Symposium, pages
283–288, 2008.
[He08b] Zhiyuan He, Zebo Peng, Petru Eles, Paul Rosinger, and Bashir M Al-Hashimi.
THERMAL-AWARE SOC TEST SCHEDULING WITH TEST SET PARTITIONING AND
INTERLEAVING, Journal of Electronic Testing, vol. 24, no. 1–3, pages 247–257,
Jan. 2008.
[He09] Zhiyuan He, Zebo Peng, and P Eles. THERMAL-AWARE TEST SCHEDULING FOR
CORE-BASED SOC IN AN ABORT-ON-FIRST-FAIL TEST ENVIRONMENT, in 12th
Euromicro Conference on Digital System Design, Architectures, Methods and
Tools, pages 239–246, 2009.
[He10] Zhiyuan He, Zebo Peng, and P Eles. MULTI-TEMPERATURE TESTING FOR CORE-
BASED SYSTEM-ON-CHIP, in Design, Automation Test in Europe, pages 208–213,
2010.
[Higami13] Yoshinobu Higami, Hiroshi Takahashi, Shin-ya Kobayashi, and Kewal K Saluja.
TEST GENERATION FOR DELAY FAULTS ON CLOCK LINES UNDER LAUNCH-ON-
CAPTURE TEST ENVIRONMENT, IEICE Transactions on Information and Systems,
vol. E96-D, no. 6, pages 1323–1331, Jun. 2013.
[Higham05] N Higham. THE SCALING AND SQUARING METHOD FOR THE MATRIX EXPONENTIAL
REVISITED, SIAM Journal on Matrix Analysis and Applications, vol. 26, no. 4,
pages 1179–1193, Jan. 2005.
[Hirschmann06] D Hirschmann, D Tissen, S Schroder, and RW De Doncker. RELIABILITY
PREDICTION FOR INVERTERS IN HYBRID ELECTRICAL VEHICLES, in 37th Power
Electronics Specialists Conference, pages 1–6, 2006.
[Hirschmann07] D Hirschmann, D Tissen, S Schroder, and RW De Doncker. RELIABILITY
PREDICTION FOR INVERTERS IN HYBRID ELECTRICAL VEHICLES, IEEE Transactions
on Power Electronics, vol. 22, no. 6, pages 2511–2517, Nov. 2007.
References
190
[Huang01] Yu Huang, Wu-Tung Cheng, Chien-Chung Tsai, N Mukherjee, O Samman, Y
Zaidan, and SM Reddy. RESOURCE ALLOCATION AND TEST SCHEDULING FOR
CONCURRENT TEST OF CORE-BASED SOC DESIGN, in 10th Asian Test Symposium,
pages 265–270, 2001.
[Huang02] Yu Huang, SM Reddy, Wu-Tung Cheng, P Reuter, N Mukherjee, Chien-Chung
Tsai, O Samman, and Y Zaidan. OPTIMAL CORE WRAPPER WIDTH SELECTION AND
SOC TEST SCHEDULING BASED ON 3-D BIN PACKING ALGORITHM, in International
Test Conference, pages 74–82, 2002.
[Huang06] Wei Huang, S Ghosh, S Velusamy, K Sankaranarayanan, K Skadron, and MR Stan.
HOTSPOT: A COMPACT THERMAL MODELING METHODOLOGY FOR EARLY-STAGE
VLSI DESIGN, IEEE Transactions on Very Large Scale Integration (VLSI) Systems,
vol. 14, no. 5, pages 501–513, May. 2006.
[Huang07] Wei Huang. HOTSPOT—A CHIP AND PACKAGE COMPACT THERMAL MODELING
METHODOLOGY FOR VLSI DESIGN, Dissertation, University of Virginia, 2007.
[Ieee14a] IEEE P1838 3D-TEST WORKING GROUP, 2014. [Online]. Available:
http://grouper.ieee.org/groups/3Dtest/. [Accessed: 29-May-2015].
[Ieee14b] IEEE STANDARD FOR ACCESS AND CONTROL OF INSTRUMENTATION EMBEDDED
WITHIN A SEMICONDUCTOR DEVICE, IEEE Std 1687-2014, pages 1–283, Dec. 2014.
[Intel13] INTEL XEON E5-2600 V3 PROCESSOR OVERVIEW: HASWELL-EP UP TO 18 CORES,
PC PERSPECTIVE, 2013. [Online]. Available:
http://www.pcper.com/reviews/Processors/Intel-Xeon-E5-2600-v3-Processor-
Overview-Haswell-EP-18-Cores. [Accessed: 28-May-2015].
[Iyengar01] Vikram Iyengar and Krishnendu Chakrabarty. PRECEDENCE-BASED, PREEMPTIVE,
AND POWER-CONSTRAINED TEST SCHEDULING FOR SYSTEM-ON-A-CHIP, in 19th
VLSI Test Symposium, pages 368–374, 2001.
[Iyengar02] Vikram Iyengar, Krishnendu Chakrabarty, and Erik Jan Marinissen. ON USING
RECTANGLE PACKING FOR SOC WRAPPER/TAM CO-OPTIMIZATION, in 20th VLSI
Test Symposium, pages 253–258, 2002.
[Jagan10] L Jagan, C Hora, B Kruseman, S Eichenberger, AK Majhi, and V Kamakoti.
IMPACT OF TEMPERATURE ON TEST QUALITY, in 23rd International Conference on
VLSI Design, pages 276–281, 2010.
[Jedec09] TEMPERATURE CYCLING. Jedec solid state technology association, 2009.
[Jedec10] FAILURE MECHANISMS AND MODELS FOR SEMICONDUCTOR DEVICES, 2010.
[Online]. Available: http://www.jedec.org/standards-documents/docs/jep-122e.
[Accessed: 23-May-2014].
[Jiang14] T Jiang, C Wu, N Tamura, M Kunz, B Kim, H Son, M Suh, J Im, R Huang, and P
Ho. STUDY OF STRESSES AND PLASTICITY IN THROUGH-SILICON VIA STRUCTURES
FOR 3D INTERCONNECTS BY X-RAY MICRO-BEAM DIFFRACTION, IEEE
Transactions on Device and Materials Reliability, vol. 14, no. 2, pages 698–703,
June 2014.
[Kamto09] A Kamto, Y Liu, L Schaper, and SL Burkett. RELIABILITY STUDY OF THROUGH-
SILICON VIA (TSV) COPPER FILLED INTERCONNECTS, Thin Solid Films, vol. 518, no.
5, pages 1614–1619, Dec. 2009.
References
191
[Kim10] Tak-Yung Kim and Taewhan Kim. CLOCK TREE SYNTHESIS WITH PRE-BOND
TESTABILITY FOR 3D STACKED IC DESIGNS, in 47th Design Automation
Conference, pages 723–728, 2010.
[Ko08] HF Ko and N Nicolici. AUTOMATED SCAN CHAIN DIVISION FOR REDUCING SHIFT
AND CAPTURE POWER DURING BROADSIDE AT-SPEED TEST, IEEE Transactions on
Computer-Aided Design of Integrated Circuits and Systems, vol. 27, no. 11, pages
2092–2097, Nov. 2008.
[Kumar12] P Kumar, I Dutta, and MS Bakir. INTERFACIAL EFFECTS DURING THERMAL
CYCLING OF CU-FILLED THROUGH-SILICON VIAS (TSV), Journal of Electronic
Materials, vol. 41, no. 2, pages 322–335, Feb. 2012.
[Kundu05] S Kundu, P Engelke, I Polian, and B Becker. ON DETECTION OF RESISTIVE
BRIDGING DEFECTS BY LOW-TEMPERATURE AND LOW-VOLTAGE TESTING, in 14th
Asian Test Symposium, pages 266–271, 2005.
[Kuo11] Chi-Wei Kuo and Hung-Yin Tsai. THERMAL STRESS ANALYSIS AND FAILURE
MECHANISMS FOR THROUGH SILICON VIA ARRAY, in 6th International
Microsystems, Packaging, Assembly and Circuits Technology Conference, pages
169–172, 2011.
[Kuo12] Chi-Wei Kuo and Hung-Yin Tsai. THERMAL STRESS ANALYSIS AND FAILURE
MECHANISMS FOR THROUGH SILICON VIA ARRAY, in 13th Intersociety Conference
on Thermal and Thermomechanical Phenomena in Electronic Systems, pages 202–
206, 2012.
[Liao05] Weiping Liao, Lei He, and KM Lepak. TEMPERATURE AND SUPPLY VOLTAGE
AWARE PERFORMANCE AND POWER MODELING AT MICROARCHITECTURE LEVEL,
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,
vol. 24, no. 7, pages 1042–1053, Jul. 2005.
[Lin84] Tzu-Mu Lin and CA Mead. SIGNAL DELAY IN GENERAL RC NETWORKS, IEEE
Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol.
3, no. 4, pages 331–349, Oct. 1984.
[Li01] JC Li, Chao-Wen Tseng, and EJ McCluskey. TESTING FOR RESISTIVE OPENS AND
STUCK OPENS, in International Test Conference, pages 1049–1058, 2001.
[Liu04] Michael Liu, Wei-Shen Wang, and Michael Orshansky. LEAKAGE POWER
REDUCTION BY DUAL-VTH DESIGNS UNDER PROBABILISTIC ANALYSIS OF VTH
VARIATION, in International Symposium on Low Power Electronics and Design,
pages 2–7, 2004.
[Loi08] Igor Loi, Subhasish Mitra, Thomas H Lee, Shinobu Fujita, and Luca Benini. A
LOW-OVERHEAD FAULT TOLERANCE SCHEME FOR TSV-BASED 3D NETWORK ON
CHIP LINKS, in IEEE/ACM International Conference on Computer-Aided Design,
pages 598–602, 2008.
[Long04] E Long, WR Daasch, R Madge, and B Benware. DETECTION OF TEMPERATURE
SENSITIVE DEFECTS USING ZTC, in 22nd VLSI Test Symposium, 2004. Proceedings,
pages 185–190, 2004.
[Lu07] Hua Lu, T Tilford, and DR Newcombe. LIFETIME PREDICTION FOR POWER
ELECTRONICS MODULE SUBSTRATE MOUNT-DOWN SOLDER INTERCONNECT, in
International Symposium on High Density packaging and Microsystem Integration,
pages 1–10, 2007.
References
192
[Manikandan11] P Manikandan, BB Larsen, and EJ Aas. AN ENHANCED PATH DELAY FAULT
SIMULATOR FOR COMBINATIONAL CIRCUITS, in 14th Euromicro Conference on
Digital System Design, pages 375–381, 2011.
[Marinissen00] EJ Marinissen, SK Goel, and M Lousberg. WRAPPER DESIGN FOR EMBEDDED CORE
TEST, in International Test Conference, pages 911–920, 2000.
[Marinissen02] EJ Marinissen, V Iyengar, and K Chakrabarty. A SET OF BENCHMARKS FOR
MODULAR TESTING OF SOCS, in International Test Conference, pages 519–528,
2002.
[Marinissen09] Erik Jan Marinissen and Yervant Zorian. TESTING 3D CHIPS CONTAINING
THROUGH-SILICON VIAS, in International Test Conference, pages 1–11, 2009.
[Marinissen10a] Erik Jan Marinissen, Chun-Chuan Chi, Jouke Verbree, and Mario Konijnenburg.
3D DFT ARCHITECTURE FOR PRE-BOND AND POST-BOND TESTING, in International
3D Systems Integration Conference, pages 1–8, 2010.
[Marinissen10b] Erik Jan Marinissen, Jouke Verbree, and Mario Konijnenburg. A STRUCTURED AND
SCALABLE TEST ACCESS ARCHITECTURE FOR TSV-BASED 3D STACKED ICS, in 28th
VLSI Test Symposium, pages 269–274, 2010.
[Marinissen10c] Erik Jan Marinissen. CHALLENGES IN TESTING TSV-BASED 3D STACKED ICS: TEST
FLOWS, TEST CONTENTS, AND TEST ACCESS, in Asia Pacific Conference on Circuits
and Systems, pages 544–547, 2010.
[Marinissen12a] Erik Jan Marinissen. CHALLENGES AND EMERGING SOLUTIONS IN TESTING TSV-
BASED 2 1/2D-AND 3D-STACKED ICS, in Design, Automation and Test in Europe,
pages 1277–1282, 2012.
[Marinissen12b] Erik Jan Marinissen, Chun-Chuan Chi, Mario Konijnenburg, and Jouke Verbree. A
DFT ARCHITECTURE FOR 3D-SICS BASED ON A STANDARDIZABLE DIE WRAPPER,
Journal of Electronic Testing, vol. 28, no. 1, pages 73–92, Feb. 2012.
[Matsuishi68] M Matsuishi and T Endo. FATIGUE OF METALS SUBJECTED TO VARYING STRESS,
Japan Society of Mechanical Engineers, Fukuoka, Japan, pages 37–40, 1968.
[Maulik00] Ujjwal Maulik and Sanghamitra Bandyopadhyay. GENETIC ALGORITHM-BASED
CLUSTERING TECHNIQUE, Pattern recognition, vol. 33, no. 9, pages 1455–1465,
2000.
[Mil04] TEMPERATURE CYCLING (MIL-STD-883; METHOD 1010), DLA Land and Maritime
Mil. Specs & Drawings, Jun-2004. [Online]. Available:
http://www.landandmaritime.dla.mil/programs/milspec/ListDocs.aspx?BasicDoc=
MIL-STD-883. [Accessed: 28-May-2014].
[Miller01] Mark Miller. NEXT GENERATION BURN-IN AND TEST SYSTEMS FOR ATHLON
MICROPROCESSORS: HYBRID BURN-IN, in BiTS Workshop, 2001.
[Millican14] SK Millican and KK Saluja. A TEST PARTITIONING TECHNIQUE FOR SCHEDULING
TESTS FOR THERMALLY CONSTRAINED 3D INTEGRATED CIRCUITS, in 27th
International Conference on VLSI Design, pages 20–25, 2014.
[Mohapatra07] Debabrata Mohapatra, Georgios Karakonstantis, and Kaushik Roy. LOW-POWER
PROCESS-VARIATION TOLERANT ARITHMETIC UNITS USING INPUT-BASED ELASTIC
CLOCKING, in International Symposium on Low Power Electronics and Design,
pages 74–79, 2007.
References
193
[Mondal07] Mosin Mondal, Andrew J Ricketts, Sami Kirolos, Tamer Ragheb, Greg Link,
Narayanan Vijaykrishnan, and Yehia Massoud. THERMALLY ROBUST CLOCKING
SCHEMES FOR 3D INTEGRATED CIRCUITS, in Design, Automation & Test in Europe,
pages 1–6, 2007.
[Murray12] Conal E Murray, ET Ryan, Paul R Besser, C Witt, Jean L Jordan-Sweet, and MF
Toney. EVOLUTION OF STRESS GRADIENTS IN CU FILMS AND FEATURES INDUCED
BY CAPPING LAYERS, Microelectronic Engineering, vol. 92, pages 95–100, Apr.
2012.
[Musallam12] M Musallam and CM Johnson. AN EFFICIENT IMPLEMENTATION OF THE RAINFLOW
COUNTING ALGORITHM FOR LIFE CONSUMPTION ESTIMATION, IEEE Transactions
on Reliability, vol. 61, no. 4, pages 978–986, Dec. 2012.
[Nebel97] Wolfgang Nebel and Jean P Mermet. LOW POWER DESIGN IN DEEP SUBMICRON
ELECTRONICS. Norwell, MA, USA: Kluwer Academic Publishers, 1997.
[Needham98] Wayne Needham, Cheryl Prunty, and Eng Hong Yeoh. HIGH VOLUME
MICROPROCESSOR TEST ESCAPES, AN ANALYSIS OF DEFECTS OUR TESTS ARE
MISSING, in International Test Conference, pages 25–34, 1998.
[Nigh98] P Nigh, D Vallett, P Patel, J Wright, F Motika, D Forlenza, R Kurtulik, and W
Chong. FAILURE ANALYSIS OF TIMING AND IDDQ-ONLY FAILURES FROM THE
SEMATECH TEST METHODS EXPERIMENT, in International Test Conference, pages
43–52, 1998.
[Noia10a] Brandon Noia, Krishnendu Chakrabarty, and Erik Jan Marinissen. OPTIMIZATION
METHODS FOR POST-BOND DIE-INTERNAL/EXTERNAL TESTING IN 3D STACKED ICS,
in International Test Conference, pages 1–9, 2010.
[Noia10b] Brandon Noia, Sandeep Kumar Goel, Krishnendu Chakrabarty, Erik Jan
Marinissen, and Jouke Verbree. TEST-ARCHITECTURE OPTIMIZATION FOR TSV-
BASED 3D STACKED ICS, in 15th European Test Symposium, pages 24–29, 2010.
[Noia11] Brandon Noia and Krishnendu Chakrabarty. TESTING AND DESIGN-FOR-
TESTABILITY TECHNIQUES FOR 3D INTEGRATED CIRCUITS, 20th Asian Test
Symposium, pages 474–479, 2011.
[Noia12] Brandon Noia, Krishnendu Chakrabarty, and Erik Jan Marinissen. OPTIMIZATION
METHODS FOR POST-BOND TESTING OF 3D STACKED ICS, Journal of Electronic
Testing, vol. 28, no. 1, pages 103–120, Feb. 2012.
[Nowka08] Kevin Nowka. SURVIVAL OF VLSI DESIGN - COPING WITH DEVICE VARIABILITY
AND UNCERTAINTY, in Circuits and Systems Workshop: System-on-Chip - Design,
Applications, Integration, and Software, Dallas, pages 1–6, 2008.
[Nvidia12] NVIDIA’S NEXT GENERATION CUDA COMPUTE ARCHITECTURE: KEPLER GK110.
2012.
[Oberg03] Johnny Oberg. NETWORKS ON CHIP, A. Jantsch and H. Tenhunen, Eds. Hingham,
MA, USA: Kluwer Academic Publishers, pages 153–172, 2003.
[Okoro12] C Okoro and YS Obeng. EFFECT OF THERMAL CYCLING ON THE SIGNAL INTEGRITY
AND MORPHOLOGY OF TSV ISOLATION LINER- SIO2, in International Interconnect
Technology Conference, pages 1–3, 2012.
[Okoro14] Chukwudi Okoro, June W Lau, Fardad Golshany, Klaus Hummler, and Yaw S
Obeng. A DETAILED FAILURE ANALYSIS EXAMINATION OF THE EFFECT OF THERMAL
References
194
CYCLING ON CU TSV RELIABILITY, IEEE Transactions on Electron Devices, vol.
61, no. 1, pages 15–22, Jan. 2014.
[Oppenheim97] Alan V Oppenheim, Alan S Willsky, and Syed Hamid Nawab. SIGNALS AND
SYSTEMS, 2nd ed. Upper Saddle River, N.J: Prentice Hall, 1997.
[Pak11] JS Pak, Mohit Pathak, Sung Kyu Lim, and David Z Pan. MODELING OF
ELECTROMIGRATION IN THROUGH-SILICON-VIA BASED 3D IC, in 61st Electronic
Components and Technology Conference, pages 1420–1427, 2011.
[Patil07] Srinivas Patil. AT-SPEED SCAN TESTS: REALITY OR FANTASY? PANEL 1.4, in
International Test Conference, pages 1–1, 2007.
[Plas10] G Van der Plas, S Thijs, D Linten, G Katti, P Limaye, A Mercha, M Stucchi, H
Oprins, B Vandevelde, N Minas, M Cupac, M Dehan, M Nelis, R Agarwal, W
Dehaene, Y Travaly, E Beyne, and P Marchal. VERIFYING
ELECTRICAL/THERMAL/THERMO-MECHANICAL BEHAVIOR OF A 3D STACK -
CHALLENGES AND SOLUTIONS, in Custom Integrated Circuits Conference, pages
1–4, 2010.
[Plas11] Geert Van Der Plas, Erik-Jan Marinissen, Nikolaos Minas, and Paul Marchal.
METHOD AND DEVICE FOR TESTING TSVS IN A 3D CHIP STACK, U.S. Patent
US20110102011 A105-May-2011.
[Poli07] Riccardo Poli, James Kennedy, and Tim Blackwell. PARTICLE SWARM
OPTIMIZATION, Swarm Intelligence, vol. 1, no. 1, pages 33–57, Jun. 2007.
[Press07] William H Press. NUMERICAL RECIPES: THE ART OF SCIENTIFIC COMPUTING, 3rd
ed. Cambridge, UK ; New York: Cambridge University Press, 2007.
[Raina07] Rajesh Raina. AT-SPEED SCAN TESTS: REALITY OR FANTASY? PANEL 1.5, in
International Test Conference, pages 1–2, 2007.
[Rao03] Rajeev Rao, Ashish Srivastava, David Blaauw, and Dennis Sylvester. STATISTICAL
ESTIMATION OF LEAKAGE CURRENT CONSIDERING INTER-AND INTRA-DIE PROCESS
VARIATION, in International Symposium on Low Power Electronics and Design,
pages 84–89, 2003.
[Rohani13] Alireza Rohani and Hans G Kerkhoff. RAPID TRANSIENT FAULT INSERTION IN
LARGE DIGITAL SYSTEMS, Microprocessors and Microsystems, vol. 37, no. 2, pages
147–154, Mar. 2013.
[Rosinger02] PM Rosinger, BM Al-Hashimi, and N Nicolici. POWER PROFILE MANIPULATION: A
NEW APPROACH FOR REDUCING TEST APPLICATION TIME UNDER POWER
CONSTRAINTS, IEEE Transactions on Computer-Aided Design of Integrated
Circuits and Systems, vol. 21, no. 10, pages 1217–1225, Oct. 2002.
[Rosinger06] Paul Rosinger, Bashir M Al-Hashimi, and Krishnendu Chakrabarty. THERMAL-
SAFE TEST SCHEDULING FOR CORE-BASED SYSTEM-ON-CHIP INTEGRATED CIRCUITS,
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,
vol. 25, no. 11, pages 2502–2512, 2006.
[Samii06] Soheil Samii, Erik Larsson, Krishnendu Chakrabarty, and Zebo Peng. CYCLE-
ACCURATE TEST POWER MODELING AND ITS APPLICATION TO SOC TEST
SCHEDULING, in International Test Conference, pages 1–10, 2006.
[Santarini14] Mike Santarini. XILINX SHIPS INDUSTRY’S FIRST 20-NM ALL PROGRAMMABLE
DEVICES, Xcell, vol. 1, no. 86, page 14, 2014.
References
195
[Sarangi08] SR Sarangi, B Greskamp, R Teodorescu, J Nakano, A Tiwari, and J Torrellas.
VARIUS: A MODEL OF PROCESS VARIATION AND RESULTING TIMING ERRORS FOR
MICROARCHITECTS, IEEE Transactions on Semiconductor Manufacturing, vol. 21,
no. 1, pages 3–13, Feb. 2008.
[Schuermyer04] C Schuermyer, J Ruffler, R Daasch, and R Madge. MINIMUM TESTING
REQUIREMENTS TO SCREEN TEMPERATURE DEPENDENT DEFECTS, in International
Test Conference, pages 300–308, 2004.
[Segura02] J Segura, A Keshavarzi, J Soden, and C Hawkins. PARAMETRIC FAILURES IN
CMOS ICS - A DEFECT-BASED ANALYSIS, in International Test Conference, pages
90–99, 2002.
[Segura04] Jaume Segura and Charles F Hawkins. CMOS ELECTRONICS: HOW IT WORKS, HOW
IT FAILS. Hoboken, NJ, USA: John Wiley & Sons, Inc., 2004.
[Semenov03] O Semenov, A Vassighi, M Sachdev, A Keshavarzi, and CF Hawkins. EFFECT OF
CMOS TECHNOLOGY SCALING ON THERMAL MANAGEMENT DURING BURN-IN, IEEE
Transactions on Semiconductor Manufacturing, vol. 16, no. 4, pages 686–695,
Nov. 2003.
[SenGupta12] Breeta SenGupta, Urban Ingelsson, and Erik Larsson. SCHEDULING TESTS FOR 3D
STACKED CHIPS UNDER POWER CONSTRAINTS, Journal of Electronic Testing, vol.
28, no. 1, pages 121–135, Feb. 2012.
[Shibin15] Konstantin Shibin, Vivek Chickermane, Brion Keller, Christos Papameletis, and
Erik Jan Marinissen. AT-SPEED DELAY TESTING OF INTER-DIE CONNECTIONS OF
2.5D- AND 3D-SICS, in 20th European Test Symposium, 2015.
[Smorodin08] T Smorodin, J Wilde, P Alpern, and M Stecher. A TEMPERATURE-GRADIENT-
INDUCED FAILURE MECHANISM IN METALLIZATION UNDER FAST THERMAL
CYCLING, IEEE Transactions on Device and Materials Reliability, vol. 8, no. 3,
pages 590–599, Sep. 2008.
[Srivastava02] Ashish Srivastava, Robert Bai, David Blaauw, and Dennis Sylvester. MODELING
AND ANALYSIS OF LEAKAGE POWER CONSIDERING WITHIN-DIE PROCESS
VARIATIONS, in International Symposium on Low Power Electronics and Design,
pages 64–67, 2002.
[Srivastava04] Ashish Srivastava, Dennis Sylvester, and David Blaauw. STATISTICAL
OPTIMIZATION OF LEAKAGE POWER CONSIDERING PROCESS VARIATIONS USING
DUAL-VTH AND SIZING, in 41st annual Design Automation Conference, pages 773–
778, 2004.
[Stan03] Mircea R Stan, Kevin Skadron, Marco Barcella, Wei Huang, Karthik
Sankaranarayanan, and Sivakumar Velusamy. HOTSPOT: A DYNAMIC COMPACT
THERMAL MODEL AT THE PROCESSOR-ARCHITECTURE LEVEL, Microelectronics
Journal, vol. 34, no. 12, pages 1153–1165, 2003.
[Syed10] A Syed. LIMITATIONS OF NORRIS-LANDZBERG EQUATION AND APPLICATION OF
DAMAGE ACCUMULATION BASED METHODOLOGY FOR ESTIMATING ACCELERATION
FACTORS FOR PB FREE SOLDERS, in 11th International Conference on Thermal,
Mechanical Multi-Physics Simulation, and Experiments in Microelectronics and
Microsystems, pages 1–11, 2010.
[Tadayon00] Pooya Tadayon. THERMAL CHALLENGES DURING MICROPROCESSOR TESTING, Intel
Technology Journal, vol. 4, no. 3, pages 1–8, 2000.
References
196
[Taouil10a] Mottaqiallah Taouil, Said Hamdioui, Kees Beenakker, and Erik Jan Marinissen.
TEST COST ANALYSIS FOR 3D DIE-TO-WAFER STACKING, in 19th Asian Test
Symposium, pages 435–441, 2010.
[Taouil10b] Mottaqiallah Taouil, Said Hamdioui, Jouke Verbree, and Erik Jan Marinissen. ON
MAXIMIZING THE COMPOUND YIELD FOR 3D WAFER-TO-WAFER STACKED ICS, in
International Test Conference, pages 1–10, 2010.
[Taouil11] Mottaqiallah Taouil, Said Hamdioui, and Erik Jan Marinissen. HOW SIGNIFICANT
WILL BE THE TEST COST SHARE FOR 3D DIE-TO-WAFER STACKED-ICS?, in 6th
International Conference on Design & Technology of Integrated Systems in
Nanoscale Era, pages 1–6, 2011.
[Taouil12] Mottaqiallah Taouil, Said Hamdioui, Kees Beenakker, and Erik Jan Marinissen.
TEST IMPACT ON THE OVERALL DIE-TO-WAFER 3D STACKED IC COST, Journal of
Electronic Testing, vol. 28, no. 1, pages 15–25, Feb. 2012.
[Tseng00] Chao-Wen Tseng, EJ McCluskey, Xiaoping Shao, and DM Wu. COLD DELAY
DEFECT SCREENING, in 18th VLSI Test Symposium, pages 183–188, 2000.
[Tudu09] JT Tudu, E Larsson, V Singh, and VD Agrawal. ON MINIMIZATION OF PEAK POWER
FOR SCAN CIRCUIT DURING TEST, in 14th European Test Symposium, pages 25–30,
2009.
[Ukhov12] Ivan Ukhov, Min Bao, Petru Eles, and Zebo Peng. STEADY-STATE DYNAMIC
TEMPERATURE ANALYSIS AND RELIABILITY OPTIMIZATION FOR EMBEDDED
MULTIPROCESSOR SYSTEMS, in 49th annual Design Automation Conference, New
York, NY, USA, pages 197–204, 2012.
[Ukhov14a] I Ukhov, P Eles, and Z Peng. PROBABILISTIC ANALYSIS OF POWER AND
TEMPERATURE UNDER PROCESS VARIATION FOR ELECTRONIC SYSTEM DESIGN,
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems,
vol. 33, no. 6, pages 931–944, Jun. 2014.
[Ukhov14b] I Ukhov, P Eles, and Z Peng. TEMPERATURE-CENTRIC RELIABILITY ANALYSIS AND
OPTIMIZATION OF ELECTRONIC SYSTEMS UNDER PROCESS VARIATION, IEEE
Transactions on Very Large Scale Integration (VLSI) Systems, vol. PP, no. 99,
pages 1–1, 2014.
[Vassighi06] Arman Vassighi and Manoj Sachdev. THERMAL AND POWER MANAGEMENT OF
INTEGRATED CIRCUITS. Boston: Kluwer Academic Publishers, 2006.
[Vasudevan08] V Vasudevan and Xuejun Fan. AN ACCELERATION MODEL FOR LEAD-FREE (SAC)
SOLDER JOINT RELIABILITY UNDER THERMAL CYCLING, in 58th Electronic
Components and Technology Conference, pages 139–145, 2008.
[Velenis10] Dimitrios Velenis, Erik Jan Marinissen, and Eric Beyne. COST EFFECTIVENESS OF
3D INTEGRATION OPTIONS, in International 3D Systems Integration Conference,
pages 1–6, 2010.
[Verbree10] Jouke Verbree, Erik Jan Marinissen, Philippe Roussel, and Dimitrios Velenis. ON
THE COST-EFFECTIVENESS OF MATCHING REPOSITORIES OF PRE-TESTED WAFERS
FOR WAFER-TO-WAFER 3D CHIP STACKING, in 15th European Test Symposium,
pages 36–41, 2010.
[Vinay10] NS Vinay, Indira Rawat, Erik Larsson, MS Gaur, and Virendra Singh. THERMAL
AWARE TEST SCHEDULING FOR STACKED MULTI-CHIP-MODULES, in East-West
Design & Test Symposium, pages 343–349, 2010.
References
197
[Wen11] Xiaoqing Wen, Kazunari Enokimoto, Kohei Miyase, Yuta Yamato, Michael A
Kochte, Seiji Kajihara, Patrick Girard, and Mohammad Tehranipoor. POWER-
AWARE TEST GENERATION WITH GUARANTEED LAUNCH SAFETY FOR AT-SPEED
SCAN TESTING, in 29th VLSI Test Symposium, pages 166–171, 2011.
[Wu10] Sean H Wu, Alexander Tetelbaum, and Li-C Wang. HOW DOES INVERSE
TEMPERATURE DEPENDENCE AFFECT TIMING SIGN-OFF, in Emerging Technologies
and Circuits, A. Amara, T. Ea, and M. Belleville, Eds. Springer Netherlands, pages
179–189, 2010.
[Yao09] Chunhua Yao, Kewal K Saluja, and Parameswaran Ramanathan. PARTITION BASED
SOC TEST SCHEDULING WITH THERMAL AND POWER CONSTRAINTS UNDER DEEP
SUBMICRON TECHNOLOGIES, in Asian Test Symposium, pages 281–286, 2009.
[Yao11a] Chunhua Yao, Kewal K Saluja, and Parameswaran Ramanathan. POWER AND
THERMAL CONSTRAINED TEST SCHEDULING UNDER DEEP SUBMICRON
TECHNOLOGIES, IEEE Transactions on Computer-Aided Design of Integrated
Circuits and Systems, vol. 30, no. 2, pages 317–322, Feb. 2011.
[Yao11b] C Yao, KK Saluja, and P Ramanathan. TEMPERATURE DEPENDENT TEST
SCHEDULING FOR MULTI-CORE SYSTEM-ON-CHIP, in 20th Asian Test Symposium,
pages 27–32, 2011.
[Yao11c] Chunhua Yao, Kewal K Saluja, and Parameswaran Ramanathan. THERMAL-
AWARE TEST SCHEDULING USING ON-CHIP TEMPERATURE SENSORS, in 24th
International Conference on VLSI Design, pages 376–381, 2011.
[Yu09] TE Yu, T Yoneda, K Chakrabarty, and H Fujiwara. TEST INFRASTRUCTURE DESIGN
FOR CORE-BASED SYSTEM-ON-CHIP UNDER CYCLE-ACCURATE THERMAL
CONSTRAINTS, in Asia and South Pacific Design Automation Conference, pages
793–798, 2009.
[Zhang13] Dingyou Zhang, K Hummler, L Smith, and JJQ Lu. BACKSIDE TSV PROTRUSION
INDUCED BY THERMAL SHOCK AND THERMAL CYCLING, in 63rd Electronic
Components and Technology Conference, pages 1407–1413, 2013.
[Zhao10] Wei Zhao, Junxia Ma, M Tehranipoor, and S Chakravarty. POWER-SAFE
APPLICATION OF TRANSITION DELAY FAULT PATTERNS CONSIDERING CURRENT
LIMIT DURING WAFER TEST, in 19th Asian Test Symposium, pages 301–306, 2010.
[Zhuo10] Cheng Zhuo, Dennis Sylvester, and David Blaauw. PROCESS VARIATION AND
TEMPERATURE-AWARE RELIABILITY MANAGEMENT, in Design, Automation and
Test in Europe, pages 580–585, 2010.
[Zorian93] Y Zorian. A DISTRIBUTED BIST CONTROL SCHEME FOR COMPLEX VLSI DEVICES,
in 11th annual VLSI Test Symposium, pages 4–9, 1993.
[Zou03] Wei Zou, SM Reddy, I Pomeranz, and Yu Huang. SOC TEST SCHEDULING USING
SIMULATED ANNEALING, in 21st VLSI Test Symposium, pages 325–330, 2003.
[Zschech02] Ehrenfried Zschech, Eckhard Langer, Hans-Juergen Engelmann, and Kornelia
Dittmar. PHYSICAL FAILURE ANALYSIS IN SEMICONDUCTOR INDUSTRY—
CHALLENGES OF THE COPPER INTERCONNECT PROCESS, Materials Science in
Semiconductor Processing, vol. 5, no. 4–5, pages 457–464, Aug. 2002.
Department of Computer and Information Science
Linköpings universitet
Dissertations
Linköping Studies in Science and Technology
Linköping Studies in Arts and Science Linköping Studies in Statistics
Linköpings Studies in Information Science
Linköping Studies in Science and Technology
No 14 Anders Haraldsson: A Program Manipulation
System Based on Partial Evaluation, 1977, ISBN 91-
7372-144-1.
No 17 Bengt Magnhagen: Probability Based Verification of
Time Margins in Digital Designs, 1977, ISBN 91-7372-
157-3.
No 18 Mats Cedwall: Semantisk analys av process-
beskrivningar i naturligt språk, 1977, ISBN 91- 7372-
168-9.
No 22 Jaak Urmi: A Machine Independent LISP Compiler
and its Implications for Ideal Hardware, 1978, ISBN
91-7372-188-3.
No 33 Tore Risch: Compilation of Multiple File Queries in
a Meta-Database System 1978, ISBN 91- 7372-232-4.
No 51 Erland Jungert: Synthesizing Database Structures
from a User Oriented Data Model, 1980, ISBN 91-
7372-387-8.
No 54 Sture Hägglund: Contributions to the Development
of Methods and Tools for Interactive Design of
Applications Software, 1980, ISBN 91-7372-404-1.
No 55 Pär Emanuelson: Performance Enhancement in a
Well-Structured Pattern Matcher through Partial
Evaluation, 1980, ISBN 91-7372-403-3.
No 58 Bengt Johnsson, Bertil Andersson: The Human-
Computer Interface in Commercial Systems, 1981,
ISBN 91-7372-414-9.
No 69 H. Jan Komorowski: A Specification of an Abstract
Prolog Machine and its Application to Partial
Evaluation, 1981, ISBN 91-7372-479-3.
No 71 René Reboh: Knowledge Engineering Techniques
and Tools for Expert Systems, 1981, ISBN 91-7372-
489-0.
No 77 Östen Oskarsson: Mechanisms of Modifiability in
large Software Systems, 1982, ISBN 91- 7372-527-7.
No 94 Hans Lunell: Code Generator Writing Systems, 1983,
ISBN 91-7372-652-4.
No 97 Andrzej Lingas: Advances in Minimum Weight
Triangulation, 1983, ISBN 91-7372-660-5.
No 109 Peter Fritzson: Towards a Distributed Programming
Environment based on Incremental Compilation,
1984, ISBN 91-7372-801-2.
No 111 Erik Tengvald: The Design of Expert Planning
Systems. An Experimental Operations Planning
System for Turning, 1984, ISBN 91-7372- 805-5.
No 155 Christos Levcopoulos: Heuristics for Minimum
Decompositions of Polygons, 1987, ISBN 91-7870-
133-3.
No 165 James W. Goodwin: A Theory and System for Non-
Monotonic Reasoning, 1987, ISBN 91-7870-183-X.
No 170 Zebo Peng: A Formal Methodology for Automated
Synthesis of VLSI Systems, 1987, ISBN 91-7870-225-9.
No 174 Johan Fagerström: A Paradigm and System for
Design of Distributed Systems, 1988, ISBN 91-7870-
301-8.
No 192 Dimiter Driankov: Towards a Many Valued Logic of
Quantified Belief, 1988, ISBN 91-7870-374-3.
No 213 Lin Padgham: Non-Monotonic Inheritance for an
Object Oriented Knowledge Base, 1989, ISBN 91-
7870-485-5.
No 214 Tony Larsson: A Formal Hardware Description and
Verification Method, 1989, ISBN 91-7870-517-7.
No 221 Michael Reinfrank: Fundamentals and Logical
Foundations of Truth Maintenance, 1989, ISBN 91-
7870-546-0.
No 239 Jonas Löwgren: Knowledge-Based Design Support
and Discourse Management in User Interface
Management Systems, 1991, ISBN 91-7870-720-X.
No 244 Henrik Eriksson: Meta-Tool Support for Knowledge
Acquisition, 1991, ISBN 91-7870-746-3.
No 252 Peter Eklund: An Epistemic Approach to Interactive
Design in Multiple Inheritance Hierarchies, 1991,
ISBN 91-7870-784-6.
No 258 Patrick Doherty: NML3 - A Non-Monotonic
Formalism with Explicit Defaults, 1991, ISBN 91-
7870-816-8.
No 260 Nahid Shahmehri: Generalized Algorithmic
Debugging, 1991, ISBN 91-7870-828-1.
No 264 Nils Dahlbäck: Representation of Discourse-
Cognitive and Computational Aspects, 1992, ISBN
91-7870-850-8.
No 265 Ulf Nilsson: Abstract Interpretations and Abstract
Machines: Contributions to a Methodology for the
Implementation of Logic Programs, 1992, ISBN 91-
7870-858-3.
No 270 Ralph Rönnquist: Theory and Practice of Tense-
bound Object References, 1992, ISBN 91-7870-873-7.
No 273 Björn Fjellborg: Pipeline Extraction for VLSI Data
Path Synthesis, 1992, ISBN 91-7870-880-X.
No 276 Staffan Bonnier: A Formal Basis for Horn Clause
Logic with External Polymorphic Functions, 1992,
ISBN 91-7870-896-6.
No 277 Kristian Sandahl: Developing Knowledge Manage-
ment Systems with an Active Expert Methodology,
1992, ISBN 91-7870-897-4.
No 281 Christer Bäckström: Computational Complexity of
Reasoning about Plans, 1992, ISBN 91-7870-979-2.
No 292 Mats Wirén: Studies in Incremental Natural
Language Analysis, 1992, ISBN 91-7871-027-8.
No 297 Mariam Kamkar: Interprocedural Dynamic Slicing
with Applications to Debugging and Testing, 1993,
ISBN 91-7871-065-0.
No 302 Tingting Zhang: A Study in Diagnosis Using
Classification and Defaults, 1993, ISBN 91-7871-078-2
No 312 Arne Jönsson: Dialogue Management for Natural
Language Interfaces - An Empirical Approach, 1993,
ISBN 91-7871-110-X.
No 338 Simin Nadjm-Tehrani: Reactive Systems in Physical
Environments: Compositional Modelling and Frame-
work for Verification, 1994, ISBN 91-7871-237-8.
No 371 Bengt Savén: Business Models for Decision Support
and Learning. A Study of Discrete-Event
Manufacturing Simulation at Asea/ABB 1968-1993,
1995, ISBN 91-7871-494-X.
No 375 Ulf Söderman: Conceptual Modelling of Mode
Switching Physical Systems, 1995, ISBN 91-7871-516-
4.
No 383 Andreas Kågedal: Exploiting Groundness in Logic
Programs, 1995, ISBN 91-7871-538-5.
No 396 George Fodor: Ontological Control, Description,
Identification and Recovery from Problematic
Control Situations, 1995, ISBN 91-7871-603-9.
No 413 Mikael Pettersson: Compiling Natural Semantics,
1995, ISBN 91-7871-641-1.
No 414 Xinli Gu: RT Level Testability Improvement by
Testability Analysis and Transformations, 1996, ISBN
91-7871-654-3.
No 416 Hua Shu: Distributed Default Reasoning, 1996, ISBN
91-7871-665-9.
No 429 Jaime Villegas: Simulation Supported Industrial
Training from an Organisational Learning
Perspective - Development and Evaluation of the
SSIT Method, 1996, ISBN 91-7871-700-0.
No 431 Peter Jonsson: Studies in Action Planning:
Algorithms and Complexity, 1996, ISBN 91-7871-704-
3.
No 437 Johan Boye: Directional Types in Logic
Programming, 1996, ISBN 91-7871-725-6.
No 439 Cecilia Sjöberg: Activities, Voices and Arenas:
Participatory Design in Practice, 1996, ISBN 91-7871-
728-0.
No 448 Patrick Lambrix: Part-Whole Reasoning in
Description Logics, 1996, ISBN 91-7871-820-1.
No 452 Kjell Orsborn: On Extensible and Object-Relational
Database Technology for Finite Element Analysis
Applications, 1996, ISBN 91-7871-827-9.
No 459 Olof Johansson: Development Environments for
Complex Product Models, 1996, ISBN 91-7871-855-4.
No 461 Lena Strömbäck: User-Defined Constructions in
Unification-Based Formalisms, 1997, ISBN 91-7871-
857-0.
No 462 Lars Degerstedt: Tabulation-based Logic Program-
ming: A Multi-Level View of Query Answering,
1996, ISBN 91-7871-858-9.
No 475 Fredrik Nilsson: Strategi och ekonomisk styrning -
En studie av hur ekonomiska styrsystem utformas
och används efter företagsförvärv, 1997, ISBN 91-
7871-914-3.
No 480 Mikael Lindvall: An Empirical Study of Require-
ments-Driven Impact Analysis in Object-Oriented
Software Evolution, 1997, ISBN 91-7871-927-5.
No 485 Göran Forslund: Opinion-Based Systems: The Coop-
erative Perspective on Knowledge-Based Decision
Support, 1997, ISBN 91-7871-938-0.
No 494 Martin Sköld: Active Database Management
Systems for Monitoring and Control, 1997, ISBN 91-
7219-002-7.
No 495 Hans Olsén: Automatic Verification of Petri Nets in
a CLP framework, 1997, ISBN 91-7219-011-6.
No 498 Thomas Drakengren: Algorithms and Complexity
for Temporal and Spatial Formalisms, 1997, ISBN 91-
7219-019-1.
No 502 Jakob Axelsson: Analysis and Synthesis of Heteroge-
neous Real-Time Systems, 1997, ISBN 91-7219-035-3.
No 503 Johan Ringström: Compiler Generation for Data-
Parallel Programming Languages from Two-Level
Semantics Specifications, 1997, ISBN 91-7219-045-0.
No 512 Anna Moberg: Närhet och distans - Studier av kom-
munikationsmönster i satellitkontor och flexibla
kontor, 1997, ISBN 91-7219-119-8.
No 520 Mikael Ronström: Design and Modelling of a
Parallel Data Server for Telecom Applications, 1998,
ISBN 91-7219-169-4.
No 522 Niclas Ohlsson: Towards Effective Fault Prevention
- An Empirical Study in Software Engineering, 1998,
ISBN 91-7219-176-7.
No 526 Joachim Karlsson: A Systematic Approach for
Prioritizing Software Requirements, 1998, ISBN 91-
7219-184-8.
No 530 Henrik Nilsson: Declarative Debugging for Lazy
Functional Languages, 1998, ISBN 91-7219-197-x.
No 555 Jonas Hallberg: Timing Issues in High-Level Synthe-
sis, 1998, ISBN 91-7219-369-7.
No 561 Ling Lin: Management of 1-D Sequence Data - From
Discrete to Continuous, 1999, ISBN 91-7219-402-2.
No 563 Eva L Ragnemalm: Student Modelling based on Col-
laborative Dialogue with a Learning Companion,
1999, ISBN 91-7219-412-X.
No 567 Jörgen Lindström: Does Distance matter? On geo-
graphical dispersion in organisations, 1999, ISBN 91-
7219-439-1.
No 582 Vanja Josifovski: Design, Implementation and
Evaluation of a Distributed Mediator System for
Data Integration, 1999, ISBN 91-7219-482-0.
No 589 Rita Kovordányi: Modeling and Simulating
Inhibitory Mechanisms in Mental Image
Reinterpretation - Towards Cooperative Human-
Computer Creativity, 1999, ISBN 91-7219-506-1.
No 592 Mikael Ericsson: Supporting the Use of Design
Knowledge - An Assessment of Commenting
Agents, 1999, ISBN 91-7219-532-0.
No 593 Lars Karlsson: Actions, Interactions and Narratives,
1999, ISBN 91-7219-534-7.
No 594 C. G. Mikael Johansson: Social and Organizational
Aspects of Requirements Engineering Methods - A
practice-oriented approach, 1999, ISBN 91-7219-541-
X.
No 595 Jörgen Hansson: Value-Driven Multi-Class Overload
Management in Real-Time Database Systems, 1999,
ISBN 91-7219-542-8.
No 596 Niklas Hallberg: Incorporating User Values in the
Design of Information Systems and Services in the
Public Sector: A Methods Approach, 1999, ISBN 91-
7219-543-6.
No 597 Vivian Vimarlund: An Economic Perspective on the
Analysis of Impacts of Information Technology:
From Case Studies in Health-Care towards General
Models and Theories, 1999, ISBN 91-7219-544-4.
No 598 Johan Jenvald: Methods and Tools in Computer-
Supported Taskforce Training, 1999, ISBN 91-7219-
547-9.
No 607 Magnus Merkel: Understanding and enhancing
translation by parallel text processing, 1999, ISBN 91-
7219-614-9.
No 611 Silvia Coradeschi: Anchoring symbols to sensory
data, 1999, ISBN 91-7219-623-8.
No 613 Man Lin: Analysis and Synthesis of Reactive
Systems: A Generic Layered Architecture
Perspective, 1999, ISBN 91-7219-630-0.
No 618 Jimmy Tjäder: Systemimplementering i praktiken -
En studie av logiker i fyra projekt, 1999, ISBN 91-
7219-657-2.
No 627 Vadim Engelson: Tools for Design, Interactive
Simulation, and Visualization of Object-Oriented
Models in Scientific Computing, 2000, ISBN 91-7219-
709-9.
No 637 Esa Falkenroth: Database Technology for Control
and Simulation, 2000, ISBN 91-7219-766-8.
No 639 Per-Arne Persson: Bringing Power and Knowledge
Together: Information Systems Design for Autonomy
and Control in Command Work, 2000, ISBN 91-7219-
796-X.
No 660 Erik Larsson: An Integrated System-Level Design for
Testability Methodology, 2000, ISBN 91-7219-890-7.
No 688 Marcus Bjäreland: Model-based Execution
Monitoring, 2001, ISBN 91-7373-016-5.
No 689 Joakim Gustafsson: Extending Temporal Action
Logic, 2001, ISBN 91-7373-017-3.
No 720 Carl-Johan Petri: Organizational Information Provi-
sion - Managing Mandatory and Discretionary Use
of Information Technology, 2001, ISBN-91-7373-126-
9.
No 724 Paul Scerri: Designing Agents for Systems with Ad-
justable Autonomy, 2001, ISBN 91 7373 207 9.
No 725 Tim Heyer: Semantic Inspection of Software
Artifacts: From Theory to Practice, 2001, ISBN 91
7373 208 7.
No 726 Pär Carlshamre: A Usability Perspective on Require-
ments Engineering - From Methodology to Product
Development, 2001, ISBN 91 7373 212 5.
No 732 Juha Takkinen: From Information Management to
Task Management in Electronic Mail, 2002, ISBN 91
7373 258 3.
No 745 Johan Åberg: Live Help Systems: An Approach to
Intelligent Help for Web Information Systems, 2002,
ISBN 91-7373-311-3.
No 746 Rego Granlund: Monitoring Distributed Teamwork
Training, 2002, ISBN 91-7373-312-1.
No 757 Henrik André-Jönsson: Indexing Strategies for Time
Series Data, 2002, ISBN 917373-346-6.
No 747 Anneli Hagdahl: Development of IT-supported
Interorganisational Collaboration - A Case Study in
the Swedish Public Sector, 2002, ISBN 91-7373-314-8.
No 749 Sofie Pilemalm: Information Technology for Non-
Profit Organisations - Extended Participatory Design
of an Information System for Trade Union Shop
Stewards, 2002, ISBN 91-7373-318-0.
No 765 Stefan Holmlid: Adapting users: Towards a theory
of use quality, 2002, ISBN 91-7373-397-0.
No 771 Magnus Morin: Multimedia Representations of Dis-
tributed Tactical Operations, 2002, ISBN 91-7373-421-
7.
No 772 Pawel Pietrzak: A Type-Based Framework for Locat-
ing Errors in Constraint Logic Programs, 2002, ISBN
91-7373-422-5.
No 758 Erik Berglund: Library Communication Among Pro-
grammers Worldwide, 2002, ISBN 91-7373-349-0.
No 774 Choong-ho Yi: Modelling Object-Oriented Dynamic
Systems Using a Logic-Based Framework, 2002, ISBN
91-7373-424-1.
No 779 Mathias Broxvall: A Study in the Computational
Complexity of Temporal Reasoning, 2002, ISBN 91-
7373-440-3.
No 793 Asmus Pandikow: A Generic Principle for Enabling
Interoperability of Structured and Object-Oriented
Analysis and Design Tools, 2002, ISBN 91-7373-479-9.
No 785 Lars Hult: Publika Informationstjänster. En studie av
den Internetbaserade encyklopedins bruksegenska-
per, 2003, ISBN 91-7373-461-6.
No 800 Lars Taxén: A Framework for the Coordination of
Complex Systems´ Development, 2003, ISBN 91-
7373-604-X
No 808 Klas Gäre: Tre perspektiv på förväntningar och
förändringar i samband med införande av
informationssystem, 2003, ISBN 91-7373-618-X.
No 821 Mikael Kindborg: Concurrent Comics -
programming of social agents by children, 2003,
ISBN 91-7373-651-1.
No 823 Christina Ölvingson: On Development of
Information Systems with GIS Functionality in
Public Health Informatics: A Requirements
Engineering Approach, 2003, ISBN 91-7373-656-2.
No 828 Tobias Ritzau: Memory Efficient Hard Real-Time
Garbage Collection, 2003, ISBN 91-7373-666-X.
No 833 Paul Pop: Analysis and Synthesis of
Communication-Intensive Heterogeneous Real-Time
Systems, 2003, ISBN 91-7373-683-X.
No 852 Johan Moe: Observing the Dynamic Behaviour of
Large Distributed Systems to Improve Development
and Testing – An Empirical Study in Software
Engineering, 2003, ISBN 91-7373-779-8.
No 867 Erik Herzog: An Approach to Systems Engineering
Tool Data Representation and Exchange, 2004, ISBN
91-7373-929-4.
No 872 Aseel Berglund: Augmenting the Remote Control:
Studies in Complex Information Navigation for
Digital TV, 2004, ISBN 91-7373-940-5.
No 869 Jo Skåmedal: Telecommuting’s Implications on
Travel and Travel Patterns, 2004, ISBN 91-7373-935-9.
No 870 Linda Askenäs: The Roles of IT - Studies of
Organising when Implementing and Using
Enterprise Systems, 2004, ISBN 91-7373-936-7.
No 874 Annika Flycht-Eriksson: Design and Use of Ontolo-
gies in Information-Providing Dialogue Systems,
2004, ISBN 91-7373-947-2.
No 873 Peter Bunus: Debugging Techniques for Equation-
Based Languages, 2004, ISBN 91-7373-941-3.
No 876 Jonas Mellin: Resource-Predictable and Efficient
Monitoring of Events, 2004, ISBN 91-7373-956-1.
No 883 Magnus Bång: Computing at the Speed of Paper:
Ubiquitous Computing Environments for Healthcare
Professionals, 2004, ISBN 91-7373-971-5
No 882 Robert Eklund: Disfluency in Swedish human-
human and human-machine travel booking di-
alogues, 2004, ISBN 91-7373-966-9.
No 887 Anders Lindström: English and other Foreign
Linguistic Elements in Spoken Swedish. Studies of
Productive Processes and their Modelling using
Finite-State Tools, 2004, ISBN 91-7373-981-2.
No 889 Zhiping Wang: Capacity-Constrained Production-in-
ventory systems - Modelling and Analysis in both a
traditional and an e-business context, 2004, ISBN 91-
85295-08-6.
No 893 Pernilla Qvarfordt: Eyes on Multimodal Interaction,
2004, ISBN 91-85295-30-2.
No 910 Magnus Kald: In the Borderland between Strategy
and Management Control - Theoretical Framework
and Empirical Evidence, 2004, ISBN 91-85295-82-5.
No 918 Jonas Lundberg: Shaping Electronic News: Genre
Perspectives on Interaction Design, 2004, ISBN 91-
85297-14-3.
No 900 Mattias Arvola: Shades of use: The dynamics of
interaction design for sociable use, 2004, ISBN 91-
85295-42-6.
No 920 Luis Alejandro Cortés: Verification and Scheduling
Techniques for Real-Time Embedded Systems, 2004,
ISBN 91-85297-21-6.
No 929 Diana Szentivanyi: Performance Studies of Fault-
Tolerant Middleware, 2005, ISBN 91-85297-58-5.
No 933 Mikael Cäker: Management Accounting as
Constructing and Opposing Customer Focus: Three
Case Studies on Management Accounting and
Customer Relations, 2005, ISBN 91-85297-64-X.
No 937 Jonas Kvarnström: TALplanner and Other
Extensions to Temporal Action Logic, 2005, ISBN 91-
85297-75-5.
No 938 Bourhane Kadmiry: Fuzzy Gain-Scheduled Visual
Servoing for Unmanned Helicopter, 2005, ISBN 91-
85297-76-3.
No 945 Gert Jervan: Hybrid Built-In Self-Test and Test
Generation Techniques for Digital Systems, 2005,
ISBN: 91-85297-97-6.
No 946 Anders Arpteg: Intelligent Semi-Structured Informa-
tion Extraction, 2005, ISBN 91-85297-98-4.
No 947 Ola Angelsmark: Constructing Algorithms for Con-
straint Satisfaction and Related Problems - Methods
and Applications, 2005, ISBN 91-85297-99-2.
No 963 Calin Curescu: Utility-based Optimisation of
Resource Allocation for Wireless Networks, 2005,
ISBN 91-85457-07-8.
No 972 Björn Johansson: Joint Control in Dynamic
Situations, 2005, ISBN 91-85457-31-0.
No 974 Dan Lawesson: An Approach to Diagnosability
Analysis for Interacting Finite State Systems, 2005,
ISBN 91-85457-39-6.
No 979 Claudiu Duma: Security and Trust Mechanisms for
Groups in Distributed Services, 2005, ISBN 91-85457-
54-X.
No 983 Sorin Manolache: Analysis and Optimisation of
Real-Time Systems with Stochastic Behaviour, 2005,
ISBN 91-85457-60-4.
No 986 Yuxiao Zhao: Standards-Based Application
Integration for Business-to-Business
Communications, 2005, ISBN 91-85457-66-3.
No 1004 Patrik Haslum: Admissible Heuristics for
Automated Planning, 2006, ISBN 91-85497-28-2.
No 1005 Aleksandra Tešanovic: Developing Reusable and
Reconfigurable Real-Time Software using Aspects
and Components, 2006, ISBN 91-85497-29-0.
No 1008 David Dinka: Role, Identity and Work: Extending
the design and development agenda, 2006, ISBN 91-
85497-42-8.
No 1009 Iakov Nakhimovski: Contributions to the Modeling
and Simulation of Mechanical Systems with Detailed
Contact Analysis, 2006, ISBN 91-85497-43-X.
No 1013 Wilhelm Dahllöf: Exact Algorithms for Exact
Satisfiability Problems, 2006, ISBN 91-85523-97-6.
No 1016 Levon Saldamli: PDEModelica - A High-Level Lan-
guage for Modeling with Partial Differential Equa-
tions, 2006, ISBN 91-85523-84-4.
No 1017 Daniel Karlsson: Verification of Component-based
Embedded System Designs, 2006, ISBN 91-85523-79-8
No 1018 Ioan Chisalita: Communication and Networking
Techniques for Traffic Safety Systems, 2006, ISBN 91-
85523-77-1.
No 1019 Tarja Susi: The Puzzle of Social Activity - The
Significance of Tools in Cognition and Cooperation,
2006, ISBN 91-85523-71-2.
No 1021 Andrzej Bednarski: Integrated Optimal Code Gener-
ation for Digital Signal Processors, 2006, ISBN 91-
85523-69-0.
No 1022 Peter Aronsson: Automatic Parallelization of Equa-
tion-Based Simulation Programs, 2006, ISBN 91-
85523-68-2.
No 1030 Robert Nilsson: A Mutation-based Framework for
Automated Testing of Timeliness, 2006, ISBN 91-
85523-35-6.
No 1034 Jon Edvardsson: Techniques for Automatic
Generation of Tests from Programs and
Specifications, 2006, ISBN 91-85523-31-3.
No 1035 Vaida Jakoniene: Integration of Biological Data,
2006, ISBN 91-85523-28-3.
No 1045 Genevieve Gorrell: Generalized Hebbian
Algorithms for Dimensionality Reduction in Natural
Language Processing, 2006, ISBN 91-85643-88-2.
No 1051 Yu-Hsing Huang: Having a New Pair of Glasses -
Applying Systemic Accident Models on Road Safety,
2006, ISBN 91-85643-64-5.
No 1054 Åsa Hedenskog: Perceive those things which cannot
be seen - A Cognitive Systems Engineering
perspective on requirements management, 2006,
ISBN 91-85643-57-2.
No 1061 Cécile Åberg: An Evaluation Platform for Semantic
Web Technology, 2007, ISBN 91-85643-31-9.
No 1073 Mats Grindal: Handling Combinatorial Explosion in
Software Testing, 2007, ISBN 978-91-85715-74-9.
No 1075 Almut Herzog: Usable Security Policies for Runtime
Environments, 2007, ISBN 978-91-85715-65-7.
No 1079 Magnus Wahlström: Algorithms, measures, and
upper bounds for Satisfiability and related problems,
2007, ISBN 978-91-85715-55-8.
No 1083 Jesper Andersson: Dynamic Software Architectures,
2007, ISBN 978-91-85715-46-6.
No 1086 Ulf Johansson: Obtaining Accurate and Compre-
hensible Data Mining Models - An Evolutionary
Approach, 2007, ISBN 978-91-85715-34-3.
No 1089 Traian Pop: Analysis and Optimisation of
Distributed Embedded Systems with Heterogeneous
Scheduling Policies, 2007, ISBN 978-91-85715-27-5.
No 1091 Gustav Nordh: Complexity Dichotomies for CSP-
related Problems, 2007, ISBN 978-91-85715-20-6.
No 1106 Per Ola Kristensson: Discrete and Continuous Shape
Writing for Text Entry and Control, 2007, ISBN 978-
91-85831-77-7.
No 1110 He Tan: Aligning Biomedical Ontologies, 2007, ISBN
978-91-85831-56-2.
No 1112 Jessica Lindblom: Minding the body - Interacting so-
cially through embodied action, 2007, ISBN 978-91-
85831-48-7.
No 1113 Pontus Wärnestål: Dialogue Behavior Management
in Conversational Recommender Systems, 2007,
ISBN 978-91-85831-47-0.
No 1120 Thomas Gustafsson: Management of Real-Time
Data Consistency and Transient Overloads in
Embedded Systems, 2007, ISBN 978-91-85831-33-3.
No 1127 Alexandru Andrei: Energy Efficient and Predictable
Design of Real-time Embedded Systems, 2007, ISBN
978-91-85831-06-7.
No 1139 Per Wikberg: Eliciting Knowledge from Experts in
Modeling of Complex Systems: Managing Variation
and Interactions, 2007, ISBN 978-91-85895-66-3.
No 1143 Mehdi Amirijoo: QoS Control of Real-Time Data
Services under Uncertain Workload, 2007, ISBN 978-
91-85895-49-6.
No 1150 Sanny Syberfeldt: Optimistic Replication with For-
ward Conflict Resolution in Distributed Real-Time
Databases, 2007, ISBN 978-91-85895-27-4.
No 1155 Beatrice Alenljung: Envisioning a Future Decision
Support System for Requirements Engineering - A
Holistic and Human-centred Perspective, 2008, ISBN
978-91-85895-11-3.
No 1156 Artur Wilk: Types for XML with Application to
Xcerpt, 2008, ISBN 978-91-85895-08-3.
No 1183 Adrian Pop: Integrated Model-Driven Development
Environments for Equation-Based Object-Oriented
Languages, 2008, ISBN 978-91-7393-895-2.
No 1185 Jörgen Skågeby: Gifting Technologies -
Ethnographic Studies of End-users and Social Media
Sharing, 2008, ISBN 978-91-7393-892-1.
No 1187 Imad-Eldin Ali Abugessaisa: Analytical tools and
information-sharing methods supporting road safety
organizations, 2008, ISBN 978-91-7393-887-7.
No 1204 H. Joe Steinhauer: A Representation Scheme for De-
scription and Reconstruction of Object
Configurations Based on Qualitative Relations, 2008,
ISBN 978-91-7393-823-5.
No 1222 Anders Larsson: Test Optimization for Core-based
System-on-Chip, 2008, ISBN 978-91-7393-768-9.
No 1238 Andreas Borg: Processes and Models for Capacity
Requirements in Telecommunication Systems, 2009,
ISBN 978-91-7393-700-9.
No 1240 Fredrik Heintz: DyKnow: A Stream-Based Know-
ledge Processing Middleware Framework, 2009,
ISBN 978-91-7393-696-5.
No 1241 Birgitta Lindström: Testability of Dynamic Real-
Time Systems, 2009, ISBN 978-91-7393-695-8.
No 1244 Eva Blomqvist: Semi-automatic Ontology Construc-
tion based on Patterns, 2009, ISBN 978-91-7393-683-5.
No 1249 Rogier Woltjer: Functional Modeling of Constraint
Management in Aviation Safety and Command and
Control, 2009, ISBN 978-91-7393-659-0.
No 1260 Gianpaolo Conte: Vision-Based Localization and
Guidance for Unmanned Aerial Vehicles, 2009, ISBN
978-91-7393-603-3.
No 1262 AnnMarie Ericsson: Enabling Tool Support for For-
mal Analysis of ECA Rules, 2009, ISBN 978-91-7393-
598-2.
No 1266 Jiri Trnka: Exploring Tactical Command and
Control: A Role-Playing Simulation Approach, 2009,
ISBN 978-91-7393-571-5.
No 1268 Bahlol Rahimi: Supporting Collaborative Work
through ICT - How End-users Think of and Adopt
Integrated Health Information Systems, 2009, ISBN
978-91-7393-550-0.
No 1274 Fredrik Kuivinen: Algorithms and Hardness Results
for Some Valued CSPs, 2009, ISBN 978-91-7393-525-8.
No 1281 Gunnar Mathiason: Virtual Full Replication for
Scalable Distributed Real-Time Databases, 2009,
ISBN 978-91-7393-503-6.
No 1290 Viacheslav Izosimov: Scheduling and Optimization
of Fault-Tolerant Distributed Embedded Systems,
2009, ISBN 978-91-7393-482-4.
No 1294 Johan Thapper: Aspects of a Constraint
Optimisation Problem, 2010, ISBN 978-91-7393-464-0.
No 1306 Susanna Nilsson: Augmentation in the Wild: User
Centered Development and Evaluation of
Augmented Reality Applications, 2010, ISBN 978-91-
7393-416-9.
No 1313 Christer Thörn: On the Quality of Feature Models,
2010, ISBN 978-91-7393-394-0.
No 1321 Zhiyuan He: Temperature Aware and Defect-
Probability Driven Test Scheduling for System-on-
Chip, 2010, ISBN 978-91-7393-378-0.
No 1333 David Broman: Meta-Languages and Semantics for
Equation-Based Modeling and Simulation, 2010,
ISBN 978-91-7393-335-3.
No 1337 Alexander Siemers: Contributions to Modelling and
Visualisation of Multibody Systems Simulations with
Detailed Contact Analysis, 2010, ISBN 978-91-7393-
317-9.
No 1354 Mikael Asplund: Disconnected Discoveries:
Availability Studies in Partitioned Networks, 2010,
ISBN 978-91-7393-278-3.
No 1359 Jana Rambusch: Mind Games Extended:
Understanding Gameplay as Situated Activity, 2010,
ISBN 978-91-7393-252-3.
No 1373 Sonia Sangari: Head Movement Correlates to Focus
Assignment in Swedish,2011,ISBN 978-91-7393-154-0.
No 1374 Jan-Erik Källhammer: Using False Alarms when
Developing Automotive Active Safety Systems, 2011,
ISBN 978-91-7393-153-3.
No 1375 Mattias Eriksson: Integrated Code Generation, 2011,
ISBN 978-91-7393-147-2.
No 1381 Ola Leifler: Affordances and Constraints of
Intelligent Decision Support for Military Command
and Control – Three Case Studies of Support
Systems, 2011, ISBN 978-91-7393-133-5.
No 1386 Soheil Samii: Quality-Driven Synthesis and
Optimization of Embedded Control Systems, 2011,
ISBN 978-91-7393-102-1.
No 1419 Erik Kuiper: Geographic Routing in Intermittently-
connected Mobile Ad Hoc Networks: Algorithms
and Performance Models, 2012, ISBN 978-91-7519-
981-8.
No 1451 Sara Stymne: Text Harmonization Strategies for
Phrase-Based Statistical Machine Translation, 2012,
ISBN 978-91-7519-887-3.
No 1455 Alberto Montebelli: Modeling the Role of Energy
Management in Embodied Cognition, 2012, ISBN
978-91-7519-882-8.
No 1465 Mohammad Saifullah: Biologically-Based Interactive
Neural Network Models for Visual Attention and
Object Recognition, 2012, ISBN 978-91-7519-838-5.
No 1490 Tomas Bengtsson: Testing and Logic Optimization
Techniques for Systems on Chip, 2012, ISBN 978-91-
7519-742-5.
No 1481 David Byers: Improving Software Security by
Preventing Known Vulnerabilities, 2012, ISBN 978-
91-7519-784-5.
No 1496 Tommy Färnqvist: Exploiting Structure in CSP-
related Problems, 2013, ISBN 978-91-7519-711-1.
No 1503 John Wilander: Contributions to Specification,
Implementation, and Execution of Secure Software,
2013, ISBN 978-91-7519-681-7.
No 1506 Magnus Ingmarsson: Creating and Enabling the
Useful Service Discovery Experience, 2013, ISBN 978-
91-7519-662-6.
No 1547 Wladimir Schamai: Model-Based Verification of
Dynamic System Behavior against Requirements:
Method, Language, and Tool, 2013, ISBN 978-91-
7519-505-6.
No 1551 Henrik Svensson: Simulations, 2013, ISBN 978-91-
7519-491-2.
No 1559 Sergiu Rafiliu: Stability of Adaptive Distributed
Real-Time Systems with Dynamic Resource
Management, 2013, ISBN 978-91-7519-471-4.
No 1581 Usman Dastgeer: Performance-aware Component
Composition for GPU-based Systems, 2014, ISBN
978-91-7519-383-0.
No 1602 Cai Li: Reinforcement Learning of Locomotion based
on Central Pattern Generators, 2014, ISBN 978-91-
7519-313-7.
No 1652 Roland Samlaus: An Integrated Development
Environment with Enhanced Domain-Specific
Interactive Model Validation, 2015, ISBN 978-91-
7519-090-7.
No 1663 Hannes Uppman: On Some Combinatorial
Optimization Problems: Algorithms and Complexity,
2015, ISBN 978-91-7519-072-3.
No 1664 Martin Sjölund: Tools and Methods for Analysis,
Debugging, and Performance Improvement of
Equation-Based Models, 2015, ISBN 978-91-7519-071-6.
No 1666 Kristian Stavåker: Contributions to Simulation of
Modelica Models on Data-Parallel Multi-Core
Architectures, 2015, ISBN 978-91-7519-068-6.
No 1680 Adrian Lifa: Hardware/Software Codesign of
Embedded Systems with Reconfigurable and
Heterogeneous Platforms, 2015, ISBN 978-91-7519-040-
2.
No 1685 Bogdan Tanasa: Timing Analysis of Distributed
Embedded Systems with Stochastic Workload and
Reliability Constraints, 2015, ISBN 978-91-7519-022-8.
No 1691 Håkan Warnquist: Troubleshooting Trucks –
Automated Planning and Diagnosis, 2015, ISBN 978-
91-7685-993-3.
No 1702 Nima Aghaee: Thermal Issues in Testing of
Advanced Systems on Chip, 2015, ISBN 978-91-7685-
949-0.
Linköping Studies in Arts and Science
No 504 Ing-Marie Jonsson: Social and Emotional
Characteristics of Speech-based In-Vehicle
Information Systems: Impact on Attitude and
Driving Behaviour, 2009, ISBN 978-91-7393-478-7.
No 586 Fabian Segelström: Stakeholder Engagement for
Service Design: How service designers identify and
communicate insights, 2013, ISBN 978-91-7519-554-4.
No 618 Johan Blomkvist: Representing Future Situations of
Service: Prototyping in Service Design, 2014, ISBN
978-91-7519-343-4.
No 620 Marcus Mast: Human-Robot Interaction for Semi-
Autonomous Assistive Robots, 2014, ISBN 978-91-
7519-319-9.
Linköping Studies in Statistics
No 9 Davood Shahsavani: Computer Experiments De-
signed to Explore and Approximate Complex Deter-
ministic Models, 2008, ISBN 978-91-7393-976-8.
No 10 Karl Wahlin: Roadmap for Trend Detection and As-
sessment of Data Quality, 2008, ISBN 978-91-7393-
792-4.
No 11 Oleg Sysoev: Monotonic regression for large
multivariate datasets, 2010, ISBN 978-91-7393-412-1.
No 13 Agné Burauskaite-Harju: Characterizing Temporal
Change and Inter-Site Correlations in Daily and Sub-
daily Precipitation Extremes, 2011, ISBN 978-91-7393-
110-6.
Linköping Studies in Information Science
No 1 Karin Axelsson: Metodisk systemstrukturering- att
skapa samstämmighet mellan informationssystem-
arkitektur och verksamhet, 1998. ISBN-9172-19-296-8.
No 2 Stefan Cronholm: Metodverktyg och användbarhet -
en studie av datorstödd metodbaserad
systemutveckling, 1998, ISBN-9172-19-299-2.
No 3 Anders Avdic: Användare och utvecklare - om
anveckling med kalkylprogram, 1999. ISBN-91-7219-
606-8.
No 4 Owen Eriksson: Kommunikationskvalitet hos infor-
mationssystem och affärsprocesser, 2000, ISBN 91-
7219-811-7.
No 5 Mikael Lind: Från system till process - kriterier för
processbestämning vid verksamhetsanalys, 2001,
ISBN 91-7373-067-X.
No 6 Ulf Melin: Koordination och informationssystem i
företag och nätverk, 2002, ISBN 91-7373-278-8.
No 7 Pär J. Ågerfalk: Information Systems Actability - Un-
derstanding Information Technology as a Tool for
Business Action and Communication, 2003, ISBN 91-
7373-628-7.
No 8 Ulf Seigerroth: Att förstå och förändra system-
utvecklingsverksamheter - en taxonomi för
metautveckling, 2003, ISBN91-7373-736-4.
No 9 Karin Hedström: Spår av datoriseringens värden –
Effekter av IT i äldreomsorg, 2004, ISBN 91-7373-963-
4.
No 10 Ewa Braf: Knowledge Demanded for Action -
Studies on Knowledge Mediation in Organisations,
2004, ISBN 91-85295-47-7.
No 11 Fredrik Karlsson: Method Configuration method
and computerized tool support, 2005, ISBN 91-85297-
48-8.
No 12 Malin Nordström: Styrbar systemförvaltning - Att
organisera systemförvaltningsverksamhet med hjälp
av effektiva förvaltningsobjekt, 2005, ISBN 91-85297-
60-7.
No 13 Stefan Holgersson: Yrke: POLIS - Yrkeskunskap,
motivation, IT-system och andra förutsättningar för
polisarbete, 2005, ISBN 91-85299-43-X.
No 14 Benneth Christiansson, Marie-Therese Christiansson: Mötet mellan process och komponent
- mot ett ramverk för en verksamhetsnära
kravspecifikation vid anskaffning av komponent-
baserade informationssystem, 2006, ISBN 91-85643-
22-X.