low power principles
TRANSCRIPT
-
8/3/2019 Low Power Principles
1/58
Low Power PrinciplesLow Power Principles
Author:Author:Agatino PennisiAgatino Pennisi
[[email protected]][[email protected]]
Low Power Architectures and DesignLow Power Architectures and Design
ASTAST--Lab CataniaLab Catania
-
8/3/2019 Low Power Principles
2/58
Index
1.Introduction
2.Basic Principles
2.1. Sources of Power Consumption
2.2. Switching Power2.3. Short-Circuit Power
2.4. Static Power
2.5. Power-Delay and Energy-Delay Products
3. Technology Level Optimizations
3.1. Technology Scaling
3.2. Threshold Voltage Reduction
3.3. Technology Level Conclusion
4.Layout Level Optimizations
5. Circuit Level Optimizations
5.1. Dynamic Logic
5.2. Pass-Transistor Logic
5.3. Asynchronous Logic
5.4. Transistor Sizing
5.5. Design Style
5.6. Circuit Level Conclusion
6.Logic and Architecture Level Optimizations6.1. Logic Level Optimizations
6.2. Architecture Level Optimizations
7.Software and System Level Optimizations
Conclusions
References
-
8/3/2019 Low Power Principles
3/58
1. Introduction
The growing market of mobile, battery-powered electronic systems (e.g., cellular phones,personal
digital assistants, etc.) demands the design of microelectronic circuits with low power dissipation, that
can be powered by lightweight batteries with long times between re-charges.
The power consumed by a circuit is defined asp(t) = i(t)v(t), where i(t) is the instantaneous current
provided by the power supply, andv(t) is the instantaneous supply voltage. Power minimization targets
maximum instantaneous power or average power. The latter impacts battery lifetime and heat
dissipation system cost, the former constrains power grid and power supply circuits design.
It is important to stress from the outset that power minimization is never the only objective in real-
life designs. Performance is always a critical metric that cannot be neglected. Unfortunately, in most
cases, power can be reduced at the price of some performance degradation. For this reason, several
metrics for joint power-performance have been proposed in the past. In many designs, thepower-delay
product (i.e., energy) is an acceptable metric. Energy minimization rules out design choices that
-
8/3/2019 Low Power Principles
4/58
heavily compromise performance to reduce power consumption. When performance has priority overpower consumption, the energy-delay product (that is equivalent to power-delay2 ) can be adopted to
tightly control performance degradation.
Besides power vs. performance, another key trade-off in VLSI design is power vs. flexibility.
Several authors have observed that application specific designs are orders of magnitude more power
efficient than general-purpose systems programmed to perform the same computation. On the other
hand, flexibility (programmability) is often an indispensable requirement, and designers must strive to
achieve maximum power efficiency without compromising flexibility.
-
8/3/2019 Low Power Principles
5/58
2. Basic Principles
The three major sources of power dissipation in a digital CMOS circuit are:
P=PSwitching
+PShort-Circuit
+PLeakage
2.1. Sources of Power Consumption
(Eq. 2.1)
Fig. 2.1. Sources of power consumption
-
8/3/2019 Low Power Principles
6/58
PSwitching, called alsoswitching power, is due to charging and discharging capacitors driven by the
circuit.
PShort-Circuit
, calledshort-circuit power, is caused by the short circuit currents that arise when pairs
of PMOS/NMOS transistors are conducting simultaneously.
PLeakage , originates from substrate injection and subthreshold effects. For older technologies (0.8
m and above), PSwitching was predominant. For deep-submicron processes, PLeakage becomes moreimportant.
Design for low-power implies the ability to reduce all three components
Optimizations can be achieved by facing the power problem from different perspectives: design and
technology. Enhanced design capabilities mostly impact switching and short-circuit power; technology
improvements, on the other hand, contribute to reductions of all three components.
-
8/3/2019 Low Power Principles
7/58
2.2. Switching Power
Switching power for a CMOS gate working in a synchronous environment is modeled by the following
equation:
SWClockDDLSwitching EfVCP
2
2
1=
where CL
is the output load of the gate, VDD
is the supply voltage,fClock
is the clock frequency andESW
is the switching activity of the gate, defined as the probability of the gates output to make a logic
transition during one clock cycle.
Reductions ofPSwitchingare achievable by:
1. supply voltage scaling
2. frequency scaling
3. minimization of switched capacitance
(Eq. 2.2)
-
8/3/2019 Low Power Principles
8/58
1. Supply Voltage Scaling
advantage: scaling VDDscaling PSwitchingquadratically
drawback: scaling VDD lower circuit speed (decreasing circuit performance)
To compensate the decrease in circuit performance introduced by reduced voltage, speed optimizationis applied first, followed by supply voltage scaling, which brings the design back to its original timing,
but with a lower power requirement.
2. Frequency Scaling
advantage: scaling fClockscaling PSwitchinglinearly
drawback: scaling fClock lower circuit speed (decreasing circuit performance)
Selective frequency scaling (as well as voltage scaling) on such units may thus be applied, at no
penalty in the overall system speed.
-
8/3/2019 Low Power Principles
9/58
3. Minimization of Switched Capacitance
Optimization approaches that have a lower impact on performance, yet allowing significant power
savings, are those targeting the minimization of the switched capacitance (i.e., the product of the
capacitive load with the switching activity).
Static solutions (i.e., applicable at design time) handle switched capacitance minimization througharea optimization (that corresponds to a decrease in the capacitive load) and switching activity
reduction via exploitation of different kinds of signal correlations (temporal, spatial, spatio-temporal).
Dynamic techniques, on the other hand, aim at eliminating power wastes that may be originated by the
application of certain system workloads (i.e., the data being processed).
-
8/3/2019 Low Power Principles
10/58
2.3. Short-Circuit Power
In actual designs, the assumption of the zero rise and fall times of the input wave forms is not correct.
The finite slope of the input signal causes a direct current path between VDD and GND for a short
period of time during switching, while the NMOS and the PMOS transistors are conducting
simultaneously. This is illustrated in Figure 2.2. Under the (reasonable) assumption that the resulting
current spikes can be approximated as triangles and that the inverter is symmetrical in its rising and
falling responses, we can compute the energy consumed per switching period,
Fig. 2.2. Short-circuit currents during transients
peakDDsc
scpeak
DD
scpeak
DDdp IVttI
VtI
VE =+=22
(Eq. 2.3)
as well as the average power consumption
fVCfIVtP DDscpeakDDscdp2
== (Eq. 2.4)
-
8/3/2019 Low Power Principles
11/58
The short-circuit (also called direct-path) power dissipation is proportional to the switching
activity, similar to the capacitive power dissipation. tsc represents the time both devices are
conducting. For a linear input slope, this time is reasonably well approximated by Eq. (2.5) where ts
represents the 0-100% transition time.
8.0
22 )(fr
DD
TDDs
DD
TDDsc
t
V
VVt
V
VVt
= (Eq. 2.5)
Ipeak is determined by the saturation current of the devices and is hence directly proportional to the
sizes of the transistors. The peak current is also a strong function of the ratio between input and
output slopes. This relationship is best illustrated by the following simple analysis. Consider a static
CMOS inverter with a 01 transition at the input. Assume first that the load capacitance is verylarge, so that the output fall time is significantly larger than the input rise time (Figure 2.3a). Under
those circumstances, the input moves through the transient region before the output starts to change.
As the source-drain voltage of the PMOS device is approximately 0 during that period, the device
shuts off without ever delivering any current. The short-circuit current is close to zero in this case.
Consider now the reverse case, where the output capacitance is very small, and the output fall time is
substantially smaller than the input rise time (Figure 2.3b). The drain-source voltage of the PMOS
device equals VDD for most of the transition period, guaranteeing the maximal short-circuit current
(equal to the saturation current of the PMOS). This clearly represents the worst-case condition.
-
8/3/2019 Low Power Principles
12/58
Fig. 2.3. Impact of load capacitance on short-circuit current
The conclusions of the above analysis are confirmed in Figure 2.4, which plots the short-circuit
current through the NMOS transistor during a low-to-high transition as a function of the load
capacitance.
Fig. 2.4. CMOS inverter short-circuit current through NMOS transistor as a
function of the load capacitance (for a fixed input slope of 500 psec).
-
8/3/2019 Low Power Principles
13/58
This analysis leads to the conclusion that the short-circuit dissipation is minimized by making the
output rise/fall time larger than the input rise/fall time. On the other hand, making the output rise/fall
time too large slows down the circuit and can cause short-circuit currents in the fan-out gates. This
presents a perfect example of how local optimization and forgetting the global picture can lead to an
inferior solution.
-
8/3/2019 Low Power Principles
14/58
2.4. Static Power
The static (or steady-state) power dissipation of a circuit is is expressed by Eq. (2.6), where Istat is the
current that flows between the supply rails in the absence of switching activity.
DDstatstat VIP = (Eq. 2.6)
Ideally, the static current of the CMOS inverter is equal to zero, as the PMOS and NMOS devices are
never on simultaneously in steady-state operation. There is, unfortunately, a leakage current flowing
through the reverse-biased diode junctions of the transistors, located between the source or drain and
the substrate as shown in Figure 2.5. This contribution is, in general, very small and can be ignored.
For the device sizes under consideration, the leakage current per unit drain area typically ranges
between 10-100 pA/m2 at room temperature. For a die with 1 million gates, each with a drain area of
0.5 m2 and operated at a supply voltage of 2.5 V, the worst-case power consumption due to diode
leakage equals 0.125 mW, which is clearly not much of an issue. However, be aware that the junction
leakage currents are caused by thermally generated carriers. Their value increases with increasing junction temperature, and this occurs in an exponential fashion. At 85C (a common junction
temperature limit for commercial hardware), the leakage currents increase by a factor of 60 over their
room-temperature values. Keeping the overall operation temperature of a circuit low is consequently a
desirable goal.
-
8/3/2019 Low Power Principles
15/58
Fig. 2.5. Sources of leakage currents in CMOS inverter (for Vin = 0 V)
As the temperature is a strong function of the dissipated heat and its removal mechanisms, this can
only be accomplished by limiting the power dissipation of the circuit and/or by using chip packages
that support efficient heat removal.
An emerging source of leakage current is the subthreshold current of the transistors. An MOS
transistor can experience a drain-source current, even when VGS is smaller than the threshold voltage
(Figure 2.6).
The closer the threshold voltage is to zero volts, the larger the leakage current at VGS= 0 V and the
larger the static power consumption. To offset this effect, the threshold voltage of the device has
generally been kept high enough. Standard processes feature VT values that are never smaller than
0.5-0.6V and that in some cases are even substantially higher (~ 0.75V).
-
8/3/2019 Low Power Principles
16/58
Fig. 2.6. Decreasing the threshold increases the subthreshold current at VGS=0
This approach is being challenged by the reduction in supply voltages that typically goes with deep-
submicron technology scaling. Scaling the supply voltages while keeping the threshold voltage
constant results in an important loss in performance, especially when VDD approaches 2VT .
One approach to address this performance issue is to scale the device thresholds down as well. This
moves the curve of Figure 3.1(dx) to the left, which means that the performance penalty for lowering
the supply voltage is reduced. Unfortunately, the threshold voltages are lower-bounded by the amount
of allowable subthreshold leakage current, as demonstrated in Figure 2.6. The choice of the threshold
voltage hence represents a trade-off between performance and static power dissipation.
-
8/3/2019 Low Power Principles
17/58
The continued scaling of the supply voltage predicted for the next generations of CMOS technologies
will however force the threshold voltages ever downwards, and will make subthreshold conduction a
dominant source of power dissipation. Process technologies that contain devices with sharper turn-off
characteristic will therefore become more attractive. An example of the latter is the SOI (Silicon-on-
Insulator) technology whose MOS transistors have slope-factors that are close to the ideal 60
mV/decade.
This lower bound on the thresholds is in some sense artificial. The idea that the leakage current in a
static CMOS circuit has to be zero is a preconception. Certainly, the presence of leakage currents
degrades the noise margins, because the logic levels are no longer equal to the supply rails. As long as
the noise margins are within range, this is not a compelling issue. The leakage currents, of course,
cause an increase in static power dissipation. This is offset by the drop in supply voltage, that is
enabled by the reduced thresholds at no cost in performance, and results in a quadratic reduction in
dynamic power. For a 0.25 m CMOS process, the following circuit configurations obtain the same
performance: 3 V supply0.7 V VT; and 0.45 V supply0.1 V VT.The dynamic power consumption of the latter is, however, 45 times smaller! Choosing the correct
values of supply and threshold voltages once again requires a trade-off. The optimal operation point
depends upon the activity of the circuit.
-
8/3/2019 Low Power Principles
18/58
In the presence of a sizable static power dissipation, it is essential that non-active modules are
powered down, lest static power dissipation would become dominant.Power-down , also calledstandby , can be accomplished by disconnecting the unit from the supply rails, or by lowering the
supply voltage.
-
8/3/2019 Low Power Principles
19/58
2.5. Power-Delay and Energy-Delay Products
The total power consumption of the CMOS inverter is now expressed as the sum of its three
components:
( ) peakDDspeakDDDDLstatdpdyntot IVftIVVCPPPP ++=++= 102 (Eq. 2.7)
In typical CMOS circuits, the capacitive dissipation is by far the dominant factor. The direct-pathconsumption can be kept within bounds by careful design, and should hence not be an issue. Leakage
is ignorable at present, but this might change in the not too distant future.
In Chapter 1, we introduced thepower-delay product,PDP, as a quality measure for a logic gate.
pav
tPPDP =(Eq. 2.8)
The PDP presents a measure of energy, as is apparent from the units (Wsec = Joule). Assuming that
the gate is switched at its maximum possible rate of fmax = 1/(2tp), and ignoring the contributions of the
static and direct-path currents to the power consumption, we find
2
22 DDL
pmaxDDL
VC
tfVCPDP=
(Eq. 2.9)The PDP stands for the average energy consumed per switching event(this is, for a 01, or a 10
transition). Remember that earlier we had defined Eav as the average energy per switching cycle (or
per energy-consuming event). As each inverter cycle contains a 01, and a 10 transition, Eav hence
is twice the PDP.
-
8/3/2019 Low Power Principles
20/58
The validity of the PDP as a quality metric for a process technology or gate topology is questionable.
It measures the energy needed to switch the gate, which is an important property for sure. Yet for agiven structure, this number can be made arbitrarily low by reducing the supply voltage. From this
perspective, the optimum voltage to run the circuit at would be the lowest possible value that still
ensures functionality. This comes at the major expense in performance, at discussed earlier. A more
relevant metric should combine a measure of performance and energy. The energy-delay product,
EDP, does exactly that.
pDDL
pavp tVC
tPtPDPEDP ===2
22
(Eq. 2.10)
It is worth analyzing the voltage dependence of the EDP. Higher supply voltages reduce delay, but
harm the energy, and the opposite is true for low voltages. An optimum operation point should hence
exist. Assuming that NMOS and PMOS transistors have comparable threshold and saturation voltages,
we can simplify the following propagation delay expression.
( ) ( ) TeDDDDL
p
DSATnTnDDDSATnn
DDLpHL
VV
VCt
VVVVkLW
VCt
n
=
2//52.0
' (Eq. 2.11)
where VTe = VT+ VDSAT/2, andtechnology parameter. Combining Eq. (2.10) and Eq.(2.11),
( )TeDDDDL
VV
VCEDP
2
32
(Eq. 2.12)
-
8/3/2019 Low Power Principles
21/58
The optimum supply voltage can be obtained by taking the derivative of Eq. (2.12) with respect to VDD,
and equating the result to 0.
TeDDopt VV2
3= (Eq. 2.13)
The remarkable outcome from this analysis is the low value of the supply voltage that simultaneously
optimizes performance and energy. For sub-micron technologies with thresholds in the range of 0.5 V,
the optimum supply is situated around 1 V.
-
8/3/2019 Low Power Principles
22/58
3. Technology Level Optimization
3.1. Technology Scaling
Scaling of physical dimensions is a well-known technique for reducing circuit power consumption. The
first-order effects of scaling can be fairly easily derived. Device gate capacitances are of the form:
ox
oxoxGate
tLWCLWC
==
If we scale down W, L, and tox by S, then this capacitance will scale down by S as well. Consequently,
if system data rates and supply voltages remain unchanged, this factor of S reduction in capacitance is
passed on directly to power:
SP
1voltagefixede,performancFixed
(Eq. 3.1)
(Eq. 3.2)
The effect of scaling on delays is equally promising. Based on (Eq. 3.3), the transistor current drive
increases linearly with S.
( )22
tddox
dd VVL
WCI
=
(Eq. 3.3)
-
8/3/2019 Low Power Principles
23/58
As a result, propagation delays, which are proportional to capacitance and inversely proportional to
drive current, scale down by a factor of S2.
Assuming we are only trying to maintain system throughput rather than increase it, the improvement in
circuit performance can be traded for lower power by reducing the supply voltage. In particular,
neglecting Vt effects, the voltage can be reduced by a factor of S2. This results in a S4 reduction in
device currents, and along with the capacitance scaling leads to an S5 reduction in power:
5
1voltagevariablee,performancFixed
SP (Eq. 3.4)
This discussion, however, ignores many important second-order effects. For example, as scaling
continues, interconnect parasitics eventually begin to dominate and change the picture substantially.
The resistance of a wire is proportional to its length and inversely proportional to its thickness and
width. Since in this discussion we are considering the impact of technology scaling on a fixed design,
the local and global wire lengths should scale down by S along with the width and thickness of the
wire. This means that the wire resistance should scale up by a factor of S overall. The wire
capacitance is proportional to its width and length and inversely proportional to the oxide thickness.
Consequently, the wire capacitance scales down by a factor of S. To summarize:
-
8/3/2019 Low Power Principles
24/58
11
wwwire
w
w
CRtS
C
SR
(Eq. 3.5)
This means that, unlike gate delays, the intrinsic interconnect delay does not scale down with physical
dimensions. So at some point interconnect delays will start to dominate over gate delays and it will no
longer be possible to scale down the supply voltage. This means that once again power is reduced
solely due to capacitance scaling:
SP
1dominatedParasitics (Eq. 3.6)
Actually, the situation is even worse since the above analysis did not consider second-order effects
such as the fringing component of wire capacitance, which may actually grow with reduced
dimensions. As a result, realistically speaking, power may not scale down at all, but instead may stay
approximately constant with technology scaling or even increase:
moreor1effectsorder-2ndIncluding P (Eq. 3.7)
-
8/3/2019 Low Power Principles
25/58
The conclusion is that technology scaling offers significant benefits in terms of power only up to a
point. Once parasitics begin to dominate, the power improvements slack off or disappear completely.
So we cannot rely on technology scaling to reduce power indefinitely. We must turn to other
techniques for lowering power consumption.
3.2. Threshold Voltage Reduction
Many process parameters, aside from lithographic dimensions, can have a large impact on circuit
performance. For example, at low supply voltages the value of the threshold voltage (Vt) is extremely
important. Threshold voltage places a limit on the minimum supply voltage that can be used without
incurring unreasonable delay penalties (Fig.3.1). Based on this, it would seem reasonable to consider
reducing threshold voltages in a low-power process.
Fig. 3.1. Energy and delay as a function of supply voltage
-
8/3/2019 Low Power Principles
26/58
Unfortunately, sub-threshold conduction and noise margin considerations limit how low Vt can be set.
Although devices are ideally off for gate voltages below Vt , in reality there is always some sub-
threshold conduction even for Vgs
-
8/3/2019 Low Power Principles
27/58
The methodology should be applicable not only to different technologies, but also to different circuit
and logic styles. Whenever possible scaling and circuit techniques should be combined with the high-
level methodology to further reduce power consumption; however, the general low-power strategy
should not require these tricks. The advantages of scaling and low-level techniques cannot be
overemphasized, but they should not be the sole arena from which the designer can extract power
gains.
-
8/3/2019 Low Power Principles
28/58
4. Layout Level Optimization
There are a number of layout-level techniques that can be applied to reduce power. The simplest of
these techniques is to select upper level metals to route high activity signals. The higher level metals
are physically separated from the substrate by a greater thickness of silicon dioxide. Since the physical
capacitance of these wires decreases linearly with increasing tox , there is some advantage to routing
the highest activity signals in the higher level metals. For example, in a typical process metal three
will have about a 30% lower capacitance per unit area than metal two. It should be noted, however,
that the technique is most effective for global rather than local routing, since connecting to a higher
level metal requires more vias, which add area and capacitance to the circuit. Still, the concept of
associating high activity signals with low physical capacitance nodes is an important one and appears
in many different contexts in low-power design.
For example, we can combine this notion with the locality theme to arrive at a general strategy for
low-power placement and routing. The placement and routing problem crops up in many different
guises in VLSI design. Place and route can be performed on pads, functional blocks, standard cells,
gate arrays, etc. Traditional placement involves minimizing area and delay. Minimizing delay, in turn,
translates to minimizing the physical capacitance (or length) of wires.
-
8/3/2019 Low Power Principles
29/58
In contrast, placement for low-power concentrates on minimizing the activity-capacitance product
rather than the capacitance alone. In summary, high activity wires should be kept short or, in a
manner of speaking, local. Tools have been developed that use this basic strategy to achieve about an
18% reduction in power.
Although intelligent placement and routing of standard cells and gate arrays can help to improve their
power efficiency, the locality achieved by low-power place and route tools rarely approaches what can
be achieved by a full-custom design. Design-time issues and other economic factors, however, may in
many cases preclude the use of full-custom design. In these instances, the concepts presented here
regarding low-power placement and routing of standard cells and gate arrays may prove useful.
Moreover, even for custom designs, these low-power strategies can be applied to placement and
routing at the block level.
-
8/3/2019 Low Power Principles
30/58
5. Circuit Level Optimization
In this section, we go beyond the traditional synchronous fully-complementary static CMOS circuit
style to consider the relative advantages and disadvantages of other design strategies; we will
consider five topics relating to low-power circuit design: dynamic logic, pass-transistor logic,
asynchronous logic, transistor sizing, and design style (e.g. full custom versus standard cell).
5.1. Dynamic Logic
In static logic, node voltages are always maintained by a conducting path from the node to one of the
supply rails. In contrast, dynamic logic nodes go through periods during which there is no path to the
rails, and voltages are maintained as charge dynamically stored on nodal capacitances. Figure 5.1
shows an implementation of a complex boolean expression in both static and dynamic logic. In the
dynamic case, the clock period is divided into a pre-charge and an evaluation phase. During pre-
charge, the output is charged to Vdd. Then, during the next clock phase, the NMOS tree evaluates the
logic function and discharges the output node if necessary. Relative to static CMOS, dynamic logic hasboth advantages and disadvantages in terms of power.
Historically, dynamic design styles have been touted for their inherent low-power properties. For
example, dynamic design styles often have significantly reduced device counts.
-
8/3/2019 Low Power Principles
31/58
Fig. 5.1. Static and dynamic implementations of ( )CBAF +=
Since the logic evaluation function is fulfilled by the NMOS tree alone, the PMOS tree can be replaced
by a single pre-charge device. These reduced device counts result in a corresponding decrease in
capacitive loading, which can lead to power savings. Moreover, by avoiding stacked PMOS
transistors, dynamic logic is amenable to low voltage operation where the ability to stack devices is
limited. In addition, dynamic gates dont experience short-circuit power dissipation. Whenever static
circuits switch, a brief pulse of transient current flows from Vdd to ground consuming power.
Furthermore, dynamic logic nodes are guaranteed to have a maximum of one transition per clock
cycle.
-
8/3/2019 Low Power Principles
32/58
Static gates do not follow this pattern and can experience a glitching phenomenon whereby output
nodes undergo unwanted transitions before settling at their final value. This causes excess powerdissipation in static gates. So in some sense, dynamic logic avoids some of the overhead and waste
associated with fully-complementary static logic.
In practice, however, dynamic circuits have several disadvantages. For instance, each of the pre-
charge transistors in the chip must be driven by a clock signal. This implies a dense clock distribution
network and its associated capacitance and driving circuitry. These components can contribute
significant power consumption to the chip. In addition, with each gate influenced by the clock, issues
of skew become even more important and difficult to handle.
Fig. 5.2. Output activities for static and dynamic logic gates (with random inputs)
-
8/3/2019 Low Power Principles
33/58
Also, the clock is a high (actually, maximum) activity signal, and having it connected to the PMOS
pull-up network can introduce unnecessary activity into the circuit. For commonly used boolean logicgates, Figure 5.2 shows the probability that the outputs make an energy consuming (i.e. zero to one)
transition for random gate inputs. In all cases, the activity of the dynamic gates is higher than that of
the static gates. We can show that, in general, for any boolean signal X, the activity of a dynamically
pre-charged wire carrying X must always be at least as high as the activity of a statically-driven wire:
( ) ( )
( ) ( ) ( )
===
=
00|110:casestatic
010:casedynamic
1 Xttwire
Xwire
PXXPP
PP(Eq. 5.1)
In conclusion, dynamic logic has certain advantages and disadvantages for low-power operation. The
key is to determine which of the conflicting factors is dominant. In certain cases, a dynamic
implementation might actually achieve a lower overall power consumption. Furthermore, the savings
in terms of glitching and short-circuit power, while possibly significant, can also be achieved in static
logic through other means (discussed in Section 6). All of this, coupled with the robustness of static
logic at low voltages gives the designer less incentive to select a dynamic implementation of a low-
power system.
-
8/3/2019 Low Power Principles
34/58
5.2. Pass-Transistor Logic
As with dynamic logic, pass-transistor logic offers the possibility of reduced transistor counts. Figure
5.3 illustrates this fact with an equivalent pass-transistor implementation of the static logic function of
Figure 5.1. Once again, the reduction in transistors results in lower capacitive loading from devices.
This might make pass-transistor logic attractive as a low-power circuit style.
Fig. 5.3. Complementary pass-transistor implementations of ( )CBAF +=
Like dynamic logic, however, pass-transistor circuits suffer from several drawbacks. First, pass
transistors have asymmetrical voltage driving capabilities. For example, NMOS transistors do not
pass high voltages efficiently, and experience reduced current drive as well as a Vt drop at the
output. If the output is used to drive a PMOS gate, static power dissipation can result.
-
8/3/2019 Low Power Principles
35/58
These flaws can be remedied by using additional hardware - for instance, complementary transmission
gates consisting of an NMOS and PMOS pass transistors in parallel or level-restoring circuit, asshowed in Figure 5.4. Unfortunately, this forfeits the power savings offered by reduced device counts.
Also, efficient layout of pass-transistor networks can be problematic. Sharing of source/drain diffusion
regions is often not possible, resulting in increased parasitic junction capacitances.
In summary, there may be situations in which pass-transistor logic is more power efficient than fully-
complementary logic; however, the benefits are likely to be small relative to the orders of magnitude
savings possible from higher level techniques. So, again, circuit-level power saving techniques should
be used whenever appropriate, but should be subordinate to higher level considerations.
In summary, there may be situations in which pass-transistor logic is more power efficient than fully-
complementary logic; however, the benefits are likely to be small relative to the orders of magnitude
savings possible from higher level techniques. So, again, circuit-level power saving techniques should
be used whenever appropriate, but should be subordinate to higher level considerations.
Fig. 5.4. Level-restoring Circuit
-
8/3/2019 Low Power Principles
36/58
5.3. Asynchronous Logic
Asynchronous logic refers to a circuit style employing no global clock signal for synchronization.
Instead, synchronization is provided by handshaking circuitry used as an interface between gates (see
Figure 5.5). While more common at the system level, asynchronous logic has failed to gain acceptance
at the circuit level. This has been based on area and performance criteria. It is worthwhile to
reevaluate asynchronous circuits in the context of low power.
Fig. 5.5. Asynchronous circuit with handshaking
Typically, asynchronous circuits are classified as showed in Figure 5.6.
Theself-timedconcept is based on an architecture in which there are registers, arithmetic logic units,
control units, control signals but no clock signal. The computations sequence is managed by local
synchronization signals (see Figure 5.7).
-
8/3/2019 Low Power Principles
37/58
Fig. 5.6. Classification of asynchronous circuits
Fig. 5.7. Example of a self-timed system
Really, self-timed systems are a subset of asynchronous systems, which ones are in general no global
clock signal. Inside self-timed systems there is a set of systems, calledspeed independent, which ones
-
8/3/2019 Low Power Principles
38/58
work properly independently from delays of its internal components, except delays of interconnections
(Fig. 5.8).
Fig. 5.8. Example of a speed independent system
Fig. 5.9. Example of a delay insensitive system
-
8/3/2019 Low Power Principles
39/58
Inside speed independent systems there is a set of systems, calleddelay insensitive , which ones work
properly independently from delays of its internal components and interconnections (Fig. 5.9). Theprimary power advantages of asynchronous logic can be classified as avoiding waste. The clock signal
in synchronous logic contains no information; therefore, power associated with the clock driver and
distribution network is in some sense wasted. Avoiding this power consumption component might offer
significant benefits. In addition, asynchronous logic uses completion signals, thereby avoiding
glitching, another form of wasted power. Finally, with no clock signal and with computation triggered
by the presence of new data, asynchronous logic contains a sort of built in power-down mechanism for
idle periods.
While asynchronous sounds like the ideal low-power design style, several issues impede its acceptance
in low-power arenas. Depending on the size of its associated logic structure, the overhead of the
handshake interface and completion signal generation circuitry can be large in terms of both area and
power.
Since this circuitry does not contribute to the actual computations, transitions on handshake signals
are wasted. This is similar to the waste due to clock power consumption, though it is not as severe
since handshake signals have lower activity than clocks. Finally, fewer design tools support
asynchronous than synchronous, making it more difficult to design.
-
8/3/2019 Low Power Principles
40/58
At the small granularity with which it is commonly implemented, the overhead of the asynchronous
interface circuitry dominates over the power saving attributes of the design style. It should beemphasized, however, that this is mainly a function of the granularity of the handshaking circuitry. It
would certainly be worthwhile to consider using asynchronous techniques to eliminate the necessity of
distributing a global clock between blocks of larger granularity. For example, large modules could
operate synchronously off local clocks, but communicate globally using asynchronous interfaces. In
this way, the interface circuitry would represent a very small overhead component, and the most
power consuming aspects of synchronous circuitry (i.e. global clock distribution) would be avoided.
5.4. Transistor Sizing
Regardless of the circuit style employed, the issue of transistor sizing for low power arises. The
primary trade-off involved is between performance and cost - where cost is measured by area and
power. Transistors with larger gate widths provide more current drive than smaller transistors.
Unfortunately, they also contribute more device capacitance to the circuit and, consequently, result inhigher power dissipation. Moreover, larger devices experience more severe short-circuit currents,
which should be avoided whenever possible.
-
8/3/2019 Low Power Principles
41/58
In addition, if all devices in a circuit are sized up, then the loading capacitance increases in the same
proportion as the current drive, resulting in little performance improvement beyond the point ofovercoming fixed parasitic capacitance components. In this sense, large transistors become self-
loading and the benefit of large devices must be reevaluated. A sensible low-power strategy is to use
minimum size devices whenever possible. Along the critical path, however, devices should be sized up
to overcome parasitics and meet performance requirements.
5.5. Design Style
Another decision which can have a large impact on the overall chip power consumption is selection of
design style: e.g. full custom, gate array, standard cell, etc. Not surprisingly, full-custom design offers
the best possibility of minimizing power consumption. In a custom design, all the principles of low-
power including locality, regularity, and sizing can be applied optimally to individual circuits.
Unfortunately, this is a costly alternative in terms of design time, and can rarely be employed
exclusively as a design strategy. Other possible design styles include gate arrays and standard cells.
Gate arrays offer one alternative for reducing design cycles at the expense of area, power, and
performance. While not offering the flexibility of full-custom design, gate-array CAD tools could
nevertheless be altered to place increased emphasis on power. For example, gate arrays offer some
control over transistor sizing through the use of parallel transistor connections.
-
8/3/2019 Low Power Principles
42/58
Standard cell synthesis is another commonly employed strategy for reducing design time. Current
standard cell libraries and tools, however, offer little hope of achieving low power operation. In manyways, standard cells represent the antithesis of a low-power methodology. First and foremost,
standard cells are often severely oversized. Most standard cell libraries were designed for maximum
performance and worst-case loading from inter-cell routing. As a result, they experience significant
self-loading and waste correspondingly significant amounts of power. To overcome this difficulty,
standard cell libraries must be expanded to include a selection of cells of identical functionality, but
varying driving strengths. With this in place, synthesis tools could select the smallest (and lowest
power cell) required to meet timing constraints, while avoiding the wasted power associated with
oversized transistors. In addition, the standard cell layout style with its islands of devices and
extensive routing channels tends to violate the principles of locality central to low-power design.
-
8/3/2019 Low Power Principles
43/58
5.6. Circuit Level Conclusion
Clearly, numerous circuit-level techniques are available to the low-power designer. These techniques
include careful selection of a circuit style: static vs. dynamic, synchronous vs. asynchronous, fully-
complementary vs. pass-transistor, etc. Other techniques involve transistor sizing or selection of a
design methodology such as full-custom or standard cell. Some of these techniques can be applied in
conjunction with higher level power reduction techniques. When possible, designers should take
advantage of this fact and exploit both low and high-level techniques in concert. Often, however,
circuit-level techniques will conflict with the low-power strategies based on higher abstraction levels.
In these cases, the designer must determine, which techniques offer the largest power reductions. As
evidenced by the previous discussion, circuit-level techniques typically offer reductions of a factor oftwo or less, while some higher level strategies with their more global impact can produce savings of
an order of magnitude or more. In such situations, considerations imposed by the higher-level
technique should dominate and the designer should employ those circuit-level methodologies most
amenable to the selected high-level strategy.
-
8/3/2019 Low Power Principles
44/58
6. Logic and Architecture Level Optimizations
Logic-level power optimization has been extensively researched in the last few years. Given the
complexity of modern digital devices, hand-crafted logic-level optimization is extremely expensive in
terms of design time and effort. Hence, it is cost-effective only for structured logic in large-volume
components, like microprocessors (e.g., functional units in the data-path). Fortunately, several
optimizations for low power have been automated and are now available in commercial logic synthesis
tools, enabling logic-level power optimization even for unstructured logic and for low-volume VLSI
circuits. During logic optimization, technology parameters such as supply voltage are fixed, and the
degrees of freedom are in selecting the functionality and sizing the gates implementing a given logic
specification. As for technology and circuit-level techniques, power is never the only cost metric of
interest. In most cases, performance is tightly constrained as well.
6.1. Logic Level Optimizations
A common setting is constrained power optimization, where a logic network can be transformed to
minimize power only if critical path length is not increased. Under this hypothesis, an effective
technique is based on path equalization.
-
8/3/2019 Low Power Principles
45/58
Path equalization ensures that signal propagation from inputs to outputs of a logic network follows
paths of similar length. When paths are equalized, most gates have aligned transitions at their inputs,
thereby minimizing spurious switching activity (which is created by misaligned input transitions). This
technique is very helpful in arithmetic circuits, such as adders of multipliers.
Glue logic and controllers have much more irregular structure than arithmetic units, and their gate-
level implementations are characterized by a wide distribution of path delays. These circuits can be
optimized for power by resizing. Resizing focuses on fast combinational paths. Gates on fast paths
are down-sized, thereby decreasing their input capacitances, while at the same time slowing down
signal propagation. By slowing down fast paths, propagation delays are equalized, and power is
reduced by joint spurious switching and capacitance reduction. Resizing does notalways imply down-
sizing.Power can be reduced also by enlarging (or buffering) heavily loaded gates, to increase their
output slew rates. Fast transitions minimize short-circuit power of the gates in the fan-out of the gate
which has been sized up, but its input capacitance is increased. In most cases, resizing is a complex
optimization problem involving a tradeoff between output switching power and internal short-circuit
power on several gates at the same time.
Other logic-level power minimization techniques are re-factoring, remapping, phase assignment and
pin swapping. All these techniques can be classified as local transformations. They are applied on
gate netlists, and focus on nets with large switched capacitance.
-
8/3/2019 Low Power Principles
46/58
Most of these techniques replace a gate, or a small group of gates, around the target net, in an effort
to reduce capacitance and switching activity. Similarly to resizing, local transformations must
carefully balance short circuit and output power consumption.
Fig. 6.1. Local transformations: (a) re-mapping, (b) phase assignment, (c) pin swapping
Figure 6.1 shows three examples of local transformations. In (a) a re-mapping transformation is
shown, where a high-activity node (marked with x ) is removed thanks to a new mapping onto an
AND-OR gate. In (b), phase assignment is exploited to eliminate one of the two high-activity nets
marked with x. Finally, pin swapping is applied in (c) to connect a high-activity net with the input
pin of the 4-input NAND with minimum input capacitance.
-
8/3/2019 Low Power Principles
47/58
6.2. Architecture Level Optimizations
Complex digital circuits usually contain units (or parts thereof) that are not performing useful
computations at every clock cycle. Think, for example, of arithmetic units or register files within a
microprocessor or, more simply, to registers of an ordinary sequential circuit. The idea, known for a
long time in the community of IC designers, is to disable the logic which is not in use during some
particular clock cycles, with the objective of limiting power consumption. In fact, stopping certain
units from making useless transitions causes a decrease in the overall switched capacitance of the
system, thus reducing the switching component of the power dissipated. Optimization techniques based
on the principle above belong to the broad class ofdynamic power management(DPM) methods.
The natural domain of applicability of DPM is system-level design; therefore, it will be discussed ingreater detail in the next section. Nevertheless, this paradigm has also been successfully adopted in
the context of architectural optimization.
Clock gatingprovides a way to selectively stop the clock, and thus force the original circuit to make no
transition, whenever the computation to be carried out by a hardware unit at the next clock cycle is
useless. In other words, the clock signal is disabled in accordance with the idle conditions of the unit.
As an example of use of the clock-gating strategy, consider the traditional block diagram of a
sequential circuit, shown on the left of Figure 6.2.
-
8/3/2019 Low Power Principles
48/58
Fig. 6.2. Example of gated clock architecture
It consists of a combinational logic block and an array of state registers which are fed by the next-
state logic and which provide some feed-back information to the combinational block itself through the
present-state input signals. The corresponding gated-clock architecture is shown on the right of the
figure. The circuit is assumed to have a single clock, and the registers are assumed to be edge-
triggered flip-flops. The combinational blockFa is controlled by the primary inputs, the present-state
inputs, and the primary outputs of the circuit, and it implements the activation function of the clock
gating mechanism. Its purpose is to selectively stop the local clock of the circuit when no state or
output transition takes place. The block namedL is a latch, transparent when the global clock signal
CLKis inactive. Its presence is essential for a correct operation of the system, since it takes care of
filtering glitches that may occur at the output of blockFa.
-
8/3/2019 Low Power Principles
49/58
The clock management logic is synthesized from the Boolean function representing the idle conditions
of the circuit. It may well be the case that considering all such conditions results in additional circuitry
that is too large and power consuming. It may then be necessary to synthesize a simplified function,
which dissipates the minimum possible power, and stops the clock with maximum efficiency. Because
of its effectiveness, clock-gating has been applied extensively in real designs and it has lately found its
way in industry-strength CAD tools (e.g., Power Compiler by Synopsys).
Power savings obtained by gating the clock distribution network of some hardware resources come at
the price of a global decrease in performance. In fact, resuming the operation of an inactive resource
introduces a latency penalty that negatively impacts system speed. In other words, with clock gating
(or with any similar DPM technique), performance and throughput of an architecture are traded for
power.
-
8/3/2019 Low Power Principles
50/58
7. Software and System Level Optimizations
Electronic systems and subsystems consist of hardware platforms with several software layers. Many
system features depend on the hardware/software interaction, e.g., programmability and flexibility,
performance and energy consumption. Software does not consume energy per se, but it is the
execution and storage of software that requires energy consumption by the underlying hardware.
Software execution corresponds to performing operations on hardware, as well as accessing and
storing data.
Thus, software execution involves power dissipation for computation, storage, and communication.
Moreover, storage of computer programs in semiconductor memories requires energy (refresh of
DRAMs, static power for SRAMs).
The energy budget for storing programs is typically small (with the choice of appropriate components)
and predictable at design time. Thus, we will concentrate on energy consumption of software during
its execution. Nevertheless, it is important to remember that reducing the size of program, which is a
usual objective in compilation, correlates with reducing their energy storage costs. Additional
reduction of code size can be achieved by means of compression techniques. The energy cost of
executing a program depends on its machine code and on the hardware architecture parameters.
Th hi d i d i d f th d f il ti T i ll th t f th
-
8/3/2019 Low Power Principles
51/58
The machine code is derived from the source code from compilation. Typically, the energy cost of the
machine code is affected by the back-end of software compilation, that controls the type, number and
order of operations, and by the means of storing data, e.g., locality (registers vs. memory arrays),
addressing, order. Nevertheless, some architecture independent optimizations can be useful in general
to reduce energy consumption, e.g., selective loop unrolling and software pipelining.
Software instructions can be characterized by the number of cycles needed to execute them and by the
energy required per cycle. The energy consumed by an instruction depends weakly on the state of the
processor (i.e., by the previously executed instruction).
On the other hand, the energy varies significantly when the instruction requires storage in registers or
in memory (caches).
The traditional goal of a compiler is to speed up the execution of the generated code, by reducing the
code size (which correlates with the latency of execution time) and minimizingspills to memory.
Interestingly enough, executing machine code of minimum size would consume the minimum energy, if
we neglect the interaction with memory and we assume a uniform energy cost of each instruction.
Energy-efficient compilation strives at achieving machine code that requires less energy as compared
to a performance-driven traditional compiler, by leveraging the disuniformity in instruction energy
cost, and the different energy costs for storage in registers and in main memory due to addressing and
address decoding. Nevertheless, results are sometimes contradictory.
Whereas for some architectures energy efficient compilation gives a competitive advantage as
-
8/3/2019 Low Power Principles
52/58
Whereas for some architectures energy-efficient compilation gives a competitive advantage as
compared to traditional compilation, for some others the most compact code is also the most
economical in terms of energy, thus obviating the need of specific low-power compilers.
Power-aware operating systems (OSs) trade generality for energy efficiency. In the case of embedded
electronic systems, OSs are streamlined to support just the required applications. On the other hand,
such an approach may not be applicable to OSs for personal computers, where the user wants to
retain the ability of executing a wide variety of applications.
Energy efficiency in an operating system can be achieved by designing an energy aware task
scheduler. Usually, a scheduler determines the set of start times for each task, with the goal of
optimizing a cost function related to the completion time of all tasks, and to satisfy real time
constraints, if applicable. Since tasks are associated with resources having specific energy models, the
scheduler can exploit this information to reduce run-time power consumption.
Operating systems achieve major energy savings by implementing dynamic power management (DPM)
of the system resources. DPM dynamically reconfigures an electronic system to provide the requested
services and performance levels with a minimum number of active components or a minimum load on
such components. Dynamic power management encompasses a set of techniques that achieve energy-
efficient computation by selectively shutting down or slowing down system components when they are
idle (or partially unexploited). DPM can be implemented in different forms including, but not limited
to clock gating clock throttling supply voltage shut down and dynamically varying power supplies
-
8/3/2019 Low Power Principles
53/58
to, clock gating, clock throttling, supply voltage shut-down, and dynamically varying power supplies.
Several system-level design trade-offs can be explored to reduce energy consumption. Some of these
design choices belong to the domain of hardware/software co-design, and leverage the migration of
hardware functions to software or vice versa. For example, the Advanced Configuration and Power
Interface (ACPI) standard, initiated by Intel, Microsoft and Toshiba, provides a portable hw/sw
interface that makes it easy to implement DPM policies for personal computers in software.
C l i
-
8/3/2019 Low Power Principles
54/58
Conclusions
Electronic design aims at striking a balance between performance and power efficiency. Designing
low power applications is a multi-faceted problem, because of the plurality of embodiments that a
system specification may have and the variety of degrees of freedom that designers have to cope with
power reduction. In this brief tutorial, we showed different design options and the corresponding
advantages and disadvantages. We tried to relate general-purpose low-power design solutions to a few
successful chips that use them to various extents. Even though we described only a few samples of
design techniques and implementations, we think that our samples are representative of the state of the
art of current technologies and can suggest future developments and improvements.
R f
-
8/3/2019 Low Power Principles
55/58
References
[1] J. Rabaey and M. Pedram,Low Power Design Methodologies. Kluwer, 1996.
[2] J. Mermet and W. Nebel,Low Power Design in Deep Submicron Electronics. Kluwer, 1997.
[3] A. Chandrakasan and R. Brodersen,Low-Power CMOS Design. IEEE Press, 1998.
[4] T. Burd and R. Brodersen, Processor Design for Portable Systems, Journal of VLSI Signal
Processing Systems, vol. 13, no. 23, pp. 203221, August 1996.
[5] D. Ditzel, Transmetas Crusoe: Cool Chips for Mobile Computing, Hot Chips Symposium,
August 2000.
[6] J. Montanaro, et al., A 160-MHz, 32-b, 0.5-W CMOS RISC Microprocessor, IEEE Journal of
Solid-State Circuits, vol. 31, no. 11, pp. 17031714, November 1996.
[7] V. Lee, et al., A 1-V Programmable DSP for Wireless Communications, IEEE Journal of Solid-
State Circuits, vol. 32, no. 11, pp. 17661776, November 1997.
[8] M. Takahashi, et al., A 60-mW MPEG4 Video Coded Using Clustered Voltage Scaling with
Variable Supply-Voltage Scheme, IEEE Journal of Solid-State Circuits, vol. 33, no. 11, pp. 1772
1780, November 1998.
[9] A. P. Chandrakasan, S. Sheng, and R. W. Brodersen, Low-Power CMOS Digital Design,IEEE
Journal of Solid-State Circuits, vol. 27, no. 4, pp. 473484, April 1992.
[10] F. Najm, A Survey of Power Estimation Techniques in VLSI Circuits, IEEE Transactions on
-
8/3/2019 Low Power Principles
56/58
[10] F. Najm, A Survey of Power Estimation Techniques in VLSI Circuits , ansactions on
VLSI Systems, vol. 2, no. 4, pp. 446455, December 1994.
[11] M. Pedram, Power Estimation and Optimization at the Logic Level, International Journal of
High-Speed Electronics and Systems, vol. 5, no. 2, pp. 179202, 1994.
[12] P. Landman, High-Level Power Estimation,ISLPED-96: ACM/IEEE International Symposium
on Low Power Electronics and Design, pp. 2935, Monterey, California, August 1996.
[13] E. Macii, M. Pedram, F. Somenzi, High-Level Power Modeling, Estimation, and Optimization,
IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 17, no. 11, pp.
10611079, November 1998.
[14] S. Borkar, Design Challenges of Technology Scaling, IEEE Micro, vol. 19, no. 4, pp. 2329,
July-August 1999.
[15] S. Thompson, P. Packan, and M. Bohr, MOS Scaling: Transistor Challenges for the 21st
Century,Intel Technology Journal, Q3, 1998.
[16] Z. Chen, J. Shott, and J. Plummer, CMOS Technology Scaling for Low Voltage Low Power
Applications, ISLPE-98: IEEE International Symposium on Low Power Electronics, pp. 5657, San
Diego, CA, October 1994.
[17] Y. Ye, S. Borkar, and V. De, A New Technique for Standby Leakage Reduction in High-
Performance Circuits, 1998 Symposium on VLSI Circuits, pp. 4041, Honolulu, Hawaii, June 1998.
[18] M. Pedram, Power Minimization in IC Design: Principles and Applications,ACM Transactions
-
8/3/2019 Low Power Principles
57/58
[ ] , g p pp ,
on Design Automation of Electronic Systems, vol. 1, no. 1, pp. 356, January 1996.
[19] B. Chen and I. Nedelchev, Power Compiler: A Gate Level Power Optimization and Synthesis
System,ICCD97: IEEE International Conference on Computer Design, pp. 7479, Austin, Texas,
October 1997.
[20] L. Benini, P. Siegel, and G. De Micheli, Automatic Synthesis of Gated Clocks for Power
Reduction in Sequential Circuits,IEEE Design and Test of Computers, vol. 11, no. 4, pp. 3240,
December 1994.
[21] Y. Yoshida, B.-Y. Song, H. Okuhata, T. Onoye, and I. Shirakawa, An Object Code Compression
Approach to Embedded Processors,ISLPED-98: ACM/IEEE International Symposium on Low Power
Electronics and Design, pp. 265268, Monterey, California, August 1997.
[22] L. Benini, A. Macii, E. Macii, and M. Poncino, Selective Instruction Compression for Memory
Energy Reduction in Embedded Systems,ISLPED-99: ACM/IEEE 1999 International Symposium on
Low Power Electronics and Design, pp. 206211, San Diego, California, August 1999.
[23] H. Lekatsas and W. Wolf, Code Compression for Low Power Embedded Systems,DAC-37:ACM/IEEE Design Automation Conference, pp. 294299, Los Angeles, California, June 2000.
[24] S. Segars, K. Clarke, and L. Goudge, Embedded Control Problems, Thumb and the
ARM7TDMI,IEEE Micro, vol. 15, no. 5, pp. 2230, October 1995.
[25] D. Brooks, et al., Power-Aware Microarchitecture: Design and Modeling Challenges for Next-
-
8/3/2019 Low Power Principles
58/58
g g g
Generation Microprocessors,IEEE Micro, vol. 20, No. 6, pp. 2644, November 2000.
[26] L. Benini and G. De Micheli,Dynamic Power Management: Design Techniques and CAD Tools.
Kluwer, 1997.
[27] Intel, SA-1100 Microprocessor Technical Reference Manual. 1998.
[28] L. Benini, A. Bogliolo, and G. De Micheli, A Survey of Design Techniques for System-Level
Dynamic Power Management,IEEE Transactions on VLSI Systems, vol. 8, no. 3, pp. 299316,
June 2000.
Some Interesting Links
1) Center for Low Power Electronics:
http://clpe.ece.arizona.edu/
2) Bibliography on Dynamic Power Management:
http://www.cse.unsw.edu.au/~danielp/cs1/power/files/bib.shtml
3) European Low Power Initiative for Electronic System Design:
http://www.ddtc.dimes.tudelft.nl/LowPower/index_f.html
4) Low Power IP Library:
http://www.ee.ed.ac.uk/~SLIg/iplibrary.html