low power principles

Upload: vermajiii

Post on 06-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 Low Power Principles

    1/58

    Low Power PrinciplesLow Power Principles

    Author:Author:Agatino PennisiAgatino Pennisi

    [[email protected]][[email protected]]

    Low Power Architectures and DesignLow Power Architectures and Design

    ASTAST--Lab CataniaLab Catania

  • 8/3/2019 Low Power Principles

    2/58

    Index

    1.Introduction

    2.Basic Principles

    2.1. Sources of Power Consumption

    2.2. Switching Power2.3. Short-Circuit Power

    2.4. Static Power

    2.5. Power-Delay and Energy-Delay Products

    3. Technology Level Optimizations

    3.1. Technology Scaling

    3.2. Threshold Voltage Reduction

    3.3. Technology Level Conclusion

    4.Layout Level Optimizations

    5. Circuit Level Optimizations

    5.1. Dynamic Logic

    5.2. Pass-Transistor Logic

    5.3. Asynchronous Logic

    5.4. Transistor Sizing

    5.5. Design Style

    5.6. Circuit Level Conclusion

    6.Logic and Architecture Level Optimizations6.1. Logic Level Optimizations

    6.2. Architecture Level Optimizations

    7.Software and System Level Optimizations

    Conclusions

    References

  • 8/3/2019 Low Power Principles

    3/58

    1. Introduction

    The growing market of mobile, battery-powered electronic systems (e.g., cellular phones,personal

    digital assistants, etc.) demands the design of microelectronic circuits with low power dissipation, that

    can be powered by lightweight batteries with long times between re-charges.

    The power consumed by a circuit is defined asp(t) = i(t)v(t), where i(t) is the instantaneous current

    provided by the power supply, andv(t) is the instantaneous supply voltage. Power minimization targets

    maximum instantaneous power or average power. The latter impacts battery lifetime and heat

    dissipation system cost, the former constrains power grid and power supply circuits design.

    It is important to stress from the outset that power minimization is never the only objective in real-

    life designs. Performance is always a critical metric that cannot be neglected. Unfortunately, in most

    cases, power can be reduced at the price of some performance degradation. For this reason, several

    metrics for joint power-performance have been proposed in the past. In many designs, thepower-delay

    product (i.e., energy) is an acceptable metric. Energy minimization rules out design choices that

  • 8/3/2019 Low Power Principles

    4/58

    heavily compromise performance to reduce power consumption. When performance has priority overpower consumption, the energy-delay product (that is equivalent to power-delay2 ) can be adopted to

    tightly control performance degradation.

    Besides power vs. performance, another key trade-off in VLSI design is power vs. flexibility.

    Several authors have observed that application specific designs are orders of magnitude more power

    efficient than general-purpose systems programmed to perform the same computation. On the other

    hand, flexibility (programmability) is often an indispensable requirement, and designers must strive to

    achieve maximum power efficiency without compromising flexibility.

  • 8/3/2019 Low Power Principles

    5/58

    2. Basic Principles

    The three major sources of power dissipation in a digital CMOS circuit are:

    P=PSwitching

    +PShort-Circuit

    +PLeakage

    2.1. Sources of Power Consumption

    (Eq. 2.1)

    Fig. 2.1. Sources of power consumption

  • 8/3/2019 Low Power Principles

    6/58

    PSwitching, called alsoswitching power, is due to charging and discharging capacitors driven by the

    circuit.

    PShort-Circuit

    , calledshort-circuit power, is caused by the short circuit currents that arise when pairs

    of PMOS/NMOS transistors are conducting simultaneously.

    PLeakage , originates from substrate injection and subthreshold effects. For older technologies (0.8

    m and above), PSwitching was predominant. For deep-submicron processes, PLeakage becomes moreimportant.

    Design for low-power implies the ability to reduce all three components

    Optimizations can be achieved by facing the power problem from different perspectives: design and

    technology. Enhanced design capabilities mostly impact switching and short-circuit power; technology

    improvements, on the other hand, contribute to reductions of all three components.

  • 8/3/2019 Low Power Principles

    7/58

    2.2. Switching Power

    Switching power for a CMOS gate working in a synchronous environment is modeled by the following

    equation:

    SWClockDDLSwitching EfVCP

    2

    2

    1=

    where CL

    is the output load of the gate, VDD

    is the supply voltage,fClock

    is the clock frequency andESW

    is the switching activity of the gate, defined as the probability of the gates output to make a logic

    transition during one clock cycle.

    Reductions ofPSwitchingare achievable by:

    1. supply voltage scaling

    2. frequency scaling

    3. minimization of switched capacitance

    (Eq. 2.2)

  • 8/3/2019 Low Power Principles

    8/58

    1. Supply Voltage Scaling

    advantage: scaling VDDscaling PSwitchingquadratically

    drawback: scaling VDD lower circuit speed (decreasing circuit performance)

    To compensate the decrease in circuit performance introduced by reduced voltage, speed optimizationis applied first, followed by supply voltage scaling, which brings the design back to its original timing,

    but with a lower power requirement.

    2. Frequency Scaling

    advantage: scaling fClockscaling PSwitchinglinearly

    drawback: scaling fClock lower circuit speed (decreasing circuit performance)

    Selective frequency scaling (as well as voltage scaling) on such units may thus be applied, at no

    penalty in the overall system speed.

  • 8/3/2019 Low Power Principles

    9/58

    3. Minimization of Switched Capacitance

    Optimization approaches that have a lower impact on performance, yet allowing significant power

    savings, are those targeting the minimization of the switched capacitance (i.e., the product of the

    capacitive load with the switching activity).

    Static solutions (i.e., applicable at design time) handle switched capacitance minimization througharea optimization (that corresponds to a decrease in the capacitive load) and switching activity

    reduction via exploitation of different kinds of signal correlations (temporal, spatial, spatio-temporal).

    Dynamic techniques, on the other hand, aim at eliminating power wastes that may be originated by the

    application of certain system workloads (i.e., the data being processed).

  • 8/3/2019 Low Power Principles

    10/58

    2.3. Short-Circuit Power

    In actual designs, the assumption of the zero rise and fall times of the input wave forms is not correct.

    The finite slope of the input signal causes a direct current path between VDD and GND for a short

    period of time during switching, while the NMOS and the PMOS transistors are conducting

    simultaneously. This is illustrated in Figure 2.2. Under the (reasonable) assumption that the resulting

    current spikes can be approximated as triangles and that the inverter is symmetrical in its rising and

    falling responses, we can compute the energy consumed per switching period,

    Fig. 2.2. Short-circuit currents during transients

    peakDDsc

    scpeak

    DD

    scpeak

    DDdp IVttI

    VtI

    VE =+=22

    (Eq. 2.3)

    as well as the average power consumption

    fVCfIVtP DDscpeakDDscdp2

    == (Eq. 2.4)

  • 8/3/2019 Low Power Principles

    11/58

    The short-circuit (also called direct-path) power dissipation is proportional to the switching

    activity, similar to the capacitive power dissipation. tsc represents the time both devices are

    conducting. For a linear input slope, this time is reasonably well approximated by Eq. (2.5) where ts

    represents the 0-100% transition time.

    8.0

    22 )(fr

    DD

    TDDs

    DD

    TDDsc

    t

    V

    VVt

    V

    VVt

    = (Eq. 2.5)

    Ipeak is determined by the saturation current of the devices and is hence directly proportional to the

    sizes of the transistors. The peak current is also a strong function of the ratio between input and

    output slopes. This relationship is best illustrated by the following simple analysis. Consider a static

    CMOS inverter with a 01 transition at the input. Assume first that the load capacitance is verylarge, so that the output fall time is significantly larger than the input rise time (Figure 2.3a). Under

    those circumstances, the input moves through the transient region before the output starts to change.

    As the source-drain voltage of the PMOS device is approximately 0 during that period, the device

    shuts off without ever delivering any current. The short-circuit current is close to zero in this case.

    Consider now the reverse case, where the output capacitance is very small, and the output fall time is

    substantially smaller than the input rise time (Figure 2.3b). The drain-source voltage of the PMOS

    device equals VDD for most of the transition period, guaranteeing the maximal short-circuit current

    (equal to the saturation current of the PMOS). This clearly represents the worst-case condition.

  • 8/3/2019 Low Power Principles

    12/58

    Fig. 2.3. Impact of load capacitance on short-circuit current

    The conclusions of the above analysis are confirmed in Figure 2.4, which plots the short-circuit

    current through the NMOS transistor during a low-to-high transition as a function of the load

    capacitance.

    Fig. 2.4. CMOS inverter short-circuit current through NMOS transistor as a

    function of the load capacitance (for a fixed input slope of 500 psec).

  • 8/3/2019 Low Power Principles

    13/58

    This analysis leads to the conclusion that the short-circuit dissipation is minimized by making the

    output rise/fall time larger than the input rise/fall time. On the other hand, making the output rise/fall

    time too large slows down the circuit and can cause short-circuit currents in the fan-out gates. This

    presents a perfect example of how local optimization and forgetting the global picture can lead to an

    inferior solution.

  • 8/3/2019 Low Power Principles

    14/58

    2.4. Static Power

    The static (or steady-state) power dissipation of a circuit is is expressed by Eq. (2.6), where Istat is the

    current that flows between the supply rails in the absence of switching activity.

    DDstatstat VIP = (Eq. 2.6)

    Ideally, the static current of the CMOS inverter is equal to zero, as the PMOS and NMOS devices are

    never on simultaneously in steady-state operation. There is, unfortunately, a leakage current flowing

    through the reverse-biased diode junctions of the transistors, located between the source or drain and

    the substrate as shown in Figure 2.5. This contribution is, in general, very small and can be ignored.

    For the device sizes under consideration, the leakage current per unit drain area typically ranges

    between 10-100 pA/m2 at room temperature. For a die with 1 million gates, each with a drain area of

    0.5 m2 and operated at a supply voltage of 2.5 V, the worst-case power consumption due to diode

    leakage equals 0.125 mW, which is clearly not much of an issue. However, be aware that the junction

    leakage currents are caused by thermally generated carriers. Their value increases with increasing junction temperature, and this occurs in an exponential fashion. At 85C (a common junction

    temperature limit for commercial hardware), the leakage currents increase by a factor of 60 over their

    room-temperature values. Keeping the overall operation temperature of a circuit low is consequently a

    desirable goal.

  • 8/3/2019 Low Power Principles

    15/58

    Fig. 2.5. Sources of leakage currents in CMOS inverter (for Vin = 0 V)

    As the temperature is a strong function of the dissipated heat and its removal mechanisms, this can

    only be accomplished by limiting the power dissipation of the circuit and/or by using chip packages

    that support efficient heat removal.

    An emerging source of leakage current is the subthreshold current of the transistors. An MOS

    transistor can experience a drain-source current, even when VGS is smaller than the threshold voltage

    (Figure 2.6).

    The closer the threshold voltage is to zero volts, the larger the leakage current at VGS= 0 V and the

    larger the static power consumption. To offset this effect, the threshold voltage of the device has

    generally been kept high enough. Standard processes feature VT values that are never smaller than

    0.5-0.6V and that in some cases are even substantially higher (~ 0.75V).

  • 8/3/2019 Low Power Principles

    16/58

    Fig. 2.6. Decreasing the threshold increases the subthreshold current at VGS=0

    This approach is being challenged by the reduction in supply voltages that typically goes with deep-

    submicron technology scaling. Scaling the supply voltages while keeping the threshold voltage

    constant results in an important loss in performance, especially when VDD approaches 2VT .

    One approach to address this performance issue is to scale the device thresholds down as well. This

    moves the curve of Figure 3.1(dx) to the left, which means that the performance penalty for lowering

    the supply voltage is reduced. Unfortunately, the threshold voltages are lower-bounded by the amount

    of allowable subthreshold leakage current, as demonstrated in Figure 2.6. The choice of the threshold

    voltage hence represents a trade-off between performance and static power dissipation.

  • 8/3/2019 Low Power Principles

    17/58

    The continued scaling of the supply voltage predicted for the next generations of CMOS technologies

    will however force the threshold voltages ever downwards, and will make subthreshold conduction a

    dominant source of power dissipation. Process technologies that contain devices with sharper turn-off

    characteristic will therefore become more attractive. An example of the latter is the SOI (Silicon-on-

    Insulator) technology whose MOS transistors have slope-factors that are close to the ideal 60

    mV/decade.

    This lower bound on the thresholds is in some sense artificial. The idea that the leakage current in a

    static CMOS circuit has to be zero is a preconception. Certainly, the presence of leakage currents

    degrades the noise margins, because the logic levels are no longer equal to the supply rails. As long as

    the noise margins are within range, this is not a compelling issue. The leakage currents, of course,

    cause an increase in static power dissipation. This is offset by the drop in supply voltage, that is

    enabled by the reduced thresholds at no cost in performance, and results in a quadratic reduction in

    dynamic power. For a 0.25 m CMOS process, the following circuit configurations obtain the same

    performance: 3 V supply0.7 V VT; and 0.45 V supply0.1 V VT.The dynamic power consumption of the latter is, however, 45 times smaller! Choosing the correct

    values of supply and threshold voltages once again requires a trade-off. The optimal operation point

    depends upon the activity of the circuit.

  • 8/3/2019 Low Power Principles

    18/58

    In the presence of a sizable static power dissipation, it is essential that non-active modules are

    powered down, lest static power dissipation would become dominant.Power-down , also calledstandby , can be accomplished by disconnecting the unit from the supply rails, or by lowering the

    supply voltage.

  • 8/3/2019 Low Power Principles

    19/58

    2.5. Power-Delay and Energy-Delay Products

    The total power consumption of the CMOS inverter is now expressed as the sum of its three

    components:

    ( ) peakDDspeakDDDDLstatdpdyntot IVftIVVCPPPP ++=++= 102 (Eq. 2.7)

    In typical CMOS circuits, the capacitive dissipation is by far the dominant factor. The direct-pathconsumption can be kept within bounds by careful design, and should hence not be an issue. Leakage

    is ignorable at present, but this might change in the not too distant future.

    In Chapter 1, we introduced thepower-delay product,PDP, as a quality measure for a logic gate.

    pav

    tPPDP =(Eq. 2.8)

    The PDP presents a measure of energy, as is apparent from the units (Wsec = Joule). Assuming that

    the gate is switched at its maximum possible rate of fmax = 1/(2tp), and ignoring the contributions of the

    static and direct-path currents to the power consumption, we find

    2

    22 DDL

    pmaxDDL

    VC

    tfVCPDP=

    (Eq. 2.9)The PDP stands for the average energy consumed per switching event(this is, for a 01, or a 10

    transition). Remember that earlier we had defined Eav as the average energy per switching cycle (or

    per energy-consuming event). As each inverter cycle contains a 01, and a 10 transition, Eav hence

    is twice the PDP.

  • 8/3/2019 Low Power Principles

    20/58

    The validity of the PDP as a quality metric for a process technology or gate topology is questionable.

    It measures the energy needed to switch the gate, which is an important property for sure. Yet for agiven structure, this number can be made arbitrarily low by reducing the supply voltage. From this

    perspective, the optimum voltage to run the circuit at would be the lowest possible value that still

    ensures functionality. This comes at the major expense in performance, at discussed earlier. A more

    relevant metric should combine a measure of performance and energy. The energy-delay product,

    EDP, does exactly that.

    pDDL

    pavp tVC

    tPtPDPEDP ===2

    22

    (Eq. 2.10)

    It is worth analyzing the voltage dependence of the EDP. Higher supply voltages reduce delay, but

    harm the energy, and the opposite is true for low voltages. An optimum operation point should hence

    exist. Assuming that NMOS and PMOS transistors have comparable threshold and saturation voltages,

    we can simplify the following propagation delay expression.

    ( ) ( ) TeDDDDL

    p

    DSATnTnDDDSATnn

    DDLpHL

    VV

    VCt

    VVVVkLW

    VCt

    n

    =

    2//52.0

    ' (Eq. 2.11)

    where VTe = VT+ VDSAT/2, andtechnology parameter. Combining Eq. (2.10) and Eq.(2.11),

    ( )TeDDDDL

    VV

    VCEDP

    2

    32

    (Eq. 2.12)

  • 8/3/2019 Low Power Principles

    21/58

    The optimum supply voltage can be obtained by taking the derivative of Eq. (2.12) with respect to VDD,

    and equating the result to 0.

    TeDDopt VV2

    3= (Eq. 2.13)

    The remarkable outcome from this analysis is the low value of the supply voltage that simultaneously

    optimizes performance and energy. For sub-micron technologies with thresholds in the range of 0.5 V,

    the optimum supply is situated around 1 V.

  • 8/3/2019 Low Power Principles

    22/58

    3. Technology Level Optimization

    3.1. Technology Scaling

    Scaling of physical dimensions is a well-known technique for reducing circuit power consumption. The

    first-order effects of scaling can be fairly easily derived. Device gate capacitances are of the form:

    ox

    oxoxGate

    tLWCLWC

    ==

    If we scale down W, L, and tox by S, then this capacitance will scale down by S as well. Consequently,

    if system data rates and supply voltages remain unchanged, this factor of S reduction in capacitance is

    passed on directly to power:

    SP

    1voltagefixede,performancFixed

    (Eq. 3.1)

    (Eq. 3.2)

    The effect of scaling on delays is equally promising. Based on (Eq. 3.3), the transistor current drive

    increases linearly with S.

    ( )22

    tddox

    dd VVL

    WCI

    =

    (Eq. 3.3)

  • 8/3/2019 Low Power Principles

    23/58

    As a result, propagation delays, which are proportional to capacitance and inversely proportional to

    drive current, scale down by a factor of S2.

    Assuming we are only trying to maintain system throughput rather than increase it, the improvement in

    circuit performance can be traded for lower power by reducing the supply voltage. In particular,

    neglecting Vt effects, the voltage can be reduced by a factor of S2. This results in a S4 reduction in

    device currents, and along with the capacitance scaling leads to an S5 reduction in power:

    5

    1voltagevariablee,performancFixed

    SP (Eq. 3.4)

    This discussion, however, ignores many important second-order effects. For example, as scaling

    continues, interconnect parasitics eventually begin to dominate and change the picture substantially.

    The resistance of a wire is proportional to its length and inversely proportional to its thickness and

    width. Since in this discussion we are considering the impact of technology scaling on a fixed design,

    the local and global wire lengths should scale down by S along with the width and thickness of the

    wire. This means that the wire resistance should scale up by a factor of S overall. The wire

    capacitance is proportional to its width and length and inversely proportional to the oxide thickness.

    Consequently, the wire capacitance scales down by a factor of S. To summarize:

  • 8/3/2019 Low Power Principles

    24/58

    11

    wwwire

    w

    w

    CRtS

    C

    SR

    (Eq. 3.5)

    This means that, unlike gate delays, the intrinsic interconnect delay does not scale down with physical

    dimensions. So at some point interconnect delays will start to dominate over gate delays and it will no

    longer be possible to scale down the supply voltage. This means that once again power is reduced

    solely due to capacitance scaling:

    SP

    1dominatedParasitics (Eq. 3.6)

    Actually, the situation is even worse since the above analysis did not consider second-order effects

    such as the fringing component of wire capacitance, which may actually grow with reduced

    dimensions. As a result, realistically speaking, power may not scale down at all, but instead may stay

    approximately constant with technology scaling or even increase:

    moreor1effectsorder-2ndIncluding P (Eq. 3.7)

  • 8/3/2019 Low Power Principles

    25/58

    The conclusion is that technology scaling offers significant benefits in terms of power only up to a

    point. Once parasitics begin to dominate, the power improvements slack off or disappear completely.

    So we cannot rely on technology scaling to reduce power indefinitely. We must turn to other

    techniques for lowering power consumption.

    3.2. Threshold Voltage Reduction

    Many process parameters, aside from lithographic dimensions, can have a large impact on circuit

    performance. For example, at low supply voltages the value of the threshold voltage (Vt) is extremely

    important. Threshold voltage places a limit on the minimum supply voltage that can be used without

    incurring unreasonable delay penalties (Fig.3.1). Based on this, it would seem reasonable to consider

    reducing threshold voltages in a low-power process.

    Fig. 3.1. Energy and delay as a function of supply voltage

  • 8/3/2019 Low Power Principles

    26/58

    Unfortunately, sub-threshold conduction and noise margin considerations limit how low Vt can be set.

    Although devices are ideally off for gate voltages below Vt , in reality there is always some sub-

    threshold conduction even for Vgs

  • 8/3/2019 Low Power Principles

    27/58

    The methodology should be applicable not only to different technologies, but also to different circuit

    and logic styles. Whenever possible scaling and circuit techniques should be combined with the high-

    level methodology to further reduce power consumption; however, the general low-power strategy

    should not require these tricks. The advantages of scaling and low-level techniques cannot be

    overemphasized, but they should not be the sole arena from which the designer can extract power

    gains.

  • 8/3/2019 Low Power Principles

    28/58

    4. Layout Level Optimization

    There are a number of layout-level techniques that can be applied to reduce power. The simplest of

    these techniques is to select upper level metals to route high activity signals. The higher level metals

    are physically separated from the substrate by a greater thickness of silicon dioxide. Since the physical

    capacitance of these wires decreases linearly with increasing tox , there is some advantage to routing

    the highest activity signals in the higher level metals. For example, in a typical process metal three

    will have about a 30% lower capacitance per unit area than metal two. It should be noted, however,

    that the technique is most effective for global rather than local routing, since connecting to a higher

    level metal requires more vias, which add area and capacitance to the circuit. Still, the concept of

    associating high activity signals with low physical capacitance nodes is an important one and appears

    in many different contexts in low-power design.

    For example, we can combine this notion with the locality theme to arrive at a general strategy for

    low-power placement and routing. The placement and routing problem crops up in many different

    guises in VLSI design. Place and route can be performed on pads, functional blocks, standard cells,

    gate arrays, etc. Traditional placement involves minimizing area and delay. Minimizing delay, in turn,

    translates to minimizing the physical capacitance (or length) of wires.

  • 8/3/2019 Low Power Principles

    29/58

    In contrast, placement for low-power concentrates on minimizing the activity-capacitance product

    rather than the capacitance alone. In summary, high activity wires should be kept short or, in a

    manner of speaking, local. Tools have been developed that use this basic strategy to achieve about an

    18% reduction in power.

    Although intelligent placement and routing of standard cells and gate arrays can help to improve their

    power efficiency, the locality achieved by low-power place and route tools rarely approaches what can

    be achieved by a full-custom design. Design-time issues and other economic factors, however, may in

    many cases preclude the use of full-custom design. In these instances, the concepts presented here

    regarding low-power placement and routing of standard cells and gate arrays may prove useful.

    Moreover, even for custom designs, these low-power strategies can be applied to placement and

    routing at the block level.

  • 8/3/2019 Low Power Principles

    30/58

    5. Circuit Level Optimization

    In this section, we go beyond the traditional synchronous fully-complementary static CMOS circuit

    style to consider the relative advantages and disadvantages of other design strategies; we will

    consider five topics relating to low-power circuit design: dynamic logic, pass-transistor logic,

    asynchronous logic, transistor sizing, and design style (e.g. full custom versus standard cell).

    5.1. Dynamic Logic

    In static logic, node voltages are always maintained by a conducting path from the node to one of the

    supply rails. In contrast, dynamic logic nodes go through periods during which there is no path to the

    rails, and voltages are maintained as charge dynamically stored on nodal capacitances. Figure 5.1

    shows an implementation of a complex boolean expression in both static and dynamic logic. In the

    dynamic case, the clock period is divided into a pre-charge and an evaluation phase. During pre-

    charge, the output is charged to Vdd. Then, during the next clock phase, the NMOS tree evaluates the

    logic function and discharges the output node if necessary. Relative to static CMOS, dynamic logic hasboth advantages and disadvantages in terms of power.

    Historically, dynamic design styles have been touted for their inherent low-power properties. For

    example, dynamic design styles often have significantly reduced device counts.

  • 8/3/2019 Low Power Principles

    31/58

    Fig. 5.1. Static and dynamic implementations of ( )CBAF +=

    Since the logic evaluation function is fulfilled by the NMOS tree alone, the PMOS tree can be replaced

    by a single pre-charge device. These reduced device counts result in a corresponding decrease in

    capacitive loading, which can lead to power savings. Moreover, by avoiding stacked PMOS

    transistors, dynamic logic is amenable to low voltage operation where the ability to stack devices is

    limited. In addition, dynamic gates dont experience short-circuit power dissipation. Whenever static

    circuits switch, a brief pulse of transient current flows from Vdd to ground consuming power.

    Furthermore, dynamic logic nodes are guaranteed to have a maximum of one transition per clock

    cycle.

  • 8/3/2019 Low Power Principles

    32/58

    Static gates do not follow this pattern and can experience a glitching phenomenon whereby output

    nodes undergo unwanted transitions before settling at their final value. This causes excess powerdissipation in static gates. So in some sense, dynamic logic avoids some of the overhead and waste

    associated with fully-complementary static logic.

    In practice, however, dynamic circuits have several disadvantages. For instance, each of the pre-

    charge transistors in the chip must be driven by a clock signal. This implies a dense clock distribution

    network and its associated capacitance and driving circuitry. These components can contribute

    significant power consumption to the chip. In addition, with each gate influenced by the clock, issues

    of skew become even more important and difficult to handle.

    Fig. 5.2. Output activities for static and dynamic logic gates (with random inputs)

  • 8/3/2019 Low Power Principles

    33/58

    Also, the clock is a high (actually, maximum) activity signal, and having it connected to the PMOS

    pull-up network can introduce unnecessary activity into the circuit. For commonly used boolean logicgates, Figure 5.2 shows the probability that the outputs make an energy consuming (i.e. zero to one)

    transition for random gate inputs. In all cases, the activity of the dynamic gates is higher than that of

    the static gates. We can show that, in general, for any boolean signal X, the activity of a dynamically

    pre-charged wire carrying X must always be at least as high as the activity of a statically-driven wire:

    ( ) ( )

    ( ) ( ) ( )

    ===

    =

    00|110:casestatic

    010:casedynamic

    1 Xttwire

    Xwire

    PXXPP

    PP(Eq. 5.1)

    In conclusion, dynamic logic has certain advantages and disadvantages for low-power operation. The

    key is to determine which of the conflicting factors is dominant. In certain cases, a dynamic

    implementation might actually achieve a lower overall power consumption. Furthermore, the savings

    in terms of glitching and short-circuit power, while possibly significant, can also be achieved in static

    logic through other means (discussed in Section 6). All of this, coupled with the robustness of static

    logic at low voltages gives the designer less incentive to select a dynamic implementation of a low-

    power system.

  • 8/3/2019 Low Power Principles

    34/58

    5.2. Pass-Transistor Logic

    As with dynamic logic, pass-transistor logic offers the possibility of reduced transistor counts. Figure

    5.3 illustrates this fact with an equivalent pass-transistor implementation of the static logic function of

    Figure 5.1. Once again, the reduction in transistors results in lower capacitive loading from devices.

    This might make pass-transistor logic attractive as a low-power circuit style.

    Fig. 5.3. Complementary pass-transistor implementations of ( )CBAF +=

    Like dynamic logic, however, pass-transistor circuits suffer from several drawbacks. First, pass

    transistors have asymmetrical voltage driving capabilities. For example, NMOS transistors do not

    pass high voltages efficiently, and experience reduced current drive as well as a Vt drop at the

    output. If the output is used to drive a PMOS gate, static power dissipation can result.

  • 8/3/2019 Low Power Principles

    35/58

    These flaws can be remedied by using additional hardware - for instance, complementary transmission

    gates consisting of an NMOS and PMOS pass transistors in parallel or level-restoring circuit, asshowed in Figure 5.4. Unfortunately, this forfeits the power savings offered by reduced device counts.

    Also, efficient layout of pass-transistor networks can be problematic. Sharing of source/drain diffusion

    regions is often not possible, resulting in increased parasitic junction capacitances.

    In summary, there may be situations in which pass-transistor logic is more power efficient than fully-

    complementary logic; however, the benefits are likely to be small relative to the orders of magnitude

    savings possible from higher level techniques. So, again, circuit-level power saving techniques should

    be used whenever appropriate, but should be subordinate to higher level considerations.

    In summary, there may be situations in which pass-transistor logic is more power efficient than fully-

    complementary logic; however, the benefits are likely to be small relative to the orders of magnitude

    savings possible from higher level techniques. So, again, circuit-level power saving techniques should

    be used whenever appropriate, but should be subordinate to higher level considerations.

    Fig. 5.4. Level-restoring Circuit

  • 8/3/2019 Low Power Principles

    36/58

    5.3. Asynchronous Logic

    Asynchronous logic refers to a circuit style employing no global clock signal for synchronization.

    Instead, synchronization is provided by handshaking circuitry used as an interface between gates (see

    Figure 5.5). While more common at the system level, asynchronous logic has failed to gain acceptance

    at the circuit level. This has been based on area and performance criteria. It is worthwhile to

    reevaluate asynchronous circuits in the context of low power.

    Fig. 5.5. Asynchronous circuit with handshaking

    Typically, asynchronous circuits are classified as showed in Figure 5.6.

    Theself-timedconcept is based on an architecture in which there are registers, arithmetic logic units,

    control units, control signals but no clock signal. The computations sequence is managed by local

    synchronization signals (see Figure 5.7).

  • 8/3/2019 Low Power Principles

    37/58

    Fig. 5.6. Classification of asynchronous circuits

    Fig. 5.7. Example of a self-timed system

    Really, self-timed systems are a subset of asynchronous systems, which ones are in general no global

    clock signal. Inside self-timed systems there is a set of systems, calledspeed independent, which ones

  • 8/3/2019 Low Power Principles

    38/58

    work properly independently from delays of its internal components, except delays of interconnections

    (Fig. 5.8).

    Fig. 5.8. Example of a speed independent system

    Fig. 5.9. Example of a delay insensitive system

  • 8/3/2019 Low Power Principles

    39/58

    Inside speed independent systems there is a set of systems, calleddelay insensitive , which ones work

    properly independently from delays of its internal components and interconnections (Fig. 5.9). Theprimary power advantages of asynchronous logic can be classified as avoiding waste. The clock signal

    in synchronous logic contains no information; therefore, power associated with the clock driver and

    distribution network is in some sense wasted. Avoiding this power consumption component might offer

    significant benefits. In addition, asynchronous logic uses completion signals, thereby avoiding

    glitching, another form of wasted power. Finally, with no clock signal and with computation triggered

    by the presence of new data, asynchronous logic contains a sort of built in power-down mechanism for

    idle periods.

    While asynchronous sounds like the ideal low-power design style, several issues impede its acceptance

    in low-power arenas. Depending on the size of its associated logic structure, the overhead of the

    handshake interface and completion signal generation circuitry can be large in terms of both area and

    power.

    Since this circuitry does not contribute to the actual computations, transitions on handshake signals

    are wasted. This is similar to the waste due to clock power consumption, though it is not as severe

    since handshake signals have lower activity than clocks. Finally, fewer design tools support

    asynchronous than synchronous, making it more difficult to design.

  • 8/3/2019 Low Power Principles

    40/58

    At the small granularity with which it is commonly implemented, the overhead of the asynchronous

    interface circuitry dominates over the power saving attributes of the design style. It should beemphasized, however, that this is mainly a function of the granularity of the handshaking circuitry. It

    would certainly be worthwhile to consider using asynchronous techniques to eliminate the necessity of

    distributing a global clock between blocks of larger granularity. For example, large modules could

    operate synchronously off local clocks, but communicate globally using asynchronous interfaces. In

    this way, the interface circuitry would represent a very small overhead component, and the most

    power consuming aspects of synchronous circuitry (i.e. global clock distribution) would be avoided.

    5.4. Transistor Sizing

    Regardless of the circuit style employed, the issue of transistor sizing for low power arises. The

    primary trade-off involved is between performance and cost - where cost is measured by area and

    power. Transistors with larger gate widths provide more current drive than smaller transistors.

    Unfortunately, they also contribute more device capacitance to the circuit and, consequently, result inhigher power dissipation. Moreover, larger devices experience more severe short-circuit currents,

    which should be avoided whenever possible.

  • 8/3/2019 Low Power Principles

    41/58

    In addition, if all devices in a circuit are sized up, then the loading capacitance increases in the same

    proportion as the current drive, resulting in little performance improvement beyond the point ofovercoming fixed parasitic capacitance components. In this sense, large transistors become self-

    loading and the benefit of large devices must be reevaluated. A sensible low-power strategy is to use

    minimum size devices whenever possible. Along the critical path, however, devices should be sized up

    to overcome parasitics and meet performance requirements.

    5.5. Design Style

    Another decision which can have a large impact on the overall chip power consumption is selection of

    design style: e.g. full custom, gate array, standard cell, etc. Not surprisingly, full-custom design offers

    the best possibility of minimizing power consumption. In a custom design, all the principles of low-

    power including locality, regularity, and sizing can be applied optimally to individual circuits.

    Unfortunately, this is a costly alternative in terms of design time, and can rarely be employed

    exclusively as a design strategy. Other possible design styles include gate arrays and standard cells.

    Gate arrays offer one alternative for reducing design cycles at the expense of area, power, and

    performance. While not offering the flexibility of full-custom design, gate-array CAD tools could

    nevertheless be altered to place increased emphasis on power. For example, gate arrays offer some

    control over transistor sizing through the use of parallel transistor connections.

  • 8/3/2019 Low Power Principles

    42/58

    Standard cell synthesis is another commonly employed strategy for reducing design time. Current

    standard cell libraries and tools, however, offer little hope of achieving low power operation. In manyways, standard cells represent the antithesis of a low-power methodology. First and foremost,

    standard cells are often severely oversized. Most standard cell libraries were designed for maximum

    performance and worst-case loading from inter-cell routing. As a result, they experience significant

    self-loading and waste correspondingly significant amounts of power. To overcome this difficulty,

    standard cell libraries must be expanded to include a selection of cells of identical functionality, but

    varying driving strengths. With this in place, synthesis tools could select the smallest (and lowest

    power cell) required to meet timing constraints, while avoiding the wasted power associated with

    oversized transistors. In addition, the standard cell layout style with its islands of devices and

    extensive routing channels tends to violate the principles of locality central to low-power design.

  • 8/3/2019 Low Power Principles

    43/58

    5.6. Circuit Level Conclusion

    Clearly, numerous circuit-level techniques are available to the low-power designer. These techniques

    include careful selection of a circuit style: static vs. dynamic, synchronous vs. asynchronous, fully-

    complementary vs. pass-transistor, etc. Other techniques involve transistor sizing or selection of a

    design methodology such as full-custom or standard cell. Some of these techniques can be applied in

    conjunction with higher level power reduction techniques. When possible, designers should take

    advantage of this fact and exploit both low and high-level techniques in concert. Often, however,

    circuit-level techniques will conflict with the low-power strategies based on higher abstraction levels.

    In these cases, the designer must determine, which techniques offer the largest power reductions. As

    evidenced by the previous discussion, circuit-level techniques typically offer reductions of a factor oftwo or less, while some higher level strategies with their more global impact can produce savings of

    an order of magnitude or more. In such situations, considerations imposed by the higher-level

    technique should dominate and the designer should employ those circuit-level methodologies most

    amenable to the selected high-level strategy.

  • 8/3/2019 Low Power Principles

    44/58

    6. Logic and Architecture Level Optimizations

    Logic-level power optimization has been extensively researched in the last few years. Given the

    complexity of modern digital devices, hand-crafted logic-level optimization is extremely expensive in

    terms of design time and effort. Hence, it is cost-effective only for structured logic in large-volume

    components, like microprocessors (e.g., functional units in the data-path). Fortunately, several

    optimizations for low power have been automated and are now available in commercial logic synthesis

    tools, enabling logic-level power optimization even for unstructured logic and for low-volume VLSI

    circuits. During logic optimization, technology parameters such as supply voltage are fixed, and the

    degrees of freedom are in selecting the functionality and sizing the gates implementing a given logic

    specification. As for technology and circuit-level techniques, power is never the only cost metric of

    interest. In most cases, performance is tightly constrained as well.

    6.1. Logic Level Optimizations

    A common setting is constrained power optimization, where a logic network can be transformed to

    minimize power only if critical path length is not increased. Under this hypothesis, an effective

    technique is based on path equalization.

  • 8/3/2019 Low Power Principles

    45/58

    Path equalization ensures that signal propagation from inputs to outputs of a logic network follows

    paths of similar length. When paths are equalized, most gates have aligned transitions at their inputs,

    thereby minimizing spurious switching activity (which is created by misaligned input transitions). This

    technique is very helpful in arithmetic circuits, such as adders of multipliers.

    Glue logic and controllers have much more irregular structure than arithmetic units, and their gate-

    level implementations are characterized by a wide distribution of path delays. These circuits can be

    optimized for power by resizing. Resizing focuses on fast combinational paths. Gates on fast paths

    are down-sized, thereby decreasing their input capacitances, while at the same time slowing down

    signal propagation. By slowing down fast paths, propagation delays are equalized, and power is

    reduced by joint spurious switching and capacitance reduction. Resizing does notalways imply down-

    sizing.Power can be reduced also by enlarging (or buffering) heavily loaded gates, to increase their

    output slew rates. Fast transitions minimize short-circuit power of the gates in the fan-out of the gate

    which has been sized up, but its input capacitance is increased. In most cases, resizing is a complex

    optimization problem involving a tradeoff between output switching power and internal short-circuit

    power on several gates at the same time.

    Other logic-level power minimization techniques are re-factoring, remapping, phase assignment and

    pin swapping. All these techniques can be classified as local transformations. They are applied on

    gate netlists, and focus on nets with large switched capacitance.

  • 8/3/2019 Low Power Principles

    46/58

    Most of these techniques replace a gate, or a small group of gates, around the target net, in an effort

    to reduce capacitance and switching activity. Similarly to resizing, local transformations must

    carefully balance short circuit and output power consumption.

    Fig. 6.1. Local transformations: (a) re-mapping, (b) phase assignment, (c) pin swapping

    Figure 6.1 shows three examples of local transformations. In (a) a re-mapping transformation is

    shown, where a high-activity node (marked with x ) is removed thanks to a new mapping onto an

    AND-OR gate. In (b), phase assignment is exploited to eliminate one of the two high-activity nets

    marked with x. Finally, pin swapping is applied in (c) to connect a high-activity net with the input

    pin of the 4-input NAND with minimum input capacitance.

  • 8/3/2019 Low Power Principles

    47/58

    6.2. Architecture Level Optimizations

    Complex digital circuits usually contain units (or parts thereof) that are not performing useful

    computations at every clock cycle. Think, for example, of arithmetic units or register files within a

    microprocessor or, more simply, to registers of an ordinary sequential circuit. The idea, known for a

    long time in the community of IC designers, is to disable the logic which is not in use during some

    particular clock cycles, with the objective of limiting power consumption. In fact, stopping certain

    units from making useless transitions causes a decrease in the overall switched capacitance of the

    system, thus reducing the switching component of the power dissipated. Optimization techniques based

    on the principle above belong to the broad class ofdynamic power management(DPM) methods.

    The natural domain of applicability of DPM is system-level design; therefore, it will be discussed ingreater detail in the next section. Nevertheless, this paradigm has also been successfully adopted in

    the context of architectural optimization.

    Clock gatingprovides a way to selectively stop the clock, and thus force the original circuit to make no

    transition, whenever the computation to be carried out by a hardware unit at the next clock cycle is

    useless. In other words, the clock signal is disabled in accordance with the idle conditions of the unit.

    As an example of use of the clock-gating strategy, consider the traditional block diagram of a

    sequential circuit, shown on the left of Figure 6.2.

  • 8/3/2019 Low Power Principles

    48/58

    Fig. 6.2. Example of gated clock architecture

    It consists of a combinational logic block and an array of state registers which are fed by the next-

    state logic and which provide some feed-back information to the combinational block itself through the

    present-state input signals. The corresponding gated-clock architecture is shown on the right of the

    figure. The circuit is assumed to have a single clock, and the registers are assumed to be edge-

    triggered flip-flops. The combinational blockFa is controlled by the primary inputs, the present-state

    inputs, and the primary outputs of the circuit, and it implements the activation function of the clock

    gating mechanism. Its purpose is to selectively stop the local clock of the circuit when no state or

    output transition takes place. The block namedL is a latch, transparent when the global clock signal

    CLKis inactive. Its presence is essential for a correct operation of the system, since it takes care of

    filtering glitches that may occur at the output of blockFa.

  • 8/3/2019 Low Power Principles

    49/58

    The clock management logic is synthesized from the Boolean function representing the idle conditions

    of the circuit. It may well be the case that considering all such conditions results in additional circuitry

    that is too large and power consuming. It may then be necessary to synthesize a simplified function,

    which dissipates the minimum possible power, and stops the clock with maximum efficiency. Because

    of its effectiveness, clock-gating has been applied extensively in real designs and it has lately found its

    way in industry-strength CAD tools (e.g., Power Compiler by Synopsys).

    Power savings obtained by gating the clock distribution network of some hardware resources come at

    the price of a global decrease in performance. In fact, resuming the operation of an inactive resource

    introduces a latency penalty that negatively impacts system speed. In other words, with clock gating

    (or with any similar DPM technique), performance and throughput of an architecture are traded for

    power.

  • 8/3/2019 Low Power Principles

    50/58

    7. Software and System Level Optimizations

    Electronic systems and subsystems consist of hardware platforms with several software layers. Many

    system features depend on the hardware/software interaction, e.g., programmability and flexibility,

    performance and energy consumption. Software does not consume energy per se, but it is the

    execution and storage of software that requires energy consumption by the underlying hardware.

    Software execution corresponds to performing operations on hardware, as well as accessing and

    storing data.

    Thus, software execution involves power dissipation for computation, storage, and communication.

    Moreover, storage of computer programs in semiconductor memories requires energy (refresh of

    DRAMs, static power for SRAMs).

    The energy budget for storing programs is typically small (with the choice of appropriate components)

    and predictable at design time. Thus, we will concentrate on energy consumption of software during

    its execution. Nevertheless, it is important to remember that reducing the size of program, which is a

    usual objective in compilation, correlates with reducing their energy storage costs. Additional

    reduction of code size can be achieved by means of compression techniques. The energy cost of

    executing a program depends on its machine code and on the hardware architecture parameters.

    Th hi d i d i d f th d f il ti T i ll th t f th

  • 8/3/2019 Low Power Principles

    51/58

    The machine code is derived from the source code from compilation. Typically, the energy cost of the

    machine code is affected by the back-end of software compilation, that controls the type, number and

    order of operations, and by the means of storing data, e.g., locality (registers vs. memory arrays),

    addressing, order. Nevertheless, some architecture independent optimizations can be useful in general

    to reduce energy consumption, e.g., selective loop unrolling and software pipelining.

    Software instructions can be characterized by the number of cycles needed to execute them and by the

    energy required per cycle. The energy consumed by an instruction depends weakly on the state of the

    processor (i.e., by the previously executed instruction).

    On the other hand, the energy varies significantly when the instruction requires storage in registers or

    in memory (caches).

    The traditional goal of a compiler is to speed up the execution of the generated code, by reducing the

    code size (which correlates with the latency of execution time) and minimizingspills to memory.

    Interestingly enough, executing machine code of minimum size would consume the minimum energy, if

    we neglect the interaction with memory and we assume a uniform energy cost of each instruction.

    Energy-efficient compilation strives at achieving machine code that requires less energy as compared

    to a performance-driven traditional compiler, by leveraging the disuniformity in instruction energy

    cost, and the different energy costs for storage in registers and in main memory due to addressing and

    address decoding. Nevertheless, results are sometimes contradictory.

    Whereas for some architectures energy efficient compilation gives a competitive advantage as

  • 8/3/2019 Low Power Principles

    52/58

    Whereas for some architectures energy-efficient compilation gives a competitive advantage as

    compared to traditional compilation, for some others the most compact code is also the most

    economical in terms of energy, thus obviating the need of specific low-power compilers.

    Power-aware operating systems (OSs) trade generality for energy efficiency. In the case of embedded

    electronic systems, OSs are streamlined to support just the required applications. On the other hand,

    such an approach may not be applicable to OSs for personal computers, where the user wants to

    retain the ability of executing a wide variety of applications.

    Energy efficiency in an operating system can be achieved by designing an energy aware task

    scheduler. Usually, a scheduler determines the set of start times for each task, with the goal of

    optimizing a cost function related to the completion time of all tasks, and to satisfy real time

    constraints, if applicable. Since tasks are associated with resources having specific energy models, the

    scheduler can exploit this information to reduce run-time power consumption.

    Operating systems achieve major energy savings by implementing dynamic power management (DPM)

    of the system resources. DPM dynamically reconfigures an electronic system to provide the requested

    services and performance levels with a minimum number of active components or a minimum load on

    such components. Dynamic power management encompasses a set of techniques that achieve energy-

    efficient computation by selectively shutting down or slowing down system components when they are

    idle (or partially unexploited). DPM can be implemented in different forms including, but not limited

    to clock gating clock throttling supply voltage shut down and dynamically varying power supplies

  • 8/3/2019 Low Power Principles

    53/58

    to, clock gating, clock throttling, supply voltage shut-down, and dynamically varying power supplies.

    Several system-level design trade-offs can be explored to reduce energy consumption. Some of these

    design choices belong to the domain of hardware/software co-design, and leverage the migration of

    hardware functions to software or vice versa. For example, the Advanced Configuration and Power

    Interface (ACPI) standard, initiated by Intel, Microsoft and Toshiba, provides a portable hw/sw

    interface that makes it easy to implement DPM policies for personal computers in software.

    C l i

  • 8/3/2019 Low Power Principles

    54/58

    Conclusions

    Electronic design aims at striking a balance between performance and power efficiency. Designing

    low power applications is a multi-faceted problem, because of the plurality of embodiments that a

    system specification may have and the variety of degrees of freedom that designers have to cope with

    power reduction. In this brief tutorial, we showed different design options and the corresponding

    advantages and disadvantages. We tried to relate general-purpose low-power design solutions to a few

    successful chips that use them to various extents. Even though we described only a few samples of

    design techniques and implementations, we think that our samples are representative of the state of the

    art of current technologies and can suggest future developments and improvements.

    R f

  • 8/3/2019 Low Power Principles

    55/58

    References

    [1] J. Rabaey and M. Pedram,Low Power Design Methodologies. Kluwer, 1996.

    [2] J. Mermet and W. Nebel,Low Power Design in Deep Submicron Electronics. Kluwer, 1997.

    [3] A. Chandrakasan and R. Brodersen,Low-Power CMOS Design. IEEE Press, 1998.

    [4] T. Burd and R. Brodersen, Processor Design for Portable Systems, Journal of VLSI Signal

    Processing Systems, vol. 13, no. 23, pp. 203221, August 1996.

    [5] D. Ditzel, Transmetas Crusoe: Cool Chips for Mobile Computing, Hot Chips Symposium,

    August 2000.

    [6] J. Montanaro, et al., A 160-MHz, 32-b, 0.5-W CMOS RISC Microprocessor, IEEE Journal of

    Solid-State Circuits, vol. 31, no. 11, pp. 17031714, November 1996.

    [7] V. Lee, et al., A 1-V Programmable DSP for Wireless Communications, IEEE Journal of Solid-

    State Circuits, vol. 32, no. 11, pp. 17661776, November 1997.

    [8] M. Takahashi, et al., A 60-mW MPEG4 Video Coded Using Clustered Voltage Scaling with

    Variable Supply-Voltage Scheme, IEEE Journal of Solid-State Circuits, vol. 33, no. 11, pp. 1772

    1780, November 1998.

    [9] A. P. Chandrakasan, S. Sheng, and R. W. Brodersen, Low-Power CMOS Digital Design,IEEE

    Journal of Solid-State Circuits, vol. 27, no. 4, pp. 473484, April 1992.

    [10] F. Najm, A Survey of Power Estimation Techniques in VLSI Circuits, IEEE Transactions on

  • 8/3/2019 Low Power Principles

    56/58

    [10] F. Najm, A Survey of Power Estimation Techniques in VLSI Circuits , ansactions on

    VLSI Systems, vol. 2, no. 4, pp. 446455, December 1994.

    [11] M. Pedram, Power Estimation and Optimization at the Logic Level, International Journal of

    High-Speed Electronics and Systems, vol. 5, no. 2, pp. 179202, 1994.

    [12] P. Landman, High-Level Power Estimation,ISLPED-96: ACM/IEEE International Symposium

    on Low Power Electronics and Design, pp. 2935, Monterey, California, August 1996.

    [13] E. Macii, M. Pedram, F. Somenzi, High-Level Power Modeling, Estimation, and Optimization,

    IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, vol. 17, no. 11, pp.

    10611079, November 1998.

    [14] S. Borkar, Design Challenges of Technology Scaling, IEEE Micro, vol. 19, no. 4, pp. 2329,

    July-August 1999.

    [15] S. Thompson, P. Packan, and M. Bohr, MOS Scaling: Transistor Challenges for the 21st

    Century,Intel Technology Journal, Q3, 1998.

    [16] Z. Chen, J. Shott, and J. Plummer, CMOS Technology Scaling for Low Voltage Low Power

    Applications, ISLPE-98: IEEE International Symposium on Low Power Electronics, pp. 5657, San

    Diego, CA, October 1994.

    [17] Y. Ye, S. Borkar, and V. De, A New Technique for Standby Leakage Reduction in High-

    Performance Circuits, 1998 Symposium on VLSI Circuits, pp. 4041, Honolulu, Hawaii, June 1998.

    [18] M. Pedram, Power Minimization in IC Design: Principles and Applications,ACM Transactions

  • 8/3/2019 Low Power Principles

    57/58

    [ ] , g p pp ,

    on Design Automation of Electronic Systems, vol. 1, no. 1, pp. 356, January 1996.

    [19] B. Chen and I. Nedelchev, Power Compiler: A Gate Level Power Optimization and Synthesis

    System,ICCD97: IEEE International Conference on Computer Design, pp. 7479, Austin, Texas,

    October 1997.

    [20] L. Benini, P. Siegel, and G. De Micheli, Automatic Synthesis of Gated Clocks for Power

    Reduction in Sequential Circuits,IEEE Design and Test of Computers, vol. 11, no. 4, pp. 3240,

    December 1994.

    [21] Y. Yoshida, B.-Y. Song, H. Okuhata, T. Onoye, and I. Shirakawa, An Object Code Compression

    Approach to Embedded Processors,ISLPED-98: ACM/IEEE International Symposium on Low Power

    Electronics and Design, pp. 265268, Monterey, California, August 1997.

    [22] L. Benini, A. Macii, E. Macii, and M. Poncino, Selective Instruction Compression for Memory

    Energy Reduction in Embedded Systems,ISLPED-99: ACM/IEEE 1999 International Symposium on

    Low Power Electronics and Design, pp. 206211, San Diego, California, August 1999.

    [23] H. Lekatsas and W. Wolf, Code Compression for Low Power Embedded Systems,DAC-37:ACM/IEEE Design Automation Conference, pp. 294299, Los Angeles, California, June 2000.

    [24] S. Segars, K. Clarke, and L. Goudge, Embedded Control Problems, Thumb and the

    ARM7TDMI,IEEE Micro, vol. 15, no. 5, pp. 2230, October 1995.

    [25] D. Brooks, et al., Power-Aware Microarchitecture: Design and Modeling Challenges for Next-

  • 8/3/2019 Low Power Principles

    58/58

    g g g

    Generation Microprocessors,IEEE Micro, vol. 20, No. 6, pp. 2644, November 2000.

    [26] L. Benini and G. De Micheli,Dynamic Power Management: Design Techniques and CAD Tools.

    Kluwer, 1997.

    [27] Intel, SA-1100 Microprocessor Technical Reference Manual. 1998.

    [28] L. Benini, A. Bogliolo, and G. De Micheli, A Survey of Design Techniques for System-Level

    Dynamic Power Management,IEEE Transactions on VLSI Systems, vol. 8, no. 3, pp. 299316,

    June 2000.

    Some Interesting Links

    1) Center for Low Power Electronics:

    http://clpe.ece.arizona.edu/

    2) Bibliography on Dynamic Power Management:

    http://www.cse.unsw.edu.au/~danielp/cs1/power/files/bib.shtml

    3) European Low Power Initiative for Electronic System Design:

    http://www.ddtc.dimes.tudelft.nl/LowPower/index_f.html

    4) Low Power IP Library:

    http://www.ee.ed.ac.uk/~SLIg/iplibrary.html