low-power-estimation .ppt

7/28/2019 Low-Power-Estimation .ppt

1/82

Chapter 3 Power Estimation

Simulation basedVectors are given, circuit is known, simulation

is performed. The instantaneous currents areaveraged.

Probabilistic analysis based

Averaging of the inputs is performed first,probabilistic measures are extracted.


2/82


3/82

Power is also dissipated due to glitchingactivity in a circuit. Glitches occur due to

different delays through different paths ofthe circuit.

A hazardous transition occurs at the output

of the AND gate due to different delaysthrough two different paths converging at

the inputs to the AND gate.


4/82


5/82

The glitches can die while propagatingthrough a logic gate if the width of the

glitch is much smaller than the inertial delayof the logic gate.


6/82

3.1 Modeling of Signals

Stochastic process: Let g(t), t , be astochastic process that takes the values of logical 0

or logical 1, transitioning from one to the other atrandom times.

Strict-sense stationary (SSS): A stochasticprocess is said to be strict-sense stationary if its

statistical properties are invariant to a shift of thetime origin. More importantly, the mean of such a

process does not change with time.


7/82

Mean ergodic: If a constant-mean processg(t) has a finite variance and is such that g(t)

and g(t+) become uncorrelated as then g(t) is mean ergodic.

Definition 3.1 (Signal Probability) The

signal probability of signal g(t) is given by

P(g) = lim T -TT g(t) dt


8/82

Definition 3.2 (Signal Activity) The signalactivity of a logic signal g(t) is given by

A(g) = lim T ng(T)/T

where ng(t) is the number of transitions ofg(t) in the time interval betweenT/2 andT/2.


9/82

If the primary inputs to the circuit are modeled asmutually independent SSS mean-ergodic 0-1process, then the probability of signal g(t)assuming the logic value 1 at any given time t

becomes a constant, independent of time and isreferred to as the equilibrium signal probability ofrandom quantity g(t) and is denoted by P(g=1),

which we refer to simply as signal probability.

Hence, A(g) becomes the expected number oftransitions per unit time.


10/82


11/82

3.2 Signal Probability

Calculation Inputs: Signal probabilities of all the inputs to acircuit

Output: Signal probabilities for all nodes of thecircuit

Step 1: For each input signal and gate output in thecircuit, assign a unique variable.

Step 2: Starting at the inputs and proceeding to theoutputs, write the expression for the output of eachgate as a function of its input expressions..


12/82

Step 3: Suppress all exponents in a given

expression to obtain the correct probability for thatsignal.

Reconvergent fanout can produce expressions for thesignal probability of internal nodes having exponents

greater than 1. Intuitively, in probability expressionsinvolving independent primary inputs, such exponents

cannot be present.


13/82

Letf be written in a canonical sum of products ofprimary inputs as follows:

f = i=1p (k=1n sk), where skis either xkor xk. Since the product terms inside the summation are

mutually independent, we have

P(f) = i=1p (k=1n P(sk)). This expression isdefined as the canonical sum of probabilityproducts off.


14/82

P(xk) P(xk) = P(xk) (1P(xk))

= P(xk)P2(xk)

= P(xk)P(xk)

= 0


15/82

3.3 Probabilistic Techniques for

Signal Activity Estimation


16/82

3.3.1 Switching Activity in

Combinational Logic The Boolean difference of fj with respect to

xi is defined as follows:

fj / xi = fj | xi=1 fj | xi=0 where denotesthe exclusive-or operation.

The Boolean difference signifies thecondition under which output fj is sensitizedto input xi.


17/82

If the primary inputs xi, i= 1, , n, to logic gateM are not spatially correlated, then the signal

activity at output fj is given by A(fj) = i=1n P(fj / xi) A(xi) (3.6) P(fj / xi) signifies the probability of sensitizing

input xito output f

j, while

P(fj / xi) A(xi) is the contribution of switchingactivity at output fj due to input xi only.


18/82


19/82

fand / x1 = fand | x1=1 fand | x1=0= x2 0

= x2

A(fand) = p(x2)A(x1) + P(x1)A(x2)


20/82

Equation (3.6) fails to consider the effect ofsimultaneous switching of signals at logic

gate inputs and, hence, can grosslyoverestimate signal activity.


21/82

The output switching activity is zero


22/82

3.4 Statistical techniques

The circuit is simulated repeatedly using alogic simulator and the switching activities

at various nodes are noted. Statistical mean estimation techniques are

used in determining the stopping criteria in

the Monte Carlo simulations.


23/82

3.4.1 Estimating Average Power

in Combinational Circuits Burch et al. experimentally determined that the

power consumed by a circuit over a period thas a

normal distribution. Let p ands be the measured average and the

standard deviation of the random sample of the

power measured over time T, respectively. Then

with (1 - ) * 100% confidence we can write thefollowing inequality:

p - Pavg < t/2s / N1/2


24/82

where t/2 is obtained from a t-distributionwith N1 degrees of freedom and Pavg is

the true average power. p - Pavg/ p < t/2s/ (p* N1/2) <

: the desired percentage error for the given

confidence level (1 - ) * 100%.


25/82

3.4.2 Estimating Average Power

in Sequential Circuits The basic idea of Monte Carlo methods for

estimating activity of individual nodes is to

simulate a circuit by applying random-pattern inputs. The convergence of

simulation can be obtained when the

activities of individual nodes satisfy somestopping criteria.


26/82


27/82


28/82

3.5 Estimation of Glitching

Power Static Hazard: A static hazard is defined as

the possible occurrence of a transient pulse

on signal line whose static value is notsupposed to change.

Dynamic Hazard: A dynamic hazard is the

possible occurrence of a spurious transitionduring the occurrence of a functional 0 1or a 1 0 transition.


29/82

Three-valued logic simulation for

AND GateAND 0 1 X

0 0 0 0

1 0 1 X

X 0 X X


30/82

Logic simulation can be used to detectprobable static hazards by using a six-

valued logic. The estimate is pessimistic because some of

these hazards might not be present under

certain delay conditions.


31/82

Six-valued logic for Static hazard analysis

Logic representation Bit sequence at t, t, t+1

0static 0 000

1static 1 111

R - rising 0U1

F - falling 1U0

SH0static 0 hazard 0U0

SH1static 1 hazard 1U1


32/82

AND Operation with Six-Valued Logic

AND 0 1 R F SH0 SH1

0 0 0 0 0 0 0

1 0 1 R F SH0 SH1

R 0 R R SH0 SH0 R

F 0 F SH0 F SH0 F

SH0 0 SH0 SH0 SH0 SH0 SH0

SH1 0 SH1 R F SH0 SH1


33/82

{1000, 1100, 1110} corresponding to fast,medium, and slow falling signals.

Eight-valued logic is required for logicsimulation to detect dynamic hazard.


34/82

Eight-valued logic for dynamic hazard

analysis

Logic representation Bit sequence at t, t, t, t+1

0static 0 0000

1static 1 1111

R - rising 0001, 0011, 0111

F - falling 1000, 1100, 1110

SH0static 0 hazard 0100, 0010, 0110

SH1static 1 hazard 1011, 1101, 1001

DH0dynamic 0 hazard 1010

DH1dynamic 1 hazard 0101


35/82

3.5.2 Delay Models

A circuit node where two reconvergentpaths with different delays meet may have a

large number of spurious transitions. However, even in a tree-structured circuit

with balanced paths there can be a large

number of spurious transitions due to slightvariations in delays.


36/82


37/82


38/82


39/82


40/82

3.5.2.1 Statistical Estimation

Delays are modeled as random variablesand should be generated from time to time

along the simulation. Whenever we generate a new set of delays,

they correspond to another die or even the

same die but with different operatingconditions such as temperature and power

supply voltage.


41/82

Activity a

a = F(PI, D) PI: primary input vectors, D: arandom vector consisting of all the random

variables of gate delays.


42/82


43/82


44/82

The difference is due to the

glitching activity


45/82

While the different nonzero-delay modelsdo track each other (except for one circuit,

C6288, which has a depth of about 120levels), it is clear that the nonzero-delay

models can produce very different results

compared to the zero-delay case. Thedifference is due to the glitching activity.


46/82

For some circuits minimum and maximum

average power can vary widely if uncertain

specifications of primary inputs exist.


47/82


48/82

The delay mismatch of different

paths causes spurious transitions.


49/82

They are 20 times greater than those

obtained using the zero-delay model.


50/82

3.8 Power Dissipation in Domino

CMOS Domino logic circuits do not have direct-

path short-circuit currents except when

static pull-up devices are used to moderatethe charge redistribution problem or when

clock skew is not well dealt with.


51/82


52/82


53/82


54/82

B fixed: X1

A Varies: X2

X3

V

Z1

X1 X2

X3

CyZ0

Y

Figure 3.36 CMOS gate y = (x1 + x2)x3


55/82

3.10 Power Estimation at the

Circuit Level The gate presents a variable capacitance to the

power/ground rails. The magnitude of thiscapacitance depends on the logic values at the

input to the gate. Two signalsA andB are to be connected to the

two equivalent inputs x1 and x2 of the gate inFigure 3.36 such that very oftenA has a transition

andB stays zero thenA should be connected to x2andB to x1 as this results in lower powerconsumption than the other case.


56/82

3.11 High Level Power

Estimation


57/82

The signal probabilities of the lower order bits of aword are essentially uncorrelated in space and

time with a signal probability of 0.5 and switching

activity of 0.25 and are essentially independent of

the data distributions.

The higher order bits show complete dependence

because they represent the sign extensions intwos complement representation.


58/82

3.12 Information-Theory-Based

Approaches The output entropy of Boolean functions

can be used to predict the average

minimized area of CMOS combinationalcircuits.

If x is a random variable with a signalprobability p, then the entropy of x is

defined as H(x) = p log2 1/p + (1p) log2 1/(1p)


59/82

For a discrete variable x, which can take ndifferent values, the entropy is defined as

H(x) = i=1n pi log2 1/pi, where pi is the probabilitythatx takes the ith value xi.

Given the input signal probabilities of 0.5, theoutput entropy of the boolean function can be usedto predict the area of its average minimizedimplementation as

A = K* (2n / n)* H(Y) where A is the area of theimplementation and K is the proportionalityconstant, Y =f(X).


60/82


61/82


62/82

RTL: Register Transfer Level

A high level technique to estimate power can beused in the following three steps:

Determine the input/output entropies ofcombinational logic block by running RTLsimulation of sequential circuits.

From the input/output entropies, determine theswitching activity, area, and estimate of average

power.

Combine with latch and clockpower to determinethe total power dissipation.


63/82

Two approaches to determining lowerbounds of maximum dynamic power in

static CMOS circuits: deterministic (automatic test generation

based) and

simulation-based approaches.


64/82

The instantaneous power dissipation due totwo consecutive input binary vectors is

proportional to Pi= for all gates T(g) * C(g), where C(g)

denotes the output capacitance of gategand

T(g) is a binary variable that indicateswhether gategswitches or not

corresponding to the two input vectors.


65/82

To justify the transition [ i.e., to see if T(*) = 1 isachievable], the modified justification mechanismin a 5-V D algorithm (an ATG algorithm forstuck-at faults) is used.

In CMOS circuits, the capacitive load of a logicgate can be approximated by the fanout of the gate.

Pi= for all gates [g(V1) g(V2)] * F(g), where V1,V2 denote two consecutive input binary vectors tothe circuit, g(*) represents the boolean function ofgategin term of PIs, and F(g) denotes the

number of fanouts of gate


66/82

The justification mechanism in the algorithmincludes two processesbacktracing andimplication.

Two composite values to be in conflict if theyhave 0 and 1 at the same position.

Experiments show the test generation approach is

superior to the traditional simulation-basedtechnique in both efficiency and the quality of the

results.


67/82

a

b

c

d

e

g 1

1 f GUT

D hD i

1

1

1

0

1

Fig 7.6 The D algorithm-sensitization step


68/82

Each gate is associated with a stack to storeall the composite logic values a/b that have

been assigned tog[ a and b denotes g(V1)and g(V2), respectively]. The variables a

and b can be 1, 0, oru (unknown). At each

gate, the top of the stack stores the most

recently updated value for the gate.


69/82

After assigning a rising transition

(0/1) to x, y(V2) is forced to be 1.


70/82


71/82

Circuit level simulation

Extract circuit netlist description from layout Captures internal (diffusion) and external (wiring and

gate fanout) capacitances

Run an analog simulation Characterization of device models (nfets, pfets) Solution of large system of equations so very

computationally intensive ( < few thousand transistors)

Can accurately estimate (within a few %) dynamic andleakage power dissipation

HSPICE, spectre (Cadence), PowerMill (Synopsys)


72/82

Gate level simulation

Perform logic simulation to obtain the switchingevents for each net (signal)

Logic description in structural VHDL or Verilog

Zero-delay or unit-delay timing models Determine frequency of each net fy = ty/(2T),

where ty if the number of logic switches of netyand Tis the simulation time, to compute dynamic

power

Pdyn = CyVDD2fy Pre-layout so must estimate Cy


73/82

Gate internal and leakage power

Use gate characterization (E(g, e)) and logicsimulation event count (f(g, e)) to calculate thegates dynamic internal power (short circuit and

charging/discharging of internal capacitors) Pint = E(g, e) f(g, e) During simulation record the fraction of time T(g,

s)/T that each gategstays in a particular statess tocalculate leakage power

Pleak= E(g, s) f(g, s)/T


74/82

Capacitance estimation

Device (diffusion and gate) capacitance Depends on width/length of driving gates source/drain

diffusion and fanout gates

Part of characterization of cell based designs Wiring capacitance

Depends on placement and routing Wire loadpredict wire length of a net from the

number of pins incident to the net Mapping table can be constructed from historical data of

existing designs


75/82


76/82

Gate level simulation

considerations Simulation vectors need to be chosen carefully

(application dependent)

Internal power really depends on operating voltage,temperature, process, multidimensionalcharacterization

Accuracy within 5-10% of HSPICE

Signal glitches may not be modeled precisely(glitches depend on delays in the circuit)


77/82

Gate level probabilistic analysis

For each internal nety determine the signalprobability of the net wrt to the given signalprobabilities of the primary inputs

From the signal probabilities determine thetransition density D(y) of each internal nety

Compute the total power

P = 0.5 CyVDD2 D(y) Pre-layout so must estimate Cy


78/82

Determining signal probabilities

Signal probability definition

P1 = t1/(t0 + t1) and P0 = 1P1

Propagate the given statistical quantitiesfrom the primary inputs to the internal

signal nets and outputs of the circuit

Propagate quantities using probabilisticsignal propagation model


79/82

Signal propagation model

Apply Shannons decomposition to the n-input Boolean function y = f(x1, , xn)

Y = xifxi + !xif!xi, where fxi(f!xi) is the newBoolean function obtained by setting xi = 1(xi = 0) in f(xi, , xn)

P(y) = P(xifxi) + P(!xif!xi) = P(xi)P(fxi ) +P(!xi)P(f!xi)

Apply recursively (note: P(!xi) = 1P(xi))


80/82

Determine transition density

For a transition (1-to-0 or 0-to-1) to have occurredfxi f!xi = 1 the Boolean difference ofy wrt xidenoted dy/dxi

P(dy/dxi) is the probability that dy/dxi evaluates to1 and D(xi) is the transition density of xi

Then the total transition density of the nety is

D(y) = P(dy/dxi) D(xi)

G l l b bili i l i


81/82

Gate level probabilistic analysis

considerations Computationally efficientMust only compute signal probabilities and

transition densities for each net to evaluate

P = 0.5 CyVDD2 D(y)

Assumes given correct signal probabilitiesfor primary inputs (and if wrong, large

errors are possible) Given average power dissipation values


82/82

Architectural level simulation

Perform RTL simulation to obtain the inputactivity for each major functional unit

Architectural description in behavioral VHDLor Verilog or C, C++

Energy characterization of functional units Transition-sensitive energy models

System busses ALUs, register file, pipeline registers

Analytical energy models Caches, DRAMs

low-power-estimation .ppt

Documents