2.low power device

8/4/2019 2.Low Power Device

1/70

1

2. Physics of Power Dissipation

in CMOS FET Devices


2/70

2

2. Physics of Power Dissipation

in CMOS FET Devices For an ideal MIS diode, the energy difference ms

between the metal work function m and thesemiconductor work function s is zero:

msm - (+ Eg/2q +B) = 0 (2.1)

where is the semiconductor electron affinity(from conduction band to vacuum level), Eg the

band gap (from valence band to conduction band),B the potential barrier between the metal and theinsulator, and B the potential difference betweenthe Fermi level EF and the intrinsic Fermi level Ei.


3/70

3

The Fermi-Dirac Function

fFD(E) = 1/ (1 + exp ((EEF) / kT))

The Fermi-Dirac distribution function givesthe probability that a certain energy state

will be occupied by an electron.

As in a gas, the electrons in a solid are inconstant motion and consequently changingtheir energy and momentum.


4/70

4

P-type


5/70

5

CMOS Gate Power equations

P = CLVDD2f01 + tsc VDD Ipeakf0 1 + VDD Ileakage

Dynamic term CLVDD2f01 Short-circuit term tsc VDD Ipeakf0 1

Leakage term VDD Ileakage


6/70

6

The Maxwell-Boltzmann statistics relatesthe equilibrium hole concentration to the

intrinsic Fermi level:

p0 = ni exp((EiEF)/kT) (2.2)


7/70

7

P substrate (The Fermi level EF in thesemiconductor is nowqV below the Fermi

level in the metal gate.)


8/70

8

P substrate


9/70

9

If the applied voltage is increased sufficiently, thebands bend far enough that level Ei at the surface

crosses over to the other side of level EF. This is brought about by the tendency of carriers

to occupy states with the lowest total energy.

In the present condition of inversion the level Eibends to be closer to level Ec and electrons

outnumber holes at the surface.


10/70

10

Ei at the surface now is below EF by an amount of

energy equal to 2 B , where B is the potentialdifference between the Fermi level EF and the

intrinsic Fermi level Ei in the bulk.


11/70

11

The value of V necessary to reach the onsetof strong inversion is called the threshold

voltage.


12/70

12

Surface Space Charge Region

and the Threshold Voltage Poisson equation

D = (x, y, z) (2.3)

Where D, the electric displacement vector,is equal to s E under low-frequency orstatic conditions; s is the permittivity of Si;

E the electric field vector; and (x, y, z) thetotal electric charge density.


13/70

13


14/70

14

Threshold voltage

VT =(2d/i ) *( q s NAB (1e-2B) )0.5+ 2B

The total voltage needed to offset the effect ofnonzero work function difference and thepresence of the charges is referred to as the

flat-band voltage VFB.VFB= msQT*d/i


15/70

15

Threshold voltage

VT =

(2d/i ) *( q s NAB (1e-2B) )0.5+ 2B +VFB


16/70

16


17/70

17

2.2.3.1 Effects Influencing

Threshold Voltage VT decreases when L (length) is decreased,

varies with Z (width), and decreases when

the drain-source voltage VDS is increased.


18/70

18

Drain-induced barrier lowering (DIBL) isthe basis for a number of more complex

models of the threshold voltage shift. It refers to the decrease in threshold voltage

due to the depletion region charges in the

potential barrier between the source and thechannel at the semiconductor surface.


19/70

19

A recent model adopt a quasi two-dimensional approach to solving the two-

dimensional Poisson equation. dEx/dx at each point (x, y) can be replaced

with the average of its value at (0, y) and at

(W, y)


20/70

20

Short channel effect

The minimum value of the surface potentialincreases with decreasing channel length

and increasing VDS.


21/70

21

2.2.3.2 Subsurface Drain-Induced

Barrier Lowering (Punchthrough) The punchthrough voltage VPT defined as

the value of VDS at which I D, st reaches

some specific magnitude with VGS = 0. The parameter VPT can be roughly

approximated as the value of VDS for which

the sum of the widths of the source and thedrain depletion regions becomes equal to L.


22/70

22


23/70

23

If the field in the oxide, Eox, is large enough, thevoltage drop across the depletion layer suffices to

enable tunneling in the drain via a near-surfacetrap.

The minority carriers emitted to the incipientinversion layer are laterally removed to the

substrate, completing a path for a gate-induceddrain leakage (GIDL) current. In CMOS circuits

this leakage current contributes to standby power.


24/70

24

2.3 Power Dissipation in CMOS

The first ICs ever fabricated used a PMOS process.This is due to the simplicity of fabrication of a p-

channel enhancement mode MOS field-effecttransistor (PMOST) with threshold voltage VTp 1 = CLVDD2.

When energy stored in a capacitor withcapacitance CL and voltage VDD across itsplates is CL VDD

2/2, the rest of the energy,

another CL VDD2/2, is converted into heat.


37/70

37

Networks of pass transistors


38/70

38


39/70

39

2.3.3 The Load Capacitance


40/70

40


41/70

41

The overall load capacitance is modeled asthe parallel combination of 4 capacitors

the gate capacitance Cg,the overlap capacitance Cov,

the diffusion capacitance Cdiff,

and the interconnect capacitance Cint.


42/70

42


43/70

43

2.3.3.2 The Overlap Capacitance

Cgd1 = Cgd2 = 2 Cox xd W

Cgd3 = Cgd4 = Cgs3 = Cgs4 = Cox xd W

The total overlap capacitance is simply thesum of all the above:

Cov = Cgd1 + Cgd2 + Cgd3 + Cgd4 + Cgs3 + Cgs4


44/70

44

2.3.3.3 Diffusion Capacitance

Two components: the bottomwall areacapacitance and the sidewall capacitance


45/70

45

2.4.1 Principles of Low-Power

Design Using the lowest possible supply voltage Using the smallest geometry, highest frequency

devices but operating them at the lowest possiblefrequency

Using parallelism and pipelining to lower requiredfrequency of operation

Power management by disconnecting the powersource when the system is idle Designing systems to have lowest requirements on

subsystem performance for the given user level

functionality


46/70

46

2.4.3 Fundamental Limits

The limit from thermodynamic principles resultsfrom the need to have, at any node with anequivalent resistor R to the ground, the signal

power Ps exceed the available noise power Pavail.

The quantum theoretic limit on low power comesfrom the Heisenberg uncertainty principle. Inorder to be able to measure the effect of a

switching transition of duration t, it must involvean energy greater than h/ t:

P h/ (t)2 where his the Plancks constant.


47/70

47

Finally the fundamental limit based onelectromagnetic theory results in the

velocity of propagation of a high-speedpulse on an interconnect to be always less

than the speed of light in free space, c0:

L/ c0 where L is the length of theinterconnect and is the interconnect transittime.


48/70

48

2.4.4 Material Limits

The attributes of a semiconductor materialthat determine the properties of a device

built with the material are Carrier mobility

Carrier saturation velocity s

Self-ionizing electric field strength Ec Thermal conductivity K


49/70

49

Consider an SOI structure by surroundingthe above generic device in a hemispherical

shell of SiO2 of radius ri, indicating a two-order-of-magnitude reduction in thermal

conductivity.


50/70

50

The response time of the globalinterconnect circuit is

= (2.3 Rtr + Rint) Cint where Rtr is theoutput resistance of the driving transistor

and Rint and Cint are the total resistance and

capacitance, respectively, of the globalinterconnect.


51/70

51

2.4.7 System Limits

The architecture of the chip

The power-delay product of the CMOS

technology used to implement the chip

The heat removal capacity of the chippackage

The clock frequency Its physical size


52/70

52

Energy characterization

Transition-sensitive energy models Single energy tables

Bit independent modules e.g., flipflops

Multiple energy tables Large bit dependent modules e.g., 32-b adders Large multi-element modules e.g., register files

Transition sensitive energy equations

System level interconnect capacitance values Analytical energy modes

Cache and main memory


53/70

53

Transition-sensitive energy

model Must first design and layout a functional unit and

then simulate it to capture switch capacitances

Bit independentbus lines, pipeline registers One bit switching does not affect other bit slices operations Bit dependentALU, decoders

Once constructed, the models can be reused in

simulations of other architectures built with thesame technology


54/70

54

Switch Capacitance TablePrevious Input

Vector

Current Input

Vector

Switch

Capacitance

000 000 cap00

000 001 cap01

111 111 Cap2n-12n-1


55/70

55

Table Compression

Problem Results in large uncompressed table (e.g., 16-bit adder 232 rows)

Excessive simulation (e.g., 232!) Solution

Clustering Algorithm Reference: Huzefa Mehta, et al.Module Energy Characterization using Clustering,

DAC96 For 16-bit adder, to keep 12% average error 1000

simulation points, 97 rows


56/70

56

2:1 Multiplexer TableUncompressed

64 rows

000 000 0.00

000 001 0.00

000 010 0.00

000 011 0.00

000 100 0.04

000 101 0.05

000 110 0.04

000 111 0.05

001 000 0.00

001 001 0.00

Compressed

32 rows

000 0xx 0.00

000 100 0.04

000 101 0.05

000 110 0.04

000 111 0.05

001 0xx 0.00

Reduced

11 rows

000 0xx 0.00

000 1xx 0.045001 0xx 0.00


57/70

57


58/70

58


59/70

59


60/70

60


61/70

61

Memory System Energy Model

Parameterizable analytical energy models for theon-chip memories that capture Energy dissipated by bitlines: precharge, read and write

cycles Energy dissipated by wordlines: when a particular rowis being read and written

Energy dissipated by storage cell on access Energy dissipated by address decoders

Energy dissipated by peripheral circuitscache controllogic, comparators, etc. Off-chip main memory energy is based on per-

access cost


62/70

62

Cache energy model example

On-chip cache Energy = Ebus + Ecell + Epad Ecell = * (Wl_length) * (Bl_length + 4.8) * (Nhit + 2

* Nmiss)

Wl_length = m * (T + 8L + St) Bl_length = C / (m * L) Nhit = number of hits; Nmiss = number of misses; C = cache size; L = cache line size in bytes;

m = set associativity; T = tag size in bits; St = # of status bits per line; = 1.44e-14 (technology based cell access cost of

SRAM)

Em = 4.95e-9 (technology based access cost of DRAM)


63/70

63


64/70

64


65/70

65


66/70

66


67/70

67


68/70

68


69/70

69

Architectural Level Analysis

Considerations Very computationally efficient

Requires predefined analytical and transition-

sensitive energy characterization modelsRequires design only to RTL (with some idea

as to the kind of functional units planned)

Coarse grainuse of gated clocks implicit

Reasonably accurate (within 5% - 15% ofSPICE)


70/70

Simulation based so can be used to supportarchitectural, compiler, OS, and application

level experimentation WattWatcher (Sente), DesignPower and

PowerCompiler (Synopsys), prototype

academic tools (WattchPrinceton,SimplePowerPSU)

2.low power device

Documents