experiences of low power design implementation and ... · experiences of low power design...

19
Experiences of Low Power Experiences of Low Power Design Implementation and Design Implementation and Verification Verification Shih Shih - - Hao Hao Chen Chen Global Global Unichip Unichip Corp. Corp. Hsinchu Hsinchu , Taiwan , Taiwan

Upload: vuongnhu

Post on 29-Jun-2018

213 views

Category:

Documents


0 download

TRANSCRIPT

Experiences of Low Power Experiences of Low Power Design Implementation and Design Implementation and

VerificationVerification

ShihShih--HaoHao ChenChenGlobal Global UnichipUnichip Corp.Corp.

HsinchuHsinchu, Taiwan, Taiwan

11

AgendaAgenda

IntroductionIntroduction

Power GatingPower Gating

Low Power VerificationLow Power Verification

Dynamic IR PreventionDynamic IR Prevention

ConclusionConclusion

11

22

33

44

55

22

Force and ImpactForce and Impact–– Application, cost, process technology driven Application, cost, process technology driven –– Complexity, power density, leakage increaseComplexity, power density, leakage increase–– TradeTrade--off: area, power, performance, reliability, cost, off: area, power, performance, reliability, cost, ……

Technology is Driven to Low PowerTechnology is Driven to Low Power

Require battery powerRequire battery power

ComplexityComplexity

Design methodologyDesign methodology

VerificationVerification

Power densityPower densityComponentComponent

LeakageLeakage

90nm

45nm

65nm

33

Should be achievedShould be achieved fromfrom various fieldsvarious fields–– Software, architecture, logical design, physical Software, architecture, logical design, physical

implementation, IP/library support, process technology, implementation, IP/library support, process technology, ……

Power OptimizationPower Optimization

HighHigh

LowLow

OpportunityOpportunity a*C*Fa*C*F

MSV MSV DVFS DVFS

SRPG SRPG

MultiMulti--VT opt.VT opt.

Power gatingPower gatingBack biasingBack biasing

GateGate--length opt.length opt.

ImplementationImplementation

Modeling &Modeling &EstimationEstimation

VerificationVerification

AlgorithmAlgorithmSchedulingScheduling

Instruction opt.Instruction opt.

FF clusteringFF clusteringSizing, Clock meshSizing, Clock mesh

Wire reductionWire reduction

PipeliningPipeliningMemory hierarchyMemory hierarchyOperand isolationOperand isolation

Clock gatingClock gating

VV22 LeakageLeakageSoftwareSoftware

PolicyPolicy

System System ArchitectureArchitecture

SynthesisSynthesisLogic DesignLogic Design

Physical Physical ImplementationImplementation

IP/Library IP/Library ProcessProcess

44

Power Gating Power Gating OffOff--chip controlchip control–– Take long time to wakeTake long time to wake--upup

OnOn--chip controlchip control–– SwitchSwitch--able pad implementation needs extra I/O spaceable pad implementation needs extra I/O space–– Power switch implementation becomes more popular Power switch implementation becomes more popular

OnOn--chip controlchip control

OffOff--chipchipcontrolcontrol

55

The most effectiveThe most effective way to manage way to manage leakageleakagePenaltyPenalty–– IIncurncurss power power by sleep transistorsby sleep transistors and and have have area penaltyarea penalty–– Has performance degradation issue due to IR drop Has performance degradation issue due to IR drop –– NNeedeedss extra extra dede--capcap to reduce the power noiseto reduce the power noise

MTCMOS TechniqueMTCMOS Technique

Gated BlocksGated Blocks

IR dropIR drop

R1: True VDD grid R1: True VDD grid R1: True VDD grid

R2: Power switchR2: Power switchR2: Power switch

R3: Virtual VDD gridR3: Virtual VDD gridR3: Virtual VDD grid

Power SourcePower SourcePower Source

PowerPower--upupsequencesequence

66

Determine switch allocation, powerDetermine switch allocation, power--up sequence up sequence –– Subject to: area, power, rampSubject to: area, power, ramp--up time and peak currentup time and peak current

0.E+00

1.E-04

2.E-04

3.E-04

4.E-04

5.E-04

6.E-04

7.E-04

8.E-04

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 Vds (V)

Current (A)Vg=0PMOS (Header)

Power Switch Planning Power Switch Planning

Mutation(domino)

Concurrence(fish-bone)

SteadySteady--statestateLinear regionLinear region

33

PowerPower--upupSaturation regionSaturation region22

OffOff11

Ramp-up time

VoltageVDD

upper/lower bound of rampupper/lower bound of ramp--up timeup time

Current

Peak

One-by-one(daisy chain)

trade-off betweenpeak current & ramp-up time

77

Power Switch AssemblyPower Switch AssemblyPartition/Cluster switch cells in banksPartition/Cluster switch cells in banks–– Peak current limit (max concurrence)Peak current limit (max concurrence)–– WakeWake--up time limit (max depth)up time limit (max depth)

7

6

1

5

2

4

3

8

7

2

6

3

5

4

4

5

5

6

7

3

4

6

5

6

7

9

8

Root

DepthDepthwithin bankwithin bank

Concurrence = 5Concurrence = 5Concurrence = 5Concurrence = 5

Concurrence = 4Concurrence = 4

DepthDepthcrosscross--banksbanks

88

Effect of PowerEffect of Power--up Sequenceup Sequence

2,400 switch cells partitioned in 20 banks, 2,400 switch cells partitioned in 20 banks, and assembled in the same configuration and assembled in the same configuration (Domino Fashion)(Domino Fashion)

4 root position: BL, BR, TL, TR4 root position: BL, BR, TL, TR

Power source

Lb (x,y)

Pi(x,y)

Pi(x,y)

Root

99

root

7

6

1

5

2

4

3

7

2

6

3

5

4

4

5

5

6

7

3

7

4

6

5

12

11

6

10

7

9

8

root

4

3

2

2

5

4

9

7

3

Power Switch OptimizationPower Switch OptimizationSizing/RemovalSizing/RemovalReorderingReordering–– To reduce rush current and dynamic IR problems To reduce rush current and dynamic IR problems

7

6

3

45

7

6

8

10

9

10

8

9

7

8 5

C5=5

C6=6C7=7

C4=4

C5=4

C6=3C7=4

C4=52

4

4

3

57

36

1

6

1010

BL BR

TR TL

Dynamic IR vs. PowerDynamic IR vs. Power--up Sequenceup Sequence

VSSVDDConfig

5.5%5.5%TR5%5.5%TL5.5%6%BR6.5%7%BL

Dynamic IR(%)

candidatecandidate

1111

Verification is a Key to SuccessVerification is a Key to SuccessComprehensive low power verificationComprehensive low power verification–– DDesign quality check esign quality check –– Electrical check, functionalElectrical check, functional correctnesscorrectness, I, IR/EM R/EM analysisanalysis–– Dynamic IRDynamic IR

MultiMulti--depths Sleep Modedepths Sleep Mode

missing isolation or shifter?missing isolation or shifter?

incapable sleep control incapable sleep control propagation?propagation?

incorrect power domain incorrect power domain connection? power routing?connection? power routing?

incomplete clock gating?incomplete clock gating?

good decision?good decision?

IR/EM analysis?IR/EM analysis?

1212

Dynamic IR Drop Affects YieldDynamic IR Drop Affects YieldDynamic IR Dynamic IR is critical at 90nm and belowis critical at 90nm and below–– TTiming variation becomes more voltage sensitiveiming variation becomes more voltage sensitive–– IIncreasing the power grid width ncreasing the power grid width is notis not efficient enoughefficient enough

Source: TSMC

ItIt’’s s not only speed degradationnot only speed degradation problemproblem, but also cause chip failure, but also cause chip failure !

1313

Scan Mode Dynamic IR is CriticalScan Mode Dynamic IR is CriticalEven worst in scan modeEven worst in scan mode–– Delay is more sensitive to IR drop as technology shrinkingDelay is more sensitive to IR drop as technology shrinking–– SimulationSimulation--based approach is effort consumingbased approach is effort consuming–– Analysis at early stage is criticalAnalysis at early stage is critical

VDD: 0% ~ 0.75%VDD: 0% ~ 0.75% VDD: 17.5% ~ 18.7%VDD: 17.5% ~ 18.7%

Static AnalysisStatic Analysis VCDVCD--based Analysisbased Analysis

1414

Dynamic IR AnalysisDynamic IR AnalysisTraditional flow is inefficient Traditional flow is inefficient –– Effort consuming to grab VCD pattern and timing windowEffort consuming to grab VCD pattern and timing window–– No enough space around No enough space around hot spots hot spots ((too late)too late)–– May worsenMay worsen leakage and yieldleakage and yield (i(inefficient denefficient de--cap insertioncap insertion))

0K

50K

100K

150K

200K

Deca

p In

serti

on

02

46

810

1214

IR-d

rop

(%)

Nr. De-cap IR-drop

Gain ~1% dynamic IR saving, but pay ~200K de-cap cells.

Power planningPower planning

PlacementPlacement

1st power analysis1st power analysis

CTS/Post-routeCTS/Post-route

2nd power analysis2nd power analysis

Insert De-capInsert De-cap

Insert De-capInsert De-cap

Meet?

Meet?

VCDpattern

VCDpattern

NoYes

YesNo

Traditional Flow Traditional Flow Traditional Flow

Lengthy Loop

1515

Dynamic IR Failure PredictionDynamic IR Failure PredictionFlipFlip--flop density ruleflop density rule–– Cell padding is applied during CTS and timing opt.Cell padding is applied during CTS and timing opt.–– DDynamicynamic--IR IR is controlled is controlled while maintaining timingwhile maintaining timing–– DeDe--cap cell insertion becomes more efficientcap cell insertion becomes more efficient

VCDVCD--based Analysisbased Analysis Dynamic IR PredictionDynamic IR Prediction

1616

Dynamic IR Prevention FlowDynamic IR Prevention FlowPreliminary planningPreliminary planning–– Power grid Power grid –– DeDe--cap precap pre--insertioninsertion–– Power switch configurationPower switch configuration–– Prediction, fixingPrediction, fixing

Power aware DFTPower aware DFT–– Clock gating Clock gating –– LocationLocation--based grouping based grouping –– Test clocks varyingTest clocks varying

Power switch opt.Power switch opt.–– Sizing, removal, reorderingSizing, removal, reordering

Power planningPower planning

PlacementPlacement

1st power analysis1st power analysis

CTS/Post-routeCTS/Post-route

2nd power analysis2nd power analysis

Decap insertionDecap insertion

Decap insertionDecap insertion

Meet?

Meet?

VCD patternTiming Window

VCD patternTiming Window

No

Yes

Yes

No

Traditional Flow Traditional Flow Traditional Flow

lengthy loop

PreliminaryPower Planning

PreliminaryPreliminaryPower PlanningPower Planning

Power-aware DFTPowerPower--aware DFTaware DFT

Power SwitchConfiguration

Power SwitchPower SwitchConfigurationConfiguration

PlacementPlacementPlacement

Simulation-freeDynamic IR Prediction

SimulationSimulation--freefreeDynamic IR PredictionDynamic IR Prediction

Static/TransientPower Analysis

Static/TransientStatic/TransientPower AnalysisPower Analysis

CTS/Post-routeTiming Optimization

CTS/PostCTS/Post--routerouteTiming OptimizationTiming Optimization

Meet

Meet Switch RemovalReordering

Switch RemovalSwitch RemovalReorderingReordering

Padding Auto-fixingPadding Padding

AutoAuto--fixingfixing

1717

ConclusionConclusionPower efficiencyPower efficiency–– Compromise between different mechanismsCompromise between different mechanisms–– Require good decision, cRequire good decision, comprehensive verification omprehensive verification

PowerPower--gating is a promising leakage controlgating is a promising leakage control–– Challenge of verification among various sleep modesChallenge of verification among various sleep modes–– Configuration with dynamic IR considerationConfiguration with dynamic IR consideration

Dynamic IR prevention flow improves yieldDynamic IR prevention flow improves yield–– FlipFlip--flop density check achieves a good quality flop density check achieves a good quality –– Fixing becomes much easier without timing degradationFixing becomes much easier without timing degradation–– DynamicDynamic--IR is controlled during CTS and timing opt.IR is controlled during CTS and timing opt.

1818