experiences of low power design implementation and ... · experiences of low power design...
TRANSCRIPT
Experiences of Low Power Experiences of Low Power Design Implementation and Design Implementation and
VerificationVerification
ShihShih--HaoHao ChenChenGlobal Global UnichipUnichip Corp.Corp.
HsinchuHsinchu, Taiwan, Taiwan
11
AgendaAgenda
IntroductionIntroduction
Power GatingPower Gating
Low Power VerificationLow Power Verification
Dynamic IR PreventionDynamic IR Prevention
ConclusionConclusion
11
22
33
44
55
22
Force and ImpactForce and Impact–– Application, cost, process technology driven Application, cost, process technology driven –– Complexity, power density, leakage increaseComplexity, power density, leakage increase–– TradeTrade--off: area, power, performance, reliability, cost, off: area, power, performance, reliability, cost, ……
Technology is Driven to Low PowerTechnology is Driven to Low Power
Require battery powerRequire battery power
ComplexityComplexity
Design methodologyDesign methodology
VerificationVerification
Power densityPower densityComponentComponent
LeakageLeakage
90nm
45nm
65nm
33
Should be achievedShould be achieved fromfrom various fieldsvarious fields–– Software, architecture, logical design, physical Software, architecture, logical design, physical
implementation, IP/library support, process technology, implementation, IP/library support, process technology, ……
Power OptimizationPower Optimization
HighHigh
LowLow
OpportunityOpportunity a*C*Fa*C*F
MSV MSV DVFS DVFS
SRPG SRPG
MultiMulti--VT opt.VT opt.
Power gatingPower gatingBack biasingBack biasing
GateGate--length opt.length opt.
ImplementationImplementation
Modeling &Modeling &EstimationEstimation
VerificationVerification
AlgorithmAlgorithmSchedulingScheduling
Instruction opt.Instruction opt.
FF clusteringFF clusteringSizing, Clock meshSizing, Clock mesh
Wire reductionWire reduction
PipeliningPipeliningMemory hierarchyMemory hierarchyOperand isolationOperand isolation
Clock gatingClock gating
VV22 LeakageLeakageSoftwareSoftware
PolicyPolicy
System System ArchitectureArchitecture
SynthesisSynthesisLogic DesignLogic Design
Physical Physical ImplementationImplementation
IP/Library IP/Library ProcessProcess
44
Power Gating Power Gating OffOff--chip controlchip control–– Take long time to wakeTake long time to wake--upup
OnOn--chip controlchip control–– SwitchSwitch--able pad implementation needs extra I/O spaceable pad implementation needs extra I/O space–– Power switch implementation becomes more popular Power switch implementation becomes more popular
OnOn--chip controlchip control
OffOff--chipchipcontrolcontrol
55
The most effectiveThe most effective way to manage way to manage leakageleakagePenaltyPenalty–– IIncurncurss power power by sleep transistorsby sleep transistors and and have have area penaltyarea penalty–– Has performance degradation issue due to IR drop Has performance degradation issue due to IR drop –– NNeedeedss extra extra dede--capcap to reduce the power noiseto reduce the power noise
MTCMOS TechniqueMTCMOS Technique
Gated BlocksGated Blocks
IR dropIR drop
R1: True VDD grid R1: True VDD grid R1: True VDD grid
R2: Power switchR2: Power switchR2: Power switch
R3: Virtual VDD gridR3: Virtual VDD gridR3: Virtual VDD grid
Power SourcePower SourcePower Source
PowerPower--upupsequencesequence
66
Determine switch allocation, powerDetermine switch allocation, power--up sequence up sequence –– Subject to: area, power, rampSubject to: area, power, ramp--up time and peak currentup time and peak current
0.E+00
1.E-04
2.E-04
3.E-04
4.E-04
5.E-04
6.E-04
7.E-04
8.E-04
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2 Vds (V)
Current (A)Vg=0PMOS (Header)
Power Switch Planning Power Switch Planning
Mutation(domino)
Concurrence(fish-bone)
SteadySteady--statestateLinear regionLinear region
33
PowerPower--upupSaturation regionSaturation region22
OffOff11
Ramp-up time
VoltageVDD
upper/lower bound of rampupper/lower bound of ramp--up timeup time
Current
Peak
One-by-one(daisy chain)
trade-off betweenpeak current & ramp-up time
77
Power Switch AssemblyPower Switch AssemblyPartition/Cluster switch cells in banksPartition/Cluster switch cells in banks–– Peak current limit (max concurrence)Peak current limit (max concurrence)–– WakeWake--up time limit (max depth)up time limit (max depth)
7
6
1
5
2
4
3
8
7
2
6
3
5
4
4
5
5
6
7
3
4
6
5
6
7
9
8
Root
DepthDepthwithin bankwithin bank
Concurrence = 5Concurrence = 5Concurrence = 5Concurrence = 5
Concurrence = 4Concurrence = 4
DepthDepthcrosscross--banksbanks
88
Effect of PowerEffect of Power--up Sequenceup Sequence
2,400 switch cells partitioned in 20 banks, 2,400 switch cells partitioned in 20 banks, and assembled in the same configuration and assembled in the same configuration (Domino Fashion)(Domino Fashion)
4 root position: BL, BR, TL, TR4 root position: BL, BR, TL, TR
Power source
Lb (x,y)
Pi(x,y)
Pi(x,y)
Root
99
root
7
6
1
5
2
4
3
7
2
6
3
5
4
4
5
5
6
7
3
7
4
6
5
12
11
6
10
7
9
8
root
4
3
2
2
5
4
9
7
3
Power Switch OptimizationPower Switch OptimizationSizing/RemovalSizing/RemovalReorderingReordering–– To reduce rush current and dynamic IR problems To reduce rush current and dynamic IR problems
7
6
3
45
7
6
8
10
9
10
8
9
7
8 5
C5=5
C6=6C7=7
C4=4
C5=4
C6=3C7=4
C4=52
4
4
3
57
36
1
6
1010
BL BR
TR TL
Dynamic IR vs. PowerDynamic IR vs. Power--up Sequenceup Sequence
VSSVDDConfig
5.5%5.5%TR5%5.5%TL5.5%6%BR6.5%7%BL
Dynamic IR(%)
candidatecandidate
1111
Verification is a Key to SuccessVerification is a Key to SuccessComprehensive low power verificationComprehensive low power verification–– DDesign quality check esign quality check –– Electrical check, functionalElectrical check, functional correctnesscorrectness, I, IR/EM R/EM analysisanalysis–– Dynamic IRDynamic IR
MultiMulti--depths Sleep Modedepths Sleep Mode
missing isolation or shifter?missing isolation or shifter?
incapable sleep control incapable sleep control propagation?propagation?
incorrect power domain incorrect power domain connection? power routing?connection? power routing?
incomplete clock gating?incomplete clock gating?
good decision?good decision?
IR/EM analysis?IR/EM analysis?
1212
Dynamic IR Drop Affects YieldDynamic IR Drop Affects YieldDynamic IR Dynamic IR is critical at 90nm and belowis critical at 90nm and below–– TTiming variation becomes more voltage sensitiveiming variation becomes more voltage sensitive–– IIncreasing the power grid width ncreasing the power grid width is notis not efficient enoughefficient enough
Source: TSMC
ItIt’’s s not only speed degradationnot only speed degradation problemproblem, but also cause chip failure, but also cause chip failure !
1313
Scan Mode Dynamic IR is CriticalScan Mode Dynamic IR is CriticalEven worst in scan modeEven worst in scan mode–– Delay is more sensitive to IR drop as technology shrinkingDelay is more sensitive to IR drop as technology shrinking–– SimulationSimulation--based approach is effort consumingbased approach is effort consuming–– Analysis at early stage is criticalAnalysis at early stage is critical
VDD: 0% ~ 0.75%VDD: 0% ~ 0.75% VDD: 17.5% ~ 18.7%VDD: 17.5% ~ 18.7%
Static AnalysisStatic Analysis VCDVCD--based Analysisbased Analysis
1414
Dynamic IR AnalysisDynamic IR AnalysisTraditional flow is inefficient Traditional flow is inefficient –– Effort consuming to grab VCD pattern and timing windowEffort consuming to grab VCD pattern and timing window–– No enough space around No enough space around hot spots hot spots ((too late)too late)–– May worsenMay worsen leakage and yieldleakage and yield (i(inefficient denefficient de--cap insertioncap insertion))
0K
50K
100K
150K
200K
Deca
p In
serti
on
02
46
810
1214
IR-d
rop
(%)
Nr. De-cap IR-drop
Gain ~1% dynamic IR saving, but pay ~200K de-cap cells.
Power planningPower planning
PlacementPlacement
1st power analysis1st power analysis
CTS/Post-routeCTS/Post-route
2nd power analysis2nd power analysis
Insert De-capInsert De-cap
Insert De-capInsert De-cap
Meet?
Meet?
VCDpattern
VCDpattern
NoYes
YesNo
Traditional Flow Traditional Flow Traditional Flow
Lengthy Loop
1515
Dynamic IR Failure PredictionDynamic IR Failure PredictionFlipFlip--flop density ruleflop density rule–– Cell padding is applied during CTS and timing opt.Cell padding is applied during CTS and timing opt.–– DDynamicynamic--IR IR is controlled is controlled while maintaining timingwhile maintaining timing–– DeDe--cap cell insertion becomes more efficientcap cell insertion becomes more efficient
VCDVCD--based Analysisbased Analysis Dynamic IR PredictionDynamic IR Prediction
1616
Dynamic IR Prevention FlowDynamic IR Prevention FlowPreliminary planningPreliminary planning–– Power grid Power grid –– DeDe--cap precap pre--insertioninsertion–– Power switch configurationPower switch configuration–– Prediction, fixingPrediction, fixing
Power aware DFTPower aware DFT–– Clock gating Clock gating –– LocationLocation--based grouping based grouping –– Test clocks varyingTest clocks varying
Power switch opt.Power switch opt.–– Sizing, removal, reorderingSizing, removal, reordering
Power planningPower planning
PlacementPlacement
1st power analysis1st power analysis
CTS/Post-routeCTS/Post-route
2nd power analysis2nd power analysis
Decap insertionDecap insertion
Decap insertionDecap insertion
Meet?
Meet?
VCD patternTiming Window
VCD patternTiming Window
No
Yes
Yes
No
Traditional Flow Traditional Flow Traditional Flow
lengthy loop
PreliminaryPower Planning
PreliminaryPreliminaryPower PlanningPower Planning
Power-aware DFTPowerPower--aware DFTaware DFT
Power SwitchConfiguration
Power SwitchPower SwitchConfigurationConfiguration
PlacementPlacementPlacement
Simulation-freeDynamic IR Prediction
SimulationSimulation--freefreeDynamic IR PredictionDynamic IR Prediction
Static/TransientPower Analysis
Static/TransientStatic/TransientPower AnalysisPower Analysis
CTS/Post-routeTiming Optimization
CTS/PostCTS/Post--routerouteTiming OptimizationTiming Optimization
Meet
Meet Switch RemovalReordering
Switch RemovalSwitch RemovalReorderingReordering
Padding Auto-fixingPadding Padding
AutoAuto--fixingfixing
1717
ConclusionConclusionPower efficiencyPower efficiency–– Compromise between different mechanismsCompromise between different mechanisms–– Require good decision, cRequire good decision, comprehensive verification omprehensive verification
PowerPower--gating is a promising leakage controlgating is a promising leakage control–– Challenge of verification among various sleep modesChallenge of verification among various sleep modes–– Configuration with dynamic IR considerationConfiguration with dynamic IR consideration
Dynamic IR prevention flow improves yieldDynamic IR prevention flow improves yield–– FlipFlip--flop density check achieves a good quality flop density check achieves a good quality –– Fixing becomes much easier without timing degradationFixing becomes much easier without timing degradation–– DynamicDynamic--IR is controlled during CTS and timing opt.IR is controlled during CTS and timing opt.