low power processor design vlsi systems lab. 3 월 28 일 박 봉 일
TRANSCRIPT
![Page 1: Low Power Processor Design VLSI Systems Lab. 3 월 28 일 박 봉 일](https://reader035.vdocuments.site/reader035/viewer/2022062315/5697bf851a28abf838c87a94/html5/thumbnails/1.jpg)
Low Power Processor Design
VLSI Systems Lab.
3 월 28 일박 봉 일
![Page 2: Low Power Processor Design VLSI Systems Lab. 3 월 28 일 박 봉 일](https://reader035.vdocuments.site/reader035/viewer/2022062315/5697bf851a28abf838c87a94/html5/thumbnails/2.jpg)
Introduction
• Processor power consumption
PowerPower
1 Watt
3-5 Watt
5-15 Watt
15+ Watt
heat sink,air flow
fan sink
exotic
none none
$1-5
$10-15
$50+
CostCostStrategyStrategy
LaptopComputer
Low powerprocessor
![Page 3: Low Power Processor Design VLSI Systems Lab. 3 월 28 일 박 봉 일](https://reader035.vdocuments.site/reader035/viewer/2022062315/5697bf851a28abf838c87a94/html5/thumbnails/3.jpg)
Power Reduction Technique
• Process level– low voltage
– low capacitance
• Circuit level– TR sizing
– adiabatic circuit
– low power arithmetic components
• Logic level– precomputation logic
– logic synthesis
– retiming
• System level– frequency reduction
– voltage reduction
– power management mode
![Page 4: Low Power Processor Design VLSI Systems Lab. 3 월 28 일 박 봉 일](https://reader035.vdocuments.site/reader035/viewer/2022062315/5697bf851a28abf838c87a94/html5/thumbnails/4.jpg)
System Level
0 20 40 60 80 100
Load/Store
Fixed Point
Floating Point
Special Register
SPECfp92SPECint92
• Execution unit idle time(PowerPC 603)
![Page 5: Low Power Processor Design VLSI Systems Lab. 3 월 28 일 박 봉 일](https://reader035.vdocuments.site/reader035/viewer/2022062315/5697bf851a28abf838c87a94/html5/thumbnails/5.jpg)
2.2
0.366
0.135
0.047
0
0.5
1
1.5
2
2.5
PowerPC 603
NormalDozeNapSleep
1.5
0.4
0
0.5
1
1.5
2
MIPS 4200
NormalReduced
0.9
0.02 0.00002
0
0.2
0.4
0.6
0.8
1
Strong ARM
Normal
Idle
Sleep
System Level
• Power management support
![Page 6: Low Power Processor Design VLSI Systems Lab. 3 월 28 일 박 봉 일](https://reader035.vdocuments.site/reader035/viewer/2022062315/5697bf851a28abf838c87a94/html5/thumbnails/6.jpg)
Power Estimation
• Simulation-based techniques– circuit simulation
– switch level simulation : IRSIM
– transistor level simulation : PowerMill
– gate level simulation
– Monte Carlo simulation
![Page 7: Low Power Processor Design VLSI Systems Lab. 3 월 28 일 박 봉 일](https://reader035.vdocuments.site/reader035/viewer/2022062315/5697bf851a28abf838c87a94/html5/thumbnails/7.jpg)
Power Estimation
• Probabilistic techniques– combinational circuits
• zero delay model
• real delay model
– sequential circuits
![Page 8: Low Power Processor Design VLSI Systems Lab. 3 월 28 일 박 봉 일](https://reader035.vdocuments.site/reader035/viewer/2022062315/5697bf851a28abf838c87a94/html5/thumbnails/8.jpg)
Logic Level
• Logic Synthesis– precomputation logic
– retiming
– state assignment
– path balancing
– technology mapping
– gate resizing
gg RR R
R1R1
R2R2AA R3R3
gg
![Page 9: Low Power Processor Design VLSI Systems Lab. 3 월 28 일 박 봉 일](https://reader035.vdocuments.site/reader035/viewer/2022062315/5697bf851a28abf838c87a94/html5/thumbnails/9.jpg)
Memory
• Architectural selection– select as little of the array as
possible
– dynamically powering up sense amp.
– Clocking only as needed
1/2Cell
Array
1/2Cell
Array
1/2Cell
Array
1/2Cell
Array
RowDe-
coder
RowDe-
coder
1/2 ColumnDecoder
1/2 ColumnDecoder
1/2 ColumnDecoder
1/2 ColumnDecoder
Split Array, half of columns active
W/R W/R DataData Addr
Memory Block Diagram
![Page 10: Low Power Processor Design VLSI Systems Lab. 3 월 28 일 박 봉 일](https://reader035.vdocuments.site/reader035/viewer/2022062315/5697bf851a28abf838c87a94/html5/thumbnails/10.jpg)
Clock
• Fast transition time and low skew– consume lots of power
– 10~20% of total chip power
• Clock power management– clock branches are segmented
and can be enabled as needed
PLLPLL
![Page 11: Low Power Processor Design VLSI Systems Lab. 3 월 28 일 박 봉 일](https://reader035.vdocuments.site/reader035/viewer/2022062315/5697bf851a28abf838c87a94/html5/thumbnails/11.jpg)
Datapath Signal Activity
• PowerPC 601
Instruction TypeInstruction Type # of instructions# of instructions # of 0 to 1# of 0 to 1 # of 1 to 0# of 1 to 0
ShiftShift 31753175 1.981.98 4.634.63
Switching FactorSwitching Factor
0.210.21
ADD/SUBADD/SUB 49374937 3.413.41 4.314.31 0.240.24
EA calculationEA calculation 1549615496 3.153.15 2.372.37 0.170.17
MUL/DIVMUL/DIV 10701070 1.501.50 1.561.56 0.100.10
Control RegisterControl Register 192192 1.441.44 2.262.26 0.120.12
CompareCompare 23492349 3.813.81 4.114.11 0.250.25
BranchBranch 5858 5.035.03 13.5013.50 0.580.58
TotalTotal 2727727277 3.053.05 3.133.13 0.190.19
![Page 12: Low Power Processor Design VLSI Systems Lab. 3 월 28 일 박 봉 일](https://reader035.vdocuments.site/reader035/viewer/2022062315/5697bf851a28abf838c87a94/html5/thumbnails/12.jpg)
Traditional Method
• Use enabling logic– Enable only the adder needed
– reduce the signal activities
• Minimizing temporal bit transition activity– gray coding
– bus inversion coding
AdderAdder AdderAdder
AdderAdder
MUXMUX
clkcontrol
clkcontrol
clkcontrol
clkcontrol
A B C D
A B C D
![Page 13: Low Power Processor Design VLSI Systems Lab. 3 월 28 일 박 봉 일](https://reader035.vdocuments.site/reader035/viewer/2022062315/5697bf851a28abf838c87a94/html5/thumbnails/13.jpg)
Datapath Components:Adder
• 특징– 다양한 구조에 따른 transition 의 변화가 심함
Adder Type(32 bit)
Adder Type(32 bit)
Delay(in gate units)
Delay(in gate units) # of gates# of gates # of transitions
(average)# of transitions
(average)
Ripple CarryRipple Carry 6868 288288 182182
Carry Skip(1)Carry Skip(1) 3333 304304 392392
Carry Skip(2)Carry Skip(2) 1919 350350 437437
Carry LookaheadCarry Lookahead 1414 401401 405405
Carry SelectCarry Select 1414 597597 711711
Conditional SumConditional Sum 1515 857857 13231323
![Page 14: Low Power Processor Design VLSI Systems Lab. 3 월 28 일 박 봉 일](https://reader035.vdocuments.site/reader035/viewer/2022062315/5697bf851a28abf838c87a94/html5/thumbnails/14.jpg)
Datapath Components: Multiplier
• 특징– 많은 transition
– transition 이 일어날 확률이 1 회 /1clock 인 노드가 50% 이상임
Multiplier Type(32 bit)
Multiplier Type(32 bit)
Delay(in gate units)
Delay(in gate units) # of gates# of gates # of transitions
(average)# of transitions
(average)
Modified ArrayModified Array 9898 24052405 73487348
Wallace/DaddaWallace/Dadda 5151 25692569 38743874
![Page 15: Low Power Processor Design VLSI Systems Lab. 3 월 28 일 박 봉 일](https://reader035.vdocuments.site/reader035/viewer/2022062315/5697bf851a28abf838c87a94/html5/thumbnails/15.jpg)
Future Works
• 저전력 프로세서 설계– Arithmetic component 에 대한 분석– 저전력을 위한 arithmetic component 의 제안
• 저전력 프로세서 구조의 제안– 다양한 구조에 대한 전력측면에서의 분석
![Page 16: Low Power Processor Design VLSI Systems Lab. 3 월 28 일 박 봉 일](https://reader035.vdocuments.site/reader035/viewer/2022062315/5697bf851a28abf838c87a94/html5/thumbnails/16.jpg)
Continued Story:
ACCENT_Light
VLSI Systems Lab. KAIST.
Mar. 28. 1998.
You-Sung Chang
![Page 17: Low Power Processor Design VLSI Systems Lab. 3 월 28 일 박 봉 일](https://reader035.vdocuments.site/reader035/viewer/2022062315/5697bf851a28abf838c87a94/html5/thumbnails/17.jpg)
Previous Work
• Dr. Bong has done!
• Everyone knows well now.
• Nothing to explain.
![Page 18: Low Power Processor Design VLSI Systems Lab. 3 월 28 일 박 봉 일](https://reader035.vdocuments.site/reader035/viewer/2022062315/5697bf851a28abf838c87a94/html5/thumbnails/18.jpg)
Feature of Accent
• Highly integrated CISC Processor-Core• 4-stage Pipelined Architecture• Configuration
– Pre-fetch Cache– Decode– Execution– Memory Management– Micro-code– External Interface– Embedded DRAM– . . .
![Page 19: Low Power Processor Design VLSI Systems Lab. 3 월 28 일 박 봉 일](https://reader035.vdocuments.site/reader035/viewer/2022062315/5697bf851a28abf838c87a94/html5/thumbnails/19.jpg)
Low Power in Accent
• Support Programmable Very Complex Code
• Micro-code based Stripe Power Control
• Pre-charging Biasing in Mask-ROM
• Inverse Data Store in Embedded DRAM
• Minimizing switching in BUS transfer
![Page 20: Low Power Processor Design VLSI Systems Lab. 3 월 28 일 박 봉 일](https://reader035.vdocuments.site/reader035/viewer/2022062315/5697bf851a28abf838c87a94/html5/thumbnails/20.jpg)
Very Complex Code
• Maximize the advantage of CISC micro-code approach
• Adaptive Programmable Micro-code – Program analyzer extract application specific instruction
– Compile micro-code ROM and decoder
• A small loop is translated into a complex instruction– Small code size
– Give more idle time to pre-fetch and decode units
– Enable low power from the small code size and the clock blocking for the induced idle time of pre-fetch and decode units
![Page 21: Low Power Processor Design VLSI Systems Lab. 3 월 28 일 박 봉 일](https://reader035.vdocuments.site/reader035/viewer/2022062315/5697bf851a28abf838c87a94/html5/thumbnails/21.jpg)
Stripe Power Control
• Clocking only as needed– Obvious!
– How?
• Cut data-path in strips
• Power control using micro-code field information
• Request enables clocking for peripheral units
Func1Func1
F/FF/F
Pass LatchPass LatchGatedclock 1
Gatedclock 2
Func2Func2
F/FF/F
Pass LatchPass LatchGatedclock 3
Gatedclock 4
![Page 22: Low Power Processor Design VLSI Systems Lab. 3 월 28 일 박 봉 일](https://reader035.vdocuments.site/reader035/viewer/2022062315/5697bf851a28abf838c87a94/html5/thumbnails/22.jpg)
Mask ROM
• Selective pre-charging/discharging for Micro-code ROM.
• Using the static statistics, assemble Micro-code ROM cell column by column.
• Simulation shows– Not so effective for Micro-code ROM
– Some potential for constant ROM
![Page 23: Low Power Processor Design VLSI Systems Lab. 3 월 28 일 박 봉 일](https://reader035.vdocuments.site/reader035/viewer/2022062315/5697bf851a28abf838c87a94/html5/thumbnails/23.jpg)
Embedded DRAM
• Full voltage pre-charging– Does not need half voltage generator
• Single-ended type
• Read/Write word by word
• To save power, minimize switching in the bit-line
• Store inverse data if ‘0’ is dominated with indicator. Sense AmplifierSense Amplifier
ReferencePre-charged(low cap.)
Pre-charged(high cap.)
![Page 24: Low Power Processor Design VLSI Systems Lab. 3 월 28 일 박 봉 일](https://reader035.vdocuments.site/reader035/viewer/2022062315/5697bf851a28abf838c87a94/html5/thumbnails/24.jpg)
BUS Transfer
• One-Hot Coding
• Gray Coding
• Bus Inversion Coding (1994 stan)
BUS
N+11: inversion indicator
N N
![Page 25: Low Power Processor Design VLSI Systems Lab. 3 월 28 일 박 봉 일](https://reader035.vdocuments.site/reader035/viewer/2022062315/5697bf851a28abf838c87a94/html5/thumbnails/25.jpg)
Self Evaluation
• Evaluation of Anticipation– Support Programmable Very Complex Code(H)
– Micro-code based Stripe Power Control (M)
– Pre-charging Biasing in Mask-ROM (M)
– Inverse Data Store in Embedded DRAM (H)
– Minimizing switching in BUS transfer (X)
H: highM: mediumL: lowX: X, its dedicated signification
![Page 26: Low Power Processor Design VLSI Systems Lab. 3 월 28 일 박 봉 일](https://reader035.vdocuments.site/reader035/viewer/2022062315/5697bf851a28abf838c87a94/html5/thumbnails/26.jpg)
Further Work
• Complete power estimation for each block– Functional Blocks in Data-path
– Pre-fetch and Decoder
• Inspect physical constraints in pre-charging biasing
• Estimate the power advantage of inverse data store
• Target : Workshop ~
• Task force : Caviar, Woosee, Bipark