inst.eecs.berkeley.edu/~ee241b
Borivoje Nikolić
EE241B : Advanced Digital Circuits
Lecture 21 – DVS II
1EECS241B L21 DVS2
April 8, NY Times: Computers Already Learn From Us. But Can They Teach
Themselves?
Scientists are exploring approaches that would help machines develop their own sort of common sense
Features Pieter Abbeel and Sergey Levine from UC Berkeley
Announcements
• Assignment 4 due next Friday.
• Reading• T. Burd, et al, JSSC, Nov 2000.
2EECS241B L21 DVS2
Outline
•Module 5• Dynamic voltage and frequency scaling
3EECS241B L21 DVS2
5.F Dynamic Voltage Scaling
4EECS241B L21 DVS2
Power /Energy Optimization Space
Constant Throughput/Latency Variable Throughput/Latency
Energy Design Time Sleep Mode Run Time
Active
Logic designScaled VDD
Trans. sizingMulti-VDD
Clock gatingDFS, DVS
Leakage
Stack effectsTrans sizingScaling VDD
+ Multi-VTh
Sleep T’sMulti-VDD Variable VTh
+ Input control
DVSVariable VTh
EECS241B L21 DVS2 5
Adaptive Supply Voltages
EECS241B L21 DVS2 6
Processors for Portable Devices
1000
100
10
1Perfo
rman
ce (M
IPS)
Processor Energy (Watt*sec)1 100.1
• Eliminate performance energy trade-off
PDAs
Pocket-PCs
NotebookComputers
DynamicVoltageScaling
BurdISSCC’00
EECS241B L21 DVS2 7
Typical MPEG IDCT Histogram
EECS241B L21 DVS2 8
timeSystem Idle
DesiredThroughput
Maximum Processor Speed
Background andhigh-latency processes
Compute-intensive andlow-latency processes
• Maximize Peak Throughput• Minimize Average Energy/operation
System Optimizations:BurdISSCC’00
Processor Usage Model
EECS241B L21 DVS2 9
Compute ASAP:
Deliv
ered
Thr
ough
put
Clock Frequency Reduction:
Excessthroughput
Always high throughput
Energy/operation remains unchanged…while throughput scaled down with fCLK
fCLKReduced
time
time
Common Design Approaches (Fixed VDD)
EECS241B L21 DVS2 10
0
0.5
1
0 0.5 1
Ener
gy/o
pera
tion
Throughput ( fCLK)
Constant supply voltage
Reduce VDD, slowcircuits down.
~10x EnergyReduction
3.3V
1.1V
BurdISSCC’00
Scale VDD with Clock Frequency
EECS241B L21 DVS2 11
InverterRingOscRegFileSRAM
1.0
0.5
0VT 2VT 3VT 4VT
Norm
alize
d m
ax. f
CLK
VDD
Delay tracks within +/- 10%BurdISSCC’00
CMOS Circuits Track Over VDD
EECS241B L21 DVS2 12
time
• Dynamically scale energy/operation with throughput.• Always minimize speed minimize average energy/operation.• Extend battery life up to 10x with the exact same hardware!
Vary fCLK,VDDDe
liver
edTh
roug
hput
1 2 Dynamically adapt
BurdISSCC’00
Dynamic Voltage Scaling (DVS)
EECS241B L21 DVS2 13
• DVS requires a voltage scheduler (VS).• VS predicts workload to estimate CPU cycles.• Applications supply completion deadlines.
0
20
40
60
80
0 0.2 0.4 0.6 1.0 1.2 1.40.8
Processor Speed (MPEG)
F DESI
RED
(MHz
)
Time (sec)
Operating System Sets Processor Speed
DESIREDCPU cycles F
time
EECS241B L21 DVS2 14
RST Counter
Latch
Digital Loop Filter
L CDD
VDD
PENAB
NENABFERR
FMEAS
f1MHz
0110
100 FDES
+Register
fCLK
Ring Oscillator Processor
IDD
• Feedback loop sets VDD so that FERR 0.• Ring oscillator delay-matched to CPU critical paths.• Custom loop implementation Can optimize CDD.
7
Buck converter
Set byO.S.
BurdISSCC’00
Converter Loop Sets VDD, fCLK
EECS241B L21 DVS2 15
• Circuit design constraints. (Functional verification)
• Circuit delay variation. (Timing verification)
• Noise margin reduction. (Power grid, coupling)
• Delay sensitivity. (Local power distribution)
Design verification complexity similar to high-performance processor design @ fixed VDD
Design Over Wide Range of Voltages
EECS241B L21 DVS2 16
• Cannot use NMOS pass gates – fails for VDD < 2VT.
• Functional verification only needed at one VDD value.
InverterRingOscRegFileSRAM
1.0
0.5
0VT 2VT 3VT 4VT
Norm
alize
d m
ax. f
CLK
VDD
BurdISSCC’00
Delay Variation & Circuit Constraints
EECS241B L21 DVS2 17
+40
+20
0
-20Perc
ent D
elay V
ariat
ion
VDDVT 2VT 3VT 4VT
• Timing verification only needed at min. & max. VDD.
Delay relative to ring oscillator
Gate
Interconnect
DiffusionSeries
Four extreme cases ofcritical paths:
All vary monotonically with VDD.
BurdISSCC’00
Relative Delay Variation
EECS241B L21 DVS2 18
Multiple Path Tracking
A. Drake, ISSCC’07EECS241B L21 DVS2 19
Multiple Path Tracking
Cho, ISSCC’16EECS241B L21 DVS2 20
Tracking with SRAM in Critical Path
Mismatch between logic and SRAM
SRAM multiplictive replica
Niki, JSSC’11EECS241B L21 DVS2 21
Alternative: Error Detection
Bull, ISSCC’2010EECS241B L21 DVS2 22
• Static CMOS logic.
• Ring oscillator.
• Dynamic logic (& tri-state busses).
• Sense amp (& memory cell).
Max. allowed |dVDD/dt| Min. CDD = 100nF (0.6m)Circuits continue to properly operate as VDD changes
Design for Dynamically Varying VDD
EECS241B L21 DVS2 23
VDD
• Static CMOS robustly operates with varying VDD.
Vin = 0 Vout = VDDrds|PMOS
CL
Vout
max. = 4ns
0.6m CMOS: |dVDD/dt| < 200V/s
Static CMOS Logic
EECS241B L21 DVS2 24
Ring Oscillator
• Output fCLK instantaneously adapts to new VDD.
60 80 100 120 140 160 180 200 220 240 260
0
1
2
3
4Vo
lts
Time (ns)
fCLK
VDD
Simulated with dVDD/dt = 20V/s
EECS241B L21 DVS2 25
VDD
Vout
Vin
clk
clk
Volts
Time
VoutVDDFalse logic low: VDD > VTP
Latch-up: VDD > Vbe
Errors
• Cannot gate clock in evaluation state.• Tri-state busses fail similarly Use hold circuit.
0.6m CMOS: |dVDD/dt| < 20V/s
clk = 1
VDD
VDD
Dynamic Logic
EECS241B L21 DVS2 26
100
80
60
40
20
00 1 2 3 4 5 6
Dhry
ston
e 2.1
MIPS
Energy (mW/MIPS)
85 MIPS @5.6 mW/MIPS
(3.8V)
6 MIPS @0.54 mW/MIPS
(1.2V)
• Dynamic operation can increase energy efficiency > 10x.
x
Static VDD
Dynamic VDD
BurdISSCC’00
Measured System Performance & Energy
EECS241B L21 DVS2 27
VDD-Hopping
MPEG-4 encoding
Nor
mal
ized
pow
er
0
0.2
0.4
0.6
0.8
1
2 3 8
# of frequency levels1
Transition time
between ƒ levels
= 200µs
Time
n-th slice finished hereNext milestone
#n #n+1
Application slicing and software feedback guarantee real-time operation.
Two hopping levels are sufficient.EECS241B L21 DVS2 28
Dithering Between Supply Levels
Keller et al, ESSCIRC’16
EECS241B L21 DVS2 29
• Done with switched-capacitor DC-DC converters which efficiently work only at discrete levels
Dithering Between Supply Levels
• Dithering fills in between fixed DC-DC modes
EECS241B L21 DVS2 30
Keller et al, ESSCIRC’16
Next Lecture
• Low-power design• Clock gating
• Power gating
EECS241B L21 DVS2 31