Design Techniques for Energy-Quality Scalable Digital Systems
Candidate: Daniele Jahier Pagliari
Supervisors: Enrico Macii, Massimo Poncino
Turin, 18 May 2018
Doctoral Program in Computer and Control Engineering (30th Cycle)
Outline
• Introduction and Motivation
• EQ Scalable Design Techniques for Processing Hardware
• EQ Scalable Design Techniques for Serial Interconnects
• EQ Scalable Design Techniques for OLED displays
• Conclusions and Future Work
1
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
Introduction
• Energy efficiency is a key objective in modern digital systems
• A lot of energy is spent in ensuring that the system performs reliable, precise and accurate operations (e.g. floating point, redundancy, etc.)
2
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
Mobile, IoT
Diminishing returns of
classic techniques
Battery operated devices
Energy harvesting
Technology scaling
Voltage Scaling
Introduction
• Many modern computing applications are error tolerant (or resilient)
• For these applications, controlled errors in internal operations do not have a dramatic impact on final output quality
• Error tolerance can have different origins:
3
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
Application Error
Tolerance
Inputs Properties
Algorithms
Outputs Properties
Introduction
• Noisy data (e.g. from sensors) are affected by environmental noise• Errors can be tolerated as long as their effect on outputs is negligible w.r.t.
the effect of noise
• Redundant data do not add information• Their computation can be approximated or skipped without degrading output
quality
4
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
Application Error
Tolerance
Inputs Properties
Algorithms
Outputs Properties
Processing
Noisy Input Noisy Output
Error!
• The definition of correct outputs can be fuzzy or informal• If correct outputs are unknown (e.g. optimization problem)• If multiple outputs are equivalently good (e.g. Google search)
• Many applications have human users• Small or rare errors (in time and space) are not perceived by our sense organs
Algorithms
Outputs Properties
Application Error
Tolerance
Introduction 5
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
Original Image Random 3 LSBs
Introduction
• Some computational patterns naturally reduce the effect of errors
• Iterative refinement steps converge to correct results even in presence of (controlled) errors.• E.g. Gradient Descent
• Statistical aggregation tends to reduce the effect of errors• E.g. Data mining, clustering, etc.
6
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
Application Error
Tolerance
Algorithms
F(x)
x
CorrectWith errors
Introduction
• Purposely introducing errors (i.e. relaxing the precision, reliability and accuracy of operations) can yield energy benefits:• Reducing data-path precision• Reducing design margins• Eliminating redundancy• Evaluating approximate functions• Etc.
• Energy-Quality (EQ) scalable design techniques exploit this tradeoff systematically for error tolerant applications
• The available “quality slack” depends on: task, context and inputs
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
7
Introduction
• EQ Scalable System Architecture:
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
8
Error ToleranceIdentification
Error Tolerance Characterization
ApplicationCode
Workloads
Offline
EQ ScalableSW/HW
EQ KnobsControl
QualityMonitor
OnlineContext
EQ K
nobs
Outputs“Quality”
Motivation
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
9
• Our work focuses on three important aspects which are given little consideration in the EQ scalable design state-of-the-art:
1. Generality:• Holistic EQ scalability (not limited to processing)• Runtime quality-configurability
2. Automation and integration:• Compatibility with EDA tools• Compatibility with standard protocols
3. Focus on overheads:• Avoid implementations that offset energy gains Smartphone (video playback)IoT sensor node
Outline
• Introduction and Motivation
• EQ Scalable Design Techniques for Processing Hardware
• EQ Scalable Design Techniques for Serial Interconnects
• EQ Scalable Design Techniques for OLED displays
• Conclusions and Future Work
10
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
• Target: Hardware (HW) data-path modules
• Objectives:• Automation (integration with EDA tools)• Generality
EQ Scalable Design of Processing Hardware
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
11
EQ Scalable Data-Path HW
Reduced-Precision Redundancy
Dynamic Voltage and Accuracy Scaling Two Variants
• Reduced-Precision Redundancy:• Voltage Over-Scaling (VOS) on the original
HW block (MDSP)• Error-Control (EC) block to mitigate the
effect of timing errors
• EC block structure:• Estimator of the error-free output:
• Implemented as a reduced-precision Replica of the MDSP
• Decision block to select between MDSP and Replica outputs
Design Techniques for Energy-Quality Scalable Digital Systems
EDA Flow for Reduced-Precision Redundancy 12
December 10, 2018
EDA Flow for Reduced-Precision Redundancy
• Limitations of classic RPR implementations:
1. Simplified and unrealistics assumptions on the input statistics• All timing path activations assumed equally probable.
1. No integration with standard EDA tools:• Simplified VOS timing degradation model• Ad-hoc replica implementation
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
13
EDA Flow for Reduced-Precision Redundancy
• Goal of the proposed method:• Automatically add RPR to the existing
gate-level netlist of a data-path HW block
• Under a user-defined minimum output quality constraint
• Features:1. Functionality agnostic2. Fully automatic and integrated with
EDA tools3. Based on back-annotated simulations
(with realistic models)
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
14
MDSP SimulationIn VOS
Error Behavior Analysis
For each VOS voltage
Optimal Replica Synthesis
Decision Block Synthesis
Mentor QuestaSim
SynopsysDesign Compiler
SynopsysPrimeTime
• Experimental Results:
0102030405060
FIR FFT IIR CORDIC SRU
Design Techniques for Energy-Quality Scalable Digital Systems
EDA Flow for Reduced-Precision Redundancy 15
December 10, 2018
FIR filer RPR power savings for two input sets under identical conditions and constraints
Power savings under realistic quality constraints
0
50
100
150
200
FIR FFT IIR CORDIC SRU
Area overheads under realistic quality constraints
Accurate consideration of input statistics is fundamental
The proposed method is general
• Dynamic (Voltage) and Accuracy Scaling:
• Advantages:• Based on technological knobs only, no
architectural modification• General• Low overheads• Many energy/quality configurations
• Limitations:• Integration with standard EDA flows• Slack does not increase as expected
(“wall of slack”)• Limited power benefits!
Design Techniques for Energy-Quality Scalable Digital Systems
EDA Flows based on DAS/DVAS 16
December 10, 2018
Precision ↓(gate input LSBs)
Switching Activity ↓
Dynamic Power ↓
Slack ↑
Supply Voltage ↓
Leakage
Power ↓
0
2
4
6
8
10
-0,3 -0,2 -0,1 0 0,1 0,2 0,3 0,4
N. o
f End
poin
ts
Slack [ns]
EDA Flows based on DAS/DVAS
• Solution 1: Combination with fine-grain Vth tuning
• Split the HW block into Vth “domains”
• Use Vth tuning to speed-up timing-critical sections of the HW for each precision
• Implemented on FDSOI using back-biasing
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
17
Originalplacement PlacementwithVth domains
Automatic Vth
domains insertion
Foreach precision
Find optimal configuration of
VDD and Vth
CadenceInnovus
SynopsysPrimeTime
MentorQuestasim
EDA Flows based on DAS/DVAS
• Solution 2: Application-driven Synthesis Flow
• Use multi-scenario optimization to prevent the wall of slack• Take into account the application-dependent usage frequency of each precision
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
18
SynopsysDesign Compiler
Foreach precision
For all decreasing
VDD
Incremental multi-scenario synthesis
Estimate total energy
SynopsysPrimeTime
EDA Flows based on DAS/DVAS
• Experimental Results (Solution 1):
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
19
0
10
20
30
40
50
Multiplier Adder FFT FIR
0
5
10
15
20
Multiplier Adder FFT FIR
Maximum power savings
Area overheads
Still general
Lower overheads w.r.t. RPR
Adder bit-width versus power consumption
646056524844403632282420161284Bit-width
0.003
0.004
0.005
0.006
0.007
Powe
r [W
]
Proposed (2x2 Domains)DVAS (NoBB)DVAS (FBB)
EQ Scalable Design of Processing Hardware
• Qualitative comparison of the proposed methods:
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
20
RPR based solution
• Rare unpredictable errors• Two “quality modes”• ≈100% area overhead
DVAS based solutions
• Systematic “errors”• Many “quality modes”• ≈10% area overhead
Flexibility
Output Quality
Overheads
Low HighLow High
1. Jahier Pagliari et al, “An automated design flow for approximate circuits based on reduced precision redundancy”, ICCD2015
2. Jahier Pagliari et al, “A methodology for the design of dynamic accuracy operators by runtime back bias”, DATE2017
3. Jahier Pagliari et al, “Application-driven synthesis of energy-efficient reconfigurable-precision operators”, ISCAS2018
4. Jahier Pagliari et al, “Energy-Efficient Digital Processing via Approximate Computing”, (Chapter) Springer 2016
5. Jahier Pagliari et al, “Fine-grain Back Biasing for the Design of Energy-Quality Scalable Operators”, IEEE TCAD 2018
Outline
• Introduction and Motivation
• EQ Scalable Design Techniques for Processing Hardware
• EQ Scalable Design Techniques for Serial Interconnects
• EQ Scalable Design Techniques for OLED displays
• Conclusions and Future Work
21
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
EQ Scalable Design of Serial Interconnects
• Target: Serial Buses• De facto standard for sensors, actuators and I/O controller interfaces• Higher frequencies, no jitter, reduced crosstalk, lower cost (# of pins)• Power consumption:
• Mostly dynamic (!"#$% = '()*)+,,- .)• Energy reduction can be achieved by reducing / → data encoding!
• Relevant?• The serial transmission of a 12-bit datum can consume as much as the execution of a
32-bit instruction! (e.g. on a large off-chip PCB trace)• A system may include tens of serial buses
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
22
EQ Scalable Design of Serial Interconnects
• Error tolerant serial bus traces:
• Highly temporally correlated on average• Often “bursty”: long almost constant (idle) sections and short (bursty) sections of fast and
large variation
• Goals:• Integration: compatibility with standard protocols (I2C, SPI, etc.)• Overheads: encoding and decoding HW/SW do not offset the energy gains on the bus
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
23
Proposed Encodings (ADE and Serial-T0): leverage the correlation and “burstiness” of data to introduce controlled approximations on encoded data with small impact on output quality.
Most information is conveyed by bursty sections!
• Approximate Differential Encoding (ADE):
• Exploit the effectiveness of Differential Encoding (DE) for correlated data• Combine it with quality-driven LSB saturation to improve savings
00 11000001
101011 0110 00
EQ Scalable Design of Serial Interconnects
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
1010
Timet t+1t-1
Input words
Codewords 0000
1010 11
10
24
14 Total Transitions
Total Transitions96
General encoding for error tolerant data
10101011
• Serial-T0 (ST0):
• Exploit idle sections of bursty data for energy savings• Selectively transmit the correct datum or a special 0-transitions pattern
(interpreted as “repeat previous datum”)
111111 101001
101001
EQ Scalable Design of Serial Interconnects
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
010101
Timet t+1t-1
Input words
Codewords
010011
25
13 Total Transitions
7 Total Transitions
Specific for ”bursty” signals (e.g. images)
010011
1 2 3 4 5 6Error [%]
10
20
30
40
50
60
70
TC
Re
du
ctio
n [
%]
ADEST0LSBSRAKESILENT2-LIWT
1 2 3 4 5 6Error [%]
10
20
30
40
50
60
70
80
90
TC
Re
du
ctio
n [
%]
ADEST0LSBSRAKESILENT2-LIWT
• Experimental Results:
EQ Scalable Design of Serial Interconnects
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
26
EQ tradeoff for accelerometer data EQ tradeoff for RGB image data
• Experimental Results:
EQ Scalable Design of Serial Interconnects
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
27
EQ tradeoff for an OCR application
Error = 4%Saving = 92%1 wrong digit!
1. Jahier Pagliari et al, “Approximate differential encoding for energy-efficient serial communication”, GLSVLSI2016
2. Jahier Pagliari et al, “Serial T0: approximate bus encoding for energy-efficient transmission of sensor signals”, DAC2016
3. Jahier Pagliari et al, “Zero-Transition Serial Encoding for Image Sensors”, IEEE Sensors Journal 2017
4. Jahier Pagliari et al, “Approximate Energy-Efficient Encoding for Serial Interfaces”, ACM TODAES 2017
Outline
• Introduction and Motivation
• EQ Scalable Design Techniques for Processing Hardware
• EQ Scalable Design Techniques for Serial Interconnects
• EQ Scalable Design Techniques for OLED displays
• Conclusions and Future Work
28
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
EQ Scalable Image Transformations for OLEDs
• Target: OLED Displays• Composed of emissive devices• Image-dependent power consumption
• New dimension: • Trade-off power for image “error”
• State-of-the-art Algorithms:
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
29
For each pixel !!’ = $!, $ < 1
Brightness (luminance) scaling
For each pixel !!’ = ((!)
Power saving + image (contrast) enhancement
EQ Scalable Image Transformations for OLEDs
• Limitations of the state-of-the-art:• High-complexity (nonlinear optimization, histogram processing)• No implementation overheads analysis
• Proposed Techniques:• Low-overhead Adaptive Brightness Scaling (LABS)• Low-overhead Adaptive Power Saving and contrast Enhancement (LAPSE)
• Goals:• Automation: plug-and-play frameworks based on regression models trained with
representative images• Overheads: implementable in SW or HW, in real-time, with low energy consumption.
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
30
EQ Scalable Image Transformations for OLEDs
• Low-overhead Adaptive Brightness Scaling (LABS):• Adaptive brightness scaling: change the scaling factor k depending on the image.
• Co-optimize power saving and image alteration: Power-Similarity Product (PSP)
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
31
!’ = $!
Scale brighter images more
$%&' ∝1∑!
EQ Scalable Image Transformations for OLEDs
• Low-overhead Adaptive Brightness Scaling (LABS):• Linear regression to fit optimal scaling factor to image luminance• Trained offline with representative images• Online transformation becomes O(#pixels) and only involves simple operations
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
32
0.2 0.4 0.6 0.8Normalized Ytot
0.7
0.8
0.9
1.0
k opt RGB
toYCbCr
Accumulator
YCbCrtoRGB
m
q
Ytot
Y
Cb
Cr
Y'koptR
G
B
R'
G'
B'
Offline phaseOnline phase
EQ Scalable Image Transformations for OLEDs
• LABS Experimental Results:
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
33
BSDS INRIA Kodak0
10
20
30
40
50Av
erag
e Po
wer S
avin
g [%
] All Bright Dark
Average savings (MSSIM ≅ 0.93)
EQ Scalable Image Transformations for OLEDs
• Low-overhead Adaptive Power Saving and Contrast Enhancement (LAPSE):• Observation: state-of-the-art transformations can be approximated by a 3rd order
polynomial of the pixels luminance
• Goal: minimize power and maximize contrast under a maxmium alteration (MSSIM) constraint
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
34
0 50 100 150 200 250
Yi, j
0
50
100
150
200
250
You
t,i,j
PCCE Output
Cubic Fit !’ = $(!)
$ ! = '(!( + '*!* + '+!
Easy to implement in SW and HW
• Low-overhead Adaptive Power Saving and Contrast Enhancement (LAPSE):• Training-based approach similar to LABS• Different objective and constraints
EQ Scalable Image Transformations for OLEDs
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
35
RGB to YCbCr YCbCr to RGB
Compute and 2
Computea
Apply T()Pixel by Pixel
Cb, CrY
YtP,q
II IO
Phase 1: O(WH) Phase 2: O(1) Phase 3: O(WH)
3
1
2
5
4
Offline phase
Online phase
• LAPSE Experimental Results:
EQ Scalable Image Transformations for OLEDs
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
36
InputState-of-the-art
technique (PCCE) LAPSE
Saving61.6 %
Saving59.5 %
Saving60.3 %
Saving67.4 %
Saving55.1 %
Saving69.4 %
MSSIM0.692
MSSIM0.799
MSSIM0.795
MSSIM0.751
MSSIM0.860
MSSIM0.691
Similar results despite much lower complexity
EQ Scalable Image Transformations for OLEDs
• LAPSE Experimental Results:
• About 10x faster than the state-of-the-art in software• Real-time implementation feasible in hardware• Hardware energy overhead per image ≅ 1000x smaller than OLED energy
consumption• (LABS implementation is even simpler!)
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
37
Image Size SW Ex. Time [ms] HW Ex. Time [ms]512x512 6.49 0.52
1280x1280 34.68 3.281. Jahier Pagliari et al, “Low-overhead adaptive constrast enhancement and power reduction for
OLEDs”, DATE20162. Jahier Pagliari et al, “Optimal content- dependent dynamic brightness scaling for OLED displays”,
PATMOS20173. Jahier Pagliari et al, “LAPSE: Low-Overhead Adaptive Power Saving and Contrast Enhancement for
OLEDs”, IEEE TIP (under review)
Outline
• Introduction and Motivation
• EQ Scalable Design Techniques for Processing Hardware
• EQ Scalable Design Techniques for Serial Interconnects
• EQ Scalable Design Techniques for OLED displays
• Conclusions and Future Work
38
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
Conclusions and Future Work
• A set of EQ scalable design techniques has been presented that:1. Target different components of a digital system, not limited to processing2. Consider in detail the integration with industrial best-practices (e.g. tools,
standard protocols)3. When possible, favor automated and widely applicable (general) solutions4. Thoroughly evaluate and try to reduce the energy overheads associated with
EQ scalability
• All proposed techniques allow runtime tuning of the energy-quality tradeoff:• Fundamental considering the time-varying nature of quality constraints
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
39
Conclusions and Future Work
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems
40
• Future directions:
1. Apply the proposed techniques within complete applications (started):• E.g. use EQ scalable HW blocks for machine learning acceleration
2. Investigate system-level EQ scalable design:• Synergistic application of techniques for different components and abstraction
levels• Implementing the EQ “control loop” becomes more complex
December 10, 2018Design Techniques for Energy-Quality Scalable Digital Systems