Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Design for Low-Power IoT Systems: Coarse-Grained Reconfigurable Acceleration Units
FRANCESCA PALUMBO
UNIVERSITÀ DEGLI STUDI DI SASSARI
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Design for Low-Power IoT Systems
Module 1: Transport Triggering Approach Jarmo Takala, Tampere University of Technology, Finland
Module 2: Coarse-Grained Reconfigurable Acceleration Units
Francesca Palumbo, Università degli Studi di Sassari, Italy
Module 3: Markov Decision Processes and Power/Performance/Thermal (PPT) Marilyn Wolf, Georgia Tech., USA
Module 4: Factored MDPs and Application Examples
Shuvra Bhattacharyya, U. Maryland, USA and Tampere U. Technology, Finland
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Overview
• Motivations
- What we need adaptation for
- Triggers and Types
• Coarse-Grained Reconfigurable Systems
- Computing Spectrum and Reconfigurable Systems Classification
- Heterogeneous and Irregular Coarse-Grained Reconfigurable Accelerators
• Power Management
- Issues and Strategies
- Low-Power Coarse-Grained Reconfigurable Accelerators
• Reconfigurable FFT Example and Remarks
3
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Overview
• Motivations
- What we need adaptation for
- Triggers and Types
• Coarse-Grained Reconfigurable Systems
- Computing Spectrum and Reconfigurable Systems Classification
- Heterogeneous and Irregular Coarse-Grained Reconfigurable Accelerators
• Power Management
- Issues and Strategies
- Low-Power Coarse-Grained Reconfigurable Accelerators
• Reconfigurable FFT Example and Remarks
4
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Numbers: opportunity or issue?
5
> 7 billion 20 MWh/year 1,800 kg oil
De
sign
ed b
y Fr
ee
pik
=
> 1 billion smartphones 8.4 billion connected things in 2017 (+31% wrt 2016)
20.4 billion by 2020
http://www.gartner.com/newsroom/id/3598917
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
SMART-TRANSPORTATION: autonomous electric vehicle, improved driver assistance and care. Path towards destinations may vary, even diverging from the optimal one, according to user preferences.
SMART-SOCIETY: increased building efficiency and comfort, i.e. lightning/air
quality management can be adjusted to the room status.
SMART HEALTH: distributed healthcare assistance to improve quality of life and active and healthy ageing, functionalities can be changed according to the daily tasks.
Some examples ... Connectivity and real-time situation awareness are nowadays common in different scenarios.
6
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Reconfiguration: Recipe for Compromises
Reconfiguration may allow to optimally implement complex/demanding systems, managing
numerous/conflicting requirements and a variety of functionalities.
Modern digital devices (real-time and ad-hoc) are pervasive (98% of computers are embedded) and
interconnected.
They may also present sensing and actuating capabilities, leading to the concept of CPS.
Safe
ty
Secu
rity
Ce
rtif
.
Dis
trib
.
HM
I
Seam
less
MP
SoC
Ene
rgy
Automotive x x x x x x x
Aerospace x x x x x x x
Healthcare x x x x x x x x
Consumer x x x
7
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Triggers for Adaptation ENVIRONMENTAL AWARENESS: Influence of the environment on the system, i.e. daylight vs. nocturnal, radiation level changes, etc. Sensors are needed to interact with the environment and capture conditions variations.
USER-COMMANDED: System-User interaction, i.e. user preferences, etc. Proper human-machine interfaces are needed to enable interaction and capture commands.
SELF-AWARENESS: The internal status of the system varies while operating and may lead to reconfiguration needs, i.e. chip temperature variation, low battery. Status monitors are needed to capture the status of the system.
8
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Types of Adaptation FUNCTIONALITY-ORIENTED: To adapt functionality because the CPS mission changes, or the data being processed changes and adaptation is required. It may be parametric (a constant changes) or fully functional (algorithm changes).
NON-FUNCTIONAL REQUIREMENTS-ORIENTED: Functionality is fixed, but system requires adaptation to accommodate to changing requirements, i.e. execution time or energy consumption.
REPAIR-ORIENTED: For safety and reliability purposes, adaptation may be used in case of faults. Adaptation may add self-healing or self-repair features. e.g.: HW task migration for permanent faults, or scrubbing (continuous fault verification) and repair.
A
B C
9
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Overview
• Motivations
- What we need adaptation for
- Triggers and Types
• Coarse-Grained Reconfigurable Systems
- Computing Spectrum and Reconfigurable Systems Classification
- Heterogeneous and Irregular Coarse-Grained Reconfigurable Accelerators
• Power Management
- Issues and Strategies
- Low-Power Coarse-Grained Reconfigurable Accelerators
• Reconfigurable FFT Example and Remarks
10
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Computing Spectrum
ASIC DSP GPU CPU
GP
Flexibility Efficiency
CG
RECONF
FG
Fine-Grained (FG)
Coarse-Grained (CG)
bit-level word-level
flexibility
speed
memory
Reconfigurable computing provides a trade-off between execution efficiency typical of ASICs and flexibility mainly exhibited by GP devices.
11
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Computing Spectrum
ASIC DSP GPU CPU
GP
Flexibility Efficiency
CG
RECONF
FG
Fine-Grained (FG)
Dynamic Partial (DP)
Coarse-Grained (CG)
bit-level bit-level word-level
flexibility
speed
memory
12
DPR
Reconfigurable computing provides a trade-off between execution efficiency typical of ASICs and flexibility mainly exhibited by GP devices.
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Virtual vs. Dynamic & Partial VRC Virtual Reconfigurable Circuits
• High reconfiguration speed
• Lower operation speed (mux and size)
• Higher Area Overhead
• Technology independent (ASIC or FPGA)
DPR Dynamic and Partial Reconfiguration
• Lower reconfiguration speeds
• Better operation speed (no mux/less logic)
• Better Resource Utilization (no dark logic)
• Higher Flexibility and Scalability
• Technology dependent (FPGA)
13
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Multi-Dataflow Composer: Development 2010 2011 2012 2013 2014 2015 2016
Baseline tool specification: Multi-Dataflow Composer (MDC) tool
MPEG-RVC Framework Integration: Orcc + MDC + Xronos + Turnus
Structural Profiler
Low-Power Extension
Co-processor Generator
2017
Generalization Process
2018
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Multi-Dataflow Composer: Evaluation 2010 2011 2012 2013 2014 2015 2016 2017 2018
Reconfigurable Image/Video Coding: JPEG e H.264
Neural Signal Decoding
Adaptive Filtering: HEVC Encoding
Cryptographic Systems
CERBERO H2020
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Multi-Dataflow Composer: Design Suite
Dynamic Power Manager
Multi Dataflow Composer Tool
Structural Profiler
Co
-Pro
cess
or
Gen
era
tor
Power Efficiency
Functional Complexity Time to Market:
Design & Mapping Automation
Constraint Driven
Optimisation
Fast Integration and Prototyping
MDC design suite
http://sites.unica.it/rpct/
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
C
Dataflow Model of Computation COMPUTING PARADIGM:
• directed flow graph of actors (functional units)
• communication with tokens (packets of data) exchange through dedicated channels
PECULIARITIES:
• explicit intrinsic application parallelism
• modularity favours model re-usability/adaptivity
EXTERNAL INTERFACE:
• I/O ports number
• I/O ports depth
• I/O ports tokens burst
A
B
D
actions
state
17
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Heterogeneous and Irregular: approach
coarse grained
substrate
C D A
B C D
A
B
1:1
coarse grained reconfigurable
substrate SB
E
SB C D
A
B
2:1
C D A
B
E D A
α
β
http://sites.unica.it/rpct/
18
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Heterogeneous and Irregular: MDC framework
Multi-Dataflow Composer (MDC) tool: Dataflow 2 HW tool
• Given N input dataflow specification, it generates the HDL Coarse-Grain (CG) reconfigurable substrate
• Handles programmability, by defining switching and configuration logic
• Deploy Xilinx-compliant IP blocks, plus their drivers
HETEROGENOUS FUNCTIONAL UNITS IRREGULAR INTERCONNECT
19
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
MDC-based Reconfiguration: Adaptation Types
Execution Profile
A SW
A SW
A
in
SW out
A A A in out1
A A in out2
A in out3
A B C in out1
A D
B
in
B
E C out2
A
SW
C
in
Execution Profile
out
B
B SW
E D
SW
Functional Oriented
A B C in1 out1
B E in2 out2
A
SW C in1
Execution Profile
out1 B
E
SW
in2
out2
Non-Functional Oriented
A
B C
20
Troughput vs
Energy
QoS vs
Energy
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Overview
• Motivations
- What we need adaptation for
- Triggers and Types
• Coarse-Grained Reconfigurable Systems
- Computing Spectrum and Reconfigurable Systems Classification
- Heterogeneous and Irregular Coarse-Grained Reconfigurable Accelerators
• Power Management
- Issues and Strategies
- Low-Power Coarse-Grained Reconfigurable Accelerators
• Reconfigurable FFT Example and Remarks
21
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
The Power Issue Power consumption =
Dynamic power + Static power
Dynamic: activity dependent
• Short-circuit: when the output line of a transistor is switching, there is a period of time when both the PMOS and the NMOS transistors are on (I·V·f)
• Switching power: due to the charging and discharging of the load capacitance when logic transitions occur (determined by the formula C·V2·f).
Static: not activity dependent, but due to leakage currents.
• Do not depend on switching and operating frequency.
• As transistors get smaller, channel lengths become shorter and leakage currents increase.
90 nm Inflection Point
22
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
The Power Issue: Textbook Techniques
• DESIGN: Multi Vt, Clock gating, Power gating, Multi Vdd, DVFS.
• PROCESS: Multi Vt, PD SOI, FD SOI, FinFet, Body Bias, Multi oxide devices, Minimize
capacitance by custom design.
• ARCHITECTURE: power-constrained DSE, hw customization and IP specific techniques,
parallelism, mapping.
Dynamic Clock gating
Variable frequency Voltage islands
Variable/multi power supply Dynamic Voltage & Frequency Scaling
Static Multi-threshold dev.
Power gating Back (substrate) bias Multi-oxide devices
SOI CMOS
INTRINSICALLY SYSTEM LEVEL
MANAGEABLE AT SYSTEM LEVEL
23
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Clock-Based Techniques
• Power consumed by flip-flops.
• Power consumed by combinatorial logic driven by registers.
• Power consumed by the clock buffer tree.
24
Reduce frequency whenever you can. Stop the clock when the component is not active.
Fine-Grained
Coarse-Grained
50% less dynamic power
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
V3 V2
Power-Based Techniques
• Power-aware partitioning.
• Switching-off power island brings local leakage to zero.
• Modes of operation -> power down all the idle chip regions.
25
Reduce the voltage level of a power island whenever you can. Switch it off when it is not active.
V4 V4
V1 V3 V2
V1
V3
V2 V4
V1 OFF
V2
V1
OFF
V2 V4
V4
V1 OFF OFF
OFF OFF OFF
Static Voltage Scaling (SVS)
SVS with Power Shut Off (PSO)
SVS with PSO and Power Modes
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Clock-/Power-Based Techniques: Overhead
26
Switch off the clock, applicable to both ASIC and FPGA. • @ASIC: simple and gates. • @FPGA: dedicated vendor specific cells.
Switch off the power supply, ASIC only (partially FPGA). • Sleep transistors: to switch on and off power supply. • Isolation logic: to avoid the transmission of spurious signals from
gated regions to normally-on cells. • Retention logic: to maintain the internal state of gated regions.
MAIN
REGISTER
SHADOW REGISTER
SAVE RESTORE
Retention Register
POWER-DOWN BLOCK
P_UP
ALWAYS-ON BLOCK
Isolation Cell
POWER-DOWN BLOCK
Vdd
Power Switch-Off Cell
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Clock-/Power-Based Techniques: Overhead
27
Switch off the clock, applicable to both ASIC and FPGA. • @ASIC: simple and gates. • @FPGA: dedicated vendor specific cells.
Switch off the power supply, ASIC only (partially FPGA). • Sleep transistors: to switch on and off power supply. • Isolation logic: to avoid the transmission of spurious signals from
gated regions to normally-on cells. • Retention logic: to maintain the internal state of gated regions.
Vary mode changing Vdd, ASIC only.
• Level shifters: to pass signals between portions of the design that operate on different voltages.
POWER
DOMAIN 1 0.8 V
POWER DOMAIN 2
1.2 V
Level Shifters
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Clock-/Power-Based Techniques: Management
The Common Power Format (CPF), used within the Cadence design flow, is meant to define power-saving techniques early in the design process.
• Technology part: depends on the adopted technology. It defines the libraries to be used (for each operating conditions) and the cells (i.e. isolation, sleep transistors, state retentions) to be used within them.
• Power intent part: depends on the design. It defines all the rules to correctly operating the inserted switch (i.e. defines the on/off conditions of the sleep transistors), defines the set of available domains and which logic blocks belong to them
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
PSO Example • Reconfigurable Filter, the
depth of the filter can vary.
• 2 different Logic Regions (LR), one always on and one switchable
• Switchable LR needs the insertion of
- isolation, retention and power switch cells;
- one power controller to handle the control signals [1*shut-o + 1*isol. + 2*reten.]
29
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Heterogeneous and Irregular CGR: power issue
α
C D A
B E D A D F
β γ
30
Common element
SB0 E
A
C B
SB1
SB2
F D
Multi-Dataflow Graph
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Heterogeneous and Irregular CGR: power issue
α
C D A
B E D A D F
β γ
SB0 E
A
C B
SB1
SB2
F D
31
Common element
α execution: E and F, being not involved, are wasting power!
Multi-Dataflow Graph
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Heterogeneous and Irregular CGR: power issue
α
C D A
B E D A D F
β γ
SB0 E
A
C B
SB1
SB2
F D
32
Common element
β execution: B C and D, being not involved, are wasting power!
Multi-Dataflow Graph
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Heterogeneous and Irregular CGR: power issue
α
C D A
B E D A D F
β γ
SB0 E
A
C B
SB1
SB2
F D
33
Common element
γ execution: B C and D, being not involved, are wasting power!
Multi-Dataflow Graph
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Heterogeneous and Irregular CGR: MDC approach
C
F
D
A
B
E
LR 1 2 3 4 5
actors A B,C D E F
α 1 1 1 0 0
β 1 0 1 1 0
γ 0 0 1 0 1 γ α
β
SB
E A
C B
SB
SB
F D
E D A
D F
C D A
B α
β
γ
MD
C
fro
nt-
en
d
34
LRs Identification
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Heterogeneous and Irregular CGR: MDC approach
low power (clock gated) CGR substrate
en generator
C
F
D
A
B
E
clk
configurator en1
en2
en3
en4
en5
LR actors α β γ
1 A 1 1 0
2 B,C 1 0 0
3 D 1 1 1
4 E 0 1 0
5 F 1 0 1
MDC back-end
35
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Overview
• Motivations
- What we need adaptation for
- Triggers and Types
• Coarse-Grained Reconfigurable Systems
- Computing Spectrum and Reconfigurable Systems Classification
- Heterogeneous and Irregular Coarse-Grained Reconfigurable Accelerators
• Power Management
- Issues and Strategies
- Low-Power Coarse-Grained Reconfigurable Accelerators
• Reconfigurable FFT Example and Conclusions
36
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
CG Reconfigurable FFT Design
37
x0
x4
x2
x6
x1
x5
x3
x7
y0
y1
y2
y3
y4
y5
y6
y7
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
-1
radix-2 butterfly
stage 1 stage 2 stage 3
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
CG Reconfigurable FFT Design
38
b_00
b_10
b_20
b_30
b_10
b_11
b_21
b_31
b_02
b_12
b_22
b_32
12 butterflies
3cc
b_00
b_10
b_20
b_30
4 butterflies b_00
1 butterfly
b_00
b_10
2 butterflies
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Low-Power CG Reconfigurable FFT Design
b_00
b_10
b_20
b_30
b_10
b_11
b_21
b_31
b_02
b_12
b_22
b_32
SWITCHABLE AREA
39
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Low-Power CG Reconfigurable FFT: 90nm ASIC
FFT: power vs throughput
Dynamic trade-off management
FFT: Area MDC offers automatic
implementation of power-gated and clock-gated designs
40
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Final Remarks • Reconfiguration is a good recipe to address flexibility
- Functional: change the tasks is possible.
- Non-Functional: change the execution profile is possible.
• Power is not necessarily an issue
- The Multi-Dataflow composer tool offer ways to minimize consumption of unused resources, which a negligible impact/effort.
Example of multi-profile CGR system: HEVC interpolator
[ESL17] Carlo Sau, Francesca Palumbo, Maxime Pelcat, Julien Heulot, Erwan Nogues, Daniel Menard, Paolo Meloni, and Luigi Raffo. “Challenging the Best HEVC Fractional Pixel FPGA Interpolators with Reconfigurable and Multi-frequency Approximate Computing” in IEEE Embedded Systems Letters, vol. 9, no. 3, pp. 65-68, Sept. 2017.
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Intelligent system DEsign and Application (IDEA) @ UNISS
42
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
The MDC Group – UNISS + UNICA team
43
UNIVERSITY OF SASSARI
UNIVERSITY OF CAGLIARI
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
CERBERO H2020 Project
44
Coordinator:
Michal Masin (IBM), [email protected]
Scientific Coordinator:
Francesca Palumbo (UniSS), [email protected]
Innovation Manager:
Katiuscia Zedda (Abinsula), [email protected]
Dissemination-Communication Manager:
Francesco Regazzoni (USI), [email protected]
www.cerbero-h2020.eu
@CERBERO_h2020
EU Commission for funding the CERBERO (Cross-layer modEl-based fRamework for multi-oBjective dEsign of Reconfigurable systems in unceRtain hybRid envirOnments) project as part of the H2020 Programme under grant agreement No 732105.
Design for Low-Power Internet-of-Things (IoT) Systems – ISCAS 2018
Design for Low-Power IoT Systems: CG Reconfigurable Acceleration Units
FRANCESCA PALUMBO
UNIVERSITÀ DEGLI STUDI DI SASSARI