icysoc - nano-tera 2016
TRANSCRIPT
-
8/18/2019 IcySoC - Nano-Tera 2016
1/17
IcySoC: Inexact Sub- andNear-Threshold Systems for
Ultra-Low Power Devces
Nano Tera Annual Meeting
CSEM, EPFL, ETHZ, EM
2016
-
8/18/2019 IcySoC - Nano-Tera 2016
2/17
Applications an Moti!ation
Trend toward mobility and IoT
Wearable Health Monitoring
Environmental Monitoring
and Automated Surveillance
Implanted Medical Devices
Internet of ThingsPersonal Electronics
…
Overhead for communication
motivates processing of dataclose to the sensor node
Energy efficiency and low
power become ver important
2
-
8/18/2019 IcySoC - Nano-Tera 2016
3/17
T"e Energ# E$cienc# C"allenges
Energy efficiency of todays systems is insufficient for desired battery lifetime
Difficult to maintain good energy efficiency for small workloads
%
10-100 ops!"
10,000&0100,000&0
1,000,000&0
10,000,000&0
100,000,000&0
1,000,000,000&0
Embedded applications
!or"load re#uirementsEnerg efficienc
" ! O
p e r a t i o n
#orkload
$hallenge
-
8/18/2019 IcySoC - Nano-Tera 2016
4/17
T"e 'c#SoC Pro(ect
)
Ob$ective of the pro$ect%
Develop technologies for ultra%lo!%po!er computing in varioustechnologies& ranging from '()nm do!n to *(nm
-
8/18/2019 IcySoC - Nano-Tera 2016
5/17
Near* an Su+*T"res"ol esign
Idea% redesign the basic building bloc"s of an integrated circuit to address
render operation at lo! suppl voltages more efficient
$areful choice of the right operating voltage
$areful choice of the technolog and technolog
options
$ustom lo!%voltage cell are lesssusceptible to variations and have
less lea"age than standard libraries
Memories are particularl critical+
custom memor structures designed
for lo! voltages are more reliable and
allo! for lo!er operating voltages !ithout failing
-
-
8/18/2019 IcySoC - Nano-Tera 2016
6/17
Allalin ic#fex2 s#ste. 'ntegration in ALP 1/0 n.
icyflex2 &'E processor
,*%bit processor& -) "gates
Sub%threshold .AM& ./M and
standard cells
0atch%based design
Minimum Energ Point+ '12'p34ccle 5 )2,16 and '7"H8 9*:MH8 5 '2(6;
0ea"age levels as lo! as *2( nW at )2-( 6 and %*:
-
8/18/2019 IcySoC - Nano-Tera 2016
7/17
P;LP< A Parallel ;ltra Lo= Po=er Plat:or.>ETHZ?
Multi%core platform
ideall suited forparallel !or"loads
/pen.IS$ processor
$onfigurable number of
cores "eeps platformhighl fle=ible
Allo!s for eas integration
of custom 9e2g2& appro=imate computing; accelerators
Tightl coupled data memories avoid management overheadof autonomous caches
$omplete programming and emulation environment
@
-
8/18/2019 IcySoC - Nano-Tera 2016
8/17
P;LP as a Silicon Pro!en Plat:or.
(ulmine
)*nmP>0P !ith accelerators
for crpto and imaging2
/
+,+v.
/nm (D'OI,rd generation P>0P
sstem !ith ?? support
+hoebe
)*nm Appro=imate -%core
P>0P !ith shared 0@>
oney 2unny
/nm$omplete P>0P sstem
!ith four cores
P>0P platform has been taped out and tested successfull
man times and in man configurations b ETHSome e=amples in various technologies+
Some P>0P platforms are even available as /penS/>.$E to
be used b others around the !orld
-
8/18/2019 IcySoC - Nano-Tera 2016
9/17
Approi.ate Co.puting< A Ne= esignParaig.any applications tolerate a certain amount of loss in 3o'
>se the potential to appro=imate to partiall compensate for performance loss due to
voltage scaling
Appro=imation allo!s to tolerate a certain amount of circuit failures due to reliabilit
issues
B
-
8/18/2019 IcySoC - Nano-Tera 2016
10/17
Approi.ate Arit".etic ;nits
!
S 3 1 4
" #
304
S 3 0 4
314
S 3 % 4
324
" $
" "
"
!
! !
!
!
!
! !
!
"
!
$ #
$ #
%
%
$
!
S 3 1 4
" #
304
S 3 0 4
314
S 3 % 4
324
" $
" "
"
!
! !
!
!
!
! !
!
!
$ #
$ #
%
% 1 N ,
!
S 3 1 4
304
S 3 0 4
314
S 3 % 4
324 !
!
!
!
!
$ #
$ #
%
%
Principle
Speculated carr
Error compensation
Advantages
High speed and lo! po!er
High%level integration
Principle
Significance%activit ran"ing
Pruning of least significant nodes
Advantages
Tunable accurac
Integration in standard flo!
4ate-level +runing 54+6 Ine7act 'peculative 8dder 5I'86
-
8/18/2019 IcySoC - Nano-Tera 2016
11/17
Approi.ate Arit".etic ;nits
1. V. Camus, J. Schlachter, and C. Enz, “A Low-power Carry Cut-Bac Appro!"mate Adder w"th #"!ed-po"nt $mplementat"on and #loat"n%-po"nt &rec"s"on,' to be presented at (AC )*1+.
). J. Schlachter, V. Camus, and C. Enz, “(es"%n o Ener%y-E"c"ent ("screte Cos"ne ransorm us"n% &runed Ar"thmet"c C"rcu"ts,' to be presented at $SCAS )*1+
. J. Schlachter, V. Camus, and C. Enz, “/ear0Su-hreshold C"rcu"ts and Appro!"mate Comput"n%2 he &erect Com"nat"on or 3ltra-Low-&ower Systems,' $SVLS$ )*14.
5. V. Camus, J. Schlachter, and C. Enz, “Ener%y-e"c"ent ("%"tal (es"%n hrou%h $ne!act and Appro!"mate Ar"thmet"c C"rcu"ts,' /E6CAS )*14.
4. V. Camus, J. Schlachter, and C. Enz, “Ener%y-e"c"ent $ne!act Speculat"7e Adder w"th 8"%h &erormance and Accuracy Control,' $SCAS )*14.
+. J. Schlachter, V. Camus, C. Enz, and 9. V. &alem, “Automat"c :enerat"on o $ne!act ("%"tal C"rcu"ts y :ate-le7el &run"n%,' $SCAS )*14.
3PEB encoding !ith appro=imate D$T 9pruned adders4subtractors;
/19 power-area savings
HD. tone%mapping !ith appro=imate CP> 9pruned4speculated adders4mult2;
EPC0 4 ETH chip /riginal E=act
*.9 power-
area savings
Appro=imate
-
8/18/2019 IcySoC - Nano-Tera 2016
12/17
Approi.ate Logarit".ic Nu.+er ;nits
12
We have sho!n that in a multi%core setting& logarithmic
number units 90@>s; can be shared efficientl2
Phoebe 9right top; contains four ,*%bit /pen.IS$ cores&
that share a common appro=imate 0@>2
Small precision rela=ations b 'ulp in the 0@> can
reduce area b more than -)F
Applications 9right bottom; sho! gains of up to :2:-= in
energ efficienc 9or '21'= on average;2
This !or" has resulted in three maGor
publications+ M2 Bautschi& M2 Schaffner& C2 2 Br"ana"& 02 ?enini& J A 65nm
CMOS 6.4-to-29.2pJ/FO!"#.$% S&'re( og'rit&mic Flo'ting
!oint )nit for Acceler'tion of *online'r F+nction ,ernel in 'ig&tly Co+ple( !roceor Cl+ter K& I''&& /01)
L2 Popoff& C2 Scheidegger& M2 Schaffner& M2 Bautschi& C2 2
Br"ana"& 02 ?enini& Jig&-Efficiency og'rit&mic *+m0er )nit
1eign 0'e( on 'n mpro3e( Cotr'nform'tion Sc&emeK2
D8TE /01)
M2 Bautschi& M2 Schaffner& C2 2 Br"ana"& 02 ?enini&
J Acc+r'cy 'n( !erform'nce r'(e-off pf og'rit&mic *+m0er
)nit in M+lti-Core Cl+terK& 8:IT /01)
-
8/18/2019 IcySoC - Nano-Tera 2016
13/17
Pertur+ation o: F'5 Coe$cients :or Lo=*Po=er peration
1%
+erturbate coefficients in (I: filters to
minimi;e the multipliers dnamic power 2
+ower characteri;ation of multipliers based on
constant coefficient operand
/ptimi8e e=act coefficients for appro7imated
filter !ith ma=2 .d2 error on stopband
2oth e7act and low-power operations ensured
&ult'lersdynamc'ower vares =it"
constrants an wndown(method
Dynamc'ower inmult'lers can +ereduced u'to "!)"*
Accepting onl#
a % error ont"e stop+anallo=s u' to"#)"*dynamc'owerreducton
%2 +it Mi it" i Cl 8
-
8/18/2019 IcySoC - Nano-Tera 2016
14/17
%2*+it Microprocessor =it" #na.ic Cloc8A(ust.ent
1)
DynO:+ ,*%bit %stage /pen%.IS$
microprocessor !ith dynamic clock ad$ustment
in *(nm CD%S/I DATE *)':N
E=ploit timing dnamic timing margins
$ustom%designed cloc" generation unit
Application%specific speedup of up to
-
8/18/2019 IcySoC - Nano-Tera 2016
15/17
ringing it all Toget"er< 'c#SoC C"ips =it" Help o: All Partners
Three versions
'ub-=T
ow eakage
ow =T
1-
CS/&pti.ie
S5AM
Macrocells:or
Su+*
DToperat
ion
/T01FourcoreP;LPs#ste
. =it"
approi.ateLN;s
ICL,2Prune
anspeculati!e
aers
TCLAppro&F'5
/&-&arn
C"ip .anu:acture using t"e ALP 1/0n.
tec"nolog#
-
8/18/2019 IcySoC - Nano-Tera 2016
16/17
Thank you for your attention
16
-
8/18/2019 IcySoC - Nano-Tera 2016
17/17
C"ips sent to Manu:acturing
(ulmine
)*nmP>0P !ith accelerators
for crpto and imaging2
1@
+,+v.
/nm (D'OI,rd generation P>0P
sstem !ith ?? support
Diego
10nmCour core P>0P sstem
!ith 0o! 6T libraries
+hoebe
)*nm
Appro=imate -%coreP>0P !ith shared 0@>
oney 2unny
/nm
$omplete P>0P sstem!ith four cores
anny
10nm
Sub 6T optimi8ed fourcore P>0P sstems
'id
10nm
Cour core P>0P sstem !ith0o! 0ea"age libraries
Diana
)*nmP>0P !ith different CP>s
using pruned arithmetic2