icysoc - nano-tera 2016

Upload: nanoterach

Post on 06-Jul-2018

227 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/18/2019 IcySoC - Nano-Tera 2016

    1/17

     IcySoC: Inexact Sub- andNear-Threshold Systems for

    Ultra-Low Power Devces

    Nano Tera Annual Meeting

     CSEM, EPFL, ETHZ, EM

    2016

  • 8/18/2019 IcySoC - Nano-Tera 2016

    2/17

    Applications an Moti!ation

    Trend toward mobility and IoT

    Wearable Health Monitoring

    Environmental Monitoring

    and Automated Surveillance

    Implanted Medical Devices

    Internet of ThingsPersonal Electronics

    Overhead for communication

    motivates processing of dataclose to the sensor node

    Energy efficiency and low

     power become ver important

    2

  • 8/18/2019 IcySoC - Nano-Tera 2016

    3/17

     T"e Energ# E$cienc# C"allenges

    Energy efficiency of todays systems is insufficient for desired battery lifetime

    Difficult to maintain good energy efficiency for small workloads

    %

    10-100 ops!"

    10,000&0100,000&0

    1,000,000&0

    10,000,000&0

    100,000,000&0

    1,000,000,000&0

    Embedded applications

    !or"load re#uirementsEnerg efficienc

         "     !     O

        p    e    r    a     t     i    o    n

    #orkload

    $hallenge

  • 8/18/2019 IcySoC - Nano-Tera 2016

    4/17

     T"e 'c#SoC Pro(ect

    )

    Ob$ective of the pro$ect%

    Develop technologies for ultra%lo!%po!er computing in varioustechnologies& ranging from '()nm do!n to *(nm

  • 8/18/2019 IcySoC - Nano-Tera 2016

    5/17

    Near* an Su+*T"res"ol esign

    Idea% redesign the basic building bloc"s of an integrated circuit to address

    render operation at lo! suppl voltages more efficient

    $areful choice of the right operating voltage

    $areful choice of the technolog and technolog

    options

    $ustom lo!%voltage cell are lesssusceptible to variations and have

    less lea"age than standard libraries

    Memories are particularl critical+

    custom memor structures designed

    for lo! voltages are more reliable and

    allo! for lo!er operating voltages !ithout failing

    -

  • 8/18/2019 IcySoC - Nano-Tera 2016

    6/17

    Allalin ic#fex2 s#ste. 'ntegration in ALP 1/0 n.

    icyflex2 &'E processor 

    ,*%bit processor& -) "gates

    Sub%threshold .AM& ./M and

    standard cells

    0atch%based design

    Minimum Energ Point+ '12'p34ccle 5 )2,16 and '7"H8 9*:MH8 5 '2(6;

    0ea"age levels as lo! as *2( nW at )2-( 6 and %*:

  • 8/18/2019 IcySoC - Nano-Tera 2016

    7/17

    P;LP< A Parallel ;ltra Lo= Po=er Plat:or.>ETHZ?

    Multi%core platform

    ideall suited forparallel !or"loads

    /pen.IS$ processor 

    $onfigurable number of

    cores "eeps platformhighl fle=ible

     Allo!s for eas integration

    of custom 9e2g2& appro=imate computing; accelerators

    Tightl coupled data memories avoid management overheadof autonomous caches

    $omplete programming and emulation environment

    @

  • 8/18/2019 IcySoC - Nano-Tera 2016

    8/17

    P;LP as a Silicon Pro!en Plat:or.

    (ulmine

    )*nmP>0P !ith accelerators

    for crpto and imaging2

    /

    +,+v.

    /nm (D'OI,rd generation P>0P

    sstem !ith ?? support

    +hoebe

    )*nm Appro=imate -%core

    P>0P !ith shared 0@>

    oney 2unny

    /nm$omplete P>0P sstem

    !ith four cores

    P>0P platform has been taped out and tested successfull

    man times and in man configurations b ETHSome e=amples in various technologies+

    Some P>0P platforms are even available as /penS/>.$E to

    be used b others around the !orld

  • 8/18/2019 IcySoC - Nano-Tera 2016

    9/17

    Approi.ate Co.puting< A Ne= esignParaig.any applications tolerate a certain amount of loss in 3o'

    >se the potential to appro=imate to partiall compensate for performance loss due to

    voltage scaling

     Appro=imation allo!s to tolerate a certain amount of circuit failures due to reliabilit

    issues

    B

  • 8/18/2019 IcySoC - Nano-Tera 2016

    10/17

    Approi.ate Arit".etic ;nits

          !

        S  3    1  4

          "      #

    304

        S  3    0  4

    314

        S  3    %  4

    324

          "      $

          "      "

          "

          !

          !      !

          !

          !

          !

          !      !

          !

          "

          !

          $      #

          $      #

          %

          %

          $

          !

        S  3    1  4

          "      #

    304

        S  3    0  4

    314

        S  3    %  4

    324

          "      $

          "      "

          "

          !

          !      !

          !

          !

          !

          !      !

          !

          !

          $      #

          $      #

          %

          %      1      N      ,

          !

        S  3    1  4

    304

        S  3    0  4

    314

        S  3    %  4

    324   !

          !

          !

          !

          !

          $      #

          $      #

          %

          %

    Principle

    Speculated carr

    Error compensation

     Advantages

    High speed and lo! po!er 

    High%level integration

    Principle

    Significance%activit ran"ing

    Pruning of least significant nodes

     Advantages

    Tunable accurac

    Integration in standard flo!

    4ate-level +runing 54+6 Ine7act 'peculative 8dder 5I'86

  • 8/18/2019 IcySoC - Nano-Tera 2016

    11/17

    Approi.ate Arit".etic ;nits

    1. V. Camus, J. Schlachter, and C. Enz, “A Low-power Carry Cut-Bac Appro!"mate Adder w"th #"!ed-po"nt $mplementat"on and #loat"n%-po"nt &rec"s"on,' to be presented at (AC )*1+.

    ). J. Schlachter, V. Camus, and C. Enz, “(es"%n o Ener%y-E"c"ent ("screte Cos"ne ransorm us"n% &runed Ar"thmet"c C"rcu"ts,' to be presented at $SCAS )*1+

    . J. Schlachter, V. Camus, and C. Enz, “/ear0Su-hreshold C"rcu"ts and Appro!"mate Comput"n%2 he &erect Com"nat"on or 3ltra-Low-&ower Systems,' $SVLS$ )*14.

    5. V. Camus, J. Schlachter, and C. Enz, “Ener%y-e"c"ent ("%"tal (es"%n hrou%h $ne!act and Appro!"mate Ar"thmet"c C"rcu"ts,' /E6CAS )*14.

    4. V. Camus, J. Schlachter, and C. Enz, “Ener%y-e"c"ent $ne!act Speculat"7e Adder w"th 8"%h &erormance and Accuracy Control,' $SCAS )*14.

    +. J. Schlachter, V. Camus, C. Enz, and 9. V. &alem, “Automat"c :enerat"on o $ne!act ("%"tal C"rcu"ts y :ate-le7el &run"n%,' $SCAS )*14.

    3PEB encoding !ith appro=imate D$T 9pruned adders4subtractors;

    /19 power-area savings

    HD. tone%mapping !ith appro=imate CP> 9pruned4speculated adders4mult2;

    EPC0 4 ETH chip /riginal E=act

    *.9 power-

    area savings

     Appro=imate

  • 8/18/2019 IcySoC - Nano-Tera 2016

    12/17

    Approi.ate Logarit".ic Nu.+er ;nits

    12

    We have sho!n that in a multi%core setting& logarithmic

    number units 90@>s; can be shared efficientl2

    Phoebe 9right top; contains four ,*%bit /pen.IS$ cores&

    that share a common appro=imate 0@>2

    Small precision rela=ations b 'ulp in the 0@> can

    reduce area b more than -)F

     Applications 9right bottom; sho! gains of up to :2:-= in

    energ efficienc 9or '21'= on average;2

    This !or" has resulted in three maGor

    publications+ M2 Bautschi& M2 Schaffner& C2 2 Br"ana"& 02 ?enini& J A 65nm

    CMOS 6.4-to-29.2pJ/FO!"#.$% S&'re( og'rit&mic Flo'ting

    !oint )nit for Acceler'tion of *online'r F+nction ,ernel in 'ig&tly Co+ple( !roceor Cl+ter K& I''&& /01)

    L2 Popoff& C2 Scheidegger& M2 Schaffner& M2 Bautschi& C2 2

    Br"ana"& 02 ?enini& Jig&-Efficiency og'rit&mic *+m0er )nit

    1eign 0'e( on 'n mpro3e( Cotr'nform'tion Sc&emeK2

    D8TE /01)

    M2 Bautschi& M2 Schaffner& C2 2 Br"ana"& 02 ?enini&

    J Acc+r'cy 'n( !erform'nce r'(e-off pf og'rit&mic *+m0er

    )nit in M+lti-Core Cl+terK& 8:IT /01)

  • 8/18/2019 IcySoC - Nano-Tera 2016

    13/17

    Pertur+ation o: F'5 Coe$cients :or Lo=*Po=er peration

    1%

    +erturbate coefficients in (I: filters to

    minimi;e the multipliers dnamic power 2

    +ower characteri;ation of multipliers based on

    constant coefficient operand

    /ptimi8e e=act coefficients for appro7imated

    filter  !ith ma=2 .d2 error on stopband

    2oth e7act and low-power  operations ensured

    &ult'lersdynamc'ower vares =it"

    constrants an wndown(method

    Dynamc'ower inmult'lers can +ereduced u'to "!)"*

    Accepting onl#

    a % error ont"e stop+anallo=s u' to"#)"*dynamc'owerreducton

    %2 +it Mi it" i Cl 8

  • 8/18/2019 IcySoC - Nano-Tera 2016

    14/17

    %2*+it Microprocessor =it" #na.ic Cloc8A(ust.ent

    1)

    DynO:+ ,*%bit %stage /pen%.IS$

    microprocessor  !ith dynamic clock ad$ustment

    in *(nm CD%S/I DATE *)':N

    E=ploit timing dnamic timing margins

    $ustom%designed cloc" generation unit

     Application%specific speedup of up to

  • 8/18/2019 IcySoC - Nano-Tera 2016

    15/17

    ringing it all Toget"er< 'c#SoC C"ips =it" Help o: All Partners

    Three versions

    'ub-=T

    ow eakage

    ow =T

    1-

    CS/&pti.ie

    S5AM

    Macrocells:or

    Su+*

    DToperat

    ion

    /T01FourcoreP;LPs#ste

    . =it"

    approi.ateLN;s

    ICL,2Prune

    anspeculati!e

    aers

    TCLAppro&F'5

    /&-&arn

    C"ip .anu:acture using t"e ALP 1/0n.

    tec"nolog#

  • 8/18/2019 IcySoC - Nano-Tera 2016

    16/17

    Thank you for your attention

    16

  • 8/18/2019 IcySoC - Nano-Tera 2016

    17/17

    C"ips sent to Manu:acturing

    (ulmine

    )*nmP>0P !ith accelerators

    for crpto and imaging2

    1@

    +,+v.

    /nm (D'OI,rd generation P>0P

    sstem !ith ?? support

    Diego

    10nmCour core P>0P sstem

    !ith 0o! 6T libraries

    +hoebe

    )*nm

     Appro=imate -%coreP>0P !ith shared 0@>

    oney 2unny

    /nm

    $omplete P>0P sstem!ith four cores

    anny

    10nm

    Sub 6T optimi8ed fourcore P>0P sstems

    'id

    10nm

    Cour core P>0P sstem !ith0o! 0ea"age libraries

    Diana

    )*nmP>0P !ith different CP>s

    using pruned arithmetic2