greendroid mobile application processor - intranet...

21
GreenDroid GreenDroid Mobile Application Mobile Application GreenDroid GreenDroid Mobile Application Mobile Application Processor Processor Morici Simone Spiriti Emanuele Spiriti Emanuele

Upload: doantram

Post on 16-Feb-2018

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: GreenDroid Mobile Application Processor - Intranet DEIBhome.deib.polimi.it/silvano/FilePDF/ARC-MULTIMEDIA/GreenDroid... · The Idea yIn 2005 industrial shift from single-threaded

GreenDroidGreenDroid Mobile Application Mobile Application GreenDroidGreenDroid Mobile Application Mobile Application ProcessorProcessorMorici SimoneSpiriti EmanueleSpiriti Emanuele

Page 2: GreenDroid Mobile Application Processor - Intranet DEIBhome.deib.polimi.it/silvano/FilePDF/ARC-MULTIMEDIA/GreenDroid... · The Idea yIn 2005 industrial shift from single-threaded

The The mainmain scenarioscenarioThe The mainmain scenarioscenario

Key technological problem calledKey technological problem calledutilization wallPercentage of transistor switched at full frequency decrease because of powerq y pconstraintsD k iliDark silicon

Page 3: GreenDroid Mobile Application Processor - Intranet DEIBhome.deib.polimi.it/silvano/FilePDF/ARC-MULTIMEDIA/GreenDroid... · The Idea yIn 2005 industrial shift from single-threaded

UtilizationUtilization wallwallUtilizationUtilization wallwall

Before 2005 Threshold voltage and supplyBefore 2005 Threshold voltage and supplyvoltage scaled with new processgenerationsNowaday impossible because of leakagey p grelated limits

Page 4: GreenDroid Mobile Application Processor - Intranet DEIBhome.deib.polimi.it/silvano/FilePDF/ARC-MULTIMEDIA/GreenDroid... · The Idea yIn 2005 industrial shift from single-threaded

The IdeaThe IdeaThe IdeaThe Idea

In 2005 industrial shift from singleIn 2005 industrial shift from single-threaded to multicore processorsBut dark silicon area is still increasingDark silicon cheaper and cheaperDark silicon cheaper and cheaperPower budget gets exponentially more

l blvaluable

Trade this low-cost resource toTrade this low cost resource toincrease energy efficiency

Page 5: GreenDroid Mobile Application Processor - Intranet DEIBhome.deib.polimi.it/silvano/FilePDF/ARC-MULTIMEDIA/GreenDroid... · The Idea yIn 2005 industrial shift from single-threaded

Brief Brief descriptiondescription of of GreenDroidGreenDroidBrief Brief descriptiondescription of of GreenDroidGreenDroid45nm multicore prototype45nm multicore prototypeIt targets the Android software stackA i ll d “C iAutomatically generated “Conservationcores” (c-cores from now on) used to

d ti (i t d f reduce energy consuption (instead of maximize performance)T l U ili i W ll d Try to solve Utilization Wall and consequently Dark Silicon issuesPatching support to support software updates

Page 6: GreenDroid Mobile Application Processor - Intranet DEIBhome.deib.polimi.it/silvano/FilePDF/ARC-MULTIMEDIA/GreenDroid... · The Idea yIn 2005 industrial shift from single-threaded

The The architecturearchitecture 11//22The The architecturearchitecture 11//22

We have an array/matrix of tilesWe have an array/matrix of tilesEach tile is made up of:◦ General purpose CPU◦ OCN (On Chip Network, point to pointOCN (On Chip Network, point to point

mesh interconnection) ◦ L1 data cache◦ L1 data cache◦ Specialized interface◦ 8 to 15 c-cores per tile

Each tile is unique.q

Page 7: GreenDroid Mobile Application Processor - Intranet DEIBhome.deib.polimi.it/silvano/FilePDF/ARC-MULTIMEDIA/GreenDroid... · The Idea yIn 2005 industrial shift from single-threaded

ArchitectureArchitecture 22//22ArchitectureArchitecture 22//22

Each c core is coupled to the CPU via the Each c-core is coupled to the CPU via the L1 cache and the specialized interfaceThis lets the CPU: ◦ Pass arguments to c-coresPass arguments to c cores◦ Perform context switches

R fi th h d◦ Reconfigure the hardware

OCN used for memory traffic and synchronization.

Page 8: GreenDroid Mobile Application Processor - Intranet DEIBhome.deib.polimi.it/silvano/FilePDF/ARC-MULTIMEDIA/GreenDroid... · The Idea yIn 2005 industrial shift from single-threaded

The The architecturearchitectureThe The architecturearchitecture

Page 9: GreenDroid Mobile Application Processor - Intranet DEIBhome.deib.polimi.it/silvano/FilePDF/ARC-MULTIMEDIA/GreenDroid... · The Idea yIn 2005 industrial shift from single-threaded

Architecture Architecture detailsdetailsArchitecture Architecture detailsdetailsThe CPU is 32 bit 7-stage in-order pipeline The CPU is 32 bit, 7 stage, in order pipeline and has a FPU and a multiplierThe frequency of 1.5 GHz is set by the The frequency of 1.5 GHz is set by the cache access timeAll the L1 cache of the tiles provide a largerp gL2 cacheCoherence provided by L2 light weightdirectories at DRAM interfaces which usethe L1 as victim caches

d h i h c-cores are power gated otherwise the budget is exceeded

Page 10: GreenDroid Mobile Application Processor - Intranet DEIBhome.deib.polimi.it/silvano/FilePDF/ARC-MULTIMEDIA/GreenDroid... · The Idea yIn 2005 industrial shift from single-threaded

ExecutionExecution ModelModelExecutionExecution ModelModelThe execution starts on one of the CPU’sThe execution starts on one of the CPU sWhen the CPU recognizes the hot code transfers the execution on the appropriate c-coreExecution moves from tile to tile wrt the availability and their specializationavailability and their specializationData associated with a given c-core usuallyresides in the associated L1 cacheresides in the associated L1 cacheC-cores largely transparent to developers.

Page 11: GreenDroid Mobile Application Processor - Intranet DEIBhome.deib.polimi.it/silvano/FilePDF/ARC-MULTIMEDIA/GreenDroid... · The Idea yIn 2005 industrial shift from single-threaded

The The AndroidAndroid stackstackThe The AndroidAndroid stackstack

Page 12: GreenDroid Mobile Application Processor - Intranet DEIBhome.deib.polimi.it/silvano/FilePDF/ARC-MULTIMEDIA/GreenDroid... · The Idea yIn 2005 industrial shift from single-threaded

The The AndroidAndroid stackstackThe The AndroidAndroid stackstack

The hot code reside mainly in:The hot code reside mainly in:◦ Commonly used application (ex web browser,

mail)◦ Application libraries◦ Dalvik virtual machine◦ Few location of the kernelFew location of the kernel

95% of the code is covered by c-coresi d 72% i d h i lexecution and 72% is due to the virtual

machine

Page 13: GreenDroid Mobile Application Processor - Intranet DEIBhome.deib.polimi.it/silvano/FilePDF/ARC-MULTIMEDIA/GreenDroid... · The Idea yIn 2005 industrial shift from single-threaded

MainMain ideasideas behindbehind cc corescoresMainMain ideasideas behindbehind cc--corescoresWe must do a profiling of the codeWe must do a profiling of the codeA specialized circuit (c-core) tries to mirrorthe hot code adding an extra logic thatthe hot code adding an extra logic thatallows patchingC ld d th CPUCold code runs on the CPUSpecialized compiler is responsible to

i h d li i h h recognize what code aligns with the c-coresWe also have a runtime system that managesthe allocation of c-cores according to availabilty

Page 14: GreenDroid Mobile Application Processor - Intranet DEIBhome.deib.polimi.it/silvano/FilePDF/ARC-MULTIMEDIA/GreenDroid... · The Idea yIn 2005 industrial shift from single-threaded

CC corescores detailsdetailsCC--corescores detailsdetailsData pathData path◦ Functional units (adders, shifters) to execute

instructionsinstructions◦ Multiplexer to implement control decisions◦ RegistersRegistersControl unit◦ Implements the state machine that mirrors the ◦ Implements the state machine that mirrors the

Control Flow Graph◦ Tracks branch outcomes (computed in data path) Tracks branch outcomes (computed in data path)

to determine witch hardware block must be active

Page 15: GreenDroid Mobile Application Processor - Intranet DEIBhome.deib.polimi.it/silvano/FilePDF/ARC-MULTIMEDIA/GreenDroid... · The Idea yIn 2005 industrial shift from single-threaded

A A graphicalgraphical representationrepresentationA A graphicalgraphical representationrepresentation

Page 16: GreenDroid Mobile Application Processor - Intranet DEIBhome.deib.polimi.it/silvano/FilePDF/ARC-MULTIMEDIA/GreenDroid... · The Idea yIn 2005 industrial shift from single-threaded

SynthesizingSynthesizing cc corescoresSynthesizingSynthesizing cc--corescores

The design of the c cores is not done by The design of the c-cores is not done by hand.A C/C++-to-Verilog toolchain isused to convert the code in hardwareThe toolchain identifies the main loopspand functions given a target workloadTh CFG d th d t t l fl hThe CFG and the data control flow graphare created

Page 17: GreenDroid Mobile Application Processor - Intranet DEIBhome.deib.polimi.it/silvano/FilePDF/ARC-MULTIMEDIA/GreenDroid... · The Idea yIn 2005 industrial shift from single-threaded

SynthesizingSynthesizing cc corescoresSynthesizingSynthesizing cc--corescoresThe compiler generates:The compiler generates:◦ The verilog code for the control unit◦ The data path that closely mimics the representations◦ Function stubs that applications can call in place of the

original functions to invoke the hardware◦ Description of the c-core, used when we update the p p

functionSmall changes in source code correspond to small changes in hardwarechanges in hardwareSince the target is to minimize the energy consuptionand not to achieve better performance we can

l it C t t th h t t exploit many more C constructs than when we try to get more parallelism in the code

Page 18: GreenDroid Mobile Application Processor - Intranet DEIBhome.deib.polimi.it/silvano/FilePDF/ARC-MULTIMEDIA/GreenDroid... · The Idea yIn 2005 industrial shift from single-threaded

PatchingPatchingPatchingPatchingSince the software evolves the c-core must Since the software evolves the c-core must adapt too◦ Redifine compile time constants in hardware◦ Redifine compile time constants in hardware◦ Exception mechanism that allows to transfer the

control back and forth the CPU and the c-corescontrol back and forth the CPU and the c coresThe area of the chip is increasedBut the experiments show that the But the experiments show that the adaptation process can hold for a decadeR b th t th lif l f Remember that the mean lifecycle of a smartphone is 3 years

Page 19: GreenDroid Mobile Application Processor - Intranet DEIBhome.deib.polimi.it/silvano/FilePDF/ARC-MULTIMEDIA/GreenDroid... · The Idea yIn 2005 industrial shift from single-threaded

AcceleratorsAcceleratorsAcceleratorsAcceleratorsThe main part of specialized hardware is used to The main part of specialized hardware is used to achieve better performanceWe need simple code that exposes parallelism

d l h hand a simple way to create a circuit that is neithercostly nor complexIn GreenDroid accelerators are mainly used to In GreenDroid accelerators are mainly used to reduce energy consuptionMore code can be suitable to create a c-core thatexecutes itFirst we accelerate the code that can be

ll li d h k h i i d d parallelized, then we take the remaining code and we try to map it to c-cores as much as possible

Page 20: GreenDroid Mobile Application Processor - Intranet DEIBhome.deib.polimi.it/silvano/FilePDF/ARC-MULTIMEDIA/GreenDroid... · The Idea yIn 2005 industrial shift from single-threaded

HighHigh levellevel synthesissynthesis toolstoolsHighHigh--levellevel synthesissynthesis toolstools

Since the code is different and lessSince the code is different and lessparallelizable we must have a completelyautomatic toolchainWe can’t have user aided process becausep◦ Code is too large◦ Code is constantly evolving◦ Code is constantly evolving◦ HLS supports I/O and system calls◦ Also parts of the kernel are translated

Page 21: GreenDroid Mobile Application Processor - Intranet DEIBhome.deib.polimi.it/silvano/FilePDF/ARC-MULTIMEDIA/GreenDroid... · The Idea yIn 2005 industrial shift from single-threaded

ConclusionsConclusionsConclusionsConclusions56% less energy consuption due to the 56% less energy consuption due to the absence of fetch/decode, register file on c coresc-cores35% energy savings come from the

i li i f h dspecialization of the codeThese are great results and sinceutilization wall is exponentially increasingthis way of thinking must be considered in y gevery future architecture both desktop or mobile