system-level power analysis and estimation september 20, 2006 chong-min kyung

14
System-level power analysis and estimation September 20, 2006 Chong-Min Kyung

Upload: silvester-daniels

Post on 14-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: System-level power analysis and estimation September 20, 2006 Chong-Min Kyung

System-level power analysis and estimation

September 20, 2006Chong-Min Kyung

Page 2: System-level power analysis and estimation September 20, 2006 Chong-Min Kyung

Power Estimation & Analysis ;power calculation needs three models ; architecture, component, and activity

clock & power network

Lower-level specification

Architecture ; component allocationScheduling operations

Page 3: System-level power analysis and estimation September 20, 2006 Chong-Min Kyung

Estimation vs. Analysis

• Analysis ; – for a given structure, i.e., netlist of components

• Estimation (=design prediction followed by analysis) ; – when the information on the structure of the design

is incomplete– Used to explore different design alternatives, and

find the best• Example ; to estimate the interconnect power, one

needs a floorplan prediction with clock and power network

– In exploring the alternatives, often times, maintaining relative order between the prediction and actual implementation is enough.

Page 4: System-level power analysis and estimation September 20, 2006 Chong-Min Kyung
Page 5: System-level power analysis and estimation September 20, 2006 Chong-Min Kyung

System-level power analysis

• System-level design Process ; – 1) allocation of components– 2) partitioning system’s task onto these

components (or, sub-systems)– 3) organizing cooperation among

components bound• System-level design Inputs ;

– Specification ; • E.g., CDFG…

– Environmental constraints ; • E.g., performance, power, cost, form factors, TTM,

number/load of I/O’s– Design space restriction ;

• E.g., enforced using some cores, available chip area, bus structure, etc.

Page 6: System-level power analysis and estimation September 20, 2006 Chong-Min Kyung

Implementation model• Should be used when execution model is not

available, typically using spread sheet• Usually start with a platform ; HW- and SW-

platform• Basically three components ;

– COTS (off-the-shelf components); • maybe only a single figure available from vendors such

as watts/MHz@VDD for a processor• Guess based on experience, know-how

– Customer-specific module;• Needs estimation based on prediction on number of

gates, activity factor, and technology scaling factor• Power consumption of this module may be insignificant,

but its use can replace the power-hungry processor.– Communication power;

• Data transfer between blocks• Clock power, cross-coupling

• What was ignored; software structure, data

Page 7: System-level power analysis and estimation September 20, 2006 Chong-Min Kyung

Execution model

• Typically given as a program in C, HDL, SystemC, or some heterogeneous combination of these

• Allows more detailed power analysis as the dynamic system behavior is simulated ; – component power model, – system architecture, and – component activation pattern needed

• For example, BFM (bus functional model) and the activity information for each processor components such as issue queue, branch prediction unit, execution unit, cache, register file are needed

Page 8: System-level power analysis and estimation September 20, 2006 Chong-Min Kyung

Memory model• DTSE work by Catthor

– Assume that memory is the dominant power consuming part in signal processing applications

– Memory optimization in terms of power should be dome first

– Objective; • increase data locality• Suppress memory access• Optimize memory hierarchy

– By doing• Perform global loop and control flow transformations• Data reuse analysis• Storage cycle distribution• Memory allocation and assignment• In-place optimization

Page 9: System-level power analysis and estimation September 20, 2006 Chong-Min Kyung

Memory model

• Memory chip ; power model is available in the data sheet

• Compiled memory core ;– Power model should be parameterized, at

least, in terms of size. For that, simulation model is needed. But due to flat hierarchy simulation model of memory takes too long time.

– Therefore, abstraction model is needed. Capacitance model is difficult to get as it reveals critical information of the memory vendor. -> Functional models not disclosing any internal cell structure is okay.

Page 10: System-level power analysis and estimation September 20, 2006 Chong-Min Kyung

Other things to include in the execution model

• Interconnect power model– Input ; physical layout and material

properties– Built based on measurement and simulation– However, on-chip interconnect is difficult to

model, especially when complex bus encoding is used.

• Models for power management policy– Hardware for DPM (dynamic power mgmt)– Software– RTOS

Page 11: System-level power analysis and estimation September 20, 2006 Chong-Min Kyung
Page 12: System-level power analysis and estimation September 20, 2006 Chong-Min Kyung

Algorithm-Level Power Estimation in Orinoco

• Activity estimation ;– Code instrumentation ; inserts protocol

statements to capture the activity during execution

• Architecture estimation ;– High-level synthesis ;

• Scheduling• Allocation• Binding

– Physical Planning• Floorplanning• Clock tree generation

Page 13: System-level power analysis and estimation September 20, 2006 Chong-Min Kyung

Algorithmic-level power estimation and analysis

• Algorithmic-level design– Objective; optimize in terms of

performance, cost and power– Means;

• Selection of algorithm performing the requested function

• Optimization of the algorithm• Partitioning the algorithm into HW and

SW

Page 14: System-level power analysis and estimation September 20, 2006 Chong-Min Kyung

• Algorithm selection ; selecting the most power-efficient one– Comparison is based on the most power-efficient

realization without actual implementation.• Optimization ;

– Reducing # of control statements, e.g., by loop unrolling, local statement reordering, memory access reordering

– Floating-point for SW vs. fixed-point arithmetic for HW

• Partitioning ;– Trade-off analysis between HW and SW

implementations– SW-to-HW transformation ; moving the

computational kernels of the algorithm to power-optimized application-specific hardware

• No need for consecutive control steps to perform a single instruction

• No need for memory access to find out what to do next• Minimal datapath just for performing the given task• Maximal concurrency exploitable compared to

processor core