freqleak
TRANSCRIPT
FreqLeak: A frequency step based method for efficient
leakage power characterization in a system
Arun Joseph, Anand Haridass, Charles Lefurgy*, Sreekanth Pai, Spandana Rachamalla, Francesco Campisano+
IBM Systems Group Bangalore, *IBM Austin Research Labs, +IBM Systems Group Austin
Contact: [email protected]
Summary
Accurate estimation of leakage power at runtime requires power measurements across a wide range of temperature and voltage conditions.
Testing individual chips, especially at high-temperature corner conditions, is expensive in cost and time.
We introduce FreqLeak, a method for inexpensive and efficient leakage power characterization in a system.
Enables a more thorough characterization than on a wafer prober alone due to time and equipment costs.
Evaluation on POWER8 systems demonstrates the efficiency of the proposed method, within an error of 5%.
2
Background
o Known benefits to system power management in estimating runtime contribution of leakage power to total chip power.
o Leakage power strongly non-linear with temperature, voltage and process.
o Significant errors can result if runtime models are based on subset of leakage power measurements.
3
1.001.201.401.601.802.002.202.402.60
45 55 65 75 85
Nor
mal
ized
pow
er
Temperature (C)
Normalized Power vs Temperature @ Voltage=V1
Fast Nom Slow
Key Challenges
o Leakage characterization typically done during wafer test.
o Test time premium due to cost constraints.
o High volume of wafers through small number of wafer probers.
o Eliminates the possibility of testing several temperature points.
o Measurements limited to few voltage points and two extreme temperature conditions.
4
Manufacturing Test
EPROM
Service Processor(Power Management
Policy)
Processor 1
Processor 2
Processor 3
Leakage Power Tables in Vital Product Data
Existing Approaches: Limitations
o Do not enable leakage characterization in a production system environment.
o Hardware testers. (also: expensive in both cost and time)
o Heaters / Heat guns. (also: expensive, reliability issues)
o Subset of leakage power measurements obtained, which are then scaled.
5
Why leakage characterization in a system ?
More readily available when compared to testers and heaters.
Opportunity to optimize system power management based on specific chips used.
Characterization performed closer to field conditions.
Enables validation late in the product life cycle, often required in industry use-cases.
Enables re-characterization of vendor chips in systems, without disassembling the system.
6
FreqLeak: Overview
New method for efficient leakage power characterization in a system.
oThree step method.
oRepeat for different conditions.
7
FreqLeak: Highlights
Can be done in a system using existing system controls for voltage, temperature and frequency.
Different combinations of system controls and constant utilization workloads can be leveraged for creating a broad range of measurements for the chip.
8
FreqLeak Methodology: System Controls
o Controls for voltage, temperature and frequency.
o Constant utilization workload, run in a loop.
o Workload designed such that it heats the chip to a fairly uniform temperature profile.
o Outputs measured include total power for a particular voltage rail, temperature and voltage measurements from the on-chip sensors.
9
FreqLeak: Step 1 (Workload Induced Pre-heating)
o Enable the power characterization mode of processor.
o A constant utilization workload is run in an infinite loop.
Workload heats the chip to a temperature based on cooling system
controls.
Acts as “built-in heater” for high temperature leakage characterization.
Temperature profile within 1-3 C across the different thermal sensors.
10
FreqLeak: Step 2 (Frequency Stepping)
o Dynamic power (DN) of a given frequency domain (N): DN = Ceff * V * V * FN = KN * FN
o Keeping the on-chip voltage and temperature constant, an increase in frequency of the domain (N) by a small delta (∆FN) via brings in a measurable increase in dynamic power, as shown: ÐN = KN * (∆FN + FN)
A very small increase in frequency realistically will not cause any change in the on-chip temperature profile.
If there is an increase in temperature, bring back temperature to the set point by adjusting the temperature control.
11
FreqLeak: Step 2 (Frequency Stepping)
o The measured change in total power (∆ŦN) is given by: ∆ŦN = KN * ∆FN
o By repeating the above steps, compute the KN of all N domains in the voltage rail.
o Total dynamic power (DP) for the given rail can be computed as: DP = ∑ (KN * FN)
o FreqLeak based leakage power (FL) is computed from the actual total power measured (ŦN) as: FL = ŦN – DP
12
FreqLeak: Step 3 (Creation of leakage table)
o Repeat to achieve power measurements across a broad range of voltage temperature, and constant workload utilization conditions.
o Store leakage power extracted in the form of a table in vital product data.
o While running any workload on the system, compute the workload dependent runtime dynamic power: DPt = Measured total power – FL(V,T) from leakage table
13
FreqLeak: Other Key Aspects
o Keeping on-chip temperature and voltage as constant as possible.
o Criteria for the absolute size of the frequency step required. (f2 - f1)
o Determining the start and stop of the frequency step. (f2 and f1)
o Determining the number of frequency steps required.
o State-dependency of leakage power.
Studied using experiments in the hardware lab.
14
Experimental Setup
Used IBM Power S824 server that uses 22 nm POWER8 microprocessors. 2 socket server in a 19inch rack mounted, 4U (EIA units) mechanical form factor. Ships 2 x IBM POWER8 chips (in 6/12, 8/16, 24 core configurations) supporting a maximum
of 1024 GB total memory (16 DDR3 CDIMM slots - 16 GB, 32 GB, 64 GB @1600 MHz).15
Experimental Evaluation Methodology
o FreqLeak used to get leakage power (FL) for the POWER8 VDD rail for a particular voltage=V and temperature=T.
o Very accurate reference hardware leakage power (HL) at the same voltage and temperature achieved using expensive external heaters.
o Experiments done across a considerable range of voltage, temperature and hardware parts.
16
POWER8
FreqLeak External Heater
Compare
VDD Leakage at voltage=V and temperature=T
VDD Leakage at voltage=V and temperature=T
FL HL
Workload
Error %
Experimental Results
17
0.5
1
1.5
2
2.5
3
3.5
-10.0 0.0 10.0 20.0 25.0
Nor
amal
ized
Leak
age
Pow
er
Normalized Voltage (%)
FreqLeak Accuracy at T=85C
HL FL1 FL2 FL3 FL4
0.4
0.6
0.8
1
1.2
1.4
1.6
1.8
-10 0 10 20 25
Nor
amal
ized
Leak
age
Pow
er
Normalized Voltage (%)
FreqLeak Accuracy at T=55C
HL FL5 FL6
Leakage power error <5%Across a considerable range of voltage, temperature, hardware parts.
Experimental Results
18
Freq Step FL vs HL Error % Freq Step FL vs HL Error %Step 1 3.8 3-step 2.9Step 2 2.3 3-step 2.0Step 3 4.6 3-step 2.8
Part = PR1 Voltage=1.2V Temperature = 85C Workload = WCo N-step method, where n=3.
o Improved accuracy.
o Additional effort in characterization.
Conclusion
We introduced FreqLeak, an efficient method for post-silicon leakage power characterization in a system.
We advocate supplementing wafer test with FreqLeak.
We present how FreqLeak can be implemented using existing system controls and power measurements.
Experimental evaluation of FreqLeak on the IBM POWER8 microprocessor chip demonstrates the efficiency and accuracy of the proposed approach.
19
Backup
Experimental Results
21
Across 2 unique hardware parts, range of voltage conditions (1.0-1.25V), different frequency start/stop steps, at T=85C
Voltage (V) FL vs HL Error %1.00 3.21.10 -3.81.20 4.81.25 4.6
Part = PR1 Frequency Step = f1 to f2 Temperature = 85C Workload = WA
Voltage (V) FL vs HL Error %1.00 -2.31.10 3.61.20 4.81.25 -2.7
Part = PR2 Frequency Step = f3 to f4 Temperature = 85C Workload = WA
Experimental Results
22
Voltage (V) PR1 - FL vs HL Error % PR2 - FL vs HL Error %1.00 3.9 -1.31.20 0.0 -0.11.25 3.6 -3.1
Frequency Step = f3 to f4 Temperature = 85C Workload = WB
Voltage (V) FL vs HL Error % Voltage (V) FL vs HL Error %1.00 0.0 1.00 -0.91.10 1.8 1.10 2.91.20 2.4 1.20 4.7
Part = PR1 Frequency Step = f5 to f6 Temperature = 85C Workload = WC
Part = PR2 Frequency Step = f3 to f4 Temperature = 85C Workload = WC
Freq Step FL vs HL Error % Freq Step FL vs HL Error % Freq Step FL vs HL Error %Step 1 3.8 Step 1 0.7 Step 1 0.9Step 2 2.3 Step 2 1.9 Step 2 3.2Step 3 4.6 Step 3 1.3 Step 3 0.9
Part = PR1 Voltage=1.2V Temperature = 85C
Workload = WC
Part = PR2 Voltage=1.25V Temperature = 85C
Workload = WC
Part = PR2 Voltage=1V Temperature = 75C
Workload = WD Voltage (V) FL vs HL Error % Voltage (V) FL vs HL Error %1.00 0.1 1.00 -4.51.10 3.8 1.10 3.21.20 -4.4 1.20 -1.1
Part = PR2 Frequency Step = f7 to f8 Temperature = 55C Workload = WE
Part = PR1 Frequency Step = f3 to f4 Temperature = 55C Workload = WE
Across a wider range of parts, frequency steps, workloads, and temperature.