energy management in virtualized environments gaurav dhiman, giacomo marchetti, raid ayoub, tajana...
Post on 03-Jan-2016
213 Views
Preview:
TRANSCRIPT
Energy Management in Virtualized EnvironmentsEnergy Management in Virtualized Environments Gaurav Dhiman, Giacomo Marchetti, Raid Ayoub, Tajana Simunic Rosing (CSE-UCSD) Gaurav Dhiman, Giacomo Marchetti, Raid Ayoub, Tajana Simunic Rosing (CSE-UCSD)
Inside Xen HypervisorInside Xen HypervisorInside Xen HypervisorInside Xen Hypervisor
Online Learning AlgorithmOnline Learning AlgorithmOnline Learning AlgorithmOnline Learning Algorithm
Vir
tuali
zati
on
V
irtu
ali
zati
on
DPM DPM DPM DPM
Performs dynamic evaluation of a set of DPM and DVFS Performs dynamic evaluation of a set of DPM and DVFS policies policies
at run time and selects the best suited for the current at run time and selects the best suited for the current workload workload
Guarantees convergence and performance close to that Guarantees convergence and performance close to that of the best of the best
available policy in the set available policy in the set
OS
im
ple
men
tati
on
an
d
OS
im
ple
men
tati
on
an
d
Resu
lts
Resu
lts
SummarySummary Hypervisor VM scheduler Hypervisor VM scheduler
implementation implementation Power Management: Power Management:
DPM/DVFSDPM/DVFS Workload characterization Workload characterization
aware aware Adaptive BehaviorAdaptive Behavior
Motivations and Goals Motivations and Goals Motivations and Goals Motivations and Goals
Lower datacenter energy Lower datacenter energy consumption consumption
Handle non-stationary Handle non-stationary workloads workloads
Service - VM - Customization Service - VM - Customization
Energy Oriented Energy Oriented
SchedulerScheduler - Implements a schedulerImplements a scheduler capable capable
of adapting to workload (guest) of adapting to workload (guest)
characteristicscharacteristics- Migration: Guest balancing and Migration: Guest balancing and
clustering clustering - Co-locate guests to free up Co-locate guests to free up
resourcesresources- Online Learning Algorithm Online Learning Algorithm
Supported by Supported by NSF-GreenLight project, CNS, Sun Microsystems, UC Micro, NSF-GreenLight project, CNS, Sun Microsystems, UC Micro, Cisco, Cisco, GSRC/DARPAGSRC/DARPA
CPU0CPU0CPU0CPU0 CPU1CPU1CPU1CPU1N/WN/WN/WN/W CPU2CPU2CPU2CPU2 CPUnCPUnCPUnCPUnHDDHDDHDDHDD
HypervisorHypervisor HypervisorHypervisor
Guest nGuest nGuest nGuest n
I/OI/O CPUsCPUs
HardwareHardware
I/O Intensive?I/O Intensive? CPU Intensive?CPU Intensive?
Guest 1Guest 1Guest 1Guest 1 Guest 2Guest 2Guest 2Guest 2
AppsAppsAppsApps
OSOSOSOS
AppsAppsAppsApps
OSOSOSOS
AppsAppsAppsApps
OSOSOSOS
Credit Credit SchedulerScheduler
Credit Credit SchedulerScheduler
Workload Workload Characterization Characterization
Workload Workload Characterization Characterization
Online Learning Online Learning AlgorithmAlgorithm
Online Learning Online Learning AlgorithmAlgorithm
VM SchedulingVM SchedulingVM SchedulingVM Scheduling
Virtual Machine Power Oriented Virtual Machine Power Oriented Scheduling Scheduling
Workload migration across Workload migration across physical machinephysical machine
Minimize impact on performanceMinimize impact on performance
Workload Workload characterizationcharacterization
- - I/O Intensiveness: I/O Intensiveness: Maintain metrics Maintain metrics
for I/O accesses per guestfor I/O accesses per guest -- CPU Intensiveness: CPU Intensiveness: Use Use
CPU CPU performance countersperformance counters
CPU intensive (µ ->1) vs Memory intensive (µ CPU intensive (µ ->1) vs Memory intensive (µ -> 0)-> 0)
µ = measure of CPU intensivenessµ = measure of CPU intensiveness Leakage impact (Leakage impact (ρρ))
DVFSDVFSDVFSDVFS
For qsortFor qsort
0
10
20
30
40
50
60
70
80
208MHz 312MHz 416MHz 520MHz
Fre
qu
ency
of
Sel
ecti
on
low α
medium α
high α
Higher Higher energy energy savingssavings
Lower Lower Perf Perf DelayDelay
Identifies both CPU-intensive and Identifies both CPU-intensive and memory intensive phases correctlymemory intensive phases correctly
Avg. μAvg. μ
timetime
0.75
0.4
25%25%
75%75%
CPU intensiveCPU intensive
mem intensivemem intensive
Energy Saving/Performance Delay Energy Saving/Performance Delay Results for CPUResults for CPU
Experimental SetupExperimental Setup Workloads: Workloads: qsort, djpeg, blowfish, dgzipqsort, djpeg, blowfish, dgzip CPU XscaleCPU Xscale
ControllerController
Working SetWorking Set
DeviceDevice
::Dormant ExpertsDormant Experts
Expert selectionExpert selection
::Operational ExpertOperational Expert
Manages PowerManages Power
Expert 1Expert 1Expert 1Expert 1 Expert 2Expert 2Expert 2Expert 2 Expert NExpert NExpert NExpert NExpert 3Expert 3Expert 3Expert 3
DPM & DVFSDPM & DVFS DPM & DVFSDPM & DVFS Experimental SetupExperimental Setup AMD quad core CPUAMD quad core CPU SPEC benchmarksSPEC benchmarks
BenchmarkBenchmark FreqFreq %delay%delay%Energy%EnergysavingsPM-isavingsPM-i
PM-1PM-1 PM-2PM-2 PM-3PM-3
mcfmcf
1.91.9 2929 5.25.2 0.70.7 -0.5-0.5
1.41.4 6363 8.18.1 0.10.1 -2.1-2.1
0.80.8 163163 8.18.1 -6.3-6.3 -10.7-10.7
bzip2bzip2
1.91.9 3737 4.74.7 -0.6-0.6 -2.1-2.1
1.41.4 8686 7.47.4 -2.4-2.4 -5-5
0.80.8 223223 7.87.8 -9.0-9.0 -14-14
artart
1.91.9 3232 66 11 -0.1-0.1
1.41.4 7676 7.37.3 -1.7-1.7 -4-4
0.80.8 202202 88 -8-8 -13-13
sixtracksixtrack
1.91.9 3737 55 -0.5-0.5 -2-2
1.41.4 8686 66 -4.3-4.3 -7.2-7.2
0.80.8 227227 77 -11-11 -16.1-16.1
DeviceDevice Trace NameTrace Name ttRIRI σσ ttRIRI
HDDHDD HP-1TraceHP-1Trace 20.520.5 2929
HP-2 TraceHP-2 Trace 5.95.9 8.48.4
HP-3 TraceHP-3 Trace 17.217.2 22
ttRIRI : Average Request Inter-arrival Time (in sec) : Average Request Inter-arrival Time (in sec)
ExpertExpert CharacteristicsCharacteristics
Fixed TimeoutFixed Timeout Timeout = 7*TTimeout = 7*Tbebe
Adaptive TimeoutAdaptive Timeout(Douglis, USENIX’95)(Douglis, USENIX’95)
Initial timeout = 7*TInitial timeout = 7*Tbebe;;
Adjustment = +1TAdjustment = +1Tbebe/-1T/-1T
bebe
Exponential PredictiveExponential Predictive(Hwang, ICCAD’97)(Hwang, ICCAD’97)
IIn+ln+l = = a a iinn + (1+ (1 – a– a).I).Innwith a = 0.5with a = 0.5
TISMDPTISMDP(Simunic, TCAD’01)(Simunic, TCAD’01)
Optimized for delay constraint of 3.5% on Optimized for delay constraint of 3.5% on HP-1 traceHP-1 trace
PolicyPolicy DescriptionDescription
PM-1PM-1 switch CPU to ACPI state C1 (remove clock supply) and move to lowest voltage setting switch CPU to ACPI state C1 (remove clock supply) and move to lowest voltage setting
PM-2PM-2 switch CPU to ACPI state C6 (remove power)switch CPU to ACPI state C6 (remove power)
PM-3PM-3 switch CPU to ACPI state C6 and switch the memory to self- refresh mode switch CPU to ACPI state C6 and switch the memory to self- refresh mode
PolicyPolicy HP1 TraceHP1 Trace HP2 TraceHP2 Trace HP3 TraceHP3 Trace
%delay%delay %energy%energy %delay%delay %energy%energy %delay%delay %energy%energy
OracleOracle 00 68.1768.17 00 65.965.9 00 71.271.2
TimeoutTimeout 4.24.2 49.949.9 4.44.4 46.946.9 3.33.3 5555
Ad TimeoutAd Timeout 7.77.7 66.366.3 8.78.7 64.764.7 66 67.767.7
TISMDPTISMDP 3.43.4 44.844.8 2.262.26 36.736.7 1.81.8 42.342.3
PredictivePredictive 88 66.666.6 9.29.2 65.265.2 6.56.5 6868
PreferencePreference HP-1 TraceHP-1 Trace HP-2 TraceHP-2 Trace HP-3 TraceHP-3 Trace
%delay%delay %energy%energy %delay%delay %energy%energy %delay%delay %energy%energy
Low delayLow delay
IIVV
High energyHigh energysavingssavings
3.53.5 4545 2.612.61 37.4137.41 2.552.55 49.549.5
6.136.13 60.6460.64 5.865.86 54.254.2 4.364.36 61.0261.02
7.687.68 65.565.5 8.598.59 64.164.1 5.695.69 66.2866.28
DPM: With Individual ExpertsDPM: With Individual Experts
DPM: With Online LearningDPM: With Online Learning
Recent CPUs might perform better with a “run to sleep” Recent CPUs might perform better with a “run to sleep” policypolicy due to:due to:
Improved CPU efficiencyImproved CPU efficiency Idle power management supportIdle power management support
Power/Performance Results for HDD HP-1 tracePower/Performance Results for HDD HP-1 traceComparison with fixed timeout expertsComparison with fixed timeout experts
top related