predicting web service levels during vm live migrations · i )identify and quantify e ects on (web)...
TRANSCRIPT
Predicting Web Service LevelsDuring VM Live Migrations
Helmut Hlavacs and Thomas TreutnerResearch Group Entertainment Computing, University of Vienna, Austria
Introduction
I VM live migration important for energy efficiencyI Enables us to establish energy efficient target distribution of VMsI Supposedly no perceivable service downtime while live migratingI Live migration is resource intensive (iterative page copying)I Experiments: Influence on service levels while migrating?I Modelling: Predict service levels based on utilization?
Scenario
I Virtualized data center, static consolidation (P2V)I Provisioning for peak load, still bad energy efficiencyI E.g., 9-5-cycles, very low utilization at nightI Live migration enables dynamic consolidationI But: Seldom used, fear of possible side effectsI⇒ Identify and quantify effects on (web) service levelsI⇒ Find most influencing utilization metrics
Experiment
I Two servers, a single VM, migrating forth and backI VM disk image on central node (Gbit, open-iscsi)I qemu-kvm VM: Linux, Apache2, PHP5, MediaWikiI SQL VM and load generation on an extra nodesI Logging utilization of servers and VM, more than 100 variablesI Rise load from 50 to 600 concurrent virtual users and backI Migrate every 15min, track response time of last 5minI Maximum allowed response time: 1s
Data Overview
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
0 20 40 60 80 100 120 140 160 180 200
CP
U U
tiliz
atio
n (
VM
no
rme
d)
Interval
0
1
2
3
4
5
6
0 20 40 60 80 100 120 140 160 180 200
Lo
ad
5 A
ve
rag
e
Interval
0
5000
10000
15000
20000
25000
06 12 18 00 06 12 18 00 06 12
Re
sp
on
se
Tim
e [
ms]
Hours
0.82
0.84
0.86
0.88
0.9
0.92
0.94
0.96
0.98
1
0 20 40 60 80 100 120 140 160 180 200
Se
rvic
e L
eve
l R
atio
Interval
0
1
2
3
4
5
6
7
8
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Lo
ad
5 A
ve
rag
e
CPU Utilization (VM normed)
Measured
M/M/1
M/D/1
M/G/1, c2B=0.42
0.82
0.84
0.86
0.88
0.9
0.92
0.94
0.96
0.98
1
0 1 2 3 4 5 6
Se
rvic
e L
eve
l
Load5 Average
MeasuredTheoretic
I We can interpret the UNIX load as approximation to Q, the average numberof jobs in a Markovian M/M/1 queue, and the VM’s CPU utilization as ρ,
the system utilization: QM/M/1 = ρ2
1−ρI UNIX load is exponentially averaged by definition and the service times are
not necessarily exactly exponentially distributed: QM/G/1 = ρ2
1−ρ ·(1+c2
B)
2
I For deterministic service times c2B = 0, resulting in QM/D/1 = ρ2
2(1−ρ)
I Simple linear regression delivers the coefficient c2B = 0.42
I P(T ≤ x) = FT(x) = 1− e−µ(1−ρ)x, the theoretical probability that theresponse time T is lower than or equal to a limit x for a given service rate µ
Service Levels for Low/Medium/High Workload Scenarios
I Low (top right): Slightly increasedresponse times during live migration,seldom response time violations
I Medium (bottom left): SLA ratiogenerally satisfies the 97% limit
I High (bottom right): Often andheavy violations, unacceptable lowservice levels, typically decreased by20-25% percentage points
0
2
4
6
8
10
12
14
16
0.85
0.9
0.95
1
Respo
nse T
ime
[s]
SL
A R
atio
Time
HTTP Response TimeMigration started
Migration finishedSLA Ratio
Maximum allowed response time
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
0.95
0.96
0.97
0.98
0.99
1
Respo
nse T
ime
[s]
SL
A R
atio
Time
HTTP Response TimeMigration started
Migration finishedSLA Ratio
Maximum allowed response time
0
5
10
15
20
0.5
0.6
0.7
0.8
0.9
1
Respo
nse T
ime
[s]
SL
A R
atio
Time
HTTP Response TimeMigration started
Migration finishedSLA Ratio
Maximum allowed response time
Model Selection
I Stepwise model selection:Akaike Information Criterion
I Finds trade-off between number ofparameters (model size) andgoodness of fit (model quality)
I For comparison: Exhaustiveall-subsets-regression (LEAPS)
I LEAPS: Find best of all possiblemodels for given range of model size
I Computationally intensive even ifnumber of variables is limited
I R2Adj increases slightly with increased
model complexity. UNIX load5
contributes R2Adj of ∼ 90%
0.9
0.91
0.92
0.93
0.94
0.95
0.96
0.97
1 2 3 4 5 6 7 8 9 10 11 12
R2 A
dj
Model Size
Best value
Most Influencing Model ParametersVariable Meaning Estimate Std. Error Pr(> |t|)Intercept 2.395e+00 5.069e-01 3.00e-06
wp01 load5 VM UNIX load5 -1.871e-02 2.627e-03 3.67e-12
wp01 swapUsed VM swap used -7.656e-07 8.809e-08 < 2e-16
wp01 residentSize SQ Suared amount of resident memory used by
the qemu-kvm process.
-4.652e-14 1.652e-14 0.00506
src host cpu proc s Tasks created/s, source host. -2.475e+00 9.166e-01 0.00716
src host cpu proc s SQ 1.091e+00 4.133e-01 0.00856
wp01 cpu util vmnorm SQ -1.328e-01 3.187e-02 3.63e-05
wp01 cpu util vmnorm CPU util measured inside VM 9.517e-02 2.316e-02 4.64e-05
wp01 load5 SQ Squared UNIX load5 of the VM. -1.140e-03 1.918e-04 5.22e-09
wp01 freeMemRatio SQ Squared ratio of free memory inside the VM. 1.976e-02 9.462e-03 0.03727
Conclusions
I Impact of live migration on SL depends on amount of workloadI Tighter SLAs can be fulfilled during low and medium workloadI Migrating during high load causes massive decrease of service levelI Service level variance during a live migration to 90% predictable
using only a single variable, the UNIX load5 average, models with 12variables can explain 95% of variance
I Systems using live migration as a mechanism to realize a more energy efficienttarget distribution and have service level targets need to consider the UNIX
load average, but typical hypervisors do not collect/export this informationI Hypervisors should be extended to export load information (cf. free memory)
Future Work
I Influence of additional VMs (idle, utilized, mixed)I Linux and qemu-kvm: Kernel Samepage Merging (KSM)I Database VM migration, currently taboo due to potentially severe influenceI qemu-kvm parameters: Bandwidth limits, maximum allowed downtimeI Predict migration delay, energy consumption, service downtime
Created with LATEXbeamerposter http://www-i6.informatik.rwth-aachen.de/~dreuw/latexbeamerposter.php
http://cs.univie.ac.at/research/research-groups/entertainment-computing/ <firstname>.<lastname>@univie.ac.at