varun gupta carnegie mellon university 1 with: mor harchol-balter (cmu)
TRANSCRIPT
Optimal Load Balancing Policies for Heterogeneous Server Farms
VARUN GUPTACarnegie Mellon University
1
With:
Mor Harchol-Balter(CMU)
2
Warm up
Khomogeneous
First-Come-First-Servedservers
Exponentially-distributedjob sizes
Load Balancer
Knows queue lengthsNot job sizes
Q: What is the optimal load balancing policy?A: Join-the-Shortest-Queue
Q: Why?A: JSQ = Minimize Expected Response time of arrival
GOAL:Minimize MeanResponse Time
E[T]
Poisson(λ)
3
This Talk
μ=4
μ=4
μ=1
Kheterogeneous
First-Come-First-Servedservers
Exponentially-distributedjob sizes
Load Balancer
Knows queue lengthsNot job sizes
Q: What is the optimal load balancing policy?
μ=1
GOAL:Minimize MeanResponse Time
E[T]
Poisson(λ)
Smart-JSQ = Join-Shortest-Queue
(with smart tie breaks)
4
MER = Minimum Expected Response
time
μ=4
μ=1
μ=4
μ=1
μ=4
μ=1
μ=4
μ=1
Q: Which is the better policy?Q: What is the optimal policy?
5
OutlineMany-servers limit:
Simulation Results• Effect of K• Effect of arrival rate (λ)• Effect of degree of heterogeneity
Light-traffic regime Heavy-traffic regime
Partial characterization of the optimal policy
Complete characterization of optimal policies First asymptotic approximations
6
Many-servers light-traffic limit4
4
1
Poisson(λ)
1
K1 = α1 K
K2 = α2 K
7
Many-servers light-traffic limit4
4
1
K1 = K/3
K2 = 2K/3
Poisson(λ)
Q: Performance of MER
1
Case 1: λ < 4K/3
• Fast can handle λ• Arrivals find at least one fast idle• E[T] = 1/4
Case 2: λ > 4K/3
• Fast can not handle λ• Can not use slow until each fast has 3 jobs !
8
Many-servers light-traffic limit4
4
1
K1 = K/3
K2 = 2K/3
Poisson(λ)
Q: Performance of Smart-JSQ
1
Case 1: λ < 4K/3
• Fast can handle λ• Arrivals find at least one fast idle• E[T] = 1/4
Case 2: λ > 4K/3
Use slow as soon as each fast has 1 job !
9
Many-servers light-traffic limit4
4
1
K1 = K/3
K2 = 2K/3
Poisson(λ)
Smart-JSQ better than MER!
…but any policy which sends to slow when all fast are busy is identical in light-
traffic
1
HYBRID(smart-
JSQ+MER)
Smart-JSQ
10
MER
μ=4
μ=1
μ=4
μ=1
Light-traffic HYBRID = Smart-JSQ
μ=4
μ=1
μ=4
μ=1
μ=4
μ=1
μ=4
μ=1
smart-JSQ when some server idle
MER when all busy
11
OutlineMany-servers limit:
Simulation Results• Effect of K• Effect of arrival rate (λ)• Effect of degree of heterogeneity
Light-traffic regime Heavy-traffic regime
Partial characterization of the optimal policy
Complete characterization of optimal policies First asymptotic approximations
12
Many-servers heavy-traffic limit
4
4
1
1
K1 = α1 K
K2 = α2 K
Poisson(λ)
GOALAnalysis of policies for
heterogeneous servers
Analysis of JSQ for
homogeneous server
13
Many-servers heavy-traffic limit
4
4
1
1
K1 = α1 K
K2 = α2 K
Poisson(λ)
GOALAnalysis of policies for
heterogeneous servers
Analysis of JSQ for
homogeneous server
14
Many-servers heavy-traffic analysis for homogenous JSQ
KPoisson(λ)
Analysis technique: Markov chain for total jobs in system
0 1 K 2K+22K+12K3K/2
λ λ λ λ λ λ
? = mean departure rate given 3K/2 jobs
15
N = 3K/2
Poisson(λ)
K/2
K/2
Rate = K/2 Rate = K
Departure rate = K–1
(not K)
Finding the O(1) fluctuations critical
to analysis
O(1) idle queues
16
N = (1+γ)K(0<γ<1)
Poisson(λ)
γK
(1- γ)K
Rate = (1-γ)K Rate = K
Departure rate = K–(1-γ)/ γ
(not K)
Finding the O(1) fluctuations critical
to analysis
O(1) idle queues
17
Many-servers heavy-traffic analysis for homogenous JSQ
KPoisson(λ)
Analysis technique: Markov chain for total jobs in system
0 1 K 2K+22K+12K3K/2
λ λ λ λ λ λ
KKK-1Asymptotically negligible
probability mass First closed-form approx for JSQ!
18
Many-servers heavy-traffic limit
4
4
1
1
K1 = α1 K
K2 = α2 K
Poisson(λ)
GOALAnalysis of policies for
heterogeneous servers
Analysis of JSQ for
homogeneous server
19
Many-servers heavy-traffic limit
OPT policy maximize departure rate for each NÞ (preemptively) send jobs to slow servers even when they have 1 job and all fast servers have >
1
Smart-JSQ is optimal in many-servers
4
4
1
1
K1 = α1 K
K2 = α2 K
Poisson(λ)
20
OutlineMany-servers limit:
Simulation Results• Effect of K• Effect of arrival rate (λ)• Effect of degree of heterogeneity
Light-traffic regime Heavy-traffic regime
Partial characterization of the optimal policy
Complete characterization of optimal policies First asymptotic approximations
21
0 200 400 6000.1
0.5
0.9
1.3
Arrival rate (λ)
E[T]
Many-servers light-traffic
μ1=4, μ2=1, K1=100, K2=200
Smart-JSQ
22
0 200 400 6000.1
0.5
0.9
1.3
Arrival rate (λ)
E[T]
Many-servers light-traffic
μ1=4, μ2=1, K1=100, K2=200
MER
Smart-JSQ
23
0 200 400 6000.1
0.5
0.9
1.3
Arrival rate (λ)
E[T]
Many-servers light-traffic
μ1=4, μ2=1, K1=100, K2=200
MER
Smart-JSQ
HYBRID
24
590 592 594 596 598 6000.4
0.8
1.2
1.6
2
Arrival rate (λ)
E[T]
Many-servers heavy-traffic
μ1=4, μ2=1, K1=100, K2=200
MER
Smart-JSQ
HYBRID
25
3 300.4
0.6
0.8
1
1.2
1.4
Number of servers
E[T]
Effect of number of servers
μ1=4, μ2=1, α1=1/3, α2=2/3
Smart-JSQ
26
3 300.4
0.6
0.8
1
1.2
1.4
Number of servers
E[T]
Effect of number of servers
μ1=4, μ2=1, α1=1/3, α2=2/3
MER
Smart-JSQ
27
3 300.4
0.6
0.8
1
1.2
1.4
Number of servers
E[T]
Effect of number of servers
μ1=4, μ2=1, α1=1/3, α2=2/3
MER
Smart-JSQ
HYBRID
Conclusions A new many-servers heavy-traffic scaling to
analyze load balancing policies
First closed-form approx of load balancing heuristics
Choosing the right load balancer• Few servers, Small load, High heterogeneity HYBRID• Many servers, High load, Low heterogeneity smart-JSQ
28
29
0 200 400 600 800 1000 12000.1
0.5
0.9
1.3
Arrival rate (λ)
E[T]
Many-servers light-traffic
μ1=8, μ2=1, K1=100, K2=400
MER
Smart-JSQ≈ HYBRID
30
1190 1192 1194 1196 1198 12000.4
0.8
1.2
1.6
2
Arrival rate (λ)
E[T]
Many-servers heavy-traffic
μ1=8, μ2=1, K1=100, K2=400
MER
Smart-JSQ
HYBRID
31
5 500.3
0.5
0.7
0.9
1.1
Number of servers
E[T]
Effect of number of servers
μ1=8, μ2=1, α1=1/5, α2=4/5
MER
Smart-JSQ
HYBRID