fair scheduling in web servers cs 213 lecture 17 l.n. bhuyan

Fair Scheduling in Web Servers

CS 213 Lecture 17

L.N. Bhuyan

Objective

• Create an arbitrary number of service quality classes and assign a priority weight for each class.

• Provide service differentiation for different use classes in terms of the allocation of CPU and disk I/O capacities

Differentiated Service in a Web Cluster: Objective

• Create an arbitrary number of service quality classes and assign a priority weight for each class.

• Provide service differentiation for different use classes in terms of the allocation of CPU and disk I/O capacities

Ref: Demand Driven Service Differentiation in Cluster-based Network Servers, Infocom 2001, by Zhu, Tang and Yang

Target System

Service Differentiation

• Requests of higher classes receive better services than lower classes, especially when the system is heavily loaded

• Request from lower classes should not be sacrificed for requests from higher classes when the system load is light

Definitions

• Requests: C1, C2, …, CN

• Corresponding Weights: W1, W2, …, WN

• Stretch factors: S1, S2, …, SN

stretch factor: the ratio of the response time of a request to the service demand of that request

• Average arrive rate: λi

• Average processing rate: μi

• Minimum resource requirement of class I, ρi= λi /μi

Optimal Problem

• Minimize:

F= W1S1+ W2S2+ … + WNSN

such that: S1 ≤ S2 ≤ … ≤ SN

S1, S2, …, SN ≤ K

K is a stretch factor bound, where K > 1 is a predefined threshold

Optimal Solution

Scheduling optimization for Resource-Intensive Web

requestsIn SPAA’99

By Zhu, Smith and Yang

Request Classification

• Static Data

web pages, images, etc.

does not consume much system resource

Request Classification (Cont.)

• Dynamic Data

e-commerce, database searching, personalized information

generated dynamically, place greater I/O and CPU demands

1 to 2 orders of magnitude longer processing time than static requests based on IBM Olympics and Alexandria Digital Library data

Flat Architecture

• Server nodes can process both static and dynamic requests

Master/Slave Architecture

• Server nodes are divided in two groups:Slave group only processes dynamic requests

Master group can handles both requests

How to partition a cluster

• Questions:

1: Given p nodes, what is the optimal number of master nodes?

2: What percentage of dynamic requests should be processed on masters?

Goal: ensure the master/slave

architecture outperforms flat

architecture

M/S and Flat Models

Performance Metric

• Stretch factor: the ratio of response time at a particular load to that at no load

• Average stretch factor is more suited than average response time for systems with highly variable task sizes. Average stretch factor indicates load of a system.

Evaluation results

• M/S: up to 69% performance improvement over flat

Separation of dynamic and static content

• Resource reservation: up to 68% improvement

• Resource requirement sampling: up to 23% improvement

Performance Guarantees for Internet Services (Gage)

• Environment: Web hosting services

multiple logical web servers (service subscriber) on a single physical web server cluster.

• Gage:

guarantee each web server with a pre specific performance

a distinct number of URL requests to service per second

Components

Each service subscriber maintain a queue

• Request classificationdetermines the queue for each input request

• Request schedulingdetermines which queue to serve next to meet the QoS requirement for each subscriber.

• Resource usage accountingcapture detailed resource usage associated with each subscriber’s service requests.

The Gage System

• QoS guarantee QoS is in terms of a fixed number of generic URL

request which represents an average web site access Currently, assuming it is 10msec of CPU time, 10msec

of disk I/O and 2000 bytes of network bandwidthEach subscribe is given a fixed number of generic requests.

Other possible QoS metrics: response time, delay jitter etc.

• Using TCP splicing

Request Scheduling

Two decisions:

• Which request should be serviced next according to each subscriber’s static resource

reservation and dynamic resource usage

• Which RPN should service this request according to the load information on each RPN and also

exploit access locality

fair scheduling in web servers cs 213 lecture 17 l.n. bhuyan

Documents

yang slide

light slide

bhuyan slide

dynamic requests slide

system resource slide

flat architecture slide

target system slide

optimal solution slide