scalable server base

Upload: krishnana

Post on 08-Apr-2018

218 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/7/2019 Scalable server base

    1/14

    A QoS Control Mechanism to ProvideService Differentiation and Overload

    Protection to Internet Scalable ServersDaniel F. Garca, Member, IEEE, Javier Garca, Joaqun Entrialgo, Manuel Garca,

    Pablo Valledor, Rodrigo Garca, and Antonio M. Campos

    AbstractNowadays, enterprises providing services through Internet often require online services supplied by other enterprises. This

    entails the cooperation of enterprise servers using Web services technology. The service exchange between enterprises must be

    carried out with a determined level of quality, which is usually established in a service level agreement (SLA). However, the fulfilment of

    SLAs is not an easy task and requires equipping the servers with special control mechanisms which control the quality of the services

    supplied. The first contribution of this research work is the analysis and definition of the main requirements that these control

    mechanisms must fulfil. The second contribution is the design of a control mechanism which fulfils these requirements and overcomes

    numerous deficiencies posed by previous mechanisms. The designed mechanism provides differentiation between distinct categories

    of service consumers as well as protection against server overloads. Furthermore, it scales in a cluster and does not require any

    modification to the system software of the host server, or to its application logic.

    Index TermsQoS, quality of service, Web services, Internet servers, overload protection, service level agreement, SLA,

    cluster computing.

    1 INTRODUCTION

    NOWADAYS, enterprises providing services through Inter-net must use online services supplied by otherenterprises to compose the final services required by theirend clients. Therefore, the Internet servers of severalenterprises must work in cooperation to provide therequired services.

    In order to ensure that the quality of service (QoS)perceived by end clients is acceptable, the servers must

    include techniques and mechanisms that guarantee a

    minimum level of QoS. Although QoS has multiple aspects

    such as response time, throughput, availability, reliability,

    and security, the primary aspect of QoS considered in this

    work is related to response time.The type of operational environment where the QoS

    control mechanism presented in this paper is intended to

    work is designed for an enterprise server (or cluster)

    working as a Web services provider, which supplies a

    collection of services through Internet to servers of otherenterprises. In this paper, we refer to the server providingservices as the provider server and to the servers consumingservices as consumer servers. The services are deployedusing Web services technology. The consumer serversusually belong to enterprises which provide online servicesto end clients. In this operational environment, the con-sumer servers consume the services delivered by theprovider server in order to respond to the requests carriedout by the end clients. This operational environment definesC2B connections between end clients and consumer servers,and B2B relationships between consumer servers andprovider servers.

    In order that consumer servers can provide end clientswith the appropriate quality of service, they must beguaranteed a determined QoS from the provider server.To this end, a Service Level Agreement (SLA) is establishedbetween each consumer server and the provider server. TheSLAs specify the maximum response time of each servicedelivered by the provider. The SLAs also specify theconsumer server category, which states the degree of QoSthat the consumer receives from the provider.

    In this type of operational environment, the interactionsbetween the consumer servers and the provider server aremainly carried out in secure mode using the SSL protocol.To avoid the great overhead entailed by negotiating anindependent secure session for each request, all the requestsof each consumer server (occurring in a determined periodof time) are usually grouped in a single secure session. Inthis paper, the collection of interactions carried out betweena provider and a consumer server within a single secure

    session is called a business session.

    IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 2, NO. 1, JANUARY-MARCH 2009 3

    . D.F. Garc a, J. Garc a, J. Entrialgo, and M. Garc a are with the Departmentof Computer Science and Engineering, University of Oviedo, CampusUniversitario, Edificio 1, 33204 Gijon, Spain.E-mail: {dfgarcia, javier, joaquin, mgarcia}@uniovi.es.

    . P. Valledor is with the R&D Technological CentreKiN (CDT),ArcelorMittal R&D Technological Centre, Centro de Desarrollo Tecnolo -gico, PO Box 90, 33400 Avile s, Spain.E-mail: [email protected].

    . R. Garc a and M.A. Campos are with the Departamento I+D+i, CTICFoundation, Parque Cient fico Tecnologico de Gijon, Edificio CentrosTecnologicos, 33203 Cabuenes, Gijon, Spain.E-mail: {rodrigo.garcia, antonio.campos}@fundacionctic.org.

    Manuscript received 29 July 2008; revised 22 Dec. 2008; accepted 19 Jan.2009; published online 27 Jan. 2009.For information on obtaining reprints of this article, please send e-mail to:[email protected], and reference IEEECS Log Number TSC-2008-07-0070.

    Digital Object Identifier no. 10.1109/TSC.2009.3.1939-1374/09/$25.00 2009 IEEE Published by the IEEE Computer Society

  • 8/7/2019 Scalable server base

    2/14

    The provider server may be implemented by means of asingle server or a cluster server. In the latter case, a load

    balancing cluster is used. This kind of cluster (shown inFig. 1) is made up of a set of nodes which execute the sameapplication logic, supplying the same set of Web services tothe consumer servers.

    The application logic of the provider server is commonlyimplemented using a Web server, as well as othercomponents which provide a wide variety of services, suchas connectivity with databases and handling of commu-nications with other applications. The systems implement-ing Web services with the support of these environmentsare currently known as applications servers. Therefore, theprovider server depicted in Fig. 1 must also be consideredan application server.

    Much research effort in the QoS control of servers hasbeen oriented to Web servers. In spite of this, manytechniques available in the literature are inapplicable toreal systems. As an example, in overloading periods, manytechniques discard Web requests arriving at the serverwithout considering that these requests are integrated in thenavigation sessions of the clients.

    Furthermore, very little effort has been devoted to theQoS control of application servers, as will be discussed inSection 3. The research presented in this paper fills a gap byproviding useful QoS control techniques suitable for abroad range of application servers. The research resultspresented in this paper are also important due to the

    increasing deployment of application servers operatingwith Web services technology which must fulfill QoSrequirements. The QoS control mechanism presented inthis paper has been designed for an application server,providing a useful improvement in the QoS control of thistype of server.

    The research topics addressed through this paper aredescribed below with the following organization. Section 3states the general requirements demanded for any usefulQoS control mechanism. Section 3 analyzes the QoSmechanisms described in previous research work, showingtheir deficiencies. In Section 4, the proposed design of a newQoS mechanism is explained. Sections 5 and 6 describe the

    experimental environment in which the QoS mechanism

    has been tested and the experimental results. Finally,Section 7 concludes the paper.

    2 REQUIREMENTS OF THE QOS CONTROLMECHANISM

    After analyzing the different QoS mechanisms presented in

    the literature (described in Section 3) and taking intoaccount the operational environment in which they mustwork, we propose a set of requirements that a QoSmechanism for an application server should address. Theserequirements are the following.

    . Requirement 1 (R1): The QoS control mechanism must beable to manage several simultaneous services with aspecific response time limitation for each service (not asingle limitation for all the services as used in otherresearch works). This is because SLAs usually specifyan individual response time limitation for eachservice. In accordance with the current usage, the

    response time limitations applied to services must beexpressed in terms of the 90 percentile.. Requirement 2 (R2): The QoS control mechanism must

    supply service differentiation in the service provided to theconsumers. This means that the requests of prefer-ential consumers must have a high degree ofprobability of being serviced, even when the loadsupported by the provider server is high. However,the requests of less important consumers may berejected (especially, when the load is high) to reservethe computational resources of the provider serverfor the preferential consumers. In this way, the QoScontrol mechanism supports the concept of consu-mer category, usually used in SLAs.

    . Requirement 3 (R3): The QoS control mechanism mustsupport the grouping of interactions (between the providerand the consumers) in sessions. The reason for this isthat in most cases, the interactions between providerand consumer servers must be carried out in securemode using SSL and other security technologies.

    . Requirement 4 (R4): The QoS control mechanismintegrated in the provider server should not requiremodifications in the system software of the server. In thisway, the QoS mechanism is independent of theunderlying computing platform. Some researchworks propose modifications to the system softwareto implement QoS. However, this is only possible in

    open source systems, in which the source code of thesystem is accessible. In many commercial systems,such modifications are not possible.

    . Requirement 5 (R5): The QoS control mechanismintegrated in the provider server should not requiremodifications in the application logic of the server. Thus,the QoS mechanism can be incorporated into theserver easily, because it does not require any changein the software of the server.

    . Requirement 6 (R6): The QoS control mechanism shouldbe easy to configure. This facilitates its integration inproduction servers.

    . Requirement 7 (R7): The QoS control mechanism must be

    scalable to operate in both a single server and a cluster

    4 IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 2, NO. 1, JANUARY-MARCH 2009

    Fig. 1. Load balancing cluster implementation of a provider server.

  • 8/7/2019 Scalable server base

    3/14

    server of any number of nodes. Nowadays, scalability isa fundamental issue in the design of providerservers, since these must be designed to adapt topossible increases in the demand of service requests.Because of this, the QoS control mechanism must bedesigned to scale with the server in which itoperates.

    . Requirement 8 (R8): The QoS control mechanism mustprovide protection against overloads. Under overloadconditions, the SLAs may not be fulfilled, so overloadsituations must be properly managed. To handlesuch situations, an admission control proceduremust be included in the QoS control mechanism.

    All these requirements have guided the design of theQoS control mechanism presented in this paper.

    3 RELATED RESEARCH WORK

    There are many research works which deal with QoScontrol mechanisms for provider servers. Most of the

    mechanisms try to fulfill one of two objectives: either thedifferentiation of the service provided to the consumers orthe management of the overload controlling the admissionof new service requests. There are also mechanisms that tryto fulfill the two objectives simultaneously. Below, the mainresearch works on QoS control mechanisms are analyzed.

    3.1 Service Differentiation and Prioritization

    The initial works [1], [2], [3] first classified the consumers ortheir requests in classes, and then, scheduled the requestsof each class with different priorities. Pradhan et al. [4]developed a system to classify the Web requests accordingto their IP connection parameters, and then, share the CPU

    between the classes to achieve the response time goals ofeach class. Sharma et al. [5] proposed a mechanism tomanage the QoS by prioritization of the requests that arriveat the server.

    Schroeder and Harchol-Balter [6] proposed the SRTPscheduling mechanism to improve the response times ofWeb servers when they support transitory overloads. TheWeb requests are classified automatically by the size of therequested file, giving the highest priority to requests forsmaller files. McWerther et al. [7] developed prioritizationmechanisms of the transactions performed in databaseenvironments to assure the QoS perceived by highpriority consumers. Later, Schroeder et al. [8] continued

    this research by developing a transaction scheduler, whichoperates interposed between the consumers and the serverto guarantee the fulfillment of the QoS requirements ofthe transactions.

    Totok and Karamcheti [9] developed a mechanism whichdynamically assigns higher priorities to the requests thatprovide potentially a higher reward, typically highereconomic revenue. Wei et al. [10] developed a mechanismthat uses the slowdown metric, which is defined as therelation between the waiting time suffered by the requestsand their service time.

    There is also a set of mechanisms [11], [12], [13], [14]based on the classic control theory whose objective is to

    maintain the processing time of the requests around a

    reference value independent of the load supported by theserver. These mechanisms combine feedback control loopswith feed-forward control actions based on the predictionsgenerated by queuing models. Lu et al. [15] demonstratedthat these mechanisms can also be used to provide arelative differentiation of the processing time of therequests of different consumers. Wei et al. [16] developed

    similar mechanisms using a fuzzy control approach. Weiand Xu [17] developed a double self-tuning controller toadjust the processing rate of the requests of premiumcustomers, so these requests experience a processing delayequal to the specified SLA. Wang et al. [18] developed theAutoParam controller to adjust the processing capacity ofthree virtual machines installed in the same physical serverin order to maintain the mean response time of transac-tions around a reference value, called the Service LevelObjective (SLO).

    When the server is a cluster organized in layers, theutilization of these techniques is very difficult, because theyrequire a complex queuing model of the underlying clusterthat must be developed and adjusted for each particularsystem. Some researchers try to avoid the problem ofbuilding complex models of multitier servers usingidentification techniques. Liu et al. [19] developed acontroller to maintain predefined ratios between QoSmetrics of several request classes. It uses an online recursiveleast-squares estimator to obtain the parameters of a linearMIMO model. This controller maintains the ratios underoverload conditions, and therefore, provides service differ-entiation, but it is not able to guarantee the absolute SLA ofpreferential requests.

    3.2 Admission Control

    One of the initial works on admission control wasdeveloped by Cherkasova and Phaal [20]. They designedan admission controller for Web servers that consideredthe consumer sessions. Chen et al. [21] implemented amechanism similar to that proposed by Cherkasova andPhaal, but they did not consider the concept of session.Elnikety et al. [22] proposed the insertion of an admissioncontroller between the Web server and the database serverof an e-commerce server. The objective is to maximize thethroughput, which is not directly linked with the QoSperceived by the consumers.

    Kamra et al. [23] proposed an admission controller thatuses a PI controller and tested its behavior with the TPC-Wbenchmark, but using a single SLA for all the requestsclasses of the benchmark. Welsh and Culler [24] proposed amechanism to manage overloads in servers organized inmultiple logical and physical layers. It requires the assign-ment of a maximum percentile of the response time to eachlayer, which is difficult to specify.

    Lee et al. [25] proposed a mechanism whose objective isto maintain a predefined ratio between the waiting timesobtained for the requests corresponding to differentconsumer categories when the load conditions of a Webserver vary. Other researchers [26], [27], [28] approach theQoS control problem as an optimization problem of the

    server operation.

    GARCIA ET AL.: A QOS CONTROL MECHANISM TO PROVIDE SERVICE DIFFERENTIATION AND OVERLOAD PROTECTION TO INTERNET... 5

  • 8/7/2019 Scalable server base

    4/14

    3.3 Combination of Service Differentiationand Admission Control

    Bhatti and Friedrich [29] developed the WebQoS mechan-

    ism to control the QoS in Web servers. It classifies therequests into basic and premium classes. Basic requestscan be rejected to maintain the QoS of premium requests.Bhoj et al. [30] developed the Web2K mechanism, whichalso provides differentiated QoS to two types of requests.Li and Jamin [31] designed an algorithm to distribute thenetwork bandwidth and the processing capacity of a Webserver among several classes of consumers. The requestsof a consumer class that do not have resources availableare rejected.

    Urgaonkar and Shenoy [32] proposed an admissioncontrol mechanism specifically designed to remain opera-tional under extreme overloads. It increases its efficiency bysorting out the requests in classes and admitting or rejectingsets of requests instead of individual requests. Later,Urgaonkar [33] introduced the concept of session into themechanism. Yue and Wang [34] developed a session-basedadmission control system for e-commerce servers classify-ing the clients into two classes: premium if they havepurchased products previously and basic if not. Underoverload conditions, the system first rejects the basic clients.

    3.4 Analysis of the Mechanisms

    Table 1 has been compiled to compare the QoS mechanisms.Each column represents one QoS mechanism, labeled by itsreference number. Each row describes the level of fulfill-ment of one of the requirements stated in Section 2. In the

    table, there are three groups: the mechanisms focusedon service differentiation are on the left (those mainly basedon control theory are shadowed), those focused on admis-sion control are in the centre, and the mechanisms focusedon both are on the right. A brief summary of the results ofTable 1 is given below.

    Only a few mechanisms consider different classes ofrequests (R1). There are also a few mechanisms thatconsider several categories of customers (R2) and applyservice differentiation. Only two of the analyzed mechan-isms consider both requirements simultaneously (R1 andR2), although this is a great advantage.

    The concept of consumer session (R3) is only considered

    by a few mechanisms, although it is essential for Web

    applications (C2B environments) and is very useful forsecure Web services (B2B environments).

    Most of the mechanisms analyzed are not independent of

    the computing platform on which they operate (R4). Someof them require modifications of the operating system,and also many of them require modifications to the Webserver software. On the other hand, few of the mechanismsrequire modifications to the Web applications running onthe server (R5).

    The configuration of many QoS control mechanisms is noteasy (R6), because the manager must know its internaloperation andthe configuration parameters arenot related tovariables with physical meaning. Most of QoS controlmechanisms have been designed for a Web server operatingin a single computer, and only a few mechanisms have beenspecifically designed including the capability of operating inclusters (R7).

    Finally, only the mechanisms that implement an admis-sion control strategy are capable of managing sustainedoverloads (R8).

    As shown, most of the QoS control mechanisms analyzedabove fail to fulfill many of the requirements stated inSection 2. The QoS control mechanism presented in thispaper provides specific solutions to the difficulties posed bythe previous QoS control mechanisms. Furthermore, it canbe applied to any kind of complex server, that is, not only toWeb servers, but to application servers as well.

    4 DESIGN OF THE QOS CONTROL MECHANISM

    To control the quality of service provided by a server,sensors and actuators must be integrated in the server (seeFig. 2). The sensors measure the controlled variables, suchas the response time of requests and the utilization of theresources of the server. The actuators operate on the controlvariables to modify the operation of the server, for example,by changing the priority or the number of the server threadswhich provide service to the requests.

    The standard operation of a controller is based oncomparing the information received from sensors with theobjectives established by the administrator of the server.Then, the controller executes a control algorithm which usesthe results of the comparisons to calculate the values of the

    control variables that will be sent to the actuators.

    6 IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 2, NO. 1, JANUARY-MARCH 2009

    TABLE 1Related Work

  • 8/7/2019 Scalable server base

    5/14

    In the following sections, we describe how the standardQoS control mechanism has been adapted to our particularenvironment, and how our proposed design for the QoScontrol mechanism fulfils the requirements stated inSection 2.

    4.1 Architectural Design

    Our proposed architectural design for the QoS control

    mechanism is based on two components: a monitor and acontroller. The monitor is a software component whichencapsulates the sensors and the actuators of the QoS controlmechanism. The sensors are developed using measurementfunctions provided by the run-time environment of theserver, and the actuators are implemented by means of anadmission procedure. The controller is also a softwarecomponent, which contains the QoS control algorithms.

    When the QoS control mechanism is deployed in acluster server, there must be one monitor for each node ofthe cluster, but only one controller is required for the wholecluster. The controller is usually hosted in one of the nodes

    of the cluster, although it could be located in any other

    computer connected to the cluster. Fig. 3 shows the

    deployment of the QoS control mechanism in a load

    balancing cluster, such as the one depicted in Fig. 1

    (Requirement 7). In Fig. 3, the components of the QoS control

    mechanism (the monitors and the controller) are repre-

    sented in gray, and the flows of information between these

    components and the other elements of the cluster are

    depicted using dotted arrows.As can be seen in Fig. 3, the monitors are located between

    the application logic of the server and the consumers

    (through the load balancer device). In this way, the

    monitors can intercept the consumers requests for service,

    as well as the responses supplied by the application logic of

    the server. For each request and each response, the monitor

    takes measurements and sends them to the controller,

    which uses these measurements to calculate the response

    times of the provided services.With the proposed architectural design, the QoS control

    mechanism operates as a front end subsystem of the server,which is independent of the system software of the server

    (Requirement 4) and its application logic (Requirement 5).To take into account the grouping of requests in

    business sessions (Requirement 3), whenever a consumer

    demands the opening of a new session, the monitor sends a

    new session request to the controller, which determines if

    the session should be admitted or rejected. Then the

    controller sends a message with this information to the

    monitor, which admits or rejects the opening of the session

    as required.

    GARCIA ET AL.: A QOS CONTROL MECHANISM TO PROVIDE SERVICE DIFFERENTIATION AND OVERLOAD PROTECTION TO INTERNET... 7

    Fig. 2. Elements of a standard QoS control mechanism for a server.

    Fig. 3. Deployment of the QoS control mechanism in a cluster server.

  • 8/7/2019 Scalable server base

    6/14

    4.2 Controller Design Strategy

    A standard controller operates periodically, generating thenecessary values of one or more control variables to controlthe evolution of one or more controlled variables. In thecase of our controller, the controlled variables are the90 percentiles of the response times of all the Web services

    provided by the server. The objective of the controller is tomaintain these percentiles under the predefined valuesestablished in the SLA. To achieve this objective, the controlvariable used by the controller is the available processingcapacity of the server, measured in business sessionsand differentiating between various consumer categories.This variable controls the admission procedure located inthe monitors of the QoS control mechanism. Managing theadmission of new sessions, the number of concurrentsessions processed by the server can be limited, so thatthe maximum percentiles established in the SLA are notsurpassed. In the case of the SLA including requirementsof minimum throughputs, they could only be satisfiedfor highest priority business sessions during server

    overload periods.The 90 percentiles of the response time are calculated

    using the response time of the Web services completedwithin a temporal window, which is displaced in time usinga tracking period, as shown in Fig. 4. In our experimentalenvironment, after carrying out a series of tests, we havechosen the values of 100 seconds for the temporal windowand 10 seconds for the tracking period. These values make itpossible to detect load changes quickly while maintainingan acceptable variance of the 90 percentiles.

    The controller must calculate a new value of the controlvariable (number of concurrent sessions) periodically.Following the classic control theory, the common approach

    to calculate new values of the control variable wouldbe based on the differences between the measured90 percentiles and their maximum admissible valuesdefined by the SLA. However, the common approach posesa serious problem, because the 90 percentile is a responsetime metric which evolves with a significant delay withrelation to the load of the server, making this an unsuitablevariable on which to base the control on, since it would leadto a delayed actuation with relation to that required.Furthermore, the percentile is a noisy variable, which isalso an undesirable characteristic for a variable used as thebasis of a control system.

    In the classic control theory, the usual procedure to

    control the output variables of a system that can only be

    measured with delay is to use as an input-output model ofthe system. The model is employed to estimate the values ofthe outputs in a future time as a function of the currentinputs. Then, the estimated values of the outputs are usedby the controller instead of the real outputs, which arenot known.

    The construction of an input-output model of the server

    to estimate the 90 percentiles poses two main problems:1) the delay of the 90 percentile is different for each serviceand varies with the load of the server and 2) the dynamicbehavior of each service is quite different from the others.Due to these problems, the construction of a dynamic modelof the server that provides the evolution of the 90 percentilesover time is a complex task. Typical dynamic modelsare simulation programs and linear difference equationswhose parameters must be reestimated periodically.

    A more practical alternative to the dynamic model is theconstruction of a static input-output model, which suppliesthe expected steady-state values of the 90 percentile of eachservice for any number of business sessions processed by

    the server. In this way, this static model supplies the samesolution that would be provided by a dynamic model whentheir output reaches the steady state. This approach, basedon a static model, is very reasonable to control a systemwith a single input and multiple output variables, when thebehavior of the output variables differs greatly.

    To obtain the model of the server, our approach is basedon characterizing the input-output behavior of the serverusing a measurement technique. To this end, a sequence ofload experiments is carried out. In each experiment, theserver processes a fixed number of concurrent sessions, andat the end of the experiment, the 90 percentile of theresponse time of each Web service provided by the server is

    calculated. After finishing all the experiments and calcula-tions, a curve is generated for each service, which relates the90 percentile of the response time of the service with thenumber of sessions being processed. Then, using a regres-sion technique, a fourth-order polynomial is fitted to eachcurve. The set of calculated polynomials constitutes asingle-input multiple-output (SIMO) model. This modelprovides the expected 90 percentiles of the response timesof all the Web services provided by the server as a functionof the concurrent sessions it processes.

    Our approach is to use the SIMO model to determine themaximum number of concurrent sessions that the servercan process, so that the maximum 90 percentiles established

    in the SLA are not surpassed. This has an essentialadvantage: the number of concurrent sessions is knownby the controller instantaneously, while the 90 percentilescan only be known with delay. Therefore, the control basedon the concurrent sessions will lead to a more precise andfaster actuation. We refer to the maximum number ofconcurrent sessions as the maximum admissible sessions.

    Fig. 5 represents the curves of the SIMO modelcorresponding to the Web services of the server, as well asthe calculation procedure of the maximum admissiblesessions using these curves. Each curve relates the numberof concurrent sessions being processed (S) with the90 percentile of the response time of the corresponding

    Web service (Pi

    ). In Fig. 5, the maximum 90 percentile

    8 IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 2, NO. 1, JANUARY-MARCH 2009

    Fig. 4. Calculation procedure of the 90 percentiles of the response times.

  • 8/7/2019 Scalable server base

    7/14

    established in the SLA for the Web service i is called Pmi

    .Introducing each Pm

    iin its corresponding curve, the number

    of concurrent sessions (Smi

    ) corresponding to this percentile

    is obtained. To guarantee the fulfilment of the SLA of all theWeb services provided by the server, the minimum valueobtained from all the Sm

    imust be chosen as the maximum

    admissible sessions. We refer to this value as Sm.Using the SIMO model, our mechanism is able to

    manage different classes of Web services, each one withits specific SLA (Requirement 1), simultaneously.

    4.3 Controller Design and Operation

    The controller has been designed to operate periodically. Ineach control period, the controller calculates a new value ofthe control variable (the available processing capacity).After carrying out several tests, we have chosen a value of

    10 seconds for the control period.Fig. 6 depicts the controller design, showing the mainfunctional blocks of the controller and the flows ofinformation between them. Some functional blocks of thecontroller are executed periodically, once each controlperiod. These functional blocks are depicted in black inFig. 6. Other functional blocks are executed asynchronously,each time they have to respond to a determined event.These functional blocks are depicted in gray.

    The collector of measurements receives the measure-ments sent by the monitors. The measurements contain themeasured response times of the Web services, and recordsof the beginnings and ends of business sessions. The

    collector of measurements stores the measured response

    times in Buffer A. The calculator of the 90 percentilesaccesses this buffer periodically to calculate the percentilescorresponding to each control period. The beginnings andends of sessions are forwarded to the session tracker, whichis in charge of calculating the instantaneous concurrentsessions being processed by the server. This variable isstored in Buffer B and represents the actual load processed

    by the server.The calculated 90 percentiles are not used in the control,

    for the reasons stated in the previous section, but they aresupplied by the controller for information purposes.Using these percentiles, the administrator of the servercan monitor the server behavior and verify that the90 percentiles established in the SLA are fulfilled.

    In each control period, the SIMO model of the server isused to obtain the number of maximum admissible sessionswhich can be accepted by the server. However, this is notcurrently necessary, because the model could be usedoffline to obtain the maximum admissible sessions andeliminated from the controller. The reason for incorporating

    the SIMO model of the server in the controller is to obtain acontroller design prepared for a future evolution of thecontroller, in which the current fixed model (that is, a modelwhich does not change in time) will be substituted by anadaptive model, which self-adjusts to slight variations inthe server behavior. In this future scenario, the model willhave to be used online.

    At the beginning of each control period, the controllercompares the instantaneous concurrent sessions with themaximum admissible sessions, and it obtains the currentadmissible sessions, which are the sessions that the servercan admit during the current control period, so that themaximum 90 percentiles are not surpassed. The value of thecurrent admissible sessions is supplied to the capacitydistributor, which calculates the differentiated availableprocessing capacity of the server. This value indicates thenumber of sessions of each consumer category that can beadmitted in the server during the current control period.

    Finally, the admission/rejection calculator is the elementin charge of admitting or rejecting the new session requestssent from the monitors. This calculator uses the differen-tiated available processing capacity to determine howmany sessions of each consumer category can be admittedduring the current control period. The available processingcapacity is recalculated at the beginning of each controlperiod and during the rest of the period the calculatoroperates asynchronously, admitting or rejecting the new

    session requests as they are received.The detailed operation of the capacity distributor and the

    admission/rejection calculator, as well as the algorithmsthey use to manage the capacity of the server are describedin the following section.

    4.4 Management of the Server Capacity

    At the beginning of each control period, the availablecapacity of the server is calculated by subtracting theinstantaneous concurrent sessions from the maximumadmissible sessions. The difference between these two valuesbecomes the total entrance tickets to the server for thecurrent control period. We call this value T0. The value ofT0

    is sent to the capacity distributor (see Fig. 6), which is in

    GARCIA ET AL.: A QOS CONTROL MECHANISM TO PROVIDE SERVICE DIFFERENTIATION AND OVERLOAD PROTECTION TO INTERNET... 9

    Fig. 5. Calculation procedure of the maximum admissible sessions.

  • 8/7/2019 Scalable server base

    8/14

    charge of distributing the available capacity of the server (T0)between the different categories of consumers trying to open

    new business sessions in the server. To illustrate the

    operation of our controller, three consumer categories are

    considered: gold, silver, and bronze. Nevertheless, our

    controller can operate with any number of categories.The capacity distributor also uses the records stored in

    Buffer C by the session tracker. This buffer holds the new

    session requests, corresponding to each consumer category,

    which have arrived in a determined number of past control

    periods. By applying a moving average technique to the

    records stored in Buffer C, the capacity distributor estimates

    the new session requests corresponding to each consumer

    category that are likely to arrive at the server during thecurrent control period. The estimated new session requests

    for gold, silver, and bronze consumers are denoted by RG,

    RS, and RB, respectively.Once RG, RS, and RB have been calculated, the capacity

    distributor executes the algorithm which generates the

    capacity distribution. This algorithm uses T0, RG, RS, and

    RB as inputs and generates TG0 , TS0 , TB0 , and TR0 as outputs.

    TG0 , TS0 , and TB0 are the calculated entrance tickets for gold,

    silver, and bronze consumers, respectively, and TR0 denotes

    the remaining tickets, that is, the tickets that are left over

    after the available tickets (T0) have been distributed among

    the three consumer categories.

    Fig. 7 shows the algorithm which carries out the capacitydistribution. The algorithm uses an auxiliary variable Taux,which is initialized with the available tickets (T0), and then,used to achieve the ticket distribution. After a set ofinitializations, the algorithm begins reserving a number of

    10 IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 2, NO. 1, JANUARY-MARCH 2009

    Fig. 6. Controller design.

    Fig. 7. Capacity distribution algorithm.

  • 8/7/2019 Scalable server base

    9/14

    tickets for gold consumers equal to the estimated newsession requests of gold consumers for the current controlperiod RG. However, if the estimation RG is greater than theavailable tickets T0, all the tickets are reserved for goldconsumers. Next, the number of available tickets (stored inTaux) is decremented by the number of tickets alreadyreserved for gold consumers. The reservation procedure

    continues with silver consumers and then with bronzeconsumers. If there are tickets left at the end of thereservation procedure, they are assigned to the remainingtickets (TR0). These tickets can be used by consumers of anycategory during the current control period.

    The functional block that determines whether a newsession request should be admitted or rejected is theadmission/rejection calculator (see Fig. 6). This calculatoruses four variables to maintain the number of availabletickets of each consumer category, as well as the number ofremaining tickets, at every instant during the controlperiod. These variables are called TG, TS, TB, and TR. Atthe beginning of the control period, these variables are

    initialized with the distribution of entrance tickets calcu-lated by the capacity distributor (TG0 , TS0 , TB0 , and TR0).Thus, TG is initialized with TG0 , TS with TS0 , and so on.

    Later, during the rest of the control period, variables TG,TS, TB, and TR are used by the calculator to determinewhether to admit or reject each new session request. When anew session request arrives at the controller, the calculatorfirst establishes the category of the consumer makingthe request. Then, it checks the appropriate variable todetermine if there is an available ticket for that consumercategory. If there is a ticket, the session is admitted and thecorresponding variable is decremented by one. If there areno available tickets, the calculator checks variable TR to

    determine if a remaining ticket is available. If so, the sessionis admitted and TR is decremented. If not, the sessionis rejected.

    The admission/rejection calculator also manages theend of each session. At the end of the sessions, thecapacity of the server increases by one session, andtherefore, a new entrance ticket is generated. We refer tothis ticket as the returned ticket. Then, the calculator mustassign the returned ticket to the available tickets of any ofthe consumer categories (TG, TS, or TB) or to the remainingtickets (TR). To achieve this assignment, the calculatorexecutes the algorithm for the returned ticket assignmentshown in Fig. 8. The strategy followed by this algorithm to

    assign each returned ticket to a category is based on thecompletion of the ticket reservation carried out by thecapacity distributor at the beginning of the control period.The capacity distributor generates the ticket reservation(TG0 , TS0 , and TB0), allocating the available tickets (T0) tothe different consumer categories, according to theestimated new session requests for the three categories inthe control period (RG, RS, and RB). If the number ofavailable tickets (T0) is less than the total estimatedrequests in the control period (RG RSRB), the ticketreservation is incomplete. The objective of the algorithm isto complete this reservation, assigning each returned ticketfirst to gold, next to silver, and finally to bronze

    consumers. To assign the returned ticket, the algorithm

    uses three variables, called T0G

    , T0S

    , and T0

    B, which, as in the

    case ofTG, TS, and TB, are also initialized with values TG0 ,TS0 , and TB0 at the beginning of the control period.

    The algorithm for the returned ticket assignment startsby applying two security conditions to assure that gold

    consumers and then silver consumers always have aminimum number of tickets. The minimum number oftickets for gold consumers is held in variable TGmin, and forsilver consumers, in variable TSmin. Both variables have beeninitialized with a value of one in the algorithm shown inFig. 8. However, these variables could also be initializedwith a greater number to minimize the rejection of sessionsof these consumer categories. The next three conditionalstatements applied by the algorithm are the statements usedto complete the ticket reservation. The completion processin a category finishes when the total assigned tickets to thecategory (T

    0

    G, T

    0

    S, or T

    0

    B) reaches the number of estimated

    new sessions of the category for the control period (RG, RS,

    or RB). Finally, if the returned ticket has not been used tosatisfy any of the previous conditions of the algorithm, it isassigned to the remaining tickets (TR).

    5 EXPERIMENTAL ENVIRONMENT

    The proposed QoS control mechanism has been tested on arepresentative B2B environment: the TPC-App benchmarkdeployed on a cluster server. This benchmark emulates atransactional application server which provides services toother servers in a B2B environment. Fig. 9 shows a schematicrepresentation of the interactions implemented in the TPC-App benchmark. As can be seen in the figure, the benchmark

    supplies seven Web services to the service consumers, and

    GARCIA ET AL.: A QOS CONTROL MECHANISM TO PROVIDE SERVICE DIFFERENTIATION AND OVERLOAD PROTECTION TO INTERNET... 11

    Fig. 8. Algorithm for the returned ticket assignment.

  • 8/7/2019 Scalable server base

    10/14

    usesfour other externalWeb services, whichemulate servicesprovided by external enterprises.All theservicessupplied bythe TPC-App use a database to store and retrieve data.Additional information about the design and implementa-tion of the TPC-App can be found in [35].

    Fig. 10 shows the experimental environment. This ismade up of the TPC-App benchmark, the emulator ofservice consumers, and the emulators of external services,as well as their corresponding computing platforms. As canbe seen in the figure, the benchmark is deployed in a cluster

    of two layers: the application layer and the data layer. The

    application layer is arranged in a load balancing cluster of

    three nodes, and the data layer, in a single server. Both the

    load balancing cluster and the single server make up the

    complete cluster executing the TPC-App. All the computers

    of the complete cluster are identical biprocessors based on

    Pentium Xeon 1 GHz running the Windows Server 2003

    operating system. In addition to the complete cluster, two

    additional single servers are used to host the emulator of

    service consumers and the emulators of external services.The application logic of the benchmark has been

    implemented using the support of the WebLogic run-time

    environment of BEA Systems, which provides a set of

    clustering facilities. This environment uses a load balancer

    to distribute the Web service requests between the nodes of

    the cluster using round-robin scheduling. The load balancer

    is located in a different node from the nodes executing the

    TPC-App application logic. In this way, the two nodes

    executing this logic support the same average load. Fig. 10

    also shows (shaded in gray) the integration of the

    components of the QoS control mechanism in the load

    balancing cluster. One monitor is located in each node

    executing the TPC-App application logic, and the controlleris placed in the load balancing node.

    The QoS control mechanism requires that the service

    consumer emulator includes two pieces of information in

    each service request: the consumer category making the

    request (gold, silver, or bronze) and the type of request,

    which indicates if this starts, finishes, or is within a business

    session. These pieces of information are inserted in the

    header of the SOAP protocol used to encapsulate the request.

    Thus, the software of the application logic of the benchmark

    does not require any modification, and therefore, the QoS

    control mechanism behaves as a transparent element with

    relation to the application executed by the cluster.

    12 IEEE TRANSACTIONS ON SERVICES COMPUTING, VOL. 2, NO. 1, JANUARY-MARCH 2009

    Fig. 10. Experimental environment.

    Fig. 9. Interactions in the TPC-App benchmark.

  • 8/7/2019 Scalable server base

    11/14

  • 8/7/2019 Scalable server base

    12/14

  • 8/7/2019 Scalable server base

    13/14

    Fig. 13 shows that during the first overload period(interval [3,000-4,000]), very few new session requestscorresponding to gold consumers are rejected. Thenumber of rejections increases for silver consumers, andfinally, in the case of bronze consumers, most of therequests are rejected. During the second overload period(interval [5,000-6,000]), more intense in load, the number

    of rejections of new session requests corresponding togold consumers increases slightly. This effect is muchmore notable for silver consumers, and extreme in thecase of bronze consumers, for which all the new sessionrequests are rejected.

    This experiment demonstrates that the QoS controlmechanism carries out an effective differentiation of theservice provided to consumers, reserving the processingcapacity of the cluster for the preferential consumersduring the overload periods (Requirement 2). Moreover, asexpected, the QoS control mechanism does not produceoverreservation of processing capacity for the preferentialconsumers when the cluster operates under normal

    load conditions.

    7 CONCLUSIONS AND FUTURE WORK

    The first relevant contribution of this work is the definitionof the requirements that any QoS control mechanism for aprovider server working in a B2B environment shouldaddress. Most of the current available mechanisms fail toaddress many of these requirements.

    The second important contribution is the development ofa new QoS control mechanism, specifically designed tofulfil the established requirements. This mechanism con-siders classes of requests and categories of consumers; it

    groups the requests carried out by consumers in securesessions; it also guarantees the SLAs during overloads,giving priority to the service of preferential consumers; it isindependent of the system software of the server, as well asof its application logic; and the adjustment of its configura-tion parameters is very simple.

    This research work not only presents a new QoS controlmechanism, but also establishes the basic design principlesfor these mechanisms. These principles can be used as areasonable starting point in the design of the QoS controlmechanisms in the near future.

    The next step in this research will be the substitutionof the fixed SIMO model used in the calculation of the

    maximum admissible sessions for an adaptive model. Theinput-output behavior of a server can change slightlyduring its operation, so an adaptive model can reproducethe server behavior more precisely than a fixed model can.

    REFERENCES[1] L. Eggert and J. Heidemann, Application-Level Differentiated

    Services for Web Servers, World-Wide Web J., vol. 2, no. 3, pp. 133-142, Aug. 1999.

    [2] J. Almeida, M. Dabu, A. Manikutty, and P. Cao, ProvidingDifferentiated Levels of Service in Web Content Hosting, Proc.First Workshop Internet Server Performance (WISP 98), June 1998.

    [3] T.F. Abdelzaher and K.G. Shin, QoS Provisioning with qCon-tracts in Web and Multimedia Servers, Proc. 20th Real-Time

    Systems Symp. (RTSS 99), pp. 44-53, Dec. 1999.

    [4] P. Pradhan, R. Tewari, S. Sahu, C. Chandra, and P. Shenoy, AnObservation-Based Approach Toward Self-Managing Web Ser-vers, Proc. 10th Intl Workshop Quality of Service (IWQoS 02), May2002.

    [5] A. Sharma, H. Adankar, and S. Sengupta, Managing QoSThrough Prioritization in Web Services, Proc. Fourth WebInformation Systems Eng. Workshop (WISEW 03), pp. 140-148,Dec. 2003.

    [6] B. Schroeder and M. Harchol-Balter, Web Servers Under Over-

    load: How Scheduling Can Help, Technical Report CMU-CS-02-143, Computer Science Dept., Carnegie-Mellon Univ., 2002.

    [7] Y. McWherter, B. Schroeder, A. Ailamaki, and M. Harchol-Balter,Priority Mechanisms for OLTP and Transactional Web Applica-tions, Proc. Intl Conf. Data Eng. (ICDE 04), pp. 535-546, 2004.

    [8] B. Schroeder, M. Harchol-Balter, A. Iyengar, and E.M. Nahum,Achieving Class-Based QoS for Transactional Workloads, Proc.22nd Intl Conf. Data Eng. (ICDE 06), p. 153, Apr. 2006.

    [9] A. Totok and V. Karamcheti, Improving Performance of InternetServices through Reward-Driven Request Prioritization, Proc.14th Intl Workshop Quality of Service (IWQoS 06), June 2006.

    [10] J. Wei, X. Zhou, and C.-Z. Xu, Robust Processing Rate Allocationfor Proportional Slowdown Differentiation on Internet Servers,IEEE Trans. Computers, vol. 54, no. 8, pp. 964-977, Aug. 2005.

    [11] L. Sha, X. Liu, Y. Lu, and T. Abdelzaher, Queueing Model BasedNetwork Server Performance Control, Proc. 23th IEEE Real-TimeSystems Symp. (RTSS 02), pp. 81-90, Dec. 2002.

    [12] D. Henriksson, Y. Lu, and T. Abdelzaher, Improved Predictionfor Web Server Delay Control, Proc. 16th Euromicro Conf. Real-Time Systems (ECRTS 04), pp. 61-68, June 2004.

    [13] X. Liu, R. Zhang, J. Heo, Q. Wang, and L. Sha, TimingPerformance Control in Web Server Systems Utilizing ServerInternal State Information, Proc. Intl Conf. Networking and Services(ICNS 05), Oct. 2005.

    [14] J. Heo, X. Liu, L. Sha, and T. Abdelzaer, Autonomous DelayRegulation for Multithreaded Internet Servers, Proc. Intl Symp.Performance Evaluation of Computer and Telecomm. Systems (SPECTS06), pp. 465-472, July 2006.

    [15] Y. Lu, T. Abdelzaher, C. Lu, L. Sha, and X. Liu, Feedback Controlwith Queuing-Theoretic Prediction for Relative Delay Guarantiesin Web Servers, Proc. Ninth Real-Time Technology and ApplicationsSymp. (RTAS 03), May 2003.

    [16] Y. Wei, C. Liu, T. Voigt, and F. Ren, Fuzzy Control forGuaranteeing Absolute Delays in Web Servers, Proc. Second Intl

    Conf. Quality of Service in Heterogeneous Wired-Wireless Networks(QSHINE 05), Aug. 2005.

    [17] J. Wei and C.-Z. Xu, eQoS: Provisioning of Client-Perceived End-to-End QoS Guarantees in Web Servers, Proc. 13th Intl WorkshopQuality of Service (IWQoS 05), pp. 123-135, June 2005.

    [18] Z. Wang, X. Liu, A. Zhang, C. Stewart, X. Zhu, T. Kelly, and S.Singhal, Autoparam: Automated Control of Application-LevelPerformance in Virtualized Server Environments, Proc. SecondIEEE Intl Workshop Feedback Control Implementation and Design inComputing Systems and Networks (FeBID 07), pp. 2-7, 2007.

    [19] X. Liu, X. Zhu, P. Padala, Z. Wang, and S. Singhal, OptimalMultivariate Control for Differentiated Services on a SharedHosting Platform, Proc. 46th IEEE Conf. Decision and Control(CDC 07), pp. 12-14, Dec. 2007.

    [20] L. Cherkasova and P. Phaal, Session-Based Admission Control: AMechanism for Peak Load Management of Commercial WebSites, IEEE Trans. Computers, vol. 51, no. 6, pp. 669-685, June 2002.

    [21] X. Chen and P. Phaal, An Admission Control Scheme forPredictable Server Response Time for Web Accesses, Proc. 10thWorld Wide Web Conf. (WWW 01), pp. 545-554, May 2001.

    [22] S. Elnikety, E. Nahum, J. Tracey, and W. Zwaenepoel, AMethod for Transparent Admission Control and RequestScheduling in E-Commerce Web Sites, Proc. 13th World WideWeb Conf. (WWW 04), May 2004.

    [23] A. Kamra, V. Misra, and E. Nahum, A Self-Tuning Controller forManaging the Performance of 3-Tiered Web Sites, Proc. 12th IntlWorkshop Quality of Service (IWQoS 04), June 2004.

    [24] M. Welsh and D. Culler, Adaptive Overload Control for BusyInternet Servers, Proc. Fourth USENIX Conf. Internet Technologiesand Systems (USITS 03), Mar. 2003.

    [25] S.C. Lee, J.C. Lui, and D.K. Yau, A Proportional-Delay Diffserv-Enabled Web Server: Admission Control and Dynamic Adapta-tion, IEEE Trans. Parallel and Distributed Systems, vol. 15, no. 5,

    pp. 385-400, May 2004.

    GARCIA ET AL.: A QOS CONTROL MECHANISM TO PROVIDE SERVICE DIFFERENTIATION AND OVERLOAD PROTECTION TO INTERNET... 15

  • 8/7/2019 Scalable server base

    14/14