ieee transactions on parallel and distributed …lik/publications/keqin-li... · cloud computing is...

Optimal Multiserver Configuration for ProfitMaximization in Cloud Computing

Junwei Cao, Senior Member, IEEE, Kai Hwang, Fellow, IEEE,

Keqin Li, Senior Member, IEEE, and Albert Y. Zomaya, Fellow, IEEE

Abstract—As cloud computing becomes more and more popular, understanding the economics of cloud computing becomes critically

important. To maximize the profit, a service provider should understand both service charges and business costs, and how they are

determined by the characteristics of the applications and the configuration of a multiserver system. The problem of optimal multiserver

configuration for profit maximization in a cloud computing environment is studied. Our pricing model takes such factors into

considerations as the amount of a service, the workload of an application environment, the configuration of a multiserver system, the

service-level agreement, the satisfaction of a consumer, the quality of a service, the penalty of a low-quality service, the cost of renting,

the cost of energy consumption, and a service provider’s margin and profit. Our approach is to treat a multiserver system as an M/M/m

queuing model, such that our optimization problem can be formulated and solved analytically. Two server speed and power consumption

models are considered, namely, the idle-speed model and the constant-speed model. The probability density function of the waiting time

of a newly arrived service request is derived. The expected service charge to a service request is calculated. The expected net business

gain in one unit of time is obtained. Numerical calculations of the optimal server size and the optimal server speed are demonstrated.

Index Terms—Cloud computing, multiserver system, pricing model, profit, queuing model, response time, server configuration,

service charge, service-level agreement, waiting time

Ç

1 INTRODUCTION

CLOUD computing is quickly becoming an effective andefficient way of computing resources and computing

services consolidation [10]. By centralized management ofresources and services, cloud computing delivers hostedservices over the Internet, such that accesses to sharedhardware, software, databases, information, and all re-sources are provided to consumers on-demand. Cloudcomputing is able to provide the most cost-effective andenergy-efficient way of computing resources managementand computing services provision. Cloud computing turnsinformation technology into ordinary commodities andutilities by using the pay-per-use pricing model [3], [5], [18].However, cloud computing will never be free [8], andunderstanding the economics of cloud computing becomescritically important.

One attractive cloud computing environment is a three-tier structure [15], which consists of infrastructure vendors,service providers, and consumers. The three parties arealso called cluster nodes, cluster managers, and consumers

in cluster computing systems [21], and resource providers,service providers, and clients in grid computing systems[19]. An infrastructure vendor maintains basic hardwareand software facilities. A service provider rents resourcesfrom the infrastructure vendors, builds appropriate multi-server systems, and provides various services to users. Aconsumer submits a service request to a service provider,receives the desired result from the service provider withcertain service-level agreement, and pays for the servicebased on the amount of the service and the quality of theservice. A service provider can build different multiserversystems for different application domains, such that servicerequests of different nature are sent to different multiserversystems. Each multiserver system contains multiple ser-vers, and such a multiserver system can be devoted toserve one type of service requests and applications. Anapplication domain is characterized by two basic features,i.e., the workload of an application environment and theexpected amount of a service. The configuration of amultiserver system is characterized by two basic features,i.e., the size of the multiserver system (the number ofservers) and the speed of the multiserver system (executionspeed of the servers).

Like all business, the pricing model of a service providerin cloud computing is based on two components, namely,the income and the cost. For a service provider, the income(i.e., the revenue) is the service charge to users, and the costis the renting cost plus the utility cost paid to infrastructurevendors. A pricing model in cloud computing includesmany considerations, such as the amount of a service (therequirement of a service), the workload of an applicationenvironment, the configuration (the size and the speed) of amultiserver system, the service-level agreement, the satisfac-tion of a consumer (the expected service time), the quality ofa service (the task waiting time and the task response time),

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 24, NO. 6, JUNE 2013 1087

. J. Cao is with the Research Institute of Information Technology, TsinghuaNational Laboratory for Information Science and Technology, TsinghuaUniversity, Beijing 100084, China. E-mail: [email protected].

. K. Hwang is with the Department of Electrical Engineering, University ofSouthern California, Los Angeles, CA 90089. E-mail: [email protected].

. K. Li is with the Department of Computer Science, State University of NewYork, New Paltz, New York 12561. E-mail: [email protected].

. A.Y. Zomaya is with the School of Information Technologies, University ofSydney, Sydney, NSW 2006, Australia.E-mail: [email protected].

Manuscript received 23 Feb. 2012; revised 18 June 2012; accepted 28 June2012; published online 25 June 2012.Recommended for acceptance by V.B. Misic, R. Buyya, D. Milojicic, and Y.Cui.For information on obtaining reprints of this article, please send e-mail to:[email protected], and reference IEEECS Log NumberTPDSSI-2012-02-0137.Digital Object Identifier no. 10.1109/TPDS.2012.203.

1045-9219/13/$31.00 � 2013 IEEE Published by the IEEE Computer Society

the penalty of a low-quality service, the cost of renting, thecost of energy consumption, and a service provider’s marginand profit. The profit (i.e., the net business gain) is theincome minus the cost. To maximize the profit, a serviceprovider should understand both service charges andbusiness costs, and in particular, how they are determinedby the characteristics of the applications and the configura-tion of a multiserver system.

The service charge to a service request is determined bytwo factors, i.e., the expected length of the service and theactual length of the service. The expected length of a service(i.e., the expected service time) is the execution time of anapplication on a standard server with a baseline or referencespeed. Once the baseline speed is set, the expected length ofa service is determined by a service request itself, i.e., theservice requirement (amount of service) measured by thenumber of instructions to be executed. The longer (shorter,respectively) the expected length of a service is, the more(less, respectively) the service charge is. The actual length ofa service (i.e., the actual service time) is the actual executiontime of an application. The actual length of a servicedepends on the size of a multiserver system, the speed ofthe servers (which may be faster or slower than the baselinespeed), and the workload of the multiserver system. Noticethat the actual service time is a random variable, which isdetermined by the task waiting time once a multiserversystem is established.

There are many different service performance metrics inservice-level agreements [2]. Our performance metric in thispaper is the task response time (or the turn around time),i.e., the time taken to complete a task, which includes taskwaiting time and task execution time. The service-levelagreement is the promised time to complete a service,which is a constant times the expected length of a service. Ifthe actual length of a service is (or, a service request iscompleted) within the service-level agreement, the servicewill be fully charged. However, if the actual length of aservice exceeds the service-level agreement, the servicecharge will be reduced. The longer (shorter, respectively)the actual length of a service is, the more (less, respectively)the reduction of the service charge is. In other words, thereis penalty for a service provider to break a service-levelagreement. If the actual service time exceeds certain limit(which is service request dependent), a service will beentirely free with no charge. Notice that the service chargeof a service request is a random variable, and we areinterested in its expectation.

The cost of a service provider includes two components,i.e., the renting cost and the utility cost. The renting cost isproportional to the size of a multiserver system, i.e., thenumber of servers. The utility cost is essentially the cost ofenergy consumption and is determined by both the size andthe speed of a multiserver system. The faster (slower,respectively) the speed is, the more (less, respectively) theutility cost is. To calculate the cost of energy consumption, weneed to establish certain server speed and power consump-tion models.

To increase the revenue of business, a service provider canconstruct and configure a multiserver system with manyservers of high speed. Since the actual service time (i.e., thetask response time) contains task waiting time and task

execution time, more servers reduce the waiting time andfaster servers reduce both waiting time and execution time.Hence, a powerful multiserver system reduces the penalty ofbreaking a service-level agreement and increases therevenue. However, more servers (i.e., a larger multiserversystem) increase the cost of facility renting from theinfrastructure vendors and the cost of base power consump-tion. Furthermore, faster servers increase the cost of energyconsumption. Such increased cost may counterweight thegain from penalty reduction. Therefore, for an applicationenvironment with specific workload which includes the taskarrival rate and the average task execution requirement, aservice provider needs to decide an optimal multiserverconfiguration (i.e., the size and the speed of a multiserversystem), such that the expected profit is maximized.

In this paper, we study the problem of optimal multi-server configuration for profit maximization in a cloudcomputing environment. Our approach is to treat a multi-server system as an M/M/m queuing model, such that ouroptimization problem can be formulated and solvedanalytically. We consider two server speed and powerconsumption models, namely, the idle-speed model and theconstant-speed model. Our main contributions are asfollows. We derive the probability density function (pdf)of the waiting time of a newly arrived service request. Thisresult is significant in its own right and is the base of ourdiscussion. We calculate the expected service charge to aservice request. Based on these results, we get the expectednet business gain in one unit of time, and obtain the optimalserver size and the optimal server speed numerically. To thebest of our knowledge, there has been no similar investiga-tion in the literature, although the method of optimalmulticore server processor configuration has been em-ployed for other purposes, such as managing the power andperformance tradeoff [17].

One related research is user-centric and market-based andutility-driven resource management and task scheduling,which have been considered for cluster computing systems[7], [20], [21] and grid computing systems [4], [12], [19]. Tocompete and bid for shared computing resources throughthe use of economic mechanisms such as auctions, a user canspecify the value (utility, yield) of a task, i.e., the reward(price, profit) of completing the task. A utility function,which measures the value and importance of a task as well asa user’s tolerance to delay and sensitivity to quality ofservice, supports market-based bidding, negotiation, andadmission control. By taking an economic approach toproviding service-oriented and utility computing, a serviceprovider allocates resources and schedules tasks in such away that the total profit earned is maximized. Instead oftraditional system-centric performance optimization such asminimizing the average task response time, the main concernin such computational economy is user-centric performanceoptimization, i.e., maximizing the total utility delivered tothe users (i.e., the total user-perceived value).

2 A MULTISERVER MODEL

Throughout the paper, we use P½e� to denote the probabilityof an event e. For a random variable x, we use fxðtÞ torepresent the probability density function of x, and FxðtÞ to

1088 IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEMS, VOL. 24, NO. 6, JUNE 2013

represent the cumulative distribution function (cdf) of x,and �x to represent the expectation of x.

A cloud computing service provider serves users’ servicerequests by using a multiserver system, which is constructedand maintained by an infrastructure vendor and rented bythe service provider. The architecture detail of the multi-server system can be quite flexible. Examples are bladeservers and blade centers where each server is a server blade[16], clusters of traditional servers where each server is anordinary processor [7], [20], [21], and multicore serverprocessors where each server is a single core [17]. We willsimply call these blades/processors/cores as servers. Users(i.e., customers of a service provider) submit service requests(i.e., applications and tasks) to a service provider, and theservice provider serves the requests (i.e., run the applicationsand perform the tasks) on a multiserver system.

Assume that a multiserver system S has m identicalservers. In this paper, a multiserver system is treated as anM/M/m queuing system which is elaborated as follows.There is a Poisson stream of service requests with arrivalrate �, i.e., the interarrival times are independent andidentically distributed (i.i.d.) exponential random variableswith mean 1=�. A multiserver system S maintains a queuewith infinite capacity for waiting tasks when all the mservers are busy. The first-come-first-served (FCFS) queuingdiscipline is adopted. The task execution requirements(measured by the number of instructions to be executed) arei.i.d. exponential random variables r with mean �r. The mservers (i.e., blades/processors/cores) of S have identicalexecution speed s (measured by the number of instructionsthat can be executed in one unit of time). Hence, the taskexecution times on the servers of S are i.i.d. exponentialrandom variables x ¼ r=s with mean �x ¼ �r=s.

Notice that although an M/G/m queuing system has beenconsidered (see, e.g., [13]), the M/M/m queuing model is theonly model that accommodates an analytical and closed-form expression of the probability density function of thewaiting time of a newly arrived service request.

Let � ¼ 1=�x ¼ s=�r be the average service rate, i.e., theaverage number of service requests that can be finished by aserver of S in one unit of time. The server utilization is� ¼ �=m� ¼ ��x=m ¼ �=m � �r=s, which is the average per-centage of time that a server of S is busy. Let pk denote theprobability that there are k service requests (waiting orbeing processed) in the M/M/m queuing system for S.Then, we have ([14, p. 102])

pk ¼p0ðm�Þk

k!; k � m;

p0mm�k

m!; k � m;

8>><>>:

where

p0 ¼Xm�1

k¼0

ðm�Þk

k!þ ðm�Þ

m

m!� 1

1� �

!�1

:

The probability of queuing (i.e., the probability that a newlysubmitted service request must wait because all servers arebusy) is

Pq ¼X1k¼m

pk ¼pm

1� � ¼ p0ðm�Þm

m!� 1

1� � :

The average number of service requests (in waiting or inexecution) in S is

N ¼X1k¼0

kpk ¼ m�þ�

1� �Pq:

Applying Little’s result, we get the average task responsetime as

T ¼ N�¼ �x 1þ Pq

mð1� �Þ

� �¼ �x 1þ pm

mð1� �Þ2

!:

The average waiting time of a service request is

W ¼ T � �x ¼ pm

mð1� �Þ2�x:

The waiting time is the source of customer dissatisfaction. Aservice provider should keep the waiting time to a low levelby providing enough servers and/or increasing serverspeed, and be willing to pay back to a customer in case thewaiting time exceeds certain limit.

3 POWER CONSUMPTION MODELS

Power dissipation and circuit delay in digital CMOS circuitscan be accurately modeled by simple equations, even forcomplex microprocessor circuits. CMOS circuits have dy-namic, static, and short-circuit power dissipation; however,the dominant component in a well-designed circuit isdynamic power consumption P (i.e., the switching compo-nent of power), which is approximately P ¼ aCV 2f , where ais an activity factor, C is the loading capacitance, V is thesupply voltage, and f is the clock frequency [6]. In the idealcase, the supply voltage and the clock frequency are relatedin such a way that V / f� for some constant � > 0 [22]. Theprocessor execution speed s is usually linearly proportionalto the clock frequency, namely, s / f . For ease of discussion,we will assume that V ¼ bf� and s ¼ cf , where b and c aresome constants. Hence, we know that power consumption isP ¼ aCV 2f ¼ ab2Cf2�þ1 ¼ ðab2C=c2�þ1Þs2�þ1 ¼ �s�, where� ¼ ab2C=c2�þ1 and � ¼ 2�þ 1. For instance, by settingb ¼ 1:16, aC ¼ 7:0, c ¼ 1:0, � ¼ 0:5, � ¼ 2�þ 1 ¼ 2:0, and� ¼ ab2C=c� ¼ 9:4192, the value of P calculated by theequation P ¼ aCV 2f ¼ �s� is reasonably close to that in[11] for the Intel Pentium M processor.

We will consider two types of server speed and powerconsumption models. In the idle-speed model, a server runs atzero speed when there is no task to perform. Since the powerfor speed s is �s�, the average amount of energy consumed bya server in one unit of time is ��s� ¼ �

m�r�s��1, where we

notice that the speed of a server is zero when it is idle. Theaverage amount of energy consumed by an m-server systemS in one unit of time, i.e., the power supply to the multiserversystem S, is P ¼ m��s� ¼ ��r�s��1, where m� ¼ ��x is theaverage number of busy servers in S. Since a server stillconsumes some amount of power P � even when it is idle(assume that an idle server consumes certain base power P �,which includes static power dissipation, short-circuit powerdissipation, and other leakage and wasted power [1]), we willinclude P � in P , i.e., P ¼ mð��s� þ P �Þ ¼ ��r�s��1 þmP �.Notice that when P � ¼ 0, the above P is independent of m.

CAO ET AL.: OPTIMAL MULTISERVER CONFIGURATION FOR PROFIT MAXIMIZATION IN CLOUD COMPUTING 1089

In the constant-speed model, all servers run at the speed seven if there is no task to perform. Again, we use P torepresent the power allocated to multiserver system S.Since the power for speed s is �s�, the power allocated tomultiserver system S is P ¼ mð�s� þ P �Þ.

4 WAITING TIME DISTRIBUTION

Let W denote the waiting time of a new service request thatarrives to a multiserver system. In this section, we find thepdf fW ðtÞ of W . To this end, we consider W in differentsituations, depending on the number of tasks in the queuingsystem when a new service request arrives. Let Wk denotethe waiting time of a new task that arrives to an M/M/mqueuing system under the condition that there are k tasks inthe queuing system when the task arrives.

We define a unit impulse function uzðtÞ as follows:

uzðtÞ ¼z; 0 � t � 1

z;

0; t >1

z:

8><>:

The function uzðtÞ has the following property:Z 10

uzðtÞdt ¼ 1;

namely, uzðtÞ can be treated as a pdf of a random variablewith expectation

Z 10

tuzðtÞdt ¼ zZ 1=z

0

tdt ¼ 1

2z:

Let z!1 and define uðtÞ ¼ limz!1 uzðtÞ.It is clear that anyrandom variable whose pdf is uðtÞ has expectation 0.

The following theorem gives the pdf of the waiting timeof a newly arrived service request:

Theorem 1. The pdf of the waiting time W of a newly arrivedservice request is

fW ðtÞ ¼ ð1� PqÞuðtÞ þm�pme�ð1��Þm�t;

where Pq ¼ pm=ð1� �Þ and pm ¼ p0ðm�Þm=m!.

Sketch of the Proof. Let Wk be the waiting time of a newservice request if there are k tasks in the queuing systemwhen the service request arrives. We find the pdf of Wk

for all k � 0. Then, we have

fW ðtÞ ¼X1k¼0

pkfWkðtÞ:

Actually, Wk can be found for two cases, i.e., when k < mand when k � m. A complete proof of the theorem isgiven in Section 9. tuNotice that a multiserver system with multiple identical

servers has been configured to serve requests from certainapplication domain. Therefore, we will only focus on taskwaiting time in a waiting queue and do not consider othersources of delay, such as resource allocation and provision,virtual machine instantiation and deployment, and otheroverhead in a complex cloud computing environment.

5 SERVICE CHARGE

If all the servers have a fixed speed s, the execution time of aservice request with execution requirement r is known asx ¼ r=s. The response time to the service request is T ¼W þ x ¼W þ r=s. The response time T is related to theservice charge to a customer of a service provider in cloudcomputing.

To study the expected service charge to a customer, weneed a complete specification of a service charge based onthe amount of a service, the service-level agreement, thesatisfaction of a consumer, the quality of a service, thepenalty of a low-quality service, and a service provider’smargin and profit.

Let s0 be the baseline speed of a server. We define theservice charge function for a service request with executionrequirement r and response time T to be

Cðr; T Þ ¼

ar; if 0 � T � c

s0r;

ar� d T � c

s0r

� �;

ifc

s0r < T � a

dþ c

s0

� �r;

0; if T >a

dþ c

s0

� �r:

8>>>>>>>>>><>>>>>>>>>>:

The above function is defined with the following rationals:

. If the response time T to process a service request isno longer than ðc=s0Þr ¼ cðr=s0Þ(i.e., a constant ctimes the task execution time with speed s0), wherethe constant c is a parameter indicating the service-level agreement, and the constant s0 is a parameterindicating the expectation and satisfaction of aconsumer, then a service provider considers thatthe service request is processed successfully withhigh quality of service and charges a customer ar,which is linearly proportional to the task executionrequirement r(i.e., the amount of service), where a isthe service charge per unit amount of service (i.e., aservice provider’s margin and profit).

. If the response time T to process a service request islonger than ðc=s0Þr but no longer than ða=dþ c=s0Þr,then a service provider considers that the servicerequest is processed with low quality of service andthe charge to a customer should decrease linearly asT increases. The parameter d indicates the degree ofpenalty of breaking the service-level agreement.

. If the response time T to process a service request islonger than ða=dþ c=s0Þr, then a service providerconsiders that the service request has been waitingtoo long, so there is no charge and the service is free.

Notice that the task response time T is compared withthe task execution time on a server with speed s0 (i.e., thebaseline or reference speed). The actual speed s of a servercan be decided by a service provider, which can be eitherlower or higher than s0, depending on the workload (i.e., �and �r) and system parameters (e.g., m, �, and P �) and theservice charge function (i.e., a, c, and d), such that the netbusiness gain to be defined below is maximized.


To build our discussion upon our earlier result on task

waiting time, we notice that the service charge function can

be rewritten equivalently in terms of r and W as

Cðr;W Þ ¼

ar; if 0 �W � c

s0� 1

s

� �r;

aþ cds0� ds

� �r� dW;

ifc

s0� 1

s

� �r < W � a

dþ c

s0� 1

s

� �r;

0; if W >a

dþ c

s0� 1

s

� �r:

8>>>>>>>>>><>>>>>>>>>>:

The following theorem gives the expected charge to a

service request:

Theorem 2. The expected charge to a service request is

C ¼ a�r

�1� Pqððms� ��rÞðc=s0 � 1=sÞ þ 1Þ

� 1

ððms� ��rÞða=dþ c=s0 � 1=sÞ þ 1Þ

�;

where Pq ¼ pm=ð1� �Þ and pm ¼ p0ðm�Þm=m!.

Sketch of the Proof. The proof is actually a detailed

calculation of C, which contains two steps. In the first

step, we calculate CðrÞ, i.e., the expected charge to a

service request with execution requirement r, based on

the pdf of W obtained from Theorem 1. In the second

step, we calculate C based on the pdf of r. A complete

proof of the theorem is given in Section 9. tu

In Fig. 1, we consider the expected charge to a service

request with execution requirement r, i.e.,

CðrÞ ¼ ar� dPqð1� �Þm��

e�ð1��Þm�ðc=s0�1=sÞr � e�ð1��Þm�ða=dþc=s0�1=sÞr�:We assume that �r ¼ 1 billion instructions, m ¼ 7 servers,

s0 ¼ 1 billion instructions per second, s ¼ 1 billion instruc-

tions per second, a ¼ 10 cents per one billion instructions

(Note: The monetary unit “cent” in this paper may not be

identical but should be linearly proportional to the real cent

in US dollars.), c ¼ 3, and d ¼ 1 cents per second. (Note:

These parameters are chosen only for illustration and should

be scaled to any values.) For � ¼ 6:15; 6:35; 6:55; 6:75; 6:95

service requests per second, we show CðrÞ for 0 � r � 3. Itcan be seen that the service charge is a decreasing function of�, since the waiting time and lateness penalty increase as �increases. It can also be seen that the service charge is anincreasing function of r, i.e., large service requests generatemore revenue than small service requests.

In Fig. 2, we further display CðrÞ=ar using the sameparameters in Fig. 1. Since ar is the ideal (maximum) chargeto a service request with execution requirement r, CðrÞ=ar isconsidered as the normalized service charge. For � ¼6:15; 6:35; 6:55; 6:75; 6:95 service requests per second, weshow CðrÞ=ar for 0 � r � 3. It can be seen that thenormalized service charge is a decreasing function of �,since the waiting time and lateness penalty increase as �increases. It can also be seen that the normalized servicecharge is an increasing function of r, i.e., the percentage oflost service charge due to waiting time decreases as servicerequirement r increases. In other words, it is more likely tomake profit from large service requests and it is more likelyto give free services to small service requests. It can beverified that as r approaches 0, the normalized servicecharge is limr!0

CðrÞar ¼ 1� Pq, where Pq increases (and 1� Pq

decreases) as � increases. It can also be verified that as rapproaches infinity, the normalized service charge islimr!1

CðrÞar ¼ 1, for all �.

6 NET BUSINESS GAIN

Since the number of service requests processed in one unitof time is � in a stable M/M/m queuing system, theexpected service charge in one unit of time is �C, which isactually the expected revenue of a service provider. Assumethat the rental cost of one server for unit of time is �. Also,assume that the cost of energy is per Watt. The cost of aservice provider is the sum of the cost of infrastructurerenting and the cost of energy consumption, i.e., �mþ P .Then, the expected net business gain (i.e., the net profit) of aservice provider in one unit of time is G ¼ �C � ð�mþ P Þ,which is defined as the revenue minus the cost. The aboveequation is G ¼ �C � ð�mþ ð��r�s��1 þmP �ÞÞ, for theidle-speed model, and G ¼ �C � ð�mþ mð�s� þ P �ÞÞ, forthe constant-speed model.

In Figs. 3 and 4, we demonstrate the revenue �C and thenet business gain G in one unit of time as a function of � forthe two power consumption models, respectively, using thesame parameters in Figs. 1 and 2. Furthermore, we assume


Fig. 1. Service charge CðrÞ versus r and �. Fig. 2. Normalized service charge CðrÞ=ar versus r and �.

that P � ¼ 2 Watts, � ¼ 2:0, � ¼ 9:4192, � ¼ 1:5 cents persecond, and ¼ 0:1 cents per Watt. For 0 � � � 7, we show�C and G. The cost of infrastructure renting is �m ¼ 14cents per second, and the cost of energy consumption is0:5�þ 7 cents per second for the idle-speed model and10.5 cents per second for the constant-speed model. Weobserve that both �C and G increase with � almost linearlyand drop sharply after certain point. In other words, moreservice requests bring more revenue and net business gain;however, after the number of service requests per unit oftime reaches certain point, the excessive waiting time causesincreased lateness penalty, so that there is no revenue andnegative business gain.

There are two situations that cause negative businessgain. In the first case, there is no enough business (i.e.,service requests). In this case, a service provider shouldconsider reducing the number of servers m and/or serverspeed s, so that the cost of infrastructure renting and thecost of energy consumption can be reduced. In the secondcase, there is too much business (i.e., service requests). Inthis case, a service provider should consider increasing thenumber of servers and/or server speed, so that the waitingtime can be reduced and the revenue can be increased.However, increasing the number of servers and/or serverspeed also increases the cost of infrastructure renting andthe cost of energy consumption. Therefore, we have theproblem of selecting the optimal server size and/or serverspeed so that the profit is maximized.

7 PROFIT MAXIMIZATION

To formulate and solve our optimization problems analy-tically, we need a closed-form expression of C. To this end,let us use the following closed-form approximation,Pm�1

k¼0ðm�Þkk! em�, which is very accurate when m is not

too small and � is not too large [17]. We also need Stirling’sapproximation of m!, i.e., m!

ffiffiffiffiffiffiffiffiffiffi2mp

ðme Þm. Therefore, we

get the following closed-form approximation of pm:

pm 1� �ffiffiffiffiffiffiffiffiffiffi

2mp

ð1� �Þðe�=e�Þm þ 1;

and the following closed-form approximation of Pq:

Pq 1ffiffiffiffiffiffiffiffiffiffi

2mp

ð1� �Þðe�=e�Þm þ 1:

By using the above-closed-form expression of Pq, we get a

closed-form approximation of the expected service charge

to a service request as

C a�r

1� 1

ðffiffiffiffiffiffiffiffiffiffi2mp

ð1� �Þðe�=e�Þm þ 1Þ

� 1

ððms� ��rÞðc=s0 � 1=sÞ þ 1Þ

� 1

ððms� ��rÞða=dþ c=s0 � 1=sÞ þ 1Þ

!:

For convenience, we rewrite C as C ¼ a�rð1� 1D1D2D3

Þ, where

D1 ¼ffiffiffiffiffiffiffiffiffiffi2mp

ð1� �Þðe�=e�Þm þ 1;

D2 ¼ ðms� ��rÞðc=s0 � 1=sÞ þ 1;

D3 ¼ ðms� ��rÞða=dþ c=s0 � 1=sÞ þ 1:

Our discussion in this section is based on the above-closed-

form expression of C.

7.1 Optimal Size

Given �, �r, s, P �, �, �, , a, c, and d, our first problem is to

find m such that G is maximized. To maximize G, we need

to find m such that

@G

@m¼ � @C

@m� ð� þ P �Þ ¼ 0;

for the idle-speed model, and

@G

@m¼ � @C

@m� ð� þ ð�s� þ P �ÞÞ ¼ 0;

for the constant-speed model, where

@C

@m¼ a�r

ðD1D2D3Þ2

� D2D3@D1

@mþD1D3

@D2

@mþD1D2

@D3

@m

� �:

To continue the calculation, we rewrite D1 as D1 ¼ffiffiffiffiffiffiffiffiffiffi2mp

ð1� �ÞRþ 1, where R ¼ ðe�=e�Þm. Notice that lnR ¼m lnðe�=e�Þ ¼ mð�� ln �� 1Þ. Since

@�

@m¼ � ��r

m2s¼ � �

m;


Fig. 3. Revenue and net business gain versus � (idle-speed model). Fig. 4. Revenue and net business gain versus � (constant-speedmodel).

we get

1

R

@R

@m¼ ð�� ln �� 1Þ þm 1� 1

�

� �@�

@m¼ � ln �;

and

@R

@m¼ �R ln �:

Now, we have

@D1

@m¼

ffiffiffiffiffiffi2p �

1

2ffiffiffiffiffimp ð1� �ÞRþ

ffiffiffiffiffimp

� @�

@m

� �R

þffiffiffiffiffimpð1� �Þ @R

@m

�

¼ffiffiffiffiffiffi2p �

1

2ffiffiffiffiffimp ð1� �ÞRþ

ffiffiffiffiffimp �

mR�

ffiffiffiffiffimpð1� �ÞR ln �

�

¼ffiffiffiffiffiffi2p �

1

2ffiffiffiffiffimp ð1� �ÞRþ 1ffiffiffiffiffi

mp �R�

ffiffiffiffiffimpð1� �Þðln �ÞR

�

¼ffiffiffiffiffiffi2p 1

2ffiffiffiffiffimp ð1þ �ÞR�

ffiffiffiffiffimpð1� �Þðln �ÞR

� �:

Furthermore, we have

@D2

@m¼ cs=s0 � 1;

and

@D3

@m¼ as=dþ cs=s0 � 1:

Although there is no closed-form solution to m, we

notice that @G=@m is a decreasing function of m. Therefore,

m can be found numerically by using the standard bisection

method.In Figs. 5 and 6, we demonstrate the net business gain G

in one unit of time as a function of m and � for the two

power consumption models, respectively, using the same

parameters in Figs. 1, 2, 3, and 4. For � ¼ 2:9; 3:9; 4:9; 5:9; 6:9,

we display G for m large enough such that � < 1. We notice

that there is an optimal choice of m such that G is

maximized. Using our analytical results, we can find m

such that @G=@m ¼ 0. The optimal value of m is 3.67479,

4.79218, 5.89396, 6.98457, 8.06655, respectively, for the idle-

speed model, and 3.54842, 4.64834, 5.73478, 6.81160, 7.88104,

respectively, for the constant-speed model.

Such server size optimization has clear physical inter-pretation. When m is small such that � is close to 1, thewaiting times of service requests are excessively long, andthe service charges and the net business gain are low. As mincreases, the waiting times are significantly reduced, andthe service charges and the net business gain are increased.However, as m further increases, there will be no moreincrease in the expected services charge which has anupper bound a�r; on the other hand, the cost of a serviceprovider (i.e., the rental cost and base power consumption)increases, so that the net business gain is actually reduced.Hence, there is an optimal choice of m which maximizesthe profit.

7.2 Optimal Speed

Given �, �r, m, P �, �, �, , a, c, and d, our second problem isto find s such that G is maximized. To maximize G, we needto find s such that

@G

@s¼ � @C

@s� ��r�ð�� 1Þs��2 ¼ 0;

for the idle-speed model, and

@G

@s¼ � @C

@s� m��s��1 ¼ 0;

for the constant-speed model, where

@C

@s¼ a�r

ðD1D2D3Þ2

� D2D3@D1

@sþD1D3

@D2

@sþD1D2

@D3

@s

� �:

Similar to the calculation in the last section, we have

@�

@s¼ � ��r

ms2¼ � �

s;

and

1

R

@R

@s¼ m 1� 1

�

� �@�

@s¼ msð1� �Þ;

and

@R

@s¼ msð1� �ÞR:


Fig. 5. Net business gain G versus m and � (idle-speed model). Fig. 6. Net business gain G versus m and � (constant-speed model).

Now, we have

@D1

@s¼

ffiffiffiffiffiffiffiffiffiffi2mp ��

� @�@s

�Rþ ð1� �Þ @R

@s

�

¼ffiffiffiffiffiffiffiffiffiffi2mp �

�

sRþm

sð1� �Þ2R

�

¼ffiffiffiffiffiffiffiffiffiffi2mp

ð�þmð1� �Þ2ÞRs:

Furthermore, we have

@D2

@s¼ m c

s0� 1

s

� �þ ðms� ��rÞ 1

s2

¼ mcs0� ��r

s2;

and

@D3

@s¼ m a

dþ c

s0� 1

s

� �þ ðms� ��rÞ 1

s2

¼ m a

dþ c

s0

� �� r

s2:

Although there is no closed-form solution to s, we noticethat @G=@s is a decreasing function of s. Therefore, s can befound numerically by using the standard bisection method.

In Figs. 7 and 8, we demonstrate the net business gainG inone unit of time as a function of s and � for the two powerconsumption models, respectively, using the same para-meters in Figs. 1, 2, 3, 4, 5, and 6. For � ¼ 2:9; 3:9; 4:9; 5:9; 6:9,we display G for s large enough such that � < 1. We noticethat there is an optimal choice of s such thatG is maximized.Using our analytical results, we can find s such that

@G=@s ¼ 0. The optimal value of s is 0.63215, 0.76982,0.90888, 1.04911, 1.19011, respectively, for the idle-speedmodel, and 0.57015, 0.71009, 0.85145, 0.99348, 1.13584,respectively, for the constant-speed model.

Such server speed optimization also has clear physicalinterpretation. When s is small such that � is close to 1, thewaiting times of service requests are excessively long, andthe service charges and the net business gain are low. As sincreases, the waiting times are significantly reduced, andthe service charges and the net business gain are increased.However, as s further increases, there will be no moreincrease in the expected services charge which has an upperbound a�r; on the other hand, the cost of a service provider(i.e., the cost of energy consumption) increases, so that thenet business gain is actually reduced. Hence, there is anoptimal choice of s which maximizes the profit.

7.3 Optimal Size and Speed

Given �, �r, P �, �, �, , a, c, and d, our third problem is tofind m and s such that G is maximized. To maximize G, weneed to find m and s such that @G=@m ¼ 0 and @G=@s ¼ 0,where @G=@m and @G=@s have been derived in the last twosections. The two equations can be solved by a nestedbisection search procedure.

In Figs. 9 and 10, we demonstrate the net business gain Gin one unit of time as a function of s and m for the twopower consumption models, respectively, using the sameparameters in Figs. 1, 2, 3, 4, 5, 6, 7, and 8, where � ¼ 6:9.For m ¼ 4; 5; 6; 7; 8, we display G for s large enough suchthat � < 1. Using our analytical results, we can find m and s

such that @G=@m ¼ 0 and @G=@s ¼ 0. For the idle-speed


Fig. 7. Net business gain G versus s and � (idle-speed model).

Fig. 8. Net business gain G versus s and � (constant-speed model).

Fig. 9. Net business gain G versus s and m (idle-speed model).

Fig. 10. Net business gain G versus s and m (constant-speed model).

model, the theoretically optimal values are m ¼ 5:56827 ands ¼ 1:46819, which result in the maximum G ¼ 49:25361 byusing the closed-form approximation of C. Practically, mcan be either 5 or 6. When m ¼ 5, the optimal value of s is1.62236, which results in the maximum G ¼ 49:16510. Whenm ¼ 6, the optimal value of s is 1.37044, which results in themaximum G ¼ 49:18888. Hence, the practically optimalsetting is m ¼ 6 and s ¼ 1:37044, and the maximum netbusiness gain in one unit of time is G ¼ 49:29273 by usingthe exact expression of C. For the constant-speed model, thetheoretically optimal values are m ¼ 5:79074 and s ¼1:35667, which result in the maximum G ¼ 47:80769 byusing the closed-form approximation of C. Practically, mcan be either 5 or 6. When m ¼ 5, the optimal value of s is1.55839, which results in the maximum G ¼ 47:63979. Whenm ¼ 6, the optimal value of s is 1.31213, which results in themaximum G ¼ 47:78640. Hence, the practically optimalsetting is m ¼ 6 and s ¼ 1:31213, and the maximum netbusiness gain in one unit of time is G ¼ 47:91830 by usingthe exact expression of C.

8 SIMULATION SETTINGS AND RESULTS

See Section 1 of the supplementary material, which can befound on the Computer Society Digital Library at http://doi.ieeecomputersociety.org/10.1109/TPDS.2012.203.

9 PROOFS OF THEOREMS 1 AND 2

See Section 2 of the supplementary material, available online.

10 CONCLUDING REMARKS

We have proposed a pricing model for cloud computingwhich takes many factors into considerations, such as therequirement r of a service, the workload � of an applicationenvironment, the configuration (m and s) of a multiserversystem, the service level agreement c, the satisfaction (r ands0) of a consumer, the quality (W and T ) of a service, thepenalty d of a low-quality service, the cost (� and m) ofrenting, the cost (�, ,P �, andP ) of energy consumption, anda service provider’s margin and profit a. By using an M/M/m queuing model, we formulated and solved the problem ofoptimal multiserver configuration for profit maximization ina cloud computing environment. Our discussion can beeasily extended to other service charge functions. Ourmethodology can be applied to other pricing models.

ACKNOWLEDGMENTS

The authors are grateful to three anonymous reviewers fortheir constructive comments. Part of the work wereperformed while K. Hwang, K. Li, and A.Y. Zomaya werevisiting Tsinghua National Laboratory for InformationScience and Technology, Tsinghua University, in the winterof 2011 and the summer of 2012 as Intellectual Venturesendowed visiting chair professors. This work is partiallysupported by Ministry of Science and Technology of Chinaunder National 973 Basic Research Grants No.2011CB302805 and No. 2011CB302505, and National 863High-tech Program Grant No. 2011AA040501; Ministry of

Industry and Information Technology of China under theInternet of Things program; the Innovation R/D TeamProgram of Guangdong Province, China, under contractNo. 201001D0104726115; Australian Research GrantDP1097110.

REFERENCES

[1] http://en.wikipedia.org/wiki/CMOS, 2012.[2] http://en.wikipedia.org/wiki/Service_level_agreement, 2012.[3] M. Armbrust et al., “Above the Clouds: A Berkeley View of Cloud

Computing,” Technical Report No. UCB/EECS-2009-28, Feb. 2009.[4] R. Buyya, D. Abramson, J. Giddy, and H. Stockinger, “Economic

Models for Resource Management and Scheduling in GridComputing,” Concurrency and Computation: Practice and Experience,vol. 14, pp. 1507-1542, 2007.

[5] R. Buyya, C.S. Yeo, S. Venugopal, J. Broberg, and I. Brandic,“Cloud Computing and Emerging IT Platforms: Vision, Hype, andReality for Delivering Computing as the Fifth Utility,” FutureGeneration Computer Systems, vol. 25, no. 6, pp. 599-616, 2009.

[6] A.P. Chandrakasan, S. Sheng, and R.W. Brodersen, “Low-PowerCMOS Digital Design,” IEEE J. Solid-State Circuits, vol. 27, no. 4,pp. 473-484, Apr. 1992.

[7] B.N. Chun and D.E. Culler, “User-Centric Performance Analysisof Market-Based Cluster Batch Schedulers,” Proc. Second IEEE/ACM Int’l Symp. Cluster Computing and the Grid, 2002.

[8] D. Durkee, “Why Cloud Computing Will Never be Free,” Comm.ACM, vol. 53, no. 5, pp. 62-69, 2010.

[9] R. Ghosh, K.S. Trivedi, V.K. Naik, and D.S. Kim, “End-to-EndPerformability Analysis for Infrastructure-as-a-Service Cloud: AnInteracting Stochastic Models Approach,” Proc. 16th IEEE PacificRim Int’l Symp. Dependable Computing, pp. 125-132, 2010.

[10] K. Hwang, G.C. Fox, and J.J. Dongarra, Distributed and CloudComputing. Morgan Kaufmann, 2012.

[11] “Enhanced Intel SpeedStep Technology for the Intel Pentium MProcessor,”White Paper, Intel, Mar. 2004.

[12] D.E. Irwin, L.E. Grit, and J.S. Chase, “Balancing Risk and Rewardin a Market-Based Task Service,” Proc. 13th IEEE Int’l Symp. HighPerformance Distributed Computing, pp. 160-169, 2004.

[13] H. Khazaei, J. Misic, and V.B. Misic, “Performance Analysis ofCloud Computing Centers Using M/G/m/m+r Queuing Sys-tems,” IEEE Trans. Parallel and Distributed Systems, vol. 23, no. 5,pp. 936-943, May 2012.

[14] L. Kleinrock, Queueing Systems: Theory, vol. 1. John Wiley andSons, 1975.

[15] Y.C. Lee, C. Wang, A.Y. Zomaya, and B.B. Zhou, “Profit-DrivenService Request Scheduling in Clouds,” Proc. 10th IEEE/ACM Int’lConf. Cluster, Cloud and Grid Computing, pp. 15-24, 2010.

[16] K. Li, “Optimal Load Distribution for Multiple HeterogeneousBlade Servers in a Cloud Computing Environment,” Proc. 25thIEEE Int’l Parallel and Distributed Processing Symp. Workshops,pp. 943-952, May 2011.

[17] K. Li, “Optimal Configuration of a Multicore Server Processor forManaging the Power and Performance Tradeoff,” J. Supercomput-ing, vol. 61, no. 1, pp. 189-214, 2012.

[18] P. Mell and T. Grance, “The NIST Definition of Cloud Comput-ing,” Nat’l Inst. of Standards and Technology, http://csrc.nist.gov/groups/SNS/cloud-computing/, 2009.

[19] F.I. Popovici and J. Wilkes, “Profitable Services in an UncertainWorld,” Proc. ACM/IEEE Conf. Supercomputing, 2005.

[20] J. Sherwani, N. Ali, N. Lotia, Z. Hayat, and R. Buyya, “Libra: AComputational Economy-Based Job Scheduling System for Clus-ters,” Software - Practice and Experience, vol. 34, pp. 573-590, 2004.

[21] C.S. Yeo and R. Buyya, “A Taxonomy of Market-Based ResourceManagement Systems for Utility-Driven Cluster Computing,”Software - Practice and Experience, vol. 36, pp. 1381-1419, 2006.

[22] B. Zhai, D. Blaauw, D. Sylvester, and K. Flautner, “Theoretical andPractical Limits of Dynamic Voltage Scaling,” Proc. 41st DesignAutomation Conf., pp. 868-873, 2004.


Junwei Cao received the bachelor’s and mas-ter’s degrees in control theories and engineeringin 1996 and 1998, respectively, both fromTsinghua University, Beijing, China. He receivedthe PhD degree in computer science from theUniversity of Warwick, Coventry, United King-dom, in 2001. He is currently a professor andvice director, Research Institute of InformationTechnology, Tsinghua University, Beijing, Chi-na. He is also the director of Common Platform

and Technology Division, Tsinghua National Laboratory for InformationScience and Technology. Before joining Tsinghua University in 2006, hewas a research scientist at MIT LIGO Laboratory and NEC LaboratoriesEurope for about five years. He has published more than 130 papersand cited by international scholars for over 3,000 times. He is the bookeditor of Cyberinfrastructure Technologies and Applications, publishedby Nova Science in 2009. His research is focused on advancedcomputing technologies and applications. He is a senior member of theIEEE and IEEE Computer Society, and a member of the ACM and CCF.

Kai Hwang received the PhD degree from theUniversity of California, Berkeley in 1972. He is aprofessor of EE/CS at the University of SouthernCalifornia. He also chairs the IV-endowedvisiting chair professor group at Tsinghua Uni-versity in China. He has published eight booksand more than 218 scientific papers in computerarchitecture, parallel processing, distributedsystems, cloud computing, network security,and Internet applications. His popular books

have been adopted worldwide and translated into four foreignlanguages. His published papers have been cited more than 11,000times by early 2012. His latest book Distributed and Cloud Computing:from Parallel Processing to the Internet of Things (with G. Fox and J.Dongarra) was just published by Kaufmann in 2011. He received the2004 CFC Outstanding Achievement Award, and the Founders Awardfor his pioneering work in parallel processing from IEEE IPDPS in 2011.He has served as a founding editor-in-chief of the Journal of Parallel andDistributed Computing for 28 years. He has delivered 34 keynoteaddresses on advanced computing systems and cutting-edge informa-tion technologies in major IEEE/ACM Conferences. He has performedadvisory, consulting and collaborative work for IBM, Intel, MIT LincolnLab, JPL at Caltech, ETL in Japan, ITRI in Taiwan, GMD in Germany,INRIA in France, and Chinese Academy of Sciences. He is a fellow ofthe IEEE (1986).

Keqin Li is a SUNY distinguished professor incomputer science and an Intellectual Venturesendowed visiting chair professor at TsinghuaUniversity, China. His research interests aremainly in design and analysis of algorithms,parallel and distributed computing, and computernetworking. He has contributed extensively toprocessor allocation and resource management;design and analysis of sequential/parallel, deter-ministic/probabilistic, and approximation algo-

rithms; parallel and distributed computing systems performanceanalysis, prediction, and evaluation; job scheduling, task dispatching,and load balancing in heterogeneous distributed systems; dynamic treeembedding and randomized load distribution in static networks; parallelcomputing using optical interconnections; dynamic location managementin wireless communication networks; routing and wavelength assignmentin optical networks; energy-efficient power management and perfor-mance optimization. He has published more than 240 researchpublications and has received several Best Paper Awards for his highestquality work. He is currently on the editorial board of IEEE Transactionson Parallel and Distributed Systems, IEEE Transactions on Computers,International Journal of Parallel, Emergent and Distributed Systems,International Journal of High Performance Computing and Networking,and Optimization Letters. He is a senior member of the IEEE.

Albert Y. Zomaya is currently the chair professorof high performance computing and networkingand Australian Research Council Professorialfellow in the School of Information Technologies,The University of Sydney. He is also the directorof the Centre for Distributed and High Perfor-mance Computing which was established in late2009. He is the author/coauthor of seven books,more than 400 papers, and the editor of ninebooks and 11 conference proceedings. He is the

editor-in-chief of the IEEE Transactions on Computers and serves as anassociate editor for 19 leading journals, such as, the IEEE Transactionson Parallel and Distributed Systems and Journal of Parallel andDistributed Computing. He is the recipient of the Meritorious ServiceAward (in 2000) and the Golden Core Recognition (in 2006), both fromthe IEEE Computer Society. Also, he received the IEEE TechnicalCommittee on Parallel Processing Outstanding Service Award and theIEEE Technical Committee on Scalable Computing Medal for Excellencein Scalable Computing, both in 2011. He is a chartered engineer, a fellowof AAAS, IEEE, and IET (United Kingdom).

. For more information on this or any other computing topic,please visit our Digital Library at www.computer.org/publications/dlib.


ieee transactions on parallel and distributed …lik/publications/keqin-li... · cloud computing is...

Documents