[ieee 2014 ieee international conference on services computing (scc) - anchorage, ak, usa...

8
Quality of Web Service Prediction by Collective Matrix Factorization Richong Zhang, Chune Li, Hailong Sun, Yanghao Wang, Jinpeng Huai School of Computer Science and Engineering Beihang University Beijing, China Email: {zhangrc, lichune, sunhl, wangyh, huaijp}@act.buaa.edu.cn Abstract—This paper studies the quality of web service pre- diction problem. We formalize the QoS prediction problem by incorporating multiple contextual characteristics via collective matrix factorization that simultaneously factor the user-service quality matrix and contextual information matrices. Using the service category and location context, we develop three context- aware QoS prediction models and algorithms to demonstrate the advantages of this modeling technique. The advantages of our proposed models are demonstrated via experiments on real-life data sets. Keywords-quality of web services; matrix factorization; QoS prediction. I. I NTRODUCTION The web services or RESTful APIs, are more and more prevalent in the context of web developments. By taking the advantages of the increasing availability of these programs, web developers enjoy a simple developing experience than ever before. Still, the quality of web services and APIs varies significantly. In addition, with the explosive growth of the number of public available service components, potential users have to spend an immense amount of time retrieving these services that assist them in better developing web applications. Some web service and API aggregating web sites, such as Seekda 1 , and ProgrammableWeb 2 listing these components, now allow users to write feedbacks, annotate tags, or rate on the services and make these information available for web developers. In addition, Zheng et al. exams the performance of web services and makes the detected results available for all web developers 3 . Potential service users may make use of these collected information to estimate the performance of services and to decide whether to adapt them for their application or not. Nevertheless, the accumulation of these feedbacks and performance takes time before an actual high- quality service or API can be discovered. To address these problems, our goal is to develop models that can effectively discover the services with the highest quality to assist web developers’ programming process. Previous studies of the quality of web service prediction primarily consider the user-service invocation quality matrix, 1 webservices.seekda.com 2 www.programmableweb.com/ 3 www.wsdream.net where each entry of the matrix is the quality achieved by a user when calling a web service. In practice, however, other associated contextual properties of users and services affect the quality of the web service, such as the categories of the functionality that a service provides and the location where this service is hosted. This paper studies the possibility of incorporating these side information when designing quality of web service prediction systems to improve the quality prediction performance. The contextual characteristics that may affect the quality of web services make it desirable that the quality of service (QoS) prediction model is capable of characterizing all these features. In this paper, we exploit the collective ma- trix factorization [1] which simultaneously considers these contextual features. Specifically, we formalize context-aware prediction models for the quality of web services and design learning algorithms for these models. These models general- ly take the combination of multiple contextual characteristics of web services and their users, and provide the QoS predictor with effectiveness. Furthermore, we develop QoS predictors based on the stochastic gradient descent algorithms. Experiments using data sets from wsdream.com and comparisons with existing quality of web service prediction algorithms confirm the superiority of our proposed models. The remainder of this paper is organized as follows. Section II introduces the related works. Section III discusses the contextual characteristics of web service invocation. Section IV presents a formulation of quality of web ser- vice prediction and provides algorithms resulting from a collective matrix factorization model. Section V presents an experimental evaluation of our approach. This paper ends with some discussion and brief conclusions in section VI. II. RELATED WORK A. Web Service Recommendation and QoS Prediction The general goal of web service recommendation and QoS prediction is to predict missing values in the user-service invocation quality matrix. Collaborative filtering is one of the most commonly-used recommendation approaches and is successfully exploited by many service recommender systems and QoS prediction methods. For example in [2] and [3], authors proposed methods of determining user similarity 2014 IEEE International Conference on Services Computing 978-1-4799-5066-9/14 $31.00 © 2014 IEEE DOI 10.1109/SCC.2014.64 432

Upload: jinpeng

Post on 16-Mar-2017

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: [IEEE 2014 IEEE International Conference on Services Computing (SCC) - Anchorage, AK, USA (2014.6.27-2014.7.2)] 2014 IEEE International Conference on Services Computing - Quality of

Quality of Web Service Prediction by Collective Matrix Factorization

Richong Zhang, Chune Li, Hailong Sun, Yanghao Wang, Jinpeng Huai

School of Computer Science and EngineeringBeihang University

Beijing, ChinaEmail: {zhangrc, lichune, sunhl, wangyh, huaijp}@act.buaa.edu.cn

Abstract—This paper studies the quality of web service pre-diction problem. We formalize the QoS prediction problem byincorporating multiple contextual characteristics via collectivematrix factorization that simultaneously factor the user-servicequality matrix and contextual information matrices. Using theservice category and location context, we develop three context-aware QoS prediction models and algorithms to demonstratethe advantages of this modeling technique. The advantagesof our proposed models are demonstrated via experiments onreal-life data sets.

Keywords-quality of web services; matrix factorization; QoSprediction.

I. INTRODUCTION

The web services or RESTful APIs, are more and more

prevalent in the context of web developments. By taking the

advantages of the increasing availability of these programs,

web developers enjoy a simple developing experience than

ever before. Still, the quality of web services and APIs varies

significantly. In addition, with the explosive growth of the

number of public available service components, potential

users have to spend an immense amount of time retrieving

these services that assist them in better developing web

applications.

Some web service and API aggregating web sites, such as

Seekda1, and ProgrammableWeb2 listing these components,

now allow users to write feedbacks, annotate tags, or rate on

the services and make these information available for web

developers. In addition, Zheng et al. exams the performance

of web services and makes the detected results available for

all web developers 3. Potential service users may make use

of these collected information to estimate the performance

of services and to decide whether to adapt them for their

application or not. Nevertheless, the accumulation of these

feedbacks and performance takes time before an actual high-

quality service or API can be discovered. To address these

problems, our goal is to develop models that can effectively

discover the services with the highest quality to assist web

developers’ programming process.

Previous studies of the quality of web service prediction

primarily consider the user-service invocation quality matrix,

1webservices.seekda.com2www.programmableweb.com/3www.wsdream.net

where each entry of the matrix is the quality achieved by a

user when calling a web service. In practice, however, other

associated contextual properties of users and services affect

the quality of the web service, such as the categories of the

functionality that a service provides and the location where

this service is hosted. This paper studies the possibility of

incorporating these side information when designing quality

of web service prediction systems to improve the quality

prediction performance.

The contextual characteristics that may affect the quality

of web services make it desirable that the quality of service

(QoS) prediction model is capable of characterizing all

these features. In this paper, we exploit the collective ma-

trix factorization [1] which simultaneously considers these

contextual features. Specifically, we formalize context-aware

prediction models for the quality of web services and design

learning algorithms for these models. These models general-

ly take the combination of multiple contextual characteristics

of web services and their users, and provide the QoS

predictor with effectiveness.

Furthermore, we develop QoS predictors based on the

stochastic gradient descent algorithms. Experiments using

data sets from wsdream.com and comparisons with existing

quality of web service prediction algorithms confirm the

superiority of our proposed models.

The remainder of this paper is organized as follows.

Section II introduces the related works. Section III discusses

the contextual characteristics of web service invocation.

Section IV presents a formulation of quality of web ser-

vice prediction and provides algorithms resulting from a

collective matrix factorization model. Section V presents an

experimental evaluation of our approach. This paper ends

with some discussion and brief conclusions in section VI.

II. RELATED WORK

A. Web Service Recommendation and QoS Prediction

The general goal of web service recommendation and QoS

prediction is to predict missing values in the user-service

invocation quality matrix. Collaborative filtering is one of

the most commonly-used recommendation approaches and

is successfully exploited by many service recommender

systems and QoS prediction methods. For example in [2] and

[3], authors proposed methods of determining user similarity

2014 IEEE International Conference on Services Computing

978-1-4799-5066-9/14 $31.00 © 2014 IEEE

DOI 10.1109/SCC.2014.64

432

Page 2: [IEEE 2014 IEEE International Conference on Services Computing (SCC) - Anchorage, AK, USA (2014.6.27-2014.7.2)] 2014 IEEE International Conference on Services Computing - Quality of

by collaborative filtering and predicted the QoS data based

on similar users’ service invocation histories.

In addition, Zheng et al [4] proposed a hybrid collabora-

tive filtering approach combining both user-based and item-

based methods to solve the web service recommendation

problem. They also conducted several large-scale evaluations

on real-world Web services [5] [6] and provided QoS

data sets which promoted the research of QoS-driven Web

services.

Model-based approaches in QoS prediction adapted the

ideas of pattern recognition, data mining and machine learn-

ing. Ge et al. [7] proposed a QoS prediction method based

on pattern recognition to predict the QoS of web services

that considers the impact of the diversity of user feelings for

different network environments and platforms.

Service discovery has also been widely researched in

Service Oriented Architecture (SOA) based systems and

researchers were considering on the non-functional aspects

such as web service selection and recommendation based on

the quality of service [8].

B. Context-aware Prediction

Schmidt et al. [9] defined the context that describes as a

situation or environment a device or user is in. The intuition

of context-aware recommender system entails that, in some

application scenario, user preferences are not monotonous

which might leads to bad performance of recommender

systems. Based on the stages of integrating the contextual

information, a context-aware recommender system is cate-

gorized as contextual pre-filtering, contextual post-filtering

and contextual modeling [10]. Recently, matrix factorization

arouses the attention of researchers [11]–[13] and the

effectiveness of this model has been confirmed by a number

of studies. Also, in [14], a context-aware recommender

system for mobile application discovery is proposed that

utilizes the implicit feedback of personal usage history and

the tensor factorization approach to make predictions.

C. Matrix Factorization

The matrix factorization (MF) model has achieved the best

performance in Netflix challenge [15]. Researchers focus

on how to extend MF model to get more accurate results

on the prediction of unobserved rating. Many variants have

been proposed to incorporate other factors. For example, the

traditional neighbor-based collaborative filtering is combined

with the MF model [16] and the temporal dynamic of users’

taste and items’ timeliness is also exploited in modeling the

temporal dynamic factors [17]. Some probabilistic approach-

es [18] [19] of the matrix factorization are also proposed to

identify the hidden connections between features to predict

the missing values of matrices.

III. CONTEXTUAL CHARACTERISTICS OF WEB SERVICE

INVOCATION

We believe that the performance of the web service

invocation is dependent on many contextual characteristics.

In reality, the web services’ QoS attributes are not only

affected by the service itself, but also dependent on the

contextual characteristics of service providers and service

users. These characteristics, e.g., network conditions and the

categories of services, affect the quality that the invoking

users can achieve. Figure 1 and Figure 2 show the average

response time and throughput of invocations from where

invoker located and to countries where service hosted. The

darker the color is, the greater the value is (which means

longer response time or larger throughput). There are 28

countries analyzed for this example and these two figures

show that the QoS varies significantly between pairs of

locations. Also, it can be seen that the quality (response

time and throughput) for the service invocation from and to

the same country is in general better than the cases when

”from and to” countries pair are different. Figure 3 and

Figure 4 illustrate that there is too great a disparity of the

average response time and throughput between the different

service categories. The categories considered in this study

are described in section V-A.

These statistics confirm that the contextual characteristics

affect the quality of web services and this requires the

design of QoS prediction system capable of combining both

service information and contextual information to generate

predictions.

Figure 1. Average response time. The two-dimensional matrix is avisual representation of a corresponding matrix whose entries representthe average response time of invocations from and to different countries.Higher values are indicated with darker cells.

In this paper, we take the web service response time

and throughput prediction as examples of the quality of

web services to illustrate the performance of the proposed

models.

433

Page 3: [IEEE 2014 IEEE International Conference on Services Computing (SCC) - Anchorage, AK, USA (2014.6.27-2014.7.2)] 2014 IEEE International Conference on Services Computing - Quality of

Figure 2. Average throughput. The two-dimensional matrix is a visualrepresentation of a corresponding numerical matrix whose entries show theaverage throughput of invocations from and to different countries. Same asprevious figure, higher values are indicated with darker cells.

1 2 3 4 5 6 7 80

0.2

0.4

0.6

0.8

1

1.2

1.4

Figure 3. Average response time of different categories of services.

1 2 3 4 5 6 7 80

0.5

1

1.5

2

2.5x 104

Figure 4. Average throughput of different categories of services.

IV. QUALITY OF WEB SERVICE PREDICTION MODEL

In this section, we propose the formulation of the quality

of web service problem and the QoS prediction models.

A. Problem Statement

We denote all web services and the set of all users by Sand U respectively. We also denote by Xu,s the quality of

web service s experienced by user u. The estimation of this

value is denoted by Xu,s. Service quality Xu,s is observed

for every (u, s) ∈ U×S . The objective of the QoS prediction

problem is to estimate the missing value of the user-service

invocation quality matrix X .

B. Matrix Factorization Model

The matrix factorization model has been used for solving

this quality of the web service prediction problem. The basis

of matrix factorization is assuming a latent low-dimensional

space RD on which for each user u, a user feature pu is

defined and for each service s, an service feature qs is

defined. That is, pu and qs both belong to RD, and the

estimated rating Xu,s is defined by the inner product of

these two vectors, namely,

Xu,s = puqTs . (1)

Representing the collection of qs’s as a |S|×D matrix Qand the collection of pu’s as a |U|×D matrix P , the estima-

tion problem of interest then reduces to solve the following

minimization problem: Find (Q,P ) that minimizes

||X − PQT ||2 + ρ||Q||2 + ρ||P ||2 (2)

for some given positive value of ρ. The notation ||·|| denotes

the matrix Frobenius norm.

An extension of matrix factorization is regularized SVD

with bias [20], which formulates the estimated Xu,s as

Xu,s = μ+ bs + bu + puqTs (3)

where μ is the average of all QoS values, bu and bs are

respectively user bias and item bias. Denote by BU the

collection of all bu’s, and by BS the collection of all

bs’s. The estimation problem then reduces to the following

minimization problem: Find (BU , BS , Q, P ) that minimizes

||X −PQT ||2 + ρ(||Q||2 + ||P ||2 + ||BU ||2 + ||BS ||2) (4)

The optimization problems as stated in (2) and (4) can

both be solved using gradient descent or stochastic gradient

descent algorithms.

C. Collective Matrix Factorization

It has been shown in previous subsection that the tradi-

tional matrix factorization model is able to solve the quality

of web service prediction problem. However, other explicit

factors and contextual characteristics are not considered. In

practice, the performance of the learning algorithm would

434

Page 4: [IEEE 2014 IEEE International Conference on Services Computing (SCC) - Anchorage, AK, USA (2014.6.27-2014.7.2)] 2014 IEEE International Conference on Services Computing - Quality of

be improved when incorporating more contextual informa-

tion. In this part, we will introduce the collective matrix

factorization (CMF) [1] to enhance the matrix factorization

models that predict quality of web services merely based on

the user-service quality pairs.

1) CMF with Service Category Information: We denote

the service categories data by Y ∈ R|S|×|C| , where C is the

set of categories, each element of the matrix Ys,c (service-

category matrix) is a binary value denoting whether the

service s belongs to the category c or not. In the previous

section of this paper, we have defined the user-service matrix

X , user feature P , and service feature Q. By utilizing the

service feature Q as a shared factor for factorizing both

user-service matrix and service-category matrix, these two

matrices, X and Y , can be decomposed at the same time.

Similar as the traditional matrix factorization discussed

above, a latent low-dimensional space RD on which for each

service category c, a category feature wc is defined. That is,

qs and wc both belong to RD, and the estimated service

category Ys,c is defined by the inner product of these two

vectors, namely,

Ys,c = qswTc (5)

Denoting the collection of wc’s as a |C|×D matrix W , the

loss function of this collective learning problem is defined

as:

LT (P,Q,W ) = α||X − PQT ||2 + β||Y −QWT ||2+ ρ(||P ||2 + ||W ||2 + ||Q||2)

(6)

where α,β ∈ [0, 1] weight the relative importance of

service quality and categories and α+β = 1. This model is

referred to as “CMF-T” in the rest of this paper.

2) CMF with User and Service Location Information:The introducing of the service category could help the model

learn a more precise latent relations between users and

services. A natural introduction of the location context of

web services invocation could be incorporated into the above

model, such that this contextual feature can be also learned

through the collective model.

We denote LU ∈ R|U|×|L| the location where a service

user is located (user-location matirx), where U is the set

of service invokers, L is the set of all location contexts of

services users. vl ∈ RD denotes the latent feature of location

l and the estimated user location LUu,l is defined by the

inner product of user feature vector pu and location feature

vector vl.

LUu,l = puvTl (7)

Similarly, we consider the location information of service

providers. We denote LS ∈ R|S|×|E| the locations where

services locate, where S is the set of services, E is the set of

all location contexts of services providers. oe ∈ RD denotes

the latent feature of the provider location e and the estimated

provider location LSs,e is defined by the inner product of

service feature vector qs and location feature vector oe.

LSs,e = qsoTe (8)

Denoting the collection of vl’s as a |L|×D matrix V , and

the collection of oe’s as a |E| ×D matrix O, the objective

function of this model can be defined as:

LL(P,Q, V,O) = α||X − PQT ||2 + γ||LU − PV T ||2+ δ||LS −QOT ||2+ ρ(||P ||2 + ||Q||2 + ||V ||2 + ||O||2)

(9)

where α+ γ + δ = 1.

This model is referred to as “CMF-L” in the reminder of

this paper.3) CMF with Category and Location Information: In this

model, we incorporate all the information mentioned above,

that is to factorize user-service matrix, user-location matrix,

service-category matrix and service-location matrix at the

same time. The objective function can be defined as:

LT L(P,Q,W, V,O)

= α||X − PQT ||2 + β||Y −QWT ||2+ γ||LU − PV T ||2 + δ||LS −QOT ||2+ ρ(||P ||2 + ||Q||2 + ||W ||2 + ||V ||2 + ||O||2)

(10)

where α + β + γ + δ = 1. This model is referred to as

“CMF-TL” in this paper.

At this point, we have not only arrived at a sensible and

well-defined notion of quality of web services, we also have

translated the problem of QoS prediction to an optimization

problem. This optimization can be solved by stochastic

gradient descent algorithm [21].

D. Algorithms

Overall we take a stochastic gradient-based approach to

minimize the objective functions. For each latent vector b in

objective function L, the update rule of the parameter is as

follows:

b = b− λ∂L∂b

(11)

where λ is the step size.

For the first collective matrix factorization model, CMF-T,

that simultaneously factors matrices X and Y , the param-

eters to be updated are user feature P , service feature Qand category feature W . And the partial derivatives of the

objective function LT with respect to these parameters are

as follows.

∂LT

∂pu

= 2α(Xu: − puQT )(−Q) + 2ρpu (12)

435

Page 5: [IEEE 2014 IEEE International Conference on Services Computing (SCC) - Anchorage, AK, USA (2014.6.27-2014.7.2)] 2014 IEEE International Conference on Services Computing - Quality of

∂LT

∂wc

= 2β(Y:c −QwTc )

T (−Q) + 2ρwc (13)

∂LT

∂qs= 2α(X:s − PqTs )

T (−P )

+ 2β(Ys: − qsWT )(−W ) + 2ρqs (14)

Xu: denotes the row vector of X corresponding to the user

u; Y:c denotes the column vector of Y corresponding to the

category c.Let Pu: denote the row vector pu and let Qs: denote the

row vector qs, both are length-D vectors. The algorithm 1

shows the stochastic gradient algorithm to estimate the

parameters.

1: initialization P = rand(), Q = rand(),W = rand()2: repeat3: for each (u, s) which Xus is observed do4: Pu: ← Pu: + λ[α(Xus − Pu:Q

Ts:)Qs: − ρPu:]

5: Qs: ← Qs: + λ[α(Xus − Pu:QTs:)Pu: − ρQs:]

6: end for7: for each (s, c) which Ysc is nonzero do8: Qs: ← Qs: + λ[β(Ysc −Qs:W

Tc: )Wc: − ρQs:]

9: Wc: ←Wc: + λ[β(Ysc −Qs:WTc: )Qs: − ρWc:]

10: end for11: record RMSE(P,Q, testX)12: if λ > minStep then13: λ← 0.99λ14: end if15: until reach maxIteration or meet the convergence

criteria.Algorithm 1: CMF-T: simultaneously factorizing matrices

X and Y .

The second collective matrix factorization model, CMF-L,

simultaneously factorizes matrices X , LU , LS. We compute

the partial derivatives of the objective function LL with

respect to user factors P , service factors Q, user location

factors V and service location factors O, and then update

the parameter according Eq. 11.

∂LL

∂pu

= 2α(Xu: − puQT )(−Q)

+ 2γ(LUu: − puVT )(−V ) + 2ρpu (15)

∂LL

∂qs= 2α(X:s − PqTs )

T (−P )

+ 2δ(LSs: − qsOT )(−O) + 2ρqs (16)

∂LL

∂vl

= 2γ(LU:l − PvTl )T (−P ) + 2ρvl (17)

∂LL

∂oe= 2δ(LS:e −QoTe )

T (−Q) + 2ρoe (18)

1: initialization

P = rand(), Q = rand(), V = rand(), O = rand()2: repeat3: for each (u, s) which Xus is observed do4: Pu: ← Pu: + λ[α(Xus − Pu:Q

Ts:)Qs: − ρPu:]

5: Qs: ← Qs: + λ[α(Xus − Pu:QTs:)Pu: − ρQs:]

6: end for7: for each (u, l) which LUu,l is nonzero do8: Pu: ← Pu: + λ[γ(LUul − Pu:V

Tl: )Vl: − ρPu:]

9: Vl: ← Vl: + λ[γ(LUul − Pu:VTl: )Pu: − ρVl:]

10: end for11: for each (s, e) which LSs,e is nonzero do12: Qs: ← Qs: + λ[δ(LSse −Qs:O

Te:)Oe: − ρQs:]

13: Oe: ← Oe: + λ[δ(LSse −Qs:OTe:)Qs: − ρOe:]

14: end for15: record RMSE(P,Q, testX)16: if λ > minStep then17: λ← 0.99λ18: end if19: until reach maxIteration or meet the convergence

criteria.Algorithm 2: CMF-L:simultaneously factorizing matrices

X ,LU and LS.

The corresponding algorithm pseudo code is shown in

Algorithm 2.

For the third collective matrix factorization model pro-

posed in last section, CMF-TL, that simultaneously factors

matrices X , Y , LU and LS. As it’s a combination of CMF-

T and CMF-L, the gradient descent algorithm is similar to

these two algorithms, we will not include the update rules in

this paper. Also, the algorithm is not listed due to the length

limit. For these algorithms, we exploit dynamic step size to

make minimization efficient by updating the step size after

each iteration. We note that three contextual characteristics

are considered in this paper and these models are able

to be easily extended by incorporating other contextual

information.

V. EXPERIMENTAL RESULTS

In this section, we introduce the metric and the prediction

results in our experiments.

A. Dataset and Evaluation Metric

We download the user-service invocation records

(WSDream-QoSDataset24) as the data set for our

experimental study. We randomly choose 2,502 services

and classify these service in to 8 categories by analyzing

the wsdl files. Table I lists the number of services in these

categories.

There are 63 service locations (service provider countries)

and 31 service invoker locations (user countries) in the

4http://www.wsdream.net/dataset.html

436

Page 6: [IEEE 2014 IEEE International Conference on Services Computing (SCC) - Anchorage, AK, USA (2014.6.27-2014.7.2)] 2014 IEEE International Conference on Services Computing - Quality of

Table ISTATISTICS ON THE NUMBER OF SERVICES IN 8 DIFFERENT

CATEGORIES.

Name # of services percentages

E-commerce 755 30.18Media 124 4.96Schedule 146 5.84Financial 90 3.60Geographical 74 2.96Government 24 0.96Network 1125 44.96Communication 164 6.55

data set. Table II lists the countries that host more than 10

services and the number of services hosted at these countries.

From the above two observations, we can see the necessity of

introducing extra contextual characteristics when the quality

may be affected by the service category and the location of

both users and providers.The user-service matrix, user-location, service-category,

and service-location matrix are generated from this data set.

To simulate the situation of sparseness in the user-service

matrix, we randomly remove some QoS data of the training

matrix and the testing matrix. This makes the sparse matrixes

with data density of 10%, 30% and 50%.

B. Evaluation MeasuresTo evaluate the performance of our algorithm, we make

use of Rooted Mean Square Error (RMSE) to compare with

the basic matrix factorization model and the biased matrix

factorization model. RMSE is a statistical accuracy metric

which is widely used to measure the prediction quality in

collaborative filtering methods.The definition of RMSE is given by the following equa-

tion:

RMSE =

√√√√∑

Xu,s∈T (Xu,s −Xu,s)2

|T | , (19)

where Xu,s is the observed QoS of service s invoked by

user u, Xu,s is the predicted corresponding QoS value, and

T is the testing set.

C. Results and Analysis1) Impact of Parameters: In this part we change the

number of latent dimensions D. Figure 5 shows the per-

formance of CMF-T when predicting the response time

by employing 10% density and changing D. The figure

shows the that RMSE achieve the best when D is set as

50. The similar trend is shown in Figure 6 that illustrates

the impact of the number of latent features D for CMF-

T when predicting the throughput of services. It can be

seen that the best performance achieved when choosing Das 200. A reasonable choice of number of dimensions can

be obtained in the experiments when using our method in

different environment.

20 40 60 80 100 120 1401.23

1.24

1.25

1.26

1.27

1.28

1.29

1.3

1.31

Figure 5. RMSE performance of CMF-T changing over different latentdimensions on the response time data with 0.1 sparsity.

50 100 150 200 250 300 350 40074.5

75

75.5

76

Figure 6. RMSE performance of CMF-T changing over different latentdimensions on throughput data with 0.1 sparsity.

2) RMSE Performance Comparison: To confirm the im-

provement by incorporating contextual information, we com-

pare the prediction performances with the existing QoS

prediction methods based on matrix factorization: MF, which

predicts the missing QoS values by factorizing user-service

matrix; MFB, which extends MF by adding bias (as shown in

equation 4). The three models, CMF-T, CMF-L, and CMF-

TL, proposed in this study are also compared.

For each model, we exam the performance on various

parameter settings and choose the one that performs the best.

For example, in experiments for predicting response time,

the parameter settings are:

• for CMF-T model, we choose α = 0.6, β = 0.4;

• for CMF-L model, we choose α = 0.6, γ = 0.2, δ =0.2;

• for CMF-TL model, we choose α = 0.5, β = 0.3, γ =0.1, δ = 0.1.

Table III and Table IV show the comparison results of

different approaches to the prediction of throughput and

437

Page 7: [IEEE 2014 IEEE International Conference on Services Computing (SCC) - Anchorage, AK, USA (2014.6.27-2014.7.2)] 2014 IEEE International Conference on Services Computing - Quality of

Table IISTATISTICS ON SERVICES IN SOME COUNTRIES.

Country Name # 1 % 2 Country Name # % Country Name # % Country Name # %

Argentina 13 0.52 Czech 40 1.60 Japan 14 0.56 Spain 43 1.72Australia 49 1.96 Denmark 63 2.52 Netherlands 53 2.12 Sweden 41 1.64Austria 30 1.20 Finland 12 0.48 New Zealand 20 0.80 Switzerland 33 1.32Belgium 25 1.00 France 59 2.36 Norway 16 0.64 Turkey 17 0.68Brazil 20 0.80 Germany 115 4.60 Poland 13 0.52 United Kingdom 209 8.35Canada 53 2.12 Iceland 12 0.48 Republic of Korea 24 0.96 United States 1146 45.80China 185 7.39 Israel 11 0.44 Russian Federation 17 0.68CostaRica 10 0.40 Italy 38 1.52 Singapore 13 0.521 The number of services in each country.2 The percentage of services in per country over all services.

Table IIIRMSE PERFORMANCE COMPARISON ON THROUGHPUT. THE UNIT IS

KBPS.

10% 30% 50%

MF 86.47 86.13 73.50MFB 86.39 87.01 73.30CMF-T 76.85 64.63 62.03CMF-L 75.51 64.80 61.75CMF-TL 75.83 64.48 62.08

Table IVRMSE PERFORMANCE COMPARISON ON RESPONSE TIME. THE UNIT IS

SECOND.

10% 30% 50%

MF 1.294 1.128 1.094MFB 1.255 1.113 1.078CMF-T 1.258 1.097 1.074CMF-L 1.248 1.091 1.070CMF-TL 1.251 1.089 1.071

response time of services invocations respectively. From the

experimental results we can see that our proposed three mod-

els achieve better performance in terms of RMSE in most

situations. This indicates that by incorporating the category

information and location information, performance of QoS

prediction can be improved. Especially for the throughput

quality, CMF-T, CMF-L, and CMF-TL all improved the

performance of MF and MFB by more than 10% in terms

of RMSE. It can be seen from Table III and Table IV that

all the compared model can gain better performance when

reducing the sparseness of the training set. It can also be

observed that by incorporating the service category or the

location information will definitely outperform the tradition-

al matrix factorization and the biased matrix factorization

approach. However, combining both of these two contextual

information will not always achieves better performance than

only incorporating one context. This may because that the

parameter space is much bigger for CMF-TL and the best-fit

parameter is not discovered.

In summary, we have shown the effectiveness of our

models and indicated that CMF-T, CMF-L and CMF-TL

outperform existing QoS prediction approaches by a good

margin. We will investigate parameter setting problem from

large space in our future research and provide a more

efficient parameter selecting strategy.

VI. CONCLUSION AND FUTURE WORK

In this paper we propose context-aware prediction models

for quality of web services. One of the main contributions

is to incorporate contextual features of service users and

service providers to make prediction for QoS values more

accurately. In specific, we exploit the collective matrix

factorization model to design context-aware predictors to in-

crease the performance of existing QoS prediction approach-

es that merely base on the traditional matrix factorization or

its variants. The experimental result confirms an increase of

the prediction accuracy in terms of RMSE.In this study, to show the advantage of the collective

matrix factorization model, we consider the service loca-

tion, user location and service category as the contextual

characteristics. In fact, there are still many other aspects of

QoS properties that can be collected and considered in the

prediction model in the future. In addition, as we discussed

in the experiment section, the parameter selecting strategies

from big space should also be investigated in our future

work.

VII. ACKNOWLEDGEMENT

This work was supported partly by National Natural Sci-

ence Foundation of China (No. 61300070, No. 61103031),

partly by China 863 program (No. 2013AA01A213, No.

2012AA011203), China 973 program (No. 2014CB340305),

partly by the State Key Lab for Software Development En-

vironment (SKLSDE-2013ZX-16), partly by A Foundation

for the Author of National Excellent Doctoral Dissertation

of PR China(No. 201159) and partly by Program for New

Century Excellent Talents in University.

REFERENCES

[1] A. P. Singh and G. J. Gordon, “Relational learning viacollective matrix factorization,” in Proceedings of the 14thACM SIGKDD International Conference on KnowledgeDiscovery and Data Mining, ser. KDD ’08. New York,NY, USA: ACM, 2008, pp. 650–658. [Online]. Available:http://doi.acm.org/10.1145/1401890.1401969

438

Page 8: [IEEE 2014 IEEE International Conference on Services Computing (SCC) - Anchorage, AK, USA (2014.6.27-2014.7.2)] 2014 IEEE International Conference on Services Computing - Quality of

[2] L. Shao, J. Zhang, Y. Wei, J. Zhao, B. Xie, and H. Mei,“Personalized qos prediction forweb services via collaborativefiltering,” in Web Services, 2007. ICWS 2007. IEEE Interna-tional Conference on. IEEE, 2007, pp. 439–446.

[3] Q. Zhang, C. Ding, and C. Chi, “Collaborative filtering basedservice ranking using invocation histories,” in Web Services(ICWS), 2011 IEEE International Conference on. IEEE,2011, pp. 195–202.

[4] Z. Zheng, H. Ma, M. R. Lyu, and I. King, “Wsrec: Acollaborative filtering based web service recommender sys-tem,” in Web Services, 2009. ICWS 2009. IEEE InternationalConference on. IEEE, 2009, pp. 437–444.

[5] Z. Zheng, Y. Zhang, and M. R. Lyu, “Distributed qos evalu-ation for real-world web services,” in Web Services (ICWS),2010 IEEE International Conference on. IEEE, 2010, pp.83–90.

[6] Y. Zhang, Z. Zheng, and M. R. Lyu, “Exploring latent featuresfor memory-based qos prediction in cloud computing,” inReliable Distributed Systems (SRDS), 2011 30th IEEE Sym-posium on. IEEE, 2011, pp. 1–10.

[7] J. Ge, Z. Chen, J. Peng, T. Li, and L. Zhang, “Web service rec-ommendation based on qos prediction method,” in CognitiveInformatics (ICCI), 2010 9th IEEE International Conferenceon. IEEE, 2010, pp. 109–112.

[8] M. Zhang, X. Liu, R. Zhang, and H. Sun, “A web servicerecommendation approach based on qos prediction usingfuzzy clustering,” in Services Computing (SCC), 2012 IEEENinth International Conference on. IEEE, 2012, pp. 138–145.

[9] A. Schmidt, M. Beigl, and H.-W. Gellersen, “There is moreto context than location,” Computers & Graphics, vol. 23,no. 6, pp. 893–901, 1999.

[10] G. Adomavicius and A. Tuzhilin, “Context-aware rec-ommender systems,” in Recommender Systems Handbook.Springer, 2011, pp. 217–253.

[11] B. Hidasi and D. Tikk, “Fast als-based tensor factorizationfor context-aware recommendation from implicit feedback,”in Machine Learning and Knowledge Discovery in Databases.Springer, 2012, pp. 67–82.

[12] H. Wermser, A. Rettinger, and V. Tresp, “Modeling andlearning context-aware recommendation scenarios using ten-sor decomposition,” in Advances in Social Networks Analysisand Mining (ASONAM), 2011 International Conference on.IEEE, 2011, pp. 137–144.

[13] Y. Shi, A. Karatzoglou, L. Baltrunas, M. Larson, A. Hanjalic,and N. Oliver, “Tfmap: Optimizing map for top-n context-aware recommendation,” in Proceedings of the 35th interna-tional ACM SIGIR conference on Research and developmentin information retrieval. ACM, 2012, pp. 155–164.

[14] A. Karatzoglou, L. Baltrunas, K. Church, and M. Bohmer,“Climbing the app wall: enabling mobile app discoverythrough context-aware recommendations,” in Proceedingsof the 21st ACM international conference on Informationand knowledge management, ser. CIKM ’12. New York,NY, USA: ACM, 2012, pp. 2527–2530. [Online]. Available:http://doi.acm.org/10.1145/2396761.2398683

[15] Y. Koren, R. Bell, and C. Volinsky, “Matrix factorizationtechniques for recommender systems,” Computer, vol. 42,no. 8, pp. 30 –37, aug. 2009.

[16] Y. Koren, “Factorization meets the neighborhood: amultifaceted collaborative filtering model,” in Proceedingsof the 14th ACM SIGKDD international conference onKnowledge discovery and data mining, ser. KDD ’08.New York, NY, USA: ACM, 2008, pp. 426–434. [Online].Available: http://doi.acm.org/10.1145/1401890.1401944

[17] ——, “Collaborative filtering with temporal dynamics,” Com-mun. ACM, vol. 53, no. 4, pp. 89–97, Apr. 2010. [Online].Available: http://doi.acm.org/10.1145/1721654.1721677

[18] R. Salakhutdinov and A. Mnih, “Probabilistic matrix factor-ization,” in NIPS, 2007.

[19] D. Agarwal and B.-C. Chen, “Regression-based latentfactor models,” in Proceedings of the 15th ACMSIGKDD international conference on Knowledge discoveryand data mining, ser. KDD ’09. New York, NY,USA: ACM, 2009, pp. 19–28. [Online]. Available:http://doi.acm.org/10.1145/1557019.1557029

[20] Y. Koren, “Factorization meets the neighborhood: a multi-faceted collaborative filtering model,” in KDD, 2008, pp. 426–434.

[21] L. Bottou, “Stochastic learning,” in Advanced Lectureson Machine Learning, ser. Lecture Notes in ArtificialIntelligence, LNAI 3176, O. Bousquet and U. von Luxburg,Eds. Berlin: Springer Verlag, 2004, pp. 146–168. [Online].Available: http://leon.bottou.org/papers/bottou-mlss-2004

439