mobileservicerecommendationviacombiningenhanced...

12
ResearchArticle Mobile Service Recommendation via Combining Enhanced Hierarchical Dirichlet Process and Factorization Machines Buqing Cao , 1 Bing Li , 2 Jianxun Liu, 1 Mingdong Tang , 3 Yizhi Liu, 1 and Yanxinwen Li 4 1 SchoolofComputerScienceandEngineering,HunanUniversityofScienceandTechnology,Xiangtan,China 2 SchoolofComputerScience,WuhanUniversity,Wuhan,China 3 SchoolofInformationScienceandTechnology,GuangdongUniversityofForeignStudies,Guangzhou,China 4 DepartmentofComputerScience,NewJerseyInstituteofTechnology,Newark,NJ,USA Correspondence should be addressed to Buqing Cao; [email protected] Received 4 November 2018; Revised 30 January 2019; Accepted 5 March 2019; Published 25 March 2019 Academic Editor: Paolo Bellavista Copyright © 2019 Buqing Cao et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Recently, Mashup is becoming a promising software development method in the mobile service computing environment, which enables software developers to compose existing mobile services to create new or value-added composite RESTful web application. Due to the rapid increment of mobile services on the Internet, it is difficult to find the most suitable services for building user- desired Mashup application. In this paper, we integrate word embeddings enhanced hierarchical Dirichlet process and fac- torization machines to recommend mobile services to build high-quality Mashup application. is method, first of all, extends the description documents of Mashup applications and mobile services by using Word2vec tool and derives latent topics from the extended description documents of Mashup and mobile services by exploiting the hierarchical Dirichlet process. Secondly, the factorization machine is applied to train these latent topics to predict the probability of mobile services invoked by Mashup and recommend mobile services with high-quality for Mashup development. Finally, the performance of the proposed method is comprehensively evaluated. e experimental results indicate that compared with the existing recommendation methods, the proposed method has significant improvements in MAE and RMSE. 1. Introduction Service computing offers an exciting paradigm for service provision and consumption. e field is now embracing new opportunities in the mobile Internet era, which is characterized by ubiquitous wireless connectivity and powerful smart devices that let users consume or provide services anytime and anywhere [1]. As emerging techniques such as cloud and mobile computing become more prev- alent, the way we provide and consume services is ever- changing [1]. Due to the rapid development and effective utilization of mobile computing technologies, services are no longer limited to traditional platforms and contexts. To some degree, traditional service computing is extended by delivering service through mobile techniques. at is to say, service can be deployed on cloud servers or mobile devices and delivered over wireless networks. Mobile devices can play the roles of provider, broker, and consumer simul- taneously. Mobile service computing (i.e., the combination of service computing and mobile computing) is un- doubtedly enabling us to provide and access services anytime and anywhere, which greatly facilitates our life, work, and study [1]. Recently, Mashup technology is be- coming a popular software development method in the mobile service computing environment. It allows software developers to compose existing mobile services to create novel and composite RESTful services [2]. Tremendous mobile services have been released by various service providers [3]. For instance, until November 2018, there are more than 20,000 services on ProgrammableWeb. ere- fore, it is a great challenge to find the most suitable service from the tremendous services to build the Mashup ap- plication that the user expects. To solve the above chal- lenges, some researchers use service recommendation Hindawi Mobile Information Systems Volume 2019, Article ID 6423805, 11 pages https://doi.org/10.1155/2019/6423805

Upload: others

Post on 23-Dec-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MobileServiceRecommendationviaCombiningEnhanced ...downloads.hindawi.com/journals/misy/2019/6423805.pdfcorpus Ke3erv3 dataset Gscription document3f3 flashup3n3 mobi3ervice jG%3op3del

Research ArticleMobile Service Recommendation via Combining EnhancedHierarchical Dirichlet Process and Factorization Machines

Buqing Cao 1 Bing Li 2 Jianxun Liu1 Mingdong Tang 3 Yizhi Liu1

and Yanxinwen Li4

1School of Computer Science and Engineering Hunan University of Science and Technology Xiangtan China2School of Computer Science Wuhan University Wuhan China3School of Information Science and Technology Guangdong University of Foreign Studies Guangzhou China4Department of Computer Science New Jersey Institute of Technology Newark NJ USA

Correspondence should be addressed to Buqing Cao buqingcaogmailcom

Received 4 November 2018 Revised 30 January 2019 Accepted 5 March 2019 Published 25 March 2019

Academic Editor Paolo Bellavista

Copyright copy 2019 Buqing Cao et al +is is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Recently Mashup is becoming a promising software development method in the mobile service computing environment whichenables software developers to compose existing mobile services to create new or value-added composite RESTful web applicationDue to the rapid increment of mobile services on the Internet it is difficult to find the most suitable services for building user-desired Mashup application In this paper we integrate word embeddings enhanced hierarchical Dirichlet process and fac-torization machines to recommendmobile services to build high-quality Mashup application+is method first of all extends thedescription documents of Mashup applications and mobile services by using Word2vec tool and derives latent topics from theextended description documents of Mashup and mobile services by exploiting the hierarchical Dirichlet process Secondly thefactorization machine is applied to train these latent topics to predict the probability of mobile services invoked by Mashup andrecommend mobile services with high-quality for Mashup development Finally the performance of the proposed method iscomprehensively evaluated +e experimental results indicate that compared with the existing recommendation methods theproposed method has significant improvements in MAE and RMSE

1 Introduction

Service computing offers an exciting paradigm for serviceprovision and consumption +e field is now embracingnew opportunities in the mobile Internet era which ischaracterized by ubiquitous wireless connectivity andpowerful smart devices that let users consume or provideservices anytime and anywhere [1] As emerging techniquessuch as cloud and mobile computing become more prev-alent the way we provide and consume services is ever-changing [1] Due to the rapid development and effectiveutilization of mobile computing technologies services areno longer limited to traditional platforms and contexts Tosome degree traditional service computing is extended bydelivering service throughmobile techniques+at is to sayservice can be deployed on cloud servers or mobile devicesand delivered over wireless networks Mobile devices can

play the roles of provider broker and consumer simul-taneously Mobile service computing (ie the combinationof service computing and mobile computing) is un-doubtedly enabling us to provide and access servicesanytime and anywhere which greatly facilitates our lifework and study [1] Recently Mashup technology is be-coming a popular software development method in themobile service computing environment It allows softwaredevelopers to compose existing mobile services to createnovel and composite RESTful services [2] Tremendousmobile services have been released by various serviceproviders [3] For instance until November 2018 there aremore than 20000 services on ProgrammableWeb +ere-fore it is a great challenge to find the most suitable servicefrom the tremendous services to build the Mashup ap-plication that the user expects To solve the above chal-lenges some researchers use service recommendation

HindawiMobile Information SystemsVolume 2019 Article ID 6423805 11 pageshttpsdoiorg10115520196423805

techniques to improve service discovery [4 5] Amongthem some topic model technologies such as latentDirichlet allocation (LDA) [6] have been utilized to obtainthe latent topics of Mashup and services to improve theaccuracy of recommendations [4 5] However LDA needsto identify the optimal number of topics in advance Toobtain the optimal topics it is needed to repetitivelyperform model training which leads to massive time-consumption Aiming at this problem the hierarchicalDirichlet process (HDP)model is proposed by Teh et al [7]which can derive the optimal number of topics and save thetime and cost We use it to model and derive the topics ofMashups and service to achieve more accurate servicerecommendations Moreover the topic training andmodelling of LDA usually needs large-scale corpusHowever the description documents of Mashup and ser-vice usually are short and their corpuses are insufficient Inthe field of information retrieval some researchers exploitWord2vec [8] to expand short text into long text in orderthat topic model can effectively estimate the latent topics oftext for more accurate information searching +eWord2vec model is proposed by Google [8] which canprocess large-scale text corpus and generate word em-beddings vector with high efficiency In this paper weexploit Word2vec to extend the description document ofMashup and mobile service to build a dense word em-beddings vector representation for more accurate topicmodelling

Recently matrix factorization is widely applied inservice recommendations [9 10] It usually decomposesMashup-service matrix into two matrixes with lower di-mension by using service invocations in historicalMashups But matrix factorization based on service rec-ommendation technology mainly depends on enoughrecords in historical Mashup-service interactions [10] Tosolve this problem some additional information such asusersrsquo social relations [11] or location similarity [12] isincorporated into matrix factorization to achieve a betteraccuracy of service recommendation However matrixfactorization only is applicable to special and single inputdata and is not suitable for general prediction task+erefore the incorporation of additional informationdrops the performance of service recommendation As weinvestigated Rendle [13 14] proposed a general predictorworking with any real-valued feature vector FMs (fac-torization machines) It can be used for general predictiontask and model all interactions between various inputvariables +erefore it can predict the probability ofservice invocated by Mashups In this paper we combineword embeddings enhanced HDP and FMs to recommendmobile service for Mashup development +e contribu-tions of this paper are as follows

(i) We use Word2vec to extend the description doc-uments of Mashups and mobile services to build adense word embeddings vector representation formore accurate topic modelling Based on the wordembeddings vector we exploit HDP to derive thelatent topics from the extended description doc-ument of Mashups and mobile services

(ii) We employ the FMs to train the latent topics de-rived from the HDP and predict the probability ofmobile services invocated by Mashups Variousvaluable information such as cooccurrence andpopularity is exploited to achieve high-qualitymobile service recommendation for Mashupdevelopment

(iii) We crawl a real Web service dataset from Pro-grammableWeb and perform a series of experi-ments +e experimental results indicate thatcompared with the existing methods the proposedmethod has significant improvements in MAE andRMSE

(iv) +e rest of this paper is organized as follows Section2 indicates the proposed approach Section 3presents the experimental results Section 4 reviewsrelated works Section 5 provides discussion Fi-nally we draw conclusions and discuss future workin Section 6

2 Method Overview

+is section consists of four subsections respectively de-scribing the overall framework description extension topicmodelling and mobile service recommendation of theproposed method with details

21 Overall Framework of Mobile Service Recommendation+e overall framework of our mobile service recommen-dation method is shown in Figure 1 which includes threemain parts ie description extension topic modelling andservice recommendation In the description extension partwe firstly extract the description documents of Mashup andmobile service fromWeb service dataset and then exploit theEnglish Wikipedia corpus trained byWord2vec as extensionsource of the description documents of Mashup and mobileservice to obtain their extended words Finally the originaldescription documents and extended words of Mashup andmobile service are together preprocessed as the input of nextstep In the topic modelling part we use HDP topic tech-nology to model the extended description documents ofMashups and mobile services and derive their latent topicsIn the service recommendation part when a user submits aMashup requirement our method applies FM model toperform the prediction and recommendation of mobileservice for the given Mashup requirement

22 Description Extension of Mashup and Mobile ServiceBased onWord2vec +e description documents of Mashupand mobile service are usually short in which the number ofcontained words is relatively few and the word frequencycooccurrence is not enough For example according to ourstatistics on an average every service description documenton ProgrammableWeb contains only 2716 words Due tothe limited words (corpus) it is difficult to effectively derivethe latent topics when using HDP topic model to train andmodel the description documents of Mashup and mobile

2 Mobile Information Systems

service +erefore it is very necessary for topic modelling toextend the description documents of Mashup and mobileservice Google develops a tool Word2vec [8] to expresswords as real numerical vectors in 2013 It is an open sourceword embeddings vector toolkit which simplifies the doc-ument content processing into vector operations in K-di-mensional vector space by using the idea of deep learning[8] +e word embeddings vector contains not only thesemantic and grammatical relations of words but also ex-tracts the context information of the document whichachieves more accurate word embeddings vector repre-sentation As we know Wikipedia is recognized as the mostcomprehensive and authoritative online encyclopaedia on theInternet which possesses rich corpus We use Wikipediacorpus as the extension source of the description documentsofMashup andmobile service More concretely first of all weuse Word2vec tool to train Wikipedia corpus and obtain theword embeddings vector model of Wikipedia corpus +enwe exploit the trained word vector model to extend the de-scription documents of Mashup and mobile service+at is tosay for each word wi in the preprocessed service descriptiondocument Stext the most similar Top-N wordsTwi

(t1 t2 tN) to wi are identified from the wordembeddings vector space ofWikipedia corpus and used as theextended words +e extended service description documentis denoted as SExText (w1 Tw1

) (w2 Tw2) (wn Twn

)1113966 1113967where n is the total number of words in service descriptiondocument andN is the number of extended words In the nextsection we will perform comparative experiments to de-termine the optimal number of extended words

+e word embeddings vector model of Wikipedia corpusis trained by adopting the CBOW model based on thenegative sampling and the Negative Sampling algorithm inwhich the more close words in word meaning indicate moreclose distance in their word vectors space +ese close wordshave similar semantic and grammatical relations which can

be used for service description document expansion +einput of the CBOW model is the word vector of the context-related words of a specific word and its output is just the wordvector of the specific word Suppose w Cw represent theextracted word and their context related words informationfrom the service description document dataset respectively+e probability of predicting w by the context-related words(surrounding words) can be denoted as follows

p w Cw

11138681113868111386811138681113872 1113873 1

1 + eminusvcwWw

(1)

where Ww is the parameter of the hidden layer and softmaxlayer in the neural network and vcw

is the sum of the vectorsof each word in Cw +e training objectives of the CBOWmodel are defined as follows (maximum likelihood esti-mation function)

OBJCBOW arg maxVwW1

⎛⎝ 1113945

wCw( )isinT

log p w Cw

11138681113868111386811138681113872 11138731113872 1113873

times 1113945

wCw( )notinT

1minus log p w Cw

11138681113868111386811138681113872 11138731113872 11138731113872 1113873⎞⎠

(2)

Next we use the HDP model to model the extendedservice description document SExText to automatically ex-tract the optimal number of topics and fully estimate thelatent topics distribution

23 Topic Modelling of Mashup and Mobile Service UsingHDP +e hierarchical Dirichlet process (HDP) is a Dirichletprocess (DP) mixture model with multilevel form and it isalso a nonparametric Bayesian approach to clusteringgrouped data [15] Assume (Θ C) is the measurable spacewhere G0 is a spatial probability measure and a0 is a positivereal number +e Dirichlet process [16] is interpreted as the

Wikipediacorpus

Web service dataset

Descriptiondocuments of Mashup and

mobile service

HDP topic model

Topic modelling of Mashup and mobile service

FM model

Prediction and recommendation of mobile service

for Mashup

Topic modelling Service recommendationDescription extension

Extraction

Word2Vec

Preprocessing Input

Mashup requirement

Extendedwords

Figure 1 Framework of mobile service recommendation for Mashup

Mobile Information Systems 3

distribution of the random probability measure G over(Θ C) If G satisfies the Dirichlet distribution for any finitepartition (A1 A2 Ar) of the measurement spaceΘ thereis a random vector (G(A1) G(Ar)) distributed as afinite-dimensional Dirichlet distribution with the parameters(a0G0(A1) a0G0(Ar))

G A1( 1113857 G Ar( 1113857( 1113857 sim Dir a0G0 A1( 1113857 a0G0 Ar( 1113857( 1113857

(3)

HDP is used to model documents for mobile servicesand Mashup Figure 2 is an HDP probability graph whichclearly indicates the documents of mobile services orMashup and their words and potential topics Amongthem c and a0 represent concentration parameter and Drepresents the entire Mashup document set in which eachMashup document in D is represented as d At the lowerpart of Figure 2 H represents the base probability measureand G0 indicates the global random probability measure+e generated topic probability distribution of Mashupdocument d is represented as Gd and the generated topic ofthe n-th word in d from Gd is represented as βdn and wdn isa generated word from βdn

+e generative process of the HDP model is as follows

(1) Sample the probability distribution G0 sim DP(c H)

(2) For each d in D sample a topic distributionGd sim DP(a G0)

(3) For each word n isin 1 2 N in d

(a) Sample a topic of the n-th word βdn sim Gd

(b) Sample a word from the multinomial distribu-tion of the topic words wdn sim multi(βdn)

In order to perform HDP sampling it is needed to devise aconstruction method to infer the posterior distribution ofparameters Chinese Restaurant Franchise (CRF) is a repre-sentative construction method which provides a way to con-struct the Dirichlet process Assume there are j Chineserestaurants containing mj tables (ψjt)

mj

t1 and each table sitsNj customers In the J restaurant each table shares a menuv (v)K

k1 where K is the amount of food Customers canchoose a table at random and each table is served a dish from amenu common to all restaurants In this way the customersrestaurants and food in the Chinese restaurant correspond tothe words documents and topic in our HDP model re-spectively Assuming δ is a probability measure the topicdistribution θji of the word xji is treated as the customerentering the restaurant and the different value ψjt correspondsto the table where the customer is seated+e customer sits thetable ψjt with a probability njt(iminus 1 + a0) or chooses with anew probability a0(iminus 1 + a0) to sit the new table ψjtnew

sharing the food vk Among them njt indicates the sum ofcustomers at the t-th table of the j-th restaurant If the customerchooses a new table shehe can distribute the food vk for thenew table with a probability mk1113936kmk + c according to thepopularity of chosen foods or new foods vknew

with a prob-ability c1113936kmk + c Here mk indicates the sum of tablesproviding the food vk We have the conditional distributions

θji

1113868111386811138681113868 θji θji θji a0 G0sim1113944

mj

t1

njt

iminus 1 + a0δψjt

+a0

iminus 1 + a0G0

(4)

ψjt

1113868111386811138681113868ψjtψjt ψjt ψjt c H sim 1113944K

k1

mk

1113936kmk + cδvk

+c

1113936kmk + cH

(5)

In fact the above CRF process of distributing tables andfoods to customers corresponds to the process of wordtopic distribution and document topic clustering inMashup document set respectively After the constructionof CRF the HDP model uses Gibbs sampling to infer theposterior probability distribution of its parameters so as togain the topic distribution of the entire Mashup documentset

24 Mobile Service Recommendation for Mashup Using FMs

241 Rating Prediction in Recommendation System and FMsIn the traditional recommendation system as for user setU u1 u2 1113864 1113865 and item set I i1 i2 1113864 1113865 the ratingprediction function is denoted as follows

y U times I⟶ R (6)

where y is the rating and y(u i) represents the rating of useru to item i

FMs are a universal predictor that estimates reliableparameters at very high sparsity [13 14] It integrates theadvantages of SVMswith factorizationmodels Different fromthe SVM it not only is suitable for any real-valued featurevector but also can use decomposition parameters tomodel allinteractions between feature variables +erefore it is verysuitable for predicting the rating of items for users Assumethere are an input feature vector x isin Rnlowastp and an outputtarget vector y (y1 y2 yn)T Here n is the sum ofinput-output pairs p denotes the sum of input featuresie the ith row vector xi isin Rp p represents xi has p inputfeature values and yi is the predicted target value of xi On thebasis of x and y the 2-order FMs can be denoted as follows

1113954y(x) w0 + 1113944

p

i1wixi + 1113944

p

i11113944

p

ji+1xixj 1113944

k

f1vifvjf (7)

H G0 GdD

N

γ α

βdn wdn

Figure 2 Probabilistic graph of HDP

4 Mobile Information Systems

where w0 is the global bias and k is the dimensionality offactorization w0 models the strength of the i-th featureand xixj indicates all the pairwise variables in thetraining instances xi and xj +e model parametersw0 w1 wp v11 vpk1113966 1113967 are denoted as follows

w0 isin R

w isin Rn

V isin Rnlowastk

(8)

242 Prediction and Recommendation of Mobile Service forMashup Based on FMs +e prediction and recommendationof mobile service is a typical classification problem and it isregarded as the task of ranking mobile services and recom-mending enough related mobile services for a given Mashup+e result of classification can be denoted as y minus1 1 Wheny 1 the relevant mobile services are recommended to thegiven Mashup However in the experiment we can onlyobtain the predicted values ranging from 0 to 1 by formula (5)We firstly rank these prediction values then label the top-Kresults as positive (+1) and the rests as negative (minus1) andfinally recommend the mobile services with positive values tothe given or target Mashup

In the modelling of FMs target Mashup and activemobile services can be considered as user and item re-spectively In addition to the two-dimensional features(target Mashup and active mobile services) we add otherfeatures such as similar mobile services similar Mashuppopularity and cooccurrence of mobile services to improvethe accuracy of prediction and recommendation +e ad-ditional features can be used as input feature vectors in FMmodelling +erefore the model in formula (6) can be ex-tended to the below prediction model with six dimensions

yMA times MS times SMS times SMA times CO times POP⟶ S (9)

where MA is the target Mashup MS is the active mobileservice SMS represents a similar Mashup SMA represents asimilar mobile service CO represents the cooccurrence ofmobile service POP indicates the popularity of mobileservice and S indicates the prediction ranking score re-spectively +ese similar Mashups and mobile services arederived from our HDP model in Section 23

Figure 3 is an example of recommending mobile servicesfor target Mashup using the FM model in which the dataconsist of two parts +e first part is the input feature vectorset X and the second part is the output target set Y Each rowincludes a feature vector xi and its corresponding targetscore value yi +e first binary indicator matrix (ie Box 1)indicates the target Mashup MA +e second binary in-dicator matrix (ie Box 2) indicates the active mobile serviceMS +e third indicator matrix (ie Box 3) represents thatTop-S mobile services are similar to active mobile service inBox 2 For example the similarity between S1 and S2 (S3) is03 (07) +e fourth indicator matrix (ie Box 4) representsTop-M similar Mashups SMA of the target Mashup in Box 1For instance the similarity between M2 and M1 (M3) is 03(07) +e fifth indicator matrix (ie Box 5) indicates the

cooccurrence CO of active mobile services composed orinvoked by the same Mashup in historical records +e sixthindicator matrix (ie Box 6) indicates the popularity (or thefrequencytimes) POP of active mobile services composed orinvoked by Mashup Target Y represents the output result ofthe model and the prediction ranking score S can beclassified as a positive value (+1) or a negative value (minus1) onthe top of a given threshold If yi gt 05 then S+1 otherwiseSminus1 +ese mobile services with positive values will beselected and recommended to the target Mashup+e case inpoint is if there are two active mobile service members S1and S3 for the selection of the target Mashup M1 S1 will beselected and recommended to M1 +is is because it has ahigher prediction value ie y2 gt 092 Furthermore we willinvestigate the influences of top-S and top-M on recom-mendation performance in the Experiments section

3 Experiments

31 Experiment Dataset and Settings In this experiment wefirstly crawled 3929 real Mashups 10648 services and 12715invocations between these Mashups and services from Pro-grammableWeb As for each Mashup or service a pre-processing process is performed to obtain their standarddescription information Secondly we use the Word2vec toolto expand the description document of Mashup or servicefrom English Wikipedia corpus published on April 2017 andobtain their word embeddings vector More concretely thegensim module in Python is applied to train the EnglishWikipedia corpus and produce its word embeddings vectorand Table 1 presents the special parameters of Word2vecFinally the trained English Wikipedia corpus is exploited toexpand the description documents of Mashup and service+e most similar Top-N words to the original word areidentified and used as the extended words For instance thetop similar 10 expanded words of the two words ldquoEarthrdquo andldquoGooglerdquo are shown in Table 2 All Mashups in the dataset areuniformly divided into 5 subsets in which 1 subset is used asthe testing set and other 4 subsets are integrated as a thetraining set A five-fold cross validation is conducted and theresults for each fold are summed up to obtain their meanvalue as the reported experiment results As for the testing setthrough randomly removing some score values from thematrix of Mashup service we change the number of scorevalues provided by the active Mashups as 10 20 and 30 andcall them as Given 10 Given 20 andGiven 30 respectively Atthe same time the removed score values are exploited as theexpected values Similarly as for the training set throughrandomly removing some score values the Mashup-servicematrix becomes more sparser with density 10 20 and30 respectively

32 Evaluation Metrics Mean absolute error (MAE) refersto the expected value of the square of the difference betweenthe observed value and the true value which can evaluate thechange degree of data [16] Root-mean-squared error(RMSE) is the square root of the ratio of the square of thedeviation between the observed value and the true value to

Mobile Information Systems 5

the observed times N [16] We adopt the MAE and RMSE asthe evaluation metrics of Web APIs recommendation +esmaller the MAE and RMSE mean the better the recom-mendation effect

MAE 1N

1113944ij

rij minus 1113954rij

11138681113868111386811138681113868

11138681113868111386811138681113868

RMSE

1N

1113944ij

rij minus 1113954rij1113872 11138732

1113971

(10)

where N is the number of predicted score rij indicates thetrue score of Mashup Mi to service Sj and 1113954rij indicates thepredicted score of Mi to Sj

33 Baseline Methods We choose the below methods asbaseline to compare them with our proposed approach

(i) SPCC Similar to IPCC [17] service-based utilizingPearson correlation coefficient (SPCC) approachmeasure the similarities between mobile servicesand perform recommendation

(ii) MPCC Similar to UPCC [17] Mashups-basedutilizing Pearson correlation coefficient (MPCC)approachmeasure the similarities betweenMashupsand perform recommendation

(iii) PMF In the collaborative filtering probabilisticmatrix factorization (PMF) is a very popular matrixfactorization model [10] +e historical invocationrecord between Mashups and mobile services isdenoted as a matrix R [rij]ntimesk If rij 1 themobile service is invoked by a Mashup is shownotherwise rij 0 +e probability of the mobileservice Si invoked by the Mashup Mj can be pre-dicted and represented as 1113954rij ST

i Mj(iv) LDA-FMs +e topic probability distributions of

description documents in Mashup and mobileservice firstly are derived by the LDA model andthen they are trained via FMs to predict theprobability distribution of mobile service invokedby Mashup and recommend mobile service withhigh quality Besides the topic information thecooccurrence and popularity of mobile service areexploited in the FM modelling

(v) HDP-FMs +e prior work [18] which integratesHDP and FMs to recommendmobile service for targetMashup+eHDPmodel is applied to derive the topicprobability distributions of description documents inMashup and mobile service Similarly the topic in-formation and the cooccurrence and popularity ofmobile service are all used in the FM modelling

(vi) EHDP-FMs +e proposed method in this paper isan extended work of the prior work [18] It firstly

Table 1 Specific parameters of Word2vec

Parameter ValueSize (the dimension of word vector) 200Window (the length of the window) 10Sample (the threshold of sampling) 0001Negative (the number of negative sampling) 5Sg (whether or not the Skip-gram model is used) 0 (No)Hs (whether or not hierarchical Softmax model isused) 0 (No)

Table 2 Extension example of the two words ldquoEarthrdquo andldquoGooglerdquo

Original word Earth Google

Extended words

Planet GmailMartian DropboxMars Evernote

Venusian AppPlanets Adsense

Spaceship YahooUniverse MicrosoftPlanetary FlickrMoon HotmailDeimos Mapquest

Mashup(MA)

Mobile service(MS)

Similar mobileservice (SMS)

Similar Mashup(SMA)

Cooccurrence(CO)

Popularity(POP)

0 1 0 hellip 1 0 0 hellip 0 03 07 hellip 03 0 07 hellip 0 05 05 hellip 12

1 0 0 hellip 1 0 0 hellip 0 05 05 hellip 0 05 05 hellip 0 1 0 hellip 3

0 1 0 hellip 0 1 0 hellip 07 0 03 hellip 05 0 05 hellip 05 0 05 hellip 7

0 0 1 hellip 0 1 0 hellip 06 0 04 hellip 04 06 0 hellip 05 0 05 hellip 21

0 0 1 hellip 0 0 1 hellip 03 07 0 hellip 01 09 0 hellip 05 05 0 hellip 5

1 0 0 hellip 0 0 1 hellip 04 01 0 hellip 0 08 02 hellip 05 05 0 hellip 3

0 1 0 hellip 0 1 0 hellip 04 0 06 hellip 04 0 06 hellip 05 0 05 hellip 8

0 0 1 hellip 1 0 0 hellip 0 08 02 hellip 07 03 0 hellip 0 1 0 hellip 1

hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip

M1 M2 M3 hellip S1 S2 S3 hellip S1 S2 S3 hellip M1 M2 M3 hellip S1 S2 S3 hellip Freq

Box 1 Box 2 Box 3 Box 4 Box 5 Box 6

Score(S)

036 (ndash1)

092 (+1)

017 (ndash1)

043 (ndash1)

069 (+1)

028 (ndash1)

055 (+1)

074 (+1)

hellip hellip hellip

hellip hellip hellip

X1

X2

X3

X4

X5

X6

X7

X8

y2

y1

y3

y8

y7

y4

y5

y6

X Y

Figure 3 FM model of recommending mobile service for Mashup

6 Mobile Information Systems

uses Word2vec tool to expand the document de-scription of mobile service and Mashup fromWikipedia corpus +en the HDP model is appliedto derive the topic probability distributions of theextended document description of mobile serviceand Mashup Finally the FM is deployed to predictand recommend high-quality mobile service forMashup

34 Experimental Results

341 Recommendation Performance Comparison To studythe performance of mobile service recommendation wecompare our method with other five baseline methods Weselect the optimal number of extended words for eachoriginal word in description documents of Mashup andservice to achieve the best recommendation result in ourEHDP-FMs A detailed investigation about it will be dis-cussed in subsequent section Table 3 reports the MAE andRMSE comparison of multiple recommendation methodswhich show our EHDP-FMs greatly outperforms WPCCand MPCC significantly surpasses PMF and LDA-FMs andslightly exceeds HDP-FMs consistently+e reason for this isthat in the EHDP-FMs (a) more useful words informationcan be obtained from the extended description content ofMashup and service (b) more similar Mashups and similarservices in topic distribution are identified by using HDPtechnology and (c) FM models and trains those usefulinformation (including similar Mashups and similar ser-vices the cooccurrence and popularity of service) to achievemore accurate service probability score prediction Fur-thermore when the given score values increase from 10 to 30and the density of training matrix rises from 10 to 30 theMAE and RMSE in the EHDP-FMs definitely drop+at is tosay more score values and training matrix with highersparsity mean better accuracy of recommendation

342 Effect of the Number of Extended Words on MobileService Recommendation +e experiments investigate theeffect of the number of extended words on mobile servicerecommendation in our proposed method During the ex-periments we set the number of extended words to 1 3 5and 7 (respectively denoted as EHDP-FMs-1 EHDP-FMs-3 EHDP-FMs-5 and EHDP-FMs-7) when training matrixdensity 10 and obtain their values of MAE and RMSE inFigures 4 and 5 +e experimental results indicate theperformance of EHDP-FMs-1 is the worst in all cases +is isbecause the extended description documents of Mashup andservice are still short and the contained useful information isless in it when only extending a word for each original wordWe can see that theMAE and RMSE of EHDP-FMs-3 are theoptimal and best in all cases However when the number ofthe extended words continues to increase from 5 to 7 therecommendation performance decreases+e reason for thisis that too many extended words contain more other ir-relevant syntax and semantics information which maybemakes the HDP topic model fail to mine the latent topicsaccurately and therefore weakens the performance of service

recommendation +erefore we select 3 extended words foreach original words of description document of Mashup andservice in our EHDP-FMmethod +e observations indicateit is very important to choose an appropriate number ofextended words for mobile service recommendation

343 HDP-FMs Performance vs LDA-FMs Performancewith Different Topic Numbers In this experiment we re-spectively set the number of topics as 3 6 12 and 24 forLDA-FMs and denote as LDA-FMs-361224 +e experi-mental results in Figures 6 and 7 respectively show theMAE and RMSE values when the training matrix density isequal to 10We also observe that the performance of HDP-FMs is the best At the same time the MAE and RMSE ofLDA-FMs-12 are close to that of HDP-FMs and surpassedthose of LDA-FMs-3 LDA-FMs-6 and LDA-FMs-24 +eobservations prove that HDP-FM is better than LDA-FMssince it can automatically derive the optimal topic numbersinstead of repeatedly training like LDA

344 Impacts of Top-S and Top-M in HDP-FMs We in-vestigate the effects of top-S and top-M to mobile servicerecommendation in order to obtain their optimal values+eoptimal values of top-M (top-S) for all similar top-S (top-M)services (Mashups) are obtained ie S 5 for all top-Msimilar Mashups and M 10 for all top-S similar mobileservices Under the setting of training matrix density 10and given number 30 the MAEs of HDP-FMs are pre-sented in Figures 8 and 9 We can see that from Figure 8 theMAE of HDP-FMs constantly rises when S grows from 5 to25 Figure 9 indicates the MAE of HDP-FMs runs up to itspeak value whenM 10 and then continuously rises with theincreasing or decreasing ofM +e observations mean that itis very important to identify suitable values of S and M forthe HDP-FM method

4 Related Works

Service recommendation is a hot topic nowadays in service-oriented computing [19] Traditional service recommen-dation solves the quality problem of Mashup services inorder to realize high-quality service recommendation +equality of a single service can facilitate recommendationshowed by Picozzi et al [20] +e quality attributes ofMashup components (APIs) and information quality inMashups [21] is analyzed by Cappiello [22] In additioncollaborative filtering (CF) technique is widely exploited inQoS-based service recommendation [16] We can use it tomeasure the similarity of services or users predict themissing QoS values on the basis of the QoS records of similarservices or similar users and recommend services to users

+e problems of the data sparsity and long tail bringabout inaccurate and imperfect search results according tothe results in references [23 24] To attack the problemsome researchers try to use matrix factorization technologyto decompose historical QoS or Mashup service interactionsto obtain service recommendations [25 26] A collaborativeQoS prediction method is proposed in which a matrix

Mobile Information Systems 7

factorization model of neighbourhood integration isdesigned to predict the QoS value of personalized Webservices by Zheng et al [26] A social awareness servicerecommendation method is proposed in which the multi-dimensional social relationships among potential usersMashups topics and services are depicted by the couplingmatrix model by Xu et al [9] +ese methods aim totransform Mashup-service rating matrix or QoS into afeature space matrix with lower dimensions and predict theprobability of services invoked by Mashups or unknownQoS

Considering that matrix factorization depends onabundant historical interaction records recent work in-corporates additional information into matrix factorizationto obtain more accurate service recommendation [5 10ndash12]Among them Ma et al [11] integrates matrix factorization

with geographic and social influence to recommend interestpoints By using location information and QoS of Webservices to cluster services and users a personalized servicerecommendation is proposed by Chen et al [12] +e his-torical invocation relationship between Mashups and ser-vices is studied to infer the implicit functional correlationbetween services and the correlation is incorporated into thematrix factorization model to facilitate service recommen-dation by Yao et al [10] Collaborative topic regression isproposed by Liu and Fulia [5] which combines probabilistictopic modelling and probabilistic matrix factorization forservice recommendation

+e existing methods based on matrix factorizationundoubtedly improve the performance of service recom-mendation At the same time we observed that few of themrecognized the historical invocation between services and

Table 3 MAE and RMSE comparison of multiple recommendation approaches

MethodMatrix density 10 Matrix density 20 Matrix density 30MAE RMSE MAE RMSE MAE RMSE

Given 10

SPCC 04258 05643 04005 05257 03932 05036MPCC 04316 05701 04108 05293 04035 05113PMF 02417 03835 02263 03774 02014 03718

LDA-FMs 02091 03225 01969 03116 01832 03015HDP-FMs 01547 02874 01329 02669 01283 02498EHDP-FMs 01308 02507 01154 02372 01081 02093

Given 20

SPCC 04135 05541 03918 05158 03890 05003MPCC 04413 05712 04221 05202 04151 05109PMF 02398 03559 02137 03427 01992 03348

LDA-FMs 01989 03104 01907 03018 01801 02894HDP-FMs 01486 02713 01297 02513 01185 02291EHDP-FMs 01227 02419 01055 02216 00952 01904

Given 30

SPCC 04016 05447 03907 05107 03739 05012MPCC 04518 05771 04317 05159 04239 05226PMF 02214 03319 02091 03117 01986 03052

LDA-FMs 01970 03096 01865 02993 01794 02758HDP-FMs 01377 02556 01109 02461 01047 02057EHDP-FMs 01113 02248 00926 02057 00804 01673

Given number10 20 30

MA

E

005

01

015

02

025

03

EHDP-FMs-1EHDP-FMs-3

EHDP-FMs-5EHDP-FMs-7

Figure 4 MAE values of EHDP-FMs-1357

EHDP-FMs-1EHDP-FMs-3

EHDP-FMs-5EHDP-FMs-7

Given number10 20 30

RMSE

018

02

022

024

026

028

03

032

Figure 5 RMSE values of EHDP-FMs-1357

8 Mobile Information Systems

Mashups to derive potential topics and they did not use FMsto model and train these potential topics to predict theprobability of Mashup calling services to obtain more ac-curate service recommendation In our previous work[8 27 28] we mainly address on LDA or enhanced LDAtopic model for Web services clustering [27 28] and alsoexploit word embedding technique to enhance the accuracyof service clustering [8] Driven by these methods wecombine FMs and word-embedded enhanced HDP forrecommending mobile services to build novel Mashup ap-plication We apply the HDP model to export the potentialtopics from the description documents of mobile servicesand Mashups to support FM model training We use FMs topredict the probability of Mashups calling mobile servicesand recommend high-quality services for building novelMashup application

5 Discussion

Recommending mobile service to build novel Mashup ap-plication for software developers in the mobile servicecomputing environment is becoming a promising researchtopic In our paper the functional semantic representationof Mashup applications and mobile services is fully con-solidated and mined by extending their description docu-ments and modelling their topic probability distributionand the quality prediction is performed by exploiting FMs totrain and model multiple dimension features of mobileservice +e high-quality mobile services are ranked andrecommended to build Mashup by simultaneously consid-ering their functionality representation and quality feature+e accuracy of mobile service recommendation is signifi-cantly improved as a result

Although the above approach and solution seem veryeffective in the Mashup development it will be better if a

Given number10 20 30

RMSE

02

025

03

035

04

LDA-FMs-3LDA-FMs-6LDA-FMs-12

LDA-FMs-24HDP-FMs

Figure 6 MAE of HDP-FMs and LDA-FMs

LDA-FMs-3LDA-FMs-6LDA-FMs-12

LDA-FMs-24HDP-FMs

Given number10 20 30

MA

E

012

013

014

015

016

017

Figure 7 RMSE of HDP-FMs and LDA-FMs

Top-S5 10 15 20 25

MA

E

013

014

015

016

017

018

019

02

HDP-FMs

Figure 8 Impact of Top-S in HDP-FMs

Top-M5 10 15 20 25

MA

E

013

014

015

016

017

018

019

02

HDP-FMs

Figure 9 Impact of Top-M in HDP-FMs

Mobile Information Systems 9

prototype can be designed or implemented to validate theeffectiveness and application value of the approach As weexpected the objective of the prototype system is to rank andrecommend the high-quality mobile service to softwaredevelopers for building Mashup application We can use thetools of Python 35 Mysql 56 and the technologies of Flaskand Pyecharts to develop the prototype system It shouldachieve four basic function parts ie service data crawlingand preprocessing description extension of Mashup andmobile service topic modelling of Mashup and mobileservice and recommendation of mobile service for the givenMashup requirement More concretely

(1) In the first part (service data crawling and pre-processing) the system incrementally crawls Mash-ups services and invocations between these Mashupsand services from ProgrammableWeb and builds theircorresponding data table to store Because the de-scription documents of Mashup and mobile servicecontain some useless or meaningless termswords thepreprocessing is performed to normalize and stan-dardize the description information +e pre-processing mainly includes tokenization words aresegmented by spaces and punctuation is separatedfrom words by using NLTK (Natural LanguageToolkit) in Python removing stop words (the com-mon short words or symbols that have no practicalmeaning but occur frequently such as the to a anwith and at) the stop vocabulary table in the NLTK isapplied to remove stop words stemming variousforms of a word are usually used in the grammaticalexpression such as provide providing provides andprovided and their common word endings such asing s and ed should be removed

(2) In the second part (description extension of Mashupand mobile service) English Wikipedia corpustrained byWord2vec based on the genism module inPython is exploited as the description extensionsource of Mashup and mobile service +e mostsimilar Top-N words to an original word in thedescription documents of Mashup and mobile ser-vice are identified and saved as the extended words inthe prototype system Word2vec uses the hierar-chical softmax algorithm to speed up and train wordvector for English Wikipedia corpus whose timecomplexity is O(log N) Meanwhile Word2veccalculates the similarity between words in EnglishWikipedia corpus and obtain the similarity matrixwhose time and space complexity are all O(n2) n isthe total amount of words During the process ofdescription extension the most similar Top-N wordsto an original word can be found only by the way oflook-up table in the trained English Wikipediacorpus and its time complexity is O(1) and spacecomplexity is O(N2)

(3) In the third part (topic modelling of Mashup andmobile service) hierarchical Dirichlet processtechnology is used to extract the implicit topics ofMashup and mobile service which clustersgroups

service data according to the cooccurrence of wordfrequency It can automatically determine the op-timal number of topics which avoids adjusting thenumber of topics repeatedly and so saves the timecost It can also accurately predict the topic dis-tribution of Mashup and mobile service which donot need to retrain the dataset and make the pro-totype system real time A topic modelling modulecan be designed in the prototype system in whichFlask framework and Pyecharts visualization tool inPython are used to present the effect of topicmodelling and a download function is provided todownload the transformed topic vectors of Mashupand mobile service for users

(4) In the fourth part (recommendation of mobile ser-vice for the given Mashup requirement) when asoftware developer submits a Mashup requirementthe prototype system will return a list of mobileservices with good quality a for software developer tobuild novel Mashup application During the processof recommendation factorization machines trainand model the important input features +ese inputfeatures include functional features ie similarMashups of target Mashup and similar mobile ser-vices of active mobile service derived from the topicsimilarity based on the HDP model and qualityfeatures ie the cooccurrence and popularity ofmobile services obtained from the stored data tableFMs predict the probability of mobile servicesinvocated by Mashups and recommends the high-quality mobile service for a Mashup creation Sim-ilarly Flask framework and Pyechart visualizationtool in Python are used to present the effect ofrecommendation

6 Conclusions and Future Work

+is paper proposes a mobile service recommendationmethod for Mashup development in mobile servicecomputing by combining word embeddings enhancedHDP and FMs +e experimental results on the top ofProgrammableWeb dataset show that compared with theexisting recommendation methods the proposed methodachieves significant improvements in the accuracy ofrecommendation In the future work we willinvestigate and apply fine-grained service relationshipinformation into the proposed model for more accuraterecommendation

Data Availability

+e crawled dataset from ProgrammableWeb can beaccessed at http491230608080MashupNetwork20datasetjsp

Conflicts of Interest

+e authors declare that there are no conflicts of interestregarding the publication of this paper

10 Mobile Information Systems

Acknowledgments

+e work was supported by the National Natural ScienceFoundation of China under grant nos 61873316 6187213961572187 61772193 61702181 and 61572371 National KeyRampD Program of China under grant no 2017YFB1400602Hunan Provincial Natural Science Foundation of Chinaunder grant nos 2017JJ2098 2017JJ4036 2018JJ2139 and2018JJ2136 and Innovation Platform Open Foundation ofHunan Provincial Education Department of China undergrant no 17K033

References

[1] S Deng L Huang H Wu et al ldquoToward mobile servicecomputing opportunities and challengesrdquo IEEE CloudComputing vol 3 no 4 pp 32ndash41 2016

[2] B Xia Y Fan W Tan K Huang J Zhang and C WuldquoCategory-aware API clustering and distributed recommen-dation for automatic mashup creationrdquo IEEE Transactions onServices Computing vol 8 no 5 pp 674ndash687 2015

[3] httpsenwikipediaorgwikiMashup_(web_application_hybrid)

[4] L Chen Y Wang Q Yu Z Zheng and J Wu ldquoWT-LDAuser tagging augmented LDA for web service clusteringrdquo inProceedings of the International Conference on Service-Oriented Computing (ICSOC) Hangzhou China January2013

[5] X Liu and I Fulia ldquoIncorporating user topic and service-related latent factors into web service recommendationrdquo inProceedings of the IEEE International Conference on WebServices pp 185ndash192 New York NY USA July 2015

[6] D Blei A Ng and M Jordan ldquoLatent dirichlet allocationrdquoJournal of Machine Learning Research vol 3 pp 993ndash10222003

[7] Y W Teh M I Jordan M J Beal and D M Blei ldquoHier-archical dirichlet processrdquo Journal of the American StatisticalAssociation vol 101 no 476 pp 1566ndash1581 2004

[8] M Shi J Liu D Zhou M Tang and B Cao ldquoWE-LDA aword embeddings augmented LDA model for web servicesclusteringrdquo in Proceedings of the IEEE International Confer-ence on Web Services (ICWS) pp 9ndash16 Honolulu HI USAJune 2017

[9] W Xu J Cao L Hu J Wang and M Li ldquoA social-awareservice recommendation approach for mashup creationrdquo inProceedings of the IEEE 20th International Conference on WebServices pp 107ndash114 Santa Clara CA USA 2013

[10] L Yao X Wang Q Sheng W Ruan and W Zhang ldquoServicerecommendation for mashup composition with implicitcorrelation regularizationrdquo in Proceedings of the IEEE In-ternational Conference on Web Services pp 217ndash224 NewYork NY USA June-July 2015

[11] H Ma D Zhou C Liu M R Lyu and I King ldquoRecom-mender Systems with social regularizationrdquo in Proceedings ofthe Fourth ACM International Conference on Web Search andData Mining pp 287ndash296 ACM Hong Kong China Feb-ruary 2011

[12] X Chen Z Zheng Q Yu and M R Lyu ldquoWeb servicerecommendation via exploiting location and QoS in-formationrdquo IEEE Transactions on Parallel and DistributedSystems vol 25 no 7 pp 1913ndash1924 2014

[13] S Rendle ldquoFactorization machinesrdquo in Proceedings of theIEEE International Conference on Data Mining (ICDM)pp 995ndash1000 Sydney Australia December 2010

[14] S Rendle ldquoFactorization machines with libFMrdquo ACMTransactions on Intelligent Systems and Technology (TIST)vol 3 no 3 pp 57ndash78 2012

[15] T Ma I Sato and H Nakagawa e Hybrid NestedHierarchical Dirichlet Process and Its Application to TopicModeling with Word Differentiation Association for theAdvancement of Artificial Intelligence (AAAI) Menlo ParkCA USA 2015

[16] Y Teh M Jordan M Beal and D Blei ldquoSharing clustersamong related groups hierarchical dirichlet processesrdquo Ad-vances in Neural Information Processing System vol 37 no 2pp 1385ndash1392 2004

[17] Z Zheng H Ma M Lyu and I King ldquoWSRec a collaborativefiltering based web service recommender systemrdquo in Pro-ceedings IEEE International Conference on Web Services(ICWS) pp 437ndash444 Los Angeles CA USA July 2009

[18] B Cao B Li J Liu M Tang and Y Liu ldquoWeb APIs rec-ommendation for mashup development based on hierarchicaldirichlet process and factorization machinesrdquo in Proceedingsof Collaborate Computing Networking Applications andWorksharing Beijing China July 2016

[19] S Wang Z Zheng Z Wu M Lyu and F Yang ldquoReputationmeasurement and malicious feedback rating prevention inweb service recommendation systemrdquo IEEE Transactions onServices Computing vol 5 no 8 pp 755ndash767 2015

[20] M Picozzi M Rodolfi C Cappiello andMMatera ldquoQuality-based recommendations for mashup compositionrdquo in Cur-rent Trends in Web Engineering vol 6385 pp 360ndash371 2010

[21] C Cappiello F Daniel M Matera and C Pautasso ldquoIn-formation quality in mashupsrdquo IEEE Internet Computingvol 14 no 4 pp 14ndash22 2010

[22] C Cappiello ldquoA quality model for mashup componentsrdquo inWeb Engineering Web Engineering Lecture Notes in Com-puter Science vol 5648 pp 236ndash250 2009

[23] K Huang Y Fan and W Tan ldquoAn empirical study ofprogrammable web a network analysis on a service-mashupsystemrdquo in Proceedings of the 2012 IEEE 19th InternationalConference onWeb Services (ICWS) Honolulu HI USA June2012

[24] W Gao L Chen J Wu and H Gao ldquoManifold-learningbased API recommendation for mashup creationrdquo in Pro-ceedings of the 2015 IEEE International Conference on WebServices (ICWS) New York NY USA June 2015

[25] X Luo M Zhou Y Xia and Q Zhu ldquoAn efficient non-negative matrix-factorization-based approach to collaborativefiltering for recommender systemsrdquo IEEE Transactions onIndustrial Informatics vol 10 no 2 pp 1273ndash1284 2014

[26] Z Zheng H Ma M R Lyu and I King ldquoCollaborative webservice QoS prediction via neighborhood integrated matrixfactorizationrdquo IEEE Transactions on Services Computingvol 6 no 3 pp 289ndash299 2013

[27] B Cao X Liu M D M Rahman B Li J Liu and M TangldquoIntegrated content and network-based service clustering andweb APIs recommendation for mashup developmentrdquo IEEETransactions on Services Computing p 1 2017

[28] B Cao X Liu J Liu and M Tang ldquoDomain-aware mashupservice clustering based on LDA topic model from multipledata sourcesrdquo Information and Software Technology vol 90pp 40ndash54 2017

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 2: MobileServiceRecommendationviaCombiningEnhanced ...downloads.hindawi.com/journals/misy/2019/6423805.pdfcorpus Ke3erv3 dataset Gscription document3f3 flashup3n3 mobi3ervice jG%3op3del

techniques to improve service discovery [4 5] Amongthem some topic model technologies such as latentDirichlet allocation (LDA) [6] have been utilized to obtainthe latent topics of Mashup and services to improve theaccuracy of recommendations [4 5] However LDA needsto identify the optimal number of topics in advance Toobtain the optimal topics it is needed to repetitivelyperform model training which leads to massive time-consumption Aiming at this problem the hierarchicalDirichlet process (HDP)model is proposed by Teh et al [7]which can derive the optimal number of topics and save thetime and cost We use it to model and derive the topics ofMashups and service to achieve more accurate servicerecommendations Moreover the topic training andmodelling of LDA usually needs large-scale corpusHowever the description documents of Mashup and ser-vice usually are short and their corpuses are insufficient Inthe field of information retrieval some researchers exploitWord2vec [8] to expand short text into long text in orderthat topic model can effectively estimate the latent topics oftext for more accurate information searching +eWord2vec model is proposed by Google [8] which canprocess large-scale text corpus and generate word em-beddings vector with high efficiency In this paper weexploit Word2vec to extend the description document ofMashup and mobile service to build a dense word em-beddings vector representation for more accurate topicmodelling

Recently matrix factorization is widely applied inservice recommendations [9 10] It usually decomposesMashup-service matrix into two matrixes with lower di-mension by using service invocations in historicalMashups But matrix factorization based on service rec-ommendation technology mainly depends on enoughrecords in historical Mashup-service interactions [10] Tosolve this problem some additional information such asusersrsquo social relations [11] or location similarity [12] isincorporated into matrix factorization to achieve a betteraccuracy of service recommendation However matrixfactorization only is applicable to special and single inputdata and is not suitable for general prediction task+erefore the incorporation of additional informationdrops the performance of service recommendation As weinvestigated Rendle [13 14] proposed a general predictorworking with any real-valued feature vector FMs (fac-torization machines) It can be used for general predictiontask and model all interactions between various inputvariables +erefore it can predict the probability ofservice invocated by Mashups In this paper we combineword embeddings enhanced HDP and FMs to recommendmobile service for Mashup development +e contribu-tions of this paper are as follows

(i) We use Word2vec to extend the description doc-uments of Mashups and mobile services to build adense word embeddings vector representation formore accurate topic modelling Based on the wordembeddings vector we exploit HDP to derive thelatent topics from the extended description doc-ument of Mashups and mobile services

(ii) We employ the FMs to train the latent topics de-rived from the HDP and predict the probability ofmobile services invocated by Mashups Variousvaluable information such as cooccurrence andpopularity is exploited to achieve high-qualitymobile service recommendation for Mashupdevelopment

(iii) We crawl a real Web service dataset from Pro-grammableWeb and perform a series of experi-ments +e experimental results indicate thatcompared with the existing methods the proposedmethod has significant improvements in MAE andRMSE

(iv) +e rest of this paper is organized as follows Section2 indicates the proposed approach Section 3presents the experimental results Section 4 reviewsrelated works Section 5 provides discussion Fi-nally we draw conclusions and discuss future workin Section 6

2 Method Overview

+is section consists of four subsections respectively de-scribing the overall framework description extension topicmodelling and mobile service recommendation of theproposed method with details

21 Overall Framework of Mobile Service Recommendation+e overall framework of our mobile service recommen-dation method is shown in Figure 1 which includes threemain parts ie description extension topic modelling andservice recommendation In the description extension partwe firstly extract the description documents of Mashup andmobile service fromWeb service dataset and then exploit theEnglish Wikipedia corpus trained byWord2vec as extensionsource of the description documents of Mashup and mobileservice to obtain their extended words Finally the originaldescription documents and extended words of Mashup andmobile service are together preprocessed as the input of nextstep In the topic modelling part we use HDP topic tech-nology to model the extended description documents ofMashups and mobile services and derive their latent topicsIn the service recommendation part when a user submits aMashup requirement our method applies FM model toperform the prediction and recommendation of mobileservice for the given Mashup requirement

22 Description Extension of Mashup and Mobile ServiceBased onWord2vec +e description documents of Mashupand mobile service are usually short in which the number ofcontained words is relatively few and the word frequencycooccurrence is not enough For example according to ourstatistics on an average every service description documenton ProgrammableWeb contains only 2716 words Due tothe limited words (corpus) it is difficult to effectively derivethe latent topics when using HDP topic model to train andmodel the description documents of Mashup and mobile

2 Mobile Information Systems

service +erefore it is very necessary for topic modelling toextend the description documents of Mashup and mobileservice Google develops a tool Word2vec [8] to expresswords as real numerical vectors in 2013 It is an open sourceword embeddings vector toolkit which simplifies the doc-ument content processing into vector operations in K-di-mensional vector space by using the idea of deep learning[8] +e word embeddings vector contains not only thesemantic and grammatical relations of words but also ex-tracts the context information of the document whichachieves more accurate word embeddings vector repre-sentation As we know Wikipedia is recognized as the mostcomprehensive and authoritative online encyclopaedia on theInternet which possesses rich corpus We use Wikipediacorpus as the extension source of the description documentsofMashup andmobile service More concretely first of all weuse Word2vec tool to train Wikipedia corpus and obtain theword embeddings vector model of Wikipedia corpus +enwe exploit the trained word vector model to extend the de-scription documents of Mashup and mobile service+at is tosay for each word wi in the preprocessed service descriptiondocument Stext the most similar Top-N wordsTwi

(t1 t2 tN) to wi are identified from the wordembeddings vector space ofWikipedia corpus and used as theextended words +e extended service description documentis denoted as SExText (w1 Tw1

) (w2 Tw2) (wn Twn

)1113966 1113967where n is the total number of words in service descriptiondocument andN is the number of extended words In the nextsection we will perform comparative experiments to de-termine the optimal number of extended words

+e word embeddings vector model of Wikipedia corpusis trained by adopting the CBOW model based on thenegative sampling and the Negative Sampling algorithm inwhich the more close words in word meaning indicate moreclose distance in their word vectors space +ese close wordshave similar semantic and grammatical relations which can

be used for service description document expansion +einput of the CBOW model is the word vector of the context-related words of a specific word and its output is just the wordvector of the specific word Suppose w Cw represent theextracted word and their context related words informationfrom the service description document dataset respectively+e probability of predicting w by the context-related words(surrounding words) can be denoted as follows

p w Cw

11138681113868111386811138681113872 1113873 1

1 + eminusvcwWw

(1)

where Ww is the parameter of the hidden layer and softmaxlayer in the neural network and vcw

is the sum of the vectorsof each word in Cw +e training objectives of the CBOWmodel are defined as follows (maximum likelihood esti-mation function)

OBJCBOW arg maxVwW1

⎛⎝ 1113945

wCw( )isinT

log p w Cw

11138681113868111386811138681113872 11138731113872 1113873

times 1113945

wCw( )notinT

1minus log p w Cw

11138681113868111386811138681113872 11138731113872 11138731113872 1113873⎞⎠

(2)

Next we use the HDP model to model the extendedservice description document SExText to automatically ex-tract the optimal number of topics and fully estimate thelatent topics distribution

23 Topic Modelling of Mashup and Mobile Service UsingHDP +e hierarchical Dirichlet process (HDP) is a Dirichletprocess (DP) mixture model with multilevel form and it isalso a nonparametric Bayesian approach to clusteringgrouped data [15] Assume (Θ C) is the measurable spacewhere G0 is a spatial probability measure and a0 is a positivereal number +e Dirichlet process [16] is interpreted as the

Wikipediacorpus

Web service dataset

Descriptiondocuments of Mashup and

mobile service

HDP topic model

Topic modelling of Mashup and mobile service

FM model

Prediction and recommendation of mobile service

for Mashup

Topic modelling Service recommendationDescription extension

Extraction

Word2Vec

Preprocessing Input

Mashup requirement

Extendedwords

Figure 1 Framework of mobile service recommendation for Mashup

Mobile Information Systems 3

distribution of the random probability measure G over(Θ C) If G satisfies the Dirichlet distribution for any finitepartition (A1 A2 Ar) of the measurement spaceΘ thereis a random vector (G(A1) G(Ar)) distributed as afinite-dimensional Dirichlet distribution with the parameters(a0G0(A1) a0G0(Ar))

G A1( 1113857 G Ar( 1113857( 1113857 sim Dir a0G0 A1( 1113857 a0G0 Ar( 1113857( 1113857

(3)

HDP is used to model documents for mobile servicesand Mashup Figure 2 is an HDP probability graph whichclearly indicates the documents of mobile services orMashup and their words and potential topics Amongthem c and a0 represent concentration parameter and Drepresents the entire Mashup document set in which eachMashup document in D is represented as d At the lowerpart of Figure 2 H represents the base probability measureand G0 indicates the global random probability measure+e generated topic probability distribution of Mashupdocument d is represented as Gd and the generated topic ofthe n-th word in d from Gd is represented as βdn and wdn isa generated word from βdn

+e generative process of the HDP model is as follows

(1) Sample the probability distribution G0 sim DP(c H)

(2) For each d in D sample a topic distributionGd sim DP(a G0)

(3) For each word n isin 1 2 N in d

(a) Sample a topic of the n-th word βdn sim Gd

(b) Sample a word from the multinomial distribu-tion of the topic words wdn sim multi(βdn)

In order to perform HDP sampling it is needed to devise aconstruction method to infer the posterior distribution ofparameters Chinese Restaurant Franchise (CRF) is a repre-sentative construction method which provides a way to con-struct the Dirichlet process Assume there are j Chineserestaurants containing mj tables (ψjt)

mj

t1 and each table sitsNj customers In the J restaurant each table shares a menuv (v)K

k1 where K is the amount of food Customers canchoose a table at random and each table is served a dish from amenu common to all restaurants In this way the customersrestaurants and food in the Chinese restaurant correspond tothe words documents and topic in our HDP model re-spectively Assuming δ is a probability measure the topicdistribution θji of the word xji is treated as the customerentering the restaurant and the different value ψjt correspondsto the table where the customer is seated+e customer sits thetable ψjt with a probability njt(iminus 1 + a0) or chooses with anew probability a0(iminus 1 + a0) to sit the new table ψjtnew

sharing the food vk Among them njt indicates the sum ofcustomers at the t-th table of the j-th restaurant If the customerchooses a new table shehe can distribute the food vk for thenew table with a probability mk1113936kmk + c according to thepopularity of chosen foods or new foods vknew

with a prob-ability c1113936kmk + c Here mk indicates the sum of tablesproviding the food vk We have the conditional distributions

θji

1113868111386811138681113868 θji θji θji a0 G0sim1113944

mj

t1

njt

iminus 1 + a0δψjt

+a0

iminus 1 + a0G0

(4)

ψjt

1113868111386811138681113868ψjtψjt ψjt ψjt c H sim 1113944K

k1

mk

1113936kmk + cδvk

+c

1113936kmk + cH

(5)

In fact the above CRF process of distributing tables andfoods to customers corresponds to the process of wordtopic distribution and document topic clustering inMashup document set respectively After the constructionof CRF the HDP model uses Gibbs sampling to infer theposterior probability distribution of its parameters so as togain the topic distribution of the entire Mashup documentset

24 Mobile Service Recommendation for Mashup Using FMs

241 Rating Prediction in Recommendation System and FMsIn the traditional recommendation system as for user setU u1 u2 1113864 1113865 and item set I i1 i2 1113864 1113865 the ratingprediction function is denoted as follows

y U times I⟶ R (6)

where y is the rating and y(u i) represents the rating of useru to item i

FMs are a universal predictor that estimates reliableparameters at very high sparsity [13 14] It integrates theadvantages of SVMswith factorizationmodels Different fromthe SVM it not only is suitable for any real-valued featurevector but also can use decomposition parameters tomodel allinteractions between feature variables +erefore it is verysuitable for predicting the rating of items for users Assumethere are an input feature vector x isin Rnlowastp and an outputtarget vector y (y1 y2 yn)T Here n is the sum ofinput-output pairs p denotes the sum of input featuresie the ith row vector xi isin Rp p represents xi has p inputfeature values and yi is the predicted target value of xi On thebasis of x and y the 2-order FMs can be denoted as follows

1113954y(x) w0 + 1113944

p

i1wixi + 1113944

p

i11113944

p

ji+1xixj 1113944

k

f1vifvjf (7)

H G0 GdD

N

γ α

βdn wdn

Figure 2 Probabilistic graph of HDP

4 Mobile Information Systems

where w0 is the global bias and k is the dimensionality offactorization w0 models the strength of the i-th featureand xixj indicates all the pairwise variables in thetraining instances xi and xj +e model parametersw0 w1 wp v11 vpk1113966 1113967 are denoted as follows

w0 isin R

w isin Rn

V isin Rnlowastk

(8)

242 Prediction and Recommendation of Mobile Service forMashup Based on FMs +e prediction and recommendationof mobile service is a typical classification problem and it isregarded as the task of ranking mobile services and recom-mending enough related mobile services for a given Mashup+e result of classification can be denoted as y minus1 1 Wheny 1 the relevant mobile services are recommended to thegiven Mashup However in the experiment we can onlyobtain the predicted values ranging from 0 to 1 by formula (5)We firstly rank these prediction values then label the top-Kresults as positive (+1) and the rests as negative (minus1) andfinally recommend the mobile services with positive values tothe given or target Mashup

In the modelling of FMs target Mashup and activemobile services can be considered as user and item re-spectively In addition to the two-dimensional features(target Mashup and active mobile services) we add otherfeatures such as similar mobile services similar Mashuppopularity and cooccurrence of mobile services to improvethe accuracy of prediction and recommendation +e ad-ditional features can be used as input feature vectors in FMmodelling +erefore the model in formula (6) can be ex-tended to the below prediction model with six dimensions

yMA times MS times SMS times SMA times CO times POP⟶ S (9)

where MA is the target Mashup MS is the active mobileservice SMS represents a similar Mashup SMA represents asimilar mobile service CO represents the cooccurrence ofmobile service POP indicates the popularity of mobileservice and S indicates the prediction ranking score re-spectively +ese similar Mashups and mobile services arederived from our HDP model in Section 23

Figure 3 is an example of recommending mobile servicesfor target Mashup using the FM model in which the dataconsist of two parts +e first part is the input feature vectorset X and the second part is the output target set Y Each rowincludes a feature vector xi and its corresponding targetscore value yi +e first binary indicator matrix (ie Box 1)indicates the target Mashup MA +e second binary in-dicator matrix (ie Box 2) indicates the active mobile serviceMS +e third indicator matrix (ie Box 3) represents thatTop-S mobile services are similar to active mobile service inBox 2 For example the similarity between S1 and S2 (S3) is03 (07) +e fourth indicator matrix (ie Box 4) representsTop-M similar Mashups SMA of the target Mashup in Box 1For instance the similarity between M2 and M1 (M3) is 03(07) +e fifth indicator matrix (ie Box 5) indicates the

cooccurrence CO of active mobile services composed orinvoked by the same Mashup in historical records +e sixthindicator matrix (ie Box 6) indicates the popularity (or thefrequencytimes) POP of active mobile services composed orinvoked by Mashup Target Y represents the output result ofthe model and the prediction ranking score S can beclassified as a positive value (+1) or a negative value (minus1) onthe top of a given threshold If yi gt 05 then S+1 otherwiseSminus1 +ese mobile services with positive values will beselected and recommended to the target Mashup+e case inpoint is if there are two active mobile service members S1and S3 for the selection of the target Mashup M1 S1 will beselected and recommended to M1 +is is because it has ahigher prediction value ie y2 gt 092 Furthermore we willinvestigate the influences of top-S and top-M on recom-mendation performance in the Experiments section

3 Experiments

31 Experiment Dataset and Settings In this experiment wefirstly crawled 3929 real Mashups 10648 services and 12715invocations between these Mashups and services from Pro-grammableWeb As for each Mashup or service a pre-processing process is performed to obtain their standarddescription information Secondly we use the Word2vec toolto expand the description document of Mashup or servicefrom English Wikipedia corpus published on April 2017 andobtain their word embeddings vector More concretely thegensim module in Python is applied to train the EnglishWikipedia corpus and produce its word embeddings vectorand Table 1 presents the special parameters of Word2vecFinally the trained English Wikipedia corpus is exploited toexpand the description documents of Mashup and service+e most similar Top-N words to the original word areidentified and used as the extended words For instance thetop similar 10 expanded words of the two words ldquoEarthrdquo andldquoGooglerdquo are shown in Table 2 All Mashups in the dataset areuniformly divided into 5 subsets in which 1 subset is used asthe testing set and other 4 subsets are integrated as a thetraining set A five-fold cross validation is conducted and theresults for each fold are summed up to obtain their meanvalue as the reported experiment results As for the testing setthrough randomly removing some score values from thematrix of Mashup service we change the number of scorevalues provided by the active Mashups as 10 20 and 30 andcall them as Given 10 Given 20 andGiven 30 respectively Atthe same time the removed score values are exploited as theexpected values Similarly as for the training set throughrandomly removing some score values the Mashup-servicematrix becomes more sparser with density 10 20 and30 respectively

32 Evaluation Metrics Mean absolute error (MAE) refersto the expected value of the square of the difference betweenthe observed value and the true value which can evaluate thechange degree of data [16] Root-mean-squared error(RMSE) is the square root of the ratio of the square of thedeviation between the observed value and the true value to

Mobile Information Systems 5

the observed times N [16] We adopt the MAE and RMSE asthe evaluation metrics of Web APIs recommendation +esmaller the MAE and RMSE mean the better the recom-mendation effect

MAE 1N

1113944ij

rij minus 1113954rij

11138681113868111386811138681113868

11138681113868111386811138681113868

RMSE

1N

1113944ij

rij minus 1113954rij1113872 11138732

1113971

(10)

where N is the number of predicted score rij indicates thetrue score of Mashup Mi to service Sj and 1113954rij indicates thepredicted score of Mi to Sj

33 Baseline Methods We choose the below methods asbaseline to compare them with our proposed approach

(i) SPCC Similar to IPCC [17] service-based utilizingPearson correlation coefficient (SPCC) approachmeasure the similarities between mobile servicesand perform recommendation

(ii) MPCC Similar to UPCC [17] Mashups-basedutilizing Pearson correlation coefficient (MPCC)approachmeasure the similarities betweenMashupsand perform recommendation

(iii) PMF In the collaborative filtering probabilisticmatrix factorization (PMF) is a very popular matrixfactorization model [10] +e historical invocationrecord between Mashups and mobile services isdenoted as a matrix R [rij]ntimesk If rij 1 themobile service is invoked by a Mashup is shownotherwise rij 0 +e probability of the mobileservice Si invoked by the Mashup Mj can be pre-dicted and represented as 1113954rij ST

i Mj(iv) LDA-FMs +e topic probability distributions of

description documents in Mashup and mobileservice firstly are derived by the LDA model andthen they are trained via FMs to predict theprobability distribution of mobile service invokedby Mashup and recommend mobile service withhigh quality Besides the topic information thecooccurrence and popularity of mobile service areexploited in the FM modelling

(v) HDP-FMs +e prior work [18] which integratesHDP and FMs to recommendmobile service for targetMashup+eHDPmodel is applied to derive the topicprobability distributions of description documents inMashup and mobile service Similarly the topic in-formation and the cooccurrence and popularity ofmobile service are all used in the FM modelling

(vi) EHDP-FMs +e proposed method in this paper isan extended work of the prior work [18] It firstly

Table 1 Specific parameters of Word2vec

Parameter ValueSize (the dimension of word vector) 200Window (the length of the window) 10Sample (the threshold of sampling) 0001Negative (the number of negative sampling) 5Sg (whether or not the Skip-gram model is used) 0 (No)Hs (whether or not hierarchical Softmax model isused) 0 (No)

Table 2 Extension example of the two words ldquoEarthrdquo andldquoGooglerdquo

Original word Earth Google

Extended words

Planet GmailMartian DropboxMars Evernote

Venusian AppPlanets Adsense

Spaceship YahooUniverse MicrosoftPlanetary FlickrMoon HotmailDeimos Mapquest

Mashup(MA)

Mobile service(MS)

Similar mobileservice (SMS)

Similar Mashup(SMA)

Cooccurrence(CO)

Popularity(POP)

0 1 0 hellip 1 0 0 hellip 0 03 07 hellip 03 0 07 hellip 0 05 05 hellip 12

1 0 0 hellip 1 0 0 hellip 0 05 05 hellip 0 05 05 hellip 0 1 0 hellip 3

0 1 0 hellip 0 1 0 hellip 07 0 03 hellip 05 0 05 hellip 05 0 05 hellip 7

0 0 1 hellip 0 1 0 hellip 06 0 04 hellip 04 06 0 hellip 05 0 05 hellip 21

0 0 1 hellip 0 0 1 hellip 03 07 0 hellip 01 09 0 hellip 05 05 0 hellip 5

1 0 0 hellip 0 0 1 hellip 04 01 0 hellip 0 08 02 hellip 05 05 0 hellip 3

0 1 0 hellip 0 1 0 hellip 04 0 06 hellip 04 0 06 hellip 05 0 05 hellip 8

0 0 1 hellip 1 0 0 hellip 0 08 02 hellip 07 03 0 hellip 0 1 0 hellip 1

hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip

M1 M2 M3 hellip S1 S2 S3 hellip S1 S2 S3 hellip M1 M2 M3 hellip S1 S2 S3 hellip Freq

Box 1 Box 2 Box 3 Box 4 Box 5 Box 6

Score(S)

036 (ndash1)

092 (+1)

017 (ndash1)

043 (ndash1)

069 (+1)

028 (ndash1)

055 (+1)

074 (+1)

hellip hellip hellip

hellip hellip hellip

X1

X2

X3

X4

X5

X6

X7

X8

y2

y1

y3

y8

y7

y4

y5

y6

X Y

Figure 3 FM model of recommending mobile service for Mashup

6 Mobile Information Systems

uses Word2vec tool to expand the document de-scription of mobile service and Mashup fromWikipedia corpus +en the HDP model is appliedto derive the topic probability distributions of theextended document description of mobile serviceand Mashup Finally the FM is deployed to predictand recommend high-quality mobile service forMashup

34 Experimental Results

341 Recommendation Performance Comparison To studythe performance of mobile service recommendation wecompare our method with other five baseline methods Weselect the optimal number of extended words for eachoriginal word in description documents of Mashup andservice to achieve the best recommendation result in ourEHDP-FMs A detailed investigation about it will be dis-cussed in subsequent section Table 3 reports the MAE andRMSE comparison of multiple recommendation methodswhich show our EHDP-FMs greatly outperforms WPCCand MPCC significantly surpasses PMF and LDA-FMs andslightly exceeds HDP-FMs consistently+e reason for this isthat in the EHDP-FMs (a) more useful words informationcan be obtained from the extended description content ofMashup and service (b) more similar Mashups and similarservices in topic distribution are identified by using HDPtechnology and (c) FM models and trains those usefulinformation (including similar Mashups and similar ser-vices the cooccurrence and popularity of service) to achievemore accurate service probability score prediction Fur-thermore when the given score values increase from 10 to 30and the density of training matrix rises from 10 to 30 theMAE and RMSE in the EHDP-FMs definitely drop+at is tosay more score values and training matrix with highersparsity mean better accuracy of recommendation

342 Effect of the Number of Extended Words on MobileService Recommendation +e experiments investigate theeffect of the number of extended words on mobile servicerecommendation in our proposed method During the ex-periments we set the number of extended words to 1 3 5and 7 (respectively denoted as EHDP-FMs-1 EHDP-FMs-3 EHDP-FMs-5 and EHDP-FMs-7) when training matrixdensity 10 and obtain their values of MAE and RMSE inFigures 4 and 5 +e experimental results indicate theperformance of EHDP-FMs-1 is the worst in all cases +is isbecause the extended description documents of Mashup andservice are still short and the contained useful information isless in it when only extending a word for each original wordWe can see that theMAE and RMSE of EHDP-FMs-3 are theoptimal and best in all cases However when the number ofthe extended words continues to increase from 5 to 7 therecommendation performance decreases+e reason for thisis that too many extended words contain more other ir-relevant syntax and semantics information which maybemakes the HDP topic model fail to mine the latent topicsaccurately and therefore weakens the performance of service

recommendation +erefore we select 3 extended words foreach original words of description document of Mashup andservice in our EHDP-FMmethod +e observations indicateit is very important to choose an appropriate number ofextended words for mobile service recommendation

343 HDP-FMs Performance vs LDA-FMs Performancewith Different Topic Numbers In this experiment we re-spectively set the number of topics as 3 6 12 and 24 forLDA-FMs and denote as LDA-FMs-361224 +e experi-mental results in Figures 6 and 7 respectively show theMAE and RMSE values when the training matrix density isequal to 10We also observe that the performance of HDP-FMs is the best At the same time the MAE and RMSE ofLDA-FMs-12 are close to that of HDP-FMs and surpassedthose of LDA-FMs-3 LDA-FMs-6 and LDA-FMs-24 +eobservations prove that HDP-FM is better than LDA-FMssince it can automatically derive the optimal topic numbersinstead of repeatedly training like LDA

344 Impacts of Top-S and Top-M in HDP-FMs We in-vestigate the effects of top-S and top-M to mobile servicerecommendation in order to obtain their optimal values+eoptimal values of top-M (top-S) for all similar top-S (top-M)services (Mashups) are obtained ie S 5 for all top-Msimilar Mashups and M 10 for all top-S similar mobileservices Under the setting of training matrix density 10and given number 30 the MAEs of HDP-FMs are pre-sented in Figures 8 and 9 We can see that from Figure 8 theMAE of HDP-FMs constantly rises when S grows from 5 to25 Figure 9 indicates the MAE of HDP-FMs runs up to itspeak value whenM 10 and then continuously rises with theincreasing or decreasing ofM +e observations mean that itis very important to identify suitable values of S and M forthe HDP-FM method

4 Related Works

Service recommendation is a hot topic nowadays in service-oriented computing [19] Traditional service recommen-dation solves the quality problem of Mashup services inorder to realize high-quality service recommendation +equality of a single service can facilitate recommendationshowed by Picozzi et al [20] +e quality attributes ofMashup components (APIs) and information quality inMashups [21] is analyzed by Cappiello [22] In additioncollaborative filtering (CF) technique is widely exploited inQoS-based service recommendation [16] We can use it tomeasure the similarity of services or users predict themissing QoS values on the basis of the QoS records of similarservices or similar users and recommend services to users

+e problems of the data sparsity and long tail bringabout inaccurate and imperfect search results according tothe results in references [23 24] To attack the problemsome researchers try to use matrix factorization technologyto decompose historical QoS or Mashup service interactionsto obtain service recommendations [25 26] A collaborativeQoS prediction method is proposed in which a matrix

Mobile Information Systems 7

factorization model of neighbourhood integration isdesigned to predict the QoS value of personalized Webservices by Zheng et al [26] A social awareness servicerecommendation method is proposed in which the multi-dimensional social relationships among potential usersMashups topics and services are depicted by the couplingmatrix model by Xu et al [9] +ese methods aim totransform Mashup-service rating matrix or QoS into afeature space matrix with lower dimensions and predict theprobability of services invoked by Mashups or unknownQoS

Considering that matrix factorization depends onabundant historical interaction records recent work in-corporates additional information into matrix factorizationto obtain more accurate service recommendation [5 10ndash12]Among them Ma et al [11] integrates matrix factorization

with geographic and social influence to recommend interestpoints By using location information and QoS of Webservices to cluster services and users a personalized servicerecommendation is proposed by Chen et al [12] +e his-torical invocation relationship between Mashups and ser-vices is studied to infer the implicit functional correlationbetween services and the correlation is incorporated into thematrix factorization model to facilitate service recommen-dation by Yao et al [10] Collaborative topic regression isproposed by Liu and Fulia [5] which combines probabilistictopic modelling and probabilistic matrix factorization forservice recommendation

+e existing methods based on matrix factorizationundoubtedly improve the performance of service recom-mendation At the same time we observed that few of themrecognized the historical invocation between services and

Table 3 MAE and RMSE comparison of multiple recommendation approaches

MethodMatrix density 10 Matrix density 20 Matrix density 30MAE RMSE MAE RMSE MAE RMSE

Given 10

SPCC 04258 05643 04005 05257 03932 05036MPCC 04316 05701 04108 05293 04035 05113PMF 02417 03835 02263 03774 02014 03718

LDA-FMs 02091 03225 01969 03116 01832 03015HDP-FMs 01547 02874 01329 02669 01283 02498EHDP-FMs 01308 02507 01154 02372 01081 02093

Given 20

SPCC 04135 05541 03918 05158 03890 05003MPCC 04413 05712 04221 05202 04151 05109PMF 02398 03559 02137 03427 01992 03348

LDA-FMs 01989 03104 01907 03018 01801 02894HDP-FMs 01486 02713 01297 02513 01185 02291EHDP-FMs 01227 02419 01055 02216 00952 01904

Given 30

SPCC 04016 05447 03907 05107 03739 05012MPCC 04518 05771 04317 05159 04239 05226PMF 02214 03319 02091 03117 01986 03052

LDA-FMs 01970 03096 01865 02993 01794 02758HDP-FMs 01377 02556 01109 02461 01047 02057EHDP-FMs 01113 02248 00926 02057 00804 01673

Given number10 20 30

MA

E

005

01

015

02

025

03

EHDP-FMs-1EHDP-FMs-3

EHDP-FMs-5EHDP-FMs-7

Figure 4 MAE values of EHDP-FMs-1357

EHDP-FMs-1EHDP-FMs-3

EHDP-FMs-5EHDP-FMs-7

Given number10 20 30

RMSE

018

02

022

024

026

028

03

032

Figure 5 RMSE values of EHDP-FMs-1357

8 Mobile Information Systems

Mashups to derive potential topics and they did not use FMsto model and train these potential topics to predict theprobability of Mashup calling services to obtain more ac-curate service recommendation In our previous work[8 27 28] we mainly address on LDA or enhanced LDAtopic model for Web services clustering [27 28] and alsoexploit word embedding technique to enhance the accuracyof service clustering [8] Driven by these methods wecombine FMs and word-embedded enhanced HDP forrecommending mobile services to build novel Mashup ap-plication We apply the HDP model to export the potentialtopics from the description documents of mobile servicesand Mashups to support FM model training We use FMs topredict the probability of Mashups calling mobile servicesand recommend high-quality services for building novelMashup application

5 Discussion

Recommending mobile service to build novel Mashup ap-plication for software developers in the mobile servicecomputing environment is becoming a promising researchtopic In our paper the functional semantic representationof Mashup applications and mobile services is fully con-solidated and mined by extending their description docu-ments and modelling their topic probability distributionand the quality prediction is performed by exploiting FMs totrain and model multiple dimension features of mobileservice +e high-quality mobile services are ranked andrecommended to build Mashup by simultaneously consid-ering their functionality representation and quality feature+e accuracy of mobile service recommendation is signifi-cantly improved as a result

Although the above approach and solution seem veryeffective in the Mashup development it will be better if a

Given number10 20 30

RMSE

02

025

03

035

04

LDA-FMs-3LDA-FMs-6LDA-FMs-12

LDA-FMs-24HDP-FMs

Figure 6 MAE of HDP-FMs and LDA-FMs

LDA-FMs-3LDA-FMs-6LDA-FMs-12

LDA-FMs-24HDP-FMs

Given number10 20 30

MA

E

012

013

014

015

016

017

Figure 7 RMSE of HDP-FMs and LDA-FMs

Top-S5 10 15 20 25

MA

E

013

014

015

016

017

018

019

02

HDP-FMs

Figure 8 Impact of Top-S in HDP-FMs

Top-M5 10 15 20 25

MA

E

013

014

015

016

017

018

019

02

HDP-FMs

Figure 9 Impact of Top-M in HDP-FMs

Mobile Information Systems 9

prototype can be designed or implemented to validate theeffectiveness and application value of the approach As weexpected the objective of the prototype system is to rank andrecommend the high-quality mobile service to softwaredevelopers for building Mashup application We can use thetools of Python 35 Mysql 56 and the technologies of Flaskand Pyecharts to develop the prototype system It shouldachieve four basic function parts ie service data crawlingand preprocessing description extension of Mashup andmobile service topic modelling of Mashup and mobileservice and recommendation of mobile service for the givenMashup requirement More concretely

(1) In the first part (service data crawling and pre-processing) the system incrementally crawls Mash-ups services and invocations between these Mashupsand services from ProgrammableWeb and builds theircorresponding data table to store Because the de-scription documents of Mashup and mobile servicecontain some useless or meaningless termswords thepreprocessing is performed to normalize and stan-dardize the description information +e pre-processing mainly includes tokenization words aresegmented by spaces and punctuation is separatedfrom words by using NLTK (Natural LanguageToolkit) in Python removing stop words (the com-mon short words or symbols that have no practicalmeaning but occur frequently such as the to a anwith and at) the stop vocabulary table in the NLTK isapplied to remove stop words stemming variousforms of a word are usually used in the grammaticalexpression such as provide providing provides andprovided and their common word endings such asing s and ed should be removed

(2) In the second part (description extension of Mashupand mobile service) English Wikipedia corpustrained byWord2vec based on the genism module inPython is exploited as the description extensionsource of Mashup and mobile service +e mostsimilar Top-N words to an original word in thedescription documents of Mashup and mobile ser-vice are identified and saved as the extended words inthe prototype system Word2vec uses the hierar-chical softmax algorithm to speed up and train wordvector for English Wikipedia corpus whose timecomplexity is O(log N) Meanwhile Word2veccalculates the similarity between words in EnglishWikipedia corpus and obtain the similarity matrixwhose time and space complexity are all O(n2) n isthe total amount of words During the process ofdescription extension the most similar Top-N wordsto an original word can be found only by the way oflook-up table in the trained English Wikipediacorpus and its time complexity is O(1) and spacecomplexity is O(N2)

(3) In the third part (topic modelling of Mashup andmobile service) hierarchical Dirichlet processtechnology is used to extract the implicit topics ofMashup and mobile service which clustersgroups

service data according to the cooccurrence of wordfrequency It can automatically determine the op-timal number of topics which avoids adjusting thenumber of topics repeatedly and so saves the timecost It can also accurately predict the topic dis-tribution of Mashup and mobile service which donot need to retrain the dataset and make the pro-totype system real time A topic modelling modulecan be designed in the prototype system in whichFlask framework and Pyecharts visualization tool inPython are used to present the effect of topicmodelling and a download function is provided todownload the transformed topic vectors of Mashupand mobile service for users

(4) In the fourth part (recommendation of mobile ser-vice for the given Mashup requirement) when asoftware developer submits a Mashup requirementthe prototype system will return a list of mobileservices with good quality a for software developer tobuild novel Mashup application During the processof recommendation factorization machines trainand model the important input features +ese inputfeatures include functional features ie similarMashups of target Mashup and similar mobile ser-vices of active mobile service derived from the topicsimilarity based on the HDP model and qualityfeatures ie the cooccurrence and popularity ofmobile services obtained from the stored data tableFMs predict the probability of mobile servicesinvocated by Mashups and recommends the high-quality mobile service for a Mashup creation Sim-ilarly Flask framework and Pyechart visualizationtool in Python are used to present the effect ofrecommendation

6 Conclusions and Future Work

+is paper proposes a mobile service recommendationmethod for Mashup development in mobile servicecomputing by combining word embeddings enhancedHDP and FMs +e experimental results on the top ofProgrammableWeb dataset show that compared with theexisting recommendation methods the proposed methodachieves significant improvements in the accuracy ofrecommendation In the future work we willinvestigate and apply fine-grained service relationshipinformation into the proposed model for more accuraterecommendation

Data Availability

+e crawled dataset from ProgrammableWeb can beaccessed at http491230608080MashupNetwork20datasetjsp

Conflicts of Interest

+e authors declare that there are no conflicts of interestregarding the publication of this paper

10 Mobile Information Systems

Acknowledgments

+e work was supported by the National Natural ScienceFoundation of China under grant nos 61873316 6187213961572187 61772193 61702181 and 61572371 National KeyRampD Program of China under grant no 2017YFB1400602Hunan Provincial Natural Science Foundation of Chinaunder grant nos 2017JJ2098 2017JJ4036 2018JJ2139 and2018JJ2136 and Innovation Platform Open Foundation ofHunan Provincial Education Department of China undergrant no 17K033

References

[1] S Deng L Huang H Wu et al ldquoToward mobile servicecomputing opportunities and challengesrdquo IEEE CloudComputing vol 3 no 4 pp 32ndash41 2016

[2] B Xia Y Fan W Tan K Huang J Zhang and C WuldquoCategory-aware API clustering and distributed recommen-dation for automatic mashup creationrdquo IEEE Transactions onServices Computing vol 8 no 5 pp 674ndash687 2015

[3] httpsenwikipediaorgwikiMashup_(web_application_hybrid)

[4] L Chen Y Wang Q Yu Z Zheng and J Wu ldquoWT-LDAuser tagging augmented LDA for web service clusteringrdquo inProceedings of the International Conference on Service-Oriented Computing (ICSOC) Hangzhou China January2013

[5] X Liu and I Fulia ldquoIncorporating user topic and service-related latent factors into web service recommendationrdquo inProceedings of the IEEE International Conference on WebServices pp 185ndash192 New York NY USA July 2015

[6] D Blei A Ng and M Jordan ldquoLatent dirichlet allocationrdquoJournal of Machine Learning Research vol 3 pp 993ndash10222003

[7] Y W Teh M I Jordan M J Beal and D M Blei ldquoHier-archical dirichlet processrdquo Journal of the American StatisticalAssociation vol 101 no 476 pp 1566ndash1581 2004

[8] M Shi J Liu D Zhou M Tang and B Cao ldquoWE-LDA aword embeddings augmented LDA model for web servicesclusteringrdquo in Proceedings of the IEEE International Confer-ence on Web Services (ICWS) pp 9ndash16 Honolulu HI USAJune 2017

[9] W Xu J Cao L Hu J Wang and M Li ldquoA social-awareservice recommendation approach for mashup creationrdquo inProceedings of the IEEE 20th International Conference on WebServices pp 107ndash114 Santa Clara CA USA 2013

[10] L Yao X Wang Q Sheng W Ruan and W Zhang ldquoServicerecommendation for mashup composition with implicitcorrelation regularizationrdquo in Proceedings of the IEEE In-ternational Conference on Web Services pp 217ndash224 NewYork NY USA June-July 2015

[11] H Ma D Zhou C Liu M R Lyu and I King ldquoRecom-mender Systems with social regularizationrdquo in Proceedings ofthe Fourth ACM International Conference on Web Search andData Mining pp 287ndash296 ACM Hong Kong China Feb-ruary 2011

[12] X Chen Z Zheng Q Yu and M R Lyu ldquoWeb servicerecommendation via exploiting location and QoS in-formationrdquo IEEE Transactions on Parallel and DistributedSystems vol 25 no 7 pp 1913ndash1924 2014

[13] S Rendle ldquoFactorization machinesrdquo in Proceedings of theIEEE International Conference on Data Mining (ICDM)pp 995ndash1000 Sydney Australia December 2010

[14] S Rendle ldquoFactorization machines with libFMrdquo ACMTransactions on Intelligent Systems and Technology (TIST)vol 3 no 3 pp 57ndash78 2012

[15] T Ma I Sato and H Nakagawa e Hybrid NestedHierarchical Dirichlet Process and Its Application to TopicModeling with Word Differentiation Association for theAdvancement of Artificial Intelligence (AAAI) Menlo ParkCA USA 2015

[16] Y Teh M Jordan M Beal and D Blei ldquoSharing clustersamong related groups hierarchical dirichlet processesrdquo Ad-vances in Neural Information Processing System vol 37 no 2pp 1385ndash1392 2004

[17] Z Zheng H Ma M Lyu and I King ldquoWSRec a collaborativefiltering based web service recommender systemrdquo in Pro-ceedings IEEE International Conference on Web Services(ICWS) pp 437ndash444 Los Angeles CA USA July 2009

[18] B Cao B Li J Liu M Tang and Y Liu ldquoWeb APIs rec-ommendation for mashup development based on hierarchicaldirichlet process and factorization machinesrdquo in Proceedingsof Collaborate Computing Networking Applications andWorksharing Beijing China July 2016

[19] S Wang Z Zheng Z Wu M Lyu and F Yang ldquoReputationmeasurement and malicious feedback rating prevention inweb service recommendation systemrdquo IEEE Transactions onServices Computing vol 5 no 8 pp 755ndash767 2015

[20] M Picozzi M Rodolfi C Cappiello andMMatera ldquoQuality-based recommendations for mashup compositionrdquo in Cur-rent Trends in Web Engineering vol 6385 pp 360ndash371 2010

[21] C Cappiello F Daniel M Matera and C Pautasso ldquoIn-formation quality in mashupsrdquo IEEE Internet Computingvol 14 no 4 pp 14ndash22 2010

[22] C Cappiello ldquoA quality model for mashup componentsrdquo inWeb Engineering Web Engineering Lecture Notes in Com-puter Science vol 5648 pp 236ndash250 2009

[23] K Huang Y Fan and W Tan ldquoAn empirical study ofprogrammable web a network analysis on a service-mashupsystemrdquo in Proceedings of the 2012 IEEE 19th InternationalConference onWeb Services (ICWS) Honolulu HI USA June2012

[24] W Gao L Chen J Wu and H Gao ldquoManifold-learningbased API recommendation for mashup creationrdquo in Pro-ceedings of the 2015 IEEE International Conference on WebServices (ICWS) New York NY USA June 2015

[25] X Luo M Zhou Y Xia and Q Zhu ldquoAn efficient non-negative matrix-factorization-based approach to collaborativefiltering for recommender systemsrdquo IEEE Transactions onIndustrial Informatics vol 10 no 2 pp 1273ndash1284 2014

[26] Z Zheng H Ma M R Lyu and I King ldquoCollaborative webservice QoS prediction via neighborhood integrated matrixfactorizationrdquo IEEE Transactions on Services Computingvol 6 no 3 pp 289ndash299 2013

[27] B Cao X Liu M D M Rahman B Li J Liu and M TangldquoIntegrated content and network-based service clustering andweb APIs recommendation for mashup developmentrdquo IEEETransactions on Services Computing p 1 2017

[28] B Cao X Liu J Liu and M Tang ldquoDomain-aware mashupservice clustering based on LDA topic model from multipledata sourcesrdquo Information and Software Technology vol 90pp 40ndash54 2017

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 3: MobileServiceRecommendationviaCombiningEnhanced ...downloads.hindawi.com/journals/misy/2019/6423805.pdfcorpus Ke3erv3 dataset Gscription document3f3 flashup3n3 mobi3ervice jG%3op3del

service +erefore it is very necessary for topic modelling toextend the description documents of Mashup and mobileservice Google develops a tool Word2vec [8] to expresswords as real numerical vectors in 2013 It is an open sourceword embeddings vector toolkit which simplifies the doc-ument content processing into vector operations in K-di-mensional vector space by using the idea of deep learning[8] +e word embeddings vector contains not only thesemantic and grammatical relations of words but also ex-tracts the context information of the document whichachieves more accurate word embeddings vector repre-sentation As we know Wikipedia is recognized as the mostcomprehensive and authoritative online encyclopaedia on theInternet which possesses rich corpus We use Wikipediacorpus as the extension source of the description documentsofMashup andmobile service More concretely first of all weuse Word2vec tool to train Wikipedia corpus and obtain theword embeddings vector model of Wikipedia corpus +enwe exploit the trained word vector model to extend the de-scription documents of Mashup and mobile service+at is tosay for each word wi in the preprocessed service descriptiondocument Stext the most similar Top-N wordsTwi

(t1 t2 tN) to wi are identified from the wordembeddings vector space ofWikipedia corpus and used as theextended words +e extended service description documentis denoted as SExText (w1 Tw1

) (w2 Tw2) (wn Twn

)1113966 1113967where n is the total number of words in service descriptiondocument andN is the number of extended words In the nextsection we will perform comparative experiments to de-termine the optimal number of extended words

+e word embeddings vector model of Wikipedia corpusis trained by adopting the CBOW model based on thenegative sampling and the Negative Sampling algorithm inwhich the more close words in word meaning indicate moreclose distance in their word vectors space +ese close wordshave similar semantic and grammatical relations which can

be used for service description document expansion +einput of the CBOW model is the word vector of the context-related words of a specific word and its output is just the wordvector of the specific word Suppose w Cw represent theextracted word and their context related words informationfrom the service description document dataset respectively+e probability of predicting w by the context-related words(surrounding words) can be denoted as follows

p w Cw

11138681113868111386811138681113872 1113873 1

1 + eminusvcwWw

(1)

where Ww is the parameter of the hidden layer and softmaxlayer in the neural network and vcw

is the sum of the vectorsof each word in Cw +e training objectives of the CBOWmodel are defined as follows (maximum likelihood esti-mation function)

OBJCBOW arg maxVwW1

⎛⎝ 1113945

wCw( )isinT

log p w Cw

11138681113868111386811138681113872 11138731113872 1113873

times 1113945

wCw( )notinT

1minus log p w Cw

11138681113868111386811138681113872 11138731113872 11138731113872 1113873⎞⎠

(2)

Next we use the HDP model to model the extendedservice description document SExText to automatically ex-tract the optimal number of topics and fully estimate thelatent topics distribution

23 Topic Modelling of Mashup and Mobile Service UsingHDP +e hierarchical Dirichlet process (HDP) is a Dirichletprocess (DP) mixture model with multilevel form and it isalso a nonparametric Bayesian approach to clusteringgrouped data [15] Assume (Θ C) is the measurable spacewhere G0 is a spatial probability measure and a0 is a positivereal number +e Dirichlet process [16] is interpreted as the

Wikipediacorpus

Web service dataset

Descriptiondocuments of Mashup and

mobile service

HDP topic model

Topic modelling of Mashup and mobile service

FM model

Prediction and recommendation of mobile service

for Mashup

Topic modelling Service recommendationDescription extension

Extraction

Word2Vec

Preprocessing Input

Mashup requirement

Extendedwords

Figure 1 Framework of mobile service recommendation for Mashup

Mobile Information Systems 3

distribution of the random probability measure G over(Θ C) If G satisfies the Dirichlet distribution for any finitepartition (A1 A2 Ar) of the measurement spaceΘ thereis a random vector (G(A1) G(Ar)) distributed as afinite-dimensional Dirichlet distribution with the parameters(a0G0(A1) a0G0(Ar))

G A1( 1113857 G Ar( 1113857( 1113857 sim Dir a0G0 A1( 1113857 a0G0 Ar( 1113857( 1113857

(3)

HDP is used to model documents for mobile servicesand Mashup Figure 2 is an HDP probability graph whichclearly indicates the documents of mobile services orMashup and their words and potential topics Amongthem c and a0 represent concentration parameter and Drepresents the entire Mashup document set in which eachMashup document in D is represented as d At the lowerpart of Figure 2 H represents the base probability measureand G0 indicates the global random probability measure+e generated topic probability distribution of Mashupdocument d is represented as Gd and the generated topic ofthe n-th word in d from Gd is represented as βdn and wdn isa generated word from βdn

+e generative process of the HDP model is as follows

(1) Sample the probability distribution G0 sim DP(c H)

(2) For each d in D sample a topic distributionGd sim DP(a G0)

(3) For each word n isin 1 2 N in d

(a) Sample a topic of the n-th word βdn sim Gd

(b) Sample a word from the multinomial distribu-tion of the topic words wdn sim multi(βdn)

In order to perform HDP sampling it is needed to devise aconstruction method to infer the posterior distribution ofparameters Chinese Restaurant Franchise (CRF) is a repre-sentative construction method which provides a way to con-struct the Dirichlet process Assume there are j Chineserestaurants containing mj tables (ψjt)

mj

t1 and each table sitsNj customers In the J restaurant each table shares a menuv (v)K

k1 where K is the amount of food Customers canchoose a table at random and each table is served a dish from amenu common to all restaurants In this way the customersrestaurants and food in the Chinese restaurant correspond tothe words documents and topic in our HDP model re-spectively Assuming δ is a probability measure the topicdistribution θji of the word xji is treated as the customerentering the restaurant and the different value ψjt correspondsto the table where the customer is seated+e customer sits thetable ψjt with a probability njt(iminus 1 + a0) or chooses with anew probability a0(iminus 1 + a0) to sit the new table ψjtnew

sharing the food vk Among them njt indicates the sum ofcustomers at the t-th table of the j-th restaurant If the customerchooses a new table shehe can distribute the food vk for thenew table with a probability mk1113936kmk + c according to thepopularity of chosen foods or new foods vknew

with a prob-ability c1113936kmk + c Here mk indicates the sum of tablesproviding the food vk We have the conditional distributions

θji

1113868111386811138681113868 θji θji θji a0 G0sim1113944

mj

t1

njt

iminus 1 + a0δψjt

+a0

iminus 1 + a0G0

(4)

ψjt

1113868111386811138681113868ψjtψjt ψjt ψjt c H sim 1113944K

k1

mk

1113936kmk + cδvk

+c

1113936kmk + cH

(5)

In fact the above CRF process of distributing tables andfoods to customers corresponds to the process of wordtopic distribution and document topic clustering inMashup document set respectively After the constructionof CRF the HDP model uses Gibbs sampling to infer theposterior probability distribution of its parameters so as togain the topic distribution of the entire Mashup documentset

24 Mobile Service Recommendation for Mashup Using FMs

241 Rating Prediction in Recommendation System and FMsIn the traditional recommendation system as for user setU u1 u2 1113864 1113865 and item set I i1 i2 1113864 1113865 the ratingprediction function is denoted as follows

y U times I⟶ R (6)

where y is the rating and y(u i) represents the rating of useru to item i

FMs are a universal predictor that estimates reliableparameters at very high sparsity [13 14] It integrates theadvantages of SVMswith factorizationmodels Different fromthe SVM it not only is suitable for any real-valued featurevector but also can use decomposition parameters tomodel allinteractions between feature variables +erefore it is verysuitable for predicting the rating of items for users Assumethere are an input feature vector x isin Rnlowastp and an outputtarget vector y (y1 y2 yn)T Here n is the sum ofinput-output pairs p denotes the sum of input featuresie the ith row vector xi isin Rp p represents xi has p inputfeature values and yi is the predicted target value of xi On thebasis of x and y the 2-order FMs can be denoted as follows

1113954y(x) w0 + 1113944

p

i1wixi + 1113944

p

i11113944

p

ji+1xixj 1113944

k

f1vifvjf (7)

H G0 GdD

N

γ α

βdn wdn

Figure 2 Probabilistic graph of HDP

4 Mobile Information Systems

where w0 is the global bias and k is the dimensionality offactorization w0 models the strength of the i-th featureand xixj indicates all the pairwise variables in thetraining instances xi and xj +e model parametersw0 w1 wp v11 vpk1113966 1113967 are denoted as follows

w0 isin R

w isin Rn

V isin Rnlowastk

(8)

242 Prediction and Recommendation of Mobile Service forMashup Based on FMs +e prediction and recommendationof mobile service is a typical classification problem and it isregarded as the task of ranking mobile services and recom-mending enough related mobile services for a given Mashup+e result of classification can be denoted as y minus1 1 Wheny 1 the relevant mobile services are recommended to thegiven Mashup However in the experiment we can onlyobtain the predicted values ranging from 0 to 1 by formula (5)We firstly rank these prediction values then label the top-Kresults as positive (+1) and the rests as negative (minus1) andfinally recommend the mobile services with positive values tothe given or target Mashup

In the modelling of FMs target Mashup and activemobile services can be considered as user and item re-spectively In addition to the two-dimensional features(target Mashup and active mobile services) we add otherfeatures such as similar mobile services similar Mashuppopularity and cooccurrence of mobile services to improvethe accuracy of prediction and recommendation +e ad-ditional features can be used as input feature vectors in FMmodelling +erefore the model in formula (6) can be ex-tended to the below prediction model with six dimensions

yMA times MS times SMS times SMA times CO times POP⟶ S (9)

where MA is the target Mashup MS is the active mobileservice SMS represents a similar Mashup SMA represents asimilar mobile service CO represents the cooccurrence ofmobile service POP indicates the popularity of mobileservice and S indicates the prediction ranking score re-spectively +ese similar Mashups and mobile services arederived from our HDP model in Section 23

Figure 3 is an example of recommending mobile servicesfor target Mashup using the FM model in which the dataconsist of two parts +e first part is the input feature vectorset X and the second part is the output target set Y Each rowincludes a feature vector xi and its corresponding targetscore value yi +e first binary indicator matrix (ie Box 1)indicates the target Mashup MA +e second binary in-dicator matrix (ie Box 2) indicates the active mobile serviceMS +e third indicator matrix (ie Box 3) represents thatTop-S mobile services are similar to active mobile service inBox 2 For example the similarity between S1 and S2 (S3) is03 (07) +e fourth indicator matrix (ie Box 4) representsTop-M similar Mashups SMA of the target Mashup in Box 1For instance the similarity between M2 and M1 (M3) is 03(07) +e fifth indicator matrix (ie Box 5) indicates the

cooccurrence CO of active mobile services composed orinvoked by the same Mashup in historical records +e sixthindicator matrix (ie Box 6) indicates the popularity (or thefrequencytimes) POP of active mobile services composed orinvoked by Mashup Target Y represents the output result ofthe model and the prediction ranking score S can beclassified as a positive value (+1) or a negative value (minus1) onthe top of a given threshold If yi gt 05 then S+1 otherwiseSminus1 +ese mobile services with positive values will beselected and recommended to the target Mashup+e case inpoint is if there are two active mobile service members S1and S3 for the selection of the target Mashup M1 S1 will beselected and recommended to M1 +is is because it has ahigher prediction value ie y2 gt 092 Furthermore we willinvestigate the influences of top-S and top-M on recom-mendation performance in the Experiments section

3 Experiments

31 Experiment Dataset and Settings In this experiment wefirstly crawled 3929 real Mashups 10648 services and 12715invocations between these Mashups and services from Pro-grammableWeb As for each Mashup or service a pre-processing process is performed to obtain their standarddescription information Secondly we use the Word2vec toolto expand the description document of Mashup or servicefrom English Wikipedia corpus published on April 2017 andobtain their word embeddings vector More concretely thegensim module in Python is applied to train the EnglishWikipedia corpus and produce its word embeddings vectorand Table 1 presents the special parameters of Word2vecFinally the trained English Wikipedia corpus is exploited toexpand the description documents of Mashup and service+e most similar Top-N words to the original word areidentified and used as the extended words For instance thetop similar 10 expanded words of the two words ldquoEarthrdquo andldquoGooglerdquo are shown in Table 2 All Mashups in the dataset areuniformly divided into 5 subsets in which 1 subset is used asthe testing set and other 4 subsets are integrated as a thetraining set A five-fold cross validation is conducted and theresults for each fold are summed up to obtain their meanvalue as the reported experiment results As for the testing setthrough randomly removing some score values from thematrix of Mashup service we change the number of scorevalues provided by the active Mashups as 10 20 and 30 andcall them as Given 10 Given 20 andGiven 30 respectively Atthe same time the removed score values are exploited as theexpected values Similarly as for the training set throughrandomly removing some score values the Mashup-servicematrix becomes more sparser with density 10 20 and30 respectively

32 Evaluation Metrics Mean absolute error (MAE) refersto the expected value of the square of the difference betweenthe observed value and the true value which can evaluate thechange degree of data [16] Root-mean-squared error(RMSE) is the square root of the ratio of the square of thedeviation between the observed value and the true value to

Mobile Information Systems 5

the observed times N [16] We adopt the MAE and RMSE asthe evaluation metrics of Web APIs recommendation +esmaller the MAE and RMSE mean the better the recom-mendation effect

MAE 1N

1113944ij

rij minus 1113954rij

11138681113868111386811138681113868

11138681113868111386811138681113868

RMSE

1N

1113944ij

rij minus 1113954rij1113872 11138732

1113971

(10)

where N is the number of predicted score rij indicates thetrue score of Mashup Mi to service Sj and 1113954rij indicates thepredicted score of Mi to Sj

33 Baseline Methods We choose the below methods asbaseline to compare them with our proposed approach

(i) SPCC Similar to IPCC [17] service-based utilizingPearson correlation coefficient (SPCC) approachmeasure the similarities between mobile servicesand perform recommendation

(ii) MPCC Similar to UPCC [17] Mashups-basedutilizing Pearson correlation coefficient (MPCC)approachmeasure the similarities betweenMashupsand perform recommendation

(iii) PMF In the collaborative filtering probabilisticmatrix factorization (PMF) is a very popular matrixfactorization model [10] +e historical invocationrecord between Mashups and mobile services isdenoted as a matrix R [rij]ntimesk If rij 1 themobile service is invoked by a Mashup is shownotherwise rij 0 +e probability of the mobileservice Si invoked by the Mashup Mj can be pre-dicted and represented as 1113954rij ST

i Mj(iv) LDA-FMs +e topic probability distributions of

description documents in Mashup and mobileservice firstly are derived by the LDA model andthen they are trained via FMs to predict theprobability distribution of mobile service invokedby Mashup and recommend mobile service withhigh quality Besides the topic information thecooccurrence and popularity of mobile service areexploited in the FM modelling

(v) HDP-FMs +e prior work [18] which integratesHDP and FMs to recommendmobile service for targetMashup+eHDPmodel is applied to derive the topicprobability distributions of description documents inMashup and mobile service Similarly the topic in-formation and the cooccurrence and popularity ofmobile service are all used in the FM modelling

(vi) EHDP-FMs +e proposed method in this paper isan extended work of the prior work [18] It firstly

Table 1 Specific parameters of Word2vec

Parameter ValueSize (the dimension of word vector) 200Window (the length of the window) 10Sample (the threshold of sampling) 0001Negative (the number of negative sampling) 5Sg (whether or not the Skip-gram model is used) 0 (No)Hs (whether or not hierarchical Softmax model isused) 0 (No)

Table 2 Extension example of the two words ldquoEarthrdquo andldquoGooglerdquo

Original word Earth Google

Extended words

Planet GmailMartian DropboxMars Evernote

Venusian AppPlanets Adsense

Spaceship YahooUniverse MicrosoftPlanetary FlickrMoon HotmailDeimos Mapquest

Mashup(MA)

Mobile service(MS)

Similar mobileservice (SMS)

Similar Mashup(SMA)

Cooccurrence(CO)

Popularity(POP)

0 1 0 hellip 1 0 0 hellip 0 03 07 hellip 03 0 07 hellip 0 05 05 hellip 12

1 0 0 hellip 1 0 0 hellip 0 05 05 hellip 0 05 05 hellip 0 1 0 hellip 3

0 1 0 hellip 0 1 0 hellip 07 0 03 hellip 05 0 05 hellip 05 0 05 hellip 7

0 0 1 hellip 0 1 0 hellip 06 0 04 hellip 04 06 0 hellip 05 0 05 hellip 21

0 0 1 hellip 0 0 1 hellip 03 07 0 hellip 01 09 0 hellip 05 05 0 hellip 5

1 0 0 hellip 0 0 1 hellip 04 01 0 hellip 0 08 02 hellip 05 05 0 hellip 3

0 1 0 hellip 0 1 0 hellip 04 0 06 hellip 04 0 06 hellip 05 0 05 hellip 8

0 0 1 hellip 1 0 0 hellip 0 08 02 hellip 07 03 0 hellip 0 1 0 hellip 1

hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip

M1 M2 M3 hellip S1 S2 S3 hellip S1 S2 S3 hellip M1 M2 M3 hellip S1 S2 S3 hellip Freq

Box 1 Box 2 Box 3 Box 4 Box 5 Box 6

Score(S)

036 (ndash1)

092 (+1)

017 (ndash1)

043 (ndash1)

069 (+1)

028 (ndash1)

055 (+1)

074 (+1)

hellip hellip hellip

hellip hellip hellip

X1

X2

X3

X4

X5

X6

X7

X8

y2

y1

y3

y8

y7

y4

y5

y6

X Y

Figure 3 FM model of recommending mobile service for Mashup

6 Mobile Information Systems

uses Word2vec tool to expand the document de-scription of mobile service and Mashup fromWikipedia corpus +en the HDP model is appliedto derive the topic probability distributions of theextended document description of mobile serviceand Mashup Finally the FM is deployed to predictand recommend high-quality mobile service forMashup

34 Experimental Results

341 Recommendation Performance Comparison To studythe performance of mobile service recommendation wecompare our method with other five baseline methods Weselect the optimal number of extended words for eachoriginal word in description documents of Mashup andservice to achieve the best recommendation result in ourEHDP-FMs A detailed investigation about it will be dis-cussed in subsequent section Table 3 reports the MAE andRMSE comparison of multiple recommendation methodswhich show our EHDP-FMs greatly outperforms WPCCand MPCC significantly surpasses PMF and LDA-FMs andslightly exceeds HDP-FMs consistently+e reason for this isthat in the EHDP-FMs (a) more useful words informationcan be obtained from the extended description content ofMashup and service (b) more similar Mashups and similarservices in topic distribution are identified by using HDPtechnology and (c) FM models and trains those usefulinformation (including similar Mashups and similar ser-vices the cooccurrence and popularity of service) to achievemore accurate service probability score prediction Fur-thermore when the given score values increase from 10 to 30and the density of training matrix rises from 10 to 30 theMAE and RMSE in the EHDP-FMs definitely drop+at is tosay more score values and training matrix with highersparsity mean better accuracy of recommendation

342 Effect of the Number of Extended Words on MobileService Recommendation +e experiments investigate theeffect of the number of extended words on mobile servicerecommendation in our proposed method During the ex-periments we set the number of extended words to 1 3 5and 7 (respectively denoted as EHDP-FMs-1 EHDP-FMs-3 EHDP-FMs-5 and EHDP-FMs-7) when training matrixdensity 10 and obtain their values of MAE and RMSE inFigures 4 and 5 +e experimental results indicate theperformance of EHDP-FMs-1 is the worst in all cases +is isbecause the extended description documents of Mashup andservice are still short and the contained useful information isless in it when only extending a word for each original wordWe can see that theMAE and RMSE of EHDP-FMs-3 are theoptimal and best in all cases However when the number ofthe extended words continues to increase from 5 to 7 therecommendation performance decreases+e reason for thisis that too many extended words contain more other ir-relevant syntax and semantics information which maybemakes the HDP topic model fail to mine the latent topicsaccurately and therefore weakens the performance of service

recommendation +erefore we select 3 extended words foreach original words of description document of Mashup andservice in our EHDP-FMmethod +e observations indicateit is very important to choose an appropriate number ofextended words for mobile service recommendation

343 HDP-FMs Performance vs LDA-FMs Performancewith Different Topic Numbers In this experiment we re-spectively set the number of topics as 3 6 12 and 24 forLDA-FMs and denote as LDA-FMs-361224 +e experi-mental results in Figures 6 and 7 respectively show theMAE and RMSE values when the training matrix density isequal to 10We also observe that the performance of HDP-FMs is the best At the same time the MAE and RMSE ofLDA-FMs-12 are close to that of HDP-FMs and surpassedthose of LDA-FMs-3 LDA-FMs-6 and LDA-FMs-24 +eobservations prove that HDP-FM is better than LDA-FMssince it can automatically derive the optimal topic numbersinstead of repeatedly training like LDA

344 Impacts of Top-S and Top-M in HDP-FMs We in-vestigate the effects of top-S and top-M to mobile servicerecommendation in order to obtain their optimal values+eoptimal values of top-M (top-S) for all similar top-S (top-M)services (Mashups) are obtained ie S 5 for all top-Msimilar Mashups and M 10 for all top-S similar mobileservices Under the setting of training matrix density 10and given number 30 the MAEs of HDP-FMs are pre-sented in Figures 8 and 9 We can see that from Figure 8 theMAE of HDP-FMs constantly rises when S grows from 5 to25 Figure 9 indicates the MAE of HDP-FMs runs up to itspeak value whenM 10 and then continuously rises with theincreasing or decreasing ofM +e observations mean that itis very important to identify suitable values of S and M forthe HDP-FM method

4 Related Works

Service recommendation is a hot topic nowadays in service-oriented computing [19] Traditional service recommen-dation solves the quality problem of Mashup services inorder to realize high-quality service recommendation +equality of a single service can facilitate recommendationshowed by Picozzi et al [20] +e quality attributes ofMashup components (APIs) and information quality inMashups [21] is analyzed by Cappiello [22] In additioncollaborative filtering (CF) technique is widely exploited inQoS-based service recommendation [16] We can use it tomeasure the similarity of services or users predict themissing QoS values on the basis of the QoS records of similarservices or similar users and recommend services to users

+e problems of the data sparsity and long tail bringabout inaccurate and imperfect search results according tothe results in references [23 24] To attack the problemsome researchers try to use matrix factorization technologyto decompose historical QoS or Mashup service interactionsto obtain service recommendations [25 26] A collaborativeQoS prediction method is proposed in which a matrix

Mobile Information Systems 7

factorization model of neighbourhood integration isdesigned to predict the QoS value of personalized Webservices by Zheng et al [26] A social awareness servicerecommendation method is proposed in which the multi-dimensional social relationships among potential usersMashups topics and services are depicted by the couplingmatrix model by Xu et al [9] +ese methods aim totransform Mashup-service rating matrix or QoS into afeature space matrix with lower dimensions and predict theprobability of services invoked by Mashups or unknownQoS

Considering that matrix factorization depends onabundant historical interaction records recent work in-corporates additional information into matrix factorizationto obtain more accurate service recommendation [5 10ndash12]Among them Ma et al [11] integrates matrix factorization

with geographic and social influence to recommend interestpoints By using location information and QoS of Webservices to cluster services and users a personalized servicerecommendation is proposed by Chen et al [12] +e his-torical invocation relationship between Mashups and ser-vices is studied to infer the implicit functional correlationbetween services and the correlation is incorporated into thematrix factorization model to facilitate service recommen-dation by Yao et al [10] Collaborative topic regression isproposed by Liu and Fulia [5] which combines probabilistictopic modelling and probabilistic matrix factorization forservice recommendation

+e existing methods based on matrix factorizationundoubtedly improve the performance of service recom-mendation At the same time we observed that few of themrecognized the historical invocation between services and

Table 3 MAE and RMSE comparison of multiple recommendation approaches

MethodMatrix density 10 Matrix density 20 Matrix density 30MAE RMSE MAE RMSE MAE RMSE

Given 10

SPCC 04258 05643 04005 05257 03932 05036MPCC 04316 05701 04108 05293 04035 05113PMF 02417 03835 02263 03774 02014 03718

LDA-FMs 02091 03225 01969 03116 01832 03015HDP-FMs 01547 02874 01329 02669 01283 02498EHDP-FMs 01308 02507 01154 02372 01081 02093

Given 20

SPCC 04135 05541 03918 05158 03890 05003MPCC 04413 05712 04221 05202 04151 05109PMF 02398 03559 02137 03427 01992 03348

LDA-FMs 01989 03104 01907 03018 01801 02894HDP-FMs 01486 02713 01297 02513 01185 02291EHDP-FMs 01227 02419 01055 02216 00952 01904

Given 30

SPCC 04016 05447 03907 05107 03739 05012MPCC 04518 05771 04317 05159 04239 05226PMF 02214 03319 02091 03117 01986 03052

LDA-FMs 01970 03096 01865 02993 01794 02758HDP-FMs 01377 02556 01109 02461 01047 02057EHDP-FMs 01113 02248 00926 02057 00804 01673

Given number10 20 30

MA

E

005

01

015

02

025

03

EHDP-FMs-1EHDP-FMs-3

EHDP-FMs-5EHDP-FMs-7

Figure 4 MAE values of EHDP-FMs-1357

EHDP-FMs-1EHDP-FMs-3

EHDP-FMs-5EHDP-FMs-7

Given number10 20 30

RMSE

018

02

022

024

026

028

03

032

Figure 5 RMSE values of EHDP-FMs-1357

8 Mobile Information Systems

Mashups to derive potential topics and they did not use FMsto model and train these potential topics to predict theprobability of Mashup calling services to obtain more ac-curate service recommendation In our previous work[8 27 28] we mainly address on LDA or enhanced LDAtopic model for Web services clustering [27 28] and alsoexploit word embedding technique to enhance the accuracyof service clustering [8] Driven by these methods wecombine FMs and word-embedded enhanced HDP forrecommending mobile services to build novel Mashup ap-plication We apply the HDP model to export the potentialtopics from the description documents of mobile servicesand Mashups to support FM model training We use FMs topredict the probability of Mashups calling mobile servicesand recommend high-quality services for building novelMashup application

5 Discussion

Recommending mobile service to build novel Mashup ap-plication for software developers in the mobile servicecomputing environment is becoming a promising researchtopic In our paper the functional semantic representationof Mashup applications and mobile services is fully con-solidated and mined by extending their description docu-ments and modelling their topic probability distributionand the quality prediction is performed by exploiting FMs totrain and model multiple dimension features of mobileservice +e high-quality mobile services are ranked andrecommended to build Mashup by simultaneously consid-ering their functionality representation and quality feature+e accuracy of mobile service recommendation is signifi-cantly improved as a result

Although the above approach and solution seem veryeffective in the Mashup development it will be better if a

Given number10 20 30

RMSE

02

025

03

035

04

LDA-FMs-3LDA-FMs-6LDA-FMs-12

LDA-FMs-24HDP-FMs

Figure 6 MAE of HDP-FMs and LDA-FMs

LDA-FMs-3LDA-FMs-6LDA-FMs-12

LDA-FMs-24HDP-FMs

Given number10 20 30

MA

E

012

013

014

015

016

017

Figure 7 RMSE of HDP-FMs and LDA-FMs

Top-S5 10 15 20 25

MA

E

013

014

015

016

017

018

019

02

HDP-FMs

Figure 8 Impact of Top-S in HDP-FMs

Top-M5 10 15 20 25

MA

E

013

014

015

016

017

018

019

02

HDP-FMs

Figure 9 Impact of Top-M in HDP-FMs

Mobile Information Systems 9

prototype can be designed or implemented to validate theeffectiveness and application value of the approach As weexpected the objective of the prototype system is to rank andrecommend the high-quality mobile service to softwaredevelopers for building Mashup application We can use thetools of Python 35 Mysql 56 and the technologies of Flaskand Pyecharts to develop the prototype system It shouldachieve four basic function parts ie service data crawlingand preprocessing description extension of Mashup andmobile service topic modelling of Mashup and mobileservice and recommendation of mobile service for the givenMashup requirement More concretely

(1) In the first part (service data crawling and pre-processing) the system incrementally crawls Mash-ups services and invocations between these Mashupsand services from ProgrammableWeb and builds theircorresponding data table to store Because the de-scription documents of Mashup and mobile servicecontain some useless or meaningless termswords thepreprocessing is performed to normalize and stan-dardize the description information +e pre-processing mainly includes tokenization words aresegmented by spaces and punctuation is separatedfrom words by using NLTK (Natural LanguageToolkit) in Python removing stop words (the com-mon short words or symbols that have no practicalmeaning but occur frequently such as the to a anwith and at) the stop vocabulary table in the NLTK isapplied to remove stop words stemming variousforms of a word are usually used in the grammaticalexpression such as provide providing provides andprovided and their common word endings such asing s and ed should be removed

(2) In the second part (description extension of Mashupand mobile service) English Wikipedia corpustrained byWord2vec based on the genism module inPython is exploited as the description extensionsource of Mashup and mobile service +e mostsimilar Top-N words to an original word in thedescription documents of Mashup and mobile ser-vice are identified and saved as the extended words inthe prototype system Word2vec uses the hierar-chical softmax algorithm to speed up and train wordvector for English Wikipedia corpus whose timecomplexity is O(log N) Meanwhile Word2veccalculates the similarity between words in EnglishWikipedia corpus and obtain the similarity matrixwhose time and space complexity are all O(n2) n isthe total amount of words During the process ofdescription extension the most similar Top-N wordsto an original word can be found only by the way oflook-up table in the trained English Wikipediacorpus and its time complexity is O(1) and spacecomplexity is O(N2)

(3) In the third part (topic modelling of Mashup andmobile service) hierarchical Dirichlet processtechnology is used to extract the implicit topics ofMashup and mobile service which clustersgroups

service data according to the cooccurrence of wordfrequency It can automatically determine the op-timal number of topics which avoids adjusting thenumber of topics repeatedly and so saves the timecost It can also accurately predict the topic dis-tribution of Mashup and mobile service which donot need to retrain the dataset and make the pro-totype system real time A topic modelling modulecan be designed in the prototype system in whichFlask framework and Pyecharts visualization tool inPython are used to present the effect of topicmodelling and a download function is provided todownload the transformed topic vectors of Mashupand mobile service for users

(4) In the fourth part (recommendation of mobile ser-vice for the given Mashup requirement) when asoftware developer submits a Mashup requirementthe prototype system will return a list of mobileservices with good quality a for software developer tobuild novel Mashup application During the processof recommendation factorization machines trainand model the important input features +ese inputfeatures include functional features ie similarMashups of target Mashup and similar mobile ser-vices of active mobile service derived from the topicsimilarity based on the HDP model and qualityfeatures ie the cooccurrence and popularity ofmobile services obtained from the stored data tableFMs predict the probability of mobile servicesinvocated by Mashups and recommends the high-quality mobile service for a Mashup creation Sim-ilarly Flask framework and Pyechart visualizationtool in Python are used to present the effect ofrecommendation

6 Conclusions and Future Work

+is paper proposes a mobile service recommendationmethod for Mashup development in mobile servicecomputing by combining word embeddings enhancedHDP and FMs +e experimental results on the top ofProgrammableWeb dataset show that compared with theexisting recommendation methods the proposed methodachieves significant improvements in the accuracy ofrecommendation In the future work we willinvestigate and apply fine-grained service relationshipinformation into the proposed model for more accuraterecommendation

Data Availability

+e crawled dataset from ProgrammableWeb can beaccessed at http491230608080MashupNetwork20datasetjsp

Conflicts of Interest

+e authors declare that there are no conflicts of interestregarding the publication of this paper

10 Mobile Information Systems

Acknowledgments

+e work was supported by the National Natural ScienceFoundation of China under grant nos 61873316 6187213961572187 61772193 61702181 and 61572371 National KeyRampD Program of China under grant no 2017YFB1400602Hunan Provincial Natural Science Foundation of Chinaunder grant nos 2017JJ2098 2017JJ4036 2018JJ2139 and2018JJ2136 and Innovation Platform Open Foundation ofHunan Provincial Education Department of China undergrant no 17K033

References

[1] S Deng L Huang H Wu et al ldquoToward mobile servicecomputing opportunities and challengesrdquo IEEE CloudComputing vol 3 no 4 pp 32ndash41 2016

[2] B Xia Y Fan W Tan K Huang J Zhang and C WuldquoCategory-aware API clustering and distributed recommen-dation for automatic mashup creationrdquo IEEE Transactions onServices Computing vol 8 no 5 pp 674ndash687 2015

[3] httpsenwikipediaorgwikiMashup_(web_application_hybrid)

[4] L Chen Y Wang Q Yu Z Zheng and J Wu ldquoWT-LDAuser tagging augmented LDA for web service clusteringrdquo inProceedings of the International Conference on Service-Oriented Computing (ICSOC) Hangzhou China January2013

[5] X Liu and I Fulia ldquoIncorporating user topic and service-related latent factors into web service recommendationrdquo inProceedings of the IEEE International Conference on WebServices pp 185ndash192 New York NY USA July 2015

[6] D Blei A Ng and M Jordan ldquoLatent dirichlet allocationrdquoJournal of Machine Learning Research vol 3 pp 993ndash10222003

[7] Y W Teh M I Jordan M J Beal and D M Blei ldquoHier-archical dirichlet processrdquo Journal of the American StatisticalAssociation vol 101 no 476 pp 1566ndash1581 2004

[8] M Shi J Liu D Zhou M Tang and B Cao ldquoWE-LDA aword embeddings augmented LDA model for web servicesclusteringrdquo in Proceedings of the IEEE International Confer-ence on Web Services (ICWS) pp 9ndash16 Honolulu HI USAJune 2017

[9] W Xu J Cao L Hu J Wang and M Li ldquoA social-awareservice recommendation approach for mashup creationrdquo inProceedings of the IEEE 20th International Conference on WebServices pp 107ndash114 Santa Clara CA USA 2013

[10] L Yao X Wang Q Sheng W Ruan and W Zhang ldquoServicerecommendation for mashup composition with implicitcorrelation regularizationrdquo in Proceedings of the IEEE In-ternational Conference on Web Services pp 217ndash224 NewYork NY USA June-July 2015

[11] H Ma D Zhou C Liu M R Lyu and I King ldquoRecom-mender Systems with social regularizationrdquo in Proceedings ofthe Fourth ACM International Conference on Web Search andData Mining pp 287ndash296 ACM Hong Kong China Feb-ruary 2011

[12] X Chen Z Zheng Q Yu and M R Lyu ldquoWeb servicerecommendation via exploiting location and QoS in-formationrdquo IEEE Transactions on Parallel and DistributedSystems vol 25 no 7 pp 1913ndash1924 2014

[13] S Rendle ldquoFactorization machinesrdquo in Proceedings of theIEEE International Conference on Data Mining (ICDM)pp 995ndash1000 Sydney Australia December 2010

[14] S Rendle ldquoFactorization machines with libFMrdquo ACMTransactions on Intelligent Systems and Technology (TIST)vol 3 no 3 pp 57ndash78 2012

[15] T Ma I Sato and H Nakagawa e Hybrid NestedHierarchical Dirichlet Process and Its Application to TopicModeling with Word Differentiation Association for theAdvancement of Artificial Intelligence (AAAI) Menlo ParkCA USA 2015

[16] Y Teh M Jordan M Beal and D Blei ldquoSharing clustersamong related groups hierarchical dirichlet processesrdquo Ad-vances in Neural Information Processing System vol 37 no 2pp 1385ndash1392 2004

[17] Z Zheng H Ma M Lyu and I King ldquoWSRec a collaborativefiltering based web service recommender systemrdquo in Pro-ceedings IEEE International Conference on Web Services(ICWS) pp 437ndash444 Los Angeles CA USA July 2009

[18] B Cao B Li J Liu M Tang and Y Liu ldquoWeb APIs rec-ommendation for mashup development based on hierarchicaldirichlet process and factorization machinesrdquo in Proceedingsof Collaborate Computing Networking Applications andWorksharing Beijing China July 2016

[19] S Wang Z Zheng Z Wu M Lyu and F Yang ldquoReputationmeasurement and malicious feedback rating prevention inweb service recommendation systemrdquo IEEE Transactions onServices Computing vol 5 no 8 pp 755ndash767 2015

[20] M Picozzi M Rodolfi C Cappiello andMMatera ldquoQuality-based recommendations for mashup compositionrdquo in Cur-rent Trends in Web Engineering vol 6385 pp 360ndash371 2010

[21] C Cappiello F Daniel M Matera and C Pautasso ldquoIn-formation quality in mashupsrdquo IEEE Internet Computingvol 14 no 4 pp 14ndash22 2010

[22] C Cappiello ldquoA quality model for mashup componentsrdquo inWeb Engineering Web Engineering Lecture Notes in Com-puter Science vol 5648 pp 236ndash250 2009

[23] K Huang Y Fan and W Tan ldquoAn empirical study ofprogrammable web a network analysis on a service-mashupsystemrdquo in Proceedings of the 2012 IEEE 19th InternationalConference onWeb Services (ICWS) Honolulu HI USA June2012

[24] W Gao L Chen J Wu and H Gao ldquoManifold-learningbased API recommendation for mashup creationrdquo in Pro-ceedings of the 2015 IEEE International Conference on WebServices (ICWS) New York NY USA June 2015

[25] X Luo M Zhou Y Xia and Q Zhu ldquoAn efficient non-negative matrix-factorization-based approach to collaborativefiltering for recommender systemsrdquo IEEE Transactions onIndustrial Informatics vol 10 no 2 pp 1273ndash1284 2014

[26] Z Zheng H Ma M R Lyu and I King ldquoCollaborative webservice QoS prediction via neighborhood integrated matrixfactorizationrdquo IEEE Transactions on Services Computingvol 6 no 3 pp 289ndash299 2013

[27] B Cao X Liu M D M Rahman B Li J Liu and M TangldquoIntegrated content and network-based service clustering andweb APIs recommendation for mashup developmentrdquo IEEETransactions on Services Computing p 1 2017

[28] B Cao X Liu J Liu and M Tang ldquoDomain-aware mashupservice clustering based on LDA topic model from multipledata sourcesrdquo Information and Software Technology vol 90pp 40ndash54 2017

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 4: MobileServiceRecommendationviaCombiningEnhanced ...downloads.hindawi.com/journals/misy/2019/6423805.pdfcorpus Ke3erv3 dataset Gscription document3f3 flashup3n3 mobi3ervice jG%3op3del

distribution of the random probability measure G over(Θ C) If G satisfies the Dirichlet distribution for any finitepartition (A1 A2 Ar) of the measurement spaceΘ thereis a random vector (G(A1) G(Ar)) distributed as afinite-dimensional Dirichlet distribution with the parameters(a0G0(A1) a0G0(Ar))

G A1( 1113857 G Ar( 1113857( 1113857 sim Dir a0G0 A1( 1113857 a0G0 Ar( 1113857( 1113857

(3)

HDP is used to model documents for mobile servicesand Mashup Figure 2 is an HDP probability graph whichclearly indicates the documents of mobile services orMashup and their words and potential topics Amongthem c and a0 represent concentration parameter and Drepresents the entire Mashup document set in which eachMashup document in D is represented as d At the lowerpart of Figure 2 H represents the base probability measureand G0 indicates the global random probability measure+e generated topic probability distribution of Mashupdocument d is represented as Gd and the generated topic ofthe n-th word in d from Gd is represented as βdn and wdn isa generated word from βdn

+e generative process of the HDP model is as follows

(1) Sample the probability distribution G0 sim DP(c H)

(2) For each d in D sample a topic distributionGd sim DP(a G0)

(3) For each word n isin 1 2 N in d

(a) Sample a topic of the n-th word βdn sim Gd

(b) Sample a word from the multinomial distribu-tion of the topic words wdn sim multi(βdn)

In order to perform HDP sampling it is needed to devise aconstruction method to infer the posterior distribution ofparameters Chinese Restaurant Franchise (CRF) is a repre-sentative construction method which provides a way to con-struct the Dirichlet process Assume there are j Chineserestaurants containing mj tables (ψjt)

mj

t1 and each table sitsNj customers In the J restaurant each table shares a menuv (v)K

k1 where K is the amount of food Customers canchoose a table at random and each table is served a dish from amenu common to all restaurants In this way the customersrestaurants and food in the Chinese restaurant correspond tothe words documents and topic in our HDP model re-spectively Assuming δ is a probability measure the topicdistribution θji of the word xji is treated as the customerentering the restaurant and the different value ψjt correspondsto the table where the customer is seated+e customer sits thetable ψjt with a probability njt(iminus 1 + a0) or chooses with anew probability a0(iminus 1 + a0) to sit the new table ψjtnew

sharing the food vk Among them njt indicates the sum ofcustomers at the t-th table of the j-th restaurant If the customerchooses a new table shehe can distribute the food vk for thenew table with a probability mk1113936kmk + c according to thepopularity of chosen foods or new foods vknew

with a prob-ability c1113936kmk + c Here mk indicates the sum of tablesproviding the food vk We have the conditional distributions

θji

1113868111386811138681113868 θji θji θji a0 G0sim1113944

mj

t1

njt

iminus 1 + a0δψjt

+a0

iminus 1 + a0G0

(4)

ψjt

1113868111386811138681113868ψjtψjt ψjt ψjt c H sim 1113944K

k1

mk

1113936kmk + cδvk

+c

1113936kmk + cH

(5)

In fact the above CRF process of distributing tables andfoods to customers corresponds to the process of wordtopic distribution and document topic clustering inMashup document set respectively After the constructionof CRF the HDP model uses Gibbs sampling to infer theposterior probability distribution of its parameters so as togain the topic distribution of the entire Mashup documentset

24 Mobile Service Recommendation for Mashup Using FMs

241 Rating Prediction in Recommendation System and FMsIn the traditional recommendation system as for user setU u1 u2 1113864 1113865 and item set I i1 i2 1113864 1113865 the ratingprediction function is denoted as follows

y U times I⟶ R (6)

where y is the rating and y(u i) represents the rating of useru to item i

FMs are a universal predictor that estimates reliableparameters at very high sparsity [13 14] It integrates theadvantages of SVMswith factorizationmodels Different fromthe SVM it not only is suitable for any real-valued featurevector but also can use decomposition parameters tomodel allinteractions between feature variables +erefore it is verysuitable for predicting the rating of items for users Assumethere are an input feature vector x isin Rnlowastp and an outputtarget vector y (y1 y2 yn)T Here n is the sum ofinput-output pairs p denotes the sum of input featuresie the ith row vector xi isin Rp p represents xi has p inputfeature values and yi is the predicted target value of xi On thebasis of x and y the 2-order FMs can be denoted as follows

1113954y(x) w0 + 1113944

p

i1wixi + 1113944

p

i11113944

p

ji+1xixj 1113944

k

f1vifvjf (7)

H G0 GdD

N

γ α

βdn wdn

Figure 2 Probabilistic graph of HDP

4 Mobile Information Systems

where w0 is the global bias and k is the dimensionality offactorization w0 models the strength of the i-th featureand xixj indicates all the pairwise variables in thetraining instances xi and xj +e model parametersw0 w1 wp v11 vpk1113966 1113967 are denoted as follows

w0 isin R

w isin Rn

V isin Rnlowastk

(8)

242 Prediction and Recommendation of Mobile Service forMashup Based on FMs +e prediction and recommendationof mobile service is a typical classification problem and it isregarded as the task of ranking mobile services and recom-mending enough related mobile services for a given Mashup+e result of classification can be denoted as y minus1 1 Wheny 1 the relevant mobile services are recommended to thegiven Mashup However in the experiment we can onlyobtain the predicted values ranging from 0 to 1 by formula (5)We firstly rank these prediction values then label the top-Kresults as positive (+1) and the rests as negative (minus1) andfinally recommend the mobile services with positive values tothe given or target Mashup

In the modelling of FMs target Mashup and activemobile services can be considered as user and item re-spectively In addition to the two-dimensional features(target Mashup and active mobile services) we add otherfeatures such as similar mobile services similar Mashuppopularity and cooccurrence of mobile services to improvethe accuracy of prediction and recommendation +e ad-ditional features can be used as input feature vectors in FMmodelling +erefore the model in formula (6) can be ex-tended to the below prediction model with six dimensions

yMA times MS times SMS times SMA times CO times POP⟶ S (9)

where MA is the target Mashup MS is the active mobileservice SMS represents a similar Mashup SMA represents asimilar mobile service CO represents the cooccurrence ofmobile service POP indicates the popularity of mobileservice and S indicates the prediction ranking score re-spectively +ese similar Mashups and mobile services arederived from our HDP model in Section 23

Figure 3 is an example of recommending mobile servicesfor target Mashup using the FM model in which the dataconsist of two parts +e first part is the input feature vectorset X and the second part is the output target set Y Each rowincludes a feature vector xi and its corresponding targetscore value yi +e first binary indicator matrix (ie Box 1)indicates the target Mashup MA +e second binary in-dicator matrix (ie Box 2) indicates the active mobile serviceMS +e third indicator matrix (ie Box 3) represents thatTop-S mobile services are similar to active mobile service inBox 2 For example the similarity between S1 and S2 (S3) is03 (07) +e fourth indicator matrix (ie Box 4) representsTop-M similar Mashups SMA of the target Mashup in Box 1For instance the similarity between M2 and M1 (M3) is 03(07) +e fifth indicator matrix (ie Box 5) indicates the

cooccurrence CO of active mobile services composed orinvoked by the same Mashup in historical records +e sixthindicator matrix (ie Box 6) indicates the popularity (or thefrequencytimes) POP of active mobile services composed orinvoked by Mashup Target Y represents the output result ofthe model and the prediction ranking score S can beclassified as a positive value (+1) or a negative value (minus1) onthe top of a given threshold If yi gt 05 then S+1 otherwiseSminus1 +ese mobile services with positive values will beselected and recommended to the target Mashup+e case inpoint is if there are two active mobile service members S1and S3 for the selection of the target Mashup M1 S1 will beselected and recommended to M1 +is is because it has ahigher prediction value ie y2 gt 092 Furthermore we willinvestigate the influences of top-S and top-M on recom-mendation performance in the Experiments section

3 Experiments

31 Experiment Dataset and Settings In this experiment wefirstly crawled 3929 real Mashups 10648 services and 12715invocations between these Mashups and services from Pro-grammableWeb As for each Mashup or service a pre-processing process is performed to obtain their standarddescription information Secondly we use the Word2vec toolto expand the description document of Mashup or servicefrom English Wikipedia corpus published on April 2017 andobtain their word embeddings vector More concretely thegensim module in Python is applied to train the EnglishWikipedia corpus and produce its word embeddings vectorand Table 1 presents the special parameters of Word2vecFinally the trained English Wikipedia corpus is exploited toexpand the description documents of Mashup and service+e most similar Top-N words to the original word areidentified and used as the extended words For instance thetop similar 10 expanded words of the two words ldquoEarthrdquo andldquoGooglerdquo are shown in Table 2 All Mashups in the dataset areuniformly divided into 5 subsets in which 1 subset is used asthe testing set and other 4 subsets are integrated as a thetraining set A five-fold cross validation is conducted and theresults for each fold are summed up to obtain their meanvalue as the reported experiment results As for the testing setthrough randomly removing some score values from thematrix of Mashup service we change the number of scorevalues provided by the active Mashups as 10 20 and 30 andcall them as Given 10 Given 20 andGiven 30 respectively Atthe same time the removed score values are exploited as theexpected values Similarly as for the training set throughrandomly removing some score values the Mashup-servicematrix becomes more sparser with density 10 20 and30 respectively

32 Evaluation Metrics Mean absolute error (MAE) refersto the expected value of the square of the difference betweenthe observed value and the true value which can evaluate thechange degree of data [16] Root-mean-squared error(RMSE) is the square root of the ratio of the square of thedeviation between the observed value and the true value to

Mobile Information Systems 5

the observed times N [16] We adopt the MAE and RMSE asthe evaluation metrics of Web APIs recommendation +esmaller the MAE and RMSE mean the better the recom-mendation effect

MAE 1N

1113944ij

rij minus 1113954rij

11138681113868111386811138681113868

11138681113868111386811138681113868

RMSE

1N

1113944ij

rij minus 1113954rij1113872 11138732

1113971

(10)

where N is the number of predicted score rij indicates thetrue score of Mashup Mi to service Sj and 1113954rij indicates thepredicted score of Mi to Sj

33 Baseline Methods We choose the below methods asbaseline to compare them with our proposed approach

(i) SPCC Similar to IPCC [17] service-based utilizingPearson correlation coefficient (SPCC) approachmeasure the similarities between mobile servicesand perform recommendation

(ii) MPCC Similar to UPCC [17] Mashups-basedutilizing Pearson correlation coefficient (MPCC)approachmeasure the similarities betweenMashupsand perform recommendation

(iii) PMF In the collaborative filtering probabilisticmatrix factorization (PMF) is a very popular matrixfactorization model [10] +e historical invocationrecord between Mashups and mobile services isdenoted as a matrix R [rij]ntimesk If rij 1 themobile service is invoked by a Mashup is shownotherwise rij 0 +e probability of the mobileservice Si invoked by the Mashup Mj can be pre-dicted and represented as 1113954rij ST

i Mj(iv) LDA-FMs +e topic probability distributions of

description documents in Mashup and mobileservice firstly are derived by the LDA model andthen they are trained via FMs to predict theprobability distribution of mobile service invokedby Mashup and recommend mobile service withhigh quality Besides the topic information thecooccurrence and popularity of mobile service areexploited in the FM modelling

(v) HDP-FMs +e prior work [18] which integratesHDP and FMs to recommendmobile service for targetMashup+eHDPmodel is applied to derive the topicprobability distributions of description documents inMashup and mobile service Similarly the topic in-formation and the cooccurrence and popularity ofmobile service are all used in the FM modelling

(vi) EHDP-FMs +e proposed method in this paper isan extended work of the prior work [18] It firstly

Table 1 Specific parameters of Word2vec

Parameter ValueSize (the dimension of word vector) 200Window (the length of the window) 10Sample (the threshold of sampling) 0001Negative (the number of negative sampling) 5Sg (whether or not the Skip-gram model is used) 0 (No)Hs (whether or not hierarchical Softmax model isused) 0 (No)

Table 2 Extension example of the two words ldquoEarthrdquo andldquoGooglerdquo

Original word Earth Google

Extended words

Planet GmailMartian DropboxMars Evernote

Venusian AppPlanets Adsense

Spaceship YahooUniverse MicrosoftPlanetary FlickrMoon HotmailDeimos Mapquest

Mashup(MA)

Mobile service(MS)

Similar mobileservice (SMS)

Similar Mashup(SMA)

Cooccurrence(CO)

Popularity(POP)

0 1 0 hellip 1 0 0 hellip 0 03 07 hellip 03 0 07 hellip 0 05 05 hellip 12

1 0 0 hellip 1 0 0 hellip 0 05 05 hellip 0 05 05 hellip 0 1 0 hellip 3

0 1 0 hellip 0 1 0 hellip 07 0 03 hellip 05 0 05 hellip 05 0 05 hellip 7

0 0 1 hellip 0 1 0 hellip 06 0 04 hellip 04 06 0 hellip 05 0 05 hellip 21

0 0 1 hellip 0 0 1 hellip 03 07 0 hellip 01 09 0 hellip 05 05 0 hellip 5

1 0 0 hellip 0 0 1 hellip 04 01 0 hellip 0 08 02 hellip 05 05 0 hellip 3

0 1 0 hellip 0 1 0 hellip 04 0 06 hellip 04 0 06 hellip 05 0 05 hellip 8

0 0 1 hellip 1 0 0 hellip 0 08 02 hellip 07 03 0 hellip 0 1 0 hellip 1

hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip

M1 M2 M3 hellip S1 S2 S3 hellip S1 S2 S3 hellip M1 M2 M3 hellip S1 S2 S3 hellip Freq

Box 1 Box 2 Box 3 Box 4 Box 5 Box 6

Score(S)

036 (ndash1)

092 (+1)

017 (ndash1)

043 (ndash1)

069 (+1)

028 (ndash1)

055 (+1)

074 (+1)

hellip hellip hellip

hellip hellip hellip

X1

X2

X3

X4

X5

X6

X7

X8

y2

y1

y3

y8

y7

y4

y5

y6

X Y

Figure 3 FM model of recommending mobile service for Mashup

6 Mobile Information Systems

uses Word2vec tool to expand the document de-scription of mobile service and Mashup fromWikipedia corpus +en the HDP model is appliedto derive the topic probability distributions of theextended document description of mobile serviceand Mashup Finally the FM is deployed to predictand recommend high-quality mobile service forMashup

34 Experimental Results

341 Recommendation Performance Comparison To studythe performance of mobile service recommendation wecompare our method with other five baseline methods Weselect the optimal number of extended words for eachoriginal word in description documents of Mashup andservice to achieve the best recommendation result in ourEHDP-FMs A detailed investigation about it will be dis-cussed in subsequent section Table 3 reports the MAE andRMSE comparison of multiple recommendation methodswhich show our EHDP-FMs greatly outperforms WPCCand MPCC significantly surpasses PMF and LDA-FMs andslightly exceeds HDP-FMs consistently+e reason for this isthat in the EHDP-FMs (a) more useful words informationcan be obtained from the extended description content ofMashup and service (b) more similar Mashups and similarservices in topic distribution are identified by using HDPtechnology and (c) FM models and trains those usefulinformation (including similar Mashups and similar ser-vices the cooccurrence and popularity of service) to achievemore accurate service probability score prediction Fur-thermore when the given score values increase from 10 to 30and the density of training matrix rises from 10 to 30 theMAE and RMSE in the EHDP-FMs definitely drop+at is tosay more score values and training matrix with highersparsity mean better accuracy of recommendation

342 Effect of the Number of Extended Words on MobileService Recommendation +e experiments investigate theeffect of the number of extended words on mobile servicerecommendation in our proposed method During the ex-periments we set the number of extended words to 1 3 5and 7 (respectively denoted as EHDP-FMs-1 EHDP-FMs-3 EHDP-FMs-5 and EHDP-FMs-7) when training matrixdensity 10 and obtain their values of MAE and RMSE inFigures 4 and 5 +e experimental results indicate theperformance of EHDP-FMs-1 is the worst in all cases +is isbecause the extended description documents of Mashup andservice are still short and the contained useful information isless in it when only extending a word for each original wordWe can see that theMAE and RMSE of EHDP-FMs-3 are theoptimal and best in all cases However when the number ofthe extended words continues to increase from 5 to 7 therecommendation performance decreases+e reason for thisis that too many extended words contain more other ir-relevant syntax and semantics information which maybemakes the HDP topic model fail to mine the latent topicsaccurately and therefore weakens the performance of service

recommendation +erefore we select 3 extended words foreach original words of description document of Mashup andservice in our EHDP-FMmethod +e observations indicateit is very important to choose an appropriate number ofextended words for mobile service recommendation

343 HDP-FMs Performance vs LDA-FMs Performancewith Different Topic Numbers In this experiment we re-spectively set the number of topics as 3 6 12 and 24 forLDA-FMs and denote as LDA-FMs-361224 +e experi-mental results in Figures 6 and 7 respectively show theMAE and RMSE values when the training matrix density isequal to 10We also observe that the performance of HDP-FMs is the best At the same time the MAE and RMSE ofLDA-FMs-12 are close to that of HDP-FMs and surpassedthose of LDA-FMs-3 LDA-FMs-6 and LDA-FMs-24 +eobservations prove that HDP-FM is better than LDA-FMssince it can automatically derive the optimal topic numbersinstead of repeatedly training like LDA

344 Impacts of Top-S and Top-M in HDP-FMs We in-vestigate the effects of top-S and top-M to mobile servicerecommendation in order to obtain their optimal values+eoptimal values of top-M (top-S) for all similar top-S (top-M)services (Mashups) are obtained ie S 5 for all top-Msimilar Mashups and M 10 for all top-S similar mobileservices Under the setting of training matrix density 10and given number 30 the MAEs of HDP-FMs are pre-sented in Figures 8 and 9 We can see that from Figure 8 theMAE of HDP-FMs constantly rises when S grows from 5 to25 Figure 9 indicates the MAE of HDP-FMs runs up to itspeak value whenM 10 and then continuously rises with theincreasing or decreasing ofM +e observations mean that itis very important to identify suitable values of S and M forthe HDP-FM method

4 Related Works

Service recommendation is a hot topic nowadays in service-oriented computing [19] Traditional service recommen-dation solves the quality problem of Mashup services inorder to realize high-quality service recommendation +equality of a single service can facilitate recommendationshowed by Picozzi et al [20] +e quality attributes ofMashup components (APIs) and information quality inMashups [21] is analyzed by Cappiello [22] In additioncollaborative filtering (CF) technique is widely exploited inQoS-based service recommendation [16] We can use it tomeasure the similarity of services or users predict themissing QoS values on the basis of the QoS records of similarservices or similar users and recommend services to users

+e problems of the data sparsity and long tail bringabout inaccurate and imperfect search results according tothe results in references [23 24] To attack the problemsome researchers try to use matrix factorization technologyto decompose historical QoS or Mashup service interactionsto obtain service recommendations [25 26] A collaborativeQoS prediction method is proposed in which a matrix

Mobile Information Systems 7

factorization model of neighbourhood integration isdesigned to predict the QoS value of personalized Webservices by Zheng et al [26] A social awareness servicerecommendation method is proposed in which the multi-dimensional social relationships among potential usersMashups topics and services are depicted by the couplingmatrix model by Xu et al [9] +ese methods aim totransform Mashup-service rating matrix or QoS into afeature space matrix with lower dimensions and predict theprobability of services invoked by Mashups or unknownQoS

Considering that matrix factorization depends onabundant historical interaction records recent work in-corporates additional information into matrix factorizationto obtain more accurate service recommendation [5 10ndash12]Among them Ma et al [11] integrates matrix factorization

with geographic and social influence to recommend interestpoints By using location information and QoS of Webservices to cluster services and users a personalized servicerecommendation is proposed by Chen et al [12] +e his-torical invocation relationship between Mashups and ser-vices is studied to infer the implicit functional correlationbetween services and the correlation is incorporated into thematrix factorization model to facilitate service recommen-dation by Yao et al [10] Collaborative topic regression isproposed by Liu and Fulia [5] which combines probabilistictopic modelling and probabilistic matrix factorization forservice recommendation

+e existing methods based on matrix factorizationundoubtedly improve the performance of service recom-mendation At the same time we observed that few of themrecognized the historical invocation between services and

Table 3 MAE and RMSE comparison of multiple recommendation approaches

MethodMatrix density 10 Matrix density 20 Matrix density 30MAE RMSE MAE RMSE MAE RMSE

Given 10

SPCC 04258 05643 04005 05257 03932 05036MPCC 04316 05701 04108 05293 04035 05113PMF 02417 03835 02263 03774 02014 03718

LDA-FMs 02091 03225 01969 03116 01832 03015HDP-FMs 01547 02874 01329 02669 01283 02498EHDP-FMs 01308 02507 01154 02372 01081 02093

Given 20

SPCC 04135 05541 03918 05158 03890 05003MPCC 04413 05712 04221 05202 04151 05109PMF 02398 03559 02137 03427 01992 03348

LDA-FMs 01989 03104 01907 03018 01801 02894HDP-FMs 01486 02713 01297 02513 01185 02291EHDP-FMs 01227 02419 01055 02216 00952 01904

Given 30

SPCC 04016 05447 03907 05107 03739 05012MPCC 04518 05771 04317 05159 04239 05226PMF 02214 03319 02091 03117 01986 03052

LDA-FMs 01970 03096 01865 02993 01794 02758HDP-FMs 01377 02556 01109 02461 01047 02057EHDP-FMs 01113 02248 00926 02057 00804 01673

Given number10 20 30

MA

E

005

01

015

02

025

03

EHDP-FMs-1EHDP-FMs-3

EHDP-FMs-5EHDP-FMs-7

Figure 4 MAE values of EHDP-FMs-1357

EHDP-FMs-1EHDP-FMs-3

EHDP-FMs-5EHDP-FMs-7

Given number10 20 30

RMSE

018

02

022

024

026

028

03

032

Figure 5 RMSE values of EHDP-FMs-1357

8 Mobile Information Systems

Mashups to derive potential topics and they did not use FMsto model and train these potential topics to predict theprobability of Mashup calling services to obtain more ac-curate service recommendation In our previous work[8 27 28] we mainly address on LDA or enhanced LDAtopic model for Web services clustering [27 28] and alsoexploit word embedding technique to enhance the accuracyof service clustering [8] Driven by these methods wecombine FMs and word-embedded enhanced HDP forrecommending mobile services to build novel Mashup ap-plication We apply the HDP model to export the potentialtopics from the description documents of mobile servicesand Mashups to support FM model training We use FMs topredict the probability of Mashups calling mobile servicesand recommend high-quality services for building novelMashup application

5 Discussion

Recommending mobile service to build novel Mashup ap-plication for software developers in the mobile servicecomputing environment is becoming a promising researchtopic In our paper the functional semantic representationof Mashup applications and mobile services is fully con-solidated and mined by extending their description docu-ments and modelling their topic probability distributionand the quality prediction is performed by exploiting FMs totrain and model multiple dimension features of mobileservice +e high-quality mobile services are ranked andrecommended to build Mashup by simultaneously consid-ering their functionality representation and quality feature+e accuracy of mobile service recommendation is signifi-cantly improved as a result

Although the above approach and solution seem veryeffective in the Mashup development it will be better if a

Given number10 20 30

RMSE

02

025

03

035

04

LDA-FMs-3LDA-FMs-6LDA-FMs-12

LDA-FMs-24HDP-FMs

Figure 6 MAE of HDP-FMs and LDA-FMs

LDA-FMs-3LDA-FMs-6LDA-FMs-12

LDA-FMs-24HDP-FMs

Given number10 20 30

MA

E

012

013

014

015

016

017

Figure 7 RMSE of HDP-FMs and LDA-FMs

Top-S5 10 15 20 25

MA

E

013

014

015

016

017

018

019

02

HDP-FMs

Figure 8 Impact of Top-S in HDP-FMs

Top-M5 10 15 20 25

MA

E

013

014

015

016

017

018

019

02

HDP-FMs

Figure 9 Impact of Top-M in HDP-FMs

Mobile Information Systems 9

prototype can be designed or implemented to validate theeffectiveness and application value of the approach As weexpected the objective of the prototype system is to rank andrecommend the high-quality mobile service to softwaredevelopers for building Mashup application We can use thetools of Python 35 Mysql 56 and the technologies of Flaskand Pyecharts to develop the prototype system It shouldachieve four basic function parts ie service data crawlingand preprocessing description extension of Mashup andmobile service topic modelling of Mashup and mobileservice and recommendation of mobile service for the givenMashup requirement More concretely

(1) In the first part (service data crawling and pre-processing) the system incrementally crawls Mash-ups services and invocations between these Mashupsand services from ProgrammableWeb and builds theircorresponding data table to store Because the de-scription documents of Mashup and mobile servicecontain some useless or meaningless termswords thepreprocessing is performed to normalize and stan-dardize the description information +e pre-processing mainly includes tokenization words aresegmented by spaces and punctuation is separatedfrom words by using NLTK (Natural LanguageToolkit) in Python removing stop words (the com-mon short words or symbols that have no practicalmeaning but occur frequently such as the to a anwith and at) the stop vocabulary table in the NLTK isapplied to remove stop words stemming variousforms of a word are usually used in the grammaticalexpression such as provide providing provides andprovided and their common word endings such asing s and ed should be removed

(2) In the second part (description extension of Mashupand mobile service) English Wikipedia corpustrained byWord2vec based on the genism module inPython is exploited as the description extensionsource of Mashup and mobile service +e mostsimilar Top-N words to an original word in thedescription documents of Mashup and mobile ser-vice are identified and saved as the extended words inthe prototype system Word2vec uses the hierar-chical softmax algorithm to speed up and train wordvector for English Wikipedia corpus whose timecomplexity is O(log N) Meanwhile Word2veccalculates the similarity between words in EnglishWikipedia corpus and obtain the similarity matrixwhose time and space complexity are all O(n2) n isthe total amount of words During the process ofdescription extension the most similar Top-N wordsto an original word can be found only by the way oflook-up table in the trained English Wikipediacorpus and its time complexity is O(1) and spacecomplexity is O(N2)

(3) In the third part (topic modelling of Mashup andmobile service) hierarchical Dirichlet processtechnology is used to extract the implicit topics ofMashup and mobile service which clustersgroups

service data according to the cooccurrence of wordfrequency It can automatically determine the op-timal number of topics which avoids adjusting thenumber of topics repeatedly and so saves the timecost It can also accurately predict the topic dis-tribution of Mashup and mobile service which donot need to retrain the dataset and make the pro-totype system real time A topic modelling modulecan be designed in the prototype system in whichFlask framework and Pyecharts visualization tool inPython are used to present the effect of topicmodelling and a download function is provided todownload the transformed topic vectors of Mashupand mobile service for users

(4) In the fourth part (recommendation of mobile ser-vice for the given Mashup requirement) when asoftware developer submits a Mashup requirementthe prototype system will return a list of mobileservices with good quality a for software developer tobuild novel Mashup application During the processof recommendation factorization machines trainand model the important input features +ese inputfeatures include functional features ie similarMashups of target Mashup and similar mobile ser-vices of active mobile service derived from the topicsimilarity based on the HDP model and qualityfeatures ie the cooccurrence and popularity ofmobile services obtained from the stored data tableFMs predict the probability of mobile servicesinvocated by Mashups and recommends the high-quality mobile service for a Mashup creation Sim-ilarly Flask framework and Pyechart visualizationtool in Python are used to present the effect ofrecommendation

6 Conclusions and Future Work

+is paper proposes a mobile service recommendationmethod for Mashup development in mobile servicecomputing by combining word embeddings enhancedHDP and FMs +e experimental results on the top ofProgrammableWeb dataset show that compared with theexisting recommendation methods the proposed methodachieves significant improvements in the accuracy ofrecommendation In the future work we willinvestigate and apply fine-grained service relationshipinformation into the proposed model for more accuraterecommendation

Data Availability

+e crawled dataset from ProgrammableWeb can beaccessed at http491230608080MashupNetwork20datasetjsp

Conflicts of Interest

+e authors declare that there are no conflicts of interestregarding the publication of this paper

10 Mobile Information Systems

Acknowledgments

+e work was supported by the National Natural ScienceFoundation of China under grant nos 61873316 6187213961572187 61772193 61702181 and 61572371 National KeyRampD Program of China under grant no 2017YFB1400602Hunan Provincial Natural Science Foundation of Chinaunder grant nos 2017JJ2098 2017JJ4036 2018JJ2139 and2018JJ2136 and Innovation Platform Open Foundation ofHunan Provincial Education Department of China undergrant no 17K033

References

[1] S Deng L Huang H Wu et al ldquoToward mobile servicecomputing opportunities and challengesrdquo IEEE CloudComputing vol 3 no 4 pp 32ndash41 2016

[2] B Xia Y Fan W Tan K Huang J Zhang and C WuldquoCategory-aware API clustering and distributed recommen-dation for automatic mashup creationrdquo IEEE Transactions onServices Computing vol 8 no 5 pp 674ndash687 2015

[3] httpsenwikipediaorgwikiMashup_(web_application_hybrid)

[4] L Chen Y Wang Q Yu Z Zheng and J Wu ldquoWT-LDAuser tagging augmented LDA for web service clusteringrdquo inProceedings of the International Conference on Service-Oriented Computing (ICSOC) Hangzhou China January2013

[5] X Liu and I Fulia ldquoIncorporating user topic and service-related latent factors into web service recommendationrdquo inProceedings of the IEEE International Conference on WebServices pp 185ndash192 New York NY USA July 2015

[6] D Blei A Ng and M Jordan ldquoLatent dirichlet allocationrdquoJournal of Machine Learning Research vol 3 pp 993ndash10222003

[7] Y W Teh M I Jordan M J Beal and D M Blei ldquoHier-archical dirichlet processrdquo Journal of the American StatisticalAssociation vol 101 no 476 pp 1566ndash1581 2004

[8] M Shi J Liu D Zhou M Tang and B Cao ldquoWE-LDA aword embeddings augmented LDA model for web servicesclusteringrdquo in Proceedings of the IEEE International Confer-ence on Web Services (ICWS) pp 9ndash16 Honolulu HI USAJune 2017

[9] W Xu J Cao L Hu J Wang and M Li ldquoA social-awareservice recommendation approach for mashup creationrdquo inProceedings of the IEEE 20th International Conference on WebServices pp 107ndash114 Santa Clara CA USA 2013

[10] L Yao X Wang Q Sheng W Ruan and W Zhang ldquoServicerecommendation for mashup composition with implicitcorrelation regularizationrdquo in Proceedings of the IEEE In-ternational Conference on Web Services pp 217ndash224 NewYork NY USA June-July 2015

[11] H Ma D Zhou C Liu M R Lyu and I King ldquoRecom-mender Systems with social regularizationrdquo in Proceedings ofthe Fourth ACM International Conference on Web Search andData Mining pp 287ndash296 ACM Hong Kong China Feb-ruary 2011

[12] X Chen Z Zheng Q Yu and M R Lyu ldquoWeb servicerecommendation via exploiting location and QoS in-formationrdquo IEEE Transactions on Parallel and DistributedSystems vol 25 no 7 pp 1913ndash1924 2014

[13] S Rendle ldquoFactorization machinesrdquo in Proceedings of theIEEE International Conference on Data Mining (ICDM)pp 995ndash1000 Sydney Australia December 2010

[14] S Rendle ldquoFactorization machines with libFMrdquo ACMTransactions on Intelligent Systems and Technology (TIST)vol 3 no 3 pp 57ndash78 2012

[15] T Ma I Sato and H Nakagawa e Hybrid NestedHierarchical Dirichlet Process and Its Application to TopicModeling with Word Differentiation Association for theAdvancement of Artificial Intelligence (AAAI) Menlo ParkCA USA 2015

[16] Y Teh M Jordan M Beal and D Blei ldquoSharing clustersamong related groups hierarchical dirichlet processesrdquo Ad-vances in Neural Information Processing System vol 37 no 2pp 1385ndash1392 2004

[17] Z Zheng H Ma M Lyu and I King ldquoWSRec a collaborativefiltering based web service recommender systemrdquo in Pro-ceedings IEEE International Conference on Web Services(ICWS) pp 437ndash444 Los Angeles CA USA July 2009

[18] B Cao B Li J Liu M Tang and Y Liu ldquoWeb APIs rec-ommendation for mashup development based on hierarchicaldirichlet process and factorization machinesrdquo in Proceedingsof Collaborate Computing Networking Applications andWorksharing Beijing China July 2016

[19] S Wang Z Zheng Z Wu M Lyu and F Yang ldquoReputationmeasurement and malicious feedback rating prevention inweb service recommendation systemrdquo IEEE Transactions onServices Computing vol 5 no 8 pp 755ndash767 2015

[20] M Picozzi M Rodolfi C Cappiello andMMatera ldquoQuality-based recommendations for mashup compositionrdquo in Cur-rent Trends in Web Engineering vol 6385 pp 360ndash371 2010

[21] C Cappiello F Daniel M Matera and C Pautasso ldquoIn-formation quality in mashupsrdquo IEEE Internet Computingvol 14 no 4 pp 14ndash22 2010

[22] C Cappiello ldquoA quality model for mashup componentsrdquo inWeb Engineering Web Engineering Lecture Notes in Com-puter Science vol 5648 pp 236ndash250 2009

[23] K Huang Y Fan and W Tan ldquoAn empirical study ofprogrammable web a network analysis on a service-mashupsystemrdquo in Proceedings of the 2012 IEEE 19th InternationalConference onWeb Services (ICWS) Honolulu HI USA June2012

[24] W Gao L Chen J Wu and H Gao ldquoManifold-learningbased API recommendation for mashup creationrdquo in Pro-ceedings of the 2015 IEEE International Conference on WebServices (ICWS) New York NY USA June 2015

[25] X Luo M Zhou Y Xia and Q Zhu ldquoAn efficient non-negative matrix-factorization-based approach to collaborativefiltering for recommender systemsrdquo IEEE Transactions onIndustrial Informatics vol 10 no 2 pp 1273ndash1284 2014

[26] Z Zheng H Ma M R Lyu and I King ldquoCollaborative webservice QoS prediction via neighborhood integrated matrixfactorizationrdquo IEEE Transactions on Services Computingvol 6 no 3 pp 289ndash299 2013

[27] B Cao X Liu M D M Rahman B Li J Liu and M TangldquoIntegrated content and network-based service clustering andweb APIs recommendation for mashup developmentrdquo IEEETransactions on Services Computing p 1 2017

[28] B Cao X Liu J Liu and M Tang ldquoDomain-aware mashupservice clustering based on LDA topic model from multipledata sourcesrdquo Information and Software Technology vol 90pp 40ndash54 2017

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 5: MobileServiceRecommendationviaCombiningEnhanced ...downloads.hindawi.com/journals/misy/2019/6423805.pdfcorpus Ke3erv3 dataset Gscription document3f3 flashup3n3 mobi3ervice jG%3op3del

where w0 is the global bias and k is the dimensionality offactorization w0 models the strength of the i-th featureand xixj indicates all the pairwise variables in thetraining instances xi and xj +e model parametersw0 w1 wp v11 vpk1113966 1113967 are denoted as follows

w0 isin R

w isin Rn

V isin Rnlowastk

(8)

242 Prediction and Recommendation of Mobile Service forMashup Based on FMs +e prediction and recommendationof mobile service is a typical classification problem and it isregarded as the task of ranking mobile services and recom-mending enough related mobile services for a given Mashup+e result of classification can be denoted as y minus1 1 Wheny 1 the relevant mobile services are recommended to thegiven Mashup However in the experiment we can onlyobtain the predicted values ranging from 0 to 1 by formula (5)We firstly rank these prediction values then label the top-Kresults as positive (+1) and the rests as negative (minus1) andfinally recommend the mobile services with positive values tothe given or target Mashup

In the modelling of FMs target Mashup and activemobile services can be considered as user and item re-spectively In addition to the two-dimensional features(target Mashup and active mobile services) we add otherfeatures such as similar mobile services similar Mashuppopularity and cooccurrence of mobile services to improvethe accuracy of prediction and recommendation +e ad-ditional features can be used as input feature vectors in FMmodelling +erefore the model in formula (6) can be ex-tended to the below prediction model with six dimensions

yMA times MS times SMS times SMA times CO times POP⟶ S (9)

where MA is the target Mashup MS is the active mobileservice SMS represents a similar Mashup SMA represents asimilar mobile service CO represents the cooccurrence ofmobile service POP indicates the popularity of mobileservice and S indicates the prediction ranking score re-spectively +ese similar Mashups and mobile services arederived from our HDP model in Section 23

Figure 3 is an example of recommending mobile servicesfor target Mashup using the FM model in which the dataconsist of two parts +e first part is the input feature vectorset X and the second part is the output target set Y Each rowincludes a feature vector xi and its corresponding targetscore value yi +e first binary indicator matrix (ie Box 1)indicates the target Mashup MA +e second binary in-dicator matrix (ie Box 2) indicates the active mobile serviceMS +e third indicator matrix (ie Box 3) represents thatTop-S mobile services are similar to active mobile service inBox 2 For example the similarity between S1 and S2 (S3) is03 (07) +e fourth indicator matrix (ie Box 4) representsTop-M similar Mashups SMA of the target Mashup in Box 1For instance the similarity between M2 and M1 (M3) is 03(07) +e fifth indicator matrix (ie Box 5) indicates the

cooccurrence CO of active mobile services composed orinvoked by the same Mashup in historical records +e sixthindicator matrix (ie Box 6) indicates the popularity (or thefrequencytimes) POP of active mobile services composed orinvoked by Mashup Target Y represents the output result ofthe model and the prediction ranking score S can beclassified as a positive value (+1) or a negative value (minus1) onthe top of a given threshold If yi gt 05 then S+1 otherwiseSminus1 +ese mobile services with positive values will beselected and recommended to the target Mashup+e case inpoint is if there are two active mobile service members S1and S3 for the selection of the target Mashup M1 S1 will beselected and recommended to M1 +is is because it has ahigher prediction value ie y2 gt 092 Furthermore we willinvestigate the influences of top-S and top-M on recom-mendation performance in the Experiments section

3 Experiments

31 Experiment Dataset and Settings In this experiment wefirstly crawled 3929 real Mashups 10648 services and 12715invocations between these Mashups and services from Pro-grammableWeb As for each Mashup or service a pre-processing process is performed to obtain their standarddescription information Secondly we use the Word2vec toolto expand the description document of Mashup or servicefrom English Wikipedia corpus published on April 2017 andobtain their word embeddings vector More concretely thegensim module in Python is applied to train the EnglishWikipedia corpus and produce its word embeddings vectorand Table 1 presents the special parameters of Word2vecFinally the trained English Wikipedia corpus is exploited toexpand the description documents of Mashup and service+e most similar Top-N words to the original word areidentified and used as the extended words For instance thetop similar 10 expanded words of the two words ldquoEarthrdquo andldquoGooglerdquo are shown in Table 2 All Mashups in the dataset areuniformly divided into 5 subsets in which 1 subset is used asthe testing set and other 4 subsets are integrated as a thetraining set A five-fold cross validation is conducted and theresults for each fold are summed up to obtain their meanvalue as the reported experiment results As for the testing setthrough randomly removing some score values from thematrix of Mashup service we change the number of scorevalues provided by the active Mashups as 10 20 and 30 andcall them as Given 10 Given 20 andGiven 30 respectively Atthe same time the removed score values are exploited as theexpected values Similarly as for the training set throughrandomly removing some score values the Mashup-servicematrix becomes more sparser with density 10 20 and30 respectively

32 Evaluation Metrics Mean absolute error (MAE) refersto the expected value of the square of the difference betweenthe observed value and the true value which can evaluate thechange degree of data [16] Root-mean-squared error(RMSE) is the square root of the ratio of the square of thedeviation between the observed value and the true value to

Mobile Information Systems 5

the observed times N [16] We adopt the MAE and RMSE asthe evaluation metrics of Web APIs recommendation +esmaller the MAE and RMSE mean the better the recom-mendation effect

MAE 1N

1113944ij

rij minus 1113954rij

11138681113868111386811138681113868

11138681113868111386811138681113868

RMSE

1N

1113944ij

rij minus 1113954rij1113872 11138732

1113971

(10)

where N is the number of predicted score rij indicates thetrue score of Mashup Mi to service Sj and 1113954rij indicates thepredicted score of Mi to Sj

33 Baseline Methods We choose the below methods asbaseline to compare them with our proposed approach

(i) SPCC Similar to IPCC [17] service-based utilizingPearson correlation coefficient (SPCC) approachmeasure the similarities between mobile servicesand perform recommendation

(ii) MPCC Similar to UPCC [17] Mashups-basedutilizing Pearson correlation coefficient (MPCC)approachmeasure the similarities betweenMashupsand perform recommendation

(iii) PMF In the collaborative filtering probabilisticmatrix factorization (PMF) is a very popular matrixfactorization model [10] +e historical invocationrecord between Mashups and mobile services isdenoted as a matrix R [rij]ntimesk If rij 1 themobile service is invoked by a Mashup is shownotherwise rij 0 +e probability of the mobileservice Si invoked by the Mashup Mj can be pre-dicted and represented as 1113954rij ST

i Mj(iv) LDA-FMs +e topic probability distributions of

description documents in Mashup and mobileservice firstly are derived by the LDA model andthen they are trained via FMs to predict theprobability distribution of mobile service invokedby Mashup and recommend mobile service withhigh quality Besides the topic information thecooccurrence and popularity of mobile service areexploited in the FM modelling

(v) HDP-FMs +e prior work [18] which integratesHDP and FMs to recommendmobile service for targetMashup+eHDPmodel is applied to derive the topicprobability distributions of description documents inMashup and mobile service Similarly the topic in-formation and the cooccurrence and popularity ofmobile service are all used in the FM modelling

(vi) EHDP-FMs +e proposed method in this paper isan extended work of the prior work [18] It firstly

Table 1 Specific parameters of Word2vec

Parameter ValueSize (the dimension of word vector) 200Window (the length of the window) 10Sample (the threshold of sampling) 0001Negative (the number of negative sampling) 5Sg (whether or not the Skip-gram model is used) 0 (No)Hs (whether or not hierarchical Softmax model isused) 0 (No)

Table 2 Extension example of the two words ldquoEarthrdquo andldquoGooglerdquo

Original word Earth Google

Extended words

Planet GmailMartian DropboxMars Evernote

Venusian AppPlanets Adsense

Spaceship YahooUniverse MicrosoftPlanetary FlickrMoon HotmailDeimos Mapquest

Mashup(MA)

Mobile service(MS)

Similar mobileservice (SMS)

Similar Mashup(SMA)

Cooccurrence(CO)

Popularity(POP)

0 1 0 hellip 1 0 0 hellip 0 03 07 hellip 03 0 07 hellip 0 05 05 hellip 12

1 0 0 hellip 1 0 0 hellip 0 05 05 hellip 0 05 05 hellip 0 1 0 hellip 3

0 1 0 hellip 0 1 0 hellip 07 0 03 hellip 05 0 05 hellip 05 0 05 hellip 7

0 0 1 hellip 0 1 0 hellip 06 0 04 hellip 04 06 0 hellip 05 0 05 hellip 21

0 0 1 hellip 0 0 1 hellip 03 07 0 hellip 01 09 0 hellip 05 05 0 hellip 5

1 0 0 hellip 0 0 1 hellip 04 01 0 hellip 0 08 02 hellip 05 05 0 hellip 3

0 1 0 hellip 0 1 0 hellip 04 0 06 hellip 04 0 06 hellip 05 0 05 hellip 8

0 0 1 hellip 1 0 0 hellip 0 08 02 hellip 07 03 0 hellip 0 1 0 hellip 1

hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip

M1 M2 M3 hellip S1 S2 S3 hellip S1 S2 S3 hellip M1 M2 M3 hellip S1 S2 S3 hellip Freq

Box 1 Box 2 Box 3 Box 4 Box 5 Box 6

Score(S)

036 (ndash1)

092 (+1)

017 (ndash1)

043 (ndash1)

069 (+1)

028 (ndash1)

055 (+1)

074 (+1)

hellip hellip hellip

hellip hellip hellip

X1

X2

X3

X4

X5

X6

X7

X8

y2

y1

y3

y8

y7

y4

y5

y6

X Y

Figure 3 FM model of recommending mobile service for Mashup

6 Mobile Information Systems

uses Word2vec tool to expand the document de-scription of mobile service and Mashup fromWikipedia corpus +en the HDP model is appliedto derive the topic probability distributions of theextended document description of mobile serviceand Mashup Finally the FM is deployed to predictand recommend high-quality mobile service forMashup

34 Experimental Results

341 Recommendation Performance Comparison To studythe performance of mobile service recommendation wecompare our method with other five baseline methods Weselect the optimal number of extended words for eachoriginal word in description documents of Mashup andservice to achieve the best recommendation result in ourEHDP-FMs A detailed investigation about it will be dis-cussed in subsequent section Table 3 reports the MAE andRMSE comparison of multiple recommendation methodswhich show our EHDP-FMs greatly outperforms WPCCand MPCC significantly surpasses PMF and LDA-FMs andslightly exceeds HDP-FMs consistently+e reason for this isthat in the EHDP-FMs (a) more useful words informationcan be obtained from the extended description content ofMashup and service (b) more similar Mashups and similarservices in topic distribution are identified by using HDPtechnology and (c) FM models and trains those usefulinformation (including similar Mashups and similar ser-vices the cooccurrence and popularity of service) to achievemore accurate service probability score prediction Fur-thermore when the given score values increase from 10 to 30and the density of training matrix rises from 10 to 30 theMAE and RMSE in the EHDP-FMs definitely drop+at is tosay more score values and training matrix with highersparsity mean better accuracy of recommendation

342 Effect of the Number of Extended Words on MobileService Recommendation +e experiments investigate theeffect of the number of extended words on mobile servicerecommendation in our proposed method During the ex-periments we set the number of extended words to 1 3 5and 7 (respectively denoted as EHDP-FMs-1 EHDP-FMs-3 EHDP-FMs-5 and EHDP-FMs-7) when training matrixdensity 10 and obtain their values of MAE and RMSE inFigures 4 and 5 +e experimental results indicate theperformance of EHDP-FMs-1 is the worst in all cases +is isbecause the extended description documents of Mashup andservice are still short and the contained useful information isless in it when only extending a word for each original wordWe can see that theMAE and RMSE of EHDP-FMs-3 are theoptimal and best in all cases However when the number ofthe extended words continues to increase from 5 to 7 therecommendation performance decreases+e reason for thisis that too many extended words contain more other ir-relevant syntax and semantics information which maybemakes the HDP topic model fail to mine the latent topicsaccurately and therefore weakens the performance of service

recommendation +erefore we select 3 extended words foreach original words of description document of Mashup andservice in our EHDP-FMmethod +e observations indicateit is very important to choose an appropriate number ofextended words for mobile service recommendation

343 HDP-FMs Performance vs LDA-FMs Performancewith Different Topic Numbers In this experiment we re-spectively set the number of topics as 3 6 12 and 24 forLDA-FMs and denote as LDA-FMs-361224 +e experi-mental results in Figures 6 and 7 respectively show theMAE and RMSE values when the training matrix density isequal to 10We also observe that the performance of HDP-FMs is the best At the same time the MAE and RMSE ofLDA-FMs-12 are close to that of HDP-FMs and surpassedthose of LDA-FMs-3 LDA-FMs-6 and LDA-FMs-24 +eobservations prove that HDP-FM is better than LDA-FMssince it can automatically derive the optimal topic numbersinstead of repeatedly training like LDA

344 Impacts of Top-S and Top-M in HDP-FMs We in-vestigate the effects of top-S and top-M to mobile servicerecommendation in order to obtain their optimal values+eoptimal values of top-M (top-S) for all similar top-S (top-M)services (Mashups) are obtained ie S 5 for all top-Msimilar Mashups and M 10 for all top-S similar mobileservices Under the setting of training matrix density 10and given number 30 the MAEs of HDP-FMs are pre-sented in Figures 8 and 9 We can see that from Figure 8 theMAE of HDP-FMs constantly rises when S grows from 5 to25 Figure 9 indicates the MAE of HDP-FMs runs up to itspeak value whenM 10 and then continuously rises with theincreasing or decreasing ofM +e observations mean that itis very important to identify suitable values of S and M forthe HDP-FM method

4 Related Works

Service recommendation is a hot topic nowadays in service-oriented computing [19] Traditional service recommen-dation solves the quality problem of Mashup services inorder to realize high-quality service recommendation +equality of a single service can facilitate recommendationshowed by Picozzi et al [20] +e quality attributes ofMashup components (APIs) and information quality inMashups [21] is analyzed by Cappiello [22] In additioncollaborative filtering (CF) technique is widely exploited inQoS-based service recommendation [16] We can use it tomeasure the similarity of services or users predict themissing QoS values on the basis of the QoS records of similarservices or similar users and recommend services to users

+e problems of the data sparsity and long tail bringabout inaccurate and imperfect search results according tothe results in references [23 24] To attack the problemsome researchers try to use matrix factorization technologyto decompose historical QoS or Mashup service interactionsto obtain service recommendations [25 26] A collaborativeQoS prediction method is proposed in which a matrix

Mobile Information Systems 7

factorization model of neighbourhood integration isdesigned to predict the QoS value of personalized Webservices by Zheng et al [26] A social awareness servicerecommendation method is proposed in which the multi-dimensional social relationships among potential usersMashups topics and services are depicted by the couplingmatrix model by Xu et al [9] +ese methods aim totransform Mashup-service rating matrix or QoS into afeature space matrix with lower dimensions and predict theprobability of services invoked by Mashups or unknownQoS

Considering that matrix factorization depends onabundant historical interaction records recent work in-corporates additional information into matrix factorizationto obtain more accurate service recommendation [5 10ndash12]Among them Ma et al [11] integrates matrix factorization

with geographic and social influence to recommend interestpoints By using location information and QoS of Webservices to cluster services and users a personalized servicerecommendation is proposed by Chen et al [12] +e his-torical invocation relationship between Mashups and ser-vices is studied to infer the implicit functional correlationbetween services and the correlation is incorporated into thematrix factorization model to facilitate service recommen-dation by Yao et al [10] Collaborative topic regression isproposed by Liu and Fulia [5] which combines probabilistictopic modelling and probabilistic matrix factorization forservice recommendation

+e existing methods based on matrix factorizationundoubtedly improve the performance of service recom-mendation At the same time we observed that few of themrecognized the historical invocation between services and

Table 3 MAE and RMSE comparison of multiple recommendation approaches

MethodMatrix density 10 Matrix density 20 Matrix density 30MAE RMSE MAE RMSE MAE RMSE

Given 10

SPCC 04258 05643 04005 05257 03932 05036MPCC 04316 05701 04108 05293 04035 05113PMF 02417 03835 02263 03774 02014 03718

LDA-FMs 02091 03225 01969 03116 01832 03015HDP-FMs 01547 02874 01329 02669 01283 02498EHDP-FMs 01308 02507 01154 02372 01081 02093

Given 20

SPCC 04135 05541 03918 05158 03890 05003MPCC 04413 05712 04221 05202 04151 05109PMF 02398 03559 02137 03427 01992 03348

LDA-FMs 01989 03104 01907 03018 01801 02894HDP-FMs 01486 02713 01297 02513 01185 02291EHDP-FMs 01227 02419 01055 02216 00952 01904

Given 30

SPCC 04016 05447 03907 05107 03739 05012MPCC 04518 05771 04317 05159 04239 05226PMF 02214 03319 02091 03117 01986 03052

LDA-FMs 01970 03096 01865 02993 01794 02758HDP-FMs 01377 02556 01109 02461 01047 02057EHDP-FMs 01113 02248 00926 02057 00804 01673

Given number10 20 30

MA

E

005

01

015

02

025

03

EHDP-FMs-1EHDP-FMs-3

EHDP-FMs-5EHDP-FMs-7

Figure 4 MAE values of EHDP-FMs-1357

EHDP-FMs-1EHDP-FMs-3

EHDP-FMs-5EHDP-FMs-7

Given number10 20 30

RMSE

018

02

022

024

026

028

03

032

Figure 5 RMSE values of EHDP-FMs-1357

8 Mobile Information Systems

Mashups to derive potential topics and they did not use FMsto model and train these potential topics to predict theprobability of Mashup calling services to obtain more ac-curate service recommendation In our previous work[8 27 28] we mainly address on LDA or enhanced LDAtopic model for Web services clustering [27 28] and alsoexploit word embedding technique to enhance the accuracyof service clustering [8] Driven by these methods wecombine FMs and word-embedded enhanced HDP forrecommending mobile services to build novel Mashup ap-plication We apply the HDP model to export the potentialtopics from the description documents of mobile servicesand Mashups to support FM model training We use FMs topredict the probability of Mashups calling mobile servicesand recommend high-quality services for building novelMashup application

5 Discussion

Recommending mobile service to build novel Mashup ap-plication for software developers in the mobile servicecomputing environment is becoming a promising researchtopic In our paper the functional semantic representationof Mashup applications and mobile services is fully con-solidated and mined by extending their description docu-ments and modelling their topic probability distributionand the quality prediction is performed by exploiting FMs totrain and model multiple dimension features of mobileservice +e high-quality mobile services are ranked andrecommended to build Mashup by simultaneously consid-ering their functionality representation and quality feature+e accuracy of mobile service recommendation is signifi-cantly improved as a result

Although the above approach and solution seem veryeffective in the Mashup development it will be better if a

Given number10 20 30

RMSE

02

025

03

035

04

LDA-FMs-3LDA-FMs-6LDA-FMs-12

LDA-FMs-24HDP-FMs

Figure 6 MAE of HDP-FMs and LDA-FMs

LDA-FMs-3LDA-FMs-6LDA-FMs-12

LDA-FMs-24HDP-FMs

Given number10 20 30

MA

E

012

013

014

015

016

017

Figure 7 RMSE of HDP-FMs and LDA-FMs

Top-S5 10 15 20 25

MA

E

013

014

015

016

017

018

019

02

HDP-FMs

Figure 8 Impact of Top-S in HDP-FMs

Top-M5 10 15 20 25

MA

E

013

014

015

016

017

018

019

02

HDP-FMs

Figure 9 Impact of Top-M in HDP-FMs

Mobile Information Systems 9

prototype can be designed or implemented to validate theeffectiveness and application value of the approach As weexpected the objective of the prototype system is to rank andrecommend the high-quality mobile service to softwaredevelopers for building Mashup application We can use thetools of Python 35 Mysql 56 and the technologies of Flaskand Pyecharts to develop the prototype system It shouldachieve four basic function parts ie service data crawlingand preprocessing description extension of Mashup andmobile service topic modelling of Mashup and mobileservice and recommendation of mobile service for the givenMashup requirement More concretely

(1) In the first part (service data crawling and pre-processing) the system incrementally crawls Mash-ups services and invocations between these Mashupsand services from ProgrammableWeb and builds theircorresponding data table to store Because the de-scription documents of Mashup and mobile servicecontain some useless or meaningless termswords thepreprocessing is performed to normalize and stan-dardize the description information +e pre-processing mainly includes tokenization words aresegmented by spaces and punctuation is separatedfrom words by using NLTK (Natural LanguageToolkit) in Python removing stop words (the com-mon short words or symbols that have no practicalmeaning but occur frequently such as the to a anwith and at) the stop vocabulary table in the NLTK isapplied to remove stop words stemming variousforms of a word are usually used in the grammaticalexpression such as provide providing provides andprovided and their common word endings such asing s and ed should be removed

(2) In the second part (description extension of Mashupand mobile service) English Wikipedia corpustrained byWord2vec based on the genism module inPython is exploited as the description extensionsource of Mashup and mobile service +e mostsimilar Top-N words to an original word in thedescription documents of Mashup and mobile ser-vice are identified and saved as the extended words inthe prototype system Word2vec uses the hierar-chical softmax algorithm to speed up and train wordvector for English Wikipedia corpus whose timecomplexity is O(log N) Meanwhile Word2veccalculates the similarity between words in EnglishWikipedia corpus and obtain the similarity matrixwhose time and space complexity are all O(n2) n isthe total amount of words During the process ofdescription extension the most similar Top-N wordsto an original word can be found only by the way oflook-up table in the trained English Wikipediacorpus and its time complexity is O(1) and spacecomplexity is O(N2)

(3) In the third part (topic modelling of Mashup andmobile service) hierarchical Dirichlet processtechnology is used to extract the implicit topics ofMashup and mobile service which clustersgroups

service data according to the cooccurrence of wordfrequency It can automatically determine the op-timal number of topics which avoids adjusting thenumber of topics repeatedly and so saves the timecost It can also accurately predict the topic dis-tribution of Mashup and mobile service which donot need to retrain the dataset and make the pro-totype system real time A topic modelling modulecan be designed in the prototype system in whichFlask framework and Pyecharts visualization tool inPython are used to present the effect of topicmodelling and a download function is provided todownload the transformed topic vectors of Mashupand mobile service for users

(4) In the fourth part (recommendation of mobile ser-vice for the given Mashup requirement) when asoftware developer submits a Mashup requirementthe prototype system will return a list of mobileservices with good quality a for software developer tobuild novel Mashup application During the processof recommendation factorization machines trainand model the important input features +ese inputfeatures include functional features ie similarMashups of target Mashup and similar mobile ser-vices of active mobile service derived from the topicsimilarity based on the HDP model and qualityfeatures ie the cooccurrence and popularity ofmobile services obtained from the stored data tableFMs predict the probability of mobile servicesinvocated by Mashups and recommends the high-quality mobile service for a Mashup creation Sim-ilarly Flask framework and Pyechart visualizationtool in Python are used to present the effect ofrecommendation

6 Conclusions and Future Work

+is paper proposes a mobile service recommendationmethod for Mashup development in mobile servicecomputing by combining word embeddings enhancedHDP and FMs +e experimental results on the top ofProgrammableWeb dataset show that compared with theexisting recommendation methods the proposed methodachieves significant improvements in the accuracy ofrecommendation In the future work we willinvestigate and apply fine-grained service relationshipinformation into the proposed model for more accuraterecommendation

Data Availability

+e crawled dataset from ProgrammableWeb can beaccessed at http491230608080MashupNetwork20datasetjsp

Conflicts of Interest

+e authors declare that there are no conflicts of interestregarding the publication of this paper

10 Mobile Information Systems

Acknowledgments

+e work was supported by the National Natural ScienceFoundation of China under grant nos 61873316 6187213961572187 61772193 61702181 and 61572371 National KeyRampD Program of China under grant no 2017YFB1400602Hunan Provincial Natural Science Foundation of Chinaunder grant nos 2017JJ2098 2017JJ4036 2018JJ2139 and2018JJ2136 and Innovation Platform Open Foundation ofHunan Provincial Education Department of China undergrant no 17K033

References

[1] S Deng L Huang H Wu et al ldquoToward mobile servicecomputing opportunities and challengesrdquo IEEE CloudComputing vol 3 no 4 pp 32ndash41 2016

[2] B Xia Y Fan W Tan K Huang J Zhang and C WuldquoCategory-aware API clustering and distributed recommen-dation for automatic mashup creationrdquo IEEE Transactions onServices Computing vol 8 no 5 pp 674ndash687 2015

[3] httpsenwikipediaorgwikiMashup_(web_application_hybrid)

[4] L Chen Y Wang Q Yu Z Zheng and J Wu ldquoWT-LDAuser tagging augmented LDA for web service clusteringrdquo inProceedings of the International Conference on Service-Oriented Computing (ICSOC) Hangzhou China January2013

[5] X Liu and I Fulia ldquoIncorporating user topic and service-related latent factors into web service recommendationrdquo inProceedings of the IEEE International Conference on WebServices pp 185ndash192 New York NY USA July 2015

[6] D Blei A Ng and M Jordan ldquoLatent dirichlet allocationrdquoJournal of Machine Learning Research vol 3 pp 993ndash10222003

[7] Y W Teh M I Jordan M J Beal and D M Blei ldquoHier-archical dirichlet processrdquo Journal of the American StatisticalAssociation vol 101 no 476 pp 1566ndash1581 2004

[8] M Shi J Liu D Zhou M Tang and B Cao ldquoWE-LDA aword embeddings augmented LDA model for web servicesclusteringrdquo in Proceedings of the IEEE International Confer-ence on Web Services (ICWS) pp 9ndash16 Honolulu HI USAJune 2017

[9] W Xu J Cao L Hu J Wang and M Li ldquoA social-awareservice recommendation approach for mashup creationrdquo inProceedings of the IEEE 20th International Conference on WebServices pp 107ndash114 Santa Clara CA USA 2013

[10] L Yao X Wang Q Sheng W Ruan and W Zhang ldquoServicerecommendation for mashup composition with implicitcorrelation regularizationrdquo in Proceedings of the IEEE In-ternational Conference on Web Services pp 217ndash224 NewYork NY USA June-July 2015

[11] H Ma D Zhou C Liu M R Lyu and I King ldquoRecom-mender Systems with social regularizationrdquo in Proceedings ofthe Fourth ACM International Conference on Web Search andData Mining pp 287ndash296 ACM Hong Kong China Feb-ruary 2011

[12] X Chen Z Zheng Q Yu and M R Lyu ldquoWeb servicerecommendation via exploiting location and QoS in-formationrdquo IEEE Transactions on Parallel and DistributedSystems vol 25 no 7 pp 1913ndash1924 2014

[13] S Rendle ldquoFactorization machinesrdquo in Proceedings of theIEEE International Conference on Data Mining (ICDM)pp 995ndash1000 Sydney Australia December 2010

[14] S Rendle ldquoFactorization machines with libFMrdquo ACMTransactions on Intelligent Systems and Technology (TIST)vol 3 no 3 pp 57ndash78 2012

[15] T Ma I Sato and H Nakagawa e Hybrid NestedHierarchical Dirichlet Process and Its Application to TopicModeling with Word Differentiation Association for theAdvancement of Artificial Intelligence (AAAI) Menlo ParkCA USA 2015

[16] Y Teh M Jordan M Beal and D Blei ldquoSharing clustersamong related groups hierarchical dirichlet processesrdquo Ad-vances in Neural Information Processing System vol 37 no 2pp 1385ndash1392 2004

[17] Z Zheng H Ma M Lyu and I King ldquoWSRec a collaborativefiltering based web service recommender systemrdquo in Pro-ceedings IEEE International Conference on Web Services(ICWS) pp 437ndash444 Los Angeles CA USA July 2009

[18] B Cao B Li J Liu M Tang and Y Liu ldquoWeb APIs rec-ommendation for mashup development based on hierarchicaldirichlet process and factorization machinesrdquo in Proceedingsof Collaborate Computing Networking Applications andWorksharing Beijing China July 2016

[19] S Wang Z Zheng Z Wu M Lyu and F Yang ldquoReputationmeasurement and malicious feedback rating prevention inweb service recommendation systemrdquo IEEE Transactions onServices Computing vol 5 no 8 pp 755ndash767 2015

[20] M Picozzi M Rodolfi C Cappiello andMMatera ldquoQuality-based recommendations for mashup compositionrdquo in Cur-rent Trends in Web Engineering vol 6385 pp 360ndash371 2010

[21] C Cappiello F Daniel M Matera and C Pautasso ldquoIn-formation quality in mashupsrdquo IEEE Internet Computingvol 14 no 4 pp 14ndash22 2010

[22] C Cappiello ldquoA quality model for mashup componentsrdquo inWeb Engineering Web Engineering Lecture Notes in Com-puter Science vol 5648 pp 236ndash250 2009

[23] K Huang Y Fan and W Tan ldquoAn empirical study ofprogrammable web a network analysis on a service-mashupsystemrdquo in Proceedings of the 2012 IEEE 19th InternationalConference onWeb Services (ICWS) Honolulu HI USA June2012

[24] W Gao L Chen J Wu and H Gao ldquoManifold-learningbased API recommendation for mashup creationrdquo in Pro-ceedings of the 2015 IEEE International Conference on WebServices (ICWS) New York NY USA June 2015

[25] X Luo M Zhou Y Xia and Q Zhu ldquoAn efficient non-negative matrix-factorization-based approach to collaborativefiltering for recommender systemsrdquo IEEE Transactions onIndustrial Informatics vol 10 no 2 pp 1273ndash1284 2014

[26] Z Zheng H Ma M R Lyu and I King ldquoCollaborative webservice QoS prediction via neighborhood integrated matrixfactorizationrdquo IEEE Transactions on Services Computingvol 6 no 3 pp 289ndash299 2013

[27] B Cao X Liu M D M Rahman B Li J Liu and M TangldquoIntegrated content and network-based service clustering andweb APIs recommendation for mashup developmentrdquo IEEETransactions on Services Computing p 1 2017

[28] B Cao X Liu J Liu and M Tang ldquoDomain-aware mashupservice clustering based on LDA topic model from multipledata sourcesrdquo Information and Software Technology vol 90pp 40ndash54 2017

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 6: MobileServiceRecommendationviaCombiningEnhanced ...downloads.hindawi.com/journals/misy/2019/6423805.pdfcorpus Ke3erv3 dataset Gscription document3f3 flashup3n3 mobi3ervice jG%3op3del

the observed times N [16] We adopt the MAE and RMSE asthe evaluation metrics of Web APIs recommendation +esmaller the MAE and RMSE mean the better the recom-mendation effect

MAE 1N

1113944ij

rij minus 1113954rij

11138681113868111386811138681113868

11138681113868111386811138681113868

RMSE

1N

1113944ij

rij minus 1113954rij1113872 11138732

1113971

(10)

where N is the number of predicted score rij indicates thetrue score of Mashup Mi to service Sj and 1113954rij indicates thepredicted score of Mi to Sj

33 Baseline Methods We choose the below methods asbaseline to compare them with our proposed approach

(i) SPCC Similar to IPCC [17] service-based utilizingPearson correlation coefficient (SPCC) approachmeasure the similarities between mobile servicesand perform recommendation

(ii) MPCC Similar to UPCC [17] Mashups-basedutilizing Pearson correlation coefficient (MPCC)approachmeasure the similarities betweenMashupsand perform recommendation

(iii) PMF In the collaborative filtering probabilisticmatrix factorization (PMF) is a very popular matrixfactorization model [10] +e historical invocationrecord between Mashups and mobile services isdenoted as a matrix R [rij]ntimesk If rij 1 themobile service is invoked by a Mashup is shownotherwise rij 0 +e probability of the mobileservice Si invoked by the Mashup Mj can be pre-dicted and represented as 1113954rij ST

i Mj(iv) LDA-FMs +e topic probability distributions of

description documents in Mashup and mobileservice firstly are derived by the LDA model andthen they are trained via FMs to predict theprobability distribution of mobile service invokedby Mashup and recommend mobile service withhigh quality Besides the topic information thecooccurrence and popularity of mobile service areexploited in the FM modelling

(v) HDP-FMs +e prior work [18] which integratesHDP and FMs to recommendmobile service for targetMashup+eHDPmodel is applied to derive the topicprobability distributions of description documents inMashup and mobile service Similarly the topic in-formation and the cooccurrence and popularity ofmobile service are all used in the FM modelling

(vi) EHDP-FMs +e proposed method in this paper isan extended work of the prior work [18] It firstly

Table 1 Specific parameters of Word2vec

Parameter ValueSize (the dimension of word vector) 200Window (the length of the window) 10Sample (the threshold of sampling) 0001Negative (the number of negative sampling) 5Sg (whether or not the Skip-gram model is used) 0 (No)Hs (whether or not hierarchical Softmax model isused) 0 (No)

Table 2 Extension example of the two words ldquoEarthrdquo andldquoGooglerdquo

Original word Earth Google

Extended words

Planet GmailMartian DropboxMars Evernote

Venusian AppPlanets Adsense

Spaceship YahooUniverse MicrosoftPlanetary FlickrMoon HotmailDeimos Mapquest

Mashup(MA)

Mobile service(MS)

Similar mobileservice (SMS)

Similar Mashup(SMA)

Cooccurrence(CO)

Popularity(POP)

0 1 0 hellip 1 0 0 hellip 0 03 07 hellip 03 0 07 hellip 0 05 05 hellip 12

1 0 0 hellip 1 0 0 hellip 0 05 05 hellip 0 05 05 hellip 0 1 0 hellip 3

0 1 0 hellip 0 1 0 hellip 07 0 03 hellip 05 0 05 hellip 05 0 05 hellip 7

0 0 1 hellip 0 1 0 hellip 06 0 04 hellip 04 06 0 hellip 05 0 05 hellip 21

0 0 1 hellip 0 0 1 hellip 03 07 0 hellip 01 09 0 hellip 05 05 0 hellip 5

1 0 0 hellip 0 0 1 hellip 04 01 0 hellip 0 08 02 hellip 05 05 0 hellip 3

0 1 0 hellip 0 1 0 hellip 04 0 06 hellip 04 0 06 hellip 05 0 05 hellip 8

0 0 1 hellip 1 0 0 hellip 0 08 02 hellip 07 03 0 hellip 0 1 0 hellip 1

hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip hellip

M1 M2 M3 hellip S1 S2 S3 hellip S1 S2 S3 hellip M1 M2 M3 hellip S1 S2 S3 hellip Freq

Box 1 Box 2 Box 3 Box 4 Box 5 Box 6

Score(S)

036 (ndash1)

092 (+1)

017 (ndash1)

043 (ndash1)

069 (+1)

028 (ndash1)

055 (+1)

074 (+1)

hellip hellip hellip

hellip hellip hellip

X1

X2

X3

X4

X5

X6

X7

X8

y2

y1

y3

y8

y7

y4

y5

y6

X Y

Figure 3 FM model of recommending mobile service for Mashup

6 Mobile Information Systems

uses Word2vec tool to expand the document de-scription of mobile service and Mashup fromWikipedia corpus +en the HDP model is appliedto derive the topic probability distributions of theextended document description of mobile serviceand Mashup Finally the FM is deployed to predictand recommend high-quality mobile service forMashup

34 Experimental Results

341 Recommendation Performance Comparison To studythe performance of mobile service recommendation wecompare our method with other five baseline methods Weselect the optimal number of extended words for eachoriginal word in description documents of Mashup andservice to achieve the best recommendation result in ourEHDP-FMs A detailed investigation about it will be dis-cussed in subsequent section Table 3 reports the MAE andRMSE comparison of multiple recommendation methodswhich show our EHDP-FMs greatly outperforms WPCCand MPCC significantly surpasses PMF and LDA-FMs andslightly exceeds HDP-FMs consistently+e reason for this isthat in the EHDP-FMs (a) more useful words informationcan be obtained from the extended description content ofMashup and service (b) more similar Mashups and similarservices in topic distribution are identified by using HDPtechnology and (c) FM models and trains those usefulinformation (including similar Mashups and similar ser-vices the cooccurrence and popularity of service) to achievemore accurate service probability score prediction Fur-thermore when the given score values increase from 10 to 30and the density of training matrix rises from 10 to 30 theMAE and RMSE in the EHDP-FMs definitely drop+at is tosay more score values and training matrix with highersparsity mean better accuracy of recommendation

342 Effect of the Number of Extended Words on MobileService Recommendation +e experiments investigate theeffect of the number of extended words on mobile servicerecommendation in our proposed method During the ex-periments we set the number of extended words to 1 3 5and 7 (respectively denoted as EHDP-FMs-1 EHDP-FMs-3 EHDP-FMs-5 and EHDP-FMs-7) when training matrixdensity 10 and obtain their values of MAE and RMSE inFigures 4 and 5 +e experimental results indicate theperformance of EHDP-FMs-1 is the worst in all cases +is isbecause the extended description documents of Mashup andservice are still short and the contained useful information isless in it when only extending a word for each original wordWe can see that theMAE and RMSE of EHDP-FMs-3 are theoptimal and best in all cases However when the number ofthe extended words continues to increase from 5 to 7 therecommendation performance decreases+e reason for thisis that too many extended words contain more other ir-relevant syntax and semantics information which maybemakes the HDP topic model fail to mine the latent topicsaccurately and therefore weakens the performance of service

recommendation +erefore we select 3 extended words foreach original words of description document of Mashup andservice in our EHDP-FMmethod +e observations indicateit is very important to choose an appropriate number ofextended words for mobile service recommendation

343 HDP-FMs Performance vs LDA-FMs Performancewith Different Topic Numbers In this experiment we re-spectively set the number of topics as 3 6 12 and 24 forLDA-FMs and denote as LDA-FMs-361224 +e experi-mental results in Figures 6 and 7 respectively show theMAE and RMSE values when the training matrix density isequal to 10We also observe that the performance of HDP-FMs is the best At the same time the MAE and RMSE ofLDA-FMs-12 are close to that of HDP-FMs and surpassedthose of LDA-FMs-3 LDA-FMs-6 and LDA-FMs-24 +eobservations prove that HDP-FM is better than LDA-FMssince it can automatically derive the optimal topic numbersinstead of repeatedly training like LDA

344 Impacts of Top-S and Top-M in HDP-FMs We in-vestigate the effects of top-S and top-M to mobile servicerecommendation in order to obtain their optimal values+eoptimal values of top-M (top-S) for all similar top-S (top-M)services (Mashups) are obtained ie S 5 for all top-Msimilar Mashups and M 10 for all top-S similar mobileservices Under the setting of training matrix density 10and given number 30 the MAEs of HDP-FMs are pre-sented in Figures 8 and 9 We can see that from Figure 8 theMAE of HDP-FMs constantly rises when S grows from 5 to25 Figure 9 indicates the MAE of HDP-FMs runs up to itspeak value whenM 10 and then continuously rises with theincreasing or decreasing ofM +e observations mean that itis very important to identify suitable values of S and M forthe HDP-FM method

4 Related Works

Service recommendation is a hot topic nowadays in service-oriented computing [19] Traditional service recommen-dation solves the quality problem of Mashup services inorder to realize high-quality service recommendation +equality of a single service can facilitate recommendationshowed by Picozzi et al [20] +e quality attributes ofMashup components (APIs) and information quality inMashups [21] is analyzed by Cappiello [22] In additioncollaborative filtering (CF) technique is widely exploited inQoS-based service recommendation [16] We can use it tomeasure the similarity of services or users predict themissing QoS values on the basis of the QoS records of similarservices or similar users and recommend services to users

+e problems of the data sparsity and long tail bringabout inaccurate and imperfect search results according tothe results in references [23 24] To attack the problemsome researchers try to use matrix factorization technologyto decompose historical QoS or Mashup service interactionsto obtain service recommendations [25 26] A collaborativeQoS prediction method is proposed in which a matrix

Mobile Information Systems 7

factorization model of neighbourhood integration isdesigned to predict the QoS value of personalized Webservices by Zheng et al [26] A social awareness servicerecommendation method is proposed in which the multi-dimensional social relationships among potential usersMashups topics and services are depicted by the couplingmatrix model by Xu et al [9] +ese methods aim totransform Mashup-service rating matrix or QoS into afeature space matrix with lower dimensions and predict theprobability of services invoked by Mashups or unknownQoS

Considering that matrix factorization depends onabundant historical interaction records recent work in-corporates additional information into matrix factorizationto obtain more accurate service recommendation [5 10ndash12]Among them Ma et al [11] integrates matrix factorization

with geographic and social influence to recommend interestpoints By using location information and QoS of Webservices to cluster services and users a personalized servicerecommendation is proposed by Chen et al [12] +e his-torical invocation relationship between Mashups and ser-vices is studied to infer the implicit functional correlationbetween services and the correlation is incorporated into thematrix factorization model to facilitate service recommen-dation by Yao et al [10] Collaborative topic regression isproposed by Liu and Fulia [5] which combines probabilistictopic modelling and probabilistic matrix factorization forservice recommendation

+e existing methods based on matrix factorizationundoubtedly improve the performance of service recom-mendation At the same time we observed that few of themrecognized the historical invocation between services and

Table 3 MAE and RMSE comparison of multiple recommendation approaches

MethodMatrix density 10 Matrix density 20 Matrix density 30MAE RMSE MAE RMSE MAE RMSE

Given 10

SPCC 04258 05643 04005 05257 03932 05036MPCC 04316 05701 04108 05293 04035 05113PMF 02417 03835 02263 03774 02014 03718

LDA-FMs 02091 03225 01969 03116 01832 03015HDP-FMs 01547 02874 01329 02669 01283 02498EHDP-FMs 01308 02507 01154 02372 01081 02093

Given 20

SPCC 04135 05541 03918 05158 03890 05003MPCC 04413 05712 04221 05202 04151 05109PMF 02398 03559 02137 03427 01992 03348

LDA-FMs 01989 03104 01907 03018 01801 02894HDP-FMs 01486 02713 01297 02513 01185 02291EHDP-FMs 01227 02419 01055 02216 00952 01904

Given 30

SPCC 04016 05447 03907 05107 03739 05012MPCC 04518 05771 04317 05159 04239 05226PMF 02214 03319 02091 03117 01986 03052

LDA-FMs 01970 03096 01865 02993 01794 02758HDP-FMs 01377 02556 01109 02461 01047 02057EHDP-FMs 01113 02248 00926 02057 00804 01673

Given number10 20 30

MA

E

005

01

015

02

025

03

EHDP-FMs-1EHDP-FMs-3

EHDP-FMs-5EHDP-FMs-7

Figure 4 MAE values of EHDP-FMs-1357

EHDP-FMs-1EHDP-FMs-3

EHDP-FMs-5EHDP-FMs-7

Given number10 20 30

RMSE

018

02

022

024

026

028

03

032

Figure 5 RMSE values of EHDP-FMs-1357

8 Mobile Information Systems

Mashups to derive potential topics and they did not use FMsto model and train these potential topics to predict theprobability of Mashup calling services to obtain more ac-curate service recommendation In our previous work[8 27 28] we mainly address on LDA or enhanced LDAtopic model for Web services clustering [27 28] and alsoexploit word embedding technique to enhance the accuracyof service clustering [8] Driven by these methods wecombine FMs and word-embedded enhanced HDP forrecommending mobile services to build novel Mashup ap-plication We apply the HDP model to export the potentialtopics from the description documents of mobile servicesand Mashups to support FM model training We use FMs topredict the probability of Mashups calling mobile servicesand recommend high-quality services for building novelMashup application

5 Discussion

Recommending mobile service to build novel Mashup ap-plication for software developers in the mobile servicecomputing environment is becoming a promising researchtopic In our paper the functional semantic representationof Mashup applications and mobile services is fully con-solidated and mined by extending their description docu-ments and modelling their topic probability distributionand the quality prediction is performed by exploiting FMs totrain and model multiple dimension features of mobileservice +e high-quality mobile services are ranked andrecommended to build Mashup by simultaneously consid-ering their functionality representation and quality feature+e accuracy of mobile service recommendation is signifi-cantly improved as a result

Although the above approach and solution seem veryeffective in the Mashup development it will be better if a

Given number10 20 30

RMSE

02

025

03

035

04

LDA-FMs-3LDA-FMs-6LDA-FMs-12

LDA-FMs-24HDP-FMs

Figure 6 MAE of HDP-FMs and LDA-FMs

LDA-FMs-3LDA-FMs-6LDA-FMs-12

LDA-FMs-24HDP-FMs

Given number10 20 30

MA

E

012

013

014

015

016

017

Figure 7 RMSE of HDP-FMs and LDA-FMs

Top-S5 10 15 20 25

MA

E

013

014

015

016

017

018

019

02

HDP-FMs

Figure 8 Impact of Top-S in HDP-FMs

Top-M5 10 15 20 25

MA

E

013

014

015

016

017

018

019

02

HDP-FMs

Figure 9 Impact of Top-M in HDP-FMs

Mobile Information Systems 9

prototype can be designed or implemented to validate theeffectiveness and application value of the approach As weexpected the objective of the prototype system is to rank andrecommend the high-quality mobile service to softwaredevelopers for building Mashup application We can use thetools of Python 35 Mysql 56 and the technologies of Flaskand Pyecharts to develop the prototype system It shouldachieve four basic function parts ie service data crawlingand preprocessing description extension of Mashup andmobile service topic modelling of Mashup and mobileservice and recommendation of mobile service for the givenMashup requirement More concretely

(1) In the first part (service data crawling and pre-processing) the system incrementally crawls Mash-ups services and invocations between these Mashupsand services from ProgrammableWeb and builds theircorresponding data table to store Because the de-scription documents of Mashup and mobile servicecontain some useless or meaningless termswords thepreprocessing is performed to normalize and stan-dardize the description information +e pre-processing mainly includes tokenization words aresegmented by spaces and punctuation is separatedfrom words by using NLTK (Natural LanguageToolkit) in Python removing stop words (the com-mon short words or symbols that have no practicalmeaning but occur frequently such as the to a anwith and at) the stop vocabulary table in the NLTK isapplied to remove stop words stemming variousforms of a word are usually used in the grammaticalexpression such as provide providing provides andprovided and their common word endings such asing s and ed should be removed

(2) In the second part (description extension of Mashupand mobile service) English Wikipedia corpustrained byWord2vec based on the genism module inPython is exploited as the description extensionsource of Mashup and mobile service +e mostsimilar Top-N words to an original word in thedescription documents of Mashup and mobile ser-vice are identified and saved as the extended words inthe prototype system Word2vec uses the hierar-chical softmax algorithm to speed up and train wordvector for English Wikipedia corpus whose timecomplexity is O(log N) Meanwhile Word2veccalculates the similarity between words in EnglishWikipedia corpus and obtain the similarity matrixwhose time and space complexity are all O(n2) n isthe total amount of words During the process ofdescription extension the most similar Top-N wordsto an original word can be found only by the way oflook-up table in the trained English Wikipediacorpus and its time complexity is O(1) and spacecomplexity is O(N2)

(3) In the third part (topic modelling of Mashup andmobile service) hierarchical Dirichlet processtechnology is used to extract the implicit topics ofMashup and mobile service which clustersgroups

service data according to the cooccurrence of wordfrequency It can automatically determine the op-timal number of topics which avoids adjusting thenumber of topics repeatedly and so saves the timecost It can also accurately predict the topic dis-tribution of Mashup and mobile service which donot need to retrain the dataset and make the pro-totype system real time A topic modelling modulecan be designed in the prototype system in whichFlask framework and Pyecharts visualization tool inPython are used to present the effect of topicmodelling and a download function is provided todownload the transformed topic vectors of Mashupand mobile service for users

(4) In the fourth part (recommendation of mobile ser-vice for the given Mashup requirement) when asoftware developer submits a Mashup requirementthe prototype system will return a list of mobileservices with good quality a for software developer tobuild novel Mashup application During the processof recommendation factorization machines trainand model the important input features +ese inputfeatures include functional features ie similarMashups of target Mashup and similar mobile ser-vices of active mobile service derived from the topicsimilarity based on the HDP model and qualityfeatures ie the cooccurrence and popularity ofmobile services obtained from the stored data tableFMs predict the probability of mobile servicesinvocated by Mashups and recommends the high-quality mobile service for a Mashup creation Sim-ilarly Flask framework and Pyechart visualizationtool in Python are used to present the effect ofrecommendation

6 Conclusions and Future Work

+is paper proposes a mobile service recommendationmethod for Mashup development in mobile servicecomputing by combining word embeddings enhancedHDP and FMs +e experimental results on the top ofProgrammableWeb dataset show that compared with theexisting recommendation methods the proposed methodachieves significant improvements in the accuracy ofrecommendation In the future work we willinvestigate and apply fine-grained service relationshipinformation into the proposed model for more accuraterecommendation

Data Availability

+e crawled dataset from ProgrammableWeb can beaccessed at http491230608080MashupNetwork20datasetjsp

Conflicts of Interest

+e authors declare that there are no conflicts of interestregarding the publication of this paper

10 Mobile Information Systems

Acknowledgments

+e work was supported by the National Natural ScienceFoundation of China under grant nos 61873316 6187213961572187 61772193 61702181 and 61572371 National KeyRampD Program of China under grant no 2017YFB1400602Hunan Provincial Natural Science Foundation of Chinaunder grant nos 2017JJ2098 2017JJ4036 2018JJ2139 and2018JJ2136 and Innovation Platform Open Foundation ofHunan Provincial Education Department of China undergrant no 17K033

References

[1] S Deng L Huang H Wu et al ldquoToward mobile servicecomputing opportunities and challengesrdquo IEEE CloudComputing vol 3 no 4 pp 32ndash41 2016

[2] B Xia Y Fan W Tan K Huang J Zhang and C WuldquoCategory-aware API clustering and distributed recommen-dation for automatic mashup creationrdquo IEEE Transactions onServices Computing vol 8 no 5 pp 674ndash687 2015

[3] httpsenwikipediaorgwikiMashup_(web_application_hybrid)

[4] L Chen Y Wang Q Yu Z Zheng and J Wu ldquoWT-LDAuser tagging augmented LDA for web service clusteringrdquo inProceedings of the International Conference on Service-Oriented Computing (ICSOC) Hangzhou China January2013

[5] X Liu and I Fulia ldquoIncorporating user topic and service-related latent factors into web service recommendationrdquo inProceedings of the IEEE International Conference on WebServices pp 185ndash192 New York NY USA July 2015

[6] D Blei A Ng and M Jordan ldquoLatent dirichlet allocationrdquoJournal of Machine Learning Research vol 3 pp 993ndash10222003

[7] Y W Teh M I Jordan M J Beal and D M Blei ldquoHier-archical dirichlet processrdquo Journal of the American StatisticalAssociation vol 101 no 476 pp 1566ndash1581 2004

[8] M Shi J Liu D Zhou M Tang and B Cao ldquoWE-LDA aword embeddings augmented LDA model for web servicesclusteringrdquo in Proceedings of the IEEE International Confer-ence on Web Services (ICWS) pp 9ndash16 Honolulu HI USAJune 2017

[9] W Xu J Cao L Hu J Wang and M Li ldquoA social-awareservice recommendation approach for mashup creationrdquo inProceedings of the IEEE 20th International Conference on WebServices pp 107ndash114 Santa Clara CA USA 2013

[10] L Yao X Wang Q Sheng W Ruan and W Zhang ldquoServicerecommendation for mashup composition with implicitcorrelation regularizationrdquo in Proceedings of the IEEE In-ternational Conference on Web Services pp 217ndash224 NewYork NY USA June-July 2015

[11] H Ma D Zhou C Liu M R Lyu and I King ldquoRecom-mender Systems with social regularizationrdquo in Proceedings ofthe Fourth ACM International Conference on Web Search andData Mining pp 287ndash296 ACM Hong Kong China Feb-ruary 2011

[12] X Chen Z Zheng Q Yu and M R Lyu ldquoWeb servicerecommendation via exploiting location and QoS in-formationrdquo IEEE Transactions on Parallel and DistributedSystems vol 25 no 7 pp 1913ndash1924 2014

[13] S Rendle ldquoFactorization machinesrdquo in Proceedings of theIEEE International Conference on Data Mining (ICDM)pp 995ndash1000 Sydney Australia December 2010

[14] S Rendle ldquoFactorization machines with libFMrdquo ACMTransactions on Intelligent Systems and Technology (TIST)vol 3 no 3 pp 57ndash78 2012

[15] T Ma I Sato and H Nakagawa e Hybrid NestedHierarchical Dirichlet Process and Its Application to TopicModeling with Word Differentiation Association for theAdvancement of Artificial Intelligence (AAAI) Menlo ParkCA USA 2015

[16] Y Teh M Jordan M Beal and D Blei ldquoSharing clustersamong related groups hierarchical dirichlet processesrdquo Ad-vances in Neural Information Processing System vol 37 no 2pp 1385ndash1392 2004

[17] Z Zheng H Ma M Lyu and I King ldquoWSRec a collaborativefiltering based web service recommender systemrdquo in Pro-ceedings IEEE International Conference on Web Services(ICWS) pp 437ndash444 Los Angeles CA USA July 2009

[18] B Cao B Li J Liu M Tang and Y Liu ldquoWeb APIs rec-ommendation for mashup development based on hierarchicaldirichlet process and factorization machinesrdquo in Proceedingsof Collaborate Computing Networking Applications andWorksharing Beijing China July 2016

[19] S Wang Z Zheng Z Wu M Lyu and F Yang ldquoReputationmeasurement and malicious feedback rating prevention inweb service recommendation systemrdquo IEEE Transactions onServices Computing vol 5 no 8 pp 755ndash767 2015

[20] M Picozzi M Rodolfi C Cappiello andMMatera ldquoQuality-based recommendations for mashup compositionrdquo in Cur-rent Trends in Web Engineering vol 6385 pp 360ndash371 2010

[21] C Cappiello F Daniel M Matera and C Pautasso ldquoIn-formation quality in mashupsrdquo IEEE Internet Computingvol 14 no 4 pp 14ndash22 2010

[22] C Cappiello ldquoA quality model for mashup componentsrdquo inWeb Engineering Web Engineering Lecture Notes in Com-puter Science vol 5648 pp 236ndash250 2009

[23] K Huang Y Fan and W Tan ldquoAn empirical study ofprogrammable web a network analysis on a service-mashupsystemrdquo in Proceedings of the 2012 IEEE 19th InternationalConference onWeb Services (ICWS) Honolulu HI USA June2012

[24] W Gao L Chen J Wu and H Gao ldquoManifold-learningbased API recommendation for mashup creationrdquo in Pro-ceedings of the 2015 IEEE International Conference on WebServices (ICWS) New York NY USA June 2015

[25] X Luo M Zhou Y Xia and Q Zhu ldquoAn efficient non-negative matrix-factorization-based approach to collaborativefiltering for recommender systemsrdquo IEEE Transactions onIndustrial Informatics vol 10 no 2 pp 1273ndash1284 2014

[26] Z Zheng H Ma M R Lyu and I King ldquoCollaborative webservice QoS prediction via neighborhood integrated matrixfactorizationrdquo IEEE Transactions on Services Computingvol 6 no 3 pp 289ndash299 2013

[27] B Cao X Liu M D M Rahman B Li J Liu and M TangldquoIntegrated content and network-based service clustering andweb APIs recommendation for mashup developmentrdquo IEEETransactions on Services Computing p 1 2017

[28] B Cao X Liu J Liu and M Tang ldquoDomain-aware mashupservice clustering based on LDA topic model from multipledata sourcesrdquo Information and Software Technology vol 90pp 40ndash54 2017

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 7: MobileServiceRecommendationviaCombiningEnhanced ...downloads.hindawi.com/journals/misy/2019/6423805.pdfcorpus Ke3erv3 dataset Gscription document3f3 flashup3n3 mobi3ervice jG%3op3del

uses Word2vec tool to expand the document de-scription of mobile service and Mashup fromWikipedia corpus +en the HDP model is appliedto derive the topic probability distributions of theextended document description of mobile serviceand Mashup Finally the FM is deployed to predictand recommend high-quality mobile service forMashup

34 Experimental Results

341 Recommendation Performance Comparison To studythe performance of mobile service recommendation wecompare our method with other five baseline methods Weselect the optimal number of extended words for eachoriginal word in description documents of Mashup andservice to achieve the best recommendation result in ourEHDP-FMs A detailed investigation about it will be dis-cussed in subsequent section Table 3 reports the MAE andRMSE comparison of multiple recommendation methodswhich show our EHDP-FMs greatly outperforms WPCCand MPCC significantly surpasses PMF and LDA-FMs andslightly exceeds HDP-FMs consistently+e reason for this isthat in the EHDP-FMs (a) more useful words informationcan be obtained from the extended description content ofMashup and service (b) more similar Mashups and similarservices in topic distribution are identified by using HDPtechnology and (c) FM models and trains those usefulinformation (including similar Mashups and similar ser-vices the cooccurrence and popularity of service) to achievemore accurate service probability score prediction Fur-thermore when the given score values increase from 10 to 30and the density of training matrix rises from 10 to 30 theMAE and RMSE in the EHDP-FMs definitely drop+at is tosay more score values and training matrix with highersparsity mean better accuracy of recommendation

342 Effect of the Number of Extended Words on MobileService Recommendation +e experiments investigate theeffect of the number of extended words on mobile servicerecommendation in our proposed method During the ex-periments we set the number of extended words to 1 3 5and 7 (respectively denoted as EHDP-FMs-1 EHDP-FMs-3 EHDP-FMs-5 and EHDP-FMs-7) when training matrixdensity 10 and obtain their values of MAE and RMSE inFigures 4 and 5 +e experimental results indicate theperformance of EHDP-FMs-1 is the worst in all cases +is isbecause the extended description documents of Mashup andservice are still short and the contained useful information isless in it when only extending a word for each original wordWe can see that theMAE and RMSE of EHDP-FMs-3 are theoptimal and best in all cases However when the number ofthe extended words continues to increase from 5 to 7 therecommendation performance decreases+e reason for thisis that too many extended words contain more other ir-relevant syntax and semantics information which maybemakes the HDP topic model fail to mine the latent topicsaccurately and therefore weakens the performance of service

recommendation +erefore we select 3 extended words foreach original words of description document of Mashup andservice in our EHDP-FMmethod +e observations indicateit is very important to choose an appropriate number ofextended words for mobile service recommendation

343 HDP-FMs Performance vs LDA-FMs Performancewith Different Topic Numbers In this experiment we re-spectively set the number of topics as 3 6 12 and 24 forLDA-FMs and denote as LDA-FMs-361224 +e experi-mental results in Figures 6 and 7 respectively show theMAE and RMSE values when the training matrix density isequal to 10We also observe that the performance of HDP-FMs is the best At the same time the MAE and RMSE ofLDA-FMs-12 are close to that of HDP-FMs and surpassedthose of LDA-FMs-3 LDA-FMs-6 and LDA-FMs-24 +eobservations prove that HDP-FM is better than LDA-FMssince it can automatically derive the optimal topic numbersinstead of repeatedly training like LDA

344 Impacts of Top-S and Top-M in HDP-FMs We in-vestigate the effects of top-S and top-M to mobile servicerecommendation in order to obtain their optimal values+eoptimal values of top-M (top-S) for all similar top-S (top-M)services (Mashups) are obtained ie S 5 for all top-Msimilar Mashups and M 10 for all top-S similar mobileservices Under the setting of training matrix density 10and given number 30 the MAEs of HDP-FMs are pre-sented in Figures 8 and 9 We can see that from Figure 8 theMAE of HDP-FMs constantly rises when S grows from 5 to25 Figure 9 indicates the MAE of HDP-FMs runs up to itspeak value whenM 10 and then continuously rises with theincreasing or decreasing ofM +e observations mean that itis very important to identify suitable values of S and M forthe HDP-FM method

4 Related Works

Service recommendation is a hot topic nowadays in service-oriented computing [19] Traditional service recommen-dation solves the quality problem of Mashup services inorder to realize high-quality service recommendation +equality of a single service can facilitate recommendationshowed by Picozzi et al [20] +e quality attributes ofMashup components (APIs) and information quality inMashups [21] is analyzed by Cappiello [22] In additioncollaborative filtering (CF) technique is widely exploited inQoS-based service recommendation [16] We can use it tomeasure the similarity of services or users predict themissing QoS values on the basis of the QoS records of similarservices or similar users and recommend services to users

+e problems of the data sparsity and long tail bringabout inaccurate and imperfect search results according tothe results in references [23 24] To attack the problemsome researchers try to use matrix factorization technologyto decompose historical QoS or Mashup service interactionsto obtain service recommendations [25 26] A collaborativeQoS prediction method is proposed in which a matrix

Mobile Information Systems 7

factorization model of neighbourhood integration isdesigned to predict the QoS value of personalized Webservices by Zheng et al [26] A social awareness servicerecommendation method is proposed in which the multi-dimensional social relationships among potential usersMashups topics and services are depicted by the couplingmatrix model by Xu et al [9] +ese methods aim totransform Mashup-service rating matrix or QoS into afeature space matrix with lower dimensions and predict theprobability of services invoked by Mashups or unknownQoS

Considering that matrix factorization depends onabundant historical interaction records recent work in-corporates additional information into matrix factorizationto obtain more accurate service recommendation [5 10ndash12]Among them Ma et al [11] integrates matrix factorization

with geographic and social influence to recommend interestpoints By using location information and QoS of Webservices to cluster services and users a personalized servicerecommendation is proposed by Chen et al [12] +e his-torical invocation relationship between Mashups and ser-vices is studied to infer the implicit functional correlationbetween services and the correlation is incorporated into thematrix factorization model to facilitate service recommen-dation by Yao et al [10] Collaborative topic regression isproposed by Liu and Fulia [5] which combines probabilistictopic modelling and probabilistic matrix factorization forservice recommendation

+e existing methods based on matrix factorizationundoubtedly improve the performance of service recom-mendation At the same time we observed that few of themrecognized the historical invocation between services and

Table 3 MAE and RMSE comparison of multiple recommendation approaches

MethodMatrix density 10 Matrix density 20 Matrix density 30MAE RMSE MAE RMSE MAE RMSE

Given 10

SPCC 04258 05643 04005 05257 03932 05036MPCC 04316 05701 04108 05293 04035 05113PMF 02417 03835 02263 03774 02014 03718

LDA-FMs 02091 03225 01969 03116 01832 03015HDP-FMs 01547 02874 01329 02669 01283 02498EHDP-FMs 01308 02507 01154 02372 01081 02093

Given 20

SPCC 04135 05541 03918 05158 03890 05003MPCC 04413 05712 04221 05202 04151 05109PMF 02398 03559 02137 03427 01992 03348

LDA-FMs 01989 03104 01907 03018 01801 02894HDP-FMs 01486 02713 01297 02513 01185 02291EHDP-FMs 01227 02419 01055 02216 00952 01904

Given 30

SPCC 04016 05447 03907 05107 03739 05012MPCC 04518 05771 04317 05159 04239 05226PMF 02214 03319 02091 03117 01986 03052

LDA-FMs 01970 03096 01865 02993 01794 02758HDP-FMs 01377 02556 01109 02461 01047 02057EHDP-FMs 01113 02248 00926 02057 00804 01673

Given number10 20 30

MA

E

005

01

015

02

025

03

EHDP-FMs-1EHDP-FMs-3

EHDP-FMs-5EHDP-FMs-7

Figure 4 MAE values of EHDP-FMs-1357

EHDP-FMs-1EHDP-FMs-3

EHDP-FMs-5EHDP-FMs-7

Given number10 20 30

RMSE

018

02

022

024

026

028

03

032

Figure 5 RMSE values of EHDP-FMs-1357

8 Mobile Information Systems

Mashups to derive potential topics and they did not use FMsto model and train these potential topics to predict theprobability of Mashup calling services to obtain more ac-curate service recommendation In our previous work[8 27 28] we mainly address on LDA or enhanced LDAtopic model for Web services clustering [27 28] and alsoexploit word embedding technique to enhance the accuracyof service clustering [8] Driven by these methods wecombine FMs and word-embedded enhanced HDP forrecommending mobile services to build novel Mashup ap-plication We apply the HDP model to export the potentialtopics from the description documents of mobile servicesand Mashups to support FM model training We use FMs topredict the probability of Mashups calling mobile servicesand recommend high-quality services for building novelMashup application

5 Discussion

Recommending mobile service to build novel Mashup ap-plication for software developers in the mobile servicecomputing environment is becoming a promising researchtopic In our paper the functional semantic representationof Mashup applications and mobile services is fully con-solidated and mined by extending their description docu-ments and modelling their topic probability distributionand the quality prediction is performed by exploiting FMs totrain and model multiple dimension features of mobileservice +e high-quality mobile services are ranked andrecommended to build Mashup by simultaneously consid-ering their functionality representation and quality feature+e accuracy of mobile service recommendation is signifi-cantly improved as a result

Although the above approach and solution seem veryeffective in the Mashup development it will be better if a

Given number10 20 30

RMSE

02

025

03

035

04

LDA-FMs-3LDA-FMs-6LDA-FMs-12

LDA-FMs-24HDP-FMs

Figure 6 MAE of HDP-FMs and LDA-FMs

LDA-FMs-3LDA-FMs-6LDA-FMs-12

LDA-FMs-24HDP-FMs

Given number10 20 30

MA

E

012

013

014

015

016

017

Figure 7 RMSE of HDP-FMs and LDA-FMs

Top-S5 10 15 20 25

MA

E

013

014

015

016

017

018

019

02

HDP-FMs

Figure 8 Impact of Top-S in HDP-FMs

Top-M5 10 15 20 25

MA

E

013

014

015

016

017

018

019

02

HDP-FMs

Figure 9 Impact of Top-M in HDP-FMs

Mobile Information Systems 9

prototype can be designed or implemented to validate theeffectiveness and application value of the approach As weexpected the objective of the prototype system is to rank andrecommend the high-quality mobile service to softwaredevelopers for building Mashup application We can use thetools of Python 35 Mysql 56 and the technologies of Flaskand Pyecharts to develop the prototype system It shouldachieve four basic function parts ie service data crawlingand preprocessing description extension of Mashup andmobile service topic modelling of Mashup and mobileservice and recommendation of mobile service for the givenMashup requirement More concretely

(1) In the first part (service data crawling and pre-processing) the system incrementally crawls Mash-ups services and invocations between these Mashupsand services from ProgrammableWeb and builds theircorresponding data table to store Because the de-scription documents of Mashup and mobile servicecontain some useless or meaningless termswords thepreprocessing is performed to normalize and stan-dardize the description information +e pre-processing mainly includes tokenization words aresegmented by spaces and punctuation is separatedfrom words by using NLTK (Natural LanguageToolkit) in Python removing stop words (the com-mon short words or symbols that have no practicalmeaning but occur frequently such as the to a anwith and at) the stop vocabulary table in the NLTK isapplied to remove stop words stemming variousforms of a word are usually used in the grammaticalexpression such as provide providing provides andprovided and their common word endings such asing s and ed should be removed

(2) In the second part (description extension of Mashupand mobile service) English Wikipedia corpustrained byWord2vec based on the genism module inPython is exploited as the description extensionsource of Mashup and mobile service +e mostsimilar Top-N words to an original word in thedescription documents of Mashup and mobile ser-vice are identified and saved as the extended words inthe prototype system Word2vec uses the hierar-chical softmax algorithm to speed up and train wordvector for English Wikipedia corpus whose timecomplexity is O(log N) Meanwhile Word2veccalculates the similarity between words in EnglishWikipedia corpus and obtain the similarity matrixwhose time and space complexity are all O(n2) n isthe total amount of words During the process ofdescription extension the most similar Top-N wordsto an original word can be found only by the way oflook-up table in the trained English Wikipediacorpus and its time complexity is O(1) and spacecomplexity is O(N2)

(3) In the third part (topic modelling of Mashup andmobile service) hierarchical Dirichlet processtechnology is used to extract the implicit topics ofMashup and mobile service which clustersgroups

service data according to the cooccurrence of wordfrequency It can automatically determine the op-timal number of topics which avoids adjusting thenumber of topics repeatedly and so saves the timecost It can also accurately predict the topic dis-tribution of Mashup and mobile service which donot need to retrain the dataset and make the pro-totype system real time A topic modelling modulecan be designed in the prototype system in whichFlask framework and Pyecharts visualization tool inPython are used to present the effect of topicmodelling and a download function is provided todownload the transformed topic vectors of Mashupand mobile service for users

(4) In the fourth part (recommendation of mobile ser-vice for the given Mashup requirement) when asoftware developer submits a Mashup requirementthe prototype system will return a list of mobileservices with good quality a for software developer tobuild novel Mashup application During the processof recommendation factorization machines trainand model the important input features +ese inputfeatures include functional features ie similarMashups of target Mashup and similar mobile ser-vices of active mobile service derived from the topicsimilarity based on the HDP model and qualityfeatures ie the cooccurrence and popularity ofmobile services obtained from the stored data tableFMs predict the probability of mobile servicesinvocated by Mashups and recommends the high-quality mobile service for a Mashup creation Sim-ilarly Flask framework and Pyechart visualizationtool in Python are used to present the effect ofrecommendation

6 Conclusions and Future Work

+is paper proposes a mobile service recommendationmethod for Mashup development in mobile servicecomputing by combining word embeddings enhancedHDP and FMs +e experimental results on the top ofProgrammableWeb dataset show that compared with theexisting recommendation methods the proposed methodachieves significant improvements in the accuracy ofrecommendation In the future work we willinvestigate and apply fine-grained service relationshipinformation into the proposed model for more accuraterecommendation

Data Availability

+e crawled dataset from ProgrammableWeb can beaccessed at http491230608080MashupNetwork20datasetjsp

Conflicts of Interest

+e authors declare that there are no conflicts of interestregarding the publication of this paper

10 Mobile Information Systems

Acknowledgments

+e work was supported by the National Natural ScienceFoundation of China under grant nos 61873316 6187213961572187 61772193 61702181 and 61572371 National KeyRampD Program of China under grant no 2017YFB1400602Hunan Provincial Natural Science Foundation of Chinaunder grant nos 2017JJ2098 2017JJ4036 2018JJ2139 and2018JJ2136 and Innovation Platform Open Foundation ofHunan Provincial Education Department of China undergrant no 17K033

References

[1] S Deng L Huang H Wu et al ldquoToward mobile servicecomputing opportunities and challengesrdquo IEEE CloudComputing vol 3 no 4 pp 32ndash41 2016

[2] B Xia Y Fan W Tan K Huang J Zhang and C WuldquoCategory-aware API clustering and distributed recommen-dation for automatic mashup creationrdquo IEEE Transactions onServices Computing vol 8 no 5 pp 674ndash687 2015

[3] httpsenwikipediaorgwikiMashup_(web_application_hybrid)

[4] L Chen Y Wang Q Yu Z Zheng and J Wu ldquoWT-LDAuser tagging augmented LDA for web service clusteringrdquo inProceedings of the International Conference on Service-Oriented Computing (ICSOC) Hangzhou China January2013

[5] X Liu and I Fulia ldquoIncorporating user topic and service-related latent factors into web service recommendationrdquo inProceedings of the IEEE International Conference on WebServices pp 185ndash192 New York NY USA July 2015

[6] D Blei A Ng and M Jordan ldquoLatent dirichlet allocationrdquoJournal of Machine Learning Research vol 3 pp 993ndash10222003

[7] Y W Teh M I Jordan M J Beal and D M Blei ldquoHier-archical dirichlet processrdquo Journal of the American StatisticalAssociation vol 101 no 476 pp 1566ndash1581 2004

[8] M Shi J Liu D Zhou M Tang and B Cao ldquoWE-LDA aword embeddings augmented LDA model for web servicesclusteringrdquo in Proceedings of the IEEE International Confer-ence on Web Services (ICWS) pp 9ndash16 Honolulu HI USAJune 2017

[9] W Xu J Cao L Hu J Wang and M Li ldquoA social-awareservice recommendation approach for mashup creationrdquo inProceedings of the IEEE 20th International Conference on WebServices pp 107ndash114 Santa Clara CA USA 2013

[10] L Yao X Wang Q Sheng W Ruan and W Zhang ldquoServicerecommendation for mashup composition with implicitcorrelation regularizationrdquo in Proceedings of the IEEE In-ternational Conference on Web Services pp 217ndash224 NewYork NY USA June-July 2015

[11] H Ma D Zhou C Liu M R Lyu and I King ldquoRecom-mender Systems with social regularizationrdquo in Proceedings ofthe Fourth ACM International Conference on Web Search andData Mining pp 287ndash296 ACM Hong Kong China Feb-ruary 2011

[12] X Chen Z Zheng Q Yu and M R Lyu ldquoWeb servicerecommendation via exploiting location and QoS in-formationrdquo IEEE Transactions on Parallel and DistributedSystems vol 25 no 7 pp 1913ndash1924 2014

[13] S Rendle ldquoFactorization machinesrdquo in Proceedings of theIEEE International Conference on Data Mining (ICDM)pp 995ndash1000 Sydney Australia December 2010

[14] S Rendle ldquoFactorization machines with libFMrdquo ACMTransactions on Intelligent Systems and Technology (TIST)vol 3 no 3 pp 57ndash78 2012

[15] T Ma I Sato and H Nakagawa e Hybrid NestedHierarchical Dirichlet Process and Its Application to TopicModeling with Word Differentiation Association for theAdvancement of Artificial Intelligence (AAAI) Menlo ParkCA USA 2015

[16] Y Teh M Jordan M Beal and D Blei ldquoSharing clustersamong related groups hierarchical dirichlet processesrdquo Ad-vances in Neural Information Processing System vol 37 no 2pp 1385ndash1392 2004

[17] Z Zheng H Ma M Lyu and I King ldquoWSRec a collaborativefiltering based web service recommender systemrdquo in Pro-ceedings IEEE International Conference on Web Services(ICWS) pp 437ndash444 Los Angeles CA USA July 2009

[18] B Cao B Li J Liu M Tang and Y Liu ldquoWeb APIs rec-ommendation for mashup development based on hierarchicaldirichlet process and factorization machinesrdquo in Proceedingsof Collaborate Computing Networking Applications andWorksharing Beijing China July 2016

[19] S Wang Z Zheng Z Wu M Lyu and F Yang ldquoReputationmeasurement and malicious feedback rating prevention inweb service recommendation systemrdquo IEEE Transactions onServices Computing vol 5 no 8 pp 755ndash767 2015

[20] M Picozzi M Rodolfi C Cappiello andMMatera ldquoQuality-based recommendations for mashup compositionrdquo in Cur-rent Trends in Web Engineering vol 6385 pp 360ndash371 2010

[21] C Cappiello F Daniel M Matera and C Pautasso ldquoIn-formation quality in mashupsrdquo IEEE Internet Computingvol 14 no 4 pp 14ndash22 2010

[22] C Cappiello ldquoA quality model for mashup componentsrdquo inWeb Engineering Web Engineering Lecture Notes in Com-puter Science vol 5648 pp 236ndash250 2009

[23] K Huang Y Fan and W Tan ldquoAn empirical study ofprogrammable web a network analysis on a service-mashupsystemrdquo in Proceedings of the 2012 IEEE 19th InternationalConference onWeb Services (ICWS) Honolulu HI USA June2012

[24] W Gao L Chen J Wu and H Gao ldquoManifold-learningbased API recommendation for mashup creationrdquo in Pro-ceedings of the 2015 IEEE International Conference on WebServices (ICWS) New York NY USA June 2015

[25] X Luo M Zhou Y Xia and Q Zhu ldquoAn efficient non-negative matrix-factorization-based approach to collaborativefiltering for recommender systemsrdquo IEEE Transactions onIndustrial Informatics vol 10 no 2 pp 1273ndash1284 2014

[26] Z Zheng H Ma M R Lyu and I King ldquoCollaborative webservice QoS prediction via neighborhood integrated matrixfactorizationrdquo IEEE Transactions on Services Computingvol 6 no 3 pp 289ndash299 2013

[27] B Cao X Liu M D M Rahman B Li J Liu and M TangldquoIntegrated content and network-based service clustering andweb APIs recommendation for mashup developmentrdquo IEEETransactions on Services Computing p 1 2017

[28] B Cao X Liu J Liu and M Tang ldquoDomain-aware mashupservice clustering based on LDA topic model from multipledata sourcesrdquo Information and Software Technology vol 90pp 40ndash54 2017

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 8: MobileServiceRecommendationviaCombiningEnhanced ...downloads.hindawi.com/journals/misy/2019/6423805.pdfcorpus Ke3erv3 dataset Gscription document3f3 flashup3n3 mobi3ervice jG%3op3del

factorization model of neighbourhood integration isdesigned to predict the QoS value of personalized Webservices by Zheng et al [26] A social awareness servicerecommendation method is proposed in which the multi-dimensional social relationships among potential usersMashups topics and services are depicted by the couplingmatrix model by Xu et al [9] +ese methods aim totransform Mashup-service rating matrix or QoS into afeature space matrix with lower dimensions and predict theprobability of services invoked by Mashups or unknownQoS

Considering that matrix factorization depends onabundant historical interaction records recent work in-corporates additional information into matrix factorizationto obtain more accurate service recommendation [5 10ndash12]Among them Ma et al [11] integrates matrix factorization

with geographic and social influence to recommend interestpoints By using location information and QoS of Webservices to cluster services and users a personalized servicerecommendation is proposed by Chen et al [12] +e his-torical invocation relationship between Mashups and ser-vices is studied to infer the implicit functional correlationbetween services and the correlation is incorporated into thematrix factorization model to facilitate service recommen-dation by Yao et al [10] Collaborative topic regression isproposed by Liu and Fulia [5] which combines probabilistictopic modelling and probabilistic matrix factorization forservice recommendation

+e existing methods based on matrix factorizationundoubtedly improve the performance of service recom-mendation At the same time we observed that few of themrecognized the historical invocation between services and

Table 3 MAE and RMSE comparison of multiple recommendation approaches

MethodMatrix density 10 Matrix density 20 Matrix density 30MAE RMSE MAE RMSE MAE RMSE

Given 10

SPCC 04258 05643 04005 05257 03932 05036MPCC 04316 05701 04108 05293 04035 05113PMF 02417 03835 02263 03774 02014 03718

LDA-FMs 02091 03225 01969 03116 01832 03015HDP-FMs 01547 02874 01329 02669 01283 02498EHDP-FMs 01308 02507 01154 02372 01081 02093

Given 20

SPCC 04135 05541 03918 05158 03890 05003MPCC 04413 05712 04221 05202 04151 05109PMF 02398 03559 02137 03427 01992 03348

LDA-FMs 01989 03104 01907 03018 01801 02894HDP-FMs 01486 02713 01297 02513 01185 02291EHDP-FMs 01227 02419 01055 02216 00952 01904

Given 30

SPCC 04016 05447 03907 05107 03739 05012MPCC 04518 05771 04317 05159 04239 05226PMF 02214 03319 02091 03117 01986 03052

LDA-FMs 01970 03096 01865 02993 01794 02758HDP-FMs 01377 02556 01109 02461 01047 02057EHDP-FMs 01113 02248 00926 02057 00804 01673

Given number10 20 30

MA

E

005

01

015

02

025

03

EHDP-FMs-1EHDP-FMs-3

EHDP-FMs-5EHDP-FMs-7

Figure 4 MAE values of EHDP-FMs-1357

EHDP-FMs-1EHDP-FMs-3

EHDP-FMs-5EHDP-FMs-7

Given number10 20 30

RMSE

018

02

022

024

026

028

03

032

Figure 5 RMSE values of EHDP-FMs-1357

8 Mobile Information Systems

Mashups to derive potential topics and they did not use FMsto model and train these potential topics to predict theprobability of Mashup calling services to obtain more ac-curate service recommendation In our previous work[8 27 28] we mainly address on LDA or enhanced LDAtopic model for Web services clustering [27 28] and alsoexploit word embedding technique to enhance the accuracyof service clustering [8] Driven by these methods wecombine FMs and word-embedded enhanced HDP forrecommending mobile services to build novel Mashup ap-plication We apply the HDP model to export the potentialtopics from the description documents of mobile servicesand Mashups to support FM model training We use FMs topredict the probability of Mashups calling mobile servicesand recommend high-quality services for building novelMashup application

5 Discussion

Recommending mobile service to build novel Mashup ap-plication for software developers in the mobile servicecomputing environment is becoming a promising researchtopic In our paper the functional semantic representationof Mashup applications and mobile services is fully con-solidated and mined by extending their description docu-ments and modelling their topic probability distributionand the quality prediction is performed by exploiting FMs totrain and model multiple dimension features of mobileservice +e high-quality mobile services are ranked andrecommended to build Mashup by simultaneously consid-ering their functionality representation and quality feature+e accuracy of mobile service recommendation is signifi-cantly improved as a result

Although the above approach and solution seem veryeffective in the Mashup development it will be better if a

Given number10 20 30

RMSE

02

025

03

035

04

LDA-FMs-3LDA-FMs-6LDA-FMs-12

LDA-FMs-24HDP-FMs

Figure 6 MAE of HDP-FMs and LDA-FMs

LDA-FMs-3LDA-FMs-6LDA-FMs-12

LDA-FMs-24HDP-FMs

Given number10 20 30

MA

E

012

013

014

015

016

017

Figure 7 RMSE of HDP-FMs and LDA-FMs

Top-S5 10 15 20 25

MA

E

013

014

015

016

017

018

019

02

HDP-FMs

Figure 8 Impact of Top-S in HDP-FMs

Top-M5 10 15 20 25

MA

E

013

014

015

016

017

018

019

02

HDP-FMs

Figure 9 Impact of Top-M in HDP-FMs

Mobile Information Systems 9

prototype can be designed or implemented to validate theeffectiveness and application value of the approach As weexpected the objective of the prototype system is to rank andrecommend the high-quality mobile service to softwaredevelopers for building Mashup application We can use thetools of Python 35 Mysql 56 and the technologies of Flaskand Pyecharts to develop the prototype system It shouldachieve four basic function parts ie service data crawlingand preprocessing description extension of Mashup andmobile service topic modelling of Mashup and mobileservice and recommendation of mobile service for the givenMashup requirement More concretely

(1) In the first part (service data crawling and pre-processing) the system incrementally crawls Mash-ups services and invocations between these Mashupsand services from ProgrammableWeb and builds theircorresponding data table to store Because the de-scription documents of Mashup and mobile servicecontain some useless or meaningless termswords thepreprocessing is performed to normalize and stan-dardize the description information +e pre-processing mainly includes tokenization words aresegmented by spaces and punctuation is separatedfrom words by using NLTK (Natural LanguageToolkit) in Python removing stop words (the com-mon short words or symbols that have no practicalmeaning but occur frequently such as the to a anwith and at) the stop vocabulary table in the NLTK isapplied to remove stop words stemming variousforms of a word are usually used in the grammaticalexpression such as provide providing provides andprovided and their common word endings such asing s and ed should be removed

(2) In the second part (description extension of Mashupand mobile service) English Wikipedia corpustrained byWord2vec based on the genism module inPython is exploited as the description extensionsource of Mashup and mobile service +e mostsimilar Top-N words to an original word in thedescription documents of Mashup and mobile ser-vice are identified and saved as the extended words inthe prototype system Word2vec uses the hierar-chical softmax algorithm to speed up and train wordvector for English Wikipedia corpus whose timecomplexity is O(log N) Meanwhile Word2veccalculates the similarity between words in EnglishWikipedia corpus and obtain the similarity matrixwhose time and space complexity are all O(n2) n isthe total amount of words During the process ofdescription extension the most similar Top-N wordsto an original word can be found only by the way oflook-up table in the trained English Wikipediacorpus and its time complexity is O(1) and spacecomplexity is O(N2)

(3) In the third part (topic modelling of Mashup andmobile service) hierarchical Dirichlet processtechnology is used to extract the implicit topics ofMashup and mobile service which clustersgroups

service data according to the cooccurrence of wordfrequency It can automatically determine the op-timal number of topics which avoids adjusting thenumber of topics repeatedly and so saves the timecost It can also accurately predict the topic dis-tribution of Mashup and mobile service which donot need to retrain the dataset and make the pro-totype system real time A topic modelling modulecan be designed in the prototype system in whichFlask framework and Pyecharts visualization tool inPython are used to present the effect of topicmodelling and a download function is provided todownload the transformed topic vectors of Mashupand mobile service for users

(4) In the fourth part (recommendation of mobile ser-vice for the given Mashup requirement) when asoftware developer submits a Mashup requirementthe prototype system will return a list of mobileservices with good quality a for software developer tobuild novel Mashup application During the processof recommendation factorization machines trainand model the important input features +ese inputfeatures include functional features ie similarMashups of target Mashup and similar mobile ser-vices of active mobile service derived from the topicsimilarity based on the HDP model and qualityfeatures ie the cooccurrence and popularity ofmobile services obtained from the stored data tableFMs predict the probability of mobile servicesinvocated by Mashups and recommends the high-quality mobile service for a Mashup creation Sim-ilarly Flask framework and Pyechart visualizationtool in Python are used to present the effect ofrecommendation

6 Conclusions and Future Work

+is paper proposes a mobile service recommendationmethod for Mashup development in mobile servicecomputing by combining word embeddings enhancedHDP and FMs +e experimental results on the top ofProgrammableWeb dataset show that compared with theexisting recommendation methods the proposed methodachieves significant improvements in the accuracy ofrecommendation In the future work we willinvestigate and apply fine-grained service relationshipinformation into the proposed model for more accuraterecommendation

Data Availability

+e crawled dataset from ProgrammableWeb can beaccessed at http491230608080MashupNetwork20datasetjsp

Conflicts of Interest

+e authors declare that there are no conflicts of interestregarding the publication of this paper

10 Mobile Information Systems

Acknowledgments

+e work was supported by the National Natural ScienceFoundation of China under grant nos 61873316 6187213961572187 61772193 61702181 and 61572371 National KeyRampD Program of China under grant no 2017YFB1400602Hunan Provincial Natural Science Foundation of Chinaunder grant nos 2017JJ2098 2017JJ4036 2018JJ2139 and2018JJ2136 and Innovation Platform Open Foundation ofHunan Provincial Education Department of China undergrant no 17K033

References

[1] S Deng L Huang H Wu et al ldquoToward mobile servicecomputing opportunities and challengesrdquo IEEE CloudComputing vol 3 no 4 pp 32ndash41 2016

[2] B Xia Y Fan W Tan K Huang J Zhang and C WuldquoCategory-aware API clustering and distributed recommen-dation for automatic mashup creationrdquo IEEE Transactions onServices Computing vol 8 no 5 pp 674ndash687 2015

[3] httpsenwikipediaorgwikiMashup_(web_application_hybrid)

[4] L Chen Y Wang Q Yu Z Zheng and J Wu ldquoWT-LDAuser tagging augmented LDA for web service clusteringrdquo inProceedings of the International Conference on Service-Oriented Computing (ICSOC) Hangzhou China January2013

[5] X Liu and I Fulia ldquoIncorporating user topic and service-related latent factors into web service recommendationrdquo inProceedings of the IEEE International Conference on WebServices pp 185ndash192 New York NY USA July 2015

[6] D Blei A Ng and M Jordan ldquoLatent dirichlet allocationrdquoJournal of Machine Learning Research vol 3 pp 993ndash10222003

[7] Y W Teh M I Jordan M J Beal and D M Blei ldquoHier-archical dirichlet processrdquo Journal of the American StatisticalAssociation vol 101 no 476 pp 1566ndash1581 2004

[8] M Shi J Liu D Zhou M Tang and B Cao ldquoWE-LDA aword embeddings augmented LDA model for web servicesclusteringrdquo in Proceedings of the IEEE International Confer-ence on Web Services (ICWS) pp 9ndash16 Honolulu HI USAJune 2017

[9] W Xu J Cao L Hu J Wang and M Li ldquoA social-awareservice recommendation approach for mashup creationrdquo inProceedings of the IEEE 20th International Conference on WebServices pp 107ndash114 Santa Clara CA USA 2013

[10] L Yao X Wang Q Sheng W Ruan and W Zhang ldquoServicerecommendation for mashup composition with implicitcorrelation regularizationrdquo in Proceedings of the IEEE In-ternational Conference on Web Services pp 217ndash224 NewYork NY USA June-July 2015

[11] H Ma D Zhou C Liu M R Lyu and I King ldquoRecom-mender Systems with social regularizationrdquo in Proceedings ofthe Fourth ACM International Conference on Web Search andData Mining pp 287ndash296 ACM Hong Kong China Feb-ruary 2011

[12] X Chen Z Zheng Q Yu and M R Lyu ldquoWeb servicerecommendation via exploiting location and QoS in-formationrdquo IEEE Transactions on Parallel and DistributedSystems vol 25 no 7 pp 1913ndash1924 2014

[13] S Rendle ldquoFactorization machinesrdquo in Proceedings of theIEEE International Conference on Data Mining (ICDM)pp 995ndash1000 Sydney Australia December 2010

[14] S Rendle ldquoFactorization machines with libFMrdquo ACMTransactions on Intelligent Systems and Technology (TIST)vol 3 no 3 pp 57ndash78 2012

[15] T Ma I Sato and H Nakagawa e Hybrid NestedHierarchical Dirichlet Process and Its Application to TopicModeling with Word Differentiation Association for theAdvancement of Artificial Intelligence (AAAI) Menlo ParkCA USA 2015

[16] Y Teh M Jordan M Beal and D Blei ldquoSharing clustersamong related groups hierarchical dirichlet processesrdquo Ad-vances in Neural Information Processing System vol 37 no 2pp 1385ndash1392 2004

[17] Z Zheng H Ma M Lyu and I King ldquoWSRec a collaborativefiltering based web service recommender systemrdquo in Pro-ceedings IEEE International Conference on Web Services(ICWS) pp 437ndash444 Los Angeles CA USA July 2009

[18] B Cao B Li J Liu M Tang and Y Liu ldquoWeb APIs rec-ommendation for mashup development based on hierarchicaldirichlet process and factorization machinesrdquo in Proceedingsof Collaborate Computing Networking Applications andWorksharing Beijing China July 2016

[19] S Wang Z Zheng Z Wu M Lyu and F Yang ldquoReputationmeasurement and malicious feedback rating prevention inweb service recommendation systemrdquo IEEE Transactions onServices Computing vol 5 no 8 pp 755ndash767 2015

[20] M Picozzi M Rodolfi C Cappiello andMMatera ldquoQuality-based recommendations for mashup compositionrdquo in Cur-rent Trends in Web Engineering vol 6385 pp 360ndash371 2010

[21] C Cappiello F Daniel M Matera and C Pautasso ldquoIn-formation quality in mashupsrdquo IEEE Internet Computingvol 14 no 4 pp 14ndash22 2010

[22] C Cappiello ldquoA quality model for mashup componentsrdquo inWeb Engineering Web Engineering Lecture Notes in Com-puter Science vol 5648 pp 236ndash250 2009

[23] K Huang Y Fan and W Tan ldquoAn empirical study ofprogrammable web a network analysis on a service-mashupsystemrdquo in Proceedings of the 2012 IEEE 19th InternationalConference onWeb Services (ICWS) Honolulu HI USA June2012

[24] W Gao L Chen J Wu and H Gao ldquoManifold-learningbased API recommendation for mashup creationrdquo in Pro-ceedings of the 2015 IEEE International Conference on WebServices (ICWS) New York NY USA June 2015

[25] X Luo M Zhou Y Xia and Q Zhu ldquoAn efficient non-negative matrix-factorization-based approach to collaborativefiltering for recommender systemsrdquo IEEE Transactions onIndustrial Informatics vol 10 no 2 pp 1273ndash1284 2014

[26] Z Zheng H Ma M R Lyu and I King ldquoCollaborative webservice QoS prediction via neighborhood integrated matrixfactorizationrdquo IEEE Transactions on Services Computingvol 6 no 3 pp 289ndash299 2013

[27] B Cao X Liu M D M Rahman B Li J Liu and M TangldquoIntegrated content and network-based service clustering andweb APIs recommendation for mashup developmentrdquo IEEETransactions on Services Computing p 1 2017

[28] B Cao X Liu J Liu and M Tang ldquoDomain-aware mashupservice clustering based on LDA topic model from multipledata sourcesrdquo Information and Software Technology vol 90pp 40ndash54 2017

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 9: MobileServiceRecommendationviaCombiningEnhanced ...downloads.hindawi.com/journals/misy/2019/6423805.pdfcorpus Ke3erv3 dataset Gscription document3f3 flashup3n3 mobi3ervice jG%3op3del

Mashups to derive potential topics and they did not use FMsto model and train these potential topics to predict theprobability of Mashup calling services to obtain more ac-curate service recommendation In our previous work[8 27 28] we mainly address on LDA or enhanced LDAtopic model for Web services clustering [27 28] and alsoexploit word embedding technique to enhance the accuracyof service clustering [8] Driven by these methods wecombine FMs and word-embedded enhanced HDP forrecommending mobile services to build novel Mashup ap-plication We apply the HDP model to export the potentialtopics from the description documents of mobile servicesand Mashups to support FM model training We use FMs topredict the probability of Mashups calling mobile servicesand recommend high-quality services for building novelMashup application

5 Discussion

Recommending mobile service to build novel Mashup ap-plication for software developers in the mobile servicecomputing environment is becoming a promising researchtopic In our paper the functional semantic representationof Mashup applications and mobile services is fully con-solidated and mined by extending their description docu-ments and modelling their topic probability distributionand the quality prediction is performed by exploiting FMs totrain and model multiple dimension features of mobileservice +e high-quality mobile services are ranked andrecommended to build Mashup by simultaneously consid-ering their functionality representation and quality feature+e accuracy of mobile service recommendation is signifi-cantly improved as a result

Although the above approach and solution seem veryeffective in the Mashup development it will be better if a

Given number10 20 30

RMSE

02

025

03

035

04

LDA-FMs-3LDA-FMs-6LDA-FMs-12

LDA-FMs-24HDP-FMs

Figure 6 MAE of HDP-FMs and LDA-FMs

LDA-FMs-3LDA-FMs-6LDA-FMs-12

LDA-FMs-24HDP-FMs

Given number10 20 30

MA

E

012

013

014

015

016

017

Figure 7 RMSE of HDP-FMs and LDA-FMs

Top-S5 10 15 20 25

MA

E

013

014

015

016

017

018

019

02

HDP-FMs

Figure 8 Impact of Top-S in HDP-FMs

Top-M5 10 15 20 25

MA

E

013

014

015

016

017

018

019

02

HDP-FMs

Figure 9 Impact of Top-M in HDP-FMs

Mobile Information Systems 9

prototype can be designed or implemented to validate theeffectiveness and application value of the approach As weexpected the objective of the prototype system is to rank andrecommend the high-quality mobile service to softwaredevelopers for building Mashup application We can use thetools of Python 35 Mysql 56 and the technologies of Flaskand Pyecharts to develop the prototype system It shouldachieve four basic function parts ie service data crawlingand preprocessing description extension of Mashup andmobile service topic modelling of Mashup and mobileservice and recommendation of mobile service for the givenMashup requirement More concretely

(1) In the first part (service data crawling and pre-processing) the system incrementally crawls Mash-ups services and invocations between these Mashupsand services from ProgrammableWeb and builds theircorresponding data table to store Because the de-scription documents of Mashup and mobile servicecontain some useless or meaningless termswords thepreprocessing is performed to normalize and stan-dardize the description information +e pre-processing mainly includes tokenization words aresegmented by spaces and punctuation is separatedfrom words by using NLTK (Natural LanguageToolkit) in Python removing stop words (the com-mon short words or symbols that have no practicalmeaning but occur frequently such as the to a anwith and at) the stop vocabulary table in the NLTK isapplied to remove stop words stemming variousforms of a word are usually used in the grammaticalexpression such as provide providing provides andprovided and their common word endings such asing s and ed should be removed

(2) In the second part (description extension of Mashupand mobile service) English Wikipedia corpustrained byWord2vec based on the genism module inPython is exploited as the description extensionsource of Mashup and mobile service +e mostsimilar Top-N words to an original word in thedescription documents of Mashup and mobile ser-vice are identified and saved as the extended words inthe prototype system Word2vec uses the hierar-chical softmax algorithm to speed up and train wordvector for English Wikipedia corpus whose timecomplexity is O(log N) Meanwhile Word2veccalculates the similarity between words in EnglishWikipedia corpus and obtain the similarity matrixwhose time and space complexity are all O(n2) n isthe total amount of words During the process ofdescription extension the most similar Top-N wordsto an original word can be found only by the way oflook-up table in the trained English Wikipediacorpus and its time complexity is O(1) and spacecomplexity is O(N2)

(3) In the third part (topic modelling of Mashup andmobile service) hierarchical Dirichlet processtechnology is used to extract the implicit topics ofMashup and mobile service which clustersgroups

service data according to the cooccurrence of wordfrequency It can automatically determine the op-timal number of topics which avoids adjusting thenumber of topics repeatedly and so saves the timecost It can also accurately predict the topic dis-tribution of Mashup and mobile service which donot need to retrain the dataset and make the pro-totype system real time A topic modelling modulecan be designed in the prototype system in whichFlask framework and Pyecharts visualization tool inPython are used to present the effect of topicmodelling and a download function is provided todownload the transformed topic vectors of Mashupand mobile service for users

(4) In the fourth part (recommendation of mobile ser-vice for the given Mashup requirement) when asoftware developer submits a Mashup requirementthe prototype system will return a list of mobileservices with good quality a for software developer tobuild novel Mashup application During the processof recommendation factorization machines trainand model the important input features +ese inputfeatures include functional features ie similarMashups of target Mashup and similar mobile ser-vices of active mobile service derived from the topicsimilarity based on the HDP model and qualityfeatures ie the cooccurrence and popularity ofmobile services obtained from the stored data tableFMs predict the probability of mobile servicesinvocated by Mashups and recommends the high-quality mobile service for a Mashup creation Sim-ilarly Flask framework and Pyechart visualizationtool in Python are used to present the effect ofrecommendation

6 Conclusions and Future Work

+is paper proposes a mobile service recommendationmethod for Mashup development in mobile servicecomputing by combining word embeddings enhancedHDP and FMs +e experimental results on the top ofProgrammableWeb dataset show that compared with theexisting recommendation methods the proposed methodachieves significant improvements in the accuracy ofrecommendation In the future work we willinvestigate and apply fine-grained service relationshipinformation into the proposed model for more accuraterecommendation

Data Availability

+e crawled dataset from ProgrammableWeb can beaccessed at http491230608080MashupNetwork20datasetjsp

Conflicts of Interest

+e authors declare that there are no conflicts of interestregarding the publication of this paper

10 Mobile Information Systems

Acknowledgments

+e work was supported by the National Natural ScienceFoundation of China under grant nos 61873316 6187213961572187 61772193 61702181 and 61572371 National KeyRampD Program of China under grant no 2017YFB1400602Hunan Provincial Natural Science Foundation of Chinaunder grant nos 2017JJ2098 2017JJ4036 2018JJ2139 and2018JJ2136 and Innovation Platform Open Foundation ofHunan Provincial Education Department of China undergrant no 17K033

References

[1] S Deng L Huang H Wu et al ldquoToward mobile servicecomputing opportunities and challengesrdquo IEEE CloudComputing vol 3 no 4 pp 32ndash41 2016

[2] B Xia Y Fan W Tan K Huang J Zhang and C WuldquoCategory-aware API clustering and distributed recommen-dation for automatic mashup creationrdquo IEEE Transactions onServices Computing vol 8 no 5 pp 674ndash687 2015

[3] httpsenwikipediaorgwikiMashup_(web_application_hybrid)

[4] L Chen Y Wang Q Yu Z Zheng and J Wu ldquoWT-LDAuser tagging augmented LDA for web service clusteringrdquo inProceedings of the International Conference on Service-Oriented Computing (ICSOC) Hangzhou China January2013

[5] X Liu and I Fulia ldquoIncorporating user topic and service-related latent factors into web service recommendationrdquo inProceedings of the IEEE International Conference on WebServices pp 185ndash192 New York NY USA July 2015

[6] D Blei A Ng and M Jordan ldquoLatent dirichlet allocationrdquoJournal of Machine Learning Research vol 3 pp 993ndash10222003

[7] Y W Teh M I Jordan M J Beal and D M Blei ldquoHier-archical dirichlet processrdquo Journal of the American StatisticalAssociation vol 101 no 476 pp 1566ndash1581 2004

[8] M Shi J Liu D Zhou M Tang and B Cao ldquoWE-LDA aword embeddings augmented LDA model for web servicesclusteringrdquo in Proceedings of the IEEE International Confer-ence on Web Services (ICWS) pp 9ndash16 Honolulu HI USAJune 2017

[9] W Xu J Cao L Hu J Wang and M Li ldquoA social-awareservice recommendation approach for mashup creationrdquo inProceedings of the IEEE 20th International Conference on WebServices pp 107ndash114 Santa Clara CA USA 2013

[10] L Yao X Wang Q Sheng W Ruan and W Zhang ldquoServicerecommendation for mashup composition with implicitcorrelation regularizationrdquo in Proceedings of the IEEE In-ternational Conference on Web Services pp 217ndash224 NewYork NY USA June-July 2015

[11] H Ma D Zhou C Liu M R Lyu and I King ldquoRecom-mender Systems with social regularizationrdquo in Proceedings ofthe Fourth ACM International Conference on Web Search andData Mining pp 287ndash296 ACM Hong Kong China Feb-ruary 2011

[12] X Chen Z Zheng Q Yu and M R Lyu ldquoWeb servicerecommendation via exploiting location and QoS in-formationrdquo IEEE Transactions on Parallel and DistributedSystems vol 25 no 7 pp 1913ndash1924 2014

[13] S Rendle ldquoFactorization machinesrdquo in Proceedings of theIEEE International Conference on Data Mining (ICDM)pp 995ndash1000 Sydney Australia December 2010

[14] S Rendle ldquoFactorization machines with libFMrdquo ACMTransactions on Intelligent Systems and Technology (TIST)vol 3 no 3 pp 57ndash78 2012

[15] T Ma I Sato and H Nakagawa e Hybrid NestedHierarchical Dirichlet Process and Its Application to TopicModeling with Word Differentiation Association for theAdvancement of Artificial Intelligence (AAAI) Menlo ParkCA USA 2015

[16] Y Teh M Jordan M Beal and D Blei ldquoSharing clustersamong related groups hierarchical dirichlet processesrdquo Ad-vances in Neural Information Processing System vol 37 no 2pp 1385ndash1392 2004

[17] Z Zheng H Ma M Lyu and I King ldquoWSRec a collaborativefiltering based web service recommender systemrdquo in Pro-ceedings IEEE International Conference on Web Services(ICWS) pp 437ndash444 Los Angeles CA USA July 2009

[18] B Cao B Li J Liu M Tang and Y Liu ldquoWeb APIs rec-ommendation for mashup development based on hierarchicaldirichlet process and factorization machinesrdquo in Proceedingsof Collaborate Computing Networking Applications andWorksharing Beijing China July 2016

[19] S Wang Z Zheng Z Wu M Lyu and F Yang ldquoReputationmeasurement and malicious feedback rating prevention inweb service recommendation systemrdquo IEEE Transactions onServices Computing vol 5 no 8 pp 755ndash767 2015

[20] M Picozzi M Rodolfi C Cappiello andMMatera ldquoQuality-based recommendations for mashup compositionrdquo in Cur-rent Trends in Web Engineering vol 6385 pp 360ndash371 2010

[21] C Cappiello F Daniel M Matera and C Pautasso ldquoIn-formation quality in mashupsrdquo IEEE Internet Computingvol 14 no 4 pp 14ndash22 2010

[22] C Cappiello ldquoA quality model for mashup componentsrdquo inWeb Engineering Web Engineering Lecture Notes in Com-puter Science vol 5648 pp 236ndash250 2009

[23] K Huang Y Fan and W Tan ldquoAn empirical study ofprogrammable web a network analysis on a service-mashupsystemrdquo in Proceedings of the 2012 IEEE 19th InternationalConference onWeb Services (ICWS) Honolulu HI USA June2012

[24] W Gao L Chen J Wu and H Gao ldquoManifold-learningbased API recommendation for mashup creationrdquo in Pro-ceedings of the 2015 IEEE International Conference on WebServices (ICWS) New York NY USA June 2015

[25] X Luo M Zhou Y Xia and Q Zhu ldquoAn efficient non-negative matrix-factorization-based approach to collaborativefiltering for recommender systemsrdquo IEEE Transactions onIndustrial Informatics vol 10 no 2 pp 1273ndash1284 2014

[26] Z Zheng H Ma M R Lyu and I King ldquoCollaborative webservice QoS prediction via neighborhood integrated matrixfactorizationrdquo IEEE Transactions on Services Computingvol 6 no 3 pp 289ndash299 2013

[27] B Cao X Liu M D M Rahman B Li J Liu and M TangldquoIntegrated content and network-based service clustering andweb APIs recommendation for mashup developmentrdquo IEEETransactions on Services Computing p 1 2017

[28] B Cao X Liu J Liu and M Tang ldquoDomain-aware mashupservice clustering based on LDA topic model from multipledata sourcesrdquo Information and Software Technology vol 90pp 40ndash54 2017

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 10: MobileServiceRecommendationviaCombiningEnhanced ...downloads.hindawi.com/journals/misy/2019/6423805.pdfcorpus Ke3erv3 dataset Gscription document3f3 flashup3n3 mobi3ervice jG%3op3del

prototype can be designed or implemented to validate theeffectiveness and application value of the approach As weexpected the objective of the prototype system is to rank andrecommend the high-quality mobile service to softwaredevelopers for building Mashup application We can use thetools of Python 35 Mysql 56 and the technologies of Flaskand Pyecharts to develop the prototype system It shouldachieve four basic function parts ie service data crawlingand preprocessing description extension of Mashup andmobile service topic modelling of Mashup and mobileservice and recommendation of mobile service for the givenMashup requirement More concretely

(1) In the first part (service data crawling and pre-processing) the system incrementally crawls Mash-ups services and invocations between these Mashupsand services from ProgrammableWeb and builds theircorresponding data table to store Because the de-scription documents of Mashup and mobile servicecontain some useless or meaningless termswords thepreprocessing is performed to normalize and stan-dardize the description information +e pre-processing mainly includes tokenization words aresegmented by spaces and punctuation is separatedfrom words by using NLTK (Natural LanguageToolkit) in Python removing stop words (the com-mon short words or symbols that have no practicalmeaning but occur frequently such as the to a anwith and at) the stop vocabulary table in the NLTK isapplied to remove stop words stemming variousforms of a word are usually used in the grammaticalexpression such as provide providing provides andprovided and their common word endings such asing s and ed should be removed

(2) In the second part (description extension of Mashupand mobile service) English Wikipedia corpustrained byWord2vec based on the genism module inPython is exploited as the description extensionsource of Mashup and mobile service +e mostsimilar Top-N words to an original word in thedescription documents of Mashup and mobile ser-vice are identified and saved as the extended words inthe prototype system Word2vec uses the hierar-chical softmax algorithm to speed up and train wordvector for English Wikipedia corpus whose timecomplexity is O(log N) Meanwhile Word2veccalculates the similarity between words in EnglishWikipedia corpus and obtain the similarity matrixwhose time and space complexity are all O(n2) n isthe total amount of words During the process ofdescription extension the most similar Top-N wordsto an original word can be found only by the way oflook-up table in the trained English Wikipediacorpus and its time complexity is O(1) and spacecomplexity is O(N2)

(3) In the third part (topic modelling of Mashup andmobile service) hierarchical Dirichlet processtechnology is used to extract the implicit topics ofMashup and mobile service which clustersgroups

service data according to the cooccurrence of wordfrequency It can automatically determine the op-timal number of topics which avoids adjusting thenumber of topics repeatedly and so saves the timecost It can also accurately predict the topic dis-tribution of Mashup and mobile service which donot need to retrain the dataset and make the pro-totype system real time A topic modelling modulecan be designed in the prototype system in whichFlask framework and Pyecharts visualization tool inPython are used to present the effect of topicmodelling and a download function is provided todownload the transformed topic vectors of Mashupand mobile service for users

(4) In the fourth part (recommendation of mobile ser-vice for the given Mashup requirement) when asoftware developer submits a Mashup requirementthe prototype system will return a list of mobileservices with good quality a for software developer tobuild novel Mashup application During the processof recommendation factorization machines trainand model the important input features +ese inputfeatures include functional features ie similarMashups of target Mashup and similar mobile ser-vices of active mobile service derived from the topicsimilarity based on the HDP model and qualityfeatures ie the cooccurrence and popularity ofmobile services obtained from the stored data tableFMs predict the probability of mobile servicesinvocated by Mashups and recommends the high-quality mobile service for a Mashup creation Sim-ilarly Flask framework and Pyechart visualizationtool in Python are used to present the effect ofrecommendation

6 Conclusions and Future Work

+is paper proposes a mobile service recommendationmethod for Mashup development in mobile servicecomputing by combining word embeddings enhancedHDP and FMs +e experimental results on the top ofProgrammableWeb dataset show that compared with theexisting recommendation methods the proposed methodachieves significant improvements in the accuracy ofrecommendation In the future work we willinvestigate and apply fine-grained service relationshipinformation into the proposed model for more accuraterecommendation

Data Availability

+e crawled dataset from ProgrammableWeb can beaccessed at http491230608080MashupNetwork20datasetjsp

Conflicts of Interest

+e authors declare that there are no conflicts of interestregarding the publication of this paper

10 Mobile Information Systems

Acknowledgments

+e work was supported by the National Natural ScienceFoundation of China under grant nos 61873316 6187213961572187 61772193 61702181 and 61572371 National KeyRampD Program of China under grant no 2017YFB1400602Hunan Provincial Natural Science Foundation of Chinaunder grant nos 2017JJ2098 2017JJ4036 2018JJ2139 and2018JJ2136 and Innovation Platform Open Foundation ofHunan Provincial Education Department of China undergrant no 17K033

References

[1] S Deng L Huang H Wu et al ldquoToward mobile servicecomputing opportunities and challengesrdquo IEEE CloudComputing vol 3 no 4 pp 32ndash41 2016

[2] B Xia Y Fan W Tan K Huang J Zhang and C WuldquoCategory-aware API clustering and distributed recommen-dation for automatic mashup creationrdquo IEEE Transactions onServices Computing vol 8 no 5 pp 674ndash687 2015

[3] httpsenwikipediaorgwikiMashup_(web_application_hybrid)

[4] L Chen Y Wang Q Yu Z Zheng and J Wu ldquoWT-LDAuser tagging augmented LDA for web service clusteringrdquo inProceedings of the International Conference on Service-Oriented Computing (ICSOC) Hangzhou China January2013

[5] X Liu and I Fulia ldquoIncorporating user topic and service-related latent factors into web service recommendationrdquo inProceedings of the IEEE International Conference on WebServices pp 185ndash192 New York NY USA July 2015

[6] D Blei A Ng and M Jordan ldquoLatent dirichlet allocationrdquoJournal of Machine Learning Research vol 3 pp 993ndash10222003

[7] Y W Teh M I Jordan M J Beal and D M Blei ldquoHier-archical dirichlet processrdquo Journal of the American StatisticalAssociation vol 101 no 476 pp 1566ndash1581 2004

[8] M Shi J Liu D Zhou M Tang and B Cao ldquoWE-LDA aword embeddings augmented LDA model for web servicesclusteringrdquo in Proceedings of the IEEE International Confer-ence on Web Services (ICWS) pp 9ndash16 Honolulu HI USAJune 2017

[9] W Xu J Cao L Hu J Wang and M Li ldquoA social-awareservice recommendation approach for mashup creationrdquo inProceedings of the IEEE 20th International Conference on WebServices pp 107ndash114 Santa Clara CA USA 2013

[10] L Yao X Wang Q Sheng W Ruan and W Zhang ldquoServicerecommendation for mashup composition with implicitcorrelation regularizationrdquo in Proceedings of the IEEE In-ternational Conference on Web Services pp 217ndash224 NewYork NY USA June-July 2015

[11] H Ma D Zhou C Liu M R Lyu and I King ldquoRecom-mender Systems with social regularizationrdquo in Proceedings ofthe Fourth ACM International Conference on Web Search andData Mining pp 287ndash296 ACM Hong Kong China Feb-ruary 2011

[12] X Chen Z Zheng Q Yu and M R Lyu ldquoWeb servicerecommendation via exploiting location and QoS in-formationrdquo IEEE Transactions on Parallel and DistributedSystems vol 25 no 7 pp 1913ndash1924 2014

[13] S Rendle ldquoFactorization machinesrdquo in Proceedings of theIEEE International Conference on Data Mining (ICDM)pp 995ndash1000 Sydney Australia December 2010

[14] S Rendle ldquoFactorization machines with libFMrdquo ACMTransactions on Intelligent Systems and Technology (TIST)vol 3 no 3 pp 57ndash78 2012

[15] T Ma I Sato and H Nakagawa e Hybrid NestedHierarchical Dirichlet Process and Its Application to TopicModeling with Word Differentiation Association for theAdvancement of Artificial Intelligence (AAAI) Menlo ParkCA USA 2015

[16] Y Teh M Jordan M Beal and D Blei ldquoSharing clustersamong related groups hierarchical dirichlet processesrdquo Ad-vances in Neural Information Processing System vol 37 no 2pp 1385ndash1392 2004

[17] Z Zheng H Ma M Lyu and I King ldquoWSRec a collaborativefiltering based web service recommender systemrdquo in Pro-ceedings IEEE International Conference on Web Services(ICWS) pp 437ndash444 Los Angeles CA USA July 2009

[18] B Cao B Li J Liu M Tang and Y Liu ldquoWeb APIs rec-ommendation for mashup development based on hierarchicaldirichlet process and factorization machinesrdquo in Proceedingsof Collaborate Computing Networking Applications andWorksharing Beijing China July 2016

[19] S Wang Z Zheng Z Wu M Lyu and F Yang ldquoReputationmeasurement and malicious feedback rating prevention inweb service recommendation systemrdquo IEEE Transactions onServices Computing vol 5 no 8 pp 755ndash767 2015

[20] M Picozzi M Rodolfi C Cappiello andMMatera ldquoQuality-based recommendations for mashup compositionrdquo in Cur-rent Trends in Web Engineering vol 6385 pp 360ndash371 2010

[21] C Cappiello F Daniel M Matera and C Pautasso ldquoIn-formation quality in mashupsrdquo IEEE Internet Computingvol 14 no 4 pp 14ndash22 2010

[22] C Cappiello ldquoA quality model for mashup componentsrdquo inWeb Engineering Web Engineering Lecture Notes in Com-puter Science vol 5648 pp 236ndash250 2009

[23] K Huang Y Fan and W Tan ldquoAn empirical study ofprogrammable web a network analysis on a service-mashupsystemrdquo in Proceedings of the 2012 IEEE 19th InternationalConference onWeb Services (ICWS) Honolulu HI USA June2012

[24] W Gao L Chen J Wu and H Gao ldquoManifold-learningbased API recommendation for mashup creationrdquo in Pro-ceedings of the 2015 IEEE International Conference on WebServices (ICWS) New York NY USA June 2015

[25] X Luo M Zhou Y Xia and Q Zhu ldquoAn efficient non-negative matrix-factorization-based approach to collaborativefiltering for recommender systemsrdquo IEEE Transactions onIndustrial Informatics vol 10 no 2 pp 1273ndash1284 2014

[26] Z Zheng H Ma M R Lyu and I King ldquoCollaborative webservice QoS prediction via neighborhood integrated matrixfactorizationrdquo IEEE Transactions on Services Computingvol 6 no 3 pp 289ndash299 2013

[27] B Cao X Liu M D M Rahman B Li J Liu and M TangldquoIntegrated content and network-based service clustering andweb APIs recommendation for mashup developmentrdquo IEEETransactions on Services Computing p 1 2017

[28] B Cao X Liu J Liu and M Tang ldquoDomain-aware mashupservice clustering based on LDA topic model from multipledata sourcesrdquo Information and Software Technology vol 90pp 40ndash54 2017

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 11: MobileServiceRecommendationviaCombiningEnhanced ...downloads.hindawi.com/journals/misy/2019/6423805.pdfcorpus Ke3erv3 dataset Gscription document3f3 flashup3n3 mobi3ervice jG%3op3del

Acknowledgments

+e work was supported by the National Natural ScienceFoundation of China under grant nos 61873316 6187213961572187 61772193 61702181 and 61572371 National KeyRampD Program of China under grant no 2017YFB1400602Hunan Provincial Natural Science Foundation of Chinaunder grant nos 2017JJ2098 2017JJ4036 2018JJ2139 and2018JJ2136 and Innovation Platform Open Foundation ofHunan Provincial Education Department of China undergrant no 17K033

References

[1] S Deng L Huang H Wu et al ldquoToward mobile servicecomputing opportunities and challengesrdquo IEEE CloudComputing vol 3 no 4 pp 32ndash41 2016

[2] B Xia Y Fan W Tan K Huang J Zhang and C WuldquoCategory-aware API clustering and distributed recommen-dation for automatic mashup creationrdquo IEEE Transactions onServices Computing vol 8 no 5 pp 674ndash687 2015

[3] httpsenwikipediaorgwikiMashup_(web_application_hybrid)

[4] L Chen Y Wang Q Yu Z Zheng and J Wu ldquoWT-LDAuser tagging augmented LDA for web service clusteringrdquo inProceedings of the International Conference on Service-Oriented Computing (ICSOC) Hangzhou China January2013

[5] X Liu and I Fulia ldquoIncorporating user topic and service-related latent factors into web service recommendationrdquo inProceedings of the IEEE International Conference on WebServices pp 185ndash192 New York NY USA July 2015

[6] D Blei A Ng and M Jordan ldquoLatent dirichlet allocationrdquoJournal of Machine Learning Research vol 3 pp 993ndash10222003

[7] Y W Teh M I Jordan M J Beal and D M Blei ldquoHier-archical dirichlet processrdquo Journal of the American StatisticalAssociation vol 101 no 476 pp 1566ndash1581 2004

[8] M Shi J Liu D Zhou M Tang and B Cao ldquoWE-LDA aword embeddings augmented LDA model for web servicesclusteringrdquo in Proceedings of the IEEE International Confer-ence on Web Services (ICWS) pp 9ndash16 Honolulu HI USAJune 2017

[9] W Xu J Cao L Hu J Wang and M Li ldquoA social-awareservice recommendation approach for mashup creationrdquo inProceedings of the IEEE 20th International Conference on WebServices pp 107ndash114 Santa Clara CA USA 2013

[10] L Yao X Wang Q Sheng W Ruan and W Zhang ldquoServicerecommendation for mashup composition with implicitcorrelation regularizationrdquo in Proceedings of the IEEE In-ternational Conference on Web Services pp 217ndash224 NewYork NY USA June-July 2015

[11] H Ma D Zhou C Liu M R Lyu and I King ldquoRecom-mender Systems with social regularizationrdquo in Proceedings ofthe Fourth ACM International Conference on Web Search andData Mining pp 287ndash296 ACM Hong Kong China Feb-ruary 2011

[12] X Chen Z Zheng Q Yu and M R Lyu ldquoWeb servicerecommendation via exploiting location and QoS in-formationrdquo IEEE Transactions on Parallel and DistributedSystems vol 25 no 7 pp 1913ndash1924 2014

[13] S Rendle ldquoFactorization machinesrdquo in Proceedings of theIEEE International Conference on Data Mining (ICDM)pp 995ndash1000 Sydney Australia December 2010

[14] S Rendle ldquoFactorization machines with libFMrdquo ACMTransactions on Intelligent Systems and Technology (TIST)vol 3 no 3 pp 57ndash78 2012

[15] T Ma I Sato and H Nakagawa e Hybrid NestedHierarchical Dirichlet Process and Its Application to TopicModeling with Word Differentiation Association for theAdvancement of Artificial Intelligence (AAAI) Menlo ParkCA USA 2015

[16] Y Teh M Jordan M Beal and D Blei ldquoSharing clustersamong related groups hierarchical dirichlet processesrdquo Ad-vances in Neural Information Processing System vol 37 no 2pp 1385ndash1392 2004

[17] Z Zheng H Ma M Lyu and I King ldquoWSRec a collaborativefiltering based web service recommender systemrdquo in Pro-ceedings IEEE International Conference on Web Services(ICWS) pp 437ndash444 Los Angeles CA USA July 2009

[18] B Cao B Li J Liu M Tang and Y Liu ldquoWeb APIs rec-ommendation for mashup development based on hierarchicaldirichlet process and factorization machinesrdquo in Proceedingsof Collaborate Computing Networking Applications andWorksharing Beijing China July 2016

[19] S Wang Z Zheng Z Wu M Lyu and F Yang ldquoReputationmeasurement and malicious feedback rating prevention inweb service recommendation systemrdquo IEEE Transactions onServices Computing vol 5 no 8 pp 755ndash767 2015

[20] M Picozzi M Rodolfi C Cappiello andMMatera ldquoQuality-based recommendations for mashup compositionrdquo in Cur-rent Trends in Web Engineering vol 6385 pp 360ndash371 2010

[21] C Cappiello F Daniel M Matera and C Pautasso ldquoIn-formation quality in mashupsrdquo IEEE Internet Computingvol 14 no 4 pp 14ndash22 2010

[22] C Cappiello ldquoA quality model for mashup componentsrdquo inWeb Engineering Web Engineering Lecture Notes in Com-puter Science vol 5648 pp 236ndash250 2009

[23] K Huang Y Fan and W Tan ldquoAn empirical study ofprogrammable web a network analysis on a service-mashupsystemrdquo in Proceedings of the 2012 IEEE 19th InternationalConference onWeb Services (ICWS) Honolulu HI USA June2012

[24] W Gao L Chen J Wu and H Gao ldquoManifold-learningbased API recommendation for mashup creationrdquo in Pro-ceedings of the 2015 IEEE International Conference on WebServices (ICWS) New York NY USA June 2015

[25] X Luo M Zhou Y Xia and Q Zhu ldquoAn efficient non-negative matrix-factorization-based approach to collaborativefiltering for recommender systemsrdquo IEEE Transactions onIndustrial Informatics vol 10 no 2 pp 1273ndash1284 2014

[26] Z Zheng H Ma M R Lyu and I King ldquoCollaborative webservice QoS prediction via neighborhood integrated matrixfactorizationrdquo IEEE Transactions on Services Computingvol 6 no 3 pp 289ndash299 2013

[27] B Cao X Liu M D M Rahman B Li J Liu and M TangldquoIntegrated content and network-based service clustering andweb APIs recommendation for mashup developmentrdquo IEEETransactions on Services Computing p 1 2017

[28] B Cao X Liu J Liu and M Tang ldquoDomain-aware mashupservice clustering based on LDA topic model from multipledata sourcesrdquo Information and Software Technology vol 90pp 40ndash54 2017

Mobile Information Systems 11

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom

Page 12: MobileServiceRecommendationviaCombiningEnhanced ...downloads.hindawi.com/journals/misy/2019/6423805.pdfcorpus Ke3erv3 dataset Gscription document3f3 flashup3n3 mobi3ervice jG%3op3del

Computer Games Technology

International Journal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom

Journal ofEngineeringVolume 2018

Advances in

FuzzySystems

Hindawiwwwhindawicom

Volume 2018

International Journal of

ReconfigurableComputing

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Applied Computational Intelligence and Soft Computing

thinspAdvancesthinspinthinsp

thinspArtificial Intelligence

Hindawiwwwhindawicom Volumethinsp2018

Hindawiwwwhindawicom Volume 2018

Civil EngineeringAdvances in

Hindawiwwwhindawicom Volume 2018

Electrical and Computer Engineering

Journal of

Journal of

Computer Networks and Communications

Hindawiwwwhindawicom Volume 2018

Hindawi

wwwhindawicom Volume 2018

Advances in

Multimedia

International Journal of

Biomedical Imaging

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Engineering Mathematics

International Journal of

RoboticsJournal of

Hindawiwwwhindawicom Volume 2018

Hindawiwwwhindawicom Volume 2018

Computational Intelligence and Neuroscience

Hindawiwwwhindawicom Volume 2018

Mathematical Problems in Engineering

Modelling ampSimulationin EngineeringHindawiwwwhindawicom Volume 2018

Hindawi Publishing Corporation httpwwwhindawicom Volume 2013Hindawiwwwhindawicom

The Scientific World Journal

Volume 2018

Hindawiwwwhindawicom Volume 2018

Human-ComputerInteraction

Advances in

Hindawiwwwhindawicom Volume 2018

Scientic Programming

Submit your manuscripts atwwwhindawicom