research article exploiting explicit and implicit feedback...

12
Research Article Exploiting Explicit and Implicit Feedback for Personalized Ranking Gai Li 1 and Qiang Chen 2 1 School of Electronics and Information Engineering, Shunde Polytechnic, Guangdong, Shunde 528333, China 2 Department of Computer Science, Guangdong University of Education, Guangdong, Guangzhou 510303, China Correspondence should be addressed to Gai Li; [email protected] Received 14 October 2015; Accepted 12 November 2015 Academic Editor: Anna Pandolfi Copyright © 2016 G. Li and Q. Chen. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. e problem of the previous researches on personalized ranking is that they focused on either explicit feedback data or implicit feedback data rather than making full use of the information in the dataset. Until now, nobody has studied personalized ranking algorithm by exploiting both explicit and implicit feedback. In order to overcome the defects of prior researches, a new personalized ranking algorithm (MERR SVD++) based on the newest xCLiMF model and SVD++ algorithm was proposed, which exploited both explicit and implicit feedback simultaneously and optimized the well-known evaluation metric Expected Reciprocal Rank (ERR). Experimental results on practical datasets showed that our proposed algorithm outperformed existing personalized ranking algorithms over different evaluation metrics and that the running time of MERR SVD++ showed a linear correlation with the number of rating. Because of its high precision and the good expansibility, MERR SVD++ is suitable for processing big data and has wide application prospect in the field of internet information recommendation. 1. Introduction As e-commerce is growing in popularity, an important challenge is helping customers sort through a large variety of offered products to easily find the ones they will enjoy the most. One of the tools that address this challenge is the recommender system, which is attracting a lot of attention recently [1–4]. Recommender systems are a subclass of infor- mation filtering systems that seek to predict the “rating” or “preference” that users would give to an item [1]. Preferences for items are learned from users’ past interactions with the system, such as purchase histories or the click logs. e purpose of the system is to recommend items that users might like from a large collection. Recommender systems have been applied to many areas on the Internet, such as the e-commerce system Amazon, the DVD rental system Netflix, and Google News. Recommender systems are usually clas- sified into three categories based on how recommendations are made: content-based recommendations, collaborative filtering (CF), and hybrid approaches. In these approaches, collaborative filtering is the most widely used and the most successful. Recently, collaborative filtering algorithm has been widely studied in both academic and industrial fields. e data processed by collaborative filtering algorithm are divided into two categories: explicit feedback data (e.g., ratings, votes) and implicit feedback data (e.g., clicks, pur- chases). Explicit feedback data are more widely used in the research fields of recommender system [1, 2, 4–7]. ey are oſten in the form of numeric ratings from users to express their preferences regarding specific items. Implicit feedback data are easier to collect. e research on implicit feedback about CF is also called One-Class Collaborative Filtering (OCCF) [8–15], in which only positive implicit feedback or only positive examples can be observed. e explicit and implicit feedback data can be expressed in matrix form as shown in Figure 1. In the explicit feedback matrix, an element can be any real number, but oſten ratings are integers in the range (15), such as the ratings on Netflix, where a missing element represents a missing example. In the implicit feedback matrix, the positive-only user preferences data can be represented as a single-valued matrix. Collaborative filtering algorithms also can be divided into two categories: collaborative filtering (CF) algorithms based Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2016, Article ID 2535329, 11 pages http://dx.doi.org/10.1155/2016/2535329

Upload: others

Post on 27-Jul-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Research Article Exploiting Explicit and Implicit Feedback ...downloads.hindawi.com/journals/mpe/2016/2535329.pdfLTR is the core technology for information retrieval. When a query

Research ArticleExploiting Explicit and Implicit Feedback forPersonalized Ranking

Gai Li1 and Qiang Chen2

1School of Electronics and Information Engineering Shunde Polytechnic Guangdong Shunde 528333 China2Department of Computer Science Guangdong University of Education Guangdong Guangzhou 510303 China

Correspondence should be addressed to Gai Li ligai999126com

Received 14 October 2015 Accepted 12 November 2015

Academic Editor Anna Pandolfi

Copyright copy 2016 G Li and Q Chen This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

The problem of the previous researches on personalized ranking is that they focused on either explicit feedback data or implicitfeedback data rather than making full use of the information in the dataset Until now nobody has studied personalized rankingalgorithm by exploiting both explicit and implicit feedback In order to overcome the defects of prior researches a new personalizedranking algorithm (MERR SVD++) based on the newest xCLiMF model and SVD++ algorithm was proposed which exploitedboth explicit and implicit feedback simultaneously and optimized the well-known evaluation metric Expected Reciprocal Rank(ERR) Experimental results on practical datasets showed that our proposed algorithm outperformed existing personalized rankingalgorithms over different evaluation metrics and that the running time of MERR SVD++ showed a linear correlation with thenumber of rating Because of its high precision and the good expansibility MERR SVD++ is suitable for processing big data andhas wide application prospect in the field of internet information recommendation

1 Introduction

As e-commerce is growing in popularity an importantchallenge is helping customers sort through a large variety ofoffered products to easily find the ones they will enjoy themost One of the tools that address this challenge is therecommender system which is attracting a lot of attentionrecently [1ndash4] Recommender systems are a subclass of infor-mation filtering systems that seek to predict the ldquoratingrdquo orldquopreferencerdquo that users would give to an item [1] Preferencesfor items are learned from usersrsquo past interactions with thesystem such as purchase histories or the click logs Thepurpose of the system is to recommend items that usersmight like from a large collection Recommender systemshave been applied to many areas on the Internet such as thee-commerce systemAmazon the DVD rental systemNetflixand Google News Recommender systems are usually clas-sified into three categories based on how recommendationsare made content-based recommendations collaborativefiltering (CF) and hybrid approaches In these approachescollaborative filtering is the most widely used and the mostsuccessful

Recently collaborative filtering algorithm has beenwidely studied in both academic and industrial fieldsThe data processed by collaborative filtering algorithm aredivided into two categories explicit feedback data (egratings votes) and implicit feedback data (eg clicks pur-chases) Explicit feedback data are more widely used in theresearch fields of recommender system [1 2 4ndash7] They areoften in the form of numeric ratings from users to expresstheir preferences regarding specific items Implicit feedbackdata are easier to collect The research on implicit feedbackabout CF is also called One-Class Collaborative Filtering(OCCF) [8ndash15] in which only positive implicit feedback oronly positive examples can be observed The explicit andimplicit feedback data can be expressed in matrix form asshown in Figure 1 In the explicit feedbackmatrix an elementcan be any real number but often ratings are integers inthe range (1sim5) such as the ratings on Netflix where amissing element represents amissing example In the implicitfeedback matrix the positive-only user preferences data canbe represented as a single-valued matrix

Collaborative filtering algorithms also can be divided intotwo categories collaborative filtering (CF) algorithms based

Hindawi Publishing CorporationMathematical Problems in EngineeringVolume 2016 Article ID 2535329 11 pageshttpdxdoiorg10115520162535329

2 Mathematical Problems in Engineering

Item

Use

r

1 3

3 2

4 4

4 3 1

5 5

2 5

(a)

Item

Use

r

1 1

1

1 1

1 1

1 1

1 1

(b)

Figure 1 Examples of an explicit feedback matrix (a) and an implicit feedback matrix (b) for a recommender system

on rating prediction [2 4 5 8 9 12 16 17] and personalizedranking (PR) algorithms based on ranking prediction [36 7 10 11 13ndash15 18] In collaborative filtering algorithmsbased on rating prediction one predicts the actual rating foran item that a customer has not rated yet and then ranksthe items according to the predicted ratings On the otherhand for personalized ranking algorithms based on rankingprediction one predicts a preference ordering over the yetunrated items without going through the intermeditate stepof rating prediction Actually from the recommendationperspective the order over the items is more important thantheir ratingTherefore in this paper we focus on personalizedranking algorithms based on ranking prediction

The problem of the previous researches on personalizedranking is that they focused on either explicit feedback dataor implicit feedback data rather than making full use of theinformation in the dataset However in most real world rec-ommender systems both explicit and implicit user feedbackare abundant and could potentially complement each otherIt is desirable to be able to unify these two heterogeneousforms of user feedback in order to generate more accu-rate recommendations The idea of complementing explicitfeedback with implicit feedback was first proposed in [16]where the author considered explicit feedback as how theusers rated the movies and implicit feedback as what movieswere rated by the users The two forms of feedback werecombined via a factorized neighborhood model (called Sin-gular Value Decomposition++ SVD++) an extension oftraditional nearest item-based model in which the item-item similarity matrix was approximated via low rank fac-torization In order to make full use of explicit and implicitfeedback Liu et al [17] proposed Co-Rating model whichdeveloped matrix factorization models that could be trainedfrom explicit and implicit feedback simultaneously BothSVD++ and Co-Rating are based on rating prediction Untilnow nobody has studied personalized ranking algorithm byexploiting both explicit and implicit feedback

In order to overcome the defects of prior researchesthis paper proposes a new personalized ranking algorithm

(MERR SVD++) which exploits both explicit and implicitfeedback and optimizes Expected Reciprocal Rank (ERR)based on the newest Extended Collaborative Less-Is-MoreFiltering (xCLiMF)model [18] and SVD++ algorithm Exper-imental results on practical datasets showed that our pro-posed algorithm outperformed existing personalized rankingalgorithms over different evaluation metrics and that therunning time of MERR SVD++ showed a linear correlationwith the number of rating Because of its high precisionand the good expansibility MERR SVD++ is suitable forprocessing big data and has wide application prospect in thefield of internet information recommendation

The rest of this paper is organized as follows Section 2introduces previous related work Section 3 demonstrates theproblem formalization and SVD++ algorithm a new per-sonalized ranking algorithm (MERR SVD++) is proposedin Section 4 the experimental results and discussion arepresented in Section 5 followed by the conclusion and futurework in Section 6

2 Related Work

21 Rating Prediction In conventional CF tasks the mostfrequently used evaluation metrics are the Mean AbsoluteError (MAE) and the Root Mean Square Error (RMSE)Therefore rating prediction (such as the Netflix Prize) hasbeen the most popular method for solving the CF problemRating prediction methods are always regression based theyminimize the error of predicted ratings and true ratingsThe simplest algorithm for rating prediction is 119870-Nearest-Neighbor (KNN) [19] which predicts the missing ratingsfrom the neighborhood of users or items KNN is a memory-based algorithm and one needs to compute all the similaritiesbetweendifferent users or itemsMore efficient algorithms aremodel based they build a model from the visible ratings andcompute all the missing ratings from the model Widely usedmodel-based rating prediction methods include PLSA [20]the Restricted Boltzmann Machine (RBM) [21] and a seriesof matrix factorization techniques [22ndash25]

Mathematical Problems in Engineering 3

22 Learning to Rank LTR is the core technology forinformation retrieval When a query is input into a searchengine LTR is responsible for ranking all the documentsor Web pages according to their relevance to this query orother objectives Many LTR algorithms have been proposedrecently and they can be classified into three categoriespointwise listwise and pairwise [3 26]

In the pointwise approach it is assumed that each query-document pair in the training data has a numerical or ordinalscore Then the LTR problem can be approximated by aregression problem given a single query-document pair itsscore is predicted

As the name suggests the listwise approach takes theentire set of documents associatedwith a query in the trainingdata as the input to construct amodel and predict their scores

The pairwise approach does not focus on accuratelypredicting the degree of relevance of each document insteadit mainly cares about the relative order of two documents Inthis sense it is closer to the concept of ldquorankingrdquo

23 Personalized Ranking Algorithms for Collaborative Fil-tering The algorithms about personalized ranking can alsobe divided into two categories personalized ranking withimplicit feedback (PRIF) [6 7 10 11 13ndash15] and personalizedranking with explicit feedback (PREF) [18 27ndash29] Theforemost of PRIF is Bayesian Personalized Ranking (BPR)[11] which converts the OCCF problem into a rankingproblem Pan et al [13] proposedAdaptive BayesianPersonal-ized Ranking (ABPR) which generalized BPR algorithm forhomogeneous implicit feedback and learned the confidenceadaptively Pan et al [15] have proposed Group BayesianPersonalized Ranking (GBPR) via introducing richer inter-actions among users In GBPR it introduces group prefer-ence to relax the individual and independence assumptionsof BPR Jahrer and Toscher [6] proposed SVD and AFMwhich used a ranking based objective function constructedby matrix decomposition model and a stochastic gradientdescent optimizer Takacs and Tikk [7] proposed RankALSwhich presented a computationally effective approach forthe direct minimization of a ranking objective functionwithout sampling Gai [10] proposed a new PRIF algorithm(Pairwise Probabilistic Matrix Factorization (PPMF)) to fur-ther improve the performance of previous PRIF algorithmsRecently Shi et al [14] proposed Collaborative Less-is-MoreFiltering (CLiMF) in which the model parameters werelearned by directly maximizing the well-known informationretrieval metric Mean Reciprocal Rank (MRR) HoweverCLiMF is not suitable for other evaluation metrics (egMAP AUC and NDCG) References [6 7 10 11 13ndash15]can improve the performance of OCCF by solving the datasparsity and imbalance problems of PRIF to a certain extentAs for PREF [27] was adapted from PLSA and [28] employedthe KNN technique of CF both of which utilized the pairwiseranking method Reference [29] utilized the listwise methodto build its ranker which was a variation of MaximumMargin Factorization [22] Shi et al [18] proposed ExtendedCollaborative Less-Is-More Filtering (xCLiMF)model whichcould be seen as a generalization of the CLiMF methodThe key idea of the xCLiMF algorithm is that it builds a

recommendation model by optimizing Expected ReciprocalRank an evaluation metric that generalizes Reciprocal Rank(RR) in order to incorporate userrsquo explicit feedback Refer-ences [18 27ndash29] can also improve the performance of PREFby solving the data sparsity and imbalance problems of PREFto a certain extent

3 Problem Formalization and SVD++

31 Problem Formalization In this paper we use capitalletters to denote a matrix (such as 119883) Given a matrix 119883119883

119894119895represents its element 119883

119894indicates the 119894th row of 119883 119883

119895

symbolizes the 119895th column of119883 and119883

119879 stands for the trans-pose of119883

32 SVD++ Given that a matrix 119883 = (119883

119894119895)

119898times119899 the total

number of users is 119898 and the total number of items is 119899 if119883 is an explicit feedback matrix then119883

119894119895isin 0 1 2 3 4 5 or

119883

119894119895is unknown We want to approximate 119883 with a low rank

matrix

119883 = (

119883

119894119895)

119898times119899 where

119883 = 119880

119879119881 119880 isin 119862

119889times119898 119881 isin 119862

119889times119899119880 and119881 denote the explicit feature matrix with ranks of 119889 forusers and items respectively 119889 ≪ 119896 and 119896 denotes the rankof119883 119896 le min(119898 119899)

If 119883 is an implicit feedback matrix then 119883

119894119895isin 0 1 or

119883

119894119895is unknown We want to approximate 119883 with a low rank

matrix

119883 = (

119883

119894119895)

119898times119899 where

119883 = 119867

119879119882 119867 isin 119862

119889times119898 119882 isin

119862

119889times119899119867 and119882 denote the implicit feature matrix with ranksof 119889 for users and items respectively

SVD++ is a collaborative filtering algorithm unifyingexplicit and implicit feedback based on rating prediction andmatrix factorization [16] In SVD++ matrix 119881 denotes theexplicit and implicit feature matrix of items simultaneously

The feature matrix of users can be defined as

119880

119906+ |119873 (119906)|

minus12sum

119895isin119873(119906)

119882

119895 (1)

where 119880

119906is used to characterize the userrsquos explicit feed-

back |119873(119906)|

minus12sum

119895isin119873(119906)119882

119895is used to characterize the userrsquos

implicit feedback 119873(119906) denotes the set of all items that user119906 gave implicit feedback and119882

119895denotes the implicit feature

vector of item 119895So the prediction formula of 119883

119906119894in SVD++ can be defined

as

119883

119906119894= (119880

119906+ |119873 (119906)|

minus12sum

119895isin119873(119906)

119882

119895)

119879

119881

119894

(2)

4 Exploiting Explicit and Implicit Feedbackfor Personalized Ranking

In this section we will firstly introduce our MERR SVD++model then give the learning algorithm of this model andfinally analyze its computational complexity

41 Exploiting Explicit and Implicit Feedback for PersonalizedRanking In practical applications the user scans the results

4 Mathematical Problems in Engineering

list from top to bottom and stops when a result is found thatfully satisfies the userrsquos information need The usefulness ofan item at rank 119894 is dependent on the usefulness of the itemsat rank less than 119894 Reciprocal Rank (RR) is an importantevaluationmetric in the research field of information retrieval[14] and it strongly emphasizes the relevance of resultsreturned at the top of the list The ERR measure is ageneralized version of RR designed to be used with multiplerelevance level data (eg ratings) ERR has similar propertiesto RR in that it strongly emphasizes the relevance of resultsreturned at the top of the list

Using the definition of ERR in [18 30] we can describeERR for a ranked item list of user 119906 as follows

ERR119906=

119899

sum

119894=1

119875

119906119894

119877

119906119894

(3)

where119877119906119894denotes the rank position of item 119894 for user 119906 when

all items are ranked in descending order of the predictedrelevance values And119875

119906119894denotes the probability that the user

119906 stops at position 119894 As in [30] 119875119906119894is defined as follows

119875

119906119894= 119903

119906119894

119899

prod

119895=1

(1 minus 119903

119906119895119868 (119877

119906119895lt 119877

119906119894)) (4)

where 119868(119909) is an indicator function equal to 1 if the conditionis true and otherwise 0 And 119903

119906119894denotes the probability that

user 119906 finds the item 119894 relevant Substituting (4) into (3) weobtain the calculation formula of ERR

119906

ERR119906=

119899

sum

119894=1

119903

119906119894

119877

119906119894

119899

prod

119895=1

(1 minus 119903

119906119895119868 (119877

119906119895lt 119877

119906119894)) (5)

We use a mapping function similar to the one used in[18] to convert ratings (or levels of relevance in general) toprobabilities of relevance as follows

119903

119906119894=

2

119883119906119894minus 1

2

119883max 119868

119906119894gt 0

0 119868

119906119894= 0

(6)

where 119868

119906119894is an indicator function Note that 119868

119906119894gt 0 (119868

119906119894=

0) indicates that user 119906rsquos preference to item 119894 is known(unknown) 119883

119906119894denotes the rating given by user 119906 to item 119894

and119883max is the highest ratingIn this paper we define that 119877(119906) denotes the set of all

items that user 119906 gave explicit feedback so 119873(119906) = 119877(119906) inthe dataset that the users only gave explicit feedback and theimplicit feedback dataset is created by setting all the ratingdata in explicit feedback dataset as 1 A toy example can beseen in Figure 2 If the dataset contains both explicit feedbackdata and implicit feedback data 119873(119906) = 119877(119906) + 119872(119906)119872(119906) denoting the set of all items that user 119906 only gaveimplicit feedback A toy example can be seen in Figure 3The influence of implicit feedback on the performance ofMERR SVD++ can be found in Section 542 If we use thetraditional SVD++ algorithm to unify explicit and implicit

feedback data the prediction formula of 119883119906119894in SVD++ can

be defined as

119883

119906119894= (119880

119906+ |119873 (119906)|

minus12sum

119895isin119873(119906)

119882

119895)

119879

119881

119894

(7)

So far through the introduction of SVD++ we canexploit both explicit and implicit feedback simultaneouslyby optimizing evaluation metric ERR So we call our modelMERR SVD++

Note that the value of rank 119877

119906119894depends on the value of

119883

119906119894 For example if the predicted relevance value

119883

119906119894of item

119894 is the second highest among all the items for user 119906 then wewill have 119877

119906119894= 2

Similar to other ranking measures such as RR ERR isalso nonsmooth with respect to the latent factors of users anditems that is 119880 119881 and 119882 It is thus impossible to optimizeERRdirectly using conventional optimization techniquesWethus employ smoothing techniques that were also used inCLiMF [14] and xCLiMF [18] to attain a smoothed versionof ERR In particular we approximate the rank-based terms1119877

119906119894and 119868(119877

119906119895lt 119877

119906119894) in (5) by smooth functions with

respect to the model parameters 119880 119881 and 119882 The approx-imate formula is as follows

119868 (119877

119906119895lt 119877

119906119894) asymp 119892 (

119883

119906119895minus

119883

119906119894)

1

119877

119906119894

= 119892 (

119883

119906119894)

(8)

where 119892(119909) is a logistic function that is 119892(119909) = 1(1 + 119890

minus119909)

Substituting (8) into (5) we obtain a smoothed approxi-mation of ERR

119906

ERR119906=

119899

sum

119894=1

119903

119906119894119892 (

119883

119906119894)

119899

prod

119895=1

(1 minus 119903

119906119895119892 (

119883

119906(119895minus119894))) (9)

Note that for notation convenience we make use of thesubstitution

119883

119906(119895minus119894)=

119883

119906119895minus

119883

119906119894

Given the monotonicity of the logarithm function themodel parameters that maximize (9) are equivalent to theparameters that maximize ln((1|119873(119906)|)ERR

119906) Specifically

we have

119880

119906 119881119882 = arg max

119880119906119881119882

ERR119906

= arg max119880119906119881119882

ln ((1119873 (119906))ERR119906)

= arg max119880119906119881119882

ln(

119899

sum

119895=1

119903

119906119894119892 (

119883

119906119894)

|119873 (119906)|

119899

prod

119895=1

(1 minus 119903

119906119895119892 (

119883

119906(119895minus119894))))

(10)

Mathematical Problems in Engineering 5

Item

Use

r

2

5

4

4 3

(a)

Item

Use

r

1

1

1

1 1

(b)

Figure 2 A toy example of the dataset that the users only gave explicit feedback (a) denotes the explicit feedback dataset (b) denotes theimplicit feedback dataset

Item

Use

r

2

5

4

4 3

(a)

Item

Use

r1 1 1

1 1

1

1 1

(b)

Figure 3 A toy example of the dataset that contains both explicit feedback data and implicit feedback data (a) denotes the explicit feedbackdataset (b) denotes the implicit feedback dataset and the numbers in bold denote the dataset that users only gave implicit feedback

Based on Jensenrsquos inequality and the concavity of thelogarithm function in a similar manner to [14 18] we derivethe lower bound of ln((1|119873(119906)|)ERR

119906) as follows

ln(

1

|119873 (119906)|

ERR119906)

= ln[

[

119899

sum

119895=1

119903

119906119894119892 (

119883

119906119894)

|119873 (119906)|

119899

prod

119895=1

(1 minus 119903

119906119895119892 (

119883

119906(119895minus119894)))

]

]

ge

1

|119873 (119906)|

sdot

119899

sum

119895=1

119903

119906119894ln[

[

119892 (

119883

119906119894)

119899

prod

119895=1

(1 minus 119903

119906119895119892 (

119883

119906(119895minus119894)))

]

]

=

1

|119873 (119906)|

sdot

119899

sum

119895=1

119903

119906119894[

[

ln119892 (

119883

119906119894) +

119899

sum

119895=1

ln (1 minus 119903

119906119895119892 (

119883

119906(119895minus119894)))

]

]

(11)

We can neglect the constant 1|119873(119906)| in the lower bound andobtain a new objective function

119871 (119880

119906 119881119882)

=

119899

sum

119895=1

119903

119906119894[

[

ln119892 (

119883

119906119894) +

119899

sum

119895=1

ln (1 minus 119903

119906119895119892 (

119883

119906(119895minus119894)))

]

]

(12)

Taking into account all119898 users and using the Frobenius normof the latent factors for regularization we obtain the objectivefunction of MERR SVD++

119871 (119880119881119882) =

119898

sum

119906

119899

sum

119894=1

119903

119906119894[

[

ln119892 (

119883

119906119894)

+

119899

sum

119895=1

ln (1 minus 119903

119906119895119892 (

119883

119906(119895minus119894)))

]

]

minus

120582

2

(119880

2+ 119881

2

+ 119882

2)

(13)

6 Mathematical Problems in Engineering

in which 120582 denotes the regularization coefficient and 119880

denotes the Frobenius norm of119880 Note that the lower bound119871(119880 119881119882) is much less complex than the original objectivefunction in (9) and standard optimization methods forexample gradient ascend can be used to learn the optimalmodel parameters 119880 119881 and119882

42 Optimization We can now maximize the objectivefunction (13) with respect to the latent factorsarg max

119880119881119882119871(119880119881119882) Note that 119871(119880119881119882) represents an

approximation of the mean value of ERR across all the usersWe can thus remove the constant coefficient 1119898 since ithas no influence on the optimization of 119871(119880 119881119882) Since theobjective function is smooth we can use gradient ascent forthe optimization The gradients can be derived in a similarmanner to xCLiMF [18] as shown in the following

120597119871

120597119880

119906

=

119899

sum

119894=1

119903

119906119894[

[

119892 (minus

119883

119906119894)119881

119894

+

119899

sum

119895=1

119903

119906119895119892

1015840(

119883

119906119895minus

119883

119906119894) (119881

119894minus 119881

119895)

1 minus 119903

119906119895119892 (

119883

119906119895minus

119883

119906119894)

]

]

minus 120582119880

119906

(14)

120597119871

120597119881

119894

= 119903

119906119894[

[

119892 (minus

119883

119906119894)119881

119894+

119899

sum

119895=1

119903

119906119895119892

1015840(

119883

119906119895minus

119883

119906119894)

sdot (

1

1 minus 119903

119906119895119892 (

119883

119906119895minus

119883

119906119894)

minus

1

1 minus 119903

119906119894119892 (

119883

119906119894minus

119883

119906119895)

)

]

]

(119880

119906+ |119873 (119906)|

minus12

sdot sum

119896isin119873(119906)

119882

119896) minus 120582119881

119894

(15)

120597119871

120597119882

119894

= 119903

119906119894[

[

119892 (minus

119883

119906119894)119881

119894

+

119899

sum

119895=1

119903

119906119895119892

1015840(

119883

119906119895minus

119883

119906119894) (119881

119894minus 119881

119895)

1 minus 119903

119906119895119892 (

119883

119906119895minus

119883

119906119894)

]

]

minus 120582119882

119894

(16)

The learning algorithm for the MERR SVD++ model isoutlined in Algorithm 1

The published research papers in [8ndash11 13 18] showthat the use of the normal distribution 119873(0 001) for theinitialization of feature matrix is very effective so we still usethis approach for the initialization of 119880 119881 and 119882 in ourproposed algorithm

43 Computational Complexity Here we first analyze thecomplexity of the learning process for one iteration Byexploiting the data sparseness in 119883 the computational com-plexity of the gradient in (14) is 119874(119889

2119898 + 119889119898) Note that

Input Training set119883 learning rate 120572 regularization 120582The number of the feature 119889 and the maximal numberOf iterations itermaxOutput feature matrix 119880 119881119882for 119906 = 1 2 119898 doIndex relevant items for user 119906119873(119906) = 119895 | 119883

119894119895gt 0 0 le 119895 le 119899

endInitialize 119880(0) 119881(0) and119882

(0) with random valuesand 119905 = 0repeatfor 119906 = 1 2 119898 doUpdate 119880

119906

119880

(119905+1)

119906= 119880

(119905)

119906+ 120572

120597119871

120597119880

(119905)

119906

for 119894 isin 119873(119906) doUpdate 119881

119894119882119894

119881

119894

(119905+1)= 119881

119894

(119905)+ 120572

120597119871

120597119881

119894

(119905)

119882

119894

(119905+1)= 119882

119894

(119905)+ 120572

120597119871

120597119882

119894

(119905)

EndEnd

119905 = 119905 + 1until 119905 ge itermax119880 = 119880

(119905) 119881 = 119881

(119905)119882 = 119882

(119905)

Algorithm 1 MERR SVD++

denotes the average number of relevant items across all theusers The computational complexity of the gradient in (16)is also 119874(119889

2119898 + 119889119898) The computational complexity of the

gradient in (15) is 119874(119889

2119898 + 119889119898) Hence the complexity

of the learning algorithm in one iteration is in the order of119874(119889

2119898) In the case that is a small number that is 2 ≪ 119898

the complexity is linear to the number of users in the datacollection Note that we have 119904 = 119898 in which 119904 denotes thenumber of nonzeros in the user-item matrix The complexityof the learning algorithm is then 119874(119889119904) Since we usuallyhave ≪ 119904 the complexity is 119874(119889119904) even in the case that is large that is being linear to the number of nonzeros (ierelevant observations in the data) In sum our analysis showsthat MERR SVD++ is suitable for large scale use cases Notethat we also empirically verify the complexity of the learningalgorithm in Section 543

5 Experiment

51 Datasets We use two datasets for the experiments Thefirst is the MovieLens 1 million dataset (ML1m) [18] whichcontains ca 1M ratings (1ndash5 scale) from ca 6K users and 37Kmovies The sparseness of the ML1m dataset is 9553 Thesecond dataset is the Netflix dataset [2 16 17] which contains100000000 ratings (1ndash5 scale) from 480189 users on 17770movies Due to the huge size of the Netflix data we extract asubset of 10000 users and 10000 movies in which each userhas rated more than 100 different movies

Mathematical Problems in Engineering 7

52 Evaluation Metrics Just as has been justified by [17]NDCG is a very suitable evaluation metric for personalizedranking algorithmswhich combine explicit and implicit feed-back And our proposed MERR SVD++ algorithm exploitsboth explicit and implicit feedback simultaneously andoptimizes the well-known personalized ranking evaluationmetric Expected Reciprocal Rank (ERR) So we use theNDCG and ERR as evaluation metrics for the predictabilityof models in this paper

NDCG is another most widely used measure for rankingproblems To define NDCG

119906for a user 119906 one first needs to

define DCG119906

DCG119906=

119898

sum

119894=1

2

pref(119894)minus 1

log (119894 + 1)

(17)

where pref(119894) is a binary indicator returning 1 if the 119894th item ispreferred and 0 otherwise DCG

119906is then normalized by the

ideal ranked list into the interval [0 1]

NDCG119906=

DCG119906

IDCG119906

(18)

where IDCG119906denotes that the ranked list is sorted exactly

according to the userrsquos tastes positive items are placed at thehead of the ranked listTheNDCGof all the users is themeanscore of each user

ERR is a generalized version of Reciprocal Rank (RR)designed to be used with multiple relevance level data (egratings) It has similar properties to RR in that it stronglyemphasizes the relevance of results returned at the top of thelist Using the definition of ERR in [18] we can define ERRfor a ranked item list of user 119906 as follows

ERR119906=

119899

sum

119894=1

119903

119906119894

119877

119906119894

119899

prod

119895=1

(1 minus 119903

119906119895119868 (119877

119906119895lt 119877

119906119894)) (19)

Similar to NDCG the ERR of all the users is the mean scoreof each user

Since in recommender systems the userrsquos satisfaction isdominated by only a few items on the top of the recommenda-tion list our evaluation in the following experiments focuseson the performance of top-5 recommended items that isNDCG5 and ERR5

53 Experiment Setup For each dataset we randomlyselected 5 rated items (movies) and 1000 unrated items(movies) for each user to form a test set We then randomlyselected a varying number of rated items from the rest toform a training set For example just as in [14 18] underthe condition of ldquoGiven 5rdquo we randomly selected 5 rateditems (disjoint to the items in the test set) for each user inorder to generate a training set We investigated a variety ofldquoGivenrdquo conditions for the training sets that is 5 10 and 15for the ML1m dataset and 10 20 and 30 for the extractedNetflix dataset Generated recommendation lists for each userare compared to the ground truth in the test set in order tomeasure the performance

All the models were implemented in MATLAB R2009aForMERR SVD++ the value of the regularization parameter

120582 was selected from range 10

minus5 10

minus4 10

minus3 10

minus2 10

minus1 1 10

and optimal parameter value was used And the learning rate120572 was selected from set 119860 119860 = 120572 | 120572 = 00001 times 2

119888 120572 le

05 119888 gt 0 and 119888 is a constant and the optimal parametervalue was also used In order to compare their performancesfairly for all matrix factorization models we set the numberof features to be 10 The optimal values of all parametersfor all the baseline models used are determined individuallyMore detailed setting methods of the parameters for all thebaselines can be found in the corresponding references Forall the algorithms used in our experiments we repeated theexperiment 5 times for each of the different conditions of eachdataset and the performances reported were averaged across5 runs

54 Experiment Results In this section we present a seriesof experiments to evaluate MERR SVD++ We designedthe experiments in order to address the following researchquestions

(1) Does the proposed MERR SVD++ outperform state-of-the-art personalized ranking approaches for top-Nrecommendation

(2) Does the performance of MERR SVD++ improvewhen we only increase the number of implicit feed-back data for each user

(3) Is MERR SVD++ scalable for large scale use cases

541 Performance Comparison We compare the perfor-mance of MERR SVD++ with that of five baseline algo-rithms The approaches we compare with are listed below

(i) Co-Rating [17] a state-of-the-art CF model thatcan be trained from explicit and implicit feedbacksimultaneously

(ii) SVD++ [16] the first proposed CF model that com-bines explicit and implicit feedback

(iii) xCLiMF [18] a state-of-the-art PR approach whichaims at directly optimizing ERR for top-N recom-mendation in domains with explicit feedback data(eg ratings)

(iv) CofiRank [29] a PR approach that optimizes theNDCG measure [28] for domains with explicit feed-back data (eg ratings) The implementation is basedon the publicly available software package from theauthors

(v) CLiMF [14] a state-of-the-art PR approach that opti-mizes the Mean Reciprocal Rank (MRR) measure[31] for domains with implicit feedback data (egclick follow) We use the explicit feedback datasetsby binarizing the rating values with a threshold Onthe ML1m dataset and the extracted Netflix datasetwe take ratings 4 and 5 (the highest two relevancelevels) respectively as the relevance threshold fortop-N recommendation

The results of the experiments on the ML1m and theextractedNetflix datasets are shown in Figure 4 Rows denote

8 Mathematical Problems in Engineering

0 Given 5 Given 10 Given 150

001

002

003

004

005

006The results on the ML1m dataset

The varieties of ldquoGivenrdquo conditions for the training sets

ND

CG

5

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(a)

0 Given 5 Given 10 Given 150

001

002

003

004

005

006

007The results on the ML1m dataset

ERR

5

The varieties of ldquoGivenrdquo conditions for the training sets

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(b)

0 Given 10 Given 20 Given 300

002

004

006

008

010

012The results on the extracted Netflix dataset

ND

CG

5

The varieties of ldquoGivenrdquo conditions for the training sets

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(c)

0 Given 10 Given 20 Given 300

001

002

003

004

005

006

007

008The results on the extracted Netflix dataset

ERR

5

The varieties of ldquoGivenrdquo conditions for the training sets

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(d)

Figure 4 The performance comparison of MERR SVD++ and baselines

the varieties of ldquoGivenrdquo conditions for the training sets basedon the ML1m and the extracted Netflix datasets and columnsdenote the quality of NDCG and ERR Figure 4 showsthat MERR SVD++ outperforms the baseline approachesin terms of both ERR and NDCG in all of the cases Theresults show that the improvement of ERR aligns consistentlywith the improvement of NDCG indicating that optimizingERR would not degrade the utility of recommendations thatare captured by the NDCG measure It can be seen thatthe relative performance of MERR SVD++ improves as thenumber of observed ratings from the users increases Thisresult indicates that MERR SVD++ can learn better top-Nrecommendation models if more observations of the gradedrelevance data from users can be used The results alsoreveal that it is difficult to model user preferences encoded

in multiple levels of relevance with limited observations inparticular when the number of observations is lower than thenumber of relevance levels

Compared to Co-Rating which is based on ratingprediction MERR SVD++ is based on ranking predictionand succeeds in enhancing the top-ranked performance byoptimizing ERR As reported in [17] the performance ofSVD++ is slightly weaker than that of Co-Rating which isbecause SVD++ model only attempts to approximate theobserved ratings and does not model preferences expressedin implicit feedback The results also show that Co-Ratingand SVD++ significantly outperform xCLiMF and CofiRankwhich confirms our belief that implicit feedback could indeedcomplement explicit feedback It can be seen in Figure 4that xCLiMF significantly outperforms CLiMF in terms of

Mathematical Problems in Engineering 9

0 10 20 30 40 50005

006

007

008

009

010

The increased numbers of implicit feedback for each user

The v

alue

of e

valu

atio

n m

etric

s

ERR5NDCG5

Figure 5 The influence of implicit feedback on the performance ofMERR SVD++

both ERR and NDCG across all the settings of relevancethresholds and datasets The results indicate that the infor-mation loss from binarizing multilevel relevance data wouldinevitably make recommendation models based on binaryrelevance data such as CLiMF suboptimal for the use caseswith explicit feedback data

Hence we give a positive answer to our first researchquestion

542 The Influence of Implicit Feedback on the Performanceof MERR SVD++ The influence of implicit feedback on theperformance of MERR SVD++ can be found in Figure 5Here 119873(119906) = 119877(119906) + 119872(119906) 119872(119906) denotes the set of allitems that user 119906 only gave implicit feedback Rows denotethe increased numbers of implicit feedback for each userand columns denote the quality of ERR and NDCG In ourexperiment the increased numbers of implicit feedback foreach user are the same We use the extracted Netflix datasetunder the condition of ldquoGiven 10rdquo Figure 5 shows that thequality of ERR and NDCG of MERR SVD++ synchronouslyand linearly improves with the increase of implicit feedbackfor each user which confirms our belief that implicit feedbackcould indeed complement explicit feedback

With this experimental result we give a positive answerto our second research question

543 Scalability The last experiment investigated the scala-bility of MERR SVD++ by measuring the training time thatwas required for the training set at different scales Firstlyas analyzed in Section 43 the computational complexityof MERR SVD++ is linear in the number of users in thetraining set when the average number of items rated peruser is fixed To demonstrate the scalability we used differentnumbers of users in the training set under each conditionwe randomly selected from 10 to 100 users in the trainingset and their rated items as the training data for learning thelatent factors The results on the ML1m dataset are shown inFigure 6 We can observe that the computational time under

10 20 30 40 50 60 70 80 90 1000

5

10

15

20

25

30

35

40

Users for training ()

Runt

ime p

er it

erat

ion

(sec

)

Given 5Given 10Given 15

Figure 6 Scalability analysis of MERR SVD++ in terms of thenumber of users in the training set

each condition increases almost linearly to the increase of thenumber of users Secondly as also discussed in Section 43the computational complexity of MERR SVD++ could befurther approximated to be linear to the amount of knowndata (ie nonzero entries in the training user-item matrix)To demonstrate this we examined the runtime of the learningalgorithm against different scales of the training sets underdifferent ldquoGivenrdquo conditions The result is shown in Figure 6from which we can observe that the average runtime of thelearning algorithm per iteration increases almost linearly asthe number of nonzeros in the training set increases

The observations from this experiment allow us to answerour last research question positively

6 Conclusion and Future Work

The problem of the previous researches on personalizedranking is that they focused on either explicit feedbackdata or implicit feedback data rather than making full useof the information in the dataset Until now nobody hasstudied personalized ranking algorithm by exploiting bothexplicit and implicit feedback In order to overcome thedefects of prior researches in this paper we have presenteda new personalized ranking algorithm (MERR SVD++) byexploiting both explicit and implicit feedback simultaneouslyMERR SVD++ optimizes the well-known evaluation metricExpected Reciprocal Rank (ERR) and is based on the newestxCLiMF model and SVD++ algorithm Experimental resultson practical datasets showed that our proposed algorithmoutperformed existing personalized ranking algorithms overdifferent evaluation metrics and that the running time ofMERR SVD++ showed a linear correlation with the num-ber of rating Because of its high precision and the goodexpansibility MERR SVD++ is suitable for processing bigdata and can greatly improve the recommendation speedand validity by solving the latency problem of personalizedrecommendation and has wide application prospect in the

10 Mathematical Problems in Engineering

field of internet information recommendation And becauseMERR SVD++ exploits both explicit and implicit feedbacksimultaneously MERR SVD++ can solve the data sparsityand imbalance problems of personalized ranking algorithmsto a certain extent

For futurework we plan to extend our algorithm to richerones so that our algorithm can solve the grey sheep problemand cold start problem of personalized recommendationAlso we would like to explore more useful information fromthe explicit feedback and implicit feedback simultaneously

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is sponsored in part by the National NaturalScience Foundation of China (nos 61370186 6140326461402122 61003140 61033010 and 61272414) Science andTechnology Planning Project of Guangdong Province (nos2014A010103040 and 2014B010116001) Science and Tech-nology Planning Project of Guangzhou (nos 2014J4100032and 201510010203) the Ministry of Education and ChinaMobile Research Fund (no MCM20121051) the secondbatch open subject of mechanical and electrical professionalgroup engineering technology development center in Foshancity (no 2015-KJZX139) and the 2015 Research BackboneTeachers Training Program of Shunde Polytechnic (no 2015-KJZX014)

References

[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge and DataEngineering vol 17 no 6 pp 734ndash749 2005

[2] S Ahn A Korattikara and N Liu ldquoLarge-scale distributedBayesian matrix factorization using stochastic gradientMCMCrdquo in Proceedings of the 21th ACMSIGKDDConference onKnowledgeDiscovery andDataMining pp 401ndash410 ACMPressSydney Australia August 2015

[3] S Balakrishnan and S Chopra ldquoCollaborative rankingrdquo in Pro-ceedings of the 5th ACM International Conference onWeb Searchand DataMining (WSDM rsquo12) pp 143ndash152 IEEE Seattle WashUSA February 2012

[4] Q Yao and J Kwok ldquoAccelerated inexact soft-impute for fastlarge-scale matrix completionrdquo in Proceedings of the 24rd Inter-national Joint Conference on Artificial Intelligence pp 4002ndash4008 ACM Press Buenos Aires Argentina July 2015

[5] Q Diao M Qiu C-Y Wu A J Smola J Jiang and C WangldquoJointly modeling aspects ratings and sentiments for movierecommendation (JMARS)rdquo in Proceedings of the 20th ACMSIGKDD International Conference on Knowledge Discovery andDataMining (KDD rsquo14) pp 193ndash202 ACMNewYorkNYUSAAugust 2014

[6] M Jahrer and A Toscher ldquoCollaborative filtering ensemble forrankingrdquo in Proceedings of the 17nd International Conference on

Knowledge Discovery and Data Mining pp 153ndash167 ACM SanDiego Calif USA August 2011

[7] G Takacs and D Tikk ldquoAlternating least squares for person-alized rankingrdquo in Proceedings of the 5th ACM Conference onRecommender Systems pp 83ndash90 ACM Dublin Ireland 2012

[8] G Li and L Li ldquoDual collaborative topicmodeling from implicitfeedbacksrdquo in Proceedings of the IEEE International Conferenceon Security Pattern Analysis and Cybernetics pp 395ndash404Wuhan China October 2014

[9] H Wang X Shi and D Yeung ldquoRelational stacked denoisingautoencoder for tag recommendationrdquo in Proceedings of the29th AAAI Conference on Artificial Intelligence pp 3052ndash3058ACM Press Austin Tex USA January 2015

[10] L Gai ldquoPairwise probabilistic matrix factorization for implicitfeedback collaborative filteringrdquo in Proceedings of the IEEEInternational Conference on Security Pattern Analysis andCybernetics (SPAC rsquo14) pp 181ndash190 Wuhan China October2014

[11] S Rendle C Freudenthaler Z Gantner and L Schmidt-Thieme ldquoBPR Bayesian personalized ranking from implicitfeedbackrdquo in Proceedings of the 25th Conference on Uncertaintyin Artificial Intelligence (UAI rsquo09) pp 452ndash461 Morgan Kauf-mann Montreal Canada 2009

[12] L Guo J Ma H R Jiang Z M Chen and C M XingldquoSocial trust aware item recommendation for implicit feedbackrdquoJournal of Computer Science and Technology vol 30 no 5 pp1039ndash1053 2015

[13] W K Pan H Zhong C F Xu and Z Ming ldquoAdaptive Bayesianpersonalized ranking for heterogeneous implicit feedbacksrdquoKnowledge-Based Systems vol 73 pp 173ndash180 2015

[14] Y Shi A Karatzoglou L Baltrunas M Larson N Oliver andA Hanjalic ldquoCLiMF collaborative less-is-more filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 3077ndash3081 ACMBeijing ChinaAugust 2013

[15] WK Pan andLChen ldquoGBPR grouppreference based bayesianpersonalized ranking for one-class collaborative filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 2691ndash2697 Beijing China 2013

[16] Y Koren ldquoFactor in the neighbors scalable and accurate col-laborative filteringrdquoACMTransactions on Knowledge Discoveryfrom Data vol 4 no 1 pp 1ndash24 2010

[17] N N Liu E W Xiang M Zhao and Q Yang ldquoUnifyingexplicit and implicit feedback for collaborative filteringrdquo inProceedings of the 19th International Conference on Informationand Knowledge Management (CIKM rsquo10) pp 1445ndash1448 ACMToronto Canada October 2010

[18] Y Shi A Karatzoglou L BaltrunasM Larson andA HanjalicldquoXCLiMF Optimizing expected reciprocal rank for data withmultiple levels of relevancerdquo in Proceedings of the 7th ACMConference on Recommender Systems (RecSys rsquo13) pp 431ndash434ACM Hongkong October 2013

[19] J Jiang J Lu G Zhang and G Long ldquoScaling-up item-based collaborative filtering recommendation algorithm basedon Hadooprdquo in Proceedings of the 7th IEEE World Congress onServices pp 490ndash497 IEEE Washington DC USA July 2011

[20] T Hofmann ldquoLatent semantic models for collaborative filter-ingrdquo ACM Transactions on Information Systems vol 22 no 1pp 89ndash115 2004

[21] R Salakhutdinov A Mnih and G Hinton ldquoRestricted Boltz-mannmachines for collaborative filteringrdquo in Proceedings of the

Mathematical Problems in Engineering 11

24rd International Conference onMachine Learning (ICML rsquo07)pp 791ndash798 ACM Press Corvallis Ore USA June 2007

[22] N Srebro J Rennie andT Jaakkola ldquoMaximum-marginmatrixfactorizationrdquo in Proceedings of the 18th Annual Conferenceon Neural Information Processing Systems pp 11ndash217 BritishColumbia Canada 2004

[23] J T Liu C H Wu and W Y Liu ldquoBayesian probabilisticmatrix factorization with social relations and item contents forrecommendationrdquo Decision Support Systems vol 55 no 3 pp838ndash850 2013

[24] D Feldman and T Tassa ldquoMore constraints smaller coresetsconstrained matrix approximation of sparse big datardquo in Pro-ceedings of the 21th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo15) pp 249ndash258ACM Press Sydney Australia August 2015

[25] S Mirisaee E Gaussier and A Termier ldquoImproved local searchfor binarymatrix factorizationrdquo in Proceedings of the 29th AAAIConference on Artificial Intelligence pp 1198ndash1204 ACM PressAustin Tex USA January 2015

[26] T Y Liu Learning to Rank for Information Retrieval SpringerNew York NY USA 2011

[27] NN LiuM Zhao andQ Yang ldquoProbabilistic latent preferenceanalysis for collaborative filteringrdquo in Proceedings of the 18thInternational Conference on Information and Knowledge Man-agement (CIKM rsquo09) pp 759ndash766 ACM Press Hong KongNovember 2009

[28] N N Liu and Q Yang ldquoEigenRank a ranking-orientedapproach to collaborative filteringrdquo in Proceedings of the 31stAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval (ACM SIGIR rsquo08) pp 83ndash90 ACM Singapore July 2008

[29] M Weimer A Karatzoglou Q V Le and A J SmolaldquoCofiRankndashmaximum margin matrix factorization for collab-orative rankingrdquo in Proceedings of the 21th Conference onAdvances in Neural Information Processing Systems pp 79ndash86Curran Associates Vancouver Canada 2007

[30] O Chapelle D Metlzer Y Zhang and P Grinspan ldquoExpectedreciprocal rank for graded relevancerdquo in Proceedings of the 18thACM International Conference on Information and KnowledgeManagement (CIKM rsquo09) pp 621ndash630 ACM New York NYUSA November 2009

[31] E M Voorhees ldquoThe trec-8 question answering track reportrdquoin Proceedings of the 8th Text Retrieval Conference (TREC-8 rsquo99)Gaithersburg Md USA November 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 2: Research Article Exploiting Explicit and Implicit Feedback ...downloads.hindawi.com/journals/mpe/2016/2535329.pdfLTR is the core technology for information retrieval. When a query

2 Mathematical Problems in Engineering

Item

Use

r

1 3

3 2

4 4

4 3 1

5 5

2 5

(a)

Item

Use

r

1 1

1

1 1

1 1

1 1

1 1

(b)

Figure 1 Examples of an explicit feedback matrix (a) and an implicit feedback matrix (b) for a recommender system

on rating prediction [2 4 5 8 9 12 16 17] and personalizedranking (PR) algorithms based on ranking prediction [36 7 10 11 13ndash15 18] In collaborative filtering algorithmsbased on rating prediction one predicts the actual rating foran item that a customer has not rated yet and then ranksthe items according to the predicted ratings On the otherhand for personalized ranking algorithms based on rankingprediction one predicts a preference ordering over the yetunrated items without going through the intermeditate stepof rating prediction Actually from the recommendationperspective the order over the items is more important thantheir ratingTherefore in this paper we focus on personalizedranking algorithms based on ranking prediction

The problem of the previous researches on personalizedranking is that they focused on either explicit feedback dataor implicit feedback data rather than making full use of theinformation in the dataset However in most real world rec-ommender systems both explicit and implicit user feedbackare abundant and could potentially complement each otherIt is desirable to be able to unify these two heterogeneousforms of user feedback in order to generate more accu-rate recommendations The idea of complementing explicitfeedback with implicit feedback was first proposed in [16]where the author considered explicit feedback as how theusers rated the movies and implicit feedback as what movieswere rated by the users The two forms of feedback werecombined via a factorized neighborhood model (called Sin-gular Value Decomposition++ SVD++) an extension oftraditional nearest item-based model in which the item-item similarity matrix was approximated via low rank fac-torization In order to make full use of explicit and implicitfeedback Liu et al [17] proposed Co-Rating model whichdeveloped matrix factorization models that could be trainedfrom explicit and implicit feedback simultaneously BothSVD++ and Co-Rating are based on rating prediction Untilnow nobody has studied personalized ranking algorithm byexploiting both explicit and implicit feedback

In order to overcome the defects of prior researchesthis paper proposes a new personalized ranking algorithm

(MERR SVD++) which exploits both explicit and implicitfeedback and optimizes Expected Reciprocal Rank (ERR)based on the newest Extended Collaborative Less-Is-MoreFiltering (xCLiMF)model [18] and SVD++ algorithm Exper-imental results on practical datasets showed that our pro-posed algorithm outperformed existing personalized rankingalgorithms over different evaluation metrics and that therunning time of MERR SVD++ showed a linear correlationwith the number of rating Because of its high precisionand the good expansibility MERR SVD++ is suitable forprocessing big data and has wide application prospect in thefield of internet information recommendation

The rest of this paper is organized as follows Section 2introduces previous related work Section 3 demonstrates theproblem formalization and SVD++ algorithm a new per-sonalized ranking algorithm (MERR SVD++) is proposedin Section 4 the experimental results and discussion arepresented in Section 5 followed by the conclusion and futurework in Section 6

2 Related Work

21 Rating Prediction In conventional CF tasks the mostfrequently used evaluation metrics are the Mean AbsoluteError (MAE) and the Root Mean Square Error (RMSE)Therefore rating prediction (such as the Netflix Prize) hasbeen the most popular method for solving the CF problemRating prediction methods are always regression based theyminimize the error of predicted ratings and true ratingsThe simplest algorithm for rating prediction is 119870-Nearest-Neighbor (KNN) [19] which predicts the missing ratingsfrom the neighborhood of users or items KNN is a memory-based algorithm and one needs to compute all the similaritiesbetweendifferent users or itemsMore efficient algorithms aremodel based they build a model from the visible ratings andcompute all the missing ratings from the model Widely usedmodel-based rating prediction methods include PLSA [20]the Restricted Boltzmann Machine (RBM) [21] and a seriesof matrix factorization techniques [22ndash25]

Mathematical Problems in Engineering 3

22 Learning to Rank LTR is the core technology forinformation retrieval When a query is input into a searchengine LTR is responsible for ranking all the documentsor Web pages according to their relevance to this query orother objectives Many LTR algorithms have been proposedrecently and they can be classified into three categoriespointwise listwise and pairwise [3 26]

In the pointwise approach it is assumed that each query-document pair in the training data has a numerical or ordinalscore Then the LTR problem can be approximated by aregression problem given a single query-document pair itsscore is predicted

As the name suggests the listwise approach takes theentire set of documents associatedwith a query in the trainingdata as the input to construct amodel and predict their scores

The pairwise approach does not focus on accuratelypredicting the degree of relevance of each document insteadit mainly cares about the relative order of two documents Inthis sense it is closer to the concept of ldquorankingrdquo

23 Personalized Ranking Algorithms for Collaborative Fil-tering The algorithms about personalized ranking can alsobe divided into two categories personalized ranking withimplicit feedback (PRIF) [6 7 10 11 13ndash15] and personalizedranking with explicit feedback (PREF) [18 27ndash29] Theforemost of PRIF is Bayesian Personalized Ranking (BPR)[11] which converts the OCCF problem into a rankingproblem Pan et al [13] proposedAdaptive BayesianPersonal-ized Ranking (ABPR) which generalized BPR algorithm forhomogeneous implicit feedback and learned the confidenceadaptively Pan et al [15] have proposed Group BayesianPersonalized Ranking (GBPR) via introducing richer inter-actions among users In GBPR it introduces group prefer-ence to relax the individual and independence assumptionsof BPR Jahrer and Toscher [6] proposed SVD and AFMwhich used a ranking based objective function constructedby matrix decomposition model and a stochastic gradientdescent optimizer Takacs and Tikk [7] proposed RankALSwhich presented a computationally effective approach forthe direct minimization of a ranking objective functionwithout sampling Gai [10] proposed a new PRIF algorithm(Pairwise Probabilistic Matrix Factorization (PPMF)) to fur-ther improve the performance of previous PRIF algorithmsRecently Shi et al [14] proposed Collaborative Less-is-MoreFiltering (CLiMF) in which the model parameters werelearned by directly maximizing the well-known informationretrieval metric Mean Reciprocal Rank (MRR) HoweverCLiMF is not suitable for other evaluation metrics (egMAP AUC and NDCG) References [6 7 10 11 13ndash15]can improve the performance of OCCF by solving the datasparsity and imbalance problems of PRIF to a certain extentAs for PREF [27] was adapted from PLSA and [28] employedthe KNN technique of CF both of which utilized the pairwiseranking method Reference [29] utilized the listwise methodto build its ranker which was a variation of MaximumMargin Factorization [22] Shi et al [18] proposed ExtendedCollaborative Less-Is-More Filtering (xCLiMF)model whichcould be seen as a generalization of the CLiMF methodThe key idea of the xCLiMF algorithm is that it builds a

recommendation model by optimizing Expected ReciprocalRank an evaluation metric that generalizes Reciprocal Rank(RR) in order to incorporate userrsquo explicit feedback Refer-ences [18 27ndash29] can also improve the performance of PREFby solving the data sparsity and imbalance problems of PREFto a certain extent

3 Problem Formalization and SVD++

31 Problem Formalization In this paper we use capitalletters to denote a matrix (such as 119883) Given a matrix 119883119883

119894119895represents its element 119883

119894indicates the 119894th row of 119883 119883

119895

symbolizes the 119895th column of119883 and119883

119879 stands for the trans-pose of119883

32 SVD++ Given that a matrix 119883 = (119883

119894119895)

119898times119899 the total

number of users is 119898 and the total number of items is 119899 if119883 is an explicit feedback matrix then119883

119894119895isin 0 1 2 3 4 5 or

119883

119894119895is unknown We want to approximate 119883 with a low rank

matrix

119883 = (

119883

119894119895)

119898times119899 where

119883 = 119880

119879119881 119880 isin 119862

119889times119898 119881 isin 119862

119889times119899119880 and119881 denote the explicit feature matrix with ranks of 119889 forusers and items respectively 119889 ≪ 119896 and 119896 denotes the rankof119883 119896 le min(119898 119899)

If 119883 is an implicit feedback matrix then 119883

119894119895isin 0 1 or

119883

119894119895is unknown We want to approximate 119883 with a low rank

matrix

119883 = (

119883

119894119895)

119898times119899 where

119883 = 119867

119879119882 119867 isin 119862

119889times119898 119882 isin

119862

119889times119899119867 and119882 denote the implicit feature matrix with ranksof 119889 for users and items respectively

SVD++ is a collaborative filtering algorithm unifyingexplicit and implicit feedback based on rating prediction andmatrix factorization [16] In SVD++ matrix 119881 denotes theexplicit and implicit feature matrix of items simultaneously

The feature matrix of users can be defined as

119880

119906+ |119873 (119906)|

minus12sum

119895isin119873(119906)

119882

119895 (1)

where 119880

119906is used to characterize the userrsquos explicit feed-

back |119873(119906)|

minus12sum

119895isin119873(119906)119882

119895is used to characterize the userrsquos

implicit feedback 119873(119906) denotes the set of all items that user119906 gave implicit feedback and119882

119895denotes the implicit feature

vector of item 119895So the prediction formula of 119883

119906119894in SVD++ can be defined

as

119883

119906119894= (119880

119906+ |119873 (119906)|

minus12sum

119895isin119873(119906)

119882

119895)

119879

119881

119894

(2)

4 Exploiting Explicit and Implicit Feedbackfor Personalized Ranking

In this section we will firstly introduce our MERR SVD++model then give the learning algorithm of this model andfinally analyze its computational complexity

41 Exploiting Explicit and Implicit Feedback for PersonalizedRanking In practical applications the user scans the results

4 Mathematical Problems in Engineering

list from top to bottom and stops when a result is found thatfully satisfies the userrsquos information need The usefulness ofan item at rank 119894 is dependent on the usefulness of the itemsat rank less than 119894 Reciprocal Rank (RR) is an importantevaluationmetric in the research field of information retrieval[14] and it strongly emphasizes the relevance of resultsreturned at the top of the list The ERR measure is ageneralized version of RR designed to be used with multiplerelevance level data (eg ratings) ERR has similar propertiesto RR in that it strongly emphasizes the relevance of resultsreturned at the top of the list

Using the definition of ERR in [18 30] we can describeERR for a ranked item list of user 119906 as follows

ERR119906=

119899

sum

119894=1

119875

119906119894

119877

119906119894

(3)

where119877119906119894denotes the rank position of item 119894 for user 119906 when

all items are ranked in descending order of the predictedrelevance values And119875

119906119894denotes the probability that the user

119906 stops at position 119894 As in [30] 119875119906119894is defined as follows

119875

119906119894= 119903

119906119894

119899

prod

119895=1

(1 minus 119903

119906119895119868 (119877

119906119895lt 119877

119906119894)) (4)

where 119868(119909) is an indicator function equal to 1 if the conditionis true and otherwise 0 And 119903

119906119894denotes the probability that

user 119906 finds the item 119894 relevant Substituting (4) into (3) weobtain the calculation formula of ERR

119906

ERR119906=

119899

sum

119894=1

119903

119906119894

119877

119906119894

119899

prod

119895=1

(1 minus 119903

119906119895119868 (119877

119906119895lt 119877

119906119894)) (5)

We use a mapping function similar to the one used in[18] to convert ratings (or levels of relevance in general) toprobabilities of relevance as follows

119903

119906119894=

2

119883119906119894minus 1

2

119883max 119868

119906119894gt 0

0 119868

119906119894= 0

(6)

where 119868

119906119894is an indicator function Note that 119868

119906119894gt 0 (119868

119906119894=

0) indicates that user 119906rsquos preference to item 119894 is known(unknown) 119883

119906119894denotes the rating given by user 119906 to item 119894

and119883max is the highest ratingIn this paper we define that 119877(119906) denotes the set of all

items that user 119906 gave explicit feedback so 119873(119906) = 119877(119906) inthe dataset that the users only gave explicit feedback and theimplicit feedback dataset is created by setting all the ratingdata in explicit feedback dataset as 1 A toy example can beseen in Figure 2 If the dataset contains both explicit feedbackdata and implicit feedback data 119873(119906) = 119877(119906) + 119872(119906)119872(119906) denoting the set of all items that user 119906 only gaveimplicit feedback A toy example can be seen in Figure 3The influence of implicit feedback on the performance ofMERR SVD++ can be found in Section 542 If we use thetraditional SVD++ algorithm to unify explicit and implicit

feedback data the prediction formula of 119883119906119894in SVD++ can

be defined as

119883

119906119894= (119880

119906+ |119873 (119906)|

minus12sum

119895isin119873(119906)

119882

119895)

119879

119881

119894

(7)

So far through the introduction of SVD++ we canexploit both explicit and implicit feedback simultaneouslyby optimizing evaluation metric ERR So we call our modelMERR SVD++

Note that the value of rank 119877

119906119894depends on the value of

119883

119906119894 For example if the predicted relevance value

119883

119906119894of item

119894 is the second highest among all the items for user 119906 then wewill have 119877

119906119894= 2

Similar to other ranking measures such as RR ERR isalso nonsmooth with respect to the latent factors of users anditems that is 119880 119881 and 119882 It is thus impossible to optimizeERRdirectly using conventional optimization techniquesWethus employ smoothing techniques that were also used inCLiMF [14] and xCLiMF [18] to attain a smoothed versionof ERR In particular we approximate the rank-based terms1119877

119906119894and 119868(119877

119906119895lt 119877

119906119894) in (5) by smooth functions with

respect to the model parameters 119880 119881 and 119882 The approx-imate formula is as follows

119868 (119877

119906119895lt 119877

119906119894) asymp 119892 (

119883

119906119895minus

119883

119906119894)

1

119877

119906119894

= 119892 (

119883

119906119894)

(8)

where 119892(119909) is a logistic function that is 119892(119909) = 1(1 + 119890

minus119909)

Substituting (8) into (5) we obtain a smoothed approxi-mation of ERR

119906

ERR119906=

119899

sum

119894=1

119903

119906119894119892 (

119883

119906119894)

119899

prod

119895=1

(1 minus 119903

119906119895119892 (

119883

119906(119895minus119894))) (9)

Note that for notation convenience we make use of thesubstitution

119883

119906(119895minus119894)=

119883

119906119895minus

119883

119906119894

Given the monotonicity of the logarithm function themodel parameters that maximize (9) are equivalent to theparameters that maximize ln((1|119873(119906)|)ERR

119906) Specifically

we have

119880

119906 119881119882 = arg max

119880119906119881119882

ERR119906

= arg max119880119906119881119882

ln ((1119873 (119906))ERR119906)

= arg max119880119906119881119882

ln(

119899

sum

119895=1

119903

119906119894119892 (

119883

119906119894)

|119873 (119906)|

119899

prod

119895=1

(1 minus 119903

119906119895119892 (

119883

119906(119895minus119894))))

(10)

Mathematical Problems in Engineering 5

Item

Use

r

2

5

4

4 3

(a)

Item

Use

r

1

1

1

1 1

(b)

Figure 2 A toy example of the dataset that the users only gave explicit feedback (a) denotes the explicit feedback dataset (b) denotes theimplicit feedback dataset

Item

Use

r

2

5

4

4 3

(a)

Item

Use

r1 1 1

1 1

1

1 1

(b)

Figure 3 A toy example of the dataset that contains both explicit feedback data and implicit feedback data (a) denotes the explicit feedbackdataset (b) denotes the implicit feedback dataset and the numbers in bold denote the dataset that users only gave implicit feedback

Based on Jensenrsquos inequality and the concavity of thelogarithm function in a similar manner to [14 18] we derivethe lower bound of ln((1|119873(119906)|)ERR

119906) as follows

ln(

1

|119873 (119906)|

ERR119906)

= ln[

[

119899

sum

119895=1

119903

119906119894119892 (

119883

119906119894)

|119873 (119906)|

119899

prod

119895=1

(1 minus 119903

119906119895119892 (

119883

119906(119895minus119894)))

]

]

ge

1

|119873 (119906)|

sdot

119899

sum

119895=1

119903

119906119894ln[

[

119892 (

119883

119906119894)

119899

prod

119895=1

(1 minus 119903

119906119895119892 (

119883

119906(119895minus119894)))

]

]

=

1

|119873 (119906)|

sdot

119899

sum

119895=1

119903

119906119894[

[

ln119892 (

119883

119906119894) +

119899

sum

119895=1

ln (1 minus 119903

119906119895119892 (

119883

119906(119895minus119894)))

]

]

(11)

We can neglect the constant 1|119873(119906)| in the lower bound andobtain a new objective function

119871 (119880

119906 119881119882)

=

119899

sum

119895=1

119903

119906119894[

[

ln119892 (

119883

119906119894) +

119899

sum

119895=1

ln (1 minus 119903

119906119895119892 (

119883

119906(119895minus119894)))

]

]

(12)

Taking into account all119898 users and using the Frobenius normof the latent factors for regularization we obtain the objectivefunction of MERR SVD++

119871 (119880119881119882) =

119898

sum

119906

119899

sum

119894=1

119903

119906119894[

[

ln119892 (

119883

119906119894)

+

119899

sum

119895=1

ln (1 minus 119903

119906119895119892 (

119883

119906(119895minus119894)))

]

]

minus

120582

2

(119880

2+ 119881

2

+ 119882

2)

(13)

6 Mathematical Problems in Engineering

in which 120582 denotes the regularization coefficient and 119880

denotes the Frobenius norm of119880 Note that the lower bound119871(119880 119881119882) is much less complex than the original objectivefunction in (9) and standard optimization methods forexample gradient ascend can be used to learn the optimalmodel parameters 119880 119881 and119882

42 Optimization We can now maximize the objectivefunction (13) with respect to the latent factorsarg max

119880119881119882119871(119880119881119882) Note that 119871(119880119881119882) represents an

approximation of the mean value of ERR across all the usersWe can thus remove the constant coefficient 1119898 since ithas no influence on the optimization of 119871(119880 119881119882) Since theobjective function is smooth we can use gradient ascent forthe optimization The gradients can be derived in a similarmanner to xCLiMF [18] as shown in the following

120597119871

120597119880

119906

=

119899

sum

119894=1

119903

119906119894[

[

119892 (minus

119883

119906119894)119881

119894

+

119899

sum

119895=1

119903

119906119895119892

1015840(

119883

119906119895minus

119883

119906119894) (119881

119894minus 119881

119895)

1 minus 119903

119906119895119892 (

119883

119906119895minus

119883

119906119894)

]

]

minus 120582119880

119906

(14)

120597119871

120597119881

119894

= 119903

119906119894[

[

119892 (minus

119883

119906119894)119881

119894+

119899

sum

119895=1

119903

119906119895119892

1015840(

119883

119906119895minus

119883

119906119894)

sdot (

1

1 minus 119903

119906119895119892 (

119883

119906119895minus

119883

119906119894)

minus

1

1 minus 119903

119906119894119892 (

119883

119906119894minus

119883

119906119895)

)

]

]

(119880

119906+ |119873 (119906)|

minus12

sdot sum

119896isin119873(119906)

119882

119896) minus 120582119881

119894

(15)

120597119871

120597119882

119894

= 119903

119906119894[

[

119892 (minus

119883

119906119894)119881

119894

+

119899

sum

119895=1

119903

119906119895119892

1015840(

119883

119906119895minus

119883

119906119894) (119881

119894minus 119881

119895)

1 minus 119903

119906119895119892 (

119883

119906119895minus

119883

119906119894)

]

]

minus 120582119882

119894

(16)

The learning algorithm for the MERR SVD++ model isoutlined in Algorithm 1

The published research papers in [8ndash11 13 18] showthat the use of the normal distribution 119873(0 001) for theinitialization of feature matrix is very effective so we still usethis approach for the initialization of 119880 119881 and 119882 in ourproposed algorithm

43 Computational Complexity Here we first analyze thecomplexity of the learning process for one iteration Byexploiting the data sparseness in 119883 the computational com-plexity of the gradient in (14) is 119874(119889

2119898 + 119889119898) Note that

Input Training set119883 learning rate 120572 regularization 120582The number of the feature 119889 and the maximal numberOf iterations itermaxOutput feature matrix 119880 119881119882for 119906 = 1 2 119898 doIndex relevant items for user 119906119873(119906) = 119895 | 119883

119894119895gt 0 0 le 119895 le 119899

endInitialize 119880(0) 119881(0) and119882

(0) with random valuesand 119905 = 0repeatfor 119906 = 1 2 119898 doUpdate 119880

119906

119880

(119905+1)

119906= 119880

(119905)

119906+ 120572

120597119871

120597119880

(119905)

119906

for 119894 isin 119873(119906) doUpdate 119881

119894119882119894

119881

119894

(119905+1)= 119881

119894

(119905)+ 120572

120597119871

120597119881

119894

(119905)

119882

119894

(119905+1)= 119882

119894

(119905)+ 120572

120597119871

120597119882

119894

(119905)

EndEnd

119905 = 119905 + 1until 119905 ge itermax119880 = 119880

(119905) 119881 = 119881

(119905)119882 = 119882

(119905)

Algorithm 1 MERR SVD++

denotes the average number of relevant items across all theusers The computational complexity of the gradient in (16)is also 119874(119889

2119898 + 119889119898) The computational complexity of the

gradient in (15) is 119874(119889

2119898 + 119889119898) Hence the complexity

of the learning algorithm in one iteration is in the order of119874(119889

2119898) In the case that is a small number that is 2 ≪ 119898

the complexity is linear to the number of users in the datacollection Note that we have 119904 = 119898 in which 119904 denotes thenumber of nonzeros in the user-item matrix The complexityof the learning algorithm is then 119874(119889119904) Since we usuallyhave ≪ 119904 the complexity is 119874(119889119904) even in the case that is large that is being linear to the number of nonzeros (ierelevant observations in the data) In sum our analysis showsthat MERR SVD++ is suitable for large scale use cases Notethat we also empirically verify the complexity of the learningalgorithm in Section 543

5 Experiment

51 Datasets We use two datasets for the experiments Thefirst is the MovieLens 1 million dataset (ML1m) [18] whichcontains ca 1M ratings (1ndash5 scale) from ca 6K users and 37Kmovies The sparseness of the ML1m dataset is 9553 Thesecond dataset is the Netflix dataset [2 16 17] which contains100000000 ratings (1ndash5 scale) from 480189 users on 17770movies Due to the huge size of the Netflix data we extract asubset of 10000 users and 10000 movies in which each userhas rated more than 100 different movies

Mathematical Problems in Engineering 7

52 Evaluation Metrics Just as has been justified by [17]NDCG is a very suitable evaluation metric for personalizedranking algorithmswhich combine explicit and implicit feed-back And our proposed MERR SVD++ algorithm exploitsboth explicit and implicit feedback simultaneously andoptimizes the well-known personalized ranking evaluationmetric Expected Reciprocal Rank (ERR) So we use theNDCG and ERR as evaluation metrics for the predictabilityof models in this paper

NDCG is another most widely used measure for rankingproblems To define NDCG

119906for a user 119906 one first needs to

define DCG119906

DCG119906=

119898

sum

119894=1

2

pref(119894)minus 1

log (119894 + 1)

(17)

where pref(119894) is a binary indicator returning 1 if the 119894th item ispreferred and 0 otherwise DCG

119906is then normalized by the

ideal ranked list into the interval [0 1]

NDCG119906=

DCG119906

IDCG119906

(18)

where IDCG119906denotes that the ranked list is sorted exactly

according to the userrsquos tastes positive items are placed at thehead of the ranked listTheNDCGof all the users is themeanscore of each user

ERR is a generalized version of Reciprocal Rank (RR)designed to be used with multiple relevance level data (egratings) It has similar properties to RR in that it stronglyemphasizes the relevance of results returned at the top of thelist Using the definition of ERR in [18] we can define ERRfor a ranked item list of user 119906 as follows

ERR119906=

119899

sum

119894=1

119903

119906119894

119877

119906119894

119899

prod

119895=1

(1 minus 119903

119906119895119868 (119877

119906119895lt 119877

119906119894)) (19)

Similar to NDCG the ERR of all the users is the mean scoreof each user

Since in recommender systems the userrsquos satisfaction isdominated by only a few items on the top of the recommenda-tion list our evaluation in the following experiments focuseson the performance of top-5 recommended items that isNDCG5 and ERR5

53 Experiment Setup For each dataset we randomlyselected 5 rated items (movies) and 1000 unrated items(movies) for each user to form a test set We then randomlyselected a varying number of rated items from the rest toform a training set For example just as in [14 18] underthe condition of ldquoGiven 5rdquo we randomly selected 5 rateditems (disjoint to the items in the test set) for each user inorder to generate a training set We investigated a variety ofldquoGivenrdquo conditions for the training sets that is 5 10 and 15for the ML1m dataset and 10 20 and 30 for the extractedNetflix dataset Generated recommendation lists for each userare compared to the ground truth in the test set in order tomeasure the performance

All the models were implemented in MATLAB R2009aForMERR SVD++ the value of the regularization parameter

120582 was selected from range 10

minus5 10

minus4 10

minus3 10

minus2 10

minus1 1 10

and optimal parameter value was used And the learning rate120572 was selected from set 119860 119860 = 120572 | 120572 = 00001 times 2

119888 120572 le

05 119888 gt 0 and 119888 is a constant and the optimal parametervalue was also used In order to compare their performancesfairly for all matrix factorization models we set the numberof features to be 10 The optimal values of all parametersfor all the baseline models used are determined individuallyMore detailed setting methods of the parameters for all thebaselines can be found in the corresponding references Forall the algorithms used in our experiments we repeated theexperiment 5 times for each of the different conditions of eachdataset and the performances reported were averaged across5 runs

54 Experiment Results In this section we present a seriesof experiments to evaluate MERR SVD++ We designedthe experiments in order to address the following researchquestions

(1) Does the proposed MERR SVD++ outperform state-of-the-art personalized ranking approaches for top-Nrecommendation

(2) Does the performance of MERR SVD++ improvewhen we only increase the number of implicit feed-back data for each user

(3) Is MERR SVD++ scalable for large scale use cases

541 Performance Comparison We compare the perfor-mance of MERR SVD++ with that of five baseline algo-rithms The approaches we compare with are listed below

(i) Co-Rating [17] a state-of-the-art CF model thatcan be trained from explicit and implicit feedbacksimultaneously

(ii) SVD++ [16] the first proposed CF model that com-bines explicit and implicit feedback

(iii) xCLiMF [18] a state-of-the-art PR approach whichaims at directly optimizing ERR for top-N recom-mendation in domains with explicit feedback data(eg ratings)

(iv) CofiRank [29] a PR approach that optimizes theNDCG measure [28] for domains with explicit feed-back data (eg ratings) The implementation is basedon the publicly available software package from theauthors

(v) CLiMF [14] a state-of-the-art PR approach that opti-mizes the Mean Reciprocal Rank (MRR) measure[31] for domains with implicit feedback data (egclick follow) We use the explicit feedback datasetsby binarizing the rating values with a threshold Onthe ML1m dataset and the extracted Netflix datasetwe take ratings 4 and 5 (the highest two relevancelevels) respectively as the relevance threshold fortop-N recommendation

The results of the experiments on the ML1m and theextractedNetflix datasets are shown in Figure 4 Rows denote

8 Mathematical Problems in Engineering

0 Given 5 Given 10 Given 150

001

002

003

004

005

006The results on the ML1m dataset

The varieties of ldquoGivenrdquo conditions for the training sets

ND

CG

5

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(a)

0 Given 5 Given 10 Given 150

001

002

003

004

005

006

007The results on the ML1m dataset

ERR

5

The varieties of ldquoGivenrdquo conditions for the training sets

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(b)

0 Given 10 Given 20 Given 300

002

004

006

008

010

012The results on the extracted Netflix dataset

ND

CG

5

The varieties of ldquoGivenrdquo conditions for the training sets

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(c)

0 Given 10 Given 20 Given 300

001

002

003

004

005

006

007

008The results on the extracted Netflix dataset

ERR

5

The varieties of ldquoGivenrdquo conditions for the training sets

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(d)

Figure 4 The performance comparison of MERR SVD++ and baselines

the varieties of ldquoGivenrdquo conditions for the training sets basedon the ML1m and the extracted Netflix datasets and columnsdenote the quality of NDCG and ERR Figure 4 showsthat MERR SVD++ outperforms the baseline approachesin terms of both ERR and NDCG in all of the cases Theresults show that the improvement of ERR aligns consistentlywith the improvement of NDCG indicating that optimizingERR would not degrade the utility of recommendations thatare captured by the NDCG measure It can be seen thatthe relative performance of MERR SVD++ improves as thenumber of observed ratings from the users increases Thisresult indicates that MERR SVD++ can learn better top-Nrecommendation models if more observations of the gradedrelevance data from users can be used The results alsoreveal that it is difficult to model user preferences encoded

in multiple levels of relevance with limited observations inparticular when the number of observations is lower than thenumber of relevance levels

Compared to Co-Rating which is based on ratingprediction MERR SVD++ is based on ranking predictionand succeeds in enhancing the top-ranked performance byoptimizing ERR As reported in [17] the performance ofSVD++ is slightly weaker than that of Co-Rating which isbecause SVD++ model only attempts to approximate theobserved ratings and does not model preferences expressedin implicit feedback The results also show that Co-Ratingand SVD++ significantly outperform xCLiMF and CofiRankwhich confirms our belief that implicit feedback could indeedcomplement explicit feedback It can be seen in Figure 4that xCLiMF significantly outperforms CLiMF in terms of

Mathematical Problems in Engineering 9

0 10 20 30 40 50005

006

007

008

009

010

The increased numbers of implicit feedback for each user

The v

alue

of e

valu

atio

n m

etric

s

ERR5NDCG5

Figure 5 The influence of implicit feedback on the performance ofMERR SVD++

both ERR and NDCG across all the settings of relevancethresholds and datasets The results indicate that the infor-mation loss from binarizing multilevel relevance data wouldinevitably make recommendation models based on binaryrelevance data such as CLiMF suboptimal for the use caseswith explicit feedback data

Hence we give a positive answer to our first researchquestion

542 The Influence of Implicit Feedback on the Performanceof MERR SVD++ The influence of implicit feedback on theperformance of MERR SVD++ can be found in Figure 5Here 119873(119906) = 119877(119906) + 119872(119906) 119872(119906) denotes the set of allitems that user 119906 only gave implicit feedback Rows denotethe increased numbers of implicit feedback for each userand columns denote the quality of ERR and NDCG In ourexperiment the increased numbers of implicit feedback foreach user are the same We use the extracted Netflix datasetunder the condition of ldquoGiven 10rdquo Figure 5 shows that thequality of ERR and NDCG of MERR SVD++ synchronouslyand linearly improves with the increase of implicit feedbackfor each user which confirms our belief that implicit feedbackcould indeed complement explicit feedback

With this experimental result we give a positive answerto our second research question

543 Scalability The last experiment investigated the scala-bility of MERR SVD++ by measuring the training time thatwas required for the training set at different scales Firstlyas analyzed in Section 43 the computational complexityof MERR SVD++ is linear in the number of users in thetraining set when the average number of items rated peruser is fixed To demonstrate the scalability we used differentnumbers of users in the training set under each conditionwe randomly selected from 10 to 100 users in the trainingset and their rated items as the training data for learning thelatent factors The results on the ML1m dataset are shown inFigure 6 We can observe that the computational time under

10 20 30 40 50 60 70 80 90 1000

5

10

15

20

25

30

35

40

Users for training ()

Runt

ime p

er it

erat

ion

(sec

)

Given 5Given 10Given 15

Figure 6 Scalability analysis of MERR SVD++ in terms of thenumber of users in the training set

each condition increases almost linearly to the increase of thenumber of users Secondly as also discussed in Section 43the computational complexity of MERR SVD++ could befurther approximated to be linear to the amount of knowndata (ie nonzero entries in the training user-item matrix)To demonstrate this we examined the runtime of the learningalgorithm against different scales of the training sets underdifferent ldquoGivenrdquo conditions The result is shown in Figure 6from which we can observe that the average runtime of thelearning algorithm per iteration increases almost linearly asthe number of nonzeros in the training set increases

The observations from this experiment allow us to answerour last research question positively

6 Conclusion and Future Work

The problem of the previous researches on personalizedranking is that they focused on either explicit feedbackdata or implicit feedback data rather than making full useof the information in the dataset Until now nobody hasstudied personalized ranking algorithm by exploiting bothexplicit and implicit feedback In order to overcome thedefects of prior researches in this paper we have presenteda new personalized ranking algorithm (MERR SVD++) byexploiting both explicit and implicit feedback simultaneouslyMERR SVD++ optimizes the well-known evaluation metricExpected Reciprocal Rank (ERR) and is based on the newestxCLiMF model and SVD++ algorithm Experimental resultson practical datasets showed that our proposed algorithmoutperformed existing personalized ranking algorithms overdifferent evaluation metrics and that the running time ofMERR SVD++ showed a linear correlation with the num-ber of rating Because of its high precision and the goodexpansibility MERR SVD++ is suitable for processing bigdata and can greatly improve the recommendation speedand validity by solving the latency problem of personalizedrecommendation and has wide application prospect in the

10 Mathematical Problems in Engineering

field of internet information recommendation And becauseMERR SVD++ exploits both explicit and implicit feedbacksimultaneously MERR SVD++ can solve the data sparsityand imbalance problems of personalized ranking algorithmsto a certain extent

For futurework we plan to extend our algorithm to richerones so that our algorithm can solve the grey sheep problemand cold start problem of personalized recommendationAlso we would like to explore more useful information fromthe explicit feedback and implicit feedback simultaneously

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is sponsored in part by the National NaturalScience Foundation of China (nos 61370186 6140326461402122 61003140 61033010 and 61272414) Science andTechnology Planning Project of Guangdong Province (nos2014A010103040 and 2014B010116001) Science and Tech-nology Planning Project of Guangzhou (nos 2014J4100032and 201510010203) the Ministry of Education and ChinaMobile Research Fund (no MCM20121051) the secondbatch open subject of mechanical and electrical professionalgroup engineering technology development center in Foshancity (no 2015-KJZX139) and the 2015 Research BackboneTeachers Training Program of Shunde Polytechnic (no 2015-KJZX014)

References

[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge and DataEngineering vol 17 no 6 pp 734ndash749 2005

[2] S Ahn A Korattikara and N Liu ldquoLarge-scale distributedBayesian matrix factorization using stochastic gradientMCMCrdquo in Proceedings of the 21th ACMSIGKDDConference onKnowledgeDiscovery andDataMining pp 401ndash410 ACMPressSydney Australia August 2015

[3] S Balakrishnan and S Chopra ldquoCollaborative rankingrdquo in Pro-ceedings of the 5th ACM International Conference onWeb Searchand DataMining (WSDM rsquo12) pp 143ndash152 IEEE Seattle WashUSA February 2012

[4] Q Yao and J Kwok ldquoAccelerated inexact soft-impute for fastlarge-scale matrix completionrdquo in Proceedings of the 24rd Inter-national Joint Conference on Artificial Intelligence pp 4002ndash4008 ACM Press Buenos Aires Argentina July 2015

[5] Q Diao M Qiu C-Y Wu A J Smola J Jiang and C WangldquoJointly modeling aspects ratings and sentiments for movierecommendation (JMARS)rdquo in Proceedings of the 20th ACMSIGKDD International Conference on Knowledge Discovery andDataMining (KDD rsquo14) pp 193ndash202 ACMNewYorkNYUSAAugust 2014

[6] M Jahrer and A Toscher ldquoCollaborative filtering ensemble forrankingrdquo in Proceedings of the 17nd International Conference on

Knowledge Discovery and Data Mining pp 153ndash167 ACM SanDiego Calif USA August 2011

[7] G Takacs and D Tikk ldquoAlternating least squares for person-alized rankingrdquo in Proceedings of the 5th ACM Conference onRecommender Systems pp 83ndash90 ACM Dublin Ireland 2012

[8] G Li and L Li ldquoDual collaborative topicmodeling from implicitfeedbacksrdquo in Proceedings of the IEEE International Conferenceon Security Pattern Analysis and Cybernetics pp 395ndash404Wuhan China October 2014

[9] H Wang X Shi and D Yeung ldquoRelational stacked denoisingautoencoder for tag recommendationrdquo in Proceedings of the29th AAAI Conference on Artificial Intelligence pp 3052ndash3058ACM Press Austin Tex USA January 2015

[10] L Gai ldquoPairwise probabilistic matrix factorization for implicitfeedback collaborative filteringrdquo in Proceedings of the IEEEInternational Conference on Security Pattern Analysis andCybernetics (SPAC rsquo14) pp 181ndash190 Wuhan China October2014

[11] S Rendle C Freudenthaler Z Gantner and L Schmidt-Thieme ldquoBPR Bayesian personalized ranking from implicitfeedbackrdquo in Proceedings of the 25th Conference on Uncertaintyin Artificial Intelligence (UAI rsquo09) pp 452ndash461 Morgan Kauf-mann Montreal Canada 2009

[12] L Guo J Ma H R Jiang Z M Chen and C M XingldquoSocial trust aware item recommendation for implicit feedbackrdquoJournal of Computer Science and Technology vol 30 no 5 pp1039ndash1053 2015

[13] W K Pan H Zhong C F Xu and Z Ming ldquoAdaptive Bayesianpersonalized ranking for heterogeneous implicit feedbacksrdquoKnowledge-Based Systems vol 73 pp 173ndash180 2015

[14] Y Shi A Karatzoglou L Baltrunas M Larson N Oliver andA Hanjalic ldquoCLiMF collaborative less-is-more filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 3077ndash3081 ACMBeijing ChinaAugust 2013

[15] WK Pan andLChen ldquoGBPR grouppreference based bayesianpersonalized ranking for one-class collaborative filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 2691ndash2697 Beijing China 2013

[16] Y Koren ldquoFactor in the neighbors scalable and accurate col-laborative filteringrdquoACMTransactions on Knowledge Discoveryfrom Data vol 4 no 1 pp 1ndash24 2010

[17] N N Liu E W Xiang M Zhao and Q Yang ldquoUnifyingexplicit and implicit feedback for collaborative filteringrdquo inProceedings of the 19th International Conference on Informationand Knowledge Management (CIKM rsquo10) pp 1445ndash1448 ACMToronto Canada October 2010

[18] Y Shi A Karatzoglou L BaltrunasM Larson andA HanjalicldquoXCLiMF Optimizing expected reciprocal rank for data withmultiple levels of relevancerdquo in Proceedings of the 7th ACMConference on Recommender Systems (RecSys rsquo13) pp 431ndash434ACM Hongkong October 2013

[19] J Jiang J Lu G Zhang and G Long ldquoScaling-up item-based collaborative filtering recommendation algorithm basedon Hadooprdquo in Proceedings of the 7th IEEE World Congress onServices pp 490ndash497 IEEE Washington DC USA July 2011

[20] T Hofmann ldquoLatent semantic models for collaborative filter-ingrdquo ACM Transactions on Information Systems vol 22 no 1pp 89ndash115 2004

[21] R Salakhutdinov A Mnih and G Hinton ldquoRestricted Boltz-mannmachines for collaborative filteringrdquo in Proceedings of the

Mathematical Problems in Engineering 11

24rd International Conference onMachine Learning (ICML rsquo07)pp 791ndash798 ACM Press Corvallis Ore USA June 2007

[22] N Srebro J Rennie andT Jaakkola ldquoMaximum-marginmatrixfactorizationrdquo in Proceedings of the 18th Annual Conferenceon Neural Information Processing Systems pp 11ndash217 BritishColumbia Canada 2004

[23] J T Liu C H Wu and W Y Liu ldquoBayesian probabilisticmatrix factorization with social relations and item contents forrecommendationrdquo Decision Support Systems vol 55 no 3 pp838ndash850 2013

[24] D Feldman and T Tassa ldquoMore constraints smaller coresetsconstrained matrix approximation of sparse big datardquo in Pro-ceedings of the 21th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo15) pp 249ndash258ACM Press Sydney Australia August 2015

[25] S Mirisaee E Gaussier and A Termier ldquoImproved local searchfor binarymatrix factorizationrdquo in Proceedings of the 29th AAAIConference on Artificial Intelligence pp 1198ndash1204 ACM PressAustin Tex USA January 2015

[26] T Y Liu Learning to Rank for Information Retrieval SpringerNew York NY USA 2011

[27] NN LiuM Zhao andQ Yang ldquoProbabilistic latent preferenceanalysis for collaborative filteringrdquo in Proceedings of the 18thInternational Conference on Information and Knowledge Man-agement (CIKM rsquo09) pp 759ndash766 ACM Press Hong KongNovember 2009

[28] N N Liu and Q Yang ldquoEigenRank a ranking-orientedapproach to collaborative filteringrdquo in Proceedings of the 31stAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval (ACM SIGIR rsquo08) pp 83ndash90 ACM Singapore July 2008

[29] M Weimer A Karatzoglou Q V Le and A J SmolaldquoCofiRankndashmaximum margin matrix factorization for collab-orative rankingrdquo in Proceedings of the 21th Conference onAdvances in Neural Information Processing Systems pp 79ndash86Curran Associates Vancouver Canada 2007

[30] O Chapelle D Metlzer Y Zhang and P Grinspan ldquoExpectedreciprocal rank for graded relevancerdquo in Proceedings of the 18thACM International Conference on Information and KnowledgeManagement (CIKM rsquo09) pp 621ndash630 ACM New York NYUSA November 2009

[31] E M Voorhees ldquoThe trec-8 question answering track reportrdquoin Proceedings of the 8th Text Retrieval Conference (TREC-8 rsquo99)Gaithersburg Md USA November 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 3: Research Article Exploiting Explicit and Implicit Feedback ...downloads.hindawi.com/journals/mpe/2016/2535329.pdfLTR is the core technology for information retrieval. When a query

Mathematical Problems in Engineering 3

22 Learning to Rank LTR is the core technology forinformation retrieval When a query is input into a searchengine LTR is responsible for ranking all the documentsor Web pages according to their relevance to this query orother objectives Many LTR algorithms have been proposedrecently and they can be classified into three categoriespointwise listwise and pairwise [3 26]

In the pointwise approach it is assumed that each query-document pair in the training data has a numerical or ordinalscore Then the LTR problem can be approximated by aregression problem given a single query-document pair itsscore is predicted

As the name suggests the listwise approach takes theentire set of documents associatedwith a query in the trainingdata as the input to construct amodel and predict their scores

The pairwise approach does not focus on accuratelypredicting the degree of relevance of each document insteadit mainly cares about the relative order of two documents Inthis sense it is closer to the concept of ldquorankingrdquo

23 Personalized Ranking Algorithms for Collaborative Fil-tering The algorithms about personalized ranking can alsobe divided into two categories personalized ranking withimplicit feedback (PRIF) [6 7 10 11 13ndash15] and personalizedranking with explicit feedback (PREF) [18 27ndash29] Theforemost of PRIF is Bayesian Personalized Ranking (BPR)[11] which converts the OCCF problem into a rankingproblem Pan et al [13] proposedAdaptive BayesianPersonal-ized Ranking (ABPR) which generalized BPR algorithm forhomogeneous implicit feedback and learned the confidenceadaptively Pan et al [15] have proposed Group BayesianPersonalized Ranking (GBPR) via introducing richer inter-actions among users In GBPR it introduces group prefer-ence to relax the individual and independence assumptionsof BPR Jahrer and Toscher [6] proposed SVD and AFMwhich used a ranking based objective function constructedby matrix decomposition model and a stochastic gradientdescent optimizer Takacs and Tikk [7] proposed RankALSwhich presented a computationally effective approach forthe direct minimization of a ranking objective functionwithout sampling Gai [10] proposed a new PRIF algorithm(Pairwise Probabilistic Matrix Factorization (PPMF)) to fur-ther improve the performance of previous PRIF algorithmsRecently Shi et al [14] proposed Collaborative Less-is-MoreFiltering (CLiMF) in which the model parameters werelearned by directly maximizing the well-known informationretrieval metric Mean Reciprocal Rank (MRR) HoweverCLiMF is not suitable for other evaluation metrics (egMAP AUC and NDCG) References [6 7 10 11 13ndash15]can improve the performance of OCCF by solving the datasparsity and imbalance problems of PRIF to a certain extentAs for PREF [27] was adapted from PLSA and [28] employedthe KNN technique of CF both of which utilized the pairwiseranking method Reference [29] utilized the listwise methodto build its ranker which was a variation of MaximumMargin Factorization [22] Shi et al [18] proposed ExtendedCollaborative Less-Is-More Filtering (xCLiMF)model whichcould be seen as a generalization of the CLiMF methodThe key idea of the xCLiMF algorithm is that it builds a

recommendation model by optimizing Expected ReciprocalRank an evaluation metric that generalizes Reciprocal Rank(RR) in order to incorporate userrsquo explicit feedback Refer-ences [18 27ndash29] can also improve the performance of PREFby solving the data sparsity and imbalance problems of PREFto a certain extent

3 Problem Formalization and SVD++

31 Problem Formalization In this paper we use capitalletters to denote a matrix (such as 119883) Given a matrix 119883119883

119894119895represents its element 119883

119894indicates the 119894th row of 119883 119883

119895

symbolizes the 119895th column of119883 and119883

119879 stands for the trans-pose of119883

32 SVD++ Given that a matrix 119883 = (119883

119894119895)

119898times119899 the total

number of users is 119898 and the total number of items is 119899 if119883 is an explicit feedback matrix then119883

119894119895isin 0 1 2 3 4 5 or

119883

119894119895is unknown We want to approximate 119883 with a low rank

matrix

119883 = (

119883

119894119895)

119898times119899 where

119883 = 119880

119879119881 119880 isin 119862

119889times119898 119881 isin 119862

119889times119899119880 and119881 denote the explicit feature matrix with ranks of 119889 forusers and items respectively 119889 ≪ 119896 and 119896 denotes the rankof119883 119896 le min(119898 119899)

If 119883 is an implicit feedback matrix then 119883

119894119895isin 0 1 or

119883

119894119895is unknown We want to approximate 119883 with a low rank

matrix

119883 = (

119883

119894119895)

119898times119899 where

119883 = 119867

119879119882 119867 isin 119862

119889times119898 119882 isin

119862

119889times119899119867 and119882 denote the implicit feature matrix with ranksof 119889 for users and items respectively

SVD++ is a collaborative filtering algorithm unifyingexplicit and implicit feedback based on rating prediction andmatrix factorization [16] In SVD++ matrix 119881 denotes theexplicit and implicit feature matrix of items simultaneously

The feature matrix of users can be defined as

119880

119906+ |119873 (119906)|

minus12sum

119895isin119873(119906)

119882

119895 (1)

where 119880

119906is used to characterize the userrsquos explicit feed-

back |119873(119906)|

minus12sum

119895isin119873(119906)119882

119895is used to characterize the userrsquos

implicit feedback 119873(119906) denotes the set of all items that user119906 gave implicit feedback and119882

119895denotes the implicit feature

vector of item 119895So the prediction formula of 119883

119906119894in SVD++ can be defined

as

119883

119906119894= (119880

119906+ |119873 (119906)|

minus12sum

119895isin119873(119906)

119882

119895)

119879

119881

119894

(2)

4 Exploiting Explicit and Implicit Feedbackfor Personalized Ranking

In this section we will firstly introduce our MERR SVD++model then give the learning algorithm of this model andfinally analyze its computational complexity

41 Exploiting Explicit and Implicit Feedback for PersonalizedRanking In practical applications the user scans the results

4 Mathematical Problems in Engineering

list from top to bottom and stops when a result is found thatfully satisfies the userrsquos information need The usefulness ofan item at rank 119894 is dependent on the usefulness of the itemsat rank less than 119894 Reciprocal Rank (RR) is an importantevaluationmetric in the research field of information retrieval[14] and it strongly emphasizes the relevance of resultsreturned at the top of the list The ERR measure is ageneralized version of RR designed to be used with multiplerelevance level data (eg ratings) ERR has similar propertiesto RR in that it strongly emphasizes the relevance of resultsreturned at the top of the list

Using the definition of ERR in [18 30] we can describeERR for a ranked item list of user 119906 as follows

ERR119906=

119899

sum

119894=1

119875

119906119894

119877

119906119894

(3)

where119877119906119894denotes the rank position of item 119894 for user 119906 when

all items are ranked in descending order of the predictedrelevance values And119875

119906119894denotes the probability that the user

119906 stops at position 119894 As in [30] 119875119906119894is defined as follows

119875

119906119894= 119903

119906119894

119899

prod

119895=1

(1 minus 119903

119906119895119868 (119877

119906119895lt 119877

119906119894)) (4)

where 119868(119909) is an indicator function equal to 1 if the conditionis true and otherwise 0 And 119903

119906119894denotes the probability that

user 119906 finds the item 119894 relevant Substituting (4) into (3) weobtain the calculation formula of ERR

119906

ERR119906=

119899

sum

119894=1

119903

119906119894

119877

119906119894

119899

prod

119895=1

(1 minus 119903

119906119895119868 (119877

119906119895lt 119877

119906119894)) (5)

We use a mapping function similar to the one used in[18] to convert ratings (or levels of relevance in general) toprobabilities of relevance as follows

119903

119906119894=

2

119883119906119894minus 1

2

119883max 119868

119906119894gt 0

0 119868

119906119894= 0

(6)

where 119868

119906119894is an indicator function Note that 119868

119906119894gt 0 (119868

119906119894=

0) indicates that user 119906rsquos preference to item 119894 is known(unknown) 119883

119906119894denotes the rating given by user 119906 to item 119894

and119883max is the highest ratingIn this paper we define that 119877(119906) denotes the set of all

items that user 119906 gave explicit feedback so 119873(119906) = 119877(119906) inthe dataset that the users only gave explicit feedback and theimplicit feedback dataset is created by setting all the ratingdata in explicit feedback dataset as 1 A toy example can beseen in Figure 2 If the dataset contains both explicit feedbackdata and implicit feedback data 119873(119906) = 119877(119906) + 119872(119906)119872(119906) denoting the set of all items that user 119906 only gaveimplicit feedback A toy example can be seen in Figure 3The influence of implicit feedback on the performance ofMERR SVD++ can be found in Section 542 If we use thetraditional SVD++ algorithm to unify explicit and implicit

feedback data the prediction formula of 119883119906119894in SVD++ can

be defined as

119883

119906119894= (119880

119906+ |119873 (119906)|

minus12sum

119895isin119873(119906)

119882

119895)

119879

119881

119894

(7)

So far through the introduction of SVD++ we canexploit both explicit and implicit feedback simultaneouslyby optimizing evaluation metric ERR So we call our modelMERR SVD++

Note that the value of rank 119877

119906119894depends on the value of

119883

119906119894 For example if the predicted relevance value

119883

119906119894of item

119894 is the second highest among all the items for user 119906 then wewill have 119877

119906119894= 2

Similar to other ranking measures such as RR ERR isalso nonsmooth with respect to the latent factors of users anditems that is 119880 119881 and 119882 It is thus impossible to optimizeERRdirectly using conventional optimization techniquesWethus employ smoothing techniques that were also used inCLiMF [14] and xCLiMF [18] to attain a smoothed versionof ERR In particular we approximate the rank-based terms1119877

119906119894and 119868(119877

119906119895lt 119877

119906119894) in (5) by smooth functions with

respect to the model parameters 119880 119881 and 119882 The approx-imate formula is as follows

119868 (119877

119906119895lt 119877

119906119894) asymp 119892 (

119883

119906119895minus

119883

119906119894)

1

119877

119906119894

= 119892 (

119883

119906119894)

(8)

where 119892(119909) is a logistic function that is 119892(119909) = 1(1 + 119890

minus119909)

Substituting (8) into (5) we obtain a smoothed approxi-mation of ERR

119906

ERR119906=

119899

sum

119894=1

119903

119906119894119892 (

119883

119906119894)

119899

prod

119895=1

(1 minus 119903

119906119895119892 (

119883

119906(119895minus119894))) (9)

Note that for notation convenience we make use of thesubstitution

119883

119906(119895minus119894)=

119883

119906119895minus

119883

119906119894

Given the monotonicity of the logarithm function themodel parameters that maximize (9) are equivalent to theparameters that maximize ln((1|119873(119906)|)ERR

119906) Specifically

we have

119880

119906 119881119882 = arg max

119880119906119881119882

ERR119906

= arg max119880119906119881119882

ln ((1119873 (119906))ERR119906)

= arg max119880119906119881119882

ln(

119899

sum

119895=1

119903

119906119894119892 (

119883

119906119894)

|119873 (119906)|

119899

prod

119895=1

(1 minus 119903

119906119895119892 (

119883

119906(119895minus119894))))

(10)

Mathematical Problems in Engineering 5

Item

Use

r

2

5

4

4 3

(a)

Item

Use

r

1

1

1

1 1

(b)

Figure 2 A toy example of the dataset that the users only gave explicit feedback (a) denotes the explicit feedback dataset (b) denotes theimplicit feedback dataset

Item

Use

r

2

5

4

4 3

(a)

Item

Use

r1 1 1

1 1

1

1 1

(b)

Figure 3 A toy example of the dataset that contains both explicit feedback data and implicit feedback data (a) denotes the explicit feedbackdataset (b) denotes the implicit feedback dataset and the numbers in bold denote the dataset that users only gave implicit feedback

Based on Jensenrsquos inequality and the concavity of thelogarithm function in a similar manner to [14 18] we derivethe lower bound of ln((1|119873(119906)|)ERR

119906) as follows

ln(

1

|119873 (119906)|

ERR119906)

= ln[

[

119899

sum

119895=1

119903

119906119894119892 (

119883

119906119894)

|119873 (119906)|

119899

prod

119895=1

(1 minus 119903

119906119895119892 (

119883

119906(119895minus119894)))

]

]

ge

1

|119873 (119906)|

sdot

119899

sum

119895=1

119903

119906119894ln[

[

119892 (

119883

119906119894)

119899

prod

119895=1

(1 minus 119903

119906119895119892 (

119883

119906(119895minus119894)))

]

]

=

1

|119873 (119906)|

sdot

119899

sum

119895=1

119903

119906119894[

[

ln119892 (

119883

119906119894) +

119899

sum

119895=1

ln (1 minus 119903

119906119895119892 (

119883

119906(119895minus119894)))

]

]

(11)

We can neglect the constant 1|119873(119906)| in the lower bound andobtain a new objective function

119871 (119880

119906 119881119882)

=

119899

sum

119895=1

119903

119906119894[

[

ln119892 (

119883

119906119894) +

119899

sum

119895=1

ln (1 minus 119903

119906119895119892 (

119883

119906(119895minus119894)))

]

]

(12)

Taking into account all119898 users and using the Frobenius normof the latent factors for regularization we obtain the objectivefunction of MERR SVD++

119871 (119880119881119882) =

119898

sum

119906

119899

sum

119894=1

119903

119906119894[

[

ln119892 (

119883

119906119894)

+

119899

sum

119895=1

ln (1 minus 119903

119906119895119892 (

119883

119906(119895minus119894)))

]

]

minus

120582

2

(119880

2+ 119881

2

+ 119882

2)

(13)

6 Mathematical Problems in Engineering

in which 120582 denotes the regularization coefficient and 119880

denotes the Frobenius norm of119880 Note that the lower bound119871(119880 119881119882) is much less complex than the original objectivefunction in (9) and standard optimization methods forexample gradient ascend can be used to learn the optimalmodel parameters 119880 119881 and119882

42 Optimization We can now maximize the objectivefunction (13) with respect to the latent factorsarg max

119880119881119882119871(119880119881119882) Note that 119871(119880119881119882) represents an

approximation of the mean value of ERR across all the usersWe can thus remove the constant coefficient 1119898 since ithas no influence on the optimization of 119871(119880 119881119882) Since theobjective function is smooth we can use gradient ascent forthe optimization The gradients can be derived in a similarmanner to xCLiMF [18] as shown in the following

120597119871

120597119880

119906

=

119899

sum

119894=1

119903

119906119894[

[

119892 (minus

119883

119906119894)119881

119894

+

119899

sum

119895=1

119903

119906119895119892

1015840(

119883

119906119895minus

119883

119906119894) (119881

119894minus 119881

119895)

1 minus 119903

119906119895119892 (

119883

119906119895minus

119883

119906119894)

]

]

minus 120582119880

119906

(14)

120597119871

120597119881

119894

= 119903

119906119894[

[

119892 (minus

119883

119906119894)119881

119894+

119899

sum

119895=1

119903

119906119895119892

1015840(

119883

119906119895minus

119883

119906119894)

sdot (

1

1 minus 119903

119906119895119892 (

119883

119906119895minus

119883

119906119894)

minus

1

1 minus 119903

119906119894119892 (

119883

119906119894minus

119883

119906119895)

)

]

]

(119880

119906+ |119873 (119906)|

minus12

sdot sum

119896isin119873(119906)

119882

119896) minus 120582119881

119894

(15)

120597119871

120597119882

119894

= 119903

119906119894[

[

119892 (minus

119883

119906119894)119881

119894

+

119899

sum

119895=1

119903

119906119895119892

1015840(

119883

119906119895minus

119883

119906119894) (119881

119894minus 119881

119895)

1 minus 119903

119906119895119892 (

119883

119906119895minus

119883

119906119894)

]

]

minus 120582119882

119894

(16)

The learning algorithm for the MERR SVD++ model isoutlined in Algorithm 1

The published research papers in [8ndash11 13 18] showthat the use of the normal distribution 119873(0 001) for theinitialization of feature matrix is very effective so we still usethis approach for the initialization of 119880 119881 and 119882 in ourproposed algorithm

43 Computational Complexity Here we first analyze thecomplexity of the learning process for one iteration Byexploiting the data sparseness in 119883 the computational com-plexity of the gradient in (14) is 119874(119889

2119898 + 119889119898) Note that

Input Training set119883 learning rate 120572 regularization 120582The number of the feature 119889 and the maximal numberOf iterations itermaxOutput feature matrix 119880 119881119882for 119906 = 1 2 119898 doIndex relevant items for user 119906119873(119906) = 119895 | 119883

119894119895gt 0 0 le 119895 le 119899

endInitialize 119880(0) 119881(0) and119882

(0) with random valuesand 119905 = 0repeatfor 119906 = 1 2 119898 doUpdate 119880

119906

119880

(119905+1)

119906= 119880

(119905)

119906+ 120572

120597119871

120597119880

(119905)

119906

for 119894 isin 119873(119906) doUpdate 119881

119894119882119894

119881

119894

(119905+1)= 119881

119894

(119905)+ 120572

120597119871

120597119881

119894

(119905)

119882

119894

(119905+1)= 119882

119894

(119905)+ 120572

120597119871

120597119882

119894

(119905)

EndEnd

119905 = 119905 + 1until 119905 ge itermax119880 = 119880

(119905) 119881 = 119881

(119905)119882 = 119882

(119905)

Algorithm 1 MERR SVD++

denotes the average number of relevant items across all theusers The computational complexity of the gradient in (16)is also 119874(119889

2119898 + 119889119898) The computational complexity of the

gradient in (15) is 119874(119889

2119898 + 119889119898) Hence the complexity

of the learning algorithm in one iteration is in the order of119874(119889

2119898) In the case that is a small number that is 2 ≪ 119898

the complexity is linear to the number of users in the datacollection Note that we have 119904 = 119898 in which 119904 denotes thenumber of nonzeros in the user-item matrix The complexityof the learning algorithm is then 119874(119889119904) Since we usuallyhave ≪ 119904 the complexity is 119874(119889119904) even in the case that is large that is being linear to the number of nonzeros (ierelevant observations in the data) In sum our analysis showsthat MERR SVD++ is suitable for large scale use cases Notethat we also empirically verify the complexity of the learningalgorithm in Section 543

5 Experiment

51 Datasets We use two datasets for the experiments Thefirst is the MovieLens 1 million dataset (ML1m) [18] whichcontains ca 1M ratings (1ndash5 scale) from ca 6K users and 37Kmovies The sparseness of the ML1m dataset is 9553 Thesecond dataset is the Netflix dataset [2 16 17] which contains100000000 ratings (1ndash5 scale) from 480189 users on 17770movies Due to the huge size of the Netflix data we extract asubset of 10000 users and 10000 movies in which each userhas rated more than 100 different movies

Mathematical Problems in Engineering 7

52 Evaluation Metrics Just as has been justified by [17]NDCG is a very suitable evaluation metric for personalizedranking algorithmswhich combine explicit and implicit feed-back And our proposed MERR SVD++ algorithm exploitsboth explicit and implicit feedback simultaneously andoptimizes the well-known personalized ranking evaluationmetric Expected Reciprocal Rank (ERR) So we use theNDCG and ERR as evaluation metrics for the predictabilityof models in this paper

NDCG is another most widely used measure for rankingproblems To define NDCG

119906for a user 119906 one first needs to

define DCG119906

DCG119906=

119898

sum

119894=1

2

pref(119894)minus 1

log (119894 + 1)

(17)

where pref(119894) is a binary indicator returning 1 if the 119894th item ispreferred and 0 otherwise DCG

119906is then normalized by the

ideal ranked list into the interval [0 1]

NDCG119906=

DCG119906

IDCG119906

(18)

where IDCG119906denotes that the ranked list is sorted exactly

according to the userrsquos tastes positive items are placed at thehead of the ranked listTheNDCGof all the users is themeanscore of each user

ERR is a generalized version of Reciprocal Rank (RR)designed to be used with multiple relevance level data (egratings) It has similar properties to RR in that it stronglyemphasizes the relevance of results returned at the top of thelist Using the definition of ERR in [18] we can define ERRfor a ranked item list of user 119906 as follows

ERR119906=

119899

sum

119894=1

119903

119906119894

119877

119906119894

119899

prod

119895=1

(1 minus 119903

119906119895119868 (119877

119906119895lt 119877

119906119894)) (19)

Similar to NDCG the ERR of all the users is the mean scoreof each user

Since in recommender systems the userrsquos satisfaction isdominated by only a few items on the top of the recommenda-tion list our evaluation in the following experiments focuseson the performance of top-5 recommended items that isNDCG5 and ERR5

53 Experiment Setup For each dataset we randomlyselected 5 rated items (movies) and 1000 unrated items(movies) for each user to form a test set We then randomlyselected a varying number of rated items from the rest toform a training set For example just as in [14 18] underthe condition of ldquoGiven 5rdquo we randomly selected 5 rateditems (disjoint to the items in the test set) for each user inorder to generate a training set We investigated a variety ofldquoGivenrdquo conditions for the training sets that is 5 10 and 15for the ML1m dataset and 10 20 and 30 for the extractedNetflix dataset Generated recommendation lists for each userare compared to the ground truth in the test set in order tomeasure the performance

All the models were implemented in MATLAB R2009aForMERR SVD++ the value of the regularization parameter

120582 was selected from range 10

minus5 10

minus4 10

minus3 10

minus2 10

minus1 1 10

and optimal parameter value was used And the learning rate120572 was selected from set 119860 119860 = 120572 | 120572 = 00001 times 2

119888 120572 le

05 119888 gt 0 and 119888 is a constant and the optimal parametervalue was also used In order to compare their performancesfairly for all matrix factorization models we set the numberof features to be 10 The optimal values of all parametersfor all the baseline models used are determined individuallyMore detailed setting methods of the parameters for all thebaselines can be found in the corresponding references Forall the algorithms used in our experiments we repeated theexperiment 5 times for each of the different conditions of eachdataset and the performances reported were averaged across5 runs

54 Experiment Results In this section we present a seriesof experiments to evaluate MERR SVD++ We designedthe experiments in order to address the following researchquestions

(1) Does the proposed MERR SVD++ outperform state-of-the-art personalized ranking approaches for top-Nrecommendation

(2) Does the performance of MERR SVD++ improvewhen we only increase the number of implicit feed-back data for each user

(3) Is MERR SVD++ scalable for large scale use cases

541 Performance Comparison We compare the perfor-mance of MERR SVD++ with that of five baseline algo-rithms The approaches we compare with are listed below

(i) Co-Rating [17] a state-of-the-art CF model thatcan be trained from explicit and implicit feedbacksimultaneously

(ii) SVD++ [16] the first proposed CF model that com-bines explicit and implicit feedback

(iii) xCLiMF [18] a state-of-the-art PR approach whichaims at directly optimizing ERR for top-N recom-mendation in domains with explicit feedback data(eg ratings)

(iv) CofiRank [29] a PR approach that optimizes theNDCG measure [28] for domains with explicit feed-back data (eg ratings) The implementation is basedon the publicly available software package from theauthors

(v) CLiMF [14] a state-of-the-art PR approach that opti-mizes the Mean Reciprocal Rank (MRR) measure[31] for domains with implicit feedback data (egclick follow) We use the explicit feedback datasetsby binarizing the rating values with a threshold Onthe ML1m dataset and the extracted Netflix datasetwe take ratings 4 and 5 (the highest two relevancelevels) respectively as the relevance threshold fortop-N recommendation

The results of the experiments on the ML1m and theextractedNetflix datasets are shown in Figure 4 Rows denote

8 Mathematical Problems in Engineering

0 Given 5 Given 10 Given 150

001

002

003

004

005

006The results on the ML1m dataset

The varieties of ldquoGivenrdquo conditions for the training sets

ND

CG

5

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(a)

0 Given 5 Given 10 Given 150

001

002

003

004

005

006

007The results on the ML1m dataset

ERR

5

The varieties of ldquoGivenrdquo conditions for the training sets

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(b)

0 Given 10 Given 20 Given 300

002

004

006

008

010

012The results on the extracted Netflix dataset

ND

CG

5

The varieties of ldquoGivenrdquo conditions for the training sets

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(c)

0 Given 10 Given 20 Given 300

001

002

003

004

005

006

007

008The results on the extracted Netflix dataset

ERR

5

The varieties of ldquoGivenrdquo conditions for the training sets

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(d)

Figure 4 The performance comparison of MERR SVD++ and baselines

the varieties of ldquoGivenrdquo conditions for the training sets basedon the ML1m and the extracted Netflix datasets and columnsdenote the quality of NDCG and ERR Figure 4 showsthat MERR SVD++ outperforms the baseline approachesin terms of both ERR and NDCG in all of the cases Theresults show that the improvement of ERR aligns consistentlywith the improvement of NDCG indicating that optimizingERR would not degrade the utility of recommendations thatare captured by the NDCG measure It can be seen thatthe relative performance of MERR SVD++ improves as thenumber of observed ratings from the users increases Thisresult indicates that MERR SVD++ can learn better top-Nrecommendation models if more observations of the gradedrelevance data from users can be used The results alsoreveal that it is difficult to model user preferences encoded

in multiple levels of relevance with limited observations inparticular when the number of observations is lower than thenumber of relevance levels

Compared to Co-Rating which is based on ratingprediction MERR SVD++ is based on ranking predictionand succeeds in enhancing the top-ranked performance byoptimizing ERR As reported in [17] the performance ofSVD++ is slightly weaker than that of Co-Rating which isbecause SVD++ model only attempts to approximate theobserved ratings and does not model preferences expressedin implicit feedback The results also show that Co-Ratingand SVD++ significantly outperform xCLiMF and CofiRankwhich confirms our belief that implicit feedback could indeedcomplement explicit feedback It can be seen in Figure 4that xCLiMF significantly outperforms CLiMF in terms of

Mathematical Problems in Engineering 9

0 10 20 30 40 50005

006

007

008

009

010

The increased numbers of implicit feedback for each user

The v

alue

of e

valu

atio

n m

etric

s

ERR5NDCG5

Figure 5 The influence of implicit feedback on the performance ofMERR SVD++

both ERR and NDCG across all the settings of relevancethresholds and datasets The results indicate that the infor-mation loss from binarizing multilevel relevance data wouldinevitably make recommendation models based on binaryrelevance data such as CLiMF suboptimal for the use caseswith explicit feedback data

Hence we give a positive answer to our first researchquestion

542 The Influence of Implicit Feedback on the Performanceof MERR SVD++ The influence of implicit feedback on theperformance of MERR SVD++ can be found in Figure 5Here 119873(119906) = 119877(119906) + 119872(119906) 119872(119906) denotes the set of allitems that user 119906 only gave implicit feedback Rows denotethe increased numbers of implicit feedback for each userand columns denote the quality of ERR and NDCG In ourexperiment the increased numbers of implicit feedback foreach user are the same We use the extracted Netflix datasetunder the condition of ldquoGiven 10rdquo Figure 5 shows that thequality of ERR and NDCG of MERR SVD++ synchronouslyand linearly improves with the increase of implicit feedbackfor each user which confirms our belief that implicit feedbackcould indeed complement explicit feedback

With this experimental result we give a positive answerto our second research question

543 Scalability The last experiment investigated the scala-bility of MERR SVD++ by measuring the training time thatwas required for the training set at different scales Firstlyas analyzed in Section 43 the computational complexityof MERR SVD++ is linear in the number of users in thetraining set when the average number of items rated peruser is fixed To demonstrate the scalability we used differentnumbers of users in the training set under each conditionwe randomly selected from 10 to 100 users in the trainingset and their rated items as the training data for learning thelatent factors The results on the ML1m dataset are shown inFigure 6 We can observe that the computational time under

10 20 30 40 50 60 70 80 90 1000

5

10

15

20

25

30

35

40

Users for training ()

Runt

ime p

er it

erat

ion

(sec

)

Given 5Given 10Given 15

Figure 6 Scalability analysis of MERR SVD++ in terms of thenumber of users in the training set

each condition increases almost linearly to the increase of thenumber of users Secondly as also discussed in Section 43the computational complexity of MERR SVD++ could befurther approximated to be linear to the amount of knowndata (ie nonzero entries in the training user-item matrix)To demonstrate this we examined the runtime of the learningalgorithm against different scales of the training sets underdifferent ldquoGivenrdquo conditions The result is shown in Figure 6from which we can observe that the average runtime of thelearning algorithm per iteration increases almost linearly asthe number of nonzeros in the training set increases

The observations from this experiment allow us to answerour last research question positively

6 Conclusion and Future Work

The problem of the previous researches on personalizedranking is that they focused on either explicit feedbackdata or implicit feedback data rather than making full useof the information in the dataset Until now nobody hasstudied personalized ranking algorithm by exploiting bothexplicit and implicit feedback In order to overcome thedefects of prior researches in this paper we have presenteda new personalized ranking algorithm (MERR SVD++) byexploiting both explicit and implicit feedback simultaneouslyMERR SVD++ optimizes the well-known evaluation metricExpected Reciprocal Rank (ERR) and is based on the newestxCLiMF model and SVD++ algorithm Experimental resultson practical datasets showed that our proposed algorithmoutperformed existing personalized ranking algorithms overdifferent evaluation metrics and that the running time ofMERR SVD++ showed a linear correlation with the num-ber of rating Because of its high precision and the goodexpansibility MERR SVD++ is suitable for processing bigdata and can greatly improve the recommendation speedand validity by solving the latency problem of personalizedrecommendation and has wide application prospect in the

10 Mathematical Problems in Engineering

field of internet information recommendation And becauseMERR SVD++ exploits both explicit and implicit feedbacksimultaneously MERR SVD++ can solve the data sparsityand imbalance problems of personalized ranking algorithmsto a certain extent

For futurework we plan to extend our algorithm to richerones so that our algorithm can solve the grey sheep problemand cold start problem of personalized recommendationAlso we would like to explore more useful information fromthe explicit feedback and implicit feedback simultaneously

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is sponsored in part by the National NaturalScience Foundation of China (nos 61370186 6140326461402122 61003140 61033010 and 61272414) Science andTechnology Planning Project of Guangdong Province (nos2014A010103040 and 2014B010116001) Science and Tech-nology Planning Project of Guangzhou (nos 2014J4100032and 201510010203) the Ministry of Education and ChinaMobile Research Fund (no MCM20121051) the secondbatch open subject of mechanical and electrical professionalgroup engineering technology development center in Foshancity (no 2015-KJZX139) and the 2015 Research BackboneTeachers Training Program of Shunde Polytechnic (no 2015-KJZX014)

References

[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge and DataEngineering vol 17 no 6 pp 734ndash749 2005

[2] S Ahn A Korattikara and N Liu ldquoLarge-scale distributedBayesian matrix factorization using stochastic gradientMCMCrdquo in Proceedings of the 21th ACMSIGKDDConference onKnowledgeDiscovery andDataMining pp 401ndash410 ACMPressSydney Australia August 2015

[3] S Balakrishnan and S Chopra ldquoCollaborative rankingrdquo in Pro-ceedings of the 5th ACM International Conference onWeb Searchand DataMining (WSDM rsquo12) pp 143ndash152 IEEE Seattle WashUSA February 2012

[4] Q Yao and J Kwok ldquoAccelerated inexact soft-impute for fastlarge-scale matrix completionrdquo in Proceedings of the 24rd Inter-national Joint Conference on Artificial Intelligence pp 4002ndash4008 ACM Press Buenos Aires Argentina July 2015

[5] Q Diao M Qiu C-Y Wu A J Smola J Jiang and C WangldquoJointly modeling aspects ratings and sentiments for movierecommendation (JMARS)rdquo in Proceedings of the 20th ACMSIGKDD International Conference on Knowledge Discovery andDataMining (KDD rsquo14) pp 193ndash202 ACMNewYorkNYUSAAugust 2014

[6] M Jahrer and A Toscher ldquoCollaborative filtering ensemble forrankingrdquo in Proceedings of the 17nd International Conference on

Knowledge Discovery and Data Mining pp 153ndash167 ACM SanDiego Calif USA August 2011

[7] G Takacs and D Tikk ldquoAlternating least squares for person-alized rankingrdquo in Proceedings of the 5th ACM Conference onRecommender Systems pp 83ndash90 ACM Dublin Ireland 2012

[8] G Li and L Li ldquoDual collaborative topicmodeling from implicitfeedbacksrdquo in Proceedings of the IEEE International Conferenceon Security Pattern Analysis and Cybernetics pp 395ndash404Wuhan China October 2014

[9] H Wang X Shi and D Yeung ldquoRelational stacked denoisingautoencoder for tag recommendationrdquo in Proceedings of the29th AAAI Conference on Artificial Intelligence pp 3052ndash3058ACM Press Austin Tex USA January 2015

[10] L Gai ldquoPairwise probabilistic matrix factorization for implicitfeedback collaborative filteringrdquo in Proceedings of the IEEEInternational Conference on Security Pattern Analysis andCybernetics (SPAC rsquo14) pp 181ndash190 Wuhan China October2014

[11] S Rendle C Freudenthaler Z Gantner and L Schmidt-Thieme ldquoBPR Bayesian personalized ranking from implicitfeedbackrdquo in Proceedings of the 25th Conference on Uncertaintyin Artificial Intelligence (UAI rsquo09) pp 452ndash461 Morgan Kauf-mann Montreal Canada 2009

[12] L Guo J Ma H R Jiang Z M Chen and C M XingldquoSocial trust aware item recommendation for implicit feedbackrdquoJournal of Computer Science and Technology vol 30 no 5 pp1039ndash1053 2015

[13] W K Pan H Zhong C F Xu and Z Ming ldquoAdaptive Bayesianpersonalized ranking for heterogeneous implicit feedbacksrdquoKnowledge-Based Systems vol 73 pp 173ndash180 2015

[14] Y Shi A Karatzoglou L Baltrunas M Larson N Oliver andA Hanjalic ldquoCLiMF collaborative less-is-more filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 3077ndash3081 ACMBeijing ChinaAugust 2013

[15] WK Pan andLChen ldquoGBPR grouppreference based bayesianpersonalized ranking for one-class collaborative filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 2691ndash2697 Beijing China 2013

[16] Y Koren ldquoFactor in the neighbors scalable and accurate col-laborative filteringrdquoACMTransactions on Knowledge Discoveryfrom Data vol 4 no 1 pp 1ndash24 2010

[17] N N Liu E W Xiang M Zhao and Q Yang ldquoUnifyingexplicit and implicit feedback for collaborative filteringrdquo inProceedings of the 19th International Conference on Informationand Knowledge Management (CIKM rsquo10) pp 1445ndash1448 ACMToronto Canada October 2010

[18] Y Shi A Karatzoglou L BaltrunasM Larson andA HanjalicldquoXCLiMF Optimizing expected reciprocal rank for data withmultiple levels of relevancerdquo in Proceedings of the 7th ACMConference on Recommender Systems (RecSys rsquo13) pp 431ndash434ACM Hongkong October 2013

[19] J Jiang J Lu G Zhang and G Long ldquoScaling-up item-based collaborative filtering recommendation algorithm basedon Hadooprdquo in Proceedings of the 7th IEEE World Congress onServices pp 490ndash497 IEEE Washington DC USA July 2011

[20] T Hofmann ldquoLatent semantic models for collaborative filter-ingrdquo ACM Transactions on Information Systems vol 22 no 1pp 89ndash115 2004

[21] R Salakhutdinov A Mnih and G Hinton ldquoRestricted Boltz-mannmachines for collaborative filteringrdquo in Proceedings of the

Mathematical Problems in Engineering 11

24rd International Conference onMachine Learning (ICML rsquo07)pp 791ndash798 ACM Press Corvallis Ore USA June 2007

[22] N Srebro J Rennie andT Jaakkola ldquoMaximum-marginmatrixfactorizationrdquo in Proceedings of the 18th Annual Conferenceon Neural Information Processing Systems pp 11ndash217 BritishColumbia Canada 2004

[23] J T Liu C H Wu and W Y Liu ldquoBayesian probabilisticmatrix factorization with social relations and item contents forrecommendationrdquo Decision Support Systems vol 55 no 3 pp838ndash850 2013

[24] D Feldman and T Tassa ldquoMore constraints smaller coresetsconstrained matrix approximation of sparse big datardquo in Pro-ceedings of the 21th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo15) pp 249ndash258ACM Press Sydney Australia August 2015

[25] S Mirisaee E Gaussier and A Termier ldquoImproved local searchfor binarymatrix factorizationrdquo in Proceedings of the 29th AAAIConference on Artificial Intelligence pp 1198ndash1204 ACM PressAustin Tex USA January 2015

[26] T Y Liu Learning to Rank for Information Retrieval SpringerNew York NY USA 2011

[27] NN LiuM Zhao andQ Yang ldquoProbabilistic latent preferenceanalysis for collaborative filteringrdquo in Proceedings of the 18thInternational Conference on Information and Knowledge Man-agement (CIKM rsquo09) pp 759ndash766 ACM Press Hong KongNovember 2009

[28] N N Liu and Q Yang ldquoEigenRank a ranking-orientedapproach to collaborative filteringrdquo in Proceedings of the 31stAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval (ACM SIGIR rsquo08) pp 83ndash90 ACM Singapore July 2008

[29] M Weimer A Karatzoglou Q V Le and A J SmolaldquoCofiRankndashmaximum margin matrix factorization for collab-orative rankingrdquo in Proceedings of the 21th Conference onAdvances in Neural Information Processing Systems pp 79ndash86Curran Associates Vancouver Canada 2007

[30] O Chapelle D Metlzer Y Zhang and P Grinspan ldquoExpectedreciprocal rank for graded relevancerdquo in Proceedings of the 18thACM International Conference on Information and KnowledgeManagement (CIKM rsquo09) pp 621ndash630 ACM New York NYUSA November 2009

[31] E M Voorhees ldquoThe trec-8 question answering track reportrdquoin Proceedings of the 8th Text Retrieval Conference (TREC-8 rsquo99)Gaithersburg Md USA November 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 4: Research Article Exploiting Explicit and Implicit Feedback ...downloads.hindawi.com/journals/mpe/2016/2535329.pdfLTR is the core technology for information retrieval. When a query

4 Mathematical Problems in Engineering

list from top to bottom and stops when a result is found thatfully satisfies the userrsquos information need The usefulness ofan item at rank 119894 is dependent on the usefulness of the itemsat rank less than 119894 Reciprocal Rank (RR) is an importantevaluationmetric in the research field of information retrieval[14] and it strongly emphasizes the relevance of resultsreturned at the top of the list The ERR measure is ageneralized version of RR designed to be used with multiplerelevance level data (eg ratings) ERR has similar propertiesto RR in that it strongly emphasizes the relevance of resultsreturned at the top of the list

Using the definition of ERR in [18 30] we can describeERR for a ranked item list of user 119906 as follows

ERR119906=

119899

sum

119894=1

119875

119906119894

119877

119906119894

(3)

where119877119906119894denotes the rank position of item 119894 for user 119906 when

all items are ranked in descending order of the predictedrelevance values And119875

119906119894denotes the probability that the user

119906 stops at position 119894 As in [30] 119875119906119894is defined as follows

119875

119906119894= 119903

119906119894

119899

prod

119895=1

(1 minus 119903

119906119895119868 (119877

119906119895lt 119877

119906119894)) (4)

where 119868(119909) is an indicator function equal to 1 if the conditionis true and otherwise 0 And 119903

119906119894denotes the probability that

user 119906 finds the item 119894 relevant Substituting (4) into (3) weobtain the calculation formula of ERR

119906

ERR119906=

119899

sum

119894=1

119903

119906119894

119877

119906119894

119899

prod

119895=1

(1 minus 119903

119906119895119868 (119877

119906119895lt 119877

119906119894)) (5)

We use a mapping function similar to the one used in[18] to convert ratings (or levels of relevance in general) toprobabilities of relevance as follows

119903

119906119894=

2

119883119906119894minus 1

2

119883max 119868

119906119894gt 0

0 119868

119906119894= 0

(6)

where 119868

119906119894is an indicator function Note that 119868

119906119894gt 0 (119868

119906119894=

0) indicates that user 119906rsquos preference to item 119894 is known(unknown) 119883

119906119894denotes the rating given by user 119906 to item 119894

and119883max is the highest ratingIn this paper we define that 119877(119906) denotes the set of all

items that user 119906 gave explicit feedback so 119873(119906) = 119877(119906) inthe dataset that the users only gave explicit feedback and theimplicit feedback dataset is created by setting all the ratingdata in explicit feedback dataset as 1 A toy example can beseen in Figure 2 If the dataset contains both explicit feedbackdata and implicit feedback data 119873(119906) = 119877(119906) + 119872(119906)119872(119906) denoting the set of all items that user 119906 only gaveimplicit feedback A toy example can be seen in Figure 3The influence of implicit feedback on the performance ofMERR SVD++ can be found in Section 542 If we use thetraditional SVD++ algorithm to unify explicit and implicit

feedback data the prediction formula of 119883119906119894in SVD++ can

be defined as

119883

119906119894= (119880

119906+ |119873 (119906)|

minus12sum

119895isin119873(119906)

119882

119895)

119879

119881

119894

(7)

So far through the introduction of SVD++ we canexploit both explicit and implicit feedback simultaneouslyby optimizing evaluation metric ERR So we call our modelMERR SVD++

Note that the value of rank 119877

119906119894depends on the value of

119883

119906119894 For example if the predicted relevance value

119883

119906119894of item

119894 is the second highest among all the items for user 119906 then wewill have 119877

119906119894= 2

Similar to other ranking measures such as RR ERR isalso nonsmooth with respect to the latent factors of users anditems that is 119880 119881 and 119882 It is thus impossible to optimizeERRdirectly using conventional optimization techniquesWethus employ smoothing techniques that were also used inCLiMF [14] and xCLiMF [18] to attain a smoothed versionof ERR In particular we approximate the rank-based terms1119877

119906119894and 119868(119877

119906119895lt 119877

119906119894) in (5) by smooth functions with

respect to the model parameters 119880 119881 and 119882 The approx-imate formula is as follows

119868 (119877

119906119895lt 119877

119906119894) asymp 119892 (

119883

119906119895minus

119883

119906119894)

1

119877

119906119894

= 119892 (

119883

119906119894)

(8)

where 119892(119909) is a logistic function that is 119892(119909) = 1(1 + 119890

minus119909)

Substituting (8) into (5) we obtain a smoothed approxi-mation of ERR

119906

ERR119906=

119899

sum

119894=1

119903

119906119894119892 (

119883

119906119894)

119899

prod

119895=1

(1 minus 119903

119906119895119892 (

119883

119906(119895minus119894))) (9)

Note that for notation convenience we make use of thesubstitution

119883

119906(119895minus119894)=

119883

119906119895minus

119883

119906119894

Given the monotonicity of the logarithm function themodel parameters that maximize (9) are equivalent to theparameters that maximize ln((1|119873(119906)|)ERR

119906) Specifically

we have

119880

119906 119881119882 = arg max

119880119906119881119882

ERR119906

= arg max119880119906119881119882

ln ((1119873 (119906))ERR119906)

= arg max119880119906119881119882

ln(

119899

sum

119895=1

119903

119906119894119892 (

119883

119906119894)

|119873 (119906)|

119899

prod

119895=1

(1 minus 119903

119906119895119892 (

119883

119906(119895minus119894))))

(10)

Mathematical Problems in Engineering 5

Item

Use

r

2

5

4

4 3

(a)

Item

Use

r

1

1

1

1 1

(b)

Figure 2 A toy example of the dataset that the users only gave explicit feedback (a) denotes the explicit feedback dataset (b) denotes theimplicit feedback dataset

Item

Use

r

2

5

4

4 3

(a)

Item

Use

r1 1 1

1 1

1

1 1

(b)

Figure 3 A toy example of the dataset that contains both explicit feedback data and implicit feedback data (a) denotes the explicit feedbackdataset (b) denotes the implicit feedback dataset and the numbers in bold denote the dataset that users only gave implicit feedback

Based on Jensenrsquos inequality and the concavity of thelogarithm function in a similar manner to [14 18] we derivethe lower bound of ln((1|119873(119906)|)ERR

119906) as follows

ln(

1

|119873 (119906)|

ERR119906)

= ln[

[

119899

sum

119895=1

119903

119906119894119892 (

119883

119906119894)

|119873 (119906)|

119899

prod

119895=1

(1 minus 119903

119906119895119892 (

119883

119906(119895minus119894)))

]

]

ge

1

|119873 (119906)|

sdot

119899

sum

119895=1

119903

119906119894ln[

[

119892 (

119883

119906119894)

119899

prod

119895=1

(1 minus 119903

119906119895119892 (

119883

119906(119895minus119894)))

]

]

=

1

|119873 (119906)|

sdot

119899

sum

119895=1

119903

119906119894[

[

ln119892 (

119883

119906119894) +

119899

sum

119895=1

ln (1 minus 119903

119906119895119892 (

119883

119906(119895minus119894)))

]

]

(11)

We can neglect the constant 1|119873(119906)| in the lower bound andobtain a new objective function

119871 (119880

119906 119881119882)

=

119899

sum

119895=1

119903

119906119894[

[

ln119892 (

119883

119906119894) +

119899

sum

119895=1

ln (1 minus 119903

119906119895119892 (

119883

119906(119895minus119894)))

]

]

(12)

Taking into account all119898 users and using the Frobenius normof the latent factors for regularization we obtain the objectivefunction of MERR SVD++

119871 (119880119881119882) =

119898

sum

119906

119899

sum

119894=1

119903

119906119894[

[

ln119892 (

119883

119906119894)

+

119899

sum

119895=1

ln (1 minus 119903

119906119895119892 (

119883

119906(119895minus119894)))

]

]

minus

120582

2

(119880

2+ 119881

2

+ 119882

2)

(13)

6 Mathematical Problems in Engineering

in which 120582 denotes the regularization coefficient and 119880

denotes the Frobenius norm of119880 Note that the lower bound119871(119880 119881119882) is much less complex than the original objectivefunction in (9) and standard optimization methods forexample gradient ascend can be used to learn the optimalmodel parameters 119880 119881 and119882

42 Optimization We can now maximize the objectivefunction (13) with respect to the latent factorsarg max

119880119881119882119871(119880119881119882) Note that 119871(119880119881119882) represents an

approximation of the mean value of ERR across all the usersWe can thus remove the constant coefficient 1119898 since ithas no influence on the optimization of 119871(119880 119881119882) Since theobjective function is smooth we can use gradient ascent forthe optimization The gradients can be derived in a similarmanner to xCLiMF [18] as shown in the following

120597119871

120597119880

119906

=

119899

sum

119894=1

119903

119906119894[

[

119892 (minus

119883

119906119894)119881

119894

+

119899

sum

119895=1

119903

119906119895119892

1015840(

119883

119906119895minus

119883

119906119894) (119881

119894minus 119881

119895)

1 minus 119903

119906119895119892 (

119883

119906119895minus

119883

119906119894)

]

]

minus 120582119880

119906

(14)

120597119871

120597119881

119894

= 119903

119906119894[

[

119892 (minus

119883

119906119894)119881

119894+

119899

sum

119895=1

119903

119906119895119892

1015840(

119883

119906119895minus

119883

119906119894)

sdot (

1

1 minus 119903

119906119895119892 (

119883

119906119895minus

119883

119906119894)

minus

1

1 minus 119903

119906119894119892 (

119883

119906119894minus

119883

119906119895)

)

]

]

(119880

119906+ |119873 (119906)|

minus12

sdot sum

119896isin119873(119906)

119882

119896) minus 120582119881

119894

(15)

120597119871

120597119882

119894

= 119903

119906119894[

[

119892 (minus

119883

119906119894)119881

119894

+

119899

sum

119895=1

119903

119906119895119892

1015840(

119883

119906119895minus

119883

119906119894) (119881

119894minus 119881

119895)

1 minus 119903

119906119895119892 (

119883

119906119895minus

119883

119906119894)

]

]

minus 120582119882

119894

(16)

The learning algorithm for the MERR SVD++ model isoutlined in Algorithm 1

The published research papers in [8ndash11 13 18] showthat the use of the normal distribution 119873(0 001) for theinitialization of feature matrix is very effective so we still usethis approach for the initialization of 119880 119881 and 119882 in ourproposed algorithm

43 Computational Complexity Here we first analyze thecomplexity of the learning process for one iteration Byexploiting the data sparseness in 119883 the computational com-plexity of the gradient in (14) is 119874(119889

2119898 + 119889119898) Note that

Input Training set119883 learning rate 120572 regularization 120582The number of the feature 119889 and the maximal numberOf iterations itermaxOutput feature matrix 119880 119881119882for 119906 = 1 2 119898 doIndex relevant items for user 119906119873(119906) = 119895 | 119883

119894119895gt 0 0 le 119895 le 119899

endInitialize 119880(0) 119881(0) and119882

(0) with random valuesand 119905 = 0repeatfor 119906 = 1 2 119898 doUpdate 119880

119906

119880

(119905+1)

119906= 119880

(119905)

119906+ 120572

120597119871

120597119880

(119905)

119906

for 119894 isin 119873(119906) doUpdate 119881

119894119882119894

119881

119894

(119905+1)= 119881

119894

(119905)+ 120572

120597119871

120597119881

119894

(119905)

119882

119894

(119905+1)= 119882

119894

(119905)+ 120572

120597119871

120597119882

119894

(119905)

EndEnd

119905 = 119905 + 1until 119905 ge itermax119880 = 119880

(119905) 119881 = 119881

(119905)119882 = 119882

(119905)

Algorithm 1 MERR SVD++

denotes the average number of relevant items across all theusers The computational complexity of the gradient in (16)is also 119874(119889

2119898 + 119889119898) The computational complexity of the

gradient in (15) is 119874(119889

2119898 + 119889119898) Hence the complexity

of the learning algorithm in one iteration is in the order of119874(119889

2119898) In the case that is a small number that is 2 ≪ 119898

the complexity is linear to the number of users in the datacollection Note that we have 119904 = 119898 in which 119904 denotes thenumber of nonzeros in the user-item matrix The complexityof the learning algorithm is then 119874(119889119904) Since we usuallyhave ≪ 119904 the complexity is 119874(119889119904) even in the case that is large that is being linear to the number of nonzeros (ierelevant observations in the data) In sum our analysis showsthat MERR SVD++ is suitable for large scale use cases Notethat we also empirically verify the complexity of the learningalgorithm in Section 543

5 Experiment

51 Datasets We use two datasets for the experiments Thefirst is the MovieLens 1 million dataset (ML1m) [18] whichcontains ca 1M ratings (1ndash5 scale) from ca 6K users and 37Kmovies The sparseness of the ML1m dataset is 9553 Thesecond dataset is the Netflix dataset [2 16 17] which contains100000000 ratings (1ndash5 scale) from 480189 users on 17770movies Due to the huge size of the Netflix data we extract asubset of 10000 users and 10000 movies in which each userhas rated more than 100 different movies

Mathematical Problems in Engineering 7

52 Evaluation Metrics Just as has been justified by [17]NDCG is a very suitable evaluation metric for personalizedranking algorithmswhich combine explicit and implicit feed-back And our proposed MERR SVD++ algorithm exploitsboth explicit and implicit feedback simultaneously andoptimizes the well-known personalized ranking evaluationmetric Expected Reciprocal Rank (ERR) So we use theNDCG and ERR as evaluation metrics for the predictabilityof models in this paper

NDCG is another most widely used measure for rankingproblems To define NDCG

119906for a user 119906 one first needs to

define DCG119906

DCG119906=

119898

sum

119894=1

2

pref(119894)minus 1

log (119894 + 1)

(17)

where pref(119894) is a binary indicator returning 1 if the 119894th item ispreferred and 0 otherwise DCG

119906is then normalized by the

ideal ranked list into the interval [0 1]

NDCG119906=

DCG119906

IDCG119906

(18)

where IDCG119906denotes that the ranked list is sorted exactly

according to the userrsquos tastes positive items are placed at thehead of the ranked listTheNDCGof all the users is themeanscore of each user

ERR is a generalized version of Reciprocal Rank (RR)designed to be used with multiple relevance level data (egratings) It has similar properties to RR in that it stronglyemphasizes the relevance of results returned at the top of thelist Using the definition of ERR in [18] we can define ERRfor a ranked item list of user 119906 as follows

ERR119906=

119899

sum

119894=1

119903

119906119894

119877

119906119894

119899

prod

119895=1

(1 minus 119903

119906119895119868 (119877

119906119895lt 119877

119906119894)) (19)

Similar to NDCG the ERR of all the users is the mean scoreof each user

Since in recommender systems the userrsquos satisfaction isdominated by only a few items on the top of the recommenda-tion list our evaluation in the following experiments focuseson the performance of top-5 recommended items that isNDCG5 and ERR5

53 Experiment Setup For each dataset we randomlyselected 5 rated items (movies) and 1000 unrated items(movies) for each user to form a test set We then randomlyselected a varying number of rated items from the rest toform a training set For example just as in [14 18] underthe condition of ldquoGiven 5rdquo we randomly selected 5 rateditems (disjoint to the items in the test set) for each user inorder to generate a training set We investigated a variety ofldquoGivenrdquo conditions for the training sets that is 5 10 and 15for the ML1m dataset and 10 20 and 30 for the extractedNetflix dataset Generated recommendation lists for each userare compared to the ground truth in the test set in order tomeasure the performance

All the models were implemented in MATLAB R2009aForMERR SVD++ the value of the regularization parameter

120582 was selected from range 10

minus5 10

minus4 10

minus3 10

minus2 10

minus1 1 10

and optimal parameter value was used And the learning rate120572 was selected from set 119860 119860 = 120572 | 120572 = 00001 times 2

119888 120572 le

05 119888 gt 0 and 119888 is a constant and the optimal parametervalue was also used In order to compare their performancesfairly for all matrix factorization models we set the numberof features to be 10 The optimal values of all parametersfor all the baseline models used are determined individuallyMore detailed setting methods of the parameters for all thebaselines can be found in the corresponding references Forall the algorithms used in our experiments we repeated theexperiment 5 times for each of the different conditions of eachdataset and the performances reported were averaged across5 runs

54 Experiment Results In this section we present a seriesof experiments to evaluate MERR SVD++ We designedthe experiments in order to address the following researchquestions

(1) Does the proposed MERR SVD++ outperform state-of-the-art personalized ranking approaches for top-Nrecommendation

(2) Does the performance of MERR SVD++ improvewhen we only increase the number of implicit feed-back data for each user

(3) Is MERR SVD++ scalable for large scale use cases

541 Performance Comparison We compare the perfor-mance of MERR SVD++ with that of five baseline algo-rithms The approaches we compare with are listed below

(i) Co-Rating [17] a state-of-the-art CF model thatcan be trained from explicit and implicit feedbacksimultaneously

(ii) SVD++ [16] the first proposed CF model that com-bines explicit and implicit feedback

(iii) xCLiMF [18] a state-of-the-art PR approach whichaims at directly optimizing ERR for top-N recom-mendation in domains with explicit feedback data(eg ratings)

(iv) CofiRank [29] a PR approach that optimizes theNDCG measure [28] for domains with explicit feed-back data (eg ratings) The implementation is basedon the publicly available software package from theauthors

(v) CLiMF [14] a state-of-the-art PR approach that opti-mizes the Mean Reciprocal Rank (MRR) measure[31] for domains with implicit feedback data (egclick follow) We use the explicit feedback datasetsby binarizing the rating values with a threshold Onthe ML1m dataset and the extracted Netflix datasetwe take ratings 4 and 5 (the highest two relevancelevels) respectively as the relevance threshold fortop-N recommendation

The results of the experiments on the ML1m and theextractedNetflix datasets are shown in Figure 4 Rows denote

8 Mathematical Problems in Engineering

0 Given 5 Given 10 Given 150

001

002

003

004

005

006The results on the ML1m dataset

The varieties of ldquoGivenrdquo conditions for the training sets

ND

CG

5

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(a)

0 Given 5 Given 10 Given 150

001

002

003

004

005

006

007The results on the ML1m dataset

ERR

5

The varieties of ldquoGivenrdquo conditions for the training sets

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(b)

0 Given 10 Given 20 Given 300

002

004

006

008

010

012The results on the extracted Netflix dataset

ND

CG

5

The varieties of ldquoGivenrdquo conditions for the training sets

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(c)

0 Given 10 Given 20 Given 300

001

002

003

004

005

006

007

008The results on the extracted Netflix dataset

ERR

5

The varieties of ldquoGivenrdquo conditions for the training sets

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(d)

Figure 4 The performance comparison of MERR SVD++ and baselines

the varieties of ldquoGivenrdquo conditions for the training sets basedon the ML1m and the extracted Netflix datasets and columnsdenote the quality of NDCG and ERR Figure 4 showsthat MERR SVD++ outperforms the baseline approachesin terms of both ERR and NDCG in all of the cases Theresults show that the improvement of ERR aligns consistentlywith the improvement of NDCG indicating that optimizingERR would not degrade the utility of recommendations thatare captured by the NDCG measure It can be seen thatthe relative performance of MERR SVD++ improves as thenumber of observed ratings from the users increases Thisresult indicates that MERR SVD++ can learn better top-Nrecommendation models if more observations of the gradedrelevance data from users can be used The results alsoreveal that it is difficult to model user preferences encoded

in multiple levels of relevance with limited observations inparticular when the number of observations is lower than thenumber of relevance levels

Compared to Co-Rating which is based on ratingprediction MERR SVD++ is based on ranking predictionand succeeds in enhancing the top-ranked performance byoptimizing ERR As reported in [17] the performance ofSVD++ is slightly weaker than that of Co-Rating which isbecause SVD++ model only attempts to approximate theobserved ratings and does not model preferences expressedin implicit feedback The results also show that Co-Ratingand SVD++ significantly outperform xCLiMF and CofiRankwhich confirms our belief that implicit feedback could indeedcomplement explicit feedback It can be seen in Figure 4that xCLiMF significantly outperforms CLiMF in terms of

Mathematical Problems in Engineering 9

0 10 20 30 40 50005

006

007

008

009

010

The increased numbers of implicit feedback for each user

The v

alue

of e

valu

atio

n m

etric

s

ERR5NDCG5

Figure 5 The influence of implicit feedback on the performance ofMERR SVD++

both ERR and NDCG across all the settings of relevancethresholds and datasets The results indicate that the infor-mation loss from binarizing multilevel relevance data wouldinevitably make recommendation models based on binaryrelevance data such as CLiMF suboptimal for the use caseswith explicit feedback data

Hence we give a positive answer to our first researchquestion

542 The Influence of Implicit Feedback on the Performanceof MERR SVD++ The influence of implicit feedback on theperformance of MERR SVD++ can be found in Figure 5Here 119873(119906) = 119877(119906) + 119872(119906) 119872(119906) denotes the set of allitems that user 119906 only gave implicit feedback Rows denotethe increased numbers of implicit feedback for each userand columns denote the quality of ERR and NDCG In ourexperiment the increased numbers of implicit feedback foreach user are the same We use the extracted Netflix datasetunder the condition of ldquoGiven 10rdquo Figure 5 shows that thequality of ERR and NDCG of MERR SVD++ synchronouslyand linearly improves with the increase of implicit feedbackfor each user which confirms our belief that implicit feedbackcould indeed complement explicit feedback

With this experimental result we give a positive answerto our second research question

543 Scalability The last experiment investigated the scala-bility of MERR SVD++ by measuring the training time thatwas required for the training set at different scales Firstlyas analyzed in Section 43 the computational complexityof MERR SVD++ is linear in the number of users in thetraining set when the average number of items rated peruser is fixed To demonstrate the scalability we used differentnumbers of users in the training set under each conditionwe randomly selected from 10 to 100 users in the trainingset and their rated items as the training data for learning thelatent factors The results on the ML1m dataset are shown inFigure 6 We can observe that the computational time under

10 20 30 40 50 60 70 80 90 1000

5

10

15

20

25

30

35

40

Users for training ()

Runt

ime p

er it

erat

ion

(sec

)

Given 5Given 10Given 15

Figure 6 Scalability analysis of MERR SVD++ in terms of thenumber of users in the training set

each condition increases almost linearly to the increase of thenumber of users Secondly as also discussed in Section 43the computational complexity of MERR SVD++ could befurther approximated to be linear to the amount of knowndata (ie nonzero entries in the training user-item matrix)To demonstrate this we examined the runtime of the learningalgorithm against different scales of the training sets underdifferent ldquoGivenrdquo conditions The result is shown in Figure 6from which we can observe that the average runtime of thelearning algorithm per iteration increases almost linearly asthe number of nonzeros in the training set increases

The observations from this experiment allow us to answerour last research question positively

6 Conclusion and Future Work

The problem of the previous researches on personalizedranking is that they focused on either explicit feedbackdata or implicit feedback data rather than making full useof the information in the dataset Until now nobody hasstudied personalized ranking algorithm by exploiting bothexplicit and implicit feedback In order to overcome thedefects of prior researches in this paper we have presenteda new personalized ranking algorithm (MERR SVD++) byexploiting both explicit and implicit feedback simultaneouslyMERR SVD++ optimizes the well-known evaluation metricExpected Reciprocal Rank (ERR) and is based on the newestxCLiMF model and SVD++ algorithm Experimental resultson practical datasets showed that our proposed algorithmoutperformed existing personalized ranking algorithms overdifferent evaluation metrics and that the running time ofMERR SVD++ showed a linear correlation with the num-ber of rating Because of its high precision and the goodexpansibility MERR SVD++ is suitable for processing bigdata and can greatly improve the recommendation speedand validity by solving the latency problem of personalizedrecommendation and has wide application prospect in the

10 Mathematical Problems in Engineering

field of internet information recommendation And becauseMERR SVD++ exploits both explicit and implicit feedbacksimultaneously MERR SVD++ can solve the data sparsityand imbalance problems of personalized ranking algorithmsto a certain extent

For futurework we plan to extend our algorithm to richerones so that our algorithm can solve the grey sheep problemand cold start problem of personalized recommendationAlso we would like to explore more useful information fromthe explicit feedback and implicit feedback simultaneously

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is sponsored in part by the National NaturalScience Foundation of China (nos 61370186 6140326461402122 61003140 61033010 and 61272414) Science andTechnology Planning Project of Guangdong Province (nos2014A010103040 and 2014B010116001) Science and Tech-nology Planning Project of Guangzhou (nos 2014J4100032and 201510010203) the Ministry of Education and ChinaMobile Research Fund (no MCM20121051) the secondbatch open subject of mechanical and electrical professionalgroup engineering technology development center in Foshancity (no 2015-KJZX139) and the 2015 Research BackboneTeachers Training Program of Shunde Polytechnic (no 2015-KJZX014)

References

[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge and DataEngineering vol 17 no 6 pp 734ndash749 2005

[2] S Ahn A Korattikara and N Liu ldquoLarge-scale distributedBayesian matrix factorization using stochastic gradientMCMCrdquo in Proceedings of the 21th ACMSIGKDDConference onKnowledgeDiscovery andDataMining pp 401ndash410 ACMPressSydney Australia August 2015

[3] S Balakrishnan and S Chopra ldquoCollaborative rankingrdquo in Pro-ceedings of the 5th ACM International Conference onWeb Searchand DataMining (WSDM rsquo12) pp 143ndash152 IEEE Seattle WashUSA February 2012

[4] Q Yao and J Kwok ldquoAccelerated inexact soft-impute for fastlarge-scale matrix completionrdquo in Proceedings of the 24rd Inter-national Joint Conference on Artificial Intelligence pp 4002ndash4008 ACM Press Buenos Aires Argentina July 2015

[5] Q Diao M Qiu C-Y Wu A J Smola J Jiang and C WangldquoJointly modeling aspects ratings and sentiments for movierecommendation (JMARS)rdquo in Proceedings of the 20th ACMSIGKDD International Conference on Knowledge Discovery andDataMining (KDD rsquo14) pp 193ndash202 ACMNewYorkNYUSAAugust 2014

[6] M Jahrer and A Toscher ldquoCollaborative filtering ensemble forrankingrdquo in Proceedings of the 17nd International Conference on

Knowledge Discovery and Data Mining pp 153ndash167 ACM SanDiego Calif USA August 2011

[7] G Takacs and D Tikk ldquoAlternating least squares for person-alized rankingrdquo in Proceedings of the 5th ACM Conference onRecommender Systems pp 83ndash90 ACM Dublin Ireland 2012

[8] G Li and L Li ldquoDual collaborative topicmodeling from implicitfeedbacksrdquo in Proceedings of the IEEE International Conferenceon Security Pattern Analysis and Cybernetics pp 395ndash404Wuhan China October 2014

[9] H Wang X Shi and D Yeung ldquoRelational stacked denoisingautoencoder for tag recommendationrdquo in Proceedings of the29th AAAI Conference on Artificial Intelligence pp 3052ndash3058ACM Press Austin Tex USA January 2015

[10] L Gai ldquoPairwise probabilistic matrix factorization for implicitfeedback collaborative filteringrdquo in Proceedings of the IEEEInternational Conference on Security Pattern Analysis andCybernetics (SPAC rsquo14) pp 181ndash190 Wuhan China October2014

[11] S Rendle C Freudenthaler Z Gantner and L Schmidt-Thieme ldquoBPR Bayesian personalized ranking from implicitfeedbackrdquo in Proceedings of the 25th Conference on Uncertaintyin Artificial Intelligence (UAI rsquo09) pp 452ndash461 Morgan Kauf-mann Montreal Canada 2009

[12] L Guo J Ma H R Jiang Z M Chen and C M XingldquoSocial trust aware item recommendation for implicit feedbackrdquoJournal of Computer Science and Technology vol 30 no 5 pp1039ndash1053 2015

[13] W K Pan H Zhong C F Xu and Z Ming ldquoAdaptive Bayesianpersonalized ranking for heterogeneous implicit feedbacksrdquoKnowledge-Based Systems vol 73 pp 173ndash180 2015

[14] Y Shi A Karatzoglou L Baltrunas M Larson N Oliver andA Hanjalic ldquoCLiMF collaborative less-is-more filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 3077ndash3081 ACMBeijing ChinaAugust 2013

[15] WK Pan andLChen ldquoGBPR grouppreference based bayesianpersonalized ranking for one-class collaborative filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 2691ndash2697 Beijing China 2013

[16] Y Koren ldquoFactor in the neighbors scalable and accurate col-laborative filteringrdquoACMTransactions on Knowledge Discoveryfrom Data vol 4 no 1 pp 1ndash24 2010

[17] N N Liu E W Xiang M Zhao and Q Yang ldquoUnifyingexplicit and implicit feedback for collaborative filteringrdquo inProceedings of the 19th International Conference on Informationand Knowledge Management (CIKM rsquo10) pp 1445ndash1448 ACMToronto Canada October 2010

[18] Y Shi A Karatzoglou L BaltrunasM Larson andA HanjalicldquoXCLiMF Optimizing expected reciprocal rank for data withmultiple levels of relevancerdquo in Proceedings of the 7th ACMConference on Recommender Systems (RecSys rsquo13) pp 431ndash434ACM Hongkong October 2013

[19] J Jiang J Lu G Zhang and G Long ldquoScaling-up item-based collaborative filtering recommendation algorithm basedon Hadooprdquo in Proceedings of the 7th IEEE World Congress onServices pp 490ndash497 IEEE Washington DC USA July 2011

[20] T Hofmann ldquoLatent semantic models for collaborative filter-ingrdquo ACM Transactions on Information Systems vol 22 no 1pp 89ndash115 2004

[21] R Salakhutdinov A Mnih and G Hinton ldquoRestricted Boltz-mannmachines for collaborative filteringrdquo in Proceedings of the

Mathematical Problems in Engineering 11

24rd International Conference onMachine Learning (ICML rsquo07)pp 791ndash798 ACM Press Corvallis Ore USA June 2007

[22] N Srebro J Rennie andT Jaakkola ldquoMaximum-marginmatrixfactorizationrdquo in Proceedings of the 18th Annual Conferenceon Neural Information Processing Systems pp 11ndash217 BritishColumbia Canada 2004

[23] J T Liu C H Wu and W Y Liu ldquoBayesian probabilisticmatrix factorization with social relations and item contents forrecommendationrdquo Decision Support Systems vol 55 no 3 pp838ndash850 2013

[24] D Feldman and T Tassa ldquoMore constraints smaller coresetsconstrained matrix approximation of sparse big datardquo in Pro-ceedings of the 21th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo15) pp 249ndash258ACM Press Sydney Australia August 2015

[25] S Mirisaee E Gaussier and A Termier ldquoImproved local searchfor binarymatrix factorizationrdquo in Proceedings of the 29th AAAIConference on Artificial Intelligence pp 1198ndash1204 ACM PressAustin Tex USA January 2015

[26] T Y Liu Learning to Rank for Information Retrieval SpringerNew York NY USA 2011

[27] NN LiuM Zhao andQ Yang ldquoProbabilistic latent preferenceanalysis for collaborative filteringrdquo in Proceedings of the 18thInternational Conference on Information and Knowledge Man-agement (CIKM rsquo09) pp 759ndash766 ACM Press Hong KongNovember 2009

[28] N N Liu and Q Yang ldquoEigenRank a ranking-orientedapproach to collaborative filteringrdquo in Proceedings of the 31stAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval (ACM SIGIR rsquo08) pp 83ndash90 ACM Singapore July 2008

[29] M Weimer A Karatzoglou Q V Le and A J SmolaldquoCofiRankndashmaximum margin matrix factorization for collab-orative rankingrdquo in Proceedings of the 21th Conference onAdvances in Neural Information Processing Systems pp 79ndash86Curran Associates Vancouver Canada 2007

[30] O Chapelle D Metlzer Y Zhang and P Grinspan ldquoExpectedreciprocal rank for graded relevancerdquo in Proceedings of the 18thACM International Conference on Information and KnowledgeManagement (CIKM rsquo09) pp 621ndash630 ACM New York NYUSA November 2009

[31] E M Voorhees ldquoThe trec-8 question answering track reportrdquoin Proceedings of the 8th Text Retrieval Conference (TREC-8 rsquo99)Gaithersburg Md USA November 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 5: Research Article Exploiting Explicit and Implicit Feedback ...downloads.hindawi.com/journals/mpe/2016/2535329.pdfLTR is the core technology for information retrieval. When a query

Mathematical Problems in Engineering 5

Item

Use

r

2

5

4

4 3

(a)

Item

Use

r

1

1

1

1 1

(b)

Figure 2 A toy example of the dataset that the users only gave explicit feedback (a) denotes the explicit feedback dataset (b) denotes theimplicit feedback dataset

Item

Use

r

2

5

4

4 3

(a)

Item

Use

r1 1 1

1 1

1

1 1

(b)

Figure 3 A toy example of the dataset that contains both explicit feedback data and implicit feedback data (a) denotes the explicit feedbackdataset (b) denotes the implicit feedback dataset and the numbers in bold denote the dataset that users only gave implicit feedback

Based on Jensenrsquos inequality and the concavity of thelogarithm function in a similar manner to [14 18] we derivethe lower bound of ln((1|119873(119906)|)ERR

119906) as follows

ln(

1

|119873 (119906)|

ERR119906)

= ln[

[

119899

sum

119895=1

119903

119906119894119892 (

119883

119906119894)

|119873 (119906)|

119899

prod

119895=1

(1 minus 119903

119906119895119892 (

119883

119906(119895minus119894)))

]

]

ge

1

|119873 (119906)|

sdot

119899

sum

119895=1

119903

119906119894ln[

[

119892 (

119883

119906119894)

119899

prod

119895=1

(1 minus 119903

119906119895119892 (

119883

119906(119895minus119894)))

]

]

=

1

|119873 (119906)|

sdot

119899

sum

119895=1

119903

119906119894[

[

ln119892 (

119883

119906119894) +

119899

sum

119895=1

ln (1 minus 119903

119906119895119892 (

119883

119906(119895minus119894)))

]

]

(11)

We can neglect the constant 1|119873(119906)| in the lower bound andobtain a new objective function

119871 (119880

119906 119881119882)

=

119899

sum

119895=1

119903

119906119894[

[

ln119892 (

119883

119906119894) +

119899

sum

119895=1

ln (1 minus 119903

119906119895119892 (

119883

119906(119895minus119894)))

]

]

(12)

Taking into account all119898 users and using the Frobenius normof the latent factors for regularization we obtain the objectivefunction of MERR SVD++

119871 (119880119881119882) =

119898

sum

119906

119899

sum

119894=1

119903

119906119894[

[

ln119892 (

119883

119906119894)

+

119899

sum

119895=1

ln (1 minus 119903

119906119895119892 (

119883

119906(119895minus119894)))

]

]

minus

120582

2

(119880

2+ 119881

2

+ 119882

2)

(13)

6 Mathematical Problems in Engineering

in which 120582 denotes the regularization coefficient and 119880

denotes the Frobenius norm of119880 Note that the lower bound119871(119880 119881119882) is much less complex than the original objectivefunction in (9) and standard optimization methods forexample gradient ascend can be used to learn the optimalmodel parameters 119880 119881 and119882

42 Optimization We can now maximize the objectivefunction (13) with respect to the latent factorsarg max

119880119881119882119871(119880119881119882) Note that 119871(119880119881119882) represents an

approximation of the mean value of ERR across all the usersWe can thus remove the constant coefficient 1119898 since ithas no influence on the optimization of 119871(119880 119881119882) Since theobjective function is smooth we can use gradient ascent forthe optimization The gradients can be derived in a similarmanner to xCLiMF [18] as shown in the following

120597119871

120597119880

119906

=

119899

sum

119894=1

119903

119906119894[

[

119892 (minus

119883

119906119894)119881

119894

+

119899

sum

119895=1

119903

119906119895119892

1015840(

119883

119906119895minus

119883

119906119894) (119881

119894minus 119881

119895)

1 minus 119903

119906119895119892 (

119883

119906119895minus

119883

119906119894)

]

]

minus 120582119880

119906

(14)

120597119871

120597119881

119894

= 119903

119906119894[

[

119892 (minus

119883

119906119894)119881

119894+

119899

sum

119895=1

119903

119906119895119892

1015840(

119883

119906119895minus

119883

119906119894)

sdot (

1

1 minus 119903

119906119895119892 (

119883

119906119895minus

119883

119906119894)

minus

1

1 minus 119903

119906119894119892 (

119883

119906119894minus

119883

119906119895)

)

]

]

(119880

119906+ |119873 (119906)|

minus12

sdot sum

119896isin119873(119906)

119882

119896) minus 120582119881

119894

(15)

120597119871

120597119882

119894

= 119903

119906119894[

[

119892 (minus

119883

119906119894)119881

119894

+

119899

sum

119895=1

119903

119906119895119892

1015840(

119883

119906119895minus

119883

119906119894) (119881

119894minus 119881

119895)

1 minus 119903

119906119895119892 (

119883

119906119895minus

119883

119906119894)

]

]

minus 120582119882

119894

(16)

The learning algorithm for the MERR SVD++ model isoutlined in Algorithm 1

The published research papers in [8ndash11 13 18] showthat the use of the normal distribution 119873(0 001) for theinitialization of feature matrix is very effective so we still usethis approach for the initialization of 119880 119881 and 119882 in ourproposed algorithm

43 Computational Complexity Here we first analyze thecomplexity of the learning process for one iteration Byexploiting the data sparseness in 119883 the computational com-plexity of the gradient in (14) is 119874(119889

2119898 + 119889119898) Note that

Input Training set119883 learning rate 120572 regularization 120582The number of the feature 119889 and the maximal numberOf iterations itermaxOutput feature matrix 119880 119881119882for 119906 = 1 2 119898 doIndex relevant items for user 119906119873(119906) = 119895 | 119883

119894119895gt 0 0 le 119895 le 119899

endInitialize 119880(0) 119881(0) and119882

(0) with random valuesand 119905 = 0repeatfor 119906 = 1 2 119898 doUpdate 119880

119906

119880

(119905+1)

119906= 119880

(119905)

119906+ 120572

120597119871

120597119880

(119905)

119906

for 119894 isin 119873(119906) doUpdate 119881

119894119882119894

119881

119894

(119905+1)= 119881

119894

(119905)+ 120572

120597119871

120597119881

119894

(119905)

119882

119894

(119905+1)= 119882

119894

(119905)+ 120572

120597119871

120597119882

119894

(119905)

EndEnd

119905 = 119905 + 1until 119905 ge itermax119880 = 119880

(119905) 119881 = 119881

(119905)119882 = 119882

(119905)

Algorithm 1 MERR SVD++

denotes the average number of relevant items across all theusers The computational complexity of the gradient in (16)is also 119874(119889

2119898 + 119889119898) The computational complexity of the

gradient in (15) is 119874(119889

2119898 + 119889119898) Hence the complexity

of the learning algorithm in one iteration is in the order of119874(119889

2119898) In the case that is a small number that is 2 ≪ 119898

the complexity is linear to the number of users in the datacollection Note that we have 119904 = 119898 in which 119904 denotes thenumber of nonzeros in the user-item matrix The complexityof the learning algorithm is then 119874(119889119904) Since we usuallyhave ≪ 119904 the complexity is 119874(119889119904) even in the case that is large that is being linear to the number of nonzeros (ierelevant observations in the data) In sum our analysis showsthat MERR SVD++ is suitable for large scale use cases Notethat we also empirically verify the complexity of the learningalgorithm in Section 543

5 Experiment

51 Datasets We use two datasets for the experiments Thefirst is the MovieLens 1 million dataset (ML1m) [18] whichcontains ca 1M ratings (1ndash5 scale) from ca 6K users and 37Kmovies The sparseness of the ML1m dataset is 9553 Thesecond dataset is the Netflix dataset [2 16 17] which contains100000000 ratings (1ndash5 scale) from 480189 users on 17770movies Due to the huge size of the Netflix data we extract asubset of 10000 users and 10000 movies in which each userhas rated more than 100 different movies

Mathematical Problems in Engineering 7

52 Evaluation Metrics Just as has been justified by [17]NDCG is a very suitable evaluation metric for personalizedranking algorithmswhich combine explicit and implicit feed-back And our proposed MERR SVD++ algorithm exploitsboth explicit and implicit feedback simultaneously andoptimizes the well-known personalized ranking evaluationmetric Expected Reciprocal Rank (ERR) So we use theNDCG and ERR as evaluation metrics for the predictabilityof models in this paper

NDCG is another most widely used measure for rankingproblems To define NDCG

119906for a user 119906 one first needs to

define DCG119906

DCG119906=

119898

sum

119894=1

2

pref(119894)minus 1

log (119894 + 1)

(17)

where pref(119894) is a binary indicator returning 1 if the 119894th item ispreferred and 0 otherwise DCG

119906is then normalized by the

ideal ranked list into the interval [0 1]

NDCG119906=

DCG119906

IDCG119906

(18)

where IDCG119906denotes that the ranked list is sorted exactly

according to the userrsquos tastes positive items are placed at thehead of the ranked listTheNDCGof all the users is themeanscore of each user

ERR is a generalized version of Reciprocal Rank (RR)designed to be used with multiple relevance level data (egratings) It has similar properties to RR in that it stronglyemphasizes the relevance of results returned at the top of thelist Using the definition of ERR in [18] we can define ERRfor a ranked item list of user 119906 as follows

ERR119906=

119899

sum

119894=1

119903

119906119894

119877

119906119894

119899

prod

119895=1

(1 minus 119903

119906119895119868 (119877

119906119895lt 119877

119906119894)) (19)

Similar to NDCG the ERR of all the users is the mean scoreof each user

Since in recommender systems the userrsquos satisfaction isdominated by only a few items on the top of the recommenda-tion list our evaluation in the following experiments focuseson the performance of top-5 recommended items that isNDCG5 and ERR5

53 Experiment Setup For each dataset we randomlyselected 5 rated items (movies) and 1000 unrated items(movies) for each user to form a test set We then randomlyselected a varying number of rated items from the rest toform a training set For example just as in [14 18] underthe condition of ldquoGiven 5rdquo we randomly selected 5 rateditems (disjoint to the items in the test set) for each user inorder to generate a training set We investigated a variety ofldquoGivenrdquo conditions for the training sets that is 5 10 and 15for the ML1m dataset and 10 20 and 30 for the extractedNetflix dataset Generated recommendation lists for each userare compared to the ground truth in the test set in order tomeasure the performance

All the models were implemented in MATLAB R2009aForMERR SVD++ the value of the regularization parameter

120582 was selected from range 10

minus5 10

minus4 10

minus3 10

minus2 10

minus1 1 10

and optimal parameter value was used And the learning rate120572 was selected from set 119860 119860 = 120572 | 120572 = 00001 times 2

119888 120572 le

05 119888 gt 0 and 119888 is a constant and the optimal parametervalue was also used In order to compare their performancesfairly for all matrix factorization models we set the numberof features to be 10 The optimal values of all parametersfor all the baseline models used are determined individuallyMore detailed setting methods of the parameters for all thebaselines can be found in the corresponding references Forall the algorithms used in our experiments we repeated theexperiment 5 times for each of the different conditions of eachdataset and the performances reported were averaged across5 runs

54 Experiment Results In this section we present a seriesof experiments to evaluate MERR SVD++ We designedthe experiments in order to address the following researchquestions

(1) Does the proposed MERR SVD++ outperform state-of-the-art personalized ranking approaches for top-Nrecommendation

(2) Does the performance of MERR SVD++ improvewhen we only increase the number of implicit feed-back data for each user

(3) Is MERR SVD++ scalable for large scale use cases

541 Performance Comparison We compare the perfor-mance of MERR SVD++ with that of five baseline algo-rithms The approaches we compare with are listed below

(i) Co-Rating [17] a state-of-the-art CF model thatcan be trained from explicit and implicit feedbacksimultaneously

(ii) SVD++ [16] the first proposed CF model that com-bines explicit and implicit feedback

(iii) xCLiMF [18] a state-of-the-art PR approach whichaims at directly optimizing ERR for top-N recom-mendation in domains with explicit feedback data(eg ratings)

(iv) CofiRank [29] a PR approach that optimizes theNDCG measure [28] for domains with explicit feed-back data (eg ratings) The implementation is basedon the publicly available software package from theauthors

(v) CLiMF [14] a state-of-the-art PR approach that opti-mizes the Mean Reciprocal Rank (MRR) measure[31] for domains with implicit feedback data (egclick follow) We use the explicit feedback datasetsby binarizing the rating values with a threshold Onthe ML1m dataset and the extracted Netflix datasetwe take ratings 4 and 5 (the highest two relevancelevels) respectively as the relevance threshold fortop-N recommendation

The results of the experiments on the ML1m and theextractedNetflix datasets are shown in Figure 4 Rows denote

8 Mathematical Problems in Engineering

0 Given 5 Given 10 Given 150

001

002

003

004

005

006The results on the ML1m dataset

The varieties of ldquoGivenrdquo conditions for the training sets

ND

CG

5

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(a)

0 Given 5 Given 10 Given 150

001

002

003

004

005

006

007The results on the ML1m dataset

ERR

5

The varieties of ldquoGivenrdquo conditions for the training sets

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(b)

0 Given 10 Given 20 Given 300

002

004

006

008

010

012The results on the extracted Netflix dataset

ND

CG

5

The varieties of ldquoGivenrdquo conditions for the training sets

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(c)

0 Given 10 Given 20 Given 300

001

002

003

004

005

006

007

008The results on the extracted Netflix dataset

ERR

5

The varieties of ldquoGivenrdquo conditions for the training sets

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(d)

Figure 4 The performance comparison of MERR SVD++ and baselines

the varieties of ldquoGivenrdquo conditions for the training sets basedon the ML1m and the extracted Netflix datasets and columnsdenote the quality of NDCG and ERR Figure 4 showsthat MERR SVD++ outperforms the baseline approachesin terms of both ERR and NDCG in all of the cases Theresults show that the improvement of ERR aligns consistentlywith the improvement of NDCG indicating that optimizingERR would not degrade the utility of recommendations thatare captured by the NDCG measure It can be seen thatthe relative performance of MERR SVD++ improves as thenumber of observed ratings from the users increases Thisresult indicates that MERR SVD++ can learn better top-Nrecommendation models if more observations of the gradedrelevance data from users can be used The results alsoreveal that it is difficult to model user preferences encoded

in multiple levels of relevance with limited observations inparticular when the number of observations is lower than thenumber of relevance levels

Compared to Co-Rating which is based on ratingprediction MERR SVD++ is based on ranking predictionand succeeds in enhancing the top-ranked performance byoptimizing ERR As reported in [17] the performance ofSVD++ is slightly weaker than that of Co-Rating which isbecause SVD++ model only attempts to approximate theobserved ratings and does not model preferences expressedin implicit feedback The results also show that Co-Ratingand SVD++ significantly outperform xCLiMF and CofiRankwhich confirms our belief that implicit feedback could indeedcomplement explicit feedback It can be seen in Figure 4that xCLiMF significantly outperforms CLiMF in terms of

Mathematical Problems in Engineering 9

0 10 20 30 40 50005

006

007

008

009

010

The increased numbers of implicit feedback for each user

The v

alue

of e

valu

atio

n m

etric

s

ERR5NDCG5

Figure 5 The influence of implicit feedback on the performance ofMERR SVD++

both ERR and NDCG across all the settings of relevancethresholds and datasets The results indicate that the infor-mation loss from binarizing multilevel relevance data wouldinevitably make recommendation models based on binaryrelevance data such as CLiMF suboptimal for the use caseswith explicit feedback data

Hence we give a positive answer to our first researchquestion

542 The Influence of Implicit Feedback on the Performanceof MERR SVD++ The influence of implicit feedback on theperformance of MERR SVD++ can be found in Figure 5Here 119873(119906) = 119877(119906) + 119872(119906) 119872(119906) denotes the set of allitems that user 119906 only gave implicit feedback Rows denotethe increased numbers of implicit feedback for each userand columns denote the quality of ERR and NDCG In ourexperiment the increased numbers of implicit feedback foreach user are the same We use the extracted Netflix datasetunder the condition of ldquoGiven 10rdquo Figure 5 shows that thequality of ERR and NDCG of MERR SVD++ synchronouslyand linearly improves with the increase of implicit feedbackfor each user which confirms our belief that implicit feedbackcould indeed complement explicit feedback

With this experimental result we give a positive answerto our second research question

543 Scalability The last experiment investigated the scala-bility of MERR SVD++ by measuring the training time thatwas required for the training set at different scales Firstlyas analyzed in Section 43 the computational complexityof MERR SVD++ is linear in the number of users in thetraining set when the average number of items rated peruser is fixed To demonstrate the scalability we used differentnumbers of users in the training set under each conditionwe randomly selected from 10 to 100 users in the trainingset and their rated items as the training data for learning thelatent factors The results on the ML1m dataset are shown inFigure 6 We can observe that the computational time under

10 20 30 40 50 60 70 80 90 1000

5

10

15

20

25

30

35

40

Users for training ()

Runt

ime p

er it

erat

ion

(sec

)

Given 5Given 10Given 15

Figure 6 Scalability analysis of MERR SVD++ in terms of thenumber of users in the training set

each condition increases almost linearly to the increase of thenumber of users Secondly as also discussed in Section 43the computational complexity of MERR SVD++ could befurther approximated to be linear to the amount of knowndata (ie nonzero entries in the training user-item matrix)To demonstrate this we examined the runtime of the learningalgorithm against different scales of the training sets underdifferent ldquoGivenrdquo conditions The result is shown in Figure 6from which we can observe that the average runtime of thelearning algorithm per iteration increases almost linearly asthe number of nonzeros in the training set increases

The observations from this experiment allow us to answerour last research question positively

6 Conclusion and Future Work

The problem of the previous researches on personalizedranking is that they focused on either explicit feedbackdata or implicit feedback data rather than making full useof the information in the dataset Until now nobody hasstudied personalized ranking algorithm by exploiting bothexplicit and implicit feedback In order to overcome thedefects of prior researches in this paper we have presenteda new personalized ranking algorithm (MERR SVD++) byexploiting both explicit and implicit feedback simultaneouslyMERR SVD++ optimizes the well-known evaluation metricExpected Reciprocal Rank (ERR) and is based on the newestxCLiMF model and SVD++ algorithm Experimental resultson practical datasets showed that our proposed algorithmoutperformed existing personalized ranking algorithms overdifferent evaluation metrics and that the running time ofMERR SVD++ showed a linear correlation with the num-ber of rating Because of its high precision and the goodexpansibility MERR SVD++ is suitable for processing bigdata and can greatly improve the recommendation speedand validity by solving the latency problem of personalizedrecommendation and has wide application prospect in the

10 Mathematical Problems in Engineering

field of internet information recommendation And becauseMERR SVD++ exploits both explicit and implicit feedbacksimultaneously MERR SVD++ can solve the data sparsityand imbalance problems of personalized ranking algorithmsto a certain extent

For futurework we plan to extend our algorithm to richerones so that our algorithm can solve the grey sheep problemand cold start problem of personalized recommendationAlso we would like to explore more useful information fromthe explicit feedback and implicit feedback simultaneously

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is sponsored in part by the National NaturalScience Foundation of China (nos 61370186 6140326461402122 61003140 61033010 and 61272414) Science andTechnology Planning Project of Guangdong Province (nos2014A010103040 and 2014B010116001) Science and Tech-nology Planning Project of Guangzhou (nos 2014J4100032and 201510010203) the Ministry of Education and ChinaMobile Research Fund (no MCM20121051) the secondbatch open subject of mechanical and electrical professionalgroup engineering technology development center in Foshancity (no 2015-KJZX139) and the 2015 Research BackboneTeachers Training Program of Shunde Polytechnic (no 2015-KJZX014)

References

[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge and DataEngineering vol 17 no 6 pp 734ndash749 2005

[2] S Ahn A Korattikara and N Liu ldquoLarge-scale distributedBayesian matrix factorization using stochastic gradientMCMCrdquo in Proceedings of the 21th ACMSIGKDDConference onKnowledgeDiscovery andDataMining pp 401ndash410 ACMPressSydney Australia August 2015

[3] S Balakrishnan and S Chopra ldquoCollaborative rankingrdquo in Pro-ceedings of the 5th ACM International Conference onWeb Searchand DataMining (WSDM rsquo12) pp 143ndash152 IEEE Seattle WashUSA February 2012

[4] Q Yao and J Kwok ldquoAccelerated inexact soft-impute for fastlarge-scale matrix completionrdquo in Proceedings of the 24rd Inter-national Joint Conference on Artificial Intelligence pp 4002ndash4008 ACM Press Buenos Aires Argentina July 2015

[5] Q Diao M Qiu C-Y Wu A J Smola J Jiang and C WangldquoJointly modeling aspects ratings and sentiments for movierecommendation (JMARS)rdquo in Proceedings of the 20th ACMSIGKDD International Conference on Knowledge Discovery andDataMining (KDD rsquo14) pp 193ndash202 ACMNewYorkNYUSAAugust 2014

[6] M Jahrer and A Toscher ldquoCollaborative filtering ensemble forrankingrdquo in Proceedings of the 17nd International Conference on

Knowledge Discovery and Data Mining pp 153ndash167 ACM SanDiego Calif USA August 2011

[7] G Takacs and D Tikk ldquoAlternating least squares for person-alized rankingrdquo in Proceedings of the 5th ACM Conference onRecommender Systems pp 83ndash90 ACM Dublin Ireland 2012

[8] G Li and L Li ldquoDual collaborative topicmodeling from implicitfeedbacksrdquo in Proceedings of the IEEE International Conferenceon Security Pattern Analysis and Cybernetics pp 395ndash404Wuhan China October 2014

[9] H Wang X Shi and D Yeung ldquoRelational stacked denoisingautoencoder for tag recommendationrdquo in Proceedings of the29th AAAI Conference on Artificial Intelligence pp 3052ndash3058ACM Press Austin Tex USA January 2015

[10] L Gai ldquoPairwise probabilistic matrix factorization for implicitfeedback collaborative filteringrdquo in Proceedings of the IEEEInternational Conference on Security Pattern Analysis andCybernetics (SPAC rsquo14) pp 181ndash190 Wuhan China October2014

[11] S Rendle C Freudenthaler Z Gantner and L Schmidt-Thieme ldquoBPR Bayesian personalized ranking from implicitfeedbackrdquo in Proceedings of the 25th Conference on Uncertaintyin Artificial Intelligence (UAI rsquo09) pp 452ndash461 Morgan Kauf-mann Montreal Canada 2009

[12] L Guo J Ma H R Jiang Z M Chen and C M XingldquoSocial trust aware item recommendation for implicit feedbackrdquoJournal of Computer Science and Technology vol 30 no 5 pp1039ndash1053 2015

[13] W K Pan H Zhong C F Xu and Z Ming ldquoAdaptive Bayesianpersonalized ranking for heterogeneous implicit feedbacksrdquoKnowledge-Based Systems vol 73 pp 173ndash180 2015

[14] Y Shi A Karatzoglou L Baltrunas M Larson N Oliver andA Hanjalic ldquoCLiMF collaborative less-is-more filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 3077ndash3081 ACMBeijing ChinaAugust 2013

[15] WK Pan andLChen ldquoGBPR grouppreference based bayesianpersonalized ranking for one-class collaborative filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 2691ndash2697 Beijing China 2013

[16] Y Koren ldquoFactor in the neighbors scalable and accurate col-laborative filteringrdquoACMTransactions on Knowledge Discoveryfrom Data vol 4 no 1 pp 1ndash24 2010

[17] N N Liu E W Xiang M Zhao and Q Yang ldquoUnifyingexplicit and implicit feedback for collaborative filteringrdquo inProceedings of the 19th International Conference on Informationand Knowledge Management (CIKM rsquo10) pp 1445ndash1448 ACMToronto Canada October 2010

[18] Y Shi A Karatzoglou L BaltrunasM Larson andA HanjalicldquoXCLiMF Optimizing expected reciprocal rank for data withmultiple levels of relevancerdquo in Proceedings of the 7th ACMConference on Recommender Systems (RecSys rsquo13) pp 431ndash434ACM Hongkong October 2013

[19] J Jiang J Lu G Zhang and G Long ldquoScaling-up item-based collaborative filtering recommendation algorithm basedon Hadooprdquo in Proceedings of the 7th IEEE World Congress onServices pp 490ndash497 IEEE Washington DC USA July 2011

[20] T Hofmann ldquoLatent semantic models for collaborative filter-ingrdquo ACM Transactions on Information Systems vol 22 no 1pp 89ndash115 2004

[21] R Salakhutdinov A Mnih and G Hinton ldquoRestricted Boltz-mannmachines for collaborative filteringrdquo in Proceedings of the

Mathematical Problems in Engineering 11

24rd International Conference onMachine Learning (ICML rsquo07)pp 791ndash798 ACM Press Corvallis Ore USA June 2007

[22] N Srebro J Rennie andT Jaakkola ldquoMaximum-marginmatrixfactorizationrdquo in Proceedings of the 18th Annual Conferenceon Neural Information Processing Systems pp 11ndash217 BritishColumbia Canada 2004

[23] J T Liu C H Wu and W Y Liu ldquoBayesian probabilisticmatrix factorization with social relations and item contents forrecommendationrdquo Decision Support Systems vol 55 no 3 pp838ndash850 2013

[24] D Feldman and T Tassa ldquoMore constraints smaller coresetsconstrained matrix approximation of sparse big datardquo in Pro-ceedings of the 21th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo15) pp 249ndash258ACM Press Sydney Australia August 2015

[25] S Mirisaee E Gaussier and A Termier ldquoImproved local searchfor binarymatrix factorizationrdquo in Proceedings of the 29th AAAIConference on Artificial Intelligence pp 1198ndash1204 ACM PressAustin Tex USA January 2015

[26] T Y Liu Learning to Rank for Information Retrieval SpringerNew York NY USA 2011

[27] NN LiuM Zhao andQ Yang ldquoProbabilistic latent preferenceanalysis for collaborative filteringrdquo in Proceedings of the 18thInternational Conference on Information and Knowledge Man-agement (CIKM rsquo09) pp 759ndash766 ACM Press Hong KongNovember 2009

[28] N N Liu and Q Yang ldquoEigenRank a ranking-orientedapproach to collaborative filteringrdquo in Proceedings of the 31stAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval (ACM SIGIR rsquo08) pp 83ndash90 ACM Singapore July 2008

[29] M Weimer A Karatzoglou Q V Le and A J SmolaldquoCofiRankndashmaximum margin matrix factorization for collab-orative rankingrdquo in Proceedings of the 21th Conference onAdvances in Neural Information Processing Systems pp 79ndash86Curran Associates Vancouver Canada 2007

[30] O Chapelle D Metlzer Y Zhang and P Grinspan ldquoExpectedreciprocal rank for graded relevancerdquo in Proceedings of the 18thACM International Conference on Information and KnowledgeManagement (CIKM rsquo09) pp 621ndash630 ACM New York NYUSA November 2009

[31] E M Voorhees ldquoThe trec-8 question answering track reportrdquoin Proceedings of the 8th Text Retrieval Conference (TREC-8 rsquo99)Gaithersburg Md USA November 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 6: Research Article Exploiting Explicit and Implicit Feedback ...downloads.hindawi.com/journals/mpe/2016/2535329.pdfLTR is the core technology for information retrieval. When a query

6 Mathematical Problems in Engineering

in which 120582 denotes the regularization coefficient and 119880

denotes the Frobenius norm of119880 Note that the lower bound119871(119880 119881119882) is much less complex than the original objectivefunction in (9) and standard optimization methods forexample gradient ascend can be used to learn the optimalmodel parameters 119880 119881 and119882

42 Optimization We can now maximize the objectivefunction (13) with respect to the latent factorsarg max

119880119881119882119871(119880119881119882) Note that 119871(119880119881119882) represents an

approximation of the mean value of ERR across all the usersWe can thus remove the constant coefficient 1119898 since ithas no influence on the optimization of 119871(119880 119881119882) Since theobjective function is smooth we can use gradient ascent forthe optimization The gradients can be derived in a similarmanner to xCLiMF [18] as shown in the following

120597119871

120597119880

119906

=

119899

sum

119894=1

119903

119906119894[

[

119892 (minus

119883

119906119894)119881

119894

+

119899

sum

119895=1

119903

119906119895119892

1015840(

119883

119906119895minus

119883

119906119894) (119881

119894minus 119881

119895)

1 minus 119903

119906119895119892 (

119883

119906119895minus

119883

119906119894)

]

]

minus 120582119880

119906

(14)

120597119871

120597119881

119894

= 119903

119906119894[

[

119892 (minus

119883

119906119894)119881

119894+

119899

sum

119895=1

119903

119906119895119892

1015840(

119883

119906119895minus

119883

119906119894)

sdot (

1

1 minus 119903

119906119895119892 (

119883

119906119895minus

119883

119906119894)

minus

1

1 minus 119903

119906119894119892 (

119883

119906119894minus

119883

119906119895)

)

]

]

(119880

119906+ |119873 (119906)|

minus12

sdot sum

119896isin119873(119906)

119882

119896) minus 120582119881

119894

(15)

120597119871

120597119882

119894

= 119903

119906119894[

[

119892 (minus

119883

119906119894)119881

119894

+

119899

sum

119895=1

119903

119906119895119892

1015840(

119883

119906119895minus

119883

119906119894) (119881

119894minus 119881

119895)

1 minus 119903

119906119895119892 (

119883

119906119895minus

119883

119906119894)

]

]

minus 120582119882

119894

(16)

The learning algorithm for the MERR SVD++ model isoutlined in Algorithm 1

The published research papers in [8ndash11 13 18] showthat the use of the normal distribution 119873(0 001) for theinitialization of feature matrix is very effective so we still usethis approach for the initialization of 119880 119881 and 119882 in ourproposed algorithm

43 Computational Complexity Here we first analyze thecomplexity of the learning process for one iteration Byexploiting the data sparseness in 119883 the computational com-plexity of the gradient in (14) is 119874(119889

2119898 + 119889119898) Note that

Input Training set119883 learning rate 120572 regularization 120582The number of the feature 119889 and the maximal numberOf iterations itermaxOutput feature matrix 119880 119881119882for 119906 = 1 2 119898 doIndex relevant items for user 119906119873(119906) = 119895 | 119883

119894119895gt 0 0 le 119895 le 119899

endInitialize 119880(0) 119881(0) and119882

(0) with random valuesand 119905 = 0repeatfor 119906 = 1 2 119898 doUpdate 119880

119906

119880

(119905+1)

119906= 119880

(119905)

119906+ 120572

120597119871

120597119880

(119905)

119906

for 119894 isin 119873(119906) doUpdate 119881

119894119882119894

119881

119894

(119905+1)= 119881

119894

(119905)+ 120572

120597119871

120597119881

119894

(119905)

119882

119894

(119905+1)= 119882

119894

(119905)+ 120572

120597119871

120597119882

119894

(119905)

EndEnd

119905 = 119905 + 1until 119905 ge itermax119880 = 119880

(119905) 119881 = 119881

(119905)119882 = 119882

(119905)

Algorithm 1 MERR SVD++

denotes the average number of relevant items across all theusers The computational complexity of the gradient in (16)is also 119874(119889

2119898 + 119889119898) The computational complexity of the

gradient in (15) is 119874(119889

2119898 + 119889119898) Hence the complexity

of the learning algorithm in one iteration is in the order of119874(119889

2119898) In the case that is a small number that is 2 ≪ 119898

the complexity is linear to the number of users in the datacollection Note that we have 119904 = 119898 in which 119904 denotes thenumber of nonzeros in the user-item matrix The complexityof the learning algorithm is then 119874(119889119904) Since we usuallyhave ≪ 119904 the complexity is 119874(119889119904) even in the case that is large that is being linear to the number of nonzeros (ierelevant observations in the data) In sum our analysis showsthat MERR SVD++ is suitable for large scale use cases Notethat we also empirically verify the complexity of the learningalgorithm in Section 543

5 Experiment

51 Datasets We use two datasets for the experiments Thefirst is the MovieLens 1 million dataset (ML1m) [18] whichcontains ca 1M ratings (1ndash5 scale) from ca 6K users and 37Kmovies The sparseness of the ML1m dataset is 9553 Thesecond dataset is the Netflix dataset [2 16 17] which contains100000000 ratings (1ndash5 scale) from 480189 users on 17770movies Due to the huge size of the Netflix data we extract asubset of 10000 users and 10000 movies in which each userhas rated more than 100 different movies

Mathematical Problems in Engineering 7

52 Evaluation Metrics Just as has been justified by [17]NDCG is a very suitable evaluation metric for personalizedranking algorithmswhich combine explicit and implicit feed-back And our proposed MERR SVD++ algorithm exploitsboth explicit and implicit feedback simultaneously andoptimizes the well-known personalized ranking evaluationmetric Expected Reciprocal Rank (ERR) So we use theNDCG and ERR as evaluation metrics for the predictabilityof models in this paper

NDCG is another most widely used measure for rankingproblems To define NDCG

119906for a user 119906 one first needs to

define DCG119906

DCG119906=

119898

sum

119894=1

2

pref(119894)minus 1

log (119894 + 1)

(17)

where pref(119894) is a binary indicator returning 1 if the 119894th item ispreferred and 0 otherwise DCG

119906is then normalized by the

ideal ranked list into the interval [0 1]

NDCG119906=

DCG119906

IDCG119906

(18)

where IDCG119906denotes that the ranked list is sorted exactly

according to the userrsquos tastes positive items are placed at thehead of the ranked listTheNDCGof all the users is themeanscore of each user

ERR is a generalized version of Reciprocal Rank (RR)designed to be used with multiple relevance level data (egratings) It has similar properties to RR in that it stronglyemphasizes the relevance of results returned at the top of thelist Using the definition of ERR in [18] we can define ERRfor a ranked item list of user 119906 as follows

ERR119906=

119899

sum

119894=1

119903

119906119894

119877

119906119894

119899

prod

119895=1

(1 minus 119903

119906119895119868 (119877

119906119895lt 119877

119906119894)) (19)

Similar to NDCG the ERR of all the users is the mean scoreof each user

Since in recommender systems the userrsquos satisfaction isdominated by only a few items on the top of the recommenda-tion list our evaluation in the following experiments focuseson the performance of top-5 recommended items that isNDCG5 and ERR5

53 Experiment Setup For each dataset we randomlyselected 5 rated items (movies) and 1000 unrated items(movies) for each user to form a test set We then randomlyselected a varying number of rated items from the rest toform a training set For example just as in [14 18] underthe condition of ldquoGiven 5rdquo we randomly selected 5 rateditems (disjoint to the items in the test set) for each user inorder to generate a training set We investigated a variety ofldquoGivenrdquo conditions for the training sets that is 5 10 and 15for the ML1m dataset and 10 20 and 30 for the extractedNetflix dataset Generated recommendation lists for each userare compared to the ground truth in the test set in order tomeasure the performance

All the models were implemented in MATLAB R2009aForMERR SVD++ the value of the regularization parameter

120582 was selected from range 10

minus5 10

minus4 10

minus3 10

minus2 10

minus1 1 10

and optimal parameter value was used And the learning rate120572 was selected from set 119860 119860 = 120572 | 120572 = 00001 times 2

119888 120572 le

05 119888 gt 0 and 119888 is a constant and the optimal parametervalue was also used In order to compare their performancesfairly for all matrix factorization models we set the numberof features to be 10 The optimal values of all parametersfor all the baseline models used are determined individuallyMore detailed setting methods of the parameters for all thebaselines can be found in the corresponding references Forall the algorithms used in our experiments we repeated theexperiment 5 times for each of the different conditions of eachdataset and the performances reported were averaged across5 runs

54 Experiment Results In this section we present a seriesof experiments to evaluate MERR SVD++ We designedthe experiments in order to address the following researchquestions

(1) Does the proposed MERR SVD++ outperform state-of-the-art personalized ranking approaches for top-Nrecommendation

(2) Does the performance of MERR SVD++ improvewhen we only increase the number of implicit feed-back data for each user

(3) Is MERR SVD++ scalable for large scale use cases

541 Performance Comparison We compare the perfor-mance of MERR SVD++ with that of five baseline algo-rithms The approaches we compare with are listed below

(i) Co-Rating [17] a state-of-the-art CF model thatcan be trained from explicit and implicit feedbacksimultaneously

(ii) SVD++ [16] the first proposed CF model that com-bines explicit and implicit feedback

(iii) xCLiMF [18] a state-of-the-art PR approach whichaims at directly optimizing ERR for top-N recom-mendation in domains with explicit feedback data(eg ratings)

(iv) CofiRank [29] a PR approach that optimizes theNDCG measure [28] for domains with explicit feed-back data (eg ratings) The implementation is basedon the publicly available software package from theauthors

(v) CLiMF [14] a state-of-the-art PR approach that opti-mizes the Mean Reciprocal Rank (MRR) measure[31] for domains with implicit feedback data (egclick follow) We use the explicit feedback datasetsby binarizing the rating values with a threshold Onthe ML1m dataset and the extracted Netflix datasetwe take ratings 4 and 5 (the highest two relevancelevels) respectively as the relevance threshold fortop-N recommendation

The results of the experiments on the ML1m and theextractedNetflix datasets are shown in Figure 4 Rows denote

8 Mathematical Problems in Engineering

0 Given 5 Given 10 Given 150

001

002

003

004

005

006The results on the ML1m dataset

The varieties of ldquoGivenrdquo conditions for the training sets

ND

CG

5

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(a)

0 Given 5 Given 10 Given 150

001

002

003

004

005

006

007The results on the ML1m dataset

ERR

5

The varieties of ldquoGivenrdquo conditions for the training sets

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(b)

0 Given 10 Given 20 Given 300

002

004

006

008

010

012The results on the extracted Netflix dataset

ND

CG

5

The varieties of ldquoGivenrdquo conditions for the training sets

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(c)

0 Given 10 Given 20 Given 300

001

002

003

004

005

006

007

008The results on the extracted Netflix dataset

ERR

5

The varieties of ldquoGivenrdquo conditions for the training sets

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(d)

Figure 4 The performance comparison of MERR SVD++ and baselines

the varieties of ldquoGivenrdquo conditions for the training sets basedon the ML1m and the extracted Netflix datasets and columnsdenote the quality of NDCG and ERR Figure 4 showsthat MERR SVD++ outperforms the baseline approachesin terms of both ERR and NDCG in all of the cases Theresults show that the improvement of ERR aligns consistentlywith the improvement of NDCG indicating that optimizingERR would not degrade the utility of recommendations thatare captured by the NDCG measure It can be seen thatthe relative performance of MERR SVD++ improves as thenumber of observed ratings from the users increases Thisresult indicates that MERR SVD++ can learn better top-Nrecommendation models if more observations of the gradedrelevance data from users can be used The results alsoreveal that it is difficult to model user preferences encoded

in multiple levels of relevance with limited observations inparticular when the number of observations is lower than thenumber of relevance levels

Compared to Co-Rating which is based on ratingprediction MERR SVD++ is based on ranking predictionand succeeds in enhancing the top-ranked performance byoptimizing ERR As reported in [17] the performance ofSVD++ is slightly weaker than that of Co-Rating which isbecause SVD++ model only attempts to approximate theobserved ratings and does not model preferences expressedin implicit feedback The results also show that Co-Ratingand SVD++ significantly outperform xCLiMF and CofiRankwhich confirms our belief that implicit feedback could indeedcomplement explicit feedback It can be seen in Figure 4that xCLiMF significantly outperforms CLiMF in terms of

Mathematical Problems in Engineering 9

0 10 20 30 40 50005

006

007

008

009

010

The increased numbers of implicit feedback for each user

The v

alue

of e

valu

atio

n m

etric

s

ERR5NDCG5

Figure 5 The influence of implicit feedback on the performance ofMERR SVD++

both ERR and NDCG across all the settings of relevancethresholds and datasets The results indicate that the infor-mation loss from binarizing multilevel relevance data wouldinevitably make recommendation models based on binaryrelevance data such as CLiMF suboptimal for the use caseswith explicit feedback data

Hence we give a positive answer to our first researchquestion

542 The Influence of Implicit Feedback on the Performanceof MERR SVD++ The influence of implicit feedback on theperformance of MERR SVD++ can be found in Figure 5Here 119873(119906) = 119877(119906) + 119872(119906) 119872(119906) denotes the set of allitems that user 119906 only gave implicit feedback Rows denotethe increased numbers of implicit feedback for each userand columns denote the quality of ERR and NDCG In ourexperiment the increased numbers of implicit feedback foreach user are the same We use the extracted Netflix datasetunder the condition of ldquoGiven 10rdquo Figure 5 shows that thequality of ERR and NDCG of MERR SVD++ synchronouslyand linearly improves with the increase of implicit feedbackfor each user which confirms our belief that implicit feedbackcould indeed complement explicit feedback

With this experimental result we give a positive answerto our second research question

543 Scalability The last experiment investigated the scala-bility of MERR SVD++ by measuring the training time thatwas required for the training set at different scales Firstlyas analyzed in Section 43 the computational complexityof MERR SVD++ is linear in the number of users in thetraining set when the average number of items rated peruser is fixed To demonstrate the scalability we used differentnumbers of users in the training set under each conditionwe randomly selected from 10 to 100 users in the trainingset and their rated items as the training data for learning thelatent factors The results on the ML1m dataset are shown inFigure 6 We can observe that the computational time under

10 20 30 40 50 60 70 80 90 1000

5

10

15

20

25

30

35

40

Users for training ()

Runt

ime p

er it

erat

ion

(sec

)

Given 5Given 10Given 15

Figure 6 Scalability analysis of MERR SVD++ in terms of thenumber of users in the training set

each condition increases almost linearly to the increase of thenumber of users Secondly as also discussed in Section 43the computational complexity of MERR SVD++ could befurther approximated to be linear to the amount of knowndata (ie nonzero entries in the training user-item matrix)To demonstrate this we examined the runtime of the learningalgorithm against different scales of the training sets underdifferent ldquoGivenrdquo conditions The result is shown in Figure 6from which we can observe that the average runtime of thelearning algorithm per iteration increases almost linearly asthe number of nonzeros in the training set increases

The observations from this experiment allow us to answerour last research question positively

6 Conclusion and Future Work

The problem of the previous researches on personalizedranking is that they focused on either explicit feedbackdata or implicit feedback data rather than making full useof the information in the dataset Until now nobody hasstudied personalized ranking algorithm by exploiting bothexplicit and implicit feedback In order to overcome thedefects of prior researches in this paper we have presenteda new personalized ranking algorithm (MERR SVD++) byexploiting both explicit and implicit feedback simultaneouslyMERR SVD++ optimizes the well-known evaluation metricExpected Reciprocal Rank (ERR) and is based on the newestxCLiMF model and SVD++ algorithm Experimental resultson practical datasets showed that our proposed algorithmoutperformed existing personalized ranking algorithms overdifferent evaluation metrics and that the running time ofMERR SVD++ showed a linear correlation with the num-ber of rating Because of its high precision and the goodexpansibility MERR SVD++ is suitable for processing bigdata and can greatly improve the recommendation speedand validity by solving the latency problem of personalizedrecommendation and has wide application prospect in the

10 Mathematical Problems in Engineering

field of internet information recommendation And becauseMERR SVD++ exploits both explicit and implicit feedbacksimultaneously MERR SVD++ can solve the data sparsityand imbalance problems of personalized ranking algorithmsto a certain extent

For futurework we plan to extend our algorithm to richerones so that our algorithm can solve the grey sheep problemand cold start problem of personalized recommendationAlso we would like to explore more useful information fromthe explicit feedback and implicit feedback simultaneously

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is sponsored in part by the National NaturalScience Foundation of China (nos 61370186 6140326461402122 61003140 61033010 and 61272414) Science andTechnology Planning Project of Guangdong Province (nos2014A010103040 and 2014B010116001) Science and Tech-nology Planning Project of Guangzhou (nos 2014J4100032and 201510010203) the Ministry of Education and ChinaMobile Research Fund (no MCM20121051) the secondbatch open subject of mechanical and electrical professionalgroup engineering technology development center in Foshancity (no 2015-KJZX139) and the 2015 Research BackboneTeachers Training Program of Shunde Polytechnic (no 2015-KJZX014)

References

[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge and DataEngineering vol 17 no 6 pp 734ndash749 2005

[2] S Ahn A Korattikara and N Liu ldquoLarge-scale distributedBayesian matrix factorization using stochastic gradientMCMCrdquo in Proceedings of the 21th ACMSIGKDDConference onKnowledgeDiscovery andDataMining pp 401ndash410 ACMPressSydney Australia August 2015

[3] S Balakrishnan and S Chopra ldquoCollaborative rankingrdquo in Pro-ceedings of the 5th ACM International Conference onWeb Searchand DataMining (WSDM rsquo12) pp 143ndash152 IEEE Seattle WashUSA February 2012

[4] Q Yao and J Kwok ldquoAccelerated inexact soft-impute for fastlarge-scale matrix completionrdquo in Proceedings of the 24rd Inter-national Joint Conference on Artificial Intelligence pp 4002ndash4008 ACM Press Buenos Aires Argentina July 2015

[5] Q Diao M Qiu C-Y Wu A J Smola J Jiang and C WangldquoJointly modeling aspects ratings and sentiments for movierecommendation (JMARS)rdquo in Proceedings of the 20th ACMSIGKDD International Conference on Knowledge Discovery andDataMining (KDD rsquo14) pp 193ndash202 ACMNewYorkNYUSAAugust 2014

[6] M Jahrer and A Toscher ldquoCollaborative filtering ensemble forrankingrdquo in Proceedings of the 17nd International Conference on

Knowledge Discovery and Data Mining pp 153ndash167 ACM SanDiego Calif USA August 2011

[7] G Takacs and D Tikk ldquoAlternating least squares for person-alized rankingrdquo in Proceedings of the 5th ACM Conference onRecommender Systems pp 83ndash90 ACM Dublin Ireland 2012

[8] G Li and L Li ldquoDual collaborative topicmodeling from implicitfeedbacksrdquo in Proceedings of the IEEE International Conferenceon Security Pattern Analysis and Cybernetics pp 395ndash404Wuhan China October 2014

[9] H Wang X Shi and D Yeung ldquoRelational stacked denoisingautoencoder for tag recommendationrdquo in Proceedings of the29th AAAI Conference on Artificial Intelligence pp 3052ndash3058ACM Press Austin Tex USA January 2015

[10] L Gai ldquoPairwise probabilistic matrix factorization for implicitfeedback collaborative filteringrdquo in Proceedings of the IEEEInternational Conference on Security Pattern Analysis andCybernetics (SPAC rsquo14) pp 181ndash190 Wuhan China October2014

[11] S Rendle C Freudenthaler Z Gantner and L Schmidt-Thieme ldquoBPR Bayesian personalized ranking from implicitfeedbackrdquo in Proceedings of the 25th Conference on Uncertaintyin Artificial Intelligence (UAI rsquo09) pp 452ndash461 Morgan Kauf-mann Montreal Canada 2009

[12] L Guo J Ma H R Jiang Z M Chen and C M XingldquoSocial trust aware item recommendation for implicit feedbackrdquoJournal of Computer Science and Technology vol 30 no 5 pp1039ndash1053 2015

[13] W K Pan H Zhong C F Xu and Z Ming ldquoAdaptive Bayesianpersonalized ranking for heterogeneous implicit feedbacksrdquoKnowledge-Based Systems vol 73 pp 173ndash180 2015

[14] Y Shi A Karatzoglou L Baltrunas M Larson N Oliver andA Hanjalic ldquoCLiMF collaborative less-is-more filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 3077ndash3081 ACMBeijing ChinaAugust 2013

[15] WK Pan andLChen ldquoGBPR grouppreference based bayesianpersonalized ranking for one-class collaborative filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 2691ndash2697 Beijing China 2013

[16] Y Koren ldquoFactor in the neighbors scalable and accurate col-laborative filteringrdquoACMTransactions on Knowledge Discoveryfrom Data vol 4 no 1 pp 1ndash24 2010

[17] N N Liu E W Xiang M Zhao and Q Yang ldquoUnifyingexplicit and implicit feedback for collaborative filteringrdquo inProceedings of the 19th International Conference on Informationand Knowledge Management (CIKM rsquo10) pp 1445ndash1448 ACMToronto Canada October 2010

[18] Y Shi A Karatzoglou L BaltrunasM Larson andA HanjalicldquoXCLiMF Optimizing expected reciprocal rank for data withmultiple levels of relevancerdquo in Proceedings of the 7th ACMConference on Recommender Systems (RecSys rsquo13) pp 431ndash434ACM Hongkong October 2013

[19] J Jiang J Lu G Zhang and G Long ldquoScaling-up item-based collaborative filtering recommendation algorithm basedon Hadooprdquo in Proceedings of the 7th IEEE World Congress onServices pp 490ndash497 IEEE Washington DC USA July 2011

[20] T Hofmann ldquoLatent semantic models for collaborative filter-ingrdquo ACM Transactions on Information Systems vol 22 no 1pp 89ndash115 2004

[21] R Salakhutdinov A Mnih and G Hinton ldquoRestricted Boltz-mannmachines for collaborative filteringrdquo in Proceedings of the

Mathematical Problems in Engineering 11

24rd International Conference onMachine Learning (ICML rsquo07)pp 791ndash798 ACM Press Corvallis Ore USA June 2007

[22] N Srebro J Rennie andT Jaakkola ldquoMaximum-marginmatrixfactorizationrdquo in Proceedings of the 18th Annual Conferenceon Neural Information Processing Systems pp 11ndash217 BritishColumbia Canada 2004

[23] J T Liu C H Wu and W Y Liu ldquoBayesian probabilisticmatrix factorization with social relations and item contents forrecommendationrdquo Decision Support Systems vol 55 no 3 pp838ndash850 2013

[24] D Feldman and T Tassa ldquoMore constraints smaller coresetsconstrained matrix approximation of sparse big datardquo in Pro-ceedings of the 21th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo15) pp 249ndash258ACM Press Sydney Australia August 2015

[25] S Mirisaee E Gaussier and A Termier ldquoImproved local searchfor binarymatrix factorizationrdquo in Proceedings of the 29th AAAIConference on Artificial Intelligence pp 1198ndash1204 ACM PressAustin Tex USA January 2015

[26] T Y Liu Learning to Rank for Information Retrieval SpringerNew York NY USA 2011

[27] NN LiuM Zhao andQ Yang ldquoProbabilistic latent preferenceanalysis for collaborative filteringrdquo in Proceedings of the 18thInternational Conference on Information and Knowledge Man-agement (CIKM rsquo09) pp 759ndash766 ACM Press Hong KongNovember 2009

[28] N N Liu and Q Yang ldquoEigenRank a ranking-orientedapproach to collaborative filteringrdquo in Proceedings of the 31stAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval (ACM SIGIR rsquo08) pp 83ndash90 ACM Singapore July 2008

[29] M Weimer A Karatzoglou Q V Le and A J SmolaldquoCofiRankndashmaximum margin matrix factorization for collab-orative rankingrdquo in Proceedings of the 21th Conference onAdvances in Neural Information Processing Systems pp 79ndash86Curran Associates Vancouver Canada 2007

[30] O Chapelle D Metlzer Y Zhang and P Grinspan ldquoExpectedreciprocal rank for graded relevancerdquo in Proceedings of the 18thACM International Conference on Information and KnowledgeManagement (CIKM rsquo09) pp 621ndash630 ACM New York NYUSA November 2009

[31] E M Voorhees ldquoThe trec-8 question answering track reportrdquoin Proceedings of the 8th Text Retrieval Conference (TREC-8 rsquo99)Gaithersburg Md USA November 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 7: Research Article Exploiting Explicit and Implicit Feedback ...downloads.hindawi.com/journals/mpe/2016/2535329.pdfLTR is the core technology for information retrieval. When a query

Mathematical Problems in Engineering 7

52 Evaluation Metrics Just as has been justified by [17]NDCG is a very suitable evaluation metric for personalizedranking algorithmswhich combine explicit and implicit feed-back And our proposed MERR SVD++ algorithm exploitsboth explicit and implicit feedback simultaneously andoptimizes the well-known personalized ranking evaluationmetric Expected Reciprocal Rank (ERR) So we use theNDCG and ERR as evaluation metrics for the predictabilityof models in this paper

NDCG is another most widely used measure for rankingproblems To define NDCG

119906for a user 119906 one first needs to

define DCG119906

DCG119906=

119898

sum

119894=1

2

pref(119894)minus 1

log (119894 + 1)

(17)

where pref(119894) is a binary indicator returning 1 if the 119894th item ispreferred and 0 otherwise DCG

119906is then normalized by the

ideal ranked list into the interval [0 1]

NDCG119906=

DCG119906

IDCG119906

(18)

where IDCG119906denotes that the ranked list is sorted exactly

according to the userrsquos tastes positive items are placed at thehead of the ranked listTheNDCGof all the users is themeanscore of each user

ERR is a generalized version of Reciprocal Rank (RR)designed to be used with multiple relevance level data (egratings) It has similar properties to RR in that it stronglyemphasizes the relevance of results returned at the top of thelist Using the definition of ERR in [18] we can define ERRfor a ranked item list of user 119906 as follows

ERR119906=

119899

sum

119894=1

119903

119906119894

119877

119906119894

119899

prod

119895=1

(1 minus 119903

119906119895119868 (119877

119906119895lt 119877

119906119894)) (19)

Similar to NDCG the ERR of all the users is the mean scoreof each user

Since in recommender systems the userrsquos satisfaction isdominated by only a few items on the top of the recommenda-tion list our evaluation in the following experiments focuseson the performance of top-5 recommended items that isNDCG5 and ERR5

53 Experiment Setup For each dataset we randomlyselected 5 rated items (movies) and 1000 unrated items(movies) for each user to form a test set We then randomlyselected a varying number of rated items from the rest toform a training set For example just as in [14 18] underthe condition of ldquoGiven 5rdquo we randomly selected 5 rateditems (disjoint to the items in the test set) for each user inorder to generate a training set We investigated a variety ofldquoGivenrdquo conditions for the training sets that is 5 10 and 15for the ML1m dataset and 10 20 and 30 for the extractedNetflix dataset Generated recommendation lists for each userare compared to the ground truth in the test set in order tomeasure the performance

All the models were implemented in MATLAB R2009aForMERR SVD++ the value of the regularization parameter

120582 was selected from range 10

minus5 10

minus4 10

minus3 10

minus2 10

minus1 1 10

and optimal parameter value was used And the learning rate120572 was selected from set 119860 119860 = 120572 | 120572 = 00001 times 2

119888 120572 le

05 119888 gt 0 and 119888 is a constant and the optimal parametervalue was also used In order to compare their performancesfairly for all matrix factorization models we set the numberof features to be 10 The optimal values of all parametersfor all the baseline models used are determined individuallyMore detailed setting methods of the parameters for all thebaselines can be found in the corresponding references Forall the algorithms used in our experiments we repeated theexperiment 5 times for each of the different conditions of eachdataset and the performances reported were averaged across5 runs

54 Experiment Results In this section we present a seriesof experiments to evaluate MERR SVD++ We designedthe experiments in order to address the following researchquestions

(1) Does the proposed MERR SVD++ outperform state-of-the-art personalized ranking approaches for top-Nrecommendation

(2) Does the performance of MERR SVD++ improvewhen we only increase the number of implicit feed-back data for each user

(3) Is MERR SVD++ scalable for large scale use cases

541 Performance Comparison We compare the perfor-mance of MERR SVD++ with that of five baseline algo-rithms The approaches we compare with are listed below

(i) Co-Rating [17] a state-of-the-art CF model thatcan be trained from explicit and implicit feedbacksimultaneously

(ii) SVD++ [16] the first proposed CF model that com-bines explicit and implicit feedback

(iii) xCLiMF [18] a state-of-the-art PR approach whichaims at directly optimizing ERR for top-N recom-mendation in domains with explicit feedback data(eg ratings)

(iv) CofiRank [29] a PR approach that optimizes theNDCG measure [28] for domains with explicit feed-back data (eg ratings) The implementation is basedon the publicly available software package from theauthors

(v) CLiMF [14] a state-of-the-art PR approach that opti-mizes the Mean Reciprocal Rank (MRR) measure[31] for domains with implicit feedback data (egclick follow) We use the explicit feedback datasetsby binarizing the rating values with a threshold Onthe ML1m dataset and the extracted Netflix datasetwe take ratings 4 and 5 (the highest two relevancelevels) respectively as the relevance threshold fortop-N recommendation

The results of the experiments on the ML1m and theextractedNetflix datasets are shown in Figure 4 Rows denote

8 Mathematical Problems in Engineering

0 Given 5 Given 10 Given 150

001

002

003

004

005

006The results on the ML1m dataset

The varieties of ldquoGivenrdquo conditions for the training sets

ND

CG

5

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(a)

0 Given 5 Given 10 Given 150

001

002

003

004

005

006

007The results on the ML1m dataset

ERR

5

The varieties of ldquoGivenrdquo conditions for the training sets

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(b)

0 Given 10 Given 20 Given 300

002

004

006

008

010

012The results on the extracted Netflix dataset

ND

CG

5

The varieties of ldquoGivenrdquo conditions for the training sets

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(c)

0 Given 10 Given 20 Given 300

001

002

003

004

005

006

007

008The results on the extracted Netflix dataset

ERR

5

The varieties of ldquoGivenrdquo conditions for the training sets

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(d)

Figure 4 The performance comparison of MERR SVD++ and baselines

the varieties of ldquoGivenrdquo conditions for the training sets basedon the ML1m and the extracted Netflix datasets and columnsdenote the quality of NDCG and ERR Figure 4 showsthat MERR SVD++ outperforms the baseline approachesin terms of both ERR and NDCG in all of the cases Theresults show that the improvement of ERR aligns consistentlywith the improvement of NDCG indicating that optimizingERR would not degrade the utility of recommendations thatare captured by the NDCG measure It can be seen thatthe relative performance of MERR SVD++ improves as thenumber of observed ratings from the users increases Thisresult indicates that MERR SVD++ can learn better top-Nrecommendation models if more observations of the gradedrelevance data from users can be used The results alsoreveal that it is difficult to model user preferences encoded

in multiple levels of relevance with limited observations inparticular when the number of observations is lower than thenumber of relevance levels

Compared to Co-Rating which is based on ratingprediction MERR SVD++ is based on ranking predictionand succeeds in enhancing the top-ranked performance byoptimizing ERR As reported in [17] the performance ofSVD++ is slightly weaker than that of Co-Rating which isbecause SVD++ model only attempts to approximate theobserved ratings and does not model preferences expressedin implicit feedback The results also show that Co-Ratingand SVD++ significantly outperform xCLiMF and CofiRankwhich confirms our belief that implicit feedback could indeedcomplement explicit feedback It can be seen in Figure 4that xCLiMF significantly outperforms CLiMF in terms of

Mathematical Problems in Engineering 9

0 10 20 30 40 50005

006

007

008

009

010

The increased numbers of implicit feedback for each user

The v

alue

of e

valu

atio

n m

etric

s

ERR5NDCG5

Figure 5 The influence of implicit feedback on the performance ofMERR SVD++

both ERR and NDCG across all the settings of relevancethresholds and datasets The results indicate that the infor-mation loss from binarizing multilevel relevance data wouldinevitably make recommendation models based on binaryrelevance data such as CLiMF suboptimal for the use caseswith explicit feedback data

Hence we give a positive answer to our first researchquestion

542 The Influence of Implicit Feedback on the Performanceof MERR SVD++ The influence of implicit feedback on theperformance of MERR SVD++ can be found in Figure 5Here 119873(119906) = 119877(119906) + 119872(119906) 119872(119906) denotes the set of allitems that user 119906 only gave implicit feedback Rows denotethe increased numbers of implicit feedback for each userand columns denote the quality of ERR and NDCG In ourexperiment the increased numbers of implicit feedback foreach user are the same We use the extracted Netflix datasetunder the condition of ldquoGiven 10rdquo Figure 5 shows that thequality of ERR and NDCG of MERR SVD++ synchronouslyand linearly improves with the increase of implicit feedbackfor each user which confirms our belief that implicit feedbackcould indeed complement explicit feedback

With this experimental result we give a positive answerto our second research question

543 Scalability The last experiment investigated the scala-bility of MERR SVD++ by measuring the training time thatwas required for the training set at different scales Firstlyas analyzed in Section 43 the computational complexityof MERR SVD++ is linear in the number of users in thetraining set when the average number of items rated peruser is fixed To demonstrate the scalability we used differentnumbers of users in the training set under each conditionwe randomly selected from 10 to 100 users in the trainingset and their rated items as the training data for learning thelatent factors The results on the ML1m dataset are shown inFigure 6 We can observe that the computational time under

10 20 30 40 50 60 70 80 90 1000

5

10

15

20

25

30

35

40

Users for training ()

Runt

ime p

er it

erat

ion

(sec

)

Given 5Given 10Given 15

Figure 6 Scalability analysis of MERR SVD++ in terms of thenumber of users in the training set

each condition increases almost linearly to the increase of thenumber of users Secondly as also discussed in Section 43the computational complexity of MERR SVD++ could befurther approximated to be linear to the amount of knowndata (ie nonzero entries in the training user-item matrix)To demonstrate this we examined the runtime of the learningalgorithm against different scales of the training sets underdifferent ldquoGivenrdquo conditions The result is shown in Figure 6from which we can observe that the average runtime of thelearning algorithm per iteration increases almost linearly asthe number of nonzeros in the training set increases

The observations from this experiment allow us to answerour last research question positively

6 Conclusion and Future Work

The problem of the previous researches on personalizedranking is that they focused on either explicit feedbackdata or implicit feedback data rather than making full useof the information in the dataset Until now nobody hasstudied personalized ranking algorithm by exploiting bothexplicit and implicit feedback In order to overcome thedefects of prior researches in this paper we have presenteda new personalized ranking algorithm (MERR SVD++) byexploiting both explicit and implicit feedback simultaneouslyMERR SVD++ optimizes the well-known evaluation metricExpected Reciprocal Rank (ERR) and is based on the newestxCLiMF model and SVD++ algorithm Experimental resultson practical datasets showed that our proposed algorithmoutperformed existing personalized ranking algorithms overdifferent evaluation metrics and that the running time ofMERR SVD++ showed a linear correlation with the num-ber of rating Because of its high precision and the goodexpansibility MERR SVD++ is suitable for processing bigdata and can greatly improve the recommendation speedand validity by solving the latency problem of personalizedrecommendation and has wide application prospect in the

10 Mathematical Problems in Engineering

field of internet information recommendation And becauseMERR SVD++ exploits both explicit and implicit feedbacksimultaneously MERR SVD++ can solve the data sparsityand imbalance problems of personalized ranking algorithmsto a certain extent

For futurework we plan to extend our algorithm to richerones so that our algorithm can solve the grey sheep problemand cold start problem of personalized recommendationAlso we would like to explore more useful information fromthe explicit feedback and implicit feedback simultaneously

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is sponsored in part by the National NaturalScience Foundation of China (nos 61370186 6140326461402122 61003140 61033010 and 61272414) Science andTechnology Planning Project of Guangdong Province (nos2014A010103040 and 2014B010116001) Science and Tech-nology Planning Project of Guangzhou (nos 2014J4100032and 201510010203) the Ministry of Education and ChinaMobile Research Fund (no MCM20121051) the secondbatch open subject of mechanical and electrical professionalgroup engineering technology development center in Foshancity (no 2015-KJZX139) and the 2015 Research BackboneTeachers Training Program of Shunde Polytechnic (no 2015-KJZX014)

References

[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge and DataEngineering vol 17 no 6 pp 734ndash749 2005

[2] S Ahn A Korattikara and N Liu ldquoLarge-scale distributedBayesian matrix factorization using stochastic gradientMCMCrdquo in Proceedings of the 21th ACMSIGKDDConference onKnowledgeDiscovery andDataMining pp 401ndash410 ACMPressSydney Australia August 2015

[3] S Balakrishnan and S Chopra ldquoCollaborative rankingrdquo in Pro-ceedings of the 5th ACM International Conference onWeb Searchand DataMining (WSDM rsquo12) pp 143ndash152 IEEE Seattle WashUSA February 2012

[4] Q Yao and J Kwok ldquoAccelerated inexact soft-impute for fastlarge-scale matrix completionrdquo in Proceedings of the 24rd Inter-national Joint Conference on Artificial Intelligence pp 4002ndash4008 ACM Press Buenos Aires Argentina July 2015

[5] Q Diao M Qiu C-Y Wu A J Smola J Jiang and C WangldquoJointly modeling aspects ratings and sentiments for movierecommendation (JMARS)rdquo in Proceedings of the 20th ACMSIGKDD International Conference on Knowledge Discovery andDataMining (KDD rsquo14) pp 193ndash202 ACMNewYorkNYUSAAugust 2014

[6] M Jahrer and A Toscher ldquoCollaborative filtering ensemble forrankingrdquo in Proceedings of the 17nd International Conference on

Knowledge Discovery and Data Mining pp 153ndash167 ACM SanDiego Calif USA August 2011

[7] G Takacs and D Tikk ldquoAlternating least squares for person-alized rankingrdquo in Proceedings of the 5th ACM Conference onRecommender Systems pp 83ndash90 ACM Dublin Ireland 2012

[8] G Li and L Li ldquoDual collaborative topicmodeling from implicitfeedbacksrdquo in Proceedings of the IEEE International Conferenceon Security Pattern Analysis and Cybernetics pp 395ndash404Wuhan China October 2014

[9] H Wang X Shi and D Yeung ldquoRelational stacked denoisingautoencoder for tag recommendationrdquo in Proceedings of the29th AAAI Conference on Artificial Intelligence pp 3052ndash3058ACM Press Austin Tex USA January 2015

[10] L Gai ldquoPairwise probabilistic matrix factorization for implicitfeedback collaborative filteringrdquo in Proceedings of the IEEEInternational Conference on Security Pattern Analysis andCybernetics (SPAC rsquo14) pp 181ndash190 Wuhan China October2014

[11] S Rendle C Freudenthaler Z Gantner and L Schmidt-Thieme ldquoBPR Bayesian personalized ranking from implicitfeedbackrdquo in Proceedings of the 25th Conference on Uncertaintyin Artificial Intelligence (UAI rsquo09) pp 452ndash461 Morgan Kauf-mann Montreal Canada 2009

[12] L Guo J Ma H R Jiang Z M Chen and C M XingldquoSocial trust aware item recommendation for implicit feedbackrdquoJournal of Computer Science and Technology vol 30 no 5 pp1039ndash1053 2015

[13] W K Pan H Zhong C F Xu and Z Ming ldquoAdaptive Bayesianpersonalized ranking for heterogeneous implicit feedbacksrdquoKnowledge-Based Systems vol 73 pp 173ndash180 2015

[14] Y Shi A Karatzoglou L Baltrunas M Larson N Oliver andA Hanjalic ldquoCLiMF collaborative less-is-more filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 3077ndash3081 ACMBeijing ChinaAugust 2013

[15] WK Pan andLChen ldquoGBPR grouppreference based bayesianpersonalized ranking for one-class collaborative filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 2691ndash2697 Beijing China 2013

[16] Y Koren ldquoFactor in the neighbors scalable and accurate col-laborative filteringrdquoACMTransactions on Knowledge Discoveryfrom Data vol 4 no 1 pp 1ndash24 2010

[17] N N Liu E W Xiang M Zhao and Q Yang ldquoUnifyingexplicit and implicit feedback for collaborative filteringrdquo inProceedings of the 19th International Conference on Informationand Knowledge Management (CIKM rsquo10) pp 1445ndash1448 ACMToronto Canada October 2010

[18] Y Shi A Karatzoglou L BaltrunasM Larson andA HanjalicldquoXCLiMF Optimizing expected reciprocal rank for data withmultiple levels of relevancerdquo in Proceedings of the 7th ACMConference on Recommender Systems (RecSys rsquo13) pp 431ndash434ACM Hongkong October 2013

[19] J Jiang J Lu G Zhang and G Long ldquoScaling-up item-based collaborative filtering recommendation algorithm basedon Hadooprdquo in Proceedings of the 7th IEEE World Congress onServices pp 490ndash497 IEEE Washington DC USA July 2011

[20] T Hofmann ldquoLatent semantic models for collaborative filter-ingrdquo ACM Transactions on Information Systems vol 22 no 1pp 89ndash115 2004

[21] R Salakhutdinov A Mnih and G Hinton ldquoRestricted Boltz-mannmachines for collaborative filteringrdquo in Proceedings of the

Mathematical Problems in Engineering 11

24rd International Conference onMachine Learning (ICML rsquo07)pp 791ndash798 ACM Press Corvallis Ore USA June 2007

[22] N Srebro J Rennie andT Jaakkola ldquoMaximum-marginmatrixfactorizationrdquo in Proceedings of the 18th Annual Conferenceon Neural Information Processing Systems pp 11ndash217 BritishColumbia Canada 2004

[23] J T Liu C H Wu and W Y Liu ldquoBayesian probabilisticmatrix factorization with social relations and item contents forrecommendationrdquo Decision Support Systems vol 55 no 3 pp838ndash850 2013

[24] D Feldman and T Tassa ldquoMore constraints smaller coresetsconstrained matrix approximation of sparse big datardquo in Pro-ceedings of the 21th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo15) pp 249ndash258ACM Press Sydney Australia August 2015

[25] S Mirisaee E Gaussier and A Termier ldquoImproved local searchfor binarymatrix factorizationrdquo in Proceedings of the 29th AAAIConference on Artificial Intelligence pp 1198ndash1204 ACM PressAustin Tex USA January 2015

[26] T Y Liu Learning to Rank for Information Retrieval SpringerNew York NY USA 2011

[27] NN LiuM Zhao andQ Yang ldquoProbabilistic latent preferenceanalysis for collaborative filteringrdquo in Proceedings of the 18thInternational Conference on Information and Knowledge Man-agement (CIKM rsquo09) pp 759ndash766 ACM Press Hong KongNovember 2009

[28] N N Liu and Q Yang ldquoEigenRank a ranking-orientedapproach to collaborative filteringrdquo in Proceedings of the 31stAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval (ACM SIGIR rsquo08) pp 83ndash90 ACM Singapore July 2008

[29] M Weimer A Karatzoglou Q V Le and A J SmolaldquoCofiRankndashmaximum margin matrix factorization for collab-orative rankingrdquo in Proceedings of the 21th Conference onAdvances in Neural Information Processing Systems pp 79ndash86Curran Associates Vancouver Canada 2007

[30] O Chapelle D Metlzer Y Zhang and P Grinspan ldquoExpectedreciprocal rank for graded relevancerdquo in Proceedings of the 18thACM International Conference on Information and KnowledgeManagement (CIKM rsquo09) pp 621ndash630 ACM New York NYUSA November 2009

[31] E M Voorhees ldquoThe trec-8 question answering track reportrdquoin Proceedings of the 8th Text Retrieval Conference (TREC-8 rsquo99)Gaithersburg Md USA November 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 8: Research Article Exploiting Explicit and Implicit Feedback ...downloads.hindawi.com/journals/mpe/2016/2535329.pdfLTR is the core technology for information retrieval. When a query

8 Mathematical Problems in Engineering

0 Given 5 Given 10 Given 150

001

002

003

004

005

006The results on the ML1m dataset

The varieties of ldquoGivenrdquo conditions for the training sets

ND

CG

5

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(a)

0 Given 5 Given 10 Given 150

001

002

003

004

005

006

007The results on the ML1m dataset

ERR

5

The varieties of ldquoGivenrdquo conditions for the training sets

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(b)

0 Given 10 Given 20 Given 300

002

004

006

008

010

012The results on the extracted Netflix dataset

ND

CG

5

The varieties of ldquoGivenrdquo conditions for the training sets

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(c)

0 Given 10 Given 20 Given 300

001

002

003

004

005

006

007

008The results on the extracted Netflix dataset

ERR

5

The varieties of ldquoGivenrdquo conditions for the training sets

CLiMFCofiRankxCLiMF

SVD++Co_RatingMERR_SVD++

(d)

Figure 4 The performance comparison of MERR SVD++ and baselines

the varieties of ldquoGivenrdquo conditions for the training sets basedon the ML1m and the extracted Netflix datasets and columnsdenote the quality of NDCG and ERR Figure 4 showsthat MERR SVD++ outperforms the baseline approachesin terms of both ERR and NDCG in all of the cases Theresults show that the improvement of ERR aligns consistentlywith the improvement of NDCG indicating that optimizingERR would not degrade the utility of recommendations thatare captured by the NDCG measure It can be seen thatthe relative performance of MERR SVD++ improves as thenumber of observed ratings from the users increases Thisresult indicates that MERR SVD++ can learn better top-Nrecommendation models if more observations of the gradedrelevance data from users can be used The results alsoreveal that it is difficult to model user preferences encoded

in multiple levels of relevance with limited observations inparticular when the number of observations is lower than thenumber of relevance levels

Compared to Co-Rating which is based on ratingprediction MERR SVD++ is based on ranking predictionand succeeds in enhancing the top-ranked performance byoptimizing ERR As reported in [17] the performance ofSVD++ is slightly weaker than that of Co-Rating which isbecause SVD++ model only attempts to approximate theobserved ratings and does not model preferences expressedin implicit feedback The results also show that Co-Ratingand SVD++ significantly outperform xCLiMF and CofiRankwhich confirms our belief that implicit feedback could indeedcomplement explicit feedback It can be seen in Figure 4that xCLiMF significantly outperforms CLiMF in terms of

Mathematical Problems in Engineering 9

0 10 20 30 40 50005

006

007

008

009

010

The increased numbers of implicit feedback for each user

The v

alue

of e

valu

atio

n m

etric

s

ERR5NDCG5

Figure 5 The influence of implicit feedback on the performance ofMERR SVD++

both ERR and NDCG across all the settings of relevancethresholds and datasets The results indicate that the infor-mation loss from binarizing multilevel relevance data wouldinevitably make recommendation models based on binaryrelevance data such as CLiMF suboptimal for the use caseswith explicit feedback data

Hence we give a positive answer to our first researchquestion

542 The Influence of Implicit Feedback on the Performanceof MERR SVD++ The influence of implicit feedback on theperformance of MERR SVD++ can be found in Figure 5Here 119873(119906) = 119877(119906) + 119872(119906) 119872(119906) denotes the set of allitems that user 119906 only gave implicit feedback Rows denotethe increased numbers of implicit feedback for each userand columns denote the quality of ERR and NDCG In ourexperiment the increased numbers of implicit feedback foreach user are the same We use the extracted Netflix datasetunder the condition of ldquoGiven 10rdquo Figure 5 shows that thequality of ERR and NDCG of MERR SVD++ synchronouslyand linearly improves with the increase of implicit feedbackfor each user which confirms our belief that implicit feedbackcould indeed complement explicit feedback

With this experimental result we give a positive answerto our second research question

543 Scalability The last experiment investigated the scala-bility of MERR SVD++ by measuring the training time thatwas required for the training set at different scales Firstlyas analyzed in Section 43 the computational complexityof MERR SVD++ is linear in the number of users in thetraining set when the average number of items rated peruser is fixed To demonstrate the scalability we used differentnumbers of users in the training set under each conditionwe randomly selected from 10 to 100 users in the trainingset and their rated items as the training data for learning thelatent factors The results on the ML1m dataset are shown inFigure 6 We can observe that the computational time under

10 20 30 40 50 60 70 80 90 1000

5

10

15

20

25

30

35

40

Users for training ()

Runt

ime p

er it

erat

ion

(sec

)

Given 5Given 10Given 15

Figure 6 Scalability analysis of MERR SVD++ in terms of thenumber of users in the training set

each condition increases almost linearly to the increase of thenumber of users Secondly as also discussed in Section 43the computational complexity of MERR SVD++ could befurther approximated to be linear to the amount of knowndata (ie nonzero entries in the training user-item matrix)To demonstrate this we examined the runtime of the learningalgorithm against different scales of the training sets underdifferent ldquoGivenrdquo conditions The result is shown in Figure 6from which we can observe that the average runtime of thelearning algorithm per iteration increases almost linearly asthe number of nonzeros in the training set increases

The observations from this experiment allow us to answerour last research question positively

6 Conclusion and Future Work

The problem of the previous researches on personalizedranking is that they focused on either explicit feedbackdata or implicit feedback data rather than making full useof the information in the dataset Until now nobody hasstudied personalized ranking algorithm by exploiting bothexplicit and implicit feedback In order to overcome thedefects of prior researches in this paper we have presenteda new personalized ranking algorithm (MERR SVD++) byexploiting both explicit and implicit feedback simultaneouslyMERR SVD++ optimizes the well-known evaluation metricExpected Reciprocal Rank (ERR) and is based on the newestxCLiMF model and SVD++ algorithm Experimental resultson practical datasets showed that our proposed algorithmoutperformed existing personalized ranking algorithms overdifferent evaluation metrics and that the running time ofMERR SVD++ showed a linear correlation with the num-ber of rating Because of its high precision and the goodexpansibility MERR SVD++ is suitable for processing bigdata and can greatly improve the recommendation speedand validity by solving the latency problem of personalizedrecommendation and has wide application prospect in the

10 Mathematical Problems in Engineering

field of internet information recommendation And becauseMERR SVD++ exploits both explicit and implicit feedbacksimultaneously MERR SVD++ can solve the data sparsityand imbalance problems of personalized ranking algorithmsto a certain extent

For futurework we plan to extend our algorithm to richerones so that our algorithm can solve the grey sheep problemand cold start problem of personalized recommendationAlso we would like to explore more useful information fromthe explicit feedback and implicit feedback simultaneously

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is sponsored in part by the National NaturalScience Foundation of China (nos 61370186 6140326461402122 61003140 61033010 and 61272414) Science andTechnology Planning Project of Guangdong Province (nos2014A010103040 and 2014B010116001) Science and Tech-nology Planning Project of Guangzhou (nos 2014J4100032and 201510010203) the Ministry of Education and ChinaMobile Research Fund (no MCM20121051) the secondbatch open subject of mechanical and electrical professionalgroup engineering technology development center in Foshancity (no 2015-KJZX139) and the 2015 Research BackboneTeachers Training Program of Shunde Polytechnic (no 2015-KJZX014)

References

[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge and DataEngineering vol 17 no 6 pp 734ndash749 2005

[2] S Ahn A Korattikara and N Liu ldquoLarge-scale distributedBayesian matrix factorization using stochastic gradientMCMCrdquo in Proceedings of the 21th ACMSIGKDDConference onKnowledgeDiscovery andDataMining pp 401ndash410 ACMPressSydney Australia August 2015

[3] S Balakrishnan and S Chopra ldquoCollaborative rankingrdquo in Pro-ceedings of the 5th ACM International Conference onWeb Searchand DataMining (WSDM rsquo12) pp 143ndash152 IEEE Seattle WashUSA February 2012

[4] Q Yao and J Kwok ldquoAccelerated inexact soft-impute for fastlarge-scale matrix completionrdquo in Proceedings of the 24rd Inter-national Joint Conference on Artificial Intelligence pp 4002ndash4008 ACM Press Buenos Aires Argentina July 2015

[5] Q Diao M Qiu C-Y Wu A J Smola J Jiang and C WangldquoJointly modeling aspects ratings and sentiments for movierecommendation (JMARS)rdquo in Proceedings of the 20th ACMSIGKDD International Conference on Knowledge Discovery andDataMining (KDD rsquo14) pp 193ndash202 ACMNewYorkNYUSAAugust 2014

[6] M Jahrer and A Toscher ldquoCollaborative filtering ensemble forrankingrdquo in Proceedings of the 17nd International Conference on

Knowledge Discovery and Data Mining pp 153ndash167 ACM SanDiego Calif USA August 2011

[7] G Takacs and D Tikk ldquoAlternating least squares for person-alized rankingrdquo in Proceedings of the 5th ACM Conference onRecommender Systems pp 83ndash90 ACM Dublin Ireland 2012

[8] G Li and L Li ldquoDual collaborative topicmodeling from implicitfeedbacksrdquo in Proceedings of the IEEE International Conferenceon Security Pattern Analysis and Cybernetics pp 395ndash404Wuhan China October 2014

[9] H Wang X Shi and D Yeung ldquoRelational stacked denoisingautoencoder for tag recommendationrdquo in Proceedings of the29th AAAI Conference on Artificial Intelligence pp 3052ndash3058ACM Press Austin Tex USA January 2015

[10] L Gai ldquoPairwise probabilistic matrix factorization for implicitfeedback collaborative filteringrdquo in Proceedings of the IEEEInternational Conference on Security Pattern Analysis andCybernetics (SPAC rsquo14) pp 181ndash190 Wuhan China October2014

[11] S Rendle C Freudenthaler Z Gantner and L Schmidt-Thieme ldquoBPR Bayesian personalized ranking from implicitfeedbackrdquo in Proceedings of the 25th Conference on Uncertaintyin Artificial Intelligence (UAI rsquo09) pp 452ndash461 Morgan Kauf-mann Montreal Canada 2009

[12] L Guo J Ma H R Jiang Z M Chen and C M XingldquoSocial trust aware item recommendation for implicit feedbackrdquoJournal of Computer Science and Technology vol 30 no 5 pp1039ndash1053 2015

[13] W K Pan H Zhong C F Xu and Z Ming ldquoAdaptive Bayesianpersonalized ranking for heterogeneous implicit feedbacksrdquoKnowledge-Based Systems vol 73 pp 173ndash180 2015

[14] Y Shi A Karatzoglou L Baltrunas M Larson N Oliver andA Hanjalic ldquoCLiMF collaborative less-is-more filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 3077ndash3081 ACMBeijing ChinaAugust 2013

[15] WK Pan andLChen ldquoGBPR grouppreference based bayesianpersonalized ranking for one-class collaborative filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 2691ndash2697 Beijing China 2013

[16] Y Koren ldquoFactor in the neighbors scalable and accurate col-laborative filteringrdquoACMTransactions on Knowledge Discoveryfrom Data vol 4 no 1 pp 1ndash24 2010

[17] N N Liu E W Xiang M Zhao and Q Yang ldquoUnifyingexplicit and implicit feedback for collaborative filteringrdquo inProceedings of the 19th International Conference on Informationand Knowledge Management (CIKM rsquo10) pp 1445ndash1448 ACMToronto Canada October 2010

[18] Y Shi A Karatzoglou L BaltrunasM Larson andA HanjalicldquoXCLiMF Optimizing expected reciprocal rank for data withmultiple levels of relevancerdquo in Proceedings of the 7th ACMConference on Recommender Systems (RecSys rsquo13) pp 431ndash434ACM Hongkong October 2013

[19] J Jiang J Lu G Zhang and G Long ldquoScaling-up item-based collaborative filtering recommendation algorithm basedon Hadooprdquo in Proceedings of the 7th IEEE World Congress onServices pp 490ndash497 IEEE Washington DC USA July 2011

[20] T Hofmann ldquoLatent semantic models for collaborative filter-ingrdquo ACM Transactions on Information Systems vol 22 no 1pp 89ndash115 2004

[21] R Salakhutdinov A Mnih and G Hinton ldquoRestricted Boltz-mannmachines for collaborative filteringrdquo in Proceedings of the

Mathematical Problems in Engineering 11

24rd International Conference onMachine Learning (ICML rsquo07)pp 791ndash798 ACM Press Corvallis Ore USA June 2007

[22] N Srebro J Rennie andT Jaakkola ldquoMaximum-marginmatrixfactorizationrdquo in Proceedings of the 18th Annual Conferenceon Neural Information Processing Systems pp 11ndash217 BritishColumbia Canada 2004

[23] J T Liu C H Wu and W Y Liu ldquoBayesian probabilisticmatrix factorization with social relations and item contents forrecommendationrdquo Decision Support Systems vol 55 no 3 pp838ndash850 2013

[24] D Feldman and T Tassa ldquoMore constraints smaller coresetsconstrained matrix approximation of sparse big datardquo in Pro-ceedings of the 21th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo15) pp 249ndash258ACM Press Sydney Australia August 2015

[25] S Mirisaee E Gaussier and A Termier ldquoImproved local searchfor binarymatrix factorizationrdquo in Proceedings of the 29th AAAIConference on Artificial Intelligence pp 1198ndash1204 ACM PressAustin Tex USA January 2015

[26] T Y Liu Learning to Rank for Information Retrieval SpringerNew York NY USA 2011

[27] NN LiuM Zhao andQ Yang ldquoProbabilistic latent preferenceanalysis for collaborative filteringrdquo in Proceedings of the 18thInternational Conference on Information and Knowledge Man-agement (CIKM rsquo09) pp 759ndash766 ACM Press Hong KongNovember 2009

[28] N N Liu and Q Yang ldquoEigenRank a ranking-orientedapproach to collaborative filteringrdquo in Proceedings of the 31stAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval (ACM SIGIR rsquo08) pp 83ndash90 ACM Singapore July 2008

[29] M Weimer A Karatzoglou Q V Le and A J SmolaldquoCofiRankndashmaximum margin matrix factorization for collab-orative rankingrdquo in Proceedings of the 21th Conference onAdvances in Neural Information Processing Systems pp 79ndash86Curran Associates Vancouver Canada 2007

[30] O Chapelle D Metlzer Y Zhang and P Grinspan ldquoExpectedreciprocal rank for graded relevancerdquo in Proceedings of the 18thACM International Conference on Information and KnowledgeManagement (CIKM rsquo09) pp 621ndash630 ACM New York NYUSA November 2009

[31] E M Voorhees ldquoThe trec-8 question answering track reportrdquoin Proceedings of the 8th Text Retrieval Conference (TREC-8 rsquo99)Gaithersburg Md USA November 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 9: Research Article Exploiting Explicit and Implicit Feedback ...downloads.hindawi.com/journals/mpe/2016/2535329.pdfLTR is the core technology for information retrieval. When a query

Mathematical Problems in Engineering 9

0 10 20 30 40 50005

006

007

008

009

010

The increased numbers of implicit feedback for each user

The v

alue

of e

valu

atio

n m

etric

s

ERR5NDCG5

Figure 5 The influence of implicit feedback on the performance ofMERR SVD++

both ERR and NDCG across all the settings of relevancethresholds and datasets The results indicate that the infor-mation loss from binarizing multilevel relevance data wouldinevitably make recommendation models based on binaryrelevance data such as CLiMF suboptimal for the use caseswith explicit feedback data

Hence we give a positive answer to our first researchquestion

542 The Influence of Implicit Feedback on the Performanceof MERR SVD++ The influence of implicit feedback on theperformance of MERR SVD++ can be found in Figure 5Here 119873(119906) = 119877(119906) + 119872(119906) 119872(119906) denotes the set of allitems that user 119906 only gave implicit feedback Rows denotethe increased numbers of implicit feedback for each userand columns denote the quality of ERR and NDCG In ourexperiment the increased numbers of implicit feedback foreach user are the same We use the extracted Netflix datasetunder the condition of ldquoGiven 10rdquo Figure 5 shows that thequality of ERR and NDCG of MERR SVD++ synchronouslyand linearly improves with the increase of implicit feedbackfor each user which confirms our belief that implicit feedbackcould indeed complement explicit feedback

With this experimental result we give a positive answerto our second research question

543 Scalability The last experiment investigated the scala-bility of MERR SVD++ by measuring the training time thatwas required for the training set at different scales Firstlyas analyzed in Section 43 the computational complexityof MERR SVD++ is linear in the number of users in thetraining set when the average number of items rated peruser is fixed To demonstrate the scalability we used differentnumbers of users in the training set under each conditionwe randomly selected from 10 to 100 users in the trainingset and their rated items as the training data for learning thelatent factors The results on the ML1m dataset are shown inFigure 6 We can observe that the computational time under

10 20 30 40 50 60 70 80 90 1000

5

10

15

20

25

30

35

40

Users for training ()

Runt

ime p

er it

erat

ion

(sec

)

Given 5Given 10Given 15

Figure 6 Scalability analysis of MERR SVD++ in terms of thenumber of users in the training set

each condition increases almost linearly to the increase of thenumber of users Secondly as also discussed in Section 43the computational complexity of MERR SVD++ could befurther approximated to be linear to the amount of knowndata (ie nonzero entries in the training user-item matrix)To demonstrate this we examined the runtime of the learningalgorithm against different scales of the training sets underdifferent ldquoGivenrdquo conditions The result is shown in Figure 6from which we can observe that the average runtime of thelearning algorithm per iteration increases almost linearly asthe number of nonzeros in the training set increases

The observations from this experiment allow us to answerour last research question positively

6 Conclusion and Future Work

The problem of the previous researches on personalizedranking is that they focused on either explicit feedbackdata or implicit feedback data rather than making full useof the information in the dataset Until now nobody hasstudied personalized ranking algorithm by exploiting bothexplicit and implicit feedback In order to overcome thedefects of prior researches in this paper we have presenteda new personalized ranking algorithm (MERR SVD++) byexploiting both explicit and implicit feedback simultaneouslyMERR SVD++ optimizes the well-known evaluation metricExpected Reciprocal Rank (ERR) and is based on the newestxCLiMF model and SVD++ algorithm Experimental resultson practical datasets showed that our proposed algorithmoutperformed existing personalized ranking algorithms overdifferent evaluation metrics and that the running time ofMERR SVD++ showed a linear correlation with the num-ber of rating Because of its high precision and the goodexpansibility MERR SVD++ is suitable for processing bigdata and can greatly improve the recommendation speedand validity by solving the latency problem of personalizedrecommendation and has wide application prospect in the

10 Mathematical Problems in Engineering

field of internet information recommendation And becauseMERR SVD++ exploits both explicit and implicit feedbacksimultaneously MERR SVD++ can solve the data sparsityand imbalance problems of personalized ranking algorithmsto a certain extent

For futurework we plan to extend our algorithm to richerones so that our algorithm can solve the grey sheep problemand cold start problem of personalized recommendationAlso we would like to explore more useful information fromthe explicit feedback and implicit feedback simultaneously

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is sponsored in part by the National NaturalScience Foundation of China (nos 61370186 6140326461402122 61003140 61033010 and 61272414) Science andTechnology Planning Project of Guangdong Province (nos2014A010103040 and 2014B010116001) Science and Tech-nology Planning Project of Guangzhou (nos 2014J4100032and 201510010203) the Ministry of Education and ChinaMobile Research Fund (no MCM20121051) the secondbatch open subject of mechanical and electrical professionalgroup engineering technology development center in Foshancity (no 2015-KJZX139) and the 2015 Research BackboneTeachers Training Program of Shunde Polytechnic (no 2015-KJZX014)

References

[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge and DataEngineering vol 17 no 6 pp 734ndash749 2005

[2] S Ahn A Korattikara and N Liu ldquoLarge-scale distributedBayesian matrix factorization using stochastic gradientMCMCrdquo in Proceedings of the 21th ACMSIGKDDConference onKnowledgeDiscovery andDataMining pp 401ndash410 ACMPressSydney Australia August 2015

[3] S Balakrishnan and S Chopra ldquoCollaborative rankingrdquo in Pro-ceedings of the 5th ACM International Conference onWeb Searchand DataMining (WSDM rsquo12) pp 143ndash152 IEEE Seattle WashUSA February 2012

[4] Q Yao and J Kwok ldquoAccelerated inexact soft-impute for fastlarge-scale matrix completionrdquo in Proceedings of the 24rd Inter-national Joint Conference on Artificial Intelligence pp 4002ndash4008 ACM Press Buenos Aires Argentina July 2015

[5] Q Diao M Qiu C-Y Wu A J Smola J Jiang and C WangldquoJointly modeling aspects ratings and sentiments for movierecommendation (JMARS)rdquo in Proceedings of the 20th ACMSIGKDD International Conference on Knowledge Discovery andDataMining (KDD rsquo14) pp 193ndash202 ACMNewYorkNYUSAAugust 2014

[6] M Jahrer and A Toscher ldquoCollaborative filtering ensemble forrankingrdquo in Proceedings of the 17nd International Conference on

Knowledge Discovery and Data Mining pp 153ndash167 ACM SanDiego Calif USA August 2011

[7] G Takacs and D Tikk ldquoAlternating least squares for person-alized rankingrdquo in Proceedings of the 5th ACM Conference onRecommender Systems pp 83ndash90 ACM Dublin Ireland 2012

[8] G Li and L Li ldquoDual collaborative topicmodeling from implicitfeedbacksrdquo in Proceedings of the IEEE International Conferenceon Security Pattern Analysis and Cybernetics pp 395ndash404Wuhan China October 2014

[9] H Wang X Shi and D Yeung ldquoRelational stacked denoisingautoencoder for tag recommendationrdquo in Proceedings of the29th AAAI Conference on Artificial Intelligence pp 3052ndash3058ACM Press Austin Tex USA January 2015

[10] L Gai ldquoPairwise probabilistic matrix factorization for implicitfeedback collaborative filteringrdquo in Proceedings of the IEEEInternational Conference on Security Pattern Analysis andCybernetics (SPAC rsquo14) pp 181ndash190 Wuhan China October2014

[11] S Rendle C Freudenthaler Z Gantner and L Schmidt-Thieme ldquoBPR Bayesian personalized ranking from implicitfeedbackrdquo in Proceedings of the 25th Conference on Uncertaintyin Artificial Intelligence (UAI rsquo09) pp 452ndash461 Morgan Kauf-mann Montreal Canada 2009

[12] L Guo J Ma H R Jiang Z M Chen and C M XingldquoSocial trust aware item recommendation for implicit feedbackrdquoJournal of Computer Science and Technology vol 30 no 5 pp1039ndash1053 2015

[13] W K Pan H Zhong C F Xu and Z Ming ldquoAdaptive Bayesianpersonalized ranking for heterogeneous implicit feedbacksrdquoKnowledge-Based Systems vol 73 pp 173ndash180 2015

[14] Y Shi A Karatzoglou L Baltrunas M Larson N Oliver andA Hanjalic ldquoCLiMF collaborative less-is-more filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 3077ndash3081 ACMBeijing ChinaAugust 2013

[15] WK Pan andLChen ldquoGBPR grouppreference based bayesianpersonalized ranking for one-class collaborative filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 2691ndash2697 Beijing China 2013

[16] Y Koren ldquoFactor in the neighbors scalable and accurate col-laborative filteringrdquoACMTransactions on Knowledge Discoveryfrom Data vol 4 no 1 pp 1ndash24 2010

[17] N N Liu E W Xiang M Zhao and Q Yang ldquoUnifyingexplicit and implicit feedback for collaborative filteringrdquo inProceedings of the 19th International Conference on Informationand Knowledge Management (CIKM rsquo10) pp 1445ndash1448 ACMToronto Canada October 2010

[18] Y Shi A Karatzoglou L BaltrunasM Larson andA HanjalicldquoXCLiMF Optimizing expected reciprocal rank for data withmultiple levels of relevancerdquo in Proceedings of the 7th ACMConference on Recommender Systems (RecSys rsquo13) pp 431ndash434ACM Hongkong October 2013

[19] J Jiang J Lu G Zhang and G Long ldquoScaling-up item-based collaborative filtering recommendation algorithm basedon Hadooprdquo in Proceedings of the 7th IEEE World Congress onServices pp 490ndash497 IEEE Washington DC USA July 2011

[20] T Hofmann ldquoLatent semantic models for collaborative filter-ingrdquo ACM Transactions on Information Systems vol 22 no 1pp 89ndash115 2004

[21] R Salakhutdinov A Mnih and G Hinton ldquoRestricted Boltz-mannmachines for collaborative filteringrdquo in Proceedings of the

Mathematical Problems in Engineering 11

24rd International Conference onMachine Learning (ICML rsquo07)pp 791ndash798 ACM Press Corvallis Ore USA June 2007

[22] N Srebro J Rennie andT Jaakkola ldquoMaximum-marginmatrixfactorizationrdquo in Proceedings of the 18th Annual Conferenceon Neural Information Processing Systems pp 11ndash217 BritishColumbia Canada 2004

[23] J T Liu C H Wu and W Y Liu ldquoBayesian probabilisticmatrix factorization with social relations and item contents forrecommendationrdquo Decision Support Systems vol 55 no 3 pp838ndash850 2013

[24] D Feldman and T Tassa ldquoMore constraints smaller coresetsconstrained matrix approximation of sparse big datardquo in Pro-ceedings of the 21th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo15) pp 249ndash258ACM Press Sydney Australia August 2015

[25] S Mirisaee E Gaussier and A Termier ldquoImproved local searchfor binarymatrix factorizationrdquo in Proceedings of the 29th AAAIConference on Artificial Intelligence pp 1198ndash1204 ACM PressAustin Tex USA January 2015

[26] T Y Liu Learning to Rank for Information Retrieval SpringerNew York NY USA 2011

[27] NN LiuM Zhao andQ Yang ldquoProbabilistic latent preferenceanalysis for collaborative filteringrdquo in Proceedings of the 18thInternational Conference on Information and Knowledge Man-agement (CIKM rsquo09) pp 759ndash766 ACM Press Hong KongNovember 2009

[28] N N Liu and Q Yang ldquoEigenRank a ranking-orientedapproach to collaborative filteringrdquo in Proceedings of the 31stAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval (ACM SIGIR rsquo08) pp 83ndash90 ACM Singapore July 2008

[29] M Weimer A Karatzoglou Q V Le and A J SmolaldquoCofiRankndashmaximum margin matrix factorization for collab-orative rankingrdquo in Proceedings of the 21th Conference onAdvances in Neural Information Processing Systems pp 79ndash86Curran Associates Vancouver Canada 2007

[30] O Chapelle D Metlzer Y Zhang and P Grinspan ldquoExpectedreciprocal rank for graded relevancerdquo in Proceedings of the 18thACM International Conference on Information and KnowledgeManagement (CIKM rsquo09) pp 621ndash630 ACM New York NYUSA November 2009

[31] E M Voorhees ldquoThe trec-8 question answering track reportrdquoin Proceedings of the 8th Text Retrieval Conference (TREC-8 rsquo99)Gaithersburg Md USA November 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 10: Research Article Exploiting Explicit and Implicit Feedback ...downloads.hindawi.com/journals/mpe/2016/2535329.pdfLTR is the core technology for information retrieval. When a query

10 Mathematical Problems in Engineering

field of internet information recommendation And becauseMERR SVD++ exploits both explicit and implicit feedbacksimultaneously MERR SVD++ can solve the data sparsityand imbalance problems of personalized ranking algorithmsto a certain extent

For futurework we plan to extend our algorithm to richerones so that our algorithm can solve the grey sheep problemand cold start problem of personalized recommendationAlso we would like to explore more useful information fromthe explicit feedback and implicit feedback simultaneously

Conflict of Interests

The authors declare that there is no conflict of interestsregarding the publication of this paper

Acknowledgments

This work is sponsored in part by the National NaturalScience Foundation of China (nos 61370186 6140326461402122 61003140 61033010 and 61272414) Science andTechnology Planning Project of Guangdong Province (nos2014A010103040 and 2014B010116001) Science and Tech-nology Planning Project of Guangzhou (nos 2014J4100032and 201510010203) the Ministry of Education and ChinaMobile Research Fund (no MCM20121051) the secondbatch open subject of mechanical and electrical professionalgroup engineering technology development center in Foshancity (no 2015-KJZX139) and the 2015 Research BackboneTeachers Training Program of Shunde Polytechnic (no 2015-KJZX014)

References

[1] G Adomavicius and A Tuzhilin ldquoToward the next generationof recommender systems a survey of the state-of-the-art andpossible extensionsrdquo IEEE Transactions on Knowledge and DataEngineering vol 17 no 6 pp 734ndash749 2005

[2] S Ahn A Korattikara and N Liu ldquoLarge-scale distributedBayesian matrix factorization using stochastic gradientMCMCrdquo in Proceedings of the 21th ACMSIGKDDConference onKnowledgeDiscovery andDataMining pp 401ndash410 ACMPressSydney Australia August 2015

[3] S Balakrishnan and S Chopra ldquoCollaborative rankingrdquo in Pro-ceedings of the 5th ACM International Conference onWeb Searchand DataMining (WSDM rsquo12) pp 143ndash152 IEEE Seattle WashUSA February 2012

[4] Q Yao and J Kwok ldquoAccelerated inexact soft-impute for fastlarge-scale matrix completionrdquo in Proceedings of the 24rd Inter-national Joint Conference on Artificial Intelligence pp 4002ndash4008 ACM Press Buenos Aires Argentina July 2015

[5] Q Diao M Qiu C-Y Wu A J Smola J Jiang and C WangldquoJointly modeling aspects ratings and sentiments for movierecommendation (JMARS)rdquo in Proceedings of the 20th ACMSIGKDD International Conference on Knowledge Discovery andDataMining (KDD rsquo14) pp 193ndash202 ACMNewYorkNYUSAAugust 2014

[6] M Jahrer and A Toscher ldquoCollaborative filtering ensemble forrankingrdquo in Proceedings of the 17nd International Conference on

Knowledge Discovery and Data Mining pp 153ndash167 ACM SanDiego Calif USA August 2011

[7] G Takacs and D Tikk ldquoAlternating least squares for person-alized rankingrdquo in Proceedings of the 5th ACM Conference onRecommender Systems pp 83ndash90 ACM Dublin Ireland 2012

[8] G Li and L Li ldquoDual collaborative topicmodeling from implicitfeedbacksrdquo in Proceedings of the IEEE International Conferenceon Security Pattern Analysis and Cybernetics pp 395ndash404Wuhan China October 2014

[9] H Wang X Shi and D Yeung ldquoRelational stacked denoisingautoencoder for tag recommendationrdquo in Proceedings of the29th AAAI Conference on Artificial Intelligence pp 3052ndash3058ACM Press Austin Tex USA January 2015

[10] L Gai ldquoPairwise probabilistic matrix factorization for implicitfeedback collaborative filteringrdquo in Proceedings of the IEEEInternational Conference on Security Pattern Analysis andCybernetics (SPAC rsquo14) pp 181ndash190 Wuhan China October2014

[11] S Rendle C Freudenthaler Z Gantner and L Schmidt-Thieme ldquoBPR Bayesian personalized ranking from implicitfeedbackrdquo in Proceedings of the 25th Conference on Uncertaintyin Artificial Intelligence (UAI rsquo09) pp 452ndash461 Morgan Kauf-mann Montreal Canada 2009

[12] L Guo J Ma H R Jiang Z M Chen and C M XingldquoSocial trust aware item recommendation for implicit feedbackrdquoJournal of Computer Science and Technology vol 30 no 5 pp1039ndash1053 2015

[13] W K Pan H Zhong C F Xu and Z Ming ldquoAdaptive Bayesianpersonalized ranking for heterogeneous implicit feedbacksrdquoKnowledge-Based Systems vol 73 pp 173ndash180 2015

[14] Y Shi A Karatzoglou L Baltrunas M Larson N Oliver andA Hanjalic ldquoCLiMF collaborative less-is-more filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 3077ndash3081 ACMBeijing ChinaAugust 2013

[15] WK Pan andLChen ldquoGBPR grouppreference based bayesianpersonalized ranking for one-class collaborative filteringrdquo inProceedings of the 23rd International Joint Conference on Artifi-cial Intelligence (IJCAI rsquo13) pp 2691ndash2697 Beijing China 2013

[16] Y Koren ldquoFactor in the neighbors scalable and accurate col-laborative filteringrdquoACMTransactions on Knowledge Discoveryfrom Data vol 4 no 1 pp 1ndash24 2010

[17] N N Liu E W Xiang M Zhao and Q Yang ldquoUnifyingexplicit and implicit feedback for collaborative filteringrdquo inProceedings of the 19th International Conference on Informationand Knowledge Management (CIKM rsquo10) pp 1445ndash1448 ACMToronto Canada October 2010

[18] Y Shi A Karatzoglou L BaltrunasM Larson andA HanjalicldquoXCLiMF Optimizing expected reciprocal rank for data withmultiple levels of relevancerdquo in Proceedings of the 7th ACMConference on Recommender Systems (RecSys rsquo13) pp 431ndash434ACM Hongkong October 2013

[19] J Jiang J Lu G Zhang and G Long ldquoScaling-up item-based collaborative filtering recommendation algorithm basedon Hadooprdquo in Proceedings of the 7th IEEE World Congress onServices pp 490ndash497 IEEE Washington DC USA July 2011

[20] T Hofmann ldquoLatent semantic models for collaborative filter-ingrdquo ACM Transactions on Information Systems vol 22 no 1pp 89ndash115 2004

[21] R Salakhutdinov A Mnih and G Hinton ldquoRestricted Boltz-mannmachines for collaborative filteringrdquo in Proceedings of the

Mathematical Problems in Engineering 11

24rd International Conference onMachine Learning (ICML rsquo07)pp 791ndash798 ACM Press Corvallis Ore USA June 2007

[22] N Srebro J Rennie andT Jaakkola ldquoMaximum-marginmatrixfactorizationrdquo in Proceedings of the 18th Annual Conferenceon Neural Information Processing Systems pp 11ndash217 BritishColumbia Canada 2004

[23] J T Liu C H Wu and W Y Liu ldquoBayesian probabilisticmatrix factorization with social relations and item contents forrecommendationrdquo Decision Support Systems vol 55 no 3 pp838ndash850 2013

[24] D Feldman and T Tassa ldquoMore constraints smaller coresetsconstrained matrix approximation of sparse big datardquo in Pro-ceedings of the 21th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo15) pp 249ndash258ACM Press Sydney Australia August 2015

[25] S Mirisaee E Gaussier and A Termier ldquoImproved local searchfor binarymatrix factorizationrdquo in Proceedings of the 29th AAAIConference on Artificial Intelligence pp 1198ndash1204 ACM PressAustin Tex USA January 2015

[26] T Y Liu Learning to Rank for Information Retrieval SpringerNew York NY USA 2011

[27] NN LiuM Zhao andQ Yang ldquoProbabilistic latent preferenceanalysis for collaborative filteringrdquo in Proceedings of the 18thInternational Conference on Information and Knowledge Man-agement (CIKM rsquo09) pp 759ndash766 ACM Press Hong KongNovember 2009

[28] N N Liu and Q Yang ldquoEigenRank a ranking-orientedapproach to collaborative filteringrdquo in Proceedings of the 31stAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval (ACM SIGIR rsquo08) pp 83ndash90 ACM Singapore July 2008

[29] M Weimer A Karatzoglou Q V Le and A J SmolaldquoCofiRankndashmaximum margin matrix factorization for collab-orative rankingrdquo in Proceedings of the 21th Conference onAdvances in Neural Information Processing Systems pp 79ndash86Curran Associates Vancouver Canada 2007

[30] O Chapelle D Metlzer Y Zhang and P Grinspan ldquoExpectedreciprocal rank for graded relevancerdquo in Proceedings of the 18thACM International Conference on Information and KnowledgeManagement (CIKM rsquo09) pp 621ndash630 ACM New York NYUSA November 2009

[31] E M Voorhees ldquoThe trec-8 question answering track reportrdquoin Proceedings of the 8th Text Retrieval Conference (TREC-8 rsquo99)Gaithersburg Md USA November 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 11: Research Article Exploiting Explicit and Implicit Feedback ...downloads.hindawi.com/journals/mpe/2016/2535329.pdfLTR is the core technology for information retrieval. When a query

Mathematical Problems in Engineering 11

24rd International Conference onMachine Learning (ICML rsquo07)pp 791ndash798 ACM Press Corvallis Ore USA June 2007

[22] N Srebro J Rennie andT Jaakkola ldquoMaximum-marginmatrixfactorizationrdquo in Proceedings of the 18th Annual Conferenceon Neural Information Processing Systems pp 11ndash217 BritishColumbia Canada 2004

[23] J T Liu C H Wu and W Y Liu ldquoBayesian probabilisticmatrix factorization with social relations and item contents forrecommendationrdquo Decision Support Systems vol 55 no 3 pp838ndash850 2013

[24] D Feldman and T Tassa ldquoMore constraints smaller coresetsconstrained matrix approximation of sparse big datardquo in Pro-ceedings of the 21th ACM SIGKDD International Conference onKnowledge Discovery and Data Mining (KDD rsquo15) pp 249ndash258ACM Press Sydney Australia August 2015

[25] S Mirisaee E Gaussier and A Termier ldquoImproved local searchfor binarymatrix factorizationrdquo in Proceedings of the 29th AAAIConference on Artificial Intelligence pp 1198ndash1204 ACM PressAustin Tex USA January 2015

[26] T Y Liu Learning to Rank for Information Retrieval SpringerNew York NY USA 2011

[27] NN LiuM Zhao andQ Yang ldquoProbabilistic latent preferenceanalysis for collaborative filteringrdquo in Proceedings of the 18thInternational Conference on Information and Knowledge Man-agement (CIKM rsquo09) pp 759ndash766 ACM Press Hong KongNovember 2009

[28] N N Liu and Q Yang ldquoEigenRank a ranking-orientedapproach to collaborative filteringrdquo in Proceedings of the 31stAnnual International ACM SIGIR Conference on Research andDevelopment in Information Retrieval (ACM SIGIR rsquo08) pp 83ndash90 ACM Singapore July 2008

[29] M Weimer A Karatzoglou Q V Le and A J SmolaldquoCofiRankndashmaximum margin matrix factorization for collab-orative rankingrdquo in Proceedings of the 21th Conference onAdvances in Neural Information Processing Systems pp 79ndash86Curran Associates Vancouver Canada 2007

[30] O Chapelle D Metlzer Y Zhang and P Grinspan ldquoExpectedreciprocal rank for graded relevancerdquo in Proceedings of the 18thACM International Conference on Information and KnowledgeManagement (CIKM rsquo09) pp 621ndash630 ACM New York NYUSA November 2009

[31] E M Voorhees ldquoThe trec-8 question answering track reportrdquoin Proceedings of the 8th Text Retrieval Conference (TREC-8 rsquo99)Gaithersburg Md USA November 1999

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Page 12: Research Article Exploiting Explicit and Implicit Feedback ...downloads.hindawi.com/journals/mpe/2016/2535329.pdfLTR is the core technology for information retrieval. When a query

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of