toward robust recommendation systems for scholarly papers ...sugiyama/papers/talk-drexel-201… ·...

66
Toward Robust Recommendation Systems for Scholarly Papers and Mobile Apps Kazunari Sugiyama National University of Singapore

Upload: others

Post on 05-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Toward Robust Recommendation Systems for Scholarly Papers and

Mobile Apps

Kazunari Sugiyama

National University of Singapore

Page 2: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Today’s talk Outline of Singapore Scholarly Paper Recommendation Mobile App Recommendation Popularity Prediction for Web 2.0 Items

2 WING, NUS

Page 3: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Outline of Singapore • Population: 5.39 million (in March 2014)

- Chinese 74%, Malay 13%, Indian 9%, Others 4% [“Statistics Singapore,” http://www.singstat.gov.sg/]

• Area

• Language Malay, Mandarin, English, Tamil

3

2 716 km 2

NUS x SMU x

SUTD x NTU x

166.6 times larger than Singapore

WING, NUS

- NUS: National University of Singapore - NTU: Nanyang Technological University - SMU: Singapore Management University - SUTD: Singapore University of Technology and Design

2 119,283 km 2

Penn State (@State College)

Drexel (@Philadelphia)

Page 4: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Research Topics in Web IR and NLP Group (WING) by A/P Min-Yen Kan

4 WING, NUS

DL IR/MM/HCI NLP

Graduate Students Jun Ping Ng: Temporal Relationship Identification Aobo Wang: Informal Chinese Language Processing Jovian Lin: Recommendation Systems for Mobile Applications

Tao Chen: Topics in Weibo Xiangnan He: Topics in Web 2.0 Muthu Kumar: NUS Co-author analysis

Research Staffs Kazunari Sugiyama: Recommender Systems in Digital Libraries Dongyuan Lu: Social Media Mining

http://wing.comp.nus.edu.sg/

Page 5: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Scholarly Paper Recommendation

Kazunari Sugiyama and Min-Yen Kan: -“Scholarly Paper Recommendation via User's Recent Research Interests” (JCDL’10) -“Exploiting Potential Citation Papers in Scholarly Paper Recommendation” (JCDL’13, “Vannevar Bush Best Paper Award”)

Page 6: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Introduction

6 WING, NUS

How many papers are published in 2012?

Source: Originally pinned from Nature (http://www.nature.com/news/366-days-2012-in-review-1.12042), with data from Thomson Reuters/Essential Science Indicators.

Page 7: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Introduction

6 WING, NUS

How many papers are published in 2012?

Source: Originally pinned from Nature (http://www.nature.com/news/366-days-2012-in-review-1.12042), with data from Thomson Reuters/Essential Science Indicators.

Page 8: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Introduction Recommendation System (Especially, content-based system)

7 WING, NUS

User profile Researcher

Candidate papers to recommend

Recommended papers

Page 9: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Introduction Feature vector construction for candidate papers

8 WING, NUS

Fragments Full text

Page 10: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Introduction Feature vector construction for candidate papers

9 WING, NUS

References

Citation and Reference Papers

Target paper

Page 11: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Introduction Feature vector construction for candidate papers

9 WING, NUS

References

Citation and Reference Papers Citation papers: Endorsement of the target paper Full text Fragments “abstract,” “introduction,” “conclusion,” … Target

paper

Page 12: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Introduction Feature vector construction for candidate papers

9 WING, NUS

References

Citation and Reference Papers Citation papers: Endorsement of the target paper Full text Fragments “abstract,” “introduction,” “conclusion,” … Target

paper Authors of citation papers - May not cite relevant papers due to space limit - Unaware of the relevant papers

Page 13: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Introduction Feature vector construction for candidate papers

9 WING, NUS

References

Citation and Reference Papers Citation papers: Endorsement of the target paper Full text Fragments “abstract,” “introduction,” “conclusion,” … Target

paper Authors of citation papers - May not cite relevant papers due to space limit - Unaware of the relevant papers

Potential citation papers

Page 14: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Our Goal • To find potential citation papers to model candidate

papers to recommend much better for better recommendation

• To refine the use of citation papers in characterizing candidate papers to recommend using fragments

10 WING, NUS

Page 15: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Outline of Baseline System [Sugiyama and Kan, JCDL’10]

13 WING, NUS

Researcher

Candidate papers to recommend

userP(1) Construct user profile from each researcher’s past papers

(2) Compute similarity between

userP

(3) Recommend papers with high similarity

),,1( tjjrecp=F

1recpF to trecpF

and

Page 16: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

14

Forgetting factor

Weighting scheme Cosine similarity

Page 17: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Outline of Baseline System [Sugiyama and Kan, JCDL’10]

15 WING, NUS

Researcher

Candidate papers to recommend

userP(1) Construct user profile from each researcher’s past papers

(2) Compute similarity between

userP

(3) Recommend papers with high similarity

),,1( tjjrecp=F

1recpF to trecpF

and

Page 18: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

[Sugiyama and Kan, JCDL’10]

References

Citation (cit) papers

Reference (ref) papers

1citp

p

1refp2refp

lrefp

2citpkcitp

1refppW →2refppW → lrefppW →

ppcitW →1 ppcitW →2

pp kcitW →

: Citation (cit) paper

Citation (cit) and potential citation (pc) papers

: Potential citation (pc) paper

References

Reference (ref) papers

p

1citp

1pcp

ppcitW →1

pp pcW →1

2pcp

pp pcW →2

2citp

ppcitW →2

jpcp

pp jpcW →

pp kcitW →

kcitp

1refp2refp

lrefp

1refppW →2refppW → lrefppW →

[Sugiyama and Kan, JCDL’13] Proposed Method

16

Page 19: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Proposed Method (1) Leveraging Potential Citation Papers

(2) Leveraging Fragments in Potential Citation Papers

17 WING, NUS

Page 20: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Proposed Method

(1) Leveraging Potential Citation Papers How are potential citation papers discovered?

16 WING, NUS

Page 21: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Proposed Method (1) Leveraging Potential Citation Papers How are potential citation papers discovered?

18 WING, NUS

0.212 0.735 0.687

0.656 0.328 0.436

0.764 0.527 0.385

0.581 0.330

0.383 0.248 0.176

0.654 0.525

0.265 0.430 0.226

: All papers in dataset : Papers as citation papers in dataset

1

2

Page 22: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Proposed Method (1) Leveraging Potential Citation Papers How are potential citation papers discovered?

18 WING, NUS

0.212 0.735 0.687

0.656 0.328 0.436

0.764 0.527 0.385

0.581 0.330

0.383 0.248 0.176

0.654 0.525

0.265 0.430 0.226

: All papers in dataset : Papers as citation papers in dataset

sim: 0.581

sim: 0.330

sim: cosine similarity

1

2

Page 23: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Proposed Method (1) Leveraging Potential Citation Papers How are potential citation papers discovered?

19 WING, NUS

0.212 0.735 0.687

0.656 0.328 0.436

0.764 0.527 0.385

0.581 0.330

0.383 0.248 0.176

0.654 0.525

0.265 0.430 0.226

: All papers in dataset : Papers as citation papers in dataset

Pearson correlation 0.538

0.216

0.475

0.304

0.513 0.487

---

Neighborhood of the target paper (e.g., set to 4)

1 2

Page 24: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Proposed Method (1) Leveraging Potential Citation Papers How are potential citation papers discovered?

19 WING, NUS

0.212 0.735 0.687

0.656 0.328 0.436

0.764 0.152 0.527 0.385

0.581 0.330

0.383 0.248 0.176

0.654 0.525

0.265 0.430 0.226

: All papers in dataset : Papers as citation papers in dataset

Pearson correlation 0.538

0.216

0.475

0.304

0.513 0.487

---

Neighborhood of the target paper (e.g., set to 4)

1

2

Page 25: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Proposed Method (1) Leveraging Potential Citation Papers How are potential citation papers discovered?

20 WING, NUS

0.212 0.735 0.687

0.656 0.328 0.436

0.764 0.152 0.527 0.385

0.435 0.581 0.536 0.211 0.330 0.472 0.368

0.383 0.248 0.176

0.654 0.525

0.265 0.430 0.226

: All papers in dataset : Papers as citation papers in dataset

Pearson correlation 0.538

0.216

0.475

0.304

0.513 0.487

---

“potential citation papers” (e.g., set to 3)

1

2

Page 26: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Proposed Method

21 WING, NUS

Identified Potential Citation Papers

Potential citation paper

Page 27: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Proposed Method (1) Leveraging Potential Citation Papers How is the sparsity of matrix solved?

19 WING, NUS

Page 28: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Proposed Method (1) Leveraging Potential Citation Papers How is the sparsity of matrix solved?

19 WING, NUS

0.233 0.628

0.233 0.147

0.147 0.265

0.265

0.628

Original matrix

1.000 0.233 0.723 0.538 0.628

0.233 1.000 0.147 0.476 0.156

0.723 0.147 1.000 0.265 0.521

0.538 0.476 0.265 1.000 0.268

0.628 0.156 0.521 0.268 1.000

Imputed matrix

Imputation The values in the cell: Cosine similarity between papers

Page 29: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Proposed Method (1) Leveraging Potential Citation Papers How is the sparsity of matrix solved?

20 WING, NUS

1.000 0.233 ? ? 0.628

0.233 1.000 0.147 0.476 0.156

0.538 0.476 0.265 1.000 0.268

0.628 0.156 0.521 0.268 1.000

Target paper ( ) and corresponding Imputed similarities of neighborhood ( ) from “Imputed matrix”

1.000 0.233 0.682 0.453 0.628

“potential citation papers” (e.g., set to 1)

Page 30: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Proposed Method

21 WING, NUS

Feature Vector Construction for Target Papers

References

p

1citp

1pcp

ppcitW →1

pp pcW →1

2pcp

pp pcW →2

2citp

ppcitW →2

jpcp

pp jpcW →

pp kcitW →

kcitp

1refp2refp

lrefp

1refppW →2refppW → lrefppW →

(1) Leveraging Potential Citation Papers

zrefzref

ycitycit

xpcxpc

pl

z

pp

pk

y

pp

pj

x

pppp

fW

fW

fWf

=

=

=

+

+

+=

1

1

1F

Page 31: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Proposed Method

22 WING, NUS

Feature Vector Construction for Target Papers

References

p

1citp

1pcp

ppcitW →1

pp pcW →1

2pcp

pp pcW →2

2citp

ppcitW →2

jpcp

pp jpcW →

pp kcitW →

kcitp

1refp2refp

lrefp

1refppW →2refppW → lrefppW →

(1) Leveraging Potential Citation Papers

zrefzref

ycitycit

xpcxpc

pl

z

pp

pk

y

pp

pj

x

pppp

fW

fW

fWf

=

=

=

+

+

+=

1

1

1F

cosine similarity

Page 32: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Proposed Method (2) Leveraging Fragments in Potential Citation Papers • [frg-SIM]: Fragments with cosine similarity weighting

• [frg-TW]: [frg-SIM] with tunable weight

23 WING, NUS

Page 33: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

(2) Leveraging Fragments in Potential Citation Papers

24 WING, NUS

[frg-SIM]: Fragments with cosine similarity weighting

References

p

1citp

1pcp

ppcitW →1

pp pcW →1

2pcp

pp pcW →2

2citp

ppcitW →2

jpcp

pp jpcW →

pp kcitW →

kcitp

1refp2refp

lrefp

1refppW →2refppW → lrefppW →

zrefzref

ycitycit

xpcxpc

ycitycitxpcxpc

pl

z

pp

pk

y

pp

pj

x

ppp

pfrg

k

y

pp

frgpfrg

j

x

pp

frgp

fW

fW

fWf

ff WW

∑∑

=

=

=

=

=

+

+

++

+=

1

1

1

)(1

)()(1

)(F

Full text

Fragments (“abstract,” “introduction,” “conclusion,” etc.)

Proposed Method

Page 34: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

(2) Leveraging Fragments in Potential Citation Papers

25 WING, NUS

[frg-TW]: [frg-SIM] with tunable weight

References

p

1citp

1pcp

ppcitW →1

pp pcW →1

2pcp

pp pcW →2

2citp

ppcitW →2

jpcp

pp jpcW →

pp kcitW →

kcitp

1refp2refp

lrefp

1refppW →2refppW → lrefppW →

α

α

α

α α

α

)

()1(

)(

1

1

1

)(1

)()(1

)(

zrefzref

ycitycit

xpcxpc

ycitycitxpcxpc

pl

z

pp

pk

y

pp

pj

x

ppp

pfrg

k

y

pp

frgpfrg

j

x

pp

frgp

fW

fW

fWf

ff WW

∑∑

=

=

=

=

=

+

+

+−+

+=

α

αF

Full text

Fragments (“abstract,” “introduction,” “conclusion,” etc.)

Proposed Method

Page 35: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Experiments Experimental Data

26 WING, NUS

(a) Researchers (they have publication lists in DBLP) Training set Test set

Number of researchers 25 25

Average number of DBLP papers 10.4 9.6

Average number of relevant papers in our dataset

76.3 74.5

Average number of citations 15.3 (max. 169) 14.4 (max. 145)

Average number of references 15.8 (max. 47) 14.2 (max. 58)

(b) Candidate papers to recommend (constructed from ACM Digital Library) Training set Test set

Number of papers 50,176 50,175

Average number of citations 19.4 (max. 175) 16.5 (max. 158)

Average number of references 15.7 (max. 45) 15.4 (max. 53)

(Basic dataset has been released from http://www.comp.nus.edu.sg/~sugiyama/SchPaperRecData.html)

Page 36: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Experiments Evaluation Measure • NDCG@5, 10 [Järvelin and Kekäläinen, SIGIR’00]

• Gives more weight to highly ranked items • Incorporates different relevance levels through different

gain values - 1: Relevant search results - 0: Irrelevant search results

• MRR [Voorhees, TREC-8, ’99] • Provides insight in the ability to return a relevant item at

the top of the ranking

27 WING, NUS

Page 37: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Experiments Experimental Results (1) Leveraging potential citation papers* [Tune:pc] Parameter tuning to discover potential citation papers (2) Leveraging fragments in potential citation papers* [Tune:frg-SIM] Fragments with cosine similarity weighting [Tune:frg-TW] [frg-SIM] with tunable weight

(3) Applying optimized parameters to test set

28 WING, NUS

* Please refer to the following paper about the detailed optimization process: K. Sugiyama and M.-Y. Kan: “Exploiting Potential Citation Papers in Scholarly Paper Recommendation” (JCDL’13)

Page 38: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Baseline [Nascimento et al., JCDL’11]

29 WING, NUS

User profile

Researcher Candidate papers to recommend

Recommended papers

Titles of published papers

Title and abstract of candidate papers

Page 39: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Baseline [Wang and Blei., KDD’11]: Collaborative topic regression Combines ideas from collaborative filtering and content analysis based on probabilistic topic modeling

30 WING, NUS

1u1p jp Np

2u

iu

2p

Uu

0

1 1 1

1 0

0 1

1 ijr

}1,0{∈ijr whether user ui includes paper pj In the user’s preference

λv

α θ z β

v r

w

λu u

K N

J

I

Title and abstract

Page 40: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

(3) Applying Optimized Parameters to Test Set

nDCG@5 MRR pc-IMP (n=4, Npc=6) frg-SIM (Full text + Conclusion) frg-TW (α=0.4, Full text + Conclusion)

0.572 0.579

0.787 0.793

Baseline system [Sugiyama and Kan, JCDL’10] (Weight “SIM,” Th=0.4,γ=0.23,d=3)

0.525

0.751

[Nascimento et al., JCDL’11] (“Frequency of bi-gram” obtained from title and abstract)

0.336 0.438

[Wang and Blei., KDD’11] (“In-matrix prediction” in collaborative topic regression)

0.393 0.495

31 WING, NUS

Page 41: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Microscopic Analysis • 1st Relevant Result in Recommendation List for a

“Mobile Computing” Researcher [Sugiyama and Kan, JCDL’10]: 52nd

32 WING, NUS

• Example of Identified Potential Citation Papers

“Biomechanics” “Computer-based music conducting systems”

“Machine learning”

[Sugiyama and Kan, JCDL’13]: 1st

Target Paper: “Real world Gesture analysis”

“Human Computer Interaction” papers

HCI

HCI HCI

HCI

Page 42: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Limitations

33 WING, NUS

“Understanding mobile user’s behavior”

- Mobile technology - User search behavior - Clustering

Target paper

“Mobile Technology”

“Mobile Technology”

“Mobile Technology”

Identified Potential Citation Papers

Target paper: “Understanding mobile user’s behavior”

Interdisciplinary paper

Page 43: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Mobile App Recommendation Jovian Lin, Kazunari Sugiyama, Min-Yen Kan, and Tat-Seng Chua: -“Addressing Cold-Start in App Recommendation: Latent User Models Constructed from Twitter Followers” (SIGIR ’13) -“New and Improved: Modeling Versions to Improve App Recommendation” (SIGIR ’14, to appear)

Page 44: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

44

INFORMATION OVERLOAD

!

Page 45: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

WING, NUS 36

Two Important Observations in Apps 1. Apps contain references to their Twitter accounts. 2. Early signals about apps can be present in social networks, even before ratings are received.

Evernote iOS app Release Date: 8 May 2012

May 2012 Jun 2012 Jul 2012 Dec 2012 … 0 ratings 0 ratings

First few ratings start coming in

118,827 ratings

Has an account on Twitter since Feb 2008

By May 2012, Evernote’s Twitter account already had 120,000 followers and 1,300 tweets.

Page 46: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

WING, NUS 37

Estimate the probability that “a target user u will like an app a.”

p( + | a, u ) p( + | t, u) p( t | a) ∑ =

“like” app user

Uniform distribution over the variousTwitter-followers(t) following app a.

Probability that the presence of Twitter-follower t indicates that it is “liked” by user u. Derived from Pseudo-Documents and Pseudo-Words.

t∈T(a)

Page 47: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

WING, NUS

Pseudo-document u

(twitterID10 , DISLIKED) (twitterID12 , DISLIKED) (twitterID10 , LIKED) (twitterID12 , LIKED) (twitterID29 , LIKED) (twitterID29 , LIKED) (twitterID31 , LIKED)

User u

disliked

liked

liked

Followed by: •twitterID10 •twitterID12

Followed by: •twitterID10 •twitterID12 •twitterID29

App a

App b

App c

Twitter-follower ID

Preference indicator

Followed by: •twitterID29 •twitterID31

38

Pseudo-Document and Pseudo-Words

Page 48: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

WING, NUS

Constructing Latent Groups

(twitterID10 , DISLIKED) (twitterID12 , DISLIKED) (twitterID10 , LIKED) (twitterID12 , LIKED) (twitterID29 , LIKED) (twitterID29 , LIKED) (twitterID31 , LIKED)

Pseudo-documents

LDA

= ∑ p( +, t | z) p( z | u) p( + | t, u)

Per-document topic distribution

Per-topic word distribution

Per-topic word distribution

Per-document topic distribution

z∈Z

Probability that the presence of Twitter-follower t indicates that it is “liked” by user u.

39

Page 49: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

WING, NUS

Dataset We collected data from the Apple iTunes Store and Twitter during September to December 2012. Statistics:

1,289,668 ratings 7,116 apps (with Twitter accounts) 10,133 users.

Restrictions: Each user must give at least 10 ratings for apps. Each Twitter ID is related to at least 5 apps.

40

Page 50: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

WING, NUS

RQ1: How does the performance of Twitter-followers feature compare with other features?

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

20 40 60 80 100 120 140 160 180 200

Rec

all

Number of recommended apps (M)

Pseudo-Docs (W) Pseudo-Docs (D) Pseudo-Docs (G)

Pseudo-Docs (T) Pseudo-Docs (All)

All = T = G = D = W =

All features Twitter-followers Genres Developers Words

41

Page 51: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

WING, NUS

RQ2: How does our method compare with other techniques?

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

20 40 60 80 100 120 140 160 180 200

Rec

all

Number of recommended apps (M)

Full Dataset

VSM (Words) VSM (Twitter) LDA

CTR Pseudo-Docs (Twitter)* Pseudo-Docs (All)**

Pseudo-Docs (All) Pseudo-Docs (Twitter) CTR LDA VSM (Twitter) VSM (Words)

42

Page 52: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

43 WING, NUS

RQ3: Do the latent groups make any sense? What can we learn from them?

Page 53: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Version Sensitive Recommendation Relationship between version of apps and users

44 WING, NUS

Page 54: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Example of Changelog and Genres in App Changelog Genres in App

WING, NUS

• From Tumblr app

• From Apple’s iOS app store (as of January 2014)

45

Page 55: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Our Approach to Version SensitiveRecommendation for Mobile Apps 1. Generate latent topics from version features using

semi-supervised topics models (labeled LDA, LLDA) to characterize each version

2. Discriminate the topics based on genre metadata and identify important topics based on a customized popularity score

3. Compute a personalized score with respect to an app and its version to recommend relevant mobile apps to each user.

46 WING, NUS

Page 56: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Experiments Experimental Data Crawled from iTunes App Store and App Annie • App metadata (6,524 apps) • Version information (109,338 versions) • Ratings (1,000,809 ratings) • Users (9,797 users) Evaluation Measure • Recall

47 WING, NUS

Page 57: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

: App metadata

: Version information

: Rating information

48

Page 58: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Experimental Results

Recall of various combinations of recommendation techniques

WING, NUS 49

Recall of version sensitive recommendation (VSR) against other individual recommendation techniques

Page 59: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Experimental Results Three most important topics

50 WING, NUS

#xxx: version category #xxx: genre category

Page 60: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Popularity Prediction for Web 2.0 Items

Xiangnan He, Ming Gao, Min-Yen Kan, Yiqun Liu, and Kazunari Sugiyama: -“Predicting the Popularity of Web 2.0 Items Based on User Comments” (SIGIR ’14, to appear)

Page 61: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

User Generated Content: A driving force of Web 2.0

52 [1] Xiangnan He et al. Comment-based Multi-view Clustering of Web 2.0 Items. In Proc. of WWW 2014.

Daily growth of UGC: Twitter: 500+ million tweets Flickr: 1+ million images YouTube: 360,000+ hours of videos

Challenges: Information overload [1] Dynamic, temporally evolving Web Rich but noisy UGC

Page 62: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Problem of Current Search Engine Example: Querying “The Voice of China”(on July 24th, 2013) (A Chinese reality talent show started in 2012 – 1 season/year)

• First result of 2nd season

53 WING, NUS

[The top 3 results of Google constrained in YouTube.com domain]

Less than 10,000 views

Ranked 16th, but more than 100,000 views

Page 63: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Proposed Approach Bipartite User-Item Ranking (BUIR)

Regularization-based ranking: • Define the regularizer for each

hypothesis • Combine the regularizers as the

cost function • Optimize the cost function for final

ranking Map each vertex in the graph to a real number so that the value reflects - the vertex’s popularity (for items) or - influence (for users)

WING, NUS

Incorporate the following three hypotheses into our model H1: Temporal factor H2: Social influence factor H3: Current popularity factor

54

Page 64: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Experiments Evaluate the prediction by the #views received in future 3 days after the ranking time Experimental Data • YouTube (21,653 videos) • Flicker (26,815 images) • Last.fm (16,824 artists) (10%: Parameter tuning in regularization, 90%: Test) Evaluation Measure • Spearman’s rank correlation coefficient • Normalized discounted cumulative gain (nDCG)

55 WING, NUS

Page 65: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Experimental Results

YouTube Flickr Last.fm

View Count 73.39 58.42 67.31

Comment count in the

past

83.35 59.43 67.21

Comment count in the

future

84.53 59.41 67.20

Multivariate linear Model

78.24 58.00 38.09

PageRank 80.72 28.15 10.24

BUIR 87.72 64.60 70.43

Improvement in Spearman coefficient between BUIR and the best baselines of query-specific evaluation

WING, NUS

Spearman coefficient (%) of overall evaluation

56

Page 66: Toward Robust Recommendation Systems for Scholarly Papers ...sugiyama/papers/Talk-Drexel-201… · Mobile Apps Kazunari Sugiyama National University of Singapore . Today’s talk

Summary of Today’s Talk • Scholarly Paper Recommendation

- Identify “potential citation papers” - Leverage fragments of the paper

• Mobile App Recommendation - Apply LDA by taking

Twitter followers and version updates into account

• Popularity Prediction for Web 2.0 Items - Apply regularization based on user’s comments

57 WING, NUS

Thank you very much!