collaborative personalized twitter search with topic-language models

42
Collaborative Personalized Twitter Search with Topic-Language Models Jan Vosecky Kenneth Wai-Ting Leung Wilfred Ng Supported by SIGIR Travel Grant

Upload: jan-vosecky

Post on 02-Jul-2015

580 views

Category:

Social Media


0 download

DESCRIPTION

The vast amount of real-time and social content in microblogs results in an information overload for users when searching microblog data. Given the user’s search query, delivering content that is relevant to her interests is a challenging problem. Traditional methods for personalized Web search are insufficient in the microblog domain, because of the diversity of topics, sparseness of user data and the highly social nature. In particular, social interactions between users need to be considered, in order to accurately model user’s interests, alleviate data sparseness and tackle the cold-start problem. In this paper, we therefore propose a novel framework for Collaborative Personalized Twitter Search. At its core, we develop a collaborative user model, which exploits the user’s social connections in order to obtain a comprehensive account of her preferences. We then propose a novel user model structure to manage the topical diversity in Twitter and to enable semantic-aware query disambiguation. Our framework integrates a variety of information about the user’s preferences in a principled manner.

TRANSCRIPT

Page 1: Collaborative Personalized Twitter Search with Topic-Language Models

Collaborative Personalized

Twitter Search with Topic-Language Models

Jan Vosecky

Kenneth Wai-Ting Leung

Wilfred Ng

Supported by SIGIR Travel Grant

Page 2: Collaborative Personalized Twitter Search with Topic-Language Models

Microblogs

2

Page 3: Collaborative Personalized Twitter Search with Topic-Language Models

Microblogs

Tweet 1

Tweet 2

3

User-generated content

– Short length

– Informal language, free-form

– Diverse topics

Very high volume

Information overload

Page 4: Collaborative Personalized Twitter Search with Topic-Language Models

Searching on Twitter

4

“When you've got 5 minutes to fill,

Twitter is a great way to fill 35 minutes”

@mattcutts

Page 5: Collaborative Personalized Twitter Search with Topic-Language Models

Searching for “ipad” on Twitter

Around 50 tweets

mentioning “iPad”

posted within

1-minute

5

Page 6: Collaborative Personalized Twitter Search with Topic-Language Models

Personalizing

Twitter Search

6

Page 7: Collaborative Personalized Twitter Search with Topic-Language Models

Microblog data

• Compared with traditional domains

(e.g. web search, news search):

– Explicitly stated user interests

• tweets, conversations, re-tweets

– Social network structure

• following

7

Page 8: Collaborative Personalized Twitter Search with Topic-Language Models

• Individual user’s data

– Diverse

– Sparse

• User’s social connections

Personalization challenge

Putting all kinds of information into a single user model

inaccurate, noisy

8

Page 9: Collaborative Personalized Twitter Search with Topic-Language Models

• Individual user’s data

– Diverse

– Sparse

• User’s social connections

– Diverse friends, topics

– Need to carefully organize friends’ informatio

Personalization challenge

9

Short messages

Few messages

Few social connections

Little search history

Page 10: Collaborative Personalized Twitter Search with Topic-Language Models

• Individual user’s data

– Diverse

– Sparse

• User’s social connections

– Diverse friends, topics

– Need to carefully organize friends’ information

for it to be useful

Personalization challenge

10

Page 11: Collaborative Personalized Twitter Search with Topic-Language Models

• Individual user’s data

– Diverse

– Sparse

• User’s social connections

– Diverse friends, topics

Topics

Contributions

11

Novel User Model

structure

Collaborative User

Model

Page 12: Collaborative Personalized Twitter Search with Topic-Language Models

12

Language

modeling IR

Query likelihood model

– Given a query Q and a

document D,

where

Topic Models

A latent topic in LDA:

“Information Technology”

Google 0.00040

Android 0.00020

Microsoft 0.00010

App 0.00010

Security 0.00009

Email 0.00008

Login 0.00005

Virus 0.00004

Page 13: Collaborative Personalized Twitter Search with Topic-Language Models

Scope of our approach

• Input to our algorithm:

– Set of n documents returned by Twitter given

query Q

• Our task:

– Rank the documents according to:

• Query

• User model

13

Page 14: Collaborative Personalized Twitter Search with Topic-Language Models

Proposed Framework

14

Page 15: Collaborative Personalized Twitter Search with Topic-Language Models

At a Glance: Proposed User Model

15

Page 16: Collaborative Personalized Twitter Search with Topic-Language Models

At a Glance: Proposed User Model

16

Page 17: Collaborative Personalized Twitter Search with Topic-Language Models

17

At a Glance: Proposed User Model

Page 18: Collaborative Personalized Twitter Search with Topic-Language Models

18

At a Glance: Proposed User Model

Page 19: Collaborative Personalized Twitter Search with Topic-Language Models

Individual User Model

19

ITW = 2/5 = 40%

Sport

W = 2/5 = 40%

Manchester: 5

Play: 4

Win: 2

Android: 6

Coding: 2

Java: 2

ID Tweet Time Topic

1 Manchester playing tonight 1. 1. Sport

2 Doing some android coding 2. 1. IT

3 Great game, great win for manchester! 5. 1. Sport

4 Had a great apple cake with chocolate 6. 1. Food

5 My java code keeps throwing exceptions 10. 1. IT

Food

W = 1/5 =

20%

Cake: 6

Apple: 5

Oven: 2

Page 20: Collaborative Personalized Twitter Search with Topic-Language Models

Individual User Model (IM)

20

Is u interested in word w from topic k?

Is u interested in topic k?

Is word w related to topic k?

Prior prob. of topic k

Recent interest is more important:

From user From topic model

Page 21: Collaborative Personalized Twitter Search with Topic-Language Models

Personalization using IM

21

Is the Query relevant to topic k?

Is Q related to topic k in general?

Is the User interested in topic k?

Is Q related to the words in topic k that User is interested in?

Is the Document relevant to topic k?

Is D related to topic k in general?

Is the User interested in topic k?

Is D related to the words in topic k that User is interested in?

Prior Document probability

Page 22: Collaborative Personalized Twitter Search with Topic-Language Models

Personalization using IM

22

Q = australia

I’m interested in IT and travel

I’ve never tweeted about Australia

TV

Music

IT

Travel

Politics

Business

0.1

0.3

Top 10 restaurants in Australia

iPhones, iPads, and Macs Hacked and Hijacked

for Ransom in Australia - Gotta Be Mobile

Tweet (D):

Page 23: Collaborative Personalized Twitter Search with Topic-Language Models

Personalization using IM

23

Q = australia

I’m interested in IT and travel

I have tweeted about IT in Australia

TV

Music

IT

Travel

Politics

Business

0.6

0.3

Top 10 restaurants in Australia

iPhones, iPads, and Macs Hacked and Hijacked

for Ransom in Australia - Gotta Be Mobile

Tweet (D):

Page 24: Collaborative Personalized Twitter Search with Topic-Language Models

Collaborative User Model

Sport Food

Manchester: 5

Play: 4

Win: 2

Cake: 6

Apple: 5

Oven: 2

Friend 1

Sport

Manchester: 5

Play: 4

Win: 2

Friend 2

IT Music

Radiohead: 4

Listen: 2

Song: 5

Android: 6

Coding: 2

Java: 2

Friend 3

Sport

Manchester: 5

Play: 4

Win: 2

IT

Android: 6

Coding: 2

Java: 2

Music

Radiohead: 4

Listen: 2

Song: 5

Food

Cake: 6

Apple: 5

Oven: 2

Collaborative Model

24

Page 25: Collaborative Personalized Twitter Search with Topic-Language Models

Collaborative User Model

• Weighted sum of IM’s of the top-n friends– based on the amount of interactions (re-tweets, mentions,

conversations)

• Weight of each friend f:

– wP(f): Popularity of f

– wA(u,f): Affinity of u and f

• Weight of each f’s topic k:

– wB(u,k): Topic bias

– wI(u,f,k): Topic-interaction between u and f

25

Page 26: Collaborative Personalized Twitter Search with Topic-Language Models

Personalization using IM and CM

26

From user From topic modelFrom friends

Dirichlet smoothing

Depends on the amount of user’s tweets

Page 27: Collaborative Personalized Twitter Search with Topic-Language Models

Search User Model (SM)

• Feedback sources: Queries + clicks

• What does a ‘click’ mean?

27

URL clickre-tweetfavorite

Page 28: Collaborative Personalized Twitter Search with Topic-Language Models

Search User Model (SM)

• Feedback sources: Queries + clicks

• Feedback from a ‘click’:

– Query-topic: preference for topic k when issuing Q

– Topic-word: preference for words in topic k

– Topic: user’s search bias towards topic k

28

Page 29: Collaborative Personalized Twitter Search with Topic-Language Models

Evaluation

29

Page 30: Collaborative Personalized Twitter Search with Topic-Language Models

Evaluation

30

Page 31: Collaborative Personalized Twitter Search with Topic-Language Models

Query log collection

• Evaluation interface

– Submit query, returns tweets from Twitter API

– Rate relevant tweets

31

Page 32: Collaborative Personalized Twitter Search with Topic-Language Models

Datasets

• Controlled user study (Log_CoS)

– 11 users

• In-the-wild user study (Log_IwS)

– 24 users

32

Log_CoS Log_IwS

Page 33: Collaborative Personalized Twitter Search with Topic-Language Models

Ranking Results

33

Baselines:

Query likelihood (J-M smoothing)

Topic model-based IR

Personalized search (User-specific language models)

Collaborative search (Cluster-specific language models)

Collaborative Personalized search

Page 34: Collaborative Personalized Twitter Search with Topic-Language Models

Ranking Results

34

Page 35: Collaborative Personalized Twitter Search with Topic-Language Models

Ranking Results

35

Page 36: Collaborative Personalized Twitter Search with Topic-Language Models

Ranking Results

36

Page 37: Collaborative Personalized Twitter Search with Topic-Language Models

Ranking Results

37

Page 38: Collaborative Personalized Twitter Search with Topic-Language Models

Average per-user ranking performance

after processing i user’s queries

Comparison of models

38

(a) Log_CoS (b) Log_IwS

Page 39: Collaborative Personalized Twitter Search with Topic-Language Models

Query types

39

(a) Log_CoS (b) Log_IwS

Performance by query type

Page 40: Collaborative Personalized Twitter Search with Topic-Language Models

In summary

• Collaborative Personalized Twitter Search

– User’s tweets

– User’s friends’ tweets

– User’s search activity

– Organized around topics

• topic-specific language models

40

Page 41: Collaborative Personalized Twitter Search with Topic-Language Models

Future work

• Query-dependent personalization

strategies

• Selection of an optimal set of friends for

collaborative model

• Integrating spatial and temporal features

41

Page 42: Collaborative Personalized Twitter Search with Topic-Language Models

Thank You!

Jan Vosecky

Kenneth Wai-Ting Leung

Wilfred Ng

Supported by SIGIR Travel Grant