determining relevance rankings from search click logs

17
Determining Relevance Rankings from Search Click Logs Dr. Carson Kai-Sang Leung Inderjeet Singh (Database and Data Mining Lab)

Upload: inderjeet-singh

Post on 10-May-2015

841 views

Category:

Technology


0 download

TRANSCRIPT

Page 1: Determining Relevance Rankings from Search Click Logs

Determining Relevance Rankings

from Search Click Logs

Dr. Carson Kai-Sang Leung

Inderjeet Singh(Database and Data Mining Lab)

Page 2: Determining Relevance Rankings from Search Click Logs

04/11/2023Comp 7220 2

Road Map Introduction

Problem

Solution Methodology

Evaluation

Page 3: Determining Relevance Rankings from Search Click Logs

04/11/2023Comp 7220 3

What are search click logs?

Page 4: Determining Relevance Rankings from Search Click Logs

Mining user behaviour/preferences Predict document relevance Re-rank the search results Compare different ranking functions (train/test) Optimize the ad. performance Query suggestions

How Big are these logs?◦10+ terabyte of entries each day◦Composed of billions of distinct (query, url)’s

04/11/2023Comp 7220 4

What are their uses?

Page 5: Determining Relevance Rankings from Search Click Logs

04/11/2023Comp 7220 5

Introduction

Documents/results presented in order of the relevance to the

query

Many ranking factors considered when

ranking these results

Ranking factors depend on query,

document and query-document pair

Improving ranking based on user preferences

(likes/dislikes)

Personalized search +Social search

Recency (temporal) ranking

Page 6: Determining Relevance Rankings from Search Click Logs

04/11/2023Comp 7220 6

[David Green; blog]

Page 7: Determining Relevance Rankings from Search Click Logs

04/11/2023Comp 7220 7

User clicks as Votes (Intrinsic feedback)

# of clicks received

[CIKM'09 Tutorial]

Page 8: Determining Relevance Rankings from Search Click Logs

04/11/2023Comp 7220 8

ProblemTrust factor: Preferences to certain URLs more than the other, e.g., wikipedia.com, stackoverflow.com, Yahoo answers, about.com

What is missing (in previous models) ? Modelling trust factor Clicks on sponsored results Related queries/searches (sidebars) Realistic and flexible assumptions on user behaviour

Page 9: Determining Relevance Rankings from Search Click Logs

04/11/2023Comp 7220 9

Page 10: Determining Relevance Rankings from Search Click Logs

04/11/2023Comp 7220 10

Different intents for search

1. Informational query – “DDR3 memory”, “SATA 3 hard drives”, “American history”

2. Navigational query – “gmail”, “digg”, “CIBC”, “CIBC credit cards”

Page 11: Determining Relevance Rankings from Search Click Logs

How user behaviour modelling works?

04/11/2023Comp 7220 11

Snippet Examine?

Snippet Attractive?

Enough Utility?

Yes

Yes

Snippet Examine?

Snippet Attractive?

Enough Utility?

Yes

Yes

No

No

No

No

No

No

Yes YesEnd End

Page 12: Determining Relevance Rankings from Search Click Logs

04/11/2023Comp 7220 12

Solution MethodologyRealistic and flexible assumptions on user behaviour (session modelling)

Consider trust bias (trust factor)

Order results for particular query by relevance scores predicted by model

Comparison of this order to the editorial ranking

Is it good model? If orderings agree upto a considerable extent

Page 13: Determining Relevance Rankings from Search Click Logs

What NEXT

04/11/2023Comp 7220 13

Ranking function tests with different class of queries for metric gains

If metric gains over baseline ranking function? Model insights can be used as a feature in ranking function

Deriving retrieval/ranking function

Deploy this model as a feature/factor for predicting relevance in learning to rank algorithm

Page 14: Determining Relevance Rankings from Search Click Logs

04/11/2023Comp 7220 14

EvaluationMetrics• Discounted Cumulative Gain (DCG)• Normalized DCG (NDCG)• Precision• Recall

Two types of data1. Search click logs (from real or meta search engines)2. Benchmarking dataset LEarning TO Rank (LETOR) for

information retrieval

Page 15: Determining Relevance Rankings from Search Click Logs

04/11/2023Comp 7220 15

[Chapelle and Zhang, 2009]

[Guo et al., 2009]

Page 16: Determining Relevance Rankings from Search Click Logs

David Green Blog. http://davidgreen.com/comparative-value-of-google-search-rankings (accessed 20th-April-2011)

Fan Guo and Chao Liu. Statistical Models for Web Search Click Log Analysis. Tutorial, 2009

Fan Guo, Chao Liu, and Yi Min Wang. Efficient multiple-click models in web search. In Proceedings of Second Web Search and Data Mining (WSDM) Conference, Barcelona, Spain, pages 124-131. ACM, 9-11 February, 2009

Olivier Chapelle and Ye Zhang. A dynamic bayesian network click model for web search and ranking. In Proceedings of the 18th International Conference on World Wide web (WWW), Madrid, Spain, pages 1-10, ACM, 20-24 April, 2009

04/11/2023Comp 7220 16

References

Page 17: Determining Relevance Rankings from Search Click Logs

04/11/2023Comp 7220 17

[Tmcnet.com Blog]