hao-chin chang department of computer science & information engineering
DESCRIPTION
Using Statistical Decision Theory and Relevance Models for Query-Performance Prediction Anna Shtok and Oren Kurland and David Carmel SIGIR 2010. Hao-Chin Chang Department of Computer Science & Information Engineering National Taiwan Normal University 2011/08/01. Outline. Introduction - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Hao-Chin Chang Department of Computer Science & Information Engineering](https://reader035.vdocuments.site/reader035/viewer/2022070411/568148a0550346895db5b4d3/html5/thumbnails/1.jpg)
Using Statistical Decision Theory and Relevance Models for Query-Performance Prediction
Anna Shtok and Oren Kurland and David Carmel
SIGIR 2010
Hao-Chin Chang
Department of Computer Science & Information Engineering
National Taiwan Normal University
2011/08/01
![Page 2: Hao-Chin Chang Department of Computer Science & Information Engineering](https://reader035.vdocuments.site/reader035/viewer/2022070411/568148a0550346895db5b4d3/html5/thumbnails/2.jpg)
2
Outline
• Introduction• Relevance-Model • Relevance Score
– Clarity
– WIG
– NUC
– QF
• Ranking List• Experiment• Conclusion
![Page 3: Hao-Chin Chang Department of Computer Science & Information Engineering](https://reader035.vdocuments.site/reader035/viewer/2022070411/568148a0550346895db5b4d3/html5/thumbnails/3.jpg)
Introduction
• We present a novel framework for query-performance prediction that is based on statistical decision theory and relevance model.
• We consider a ranking induced by a retrieval method in response to a query as a decision taken so as to satisfy the underlying information need.
• Our goal is to predict the query-performance of M with respect to q.
• We instantiate various query-performance predictors from the framework by varying the– estimates of the relevance-model
– measures for the quality of a relevance-model estimate
– selects a measure of similarity between ranked lists
3
![Page 4: Hao-Chin Chang Department of Computer Science & Information Engineering](https://reader035.vdocuments.site/reader035/viewer/2022070411/568148a0550346895db5b4d3/html5/thumbnails/4.jpg)
Relevance-Model
• represents the information need Iq
• Negative Cross Entropy
4
iq
i dqpdqpdqScore )|(log)|(log);(
sd d
sd
dqp
dqpdwpqwp
qdpdwpqwp
' )|(
)|()|()1()|(
)|()|()1()|()R̂|p(w
'
q;s
)|(log)ˆ|();ˆ( ;; dwpRwpdRScorew
sqsqCE
sqR ;ˆ
![Page 5: Hao-Chin Chang Department of Computer Science & Information Engineering](https://reader035.vdocuments.site/reader035/viewer/2022070411/568148a0550346895db5b4d3/html5/thumbnails/5.jpg)
Relevance Score(Clarity,WIG)
• The socre be measured by the KL divergence
• WIG is based on estimating the presumed percentage of relevant documents in the set S from which is constructed
5
w
SqSqqSqClarity CwP
RwPRwPIRP
)|(
)ˆ|(log)ˆ|()|ˆ(ˆ ;
;;
Sd
QLQLqSqWIG DqScoredqScoreSq
IRP ));();((11
)|ˆ(ˆ;
sqR ;ˆ
![Page 6: Hao-Chin Chang Department of Computer Science & Information Engineering](https://reader035.vdocuments.site/reader035/viewer/2022070411/568148a0550346895db5b4d3/html5/thumbnails/6.jpg)
Relevance Score(NQC)
• NQC, is based on the hypothesis that the standard deviation of retrieval scores in the result list is negatively correlated with the potential amount of query drift — i.e., non-query-related information manifested in the list.
• u is the mean retrieval score in
6
);(
;1
)|ˆ(ˆ ;
2
;DqScore
udqScoreVIRP
QL
Dd QL
qSqNQC
vqQL
vqQLD ;
![Page 7: Hao-Chin Chang Department of Computer Science & Information Engineering](https://reader035.vdocuments.site/reader035/viewer/2022070411/568148a0550346895db5b4d3/html5/thumbnails/7.jpg)
Relevance Score(QF)
this goal is to represent ranked list L by a language model
Terms are ranked by their contribution to the language model’s KL (Kullback-Leibler) divergence from the background collection model.
Top ranked terms will be chosen to form the new query Q’
7
![Page 8: Hao-Chin Chang Department of Computer Science & Information Engineering](https://reader035.vdocuments.site/reader035/viewer/2022070411/568148a0550346895db5b4d3/html5/thumbnails/8.jpg)
Relevance Score(QF)
• P(D|L) is estimated by a linearly decreasing function of the rank of document D
• Each term in P(w|L) is ranked
• The top N ranked terms by form a weighted query Q={(wi,ti)}
• wi denotes the i-th ranked term
• weight ti is the KL-divergence contribution of wi
8
SD
SwP L)|D)p(D|p(w)|(
)C|(
)L|()L|(
wp
wpwp
![Page 9: Hao-Chin Chang Department of Computer Science & Information Engineering](https://reader035.vdocuments.site/reader035/viewer/2022070411/568148a0550346895db5b4d3/html5/thumbnails/9.jpg)
Similarity between ranked lists
• Pearson’s coefficient and Spearman’s-ρ and Kendall’s-γ correlation between the original list ranking and its relevance model based ranking are computed
9
));ˆ(),;(( ][:;
][:
kqMSqM
kqMM DRDqSim
)|ˆ());ˆ(),;((
));;((
;ˆ
][:;
][:
;
qSqR
kqMSqM
kqMM
qM
IRpDRDqSim
IDqU
Sq
![Page 10: Hao-Chin Chang Department of Computer Science & Information Engineering](https://reader035.vdocuments.site/reader035/viewer/2022070411/568148a0550346895db5b4d3/html5/thumbnails/10.jpg)
Experiment
10
![Page 11: Hao-Chin Chang Department of Computer Science & Information Engineering](https://reader035.vdocuments.site/reader035/viewer/2022070411/568148a0550346895db5b4d3/html5/thumbnails/11.jpg)
Experiment
11
![Page 12: Hao-Chin Chang Department of Computer Science & Information Engineering](https://reader035.vdocuments.site/reader035/viewer/2022070411/568148a0550346895db5b4d3/html5/thumbnails/12.jpg)
Experiment
12
![Page 13: Hao-Chin Chang Department of Computer Science & Information Engineering](https://reader035.vdocuments.site/reader035/viewer/2022070411/568148a0550346895db5b4d3/html5/thumbnails/13.jpg)
13
![Page 14: Hao-Chin Chang Department of Computer Science & Information Engineering](https://reader035.vdocuments.site/reader035/viewer/2022070411/568148a0550346895db5b4d3/html5/thumbnails/14.jpg)
14
Conclusion
• Improving the sampling technique used for relevance model construction
• Devising and adapting better measures of representativeness for relevance models constructed form cluster