Using Statistical Decision Theory and Relevance Models for Query-Performance Prediction
Anna Shtok and Oren Kurland and David Carmel
SIGIR 2010
Hao-Chin Chang
Department of Computer Science & Information Engineering
National Taiwan Normal University
2011/08/01
2
Outline
• Introduction• Relevance-Model • Relevance Score
– Clarity
– WIG
– NUC
– QF
• Ranking List• Experiment• Conclusion
Introduction
• We present a novel framework for query-performance prediction that is based on statistical decision theory and relevance model.
• We consider a ranking induced by a retrieval method in response to a query as a decision taken so as to satisfy the underlying information need.
• Our goal is to predict the query-performance of M with respect to q.
• We instantiate various query-performance predictors from the framework by varying the– estimates of the relevance-model
– measures for the quality of a relevance-model estimate
– selects a measure of similarity between ranked lists
3
Relevance-Model
• represents the information need Iq
• Negative Cross Entropy
4
iq
i dqpdqpdqScore )|(log)|(log);(
sd d
sd
dqp
dqpdwpqwp
qdpdwpqwp
' )|(
)|()|()1()|(
)|()|()1()|()R̂|p(w
'
q;s
)|(log)ˆ|();ˆ( ;; dwpRwpdRScorew
sqsqCE
sqR ;ˆ
Relevance Score(Clarity,WIG)
• The socre be measured by the KL divergence
• WIG is based on estimating the presumed percentage of relevant documents in the set S from which is constructed
5
w
SqSqqSqClarity CwP
RwPRwPIRP
)|(
)ˆ|(log)ˆ|()|ˆ(ˆ ;
;;
Sd
QLQLqSqWIG DqScoredqScoreSq
IRP ));();((11
)|ˆ(ˆ;
sqR ;ˆ
Relevance Score(NQC)
• NQC, is based on the hypothesis that the standard deviation of retrieval scores in the result list is negatively correlated with the potential amount of query drift — i.e., non-query-related information manifested in the list.
• u is the mean retrieval score in
6
);(
;1
)|ˆ(ˆ ;
2
;DqScore
udqScoreVIRP
QL
Dd QL
qSqNQC
vqQL
vqQLD ;
Relevance Score(QF)
this goal is to represent ranked list L by a language model
Terms are ranked by their contribution to the language model’s KL (Kullback-Leibler) divergence from the background collection model.
Top ranked terms will be chosen to form the new query Q’
7
Relevance Score(QF)
• P(D|L) is estimated by a linearly decreasing function of the rank of document D
• Each term in P(w|L) is ranked
• The top N ranked terms by form a weighted query Q={(wi,ti)}
• wi denotes the i-th ranked term
• weight ti is the KL-divergence contribution of wi
8
SD
SwP L)|D)p(D|p(w)|(
)C|(
)L|()L|(
wp
wpwp
Similarity between ranked lists
• Pearson’s coefficient and Spearman’s-ρ and Kendall’s-γ correlation between the original list ranking and its relevance model based ranking are computed
9
));ˆ(),;(( ][:;
][:
kqMSqM
kqMM DRDqSim
)|ˆ());ˆ(),;((
));;((
;ˆ
][:;
][:
;
qSqR
kqMSqM
kqMM
qM
IRpDRDqSim
IDqU
Sq
Experiment
10
Experiment
11
Experiment
12
13
14
Conclusion
• Improving the sampling technique used for relevance model construction
• Devising and adapting better measures of representativeness for relevance models constructed form cluster