on the foundations of diverse information...
TRANSCRIPT
![Page 1: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/1.jpg)
On the Foundations of Diverse Information Retrieval
Scott Sanner, Kar Wai Lim, Shengbo Guo, Thore Graepel, Sarvnaz Karimi, Sadegh Kharazmi
1
TexPoint fonts used in EMF. Read the TexPoint manual before you delete this box.: AAAAAAAAAAA
![Page 2: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/2.jpg)
Outline
• Need for diversity
• The answer: MMR
• But what was the question?
– Expected n-call@k
2
![Page 3: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/3.jpg)
Search Result Ranking
• We query the daily news for “technology”
we get this
• Is this desirable?
• Note that de-duplication does not solve this problem
3
![Page 4: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/4.jpg)
Recommendation
4
• Book search for “cowboys”*
*These are actual results I got from an e-book search engine.
• Why are they mostly romance books? – Will this appeal to all demographics?
![Page 5: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/5.jpg)
Diversity Beyond IR: Machine Learning
5
• Classifying Computer Science web pages – Select top features by some feature scoring metric
• computer
• computers
• computing
• computation
• computational
• Certainly all are appropriate – But do these cover all relevant web pages well?
– A better approach? MRMR?
![Page 6: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/6.jpg)
Diversity in IR • In this talk, focus on diversity from an IR perspective:
– De-duplication (all search engines handle – locality sensitive hashing) • Same page, different URL • Different page versions (copied Wiki articles)
– Source diversity (easy) • Web pages vs. news vs. image search vs. Youtube
– Sense ambiguity (easily addressed through user reformulation) • Java, Jaguar, Apple • Arguably not the main motivation
– Intrinsic diversity (faceted information needs) • Heathrow (checkin, food services, ground transport)
– Extrinsic diversity (diverse user population) • Teens vs. parents, men vs. women, location
6
Radlinski and
Joachims – diverse
information needs
(SIGIR Forum 2009)
How do these relate to previous examples?
![Page 7: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/7.jpg)
Diversification in IR
• Maximum marginal relevance (MMR)
– Carbonell & Goldstein, SIGIR 1998
– Standard diversification approach in IR
• MMR Algorithm:
• Sk is subset of k selected documents from D
• Greedily build Sk from Sk-1 where S0 as follows:
7
![Page 8: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/8.jpg)
What was the Question?
• MMR is an algorithm, we don’t know what underlying objective it is optimizing.
• Previous formalization attempts but full question unanswered for 14 years – Chen and Karger, SIGIR 2006 came closest
• This talk: a complete derivation of MMR – Many assumptions
– Arguably the assumptions you are making when using MMR!
8
![Page 9: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/9.jpg)
Where do we start?
Let’s try to relate set/ranking objective Precision@k to diversity*
9
*Note: non-standard IR! IR evaluates these objectives empirically but never derives algorithms to directly optimize them! (Largely because long tail queries & no labels.)
![Page 10: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/10.jpg)
Relating Precision@k Objectives to Diversity
• Chen and Karger, SIGIR 2006: 1-call@k – At least one document in Sk should be relevant (P@k=1) – Very Diverse: encourages you to “cover your bases” with Sk
• Sanner et al, CIKM 2011: 1-call@k derives MMR with λ = ½
• van Rijsbergen, 1979: Probability Ranking Principle (PRP)
– Rank items by probability of relevance (e.g., modeled via term freq) • PRP relates to k-call@k (P@k=k) which relates to MMR with λ = 1
– Not diverse: Encourages kth item to be very similar to first k-1 items
• So either λ= ½ (1-call@k – very diverse) or λ= 1 (k-call@k – not diverse)? – Should really tune λ for MMR based on query ambiguity
• Santos, MacDonald, Ounis, CIKM 2011: Learn best λ given query features
– So what derives λ[½,1]? • Any guesses?
10
Small fraction of queries have diverse information needs – need
good experimental design
![Page 11: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/11.jpg)
Empirical Study of n-call@k
• How does diversity of n-call@k change with n?
11
J. Wang and J. Zhu. Portfolio theory of information retrieval, SIGIR 2009
Esti
mat
e o
f R
esu
lts
Div
ersi
ty Clearly, diversity
(pairwise document
correlation) decreases with n
in n-call@k
![Page 12: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/12.jpg)
Hypothesis
• Let’s try optimizing 2-call@k – Derivation builds on Sanner et al, CIKM 2011
– Optimizing this leads to MMR with λ =2
3
• There seems to be a trend relating λ and n:
– n=1: λ = ½
– n=2: λ =2
3
– n=k: 1
• Hypothesis
– Optimizing n-call@k leads to MMR with lim𝑘→∞
λ(k,n) =𝑛
𝑛+1
12
![Page 13: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/13.jpg)
Recap
• We wanted to know what objective leads to MMR diversification
• Evidence supports that optimizing n-call@k leads to diverse MMR-like behavior where
• Can we derive MMR from n-call@k?
13
![Page 14: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/14.jpg)
One Detail is Missing…
• We want to optimize n-call@k
– i.e., at least n of k documents should be relevant
– Great, but given a query and corpus, how do we do this?
• Key question: how to define “relevance”?
– Need a model for this – probabilistic given PRP connections
– If diversity needed to cover latent information needs
relevance model must include latent query/doc “topics”
14
![Page 15: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/15.jpg)
Latent Binary Relevance Retrieval Model
• Formalize as optimization
in the graphical model:
– si: doc selection i=1..k
– ti: topic for i – ri: i relevant?
– q: query
– t’: topic for q
r1
s1
t1
t’ q
rk
Sk
tk …
…
…
![Page 16: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/16.jpg)
How to determine latent topics?
• observed latent
• Need CPTs for – P(ti | si) – P(t’ | q)
• Can… – Set arbitrarily
• Topics are words • L1-norm TF or TF-IDF!
– Topic modeling (not quite LDA)
r1
S1
t1
t’ q
rk
Sk
tk …
…
…
![Page 17: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/17.jpg)
P (rijt0; ti) =(0 if ti 6= t0
1 if ti = t0
Defining Relevance
• Adapt 0-1 loss model of PRP:
r1
S1
t1
t’ q
rk
Sk
tk …
…
…
![Page 18: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/18.jpg)
Greedy: s¤i = argmaxsi
P(ri = 1jr1 = 0; : : : ; ri¡1 = 0; s¤1; : : : ; s¤i¡1; si; ~q)
Optimizing Expected 1-call@k
Exp-1-call@k(S; ~q)
= E
"k_
i=1
ri = 1
¯̄¯̄¯ s1; : : : ; sk; ~q
#= P (
k_
i=1
ri = 1js1; : : : ; sk; ~q)
= P (r1 = 1 _ [r1 = 0 ^ r2 = 1] _ [r1 = 0 ^ r2 = 0 ^ r3 = 1] _ : : : js1; : : : ; sk; ~q)
=
kX
i=1
P (ri = 1; r1 = 0; : : : ; ri¡1 = 0js1; : : : ; sk; ~q)
=
kX
i=1
P (ri = 1jr1 = 0; : : : ; ri¡1 = 0; s1; : : : ; sk; ~q)P (r1 = 0; : : : ; ri¡1 = 0jS; ~q)
S¤ = argmaxS=fs1;:::;skg
Exp-1-call@k(S; ~q)
All disjuncts mutually exclusive
sk D-separated from r1…rk-1; so can ignore when greedy!
![Page 19: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/19.jpg)
Objective to Optimize: s1* • Take a greedy approach (like MMR)
• Choose s1 via AccRel first
s¤1 = argmax
s1P (r1js1; ~q)
= argmaxs1
X
t1;t0
I[t0 = t1]P (t0j~q)P (t1js1)
= argmaxs1
X
t0
P (t0j~q)P (t1 = t0js1)| {z }
24...
35¢
24...
35
Can derive numerous kernels including TF, TD-IDF, LSI
r1
s1
t1
t’ q
![Page 20: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/20.jpg)
s¤2 = argmaxs2
P (r2 = 1jr1 = 0; s¤1; s2; ~q)
= argmaxs2
X
t1;t2;t0
I[t2 = t0]P (t1js¤1)I[t1 6= t0]P (t2js2)P (t0j~q)
= argmaxs2
X
t0
p(t0j~q)P (t2 = t0js2)(1¡ P (t1 = t0js¤1))
= argmaxs2
"X
t0
p(t0j~q)P (t2 = t0js2)#
| {z }relevance
¡"X
t0
P (t0j~q)p(t1 = t0js¤1)P (t2 = t0js2)#
| {z }non-diversity penalty
Objective to Optimize: s2*
• Choose s2 via AccRel next – Condition on chosen s1* and r1=0
Query-topic weighted diversity!
r1
s1
t1
t’ q
r2
s2
t2
What is ? ½.
![Page 21: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/21.jpg)
s¤k = argmaxsk2DnSk¡1
X
t0
P (tk = t0jsk)P (t0j~q)k¡1Y
i=1
[1¡ P (ti = t0js¤i )]
k¡1Qi=1
[1¡ P (ti = t0js¤i )] = 1¡
24k¡1X
i=1
P (ti = t0js¤i )¡k¡1X
i=1
k¡1X
j=1;j 6=i
P (ti = t0js¤i )P (tj = t0js¤j ) + : : :
35
¼ P(t0js¤1; : : : ; s¤k¡1)
Objective to Optimize: sk*, k>2
Derives topic t’ coverage by Principle of Inclusion, Exclusion!
Provides set-covering view of diversity.
t’
![Page 22: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/22.jpg)
So far…
• We’ve seen hints of MMR from E[1-call@k] – Need a few more assumptions to get to MMR
• Let’s also generalize to E[n-call@k] for general :
22
where
![Page 23: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/23.jpg)
Optimization Objective
• Continue with greedy approach for E[n-call@k]
– Select the next document sk* given all previously chosen documents Sk-1:
23
![Page 24: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/24.jpg)
Derivation
• Nontrivial – Only an overview of “key tricks” here
• For full details, see
– Sanner et al, CIKM 2011: 1-call@k (gentler introduction) • http://users.cecs.anu.edu.au/~ssanner/Papers/cikm11.pdf
– Lim et al, SIGIR 2012: n-call@k • http://users.cecs.anu.edu.au/~ssanner/Papers/sigir12.pdf
and online SIGIR 2012 appendix • http://users.cecs.anu.edu.au/~ssanner/Papers/sigir12_app.pdf
24
![Page 25: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/25.jpg)
Derivation
25
![Page 26: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/26.jpg)
Derivation
26
Marginalise out all subtopics (using conditional probability)
![Page 27: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/27.jpg)
Derivation
27
We write rk as conditioned on Rk-1, where it decomposes into two independent events, hence the +
![Page 28: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/28.jpg)
Derivation
28
Start to push latent topic marginalizations as far in as possible.
![Page 29: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/29.jpg)
Derivation
29
First term in + is independent of sk so can remove from max!
![Page 30: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/30.jpg)
Derivation
• We arrive at the simplified
• This is still a complicated expression, but it can be expressed recursively…
30
![Page 31: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/31.jpg)
Recursion
Very similar conditional decomposition as done in first part of derivation.
31
![Page 32: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/32.jpg)
Unrolling the Recursion
• We can unroll the previous recursion, express it in closed-form, and substitute:
32
Where’s the max? MMR has a max.
![Page 33: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/33.jpg)
Deterministic Topic Probabilities
• We assume that the topics of each document are known (deterministic), hence:
– Likewise for P(t|q)
– This means that a document refers to exactly one topic and likewise for queries, e.g., • If you search for “Apple” you meant the fruit OR the
company, but not both
• If a document refers to “Apple” the fruit, it does not discuss the company Apple Computer
33
![Page 34: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/34.jpg)
Deterministic Topic Probabilities
• Generally:
• Deterministic:
34
![Page 35: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/35.jpg)
Convert a to a max
• Assuming deterministic topic probabilities, we can convert a to a max and vice versa
• For xi {0 (false), 1 (true)}
maxi = i xi
= i (xi)
= 1 - i (1 – xi)
= 1 - i (1 – xi)
35
![Page 36: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/36.jpg)
Convert a to a max
• From the optimizing objective when , we can write
36
![Page 37: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/37.jpg)
Objective After max
37
![Page 38: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/38.jpg)
Combinatorial Simplification • Deterministic topics also permit combinatorial
simplification of some of the
• Assuming that m documents out of the chosen (k-1) are relevant, then
d (the top term) are non-zero
times.
• (bottom term) are non-zero times.
38
![Page 39: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/39.jpg)
Final form • After…
– assuming a deterministic topic distribution,
– converting to a max, and
– combinatorial simplification
39
Topic marginalization leads to probability product kernel Sim1(·, ·): this is any kernel that L1 normalizes inputs, so can use with TF, TF-IDF!
MMR drops q dependence in Sim2(·, ·).
argmax invariant to constant multiplier, use Pascal’s rule to normalize coefficients to [0,1]:
![Page 40: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/40.jpg)
Comparison to MMR
• The optimising objective used in MMR is
• We note that the optimizing objective for expected n-call@k has the same form as MMR, with .
– but m is unknown
40
![Page 41: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/41.jpg)
Expectation of m
• m is expected number of relevant documents (m n), we can lower bound m as m n.
• With the assumption m=n, we obtain – Our hypothesis!
– If instead m constant, still yields MMR-like algorithm
41
λ =𝑛
𝑛+1 also roughly follows
empirical behavior observed earlier, variation is likely due to m for each corpus
![Page 42: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/42.jpg)
Summary of Contributions
• We derived MMR from n-call@k!
– After 14 years, we have insight as to what MMR is optimizing!
– Don’t like the assumptions?
• Write down the objective you want
• Derive the solution!
42
![Page 43: On the Foundations of Diverse Information Retrievalusers.cecs.anu.edu.au/~ssanner/Papers/sigir12_slides.pdfOn the Foundations of Diverse Information Retrieval Scott Sanner, Kar Wai](https://reader030.vdocuments.site/reader030/viewer/2022041022/5ed3fc598d46b66d226337fc/html5/thumbnails/43.jpg)
Bigger Picture: Prob ML for IR
• Search engines are complex beasts – Manually optimized
(which has grown out of empirical IR philosophy)
• But there are probabilistic derivations for popular algorithms in IR – TF-IDF, BM25, Language Model
• Opportunity for more modeling, learning, optimization – Probabilistic models of (latent) information needs – And solutions which autonomously learn and optimize
these needs!
43