probabilistic modeling on rankings - gipuzkoa€¦ · probabilistic modeling on rankings...
TRANSCRIPT
Probabilistic Modeling on Rankings
Probabilistic Modeling on Rankings
Jose A. Lozano Ekhine Irurozki
Intelligent Systems GroupThe University of the Basque Country
ACML, Singapore, November 4th, 2012
Probabilistic Modeling on Rankings
Introduction
Outline of the talk
1 Introduction
2 Thurstone Models
3 Plackett-Luce Model
4 Ranking Induced by Pair-comparisons
5 Distance-Based Ranking Models
6 Other Models
7 Independence
8 Datasets and Software
9 Conclusions
Probabilistic Modeling on Rankings
Introduction
Problems we are interested in
Identity-tracking
Probabilistic Modeling on Rankings
Introduction
Problems we are interested in
Preferences
Probabilistic Modeling on Rankings
Introduction
Problems we are interested in
Information retrieval
Probabilistic Modeling on Rankings
Introduction
Probabilistic Modeling on Ranking
RemarksSocial scienceMachine learning: ECML, ICML, UAI, NIPS, JMLRNOT “learning to rank”A model can be explained from many different points ofviewProbabilistic graphical model bias
Probabilistic Modeling on Rankings
Introduction
Permutations
What are permutations?A permutation σ can be seen as a bijection between theset 1,2, . . . ,n onto itself:
σ : 1,2, . . . ,n −→ 1,2, . . . ,ni 7−→ σ(i)
Interpretationσ(i) represents the rank associated to the i-th elementσ−1(i) represents the i-th ranked element
Sn is the set of permutations of n elements, then (Sn,o)forms the symmetric group:
σ1o σ2(i) = σ1(σ2(i))
Probabilistic Modeling on Rankings
Introduction
Permutations
Notation and representation
The identity permutation (1 2 . . . n) will be denoted by eA permutation σ can be equivalently represented as apermutation matrix M = [mij ] where:
mij =
1 if σ(i) = j0 otherwise
Probabilistic Modeling on Rankings
Introduction
Learning probability distributions over permutations
(1 2 3 4 5 6 7)(2 3 1 7 6 5 4)(6 7 1 3 4 2 5)(5 2 7 6 3 4 1)(5 1 7 2 4 3 6)(3 5 7 1 2 4 6)...
p : Sn −→ [0,1]
σ 7−→ p(σ)
Probabilistic Modeling on Rankings
Introduction
Learning probability distributions over permutations
(1 2 3 4 5 6 7− − − − . . .)(2 3 1 7 6 5 4− − − − . . .)(6 7 1 3 4 2 5− − − − . . .)(5 2 7 6 3 4 1− − − − . . .)(5 1 7 2 4 3 6− − − − . . .)(3 5 7 1 2 4 6− − − − . . .)...
p : Sn −→ [0,1]
σ 7−→ p(σ)
Probabilistic Modeling on Rankings
Introduction
Learning probability distributions over permutations
1 2 4, 5
2, 3, 1 7 6, 5, 4
6, 7, 1, 3 4 2
5, 2 7, 6 3, 4 1...
p : Sn −→ [0,1]
σ 7−→ p(σ)
Probabilistic Modeling on Rankings
Introduction
Learning probability distributions over permutations
(1 2 3 4 − − . . .− − 5 6 7)(2 3 1 7 − − . . .− − 6 5 4)(6 7 1 3 − − . . .− − 4 2 5)(5 2 7 6 − − . . .− − 3 4 1)(5 1 7 2 − − . . .− − 4 3 6)(3 5 7 1 − − . . .− − 2 4 6)...
p : Sn −→ [0,1]
σ 7−→ p(σ)
Probabilistic Modeling on Rankings
Introduction
Learning probability distributions over permutations
1 5
1 7
3 4
5 7...
p : Sn −→ [0,1]
σ 7−→ p(σ)
Probabilistic Modeling on Rankings
Introduction
Representing probability distributions overpermutation
ProblemsHow many parameters are needed?
p(1 2 3 4 5) , p(2 1 3 4 5)
p(3 2 1 4 5) , . . .
. . . , p(5 4 3 2 1)
5!− 1 parameters. In general n!− 1
Probabilistic Modeling on Rankings
Introduction
Representing probability distributions overpermutation
ProblemsHow many parameters are needed?
p(1 2 3 4 5) , p(2 1 3 4 5)
p(3 2 1 4 5) , . . .
. . . , p(5 4 3 2 1)
5!− 1 parameters. In general n!− 1
Probabilistic Modeling on Rankings
Introduction
Bayesian networks
X1 X2
X3 X6
X4 X5
p(x) = p(x1) · p(x2) · p(x3|x1, x2) · p(x4|x3) · p(x5|x3, x6) · p(x6)
Probabilistic Modeling on Rankings
Introduction
Bayesian networks
X1θ1 X2 θ2
X3θ3 X6 θ6
X4θ4 X5 θ5
p(x) = p(x1) · p(x2) · p(x3|x1, x2) · p(x4|x3) · p(x5|x3, x6) · p(x6)
Probabilistic Modeling on Rankings
Introduction
Bayesian networks
θ4 = (θ41,θ42) X1θ1 X2 θ2
X3θ3 X6 θ6
X4θ4 X5 θ5
p(x) = p(x1) · p(x2) · p(x3|x1, x2) · p(x4|x3) · p(x5|x3, x6) · p(x6)
Probabilistic Modeling on Rankings
Introduction
Bayesian networks
θ4 = (θ41,θ42)
θ411 = p(X4 = 0 | X3 = 0)θ412 = p(X4 = 1 | X3 = 0)
X1θ1 X2 θ2
X3θ3 X6 θ6
X4θ4 X5 θ5
p(x) = p(x1) · p(x2) · p(x3|x1, x2) · p(x4|x3) · p(x5|x3, x6) · p(x6)
Probabilistic Modeling on Rankings
Introduction
Bayesian networks
θ4 = (θ41,θ42)
θ411 = p(X4 = 0 | X3 = 0)θ412 = p(X4 = 1 | X3 = 0)
X1θ1 X2 θ2
X3θ3 X6 θ6
X4θ4 X5 θ5General case
θijk = p(Xi = xki |Pai = paj
i)
p(x) = p(x1) · p(x2) · p(x3|x1, x2) · p(x4|x3) · p(x5|x3, x6) · p(x6)
Probabilistic Modeling on Rankings
Introduction
Bayesian networks for permutations
X1 = 1⇒ X2 6= 1 X1 X2
X3 X4
Probabilistic Modeling on Rankings
Introduction
Bayesian networks for permutations
X1 = 1⇒ X2 6= 1 X1 X2
X3 X4
Probabilistic Modeling on Rankings
Introduction
Bayesian networks for permutations
X1 = 1⇒ X2 6= 1
X1 = 1⇒ X3 6= 1
X1 X2
X3 X4
Probabilistic Modeling on Rankings
Introduction
Bayesian networks for permutations
X1 = 1⇒ X2 6= 1
X1 = 1⇒ X3 6= 1
X1 X2
X3 X4
Probabilistic Modeling on Rankings
Introduction
Bayesian networks for permutations
X1 = 1⇒ X2 6= 1
X1 = 1⇒ X3 6= 1
X1 = 1⇒ X4 6= 1
X1 X2
X3 X4
Probabilistic Modeling on Rankings
Introduction
Bayesian networks for permutations
X1 = 1⇒ X2 6= 1
X1 = 1⇒ X3 6= 1
X1 = 1⇒ X4 6= 1
X1 X2
X3 X4
Probabilistic Modeling on Rankings
Introduction
Bayesian networks for permutations
X1 = 1⇒ X2 6= 1
X1 = 1⇒ X3 6= 1
X1 = 1⇒ X4 6= 1
X1 X2
X3 X4
Probabilistic Modeling on Rankings
Introduction
Bayesian networks for permutations
X1 = 1⇒ X2 6= 1
X1 = 1⇒ X3 6= 1
X1 = 1⇒ X4 6= 1
X1 X2
X3 X4
p(x) = p(x1) · p(x2|x1) · p(x3|x1, x2) · p(x4|x1, x2, x3)
Probabilistic Modeling on Rankings
Introduction
Inference over permutations
Recommender systems (Sun and Lebanon, 2012)What is the most preferred film given the current personalrankings?
arg m«axi 6=3,7,9,15,4
p(σ−1(1) = i |3 7, 9, 15 4)
What is the most probable order of the films given thecurrent rankings?
arg m«axσ
p(σ |3 7, 9, 15 4)
Probabilistic Modeling on Rankings
Introduction
Inference over permutations
Information retrievalLabel ranking (Cheng and Hüllermeier, 2009; Chen et al.,2010): finding the ranking that maximizes the probability
Probabilistic Modeling on Rankings
Thurstone Models
Outline of the talk
1 Introduction
2 Thurstone Models
3 Plackett-Luce Model
4 Ranking Induced by Pair-comparisons
5 Distance-Based Ranking Models
6 Other Models
7 Independence
8 Datasets and Software
9 Conclusions
Probabilistic Modeling on Rankings
Thurstone Models
Thurstone Order Statistic Models
Ranking biscuits in relation with its sweetness
B1
B2
B3
B4
B5
Probabilistic Modeling on Rankings
Thurstone Models
Thurstone Order Statistic Models
Ranking biscuits in relation with its sweetness
B1 −→
B2
B3
B4
B5
Probabilistic Modeling on Rankings
Thurstone Models
Thurstone Order Statistic Models
Ranking biscuits in relation with its sweetness
B1 −→ 5.6
B2
B3
B4
B5
Probabilistic Modeling on Rankings
Thurstone Models
Thurstone Order Statistic Models
Ranking biscuits in relation with its sweetness
B1 −→ 5.6
B2 −→ 4.9
B3
B4
B5
Probabilistic Modeling on Rankings
Thurstone Models
Thurstone Order Statistic Models
Ranking biscuits in relation with its sweetness
B1 −→ 5.6
B2 −→ 4.9
B3 −→ 9.3
B4 −→ 2.1
B5 −→ 3.4
Probabilistic Modeling on Rankings
Thurstone Models
Thurstone Order Statistic Models
Ranking biscuits in relation with its sweetness
B1 −→ 5.6
B2 −→ 4.9
B3 −→ 9.3
B4 −→ 2.1
B5 −→ 3.4
(_,_,_,_,_)
Probabilistic Modeling on Rankings
Thurstone Models
Thurstone Order Statistic Models
Ranking biscuits in relation with its sweetness
B1 −→ 5.6
B2 −→ 4.9
B3 −→ 9.3
B4 −→ 2.1
B5 −→ 3.4
(_,_,_,_,_)
Probabilistic Modeling on Rankings
Thurstone Models
Thurstone Order Statistic Models
Ranking biscuits in relation with its sweetness
B1 −→ 5.6
B2 −→ 4.9
B3 −→ 9.3
B4 −→ 2.1
B5 −→ 3.4
(_,_,_,1,_)
Probabilistic Modeling on Rankings
Thurstone Models
Thurstone Order Statistic Models
Ranking biscuits in relation with its sweetness
B1 −→ 5.6
B2 −→ 4.9
B3 −→ 9.3
B4 −→ 2.1
B5 −→ 3.4
(_,_,_,1,_)
Probabilistic Modeling on Rankings
Thurstone Models
Thurstone Order Statistic Models
Ranking biscuits in relation with its sweetness
B1 −→ 5.6
B2 −→ 4.9
B3 −→ 9.3
B4 −→ 2.1
B5 −→ 3.4
(_,_,_,1,_)
Probabilistic Modeling on Rankings
Thurstone Models
Thurstone Order Statistic Models
Ranking biscuits in relation with its sweetness
B1 −→ 5.6
B2 −→ 4.9
B3 −→ 9.3
B4 −→ 2.1
B5 −→ 3.4
(_,_,_,1,2)
Probabilistic Modeling on Rankings
Thurstone Models
Thurstone Order Statistic Models
Ranking biscuits in relation with its sweetness
B1 −→ 5.6
B2 −→ 4.9
B3 −→ 9.3
B4 −→ 2.1
B5 −→ 3.4
(_,_,_,1,2)
Probabilistic Modeling on Rankings
Thurstone Models
Thurstone Order Statistic Models
Ranking biscuits in relation with its sweetness
B1 −→ 5.6
B2 −→ 4.9
B3 −→ 9.3
B4 −→ 2.1
B5 −→ 3.4
(4,3,5,1,2)
Probabilistic Modeling on Rankings
Thurstone Models
Thurstone Order Statistic Models
Ranking biscuits in relation with its sweetness
B1 −→ 5.6 ∼ X1
B2 −→ 4.9 ∼ X2
B3 −→ 9.3 ∼ X3
B4 −→ 2.1 ∼ X4
B5 −→ 3.4 ∼ X5
(4,3,5,1,2)
Probabilistic Modeling on Rankings
Thurstone Models
Thurstone Order Statistic Models
BasicsEach item is associated with a true continue value:sweetness of a cookie, loudness of a sound, etc.A judge assesses the cookies or the sounds and classifythem.Errors are produced because of the lack of exactness ofthe sensorial apparatus of the judge.The output of the assessment is a classification of theitems.
Probabilistic Modeling on Rankings
Thurstone Models
Thurstone Order Statistic Models
BasicsCodify a permutation as a real-valued vectorGiven a real vector (x1, x2, . . . , xn) of length n apermutation can be obtained by ranking the positions usingthe values xi , (i = 1, . . . ,n)Given
(2.35, 3.42, 9.35, 0.32, 11.54, 10.42, 5.23, 4.2, 7.8)
the permutation obtained is (2 3 7 1 9 8 5 4 6)
Probabilistic Modeling on Rankings
Thurstone Models
Thurstone Order Statistic Models
DefinitionGiven X1,X2, . . . ,Xn random variables with a continuous jointdistribution F (x1, . . . , xn) we can define a random ranking σ insuch a way that σ(i) is the rank that Xi occupied betweenX1,X2, . . . ,Xn. In this way:
P(π) = P(Xσ−1(1) < Xσ−1(2) < . . . < Xσ−1(n))
Unconstrained Xi ’s (and therefore F ) produce all kind ofdistributions over Sn
Probabilistic Modeling on Rankings
Thurstone Models
Thurstone Models
n-1 parameter modelAssumption: All the Xi are independent (it produces aproper subset of the distribution over permutations)All the distributions are similar Fi(x) = F (x − µi)
The parameters of the model are (µ2 − µ1, . . . , µn − µ1)
Most common models: Fi Gaussian, Fi Gumbel (Luce’sModel)
Probabilistic Modeling on Rankings
Thurstone Models
Thurstone Models
Learning I
MLE of the parameters: complex numerical integralproblem:
p(σ) = P(Xσ−1(1) < Xσ−1(2) < . . . < Xσ−1(n)) (1)
=
∫Ω
F (x1, x2, . . . , xn)dx1 · · · dxn (2)
where Ω = (x1, x2, . . . , xn)|xσ−1(1) < xσ−1(2) < . . . <xσ−1(n) xi ∈ R i = 1, . . . ,n
Probabilistic Modeling on Rankings
Thurstone Models
Thurstone Models
Learning Approaches
Böckenholt (1993): Numerical integration (n ≤ 4)Yao & Böckenholt (1999): Bayesian Gibbs sampling(n ≤ 10)Limited information (Maydeu-Olivares, 1999; 2001; 2003):pair, triplets and tetrads comparisons frequencies (n ≤ 10)
Probabilistic Modeling on Rankings
Thurstone Models
Thurstone Models
Sampling
It depends on the complexity of F (x1, . . . , xn)
TrueSkillTM (Herbrich et al. 2007)
Probabilistic Modeling on Rankings
Plackett-Luce Model
Outline of the talk
1 Introduction
2 Thurstone Models
3 Plackett-Luce Model
4 Ranking Induced by Pair-comparisons
5 Distance-Based Ranking Models
6 Other Models
7 Independence
8 Datasets and Software
9 Conclusions
Probabilistic Modeling on Rankings
Plackett-Luce Model
Plackett-Luce Model
Ranking objects with Plackett-Luce
O1
O2
O3
O4
O5
Probabilistic Modeling on Rankings
Plackett-Luce Model
Plackett-Luce Model
Ranking objects with Plackett-Luce
O1
O2
O3
O4
O5
Probabilistic Modeling on Rankings
Plackett-Luce Model
Plackett-Luce Model
Ranking objects with Plackett-Luce
O1 −→ w1
O2
O3
O4
O5
Probabilistic Modeling on Rankings
Plackett-Luce Model
Plackett-Luce Model
Ranking objects with Plackett-Luce
O1 −→ w1
O2 −→ w2
O3 −→ w3
O4 −→ w4
O5 −→ w5
Probabilistic Modeling on Rankings
Plackett-Luce Model
Plackett-Luce Model
Ranking objects with Plackett-Luce
O1 −→ w1w1
w1+w2+w3+w4+w5
O2 −→ w2
O3 −→ w3
O4 −→ w4
O5 −→ w5
Probabilistic Modeling on Rankings
Plackett-Luce Model
Plackett-Luce Model
Ranking objects with Plackett-Luce
O1 −→ w1w1
w1+w2+w3+w4+w5
O2 −→ w2w2
w1+w2+w3+w4+w5
O3 −→ w3w3
w1+w2+w3+w4+w5
O4 −→ w4w4
w1+w2+w3+w4+w5
O5 −→ w5w5
w1+w2+w3+w4+w5
Probabilistic Modeling on Rankings
Plackett-Luce Model
Plackett-Luce Model
Ranking objects with Plackett-Luce
O1 −→ w1w1
w1+w2+w3+w4+w5
O2 −→ w2w2
w1+w2+w3+w4+w5
O3 −→ w3w3
w1+w2+w3+w4+w5
O4 −→ w4w4
w1+w2+w3+w4+w5
O5 −→ w5w5
w1+w2+w3+w4+w5
(_,_,_,_,_)
Probabilistic Modeling on Rankings
Plackett-Luce Model
Plackett-Luce Model
Ranking objects with Plackett-Luce
O1 −→ w1w1
w1+w2+w3+w4+w5
O2 −→ w2w2
w1+w2+w3+w4+w5
O3 −→ w3w3
w1+w2+w3+w4+w5
O4 −→ w4w4
w1+w2+w3+w4+w5
O5 −→ w5w5
w1+w2+w3+w4+w5
(_,_,_,_,_)
Probabilistic Modeling on Rankings
Plackett-Luce Model
Plackett-Luce Model
Ranking objects with Plackett-Luce
O1 −→ w1w1
w1+w2+w3+w4+w5
O2 −→ w2w2
w1+w2+w3+w4+w5
O3 −→ w3w3
w1+w2+w3+w4+w5
O4 −→ w4w4
w1+w2+w3+w4+w5
O5 −→ w5w5
w1+w2+w3+w4+w5
(_,_,_,1,_)
Probabilistic Modeling on Rankings
Plackett-Luce Model
Plackett-Luce Model
Ranking objects with Plackett-Luce
O1 −→ w1w1
w1+w2+w3+w4+w5
O2 −→ w2w2
w1+w2+w3+w4+w5
O3 −→ w3w3
w1+w2+w3+w4+w5
O4 −→ w4w4
w1+w2+w3+w4+w5
O5 −→ w5w5
w1+w2+w3+w4+w5
(_,_,_,1,_)
Probabilistic Modeling on Rankings
Plackett-Luce Model
Plackett-Luce Model
Ranking objects with Plackett-Luce
O1 −→ w1w1
w1+w2+w3+w4+w5
O2 −→ w2w2
w1+w2+w3+w4+w5
O3 −→ w3w3
w1+w2+w3+w4+w5
O4 −→ w4w4
w1+w2+w3+w4+w5
O5 −→ w5w5
w1+w2+w3+w4+w5
(_,_,_,1,_)
Probabilistic Modeling on Rankings
Plackett-Luce Model
Plackett-Luce Model
Ranking objects with Plackett-Luce
O1 −→ w1w1
w1+w2+w3+w4+w5
O2 −→ w2w2
w1+w2+w3+w4+w5
O3 −→ w3w3
w1+w2+w3+w4+w5
O4 −→ w4w4
w1+w2+w3+w4+w5
O5 −→ w5w5
w1+w2+w3+w4+w5
(_,_,_,1,_)
Probabilistic Modeling on Rankings
Plackett-Luce Model
Plackett-Luce Model
Ranking objects with Plackett-Luce
O1 −→ w1w1
w1+w2+w3+w4+w5
O2 −→ w2w2
w1+w2+w3+w4+w5
O3 −→ w3w3
w1+w2+w3+w4+w5
O4 −→ w4w4
w1+w2+w3+w4+w5
O5 −→ w5w5
w1+w2+w3+w4+w5
(_,_,_,1,2)
Probabilistic Modeling on Rankings
Plackett-Luce Model
Plackett-Luce Model
Ranking objects with Plackett-Luce
O1 −→ w1w1
w1+w2+w3+w4+w5
O2 −→ w2w2
w1+w2+w3+w4+w5
O3 −→ w3w3
w1+w2+w3+w4+w5
O4 −→ w4w4
w1+w2+w3+w4+w5
O5 −→ w5w5
w1+w2+w3+w4+w5
(_,_,_,1,2)
Probabilistic Modeling on Rankings
Plackett-Luce Model
Plackett-Luce Model
Ranking objects with Plackett-Luce
O1 −→ w1w1
w1+w2+w3+w4+w5
O2 −→ w2w2
w1+w2+w3+w4+w5
O3 −→ w3w3
w1+w2+w3+w4+w5
O4 −→ w4w4
w1+w2+w3+w4+w5
O5 −→ w5w5
w1+w2+w3+w4+w5
(_,_,_,1,2)
Probabilistic Modeling on Rankings
Plackett-Luce Model
Plackett-Luce Model
Ranking objects with Plackett-Luce
O1 −→ w1w1
w1+w2+w3+w4+w5
O2 −→ w2w2
w1+w2+w3+w4+w5
O3 −→ w3w3
w1+w2+w3+w4+w5
O4 −→ w4w4
w1+w2+w3+w4+w5
O5 −→ w5w5
w1+w2+w3+w4+w5
(4,3,5,1,2)
Probabilistic Modeling on Rankings
Plackett-Luce Model
Plackett-Luce Model
Ranking objects with Plackett-Luce
O1 −→ w1w1
w1+w2+w3+w4+w5
O2 −→ w2w2
w1+w2+w3+w4+w5
O3 −→ w3w3
w1+w2+w3+w4+w5
O4 −→ w4w4
w1+w2+w3+w4+w5
O5 −→ w5w5
w1+w2+w3+w4+w5
(4,3,5,1,2)
p((4, 3, 5, 1, 2)) =w4
w1 + w2 + w3 + w4 + w5·
w5
w1 + w2 + w3 + w5·
w2
w1 + w2 + w3·
w1
w1 + w3
Probabilistic Modeling on Rankings
Plackett-Luce Model
Plackett-Luce Model
Definition procedure
Parameters: W = (w1,w2, . . . ,wn) with∑n
i=1 wi = 1 andwi > 0wi is the probability of chosen object iProcedure:
1 An object is chosen using W2 Update W and chose another object
Model
P(σ) =n−1∏i=1
wσ−1(i)∑nj=i wσ−1(j)
Probabilistic Modeling on Rankings
Plackett-Luce Model
Plackett-Luce Model
PropertiesThe choice probability ratio between two items isindependent of any other items in the setEasy to extend to partial rankings:
P(σ) =n′−1∏i=1
wσ−1(i)∑n′j=i wσ−1(j)
Plackett-Luce model can be seen as a Thurstone modelwith Fi Gumbel
Probabilistic Modeling on Rankings
Plackett-Luce Model
Plackett-Luce model
Learning
Given a dataset σ1, σ2, . . . , σN find WTwo main approaches have been developed
MM algorithm (Hunter, 2004)Bayesian approach by means of message passing (Guiverand Snelson, 2009)
Probabilistic Modeling on Rankings
Plackett-Luce Model
P-L Learning
Minorization-Maximization (MM) algorithmIt is an iterative algorithm to calculate ML parametersEM algorithm can be considered as a special case of MMThe key idea: Find a function Q(w|w‘) such that(minorization):
Q(w|w‘) ≤ L(w)
with equality if w = w‘
In this case it happens that:
if Q(w|w′) ≥ Q(w′|w′) then L(w) ≥ L(w′)
Probabilistic Modeling on Rankings
Plackett-Luce Model
P-L Learning
Minorization-Maximization (MM) algorithm
The algorithm is as follows:1 Give an initial guess for the parameters w0
2 Set l = 03 Construct function Q(w|wl)
4 Find wl+1 = argmaxwQ(w|wl)
5 If ||wl+1 −wl || < θ stop. Otherwise set l = l + 1 and go tostep 3
Probabilistic Modeling on Rankings
Plackett-Luce Model
P-L Learning
Minorization-Maximization (MM) algorithm
The algorithm is as follows:1 Give an initial guess for the parameters w0
2 Set l = 03 Construct function Q(w|wl)
4 Find wl+1 = argmaxwQ(w|wl)
5 If ||wl+1 −wl || < θ stop. Otherwise set l = l + 1 and go tostep 3
How to construct Q(w|wl)
Probabilistic Modeling on Rankings
Plackett-Luce Model
P-L Learning
Minorization-Maximization (MM) algorithm
The algorithm is as follows:1 Give an initial guess for the parameters w0
2 Set l = 03 Construct function Q(w|wl)
4 Find wl+1 = argmaxwQ(w|wl)
5 If ||wl+1 −wl || < θ stop. Otherwise set l = l + 1 and go tostep 3
How to construct Q(w|wl)
How to calculate argmaxwQ(w|wl)
Probabilistic Modeling on Rankings
Plackett-Luce Model
P-L Learning
MM for P-L
Maximum Likelihood function given a sample σ1, . . . , σN:
ln L(w|σ1, . . . , σN) =N∑
k=1
n−1∑i=1
ln w(σk )−1(i) − lnn∑
j=i
w(σk )−1(j)
Constructing Q
∀x , y > 0 we have − ln x ≥ 1− ln y − xy
Taking y in the previous formula as wl and applying it to L weobtain Q
Probabilistic Modeling on Rankings
Plackett-Luce Model
P-L Learning
MM for P-L. Function Q
Q(w|wl) =N∑
k=1
n−1∑i=1
(ln w(σk )−1(i) −
∑nj=i w(σk )−1(j)∑nj=i w l
(σk )−1(j)
)
Probabilistic Modeling on Rankings
Plackett-Luce Model
P-L Learning
MM for P-L. Maximize QWe maximize Q by deriving and equaling to 0:
w l+1r =
hr∑Nk=1
∑n−1i=1 δkir [
∑nj=i w l
(σk )−1(j)]−1
where hr is the number of rankings in which the r−th item isranked higher than last and
δkir =
1 if an item higher than i − 1 is ranked r
in the k -th permutation
0 otherwise
Probabilistic Modeling on Rankings
Plackett-Luce Model
P-L Learning
Comments on the MM approachTo guarantee convergence several properties have to becomplied with. For instance:
...in every possible partition of the items into twononempty subsets, some item in the second setranks higher than some item in the first set atleast once..
It can be applied to partial rankings without any changesIt can be combined with Newton-Raphson methodHunter (2004) reports experiments with up to n = 80
Probabilistic Modeling on Rankings
Plackett-Luce Model
P-L Learning
Bayesian Approach (Guiver and Snelson, 2009)Try to avoid the overfitting that can suffer the MLparameters in some situationsGive a method that can be applied globally without theconstraints of the MM methodBayesian approach: assume a prior distribution over theparameters and carry out inference to obtain thea-posteriori distribution given the dataset
Probabilistic Modeling on Rankings
Plackett-Luce Model
P-L Learning
Bayesian Approach (Guiver and Snelson, 2009)The authors assume a Gamma distribution as a prior:
wi ∼ Gamma(v |α0, β0)
Assume a full factorization for the posteriorp(w) ≈
∏ni=1 p(wi)
Use a message-passing algorithm (powerExpectation-Propagation) to carry out inference
Probabilistic Modeling on Rankings
Plackett-Luce Model
P-L Inference
InferenceSampling is trivial in this modelMarginal calculation can be exponential: P(σ−1(n) = i)
Probabilistic Modeling on Rankings
Plackett-Luce Model
Machine Learning Applications
ML ApplicationsCheng et al. (2010): Label rankingChen et al. (2007): Document ranking
Probabilistic Modeling on Rankings
Ranking Induced by Pair-comparisons
Outline of the talk
1 Introduction
2 Thurstone Models
3 Plackett-Luce Model
4 Ranking Induced by Pair-comparisons
5 Distance-Based Ranking Models
6 Other Models
7 Independence
8 Datasets and Software
9 Conclusions
Probabilistic Modeling on Rankings
Ranking Induced by Pair-comparisons
Ranking induced by pair-comparisons
DefinitionBabington-Smith proposes a model based on paircomparisons:
pi,j = probability of preferring item i to item j
if only that comparison were to be made
Assuming no ties, for n objects there are(
n2
)possible
comparisons
Probabilistic Modeling on Rankings
Ranking Induced by Pair-comparisons
Ranking induced by pair-comparisons
DefinitionGiven an ordering it is easy to get the pair comparisons:
(312)⇒ 3 > 1, 3 > 2, 1 > 2
The opposite is not true in general:
3 > 1, 1 > 2, 2 > 3
Probabilistic Modeling on Rankings
Ranking Induced by Pair-comparisons
Ranking induced by pair-comparisons
DefinitionA ranking is obtained carrying out all kind of comparisonsuntil a consistent set of comparisons is obtained:
P(σ) ∝∏
(i,j)|σ−1(i)<σ−1(j)
pi,j
For instance given the ordering (1 3 4 2):
p(1 3 4 2) ∝ p1,3 · p1,4 · p1,2 · p3,4 · p3,2 · p4,2
The number of parameters is quadratic n(n − 1)/2There is no closed form for the normalization constantSimplifications....
Probabilistic Modeling on Rankings
Ranking Induced by Pair-comparisons
Ranking induced by pair-comparisons
Mallows-Bradley-Terry modelsThe model comes defined by a vector of weights(v1, . . . , vn) each one associated with an item:
pi,j =vi
vi + vjwith
n∑i=1
vi = 1 and vi > 0
MM algorithms have been given to learn the MLEparameters (Hunter, 2004)
Probabilistic Modeling on Rankings
Ranking Induced by Pair-comparisons
Ranking induced by pair-comparisons
Extensions to Mallows-Bradley-Terry modelsAgresti (1990) considers a model where the probability of ibeating j depends on which individual is listed first:
pij =
θwi/(θvi + vj) if i is home
vi/(vi + θvj) if j is home
Rao and Kupper (1967) allows ties:
P(i beats j) = vi/(vi + θvj)
P(j beats i) = vj/(θvi + vj)
P(j ties i) = (θ2 − 1)vivj/((θvi + vj)(vi + θvj))
Probabilistic Modeling on Rankings
Distance-Based Ranking Models
Outline of the talk
1 Introduction
2 Thurstone Models
3 Plackett-Luce Model
4 Ranking Induced by Pair-comparisons
5 Distance-Based Ranking Models
6 Other Models
7 Independence
8 Datasets and Software
9 Conclusions
Probabilistic Modeling on Rankings
Distance-Based Ranking Models
Distance-based ranking models (Mallows models)
Definition
p(σ|θ, σ0) =1
Z (θ)e−θd(σ,σ0)
It is an exponential modeld is a distance between permutations such that:
d(σ, π) ≥ 0 ∀σ, π with equality iff σ = πRight-invariant property: d(σ, π) = d(σφ, πφ) ∀σ, π, φ
Two parameters:Central permutation σ0Spread parameter θ ≥ 0
Z (θ) is the partition function
Probabilistic Modeling on Rankings
Distance-Based Ranking Models
Mallows model
Equivalent to Gaussian distribution for permutations
Probabilistic Modeling on Rankings
Distance-Based Ranking Models
Mallows model
Equivalent to Gaussian distribution for permutations
Probabilistic Modeling on Rankings
Distance-Based Ranking Models
Distances between permutations I
Kendall-tau metric. Measures the number ofdisagreements between two permutations:
T (π, σ) =∑i<j
I(π(i)− π(j))(σ(i)− σ(j)) < 0
Example
π = (2 3 1 5 4) σ = (1 3 4 5 2)
Probabilistic Modeling on Rankings
Distance-Based Ranking Models
Distances between permutations I
Kendall-tau metric. Measures the number ofdisagreements between two permutations:
T (π, σ) =∑i<j
I(π(i)− π(j))(σ(i)− σ(j)) < 0
Example
π = (2 3 1 5 4) σ = (1 3 4 5 2)
(1,2)→ π(1) = 2 π(2) = 3 σ(1) = 1 σ(2) = 3
Probabilistic Modeling on Rankings
Distance-Based Ranking Models
Distances between permutations I
Kendall-tau metric. Measures the number ofdisagreements between two permutations:
T (π, σ) =∑i<j
I(π(i)− π(j))(σ(i)− σ(j)) < 0
Example
π = (2 3 1 5 4) σ = (1 3 4 5 2)
(1,2)→ π(1) = 2 < π(2) = 3 σ(1) = 1 σ(2) = 3
Probabilistic Modeling on Rankings
Distance-Based Ranking Models
Distances between permutations I
Kendall-tau metric. Measures the number ofdisagreements between two permutations:
T (π, σ) =∑i<j
I(π(i)− π(j))(σ(i)− σ(j)) < 0
Example
π = (2 3 1 5 4) σ = (1 3 4 5 2)
(1,2)→ π(1) = 2 < π(2) = 3 σ(1) = 1 < σ(2) = 3
Probabilistic Modeling on Rankings
Distance-Based Ranking Models
Distances between permutations I
Kendall-tau metric. Measures the number ofdisagreements between two permutations:
T (π, σ) =∑i<j
I(π(i)− π(j))(σ(i)− σ(j)) < 0
Example
π = (2 3 1 5 4) σ = (1 3 4 5 2)
(1,2)→ π(1) = 2 < π(2) = 3 σ(1) = 1 < σ(2) = 3
(1,3)→ π(1) = 2 π(3) = 1 σ(1) = 1 σ(3) = 4
Probabilistic Modeling on Rankings
Distance-Based Ranking Models
Distances between permutations I
Kendall-tau metric. Measures the number ofdisagreements between two permutations:
T (π, σ) =∑i<j
I(π(i)− π(j))(σ(i)− σ(j)) < 0
Example
π = (2 3 1 5 4) σ = (1 3 4 5 2)
(1,2)→ π(1) = 2 < π(2) = 3 σ(1) = 1 < σ(2) = 3
(1,3)→ π(1) = 2 > π(3) = 1 σ(1) = 1 σ(3) = 4
Probabilistic Modeling on Rankings
Distance-Based Ranking Models
Distances between permutations I
Kendall-tau metric. Measures the number ofdisagreements between two permutations:
T (π, σ) =∑i<j
I(π(i)− π(j))(σ(i)− σ(j)) < 0
Example
π = (2 3 1 5 4) σ = (1 3 4 5 2)
(1,2)→ π(1) = 2 < π(2) = 3 σ(1) = 1 < σ(2) = 3
(1,3)→ π(1) = 2 > π(3) = 1 σ(1) = 1 < σ(3) = 4
Probabilistic Modeling on Rankings
Distance-Based Ranking Models
Distances between permutations II
Kendall-tau is equivalente to: minimum number of adjacentswaps to go from π−1 to σ−1.0 ≤ T (π, σ) ≤ n(n − 1)/2
Probabilistic Modeling on Rankings
Distance-Based Ranking Models
Distances between permutations III
Spearman rho metric:
R(π, σ) =
(n∑
i=1
(π(i)− σ(i))2
)1/2
Spearman footrule:
F (π, σ) =n∑
i=1
|π(i)− σ(i)|
Probabilistic Modeling on Rankings
Distance-Based Ranking Models
Distances between permutations IV
Hamming distance:
H(π, σ) =n∑
i=1
Iπ(i) 6= σ(i)
Cayley metric:
C(π, σ) = minimum number of transpositionsto go from π to σ
Probabilistic Modeling on Rankings
Distance-Based Ranking Models
Distances between permutations V
Ulam metric
U(π, σ) = n −maximum number of items rankedthe same relative order by π and σ
Probabilistic Modeling on Rankings
Distance-Based Ranking Models
Mallows model
Learning
Given a dataset σ1, σ2, . . . , σN find θ and σ0
Maximum likelihood estimationσ0 = arg minσ
∑Ni=1 d(σi , σ) (consensus ranking, NP-hard
for T )θ can be found by a standard numerical method for convexoptimization, given σ0
InferenceFor all distance d , Gibbs samplingThere are alternative methods depending on d
Probabilistic Modeling on Rankings
Distance-Based Ranking Models
Mallows model with Kendall distance
DefinitionIn this particular case the partition function can becalculated in a closed form:
p(π|θ, σ0) =(1− e−θ)n−1∏n−1
i=1 (1− e−(n−i+1)θ)e−θT (σ,σ0)
Learning. Kemeny ranking problem:σ0 = arg minσ
∑Ni=1 T (σi , σ)
Probabilistic Modeling on Rankings
Distance-Based Ranking Models
Mallows model with Kendall distance
Learning
Ali and Meila (2012) compare 104 algorithms in theKemeny ranking problem. Borda one of the best:
1 Calculate for all index i :
vi = 1/NN∑
j=1
σj (i)
2 Assign σ0(1) with the index of the smallest v1, . . . , vn,σ0(2) with second smallest and so on
Probabilistic Modeling on Rankings
Distance-Based Ranking Models
Mallows model
ProblemsUnimodalityEstimationPermutations at the same distance from σ0 have the sameprobability
Probabilistic Modeling on Rankings
Distance-Based Ranking Models
Mallows model
ExtensionsMixture of Mallows models (Murphy and Martin, 2003; Leeand Yu, 2012; Lu and Boutilier, 2011)Two-side infinite extension (Gnedin and Grigori Olshanski,2012)Bao and Meila (2010) extends the Mallows model to infiniteitems. They learn the model from top-t lists of items andextend it to multi-modal dataGeneralized Mallows models
Probabilistic Modeling on Rankings
Distance-Based Ranking Models
Generalized Mallows models
DefinitionFligner and Verducci (1986): if a distance d can be writtenas:
d(π, σ) =n∑
i=1
Si(σ, π)
and the Si are independent under the uniform distribution,then the Mallows model is factorizable. Even more it canbe generalized to a n-parameters model:
P(σ) ∝ exp(−n∑
j=1
θjSj(σ, σ0))
where a different spread parameter θi is associated witheach position in the permutation
Probabilistic Modeling on Rankings
Distance-Based Ranking Models
Generalized Mallows models
Kendall distanceKendall distance can be decomposed as:
T (π, σ) =n−1∑i=1
Vi(π, σ)
where Vi(π, σ) is defined as follows:
n∑j=i+1
I(π(i)− π(j))(σ(i)− σ(j)) < 0
There exists a bijection between the set of vectors(v1, . . . , vn−1) and the set of permutations σ
Probabilistic Modeling on Rankings
Distance-Based Ranking Models
Generalized Mallows models
Kendall distanceIf we assign a different spread parameter θi to eachposition i then
T (σ, σ0) =n−1∑i=1
θiVi(σ, σ0)
and the model can be written as:
p(σ) =n−1∏i=1
1ψi(θi)
e−θi Vi (σ,σ0) =n−1∏i=1
1− e−θi
1− e−(n−i+1)θie−θi Vi (σ,σ0)
Probabilistic Modeling on Rankings
Distance-Based Ranking Models
Generalized Mallows models with Kendall distance
Sampling
The distribution of Vi (i = 1, . . . ,n − 1) under the model isknown:
p(Vi(σ, σ0) = r) =e−rθi
ψ(θi)
Learning
Mandhani and Meila (2009) gives a A* search algorithmwith an admissible heuristic.
Probabilistic Modeling on Rankings
Distance-Based Ranking Models
Generalized Mallows model
Cayley distanceDefine the following variables Xi variables:
cycleσ(i) = σk (i)|k = 0,1, . . .where σ0(i) = i and σk (i) = σk−1(σ(i))
Xi(σ) =
0 if i = m«axcycleσ(i)1 otherwise
Cayley distance can be written as:
C(π, σ) = C(πσ−1, σσ−1) = C(πσ−1,e) =n−1∑i=1
Xi(πσ−1)
A similar development that with Kendall can be given
Probabilistic Modeling on Rankings
Distance-Based Ranking Models
Multi-stage ranking models
DefinitionAt each step a decision is takenA modal raking σ0 is assumed to existThe decision only depends on the number of items leftIt has n(n − 1)/2 parameters:
p(m, r) = probability of taken the m-th best decision at stage r
Let Vr = m if the (m + 1)-th best decision is taken at stager . Then
P(π) =n−1∏r=1
p(Vr , r) factorable model
Mallows and Plackett-Luce can be seen as multi-stagemodels
Probabilistic Modeling on Rankings
Other Models
Outline of the talk
1 Introduction
2 Thurstone Models
3 Plackett-Luce Model
4 Ranking Induced by Pair-comparisons
5 Distance-Based Ranking Models
6 Other Models
7 Independence
8 Datasets and Software
9 Conclusions
Probabilistic Modeling on Rankings
Other Models
Models based on marginals
First-order maginals
First-order marginals Q = [qij ]:
qij = P(σ(i) = j)
Easy to learn and to representWhich distribution is represented by these marginals?
Probabilistic Modeling on Rankings
Other Models
Models based on marginals
First-order maginals
First-order marginals Q = [qij ]:
qij = P(σ(i) = j)
Easy to learn and to representWhich distribution is represented by these marginals?Infinite distributions
Probabilistic Modeling on Rankings
Other Models
Models based on marginals
First-order marginals
Criterion to choose between them: The one with themaximum entropy
P(σ) = exp(n∑
i=1
Y(i,σ(i)) − 1)
where Y ∈ Rnxn
Obtaining Y is #P-hard (Agrawal et al., 2008)
Probabilistic Modeling on Rankings
Other Models
Models based on marginals
High-order marginals
Huang et al. (2009) considers models based on high-ordermarginalsThey use the Fourier transform over the symmetric groupto carry out inference tasksIrurozki et al. (2011) give a first approach to learning thesekind of models
Probabilistic Modeling on Rankings
Other Models
Non-parametric model I
Based on Mallows kernelsLebanon and Mao (2008) give a probabilistic approachbased on Mallows kernels with Kendall distance
p(σ) =1
NZ (θ)
N∑i=1
e−θd(σ,σi )
They were able to learn from partially ranked dataEfficient learningLacks: Marginalization, conditioning
Probabilistic Modeling on Rankings
Other Models
Non-parametric model II
Modifying Mallows kernels
Sun and Lebanon (2012) uses a triangular kernel based onKendall distance in to solve problems in recommendationsystemsThey can deal with all kind of partial rankings
Based on the Fourier transformBarbosa y Kondor (2010) gives another kernel approachbased on the use of the Fourier transform
Probabilistic Modeling on Rankings
Independence
Outline of the talk
1 Introduction
2 Thurstone Models
3 Plackett-Luce Model
4 Ranking Induced by Pair-comparisons
5 Distance-Based Ranking Models
6 Other Models
7 Independence
8 Datasets and Software
9 Conclusions
Probabilistic Modeling on Rankings
Independence
Luce modelA ranking σ is built by selecting first selecting the preferreditem, then among the remaining items, the second preferredand so on. Given n items, each with weight wi
P(σ) =n−1∏k=1
Pik ,...,in(σ(k)) =n−1∏k=1
wσ−1(k)∑nj=k wσ−1(j)
L-decomposability induces a conditional independence
P(σ−1(k) = ik |σ−1(1) = i1, . . . , σ−1k−1(k − 1) = ik−1) =
Pik ,...,in(ik )Thurstone and the Mallows model based on Kendall, Hammingor Spearman’s footrule are also L-decomposable
Probabilistic Modeling on Rankings
Independence
DefinitionIndependence based on relative rankings.The subsets A ⊂ 1, . . . ,n and B = Ac are independent underdistribution p(σ) if
p(σ) = pA(σA) · pB(σB)
Probabilistic Modeling on Rankings
Independence
First order marginal independence
Necessary (not sufficient) condition for full independenceGiven this collection of permutations, the frequency matrix is
1 2 5 3 42 1 4 3 51 2 3 4 51 2 3 5 41 2 5 4 32 1 5 3 4
First order independence detectionFinding the independent sets is clustering problem
Approximation qualityWhat if every time item 2 is in position 1, item 3 is in position 4?
Probabilistic Modeling on Rankings
Independence
Riffle independence (Huang et al. 2010)
Riffle shuffle
DefinitionDistribution p is said to be riffle independent ifp(σ) = m(τ(A,B)) · pA(σA) · pB(σB)
Probabilistic Modeling on Rankings
Independence
Riffle independence (Huang et al. 2010)
Riffle shuffle
DefinitionDistribution p is said to be riffle independent ifp(σ) = m(τ(A,B)) · pA(σA) · pB(σB)
Probabilistic Modeling on Rankings
Independence
SimplificationsDifferent settings of interleaving distribution lead to simplermodels
To take homeNatural criterion in the ranking domainFirst generate the rankings on the independent subsetsand the shuffle those rankings to obtain the complete one
Probabilistic Modeling on Rankings
Independence
SummaryL-decomposability: the probability of selecting an item ateach stage does not depend on the order of the alreadyselected itemsFirst order independence: a subset of items A always mapinto a subset of the positions X (|A| = |X |)Riffle independence: given 2 riffle independent subsets Aand B and items a1,a2 ∈ A and b ∈ B knowing that a1 isranked before a2 does not give any info about where b isranked
Probabilistic Modeling on Rankings
Datasets and Software
Outline of the talk
1 Introduction
2 Thurstone Models
3 Plackett-Luce Model
4 Ranking Induced by Pair-comparisons
5 Distance-Based Ranking Models
6 Other Models
7 Independence
8 Datasets and Software
9 Conclusions
Probabilistic Modeling on Rankings
Datasets and Software
Application of ranking has changed in history, frompsychology to machine learningSo has the size and amount of data, from small datasets tohuge amount of data
Kinds of dataPartial ranking: not every items has received a rank, top 5rankingRank with ties: a set of items is preferred to another set butthere is not order within the items in the setImplicit data: the acts of users are known (buying film A orvisiting B web page) but not their opinion about it, there isno rating or ranking no negative feedback
Probabilistic Modeling on Rankings
Datasets and Software
German sample (Croon, 1988)Small database of the permutations of 4 elements cited in(Bückenholt, 2002)
Idea (Fligner et al. 1986)
98 college students where asked to rank five words, namelythought, play, theory, dream, and attention regarding itsassociation with the word ideaPaper that cite it are (Gupta et al. 2002)
Probabilistic Modeling on Rankings
Datasets and Software
Cars (Bückenholt, 2002)279 Spanish college students were asked to rank four compactcars according to their purchase preferences. The authorsprovide the ranking patterns’ observed frequencies in thissample
APA (Diaconis, 1988)The American Psychological Association dataset includes15449 ballots of the election of the president in 1980, 5738 ofwhich are complete rankings, in which the candidates areranked from most to least favoriteIt is very frequently used (Huang et. al 2010)
Probabilistic Modeling on Rankings
Datasets and Software
Paired comparisons (Hunter, 2003)NFL dataset, results of the 1997 NFL seasonNASCAR dataset, results of the 2002 car season, 36 races and83 divers took part
Probabilistic Modeling on Rankings
Datasets and Software
Sushi (Kamishima et al. 2010)This information collected via web comprises user informationas well as their preferences in sushi, including
1025 partial rankings on 100 different kinds of sushi1025 full rankings on 10 different kinds of sushi (items areranked by every ranker)
Cited by (Huang et. al 2010)
Jester (Goldberg et al. 2001)
4.1 Million continuous ratings (-10.00 to +10.00) of 100 jokesfrom 73421 usersAppears in (Meila et al. 2010)
Probabilistic Modeling on Rankings
Datasets and Software
Both are offline
Netflix (Bennett et al. 07)Movie recomender system. In 2009 1000000 dollar prize wasgiven to Bellkor’s programatic chaos for besting Netflix’srecomenderoffline for privacy reasons
WikiLensWikiLens was a generalized collaborative recommender systemthat allowed its community to define item types (movie, bookalbum, restaurant, web) and categories for each type, and thenrate and get recommendations for items
Probabilistic Modeling on Rankings
Datasets and Software
Book crossing (Ziegler, 2005)
Collected from the Book-Crossing community.Contains 278.858 users (with demographic information)providing 1.149.780 ratings (explicit / implicit) about 271.379books
Probabilistic Modeling on Rankings
Datasets and Software
MovieLensRankings with-ties
MovieLens 100k: 100000 ratings (1-5) from 1000 users on1700 movies.MovieLens 1M: 1 million ratings from 6000 users on 4000movies.MovieLens 10M: 10 million ratings and 100000 tagapplications applied to 10000 movies by 72000 users. Inthe dataset, the movies are linked to movie reviewsystems. Each movie does have its IMDb and RTidentifiers, English and Spanish titles, picture URLs,genres, directors, actors (ordered by ”popularity”), RTaudience’ and experts’ ratings and scores, countries, andfilming locations.
Probabilistic Modeling on Rankings
Datasets and Software
HetRec 2011 conferenceHighly sparse databases
movielens 2k: extension of MovieLens10M dataset, whichcontains personal ratings and tags about movies.delicious 2k: implicit data from Delicious socialbookmarking system. Each of the 1867 users hasbookmarks, tag assignments, i.e. tuples [user, tag,bookmark], and contact relations within the dataset socialnetwork. Each bookmark has a title and URL.lastfm 2k: implicit dataset in which each user has a list ofmost listened music artists, tag assignments, i.e. tuples[user, tag, artist], and friend relations within the datasetsocial network. Each artist has a Last.fm URL and apicture URL.
Probabilistic Modeling on Rankings
Datasets and Software
Non parametric modelsThe SnOB software provides general functions orpermutations such as Fast Fourier transformThe props toolbox (based on SnOB) provides someefficient inference algorithms for distributions overpermutations and examples as described in the paper(Huang et al. 2009)
Probabilistic Modeling on Rankings
Datasets and Software
Mallows modelpyMallows is a set of Python routines for fitting and simulatingfrom a generalized Mallows model. Learning algorithmsimplemented for both full and partial rankings.
Probabilistic Modeling on Rankings
Datasets and Software
Paired comparisonsprefmod fits and simulates data as pairs comparisons,Bradley-Terry models and others, and their extension. Thepackage includes datasets for testingBradleyTerry2, eba, psychotree are other R packages forpaired comparisonsMMBT is a matlab packages for the estimation of theBradley-Terry model and and its extension to multicomparison case
Probabilistic Modeling on Rankings
Conclusions
Outline of the talk
1 Introduction
2 Thurstone Models
3 Plackett-Luce Model
4 Ranking Induced by Pair-comparisons
5 Distance-Based Ranking Models
6 Other Models
7 Independence
8 Datasets and Software
9 Conclusions
Probabilistic Modeling on Rankings
Conclusions
Conclusions
Many process are based on orderings and rankingNew technologies allow to store thousand partial rankingsof thousand of itemsNew probabilistic approaches need to be developed todeal with themTwo ways:
To adapt well-known probabilistic modelsTo create new specific models for the new kind of data