sac treck 2008
Post on 16-Dec-2014
774 Views
Preview:
DESCRIPTION
TRANSCRIPT
the effect of correlation coefficients on
communities of recommenders
neal lathia, stephen hailes, licia capradepartment of computer science
university college london
n.lathia@cs.ucl.ac.uk
ACM SAC TRECK, Fortaleza, Brazil: March 2008Trust, Recommendations, Evidence and other Collaboration
Know-how
recommender systems:
built on collaboration between users
collaborative filtering research design
methodsto solve problems
1. accuracy, coverage
2. data sparsity, cold-start
3. incorporating tag knowledge
for example,
… a method to classify content correctly
data predictedratingsintelligent
process
our focus: k-nearest neighbours (kNN)
how do we model kNN collaborative filtering?
a graph of cooperating users
me
nodes = userslinks = weighted according to similarity
accuracy, coverage
to answer this question, we need to find the optimal weighting:
the best similarity measure for the dataset, from the many available:
ba
ba
baRR
RRw ,
2
,
2
,
,,,
bibaia
bibaiaba
rrrr
rrrrw
2
,
2
,
,,,
1
bibaia
bibaiaba
rrrr
rrrr
Nw
and there are more still…
2
,2
,
,,,
5.25.2
5.25.2
ibia
ibiaba
rr
rrw
concordance: proportion of agreement
TN
DCw ba
,
+0.5 +3.0
-1.5+1.5
+1.5 +/-?
concordant
discordant
tied
Somers’ d}
community view of the graph:
-0.430.57
(a very small example)
me-0.50
-0.65
0.12
0.87
0.010.57
0.840.220.99
0.82
0.23
0.39
0.11
0.68
0.02
0.41 0.01
-0.99
0.78
or, put another way:
-0.430.57
(a very small example)
me
good
bad
none
good
good
goodgood
none
nonegood
bad
bad
good
good
good
good
nonegood
good
what is the best way of generating the graph?
like this?
-0.430.57
(a very small example)
me
good
bad
none
none
good
badbad
good
goodgood
good
good
bad
none
none
good
nonebad
bad
or like this?
-0.430.57
(a very small example)
megood
bad
none
good
good
good
good
none
nonebad
bad
bad
good
good
good
good
none
good
good
similarity values depend on the method used:
there is no agreement between measures
[2][3][1][5][3]
[4][1][3][2][3]
my profile neighbour profile
pearson -0.50weighted- pearson -0.05cosine angle0.76co-rated proportion1.00concordance -0.06
badnear zero
goodvery goodnear zero
nodes = userslinks = weighted according to similarity
each method will change the distribution of similarity across the graph
… the pearson distribution
intelligent process
Pearson Distribution
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
(-1.0
,-0.9
5)
(-0.9
,-0.8
5)
(-0.8
,-0.7
5)
(-0.7
,-0.6
5)
(-0.6
,-0.5
5)
(-0.5
,-0.4
5)
(-0.4
,-0.3
5)
(-0.3
,-0.2
5)
(-0.2
,-0.1
5)
(-0.1
,-0.0
5)
(0.0,
0.05
)
(0.1,
0.15
)
(0.2,
0.25
)
(0.3,
0.35
)
(0.4,
0.45
)
(0.5,
0.55
)
(0.6,
0.65
)
(0.7,
0.75
)
(0.8,
0.85
)
(0.9,
0.95
)
Range
Pro
po
rtio
n
… the modified pearson distributionsweighted-PCC, constrained-PCC
Modified Pearson Distributions
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
(-1.0
,-0.9
5)
(-0.9
,-0.8
5)
(-0.8
,-0.7
5)
(-0.7
,-0.6
5)
(-0.6
,-0.5
5)
(-0.5
,-0.4
5)
(-0.4
,-0.3
5)
(-0.3
,-0.2
5)
(-0.2
,-0.1
5)
(-0.1
,-0.0
5)
(0.0,
0.05
)
(0.1,
0.15
)
(0.2,
0.25
)
(0.3,
0.35
)
(0.4,
0.45
)
(0.5,
0.55
)
(0.6,
0.65
)
(0.7,
0.75
)
(0.8,
0.85
)
(0.9,
0.95
)
Range
Pro
po
rtio
n
Weighted-PCC Constrained-PCC
… and other measures
intelligent process
Other Distributions
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
(-1.0
,-0.9
5)
(-0.9
,-0.8
5)
(-0.8
,-0.7
5)
(-0.7
,-0.6
5)
(-0.6
,-0.5
5)
(-0.5
,-0.4
5)
(-0.4
,-0.3
5)
(-0.3
,-0.2
5)
(-0.2
,-0.1
5)
(-0.1
5,-0
.1)
(-0.0
5,0.0
)
(0.05
,0.1)
(0.15
,0.2)
(0.25
,0.3)
(0.35
,0.4)
(0.45
,0.5)
(0.55
,0.6)
(0.65
,0.7)
(0.75
,0.8)
(0.85
,0.9)
(0.95
,1.0)
Range
Pro
po
rtio
n
Co-Rated Somers VSS
somers’ d, co-rated, cosine angle
an experiment withrandom numbers
what happens if we do this?
me
java.util.Random r = new java.util.Random()
for all neighbours i {
similarity(i) = (r.nextDouble()*2.0)-1.0);
}
Neighborhood Co Rated Somers’ d PCC wPCC R(0.5, 1.0) Constant(1.0) R(-1.0, 1.0)
1 0.9449 0.9492 1.1150 0.9596 1.0665 1.0406 1.0341
10 0.8498 0.8355 1.0455 0.8277 0.9595 0.9495 0.9689
30 0.7979 0.7931 0.9464 0.7847 0.8903 0.9108 0.8848
50 0.7852 0.7817 0.9007 0.7733 0.8584 0.8922 0.8498
100 0.7759 0.7728 0.8136 0.7647 0.8222 0.8511 0.8153
153 0.7726 0.7727 0.7817 0.7638 0.8053 0.8243 0.8024
229 0.7717 0.7771 0.7716 0.7679 0.7919 0.7992 0.8058
459 0.7718 0.7992 0.8073 0.8025 0.7773 0.7769 0.7811
N
prMAE
iaia ,,accuracy
…cross-validation results in paper
movielens u1 subset…
sprediction#
sprediction uncovered#Coveragecoverage
…cross-validation results in paper
movielens u1 subset…
Neighborhood Co Rated Somers’ d PCC wPCC Oracle
1 0.67795 0.57165 0.96725 0.61375 0.00495
10 0.15455 0.0999 0.80515 0.1114 0.00495
30 0.0512 0.0407 0.57225 0.04135 0.00495
50 0.03065 0.0266 0.3641 0.0251 0.00495
100 0.01515 0.01645 0.08345 0.01485 0.00495
153 0.00945 0.0122 0.0273 0.01135 0.00495
229 0.00715 0.00965 0.01165 0.00915 0.00495
459 0.00495 0.0054 0.00495 0.00495 0.00495
(best coverage when all of community used)
why do we get these results?
a) our error measures are not good
enough?
N
rpMAE
iaia ,,
sprediction#
sprediction uncovered#Coverage
J. Herlocker, J. Konstan, L. Terveen, and J. Riedl. Evaluating collaborative filtering recommender systems. In ACM Transactions on Information Systems, volume 22, pages 5–53. ACM Press, 2004.
S.M. McNee, J. Riedl, and J.A. Konstan. Being accurate is not enough: How accuracy metrics have hurt recommender systems. In Extended Abstracts of the 2006 ACM Conference on Human Factors in Computing Systems. ACM Press, 2006.
N
prRMSE iaia
2
,,
b) is there something wrong with the dataset?
c) is user-similarity not strong enough to capture the best recommender relationships in
the graph?
one proposal…
N. Lathia, S. Hailes, L. Capra. Trust-Based Collaborative Filtering. To appear In IFIPTM 2008: Joint iTrust and PST Conferences on Privacy, Trust management and Security. Trondheim, Norway. June 2008.
is modelling filtering as a trust-management problem a potential solution?
once we do that, more questions arise…
what other graph properties emerge from kNN collaborative filtering?
how does the graph evolve over time?
current work
N. Lathia, S. Hailes, L. Capra. Evolving Communities of Recommenders: A Temporal Evaluation. Research Note RN/08/01, Department of Computer Science, University College London. Under Submission.
N. Lathia, S. Hailes, L. Capra. kNN User Filtering: A Temporal Implicit Social Network. Current Work.
read more: http://mobblog.cs.ucl.ac.uktrust, recommendations, …
neal lathia, stephen hailes, licia capradepartment of computer science
university college london
n.lathia@cs.ucl.ac.uk
ACM SAC TRECK, Fortaleza, Brazil: March 2008Trust, Recommendations, Evidence and other Collaboration Know-how
questions?
top related