online social networks: navigation, search, recommendationeugene/cs190/lectures/april23-osn3.pdf ·...
TRANSCRIPT
![Page 1: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/1.jpg)
Online Social Networks: Navigation, Search, Recommendation
1
Many slides adapted from Lada Adamic (Michigan)
![Page 2: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/2.jpg)
Today's Plan
Final project details: recap and tips
Searching a social network
Real systems: node recommendation
2
![Page 3: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/3.jpg)
Search in structured networks
3
![Page 4: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/4.jpg)
Mary
Bob
Jane
Who could
introduce me to
Richard Gere?
How do we search?
Friends collage – luc, Flickr; http://creativecommons.org/licenses/by/2.0/deed.en
Richard Gere – spaceodissey, Flickr; http://creativecommons.org/licenses/by/2.0/deed.en
4
![Page 5: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/5.jpg)
1
6
54
63
67
2
94
number of
nodes found
power-law graph
5
![Page 6: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/6.jpg)
How would you search for a node here?
6
![Page 7: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/7.jpg)
What about here?
7
![Page 8: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/8.jpg)
gnutella network fragment
8
![Page 9: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/9.jpg)
0 20 40 60 80 1000
0.2
0.4
0.6
0.8
1
step
cu
mu
lati
ve
no
de
s f
ou
nd
at
ste
p
high degree seeking 1st neighborshigh degree seeking 2nd neighbors
50% of the files in a 700 node network can be found in < 8 steps
Gnutella network
9
![Page 10: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/10.jpg)
And here?
10
![Page 11: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/11.jpg)
here?
11
![Page 12: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/12.jpg)
here?
Source: http://maps.google.com12
![Page 13: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/13.jpg)
here?
Source: http://maps.google.com13
![Page 14: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/14.jpg)
here?
Source: http://maps.google.com14
![Page 15: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/15.jpg)
NE
MA
Milgram (1960’s), Dodds, Muhamad, Watts (2003)
Given a target individual and a particular property, pass the message to a
person you correspond with who is “closest” to the target.
Short chain lengths – six degrees of separation
Typical strategy – if far from target choose someone geographically closer,
if close to target geographically, choose someone professionally closer
Small world experiments review
Source: undeterminedSource: NASA, U.S. Government;
http://visibleearth.nasa.gov/view_rec.php?id=2429
15
![Page 16: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/16.jpg)
Is this the whole picture?
Why are small worlds navigable?
Source: Watts, D.J., Strogatz, S.H.(1998) Collective dynamics of 'small-world' networks. Nature 393:440-442.16
![Page 17: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/17.jpg)
How to choose among hundreds of acquaintances?
Strategy:
Simple greedy algorithm - each participant chooses
correspondent
who is closest to target with respect to the given property
Models
geographyKleinberg (2000)
hierarchical groupsWatts, Dodds, Newman (2001), Kleinberg(2001)
high degree nodesAdamic, Puniyani, Lukose, Huberman (2001), Newman(2003)
How are people are able to find short paths?
17
![Page 18: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/18.jpg)
Reverse small world experiment
• Killworth & Bernard (1978):
• Given hypothetical targets (name, occupation, location, hobbies, religion…) participants choose an acquaintance for each target
• Acquaintance chosen based on
• (most often) occupation, geography
• only 7% because they “know a lot of people”
• Simple greedy algorithm: most similar acquaintance
• two-step strategy rare
Source: 1978 Peter D. Killworth and H. Russell Bernard. The Reverse Small World Experiment Social Networks 1:159–92. 18
![Page 19: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/19.jpg)
How many hops actually separate any two individuals in the world?
• Participants are not perfect in routing messages• They use only local information• “The accuracy of small world chains in social networks”
Peter D. Killworth, Chris McCarty , H. Russell Bernard& Mark House:
– Analyze 10920 shortest path connections between 105 members of an interviewing bureau,
– together with the equivalent conceptual, or ‘small world’ routes, which use individuals’ selections of intermediaries.
– This permits the first study of the impact of accuracy within small world chains.
– The mean small world path length (3.23) is 40% longer than the mean of the actual shortest paths (2.30)
– Model suggests that people make a less than optimal small world choice more than half the time.
19
![Page 20: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/20.jpg)
nodes are placed on a lattice and
connect to nearest neighbors
additional links placed with puv~
+Spatial search
“The geographic movement of the [message]
from Nebraska to
Massachusetts is striking. There is a
progressive closing in on the target
area as each new person is added to the
chain”
S.Milgram „The small world
problem‟, Psychology Today 1,61,1967
r
uvd
Kleinberg, „The Small World Phenomenon, An Algorithmic Perspective‟
Proc. 32nd ACM Symposium on Theory of Computing, 2000.
(Nature 2000)
20
![Page 21: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/21.jpg)
When r=0, links are randomly distributed, ASP ~ log(n), n size of grid
When r=0, any decentralized algorithm is at least a0n2/3
no locality
When r<2,
expected
time at
least arn(2-r)/3
0~p p
21
![Page 22: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/22.jpg)
Overly localized links on a lattice
When r>2 expected search time ~ N(r-2)/(r-1)
4
1~p
d
22
![Page 23: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/23.jpg)
Links balanced between long and short range
When r=2, expected time of a DA is at most C (log N)2
2
1~p
d
23
![Page 24: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/24.jpg)
demo
• how does the probability of long-range links affect search?
http://www.ladamic.com/netlearn/NetLogo4/SmallWorldSearch.html
24
![Page 25: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/25.jpg)
Use a well defined network:
HP Labs email correspondence over 3.5 months
Edges are between individuals who sent
at least 6 email messages each way
450 users
median degree = 10, mean degree = 13
average shortest path = 3
Node properties specified:
degree
geographical location
position in organizational hierarchy
Can greedy strategies work?
Testing search models on social networksadvantage: have access to entire communication network
and to individual‟s attributes
25
![Page 26: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/26.jpg)
the network otherwise known as sample.gdf
26
![Page 27: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/27.jpg)
100
101
102
103
104
10-8
10-6
10-4
10-2
100
outdegree
frequency
outdegree distributiona = 2.0 fit
Power-law degree distribution of all senders of email passing through HP labs
Strategy 1: High degree search
number of recipients sender has sent email to
pro
po
rtio
n o
f se
nd
ers
27
![Page 28: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/28.jpg)
Filtered network (at least 6 messages sent each way)
0 20 40 60 800
5
10
15
20
25
30
35
number of email correspondents, k
p(k
)
0 20 40 60 8010
-4
10-2
100
k
p(k
)
Degree distribution no longer power-law, but Poisson
It would take 40 steps on average (median of 16) to reach a target! 28
![Page 29: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/29.jpg)
Strategy 2:
Geography
29
![Page 30: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/30.jpg)
1U
2L 3L
3U
2U
4U
1L
87 % of the
4000 links are
between individuals
on the same floor
Communication across corporate geography
source: Adamic and Adar, How to search a social network, Social Networks, 27(3), p.187-203, 2005. 30
![Page 31: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/31.jpg)
Cubicle distance vs. probability of being linked
102
103
10-3
10-2
10-1
100
distance in feet
pro
po
rtio
n o
f lin
ke
d p
airs
measured
1/r
1/r2
optimum for search
source: Adamic and Adar, How to search a social network, Social Networks, 27(3), p.187-203, 2005. 31
![Page 32: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/32.jpg)
Livejournal
• LiveJournal provides an API to crawl the friendship network + profiles– friendly to researchers– great research opportunity
• basic statistics – Users (stats from April 2006)
• How many users, and how many of those are active?• Total accounts: 9980558 • ... active in some way: 1979716 • ... that have ever updated: 6755023 • ... updating in last 30 days: 1300312 • ... updating in last 7 days: 751301 • ... updating in past 24 hours: 216581
32
![Page 33: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/33.jpg)
Predominantly female & young
demographic• Male: 1370813 (32.4%)
• Female: 2856360 (67.6%)
• Unspecified: 1575389
13 18483
14 87505
15 211445
16 343922
17 400947
18 414601
19 405472
20 371789
21 303076
22 239255
23 194379
24 152569
25 127121
26 98900
27 73392
28 59188
29 48666
Age distribution
33
![Page 34: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/34.jpg)
Geographic Routing in Social Networks
• David Liben-Nowell, Jasmine Novak, Ravi Kumar, Prabhakar Raghavan, and Andrew Tomkins (PNAS 2005)
• data used
– Feb. 2004
– 500,000 LiveJournal users with US locations
– giant component (77.6%) of the network
– clustering coefficient: 0.2
34
![Page 35: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/35.jpg)
Degree distributions
• The broad degree distributions we’ve learned to know and love
– but more probably lognormal than power law
broader in degree than outdegree distributionSource: http://www.tomkinshome.com/andrew/papers/science-blogs/pnas.pdf 35
![Page 36: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/36.jpg)
Results of a simple greedy geographical algorithm
• Choose source s and target t randomly
• Try to reach target’s city – not target itself
• At each step, the message is forwarded from the current message holder u to the friend v of u geographically closest to t
stop if d(v,t) > d(u,t)
13% of the chains are completed
stop if d(v,t) > d(u,t)
pick a neighbor at random in the
same city if possible, else stop
80% of the chains are completed
Source: http://www.tomkinshome.com/andrew/papers/science-blogs/pnas.pdf36
![Page 37: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/37.jpg)
the geographic basis of friendship
• d = d(u,v) the distance between pairs of people
• The probability that two people are friends given their distance is equal to
– P(d) = e + f(d), e is a constant independent of geography
– e is 5.0 x 10-6 for LiveJournal users who are very far apart
Source: http://www.tomkinshome.com/andrew/papers/science-blogs/pnas.pdf 37
![Page 38: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/38.jpg)
the geographic basis of friendship
• The average user will have ~ 2.5 non-geographic friends
• The other friends (5.5 on average) are distributed according to an approximate 1/distance relationship
• But 1/d was proved not to be navigable by Kleinberg, so what gives?
Source: http://www.tomkinshome.com/andrew/papers/science-blogs/pnas.pdf 38
![Page 39: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/39.jpg)
Navigability in networks of variable geographical density
• Kleinberg assumed a uniformly populated 2D lattice
• But population is far from uniform
• population networks and rank-based friendship
– probability of knowing a person depends not on absolute distance but on relative distance (i.e. how many people live closer) Pr[u ->v] ~ 1/ranku(v)
Source: http://www.tomkinshome.com/andrew/papers/science-blogs/pnas.pdf 39
![Page 40: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/40.jpg)
what if we don’t have geography?
40
![Page 41: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/41.jpg)
does community structure help?
41
![Page 42: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/42.jpg)
Kleinberg, „Small-World Phenomena and the Dynamics of Information‟, NIPS 14, 2001
Individuals classified into a hierarchy,
hij = height of the least common ancestor.
Theorem: If a = 1 and outdegree is polylogarithmic, can
s ~ O(log n)
Group structure models:
Individuals belong to nested groups
q = size of smallest group that v,w belong to
f(q) ~ q-a
Theorem: If a = 1 and outdegree is polylogarithmic, can
s ~ O(log n)
h b=3
e.g. state-county-city-neighborhood
industry-corporation-division-groupijh
ij bpa
~
Hierarchical small world models
42
![Page 43: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/43.jpg)
Why search is fast in hierarchical topologies
T
S
Rl2|R|<|R‟|<l|R|
k = c log2n calculate probability that s fails to have a link in R‟
R‟
43
![Page 44: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/44.jpg)
individuals belong to hierarchically nested groups
multiple independent hierarchies h=1,2,..,H
coexist corresponding to occupation,
geography, hobbies, religion…
pij ~ exp(-a x)
Source: Identity and Search in Social Networks: Duncan J. Watts, Peter Sheridan Dodds, and M. E. J. Newman;
Science 17 May 2002 296: 1302-1305. < http://arxiv.org/abs/cond-mat/0205383v1 >
hierarchical models with multiple hierarchies
44
![Page 45: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/45.jpg)
Source: Identity and Search in Social Networks: Duncan J. Watts, Peter Sheridan Dodds, and M. E. J. Newman;
Science 17 May 2002 296: 1302-1305. < http://arxiv.org/abs/cond-mat/0205383v1 >
45
![Page 46: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/46.jpg)
Identity and search in social networksWatts, Dodds, Newman (2001)
Message chains fail at each node with probability p
Network is „searchable‟ if a fraction r of messages reach the target
N=102400
N=409600
N=204800
(1 )L
Lq p r
Source: Identity and Search in Social Networks: Duncan J. Watts, Peter Sheridan Dodds, and M. E. J. Newman;
Science 17 May 2002 296: 1302-1305. < http://arxiv.org/abs/cond-mat/0205383v1 >
46
![Page 47: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/47.jpg)
Small World Model, Watts et al.
Fits Milgram‟s data well
Model
parameters:
N = 108
z = 300
g = 100
b = 10
a= 1, H = 2
Lmodel= 6.7
Ldata = 6.5
http://www.aladdin.cs.cmu.edu/workshops/wsa/papers/dodds-2004-04-10search.pdf
more slides on this:
47
![Page 48: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/48.jpg)
does it work in practice? back to HP Labs: Organizational hierarchy
48
![Page 49: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/49.jpg)
Email correspondence superimposed on the organizational hierarchy
source: Adamic and Adar, How to search a social network, Social Networks, 27(3), p.187-203, 2005.
49
![Page 50: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/50.jpg)
Example of search path
distance 1
distance 1
distance 2
hierarchical distance = 5
search path distance = 4
distance 1
50
![Page 51: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/51.jpg)
Probability of linking vs. distance in hierarchy
in the „searchable‟ regime: 0 < a < 2 (Watts, Dodds, Newman 2001)
2 4 6 8 100
0.1
0.2
0.3
0.4
0.5
0.6p
rob
ab
ility
of lin
kin
g
hierarchical distance h
observedfit exp(-0.92*h)
51
![Page 52: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/52.jpg)
Results
0 5 10 15 20 250
1
2
3
4
5x 10
4
number of steps in search
nu
mb
er
of
pa
irs
distance hierarchy geography geodesic org random
median 4 7 3 6 28
mean 5.7 (4.7) 12 3.1 6.1 57.4
0 2 4 6 8 10 12 14 16 18 200
2000
4000
6000
8000
10000
12000
14000
16000
number of steps
nu
mb
er
of p
airs
hierarchygeography
source: Adamic and Adar, How to search a social network, Social Networks, 27(3), p.187-203, 2005.
52
![Page 53: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/53.jpg)
Expt 2
Searching
a social
networking
website
Source: ClubNexus - Orkut Buyukkokten, Tyler Ziemann53
![Page 54: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/54.jpg)
Source: ClubNexus - Orkut Buyukkokten, Tyler Ziemann54
![Page 55: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/55.jpg)
Profiles:
status (UG or G)
year
major or department
residence
gender
Personality (choose 3 exactly):
you funny, kind, weird, …
friendship honesty/trust, common interests, commitment, …
romance - “ -
freetime socializing, getting outside, reading, …
support unconditional accepters, comic-relief givers, eternal optimists
Interests (choose as many as apply)
books mystery & thriller, science fiction, romance, …
movies western, biography, horror, …
music folk, jazz, techno, …
social activities ballroom dancing, barbecuing, bar-hopping, …
land sports soccer, tennis, golf, …
water sports sailing, kayaking, swimming, …
other sports ski diving, weightlifting, billiards, …
55
![Page 56: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/56.jpg)
Differences between data sets
• complete image of
communication network
• affinity not reflected
• partial information of
social network
• only friends listed
HP labs email network Online community
56
![Page 57: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/57.jpg)
0 20 40 60 80 1000
50
100
150
200
250
number of links
nu
mb
er
of u
se
rs w
ith
so
ma
ny lin
ks
100
101
102
100
101
102
number of links
num
ber
of
users
Degree Distribution for Nexus Net
2469 users, average degree 8.2
source: Adamic and Adar, How to search a social network, Social Networks, 27(3), p.187-203, 2005. 57
![Page 58: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/58.jpg)
Problem: how to construct hierarchies?
0 1 2 30
0.002
0.004
0.006
0.008
0.01
0.012
0.014
separation in years
pro
b. tw
o u
nd
erg
rad
s a
re frie
nd
s
data
(x+1)-1.1 fit
0 1 2 3 4 50
0.005
0.01
0.015
0.02
separation in years
pro
b.
two g
rads a
re f
riends
data
(x+1)-1.7 fit
Probability of linking by separation in years
source: Adamic and Adar, How to search a social network, Social Networks, 27(3), p.187-203, 2005. 58
![Page 59: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/59.jpg)
Hierarchies not useful for other attributes:
0 100 200 300 400 500 6000
0.01
0.02
0.03
0.04
0.05
0.06
distance between residences
pro
ba
bili
ty o
f b
ein
g frie
nd
s
Geography
Other attributes: major, sports, freetime activities, movie preferences…
source: Adamic and Adar, How to search a social network, Social Networks, 27(3), p.187-203, 2005.
59
![Page 60: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/60.jpg)
Strategy using user profiles
prob. two undergrads are friends (consider simultaneously)
• both undergraduate, both graduate, or one of each
• same or different year
• both male, both female, or one of each
• same or different residences
• same or different major/department
Results
random 133 390
high degree 39 137
profile 21 53
strategy median mean
With an attrition rate of 25%, 5% of the messages get through at
an average of 4.8 steps,
=> hence network is barely searchable 60
![Page 61: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/61.jpg)
Individuals associate on different levels into groups.
Group structure facilitates decentralized search using social ties.
Hierarchy search faster than geographical search
A fraction of „important‟ individuals are easily findable
Humans may be more resourceful in executing search tasks:
making use of weak ties
using more sophisticated strategies
Summary
61
![Page 62: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/62.jpg)
Link Recommendation on Social Networks
• Basics of recommender systems
• Friends on Facebook
• Connections on LinkedIn
• WTF ("who to follow") on Twitter (to be continued)
62
![Page 63: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/63.jpg)
Recommender Systems
• Systems which take user preferences about items as input and outputs recommendations
• Early examples
• Bellcore Music Recommender (1995)
• MIT Media Lab: Firefly (1996)
Best example: Amazon.com
Worst example: Amazon.com
Also:
Netflix
eBay
Google Reader
iTunes Genius
digg.com
Hulu.com
63
![Page 64: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/64.jpg)
Recommender Systems
• Basic idea
– recommend item i to user u for the purpose of• Exposing them to something they would not have otherwise seen
• Leading customers to the Long Tail
• Increasing customers’ satisfaction
• Data for recommender systems (need to know who likes what)
– Purchase/rented
– Ratings
– Web page views
– Which do you think is best?
64
![Page 65: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/65.jpg)
Recommender Systems
• Two types of data:
• Explicit data: user provides information about their preferences– Pro: high quality ratings
– Con: Hard to get: people cannot be bothered
• Implicit data: infer whether or not user likes product based on behavior– Pro: Much more data available, less invasive
– Con: Inference often wrong (does purchase imply preference?)
• In either case, data is just a big matrix – Users x items
– Entries binary or real-valued
• Biggest Problem:– Sparsity: most users have not rated most products.
65
45531
312445
53432142
24542
522434
42331
![Page 66: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/66.jpg)
Recommender Systems: Models
Two camps on how to make recommendations:
– Collaborative Filtering (CF)• Use collective intelligence from all available rating information to make
predictions for individuals
• Depends on the fact that user tastes are correlated and commutative:
• If Alice and Bob both like X and Alice likes Y then Bob is more likely to like Y
– Content based• Extracts “features” from items for a big regression or rule-based model
• See www.nanocrowd .com
– 15 years of research in the field
– Conventional wisdom:• CF performs better when there is sufficient data
• Content-based is useful when there is little data
66
![Page 67: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/67.jpg)
Detour: the Netflix Prize
67
![Page 68: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/68.jpg)
Netflix• A US-based DVD rental-by mail company
• >10M customers, 100K titles, ships 1.9M DVDs per day
Good recommendations = happy
customers
68
![Page 69: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/69.jpg)
Netflix Prize
• October, 2006:
• Offers $1,000,000 for an improved recommender algorithm
•Training data
• 100 million ratings
• 480,000 users
• 17,770 movies
• 6 years of data: 2000-2005
• Test data
• Last few ratings of each user (2.8 million)
• Evaluation via RMSE: root mean squared error
• Netflix Cinematch RMSE: 0.9514
• Competition
• $1 million grand prize for 10% improvement
• If 10% not met, $50,000 annual “Progress Prize” for best improvement
datescoremovieuser
2002-01-031211
2002-04-0452131
2002-05-0543452
2002-05-0541232
2003-05-0337682
2003-10-105763
2004-10-114454
2004-10-1115685
2004-10-1123425
2004-12-1222345
2005-01-025766
2005-01-314566
datescoremovieuser
2003-01-03?2121
2002-05-04?11231
2002-07-05?252
2002-09-05?87732
2004-05-03?982
2003-10-10?163
2004-10-11?24504
2004-10-11?20325
2004-10-11?90985
2004-12-12?110125
2005-01-02?6646
2005-01-31?15266
69
![Page 70: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/70.jpg)
Netflix Prize
• Competition design
• Hold-out set created by taking last 9 ratings for each user
– Non-random, biased set
• Hold-out set split randomly three ways:
– Probe Set – appended to training data to allow unbiased estimation of RMSE
– Submit ratings for the (Quiz+Test) Sets – Netflix returns RMSE on the Quiz Set
only
– Quiz Set results posted on public leaderboard, but Test Set used to determine the winner!
» Prevents overfitting
70
![Page 71: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/71.jpg)
Data CharacteristicsMean Score vs. Date of Rating
3.2
3.3
3.4
3.5
3.6
3.7
3.8
2000 2001 2002 2003 2004 2005 2006
Date
Me
an
Sc
ore
0
5
10
15
20
25
30
35
40
1 2 3 4 5
Rating
Perc
en
tag
e
Training (m = 3.60)
Probe (m = 3.67)
71
![Page 72: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/72.jpg)
Ratings per movie/user
Mean Rating# RatingsUser ID
1.9017,651305344
1.8117,432387418
1.2216,5602439493
4.2615,8111664010
4.0814,8292118461
1.379,8201461435
Avg #ratings/user: 208
Avg #ratings/movie: 5627
72
![Page 73: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/73.jpg)
Data Characteristics
• Most Loved MoviesCountAvg ratingMost Loved Movies
137812 4.593 The Shawshank Redemption
133597 4.545 Lord of the Rings :The Return of the King
180883 4.306 The Green Mile
150676 4.460 Lord of the Rings :The Two Towers
139050 4.415 Finding Nemo
117456 4.504 Raiders of the Lost Ark
Most Rated Movies
Miss Congeniality
Independence Day
The Patriot
The Day After Tomorrow
Pretty Woman
Pirates of the Caribbean
Highest Variance
The Royal Tenenbaums
Lost In Translation
Pearl Harbor
Miss Congeniality
Napolean Dynamite
Fahrenheit 9/11
73
![Page 74: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/74.jpg)
8pm6am10/18pm
• ARRRRGH! We have one more chance….
74
![Page 75: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/75.jpg)
75
![Page 76: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/76.jpg)
76
Test Set Results
• BellKor’s Pragmatic Theory: 0.8567
• The Ensemble: 0.8567
• Tie breaker was submission date/time• They won by 20 minutes!
But really:
• BellKor’s Pragmatic Theory: 0.856704
• The Ensemble: 0.856714
• Also, a combination of BPC (10.06%) and Ensemble (10.06%) scores results in a 10.19% improvement!
![Page 77: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/77.jpg)
BellCore Approach
• The prize winning solutions were an ensemble of many separate solution sets
• Progress Prize 2007: 103 sets
• Progress Prize 2008 (w/Big Chaos): 205 sets
• Grand Prize 2009 (w/ BC and Pragmatic Theory): > 800 sets!!
– Used two main classes of models• Nearest Neighbors
• Latent Factor Models (via Singular Value Decomposition)
• Also regularized regression, not a big factor
• Teammates used neural nets and other methods
• Approaches mainly algorithmic, not statistical in nature
77
![Page 78: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/78.jpg)
Data representation (excluding dates)
121110987654321
455311
3124452
534321423
245424
5224345
423316
users
mo
vie
s
- unknown rating - rating between 1 to 5
78
![Page 79: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/79.jpg)
Nearest Neighbors
79
![Page 80: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/80.jpg)
Nearest Neighbors
121110987654321
455311
3124452
534321423
245424
5224345
423316
users
mo
vie
s
- unknown rating - rating between 1 to 5
80
![Page 81: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/81.jpg)
Nearest Neighbors
121110987654321
455 ?311
3124452
534321423
245424
5224345
423316
users
mo
vie
s
- estimate rating of movie 1 by user 5
81
![Page 82: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/82.jpg)
Nearest Neighbors
121110987654321
455 ?311
3124452
534321423
245424
5224345
423316
users
Neighbor selection:
Identify movies similar to 1, rated by user 5
mo
vie
s
82
![Page 83: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/83.jpg)
Nearest Neighbors
121110987654321
455 ?311
3124452
534321423
245424
5224345
423316
users
Compute similarity weights:
s13=0.2, s16=0.3
mo
vie
s
83
![Page 84: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/84.jpg)
Nearest Neighbors
121110987654321
4552.6
311
3124452
534321423
245424
5224345
423316
users
Predict by taking weighted average:
(0.2*2+0.3*3)/(0.2+0.3)=2.6
mo
vie
s
84
![Page 85: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/85.jpg)
Nearest Neighbors
– To predict the rating for user u on item i:
• Use similar users’ ratings for similar movies:
rui = rating for user u and item i
bui= baseline rating for user u and item I
sij = similarity between items i and j
N(i,u) = neighborhood of item i for user u (might be fixed at k)
ˆ r ui =sij
j ÎN ( i,u)å ruj
sijj ÎN( i,u)
å
85
![Page 86: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/86.jpg)
Nearest Neighbors
• Useful to “center” the data, and model residuals
• What is sij ???– Cosine distance
– Correlation
• What is N(i,u)??– Top-k
– Threshold
• What is bui
• How to deal with missing values?• Choose several different options and throw them in!
86
![Page 87: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/87.jpg)
Nearest Neighbors, cont
• This is called “item-item” NN
– Can also do user-user
– Which do you think is better?
• Advantages of NN
– Few modeling assumptions
– Easy to explain to users
– Most popular RS tool
87
![Page 88: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/88.jpg)
Nearest Neighbors, Modified
• Problem with traditional k-NN:• Similarity weights are calculated globally, and
• do not account for correlation among the neighbors
– We estimate the weights (wij) simultaneously via a least squares optimization :
Basically, a regression using the ratings in the nbhd.
– Shrinkage helps address correlation
– (don’t try this at home)88
![Page 89: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/89.jpg)
Geared
towards
females
Geared
towards
males
serious
escapist
The PrincessDiaries
The Lion King
Braveheart
Lethal Weapon
Independence Day
AmadeusThe Color Purple
Dumb and Dumber
Ocean’s 11
Sense and Sensibility
Latent factor models – Singular Value Decomposition
89
SVD finds concepts
![Page 90: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/90.jpg)
Matrix Decomposition - SVD
90
45531
312445
53432142
24542
522434
42331
item
s
.2-.4.1
.5.6-.5
.5.3-.2
.32.11.1
-22.1-.7
.3.7-1
-.92.41.4.3-.4.8-.5-2.5.3-.21.1
1.3-.11.2-.72.91.4-1.31.4.5.7-.8
.1-.6.7.8.4-.3.92.41.7.6-.42.1
~
~
item
s
users
users
?
D3
Example with 3
factors
(concepts
Each user and each item is
described by a feature vector across
concepts
![Page 91: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/91.jpg)
Factorization-based modeling45531
312445
53432142
24542
522434
42331
.2-.4.1
.5.6-.5
.5.3-.2
.32.11.1
-22.1-.7
.3.7-1
-.92.41.4.3-.4.8-.5-2.5.3-.21.1
1.3-.11.2-.72.91.4-1.31.4.5.7-.8
.1-.6.7.8.4-.3.92.41.7.6-.42.1~
• This is a strange way to use SVD!
– Usually for reducing dimensionality, here for filling in missing data!
– Special techniques to do SVD w/ missing data• Alternating Least Squares = variant of EM algorithms
• Probably most popular model among contestants– 12/11/2006: Simon Funk describes an SVD based method
91
![Page 92: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/92.jpg)
Latent Factor Models, Modified• Problem with traditional SVD:
– User and item factors are determined globally
– Each user described as a fixed linear combination across factors
– What if there are different people in the household?
• Let the linear combination change as a function of the item rated.
• Substitute pu with pu(i), and add similarity weights
• Again, don’t try this at home!
92
![Page 93: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/93.jpg)
First 2 Singular Vectors
93
![Page 94: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/94.jpg)
Incorporating Implicit Data
• Implicit Data: what you choose to rate is an important, and separate piece of information than how you rate it.
• Helps incorporate negative information, especially for those users with low variance.
• Can be fit in NN or SVD
94
![Page 95: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/95.jpg)
WTF: The Who to Follow Service at Twitter
• Twitter's user recommendation service, responsible for creating millions of connections daily between users based on shared interests, common connections, and other related factors.
• Reference:http://www.stanford.edu/~rezab/papers/wtf_overview.pdf
95
![Page 96: Online Social Networks: Navigation, Search, Recommendationeugene/cs190/lectures/april23-osn3.pdf · Theorem: If a= 1 and outdegree is polylogarithmic, can s ~ O(log n) Group structure](https://reader036.vdocuments.site/reader036/viewer/2022071100/5fd9532a5d73db352a659add/html5/thumbnails/96.jpg)
Facebook EdgeRank
• http://techcrunch.com/2010/04/22/facebook-edgerank/
http://econsultancy.com/us/blog/7885-the-ultimate-guide-to-the-facebook-edgerank-algorithm
http://cs229.stanford.edu/proj2007/DaniyalzadeLipus-FacebookFriendSuggestion.pdf
• To be continued…. http://cameronmarlow.com/papers
96