stanford university · anchor text: i go to stanford where i study what ... webgraph we can ......
TRANSCRIPT
![Page 1: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/1.jpg)
CS224W: Social and Information Network AnalysisJure Leskovec, Stanford University
http://cs224w.stanford.edu
![Page 2: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/2.jpg)
How to organize/navigate it?
First try: Human curatedWeb directories Yahoo, DMOZ, LookSmart
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 2
![Page 3: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/3.jpg)
SEARCH!
Find relevant docs in a small and trusted set: Newspaper articles Patents, etc.
Two traditional problems: Synonimy: buy – purchase, sick – ill Polysemi: jaguar
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 3
![Page 4: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/4.jpg)
Does more documents mean better results?
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 4
![Page 5: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/5.jpg)
What is “best” answer to query “Stanford”? Anchor Text: I go to Stanford where I study
What about query “newspaper”? No single right answer
Scarcity (IR) vs. abundance (Web) of information Web: Many sources of information. Who to “trust”?
Trick: Pages that actually know about newspapers might all be pointing to many newspapers
Ranking!
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 5
![Page 6: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/6.jpg)
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 6
the “golden triangle”
![Page 7: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/7.jpg)
Web pages are not equally “important” www.joe‐schmoe.com vs. www.stanford.edu
We already know:Since there is large diversity in the connectivity of the webgraph we can rank the pages by the link structure
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 7
![Page 8: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/8.jpg)
We will cover the following Link Analysis approaches to computing importances of nodes in a graph: Hubs and Authorities (HITS) Page Rank Topic‐Specific (Personalized) Page Rank
Sidenote: Various notions of node centrality: Node u Degree dentrality = degree of u Betweenness centrality = #shortest paths passing through u Closeness centrality = avg. length of shortest paths from u to all other nodes Eigenvector centrality = like PageRank
11/8/2011 8Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu
![Page 9: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/9.jpg)
Goal (back to the newspaper example): Don’t just find newspapers.Find “experts” – people who link in a coordinated way to good newspapers
Idea: Links as votes Page is more important if it has more links In‐coming links? Out‐going links?
Hubs and AuthoritiesEach page has 2 scores: Quality as an expert (hub): Total sum of votes of pages pointed to
Quality as an content (authority): Total sum of votes of experts
Principle of repeated improvement11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 9
NYT: 10
Ebay: 3
Yahoo: 3
CNN: 8
WSJ: 9
![Page 10: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/10.jpg)
Interesting pages fall into two classes:1. Authorities are pages containing useful
information Newspaper home pages Course home pages Home pages of auto manufacturers
2. Hubs are pages that link to authorities List of newspapers Course bulletin List of US auto manufacturers
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 10
NYT: 10Ebay: 3Yahoo: 3CNN: 8WSJ: 9
![Page 11: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/11.jpg)
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 11
![Page 12: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/12.jpg)
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 12
![Page 13: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/13.jpg)
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 13
![Page 14: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/14.jpg)
A good hub links to many good authorities
A good authority is linked from many good hubs
Model using two scores for each node: Hub score and Authority score Represented as vectors h and a
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 14
![Page 15: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/15.jpg)
Each page i has 2 scores: Authority score: Hub score:
HITS algorithm: Initialize: Then keep iterating: Authority: →
Hub: →
normalize: ,
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 15
[Kleinberg ‘98]
i
j1 j2 j3 j4
→
i
j1 j2 j3 j4
→
![Page 16: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/16.jpg)
HITS converges to a single stable point Slightly change the notation: Vector a = (a1…,an), h = (h1…,hn) Adjacency matrix (n x n): Mij=1 if ij
Then:
So: And likewise:
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu
j
jijiji
ji aMhah
Mah hMa T
16
[Kleinberg ‘98]
![Page 17: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/17.jpg)
HITS algorithm in new notation: Set: a = h = 1n
Repeat: h=Ma, a=MTh Normalize
Then: a=MT(Ma)
Thus, in 2k steps: a=(MT M)k ah=(M MT)k h
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu
new hnew a
a is being updated (in 2 steps):MT(M a)=(MT M) a
h is updated (in 2 steps):M (MT h)=(MMT) h
Repeated matrix powering17
![Page 18: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/18.jpg)
Definition: Let Ax=x for some scalar , vector x, matrix A Then x is an eigenvector, and is its eigenvalue
Fact: If A is symmetric (Aij=Aji) (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors w1…wnthat form a basis (coordinate system) with eigenvalues 1... n (|i||i+1|)
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 18
![Page 19: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/19.jpg)
Let’s write x in coordinate system w1…wnx=i i wi
x has coordinates (1,…, n) Suppose: 1 ... n (|1| … |n|) Akx = k x = i i
ki wi As k, if we normalize
Ak x 1 1 w1(contribution of all other coordinates 0)
So authority a is eigenvector of MT Massociated with largest eigenvalue 1
Similarly: hub h is eigenvector of M MT
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 19
lim→
→ ∞
![Page 20: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/20.jpg)
A “vote” from an important page is worth more
A page is important if it is pointed to by other important pages
Define a “rank” rj for node j
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 20
ji
ij
rr(i)dout
yy
mmaaa/2
y/2a/2
m
y/2
The web in 1839
Flow equations:ry = ry /2 + ra /2ra = ry /2 + rm
rm = ra /2
![Page 21: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/21.jpg)
Stochastic adjacency matrix M Let page j has dj out‐links If j → i, then Mij = 1/dj else Mij = 0 M is a column stochastic matrix Columns sum to 1
Rank vector r: vector with an entry per page ri is the importance score of page i i ri = 1
The flow equations can be written r = M r
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 21
i
j
13
![Page 22: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/22.jpg)
Imagine a random web surfer: At any time t, surfer is on some page u At time t+1, the surfer follows an out‐link from u uniformly at random Ends up on some page v linked from u Process repeats indefinitely
Let: p(t) … vector whose ith coordinate is the prob. that the surfer is at page i at time t p(t) is a probability distribution over pages
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 22
![Page 23: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/23.jpg)
Where is the surfer at time t+1? Follows a link uniformly at random
p(t+1) = Mp(t) Suppose the random walk reaches a state
p(t+1) = Mp(t) = p(t)then p(t) is stationary distribution of a random walk
Our rank vector r satisfies r = Mr So, it is a stationary distribution for the random walk
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 23
![Page 24: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/24.jpg)
Given a web graph with n nodes, where the nodes are pages and edges are hyperlinks Assign each node an initial page rank Repeat until convergence calculate the page rank of each node
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 24
ji
tit
jrr
i
)()1(
d
di …. out-degree of node i
![Page 25: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/25.jpg)
Power Iteration: Set
→
And iterate
Example:ry 1/3 1/3 5/12 9/24 6/15ra = 1/3 3/6 1/3 11/24 … 6/15rm 1/3 1/6 3/12 1/6 3/15
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu
yy
aa mm
y a my ½ ½ 0a ½ 0 1
m 0 ½ 0
25
Iteration 0, 1, 2, …
ry = ry /2 + ra /2ra = ry /2 + rm
rm = ra /2
![Page 26: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/26.jpg)
Does this converge?
Does it converge to what we want?
Are results reasonable?
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 26
ji
tit
jrr
i
)()1(
d Mrr orequivalently
![Page 27: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/27.jpg)
Example:ra 1 0 1 0rb 0 1 0 1
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 27
=
bbaa
Iteration 0, 1, 2, …
![Page 28: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/28.jpg)
Example:ra 1 0 0 0rb 0 1 0 0
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 28
=
bbaa
Iteration 0, 1, 2, …
![Page 29: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/29.jpg)
2 problems: Some pages are “dead ends” (have no out‐links) Such pages cause importance to “leak out”
Spider traps (all out links arewithin the group) Eventually spider traps absorb all importance
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 29
![Page 30: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/30.jpg)
Power Iteration: Set
→
And iterate
Example:ry 1/3 2/6 3/12 5/24 0ra = 1/3 1/6 2/12 3/24 … 0rm 1/3 1/6 1/12 2/24 0
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 30
Iteration 0, 1, 2, …
yy
aa mm
y a my ½ ½ 0a ½ 0 0
m 0 ½ 0
ry = ry /2 + ra /2ra = ry /2rm = ra /2
![Page 31: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/31.jpg)
Power Iteration: Set
→
And iterate
Example:ry 1/3 2/6 3/12 5/24 0ra = 1/3 1/6 2/12 3/24 … 0rm 1/3 3/6 7/12 16/24 1
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 31
Iteration 0, 1, 2, …
yy
aa mm
y a my ½ ½ 0a ½ 0 0
m 0 ½ 1
ry = ry /2 + ra /2ra = ry /2rm = ra /2 + rm
![Page 32: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/32.jpg)
Markov Chains Set of states X Transition matrix P where Pij = P(Xt=i | Xt‐1=j) π specifying the probability of being at eachstate x X
Goal is to find π such that π = π P
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 32
)()1( tt Mrr
![Page 33: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/33.jpg)
Markov chains theory
Fact: For any start vector, the power method applied to a Markov transition matrix P will converge to a unique positive stationary vector as long as P is stochastic, irreducibleand aperiodic.
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 33
![Page 34: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/34.jpg)
Stochastic: every column sums to 1
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 34
yy
aa mm
y a my ½ ½ 1/3a ½ 0 1/3
m 0 ½ 1/3
ry = ry /2 + ra /2 + rm /3ra = ry /2+ rm /3rm = ra /2 + rm /3
)1( en
aMS e…vector
of all 1
![Page 35: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/35.jpg)
A chain is periodic if there exists k > 1 such that the interval between two visits to some state s is always a multiple of k.
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 35
yy
aa mm
![Page 36: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/36.jpg)
From any state, there is a non‐zero probability of going from any one state to any another.
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 36
yy
aa mm
![Page 37: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/37.jpg)
Google’s solution:At each step, random surfer has two options: With probability 1-,
follow a link at random With probability ,
jump to some page uniformly at random PageRank equation [Brin‐Page, 98]
r=
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 37
di … outdegreeof node i
Assuming we follow random teleport links with probability 1.0 from dead-ends
![Page 38: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/38.jpg)
The Google Matrix:
G is stochastic, aperiodic and irreducible.
G is dense but computable using sparse mtx H
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 38
![Page 39: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/39.jpg)
PageRank as a principal eigenvectorr = Mr rj=i ri/di
But we really want:rj = (1- ) ij ri/di +
Define:M’ij = (1- ) Mij + 1/n
Then: r = M’r What is ? In practice =0.15 (5 links and jump)
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 39
di … out‐degree of node i
![Page 40: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/40.jpg)
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 40
![Page 41: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/41.jpg)
PageRank and HITS are two solutions to the same problem: What is the value of an in‐link from u to v? In the PageRank model, the value of the link depends on the links into u In the HITS model, it depends on the value of the other links out of u
The destinies of PageRank and HITS post‐1998 were very different
11/8/2011 41Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu
![Page 42: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/42.jpg)
![Page 43: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/43.jpg)
Goal: Evaluate pages not just by popularity but by how close they are to the topic
Teleporting can go to: Any page with equal probability (we used this so far)
A topic‐specific set of “relevant” pages Topic‐specific (personalized) PageRank
M’ij = (1-) Mij + /|S| if i in S (S...teleport set)
= (1-) Mij otherwise
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 43
![Page 44: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/44.jpg)
Graphs and web search: Ranks nodes by “importance”
Personalized PageRank: Ranks proximity of nodes to the teleport nodes S
Proximity on graphs: Q: What is most related conference to ICDM? Random Walks with Restarts Teleport back: S={single node}
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 44
ICDM
KDD
SDM
Philip S. Yu
IJCAI
NIPS
AAAI M. Jordan
Ning Zhong
R. Ramakrishnan
…
…
… …
Conference Author
![Page 45: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/45.jpg)
Link Farms: networks of millions of pages design to focus PageRank on a few undeserving webpages
To minimize their influence use a teleport set of trusted webpages E.g., homepages of universities
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 45
![Page 46: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/46.jpg)
![Page 47: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/47.jpg)
Rich get richer [Cho et al., WWW ‘04] Two snapshots of the web‐graph at two different time points Measure the change: In the number of in‐links PageRank
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 47
http://oak.cs.ucla.edu/~cho/papers/cho-bias.pdf
![Page 48: Stanford University · Anchor Text: I go to Stanford where I study What ... webgraph we can ... (in our case MT M and M MT are symmetric) Then A has n orthogonal unit eigenvectors](https://reader033.vdocuments.site/reader033/viewer/2022050413/5f8999eb65d3911b1622e647/html5/thumbnails/48.jpg)
11/8/2011 Jure Leskovec, Stanford CS224W: Social and Information Network Analysis, http://cs224w.stanford.edu 48