modeling the spread of influence on the blogosphere akshay java, pranam kolari, tim finin, and tim...
TRANSCRIPT
Modeling the Spread of Influence on the Blogosphere Akshay Java, Pranam Kolari, Tim Finin, and Tim Oates
UMBC Tech Report
04/12/06
Outline
What is influence? Basic Influence Model Influence models for the blogosphere Results Conclusions
What is Influence?
Main Entry: in·flu·ence Pronunciation: 'in-"flü-&n(t)s, esp Southern in-'Function: nounEtymology: Middle English, from Middle French, from Medieval Latin influentia, from Latin influent-, influens, present participle of influere to flow in, from in- + fluere to flow -- more at FLUID1 a : an ethereal fluid held to flow from the stars and to affect the actions of humans b : an emanation of occult power held to derive from stars2 : an emanation of spiritual or moral force
3 a : the act or power of producing an effect without apparent exertion of force or direct exercise of command b : corrupt interference with authority for personal gain4 : the power or capacity of causing an effect in indirect or intangible ways : SWAY5 : one that exerts influence- under the influence : affected by alcohol : DRUNK <was arrested for driving under the influence>
NOT This Kind of Influence! ;-)
Motivation
Influence models studied for cocitation graphs David Kempe, Jon Kleinberg, Eva Tardos Maximizing the
Spread of Influence through a Social Network, KDD 2003
Applies to blogs also. Recent Examples: Startups, Microsoft Origami, Walmart,DoD
GOAL: Predict influential blogs Target nodes to help achieve a “Tipping Point”*
* The Tipping Point: Malcolm Gladwell
Influence Models for the Blogosphere
Blog Graph Influence Graph
4
3
2
1
5
4
3
2
1
5
2/5
1/5
2/5
1/3
1/3
1/3
1
1/2
1/2
1
Wu,v = Cu,v / dv
U
V
U links to V => U is Influenced by V
Basic Influence Models
Linear Threshold Model
Σ bvw ≥ θv
w is the active neighbor of v
Cascade Model Pvw - probability with which a
node can activate each of its
neighbors, independent of
history.
Influence Graph
4
3
2
1
5
2/5
1/5
2/5
1/3
1/3
1/3
1
1/2
1/2
1
θv
Active
Active
Inactive
Node Selection Heuristics
Inlinks Easily spammed
Centrality Expensive to compute for every large graphs
PageRank Requires link information However, is easy to compute
Greedy Heuristic Computationally expensive However performs better
Effect of Splogs on Node Selection(indegree vs pagerank)
Almost 54% of the links were from splogs/failed to splogs/failed!
Effect of Splogs on Inlinksrank URL #inlinks
1 http://www.livejournal.com/users/pics 3072
2 http://www.boingboing.net 2191
3 http://www.dailykos.com 2017
4 http://www.engadget.com 1942
5 http://profiles.blogdrive.com 1526
6 http://michellemalkin.com 1242
7 http://www.opinionjournal.com 1232
8 http://instapundit.com 1187
9 http://slashdot.org 1124
10 http://www.powerlineblog.com 909
11 http://www.huffingtonpost.com/theblog 905
12 http://corner.nationalreview.com 853
13 http://www.talkingpointsmemo.com 733
14 http://www.captainsquartersblog.com/mt 728
15 http://espn-presents2003-world-seriesofpoker.blogspot.com 711
16 http://3-world-series-of-poker-online-3.blogspot.com 711
17 http://worldseries-of-poker-network-tv-show.blogspot.com 711
18 http://wsop2003.blogspot.com 711
19 http://wsop-bracelet1.blogspot.com 711
20 http://worldseries-poker.blogspot.com 711
21 http://worldseries-of-poker-official.blogspot.com 711
22 http://worldseries-of-poker-wsop.blogspot.com 711
23 http://world-series-of-poker-nocd-patch66.blogspot.com 711
24 http://4-world-series-of-poker-past-winners.blogspot.com 711
25 http://7-wsop-games-7.blogspot.com 711
Tightly Knit
Community
of Splog
Conlusions
Influence models can be applied to blogs not just cocitation graphs
Splogs are a problem Greedy heuristics work well, pagerank is an
inexpensive approximation
Ideas for CIKM 06
Good or bad influence? Associating sentiment with links.
Finding influential blogs for a topic. (SVM accuracy 75-85%)
Community structure of blogs.