university of california at santa barbara

31
University of California at Santa Barbara Christo Wilson, Bryce Boe, Alessandra Sala, Krishna P. N. Puttaswamy, and Ben Zhao User Interactions in Social Networks and their Implications

Upload: dana-collins

Post on 03-Jan-2016

29 views

Category:

Documents


3 download

DESCRIPTION

User Interactions in Social Networks and their Implications. University of California at Santa Barbara Christo Wilson , Bryce Boe , Alessandra Sala , Krishna P. N. Puttaswamy , and Ben Zhao. Social Networks. Social Applications. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: University of California at Santa Barbara

University of California at Santa Barbara

Christo Wilson, Bryce Boe, Alessandra Sala, Krishna P. N. Puttaswamy, and Ben Zhao

User Interactions in Social Networks and their Implications

Page 2: University of California at Santa Barbara

University of California at Santa Barbara 2

Social Networks

4/2/2009

Page 3: University of California at Santa Barbara

University of California at Santa Barbara 3

Social Applications

4/2/2009

Enables new ways to solve problems for distributed systems Social web search Social bookmarking Social marketplaces Collaborative spam filtering (RE: Reliable

Email) How popular are social applications?

Facebook Platform – 50,000 applications Popular ones have >10 million users each

Page 4: University of California at Santa Barbara

University of California at Santa Barbara 44/2/2009

Social Graphs and User Interactions Social applications rely on

1. Social graph topology2. User interactions

Currently, social applications evaluated just using social graph Assume all social links are equally

important/interactive Is this true in reality?

Milgram’s familiar stranger Connections for ‘status’ rather than ‘friendship’

Incorrect assumptions lead to faulty application design and evaluation

Page 5: University of California at Santa Barbara

University of California at Santa Barbara 5

Goals

4/2/2009

Question: Are social links valid indicators of real user interaction? First large scale study of Facebook

10 million users (15% of total users) / 24 million interactions

Use data to show highly skewed distribution of interactions <1% of people on Facebook talk to >50% of their friends

Propose new model for social graphs that includes interaction information Interaction Graph Reevaluate existing social application using new model

In some cases, break entirely

Page 6: University of California at Santa Barbara

6University of California at Santa Barbara

• Characterizing Facebook• Analyzing User Interactions• Interaction Graphs• Effects on Social Applications

Outline

4/2/2009

Page 7: University of California at Santa Barbara

University of California at Santa Barbara 7

Crawling Facebook for Data

4/2/2009

Facebook is the most popular social network Crawling social networks is difficult

Too large to crawl completely, must be sampled Privacy settings may prevent crawling

Thankfully, Facebook is divided into ‘networks’ Represent geographic regions, schools,

companies Regional networks are not authenticated

Page 8: University of California at Santa Barbara

University of California at Santa Barbara 8

Crawling for Data, cont.

Crawled Facebook regional networks 22 largest networks: London, Australia, New York, etc Timeframe: March – May 2008 Start with 50 random ‘seed’ users, perform BFS search

Data recorded for each user: Friends list History of wall posts and photo comments

Collectively referred to as interactions Most popular publicly accessible Facebook

applications

4/2/2009

Page 9: University of California at Santa Barbara

University of California at Santa Barbara 9

Facebook Orkut1

Number of Users Crawled 10,697,000 1,846,000

Percentage of Total Users 15% 26.9%

Number of Social Links Crawled

408,265,000 22,613,000

Radius 9.8 6

Diameter 13.4 9

Average Path Length 4.8 4.25

Clustering Coefficient 0.164 0.171

Power-law Coefficient α=1.5, D=0.55

α=1.5, D=0.6

High Level Graph Statistics

4/2/2009

1. A. Mislove, M. Marcon, K. P. Gummadi, P. Druschel, and B. Bhattacharjee. Measurement and analysis of online social networks. In Proc. of IMC, October 2007.

•Based on Facebook’s total size of 66 million users in early 2008

•Represents ~50% of all users in the crawled regions

•~49% of links were crawlable

•This provides a lower bound on the average number of in-network friends

•Avg. social degree = ~77

•Low average path length and high clustering coefficient indicate Facebook is small-world

Page 10: University of California at Santa Barbara

10University of California at Santa Barbara

• Characterizing Facebook• Analyzing User Interactions• Interaction Graphs• Effects on Social Applications

Outline

4/2/2009

Page 11: University of California at Santa Barbara

University of California at Santa Barbara 11

Analyzing User Interactions

Having established that Facebook has the expected social graph properties…

Question: Are social links valid indicators of real user interaction?

Examine distribution of interactions among friends

4/2/2009

Page 12: University of California at Santa Barbara

University of California at Santa Barbara 12

Distribution Among Friends

0 5 10 15 20 25 30 35 40 45 500

10

20

30

40

50

60

70

80

90

100

70% Interaction Cumulative Fraction90% Interaction Cumulative Fraction

% of Friends Involved

% o

f U

sers

(C

DF)

4/2/2009

For 50% of users, 70% of interaction comes from 7% of

friends.

Almost nobody interacts with more than 50% of their friends!

For 50% of users, 100% of interaction comes from 20% of

friends.

•Social degree does not accurately predict human behavior

•Initial Question: Are social links valid indicators of real user interaction?

Answer: NO

Page 13: University of California at Santa Barbara

13University of California at Santa Barbara

• Characterizing Facebook• Analyzing User Interactions• Interaction Graphs• Effects on Social Applications

Outline

4/2/2009

Page 14: University of California at Santa Barbara

University of California at Santa Barbara 14

A Better Model of Social Graphs

4/2/2009

Answer to our initial question: Not all social links are created equal Implication: can not be used to evaluate

social applications What is the right way to model social

networks? More accurately approximate reality by

taking user interactivity into account Interaction Graphs

Chun et. al. IMC 2008

Page 15: University of California at Santa Barbara

University of California at Santa Barbara 15

Interaction Graphs

Definition: a social graph parameterized by… n : minimum number of interactions per

edge t : some window of time for interactions

n = 1 and t = {2004 to the present}

4/2/2009

Page 16: University of California at Santa Barbara

University of California at Santa Barbara 16

0 200 400 600 800 1000 1200 14000

50

100

150

200

250

300

350

400

450

500

Social Degree

Inte

racti

on

Deg

ree

Social vs. Interaction Degree

4/2/2009

1:1 Degree Ratio

Dunbar’s Number (150)

99% of Facebook Users

•Interaction graph prunes useless edges

•Results agree with theoretical limits on human social cognition

Page 17: University of California at Santa Barbara

University of California at Santa Barbara 17

Interaction Graph Analysis

4/2/2009

Do Interaction Graphs maintain expected social network graph properties?

Social Graph Interaction Graph

Number of Vertices 10,697,000 8,403,000

Number of Edges 408,265,000 94,665,000

Radius 9.8 12.4

Diameter 13.4 19.8

Average Path Length 4.8 7.3

Clustering Coefficient 0.164 0.078

Power-law Coefficient α=1.5, D=0.55 α=1.5, D=0.24

•Interaction Graphs still have

Power-law scaling

Scale-free behavior

Small-world clustering

•… But, exhibit less of these characteristics than the full social network

Page 18: University of California at Santa Barbara

18University of California at Santa Barbara

• Characterizing Facebook• Analyzing User Interactions• Interaction Graphs• Effects on Social

Applications

Outline

4/2/2009

Page 19: University of California at Santa Barbara

University of California at Santa Barbara 19

Social Applications, Revisited

4/2/2009

Recap: Need a better model to evaluate social

applications Interaction Graphs augment social graphs

with interaction information How do these changes effect social

applications? Sybilguard Analysis of Reliable Email in the paper

Page 20: University of California at Santa Barbara

University of California at Santa Barbara 20

Sybilguard

4/2/2009

Sybilguard is a system for detecting Sybil nodes in social graphs

Why do we care about detecting Sybils? Social network based games:

Social marketplaces:

How Sybilguard works Key insight: few edges between Sybils and

legitimate users (attack edges) Use persistent routing tables and random walks

to detect attack edges

Page 21: University of California at Santa Barbara

21

Sybilguard Algorithm

4/2/2009University of California at Santa Barbara

Step 1:

Bootstrap the network.

All users exchange signed keys.

Key exchange implies that both parties are human and trustworthy.

Step 2:

Choose a verifier (A) and a suspect (B).

A and B send out random walks of a certain length (2).

Look for intersections.

A knows B is not a Sybil because multiple paths intersect and they do so at different nodes. A

B

Page 22: University of California at Santa Barbara

University of California at Santa Barbara 22

Sybilguard Algorithm, cont.

4/2/2009

A

B

Page 23: University of California at Santa Barbara

University of California at Santa Barbara 23

Sybilguard Caveats

4/2/2009

Bootstrapping requires human interaction Evaluating Sybilguard on the social graph is

overly optimistic because most friends never interact!

Better to evaluate using Interaction Graphs

Page 24: University of California at Santa Barbara

University of California at Santa Barbara 24

Expected Impact

4/2/2009

Fewer of edges, lower clustering lead to reduced performance

Why? Self-loops

A

B

Page 25: University of California at Santa Barbara

University of California at Santa Barbara 25

Sybilguard on Interaction Graphs

4/2/2009

0 200 400 600 800 1000 1200 14000

10

20

30

40

50

60

70

80

90

100

Social Graph

Full Interaction Graph

Interaction Graph (1 Year)

Interaction Graph (6 Months)

Random Walk Path Length

% o

f In

ters

ecti

on

s

(CD

F)

•When evaluated under real world conditions, performance of social applications changes dramatically

Page 26: University of California at Santa Barbara

University of California at Santa Barbara 26

Conclusion

4/2/2009

First large scale analysis of Facebook Answer the question: Are social links

valid indicators of real user interaction? Formulate new model of social networks:

Interaction Graphs Demonstrate the effect of Interaction

Graphs on social applications Final takeaway: when building social

applications, use interaction graphs!

Page 27: University of California at Santa Barbara

27University of California at Santa Barbara

Anonymized Facebook data (social graphs and interaction graphs) will be available for download soon at the Current Lab website!

http://current.cs.ucsb.edu/facebook

Questions?

4/2/2009

Page 28: University of California at Santa Barbara

University of California at Santa Barbara 284/2/2009

Social Networks

Social Networks are popular platforms for interaction, communication and collaboration > 110 million users

9th most trafficked site on the Internet

> 170 million users #1 photo sharing site 4th most trafficked site on the Internet 114% user growth in 2008

> 800 thousand users 1,689% user growth in 2008

Page 29: University of California at Santa Barbara

University of California at Santa Barbara 29

Facebook Orkut1

Number of Users Crawled 10,697,000 1,846,000

Percentage of Total Users 15% 26.9%

Number of Social Links Crawled

408,265,000 22,613,000

Radius 9.8 6

Diameter 13.4 9

Average Path Length 4.8 4.25

Clustering Coefficient 0.164 0.171

Power-law Coefficient α=1.5, D=0.55

α=1.5, D=0.6

High Level Graph Statistics

4/2/2009

1. A. Mislove, M. Marcon, K. P. Gummadi, P. Druschel, and B. Bhattacharjee. Measurement and analysis of online social networks. In Proc. of IMC, October 2007.

•Based on Facebook’s total size of 66 million users in early 2008

•Represents ~50% of all users in the crawled regions

•~49% of links were crawlable

•This provides a lower bound on the average number of in-network friends

•Avg. social degree = ~77

•Clustering Coefficient measures strength of local cliques

•Measured between zero (random graphs) and one (complete connectivity)

•Social networks display power law degree distribution

•Alpha is the curve of the power law

•D is the fitting error

Page 30: University of California at Santa Barbara

University of California at Santa Barbara 30

Social Degree CDF

1 10 100 10000

10

20

30

40

50

60

70

80

90

100

YouTube

LiveJournal

Facebook

Orkut

Social Degree

% o

f U

sers

(C

DF)

4/2/2009

Page 31: University of California at Santa Barbara

University of California at Santa Barbara 31

0 10 20 30 40 50 600

10

20

30

40

50

60

70

80

90

100

Sorted by DegreeSorted by Total Inter-actions

% of Nodes

% o

f To

tal In

tera

cti

on

(C

DF)

Nodes vs. Total Interactions

4/2/2009

Top 10% of most well connected users are

responsible for 60% of total interactions

Top 10% of most interactive users are responsible for 85%

of total interactions•Social degree does not accurately predict human behavior

•Interactions are highly skewed towards a small percent of the Facebook population