prying data from a social network - joseph · pdf filejoseph bonneau (university of cambridge)...

33
P RYING DATA F ROM A S OCIAL NETWORK Joseph Bonneau [email protected] Jonathan Anderson [email protected] George Danezis [email protected] Computer Laboratory ASONAM Conference Athens, Greece July 20, 2009 Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1/1

Upload: dohanh

Post on 25-Feb-2018

222 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

PRYING DATA FROM A SOCIAL NETWORK

Joseph [email protected]

Jonathan [email protected]

George [email protected]

Computer Laboratory

ASONAM Conference

Athens, Greece

July 20, 2009

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1

Page 2: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

I. Research Question

How can we extract data from a social network on an large scale?

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 2 / 1

Page 3: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

Our Case Study

Why Facebook is interesting:

Size: 225 M usersComplexity

Third-PartyApplicationsPublic ListingsFB Connect

Accurate Profiles

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 3 / 1

Page 4: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

Data of Interest

User ProfilesSocial GraphTraffic Data

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 4 / 1

Page 5: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

Data of Interest

User ProfilesSocial GraphTraffic Data

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 4 / 1

Page 6: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

Data of Interest

User ProfilesSocial GraphTraffic Data

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 4 / 1

Page 7: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

Potential Adversaries

AdvertisersMarketersData Aggregators

Credit Ratings AgenciesInsurance Companies

Law EnforcementIntelligenceEmployersEducatorsOnline ScammersResearch Community

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 5 / 1

Page 8: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

What This Talk is Not

Mechanics of large-scale parallelized web crawlingLargest academic crawls: ∼ 10 M profilesSee Wilson et al. User Interactions in Social Networks and theirImplications. EuroSys 2009.

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 6 / 1

Page 9: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

II. Data Extraction Techniques

Public ListingsFalse ProfilesMalicious ApplicationsPhishingFacebook API

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 7 / 1

Page 10: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

1.) Public Listings

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 8 / 1

Page 11: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

1.) Public Listings

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 8 / 1

Page 12: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

1.) Public Listings

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 8 / 1

Page 13: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

1.) Public Listings

Not protected from crawlingAble to extract ∼ 500 k per day, desktop PCExtract entire network in ∼ 500 machine-days

Get only 8 links per listingCan still extract many useful features (Bonneau et al. 2009)

High Degree NodesSmall Dominating SetsHighly Central NodesCommunities

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 9 / 1

Page 14: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

2.) False Profiles

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 10 / 1

Page 15: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

2.) False Profiles

80% of users will befriend a frog (Krishmanurthy and Wills, 2008)Can then crawl profiles with Friend-of-Friend Privacy

70-90% of users viewable within a sub-networkRegional networks being phased out

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 11 / 1

Page 16: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

3.) Malicious Applications

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 12 / 1

Page 17: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

3.) Malicious Applications

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 12 / 1

Page 18: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

3.) Top ApplicationsApplication # Users

1. How Well Do You Know Me? 28,074,5282. Causes 25,508,1743. MyCalendar 18,403,8784. We’re Related 16,860,9485. LivingSocial 16,618,0436. Movies 16,128,5397. RockYou Live 14,931,2298. Texas HoldEm Poker 14,594,9319. Pet Society 12,743,91810. Mafia Wars 12,694,72911. MindJolt Games 12,346,54912. Top Friends 12,144,26313. MyCalendar 12,128,12814. Slide FunSpace 11,088,63615. Farm Town 11,001,529

Source: InsideFacebook.com, 7/7/09

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 13 / 1

Page 19: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

3.) Top DevelopersApplication # Users

1. Zynga 54,778,1272. RockYou! 37,783,7783. Playfish 33,030,8724. How Well Do You Know Me? 28,074,5285. Slide, Inc. 27,149,3776. Causes 25,508,1747. MyCalendar 18,403,8788. LivingSocial 17,543,3759. FamilyLink.com 17,299,31610. Flixster 16,128,53911. MindJolt 12,346,54912. My Calendar 12,128,12813. Slashkey 11,001,52914. 6 waves 10,809,79715. Zwigglers 10,006,859

Source: InsideFacebook.com, 7/7/09

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 14 / 1

Page 20: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

3.) Weekly Application ChurnApplication # Users

1. MindJolt Games +2,444,4702. We’re Related +1,291,5313. Quizzer +959,6004. Farm Town +953,4285. Pet Society +840,2966. MyCalendar +820,0857. What Type Of Girl Are you? +743,5608. FARKLE +731,5379. Food Fling! +713,60410. Music +621,58811. Barn Buddy +600,10512. What Era Should You Time Travel To? +558,30113. Texas HoldEm Poker +490,32514. Cities I’ve Visited +488,83115. Waka-Waka +486,538

Source: InsideFacebook.com, 7/7/09

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 15 / 1

Page 21: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

4.) Profile Compromise & Phishing

Email Phishing

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 16 / 1

Page 22: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

4.) Profile Compromise & Phishing

Password Sharing

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 16 / 1

Page 23: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

4.) Profile Compromise & Phishing

Facebook Connect

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 16 / 1

Page 24: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

5.) Facebook Query Language

SELECT uid, name, affiliations FROM user

WHERE uid IN (X,Y, ... Z);

Step 1: Fetch Name/UID pairs

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 17 / 1

Page 25: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

5.) Facebook Query Language

SELECT uid1, uid2 FROM friend

WHERE uid1 IN (X,Y, ... Z)

AND uid2 IN (U,V, ... W);

Step 2: Fetch Friendships

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 18 / 1

Page 26: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

5.) Facebook Query Language

Can query sets of ∼ 1,000 users at a timeCan fetch all Name/UID pairs in ∼ 600 machine-daysExponential blowup in friendship queries:( N

1,0002

)≈

(200, 000

2

)≈ 2 · 1010

Still, useful to fill in gaps from other methods

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 19 / 1

Page 27: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

III.) Simulation

How many nodes must be “compromised” to view a large portionof the network?Assume all nodes have friends-only or friend-of-friend privacyTest growth of node coverage and edge coverage

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 20 / 1

Page 28: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

Data Set

Crawled ∼ 15,000 users from Stanford UniversityUsed FQL method, took < 12 hours.

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 21 / 1

Page 29: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

Experimental ResultsFriends-Only Friend-of-Friend

Nodes

Links

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 22 / 1

Page 30: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

Experimental Results

50% profiles 90% linksTargeted compromise, friend-only 0.16% 0.14%Random compromise, friend-only 0.71% 0.60%Friend requests, friend-only 50.0% 19.6%Targeted compromise, friend-of-friend 0.01% 0.01%Random compromise, friend-of-friend 0.04% 0.03%Friend requests, friend-of-friend 0.16% 0.14%

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 23 / 1

Page 31: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

Simulation Conclusions

Only need to compromise a small fraction of networkInitial gains very fast

Friends-of-friend makes discovery 10-20 times fasterTargeted compromise doesn’t help muchPhishing needs to be taken seriously...

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 24 / 1

Page 32: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

General Conclusions

Many ways to get data out of a modern SNSMost users unaware of these methodsData collection practical for many motivated parties

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 25 / 1

Page 33: Prying Data From a Social Network - Joseph · PDF fileJoseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 1 / 1. I. Research Question How can we

Thank You

Questions?

Joseph Bonneau (University of Cambridge) Prying Data From a Social Network July 20, 2009 26 / 1