surf: detecting and measuring search poisoning

23
SURF: Detecting and Measuring Search Poisoning Long Lu, Roberto Perdisci, and Wenke Lee Georgia Tech and University of Georgia

Upload: adin

Post on 23-Feb-2016

41 views

Category:

Documents


0 download

DESCRIPTION

SURF: Detecting and Measuring Search Poisoning. Long Lu, Roberto Perdisci , and Wenke Lee Georgia Tech and University of Georgia. Search engines. SEO. Optimizing website presentation to search crawlers Emphasizing keyword relevance Demonstrating popularity Black-hat SEO - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: SURF:  Detecting and Measuring Search Poisoning

SURF: Detecting and Measuring Search PoisoningLong Lu, Roberto Perdisci, and Wenke LeeGeorgia Tech and University of Georgia

Page 2: SURF:  Detecting and Measuring Search Poisoning

Search engines

2

SURF: Detecting and Measuring Search Poisoning18th ACM Conference on Computer and Communications Security

Page 3: SURF:  Detecting and Measuring Search Poisoning

SEO

3

SURF: Detecting and Measuring Search Poisoning18th ACM Conference on Computer and Communications Security

• Optimizing website presentation to search crawlers– Emphasizing keyword relevance– Demonstrating popularity

• Black-hat SEO– Artificially inflating relevance– Dishonest but typically non-malicious

Page 4: SURF:  Detecting and Measuring Search Poisoning

Search poisoning

SURF: Detecting and Measuring Search Poisoning18th ACM Conference on Computer and Communications Security

4

Page 5: SURF:  Detecting and Measuring Search Poisoning

Search poisoning• Aggressively abusing SEO

– Forging relevance– Employing link farm– Redirecting visitors

• Inadequate countermeasures– IR quality assurance– Designed for less adversarial scenarios– Robust solutions needed

5

SURF: Detecting and Measuring Search Poisoning18th ACM Conference on Computer and Communications Security

Page 6: SURF:  Detecting and Measuring Search Poisoning

Malicious search user redirection

• Preserving poisoning infrastructure• Filtering out detection traffic• Enabling affiliate network

SURF: Detecting and Measuring Search Poisoning18th ACM Conference on Computer and Communications Security

6

Page 7: SURF:  Detecting and Measuring Search Poisoning

Observations• Analyzed 1,048 search poisoning cases

– Ubiquitous cross-site redirections– Poisoning as a service– Variety in malicious applications– Persistence under transient appearances

SURF: Detecting and Measuring Search Poisoning18th ACM Conference on Computer and Communications Security

7

Page 8: SURF:  Detecting and Measuring Search Poisoning

Goals• Not specific to malicious content

hosted on terminal pageGenerality

• Cannot be trivially evaded by attackersRobustness

• Not dependent on proprietary data or special environment

Wide deployability

SURF: Detecting and Measuring Search Poisoning18th ACM Conference on Computer and Communications Security

8SURF

(Search User Redirection Finder)

Page 9: SURF:  Detecting and Measuring Search Poisoning

SURF overview

SURF: Detecting and Measuring Search Poisoning18th ACM Conference on Computer and Communications Security

9

Instrumented Browser

Feature Extractor

Feature SourcesBrowser events

Network infoSearch result

SURFClassifier

Page 10: SURF:  Detecting and Measuring Search Poisoning

SURF prototype• Instrumented browser

– Stripped IE with customizations (~1k SLOC in C#)– Listening and responding to rendering events

• Feature extractor – Offline execution to facilitate experiments

• SURF Classifier– Weka’s J48– Simple, efficient, and easily interpreted

SURF: Detecting and Measuring Search Poisoning18th ACM Conference on Computer and Communications Security

10

Page 11: SURF:  Detecting and Measuring Search Poisoning

Detection featuresRedirection composition

Total redirection

hops

Cross-site redirection

hops

Redirection consistenc

y

Chained webpages

Landing-to-terminal distance

Page rendering

errors

IP-to-name ratio

Poisoning resistance

Keyword poisoning resistance

Search rank

Good rank confidence

SURF: Detecting and Measuring Search Poisoning18th ACM Conference on Computer and Communications Security

11

Page 12: SURF:  Detecting and Measuring Search Poisoning

Detection features (1/3)• Regular Vs. Malicious search redirection• Covering all types of redirections

SURF: Detecting and Measuring Search Poisoning18th ACM Conference on Computer and Communications Security

12

Redirection composition

Total redirection

hops

Cross-site redirection

hops

Redirection consistency

Page 13: SURF:  Detecting and Measuring Search Poisoning

Detection features (2/3)

SURF: Detecting and Measuring Search Poisoning18th ACM Conference on Computer and Communications Security

13

Chained webpages

Landing-to-terminal distance

Page rendering

errors

IP-to-name ratio

• Webpages involved in redirections• Distance = min {geo_dist, org_dist}• Premature termination on errors• Unnamed malicious hosts

Page 14: SURF:  Detecting and Measuring Search Poisoning

Detection features (3/3)

SURF: Detecting and Measuring Search Poisoning18th ACM Conference on Computer and Communications Security

14

Poisoning resistance

Keyword poison

resistance

Search rank

Good rank confidence

• Derived from search keyword and result • Poison resistance

– Difficulty of poisoning a keyword– Avg {PageRank of top 10 results}

• Good rank confidence– Poison resistance / search rank

Page 15: SURF:  Detecting and Measuring Search Poisoning

Evaluation• Semi-manually labeled datasets

– 2,344 samples collected on Oct 2010– Labeling methods does not overlap detection features

SURF: Detecting and Measuring Search Poisoning18th ACM Conference on Computer and Communications Security

15Negative Possitive

0

200

400

600

800

1000

1200

1400

BenignRogue pharmacyDrive-by downloadFake AV

Page 16: SURF:  Detecting and Measuring Search Poisoning

Evaluation• Accuracy

– 10-fold cross validation– On average, 99.1% TP, 0.9% FP

• Generality– Cross-category validation– Oblivious to on-page malicious content

• Robustness – Simulating compromised features– Evaluating accuracy degradation

SURF: Detecting and Measuring Search Poisoning18th ACM Conference on Computer and Communications Security

16

Page 17: SURF:  Detecting and Measuring Search Poisoning

Discussion• Unselected features

– Evadable or dependent on search-internal data– Domain reputation

• Deployment scenarios– Regular users, search engines, security vendors.– Enabling community efforts

SURF: Detecting and Measuring Search Poisoning18th ACM Conference on Computer and Communications Security

17

Page 18: SURF:  Detecting and Measuring Search Poisoning

Empirical measurements• 7-month measurement study (2010-9 ~ 2011-4)• 12 million search results analyzed• On a daily basis:

SURF: Detecting and Measuring Search Poisoning18th ACM Conference on Computer and Communications Security

18

Retrieve trendy

keywords

Dispatch search jobs

to SURF bots

visits each search result

and produces

logs

Feature extraction

and classification

Page 19: SURF:  Detecting and Measuring Search Poisoning

Empirical measurements• 7-day window

– Poisoning lag and poisoned volume– Avg. landing page life time – 1.7 days

SURF: Detecting and Measuring Search Poisoning18th ACM Conference on Computer and Communications Security

19

Page 20: SURF:  Detecting and Measuring Search Poisoning

Empirical measurements• 7-month window

– More than 50% trendy keywords poisoned

SURF: Detecting and Measuring Search Poisoning18th ACM Conference on Computer and Communications Security

20

Page 21: SURF:  Detecting and Measuring Search Poisoning

Empirical measurements• 7-month window

– Unique landing domains observed per week

SURF: Detecting and Measuring Search Poisoning18th ACM Conference on Computer and Communications Security

21

Page 22: SURF:  Detecting and Measuring Search Poisoning

Empirical measurements• 7-month window

– Terminal page variety survey

SURF: Detecting and Measuring Search Poisoning18th ACM Conference on Computer and Communications Security

222010-9

2010-10

2010-11

2010-12

2011-1

2011-2

2011-3

0%

20%

40%

60%

80%

100%UnknownVoid PageClick FraudRogue PharmacyScam (discount luxury)Scam (local service)Scam (free gift)Rogue Search EngineDrive-by downloadFakeAV

Page 23: SURF:  Detecting and Measuring Search Poisoning

Conclusion• In-depth study of search poisoning• Design and evaluation of SURF• Long-term measurement of search poisoning

SURF: Detecting and Measuring Search Poisoning18th ACM Conference on Computer and Communications Security

23