temporal correlations between spam and phishing …temporal correlations between spam and phishing...

23
Data collection methodology Comparing website lifetimes and spam campaigns Discussion Temporal Correlations between Spam and Phishing Websites Tyler Moore, Richard Clayton and Henry Stern Center for Research on Computation and Society Harvard University USENIX LEET ‘09 Boston, MA April 21, 2009 Tyler Moore Temporal Correlations between Spam and Phishing Websites

Upload: others

Post on 11-Jun-2020

13 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Temporal Correlations between Spam and Phishing …Temporal Correlations between Spam and Phishing Websites Tyler Moore, Richard Clayton and Henry Stern Center for Research on Computation

Data collection methodologyComparing website lifetimes and spam campaigns

Discussion

Temporal Correlations between Spam

and Phishing Websites

Tyler Moore, Richard Clayton and Henry Stern

Center for Research on Computation and Society

Harvard University

USENIX LEET ‘09Boston, MA

April 21, 2009

Tyler Moore Temporal Correlations between Spam and Phishing Websites

Page 2: Temporal Correlations between Spam and Phishing …Temporal Correlations between Spam and Phishing Websites Tyler Moore, Richard Clayton and Henry Stern Center for Research on Computation

Data collection methodologyComparing website lifetimes and spam campaigns

Discussion

Outline

1 Data collection methodologyMotivationPhishing website and spam data sourcesExamining our datasets

2 Comparing website lifetimes and spam campaignsSpam campaign durationPhishing spam volume over time

3 DiscussionWhat metric of phishing harm is best?Does phishing website take-down help?

Tyler Moore Temporal Correlations between Spam and Phishing Websites

Page 3: Temporal Correlations between Spam and Phishing …Temporal Correlations between Spam and Phishing Websites Tyler Moore, Richard Clayton and Henry Stern Center for Research on Computation

Data collection methodologyComparing website lifetimes and spam campaigns

Discussion

MotivationPhishing website and spam data sourcesExamining our datasets

Outline

1 Data collection methodologyMotivationPhishing website and spam data sourcesExamining our datasets

2 Comparing website lifetimes and spam campaignsSpam campaign durationPhishing spam volume over time

3 DiscussionWhat metric of phishing harm is best?Does phishing website take-down help?

Tyler Moore Temporal Correlations between Spam and Phishing Websites

Page 4: Temporal Correlations between Spam and Phishing …Temporal Correlations between Spam and Phishing Websites Tyler Moore, Richard Clayton and Henry Stern Center for Research on Computation

Data collection methodologyComparing website lifetimes and spam campaigns

Discussion

MotivationPhishing website and spam data sourcesExamining our datasets

Motivation

Removing impersonating content from hosting website is keyphishing countermeasure

Banks, or 3rd party take-down companies, collect ‘feeds’ ofphishing URLsVerify URLs in feed, then issue take-down notices to relevantISPs and/or registrarsAverage phishing website lifetime (eCrime ‘07): 62 to 95 hoursLong tail of long-lived phishing websites (7% > 1 week)

Motivating question: do long-lived phishing websites matter?

Our view: only if victims still visit the websiteSince spam drives victims to phishing websites, we mustcompare the timing of phishing spam to the time phishingwebsites are alive

Tyler Moore Temporal Correlations between Spam and Phishing Websites

Page 5: Temporal Correlations between Spam and Phishing …Temporal Correlations between Spam and Phishing Websites Tyler Moore, Richard Clayton and Henry Stern Center for Research on Computation

Data collection methodologyComparing website lifetimes and spam campaigns

Discussion

MotivationPhishing website and spam data sourcesExamining our datasets

Motivation

Removing impersonating content from hosting website is keyphishing countermeasure

Banks, or 3rd party take-down companies, collect ‘feeds’ ofphishing URLsVerify URLs in feed, then issue take-down notices to relevantISPs and/or registrarsAverage phishing website lifetime (eCrime ‘07): 62 to 95 hoursLong tail of long-lived phishing websites (7% > 1 week)

Motivating question: do long-lived phishing websites matter?

Our view: only if victims still visit the websiteSince spam drives victims to phishing websites, we mustcompare the timing of phishing spam to the time phishingwebsites are alive

Tyler Moore Temporal Correlations between Spam and Phishing Websites

Page 6: Temporal Correlations between Spam and Phishing …Temporal Correlations between Spam and Phishing Websites Tyler Moore, Richard Clayton and Henry Stern Center for Research on Computation

Data collection methodologyComparing website lifetimes and spam campaigns

Discussion

MotivationPhishing website and spam data sourcesExamining our datasets

Motivation

Removing impersonating content from hosting website is keyphishing countermeasure

Banks, or 3rd party take-down companies, collect ‘feeds’ ofphishing URLsVerify URLs in feed, then issue take-down notices to relevantISPs and/or registrarsAverage phishing website lifetime (eCrime ‘07): 62 to 95 hoursLong tail of long-lived phishing websites (7% > 1 week)

Motivating question: do long-lived phishing websites matter?

Our view: only if victims still visit the websiteSince spam drives victims to phishing websites, we mustcompare the timing of phishing spam to the time phishingwebsites are alive

Tyler Moore Temporal Correlations between Spam and Phishing Websites

Page 7: Temporal Correlations between Spam and Phishing …Temporal Correlations between Spam and Phishing Websites Tyler Moore, Richard Clayton and Henry Stern Center for Research on Computation

Data collection methodologyComparing website lifetimes and spam campaigns

Discussion

MotivationPhishing website and spam data sourcesExamining our datasets

Data sources

Phishing website lifetimes

Amalgamate several feeds: PhishTank, APWG, one large brandowner, and two take-down companies (each a combination ofoutside feeds and proprietary collection)Automated testing system continuously queries sites until theystop responding or change

Phishing spam campaigns

Subset of Ironport’s spam corpus marked as phishingPrimarily 3rd party spam traps, but also customer submissionsDefine spam campaign as all spam associated with a singlephishing hostDefine spam campaign duration as the time difference betweenfirst and last email advertising phishing host

Tyler Moore Temporal Correlations between Spam and Phishing Websites

Page 8: Temporal Correlations between Spam and Phishing …Temporal Correlations between Spam and Phishing Websites Tyler Moore, Richard Clayton and Henry Stern Center for Research on Computation

Data collection methodologyComparing website lifetimes and spam campaigns

Discussion

MotivationPhishing website and spam data sourcesExamining our datasets

Types of phishing websites

Ordinary phishing website hosting

Free webspace(http://www.bankname.freespacesitename.com/signin/)Compromised machine(http://www.example.com/ user/images/www.bankname.com/)

Fast-flux-hosted phishing websites

Register many innocuous-sounding domains (lof80.info)Send out phishing email with URLhttp://www.volksbank.de.netw.oid3614061.lof80.info/vr

Resolve domain to random selection of 5 or 10 botnet-infectedmachines, which proxy to a back-end server

Tyler Moore Temporal Correlations between Spam and Phishing Websites

Page 9: Temporal Correlations between Spam and Phishing …Temporal Correlations between Spam and Phishing Websites Tyler Moore, Richard Clayton and Henry Stern Center for Research on Computation

Data collection methodologyComparing website lifetimes and spam campaigns

Discussion

MotivationPhishing website and spam data sourcesExamining our datasets

Phishing datasets for our study

Phishing website dataset

12 693 phishing URLs for last week of September 2008Pares down to 4 084 ordinary websites and 120 fast-fluxdomains

Phishing spam dataset

Checked phishing spam sent Jun – Dec 2008430 ordinary phishing hosts appeared in spam list103 fast-flux phishing domains appeared in spam list

Tyler Moore Temporal Correlations between Spam and Phishing Websites

Page 10: Temporal Correlations between Spam and Phishing …Temporal Correlations between Spam and Phishing Websites Tyler Moore, Richard Clayton and Henry Stern Center for Research on Computation

Data collection methodologyComparing website lifetimes and spam campaigns

Discussion

MotivationPhishing website and spam data sourcesExamining our datasets

Questions we examine

What can we measure?1 Spam campaign volume2 Spam campaign duration3 Phishing website lifetimes

How should we measure it?1 On its own2 Phishing website lifetimes: relative to the start/end of spam

campaigns3 Phishing spam campaigns: relative to first

appearance/take-down time

Tyler Moore Temporal Correlations between Spam and Phishing Websites

Page 11: Temporal Correlations between Spam and Phishing …Temporal Correlations between Spam and Phishing Websites Tyler Moore, Richard Clayton and Henry Stern Center for Research on Computation

Data collection methodologyComparing website lifetimes and spam campaigns

Discussion

Spam campaign durationPhishing spam volume over time

Outline

1 Data collection methodologyMotivationPhishing website and spam data sourcesExamining our datasets

2 Comparing website lifetimes and spam campaignsSpam campaign durationPhishing spam volume over time

3 DiscussionWhat metric of phishing harm is best?Does phishing website take-down help?

Tyler Moore Temporal Correlations between Spam and Phishing Websites

Page 12: Temporal Correlations between Spam and Phishing …Temporal Correlations between Spam and Phishing Websites Tyler Moore, Richard Clayton and Henry Stern Center for Research on Computation

Data collection methodologyComparing website lifetimes and spam campaigns

Discussion

Spam campaign durationPhishing spam volume over time

Lifetimes of phishing websites and spam campaigns

Website lifetime (hrs) Spam duration (hrs)mean median mean median

Ordinary 52 18 106 0

Fast-flux 97 21 97 28

Tyler Moore Temporal Correlations between Spam and Phishing Websites

Page 13: Temporal Correlations between Spam and Phishing …Temporal Correlations between Spam and Phishing Websites Tyler Moore, Richard Clayton and Henry Stern Center for Research on Computation

Data collection methodologyComparing website lifetimes and spam campaigns

Discussion

Spam campaign durationPhishing spam volume over time

CDF of phishing website and spam campaign lifetimes

0 1000 2000 3000 4000

0.0

0.2

0.4

0.6

0.8

1.0

CDF of phishing host and spam lifetimes

Lifetime (hours)

Pro

port

ion Phishing spam duration (ordinary)

Phishing spam duration (fast−flux)Phishing host lifetime (ordinary)Phishing host lifetime (fast−flux)

Tyler Moore Temporal Correlations between Spam and Phishing Websites

Page 14: Temporal Correlations between Spam and Phishing …Temporal Correlations between Spam and Phishing Websites Tyler Moore, Richard Clayton and Henry Stern Center for Research on Computation

Data collection methodologyComparing website lifetimes and spam campaigns

Discussion

Spam campaign durationPhishing spam volume over time

Spam timing relative to phishing website uptime

−2000 −1000 0 1000 2000

0.0

0.2

0.4

0.6

0.8

1.0

CDF of time difference between host detection and first spam

# hours after phishing host is detected

Pro

port

ion

ordinary phishingfast−flux

−2000 −1000 0 1000 20000.

00.

20.

40.

60.

81.

0

CDF of time difference between host removal and last spam

# hours after phishing host is removed

Pro

port

ion

ordinary phishingfast−flux

0 is time host appears (left graph) or is removed (right graph)

Left graph: time for first spam in campaign

Right graph: time for last spam in campaignTyler Moore Temporal Correlations between Spam and Phishing Websites

Page 15: Temporal Correlations between Spam and Phishing …Temporal Correlations between Spam and Phishing Websites Tyler Moore, Richard Clayton and Henry Stern Center for Research on Computation

Data collection methodologyComparing website lifetimes and spam campaigns

Discussion

Spam campaign durationPhishing spam volume over time

Initial observations

Fast-flux hosting more tightly correlated with spamtransmission

Most spam starts around time website appearsMost spam stops around time website is removed

Ordinary phishing websites exhibit higher variance

First spam may be sent well before or after website appears29% of spam campaigns send final message more than a daybefore the website is removed35% of spam campaigns send final message more than a dayafter the website is removed

Tyler Moore Temporal Correlations between Spam and Phishing Websites

Page 16: Temporal Correlations between Spam and Phishing …Temporal Correlations between Spam and Phishing Websites Tyler Moore, Richard Clayton and Henry Stern Center for Research on Computation

Data collection methodologyComparing website lifetimes and spam campaigns

Discussion

Spam campaign durationPhishing spam volume over time

Spam volume over time

0 10 20 30 40 50 60

0.0

0.2

0.4

0.6

0.8

1.0

CDF of phishing spam volume over time

Days after first spam is sent

Pro

port

ion

of to

tal s

pam

vol

ume

ordinary phishingfast−flux

Tyler Moore Temporal Correlations between Spam and Phishing Websites

Page 17: Temporal Correlations between Spam and Phishing …Temporal Correlations between Spam and Phishing Websites Tyler Moore, Richard Clayton and Henry Stern Center for Research on Computation

Data collection methodologyComparing website lifetimes and spam campaigns

Discussion

Spam campaign durationPhishing spam volume over time

Spam volume relative to phishing website uptime

−60 −40 −20 0 20 40

0.0

0.2

0.4

0.6

0.8

1.0

CDF of spam volume relative to phishing website detection

Days after phishing host is detected

Pro

port

ion

of to

tal s

pam

vol

ume

ordinary phishingfast−flux

−60 −40 −20 0 20 400.

00.

20.

40.

60.

81.

0

CDF of spam volume relative to phishing website removal

Days after phishing host is removed

Pro

port

ion

of to

tal s

pam

vol

ume

ordinary phishingfast−flux

0 is time host appears (left graph) or is removed (right graph)

Both graph: volume of spam sent by time t

Tyler Moore Temporal Correlations between Spam and Phishing Websites

Page 18: Temporal Correlations between Spam and Phishing …Temporal Correlations between Spam and Phishing Websites Tyler Moore, Richard Clayton and Henry Stern Center for Research on Computation

Data collection methodologyComparing website lifetimes and spam campaigns

Discussion

Spam campaign durationPhishing spam volume over time

Observations on spam volume

Most spam is sent around the time the website appears

Almost all fast-flux spam appears on the day surroundingwebsite appearance16% of ordinary phishing spam sent more than a day beforedetection, 3% more than a day after detection

Most spam stops once the website is removed

99.997% of fast-flux spam is sent prior to website removal4% of spam advertising ordinary phishing websites are sent outafter removal

Tyler Moore Temporal Correlations between Spam and Phishing Websites

Page 19: Temporal Correlations between Spam and Phishing …Temporal Correlations between Spam and Phishing Websites Tyler Moore, Richard Clayton and Henry Stern Center for Research on Computation

Data collection methodologyComparing website lifetimes and spam campaigns

Discussion

What metric of phishing harm is best?Does phishing website take-down help?

Outline

1 Data collection methodologyMotivationPhishing website and spam data sourcesExamining our datasets

2 Comparing website lifetimes and spam campaignsSpam campaign durationPhishing spam volume over time

3 DiscussionWhat metric of phishing harm is best?Does phishing website take-down help?

Tyler Moore Temporal Correlations between Spam and Phishing Websites

Page 20: Temporal Correlations between Spam and Phishing …Temporal Correlations between Spam and Phishing Websites Tyler Moore, Richard Clayton and Henry Stern Center for Research on Computation

Data collection methodologyComparing website lifetimes and spam campaigns

Discussion

What metric of phishing harm is best?Does phishing website take-down help?

How do you measure the impact of phishing?

Websites Website lifetime Spam volume# % Hrs % # %

Ordinary 4 084 97.0 20 602.7 68.0 978 693.1 32.0

Fast-flux 120 3.0 9 673.8 32.0 2 080 035.7 68.0

Tyler Moore Temporal Correlations between Spam and Phishing Websites

Page 21: Temporal Correlations between Spam and Phishing …Temporal Correlations between Spam and Phishing Websites Tyler Moore, Richard Clayton and Henry Stern Center for Research on Computation

Data collection methodologyComparing website lifetimes and spam campaigns

Discussion

What metric of phishing harm is best?Does phishing website take-down help?

Do long-lived phishing websites matter?

Some phishing websites are very long-lived

Should we bother removing them?

If spam is still being sent for the websites, then users may stillbe at risk

Tyler Moore Temporal Correlations between Spam and Phishing Websites

Page 22: Temporal Correlations between Spam and Phishing …Temporal Correlations between Spam and Phishing Websites Tyler Moore, Richard Clayton and Henry Stern Center for Research on Computation

Data collection methodologyComparing website lifetimes and spam campaigns

Discussion

What metric of phishing harm is best?Does phishing website take-down help?

Phishing website take-down does help!

0 10 20 30 40

020

4060

8010

0

Phishing websites sending ’fresh’ spam after detection

days after phishing website first reported

% li

ve p

hish

ing

site

s st

ill s

endi

ng s

pam

ordinary phishingfast−flux

Tyler Moore Temporal Correlations between Spam and Phishing Websites

Page 23: Temporal Correlations between Spam and Phishing …Temporal Correlations between Spam and Phishing Websites Tyler Moore, Richard Clayton and Henry Stern Center for Research on Computation

Data collection methodologyComparing website lifetimes and spam campaigns

Discussion

Conclusion

We have brought together data on phishing website lifetimesand the spam campaigns which advertise them

Most spam is sent around the time when the website is alive

Phishing attacks using fast-flux techniques transmit spammore effectively

Phishing website take-down helps, because spam transmissioncontinues until removal

http://people.seas.harvard.edu/~tmoore/

Tyler Moore Temporal Correlations between Spam and Phishing Websites