detecting adversaries using metafeatures

Chad MillsProgram Manager

Windows Live Safety PlatformMicrosoft

features

Training Time

score

feature weights

(feature,weight) pairs

features

Run Time

training

Assumption: Spam words continue to appear in spam messages Good words continue to appear in good messages

milliondollarstransferguardian

Marchcommunit

ysocialfellow

(dollars, 0.2)(million, 0.1)(transfer, 0.1)(community, -

0.01)(social, -0.01)(fellow, -0.01)(guardian, 0.03)(March, -0.08)

0.37

-0.11

From: "Chelsea Clark" <[email protected]>

Subject: Get PaidFor yourOpinion

<style>… 

 

 

opens NRSU syringe />

 

 

Korean relations header greeting Airllines Phantom CVS Rae 504 1009 perf 

undertaking paced Liquidation reduction />…

Overall Group of words

Good newsletter peers month select these

Good late click commissioner media

Good smoothly off close support before

Good okay sponsor rock go by ads

Good none cases text membership

Good Message

+Free

NigeriaViagra

Spammy Words

= Borderline Spam

Message

+Borderline

Spam

lateclick

commissioner

Unknown Words

=lateclick

commissionerGood

WordsInbox

+Borderline

Spam

newsletter

selectmonthUnknown Words

=newslett

erselectmonthNon-Good Words

Junk Folder

Chaff Spam [spam content] newsletter peers month select these late click commissioner media smoothly off close support before okay sponsor rock go by ads none cases text membership

Legitimate MailMarch is all about the Zune community. This month,

you can help create a new featurefor The Social, get tips from a fellow Zuneuser and find out the winners of theYour Zune Your Choice Awards.

Sum of weights (content filter score) Average weight Standard Deviation Percent of words that are good Percent of words that are spam Number of features Maximum feature weight Number of strong spam words Etc.

features

features(feature,weight)

pairs

Metafeatures

score

Metafeature weights

(Metafeature,weight)Pairs

feature weights

Metafeatures

Training Time

Run Time

training

training

metafeature extraction

metafeature extraction

milliondollarstransferguardian

Marchcommunit

ysocialfellow

(dollars, 0.2)(million, 0.1)(transfer, 0.1)(community, -

0.01)(social, -0.01)(fellow, -0.01)(guardian, 0.03)(March, -0.08)

Sum: 0.37

σ: 0.09Max: 0.2

Sum: -0.11

σ: 0.04Max: -0.1

Features

(feature, weight)

Metafeatures

(Metafeature, weight)(Sum: 0.37,

1.0)(σ: 0.09, 0.8)(Max: 0.2, 0.1)

(Sum: -0.11, -0.8)(σ: 0.04, -0.6)(Max: -0.1, -0.3)

-1.7

1.9

Hotmail Feedback Loop◦ Messages classified by recipients

Training Set: 1,800,000 messages◦ Ending on 5/20/07

Evaluation Set: 50,000 messages◦ Data from 5/21/07

45% improvement in TP at low FP levels

At a reasonable False Positive rate:◦ 98% of unique catches are chaff spam◦ Caught 99.5% of chaff spam missed by regular

content filter◦ Similar types of False Positives as regular filter

Challenges Remaining◦ Primarily just helped on spam with chaff◦ Relies on base content filter to detect spam with

obfuscated content (e.g. v1agra) or naïve spam without any chaff

Spam messages with good word chaff have unnatural weight distributions

Metafeatures is able to identify and catch these messages

This resulted in a 45% improvement in TP Gains were limited to spam with good word

chaff

detecting adversaries using metafeatures

Documents

spam messagesgood words

nave spam

goodpercent of words

good word chaff

zune community

obfuscated content

base content filter

fellow zune user