automated worm fingerprinting

Automated Worm Fingerprinting

Sumeet Singh, Cristian Estan, George Varghese, and Stefan Savage

Introduction

Problem: how to react quickly to worms?

CodeRed 2001 Infected ~360,000 hosts within 11 hours

Sapphire/Slammer (376 bytes) 2002 Infected ~75,000 hosts within 10 minutes

Existing Approaches

Detection Ad hoc intrusion detection

Characterization Manual signature extraction

Isolates and decompiles a new worm Look for and test unique signatures Can take hours or days

Existing Approaches

Containment Updates to anti-virus and network filtering

products

Earlybird

Automatically detect and contain new worms

Two observations Some portion of the content in existing

worms is invariant Rare to see the same string recurring

from many sources to many destinations

Earlybird

Automatically extract the signature of all known worms Also Blaster, MyDoom, and Kibuv.B

hours or days before any public signatures were distributed

Few false positives

Background and Related Work

Almost all IPs were scanned by Slammer < 10 minutes Limited only by bandwidth constraints

The SQL Slammer Worm:30 Minutes After “Release”

- Infections doubled every 8.5 seconds- Spread 100X faster than Code Red- At peak, scanned 55 million hosts per second.

Network Effects Of The SQL Slammer Worm

At the height of infections Several ISPs noted significant bandwidth

consumption at peering points Average packet loss approached 20% South Korea lost almost all Internet

service for period of time Financial ATMs were affected Some airline ticketing systems

overwhelmed

Signature-Based Methods

Pretty effective if signatures can be generated quickly For CodeRed, 60 minutes For Slammer, 1 – 5 minutes

Worm Detection

Three classes of methods Scan detection Honeypots Behavioral techniques

Scan Detection

Look for unusual frequency and distribution of address scanning

Limitations Not suited to worms that spread in a non-

random fashion (i.e. emails) Detects infected sites Does not produce a signature

Honeypots

Monitored idle hosts with untreated vulnerabilities Used to isolate worms

Limitations Manual extraction of

signatures Depend on quick infections

Behavioral Detection

Looks for unusual system call patterns Sending a packet from the same buffer

containing a received packet Can detect slow moving worms

Limitations Needs application-specific knowledge Cannot infer a large-scale outbreak

Characterization

Process of analyzing and identifying a new worm

Current approaches Use a priori vulnerability signatures Automated signature extraction

Vulnerability Signatures

Example Slammer Worm

UDP traffic on port 1434 that is longer than 100 bytes (buffer overflow)

Can be deployed before the outbreak Can only be applied to well-known

vulnerabilities

Some Automated Signature Extraction Techniques

Allows viruses to infect decoy progs Extracts the modified regions of the

decoy Uses heuristics to identify invariant code

strings across infected instances Limitation

Assumes the presence of a virus in a controlled environment

Some Automated Signature Extraction Techniques

Honeycomb Finds longest common subsequences

among sets of strings found in messages Autograph

Uses network-level data to infer worm signatures

Limitations Scale and full distributed deployments

Containment

Mechanism used to deter the spread of an active worm Host quarantine

Via IP ACLs on routers or firewalls String-matching Connection throttling

On all outgoing connections

Defining Worm Behavior

Content invariance Portions of a worm are invariant (e.g. the

decryption routine) Content prevalence

Appears frequently on the network Address dispersion

Distribution of destination addresses more uniform to spread fast

Finding Worm Signatures

Traffic pattern is sufficient for detecting worms Relatively straightforward Extract all possible substrings Raise an alarm when

FrequencyCounter[substring] > threshold1 SourceCounter[substring] > threshold2 DestCounter[substring] > threshold3

Detecting Common Strings

Cannot afford to detect all substrings Maybe can afford to detect all strings

with a small fixed length




A horse is a horse, of course, of course

F1 = (c1p4 + c2p3 + c3p2 + c4p1 + c5) mod M




A horse is a horse, of course, of course

F1 = (c1p4 + c2p3 + c3p2 + c4p1 + c5) mod M

F2 = (c2p4 + c3p3 + c4p2 + c5p1 + c6) mod M

Too CPU-Intensive

Each packet with payload of s bytes has s-β+1 strings of length β A packet with 1,000 bytes of payload

Needs 960 fingerprints for string length of 40 Still too expensive

Prone to Denial-of-Service attacks

CPU Scaling

Random sampling may miss many substrings

Solution: value sampling Track only certain substrings

e.g. last 6 bits of fingerprint are 0 P(not tracking a worm)

= P(not tracking any of its substrings)

CPU Scaling

Example Track only substrings with last 6 bits = 0s String length = 40 1,000 char string

960 substrings 960 fingerprints ‘11100…101010’…‘10110…000000’…

Track only ‘xxxxx….000000’ substrings Probably 960 / 26 = 15 substrings in total

string that end in 6 0’s

CPU Scaling

P(finding a 100-byte signature) = 55% P(finding a 200-byte signature) = 92% P(finding a 400-byte signature) =

99.64%

Estimating Content Prevalence

Finding the packet payloads that appear at least x times among the N packets sent During a given interval

Estimating Content Prevalence

Table[payload] 1 GB table filled in 10 seconds

Table[hash[payload]] 1 GB table filled in 4 minutes Tracking millions of ants to track a few

elephants Collisions...false positives

[Singh et al. 2002]

stream memoryArray of counters

Hash(Pink)

Multistage Filters

packet memoryArray of counters

Hash(Green)

Multistage Filters

packet memory

Multistage Filters

packet memoryCollisions are OK

Multistage Filters

packet memory

packet1 1

Insert

Reached threshold

Multistage Filters

packet memory

packet1 1

Multistage Filters

packet memory

packet1 1

packet2 1

Multistage Filters

Stage 2

packet memory

packet1 1

Stage 1

Multistage Filters

No false negatives!(guaranteed detection)

Gray = all prior packets

Conservative Updates

Redundant

Redundant


Estimating Address Dispersion

Not sufficient to count the number of source and destination pairs e.g. send a mail to a mailing list

Two sources—mail server and the sender Many destinations

Need to count the unique source and destination traffic flows For each substring

[Estan et al. 2003]

Bitmap counting – direct bitmap

HASH(green)=10001001

Set bits in the bitmap using hash of the flow ID of incoming packets


HASH(blue)=00100100

Different flows have different hash values



Packets from the same flow always hash to the same bit


HASH(violet)=10010101

Collisions OK, estimates compensate for them


HASH(orange)=11110011


HASH(pink)=11100000


HASH(yellow)=01100011

As the bitmap fills up, estimates get inaccurate


Solution: use more bits



Solution: use more bits

Problem: memory scales with the number of flows

HASH(blue)=00100100

Bitmap counting – virtual bitmap

Solution: a) store only a portion of the bitmap

b) multiply estimate by scaling factor


HASH(pink)=11100000



Problem: estimate inaccurate when few flows active

Bitmap counting – multiple bmps

Solution: use many bitmaps, each accurate for a different range


HASH(pink)=11100000


Use this bitmap to estimate number of flows

Bitmap counting – multires. bmp

Problem: must update up to three bitmaps per packet

Solution: combine bitmaps into one

OR

OR

HASH(pink)=11100000


Putting It Together

header payload

substring fingerprintssubstring fingerprints

key src cnt dest cnt

AD entry exist?update counters

key cntelseupdate counter

cnt > prevalence threshold?create AD entry

Content Prevalence Table(multistage filters)

Address Dispersion Table(scalable counters)

counters > dispersion threshold?report key as suspicious worm

Putting It Together

Sample frequency: 1/64 String length: 40 Use 4 hash functions to update

prevalence table Multistage filter reset every 60 seconds

System Design

Two major components Sensors

Sift through traffic for a given address space Report signatures

An aggregator Coordinates real-time updates Distributes signatures

Implementation and Environment

Written in C and MySQL (5,000 lines) Prototype executes on a 1.6Ghz AMD

Opteron 242 1U Server Linux 2.6 kernel

EarlyBird Performance

Processes 1TB of traffic per day 200Mbps of continuous traffic Can be pipelined and parallelized for

achieve 40Gbps

Parameter Tuning

Prevalence threshold: 3 Very few signatures repeat

Address dispersion threshold 30 sources and 30 destinations Reset every few hours Reduces the number of reported

signatures down to ~25,000

Parameter Tuning

Tradeoff between and speed and accuracy Can detect Slammer in 1 second as

opposed to 5 seconds With 100x more reported signatures

Memory Consumption

Prevalence table 4 stages

Each with ~500,000 bins (8 bits/bin) 2MB total

Address dispersion table 25K entries (28 bytes each) < 1 MB

Total: < 4MB

Trace-Based Verification

Two main sources of false positives 2,000 common protocol headers

e.g. HTTP, SMTP Whitelisted

SPAM e-mails BitTorrent

Many-to-many download

False Negatives

So far none Detected every worm outbreak

Inter-Packet Signatures

An attacker might evade detection by splitting an invariant string across packets

With 7MB extra, EarlyBird can keep per flow states and fingerprint across packets

Live Experience with EarlyBird

Detected precise signatures CodeRed variants MyDoom mail worm Sasser Kibvu.B

Conclusions

EarlyBird is a promising approach To detect unknown worms real-time To extract signatures automatically To detect SPAMs with minor changes

Wire-speed signature learning is viable

automated worm fingerprinting

Documents