network-based and attack-resilient length signature generation for zero-day polymorphic worms...
TRANSCRIPT
Network-based and Attack-resilient Length Signature Generation for Zero-day Polymorphic Worms
Zhichun Li1, Lanjia Wang2, Yan Chen1 and Judy Fu3
1 Lab for Internet and Security Technology (LIST), Northwestern Univ.
2 Tsinghua University, China
3 Motorola Labs, USA
3
Limitations of Exploit Based Signature
1010101
10111101
11111100
00010111
Our network
Traffic Filtering
Internet
Signature: 10.*01
XX
Polymorphic worm might not have exact exploit based signature
Polymorphism!
4
Vulnerability Signature
Work for polymorphic wormsWork for all the worms which target thesame vulnerability
Vulnerability signature traffic filtering
Internet
XX Our network
UnknownVulnerability
XX
Better!
5
Benefits of Network Based Detection
• At the early stage of the worm, only limited worm samples.
• Host based sensors can only cover limited IP space, which might have scalability issues.
Gateway routersInternet
Our network
Host baseddetection
Early Detection!
6
Design Space and Related Work
• Most host approaches depend on lots of host information, such as source/binary code of the vulnerable program, vulnerability condition, execution traces, etc.
[Polygraph-SSP05][Hamsa-SSP06][PADS-INFOCOM05]
[CFG-RAID05]
[Nemean-Security05]
[DOCODA-CCS05]
[TaintCheck-NDSS05]
LESG (this paper)
[Vulsig-SSP06]
[Vigilante-SOSP05]
[COVERS-CCS05]
[ShieldGen-SSP07]
Vulnerability Based
Exploit Based
Network Based Host Based
7
Outline
• Motivation and Related Work
• Design of LESG
• Problem Statement
• Three Stage Algorithm
• Attack Resilience Analysis
• Evaluation
• Conclusions
8
Basic Ideas
• At least 75% vulnerabilities are due to buffer overflow
• Intrinsic to buffer overflow vulnerability and hard to evade
• However, there could be thousands of fields to select the optimal field set is hard
Vulnerable buffer
Protocol message
Overflow!
9
FrameworkProtocolClassifier
UDP1434
LESGWormFlow
Classifier
TCP137
. . .TCP80
TCP53
TCP25
NormalTraffic Pool
SuspiciousTraffic Pool
Signatures
NetworkTap
KnownWormFilter
Normal traffic reservoir
Real time
Policy driven
ICDCS06, INFOCOM06, TON
11
Outline
• Motivation and Related Work
• Design of LESG
• Problem Statement
• Three Stage Algorithm
• Attack Resilience Analysis
• Evaluation
• Conclusions
13
Length-based Signature Definition
Name Type Class TTL RDlength RDATA
Length Signature (Name,100)
100
Length Signature
Signature Set{(Name,100), (Class,50), (RDATA,300)}
“OR” relationship
Ground truth signature(RDATA,315)
Buffer length!
Vulnerable
RDATA
14
Problem Formulation
LESG
Worms which are not covered in the suspicious pool are at most
Minimize the false positives in the normal pool
Suspicious pool
Normal pool
Signature
With noise NP-Hard!
15
Outline
• Motivation and Related Work
• Design of LESG
• Problem Statement
• Three Stage Algorithm
• Attack Resilience Analysis
• Evaluation
• Conclusions
16
Stages I and II
Stage I: Field Filtering Stage II: Length Optimization
COV≥1%FP≤0.1%
Trade off between specificity and sensitivityScore function Score(COV,FP)
17
Stage III• Find the optimal set of fields as the
signature with high coverage and low false positive
• Separate the fields to two sets, FP=0 and FP>0– Opportunistic step (FP=0)– Attack Resilience step (FP>0)
• The similar greedy algorithm for each step
18
Stage III (cont.)
suspicious normal
Name Type Class TTL Comments RDATA
(Name,100) [40%,0.03%]
(Class,50) [35%,0.09%]
(RDATA,300) [50%,0.05%]
(Comments,2000) [10%,0.1%]
Stage I COV0≥1%FP0≤0.1%
50% 0.05% Residual coverage≥5%
19
Stage III (cont.)
suspicious normal
Name Type Class TTL Comments RDATA
(Class,50) [25%,0.02%]
(Name,100) [3%,0.08%]
{(RDATA,300)}
(Comments,2000) [1%,0.05%]
Stage I COV0≥1%FP0≤0.1%
50% 0.05% Residual coverage≥5%
20
Stage III (cont.)
suspicious normal
Name Type Class TTL Comments RDATA
(Class,50) [25%,0.02%]
(Name,100) [3%,0.08%]
{(RDATA,300),(Class,50)}
(Comments,2000) [1%,0.05%]
Stage I COV0≥1%FP0≤0.1%
(50+25)%(0.05+0.02)% Residual coverageγ≥5%
21
Attack Resilience Bounds
• Depend on whether deliberated noise injection (DNI) exists, we get different bounds
• With 50% noise in the suspicious pool, we can get the worse case bound FN<2% and FP<1%
• In practice, the DNI attack can only achieve FP<0.2%
• Resilient to most proposed attacks (proposed in other papers)
22
Outline
• Motivation and Related Work
• Design of LESG
• Problem Statement
• Three Stage Algorithm
• Attack Resilience Analysis
• Evaluation
• Conclusions
23
Methodology• Protocol parsing with Bro and BINPAC (I
MC2006)
• Worm workload– Eight polymorphic worms created based on r
eal world vulnerabilities including CodeRed II and Lion worms.
– DNS, SNMP, FTP, SMTP
• Normal traffic data– 27GB from a university gateway and 123GB
email log
24
Results• Single/Multiple worms with noise
– Noise ratio: 0~80%– False negative: 0~1% (mostly 0)– False positive: 0~0.01% (mostly 0)
• Pool size requirement– 10 or 20 flows are enough even with 20% noises
• Speed results– With 500 samples in suspicious pool and 320K s
amples in normal pool, For DNS, parsing 58 secs, LESG 18 secs
25
Conclusions
• A novel network-based automated worm signature generation approach– Work for zero day polymorphic worms with
unknown vulnerabilities – First work which is both Vulnerability based
and Network based using length signature for buffer overflow vulnerabilities
– Provable attack resilience– Fast and accurate through experiments