polygraph: automatically generating signatures for polymorphic worms james newsome, brad karp, and...
TRANSCRIPT
Polygraph: Automatically Polygraph: Automatically Generating Signatures for Generating Signatures for Polymorphic Worms James Polymorphic Worms James Newsome, Brad Karp, andNewsome, Brad Karp, andDawn SongDawn SongCarnegie Mellon UniversityCarnegie Mellon University
Presented by Ryan Gates
OverviewOverviewGoalComposition of a wormInvariant bytes and TokensTypes of signatures
◦ Conjunction◦ Token Subsequence◦ Bayes
Polygraph Signature GeneratorMetricsResultsEvaluation
GoalGoalAutomate the generation of
worm signatures◦Specifically polymorphic worms
Prevent polymorphic worms from going undetected◦Including perfectly polymorphic
instances
Decomposition of a wormDecomposition of a worm
Figure 1. Polymorphed ApacheKnacker
Invariant bytesWild card bytesCode bytes
Invariant BytesInvariant Bytes Invariant framing
◦Reserved key words or well known binary constants that are part of the wire protocol
◦For example "HTTP" or "GET"Invariant overwrite values
◦High order bytes of the overwritten address◦For example in BIND-TSIG "\xFF\xBF"
Many invariant substrings are not sufficiently long to not prevent false positives.
The solution is to let each set of invariant bytes be represented by a token
TokensTokensTokens must not be a substring
of another token◦For example HTTP not TTP
Conjunction SignatureToken Sub-sequence SignatureBayes Signature
◦Each token value represents the probability of that token being present in an actual worm flow.
Conjunction SignaturesConjunction SignaturesEvery token in the conjunction
signature must be found in the payload for there to be a match
All tokens are required to matchReduce false positivesFor example in the Apache-
Knacker signature, ‘GET’, ‘HTTP/1.1\r\n’,’:’ are tokens in a conjunction signature
Token Subsequence Token Subsequence SignaturesSignaturesSimilar to the conjunction
signature, but more restrictive.All tokens must be present in the
correct order to reduce false positives
Typically modeled using Regular Expressions
For example in the BIND-TSIG signature, “GET.*HTTP/1.1\r\n.*…”
Bayes SignatureBayes SignatureSet of tokens, and each with a scoreIf the sum the tokens exceeds a
threshold then it is considered a match.
A sample signature would include ‘\x00\x00\xFA’: 1.7574
Benefits◦Less rigid, which helps prevent false
positives for common tokens.◦Higher quality signatures with a more
diverse suspicious pool.
Limitations of Signature Limitations of Signature TypesTypesBayes signature is unaffected by noise,
until it grows beyond 80%. At this point there will be 100% false negatives.◦Flow classifier did a very poor job of
classifying the flows.
Conjunction and Token Subsequence cannot handle multiple types of worms◦The solution is to use clustering to separate
the worms into manageable clusters
ClusteringClusteringClustering helps the conjunction
and token subsequence signatures deal with variety
Used to divide the suspicious flows into a number of different pools.
Divide the suspicious pool into several clusters which contain types of flows◦Clusters should not be too general◦Clusters should not be too specific
Polygraph Signature Polygraph Signature GeneratorGenerator
The polygraph monitor must have access to the network's packet flow.
An imperfect flow classifier sorts packet flows into either the suspicious or innocuous pool.
Polygraph Signature Polygraph Signature GeneratorGenerator
It will not distinguish between different worms, but merely suspicious flows and innocuous flows.
Flow classifier is reliable, but imperfect. The result is noise.
Polygraph Signature Polygraph Signature GeneratorGenerator
Uses samples to determine appropriate signatures for worms present in the suspicious flow pool.
Resilient to noise in the system
MetricsMetricsQuality
◦ Low percentage of false positives and false negatives
Efficiency in generation◦ Lower computational cost
Efficiency in matching◦ Should not inhibit the network traffic
Generate small signature sets◦ Limit the number of signatures
Robustness◦ Yield high quality signature even with noise
and a variety of worms◦ Resistance to clever evasion by worms
Results | ApacheKnackerResults | ApacheKnacker
Table 1. ApacheKnacker signatures. These signatures were successfully generated for innocuous pools containing at least 3 worm samples.
Best performer was Token SubsequenceThe ordering used in the Token
Subsequence signature helps reduce the number of false positives.
Results | BIND-TSIGResults | BIND-TSIG
Table 2. BINDTSIG signatures. These signatures were successfully generated for innocuous pools containing at least 3 worm samples.
The best performers were Conjunction and Token Subsequence.
Bayes signature quality is degraded when the tokens are common in other innocuous flows.
Results | Coincidental Results | Coincidental PatternPattern
Coincidental Patter attack injects invariant bytes in wildcard bytes to confuse the signature generater.
ContributionContributionPolygraph helps to automate
signature generation
Examined the effects that implementing polymorphism on worms could have on worm signature generation and matching.
Introduced imperfections in the classifying of network flows
LimitationsLimitationsWorms that lack invariant code
Requires a flow classifier and at least 3 worm samples
If the innocuous pool is too diverse, there will be too many false positives.
Improvements and Future Improvements and Future WorkWorkTake advantage of multiple
cores.Incorporate the design of an
efficient flow classifierDetermine how feasible it is to
inspect network trafficDetermine an algorithm to
choose best signature to use
ReferencesReferencesJ. Newsome, B. Karp, and D.
Song. Polygraph: Automatically generating signatures for polymorphic worms. In IEEE Security and Privacy Symposium, 2005.