is sampled data sufficient for anomaly detection ip wing chung peter (05133660) ngan sze chung...
TRANSCRIPT
Is Sampled Data Sufficient for Anomaly Detection
Ip Wing Chung Peter (05133660)
Ngan Sze Chung (05928650)
Abstract
Traffic Measurement in Network is important Network management Anomaly detection for security analysis
Detect all packet trace? The most accurate Consume network
resources Affect normal traffic
Router A Router B
Monitor
Sampling a point-to-point link
Abstract
Sampling Technique Conserve network resources How many samples? Sampling techniques vs Anomalies detection algo
rithm
Abstract
Introduction Background and Methods Impact of Sampling on Volume Anomaly Dete
ction Impact of Sampling on Portscan Detection Conclusion and Future Work
Introduction
Aim To study the impact of sampling on anomaly
detection Objective
To study 4 existing sampling techniques To study 3 common anomaly detection algorithm To simulate the result by inputting the sampled
data to detect the anomalies To evaluate the impact of sampling on anomaly
detection algorithm
Background and Methods
Sampling Volume Anomaly Detection Portscan Detection Trace Data Methodology
Sampling
Random packet sampling Sample a packet with a small probability r < 1 Classify sampled
packets into flows based on source/destination, IP/port, protocol
Flow terminated by timeout (1 min), or explicit TCP semantics (FIN)
Sampling
Random packet sampling Simple to implement Low CPU power and memory requirement Inaccurate for flow statistic
Sampling
Random flow sampling Sample a flow with a small probability p < 1 Improve accuracy
for flow statistic Classifies packet
into flows first Prohibitive memory
and CPU power
Sampling
Smart sampling Sample a flow of size x with a probability p(x) Determined by threshold z (e.g. z = 40000) Bias towards large flows
Flow 1, 40 bytesFlow 2, 15580 bytesFlow 3, 8196 bytesFlow 4, 5350789 bytesFlow 5, 532 bytesFlow 6, 4000 bytes
sample with 100% probability
sample with 0.1% probability
sample with 10% probability
Where z is a threshold that trades off accuracy
Sampling
Sample-and-hold (S&H)
Sampling
Sample-and-hold (S&H) Flow table lookup
If found, flow entry gets updated by all the subsequent packets once it is created in S&H table
If not found, flow entry created with a probability p
(e.g. p = 1/3 on previous case) Sampling biased toward “elephant” flows
Volume Anomaly Detection
Detect Network traffic anomalies (e.g. DoS attack) Abrupt changes in packet or flow count measurem
ents Induces volume anomalies
Discrete wavelet transform (DWT) based detection Proved to be effective at detecting volume anomal
ies
DWT-Based Detection
Applies wavelet decomposition on packet or flow time series
Detect volume change at various time scale 3 steps
Decomposition Re-synthesis Detection
DWT-Based Detection
Decomposition Decompose original signal to identify changes DWT calculate wavelet coefficient
high pass filter
low pass filter
original signal
DWT-Based Detection
Re-synthesis
Aggregated into high, mid and low bands Low-band signal slow-varying trends High-band signal highlight sudden variations Mid-band sum of the rest
DWT-Based Detection
Detection Compute variance of high and mid-band signals
over a time interval
Deviation score =
If deviation score is higher than a predefined threshold are marked as volume anomalies
local varianceglobal variance
Portscan Dectection
2 online portscan detection techniques Threshold Random Walk (TRW) Time Access Pattern Scheme (TAPS)
Threshold Random Walk (TRW) 2 Hypothesis H0: a source is a “normal” host
H1: a source is a scanner Rationale:
A normal host is far more likely to have successful connection than a scanner which randomly probes address space.
Threshold Random Walk (TRW) Hypotheses testing on sequence of events To determine which hypothesis is more likely let Y = {Y1, Y2, . . . , Yi} represent the random
vector of connections observed from a source,
where Yi = 0 if the ith connection is successful and Yi = 1 otherwise
Threshold Random Walk (TRW) Likelihood Ratio:
When the Likelihood Ratio crosses either one of two predefined thresholds, the corresponding hypothesis is selected as the most likely.
requires ~6 observed events to detect scanners successfully
Threshold Random Walk (TRW) TRWSYN - backbone adaptation of TRW Backbone traffic usually uni-directional Difficult to predict “failed” / “succeeded” conn
ection TRWSYN oracle:
Marks single SYN-packet flows as failed connection
Detect TCP portscan ONLY
Time Access Pattern Scheme (TAPS) Access Pattern Observation: Scanner initiates connections
to a larger spread of destination IP addresses (horizontal scan) port numbers (vertical scan)
That means, ratio γ between distinct IP addresses and port number is larger for scanner.
Time Access Pattern Scheme (TAPS) Hypotheses test, similar to TRW. Single packet flow failed connection Each time bin (say i), for each source, compu
te ratio γ, compare with predefine threshold k. Event variable Yi = 0 if γ<k
1 if γ>=k Update Likelihood Ratio
Trace Data
2 Links in Tier-1 ISP’s Backbone network 2 OC-48 links between backbone routers on West
Coast and East Coast BB-West: Large percentage of scanning traffic BB-East: Large Volume
Collected by IPMON
Methodology
4 sampling schemes use different parameters Require common metric for fair comparison We choose:
Different in: Memory requirement CPU utilization
Percentage of sampled flows
Methodology
Note: Although fixed percentage of sampled flows Smart sampling & Sample-and-Hold bias towards
Large flows
Impact of Sampling onVolume Anomaly Detection Volume Anomaly Detection Result Feature Variation Due to Sampling
Detection from the original trace
Total 21 abrupt changes from original trace
No. of detection ↓ as sampling interval ↑ Random flow sampling performs the best Smart sampling & Sample-and-hold drops much faster No false positive in detection
Feature Variation Due to Sampling Difference in performance on detection
Most volume spikes caused by a sudden increase in small packet flows
Random flow sampling is unbiased by flow size Others are biased by large flows Smart sampling and Sample-and-hold designed to
track heavy hitters Poor performance compare to packet sampling
Feature Variation Due to Sampling No false positives
Simply, spike in samples must have existed in the original trace
Not an artifact of sampling Sampling only ↓ no. of detection and not cause
any false detection
Feature Variation Due to Sampling No. of detection ↓ as sampling interval ↑
even in random flow sampling Technique based
on no. of sampled
event and local
variance Hypothesize sampling introduces distortion in
variance
Success
Fail
Feature Variation Due to Sampling Sampling introduce distortion in variance
Sampling scale down original time series by a fraction of p
Assume variance = and average rate = New scaled-down variance Sampling involves removal of discrete point i.e. Sample original point process
binomially Total variance
Binomial random
var.
Feature Variation Due to Sampling
Total variance
removal of discrete pt.
scaled-down variance
> 70%
when N = 500
Affect Detection !
Impact of Sampling on Portscan Dectection Metrics
Desirable to have HIGH Rs and LOW Rf+
Focus on Success and False Positive Ratio (because Rs+Rf-=1)
Impact of Sampling on Portscan Dectection Challenge: Determine true scanners Final list of scanners manually generated by
Sridharan (in Impact of Packet Sampling on Portscan Detection) as the ground truth
Less interested in absolute accuracy Relative performance as a function of sampli
ng scheme and sampling rate
TRWSYN under Sampling
Rs and Rf+ ratios for the BB-West trace as functions of effective sampling interval for all four sampling schemes
TRWSYN under Sampling
Random Packet Sampling As base case for comparison
Success Ratio Rs
Initially increases slightly for small N
(seems advantageous)
Drop off for Large N
TRWSYN under Sampling
False Positive Ratio Rf+
Follows similar behaviour as Rs but Larger scale Increases 3 times when N
from 1 to 10
Random Packet Sampling As base case for comparison
TRWSYN under Sampling
2 key effects of packet sampling Flow-reduction
Number of flows observed reduced Flow-shortening
Multi-packet flows reduced to single packet flows Recall: TRWSYN algorithm Single SYN packet flow connection failure
potential scanner
TRWSYN under Sampling
Small sampling interval Flow-reduction slight impact High Rs
Flow-shortening substantial impact
↑single packet flowImpact: Scanners’ multi-packet flows initially missed
shortened Detected Increase Rs
Regular multi-packet flows
shortened “Detected” Increase Rf+
TRWSYN under Sampling
Large sampling interval Flow-reduction dominates Fewer decisions (detections) Rs and Rf+ decrease
TRWSYN under Sampling
3 Flow sampling schemes Decision based on entire flow
No Flow-shortening Flow- Reduction dominates the impact
Exception: Sample-and-Hold
Mid-Flow-Shortening Decision only made on SYN packet flows Introduce NO False Positive
TRWSYN under Sampling
Both Rs and Rf+ decrease almost monotonically as N increases
Rf+ lower than packet sampling
TRWSYN under Sampling
In terms of Rf+
Flow sampling >> Packet sampling In terms of Rs,
Random Flow Sampling > Random Packet Sampling > Smart Sampling > Sample-and-Hold
Cause: Bias towards Large Flows Suffer more from Flow-reduction
TAPS under Sampling
Critical parameter: Time Bin For each sampling scheme, each sampling rate, Use Optimal Time Bin
Maximize Rs
Increasing function of sampling interval True for both Packet sampling and Flow sampling
schemes
TAPS under Sampling
Results of portscan detection with TAPS for Trace BB-West
TAPS under Sampling
Rs decreases as sampling interval increases Random Flow Sampling performs the best Random Packet Sampling performs as well a
s the remaining 2 Flow sampling schemes
Cause: Bias towards Large Flows Tend to miss small (critical) flows
TAPS under Sampling
Random Packet Sampling Rf+ intially increases
due to Flow-shortening Then drop off at large sampling interval
due to Flow-reduction
Flow Sampling schemes No/Minor Flow-shortening
Low Rf+
Monotonically decreases with sampling interval
TAPS under Sampling
TAPS uses address range distribution for detection
Insensitive to the 4 schemes No distortion introduced Low Rf+
e.g. Random Packet Sampling yields 1/10 of Rf+ by TRWSYN
Conclusion
Random Flow Sampling Performs the best Prohibitive resource requirement
Random Packet Sampling Suffers from Flow-shortening
Smart Sampling & Sample-and-Hold Bias towards large flows Perform poorer than Random Packet Sampling in
volume anomaly detection
Conclusion
All 4 sampling schemes Degrade all 3 anomaly detection algorithms In terms of Rs and Rf+
Sampled Data Sufficient for Anomaly Detection?
Remains an Open Question