applied anomaly based ids

33
Applied Anomaly Based IDS Craig Buchanan University of Illinois at Urbana-Champaign CS 598 MCC 4/30/13

Upload: kamal

Post on 24-Feb-2016

33 views

Category:

Documents


0 download

DESCRIPTION

Applied Anomaly Based IDS. Craig Buchanan University of Illinois at Urbana-Champaign CS 598 MCC 4/30/13. Outline. K-Nearest Neighbor Neural Networks Support Vector Machines Lightweight Network Intrusion Detection (LNID). K-Nearest Neighbor. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Applied Anomaly Based IDS

Applied Anomaly Based IDSCraig BuchananUniversity of Illinois at Urbana-ChampaignCS 598 MCC4/30/13

Page 2: Applied Anomaly Based IDS

Outline• K-Nearest Neighbor• Neural Networks• Support Vector Machines• Lightweight Network Intrusion Detection (LNID)

Page 3: Applied Anomaly Based IDS

K-Nearest Neighbor• “Use of K-Nearest Neighbor classifier for intrusion detection”

[Liao, Computers and Security]

Page 4: Applied Anomaly Based IDS

K-nearest neighbor on text

1. Categorize training documents into vector space model, A• Word-by-document matrix A

• Rows = words• Columns = documents• Represents weight of each word in set of documents

2. Build vector for test document, X3. Classify X into A using K-nearest neighbor

Page 5: Applied Anomaly Based IDS

Text categorization• Create vector space model A• – weight of word i in document j

• Useful variables• N – number of documents in the collection• M – number of distinct words in the collection• – frequency of word i in document j• – total number of times word i in the collection

Page 6: Applied Anomaly Based IDS

Text categorization• Frequency weighting

• Term frequency – inverse document frequency (tf*idf)

Page 7: Applied Anomaly Based IDS

Text categorization• System call = “word”• Program execution = “document”

• Close, execve, open, mmap, open, mmap, munmap, mmap, mmap, close, …, exit

Page 8: Applied Anomaly Based IDS

Document Classification• Distance measured by Euclidean distance

• – test document• – jth training document• – word shared by and • – weight of word in • – weight of word in

Page 9: Applied Anomaly Based IDS

Anomaly detection• If X has unknown system call then abnormal• If X is the same as any Dj then normal

• K-nearest neighbor• Calculate sim_avg for k-nearest neighbors• If sim_avg > threshold then normal• Else abnormal

Page 10: Applied Anomaly Based IDS

Results

Page 11: Applied Anomaly Based IDS

Results

Page 12: Applied Anomaly Based IDS

Neural Networks• Intrusion Detection with Neural Networks [Ryan, AAAI

Technical Report 1997]

• Learn user profiles (“prints”) to detect intrusion

Page 13: Applied Anomaly Based IDS

NNID System

1. Collect training data• Audit logs from each user

2. Train the neural network3. Obtain new command distribution vector4. Compare to training data

• Anomaly if:• Associated with a different user• Not clearly associated with any user

Page 14: Applied Anomaly Based IDS

Collect training data • Type of data• as, awk, bc, bibtex, calendar, cat, chmod, comsat, cp, cpp, cut,

cvs, date, df, diff, du, dvips, egrep, elm, emacs, …, w, wc, whereis, xbiff++, xcalc, xdvi, xhost, xterm

• Type of platform• Audit trail logging• Small number of users• Not a large target

Page 15: Applied Anomaly Based IDS

Train Neural Network• Map frequency of command to nonlinear scale• 0.0 to 1.0 in 0.1 increments• 0.0 – never used• 0.1 – used once or twice• 1.0 – used > 500x

• Concatenate values to 100-dimensional command distribution vector

Page 16: Applied Anomaly Based IDS

Neural Network• 3-layer backpropagation architecture

Input(x100)

Hidden(x30)

Output(x10)

Page 17: Applied Anomaly Based IDS

Results

Page 18: Applied Anomaly Based IDS

Results• Rejected 63% random user vectors• Anomaly detection rate 96%

• Correctly identified user 93%• False alarm rate 7%

Page 19: Applied Anomaly Based IDS

Support Vector Machines• Intrusion Detection Using Neural Networks and Support

Vector Machines [Mukkamala, IEEE 2002]

Page 20: Applied Anomaly Based IDS

SVM IDS

1. Preprocess randomly selected raw TCP/IP traffic2. Train SVM

• 41 input features• 1 – normal• -1 – attack

3. Classify new traffic as normal or anomaly

Page 21: Applied Anomaly Based IDS

SVM IDS FeaturesFeature name Description Type

Duration Length of the connection Continuous

Protocol type TCP, UDP, etc. Discrete

Service HTTP, TELNET, etc. Discrete

Src_bytes Number of data bytes from source to destination

Continuous

Dst_bytes Number of data bytes to source from destination

Continuous

Flag Normal or error status Discrete

Land If connection is from/to the same host/port

Discrete

Wrong_fragment Number of “wrong” fragments

Continuous

… … …

Page 22: Applied Anomaly Based IDS

Results

Series1

-1.5

-1

-0.5

0

0.5

1

1.5

SVM predictionActual

Page 23: Applied Anomaly Based IDS

Recent Anomaly-based IDS• An efficient network intrusion detection [Chen, Computer

Communications 2010]

• Lightweight Network Intrusion Detection (LNID) system

Page 24: Applied Anomaly Based IDS

LNID Approach• Detect R2L and U2R

• Assume attack is in first few packets• Calculate anomaly score of packets

Page 25: Applied Anomaly Based IDS

LNID System Architecture

Page 26: Applied Anomaly Based IDS

Anomaly Score• Based on Mahoney’s network IDS [21-24]• M.V. Mahoney, P.K. Chan, PHAD: packet header anomaly

detection for identifying hostile network traffic, Florida Institute of Technology Technical Report CS-2001-04, 2001.

• M.V. Mahoney, P.K. Chan, Learning nonstationary models of normal network traffic for detecting novel attacks, in: Proceedings of the 8th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 2002a, pp. 276-385.

• M.V. Mahoney, P.K. Chan, Learning models of network traffic for detecting novel attacks, Florida Institute of Technology Technical Report CS-2002-08, 2002b.

• M.V. Mahoney, Network traffic anomaly detection based on packet bytes, in: Proceedings of the 2003 ACM Symposium on Applied Computing, 2003, pp. 346-350.

Page 27: Applied Anomaly Based IDS

Anomaly Score (Mahoney)

• = time elapsed since last time attribute was anomalous• = number of training or observed instances• = number of novel values of attribute

Page 28: Applied Anomaly Based IDS

Anomaly Score (revised)

• = number of training or observed instances• = number of novel values of attribute

Page 29: Applied Anomaly Based IDS

Anomaly Scoring Comparison

Page 30: Applied Anomaly Based IDS

Attributes• Attribute = packet byte• 256 possible values• 48 attributes (packet bytes)

• 20 bytes of IP header• 20 bytes of TCP header• 8 bytes of payload

Page 31: Applied Anomaly Based IDS

Results• Detection rate

• Workload• LNID – 0.3% of traffic• NETAD – 3.16% of traffic• Lee et. al. – 100% of traffic

Total (%) U2R (%) R2L (%) # FA/Day

LNID 73 70 77 2

NETAD 68 55 78 10

Lee et. al. 78 18 10

Page 32: Applied Anomaly Based IDS

Results• Hard detected attacks

Attack name Description LNID PHAD DARPA

loadmodule U2R, SunOS, set IFS to call trojan suid program

1/3 0/3 1/3

ncftp R2L, FTP exploit 4/5 0/5 0/5

sechole U2R, NT bug exploit 3/3 1/3 1/3

perl U2R, Linux exploit 2/3 0/3 0/4

sqlattack U2R, excape from SQL database shell

3/3 0/3 0/3

xterm U2R, Linux buffer overflow in suid root prog.

3/3 0/3 1/3

Detection rate 16/20(80%)

1/20(5%)

3/21(14%)

Page 33: Applied Anomaly Based IDS

Questions or Comments