![Page 1: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/1.jpg)
Data Mining for NetworkIntrusion Detection
S Terry Brugger
UC Davis
Department of Computer Science
Data Mining for Network Intrusion Detection – p.1/55
![Page 2: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/2.jpg)
Overview• This is important for defense in depth• Much work has been done in the area, but no
solution yet• I will investigate an ensemble approach as a
possible solution
Data Mining for Network Intrusion Detection – p.2/55
![Page 3: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/3.jpg)
We need to detect intrusions• Can’t stop intrusions, so need to mitigate
them• Can mitigate (stop the attackers) when they’re
detected, or take other corrective action(improving defenses)
• Part of defense in depth
Data Mining for Network Intrusion Detection – p.3/55
![Page 4: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/4.jpg)
Current IDSs are notsufficient
• Only detect known attacks• Can’t detect insider attacks (privilege abuse)• Don’t have a holistic picture of the network to
detect multi-step attacks over a long timeperiod
• Data for detection is available, but sysadminresources are limited
Data Mining for Network Intrusion Detection – p.4/55
![Page 5: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/5.jpg)
The solution is Data Mining
Data Mining: The process of extracting usefuland previously unnoticed models or patternsfrom large data stores.
(Also called “sensemaking”.)
Data Mining for Network Intrusion Detection – p.5/55
![Page 6: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/6.jpg)
Data mining should be donein an Offline environment
• Last line of defense – used in concert withreal-time systems
• Allows system to be queried post hoc• More complete session information• Data mining techniques are expensive (even
with mitigation through cost-based models[Lee])
Data Mining for Network Intrusion Detection – p.6/55
![Page 7: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/7.jpg)
More reasons to work in anOffline environment
• Periodic batch processing provides trade-offbetween timeliness and efficiency
• Allows for holistic picture, grouping relatedactivity
• Harder to attack IDS via denial of service
Data Mining for Network Intrusion Detection – p.7/55
![Page 8: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/8.jpg)
We mine network connectionrecords
• Readily available• Efficient (good size to information ratio)• Easy for data mining methods to operate on• Avoids privacy issues and encryption of data
streams
Data Mining for Network Intrusion Detection – p.8/55
![Page 9: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/9.jpg)
How network connectionrecords break down
• Intrinsic attributes• Essential attributes
• Axis and reference attributes
• Secondary attributes
• Calculated attributes
Data Mining for Network Intrusion Detection – p.9/55
![Page 10: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/10.jpg)
Place-holder for intrinsic at-tributes table
Data Mining for Network Intrusion Detection – p.10/55
![Page 11: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/11.jpg)
Place-holder for calculatedattributes table
Data Mining for Network Intrusion Detection – p.11/55
![Page 12: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/12.jpg)
Additional usefulinformation
• Calendar schema• Normalization (pseudo-Bayes estimators,
probability given other values)• Compression (UDP, ICMP, source net,
information gain)• Selection using genetic algorithm [Helmer]
Data Mining for Network Intrusion Detection – p.12/55
![Page 13: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/13.jpg)
Many datasets, none great• Information Exploration Shootout (IES)• Internet Traffic Archive [LBL]• Security Suite 16 [InfoWorld]• DARPA Off-line Intrusion Detection Evaluation
• 1998
• KDD-Cup
• 1999
Data Mining for Network Intrusion Detection – p.13/55
![Page 14: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/14.jpg)
What’s so bad withthe IDEval?
• McHugh identified numerous proceduralproblems• Unrealistic data rates
• Failure to show relation to real traffic
Data Mining for Network Intrusion Detection – p.14/55
![Page 15: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/15.jpg)
More problems with IDEval?• Mahoney & Chan found problems with the
data• Some fields like TTL predictable
• Allowed naive methods to achieve high detectionrates
• Correctable by mixing with real traffic
• Despite all this, DARPA dataset still thestandard
Data Mining for Network Intrusion Detection – p.15/55
![Page 16: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/16.jpg)
Some desired features inexisting systems
• Information Security Officer’s Assistant(ISOA) [Winkler] and Distributed IntrusionDetection System (DIDS) [Snapp] did datafusion and multi-sensor correlation
• SRI work: IDES, NIDES, EMERALD providemore published research in this area
Data Mining for Network Intrusion Detection – p.16/55
![Page 17: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/17.jpg)
Commercial offerings tocorrelate alarms
• RealSecure SiteProtector• Symantec ManHunt• nSecure nPatrol• Cisco IDS• Network Flight Recorder (NFR)
Data Mining for Network Intrusion Detection – p.17/55
![Page 18: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/18.jpg)
Commercial offerings foraudit trail integrity
• Computer Associates’ eTrust IntrusionDetection Log View
• NetSecure Log
Data Mining for Network Intrusion Detection – p.18/55
![Page 19: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/19.jpg)
Some functionality offeredby services
• SANS Internet Storm Center• dShield (Independent Storm Center Analysis
and Coordination Center)• myNetWatchman• Security Focus DeepSight Analyzer• Managed service available from
Counterpane, ISS, and Symantec
Data Mining for Network Intrusion Detection – p.19/55
![Page 20: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/20.jpg)
Internet Storm Center
Data Mining for Network Intrusion Detection – p.20/55
![Page 21: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/21.jpg)
Two major data miningapproaches
• Statistical (top-down)• Machine learning (bottom-up)
Data Mining for Network Intrusion Detection – p.21/55
![Page 22: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/22.jpg)
Statistical techniques• Probability of record given correlated
probability of individual fields [SRI]• Probability of record given Bayes network of
conditional probabilities [Staniford]• Probability of value not seen in training given
alphabet size and time since last anomaly[Mahoney]
• Decision trees (ID3) [Sinclair]
Data Mining for Network Intrusion Detection – p.22/55
![Page 23: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/23.jpg)
Many types ofmachine learning
• Classification• Clustering• Support Vector Machines [Eskin,Mukkamala]• Others
Data Mining for Network Intrusion Detection – p.23/55
![Page 24: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/24.jpg)
Classification approaches• Inductive rule [Lee,Helmer,Warrender]• Genetic algorithms
[Neri,Sinclair,Dasgupta,Crosbie,Chittur]• Fuzzy rules [Dickerson,Luo]• Neural nets [Giacinto,Ghosh,Ryan,Endler]• Immunological [Hofmeyr,Dasgupta,Fan]
Data Mining for Network Intrusion Detection – p.24/55
![Page 25: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/25.jpg)
Clustering approaches• Fixed width, k-nearest neighbor
[Portnoy,Eskin,Chan]• k-means [Bloedorn]• Learning Vector Quantization [Marin]• Simulated annealing [Staniford]
Data Mining for Network Intrusion Detection – p.25/55
![Page 26: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/26.jpg)
More Clustering approaches• Approximate Distance Clustering & AKMDE
[Marchette]• Dynamic Clustering [Sequeira]• Parzen-window [Yeung]• Instance-based learner [Lane]
Data Mining for Network Intrusion Detection – p.26/55
![Page 27: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/27.jpg)
Other approaches• Colored Petri nets [Kumar]• Graphs [Staniford,Tolle]• Markov models [Lane,Warrender]
Data Mining for Network Intrusion Detection – p.27/55
![Page 28: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/28.jpg)
Other proposed methods
Proposed by Method
[Denning] operational model
[Denning] mean and standard deviation
[Denning] multivariate model
[Denning] Markov process model
[Kumar] generalized Markov chain
[Denning] time series model
[Frank,Endler] Recurrent neural network
Data Mining for Network Intrusion Detection – p.28/55
![Page 29: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/29.jpg)
More proposed methods
[Chan,Prodromidis] C4.5, ID3, CART, WPEBLS
[Bass] Dempster-Shafer method
[Bass] Generalized EPT
[Lane] Spectral analysis
[Lane] Principle component analysis
[Lane] Linear regression
[Lane] Linear predictive coding
[Lane] (γ, ǫ)-similarity
Data Mining for Network Intrusion Detection – p.29/55
![Page 30: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/30.jpg)
Research to date has pro-vided progress, no solution
• Most data mining methods for ID are good atdetecting particular types of malicious activity
• False positive rates are high (base-ratefallacy [Axelsson])
Data Mining for Network Intrusion Detection – p.30/55
![Page 31: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/31.jpg)
Better performance throughEnsemble techniques
(Also called meta-learning or multi-strategylearning)
“It is well known in the machine learning literature
that appropriate combination of a number of weak
classifiers can yield a highly accurate global clas-
sifier.” [Lane]
Data Mining for Network Intrusion Detection – p.31/55
![Page 32: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/32.jpg)
More support for Ensembletechniques
Neri notes “that combining classifiers learned by
different learning methods, such as hill-climbing
and genetic evolution, can produce higher clas-
sification performances because of the different
knowledge captured by complementary search
methods.”
Data Mining for Network Intrusion Detection – p.32/55
![Page 33: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/33.jpg)
Ensemble techniquesimportant for ID
“In reality there are many different types ofintrusions, and different detectors are needed todetect them.” [Axelsson]
“Combining evidence from multiple base classi-
fiers . . . is likely to improve the effectiveness in de-
tecting intrusions.” [Lee]
Data Mining for Network Intrusion Detection – p.33/55
![Page 34: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/34.jpg)
Some work has been donewith Ensemble techniques
• Manually built covariance matrix in [N]IDES touse multiple classifiers
• Crosbie’s autonomous agents and Staniford’sSPICE also do basic correlation of statisticalclassifiers
• Lee, Fan, et al. proposed use forincorporating classifiers trained on new dataand aging out old classifiers
Data Mining for Network Intrusion Detection – p.34/55
![Page 35: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/35.jpg)
More prior work withEnsemble techniques
• Lee, Fan, et al. also used cost-basedmeta-classifiers
• ADAM uses multiple classifiers for filtering[Barbará]
• Giacinto et al. use ensemble of neural netstrained on different feature sets
Data Mining for Network Intrusion Detection – p.35/55
![Page 36: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/36.jpg)
Outstanding questions
1. For baseline purposes, what is the accuracy of acontemporary NID on the DARPA dataset?
2. Ideal number of states for a Hidden Markov Model, andwhat parameters influence this value?
3. Ideal feature sets for different data mining techniques?
4. Should connectionless protocols like UDP and ICMP,be compressed to a single connection (as in TCP)?
5. Separate training sets for classifiers andmeta-classifiers?
Data Mining for Network Intrusion Detection – p.36/55
![Page 37: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/37.jpg)
More Outstanding questions
6. What is the accuracy of ensemble based offline NIDemploying numerous, different, techniques?
7. How much data is required in order to properly train adata-mining based IDS?
8. How dependent is data mining performance on trainingon same network as it’s used?
9. Should hosts and / or services be grouped together forusage profiles?
Data Mining for Network Intrusion Detection – p.37/55
![Page 38: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/38.jpg)
More Outstanding questions
10. Other forms of data compression to improve accuracy?
11. Predictive capabilities of an offline network intrusiondetection system?
12. How much will the incorporation additional datasources improve performance?
13. Better accuracy by considering state of hosts withconnection as transition operator?
14. Does the ideal time window, w, depend on the currentstate of a host?
Data Mining for Network Intrusion Detection – p.38/55
![Page 39: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/39.jpg)
Big questions
15. What similarities or differences exist in the trafficcharacteristics between different types of networks thatimpact the performance characteristics of a networkintrusion detector?
16. What is the user acceptability level of false alarms?
17. How much can false alarms be reduced through theuse of user feedback, and learning algorithms orclassifier retraining?
Data Mining for Network Intrusion Detection – p.39/55
![Page 40: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/40.jpg)
What am I going to do aboutit?
Data Mining for Network Intrusion Detection – p.40/55
![Page 41: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/41.jpg)
Answer the first six1. For baseline purposes, what is the accuracy of a
contemporary NID on the DARPA dataset?
2. Ideal number of states for a Hidden Markov Model, andwhat parameters influence this value?
3. Ideal feature sets for different data mining techniques?
Data Mining for Network Intrusion Detection – p.41/55
![Page 42: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/42.jpg)
Goal4. Should connectionless protocols like UDP and ICMP,
be compressed to a single connection (as in TCP)?
5. Separate training sets for classifiers andmeta-classifiers?
6. What is the accuracy of ensemble based offline NIDemploying numerous, different, techniques?
Data Mining for Network Intrusion Detection – p.42/55
![Page 43: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/43.jpg)
Approach• Datasets
• 1998, 1999 DARPA TCP data
• 1998 DARPA mixed with real data
• Baseline Snort• Connection Mining (tcpreduce)
Data Mining for Network Intrusion Detection – p.43/55
![Page 44: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/44.jpg)
Place-holder for ConnectionTable Creation
Data Mining for Network Intrusion Detection – p.44/55
![Page 45: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/45.jpg)
Place-holder for Results Ta-ble Creation
Data Mining for Network Intrusion Detection – p.45/55
![Page 46: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/46.jpg)
Base method creation• Training mode• Classification mode
Data Mining for Network Intrusion Detection – p.46/55
![Page 47: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/47.jpg)
Anomaly detection methods• Bayes network• Non-self bit-vectors• Hidden Markov Model
Data Mining for Network Intrusion Detection – p.47/55
![Page 48: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/48.jpg)
Classification methods• Decision tree• Associative rules• Neural network• Elman network• Genetic algorithm• Clustering algorithm• Support Vector Machine
Data Mining for Network Intrusion Detection – p.48/55
![Page 49: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/49.jpg)
Further approach• Ideal parameter determination• Ideal feature set determination• Base classifier analysis• Training information population
Data Mining for Network Intrusion Detection – p.49/55
![Page 50: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/50.jpg)
About Meta-classifiers1. Total anomaly from individual
2. Total probe from individual
3. Total DoS from individual
4. Total R2L from individual
5. Total U2L from individual
6. Total threat from individual
7. Total threat from totals
Data Mining for Network Intrusion Detection – p.50/55
![Page 51: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/51.jpg)
Meta-classifiers• Naive-Bayes• Decision tree• Associative rules• Neural network• Genetic algorithm• Support Vector Machine
Data Mining for Network Intrusion Detection – p.51/55
![Page 52: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/52.jpg)
Final approach• Ideal meta-classifier training• Ideal performance testing• Analysis and writeup
Data Mining for Network Intrusion Detection – p.52/55
![Page 53: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/53.jpg)
Timeline
Step Finish date
Mixed dataset generated 7 October 2004
Baseline 1 November 2004
Connection mining 15 October 2004
Table creation 15 August 2004
Base classifiers 1 April 2005
HMM parameter estimation 15 April 2005
Ideal feature set determination 1 July 2005
Data Mining for Network Intrusion Detection – p.53/55
![Page 54: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/54.jpg)
A little more time
Base classifier analysis 15 August 2005
Training information population 1 September 2005
Meta-classifiers 22 October 2005
Ideal meta-classifier training 15 December 2005
Ideal performance testing 1 February 2006
Data Mining for Network Intrusion Detection – p.54/55
![Page 55: Data Mining for Network Intrusion Detectionzow/papers/dmnid_qualpres.pdfData Mining for Network Intrusion Detection S Terry Brugger UC Davis Department of Computer Science Data Mining](https://reader033.vdocuments.site/reader033/viewer/2022042112/5e8e0c3b6c6bae3e1c5d37c3/html5/thumbnails/55.jpg)
Completion of dissertation
1 May 2006
Data Mining for Network Intrusion Detection – p.55/55