towards host-based detection and collaborative network

Towards Host-based Detection and Collaborative Network Containment of FastSpreading Active Worms

�Frank Akujobi, Ioannis Lambadaris

Department of Systems and Computer Engineeringfakujobi, [email protected]

andEvangelos Kranakis

Department of Computer [email protected]

Carleton University1125 Colonel By Dr., Ottawa, ON K1S 5B6, Canada

Abstract—The complex propagation techniques of malicious wormsand the severity of worm-based attacks have made the study of de-fenses against worm propagation very crucial. Fast spreading mali-cious worms in particular have been known to cause significant havocon networks they attack. In this paper, we propose and analyze theEndpoint Detection And Network Containment (EDANC) approachfor defending against fast spreading worms. The EDANC detectionand correlation engine is based on the Generalized Evidence Pro-cessing (GEP) theory, a decision level multi-sensor data fusion tech-nique. With GEP theory, evidence collected by distributed detectorsdetermine the probability associated with a decision under a hypoth-esis. Several pieces of evidence from the detectors are combined toarrive at an optimal fused decision by minimizing a decision risk func-tion. The EDANC scheme also employs an automated collaborativenetwork-centric containment approach for worm defense. Randomscanning worm behavior is considered in analyzing worm detectioninterval associated with using the EDANC technique. Further, wepresent the Analytical Active Worm Containment (AAWC) model, anon-deterministic discrete-time model used to model vulnerable hostpopulation protected as a result of the EDANC collaborative defensemechanism in a large scale network. Analysing the AAWC modelalongside a known discrete-time worm propagation model, we demon-strate quantitatively the effectiveness of the EDANC technique in de-fending against large scale fast spreading scanning worm attacks.

Keywords – Anomaly detection, Generalized Evidence Processingtheory, Detection interval, Collaborative containment.

I. INTRODUCTION AND CONTRIBUTIONS

Anomaly-based intrusion detection has emerged as a mechanismfor defending against advanced malicious intrusions that are ei-ther previously unknown or are capable of bypassing traditionalsignature-based detection schemes [11] [16]. Anomaly-based In-trusion Detection Systems (AIDS) can be classified as networkAIDS or host AIDS depending on where the anomalous behav-ior is detected. Network AIDS infer suspicious activity in a net-work by detecting anomalous network traffic patterns [30] [7] andtrends [32]. While this approach can effectively detect fast spread-ing worms under favorable conditions, it cannot reliably detect ma-licious intrusions that do not cause anomalous network traffic pat-terns [9]. Host-based AIDS have been more successful at detect-ing malicious worm intrusions irrespective of scanning behavior ofworms since a detection alert is generated when an intrusion at-�

This research was partially funded by grants from Natural Sciences and Engi-neering Research Council (NSERC) and Mathematics of Information Technologyand Complex Systems (MITACS).

tempts to alter the standard state 1 of the endpoint. Typically, suchattempts are in the form of anomalous system calls [20] or unautho-rized intrusions which cause the host AIDS to trigger an alert. Thisaligns with the argument in [9] that since end hosts running vulner-able software are the targets of malicious code attacks, they oughtto be the point of detection. Recent work [9] [1] [4] and vendor im-plementations [26] have recorded success in using host AIDS fordetecting unauthorized intrusions. Host AIDS are capable of lever-aging large amounts of detailed context about applications and sys-tem behavior to effectively detect anomalous host behaviors [25].The technique adopted in [9] shows that with properly instrumenteddetection software, host-based intrusion detection is effective andcapable of eliminating false positives.

In this paper, we introduce use of the Generalized Evidence Pro-cessing (GEP) theory, a decision level multi-sensor data fusiontechnique, for detecting fast spreading malicious worms by com-bining intrusion detection evidence provided by distributed host-based intrusion detectors [28] [29]. We recently demonstrated useof the GEP theory for detecting slow scanning malicious worms[3]. This paper extends our work to detection and collaborative con-tainment of fast scanning malicious worms in enterprise networks.There has been previous attempts to use two major evidence com-bining theories for intrusion detection - the Bayesian theory and theDempster-Shafer theory [17] [24]. Proponents of the Bayesian the-ory criticize the Dempster-Shafer theory for lack of rigorousness inthe axiomatic definition of evidence and the inability to use a prioriprobabilities when they are known [28] [14]. On the other hand,proponents of the Dempster-Shafer theory criticize the Bayesiantheory for lack of flexibility when it comes to fuzzy decisions wherethe evidence might not support hard decisions, difficulty in defininga priori probabilities and likelihood functions, as well as the mu-tual exclusivity requirement for competing hypotheses [24] [10].The GEP theory unifies both theories in a generalized frameworkand combines their advantages [28] [10]. With GEP, the evidencecollected by the host detectors determines the probability mass as-sociated with a decision under a hypothesis. The probability massassignments may be based on the Bayesian likelihood function orcorrespond to the belief functions used in the Dempster-Shafer ev-idential theory. The evidence is combined to arrive at an optimalfused decision by minimizing a decision risk function.�

Pre-defined standard states of endpoints are typically determined by establishedsecurity policies and standards.

1 of 13

We propose the EDANC scheme which uses host-based anomalyintrusion detectors as sensors in a target cell and runs a GEP the-ory based correlation algorithm on their gateway router which actsas a fusion center to achieve an optimal intrusion detection deci-sion. Based on that decision, the gateway router communicateswith other routers in the network to achieve collaborative wormcontainment in the network. With the EDANC scheme, the hostdetectors do not participate in normal traffic transactions but func-tion solely as anomaly detectors in the target cell. Their role isto make local decisions (malicious or benign) concerning detectedintrusions and communicate that decision to their gateway router.There is evidence of the effectiveness of host-based anomaly de-tection techniques in detecting unauthorized and malicious host in-trusions with minimal false alarms [9] [13]. The EDANC tech-nique is unique because it leverages probabilistic evidence fromdistributed host-based anomaly detectors about unauthorized intru-sions, as well as an optimal correlation of the evidence based on theGeneralized Evidence Processing theory to achieve network-basedcollaborative containment of fast spreading malicious worms. Themain contributions of this work are:� We propose an optimized detection and correlation scheme

based on the Generalized Evidence Processing (GEP) theory,a data fusion technique known to have advantages over thetwo major evidence combining theories that have dominatedthe field of distributed evidence processing - the Bayesian the-ory and the Dempster-Shafer theory. Our proposed EDANCscheme uses the GEP-based detection decision to achieve col-laborative containment of fast spreading worms.� We present an analysis of EDANC detection interval for a fastscanning worm. Experimenting on a live test-bed we evaluatethe EDANC scheme and show that the results obtained are inagreement with analytical results.� We develop the Analytical Active Worm Containment(AAWC) model, a discrete-time model used to model a vul-nerable host population protected as a result of EDANC col-laborative defense in a large scale network.� Using the AAWC model we demonstrate quantitatively thecollaborative containment capability of the EDANC scheme.We define containment of an infectious worm as the completehalting of further worm spread. Therefore, when a fast spread-ing worm is contained using the EDANC scheme, significantportions of the vulnerable uninfected population are protectedfrom infection. We also evaluate introducing immunization tothe EDANC collaborative defense.

A. OutlineSection II introduces the Generalized Evidence Processing the-

ory. In section III and IV we present a description and analysisof the EDANC scheme. Experimentation with EDANC on a livetest-bed is presented in section V. In section VI, using the AAWCmodel we investigate the protection capability of the EDANC col-laborative defense. Section VII concludes the paper and points tofuture work.

II. INTRODUCTION - GENERALIZED EVIDENCEPROCESSING THEORY

Some known limitations of the Bayesian decision processes in-clude inability to deal with both non-mutually exclusive multiple

hypotheses and uncertainty [14] [10]. The Generalized EvidenceProcessing (GEP) theory extends the Bayesian inference frame-work to deal with these limitations. As an example, let �� and ��be the two hypotheses under testing. The events associated with theprobability space can be attributed to the two hypotheses � � and� � with probabilities �� and �� respectively,where �� . Let � � , � � , and �� be the decisionswhich correspond to the propositions “ � � is true”, “ � � is true” and“ � � or � � is true” 2 respectively. With Bayesian inference where� � and � � are mutually exclusive events, the probability associ-ated with �� is equivalent to:�� "!#�� This shows an inability to account for non-mutually exclusiveevents and uncertainty (or indecision) within the Bayesian frame-work. The Dempster-Shafer technique also has shortcomings asearlier explained in section I. The GEP theory enjoys the merits ofboth the Bayesian and the Dempster-Shafer techniques and avoidstheir shortcomings [14].

1 z

5 z

4 z

3 z 2 z

1 d

5 d

4 d

3 d 2 d

)} ( { ~

d f D

) ( ~

Z g D Z

Local sensor/detector observations result in local decisions

Using GEP theory, fused decision at gateway router is determined from

local detector decisions

Transformation from local detector observations to fused decision at gateway router

Fig. 1. Transformation from $ local detector observations to a fused decision. $&%' .The GEP theory is a unified evidence theory which accounts forindecision and combines evidence that supports non-mutually ex-clusive propositions to arrive at a decision by minimizing a cum-mulative risk function. In a distributed multi-sensor system with(

sensors, let ) be the observation (data) space which resultsin individual local decisions on the sensors. We represent ) as) �+*-, .�, � , � / , � /10203/ , $ � / ,�45�6� / � /87:9 where � impliesa “benign observation”, � implies a “malicious observation” and7 implies an “uncertainty” about the nature of the observation.Also, let two hypotheses �;� and �<� be considered, where �;�is the hypothesis that the observation is malicious and �� is thehypothesis that the observation is benign. Each local sensor ob-servation results in a local sensor decision (see Fig. 1). Hence,the vector of observations , results in a vector of local decisions*�= .>= � =?� / = �@/10203/ = $ � / = 4 �A� / � /87:9 , where � , � , and 7 are the in-dividual local decisions which correspond to the propositions “ � �is true”, “ � � is true” and “ � � or � � is true” (i.e. an indecision)

Using the GEP theory, the local sensor decisions are combinedat a fusion center to arrive at a fused decision that minimizes acummulative risk function. As depicted in Fig. 1, let BCD) � be thetransformation from the observation space ) into the fused decisionE

Decision F E therefore represents an indecision about the true nature of the hy-pothesis.

2 of 13

space GH �I* H �KJ = � .�= � = �L/ = � /10203/ = $ � / =M4N�O� / � /P7?9 suchthat, GH � BCD) � / GH �Q*�� / � /87?9 and

H �A� / � /P7 (1)

whereH �R� / � /P7 are the fused decisions that “ � � is true”, “ � � is

true” and “ � � or � � is true” respectively.In practical worm detection systems, distributed detectors in a

network can make observations of malicious intrusions in the net-work and report their individual decisions to a central processor(such as a gateway router). The transformation BCD) � correspondsto the function of a correlation algorithm running on the gatewayrouter that takes as input the individual local decisions of the detec-tors and outputs a fused decision.

Following the GEP theory [28], let SUT�V be the cost associatedwith a decision W that is in set GH at the fusion center when hy-pothesis �V is true, where �YX SZT�V X[� . We assume that thereis no penalty for a correct decision, hence the associated cost fora correct decision is zero (i.e. S"�8� � SZ�8� �\� ). We now deter-mine the decision rules which ensure the fused decision (1) madeat the fusion center minimizes the cummulative decision risk ] .The cummulative risk ] can be expressed as:] �R^ T ^ V SZT�V_��V �a`?b % T = � = c ��V �edPfU�A� / � and W �A� / � /P7 (2)

whereH �R� / � /P7 are the fused decisions that “ � � is true”, “ � � is

true” and “ � � or � � is true” respectively which occupy the fuseddecision space. Solving,

] �Q`?b % �>g �� S �h�-= � = c � ��i� �� 1� S ��= � = c � ��kj�5`?b % � g �� S �l�-= � = c � ��i� �� 1� S �8��= � = c � ��kj� `?b % � g ��<� � S � � = � = c �<� �i� �� S � � = � = c �� kj] is minimized if the fusion decision rule assignsH

(the fuseddecision) to the region (

H �m� , H �n� orH � 7 ) that results

in the least integrand under the three integrals. Since = is a vectorwith discrete components, we can write the fusion decision rules asfollows:o@p8qapDp_osr3t u vwp_xzy�o?{|q}pl{lowr3t u v�{|x�~�� { or ~��@��~�� p or ~��@� o>phqC{zphowr3t u vwp_x�y�oM{_q�{D{losr3t u v�{lxo@p8qapDp_osr3t u vwp_xzy�o?{|q}pl{lowr3t u v�{|x�~��>� or ~�� {�~�� p or ~�� { o>phq � phowr3t u vwp_x�y�oM{_q � {losr3t u v�{lxo@p8q � p_osr3t u vwp_xzy�o?{|q � {lowr3t u v�{|x ~�� { or ~�� p�~��>� or ~�� p o>phqC{zphowr3t u vwp_x�y�oM{_q�{D{losr3t u v�{lxwhere, � �U� �� and � �� and � te��tP�Z�

implies that decision =M� is made if

�� , otherwise decision =?�is made.Dividing both sides by � = c �� and defining �� = �U� owr�t u v�{_xowr�t u vwp8x thedecision rules become:g S �e�"� S �h�ej �� = � b % � or

b % ��b % � orb % � �� g S �|�� S �8��j

g S �e�"� S"� �ej �� = � b % � orb % ��b % � orb % � ��<� �� g S"� �� S �8��j

g S � � � SU�h� j �� = � b % � orb % ��b % � orb % � �� g SU�|� � S � � j

Solving, �� = � b % � orb % ��b % � orb % � ��<� �� SU�l�S �e� (3)

�� = � b % � orb % ��b % � orb % � �� S"� �S�� S � � (4)

�� = � b % � orb % ��b % � orb % � ��<� �� SZ�|� � S � �S"� � (5)

According to equation (3), (4), (5), the fusion decision rules dependon the values of the SUT�V costs and a priori probabilities �� and��<� � of the two hypotheses, �;� and �<� respectively. We assumethat the a priori probabilities are known, hence we are interested inestimating �� = � .

For malicious worm detection, S��|� and S�� are the costs asso-ciated with a false positive decision and a false negative decisionrespectively. For worm detection systems without a bias for falsepositives or false negatives, cost values S �|�#� S �e� is appropriateto ensure the same penalty for decisions that result in either falsepositives or false negatives.

Equation (3), (4), (5) also show that the GEP framework canmake use of the a priori probabilities of both hypothesis � � and� � if they are known. When they are not known, we assume that�� -� thus nullifying the impact of a priori probabilitieson the fusion decision rules in (3), (4), (5). Also, note that theGEP decision process breaks down to a binary decision process ifindecision is not considered. In this case, only the decision rule (3)applies since S � � and S � � become undefined.

To illustrate an application of the decision rules, we considerdifferent possible cases as was done in [28]. We assume a prioriprobabilities of both hypothesis �;� and �<� are unknown, hence�� <� � , and that the cost of an incorrect decision isgreater than the cost associated with an indecision (i.e. S �|� � S"� � ,S �e� � S"� � ).A. Case � : S �8�� S �h��A� , S �|� � 7��S"� � , S �e� � 7��ZS"� �

Let S �l�A� S �e�� , S"� �A� S"� �� . Hence,q�{�pq pl{ �� ,q � pqapl{8��q � { �� 0�� , and

q {�p ��q � pq � { � 7 . Equation (3), (4), (5) are usedto generate the partition in Fig. 2. In this case, Fig. 2 shows that theindecision region lies between the two definite decision regions.This case is applicable to practical detection systems that are notalways capable of providing evidence to support definite decisions,hence the option of indecision is provided.

3 of 13

5 10 15 20 25 300

0.5

1

1.5

2

2.5

x

Like

lihoo

d ra

tio, Λ

(x)

1 or 0

0 or 2

Decide D=1

Decide D=2

Decide D=2

Decide D=0

0 or 2

1 or 2

0 or 1

1 or 2

Fig. 2. Case � : The indecision region lies between the two definite decision regions.

5 10 15 20 25 300

0.5

1

1.5

2

2.5

x

Like

lihoo

d ra

tio, Λ

(x)

1 or 2

0 or 2

Decide D=1

Decide D=0

Fig. 3. Case � : The indecision region is completely eliminated.

B. Case 7 : S �h�� S �h��A� , S �|� � 7��ZS�� , S �e� � 7��ZS"� �Let S �|�� S ��R�+� , S"� �� S"� �Q�� 0�� . Hence,

qC{�pq pl{ �q � pqapl{8��q � { � q�{�pe��q � pq � { �� . All three thresholds have the samevalues and the indecision region is non-existent as shown in Fig. 3.In this case the decision process corresponds to a standard binarydecision process. This case is applicable if the detection system iscapable of always providing hard evidence sufficient to support adecision or if the system is not capable of dealing with indecision.

C. Case � : S��8� � SU�h� �R� , SU�|� � 7��ZS � � , S��e� � 7��ZS � �Let SU�l� � S�� , S � � � S � � � �� . Hence,

q {�pqapl{ �� ,q � pq pl{ ��q � { � 7 , andqC{zp��}q � pq � { �n� 0 � . In this case, Fig. 4 shows

that the two definite decision regions lie between two indecisionregions, an exact opposite of Case 1. Case 3 represents a detectionsystem that exhibits a standard binary decision process within alikelihood ratio bound (in this case � 0��R m��¡ � [7 ). Beyondthe bound, the detection system is incapable of making a definitedecision.

Practical detection systems are more suited to Case � and Case7 .Though the GEP theory can be generalized to deal with detector

indecision, the implementation of our proposed EDANC detectionscheme does not consider indecision in this paper. We focus ondetectors that are capable of making decisions and leave detectorindecision for future work.

5 10 15 20 25 300

0.5

1

1.5

2

2.5

x

Like

lihoo

d ra

tio, Λ

(x)

Do not decide D=0

Decide D=1

Decide D=0

Do not decide D=1

1 or 2

0 or 1

1 or 2

0 or 21 or 0

0 or 2

Fig. 4. Case�

: The definite decision regions lies between two indecision regions.

Attacking Network - 1


GR-2 GR-1

Target Network-A

Target Network-C

Target Network-B

Target Network-D

Cell within target network

Gateway router

GR-3

Direction of malicious traffic flow

Fig. 5. Typical worm attack on multiple networks

III. THE EDANC SCHEME

Fig. 5 depicts a typical attack scenario in which single or multipleattackers in Network-1 and Network-2 launch fast spreading scan-ning worm attacks on several enterprise networks (Target Network-A, Target Network-B, Target Network-C, Target Network-D). Typ-ically, well-designed enterprise networks are logically subdividedinto cells or network zones as shown in Fig. 5.

A. Detection Technique

The EDANC scheme uses host based anomaly detection soft-ware running on detector endpoints (DEs) for detecting maliciousintrusions into the network. In a large enterprise network, a num-ber of DEs located within distributed cells detect and respond tointrusion attempts targeted at the cells. Individual DEs performreal-time monitoring and recording of profiles of all network traf-fic originated from outside their cell and targeted at the DEs. Wedefine a profile as a 4-tuple consisting of srcIP, dstport, proto, pay-load. srcIP is the source IP address in the IP header of packetscaptured by the DE, dstport is the target port, proto is the transportlayer protocol used, and payload is the signature of the exploit ofthe payload of the IP packet. When the host AIDS software run-ning on a detector endpoint (DE) makes a positive detection of amalicious or unauthorized intrusion the following occurs:

4 of 13

1) The DE immediately sends an alert to other participating DEswithin the cell. DEs communicate only with other DEs.

2) When the alert is received, the DEs within the target cell startreal-time recording of profiles for all network traffic origi-nated from outside their cell and targeted at the DEs for apre-set capture interval. The DE capture interval correspondsto the detection window with duration ¢ t .

3) For each traffic profile £ detected in the target cell by a DE,two hypotheses � � and � � are considered, where � � is thehypothesis that the traffic profile £ is malicious and � � is thehypothesis that the traffic profile £ is benign. For the profile£ , let = 4 ¤ be the individual local decision by the ¥L¦�§ DE basedon observed intrusion attempts. = 4 ¤ �¨� if �� is decided and= 4 ¤ �©� if � � is decided. We assume the anomaly detec-tion software running on a DE is capable of making such adecision. For this work, we considered a binary local detec-tion outcome which did not include indecision. However, theGEP theory is capable of dealing with indecision as explainedearlier. For a target cell with ª DEs, let = 4 � = 4 � / = 4� /�0302/ = 4«Z¬ �be the vector of individual DE decisions on traffic profile £ .

4) At the end of the detection window, the DEs in the cell trans-fer their records and local decisions to their upstream gate-way router and continue monitoring for unauthorized intru-sions. If during the detection window, the anomaly detectionsoftware running on the DEs does not alert and hence no localdecisions are made, then the DEs do not transfer any recordsto the gateway router.

The DEs do not initiate communication with any host outside theircell nor do they participate in normal traffic transactions. Theirrole is to make local decisions (malicious or benign) concerningdetected intrusions and communicate that decision to their gatewayrouter.

B. Correlation Technique

The upstream gateway router receives the records and localdecisions transfered from the DEs in the target cell. The gatewayrouter runs a correlation engine based on the Generalized EvidenceProcessing (GEP) theory data fusion to determine the most likelyprofile(s) associated with the detected malicious or unauthorizedintrusion(s). Multiple correlation processes can run on the gatewayrouter simultaneously.

1) Correlation Engine (based on GEP Theory): At the gatewayrouter, we are interested in using collected DE local decisions inmaking an optimal fused decision which minimizes a cummulativedecision risk. For each traffic profile £ with associated records andlocal decisions received from the DEs, two hypotheses � � and � �are considered, where � � is the hypothesis that the traffic profile£ is malicious and � � is the hypothesis that the traffic profile £ isbenign. In general, the a priori probabilities of the two hypothesis� 4 �� and � 4 �� can be estimated using historical data or expe-rience. However, without loss of generality, we assume that � 4 �� and � 4 ��<� � for each profile £ are the same. Hence, the GEP opti-mal decision criteria at the fusion centre (the gateway router) whichminimizes the cummulative decision risk can be expressed using

the following likelihood ratio rule (derived from (3)):�� = 4 �� = 4 c �� = 4 c � �� b % ��b % � SU�l�S �e� �® (6)

where SU�l� and S�� are the costs of a false positive decision and afalse negative decision respectively.

H �¯� is the fused decisionthat traffic profile £ is a malicious traffic profile while

H �[� isthe decision that traffic profile £ is a benign traffic profile. Thechoice of S �l� and S �e� is system design driven and has an impacton the behavior of our algorithm as we show in our experiments.With our implementation, the GEP decision process breaks downto a binary decision process, hence we do not consider indecision3. This corresponds to Case 7 in Section II.

To express (6) in more practical terms, let ° t ¤ denote the detec-tion probability and °C± ¤ denote the false alarm probability of the¥ ¦�§ individual detector. Both ° t ¤ and °�± ¤ depend on the quality ofthe detector. In our implementation, the DEs are homogeneous 4

since all DEs are assumed to run the same anomaly host-based in-trusion detection software, hence ° t ¤ � ° t and °}± ¤ � °�± , for all¥ . Also, ° t � °}± . See Table I for the relationship between � � , � �and ° t , ° ± .

TABLE IRELATIONSHIP BETWEEN ² � , ²�³ AND ´ µ , ´@¶

True Nature Detector decision� � � �� ° t �U� ° t�<� ° ± �U� ° ±For a particular profile £ , let ª 4 be the total number of detectorsin the target cell with observations of profile £ and · 4 be the to-tal number of such detectors with individual local decisions whichfavor � � . If we assume the observations on individual DEs areconditionally independent given hypotheses � � and � � , then ac-cording to GEP, the conditional probability at the gateway routeris � = 4 c � �� °}¸ ¬t �� U� ° t � « ¬ � ¸ ¬� = 4 c �<� �� ° ¸ ¬± �� U� ° ± � «Z¬ � ¸ ¬Hence, the likelihood ratio test in (6) is equivalent to,�� = 4 �� = 4 c �� = 4 c � � � �¯¹ ° t°}±»º ¸ ¬ � ¹ �U� ° t�U� °�±�º «"¬ � ¸ ¬ b % ��b % � SU�|�S �� (7)

Based on computation of �� = 4 � , the likelihood ratio test in (7) de-termines whether the correlation algorithm considers traffic profile£ as a malicious traffic profile (i.e.

H �� ) or a benign traffic profile(i.e.

H �A� ).2) Algorithm for handling multiple malicious attacks: When

multiple simultaneous malicious fast worm attacks occur and aredetected using the correlation algorithm, there is a need to deter-mine how to respond to the multiple attacks. Our proposed ap-proach involves computing the combined probability of detection,¼

Indecision within the GEP theory framework is reserved for future work.½The implementation can be modified to use heterogeneous DEs.

5 of 13

TABLE IIPARAMETERS FOR GEP-BASED CORRELATION ALGORITHM

Notation Explanation¾�¿»ÀCombined probability of positive detection for traffic profile Á´ µhÂ Detection probability for the ÃeÄÆÅ individual detector´ ¶eÂ False alarm probability for the Ã ÄÆÅ individual detectorÇ ÀÂ Individual local binary decision by the Ã ÄÆÅ DE on intrusion attempts due to profile Á .ÈÊÉ Ç À�ËGEP likelihood ratio for optimal fused decisionÌ GEP likelihood ratio threshold, also equivalent to Í {�pÍ pl{Î}ÏeÐCost or penalty associated with a detector decision Ñ when the true hypothesis is ² ÐÒ À Total number of detectors with observations of profile ÁÓ À Total number of detectors with observations of profile Á and that favor ² �Ô À Number of detectors which favor ² � required to minimally satisfy

ÈiÉ Ç À ËCÕ Ì�iÖ 4 for each profile £ determined to be malicious using the likeli-hood ratio test in (7). �wÖ 4 for each detected traffic profile £ is com-puted using the following expressions (see Table II for descriptionof notations): �iÖ 4 � �� = 4 �Z��c � �� (8)

If we let × 4 be the minimum number of detectors which favor � �required to satisfy the condition �� = 4 �Ø�� (determined using 7),then �ÊÖ 4 defined in (8) can be expressed as:�ÊÖ 4 � ^¸ ¬�Ù ¸ ¬|Ú}Û8¬ ¹ ª 4· 4 º Ü° t � ¸ ¬ �� ° t � «Z¬ � ¸ ¬Containment action for the multiple malicious traffic profiles istriggered in a sequence ordered by the value of � Ö 4 for the dif-ferent profiles. Containment for profile · is triggered before that ofprofile × if, �ÊÖ ¸ � �ÊÖ Û (9)

C. Containment TechniqueThe EDANC containment technique is based on a distributed de-

fense scheme which uses a reactive blocking protocol (explainedlater in better detail) running on gateway routers to achieve collabo-rative containment. Using the detection technique explained above,the gateway router attached to the target cell under malicious attackidentifies the malicious traffic profile(s) and triggers a containmentaction. First, the gateway router attached to the target cell (in Fig. 6,Cell-1 is the target cell and GR-1 is the attached gateway router)executes a reactive block 5 against the traffic profile(s) identified asmalicious by automatically applying router filters against ingressconnections from the profile(s) into the cell. The filter is appliedagainst the 4-tuple consisting of the IP address of the attacker, thetarget vulnerable port, the transport protocol used, and the payloadcontent. The gateway router of the target cell (GR-1 in Fig. 6) thenoriginates a block notification message to its neighbor routers (GR-2, GR-3 and GR-8 in Fig. 6) which informs them of the blockedprofile(s). With the EDANC containment technique, all gatewayrouters run the reactive blocking protocol when a block notificationmessage is received. The reactive blocking protocol that runs onthe gateway routers determine how the routers respond to a blocknotification message. We describe the reactive blocking protocolbelow using a number of processes.Ý

Reactive blocking, blocking and filtering are used interchangeably to describethe gateway router’s containment action.

GR-1

Cell-1

GR-9

GR-8

GR-7 GR-6 GR-4

GR-3

GR-5

GR-2

GR-10

Cell-7

Cell-3

Cell-6

Cell-10 Cell-9

Cell-8

Cell-4 Cell-5

Cell-2

Cell under direct attack

Notification messages

Fig. 6. Distributed collaborative containment strategy

1) Fitering process: The protocol starts by creating a router fil-ter on a neighbor gateway router that receives the block notificationmessage (in this case GR-2, GR-3 and GR-8 in Fig. 6 are the neigh-bor gateway routers). The filter blocks ingress traffic that match theprofile(s) contained in the block notification message from enteringall cells or subnets existing on the neighbor gateway router.

2) Monitoring process: With the filter still applied, a neighborgateway router carries out real-time monitoring and recording ofthe number of hits on the filter caused by the blocked profile(s) fora time interval equivalent to ¡ seconds to verify actual existenceof the suspected malicious worm activity. In our experiments, atractable value was chosen for parameter ¡ based on experience.After ¡ seconds the algorithm computes a probing rate, � 4 for eachblocked profile £ :� 4i� number of hits on filter by profile Þ¡It then carries out the following conditional loop with chosenparameter ß . Similar to ¡ , a tractable value was chosen forparameter ß based on experience with our experiments.If ( à À�á�â )ã

then profile Á is associated with a worm attack;

transition to the Notification process;äIf ( à À�å�â )ã

then profile Á is not associated with a worm attack;

6 of 13

transition to Unblocking process.ä3) Notification process: During this process, the neighbor gate-

way router notifies its own neighbor routers of the suspicious pro-file by sending a block notification message to them. This also trig-gers the reactive blocking protocol on the new neighbor gatewayrouters. Note that a block notification message is not sent back thepath it was earlier received on. This ensures that a gateway routerthat originates a block notification message does not receive thesame message from its neighbors. As an example, in Fig. 6, GR-2 which received an initial block notification message from GR-1will send a block notification message to GR-4 and GR-5 but willnot send the message back to GR-1.

4) Unblocking process: In this state the filter is removed toprevent a denial of service on non-malicious traffic.

Fast spreading malicious worms are known to exhibit probingrates in the order of tens of thousands probes/second. In the eventof a fast spreading worm attack, the reactive blocking protocol onmost gateway routers will get to run the Notification process if ¡and ß parameters are properly chosen. In that scenario, all enter-prise routers quickly become aware of the suspicious profile(s) andestablish filters against ingress traffic that match the profile(s) thusachieving enterprise-wide fast and automated containment.

IV. DETECTION INTERVAL ANALYSIS

For a particular malicious worm traffic with profile £ , detectioninterval, ¢ ÖZæ 4 is the interval between the time the worm scan firsthits a target cell and the time the worm is positively detected in thetarget cell. As explained in section III-B, a minimum of × 4 DEs inthe target cell must make a positive detection of the malicious intru-sion to satisfy the condition ��zçiè ��® for the GEP optimal fuseddecision

H �\� to be made. Hence, the detection interval, ¢hÖZæ 4 isthe interval between the time the worm scan first hits a target celland the time a minimum of × 4 DEs in the target cell make a positivedetection of the malicious intrusion. The detection interval, ¢8ÖZæ 4comprises the total inter-infection interval, ¢Pé?æ 4 and the total timeto infect, ¢ 42ê ±-ë|ì ¦ æ 4 for the particular malicious traffic with profile £(i.e. ¢|ÖZæ 4 � ¢hé?æ 4 � ¢ 43ê ±-ë|ì ¦ æ 4 ). These parameters will be explainedin the following sections. We also show in section IV-B that since¢ 42ê ±�ë_ì ¦ æ 4 is negligible for fast scanning worms, ¢ ÖZæ 4ií ¢ é?æ 4 .A. Total inter-infection interval, ¢ é?æ 4

Inter-infection interval is the time interval between successivehits experienced by hosts in a target cell as a result of worm scans.The total inter-infection interval, ¢Pé?æ 4 is the sum of time intervalsbetween successive worm scans of the DEs in the target cell until aminimum of × 4 DEs in the target cell make a positive detection ofintrusion.

We model scanning of hosts in the target cell by a Poisson pro-cess with an average rate of î hosts/second (h/s). Use of the Pois-son distribution to model random scanning behavior of maliciousworms is not uncommon. Authors in [22] [12] [31] used the Pois-son process to model the behavior of random scanning worms. Theinter-infection interval between hosts is therefore an exponentialrandom variable with mean �ï and the total inter-infection interval,

¢ é?æ 4 is the sum of inter-infection intervals until at least × 4 DEs inthe target cell make a positive detection. Two network scenariosare possible:

1) The total number of hosts in the target cell is ð and the num-ber of detectors (DEs) in the target cell is ª , where ªñ òð .

2) All hosts on the target cell function as detectors, hence ª �ð .1) Scenario 1: ª[ Að : In this scenario, we assume that there

are a total of ð hosts in the target cell comprising ª DEs and there-fore ð � ª non-detectors (non-DEs) 6, and that ó 4 hosts (compris-ing DEs and non-DEs) in the target cell are scanned by the intru-sion traffic with profile £ before the condition for detection is met(i.e ��zç è �ô�õ ). Then, the minimum total inter-infection interval,¢hé « 43ê æ 4 7 will be an Erlang- ó 4 random variable with mean ö ¬ï . But,ó 4 � × 4 �Y, 4 ª � × 4where ,-4 is the number of non-DEs that were scanned before a pos-itive detection decision is made. , 4 is a random variable with valuesthat range between � and ð � × 4 . Since , 4 can have values between� and ð � × 4 with equal probability, we can assume , 4 follows auniform distribution, , 4»÷õø � /Pð � × 4 � . Hence,ù g ¢hé « 43ê æ 4 j�� ù g ó 4î j�� ù g × 4 �Y, 4î j�� × 4 � ð7Lî (10)

2) Scenario 2: ª � ð : In this scenario, we assume that thenumber of detectors, ª in the target cell is equivalaent to the totalnumber of hosts ð in the target cell and that ó 4 hosts (compris-ing only DEs) in the target cell are scanned by the intrusion traf-fic with profile £ before the condition for detection is met. Thenó 4Ê� × 4 . The minimum total inter-infection interval, ¢ é « 43ê æ 4 is alsoan Erlang- × 4 random variable with mean

Û ¬ï .Since both × 4 and î are constants,ù g ¢hé « 43ê æ 4 j�� ù g × 4î j�� × 4î (11)

B. Total time to infect, ¢ 43ê ±-ë|ì ¦ æ 4This is the time interval it takes to scan and infect a vulnerable

host in the target cell. This time is largely dependent on the natureof the intrusion attack and the vulnerability being exploited on theendpoint. For analysis, we assume that ¢ 43ê ±-ë|ì ¦ æ 4 is negligible forfast spreading worms. Therefore,¢ Ö « 43ê æ 4ií ù g ¢ é « 43ê æ 4zj (12)

where ¢_Ö « 42ê æ 4 is the minimum average detection interval for profile£ .Using simulations, Fig. 7 was generated using (10), (11) and (12).Fig. 7 shows that ¢_Ö « 42ê æ 4 decreases progressively with increase inworm scanning rate.ú Ò Õ Ô Àûhü�ý�þ ÀÜÿ�� À

is the minimum total inter-infection interval which occurs when hostsin the target cell are scanned only once in a single worm attack instance. Multiplehost scans by a particular malicious worm will result in greater total inter-infectioninterval for the worm.

7 of 13

10 15 20 25 30 350

5

10

15

20

25

Worm scanning rate (hosts/second)

Min

imum

ave

rage

det

ectio

n in

terv

al (s

econ

ds)

Scenario 1: m = 128, m < WScenario 2: m = 254, m = W

Fig. 7. Average detection interval using (10), (11) and (12).� % � '�� .



GR-2 GR-1

Target Network-A

Target Network-C

Target Network-B

Target Network-D

Cell within target network

Gateway router

GR-3

Direction of malicious traffic flow

Fig. 8. Test-bed used for experimentation

V. EXPERIMENTATION

A. Description of test-bed setup

Fig. 8 shows the topology of our live testbed which will be de-scribed in more detail in this section. Worm attacks are sourcedfrom Network-1 and Network-2 and targeted at vulnerable hosts inNetwork-A, Network-B, Network-C and Network-D. Network-A,Network-B, Network-C and Network-D are logically subdividedinto cells or network zones as shown in Fig. 8. Detector endpoints(DEs) that run the EDANC detection algorithm are located withinthe target cells and communicate with their gateway router (GR-1 and GR-2). The gateway router runs our GEP-based correlationalgorithm.To evaluate the functionality and performance of the EDANCscheme on a live testbed, we emulated scanning worm attacks us-ing a modified blaster worm source code [21]. To emulate multi-ple malicious attacks the source code was used to instrument twoworms that exploited two different vulnerabilities. The first, worm-1 was instrumented to create a directory named /root/infected-1 onthe target host and copy a file named malicious-1 into that directoryover TCP port 888. The second, worm-2 was instrumented to cre-ate a directory named /root/infected-2 on the target host and copya file named malicious-2 into that directory over UDP port 999.Hosts in network-1 and network-2 were used to launch worm-1 andworm-2 attacks respectively on hosts in the target networks (net-

work A, network B, network C, and network D). We used OpenVZvirtualization 8 [27] to create the required vulnerable host popula-tion in the target networks. Upto 254 virtual hosts were created ona number of Linux workstations running OpenVZ kernel-2.6.22 toemulate a class C network population in each target network. Scan-ning rates of upto 70 hosts per second (h/s) were generated froma single worm-1 or worm-2 attack instance and since all hosts inthe target networks were vulnerable, the scanning rate is equivalentto infection rate. In comparison, the code red worm [19] infected359000 hosts in less than 14 hours, equivalent to an average infec-tion rate in the order of 7.1h/s. The Witty worm [23] infected 110hosts in the first 10 seconds, equivalent to an average infection rateof about 11h/s while the Slammer worm [18] infected more than75,000 hosts within 10 minutes, equivalent to an average infectionrate of over 125h/s.

For our test, the host-based Anomaly Intrusion Detection System(AIDS) running on the DEs was emulated using a detector agent(DA) that constantly monitored the directory structure and contentof the DE, and generated an alert when a file named malicious-1 or malicious-2 was found in a directory named /root/infected-1or /root/infected-2 respectively on the DE. We used a probabilityof detection, ° t � � 0 �� as was used in [13]. In our implemen-tation, a snort-based IDS was used for real-time recording on theDE 9 and the worm attacks randomly scanned hosts in one targetnetwork before selecting another target network. All hosts on thetarget networks functioned as detectors, hence ª � ð . The detec-tion window parameter, ¢ t on the detectors was set to 7 � seconds.Benign background network scans are maintained throughout theexperiments.

The purpose of the experiments was to demonstrate how theEDANC detection technique can be used for detecting fast propa-gating worm attacks. It may not be representative of all the possibleworm attack scenarios that exist or may exist on the Internet today.

B. Experiment 1: Single worm attack instance - varying scanningrate

In this experiment, a single attacking host in network-1 was usedto launch direct attacks on hosts in the target networks. The ob-jective of the experiment was to evaluate the responsiveness of theEDANC scheme in detecting unauthorized malicious intrusions ina target network. The worm scanning rate was varied to investigateits impact on the EDANC detection scheme.

Fig. 9(a) shows that average detection interval observed on GR-1 and GR-2 reduces progressively with increase in worm scanningrate. These observations concurs with results obtained analyticallyin Section IV. The results show that the EDANC scheme is capableof automatically detecting malicious intrusion attacks with scan-ning rate of over 30h/s within an average of � seconds after startingthe attack on a network with cells containing 7>�� hosts. In Fig.9(a) we also present the

� �� confidence interval for the averagedetection interval.

OpenVZ is an operating system-level virtualization technology based on theLinux kernel and operating system.�

Note that our emulation of host-based detection with detector agents and snort-based real-time logging was only used to demonstrate the EDANC scheme. Otherhost AIDS software such as Thirdbrigade’s host AIDS, Cisco’s Security Agent andTripwire’s host AIDS can be used for host-based detection in enterprise deploymentof the EDANC scheme.

8 of 13

10 15 20 25 30 350

2

4

6

8

10

12

14

16

18

Worm scanning rate (hosts/second)

Avera

ge d

ete

ction inte

rval (s

econds)

Average detection intervalobserved on both GR−1and GR−2

(a) Experiment 1: Average detection in-terval.

� % � '�� .

0 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1

100

110

120

130

140

150

160

170

180

Probability of false detection, pf

Min

. # o

f positiv

e d

ete

cto

rs r

equired for

fused d

ecis

ion

Minimum # of positivedetectors observed onGR−1 and GR−2.

(b) Experiment 2: Effect of varying, ��onÛ ¬

.� % � '�� . Worm scanning rate

=� ' h/s. ��% �� .

0.5 1 1.5 2 2.580

90

100

110

120

130

140

150

160

170

180

γ = C10

/C01

# o

f p

osi

tive

de

tect

ors

re

qu

ire

d f

or

fuse

d d

eci

sio

n

(c) Experiment 3: Effect of �ô%�� {zp� pk{onÛ ¬

.� % � '�� .

2 4 6 8 10 12 14 16 18 200.6

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

Experiment runs

Com

bin

ed d

ete

ctio

n p

robabili

ty, P

Di

Worm−1Worm−2

(d) Experiment 4: Combined detectionprobability,

o� ¬.� % � '�� .

Fig. 9. Experiment results with EDANC.

C. Experiment 2: Effect of varying probability of false positive de-tection, °�±

A second set of experiments were carried out to investigate theimpact of varying the probability of false positive detection ° ± ofthe individual detectors on the GEP-based correlation algorithm ex-plained in sction III-B.1. A single attacking host in network-1 wasused to launch direct attacks on hosts in the target networks at ascanning rate of �M� h/s and the attack session was stopped when adetection decision was made by the correlation algorithm runningon GR-1 and GR-2. The probability of detection, ° t of the individ-ual detectors was set to � 0 �� . The number of hosts in a target cellwas set to 254. Several experiment runs were made, and in eachrun the value of the probability of false positive detection °�± of theindividual detectors in the target network was modified.

Figure 9(b) shows that an increase in °C± results in a correspond-ing increase in the number of positive detectors, × 4 10 required tomeet the conditions for a fused decision

H �� using the likelihoodratio test in (7). This result is somewhat intuitive since an increasein °}± means that the individual detectors are less accurate. In thisscenario, the correlation algorithm therefore required more reportsof individual detector local decisions to arrive at a fused decision.Conversely, more accurate detectors will cause the correlation algo-rithm to require less number of individual detector local decisionsto arrive at a fused decision.

D. Experiment 3: Effect of varying �� q�{�pq pl{ on the minimum num-ber of detectors required, × 4 for a fused decision

In this experiment, a single attacking host in network-1 was usedagain to launch direct attacks on hosts in the target networks andthe attack session was stopped when a detection decision was madeby the correlation algorithm running on GR-1 and GR-2. The prob-ability of detection, ° t and the probability of false detection ° ± ofthe individual detectors were set to � 0 �� and � 0 �� respectively. Sim-ilar values were used in [13]. The number of hosts in a target cellwas set to 254. Several experiment runs were made, and in eachrun the value of � q�{�pq pl{ was modified.

Figure 9(c) shows that an increase in results in a correspondingincrease in the number of positive detectors × 4 required to meet theconditions for a fused decision

H �m� using the likelihood ratio� ³ Positive detectors refer to detectors which favor the ² � hypothesis.

test in (7). In our experiments with fast scanning worms, a valueof 7�0 � was used to ensure that the number of positive detectors × 4required to meet the conditions for a fused decision

H � � re-mained an average of � 7�� (chosen for this experiment). In practicaldeployments, security engineers can choose a desirable number ofpositive detectors required for a fused decision and tune × 4 to matchthis chosen number by adjusting .

E. Experiment 4: Multiple simultaneous worm attacks

In this experiment, single attacking hosts from different net-works (network-1 and network-2 in Fig. 8) were used to launchdifferent attacks (worm-1 and worm-2 respectively) on hosts in thetarget networks. The single attacking hosts from Network-1 andNetwork-2 scanned the target networks at a scanning rates of �>� h/sand 7@� h/s respectively.

We observed that both worms were identified as malicious (i.e.fused decision

H �ñ� ) using the likelihood ratio test in (7). Thecombined detection probability for worm-1 and worm-2 traffic pro-files are captured in Fig. 9(d). The figures show that though bothworms were identified as malicious and selected for containment,worm-1 (the faster worm) was blocked first since the combinedprobability of detection for worm-1 was greater than that of worm-2 (i.e. � Ö�� { � � Ö�� !�� ) as explained in (9).

VI. COLLABORATIVE DEFENSE

In [2], we developed the Analytical Active Worm Containment(AAWC) model which was used to model host population protectedas a result of the EDANC collaborative defense scheme in a largehierarchical network 11 as depicted in Fig. 10 and Fig. 11. In Fig. 10and Fig. 11 the nodes represent network routers, the level

�nodes

represent gateway routers which contain cells 12 while level � repre-sents the network core routers. Routers upstream of level

�do not

contain cells. When a gateway router implements a blocking filteragainst an intrusion traffic, the filter is applied to all cells containedin the gateway router hence protecting them from the intrusion traf-fic. Also, when an upstream router implements a blocking filter�l�

The Internet and most well-designed large enterprise networks generally followa hierarchical architecture [5] [15].��E

Endpoints are logically located within a cell and a single gateway router typi-cally contains multiple cells.

9 of 13

LEGEND

Core Router (Level 0)

Gateway Router (Level L)

Upstream Router Level 0

Level (L-2)

Level (L-1)

Level L

Upstream Direction

Fig. 10. Typical large-scale hierarchical network topology.

Level 0

Level L

Level 2

Level 1

Attack Source

Upstream direction

Fig. 11. Hierarchical network topology used in AAWC modeling. "�% � .

0 5 10 15 20 25 30 350

1

2

3

4

5

6

x 104

time (seconds)

Po

pu

latio

n o

f p

rote

cte

d h

ost

s

r=5h/sr=10h/sr=25h/sr=30h/sr=35h/s

(a) Effect of ï on protected population # r ¦%$ x .� % � '�� æ « % � �'& .

0 5 10 15 20 250

1

2

3

4

5

6

x 105

time (seconds)

Po

pu

latio

n (

ho

sts)

Infected population using AAWP modelProtected population using AAWC model

Protected populationexceed infectedpopulation.

(b) AAWC vs. AAWP. (�% � � æ � % � æ�) %� æ!* %,+ æ s% � æ ï % � � æ "�% � �0 5 10 15 20 25

0

1

2

3

4

5

6

x 105

time (seconds)

Num

ber

of in

fect

ed h

ost

s

Without containmentWith containment, no immunizationWith containment and immunization

(c) Effect of network containment and immu-nization on infected population. (�% � � æ � %� æ�) % � æ!* %,+ æ s% � � � æ ï % � � æ "�% � .

Fig. 12.

against an intrusion traffic, all cells attached to gateway routersdownstream of that upstream router are protected from the intru-sion traffic. While we acknowledge that our hierarchical networktopology is not representative of all real-life production networks,it depicts the general topology of a hierarchical network. Authorsin [5] [15] emphasized that the Internet and most well-designedlarge enterprise networks generally follow a hierarchical architec-ture. We therefore considered our topology sufficient to demon-strate the protection capability of EDANC against large scale fastspreading worm epidemics.

TABLE IIIPARAMETERS FOR NETWORK TOPOLOGY AND AAWC MODEL

Notation Explanation-number of hierarchical levels in network. number of nodes that connect to an upstream node/number of hosts in each cell0 number of W-sized cells that exist on each GRü%1time intrusion traffic is released into the networkü32time of containment at level 4 in the hierarchical network5time interval for notification between routers6 É ü32 Ëtotal number of contained hosts after a time interval of

ü72Considering the random scanning behavior of malicious worms andusing (10), derived earlier in section IV-A.1, we develop expres-sions for the total number of hosts 8O�¢�9 � protected as a result of

EDANC collaborative containment action carried out by a routerat level : in an

�-level network. Using notations in Table III, it

is assumed that ; is same for all upstream routers, ð is same forall cells and < is same for all gateway routers. We realise that thisassumption might not be applicable to all networks on the Internet,but we consider it sufficient to demonstrate the protection capabil-ity of EDANC collaborative containment.8Oz¢�9 � can therefore be expressed as:

8Oz¢ 9�� ð=<>; * � 9 �<X : X � (13)

Let us assume the worm’s travel time from source to destination isnegligible. This assumption is not unrealistic for fast propagatingworms. If we also assume that the correlation time interval for theGEP theory based correlation engine is negligible 13, then the timeof containment ¢ 9 at any level : in the hierarchical network for anintrusion with traffic profile £ can be expressed as:¢ 9N� ð � × 47 î � � � : �@? (14)

where ? is the notification time interval between routers in the net-work. Equation (14) comprises the average detection interval andthe collaborative containment interval based on our hierarchical��¼

Processing speed of modern gateway routers make this assumption realistic.

10 of 13

TABLE IVPARAMETERS FOR AAWP MODEL

Notation Explanation² total number of vulnerable machinesA scanning rate (the average number of machines scanned by an infected machine per unit time)´ patching rate (the rate at which an infected or vulnerable machine becomes invulnerable)B size of hitlist (the number of infected machines at the beginning of the spread of active worms)Çdeath rate (the rate at which an infection is detected on a machine and eliminated without patching)C À number of infected machines at time tick

ü ÀDLÀnumber of vulnerable machines (including infected ones) at time tick

ü Ànetwork topology (Fig. 10 and Fig. 11). Solving for

�and sub-

stituting in (13), we get:

8Oz¢�9 �� ð=<>; E $ �GFIH�J¬� �K �<X : X � (15)

Using (15), Fig. 12(a) was generated to show the impact of varyingworm scanning rate î on 8O�¢ 9�� . Fig. 12(a) shows that the protectedpopulation increases exponentially after the initial containment ac-tion on the gateway router due to collaborative containment. Also,increased worm scanning rate causes a faster EDANC defense re-sponse, hence a greater population is protected within a shorter timeinterval.

A. EDANC Protection Capability

Using the AAWC model introduced in the previous section, wedefine the protection capability of the EDANC scheme in termsof the maximum number of hosts an active worm spread success-fully infects before further spread of the worm due to direct scansis completely contained. The smaller the number, the more effec-tive the defense scheme is assumed to be. In order to quantitativelyanalyze EDANC protection capability using the AAWC model, itis important to first model malicious worm propagation. In thissection, we briefly review the Analytical Active Worm Propagation(AAWP) model [8], a well known model for worm propagation andthen adapt the AAWP model to our hierarchical network topology.

1) Review of the AAWP Model: Active worms often propagateby first randomly scanning a network, infecting vulnerable hosts inthe network and then using infected hosts as a vehicle for furtherscanning and spreading. The Analytical Active Worm Propagation(AAWP) model [8] was chosen to model worm propagation in ouranalysis because it more accurately captures the behavior of ran-dom scanning worms compared to epidemiological models [8]. Inaddition, it is a discrete time model similar to our AAWC model.The AAWP model shows that the number of newly infected hosts ineach time increment as a result of a random scanning worm attackis determined by parameters such as the size of the total populationthat the worm scans, the total number of vulnerable hosts in thepopulation, the scanning rate of the worm, the patching rate, thedeath rate, and the time it takes for the worm to complete infectionon a vulnerable host [8]. Using parameters in Table IV, the AAWPmodel assumes that a worm randomly scans a class A network 14

and requires one time increment to infect a vulnerable host. There-fore, the probability that a host is hit by one scan is �� ML . If at timetick ¢l� there are N�� A, infected hosts and O�� vulnerable hosts thenthe effective initial scanning rate will be , � î and there will be� ½

A class A network was chosen because most ISPs and large networks which aretypical targets of large scale worm attacks have class A IP address blocks.

!O�� Ni� � P �s� �s� �� ML � ê { ïRQ newly infected hosts on the next time

tick, ¢h� . It was shown in [8] that with death rate = and patching rate° , the total number of infected hosts N 4 y � at a time tick ¢ 4 y � can beexpressed as:

N 4 y �� N 4 � �O 4 � N 4 � P �U� �U� �7 � � � ê ¬ ï Q � =�� ° � N 4Also, the total number of vulnerable hosts (including infected ones)reduce by a factor of �� ° � after every time tick. Hence, O 4 y �� ° � O 4 and O 4 � �� ° � 4 O �� ° � 4 � . Solving, N 4 y � can beexpressed as:ê ¬ H { % r � ��t1� x ê ¬ y P r � � x ¬ v�� ê ¬ Q P � �ir � � {� �ML x%S ¬ � Q (16)

where £ �A� , NC� �Q, and O�� . According to [8], the recursionstops when there are no remaining vulnerable hosts or when theworm can no longer increase the total number of infected hosts.

2) AAWP Model in our Hierarchical Network: We make thefollowing assumptions in adapting the AAWP model to the hierar-chical network topology.

1) The entire host population in target networks are vulnera-ble to the worm attack.

2) Infection on infected hosts are eliminated only by patch-ing. Hence death rate = �R� .

3) For analysis, we assume that one time tick is equivalent toone second.

Using the hierarchical network topology in Fig. 11, total number ofvulnerable hosts, � in the network can be expressed as:� � ð=<>; * equivalent to : �®� in (13)

Applying the AAWP model, (16) can be expressed as:ê ¬ H { % r � � x ê ¬ y P r � � x ¬ rT� ) ��U xk� ê ¬ Q P � �ir � � {� �!L x%S ¬ � Q (17)

In our analysis, we assume that the attacking worm is a fast scan-ning malicious worm. The adapted AAWP model (17) is used tomodel the number of hosts infected as a result of the worm attackwhile the AAWC model (15) is used to model the number of hostsprotected as a result of the EDANC scheme.Fig. 12(b) shows that while the AAWP model shows steady growthin infected population with time, the AAWC model shows an evengreater growth in the number of protected hosts due to collabo-rative containment in the network. A perimeter is created on agateway router or upstream router after a containment action istaken thus preventing further direct scans. For a scanning worm

11 of 13

attack, successful direct worm scans are stopped when the numberof protected hosts (modeled using AAWC model) exceed the num-ber of directly scanned hosts (modeled using AAWP model), thuspreventing further increase in the number of hosts infected by di-rect scans (Fig. 12(c)). Fig. 12(b) and Fig. 12(c) show that this isachievable within � � seconds after release of the worm.

B. EDANC with ImmunizationImmunization by quickly deploying patches on infected hosts

has been proposed as an effective defense strategy for worms [33][6]. Worm defense using EDANC can effectively and quickly stopfurther direct worm scans but does not address the infectious stateof hosts infected before complete containment of direct worm at-tacks. It also does not defend against local scanning within a pro-tected cell that contains an infected host. For complete eradicationof infection we introduce immunization to the EDANC scheme.For this analysis, a patching rate, ° is introduced in the AAWPmodel (17). This causes the number of infected hosts N 4 at time¢ 4 to decrease by the patching rate on every subsequent time tick.Fig. 12(c) shows that the number of infected hosts in protected cellsis significantly reduced and tends towards zero as a result of thiscombined approach thus ensuring that hosts infected before com-plete containment of direct worm attacks do not become launchingplatforms for more attacks within or outside their cells.

VII. CONCLUSION AND FUTURE WORK

In this paper, we proposed the Endpoint Detection And NetworkContainment (EDANC) scheme for distributed detection and col-laborative defense against fast spreading malicious worms. TheEDANC detection and correlation engine is based on the Gener-alized Evidence Processing theory, a decision level multi-sensordata fusion technique. With the GEP theory, the evidence col-lected by distributed detectors determine the probability associatedwith a decision under a hypothesis. The evidence are combinedto arrive at a optimal fused decision by minimizing a decision riskfunction. GEP theory is known to have advantages over the twomajor evidence combining theories that have dominated the fieldof distributed evidence processing - the Bayesian theory and theDempster-Shafer theory. GEP theory extends the Bayesian deci-sion theory to deal with non-mutually exclusive multiple hypothe-sis and detector indecision.

We presented an analysis of EDANC detection interval for a fastscanning worm and experimented with the EDANC scheme on alive test-bed. Results obtained from experimentation concurredwith analytical results. Some useful deductions from the experi-mental results include the following:

1) The EDANC scheme is capable of automatically detect-ing malicious intrusion attacks with scanning rate of over30h/s within an average of � seconds after starting the at-tack on a network with cells containing 7@�V� hosts.

2) An increase in the probability of false positive detection°}± of the individual detectors results in a correspondingincrease in the number of positive detectors, × 4 required tomeet the conditions for a fused decision

H �I� . Thoughsomewhat intuitive, the results showed that in this sce-nario, the correlation algorithm required more reports ofindividual detector local decisions to arrive at a fused de-cision. Conversely, more accurate detectors (with lower

° ± values) caused the correlation algorithm to require lessnumber of individual detector local decisions to arrive at afused decision.

3) An increase in �� q�{�pq pl{ results in a corresponding increasein the number of positive detectors × 4 required to meet theconditions for a fused decision

H �� .4) In multiple simultaneous malicious attack scenarios, even

though all the malicious attack profiles are identified asmalicious and selected for containment, the profiles areblocked (or contained) in a sequence ordered by the prop-agation speed of the worm (or malicious traffic) associatedwith the traffic profile. Profiles due to faster spreading ma-licious traffic are blocked first.

Considering the random behavior of scanning worms, we pre-sented the Analytical Active Worm Containment (AAWC) model, adiscrete-time model used to model vulnerable host population pro-tected as a result of the EDANC scheme in a large scale network.Using the AAWC model, we demonstrated quantitatively the col-laborative containment capability of EDANC. In addition, we ob-served that while network containment of worms can halt furtherworm spread due to direct scans, it does not recover infected hostsnor does it prevent local scanning within a protected cell. We there-fore introduced immunization by patching to EDANC and the com-bined approach was successful at both quickly defending againstdirect attacks and recovering infected hosts within protected cellsthus preventing infectious local cell scanning.

For future work, we intend to extend our investigation to detec-tor indecision using the GEP theory and detection of slow scanningworms using the EDANC scheme in more complex network andtraffic scenarios. With proliferation of Web 7:0 � and social net-working, a new threat model for large scale infection of unpro-tected systems and networks by slow stealthy worms seem quiteconceivable. Successful large scale infection of this nature can beexploited by bot herds and used to perpetuate significant damageon computer systems. More work is required to develop adequatedefenses against this special class of worms. Another possible di-rection involves investigating the impact of indecisive detectors onGEP based intrusion detection of malicious worms.

REFERENCES

[1] F. Akujobi, I. Lambadaris, and E. Kranakis. Endpoint-driven intrusion detec-tion and containment of fast spreading worms in enterprise networks. In IEEEMilitary Communications Conference (MILCOM) 2007, 2007.

[2] F. Akujobi, I. Lambadaris, and E. Kranakis. Modeling host-based detectionand active worm containment. In CNS ’08: Proceedings of the 11th commu-nications and networking simulation symposium, pages 222–229, New York,NY, USA, 2008. ACM.

[3] F. Akujobi, I. Lambadaris, and E. Kranakis. Detection of Slow MaliciousWorms using Multi-sensor Data Fusion. IEEE Symposium on ComputationalIntelligence for Security and Defense Applications (CISDA 2009), 2009.

[4] F. Akujobi, I. Lambadaris, and E. Kranakis. An Integrated Approach to De-tection of Fast and Slow Scanning Worms. ACM Symposium on Information,Computer and Communications Security (ASIACCS 2009), 2009.

[5] D. Alderson, L. Li, W. Willinger, and J.C. Doyle. Understanding Internettopology: Principles, models, and validation. In IEEE/ACM Transactions onNetworking, 13(6):1205-1218, 2005.

[6] K.G. Anagnostakis, M.B. Greenwald, S. Ioannidis, A.D. Keromytis, and D. Li.A cooperative immunization system for an untrusting Internet. In 11th IEEEInternational Conference on Networking (ICON), pages 403–408, 2003.

[7] P. Barford, J. Kline, D. Plonka, and R. Amos. A signal analysis of networktraffic anomalies. In ACM SIGCOMM Internet Measurement Workshop, 2002.

[8] Z. Chen, L. Gao, and K. Kwiat. Modeling the spread of active worms. InINFOCOM 2003, 2003.

12 of 13

[9] M. Costa, J. Crowcroft, M. Castro, A. Rowstron, L. Zhou, L. Zhang, andP. Barham. Vigilante: End-to-end containment of Internet worms. In Proceed-ings of the Symposium on Systems and Operating Systems Principles (SOSP),pages 133–147, 2005.

[10] D.L. Hall and S.A.H. McMullen. Mathematical Techniques in MultisensorData Fusion. Artech House, Inc., Norwood, MA, USA, 2004.

[11] M. Handley, C. Kreibich, and V. Paxson. Network intrusion detection: Eva-sion, traffic normalization, and end-to-end protocol semantics. In USENIXSecurity Symposium, 2001.

[12] J. Jung, R.A. Milito, and V. Paxson. On the Adaptive Real-Time Detection ofFast-Propagating Network Worms. Springer Berlin/Heidelberg, 2007.

[13] J. Jung, V. Paxson, A. Berger, and H. Balakrishnan. Fast portscan detectionusing sequential hypothesis testing. In Proceedings of the IEEE Symposiumon Security and Privacy, May 9– 12, 2004., 2004.

[14] L.A. Klein. Sensor and Data Fusion: A Tool for Information Assessment andDecision Making (SPIE Press Monograph Vol. PM138). SPIE - InternationalSociety for Optical Engineering, 2004.

[15] L. Li, D. Alderson, W. W. Willinger, and J. Doyle. A first-principles approachto understanding the Internet’s router-level topology. In ACM SIGCOMM,2004, 2004.

[16] M. Locasto, K. Wang, A. Keromytis, and S. Stolfo. FLIPS: hybrid adaptiveintrusion prevention. In 8th International Symposium on Recent Advanced inIntrusion Detection (RAID), 2005.

[17] Thomas M.C. and V. Venkataramanan. Dempster-shafer theory for intrusiondetection in ad hoc networks. IEEE Internet Computing, 9(6):35–41, 2005.

[18] D. Moore, V. Paxson, S. Savage, C. Shannon, S. Staniford, and N. Weaver.Inside the slammer worm. In IEEE Security Privacy, vol. 1, no. 4, 2003.

[19] D. Moore, C. Shannon, and K. Claffy. Code Red: A case study on the spreadand victims of an Internet worm. In ACM SIGCOMM Internet MeasurementWorkshop, pages 273–284, 2002.

[20] D. Mutz, F. Valeur, C. Kruegel, and G. Vigna. Anomalous system call detec-tion. ACM Transactions on Information and System Security, 9:61–93, 2006.

[21] Network Security Resources. Governmentsecurity.org.http://www.governmentsecurity.org/forum/index.php?showtopic=4726,2003. Website was functional in 2006.

[22] D.M. Nicol. The impact of stochastic variance on worm propagation and de-tection. In WORM ’06: Proceedings of the 4th ACM workshop on Recurringmalcode, pages 57–64, New York, NY, USA, 2006. ACM.

[23] C. Shannon and D. Moore. The spread of the witty worm. In IEEE SecurityPrivacy, vol. 2, no. 4, 2004.

[24] C. Siaterlis and B. Maglaris. Towards multisensor data fusion for DoS de-tection. In Proceedings of the 2004 ACM symposium on Applied computing,pages 439–446. ACM Press, 2004.

[25] S. Singh, C. Estan, G. Varghese, and S. Savage. Automated worm fingerprint-ing. In Operating Systems Design and Implementation (OSDI), Proceedingsof the 6th conference on Symposium on Operating Systems Design Implemen-tation - Volume 6, pages 45–60, 2004.

[26] C. Sullivan. Cisco Security Agent. Cisco Press, 2005.[27] Swsoft. Openvz homepage. http://openvz.org/, 2008.[28] S.C.A. Thomopoulos. Sensor integration and data fusion. Journal of Robotic

Systems, 7(3):337–372, 1990.[29] S.C.A. Thomopoulos. Theories in distributed data fusion: Comparison and

generalization. SPIE, 1383:623, 1990.[30] D. Whyte, E. Kranakis, and P. Van Oorschot. DNS-based detection of scan-

ning worms in an enterprise network. In Network and Distributed SystemsSymposium (NDSS), 2005.

[31] V. Yegneswaran, P. Barford, and V. Paxson. Using honeynets for Internetsituational awareness. In Proceedings of the ACM/USENIX Fourth Workshopon Hot Topics in Networks, 2005.

[32] C. C. Zou, L. Gao, W. Gong, and D. Towsley. Monitoring and early warningfor Internet worms. In CCS ’03: Proceedings of the 10th ACM conferenceon Computer and communications security, pages 190–199, New York, NY,USA, 2003. ACM.

[33] C.C. Zou, D. Towsley, and W. Gong. Email worm modeling and defense. In13th International Conference on Computer Communications and Networks,pages 409-414, 2004.

13 of 13

towards host-based detection and collaborative network

Documents