classifying different denial-of-service attacks in cloud computing using rule-based learning

13
SPECIAL ISSUE PAPER Classifying different denial-of-service attacks in cloud computing using rule-based learning Md Tanzim Khorshed, A B M Shawkat Ali * and Saleh A. Wasimi School of Information and Communication Technology, CQUniversity, QLD 4702, Australia ABSTRACT From traditional networking to cloud computing, one of the essential but formidable tasks is to detect cyber attacks and their types. A cloud providers unwillingness to share security-related data with its clients adds to the difculty of detection by a cloud customer. The research contributions in this paper are twofold. First, an investigative survey on cloud computing is con- ducted with the main focus on gaps that is hindering cloud adoption, accompanied by a review of the threat remediation chal- lenges. Second, some thoughts are constructed on novel approaches to address some of the widely discussed denial-of-service (DoS) attack types by using machine learning techniques. We evaluate the techniquesperformances by using statistical ranking-based methods, and nd the rule-based learning technique C4.5, from a set of popular learning algorithms, as an efcient tool to classify various DoS attacks in the cloud platform. The novelty of our rather rigorous analysis is in its ability to identify insiders activities and other DoS attacks by using performance data. The reason for using performance data rather than traditional logs and security-related data is that the performance data can be collected by the customers themselves without any help from cloud providers. To the best of our knowledge, no one has made such attempts before. Our ndings and thoughts captured through a series of experiments in our constructed cloud server are expected to give researchers, cloud providers and customers additional insight and tools to proactively protect themselves from known or perhaps even unknown security issues that have similar patterns. Copyright © 2012 John Wiley & Sons, Ltd. KEYWORDS security; threats; machine learning; cyber attacks; cloud computing; DoS attacks *Correspondence A B M Shawkat Ali, School of Information and Communication Technology, CQUniversity, QLD 4702, Australia. E-mail: [email protected] 1. INTRODUCTION The concept Computing as Utilityhas manifested in the real world as cloud computing. It holds a huge potential. It can cater to the on-demand service requirements for software, platform and infrastructure. In its application, industries do not even need to plan for their IT growth in advance with this new pay as you gosystem. Already, there is considerable interest about its great potential as utility and its scalability and instant access features in the IT commu- nity, but there are also nervousness and caution about the security gaps, for instance, trust, threats and risks. Cloud computing could be the most signicant growth in IT infra- structure of recent times, but a great deal of work is needed in the area of security to minimise the gaps. In this paper, attempts are made to identify different denial-of-service (DoS) attacks in cloud computing and to provide solutions to detect a particular attack type by using supervised learning techniques. To gauge the extent of the problem and gain appreciation of all its facets, an investigative survey on cloud computing is conducted with the main focus on gaps that are hindering cloud adoption. To be useful, this necessarily needs identication and nor- mative descriptions of the threat remediation challenges. Given the information, some thoughts have been con- structed on novel approaches to address some of the widely discussed DoS attack types by using machine learning techniques. Obviously, any proposed scheme to tackle a problem needs evaluation, and this has been carried out in this study using statistical ranking-based methods. The word novelis used in our analysis as an explicative for the ability to identify insiders activities and other DoS attacks by using performance data. The reason for using performance data rather than traditional logs and security- related data is that the performance data can be collected by the customers themselves without any help from cloud providers. To the best of our knowledge, no one has made such attempts before. Our ndings and thoughts are cap- tured through a series of experiments in our constructed cloud server designed to give researchers, cloud providers SECURITY AND COMMUNICATION NETWORKS Security Comm. Networks (2012) Published online in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/sec.621 Copyright © 2012 John Wiley & Sons, Ltd.

Upload: md-tanzim-khorshed

Post on 06-Oct-2016

220 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Classifying different denial-of-service attacks in cloud computing using rule-based learning

SECURITY AND COMMUNICATION NETWORKSSecurity Comm. Networks (2012)

Published online in Wiley Online Library (wileyonlinelibrary.com). DOI: 10.1002/sec.621

SPECIAL ISSUE PAPER

Classifying different denial-of-service attacks in cloudcomputing using rule-based learningMd Tanzim Khorshed, A B M Shawkat Ali* and Saleh A. Wasimi

School of Information and Communication Technology, CQUniversity, QLD 4702, Australia

ABSTRACT

From traditional networking to cloud computing, one of the essential but formidable tasks is to detect cyber attacks and theirtypes. A cloud provider’s unwillingness to share security-related data with its clients adds to the difficulty of detection by acloud customer. The research contributions in this paper are twofold. First, an investigative survey on cloud computing is con-ducted with the main focus on gaps that is hindering cloud adoption, accompanied by a review of the threat remediation chal-lenges. Second, some thoughts are constructed on novel approaches to address some of the widely discussed denial-of-service(DoS) attack types by using machine learning techniques. We evaluate the techniques’ performances by using statisticalranking-based methods, and find the rule-based learning technique C4.5, from a set of popular learning algorithms, as anefficient tool to classify various DoS attacks in the cloud platform. The novelty of our rather rigorous analysis is in its abilityto identify insider’s activities and other DoS attacks by using performance data. The reason for using performance data ratherthan traditional logs and security-related data is that the performance data can be collected by the customers themselves withoutany help from cloud providers. To the best of our knowledge, no one has made such attempts before. Our findings and thoughtscaptured through a series of experiments in our constructed cloud server are expected to give researchers, cloud providers andcustomers additional insight and tools to proactively protect themselves from known or perhaps even unknown security issuesthat have similar patterns. Copyright © 2012 John Wiley & Sons, Ltd.

KEYWORDS

security; threats; machine learning; cyber attacks; cloud computing; DoS attacks

*Correspondence

A B M Shawkat Ali, School of Information and Communication Technology, CQUniversity, QLD 4702, Australia.E-mail: [email protected]

1. INTRODUCTION

The concept ‘Computing as Utility’ has manifested in thereal world as cloud computing. It holds a huge potential. Itcan cater to the on-demand service requirements forsoftware, platform and infrastructure. In its application,industries do not even need to plan for their IT growth inadvancewith this new ‘pay as you go’ system. Already, thereis considerable interest about its great potential as utility andits scalability and instant access features in the IT commu-nity, but there are also nervousness and caution about thesecurity gaps, for instance, trust, threats and risks. Cloudcomputing could be the most significant growth in IT infra-structure of recent times, but a great deal of work is neededin the area of security to minimise the gaps.

In this paper, attempts are made to identify differentdenial-of-service (DoS) attacks in cloud computing andto provide solutions to detect a particular attack type byusing supervised learning techniques. To gauge the extentof the problem and gain appreciation of all its facets, an

Copyright © 2012 John Wiley & Sons, Ltd.

investigative survey on cloud computing is conducted withthe main focus on gaps that are hindering cloud adoption.To be useful, this necessarily needs identification and nor-mative descriptions of the threat remediation challenges.Given the information, some thoughts have been con-structed on novel approaches to address some of the widelydiscussed DoS attack types by using machine learningtechniques. Obviously, any proposed scheme to tackle aproblem needs evaluation, and this has been carried outin this study using statistical ranking-based methods. Theword ‘novel’ is used in our analysis as an explicative forthe ability to identify insider’s activities and other DoSattacks by using performance data. The reason for usingperformance data rather than traditional logs and security-related data is that the performance data can be collectedby the customers themselves without any help from cloudproviders. To the best of our knowledge, no one has madesuch attempts before. Our findings and thoughts are cap-tured through a series of experiments in our constructedcloud server designed to give researchers, cloud providers

Page 2: Classifying different denial-of-service attacks in cloud computing using rule-based learning

Classifying different DoS attacks in cloud computing M. T. Khorshed, A. B. M. S. Ali and S. A. Wasimi

and customers additional insights and tools to proactivelyprotect themselves from known or perhaps even unknownsecurity issues that follow the same patterns.

The paper is organised as follows. Section 2 describesthe main aspects of cloud computing, Section 3 identifiesthe gaps that are slowing down cloud adoption, andSection 4 narrates the challenges for threat remediation.In Section 5, we propose a solution to identify the DoSattacks in cloud computing by using rule-based supervisedlearning. An extensive experimental analysis is alsoincluded in the Appendix A towards the end of the paper.

2. MAIN ASPECTS OF CLOUDCOMPUTING

Cloud computing actually enmeshes all the security issuesfrom existing systemswith the security issues that are createdbecause of its unique architecture and features. To under-stand these unique features, we first need to look at the rami-fications of a cloud system. Jeffery et al. [1] have drawn apicture that epitomises the cloud system and its main aspects.Others have performed pioneering research [2–7] to organisedifferent aspects of cloud computing. After doing a rigorousreview in our earlier research works [7,8], we have con-structed a new framework to present a comprehensive narra-tive of a cloud system as shown in Figure 1.

In Figure 1, we have attempted to categorise cloudsystem into eight main aspects. These are features, compar-ison, service delivery models, deployment models, roles,layers, locality and gap—the last being a new addition.

Figure 1. Understanding

Each of these main aspects has at least three subaspects.We linked all subaspects with the relevant main aspects.Our contribution is the introduction of gaps as one of themain aspects because we believe it is too important toignore. Furthermore, our assertion is that trust issues,security threats and security risks are the main gaps ofcloud computing [7,8]. We elaborate in the followingsection the cloud computing gaps.

3. CLOUD COMPUTING GAPS

Despite its huge potential, cloud computing has so far notbeen adopted by the consumers with the enthusiasm andpace that it deserves. This can be attributed to the gaps.The National Institute of Standards and Technology [9]opined that security, interoperability and portability arethe major barriers to a broader cloud adoption. Armbrustet al. [3] identified 10 obstacles to cloud computing. Theseare the following: availability of service, data lock-in, dataconfidentiality and auditability, data transfer bottlenecks,performance unpredictability, scalable storage, bugs inlarge distributed systems, scaling quickly, reputation fatesharing and software licensing. Ness [10] recounted threemajor barriers to cloud computing: first, cloud will dependon the new approaches to security; second, cloud can breakstatic networks; and third, network automation is critical.Leavitt [11] narrated six challenges, which are as follows:control; performance, latency and reliability; security andprivacy; related bandwidth costs; vendor lock-in and stan-dards; and transparency. There can be many ways to define

cloud computing [8].

Security Comm. Networks (2012) © 2012 John Wiley & Sons, Ltd.DOI: 10.1002/sec

Page 3: Classifying different denial-of-service attacks in cloud computing using rule-based learning

Classifying different DoS attacks in cloud computingM. T. Khorshed, A. B. M. S. Ali and S. A. Wasimi

gaps, and many parties are also involved other than cloudproviders and customers. But, in practice, what reallymatters at the end is that it is up to the customer whetherhe or she or his or her organisation is willing to join cloudcomputing. The reputation of an organisation and theexpectations of the type of services one is going to receivefrom a particular provider are the key elements in choosinga cloud provider. From a detailed review [3,7–11], we candefine cloud computing gaps succinctly as follows:

SecurityDOI: 10

The factors that are slowing down migra-tion to cloud computing from the existingsystem are cloud computing gaps.

In Figure 2, we have drawn a diagram showing gapsbetween cloud customers’ expectations and perceived ser-vices based on our understanding [2–5,7–14]. Cloud custo-mers may form their expectations on the basis of their pastexperience and their needs. They are likely to do some sortof survey before choosing a cloud service provider similarto what people do before choosing an Internet serviceprovider. Customers can also be expected to do a securitysurvey, which is basically on three security concepts:confidentiality, integrity and availability. By contrast,the cloud service providers may promise much to lure a

Figure 2. Understanding of c

Comm. Networks (2012) © 2012 John Wiley & Sons, Ltd..1002/sec

customer to sign a deal, but some gaps may surface lateras a difficult barrier to keep their promises. We havealready witnessed a gap between customers’ expectationsand deliverable services. In our opinion, many potentialcloud customers are well aware of this and, as a conse-quence, are still sitting on the sidelines. They will notventure into cloud computing until they are convinced thatwhat is on offer meets their expectations [2,7,8].

4. CHALLENGES IN THREATREMEDIATION

In March 2010, Cloud Security Alliance released theirresearch findings on the top threats to cloud computing[14]. The aim was to inform cloud providers as well astheir potential customers on the identification of the majorrisks and help to decide whether or not to join in cloudinfrastructure and also how to proactively protect oneselffrom these risks.

We have carried out research on each of their top seventhreats and threat remediation [2,4,7,8,14–23]. We foundthe challenges listed in Table I are impeding solutions tothreats. We believe these obstacles are the gaps for threatremediation and need to be addressed in future research.Interestingly, we found many of these challenges are trust

loud computing gaps [8].

Page 4: Classifying different denial-of-service attacks in cloud computing using rule-based learning

Table I. Gaps in threat remediation [8].

Threats Challenges in implementing threat remediation or gaps

Abuse and nefarious use of cloudcomputing

• Privacy laws are restricting cloud providers from instant monitoring.• Interests of different stakeholders are not necessarily in the same direction.

Insecure application programminginterfaces

• The inability to audit events associated with application programming interface use.• Incomplete log data to enable reconstruction of management activity.

Malicious insiders • Providers may always try to hide their own company policies for recruiting employees.• Solutions come into effect after the incident occurs, which is too late.• Cloud providers’ inability of monitoring its employees.

Shared technology vulnerabilities • Shared elements were never designed for strong compartmentalization.• Business competitors using separate virtual machines on the same physical hardware.• Coexistence of manufacturing sector and retail sector.• Uses of vulnerable operating system image for cloning can spread over many systems.

Data loss/leakage • Trust issue with the cloud providers that they may become self-interested and store in lowsecurity area than agreed.• Untested procedures, poor policy and inadequate data retention practices.• Lack of knowledge.

Account, service and traffic hijacking • Rapid development of cloud computing also opens some new loopholes.• Present way of digital identity management is not good enough for hybrid clouds.

Unknown risk profile • Cloud providers’ unwillingness to provide log and audit data, and security practices.• Lack of transparency.

Classifying different DoS attacks in cloud computing M. T. Khorshed, A. B. M. S. Ali and S. A. Wasimi

issues that create threats for cyber attacks. These issues arebidirectional—cloud providers do not trust their customersand think they can take advantage of present privacy lawsand can run phishing and malware attacks by using theircloud services. On the opposite direction, cloud customershave many other trust issues. For instance, providers mayalways try to hide their own company policies for recruit-ing, and providers’ inability in monitoring their employeesmay lead to malicious insiders’ attack. Image taken fromuntrustworthy sources can open a backdoor for attackers.Cloud providers may become complacent and too focussedon cost savings and may store in lower-tiered security areathan agreed, which may lead to cross-virtual machine(VM) side-channel attacks.

5. PROPOSED SOLUTION:CLASSIFYING DENIAL-OF-SERVICEATTACKS IN CLOUD COMPUTINGUSING RULE-BASED LEARNING

From our survey on cloud computing threats and remedia-tion, we have identified some challenges in Section 4. Wenoticed some remediation only comes into effect after asuccessful attack is executed. We also found cloud provi-ders’ unwillingness to provide security-related data to theircustomers. This is a repeated pattern and does not seem tohave an easy solution as nobody wants to reveal his or hercompany secrets including the policy for hiring employees.Because of these, we would like to propose ‘classifyingDoS attacks’ model with two goals. Goal 1: It will be ableto detect an attack when it starts or at least at the time ofattack perpetuation (during the attack). Goal 2: If cloudproviders try to hide attack information from the custo-mers, this model will be able to tell customers what kind

of DoS attack happened by looking at the pattern of attack.In the sequel, we provide a brief description of the modeland its development.

5.1. Background

In experiments with Amazon’s EC2, researchers at theUniversity of California and the Massachusetts Instituteof Technology [24] showed that it is possible to map theinternal cloud infrastructure and find out the location of aparticular VM. They also showed how such findings canbe used to mount cross-VM side-channel attacks to collectinformation from a target VM residing on the same physi-cal machine [24]. In a recent research, Rocha and Correia[19] showed how malicious insiders can steal confidentialdata. They depicted a set of attacks with attack videos,showing how easily an insider can obtain passwords, cryp-tographic keys and files, and so on. Chonka et al. [25] rec-reated some of the recent real-world attack scenarios anddemonstrated how HTTP-DoS and XML-DoS attacks cantake place in cloud computing.

5.2. Experimental design

In our experiment, we first generate attack scripts by usingattack tools and the information available in documentedattack scenarios in different Internet security-relatedwebsites and blogs such as Dancho Danchev’s blog [26]or Jeremiah Grossman’s blog [27]. One of the benefits ofgenerating attack scripts is less human effort and abilityto program it to run according to the actual attack timingand duration over multiple VMs simultaneously. We havedesigned our experiment as given in Figure 3 for a singlecloud environment.

Security Comm. Networks (2012) © 2012 John Wiley & Sons, Ltd.DOI: 10.1002/sec

Page 5: Classifying different denial-of-service attacks in cloud computing using rule-based learning

Figure 3. Attack detection and proactive resolution in single cloud environment using machine learning [8].

Classifying different DoS attacks in cloud computingM. T. Khorshed, A. B. M. S. Ali and S. A. Wasimi

The next step is data collection; the data are to becollected using different data collection tools dependingon the type of data. The most common type of data collec-tion in an attack scenario could be the number of packetssent and received, processing time, round-trip time, CPUusages and so on. Machine learning techniques are usednext to inform us if there is an attack. If there is a knowntype of attack, machine learning will take proactive actionto resolve the issue, and at the same time, it will notify thedata owner and systems/security administrators. If anunknown type of attack happens, machine learning willstill be able to detect it as an attack from the variation oflog in normal use and will notify the relevant person withthe closest type attack known to its database. The processwill make security administrators’ job easier to fightagainst unknown types of attacks.

For this experiment, we have chosen a HP ProLiantDL380 G4 Server [28] with the following specifications:dual Intel Pentium IV Xeon 3.2 GHz processors, 6 GBRAM, 2� 72.8 GB hot plug SCSI hard drives, integratedSmart Array 6i Plus RAID controller and dual network in-terface cards. The main reason for choosing server hardwareis not to make hardware limitation a bottleneck, which mayprovide incorrect data. We also choose VMWare ESXi 3.5[29] Hypervisor as the VM manager and Windows 7 [30] asthe guest operating system. Figure 4 shows a logical andphysical diagram of our experiment design.

Figure 4. Logical and physical diagram

Security Comm. Networks (2012) © 2012 John Wiley & Sons, Ltd.DOI: 10.1002/sec

5.3. Data preparation

According to the United States Computer EmergencyReadiness Team, DoS attack is a type of attack where anattacker attempts to prevent legitimate users from acces-sing network or computer resources. Distributed DoSmeans the attacker is using multiple computers to launchthe DoS attack [31]. However, there are several other typesof DoS attacks and attack tools that are worth testing in anexperimental cloud environment. The United StatesComputer Emergency Readiness Team [31] also listedfew symptoms of DoS and distributed DoS attacks suchas unusually slow network performance, unavailability ofa particular website, inability to access any website anddramatic increase in the amount of spam.

We generate an attack dataset for the experimentaldemonstration by simply gathering performance data ofCPU, memory, disk and network usage from hypervisorand guest operating system, and choose an appropriatetechnique for attack classification. The aim is to detectattack pattern and alert on the type of attack that happenedby looking at the change of parameters in the computer andnetwork systems. To start with, we consider five types ofmost common DoS attacks (including no attack, to distin-guish from original attacks) of cyber criminals. These arethe following: no attack (short name is attack A), DoSattack using attack tool RDoS.exe (B), SYN flood attack

of our experimental design [8].

Page 6: Classifying different denial-of-service attacks in cloud computing using rule-based learning

Classifying different DoS attacks in cloud computing M. T. Khorshed, A. B. M. S. Ali and S. A. Wasimi

(C) and HTTP-DoS attack using attack tool Low Orbit IonCannon (D).

We designed this process on the belief that customersneed to know all cyber attacks happening on their VMand the physical machine they are co-residing with others.If their business competitors acquire co-residence on thesame physical hardware, or their machine is cloned withoutprior notice, there is always a threat. Our main goal for thisexperiment is to enlighten cloud customers with some basicinformation on how they will be able to detect different typesof DoS attacks with the limited resources and access theyhave. Figure 5 shows a DoS attack using attack tool RDoSand monitoring system performance chart. Figure 6 showsCPU and disk performance plot during ping flood and RDoSattacks. Figure 7 shows network performance chart of thehypervisor at the time of TCP SYN flood attack. Figure 8shows HTTP-DoS attack using Low Orbit Ion Cannon andmonitoring victims’ performance chart. In Figure 9, we put

Figure 5. Denial-of-service attack using RDoS

Figure 6. CPU and disk performance plot d

CPU and network performance plot of the victim at the timeof HTTP-DoS attack. Sometimes, there could be a combina-tion of attacks. It is important to note that, for each attacktype, different sets of parameters of the computer/networksystemmay change—we collect the data on what parametersare changing compared with the usual/average usage. Weconsider 14 attributes to construct the dataset. These aredifferent parameters of CPU, disk, network and memoryperformances. In practice, all the data points are consideredas real values. The total number of instances in our datasetis 536.

5.4. Denial-of-service attack classification

Classification of any activity based on predefined classescould be solved successfully using the machine learningtechniques. These techniques are widely available from thecomputational intelligence community. From the available

and monitoring system performance chart.

uring ping flood and RDoS attacks [8].

Security Comm. Networks (2012) © 2012 John Wiley & Sons, Ltd.DOI: 10.1002/sec

Page 7: Classifying different denial-of-service attacks in cloud computing using rule-based learning

Figure 7. Network performance chart of the hypervisor at the time of TCP SYN flood attack [8].

Figure 8. HTTP-DoS attack using Low Orbit Ion Cannon and monitoring victims’ performance chart.

Figure 9. CPU and network performance plot of the victim at the time of HTTP-DoS attack [8].

Classifying different DoS attacks in cloud computingM. T. Khorshed, A. B. M. S. Ali and S. A. Wasimi

Security Comm. Networks (2012) © 2012 John Wiley & Sons, Ltd.DOI: 10.1002/sec

Page 8: Classifying different denial-of-service attacks in cloud computing using rule-based learning

Table III. Final ranking performance.

NaiveBayes

Multilayerperceptron

Support vectormachine

Decisiontree PART

Win 8 10 10 14 12Loss 5 1 21 2 0P 0.8 1.4 �0.6 1.7 1.7

Classifying different DoS attacks in cloud computing M. T. Khorshed, A. B. M. S. Ali and S. A. Wasimi

list, we have chosen naive Bayes [32], multilayer perceptron[33], support vector machine [34], decision tree (C4.5) [35]and Partial Tree (PART) [36] to classify our data. NaiveBayes is a probability-based technique, multilayer percep-tron and support vector machine are function estimation-based techniques, and decision tree and PART are rule-basedmachine learning techniques. All these techniques have beenimplemented in Weka [37], which is a Java-based popularmachine learning tool. Weka uses C4.5 [37] algorithm fordecision tree implementation. At the beginning, we carriedout some experimental tests to identify the best-suited tech-nique for attack classification. The details of performancesare available in Table II. We primarily consider classificationaccuracy, number of unclassified instances and computa-tional complexity. The classification accuracy calculatedthe percentage of activities that were classified correctly bythe machine learning techniques. The number of unclassifiedinstances basically measured the technique’s limitations,which means it failed to classify any attack. We are alsoaware of the computational efficiency of the techniques andhow well they learn because we are dealing with compara-tively large data sets. Therefore, we observe the modelbuilding and testing time, which are listed in Table II. Onthe basis of the classification accuracy, number of unclassi-fied instances and computational complexity, we founddecision tree C4.5 could be a preferred choice for DoS attackclassification in the cloud computing area. The classificationaccuracy and number of unclassified instances essentiallysummarise the average performances of the techniques forour attack classification task. So we tried to observe thedetails of performance about the attack classificationscenario. As a result, we employed confusion matrix [38]analysis to see the details of the techniques’ performancemeasures. This matrix offers a detailed picture about thequality of the actual and predicted classification task carriedout by any classification technique classwise. Because thenumber of instances is 536, we adopted a 10-fold cross-validation process in our entire experiment.

It is not a wise decision to choose the best algorithm forour task just on the basis of accuracy of performance alone.As a result, a statistical performance measure based onranking has been conducted using the results of perfor-mance metrics for different types of algorithms. Theformulation and calculation details are summarised inAppendix A. The summary of the ranking performancemeasure is presented in Table III.

In Table III, P-values capture the performance levels thatsuggest we can use either PART or C4.5 algorithm to detectdifferent types of DoS attack in the cloud computing

Table II. Classification performances

Naive Bayes Multilayer perce

Classification accuracy (%) 75.00 92.53No. of unclassified instances 0 0Model building time (s) 0.03 4.55Model testing time (s) 0.01 0.01

environment. However, we suggest using C4.5 algorithm inthe cloud system, because it is a comparatively established al-gorithm and is computationally inexpensive than PART. Thenext choice for our task is multilayer perceptron. We foundthat support vector machine has shown the worst performanceto classify the DoS attacks in our own cloud environment.

5.5. Decision tree: C4.5

C4.5 was developed by Professor Ross Quinlan of Universityof Sydney. C4.5 constructs a large tree by considering allattribute values and finalises the decision rule throughpruning. It uses a heuristic approach for pruning basedon the statistical significance of splits [39]. The treeconstruction process essentially calculates the entropyand information gain to finalise the decision tree. On thebasis of this gain information, C4.5 can determine if anattack is happening or not.

The entropy or expected information based on the parti-tioning into subsets by A is given by the equation [1]:

E Sð Þ ¼ �Xnj¼1

fs jð Þ log2 fs jð Þ (1)

where

• E(S) is the information entropy of the subset S;• n is the number of different values of the attribute in S(entropy is computed for one chosen attribute);

• fS(j) is the frequency (proportion) of the value j in thesubset S; and

• log2 is the binary logarithm.

Entropy of 0 identifies a perfectly classified subset,whereas 1 shows a totally random composition.

Entropy is used to determine which node to split next inthe algorithm, the higher the entropy, the higher the potentialto improve the classification.

The encoding information that would be gained bybranching on A is given by the following:

of denial-of-service attack data.

ptron Support vector machine Decision tree PART

82.08 93.47 93.280 0 00.67 0.06 0.110.01 0.00 0.01

Security Comm. Networks (2012) © 2012 John Wiley & Sons, Ltd.DOI: 10.1002/sec

Page 9: Classifying different denial-of-service attacks in cloud computing using rule-based learning

Classifying different DoS attacks in cloud computingM. T. Khorshed, A. B. M. S. Ali and S. A. Wasimi

G S;Að Þ ¼ E Sð Þ �Xmi¼1

fs Aið ÞE SAið Þ (2)

where

• G(S,A) is the gain of the subset S after a split over theA attribute;

• E(S) is the information entropy of the subset S;• m is the number of different values of the attribute A in S;• fS(Ai) is the frequency (proportion) of the itemspossessing Ai as a value for A in S;

• Ai is the ith possible value of A; and• SAi is a subset of S containing all items, where thevalue of A is Ai.

Gain quantifies the entropy improvement by splittingover an attribute: higher is better. The algorithm computesthe information gain of each attribute to construct the finaldecision tree [40].

We attempted to investigate the best choice amongmachine learning techniques for DoS attack classificationin the cloud environment. Our results support the claimthat C4.5 is the best choice for the activity of DoS attackclassification in the cloud environment.

5.6. Performance evaluation

The performance evaluation is a significant task beforereporting any experimental finding. A onefold training andtesting performance measure is most popular in the classifierevaluation domain. However, statisticians suggest that, whenthe numbers of samples are less than 1000, it is better toconsider K-fold cross-validation measure. The most popularvalue for K is 10. In our processes, we run the experimentK times and use (K� 1)-folds for training and the remainingone (as we mentioned Test Data in Figure 10) for the modeltesting. A simple sketch of K-fold cross-validation ispresented in Figure 10.

We can express K-fold cross-validation method mathe-matically as follows:

E ¼ 1K

Xk

i¼0Ei (3)

Total number of attacks

Test Data

Experiment 1

Test Data

Experiment 2

Test Data

Experiment 3

Test Data

Experiment 4 ..........

Figure 10. K-fold cross-validation.

Security Comm. Networks (2012) © 2012 John Wiley & Sons, Ltd.DOI: 10.1002/sec

where E is the final estimator for error and K is the numberof folds. We consider K = 10 in our attack classifier perfor-mance measure.

The main advantage of K-fold cross-validation is thatall the attacks in the dataset are eventually used for bothtraining and testing during the final model selection.

6. CONCLUSIONS

In this study, a review on cloud computing with the mainfocus on gaps that is hindering cloud adoption has beenundertaken, and, at the same time, a review on threat remedi-ation challenges has been performed. After experimentingthrough a constructed setup, we presented the performancesof several of the most popular machine learning techniqueson attack identification in a cloud environment, and acomparison on performances has been made. We used a sta-tistical ranking approach for the final selection of a learningtechnique for the task. We found that rule-based techniquesC4.5 and PART are equally efficient techniques to solveour problem at hand. However, on the basis of the computa-tional performance, we suggest C4.5 as the better techniquefor real-time attack protection in a cloud environment.

We evaluated each technique’s performance throughdifferent performance evaluation matrices that included therigorous testing of 10-fold cross-validation, true positiverate, false positive rate, precision, recall, F-measure and thearea of receiver operating characteristic. In another phase,we also counted computational complexity for our finalselection. Our experimental outcome corroborated not onlythe fact that C4.5 provided a better performance than othertechniques but also the fact that the level of performance isof acceptable standard. The other algorithms tested were na-ive Bayes, multilayer perceptron, support vector machineand PART techniques, which, as a future task, would be sub-jected to further tests by adopting their best parameters formore real-world attack classification problems. In addition,we have a plan to extend our current research in a more com-plex cloud environment, especially public and hybrid clouds.

REFERENCES

1. Schubert L, Jeffery K, et al. The future for cloud computing:opportunities for European cloud computing beyond 2010.Expert Group report, public version 2010; 1. http://cordis.europa.eu/fp7/ict/ssai/docs/cloud-report-final.pdf

2. Khorshed MT, et al. Trust issues that create threats forcyber attacks in cloud computing. InProceedings of IEEEICPADS, December 7–9, 2011, Tainan, Taiwan, 2011.

3. Armbrust M, et al. Above the clouds: a Berkeley viewof cloud computing. EECS Department, University ofCalifornia, Berkeley, Tech. Rep. UCB/EECS-2009-28, 2009.

4. Brunette G, Mogull R. Security Guidance for criticalareas of focus in Cloud Computing V2. 1. CSA (Cloud

Page 10: Classifying different denial-of-service attacks in cloud computing using rule-based learning

Classifying different DoS attacks in cloud computing M. T. Khorshed, A. B. M. S. Ali and S. A. Wasimi

Security Alliance), USA. Disponible en: https://cloudsecurityalliance.org/csaguide.pdf, vol. 1, 2009.

5. Catteddu D, Hogben G. Benefits, risks and recommen-dations for information security. European Networkand Information Security Agency (ENISA), 2009.

6. Mell P, Grance T. The NIST definition of cloud comput-ing. National Institute of Standards and Technology2009; 53(6): 50. http://csrc.nist.gov/publications/nistpubs/800-145/SP800-145.pdf

7. KhorshedMT, et al.Monitoring insiders activities in cloudcomputing using rule based learning. In Proceedings ofIEEE TrustCom-11, Nov. 16–18, Changsha, China, 2011.

8. Khorshed MT, et al. A survey on gaps, threat remedia-tion challenges and some thoughts for proactive attackdetection in cloud computing. Future GenerationComputer Systems 2012.

9. NIST. (2011, 21 May 2011). NIST Cloud ComputingProgram. Available: http://www.nist.gov/itl/cloud/

10. Ness G. (2009, 22 May 2011). 3 Major Barriers toCloud Computing. Available: http://www.infra20.com/post.cfm/3-major-barriers-to-cloud-computing

11. Leavitt N. Is cloud computing really ready for primetime? Growth 2009; 27: 5.

12. Brodkin J. Gartner: seven cloud-computing securityrisks, 2008. Available: http://www.infoworld.com/d/security-central/gartner-seven-cloud-computing-security-risks-853 (Retrieved: 6th August 2012).

13. Archer J, Boehm A. Security guidance for critical areasof focus in cloud computing. Cloud Security Alliance2009. Available: https://cloudsecurityalliance.org/guidance/csaguide.v1.0.pdf

14. Archer J, et al. (2010, 7May 2011). Top Threats to CloudComputing,Version 1.0. Available: http://www.cloudsecurityalliance.org/topthreats/csathreats.v1.0.pdf

15. Monfared AT. Monitoring intrusions and securitybreaches in highly distributed cloud environments. 2010.

16. Grosse E, et al. Cloud computing roundtable. Security& Privacy, IEEE 2010; 8: 17–23.

17. Wrenn G. (2010, 25 May 2011). Unisys Secure CloudAddressing the Top Threats of Cloud Computing. Avail-able: http://www.unisys.com/unisys/common/download.jsp?d_id=1120000970002010125&backurl=/unisys/ri/wp/detail.jsp&id=1120000970002010125

18. Grobauer B, et al. Understanding cloud-computingvulnerabilities. IEEE Security and Privacy 2010; 50–57.DOI: 10.1109/MSP.2010.115

19. Rocha F, Correia M. Lucy in the sky without dia-monds: stealing confidential data in the cloud. 2011.

20. Yildiz M, et al. A Layered Security Approach for CloudComputing Infrastructure. Publisher IEEE, 2009 10thInternational Symposium on Pervasive Systems,Algorithms, and Networks, Kaohsiung, 2009; 763–767.DOI: 10.1109/I-SPAN.2009.157

21. Dahbur K, et al. A survey of risks, threats and vulner-abilities in cloud computing. In Proceedings of the2011 International Conference on Intelligent SemanticWeb-Services and Applications, ISWSA ’11, ACM,New York, USA, 2011; 12.

22. Wang C, et al. Ensuring Data Storage Security inCloud Computing. Publisher IEEE, 17th InternationalWorkshop on Quality of Service, 2009. IWQoS,Charleston, SC, 2009; 1–9.

23. Yan L, et al. Strengthen cloud computing securitywith federal identity management using hierarchicalidentity-based cryptography. Cloud Computing 2009;5931: 167–177. DOI: 10.1007/978-3-642-10665-1_15

24. Ristenpart T, et al. Hey, you, get off of my cloud:exploring information leakage in third-party computeclouds. In Proceedings of the 16th ACM conferenceon Computer and communications security Chicago,Illinois, USA, November 09–13, 2009; 199–212.

25. Chonka A, et al. Cloud security defence to protectcloud computing against HTTP-DoS and XML-DoSattacks. Journal of Network and Computer Applica-tions 2010. DOI: 10.1016/j.jnca.2010.06.004

26. Danchev D. (2011, 31 May 2011). Dancho Danchev’sBlog—Mind Streams of Information Security Knowl-edge. Available: http://ddanchev.blogspot.com/

27. Grossman J. (2011, 19 June 2011). Jeremiah Grossman.Available: http://jeremiahgrossman.blogspot.com/

28. Company H.-P. D. HP ProLiant DL380 G4 server -specifications, 2012. Available: http://h18000.www1.hp.com/products/servers/proliantdl380/specifications-g4.html (Retrieved: 6th August 2012).

29. VMware. (2011, 16 July, 2011). VMware vSphereHypervisor. Available: https://www.vmware.com/tryvmware/?p=esxi&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a

30. Corporation M. Windows 7, 2012. Available: http://win-dows.microsoft.com/en-au/windows7/products/home (Re-trieved: 6th August 2012).

31. McDowell M. (2009, 21 June, 2011). UnderstandingDenial-of-Service Attacks. Available: http://www.us-cert.gov/cas/tips/ST04-015.html

32. John GH, Langley P. Estimating continuous distributionsin Bayesian classifiers. In Eleventh Conference on Uncer-tainty in Artificial Intelligence, SanMateo, 1995; 338–345.

33. Lopez R, Onate E. A variational formulation for the multi-layer perceptron. Artificial Neural Networks–ICANN 20062006; 4131: 159–168. DOI: 10.1007/11840817_17

34. Platt JC. Fast training of support vector machinesusing sequential minimal optimization. 1999; 185–208.

35. Quinlan JR. C4. 5: Programs for Machine Learning.Morgan Kaufmann: San Mateo, CA, 1993.

36. Frank E, Witten IH. Generating accurate rulesets without global optimization. In Fifteenth

Security Comm. Networks (2012) © 2012 John Wiley & Sons, Ltd.DOI: 10.1002/sec

Page 11: Classifying different denial-of-service attacks in cloud computing using rule-based learning

Classifying different DoS attacks in cloud computingM. T. Khorshed, A. B. M. S. Ali and S. A. Wasimi

International Conference on Machine Learning,1998; 144–151.

37. Witten IH, et al. Data Mining: Practical MachineLearning Tools and Techniques (3rd edn). MorganKaufmann: San Francisco, 2011.

38. Kohavi R, Provost F. Glossary of terms. MachineLearning 1998; 30: 271–274.

39. Ali ABMS, Wasimi SA. Data Mining: Methods andTechniques. Thomson. 2007.

APPENDIX A: PERFORMANCE EVALUA

To select the most efficient algorithm, a statistical performance mered true positive rate (TPR), false positive rate (FPR), precision (Pcharacteristic (ROC) [39] as indicators of the performance metrimeasuring attributes, we use the confusion matrix for this case as

Actual No attack (negative)Attack (positive)

Following the table given previously, we can represent the

TPR ¼ d

cþ d; FPR ¼ b

aþ b; P ¼ d

bþwhere b is a positive number used to adjust the relative import

Finally, an ROC is a graphical representation of the trade-offThe best-performing algorithm on each of these measures is a

algorithm as in Equations (A1) and (A2) on the ith dataset is calc

Rij ¼ 1� eij � max eið Þmin eið Þ � max eið Þ

For all the attributes except FPRð

Rij ¼ 1� eij � min eið Þmin eið Þ � max eið Þ

For FPR; lower valuðwhere eij is the measured value for the jth algorithm on attribute iison of performances can be evaluated from these equations.

The most suitable algorithm has been assessed using the totanumber of the best and worst rankings was evaluated by using

Pi ¼ 1r

si �n

where r=2 is the weight shifting parameter, si is the total numbertotal number of failures (loss) or worst cases for the same algorithmetrics. The value ofP has been finally assigned the algorithm perputing as summarised in Table A1 using the details analysis from

Security Comm. Networks (2012) © 2012 John Wiley & Sons, Ltd.DOI: 10.1002/sec

40. Quinlan JR. Induction of decision trees. MachineLearning 1986; 1: 81–106.

41. Ali ABMS, Smith KA. On learning algorithmselection for classification. Journal on Applied SoftComputing, Elsevier 2006; 6: 119–138.

42. Shafiullah G, et al. Prospects of renewableenergy—a feasibility study in the Australiancontext. Renewable Energy, ELSEVIER 2012;39(1): 183–197.

TION USING RANKING METHOD

easure based on ranking has been applied. We have consid-), recall (R),F-measure (F) and the area of receiver operatingcs as shown in Figures 4–9. To summarise the performanceillustrated in the following.

Predicted

No attack(negative)

Attack(positive)

a bc d

measuring attributes as follows:

d; R ¼ d

cþ d;F

b2 þ 1� �

P� TPR

b2Pþ TPR

ance of TPR and P.between the false negative and false positive rates.ssigned rank 1 and the worst rank 0. Thus, the rank of the jthulated as stated in reference [41,42]:

; higher values are better:Þ(A1)

es are better:Þ(A2)

and ei is a vector accuracy for attribute i. A detailed compar-

l number of best and worst ranking performances. The totalthe following equation:

fi�þ 1r

of successes (win) or best cases for the ith algorithm, fi is them and n is the total number of attributes in the performanceformance rank for the DoS attack identification in cloud com-Tables A1–A6.

Page 12: Classifying different denial-of-service attacks in cloud computing using rule-based learning

Table A2. False positive rate for the algorithms’ performance.

Naive Bayes Multilayer perceptron Support vector machine PART C4.5

No attack 0.217 0.287 0.705 0.132 0.155Ping flood attack 0.057 0.004 0.000 0.008 0.004RDoS attack 0.017 0.002 0.002 0.004 0.015SYN flood attack 0.129 0.000 0.000 0.010 0.008HTTP-DoS attack 0.004 0.000 0.000 0.016 0.004Weighted average 0.174 0.218 0.536 0.102 0.120

Table A3. Precision for the algorithms’ performance.

Naive Bayes Multilayer perceptron Support vector machine PART C4.5

No attack 0.915 0.916 0.817 0.958 0.952Ping flood attack 0.442 0.895 0.000 0.852 0.920RDoS attack 0.849 0.982 0.958 0.969 0.901SYN flood attack 0.193 1.000 0.000 0.643 0.556HTTP-DoS attack 0.882 1.000 1.000 0.652 0.882Weighted average 0.861 0.929 0.776 0.933 0.929

Table A4. Recall for the algorithms’ performance.

Naive Bayes Multilayer perceptron Support vector machine PART C4.5

No attack 0.744 0.993 0.998 0.958 0.968Ping flood attack 0.852 0.630 0.000 0.852 0.852RDoS attack 0.682 0.818 0.348 0.955 0.970SYN flood attack 1.000 0.375 0.000 0.563 0.313HTTP-DoS attack 0.750 0.750 0.750 0.750 0.750Weighted average 0.750 0.925 0.828 0.933 0.935

Table A5. Table A5. F-measure for the algorithms’ performance.

Naive Bayes Multilayer perceptron Support vector machine PART C4.5

No attack 0.821 0.953 0.898 0.958 0.960Ping flood attack 0.582 0.739 0.000 0.852 0.885RDoS attack 0.756 0.893 0.511 0.962 0.934SYN flood attack 0.323 0.545 0.000 0.600 0.400HTTP-DoS attack 0.811 0.857 0.857 0.698 0.811Weighted average 0.786 0.919 0.777 0.933 0.931

Table A1. True positive rate for the algorithms’ performance.

Naive Bayes Multilayer perceptron Support vector machine PART C4.5

No attack 0.744 0.993 0.998 0.958 0.968Ping flood attack 0.852 0.630 0.000 0.852 0.852RDoS attack 0.682 0.818 0.348 0.955 0.970SYN flood attack 1.000 0.375 0.000 0.563 0.313HTTP-DoS attack 0.750 0.750 0.750 0.750 0.750Weighted average 0.750 0.925 0.828 0.933 0.935

Classifying different DoS attacks in cloud computing M. T. Khorshed, A. B. M. S. Ali and S. A. Wasimi

Security Comm. Networks (2012) © 2012 John Wiley & Sons, Ltd.DOI: 10.1002/sec

Page 13: Classifying different denial-of-service attacks in cloud computing using rule-based learning

Table A6. Area of receiver operating characteristic for the algorithms’ performance.

Naive Bayes Multilayer perceptron Support vector machine PART C4.5

No attack 0.817 0.903 0.646 0.931 0.931Ping flood attack 0.917 0.985 0.873 0.937 0.953RDoS attack 0.922 0.967 0.728 0.983 0.980SYN flood attack 0.978 0.806 0.647 0.806 0.791HTTP-DoS attack 0.948 0.835 0.860 0.941 0.923Weighted average 0.845 0.909 0.676 0.935 0.934

Classifying different DoS attacks in cloud computingM. T. Khorshed, A. B. M. S. Ali and S. A. Wasimi

Security Comm. Networks (2012) © 2012 John Wiley & Sons, Ltd.DOI: 10.1002/sec