a network behavior analysis system for cloud computing service

8
A Network Behavior Analysis System for Cloud Computing Service Bon-Yeh Lin 1,2 , Chi-Hua Chen 1,* , Hsu-Chia Chang 1 , Chi-Chun Lo 1 1 Institute of Information Management, National Chiao-Tung University, Hsinchu 300, Taiwan, ROC 2 Telecommunication Laboratories, Chunghwa Telecom Co., Ltd, Taoyuan 326, Taiwan, ROC *corresponding author: [email protected] Abstract It will be full of distributed network services and a large quantity of data in network environment. It is important to dynamically retrieve the real-time network information when the information is published in network. In the determination of Optimal Cycle for Network Information Retrieval (OCNIR), the network loading and the cost of information retrieval are higher when the retrieval frequency is higher. If the retrieval frequency is too low, the delay time of information is long. For this reasoning, the OCNIR is an important issue. However, the OCNIR for cloud computing service has not been investigated. In this paper, we provide the integration of web service techniques and Network Behavior Analysis Model (NBAM). We implement Service-Oriented Architecture (SOA) environment with Semantic Web (SW) and NBAM to accomplish a Network Behavior Analysis System (NBAS) for cloud computing service. The major contents of this research are follows: (1) the establishment of NBAM, (2) the determination of the optimal cycle for network information retrieval, and (3) the establishment of data transmission and analysis based on SOA. We develop the approach and implement the NBAS for individuals or enterprises to find out NBAM and OCNIR, test this approach in a domain with real-time data alteration, and then evaluate this approach. Keywords: Cloud computing, Web services, Service-oriented architecture, Network behavior analysis model. 1. Introduction In recent years, cloud computing has become one of the popular topics. Using the innovative computing model, cloud computing allows users accessing the network with a great deal of computing abilities and a variety of information services. The innovative business model of cloud computing provides a pay-per-use way for user access [1, 2]. Cloud computing can support virtualization and service management automation to provide a flexible high-performance computing ability and a large amount of data analysis [6]. There is no demand to build a data center for an enterprise and can run various kinds of service system on cloud platform. This innovative computing and business model is interesting for industry and academic organizations [2, 3, 7]. For classification of the service framework, cloud computing consists of three layers: Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS) [1, 8]. For example, Google App Engine (GAE) [10] is a PaaS provider that provides an application development platform for developers and allocates the resources strictly in order to hold the automatic scalability and high availability of applications. Many users begin to develop their own software which is managed by PaaS provider for reducing the deployment cost of the hardware environment [5]. It will be full of distributed network services and a large quantity of data in network environment. It is important to dynamically retrieve the real-time network information when the information is published in network. Hence, many software developers often implement the crawler applications to retrieve new information. The applications also can be deployed and operated as SaaS. However,

Upload: chihua0826

Post on 14-Oct-2014

252 views

Category:

Documents


1 download

DESCRIPTION

It will be full of distributed network services and a large quantity of data in network environment. It is important to dynamically retrieve the real-time network information when the information is published in network. In the determination of Optimal Cycle for Network Information Retrieval (OCNIR), the network loading and the cost of information retrieval are higher when the retrieval frequency is higher. If the retrieval frequency is too low, the delay time of information is long. For this reasoning, the OCNIR is an important issue. However, the OCNIR for cloud computing service has not been investigated. In this paper, we provide the integration of web service techniques and Network Behavior Analysis Model (NBAM). We implement Service-Oriented Architecture (SOA) environment with Semantic Web (SW) and NBAM to accomplish a Network Behavior Analysis System (NBAS) for cloud computing service. The major contents of this research are follows: (1) the establishment of NBAM, (2) the determination of the optimal cycle for network information retrieval, and (3) the establishment of data transmission and analysis based on SOA. We develop the approach and implement the NBAS for individuals or enterprises to find out NBAM and OCNIR, test this approach in a domain with real-time data alteration, and then evaluate this approach.

TRANSCRIPT

Page 1: A Network Behavior Analysis System for Cloud Computing Service

A Network Behavior Analysis System for Cloud Computing

Service

Bon-Yeh Lin1,2, Chi-Hua Chen1,*, Hsu-Chia Chang1, Chi-Chun Lo1 1 Institute of Information Management, National Chiao-Tung University, Hsinchu 300, Taiwan, ROC

2 Telecommunication Laboratories, Chunghwa Telecom Co., Ltd, Taoyuan 326, Taiwan, ROC

*corresponding author: [email protected]

Abstract

It will be full of distributed network services and a large quantity of data in network environment.

It is important to dynamically retrieve the real-time network information when the information is published in network. In the determination of Optimal Cycle for Network Information Retrieval (OCNIR), the network loading and the cost of information retrieval are higher when the retrieval frequency is higher. If the retrieval frequency is too low, the delay time of information is long. For this reasoning, the OCNIR is an important issue. However, the OCNIR for cloud computing service has not been investigated. In this paper, we provide the integration of web service techniques and Network Behavior Analysis Model (NBAM). We implement Service-Oriented Architecture (SOA) environment with Semantic Web (SW) and NBAM to accomplish a Network Behavior Analysis System (NBAS) for cloud computing service. The major contents of this research are follows: (1) the establishment of NBAM, (2) the determination of the optimal cycle for network information retrieval, and (3) the establishment of data transmission and analysis based on SOA. We develop the approach and implement the NBAS for individuals or enterprises to find out NBAM and OCNIR, test this approach in a domain with real-time data alteration, and then evaluate this approach.

Keywords:

Cloud computing, Web services, Service-oriented architecture, Network behavior analysis model.

1. Introduction

In recent years, cloud computing has become one of the popular topics. Using the innovative computing model, cloud computing allows users accessing the network with a great deal of computing abilities and a variety of information services. The innovative business model of cloud computing provides a pay-per-use way for user access [1, 2]. Cloud computing can support virtualization and service management automation to provide a flexible high-performance computing ability and a large amount of data analysis [6]. There is no demand to build a data center for an enterprise and can run various kinds of service system on cloud platform. This innovative computing and business model is interesting for industry and academic organizations [2, 3, 7]. For classification of the service framework, cloud computing consists of three layers: Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS) [1, 8]. For example, Google App Engine (GAE) [10] is a PaaS provider that provides an application development platform for developers and allocates the resources strictly in order to hold the automatic scalability and high availability of applications. Many users begin to develop their own software which is managed by PaaS provider for reducing the deployment cost of the hardware environment [5].

It will be full of distributed network services and a large quantity of data in network environment. It is important to dynamically retrieve the real-time network information when the information is published in network. Hence, many software developers often implement the crawler applications to retrieve new information. The applications also can be deployed and operated as SaaS. However,

Page 2: A Network Behavior Analysis System for Cloud Computing Service

accessing and using the transmission resources, computing resources and storage resources in cloud computing environment requires the production cost for the application execution [4, 5]. If the retrieval frequency is higher, the network loading and the cost of information retrieval will be higher. If the retrieval frequency is too low, the delay time of information is long. For this reasoning, the Optimal Cycle for Network Information Retrieval (OCNIR) is an important issue. However, the OCNIR for cloud computing service has not been investigated.

In this paper, we provide the integration of web service techniques and Network Behavior Analysis Model (NBAM). We implement Service-Oriented Architecture (SOA) environment with Semantic Web (SW) and NBAM to accomplish a Network Behavior Analysis System (NBAS) for cloud computing service. The major contents of this research are follows: (1) the establishment of NBAM, (2) the determination of the optimal cycle for network information retrieval, and (3) the establishment of data transmission and analysis based on SOA. We develop the approach and implement the NBAS for individuals or enterprises to find out NBAM and OCNIR, test this approach in a domain with real-time data alteration, and then evaluate this approach.

2. Network Behavior Analysis System

In this session, we implement SOA environment with semantic web and NBAM to accomplish a NBAS for cloud computing service. The NBAS is a three-tier system composed of users, Real-Time Network Behavior Analysis Server (RNBAS), and GAE shown in Figure 1.

Figure 1. The architecture of Network Behavior Analysis System (NBAS)

(1). Users

Users request their requirements about NBA to NBAS, in order to carry on the inference of NBA by RNBAS using SW and NBAM. The NBAM offers the analysis of NBA to provide OCNIR for different user’s situation. (2). Real-Time Network Behavior Analysis Server (RNBAS)

For analysis of NBA, RNBAS will retrieve the network information in the first. RNBAS uses the NBAM to calculate the average cost and delay time of network information retrieval. The NBAM can consider the cost and delay time to infer the OCNIR to users. In this paper, we discuss a case study, news event crawling, to analyze and evaluate the NBAM. (3). Google App Engine (GAE)

GAE is a PaaS provider that provides an application development platform for developers and allocates the use of resources strictly in order to hold the automatic scalability and high availability of applications. RNBAS can use Memcache APIs, URL Retrieval APIs, E-mail APIs, Image APIs which are provides by GAE to retrieve network information [10].

Page 3: A Network Behavior Analysis System for Cloud Computing Service

2.1 The collection and analysis of network information In experiments, CAMEO InfoTech [9] crawls and collects the news events in Yahoo News in

May 2010. We analyze the cumulative probability density of News Inter-Arrival Time (NIAT) distribution. The average NIAT is 1.04 hr/event in news events history. The Exponential Distribution (ED) function is considered to fit the cumulative probability density of NIAT with average NIAT is 1.04 hr/event. The average error ratio between the density of real data and ED is 3.48% shown in Figure 2. The results show that the cumulative probability density of NIAT is fitting the ED function. Therefore, we can assume that the news event arrival process is fitting Poisson process for NBA.

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 5 10 15 20 25

Cu

mu

lati

ve D

istr

ibu

tio

n F

un

cti

on

(%)

News Inter Arrival Time (hr)

Exponential Distribution Real Data

Figure 2. The comparison of the measured news events and its fitting cumulative probability

density using exponential distribution function (the average NIAT = 1.04 hr/event)

2.2 The average cost of information retrieval In this section, we propose a model to analyze the average cost of information retrieval. We

assume the news event arrival process is a Poisson process according to the analytical results described in section 2.1. For GAE environment [10], we define each URL API request of GAE is one transaction as the cost unit of news event retrieval. The parameters of our model are defined as Table. 1.

Table 1. The parameters of average cost of information retrieval

Parameter Description

(event/hr) The average arrival rate of news event

t (hr/crawling) The crawl cycle time of news event

k (event) The number of news arrived during current crawl cycle time

n (event/page) The number of news per one news list page

c (transaction/crawling) The average of amount of transactions per crawling

The proposed news event retrieval algorithm is shown in Figure 3. The URLs of news events in a

news list page are sorted by time. First, the algorithm will retrieve the news list page and determine which URLs are new. Then we retrieve the newly arrived news contents by those URLs.

For the case in Yahoo News, there are six news events (n = 6) in a news list page. If all the news events in the news list page are newly arrived, we will retrieve the news list in the next page and analyze it. The cost function c(k) is defined as formula (1) to evaluate the cost incurred when there are k newly arrived news in a crawl cycle.

Average NIAT = 1.04 hr/event

Page 4: A Network Behavior Analysis System for Cloud Computing Service

1mod1

nkn

n

kkc (1)

while(true) {

i = 0 while(the URL of the (i+1)-th news event in the news list page is new) {

request URL API to retrieve the contents of this news i = i + 1 if(i = n) then break

} if (i < n) then break else retrieve the news list in the next page

}

Figure 3. The crawling algorithm of news event retrieval In addition, the news event arrival process is assumed to be Poisson process with news event

arrival rate , so the probability density function can be defined as formula (2) to describe that there are k newly arrived news in a crawl cycle.

!)(

k

ettf

tk

(2)

By multiply formula (1) and formula (2), we can obtain the average number of transactions per

crawling as shown in formula (3).

0 !1mod1

k

tk

k

etnkn

n

k

tfkcc

(3)

Finally, we normalize the cost function to obtain the average number of transactions per hour

shown as formula (4). Figure 4 shows the results when adopts and in formula (4)

respectively.

t

cC (4)

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5

0 4 8 12 16 20 24 28 32

C(t

ra

nsa

cti

on

/hr)

t (hr/crawling)

Figure 4. The average crawling cost of news events with different news event arrival rates

n = 6 event/page

Page 5: A Network Behavior Analysis System for Cloud Computing Service

2.3 The average delay time of information The paper also considers the average delay time of network events when the events are retrieved.

The analytical results are described in section 2.1, the probability density function of NIAT can be assumed to be ED. To obtain the average delay time of network events, we refer to the timing diagram shown in Figure. 5. The first crawling point happened in t1, we define there is an event arrived at t0 before t1. The second event arrived at t2, and then the second crawling point happened at t3. The relevant parameters are defined as shown in Table 2.

1st event arrival 2nd event arrival

τ

t

1st crawling

point

2nd crawling

point

x

t0 t1 t2 t3

t+x-τ

Figure 5. The timing diagram of network event arrivals and crawling points

Table 2. The parameters of average delay time of network events

Parameter Description

τ (hr) News event inter-arrival time

1 (hr/event)

The expected value of news event inter-arrival time

x (hr) x is the elapsed time from the first news event arrival to the first crawling point.

In addition, the probability density of NIAT distribution is assumed to be ED with the average

NIAT (1/), so the probability density function g() of NIAT is defined as formula (5).

eg )( (5)

In Figure 5, we can see the delay time (t+x-) of the second event is the elapsed time while from its arrival to it can be retrieved. We can obtain the average delay time T of the first network event

arrived in each crawl cycle shown as formula (6). Figure 6 shows the results when we adopt and

in formula (6) respectively.

2

0

1

Pr

te

dxdgxt

xtxxtT

t

x

xt

x

(6)

Page 6: A Network Behavior Analysis System for Cloud Computing Service

0.0

5.0

10.0

15.0

20.0

25.0

30.0

35.0

0 4 8 12 16 20 24 28 32

T(h

r)

t (hr/crawling)

Figure 6. The average delay time T of the first news event arrival in each crawling cycle

2.4 The analysis of the Optimal Cycle for Network Information Retrieval

(OCNIR) Based on the former sections, we can obtain (1) the average cost of information retrieval (2) the

average delay time of the first network event in each crawl cycle. When the crawl cycle time increase, the average cost of information retrieval will be reduced and the average delay time of the first network event in each crawl cycle will be increased. Therefore, there are tradeoffs between costs and delay time. In this paper, we aim to find equilibrium by considering both of them. The OCNIR is decided when the decreasing rate of cost equals the increasing rate of the delay time. The analytical results are shown in Table 3 and Figure 7.

Table 3. The analysis of the OCNIR

t (hr/crawling) C (transaction/hr) T (hr)

0.5 3.000 0.107

1 2.001 0.368

2 1.508 1.135

3.0565 1.357 2.104

4 1.304 3.018

8 1.240 7.000

16 1.203 15.000

32 1.185 31.000

0

4

8

12

16

20

24

28

32

1.0 1.2 1.4 1.6 1.8 2.0 2.2 2.4 2.6 2.8 3.0

T(h

r)

C (transaction/hr)

Figure 7. The analysis of the OCNIR

n = 6 event/page

= 1 event/hr n = 6 event/page

Page 7: A Network Behavior Analysis System for Cloud Computing Service

3. Conclusions It will be full of distributed network services and a large quantity of data in network environment.

It is important to dynamically retrieve the real-time network information when the information is published in network. In the determination of OCNIR, the network loading and the cost of information retrieval are higher when the retrieval frequency is higher. If the retrieval frequency is too low, the delay time of information is long. For this reasoning, the OCNIR is an important issue. However, the OCNIR for cloud computing service has not been investigated. In this paper, we provide the integration of web service techniques and NBAM. We implement SOA environment with semantic web and NBAM to accomplish a NBAS for cloud computing service. The major contents of this research are follows: (1) the establishment of NBAM, (2) the determination of the optimal cycle for network information retrieval, and (3) the establishment of data transmission and analysis based on SOA. We develop the approach and implement the NBAS for individuals or enterprises to find out NBAM and OCNIR, test this approach in a domain with real-time data alteration, and then evaluate this approach.

To dynamically adapt the versatile characteristics of different network information, our approach can be used to develop different analytical services accordingly. The applications can choose corresponding analytical service according to the network information they want to retrieve. These services can be developed and provided by PaaS provider, so they can be managed and maintained in an effective manner. Currently, the NBAM is obtained through statistical analysis and aims to news events retrieval. In the future, the different kinds of network information will be analyzed to extend the usability of NBAS.

Acknowledgements The authors would like to thank CAMEO InfoTech for contribution to this work. The research is

supported by the National Science Council of Taiwan under the grant No. NSC 99-2622-H-009-003-CC3.

References [1] M. Armberust, A. Fox, R. Griffith, A.D. Joseph, R. Katz, A. Konwinski, G. Lee, D.

Patterson, A. Rabkin, I. Stoica, and M. Zaharia. A view of cloud computing, Communications of the ACM, 53:4 (2010), 50-58.

[2] J. Brodkin. Pricing the cloud is an ongoing challenge. Computerworld, (2009). [3] R. Buyya, C.S. Yeo, S.Venugopal, B. Srikumar, J. Broberg and I. Brandic. Cloud computing

and emerging IT platforms: Vision, hype, and reality for delivering computing as the 5th utility. Future Generation Computer Systems, 25:6 (2009), 599-616.

[4] M. Creeger. CTO roundtable: cloud computing, Communications of the ACM, 52:8 (2009), 50-56.

[5] M. Cusumano. Cloud computing and SaaS as new computing platforms, Communications of the ACM, 53:4 (2010), 27-29.

[6] K. Hartig. What is cloud computing?: The cloud is a virtualization of resources that maintains and manages itself. Cloud Computing Journal, (2009).

[7] R. Hicks. The future of government in the cloud. FutureGov, 6:3 (2009), 58-62. [8] D.C. Wyld. The Utility of Cloud Computing as a New Pricing – and Consumption - Model

for Information Technology. International Journal of Database Management Systems (IJDMS), 1:1 (2009), 1-20.

[9] CAMEO. CAMEO Engine, CAMEO InfoTech Inc, (2010). available: http://www.mycameo.com/website/index.htm

[10] Google App Engine. Quotas, Google Inc, (2010). available: http://code.google.com/intl/en/appengine/docs/quotas.html

Page 8: A Network Behavior Analysis System for Cloud Computing Service

Bon-Yeh Lin received a B.S. degree in computer science from National Chiao-Tung University, Taiwan, in 1997, and an M.S. degree in information management from National Chiao-Tung University, Taiwan, in 1999. Since 1999, he was employed by the ChunHwa Telecom Laboratories, Taiwan. At present, he is a Ph.D. candidate in Institute of Information Management, National Chiao-Tung University, Taiwan. His major research interests include information security, cloud computing, intelligent transportation system, and network management. Chi-Hua Chen received a B.S. degree from National Pingtung University of Science and Technology, Taiwan, in 2007, and an M.S. from National Chiao-Tung University, Taiwan, in 2009, all in information Management. At present, he is a Ph.D. student in Institute of Information Management, National Chiao-Tung University, Taiwan. His major research interests include personal communications network, cloud computing, intelligent transportation system, and network management. Hsu-Chia Chang is currently a Ph.D. candidate in the Institute of Information Management, National Chaio-Tung University, Hsinchu, Taiwan. Her major research interests include wireless home network, wireless P2P, service-oriented computing and cloud computing. Chi-Chun Lo received a B.S. degree in mathematics from National Central University, Taiwan, in 1974, an M.S. degree in computer science from Memphis State University, Memphis, Tennessee, in 1978, and a Ph.D. degree in computer science from Polytechnic University, Brooklyn, New York, in 1987. From 1981 to 1986, he was employed by AT&T Bell Laboratories, Holmdel, New Jersey. From 1986 to 1990, he worked for Bell Communications Research, Piscataway, New Jersey. Since 1990 he has been with the institute of information management, National Chiao-Tung University, Taiwan. At present, he is a professor of the institute. His major current research interests include network design algorithm, network management, network security, and network architecture.