self-similar and fractal nature of internet trafï¬c

13
5 10 15 20 25 30 35 40 Copyright © 2004 John Wiley & Sons, Ltd. Self-similar and fractal nature of Internet traffic By D. Chakraborty* †§ A. Ashir, T. Suganuma G. Mansfield Keeni, T. K. Roy and N. Shiratori The self-similar bursty Internet traffic is usually characterized by the Hurst parameter (H). Such a process is also seen to possess fractal characteristics in time described by a parameter (b), with multifractals in most cases. We observe that these highly stochastic traffics have fractals in flow density too, described by a fractal dimension (D), also with the possibiliy of multifractals as in the former. This requires another parameter for the description of Internet traffic, besides the usual self- similarity parameter b or H and the different simulations or models worked out to understand the Internet traffic to reproduce the characteristics as found in the present work. We also find a notable self- similarity feature of the autocorrelations in the data and its aggregates, in all the cases studied. Copyright © 2004 John Wiley & Sons, Ltd. *Correspondence to: D. Chakraborty, TAO Tohoku University Office, Sendai, Japan. E-mail: [email protected] § Tel: +81-22-726-1667; Fax: +81-22-726-1667 Introduction T he self-similar burstiness of the Internet traffic almost at all time scales, having slowly decaying autocorrelations with time, the so-called long-range effect and 1/f g power spectrum at low frequencies, are all mani- festations of the critical condition of the system arising due to jamming effects at the gateways and their neighbours through which packets of infor- mation are processed. 1,2,3 After the pioneering works of Leland et al. 4 and Casabi, 5 numerous works have appeared in the literature confirming the self-similar and fractal nature 6–11 of Internet traffic data, also with the possibility of multi- fractals in time resolution. The fractal properties are a consequence of jamming conditions in the Internet 1–3 and possibly, fractal structure of its networks. 12 Understanding the Internet traffic is an open and challenging problem, more so after the famil- iar Poisson model, which used to be applied to problems in telecommunications, showed cracks in characterizing the Internet traffic with the start of fascimile services. 6–11 Without an appropriate model for the Internet traffic it is impossible to obtain the insight that is required to efficiently plan, manage or operate a network to render a sat- isfactory quality of service to the users. Although parsimony in models is desired the Internet traffic, due to its complicated structure and stochastic behaviour, may require a number of parameters for the characteristics that specify its behaviour. In this paper we report that the Internet traffic possesses fractals in flow density (bytes transmit- ted per unit time) too, with the number of boxes at a certain resolution to cover the range of data following a power law. We also show that the auto- INTERNATIONAL JOURNAL OF NETWORK MANAGEMENT Int. J. Network Mgmt 2004; 14: 000–000 (DOI: 10.1002/nem.512) 1 NEM512 12/5/03 05:39 PM Page 1

Upload: others

Post on 12-Sep-2021

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Self-similar and fractal nature of Internet trafï¬c

5

10

15

20

25

30

35

40

Copyright © 2004 John Wiley & Sons, Ltd.

Self-similar and fractal nature of Internet traffic

By D. Chakraborty*†§ A. Ashir, T. Suganuma G. Mansfield Keeni, T. K. Roy and N. Shiratori

The self-similar bursty Internet traffic is usually characterized by theHurst parameter (H). Such a process is also seen to possess fractalcharacteristics in time described by a parameter (b), with multifractals inmost cases. We observe that these highly stochastic traffics have fractalsin flow density too, described by a fractal dimension (D), also with thepossibiliy of multifractals as in the former. This requires anotherparameter for the description of Internet traffic, besides the usual self-similarity parameter b or H and the different simulations or modelsworked out to understand the Internet traffic to reproduce thecharacteristics as found in the present work. We also find a notable self-similarity feature of the autocorrelations in the data and its aggregates,in all the cases studied. Copyright © 2004 John Wiley & Sons, Ltd.

*Correspondence to: D. Chakraborty, TAO Tohoku University Office, Sendai, Japan.†E-mail: [email protected]§Tel: +81-22-726-1667; Fax: +81-22-726-1667

Introduction

T he self-similar burstiness of the Internet

traffic almost at all time scales, having

slowly decaying autocorrelations with

time, the so-called long-range effect and 1/f g

power spectrum at low frequencies, are all mani-

festations of the critical condition of the system

arising due to jamming effects at the gateways and

their neighbours through which packets of infor-

mation are processed.1,2,3 After the pioneering

works of Leland et al.4 and Casabi,5 numerous

works have appeared in the literature confirming

the self-similar and fractal nature6–11 of Internet

traffic data, also with the possibility of multi-

fractals in time resolution. The fractal properties

are a consequence of jamming conditions in the

Internet1–3 and possibly, fractal structure of its

networks.12

Understanding the Internet traffic is an open

and challenging problem, more so after the famil-

iar Poisson model, which used to be applied to

problems in telecommunications, showed cracks

in characterizing the Internet traffic with the start

of fascimile services.6–11 Without an appropriate

model for the Internet traffic it is impossible to

obtain the insight that is required to efficiently

plan, manage or operate a network to render a sat-

isfactory quality of service to the users. Although

parsimony in models is desired the Internet traffic,

due to its complicated structure and stochastic

behaviour, may require a number of parameters

for the characteristics that specify its behaviour.

In this paper we report that the Internet traffic

possesses fractals in flow density (bytes transmit-

ted per unit time) too, with the number of boxes

at a certain resolution to cover the range of data

following a power law. We also show that the auto-

INTERNATIONAL JOURNAL OF NETWORK MANAGEMENT

Int. J. Network Mgmt 2004; 14: 000–000 (DOI: 10.1002/nem.512)

1

NEM512 12/5/03 05:39 PM Page 1

Page 2: Self-similar and fractal nature of Internet trafï¬c

correlations of the characteristic time-series data

follow a notable self-similar behaviour, a feature

which has not been reported so far. The Internet

traffic being an example of a system composed of

many stochastic components, it will also be

interesting to see whether other similar natural

systems13 follow similar characteristics to those

that we present here.

In the following paper, after a brief account of

related work and statistics of self-similarity in a

time series, we describe how the fractal character-

istics of the flow density data can be determined.

We then describe the arrangement for data col-

lection in our laboratory and give the results of

our studies on, along with our data, other data

sets,4–6,14–15 which belong to traffics of either LAN

or wide-area networks, all showing self-similarity.

Finally, we conclude with a summary and a note

on possible areas of further research.

Related WorkLeland et al.4 and Casabi5 were first to show that

Internet traffic had self-similar features. The

former group had measured Ethernet traffic flows

with errors £100ms and a survey of their data

consisting of several hundred millions of packets

confirmed the self-similar behaviour of the traffic

over a wide range of time scales. This was contrary

to the traditional idea of the Poisson model, in

which the data smooth out and become featureless

over large time scales. The distribution of the

traffic was similar irrespective of the time scale of

observation. All of them had similar long-range

behaviour of the autocorrelations, decaying slowly

according to a power law. This was confirmed by

Paxson and Floyd6 and Crovella and Bestavros7 in

the case of wide-area traffics also, and they sug-

gested that the cause was related to the heavy tail

distribution (of Pareto type) of the bytes transmit-

ted. Willinger et al.8 also had similar observations

related to the high variability of the Internet traffic

data. Taqqu et al.10 examined in detail the traffic at

higher resolutions and observed that the Internet

traffic is multifractal, that is, at these resolutions,

the characteristics of the data, remaining self-

similar, have different parameters.

Contemporary to Leland et al.4 Casabi5 mea-

sured the round-trip-time (RTT) of the packets in

the network and observed the 1/f law of the power

spectrum of the RTT distribution. This was exam-

ined in detail by Takayasu et al.1–3 and they pro-

posed a model of ‘contact process’, by which they

were able to show that the traffic follows a 1/fbehaviour but only during the critical condition.

Away from the critical condition the power spec-

trum at low frequencies have a 1/f 2 distribution

instead. Their simulations of the traffic based on a

tree model show that due to jamming conditions,

when the demand for transmission exceeds the

capacity of the links, the Internet can exist in a crit-

ical condition during which the network is most

efficient. At the critical condition the traffic shows

self-similar and fractal features.

T he nature of the flow densities inInternet traffic has not been reported

earlier.

An additional fractal feature we present in this

paper relates to the nature of the flow densities in

Internet traffic, which has not been reported

earlier. We describe this, along with other interest-

ing features, in a later section. Preliminary reports

of our work can be found in references.23–26

Characteristics of Internet Traffic Data

Modern Internet traffic measurement systems4,5

record the time, with errors £100ms, at which a

packet of information arrives along with other

details such as the source, destination, etc., and

the byte length of the data transferred. From this

record, a time series of the data transmitted per

unit time interval (flow density) is obtained. The

self-similarity of the traffic relates to, and is

defined in terms of, aggregates of this time series.

—Statistics of Self-similarity—

For a self-similar time series:

(1)

the m-aggregate {X(m)} with its kth term:

X X X XN{ } = { }1 2, , . . . , ,

2 D. CHAKRABORTY ET AL.

Copyright © 2004 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2004; 14: 000–000

5

10

15

20

25

30

35

40

45

50

NEM512 12/5/03 05:39 PM Page 2

Page 3: Self-similar and fractal nature of Internet trafï¬c

(2)

has its variance Var[X(m)] related to original

Var[X] as:

(3)

b being the self-similarity parameter. b initially

may have a different value from that to which it

settles down with m.

Another characterization of self-similarity is

given by16 the rescaled range (R/S) of the data,

where R(N) is the range of a series {L} defined as:

(4)

with m(N), the mean of data of size N and S(N),

the usual standard deviation. The R/S plot

follows a power law at large N for a self-similar

process:

(5)

where H is the Hurst parameter. This is an exten-

sion of the Brownian-motion process where the

average distance traversed by a Brownian particle

varies as N1–2. Most Internet traffic data show a

value around 0.9 for H.4 It is to be noted that Hgives a measure of self-similarity independent of

the time scale of observation since the series Lj and

L(m)j from original and m-aggregated series are

expected to produce same H. b (the asymptotic

value) is related to H as:4

(6)

so that for full similarity (b = 0), H = 1 and for

noise, which has no self-similarity (b = 1), H = 0.5.

The autocorrelation R(k) = ·XiXi+kÒ, the average of

product XiXi+k, is supposed to be independent of

m4 for such a process:

(7)

i.e. the original series and the m-aggregates give

the same autocorrelation, which is difficult to

observe if the data set is not large. However, at

large k it is found to decay slowly (remains locallyconstant) according to a power law, hence the term

covariance stationary (due to b < 1):

(8)R k k k l( ) -~ ,b arge.

R k R km( )( ) = ( )

H = -12b

R S N HH~ , .2 0 5( ) >

L X N j Nj ii

j

= - ( )( ) £ £=Â m

1

1,

Var X Var X mm[ ] [ ] ,( ) = < <b b0 1

X X mkm

ii m k

km( )

= -( )+= Â

1 1

,Due to this long-range behaviour the power spec-

trum S( f ) at frequency f related to R(k) as:

(9)

also follows a power law at low frequencies:

(10)

—Fractal Characteristics of InternetTraffic Data—

We define17 a process to possess fractal charac-

teristics, if there exists a relationship of the form:

(11)

where Q is a quantity depending on t, a resolu-

tion in time or space of observation variables and

F(D), a simple function (most often linear) of the

dimension D, a non-integer, of the process, the

fractal dimension.

Due to extreme variability, the Internet traffic

data exhibit such fractal-like structures over a long

range of time scales. In fact one of the measures

of self-similarity is based on such a definition.

When Q is the variance of data, then F is -b, so

that the fractal dimension D is identified to

be b:

(12)

Thus (12) describes a fractal behaviour of data

in time. Equation (10) is another similar descrip-

tion. The case of multifractals refers to that situa-

tion when the exponent (b) varies from one range

of scales (m) to another.

We observe that a similar description of data {X}

of (1) or {X(m)} of (2) can be found with the resolu-

tion in the magnitude of data. Imagine the range

of data {X} to be divided into equal segments of

size e, and we count the number of segments that

contain the data. Let this be N(e). Then if (see for

example18)

(13)

then the dimension D at resolution e can be

obtained from the slope of logN(e) vs. loge.

Thus the set {X} or {X(m)} is fractal if (13) is valid

over an appreciable range of scales. For random

data, since all values are equally likely, D = 1.

Thus (13) gives an indication that all values in the

N De e( ) µ -

Var X mm[ ]( ) -µ b

Q F Dt t( ) µ ( )

S f fÆ( ) = --0 1~ ,g g b

S f R k ek

ikf( ) = ( )Â -

SELF-SIMILAR AND FRACTAL NATURE OF INTERNET TRAFFIC 3

Copyright © 2004 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2004; 14: 000–000

5

10

15

20

25

30

35

40

NEM512 12/5/03 05:39 PM Page 3

Page 4: Self-similar and fractal nature of Internet trafï¬c

range of data are not equally probable if D is a frac-

tion. As in all natural cases,19 it is to be noted that

(13) is expected to give the dimension for a range

of scales depending on the volume and accuracy

of data. Again, here also multifractals can exist

with the exponent Ds depending on the scales (e)

of observation.

Results and Implications—Data Collection Environment—

In addition to data collected from different

sources4 and,6,14,15 we also made arrangements to

test the self-similar nature of Internet traffic in our

communications laboratory, Shiratori Laboratory

(SL) of RIEC, Tohoku University. Figure 1 shows

the data collection point of SL LAN. SL LAN is a

100Mbps Ethernet link connected to the Tohoku

University ATM campus network (622Mbps). It is

equipped with 10 servers and 50 users. The traffic

traversing the network originates mainly from

Web, ftp and mail services as well as some exper-

iments carried out in the laboratory. Two types of

flow were recorded, all inbound octets from the

rest of the world to SL and all outward octets from

SL to the Internet in time intervals of 1 minute

between 1 February 2000 and 29 February 2000, a

busy period in the University.

Table 1 lists, along with our data, various Inter-

net traffic data sets that we studied, with collection

periods ranging from 1 hour to about 1 month

and time intervals Dt, 2-7 second to 1 minute and

average flow densities varying over a wide range,

from 1 byte to about 500 Kbytes per second. The

time series {X}, of (1), i.e. transmitted bytes per

interval of time Dt was obtained from the trans-

mitted packet length vs. time stamp records by a

4 D. CHAKRABORTY ET AL.

Copyright © 2004 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2004; 14: 000–000

5

10

15

20

25

30

35

40

45

50

Shiratori Lab LAN

Tohoku UniversityCampus Network

100 Mbps

[622 Mbps]

130.34.202.0/26

Internet

data collector

In Out

Figure 1. Data collection point at Shiratori

laboratory LAN

Data Ref. Resolution Duration Data type Average flow

BC-Oct 89Ext 4 2-7 s 34h Ext 1.146 Bytes/sBC-Oct 89Ext4 4 2-7 s 21h Ext 3.416 Bytes/sBC-p Oct89 4 2-7 s 29min LAN 358.23 KBytes/sBC-p Aug89 4 2-7 s 52min LAN 167.24 KBytes/sDEC pkt-1 6 2-7 s 1h tcp 122.95 KBytes/sDEC pkt-2 6 2-7 s 1h tcp 232.11 KBytes/sDEC pkt-3 6 2-7 s 1h tcp 176.41 KBytes/sDEC pkt-4 6 2-7 s 1h tcp 256.77 KBytes/sLBL pkt-3 6 2-7 s 2h tcp 32.81 KBytes/sLBL pkt-4 6 2-7 s 2h tcp 17.77 KBytes/sLBL pkt-5 6 2-7 s 2h tcp 12.76 KBytes/sMAWI 14 10ms 120h All 437.10 KBytes/sSSOO 15 1min 29 days OutOcts 2.56 KBytes/sSSIO 15 1min 29 days InOcts 502.78 Bytes/s

Abbreviations: Ext, external arrivals; All, all packets; pkt, packets; OutOcts, all out-octets; InOcts, all in-octets; Kbytes, 1024 bytes.

Table 1. Traffic data profile

NEM512 12/5/03 05:39 PM Page 4

Page 5: Self-similar and fractal nature of Internet trafï¬c

5

10

15

20

25

30

35

40

—Observed Characteristics—

Self-similarity from change of variance:temporal multifractals—All the data sets show

(in Table 2) high self-similarity as shown by the

values of H except the last one (SSIO) which was

not so (b ª 1). Most of the cases have temporal

multifractals (as in Figure 2) described by bs in the

respective scales given by the range of ms (in Table

2) and as earlier reported6–11 these could be found

in the data sets with small Dt. The low value of bin certain ranges of time scales show that at these

resolutions the processes look most self-similar,

which may be of help in a prediction process.

There is a general trend to have higher values of bat higher resolutions (low m).

We also found that these multifractals remained

almost constant or very slowly changing in time,

one of them being shown in Figure 3. Here b was

calculated over a sliding window of 256K data

(1K = 1024 bytes) corresponding to a flow of 32

min. approximately. This window sliding after 8

min. (approximately) would discard the previous

64K data and add in a new set of the same length.

Thus in this way we can get a variation of b with

time and this can be applied in the case of on-line

determination also. Since our laboratory data were

taken at intervals of 1min, and hence volume of

data was not large, the time dependence of the

parameters could not be determined for small time

SELF-SIMILAR AND FRACTAL NATURE OF INTERNET TRAFFIC 5

Copyright © 2004 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2004; 14: 000–000

7

9

11

2 4 6 8 10 12

log

Var

[m]

log m

Data LBL-BC-Oct89Ext

(a)

(b)

Figure 2. Temporal multifractals of data

BC-Oct89Ext (Dt = 2-7 s,4) with (a) b = 0.19

(m = 26 ~ 214) and (b) b = 0.61 (m = 20 ~ 26).

0

0.2

0.4

0.6

0.8

1

0 5 10 15 20 25

beta

and

f.d

time (1 unit = 512 s)

Data BC-Oct89Ext (LBL) betaf.d

Figure 3. b and fractal dimension D (f.d) for data BC-Oct89Ext of LBL,4 calculated over a sliding time

window of 2048s, after every 512s, showing that these parameters remain almost constant in time. The

bs correspond to low-time resolutions (m = 20 ~ 26)

program. The BC-p data correspond to LAN

during busy hours and the rest belong to WAN

traffics. Descriptions of these data can be found in

the cited references. The SSIO and SSOO data cor-

respond to, respectively, all in-octets to SL and all

out-octets from SL as mentioned above.

NEM512 12/5/03 05:39 PM Page 5

Page 6: Self-similar and fractal nature of Internet trafï¬c

5

10

15

20

25

30

35

40

45

50

6 D. CHAKRABORTY ET AL.

Copyright © 2004 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2004; 14: 000–000

Data H b1 b2 b3 D1 D2 D3scales (m) scales (m) scales (m) scales (m) scales (m) scales (m)

BC-Oct 0.90 0.61 0.19 0.82 0.5689Ext (21–25) (25–214) (20–212) (28–212)BC-Oct 1.00 0.08 0.24 0.12 0.95 0.6 0.489Ext4 (21–24) (24–211) (211–2114) (20–28) (24–212) (28–212)BC-p 0.79 0.6 0.18 0.98 0.88 0.62Oct89 (21–24) (24–210) (210–2114) (20) (28) (212)BC-p 0.84 0.49 0.34 0.98 0.83 0.69Aug89 (21–24) (24–214) (20) (28) (212)DEC 0.83 0.58 0.33 0.95 0.83 0.76pkt-1 (21–24) (24–214) (20) (28) (212)DEC 0.98 0.54 0.3 0.1 0.975 0.88 0.79, 0.37pkt-2 (21–24) (24–211) (211–214) (20) (28) (212)DEC 0.79 0.66 0.4 0.95 0.84 0.70pkt-3 (21–24) (24–214) (20) (28) (212)DEC 0.79 0.59 0.39 0.97 0.85 0.68pkt-4 (21–24) (24–214) (20) (28) (212)LBL 0.79 0.46 0.23 0.53 0.88 0.74pkt-3 (21–24) (24–212) (212–215) (20–28) (24–212)LBL 0.75 0.56 0.27 0.48 0.84 0.68 0.21pkt-4 (21–24) (24–211) (211–215) (20–212) (24–28) (212)LBL 0.88 0.68 0.32 0.88 0.67 0.23pkt-5 (21–24) (24–212) (20–212) (24–28) (212)MAWI 0.95 0.12 0.012 0.84 0.90, 0.48

(21–25) (25–214) (20–210) (212)SSOO 0.80 0.47 0.5

(21–27) (20–27)SSIO 0.58 0.95 0.5

(21–27) (20–25)

(The blank places in the table imply that multifractal characteristics were not found in thesecases.)

Table 2. Multifractals b’s and D’s of Internet traffic data in different ranges of time scales (m’s)

windows (as in Figure 3). But averaged over a

sliding window of about 15 days (with a set of

16K data) after every 2 days (corresponding to an

interval of 2K data), b for both SSIO and SSOO

were found to be almost constant in time.

Nature of autocorrelations of m-aggregates—The autocorrelation R(k), the

average of product XiXi+k, initially shows a hyper-

bolic fall-off as in (8), but at larger k it is full of

structures like the data. We find a notable self-

similarity when we compare the plots for different

m-aggregates (Figure 4). As a test of self-similarity,

Leland et al.4 observed the m-independence of (7),

obtained from different m-aggregated series all of

same length. This requires a data set over a long

duration. We observe that given an unaggregated

series of length N (say), from which different m-

aggregated series each of length Nm = N/m are

obtained, the autocorrelations R(m)(k) for m and klarge follow:

(14)

The higher and in most cases all the m-

aggregated series give the same autocorrelations

in terms of actual time (mk), which is easier to verify

and more practical (we tested as far as mk = N/2

for all the cases studied). This does not occur for

an ordinary data, noise or a signal from a dynamic

system that is not self-similar. Thus we have a dif-

ferent indication of self-similarity also useful for

verification in a small data set—the autocorrela-

tion plots at higher, and in most cases all, levels of

aggregation look similar as in (14). As a matter of

R k R k m k m km m1 21 2 1 1 2 2

( ) ( )( ) = ( ) =, .

NEM512 12/5/03 05:39 PM Page 6

Page 7: Self-similar and fractal nature of Internet trafï¬c

fact when we tested with artificial data (obtained

by inverse Fourier transformation) from a power

spectrum around 1/|f|g (a signature of critical

behaviour) with random deviations and having

random phases, we do find such self-similarity of

the autocorrelations for all m-aggregates. This fact

coupled with (7) makes the autocorrelation func-

tion itself self-similar and fractal-like, a feature

leading to the data (flow densities) forming a

fractal set, which we shall discuss below. With

increase of time resolution the details of the

autocorrelation come out preserving the previous

values. On closer inspection, we do find them to

be self-similar and fractal-like (which cannot be

described by H or b alone).

Fractal dimension of flow density data: anew parameter of Internet traffic—In order to

determine the fractal dimension D of (13), we nor-

malize a dataset to the range [0, 1], which is

divided into 2k segments (or boxes) each of size

e = 2-k. If a datum has value X, then it is kept in

box number Integer(X/e). The box counting

method counts the number of boxes N(e) that

contain data (some of these may contain more than

one datum) and determines D from the log–log plot

of N(e) vs. e.

All the data sets show fractal behaviour of flow

densities. The scaling described by (13) works for

an appreciable range of precisions (more than two

orders of magnitude in all the cases studied), but

with D depending on the time resolution like the

bs (as shown in Table 2 by D1, D2, etc., in time

scales given by range of ms). In general, Ddecreases with m which implies clustering of data

when viewed in larger time scales. In some cases,

for example in the LBL-BC (Figure 5) and LBL-tcp

packets, multifractals appear as the time resolu-

tion is decreased, gradually switching over to

another dimension at higher m. The 1-hour DEC-

tcp packets appear to have a random behaviour

(with D ª 1 at low m), the difference being notice-

SELF-SIMILAR AND FRACTAL NATURE OF INTERNET TRAFFIC 7

Copyright © 2004 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2004; 14: 000–000

5

10

15

20

25

30

35

40

25

30

35

40

131072 262144 393216

R

(k)

mk

Autocorrelations for BC-Oct89Ext

m=256 m=16

(m)

Figure 4. A typical R(m)(k) vs. mk, m = 16 and 256, plotted at mk = 0(256)2048.256 for comparison

NEM512 12/5/03 05:39 PM Page 7

Page 8: Self-similar and fractal nature of Internet trafï¬c

able only at higher ms. An important point to note

is that although a finite set of random data tends

to form clusters with m-aggregates, thereby appar-

ently forming fractals, they do not give rise to

sharp multifractals we observe here (as in Figure

5). The fractal dimensions of the aggregates of a set

of random numbers (of same volume as data)

change (from D = 1 at m = 1) at a slower rate with

m, probably due to the drop of variance being

faster. In these stochastic processes, with the vari-

ation of m either there is a faster rate of change of

D or multifractals appear with the previous

dimensions (almost) unchanged.

We verified that appearance of these multifrac-

tals with aggregating is not due to artifact that the

data size is reduced. For the m = 256 case of BC-

OctExt data (of Figure 5) with datasize 64K (cor-

responding to data of 32 hours approximately), we

find similar multifractals with D1 ~ 0.75, D2 ~ 0.5

for data sizes 32K and 16K (data of 16 hours

and 8 hours approximately), slightly different

from that obtained from the whole dataset, with

D1 = 0.81, D2 = 0.57. For m = 4096 (data size 4K) we

find a greater range of scales for D2 ~ 0.57. Thus

there is a tendency to switch over to a fractal with

lesser D at larger time scales. While this data size

may seem to be insufficient apparently showing

multifractals we note that this feature at such low

resolutions is obtained for other datasets also

(Table 2) such as MAWI (resolution 40s), SSIO and

SSOO (60s) all with datasize >10K and D ~ 0.5.

We have tested that although the 1/|f|g law of

(10) gives rise to fractals like b and D, multifrac-

tals in both time and flow density resolutions

cannot be obtained from such a power spectrum

(even with large fluctuations up to 50%). For this

it is necessary to identify the features in simula-

tions of Internet traffic (such as in1,2,3).

We also studied the variation of these features

with time, and a typical result is shown in Figure

3. In most cases, the fractal dimension was a con-

stant or very slowly changing in time depending

8 D. CHAKRABORTY ET AL.

Copyright © 2004 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2004; 14: 000–000

5

10

15

20

25

30

35

40

45

50

4

8

12

16

–16 –12 –8 –4

log N(eps) in arbitrary units

log (eps) in arbitrary units

m = 1

m = 16

m = 256

m=4096 Data LBL-BC-Oct89Ext

D2(4096)

D2(256)

D1(4096)

D1(256)

D1(16)

D1(1)

m = 4096

Figure 5. Multifractals of flow density (data BC-Oct89Ext, Dt = 2-7 s,4) at time resolutions m = 1(D =0.85), 16(D = 0.84), 256(D1 = 0.81, D2 = 0.57) and 4096(D1 = 0.81, D2 = 0.55). Data is normalized to [0, 1]

and the range of scales in the figure correspond to 2-20 to 1

NEM512 12/5/03 05:39 PM Page 8

Page 9: Self-similar and fractal nature of Internet trafï¬c

on the time resolution of observation only. This

indicates that these stochastic processes follow

fixed dynamics. In some cases, for example in BC-

Oct89Ext and BC-Oct89Ext4 data of LBL, we find

that if we treat the byte size of the packets as a

series irrespective of the time of flow, we have a

fractal structure in the data with D ª 0.75, which

also remains a constant in time. The same holds for

the LBL-tcp packets (3, 4 and 5) also, with D ª 0.8.

We find similar descriptions of some stochastic

processes in the excellent collection of Briggs.20

The other cases which show D ª 1.0 indicates that

it is the pattern of inter-arrival times that makes

the equal-interval time series fractal. The statistics

of the inter-arrival times need to be worked out to

get a clue to this feature.

The Internet traffic, therefore, appears to have

another parameter of description, which is its

fractal dimension D, depending on the time reso-

lution (a convention of which may be set to specify

D). As seen in Table 2, two sets of data (with same

Dt) having more or less the same b (or D) are dif-

ferentiated by the other parameter D (or b). The

case of SSIO, which has a convincing fractal char-

acteristic, is interesting (Figure 6), but there is

much less temporal self-similarity (b � 0.95).

Usually b � 1 represents noisy data with no

fractal-like behaviour, but in this case we find a

consistent fractal characteristic (over an apprecia-

ble range of precisions) given by D � 0.50. In this

case although the autocorrelations of different

aggregates (Figure 7) appear to have features

similar to that of random data and self-similarity

not so pronounced as in other cases, it is not noisy.

Conclusion and Future ResearchWe have shown that given Internet traffic time

series data of length N (say) from which are

obtained its m-aggregates each of length N/m,

there is self-similarity in autocorrelations from

such aggregates as R(m)(k) = R(mk) (m and k large),

i.e. the autocorrelations from different m-

aggregates are the same in terms of actual time

(mk). This does not occur for an ordinary data,

SELF-SIMILAR AND FRACTAL NATURE OF INTERNET TRAFFIC 9

Copyright © 2004 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2004; 14: 000–000

5

10

15

20

25

30

35

40

3

5

7

9

–18 –16 –14 –12 –10 –8 –6 –4

log

N(e

ps)

[arb

. uni

ts]

log (eps) [arb. units]

Data SSIO, D=0.5

Figure 6. Fractals in flow density of data SSIO, Dt = 60s

NEM512 12/5/03 05:39 PM Page 9

Page 10: Self-similar and fractal nature of Internet trafï¬c

noise or a signal from a dynamic system which is

not self-similar. Such an indication is also useful

for verification of self-similarity in a small data set.

Similar to b, which gives the rate of decrease of

variance of m-aggregated data, the flow density is

also seen to be fractal with dimension D, depend-

ing on time resolution of observation. This pro-

vides the Internet traffic with another parameter of

description, the fractal dimension D of the dataset,

besides the commonly used self-similarity para-

meter b. Multifractals are also possible in both the

cases, especially from data of high resolution in

time (of the order of 1ms). These parameters,

remaining almost constant or very slowly chang-

ing over time, give a hint that the dynamics of the

process is fixed. Thus, it is also required that the

different simulations or models worked out to

understand Internet traffic, reproduce these fractal

descriptions as found in the present work. Another

important point to note is that in nature we have,

similar to Internet traffic, other traffic systems (for

example vehicular traffic) or stochastic flows (like

the water drainage in a river) and growths, and it

is expected that such systems will also follow

behaviour similar to that we observed here.

I t is expected that, in nature, othersystems will follow behaviour similar to

that observed here.

Application of self-similarity features to Internet

traffics and further extension of our work are pos-

sible by working out the following problems:

1. Understanding the characteristics of these

self-similar stochastic processes may be of

help in a statistical prediction of the flow

useful to the network manager to render a

satisfactory quality of service. In statistical

prediction although the ARIMA method21

is in practice, further accuracy could be

achieved if self-similarity features were taken

into account. This could be done by a simul-

taneous fitting to original and aggregates

with the same set of parameters.

2. The characteristics may be of use in differen-

tiating traffic that is ‘not natural’, which may

be due to the intruders22 in networks or an

‘abnormal’ traffic situation in a busy city. The

change of nature of traffic will probably result

in a change of its characteristics. Since the

calculations to determine the characteristic

(b and D) require little CPU time, an in-situdetermination of abnormal traffic may be

possible by noting the change of parameters.

3. The cause of flow density data having such

fractal structures as described by dimension

D, dependent on time resolutions, is not yet

understood. So other interesting work will be

to look for a description of the system that

reproduces these parameters.

4. In simulations of Internet traffic, such as the

phase-transition model in networks worked

out by Takayasu et al.1,2,3 it will also be

interesting to see what happens to the self-

similarity and fractal structure of the flow

densities at situations away from the critical

point in such a model. This is essential since

network performances are expected to be best

near the critical condition,1,2,3 which may be

identified by the variation of these parameters.

References1. Takayasu M, Takayasu H, Sato T. Critical behaviors

and 1/f noise in information traffic. Physica A2331996; 824–834.

10 D. CHAKRABORTY ET AL.

Copyright © 2004 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2004; 14: 000–000

5

10

15

20

25

30

35

40

45

50

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

mk

Data SSIO m=1

m=8

m=32

R

(k)

(m)

[arb

. uni

ts]

Figure 7. R(m)(k) vs. mk, m = 1, 8 and 32 for data SSIO, Dt = 60s

NEM512 12/5/03 05:39 PM Page 10

Page 11: Self-similar and fractal nature of Internet trafï¬c

2. Takayasu M, Takayasu H, Fukuda K. Dynamic

phase transition observed in the Internet traffic flow.

Physica A277 2000; 248–255.

3. Fukuda K, Takayasu H, Takayasu M. Origin of crit-

ical behavior in Ethernet traffic. Physica A287 2000;

289–301.

4. Leland WE, Taqqu MS, Willinger W, Wilson DV.

On the self-similar nature of Ethernet traffic.

IEEE/ACM Transactions on Networking Feb. 1994;

2(1): 1–15. (Data collected from: http://ita.ee.lbl.

gov/html/contrib/BC.html.)

5. Casabi I. L/f noise in computer network traffic. J.Phys. A: Math. Gen. 1994; 27: L417–L421.

6. Paxson V, Floyd S. Wide area traffic: the failure of

Poisson modeling. IEEE/ACM Transactions on Networking June 1995; 3(3): 226. (Data collected

from: http://ita.ee.lbl.gov/html/contrib/{DEC-PKT,

LBL-TCP-3, LBL-PKT}.html.)

7. Crovella ME, Bestavros A. Self-similarity in World

Wide Web traffic—evidence and possible causes. In

Proceedings of Sigmetrics’96, 1996; 160–169.

8. Willinger W, Taqqu MS, Sherman R, Wilson DV. Self-

similarity through high variability: statistical analy-

sis of Ethernet lan traffic at the source level.

IEEE/ACM Transactions Networking 1997; 5: 71–86.

9. Adler R, Feldman R, Taqqu MS. A practical guide toheavytails (Self-similarity and heavy tails: Structuralmodeling of network traffic). Birkhauser, Boston, Feb-

ruary 1998.

10. Taqqu MS, Teverovsky V, Willinger W. Is network

traffic self-similar or multifractal? Fractals 1997; 5:

63.

11. Feldmann A, Gilbert AC, Willinger W, Kurtz TG.

The changing nature of network traffic: scaling phe-

nomena. In ACM SIGCOMM98 1998; Comp. Comm.Rev.

12. Caldarelli G, Marchetti R, Pietronero L. The fractal

properties of Internet. Europhysics Letters 2000; 52:

386–992.

13. Mandlebrot BB, Van Ness JW. Fractional Brownian

motions, fractional noises and applications. SIAMRev. 1968; 10: 422.

14. WIDE group. http://tracer.csl.sony.co.jp/mawi/

(mawi of wide).

15. Data from Shiratori Laboratory RIEC, Tohoku

University, (available on request,

http://www.shiratori.riec.tohoku.ac.jp/).

16. Hurst HE, Black R, Simaika Y. Long-term storage: AnExperimental Study. London: Constable, 1965.

17. Willinger W, Paxson V. Where mathematics meets

the Internet. Notices of the Am. Math. Soc. Sept. 1998;

8(45): 961.

18. Schuster HG. Deterministic chaos: An introduction, 3rd

edn. John Wiley & Sons, March 1995.

19. Addison PS. Fractals and chaos: An illustrated course.

Institute of Physics Publishing, Sept. 1997.

20. Briggs, J. Fractals: the patterns of chaos. Simon and

Schuster, New York, Oct. 1992.

21. Box GEP, Jenkins GM, Reinsel GC. Time series analy-sis: forcasting and control, 3rd Edn. Prentice Hall, New

Jersey, February 1994.

22. Mansfield G, Ohta K, Takei Y, Kato N, Nemoto Y.

Towards trapping wily intruders at large. ComputerNetworks 2000; 34: 659.

23. Ashir A, Suganuma T, Kinoshita T, Roy TK,

Mansfield G, Shiratori N. Network traffic character-

ization and network information services—R&D on

JGN. Computer Communication (in press).

24. Mansfield G, Roy TK, Shiratori N. Self-similar and

fractal nature of Internet traffic data. In Proceedingsof the 15th International Conference on InformationNetworking, Beppu City, Kyushu, Japan Jan.–Feb. 2001;

227–231.

25. Ashir A, Chakraborty D, Roy TK, Mansfield G,

Shiratori N. Some characteristics of traffic data. In

Proceedings of the International Conference on Electricaland Engineering, ICECE 2001, Dhaka, BangladeshJanuary 2001; 226–229.

26. Roy TK, Chakraborty D, Ashir A, Mansfield G,

Shiratori N. Characteristcs of Internet traffic data.

Workshop on Multimedia and Distributed Processing,Tokyo, Japan, 2000; 12: 75–80. �

SELF-SIMILAR AND FRACTAL NATURE OF INTERNET TRAFFIC 11

Copyright © 2004 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2004; 14: 000–000

5

10

15

20

25

30

35

40If you wish to order reprints for this or anyother articles in the International Journal ofNetwork Management, please see the SpecialReprint instructions inside the front cover.

2

NEM512 12/5/03 05:39 PM Page 11

Page 12: Self-similar and fractal nature of Internet trafï¬c

5

10

15

20

25

30

35

40

45

50

NEM512 12/5/03 05:39 PM Page 12

Page 13: Self-similar and fractal nature of Internet trafï¬c

AUTHOR QUERY FORM

Dear Author,

During the preparation of your manuscript for publication, the questions listed below have arisen.Please attend to these matters and return this form with your proof.

Many thanks for your assistance.

Query Query RemarksReferences

1 AQ? Biographies for p. 1?

2 AQ? Pages?

NEM512

Author Query Form (NEM512) 12/5/03 05:39 PM Page 1