self-similar and fractal nature of internet trafï¬c
TRANSCRIPT
5
10
15
20
25
30
35
40
Copyright © 2004 John Wiley & Sons, Ltd.
Self-similar and fractal nature of Internet traffic
By D. Chakraborty*†§ A. Ashir, T. Suganuma G. Mansfield Keeni, T. K. Roy and N. Shiratori
The self-similar bursty Internet traffic is usually characterized by theHurst parameter (H). Such a process is also seen to possess fractalcharacteristics in time described by a parameter (b), with multifractals inmost cases. We observe that these highly stochastic traffics have fractalsin flow density too, described by a fractal dimension (D), also with thepossibiliy of multifractals as in the former. This requires anotherparameter for the description of Internet traffic, besides the usual self-similarity parameter b or H and the different simulations or modelsworked out to understand the Internet traffic to reproduce thecharacteristics as found in the present work. We also find a notable self-similarity feature of the autocorrelations in the data and its aggregates,in all the cases studied. Copyright © 2004 John Wiley & Sons, Ltd.
*Correspondence to: D. Chakraborty, TAO Tohoku University Office, Sendai, Japan.†E-mail: [email protected]§Tel: +81-22-726-1667; Fax: +81-22-726-1667
Introduction
T he self-similar burstiness of the Internet
traffic almost at all time scales, having
slowly decaying autocorrelations with
time, the so-called long-range effect and 1/f g
power spectrum at low frequencies, are all mani-
festations of the critical condition of the system
arising due to jamming effects at the gateways and
their neighbours through which packets of infor-
mation are processed.1,2,3 After the pioneering
works of Leland et al.4 and Casabi,5 numerous
works have appeared in the literature confirming
the self-similar and fractal nature6–11 of Internet
traffic data, also with the possibility of multi-
fractals in time resolution. The fractal properties
are a consequence of jamming conditions in the
Internet1–3 and possibly, fractal structure of its
networks.12
Understanding the Internet traffic is an open
and challenging problem, more so after the famil-
iar Poisson model, which used to be applied to
problems in telecommunications, showed cracks
in characterizing the Internet traffic with the start
of fascimile services.6–11 Without an appropriate
model for the Internet traffic it is impossible to
obtain the insight that is required to efficiently
plan, manage or operate a network to render a sat-
isfactory quality of service to the users. Although
parsimony in models is desired the Internet traffic,
due to its complicated structure and stochastic
behaviour, may require a number of parameters
for the characteristics that specify its behaviour.
In this paper we report that the Internet traffic
possesses fractals in flow density (bytes transmit-
ted per unit time) too, with the number of boxes
at a certain resolution to cover the range of data
following a power law. We also show that the auto-
INTERNATIONAL JOURNAL OF NETWORK MANAGEMENT
Int. J. Network Mgmt 2004; 14: 000–000 (DOI: 10.1002/nem.512)
1
NEM512 12/5/03 05:39 PM Page 1
correlations of the characteristic time-series data
follow a notable self-similar behaviour, a feature
which has not been reported so far. The Internet
traffic being an example of a system composed of
many stochastic components, it will also be
interesting to see whether other similar natural
systems13 follow similar characteristics to those
that we present here.
In the following paper, after a brief account of
related work and statistics of self-similarity in a
time series, we describe how the fractal character-
istics of the flow density data can be determined.
We then describe the arrangement for data col-
lection in our laboratory and give the results of
our studies on, along with our data, other data
sets,4–6,14–15 which belong to traffics of either LAN
or wide-area networks, all showing self-similarity.
Finally, we conclude with a summary and a note
on possible areas of further research.
Related WorkLeland et al.4 and Casabi5 were first to show that
Internet traffic had self-similar features. The
former group had measured Ethernet traffic flows
with errors £100ms and a survey of their data
consisting of several hundred millions of packets
confirmed the self-similar behaviour of the traffic
over a wide range of time scales. This was contrary
to the traditional idea of the Poisson model, in
which the data smooth out and become featureless
over large time scales. The distribution of the
traffic was similar irrespective of the time scale of
observation. All of them had similar long-range
behaviour of the autocorrelations, decaying slowly
according to a power law. This was confirmed by
Paxson and Floyd6 and Crovella and Bestavros7 in
the case of wide-area traffics also, and they sug-
gested that the cause was related to the heavy tail
distribution (of Pareto type) of the bytes transmit-
ted. Willinger et al.8 also had similar observations
related to the high variability of the Internet traffic
data. Taqqu et al.10 examined in detail the traffic at
higher resolutions and observed that the Internet
traffic is multifractal, that is, at these resolutions,
the characteristics of the data, remaining self-
similar, have different parameters.
Contemporary to Leland et al.4 Casabi5 mea-
sured the round-trip-time (RTT) of the packets in
the network and observed the 1/f law of the power
spectrum of the RTT distribution. This was exam-
ined in detail by Takayasu et al.1–3 and they pro-
posed a model of ‘contact process’, by which they
were able to show that the traffic follows a 1/fbehaviour but only during the critical condition.
Away from the critical condition the power spec-
trum at low frequencies have a 1/f 2 distribution
instead. Their simulations of the traffic based on a
tree model show that due to jamming conditions,
when the demand for transmission exceeds the
capacity of the links, the Internet can exist in a crit-
ical condition during which the network is most
efficient. At the critical condition the traffic shows
self-similar and fractal features.
T he nature of the flow densities inInternet traffic has not been reported
earlier.
An additional fractal feature we present in this
paper relates to the nature of the flow densities in
Internet traffic, which has not been reported
earlier. We describe this, along with other interest-
ing features, in a later section. Preliminary reports
of our work can be found in references.23–26
Characteristics of Internet Traffic Data
Modern Internet traffic measurement systems4,5
record the time, with errors £100ms, at which a
packet of information arrives along with other
details such as the source, destination, etc., and
the byte length of the data transferred. From this
record, a time series of the data transmitted per
unit time interval (flow density) is obtained. The
self-similarity of the traffic relates to, and is
defined in terms of, aggregates of this time series.
—Statistics of Self-similarity—
For a self-similar time series:
(1)
the m-aggregate {X(m)} with its kth term:
X X X XN{ } = { }1 2, , . . . , ,
2 D. CHAKRABORTY ET AL.
Copyright © 2004 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2004; 14: 000–000
5
10
15
20
25
30
35
40
45
50
NEM512 12/5/03 05:39 PM Page 2
(2)
has its variance Var[X(m)] related to original
Var[X] as:
(3)
b being the self-similarity parameter. b initially
may have a different value from that to which it
settles down with m.
Another characterization of self-similarity is
given by16 the rescaled range (R/S) of the data,
where R(N) is the range of a series {L} defined as:
(4)
with m(N), the mean of data of size N and S(N),
the usual standard deviation. The R/S plot
follows a power law at large N for a self-similar
process:
(5)
where H is the Hurst parameter. This is an exten-
sion of the Brownian-motion process where the
average distance traversed by a Brownian particle
varies as N1–2. Most Internet traffic data show a
value around 0.9 for H.4 It is to be noted that Hgives a measure of self-similarity independent of
the time scale of observation since the series Lj and
L(m)j from original and m-aggregated series are
expected to produce same H. b (the asymptotic
value) is related to H as:4
(6)
so that for full similarity (b = 0), H = 1 and for
noise, which has no self-similarity (b = 1), H = 0.5.
The autocorrelation R(k) = ·XiXi+kÒ, the average of
product XiXi+k, is supposed to be independent of
m4 for such a process:
(7)
i.e. the original series and the m-aggregates give
the same autocorrelation, which is difficult to
observe if the data set is not large. However, at
large k it is found to decay slowly (remains locallyconstant) according to a power law, hence the term
covariance stationary (due to b < 1):
(8)R k k k l( ) -~ ,b arge.
R k R km( )( ) = ( )
H = -12b
R S N HH~ , .2 0 5( ) >
L X N j Nj ii
j
= - ( )( ) £ £=Â m
1
1,
Var X Var X mm[ ] [ ] ,( ) = < <b b0 1
X X mkm
ii m k
km( )
= -( )+= Â
1 1
,Due to this long-range behaviour the power spec-
trum S( f ) at frequency f related to R(k) as:
(9)
also follows a power law at low frequencies:
(10)
—Fractal Characteristics of InternetTraffic Data—
We define17 a process to possess fractal charac-
teristics, if there exists a relationship of the form:
(11)
where Q is a quantity depending on t, a resolu-
tion in time or space of observation variables and
F(D), a simple function (most often linear) of the
dimension D, a non-integer, of the process, the
fractal dimension.
Due to extreme variability, the Internet traffic
data exhibit such fractal-like structures over a long
range of time scales. In fact one of the measures
of self-similarity is based on such a definition.
When Q is the variance of data, then F is -b, so
that the fractal dimension D is identified to
be b:
(12)
Thus (12) describes a fractal behaviour of data
in time. Equation (10) is another similar descrip-
tion. The case of multifractals refers to that situa-
tion when the exponent (b) varies from one range
of scales (m) to another.
We observe that a similar description of data {X}
of (1) or {X(m)} of (2) can be found with the resolu-
tion in the magnitude of data. Imagine the range
of data {X} to be divided into equal segments of
size e, and we count the number of segments that
contain the data. Let this be N(e). Then if (see for
example18)
(13)
then the dimension D at resolution e can be
obtained from the slope of logN(e) vs. loge.
Thus the set {X} or {X(m)} is fractal if (13) is valid
over an appreciable range of scales. For random
data, since all values are equally likely, D = 1.
Thus (13) gives an indication that all values in the
N De e( ) µ -
Var X mm[ ]( ) -µ b
Q F Dt t( ) µ ( )
S f fÆ( ) = --0 1~ ,g g b
S f R k ek
ikf( ) = ( )Â -
SELF-SIMILAR AND FRACTAL NATURE OF INTERNET TRAFFIC 3
Copyright © 2004 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2004; 14: 000–000
5
10
15
20
25
30
35
40
NEM512 12/5/03 05:39 PM Page 3
range of data are not equally probable if D is a frac-
tion. As in all natural cases,19 it is to be noted that
(13) is expected to give the dimension for a range
of scales depending on the volume and accuracy
of data. Again, here also multifractals can exist
with the exponent Ds depending on the scales (e)
of observation.
Results and Implications—Data Collection Environment—
In addition to data collected from different
sources4 and,6,14,15 we also made arrangements to
test the self-similar nature of Internet traffic in our
communications laboratory, Shiratori Laboratory
(SL) of RIEC, Tohoku University. Figure 1 shows
the data collection point of SL LAN. SL LAN is a
100Mbps Ethernet link connected to the Tohoku
University ATM campus network (622Mbps). It is
equipped with 10 servers and 50 users. The traffic
traversing the network originates mainly from
Web, ftp and mail services as well as some exper-
iments carried out in the laboratory. Two types of
flow were recorded, all inbound octets from the
rest of the world to SL and all outward octets from
SL to the Internet in time intervals of 1 minute
between 1 February 2000 and 29 February 2000, a
busy period in the University.
Table 1 lists, along with our data, various Inter-
net traffic data sets that we studied, with collection
periods ranging from 1 hour to about 1 month
and time intervals Dt, 2-7 second to 1 minute and
average flow densities varying over a wide range,
from 1 byte to about 500 Kbytes per second. The
time series {X}, of (1), i.e. transmitted bytes per
interval of time Dt was obtained from the trans-
mitted packet length vs. time stamp records by a
4 D. CHAKRABORTY ET AL.
Copyright © 2004 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2004; 14: 000–000
5
10
15
20
25
30
35
40
45
50
Shiratori Lab LAN
Tohoku UniversityCampus Network
100 Mbps
[622 Mbps]
130.34.202.0/26
Internet
data collector
In Out
Figure 1. Data collection point at Shiratori
laboratory LAN
Data Ref. Resolution Duration Data type Average flow
BC-Oct 89Ext 4 2-7 s 34h Ext 1.146 Bytes/sBC-Oct 89Ext4 4 2-7 s 21h Ext 3.416 Bytes/sBC-p Oct89 4 2-7 s 29min LAN 358.23 KBytes/sBC-p Aug89 4 2-7 s 52min LAN 167.24 KBytes/sDEC pkt-1 6 2-7 s 1h tcp 122.95 KBytes/sDEC pkt-2 6 2-7 s 1h tcp 232.11 KBytes/sDEC pkt-3 6 2-7 s 1h tcp 176.41 KBytes/sDEC pkt-4 6 2-7 s 1h tcp 256.77 KBytes/sLBL pkt-3 6 2-7 s 2h tcp 32.81 KBytes/sLBL pkt-4 6 2-7 s 2h tcp 17.77 KBytes/sLBL pkt-5 6 2-7 s 2h tcp 12.76 KBytes/sMAWI 14 10ms 120h All 437.10 KBytes/sSSOO 15 1min 29 days OutOcts 2.56 KBytes/sSSIO 15 1min 29 days InOcts 502.78 Bytes/s
Abbreviations: Ext, external arrivals; All, all packets; pkt, packets; OutOcts, all out-octets; InOcts, all in-octets; Kbytes, 1024 bytes.
Table 1. Traffic data profile
NEM512 12/5/03 05:39 PM Page 4
5
10
15
20
25
30
35
40
—Observed Characteristics—
Self-similarity from change of variance:temporal multifractals—All the data sets show
(in Table 2) high self-similarity as shown by the
values of H except the last one (SSIO) which was
not so (b ª 1). Most of the cases have temporal
multifractals (as in Figure 2) described by bs in the
respective scales given by the range of ms (in Table
2) and as earlier reported6–11 these could be found
in the data sets with small Dt. The low value of bin certain ranges of time scales show that at these
resolutions the processes look most self-similar,
which may be of help in a prediction process.
There is a general trend to have higher values of bat higher resolutions (low m).
We also found that these multifractals remained
almost constant or very slowly changing in time,
one of them being shown in Figure 3. Here b was
calculated over a sliding window of 256K data
(1K = 1024 bytes) corresponding to a flow of 32
min. approximately. This window sliding after 8
min. (approximately) would discard the previous
64K data and add in a new set of the same length.
Thus in this way we can get a variation of b with
time and this can be applied in the case of on-line
determination also. Since our laboratory data were
taken at intervals of 1min, and hence volume of
data was not large, the time dependence of the
parameters could not be determined for small time
SELF-SIMILAR AND FRACTAL NATURE OF INTERNET TRAFFIC 5
Copyright © 2004 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2004; 14: 000–000
7
9
11
2 4 6 8 10 12
log
Var
[m]
log m
Data LBL-BC-Oct89Ext
(a)
(b)
Figure 2. Temporal multifractals of data
BC-Oct89Ext (Dt = 2-7 s,4) with (a) b = 0.19
(m = 26 ~ 214) and (b) b = 0.61 (m = 20 ~ 26).
0
0.2
0.4
0.6
0.8
1
0 5 10 15 20 25
beta
and
f.d
time (1 unit = 512 s)
Data BC-Oct89Ext (LBL) betaf.d
Figure 3. b and fractal dimension D (f.d) for data BC-Oct89Ext of LBL,4 calculated over a sliding time
window of 2048s, after every 512s, showing that these parameters remain almost constant in time. The
bs correspond to low-time resolutions (m = 20 ~ 26)
program. The BC-p data correspond to LAN
during busy hours and the rest belong to WAN
traffics. Descriptions of these data can be found in
the cited references. The SSIO and SSOO data cor-
respond to, respectively, all in-octets to SL and all
out-octets from SL as mentioned above.
NEM512 12/5/03 05:39 PM Page 5
5
10
15
20
25
30
35
40
45
50
6 D. CHAKRABORTY ET AL.
Copyright © 2004 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2004; 14: 000–000
Data H b1 b2 b3 D1 D2 D3scales (m) scales (m) scales (m) scales (m) scales (m) scales (m)
BC-Oct 0.90 0.61 0.19 0.82 0.5689Ext (21–25) (25–214) (20–212) (28–212)BC-Oct 1.00 0.08 0.24 0.12 0.95 0.6 0.489Ext4 (21–24) (24–211) (211–2114) (20–28) (24–212) (28–212)BC-p 0.79 0.6 0.18 0.98 0.88 0.62Oct89 (21–24) (24–210) (210–2114) (20) (28) (212)BC-p 0.84 0.49 0.34 0.98 0.83 0.69Aug89 (21–24) (24–214) (20) (28) (212)DEC 0.83 0.58 0.33 0.95 0.83 0.76pkt-1 (21–24) (24–214) (20) (28) (212)DEC 0.98 0.54 0.3 0.1 0.975 0.88 0.79, 0.37pkt-2 (21–24) (24–211) (211–214) (20) (28) (212)DEC 0.79 0.66 0.4 0.95 0.84 0.70pkt-3 (21–24) (24–214) (20) (28) (212)DEC 0.79 0.59 0.39 0.97 0.85 0.68pkt-4 (21–24) (24–214) (20) (28) (212)LBL 0.79 0.46 0.23 0.53 0.88 0.74pkt-3 (21–24) (24–212) (212–215) (20–28) (24–212)LBL 0.75 0.56 0.27 0.48 0.84 0.68 0.21pkt-4 (21–24) (24–211) (211–215) (20–212) (24–28) (212)LBL 0.88 0.68 0.32 0.88 0.67 0.23pkt-5 (21–24) (24–212) (20–212) (24–28) (212)MAWI 0.95 0.12 0.012 0.84 0.90, 0.48
(21–25) (25–214) (20–210) (212)SSOO 0.80 0.47 0.5
(21–27) (20–27)SSIO 0.58 0.95 0.5
(21–27) (20–25)
(The blank places in the table imply that multifractal characteristics were not found in thesecases.)
Table 2. Multifractals b’s and D’s of Internet traffic data in different ranges of time scales (m’s)
windows (as in Figure 3). But averaged over a
sliding window of about 15 days (with a set of
16K data) after every 2 days (corresponding to an
interval of 2K data), b for both SSIO and SSOO
were found to be almost constant in time.
Nature of autocorrelations of m-aggregates—The autocorrelation R(k), the
average of product XiXi+k, initially shows a hyper-
bolic fall-off as in (8), but at larger k it is full of
structures like the data. We find a notable self-
similarity when we compare the plots for different
m-aggregates (Figure 4). As a test of self-similarity,
Leland et al.4 observed the m-independence of (7),
obtained from different m-aggregated series all of
same length. This requires a data set over a long
duration. We observe that given an unaggregated
series of length N (say), from which different m-
aggregated series each of length Nm = N/m are
obtained, the autocorrelations R(m)(k) for m and klarge follow:
(14)
The higher and in most cases all the m-
aggregated series give the same autocorrelations
in terms of actual time (mk), which is easier to verify
and more practical (we tested as far as mk = N/2
for all the cases studied). This does not occur for
an ordinary data, noise or a signal from a dynamic
system that is not self-similar. Thus we have a dif-
ferent indication of self-similarity also useful for
verification in a small data set—the autocorrela-
tion plots at higher, and in most cases all, levels of
aggregation look similar as in (14). As a matter of
R k R k m k m km m1 21 2 1 1 2 2
( ) ( )( ) = ( ) =, .
NEM512 12/5/03 05:39 PM Page 6
fact when we tested with artificial data (obtained
by inverse Fourier transformation) from a power
spectrum around 1/|f|g (a signature of critical
behaviour) with random deviations and having
random phases, we do find such self-similarity of
the autocorrelations for all m-aggregates. This fact
coupled with (7) makes the autocorrelation func-
tion itself self-similar and fractal-like, a feature
leading to the data (flow densities) forming a
fractal set, which we shall discuss below. With
increase of time resolution the details of the
autocorrelation come out preserving the previous
values. On closer inspection, we do find them to
be self-similar and fractal-like (which cannot be
described by H or b alone).
Fractal dimension of flow density data: anew parameter of Internet traffic—In order to
determine the fractal dimension D of (13), we nor-
malize a dataset to the range [0, 1], which is
divided into 2k segments (or boxes) each of size
e = 2-k. If a datum has value X, then it is kept in
box number Integer(X/e). The box counting
method counts the number of boxes N(e) that
contain data (some of these may contain more than
one datum) and determines D from the log–log plot
of N(e) vs. e.
All the data sets show fractal behaviour of flow
densities. The scaling described by (13) works for
an appreciable range of precisions (more than two
orders of magnitude in all the cases studied), but
with D depending on the time resolution like the
bs (as shown in Table 2 by D1, D2, etc., in time
scales given by range of ms). In general, Ddecreases with m which implies clustering of data
when viewed in larger time scales. In some cases,
for example in the LBL-BC (Figure 5) and LBL-tcp
packets, multifractals appear as the time resolu-
tion is decreased, gradually switching over to
another dimension at higher m. The 1-hour DEC-
tcp packets appear to have a random behaviour
(with D ª 1 at low m), the difference being notice-
SELF-SIMILAR AND FRACTAL NATURE OF INTERNET TRAFFIC 7
Copyright © 2004 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2004; 14: 000–000
5
10
15
20
25
30
35
40
25
30
35
40
131072 262144 393216
R
(k)
mk
Autocorrelations for BC-Oct89Ext
m=256 m=16
(m)
Figure 4. A typical R(m)(k) vs. mk, m = 16 and 256, plotted at mk = 0(256)2048.256 for comparison
NEM512 12/5/03 05:39 PM Page 7
able only at higher ms. An important point to note
is that although a finite set of random data tends
to form clusters with m-aggregates, thereby appar-
ently forming fractals, they do not give rise to
sharp multifractals we observe here (as in Figure
5). The fractal dimensions of the aggregates of a set
of random numbers (of same volume as data)
change (from D = 1 at m = 1) at a slower rate with
m, probably due to the drop of variance being
faster. In these stochastic processes, with the vari-
ation of m either there is a faster rate of change of
D or multifractals appear with the previous
dimensions (almost) unchanged.
We verified that appearance of these multifrac-
tals with aggregating is not due to artifact that the
data size is reduced. For the m = 256 case of BC-
OctExt data (of Figure 5) with datasize 64K (cor-
responding to data of 32 hours approximately), we
find similar multifractals with D1 ~ 0.75, D2 ~ 0.5
for data sizes 32K and 16K (data of 16 hours
and 8 hours approximately), slightly different
from that obtained from the whole dataset, with
D1 = 0.81, D2 = 0.57. For m = 4096 (data size 4K) we
find a greater range of scales for D2 ~ 0.57. Thus
there is a tendency to switch over to a fractal with
lesser D at larger time scales. While this data size
may seem to be insufficient apparently showing
multifractals we note that this feature at such low
resolutions is obtained for other datasets also
(Table 2) such as MAWI (resolution 40s), SSIO and
SSOO (60s) all with datasize >10K and D ~ 0.5.
We have tested that although the 1/|f|g law of
(10) gives rise to fractals like b and D, multifrac-
tals in both time and flow density resolutions
cannot be obtained from such a power spectrum
(even with large fluctuations up to 50%). For this
it is necessary to identify the features in simula-
tions of Internet traffic (such as in1,2,3).
We also studied the variation of these features
with time, and a typical result is shown in Figure
3. In most cases, the fractal dimension was a con-
stant or very slowly changing in time depending
8 D. CHAKRABORTY ET AL.
Copyright © 2004 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2004; 14: 000–000
5
10
15
20
25
30
35
40
45
50
4
8
12
16
–16 –12 –8 –4
log N(eps) in arbitrary units
log (eps) in arbitrary units
m = 1
m = 16
m = 256
m=4096 Data LBL-BC-Oct89Ext
D2(4096)
D2(256)
D1(4096)
D1(256)
D1(16)
D1(1)
m = 4096
Figure 5. Multifractals of flow density (data BC-Oct89Ext, Dt = 2-7 s,4) at time resolutions m = 1(D =0.85), 16(D = 0.84), 256(D1 = 0.81, D2 = 0.57) and 4096(D1 = 0.81, D2 = 0.55). Data is normalized to [0, 1]
and the range of scales in the figure correspond to 2-20 to 1
NEM512 12/5/03 05:39 PM Page 8
on the time resolution of observation only. This
indicates that these stochastic processes follow
fixed dynamics. In some cases, for example in BC-
Oct89Ext and BC-Oct89Ext4 data of LBL, we find
that if we treat the byte size of the packets as a
series irrespective of the time of flow, we have a
fractal structure in the data with D ª 0.75, which
also remains a constant in time. The same holds for
the LBL-tcp packets (3, 4 and 5) also, with D ª 0.8.
We find similar descriptions of some stochastic
processes in the excellent collection of Briggs.20
The other cases which show D ª 1.0 indicates that
it is the pattern of inter-arrival times that makes
the equal-interval time series fractal. The statistics
of the inter-arrival times need to be worked out to
get a clue to this feature.
The Internet traffic, therefore, appears to have
another parameter of description, which is its
fractal dimension D, depending on the time reso-
lution (a convention of which may be set to specify
D). As seen in Table 2, two sets of data (with same
Dt) having more or less the same b (or D) are dif-
ferentiated by the other parameter D (or b). The
case of SSIO, which has a convincing fractal char-
acteristic, is interesting (Figure 6), but there is
much less temporal self-similarity (b � 0.95).
Usually b � 1 represents noisy data with no
fractal-like behaviour, but in this case we find a
consistent fractal characteristic (over an apprecia-
ble range of precisions) given by D � 0.50. In this
case although the autocorrelations of different
aggregates (Figure 7) appear to have features
similar to that of random data and self-similarity
not so pronounced as in other cases, it is not noisy.
Conclusion and Future ResearchWe have shown that given Internet traffic time
series data of length N (say) from which are
obtained its m-aggregates each of length N/m,
there is self-similarity in autocorrelations from
such aggregates as R(m)(k) = R(mk) (m and k large),
i.e. the autocorrelations from different m-
aggregates are the same in terms of actual time
(mk). This does not occur for an ordinary data,
SELF-SIMILAR AND FRACTAL NATURE OF INTERNET TRAFFIC 9
Copyright © 2004 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2004; 14: 000–000
5
10
15
20
25
30
35
40
3
5
7
9
–18 –16 –14 –12 –10 –8 –6 –4
log
N(e
ps)
[arb
. uni
ts]
log (eps) [arb. units]
Data SSIO, D=0.5
Figure 6. Fractals in flow density of data SSIO, Dt = 60s
NEM512 12/5/03 05:39 PM Page 9
noise or a signal from a dynamic system which is
not self-similar. Such an indication is also useful
for verification of self-similarity in a small data set.
Similar to b, which gives the rate of decrease of
variance of m-aggregated data, the flow density is
also seen to be fractal with dimension D, depend-
ing on time resolution of observation. This pro-
vides the Internet traffic with another parameter of
description, the fractal dimension D of the dataset,
besides the commonly used self-similarity para-
meter b. Multifractals are also possible in both the
cases, especially from data of high resolution in
time (of the order of 1ms). These parameters,
remaining almost constant or very slowly chang-
ing over time, give a hint that the dynamics of the
process is fixed. Thus, it is also required that the
different simulations or models worked out to
understand Internet traffic, reproduce these fractal
descriptions as found in the present work. Another
important point to note is that in nature we have,
similar to Internet traffic, other traffic systems (for
example vehicular traffic) or stochastic flows (like
the water drainage in a river) and growths, and it
is expected that such systems will also follow
behaviour similar to that we observed here.
I t is expected that, in nature, othersystems will follow behaviour similar to
that observed here.
Application of self-similarity features to Internet
traffics and further extension of our work are pos-
sible by working out the following problems:
1. Understanding the characteristics of these
self-similar stochastic processes may be of
help in a statistical prediction of the flow
useful to the network manager to render a
satisfactory quality of service. In statistical
prediction although the ARIMA method21
is in practice, further accuracy could be
achieved if self-similarity features were taken
into account. This could be done by a simul-
taneous fitting to original and aggregates
with the same set of parameters.
2. The characteristics may be of use in differen-
tiating traffic that is ‘not natural’, which may
be due to the intruders22 in networks or an
‘abnormal’ traffic situation in a busy city. The
change of nature of traffic will probably result
in a change of its characteristics. Since the
calculations to determine the characteristic
(b and D) require little CPU time, an in-situdetermination of abnormal traffic may be
possible by noting the change of parameters.
3. The cause of flow density data having such
fractal structures as described by dimension
D, dependent on time resolutions, is not yet
understood. So other interesting work will be
to look for a description of the system that
reproduces these parameters.
4. In simulations of Internet traffic, such as the
phase-transition model in networks worked
out by Takayasu et al.1,2,3 it will also be
interesting to see what happens to the self-
similarity and fractal structure of the flow
densities at situations away from the critical
point in such a model. This is essential since
network performances are expected to be best
near the critical condition,1,2,3 which may be
identified by the variation of these parameters.
References1. Takayasu M, Takayasu H, Sato T. Critical behaviors
and 1/f noise in information traffic. Physica A2331996; 824–834.
10 D. CHAKRABORTY ET AL.
Copyright © 2004 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2004; 14: 000–000
5
10
15
20
25
30
35
40
45
50
0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000
mk
Data SSIO m=1
m=8
m=32
R
(k)
(m)
[arb
. uni
ts]
Figure 7. R(m)(k) vs. mk, m = 1, 8 and 32 for data SSIO, Dt = 60s
NEM512 12/5/03 05:39 PM Page 10
2. Takayasu M, Takayasu H, Fukuda K. Dynamic
phase transition observed in the Internet traffic flow.
Physica A277 2000; 248–255.
3. Fukuda K, Takayasu H, Takayasu M. Origin of crit-
ical behavior in Ethernet traffic. Physica A287 2000;
289–301.
4. Leland WE, Taqqu MS, Willinger W, Wilson DV.
On the self-similar nature of Ethernet traffic.
IEEE/ACM Transactions on Networking Feb. 1994;
2(1): 1–15. (Data collected from: http://ita.ee.lbl.
gov/html/contrib/BC.html.)
5. Casabi I. L/f noise in computer network traffic. J.Phys. A: Math. Gen. 1994; 27: L417–L421.
6. Paxson V, Floyd S. Wide area traffic: the failure of
Poisson modeling. IEEE/ACM Transactions on Networking June 1995; 3(3): 226. (Data collected
from: http://ita.ee.lbl.gov/html/contrib/{DEC-PKT,
LBL-TCP-3, LBL-PKT}.html.)
7. Crovella ME, Bestavros A. Self-similarity in World
Wide Web traffic—evidence and possible causes. In
Proceedings of Sigmetrics’96, 1996; 160–169.
8. Willinger W, Taqqu MS, Sherman R, Wilson DV. Self-
similarity through high variability: statistical analy-
sis of Ethernet lan traffic at the source level.
IEEE/ACM Transactions Networking 1997; 5: 71–86.
9. Adler R, Feldman R, Taqqu MS. A practical guide toheavytails (Self-similarity and heavy tails: Structuralmodeling of network traffic). Birkhauser, Boston, Feb-
ruary 1998.
10. Taqqu MS, Teverovsky V, Willinger W. Is network
traffic self-similar or multifractal? Fractals 1997; 5:
63.
11. Feldmann A, Gilbert AC, Willinger W, Kurtz TG.
The changing nature of network traffic: scaling phe-
nomena. In ACM SIGCOMM98 1998; Comp. Comm.Rev.
12. Caldarelli G, Marchetti R, Pietronero L. The fractal
properties of Internet. Europhysics Letters 2000; 52:
386–992.
13. Mandlebrot BB, Van Ness JW. Fractional Brownian
motions, fractional noises and applications. SIAMRev. 1968; 10: 422.
14. WIDE group. http://tracer.csl.sony.co.jp/mawi/
(mawi of wide).
15. Data from Shiratori Laboratory RIEC, Tohoku
University, (available on request,
http://www.shiratori.riec.tohoku.ac.jp/).
16. Hurst HE, Black R, Simaika Y. Long-term storage: AnExperimental Study. London: Constable, 1965.
17. Willinger W, Paxson V. Where mathematics meets
the Internet. Notices of the Am. Math. Soc. Sept. 1998;
8(45): 961.
18. Schuster HG. Deterministic chaos: An introduction, 3rd
edn. John Wiley & Sons, March 1995.
19. Addison PS. Fractals and chaos: An illustrated course.
Institute of Physics Publishing, Sept. 1997.
20. Briggs, J. Fractals: the patterns of chaos. Simon and
Schuster, New York, Oct. 1992.
21. Box GEP, Jenkins GM, Reinsel GC. Time series analy-sis: forcasting and control, 3rd Edn. Prentice Hall, New
Jersey, February 1994.
22. Mansfield G, Ohta K, Takei Y, Kato N, Nemoto Y.
Towards trapping wily intruders at large. ComputerNetworks 2000; 34: 659.
23. Ashir A, Suganuma T, Kinoshita T, Roy TK,
Mansfield G, Shiratori N. Network traffic character-
ization and network information services—R&D on
JGN. Computer Communication (in press).
24. Mansfield G, Roy TK, Shiratori N. Self-similar and
fractal nature of Internet traffic data. In Proceedingsof the 15th International Conference on InformationNetworking, Beppu City, Kyushu, Japan Jan.–Feb. 2001;
227–231.
25. Ashir A, Chakraborty D, Roy TK, Mansfield G,
Shiratori N. Some characteristics of traffic data. In
Proceedings of the International Conference on Electricaland Engineering, ICECE 2001, Dhaka, BangladeshJanuary 2001; 226–229.
26. Roy TK, Chakraborty D, Ashir A, Mansfield G,
Shiratori N. Characteristcs of Internet traffic data.
Workshop on Multimedia and Distributed Processing,Tokyo, Japan, 2000; 12: 75–80. �
SELF-SIMILAR AND FRACTAL NATURE OF INTERNET TRAFFIC 11
Copyright © 2004 John Wiley & Sons, Ltd. Int. J. Network Mgmt 2004; 14: 000–000
5
10
15
20
25
30
35
40If you wish to order reprints for this or anyother articles in the International Journal ofNetwork Management, please see the SpecialReprint instructions inside the front cover.
2
NEM512 12/5/03 05:39 PM Page 11
5
10
15
20
25
30
35
40
45
50
NEM512 12/5/03 05:39 PM Page 12
AUTHOR QUERY FORM
Dear Author,
During the preparation of your manuscript for publication, the questions listed below have arisen.Please attend to these matters and return this form with your proof.
Many thanks for your assistance.
Query Query RemarksReferences
1 AQ? Biographies for p. 1?
2 AQ? Pages?
NEM512
Author Query Form (NEM512) 12/5/03 05:39 PM Page 1