fex72d5disclaimer portions of this document may be illegible in electronic image products. images...

#

● L/4a- 00-3765

The Failure of TCP in High-Performance Computational Grids*

W. Feng$t and P. Tinnakomsrisuphap~f eng@lanl. gov, timakor@cae .wisc. edu

~Research& Development in Advanced Network Technology (RADIANT)P.O. BOX 1663, M.S. B255

Los Alamos National LaboratoryLOS #d2iII10S,NM 87545

t Department of Computer & Information ScienceOhio State University

2015 Neil AvenueColumbus, OH 43210

$Department of Electrical & Computer EngineeringUniversity of Wisconsin-Madison

1415 Engineering DriveMadison, WI 53706

Abstract

Distributed computational grids depend on TCP to en-sure reliable end-to-end communication between nodesacross the wide-area network (W~). Unfortunately, TCPperformance can be abysmal even when bufers on theend hosts are manually optimized. Recent studies blamethe self-similar nuture of aggregate network tra$ic forTCP’s poor performance because such trafic is not read-ily amenable to statistical multiplexing in the Internet, andhence computational grids.

In thispapec we identifi a source of selj-similarity previ-ously ignored, a source that is readily controllable — TCPVia an experimental study, we examine the eflects of theTCP stack on network trafic using different implementa-tions of TCP We show that even when aggregate applica-tion trajic ought to smooth out as more applications’ traficare multiplexed, TCP induces burstiness into the aggregatetrafic loud, thus adversely impacting networkperjormance.Furthermore, our results indicate that TCP performancewill worsen as W~ speeds continue to increase.

Keywords: network traffic characterization, self-similady,TCP, computational grid, distributed computing

*This work was supported by the U.S. Dept. of Energy through LosAlamos National Laboratory contract W-7405-ENG-36.

1 Introduction

Distributed computational grids, managed by middle-ware such as Globus [4] and Legion [5], require supportfor the fluctuating and heterogeneous demands of individ-ual users. The ability to characterize the behavior of theresulting aggregate network traftic can provide insight intohow traflic should be scheduled to make efficient use of thenehvork in a computational grid.

Several studies conclude that aggregate network trafficis self-similar (or fractal) [7, 12], and thus not amenableto the statistical-multiplexing techniques currently foundon the Internet. Additional studies claim that the heavy-tailed distributions of file size, packet interarrival, and trans-fer duration fundamentally contribute to the self-similar na-ture of aggregate network traffic [10, 11, 16]. While theseheavy-tailed distributions may contribute to self-similarity,we will illustrate that TCP itself is the primary source ofself-similarity and that this behavior may have dire con-sequences in computational grids as WAN speeds increaseinto the gigabit-per-second (Gb/s) range. In particular, weshow that even when application traffic is rigged to producea Hurst parameter H of 0.5, TCP adversely modulates thisapplication traffic and produces H = 1.0 (i.e., self-similartraftic). While we focus on the TCP-induced self-similarityof TCP Reno and TCP Vegas, concurrent research by Veres

DISCLAIMER

This report was prepared as an account of work sponsoredbyan agency of the United States Government. Neitherthe United States Government nor any agency thereof, norany of their employees, make any warranty, express orimplied, or assumes any legal liability or responsibility forthe accuracy, completeness, or usefulness of anyinformation, apparatus, product, or process disclosed, orrepresents that its use would not infringe privately ownedrights. Reference herein to any specific commercialproduct, process, or service by trade name, trademark,manufacturer, or otherwise does not necessarily constituteor imply its endorsement, recommendation, or favoring bythe United States Government or any agency thereof. Theviews and opinions of authors expressed herein do notnecessarily state or reflect those of the United StatesGovernment or any agency thereof.

DISCLAIMER

Portions of this document may be illegiblein electronic image products. Images areproduced from the best available originaldocument.

et al. [14, 15] proves that TCP Tahoe is also a source of self-similarity but does so through an examination of congestionwindows (CUM@and their attractors.

2 Background

TCP is a connection-oriented service which guaranteesthe reliable, in-order delivery of a stream of bytes, hencefreeing the application from having to worry about miss-ing or reordered data. It includes a flow-control mechanismwhich ensures that a sender does not overrun the buffer ca-pacity of the receiver and a congestion-control mechanismwhich tries to prevent too much data from being injectedinto the network, thereby causing packet loss within the net-work. While the size of the flow-control window is static,the size of the congestion window evolves over time, ac-cording to the status of the network.

2.1 TCP Congestion Control

Currently, the most widely-used TCP implementation isTCP Reno [6]. Its congestion-control mechanism consistsof two phases: (1) slow start and (2) congestion avoidance.In the slow-start phase, the congestion window grows expo-nentially (i.e., doubles every time the sender successfullytransmits a congestion-window’s worth of packets acrossthe network) until a timeout occurs, which implies that apacket has been lost. At this poin~ a Threshold value isset to the halved window sizq TCP Reno resets the conges-tion window size to one and re-enters the slow-start phase,increasing the congestion window exponentially up to theThreshold When the threshold is reached, TCP Reno thenenters its congestion-avoidance phase in which tie conges-tion window is increased by “one packet” every time thesender successfully transmits a congestion-window’s worthof packets across the network. When a packet is lost duringthe congestion-avoidance phase, TCP Reno takes the sameactions as when a packet is lost during slow start.

TCP Reno now also implements fast-retransmit andfast-recovery mechanisms for both the slow-start andcongestion-avoidance phases. Rather than timing out whilewaiting for the acknowledgement (ACK) of a lost packe~if the sender receives three duplicate ACKS (indicating thatsome packet was lost but later packets were received), thesender immediately retransmits the lost packet (fast retrans-mit). Because later packets were received, the networkcongestion is assumed to be less severe than if all packetswere lost, and the sender only halves its congestion windowand re-enters the congestion-avoidance phase (fast recov-ery) without going through the slow-start phase again.

TCP Vegas [1] introduces a new congestion-controlmechanism that tries to prevent rather than react to conges-tion, When the congestion window increases in size, the

expected sending rate (ER) increases as well. But if theactual sending rate (AR) stays roughly the same, this im-plies that there is not enough bandwidth available to sendat ER, and thus, any increase in the size of the congestionwindow will result in packets filling up the buffer space atthe bottleneck gateway. TCP Vegas attempts to detect thisphenomenon and avoid congestion at the bottleneck gate-way by adjusting tie congestion-window size, and henceER, as necessary to adapt to the available bandwidth.

To adjust the window size appropriately, TCP Vegas de-fines two threshold values, a and /3, for the congestion-avoidance phase, and a third threshold value, ~, for thetransition between the slow-start and congestion-avoidancephases. Conceptually, a = 1 implies that TCP Vegas triesto keep at least one packet from each stream queued in g,ate-way while @= 3 keeps at most three packets.

If Diff= ER - AR, then when Diff < a, Vegas increasesthe congestion window linearly during the next RT’E whenDifl > p, Vegas decreases the window linearly during thenext R~, otherwise, the congestion window remains un-changed. The ~ parameter can be viewed as the “initial” ,6’when TCP Vegas enters its congestion-avoidance phase.

To further enhance TCP performance, Floyd et al. pro-posed the use of random early detection (RED) gateways [3]to detect incipient congestion. RED gateways maintaina weighted average of the queue length. As long as theaverage queue length stays below the minimum threshold(minth), all packets are queued. When the average queuelength exceeds minth, packets are dropped with probabilityP. And when the average queue length exceeds a maximumthreshold (maz~~), all arriving packets are dropped.

2.2 Probability and Statistics

The Central Limit Theorem states that the sum of a largenumber of finite-mean, finite-variance, independent vari-ables (e.g., Poisson) approaches a Gaussian random vari-able with less variability (i.e., less “spread” or burstiness)than the original distribution(s). So, if each random variablerepresented traftic generated by a particular communicationstream, then the sum of a large number of these streams rep-resents aggregate network traffic with less variability, andthus less variation or spread in tie required bandwidth, i.e,network traffic is less bursty or more smooth. Such aggre-gate traffic behavior enables statistical-multiplexing tech-niques to be effective over the Internet. Unfortunately, evenif application-generated tnffic streams have finite meansand variances and are independent TCP can modulate thesestreams in such a way that they are no longer independent,leading to aggregate network traffic with wildly oscillatingand unpredictable bandwidth demands [13].

To measure the burstiness of aggregate TCP traffic, weuse the coejkient of variation (c.o.v.) [2, 13] — the ra-

Kc 132000

cm-n

tio of the standard deviation to the mean of the observednumber of packets arriving at a gateway in each round-tippropagation delay, The C.O.V.gives a normalized value forthe “spread” of a distribution and allows for the compari-son of “spreads” over a varying number of communicationstreams. If the C.O.V.is small, the amount of tmftic com-ing into the gateway in each RTT will concentrate mostlyaround the mean, and therefore will yield better perfor-mance via statistical multiplexing. For purposes of compar-ison, we also use the Hurst parameter, H, from self-similarmodeling [7, 10, 11, 12, 16]. While H may be better atdetermining long-term buffer requirements (i.e., large timegranularities), it does not provide insight into how well sta-tistical multiplexing performs, i.e., at small time granulari-ties such as a round-trip time (RTT).

3 Experimental Study

To understand the dynamics of TCP, we use m [9] tomodel the portion of the Energy Sciences network (ESnet)that is specific to our computational grid — namely the net-work between Los Akunos National Laboratory (LANL)and Sandia National Laboratory (SNL). We then syntheti-cally generate application-generated traftic to better under-stand how TCP modulates input traftic. Understanding howTCP modulates traffic can have a profound impact on theC.O.V.and H parameters, and hence, the throughput andpacket loss percentage of network traftic.

3.1 Network Topology

Figures 1 and 2 show the current ESnet backbone andan abstraction of the ESnet backbone, respectively. We as-sume that there are up to M hosts in the local-area networks(LANs) of both LANL and SNL sending traffic back andforth, The traftic generated from an Ethernet-attached hosti in LANL’s LAN goes through an Ethemet/FDDI interfaceto a FDDI/A~ Cisco 7507 router to a Fore ASX-200BXATM switch in Albuquerque (ABQ POP) to a Cisco 7507router to host i at SNL and vice versa. While the trafficgenerated between LANL and SNL alone cannot cause anycongestion at the ABQ POP’s WAN Am switch, data fromthe “Internet Cloud” is signillcant enough to interfere withtraffic from both sites.

The Cisco 7507 routers have buffers large enough (48MB) to handle incoming and outgoing traffic. The ABQPOP’s Fore ASX-200BX switch can only buffer 64K ATMcells per port or about 3 MB per port.

Figure 3 shows an abstraction of the proposed topol-ogy for the future ESnet backbone. The ABQ POP ATMswitch remains a Fore ASX-200BX, but the interface speedchanges to OC-12 (622 Mb/s). The routers on either side ofthe Am switch upgradetoCisco7512 routers which have

33% more buffer space than the 7507s. Lastly, both LANarchitectures are replaced by Gigabit Ethernet (1000 Mb/s).In short+all tie link bandwidths scale up, but the ABQ POPAm switch remains essentially the same.

3.2 Traffic Characterization

The C.O.V.provides a quantitative measure of how tra.flitsmoothes out when a large number of finite-mean, finite-variance, independent streams are aggregated (via the Cen-tml Limit Theorem). To characterize the TCP modula-tion of traffic, we tirst generate application traffic accord-ing to a known dis~bution, e.g., poisson. We hen com-

pare the C.O.V.of this distribution (i.e., theoretical c.o.v.) tothe C.O.V.of tie traffic tmnsmitted by TCP (i.e., measuredc.o.v.). This allows us to determine whether TCP modulatesthe tmflic, and if so, how it affects the shape (burstiness) ofthe traftic, and hence the perfonmmce of the network.

For instance, when N independent Poisson sources eachgenerate packets at a rate of Awith round-trip time delay ofT, the C.O.V.of the aggregated sources is

c.o.v.(Poisson,T) = —&

(1)

Hence, this theoretical C.O.V.of Poisson traffic decreases asa function of 11~, implying that traffic should becomesmootler as N increases (or as more sources are aggre-gated).

The self-simiku model is also used to characterize net-work traffic. In this model, the Hurst parameter H theo-retically lies between 0.5 and 1.0. When H = 1.0, theaggregate traffic is self-similar and exhibits long-range de-pendence. Because we generate application traffic for eachclient identically according to a known distribution, i.e.,Poisson, with a finite mean and variance, the Hurst paramet-er H from self-similar modeling should be 0.5 i~TCP doesnot adversely modulate the application traflic. However, aswe will show later in this section, TCP modulates the trafEcto have self-similar characteristics, particularly TCP Reno.

3.3 Experimental Set-Up

Tables 1 and 2 show tie network parameters for the cur-rent and proposed ESnet topologies, respectively. The val-ues for network delay are derived from traceroufe infor-mation between LANL and SNL. Network-hardware valuessuch as buffer size are derived from vendor specifications.Tables 3 and 4 contain appropriately scaled traftic-sourceparameters for tie current andproposedESne~ respectively.

Each LANL client generates Poisson traflic, i.e., singlepackets are submitted to the TCP stack with exponentiallydistributed interpacket arrival times with mean l/A Wevary the total traffic load offered by varying tie number of

Figure 1. Network Topology of the Energy Sciences Network (ESNet)

.... ..“,...

I.ANI...

..... ....’ WI.

/

● FDDI : 155 Mb/s ABQ POPROUTER

155 Mws

100Mbls i ATM Switch●

155 Mb/s

: men-et 100 Mb/s

,... .......

...‘“ ...,...

Figure 2. Abstraction of the Current ESnet Architecture.

“...

LANL ....““........”

....” Sm

%

1;

2!

● 622 ~bhROUTER

●

●

w.

ABQ POP 622 Mbls

ATM Switch

622 hfbh

/’ 1

P

2

●

LOSAhnos, NM Albuquerque, NM

Figure 3. Abstraction of the Proposed ESnet Architecture.

clients M and test two different implementations of TCP(Reno and Vegas) and two different queueing disciplinesin the routers (FIFO and RED). The cross traftic producedby the rest of the “ATM Cloud” is modeled after a Poissonsession arrival with Pareto service time for each session.For each session, a new TCP is randomly created to eitherLANL or SNL. This traffic can be characterized by the ar-rival rate, mean file size, and Pareto shape parameter.

To see how TCP modulates tmflic, we calculate the theo-retical C.O.V.of the aggregate tnffic generated by the LANLclients (based on the Poisson distribution each client uses togenerate its traftic) and compare it to the measured C.O.V.ofthe aggregate TCP-modulated traffic at the routers.

For example, using the experimental parameters fromTable 1, the theoretical C.O.V.of Poisson traffic is 0.659when N = 2 and drops all the way down to 0.120 whenN = 60. However, in Sections 4 and 5, we will find that themeasured C.O.V.(after TCP modulation) can be upwards of300% larger than the theoretical C.O.V.!

4 Results: Current ESnet Topology

When the amount of traffic being generated is larger thanthe available bandwidth, i.e., [(155 – 64)/31 = 31 hosts,TCP congestion-control kicks in and modulates the trafticto be more bursty than expected, i.e., the measured C.O.V.issignificantly larger than the theoretical C.O.V.from Eq. 1, asshown in Figure 4. In particular, with tie theoretical C.O.V.reaching a low of 0.120 when N = 60, the measured C.O.V.for TCP Reno is generally 300% higher than the theoreti-cal C.O.V.while the measured C.O.V.for TCP Vegas is nevermore than 405Z0higher (and in most cases is less than 109ohigher).

TCP Reno begins to induce burstiness when as few aseight clients are aggregated. By 20 clients, the inducedburstiness becomes so significant that it adversely impactsboth throughput and packet-loss rates, as shown in Figures 5and 6. (These results are in stark conmast to our resultsin a more commodity 10-Mb/s switched Ethernet environ-ment [13].) The enhanced burstiness that we see here resultsfrom a combination of (1) the fluctuation of the congestion-window sizes, particularly in TCP Reno [13], and (2) themagnification of TCP’S inability to adapt appropriately toshort-lived congestion over high bandwidth-delay links.

Figure 5 shows tie number of packets transmittedthrough both routers. The throughput becomes saturatedaround 30 clients or when the link capacity minus the av-erage throughput of outside sources equals the traffic gen-erated by the sources. TCP Vegas obviously does a betterjob at utilizing the link as the number of packets tnnsmittedis higher than in TCP Reno. This substantiates the findingsin [8].

Figure 6 shows that TCP Vegas does not lose any pack-ets, i.e., packet-loss rate = O,while TCP Reno loses a mea-surable 0.26% of packets when the network is not even con-gested at less than 20 clients. The 0.26?Z0loss rate corrob-orates the loss-rate numbers reported by ESneC while sucha number is oftentimes dismissed, it should not be becausewhen a packet is dropped before it reaches its destination,all the resources that it has consumed in transit are wasted.Over a 155 Mb/slink, a loss rate of 0.26% translates to 403Kb of information being lost per second. This situation getsworse as the WAN scales up in speed (as we will see laterin Section 5).

Figures 5 and 6 also show that the presence of a REDgateway at both the LANL and SNL routers does not cause

I Parameter I Value

LANL nehvork speed I 100Mb/s

LANL network dekw I 0.3385 ms

,SNL network speed 100titsSNL network delav 0.287 msSNL to ABQ POP-speed 155 Mb/sSNL to ABO POP delav 0.13 msSNL router ~uffer size - I 33554 packets

~Packet size 1500 b~tesRound-trip propagation delay 4.6 msTotal test time 200s

TCP Veins/m 6. v 1,3,1

RED gateway/minth I 6711 packetsRED ~atewav/maz+~ I 26843 ~ackets

Table 1. Network Parameters for Current ES-net.

Parameter Value

LANL network speed 1000 Mb/sLANL network delay 0.67 msLANL to ABQ POP speed 622 Mb/SLANL to ABQ POP delay 1.21 msLANL router buffer size 44739 packets

SNL network speed 1000 Mb/sSNL network delay 0.287 msSNL to ABQ POP-speed 622 Mb/sSNL to ABQ POP delay 0.13 msSNL router buffer size 44739 DaCketS

1

EEEEEE91

ABQ POP ATM buffer (per port) 2315 packets

J

Packet size 1500 by;esRound-trip propagation delay 4.6 msTotal test time 200s

TCP Vegas/a, /3,-y 1,3,1

RED ga~way/minth 8948 packetsRED ga~way/muZth 35791 packets

Table 2. Network Parameters for ProposedESnet.

I Parameters I Value

Maximum #of clients 60Poisson mean packet intergeneration (1/}) 4 msAverage trafIic rate 3 Mb/s

ON/OFF mean burst ueriod 3 ms

ON/OFF mean idle p&iod 97 msON/OFF Pareto shaue 1.5ON traflic rate - 100Mb/s

Average traffic rate 3 Mb/s

Outside-source mean time between session I 0.25sOutside-source mean file size per session I 2MB

Outside-source average traflic;ate I 64 Mb/s

Table 3. Traffic-Source Parameters for CurrentESnet.

I Parameters I Value IMaximum #of clients 60Poisson mean packet intergeneration (1/~) 0.8 msAverage traftic rate 15 Mb/s

ON/OFF mean burst period 1.5 msON/OFF mean idle neriOd 98.5 msON/OFF Pareto shape [ 1.5ON traflic rate ! 1000 Mb/s

I Average traflic rate i 15 Mb/s

I Outside-source mean time between session I 0.25s(

Outside-source mean file size per session I 2MB

Outside-source average tM3ic rate 1 64 Mb/s

Table 4. Traffic-Source Parameters for Pro-posed ESnet.

O.ITMI E!JVA Tacb’rt Pd$am 2UE.XSAJ LMU Rcdu

l“” El— Reno0,9 -,-.; ~yJ~ED

0

0,2 x x VegasRED

07!- 4

Figure 4. C.O.V. of Traffic at LANL Router.

Omml ESm4 T%OkW, kid P.M. Q L4NL am SNLP.lsm %..s

“~

Figure 5. Number of Packets TransmittedThrough LANL and SNL Routers.

C4!aeti E2nel Tw Pdssm SUXCeS Pxk6i-L@s Pemriqe d LAM d SM SCUCOS0.25

0.2 - m-1 I I I

Figure 6. Packet-Loss Percentage for LANLand SNL Sources.

much change in the C.O.V.or the overall network perfor-mance because the bottleneck of the network is at the ABQPOP, not at the routers. In addition, because the buffer sizein the Am switch at the ABQ POP is small compared toboth routers (which have buffers that are several orders ofmagnitude higher than the bandwidth-delay product), thebuffers occupied in the routers are ahnost always smallerthan the RED minth, reSUk@ in very little difference innetwork performance or the C.O.V.of aggregate traffic.

Figure 7 shows the Hurst parameter H as a function oftraffic load. If H = 0.5, the traffic streamsare independentof one anothe~ while at H = 1.0, the traffic streams exhibitself-similarity (burstiness), or more precisely, long-rangedependence. So, when N independent Poisson sources aremathematically aggregated together, the Hurst parameter His 0.5. If TCP does not adversely modulate our application-generated traffic of independent Poisson sources, H shouldremain at 0.5 across all offered traffic loads.

However, as Figure 7 shows, TCP Reno dramatically in-duces burstiness and long-range dependence into the trafficstreams as its H value quickly reaches 1.0 when only twelveclients have been aggregated together over an uncontestednetwork. Meanwhile, TCP Vegas does a much better jobat not adversely modulating the traffic. Unfortunately, TCPReno (not Vegas) is currently the most popular and virtuallyubiquitous implementation of TCP out in the world today.

02

t i

Figure 7. Hurst Parameter at LANL Router.

5 Results: Proposed ESnet Topology

As in the previous section, when the amount of traf-fic being generated is larger than the available bandwidth,i.e., [(622 – 64)/151 = 38 hosts, in tie proposed ESnet,TCP congestion-control really kicks in and adversely mod-ulates the application traftic. However, over this proposed

ESnet, the adverse modulation by TCP is even more pro-nounced than in the current ESne~ i.e., the measured C.O.V.is enormously larger than the theoretical C.O.V.from Eq. 1,as shown in Figure 8. In this particula case, the theoreticalC.O.V.reaches a low of 0.054 when N = 60 while the mea-sured C.O.V.for TCP Reno and TCP Vegas are upwards of642% and 45790 higher than theoretical C.O.V.! These nnm-bers indicate that TCP performance may worsen as WANspeeds continue to increase further evidence of this is pro-vided below.

TCP Reno starts to induce burstiness when as few aseight clients are aggregated. By 34 clients, tie inducedburstiness becomes so significant that it adversely impactsboth throughput andpacket-loss rates, as shown in Figures 9and 10.

In contrast to Figure 5 where TCP Vegas shows bet-ter throughput than TCP Reno, Figure 9 shows that thethroughput of TCP Reno is slightly better rim TCP Vegas.However, a detailed look at a snapshot of the congestion-window evolution (Figures 11- 14) indicates that the reasonwhy the throughput of TCP Reno is better than TCP Ve-gas may be due to the many packet retransmissions by TCPReno. Consequently, TCP Vegas’s goodput is likely to bebetter than TCP Reno’s.

Figures 11 and 13 show that TCP Reno increases its con-gestion window to an extraordinarily high value (>> 1000)during slow start even though the optimal value is less than100 packets. Why does this happen? Because there is nomechanism other than packet loss to inform TCP Reno ofthe appropriate value for the congestion window. With thelarge bursts that TCP Reno allows with the 0(1000 packet)congestion window, the network becomes severely con-gested and must back off twice in Figure 13 without trans-mitting anything for more than two seconds. This suggeststhat the slow-start mechanism of TCP Reno does not adaptquickly enough when the bandwidth-delay product is verylarge.

Returning to figure on packet-loss percentage (Fig-ure 10), TCP Reno’s loss rate exceeds 59i0when the net-work is heavily congested while TCP Vegas once again pro-duces no packet loss. Over a 622 Mb/s link a loss rate of5~0 translates to a loss of over 31 Mb/s! This kind of lossrate is clearly unacceptable for the multimedia applicationsthat must be supported in computational grids, e.g., remotesteering of visualization data and video-teleconferencing.

As in Section 4, Figures 9 and 10 also show that thepres-ence of a RED gateway at both the LANL and SNL routersdoes not cause much change in the C.O.V.or the overall net-work performance because the bottleneck of the network isat the ABQ POP, not at the routers. In addition, becausethe buffer size in the Am switch at the ABQ POP is smallcompared to both routers (which have buffers that are sev-eral orders of magnitude higher than the bandwidth-delay

Figure 8. C.O.V. of Traffic at LANL Router.

Mnnkfd Cllmls

Figure 9. Number of Packets TransmittedThrough LANL and SNL Routers.

Rwd EW TWC+WX PCUSXI SOWC.XPxke!-1.xs Pwcwdaswd IAM-ard w smrces6

— Rm.-. Rfam%ED I

1 -

Wn&d Cfnds

Figure 10. Packet-Loss Percentage for LANLand SNL Sources.

W-Em P@=t 35~e~% FWWI4M0

— Cdeld1.-. Ca%nlis

,3W0 -

. . . amdm

Xoo -*~

:

pQ -

Eg 1s30 -

1000 -

ma -

5678910Ibm (-)

Figure 11. TCP Reno Congestion Window(Clients = 35).

[Prcpssd E5n~ WSSSW25 CSenI%V6gasJ

‘~

7’i

1

126 .- .-,---- -------- .- ---------------- -------

1

;’

3

$4

5

!3

2 -

— C6cml1.-. c-k135

C6mlrxl

‘01234 56 78 9 10Iuln (SQxd.)

Figure 12. TCP Vegas Congestion Window(Clients = 35).

Figure 13. TCP Reno Congestion Window(Clients = 60).

2

t

Figure 14. TCP Vegas Congestion Window(Clients = 60).

product), the buffers occupied in the routers are almost al-ways smaller than the RED minth, resulting in very littledifference in nehvork performance or the C.O.V.of aggre-gate traffic.

6 Discussion

What are the implications of the above results for tomor-row’s Next-Generation Internet and high-performance com-putational grids? FirsL in contrast to the literature over thepast five years, we have shown that TCP itself is a primarysource of self-similarity, particularly TCP Reno. Due to theburstiness that is induced by TCP Reno, the throughput andpacket-loss metrics are adversely affected, reducing overallnetwork performance.

Second, based on our results, Vegas [1] deserves seriousconsideration as a high-performance TCP for both the In-ternet as well as distributed computational grids. (However,the weakness in this claim is that we did not test TCP Ve-gas in a dynamically changing environment of short-livedconnections. That is, we did not model the arrival and de-parture patterns of TCP connections.) For a somewhat op-posing view, the authors direct the reader to [8] where Moet al. pita TCP Reno connection against a TCP Vegas con-nection with the input traffic originating from an infinite filestream.

Third, the performance of TCP worsens as WAN speedsare scaled up. Evidence of this was presented in Section 5.In addition, there exists another case where TCP perfor-mance will suffer as WAN speeds scale; a case that wasnot specifically addressed in our experimental study. Foran intuitive understanding, one must think about the TCPReno congestion-control mechanism in the context of ahigh bandwidth-delay produc~ e.g., 1 Gb/s WAN x 100 msround-trip time (RIT) = 100 Mb. For the sake of argumen~assume that the “optimal window size” for a particuk=ucon-nection is 50 Mb. TCP Reno continually increases its win-dow size until it induces packet loss (i.e., just above 50 Mb)and then chops its window size in half (i.e., 25 Mb). Thus,having all TCP connections use packet loss as a way to in-cessantly probe network state can obviously induce bursti-ness, Furthermore, re-convergence to the “optimal windowsize” using TCP’S absolute linear increase takes much toolong and results in lowered network utilization. In this prK-ticular case, convergence can take as long as (50 Mb -25Mb) / (1500 B/RTT * 8 b/B)= 2,084 RITs or (2,084 RTTs* 100 ms/RTT) =208.4 seconds= 3.472 minutes.

7 Conclusion

The ability to characterize the behavior of aggregate net-work traffic can provide insight into how traftic should be

scheduled to male efficient use of the network, and yet stilldeliver expected quality-of-service to end users. These is-sues are of fundamental importance in widely-distributed,high-speed computational grids.

Our experimental study illustrates hat the cong,estion-control mechanisms of TCP Reno and TCP Vegas modulatethe traflic generated by the application layer. TCP Reno,in particular, adversely modulates the traffic to be signifi-cantly more bursty, which subsequently affects the perfor-mance of statistical multiplexing in the gateway. This mod-ulation occurs for a number of reasons: (1) the rapid fluctu-ation of the congestion-window sizes caused by the contin-ual “additive increase / multiplicative decrease (or re-startslow stint)” probing of the network state and (2) the depen-dency between the congestion-control decisions made bymultiple TCP streams, i.e., TCP streams tend to recognizecongestion in the network at the same time and thus halvetheir congestion windows at the same time (see Figures 11and 13, for example).

Furthermore, over the proposed ESnet topology, wewere able to magnify TCP’S inability (particularly Reno)to adapt appropriately to congestion over high bandwidth-delay links. Thus, tie work presented here concludes that ifwe continue on the path that we are on — using TCP Renoas the networking substrate for high-performance computa-tional grids — overall network performance will suffer. Inparticular, TCP Reno’s congestion-control mechanism in-duces burstiness and dependency between streams whichultimately limit the effectiveness of statistical multiplexingin routers. In addition, TCP Reno’s “packet-loss induced”probing of network state does not allow a TCP connectionto maintain its optimal window size.

Finally, this work also opens up future research opportu-nities in the profiling of application-generated network traf-fic. Why is such profiling important? Because the trafficdistribution that sources generate will likely have an im-pact on the performance of the congestion-control proto-col (even though the average aggregated traffic rate is thesame). While the research community currently under-stands what network traffic looks like on the wire, thereis little understanding on what the traffic looks like whenit enters the TCP protocol. Oftentimes, researchers sim-ply use an infinite-sized file as input into TCP. Why is thisbad? Because it is analogous to pumping an infinitely long,memory-reference pattern into a cache-coherency protocolin order to test tie effectiveness of the protocol. Ratherthan do that researchers in computer architecture profiledmemory-reference patterns in real parallel programs andused these profiles as input into their cache-coherency pro-tocols. Likewise, we, the network research community,ought to s~ doing tie same.

.

References

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]

L. Braktno and L. Peterson. TCP Vegas: End to End CongestionAvoidance on a Global IntemeL IEEE Joarnal of Selected Areas inCommanicafion.v, 13(8):1465-1480, October 1995.

W. Feng and P. ‘Jlsnakomsrisuphap. The Adverse Impact of the TCPCongestion-Control Mechanism in Heterogeneous Computing Sys-tems. In Proceedings of the In!ermtional Conference on ParallelProcessing (ICPP ‘00), August 2000.

S. Floyd and V. Jacobson. Random Early Detection Gateways for

Congestion Avoidance. IEEELACM Transactions on Networking,1(4):397413, August 1993.

L Foster and C. Kesselman. The Grid: Blueprint for a New Conlput-ing Injrastractwe. Morgan-Kaufman Publishing, 1999.

A. Grimshaw, W. Wrdf, and the Legion team. The Legion Visionof a Worldwide Virtual Computer. Commanica!ions of the ACM,40(1):3945, January 1997.

V. Jacobson. Congestion Avoidance and Control. In Proceedings ofthe SIGCOMM’88 Symposium, pages 314-332, August 1988.

W. Leland, M. Taqqu, W. WilIinger, and D. Wilson. On the Self-Sirnilar Nature of Ethernet Traffic (Extended Version). ZEEEIACMTransaction on Networking, 2(1):1-15, Febumry 1994.

J. Mo, R. J. La, V. Anantharam, and J. Walrand. Analysis and Com-parison of TCP Reno and Vegas. In Proceedings of INFOCOM’99,March 1999.

us. Net work Simulator. hltp://www-nlash. cs.berhley. edahs.

K. Park, G. Kim, and M. Crovella. On the Relationship Between FileSizes, Transport Protocols, and Self-Shnilar Network Traffic. hr Pro-ceedings of the 4111International Conference on Network ProtocoLr,October 1996.

K. Parkj G. Kim, and M. Crovella. On the Effect of Traffic Self-Similarty on Network Performance. In Proceedings of dse SPZE In-ternational Conference on Performance and Control ofNehvork Sys-tems, 1997.

V. Paxson and S. Ffoyd. Wide-Area Traffic The Failure of Poisson

Mcdefing. IE.EELACM Transaction on Networking, 3(3):226-244,June 1995.

P. Tinnakornsrisuphap, W. Feng, and I. Philp. On the Burstinessof the TCP Congestion-Control Mechanism in a Distributed Com-puting System. In Proceedings of the International Conference onDistributed Compating Sysrews (ZCDCS’00), April 2000.

A. Veres and M. Boda. The chaotic nature of tcp congestion control.In Proceedings of INFOCOM 2000, March 2000.

A. Veres, Z-s.Kenesi, S. Molnar, and G. Vattay. On the propagation oflong-range dependence in the intemet. In Proceedings of SZGCOMM2000, August/September 2000.

W. WIIIinger, V. Paxson, and M. Taqqu. Self-SirniIarity and HeavyTails: Structural Modeling of Network Traffic. A Practical Gatie toHeavy TaiLv: Stat.ivtical Tec}utiqaes and Applications, pages 27-53,1998.

fex72d5disclaimer portions of this document may be illegible in electronic image products. images...

Documents