achieving near-optimaltraffic engineering solutions for...

22
1 Achieving Near-Optimal Traffic Engineering Solutions for Current OSPF/IS-IS Networks Ashwin Sridharan , Roch Gu´ erin , Christophe Diot ashwin,guerin @ee.upenn.edu , [email protected] Abstract— Traffic engineering is aimed at distributing traffic so as to “optimize” a given performance criterion. The ability to carry out such an optimal distribution depends on both the routing protocol and the forwarding mechanisms in use in the network. In IP networks running the OSPF or IS-IS protocols, routing is over shortest paths, and forwarding mechanisms are constrained to distributing traffic uniformly over equal cost shortest paths. These constraints often make achieving an optimal distribution of traffic impossible. In this paper, we propose and evaluate an approach, based on manipulating the set of next hops for routing prefixes, that is capable of realizing near optimal traffic distribution without any change to existing routing protocols and forwarding mechanisms. In addition, we explore the trade-off that exists between performance and the overhead associated with the addi- tional configuration steps that our solution requires. The paper’s contributions are in formulating and evaluating an approach to traffic engineering for existing IP networks that achieves performance levels comparable to that of- fered when deploying other forwarding technologies such as MPLS. Index Terms— Routing, Networks, Traffic Engineering, Aggregation. I. I NTRODUCTION As the amount and criticality of data being carried on IP networks grows, it is becoming increasingly important to manage network resources in order to ensure reliable and acceptable performance. Fur- thermore, it is desirable to accomplish this while minimizing or deferring costly upgrades. One of the techniques that is being evaluated by many Internet Service Providers to achieve this goal is traffic engineering. Traffic engineering aims at using information about the traffic entering and leaving the network to optimize network performance. Most Dept. of Elec. Eng., Univ. of Pennsylvania, Philadelphia, PA 19104 Sprint Advanced Technology Labs, Burlingame, CA, 94010 The work of A. Sridharan and R. Gu´ erin was supported in part by NSF Grant ANI-9902943 This work was done when Christophe Diot was at Sprint ATL. often the output of traffic engineering is an “opti- mal” set of paths and link loads that produce the best possible performance given the available resources. This set of paths can then be used by network administrators within an autonomous system to con- trol the flow and distribution of traffic across the network. However, explicitly setting up such paths and (optimally) assigning traffic to them, typically calls for changes to both the routing protocols and the forwarding mechanism they rely on, e.g., through the introduction of new technology such as MPLS [14]. Currently, two of the most widely used Interior Gateway routing protocols are OSPF [13] and IS- IS [4]. Hence it would be beneficial to devise solutions that allow these protocols to emulate “optimal routing,” thus leveraging their widespread deployment. There are two main difficulties in doing so. The first is that these protocols use shortest path routing with destination based forwarding. The second is that when the protocols generate multiple equal cost paths for a given destination routing prefix, the underlying forwarding mechanism per- forms load balancing across those paths by equally splitting traffic on the corresponding set of next hops . These added constraints make it difficult or im- possible to achieve optimal traffic engineering link loads. One of the first works to explore this issue was [6], where a local search heuristic was proposed for optimizing OSPF weights assuming knowledge about the traffic matrix. [6] showed that in spite of these constraints, properly selecting OSPF weights could yield significant performance improvements. However, the paper also showed that for some topologies, performance can still be substantially different from the optimal solution. Subsequently, a result from linear programming ([2][Chap. 17, Sec. 17.2]) was used in [17] to prove that any set of routes can be converted into a set of shortest paths based on some link weights, that matches

Upload: others

Post on 01-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Achieving Near-OptimalTraffic Engineering Solutions for ...khayyam.seas.upenn.edu/software/zebra/ashwin_split_hops-report.pdf · 1 Achieving Near-OptimalTraffic Engineering Solutions

1

Achieving Near-Optimal Traffic EngineeringSolutionsfor CurrentOSPF/IS-ISNetworks

AshwinSridharan�, Roch Guerin

�, ChristopheDiot

���ashwin,guerin @ee.upenn.edu , [email protected]

Abstract— Traffic engineering is aimed at distrib utingtraffic so as to “optimize” a given performance criterion.The ability to carry out such an optimal distrib utiondependson both the routing protocol and the forwardingmechanismsin usein the network. In IP networks runningthe OSPF or IS-IS protocols, routing is over shortestpaths, and forwarding mechanisms are constrained todistrib uting traffic uniformly over equal cost shortestpaths. Theseconstraints often make achieving an optimaldistrib ution of traffic impossible.In this paper, we proposeand evaluatean approach,basedon manipulating the setofnext hops for routing prefixes,that is capableof realizingnear optimal traffic distrib ution without any change toexisting routing protocols and forwarding mechanisms.In addition, we explore the trade-off that exists betweenperformance and the overhead associatedwith the addi-tional configuration steps that our solution requires. Thepaper’s contributions are in formulating and evaluating anapproach to traffic engineering for existing IP networksthat achieves performance levels comparable to that of-fered when deploying other forwarding technologiessuchas MPLS.

Index Terms— Routing, Networks, Traffic Engineering,Aggregation.

I . INTRODUCTION

As theamountandcriticality of databeingcarriedon IP networks grows, it is becomingincreasinglyimportantto managenetwork resourcesin order toensurereliable and acceptableperformance.Fur-thermore,it is desirableto accomplishthis whileminimizing or deferring costly upgrades.One ofthe techniquesthat is being evaluated by manyInternet Service Providers to achieve this goal istraffic engineering.Traffic engineeringaimsat usinginformation about the traffic enteringand leavingthenetwork to optimizenetwork performance.Most

�Dept.of Elec.Eng.,Univ. of Pennsylvania,Philadelphia,PA 19104���Sprint AdvancedTechnologyLabs,Burlingame,CA, 94010

The work of A. Sridharanand R. Guerin was supportedin part byNSF GrantANI-9902943This work wasdonewhenChristopheDiot wasat Sprint ATL.

often the output of traffic engineeringis an “opti-mal” setof pathsandlink loadsthatproducethebestpossibleperformancegiven the availableresources.This set of paths can then be used by networkadministratorswithin anautonomoussystemto con-trol the flow and distribution of traffic acrossthenetwork. However, explicitly settingup suchpathsand (optimally) assigningtraffic to them, typicallycalls for changesto both the routing protocolsand the forwarding mechanismthey rely on, e.g.,throughthe introductionof new technologysuchasMPLS [14].

Currently, two of the most widely usedInteriorGateway routing protocolsare OSPF[13] and IS-IS [4]. Hence it would be beneficial to devisesolutions that allow these protocols to emulate“optimal routing,” thus leveragingtheir widespreaddeployment.Therearetwo maindifficultiesin doingso. The first is that these protocols use shortestpathroutingwith destinationbasedforwarding.Thesecondis that whenthe protocolsgeneratemultipleequal cost paths for a given destination routingprefix, the underlying forwarding mechanismper-forms load balancingacrossthosepathsby equallysplitting traffic on thecorrespondingsetof next hops. Theseaddedconstraintsmake it difficult or im-possibleto achieve optimal traffic engineeringlinkloads.One of the first works to explore this issuewas[6], wherea local searchheuristicwasproposedfor optimizing OSPFweightsassumingknowledgeaboutthe traffic matrix. [6] showed that in spiteoftheseconstraints,properly selectingOSPFweightscould yield significantperformanceimprovements.However, the paper also showed that for sometopologies,performancecan still be substantiallydifferent from the optimal solution. Subsequently,a result from linear programming([2][Chap. 17,Sec.17.2]) was usedin [17] to prove that any setof routes can be converted into a set of shortestpaths basedon some link weights, that matches

Page 2: Achieving Near-OptimalTraffic Engineering Solutions for ...khayyam.seas.upenn.edu/software/zebra/ashwin_split_hops-report.pdf · 1 Achieving Near-OptimalTraffic Engineering Solutions

2

or improves upon the performanceof the originalsetof routes.This establishesthat the shortestpathlimitation is in itself not a major hurdle.However,the resultof [17] assumesforwardingdecisionsthatare specific to each source-destinationpair, andmore importantly, the ability to split traffic in anarbitrary ratio over different shortestpaths. Bothof theseassumptionsare at odds with current IPforwardingmechanisms.

In this paper, we proposean approachthat reme-dies both of theseproblems.It builds on [17] bytaking advantageof the fact that shortestpathscanbe used to achieve optimal link loads, but it iscompatiblewith both destinationbasedforwardingand even splitting of traffic over equal cost paths.Compatibilitywith destinationbasedforwardingcanbe achieved througha very minor extensionto theresultof [17], simply by takingadvantageof a prop-erty of shortestpathsandreadjustingtraffic splittingratiosaccordingly. Accommodatingtheconstraintofevensplittingof traffic acrossmultipleshortestpathsis a morechallengingtask.Thesolutionwe proposeleveragesthe fact that current day routers havethousandsof route entries(destinationrouting pre-fixes)in their routing table.Insteadof changingtheforwarding mechanismresponsiblefor distributingtraffic acrossequalcostpaths,weplanto controltheactual(sub)setof shortestpaths(next hops)assignedto routingprefix entriesin theforwardingtable(s)ofa router. In other words, for each prefix we defineasetof allowablenext hops,bycarefullyselectingthissubsetfromthesetof next hopscorrespondingto theshortestpaths computedby the routing algorithm.This allows us to control how traffic is distributedwithout modifying existing routing protocolssuchasOSPFor IS-IS, andwithout requiringchangestothedatapathof currentrouters,i.e., their forwardingmechanism.It does require some changesto thecontrolpathof routersin orderto allow theselectiveinstallmentof next hopsin the forwarding table.

Our initial focusis ongainingabetterunderstand-ing of how well the selective installmentof nexthopsfor different routing prefixescanapproximatean optimal traffic allocation(set of link loads).Weshow that this problemis NP-Completeand henceone must resort to heuristics, Towards this end,we presentheuristics with provable performancebounds and show through experimentsthat theyperform quite well. Even though we study theseheuristics in the context of a routing problem,

we believe that they are generic enough to beof potential use in other load balancing scenar-ios. The main finding from our investigation isthat the performanceachieved by this approachis essentially indistinguishablefrom the optimal.This being said, an obvious drawback of “hand-crafting” thesetof next hopsthatareto beinstalledfor each routing prefix in a router’s forwardingtable, is the configurationoverheadit introduces.The potential magnitude(proportional to the sizeof the routing/forwarding table) of this overheadcould make this approachimpractical.As a result,our next step is to investigatea solution that canhelp mitigate this overhead,albeit at the cost ofa possibledegradationin performance.Specifically,we limit the numberof routing prefixes for whichwe perform the proposedselective installment ofnext hops. Our results indicate that a significantreductionin configurationoverheadcanbeachievedwithout a major lossof performance.

The rest of the paper is structuredas follows.SectionII introducesthelinearprogramformulationusedin [17] to generatean”optimal” setof shortestpaths,andintroducesour proposedmodificationstomake it compatiblewith existing IP routers.Sec-tion III presentsasetof heuristicsfor approximatingan optimal traffic distribution by manipulatingtheset of next hops assignedto each routing prefix.A performancebound is derived for one of theheuristics(seeAppendix).SectionIV presentssev-eral experimentsthat first establishthe efficacy ofthe heuristicsof SectionIII, and then explore theimpact on performanceof lowering configurationoverhead.SectionV provides a brief summaryofthe paper’s contributions and outline directionsforfuture work.

I I . FROM OPTIMAL ROUTING TO SHORTEST

PATH ROUTING

In this section,we first briefly review the clas-sic result from linear programming[2][Chap. 17,Sec.17.2] that was cast in the context of routingin communicationnetworks in [17] to show howoptimal routingcanbe achievedusingonly shortestpaths.We thendiscusswhy this resultis not directlyusablein current IP networks, and finally proposesolutionsthatallow usto implementtheresultunderthe existing paradigm.

The network is modeledasa directedgraph ������ ���with ����� � � nodesand ����� � � directed

Page 3: Achieving Near-OptimalTraffic Engineering Solutions for ...khayyam.seas.upenn.edu/software/zebra/ashwin_split_hops-report.pdf · 1 Achieving Near-OptimalTraffic Engineering Solutions

3

links. We assumethe existenceof a traffic matrix�whereentry

� ����������� ��� � denotesthe averageintensity of traffic from ingressnode

�to egress

node�

for commodity "!$# . A good referenceon how to constructsuch a traffic matrix can befound in [5]. Assume that an optimal allocationbasedon somenetwork wide cost function yieldsa set of paths % � for each commodity , so thatthe total bandwidthconsumedby thesepaths1 onlink

�'&(�)*�is +,�-/. . It can be shown that the same

performance,in termsof the bandwidthconsumedon eachlink, canbe achieved with a setof shortestpathsby formulating and solving a linear programandits dual. The linear programcanbe formulatedas ([17])02143 5 -76 .(8:9�; � 9=< � �?>

[email protected]

.(A 5 .?6 -B8:9�; >�-/.DC .EA 5 -B6 .(8:9�; >

�-/. �"F �&HG� ���I���� 2!J#.(A 5 .?6 -B8:9�; >

�.K- C .EA 5 -B6 .(8:9�; >�-/. �$L �& � ��� 2!M#

� 9�N � �?>�[email protected] +,�-@. �Q&=�)*� ! �

F O > �-76 .PO L �'&=�)R� ! �S T!U# (1)

where> �-/. is the fraction of traffic for commodity that flows through link

�'&=�)R�. Solving the linear

programgives a traffic allocation V +> �-76 .=W that con-sumesno more than +,�-76 . amountof bandwidthonany link

�Q&=�)*�. In order to obtain link weights for

shortestpathrouting, thedualof the linearprogramas formulatedin [17] needsto be solved:0TX�Y � 9�NZ6Q[79�\S]

�[_^ C 5 -76 .(8:9�; +,�-/.�`[email protected]

]�.aC ]

�- O `M-/.�b L c� 2!M# d�Q&=�)*� ! �`U-/.Pe F]�f ^ �"Fhg (2)

Thedualgivesa setof link weights V +`U-76 . W , fromwhich a setof shortestpathscanbeconstructedthatare consistentwith the traffic allocation variablesV +> �-76 .�W .

It is, however, important to understandthat al-thoughroutingcannow bedoneovershortestpaths,

1The bandwidthconsumedon a link is assumedto be the sumofthe traffic on all pathsthat usethe link

this is still quite different from the forwardingparadigmcurrently deployed in existing OSPFandIS-IS networks. The reasonsare two-fold andbothcanbetracedto theoutputof theprimal LP, namely,thetraffic allocation V +> �-76 . W . Firstly, observe that thetraffic allocation is for eachcommodityor source-destinationpair. This meansthat the routing proto-col couldpossiblygeneratedifferentsetof next hopsfor each source-destinationpair on which trafficis to be forwarded. This impacts the forwardingmechanismon the data path, as it now needstomake decisionson the basis of both source anddestinationaddresses.

Thesecondproblemrelatesto thefactthatcurrentforwardingmechanismssupportonly equalsplittingof traffic on the set of equal cost next hops. Thelinear program yields a traffic allocation that isnot guaranteedto obey this constraint. Modifyingthe forwarding engineto supportunequalsplitting(see, for example, [16]) of traffic would involvesignificant and expensive changes.The functionused to select the next hop on which to send apacket would have to be modified, and additionalinformationstoredin theforwardingentriesin orderto achieve thedesiredsplit ratios.This changeis allthe moredifficult sinceit impactsthe datapath. Inthe next two sub-sectionswe suggestmethodsthatovercomeboth theseproblems.

A. DestinationBasedAggregation of Traffic

Thefirst problemof translatinga traffic allocationthat distinguishesbetweensource-destinationpairsinto one that only dependson destinationsis rel-atively straightforward. It can be achieved simplyby transforming the individual splitting ratios ofsource-destinationpairs that sharea commondes-tination into a possibly different splitting ratio forthe aggregate traffic associatedwith the commondestination.The reasonthis is possibleis becauseall routesareshortestpaths.Shortestpathshave theproperty that segmentsof shortestpaths are alsoshortestpaths, so that once two flows headedtothe samedestinationmeetat a commonnodetheywill subsequentlyfollow the sameset of shortestpaths.This meansthatwe neednot distinguishthesepacketsbasedon their sourceaddress,andcanmakesplitting and forwardingdecisionssimply basedontheir destinationaddress.

Also observe that, the aggregation of trafficheadedtowardsa destinationcanbe doneoncethe

Page 4: Achieving Near-OptimalTraffic Engineering Solutions for ...khayyam.seas.upenn.edu/software/zebra/ashwin_split_hops-report.pdf · 1 Achieving Near-OptimalTraffic Engineering Solutions

4

flow per source-destinationpair hasbeenobtainedfrom solving Eq. (1), or in the forumulationof thelinearprogramitself. If theaggregationis donefromthe source-destinationpair flows, the new splittingratiosthat areto be usedon the aggregatetraffic inorderto achieve the sametraffic distribution canbecomputedas follows.

Let the traffic headingfor destination�

at node&iG� � be denotedasj [- � .EA 5 .?6 -_849�; f A 5 f 6 [7849=< ��?> �-76 . g

Thefractionof destination�

traffic that is forwardedon link

�'&=�)R�is then

k [-/. � f A 5 f 6 [7849=< � �?>�-76 .

j [- gIt canbe readily seenthat using k [-/. asthe fractionof theoverall traffic headedtowarddestination

�and

senton link�'&(�)*�

will maintainthe optimal trafficprofile.

The aggregation of flows could also be donein the formulation of the linear program itself.Indeed, since forwarding is done basedonly ondestination,we canignorecommoditiesandinsteadaggregate traffic basedon the destinationfor anymulti-commodity flow problem. This schemewasfirst presentedin [1]. A re-formulation of Eqns.(1) and (2) in the destinationaggregated form ispresentedbelow.

The primal Linear Programis given by :

021:3 5 -76 .E849�; [79�\ml[-/.

subjectto

.EA 5 .?6 -_849�; l[-/. C .(A 5 -76 .(8:9�; l

[-/. �Con [ �& � �� �p�& � ������ � ���F �q gsrc & ! �t � ! �

[79�u l[[email protected] +,�-@. �Q&=�)*� ! �

wheren [ � � A [ ^wv [ � � g(3)

Note that the variablesof the Primal (Eqn: (3),

l [-76 . , representtraffic on link&=�)

headedtowardsdestination

�. Also, theflow conservationconstraints

have beenmodifiedto accomodatethe aggregation,where n [ representsall traffic headedtowardsdes-tination

�.

The dual is given by :

02XIY [79�uyx[z [ C 5 -76 .(8:9�; +,�-/.�`M-/.

subjectto

][- C ]

[.HO `U-/.{b L c � !}| ~�Q&=�)*� ! �`[email protected] F][[ �"F c � !}|

(4)

where x [ representsthe right handside array intheflow conservationconstraintsof theprimal, Eqn(3), that is :

� [ �Q&(� �C�n [ & � �� � �� & � ���I�� � ���F q gsr

andasbefore, L b +`M-B6 . still representlink weightsforshortestpathcomputation.Theabove formulationisnot only conformal to the paradigmof destinationbased forwarding, but also reduces the numberof variables by a factor of | through removalof information regarding source-destinationpairs.This greatly improves the computationcomplexityinvolved in obtainingan optimal solution.

B. ApproximatingUnequalSplit of Traffic

In the previous sub-section,we saw that solvingthe problemof source-destinationbasedforwardingdecisionswas relatively straightforward. Unfortu-nately, the same does not hold for the unevensplitting issue,and as mentionedearlier providingsuch a capability is a significant departurefromcurrent operations.Our proposalto overcomethisproblemis to take advantageof the fact that today’srouting tables are relatively large, with multiplerouting prefixes associatedwith the same egressrouter. By controlling the (sub)setof next hopsthat eachrouting prefix is allowed to use,we cancontrol the traffic headedtoward a particularegressrouter(destination).In other words, insteadof thecurrent operationthat has all routing prefixes usethe full set of next hops,we proposeto selectivelycontrol this choice basedon the amountof traffic

Page 5: Achieving Near-OptimalTraffic Engineering Solutions for ...khayyam.seas.upenn.edu/software/zebra/ashwin_split_hops-report.pdf · 1 Achieving Near-OptimalTraffic Engineering Solutions

5

associatedwith eachrouting prefix and the desiredlink loadsfor an optimal traffic allocation.

Thefollowing exampleillustratesthe ideabehindthe approach.Assumethat at somenode,therearefour routing prefixes, *� �� �� and �� that map toa commondestination� andhave traffic intensities��� *� � ��� ���� �� � ��� ���� �� � ��� and

��� �� � ��� . Lettherebethreeshortestpathsassociatedwith destina-tion � , so that the routing tablehas3 possiblenexthopsto � , andassumethat the optimal distributionof traffic to the threenext hops is

j ����� j �y���and

j ����� . Wecanthenintuitively matchthis trafficdistribution by the following next hop assignment:

*��� Hops1, 3

�� � Hop 1

�� � Hop 3

�� � Hop 2

The resulting traffic distribution isj��� � � �h��� b����L � ��� j��� � � �h��L � ��� j��� � � �h��� b �h��L � ��� ,

which matchesthe optimal allocation.The advantageof the above approachis that the

forwarding mechanismon the data path remainsunchanged,as packets are still distributed evenlyoverthesetof next hopsassignedto aroutingprefix.Thismeansthatacloseapproximationof anoptimaltraffic engineeringsolution might be feasibleevenin the context of existing routing and forwardingtechnologies.Thereare,however, a numberof chal-lengesthatfirst needto beaddressed.Thefirst is theneedfor traffic information at the granularityof arouting prefix entry insteadof a destination(egressrouter).This in itself is not an insurmountabletaskas most of the techniquescurrently usedto gathertraffic data, e.g., router mechanisms’like Cisco’sNetflow or Juniper’s cflowd, canbe readily adaptedto yield information at the granularityof a routingprefix.

Thesecondissueconcernstheconfigurationover-headinvolved in communicatingto eachrouter thesubsetof next hopsto beusedfor eachroutingpre-fix,. This canclearly representa substantialamountof configurationdata, as routing tables are largeand the information that needsto be conveyed istypically different for each router. The approachwe proposeand study is to identify a small setof prefixes for which careful allocation of nexthops is done and rely on default behavior for theremaining prefixes. The trade-off will then be in

terms of how close one can get to an optimaltraffic distribution, while configuring the smallestpossiblenumberof routing prefixes.We investigatethis trade-off in Section IV, where we find thatnear optimum performancecan often be achievedby configuringonly a small numberof routes.

The third and last challengeis to actually for-mulate a methodfor determiningwhich subsetofnext hopsto choosefor eachroutingprefix in orderto approximatean optimal allocation.The goal ofany solution will be to minimize somemetric thatmeasuresdiscrepancy betweentheoptimaltraffic al-locationandthe oneachieved underequal-splittingconstraintson any hop. We explored two metrics:the maximumgap betweenthe optimal traffic andthe allocatedtraffic on any hop, and the maximumloadonany hop,wherethe loadonahopis theratioof the allocatedtraffic and the optimal traffic. Wenote that this allocation problem is a generalizedversion of schedulingunsplittable tasks on a setof processorswith speed-up,andhenceis NP-hard[8]. In the next section,we proposesomesimpleheuristicsthatarebothfastandstill give reasonablygoodperformance.

I I I . HEURISTICS FOR TRAFFIC SPLITTING

Ideally, oneshouldconsidertheproblemof selec-tive next-hop allocationat the global level, that is,do a concurrent optimal assignmentof next hopsfor each routing prefix at each node. However,since even the single node allocation problem isNP-Complete2, we proposeheuristicsthat performindependentcomputationsfor each routing prefixat eachnode. Thesecomputationsare basedonlyon the incoming traffic at the nodeand the desiredoutgoing traffic profile. A potential problem withthis approachis that the traffic arriving at a nodemaynot matchtheoptimalprofile dueto theheuris-tic decisionsat someupstreamnode.Consequently,the profile of the outgoingtraffic from the nodeinquestion,couldfurtherdeviatefrom thedesiredone.However, as we shall see, the heuristicsperformexcellently and henceincoming traffic seenat anynodeandthe resultantoutgoingtraffic have a near-optimalprofile. We proposethreeheuristicsthataregreedyin natureandtry to minimizeoneof thetwometricsmentionedin SectionII-B.

2SeeAppendix for proof

Page 6: Achieving Near-OptimalTraffic Engineering Solutions for ...khayyam.seas.upenn.edu/software/zebra/ashwin_split_hops-report.pdf · 1 Achieving Near-OptimalTraffic Engineering Solutions

6

In the following discussion,we focuson a givenegresspoint,andusethewords“stream”and“trafficintensity of a routing prefix associatedwith theegresspoint” interchangeably. The threeheuristicswe proposework broadly in the following fashion.Whenperformingcomputationat an arbitrarynode

1) Orderroutingprefixesdestinedto a particularegress router in decreasingorder of trafficintensity,

2) Sequentiallyassigneachrouting prefix to asubsetof next hopssoasto minimizea givenmetric.

For clarity we use the following notation in oursubsequentdiscussion:At an arbitrary node � , when assigningroutingprefixes associatedwith an arbitrary egressrouter(destination)� to next hops:

1) Denotethe set of next hops to egressrouter� by ��� 6 � � V¡L � g g g ¢ W ����� 6 � �I� ¢.

2) Denotethe desired(optimal) traffic load (foregressrouter � ) on hop £2!¤��� 6 � by

jI¥.

3) Denotethe traffic intensityof routing prefix&

by ¦ - . Denotethe collective setof the routingprefixes(at � for � ) that needto be assignedto next hopsby §¨� 6 � .

4) Denotethe traffic load on hop £ after heuris-tic © has assigned

&routing prefixes by ª -¥ .

Assumeª -¥ �"F for& O F .

We now describethe threeheuristicswe investi-gatein the paper.MAX-MIN RESIDUAL CAPACITY : Thisheuris-

tic triesto assigneachroutingprefixsothatthemin-imum gap betweenthe optimal and desiredtrafficon any hop is maximized.Although this may seemto be at oddswith the goalof matchingthe optimalprofile, the intuition behindsuchan assignmentisto alwayskeepenoughresidualcapacity(differencebetweenoptimal and assignedtraffic) so as to beable to accommodatesubsequentrouting prefixes.Sinceall routing prefixesmustbe allocateda setofnext hops,by keepingenoughresidualcapacitywetry to ensurethatanallocationdoesnot ”overflow”.Algorithm MAX-MIN RESIDUAL-CAPACITY:

1) Sort the set of prefixes §¨� 6 � in descendingorderof traffic intensity.

2) For eachprefix& !«§¨� 6 � choosea subsetof

next hops +¬ !­��� 6 � , with cardinality � +¬ �which maximizes0T1:3 ¥ 9=® � j�¥ C ª -B¯ �¥ b ¦ -� +¬ �

� g

Note that Step2) canbe easilyachieved by simplysorting all the next hops in decreasingorder oftheir residualcapacity

j�¥ C ª -_¯ �¥ , indexing them inthat order, going through an increasingsequenceof¬ � Vh� O £°�T� !±� W £²� L � g�g g �¢

assignmentsover � ¬ � hops and choosingthebest one, ie., one with maximum min gap. Thecomplexity of thealgorithmis ³ � |�´4µ�¶~| b | ¢ � � .The first term representthe complexity of sortingthe streamsand the second term representsthecomplexity of finding the best allocation for eachstream.

MIN-MAX GAP : This heuristic tries to assigneachroutingprefix soasto minimize themaximumgapbetweenthe optimal anddesiredtraffic on anyhop. Observe that even though, the metric usedby this heuristic is the oppositeof that used byheuristicMAX-MIN RESIDUAL CAPACITY, bothessentiallytry to achieve the samegoal. This isbecauseboth heuristicsmustobey the conservationconstraintof assigningall routing prefixes.Algorithm MIN-MAX GAP:

1) Sort the set of prefixes §~� 6 � in descendingorderof traffic intensity.

2) For eachprefix& !·§~� 6 � , choosea subsetof

next hops +¬ !¸� , with cardinality � +¬ �which minimizes

02XIY ¥ 9=® � jI¥ C ª -B¯ �¥ b ¦ -� +¬ �� g

Again, note that this step can be executed inthe same fashion as for MAX-MIN RESIDUALCAPACITY. Also note that the complexity ofthe algorithm is the sameas for the MAX-MINRESIDUAL CAPACITY.

MIN-MAX LOAD :The Min-Max Load heuristicis similar to awork conservingschedulingalgorithmwhich tries to minimize the maximum load onany processor(the maximummakespan).HeuristicMIN-MAX LOAD tries to minimize the maximumratio of assignedtraffic to the optimal traffic loadover all hops.The differencenow is that eachtask(stream)can be split equally amongmultiple pro-cessors(next hops)and the processors(next hops)canhave differentspeeds(optimal traffic loads).

Algorithm MIN-MAX LOAD:

Page 7: Achieving Near-OptimalTraffic Engineering Solutions for ...khayyam.seas.upenn.edu/software/zebra/ashwin_split_hops-report.pdf · 1 Achieving Near-OptimalTraffic Engineering Solutions

7

1) Sort the set of prefixes §¨� 6 � in descendingorderof traffic intensity.

2) For eachprefix& !«§¨� 6 � choosea subsetof

next hops +¬ !¸� , with cardinality � +¬ �which minimizes

02XIY ¥ 9=® � ª -B¯ �¥ b ¹ º»�¼½»jI¥ �Step 2) can be achieved in two stages.First, foreachindex ¾y�¿L � g g g ¢ , do a virtual assignmentof routing prefix

&to a set of ¾ hopswhich yields

thesmallestmaximum.This canbedoneby simply

sortingtheset V ª-B¯ �¥ b ¦ - ��¾jI¥ W £T!}� in increasing

order, re indexing them and virtually assigning&

only to the first ¾ hops.Second,from all the

¢such possible assign-

ments,choosethe onewith the smallestmaximumfor an actual assignment.In caseof a tie, choosea lexicographicallysmaller assignment.The com-plexity of the algorithm is again, the sameas forthe first two heuristics.

We outline our implementationof the heuristicsin the pseudo-codebelow :

procedure Selective Hop AllocationInput À �

Link Weights V¡L b +`U-/. W , optimal trafficallocation V +j [-/.�W , Traffic Matrix

� �For each destinationnode � do

Run Dijkstra’s algorithmwith weights V +`U-/. WFor each node � G��� in order of decreasing

distancefrom � doApply the heuristic to the set of routing

prefixes §¨� 6 � to determine,for eachrouting prefix&, the setof next hops � -

For each routing prefix& !}§¨� 6 � do

Update the intensity of the correspondingrouting prefix at eachnode

) !}� -done

donedone

For the sake of continuity, analysisof theheuris-tics is postponedtill the appendix.Therewe showsthat the MIN-MAX LOAD heuristicacievesa loadratio that is within a factor of

� L b ´ 3 ¢ ��� � of theratio achieved by an optimumallocation(undertheequal splitting constraint),while the deviation ofthe MIN-MAX REISUDAL CAPACITY from theoptimalallocationis no morethan L¢ � C ¦ ['Á � ´ 3 ¢Â� b� | C �Ià C L � ¦ ��- � � .

IV. EXPERIMENTS

In order to evaluate the effectivenessof ourapproach,we conductedtwo setsof experimentsonartificially generatedtopologiesas well as on anactual ISP topology, namely the Sprint Backbone.In the first set we studiedthe performanceof ourheuristicswhen comparedagainstoptimal routing.In the secondset of experimentswe studied thetrade-off between performanceand configurationoverheadby varying the numberof routing prefixesfor which we controlled the set of next hops theywereassigned.

For purposesof comparison,we solved a linearmulti-commodity flow routing problem with thesame piecewise linear cost function used in [6].The only constraintin the routing problemis flowconservation and consequentlyit provides a lowerboundon the performanceof any routing scheme,for the samemetric. Henceforth, we shall refer tothis problemas the “optimal routing problem” andits solution as the “optimal routing solution”. Thesolution to this problem is a set of paths (trafficallocation)for eachcommoditywhich yields +,�-/. , thebandwidthconsumedon eachlink (

&(�)*�.

We reproduce the optimal allocation problemwith regardto thiscostfunctionbelow for complete-ness.Let theflow to destinationnode

�on link

�Q&=�)*�be denotedby Ä [-B6 . . Let the total flow on link

�Q&=�)*�isj -76 . � [79�\ Ä [-B6 . and the capacity is Å -B6 . . Denote

the cost of link�'&(�)*�

by Æ -76 . � j -B6 . Å -76 . � , which isa piecewise linear function that approximatesanexponentially growing curve (the exact piecesofthe function are presentedin Eqn. (6) - Eqn. (11).The cost grows as the traffic on the link increasesand the rate of growth accelerateswith increasingutilization. Evolution of the cost function with linkload is shown in Figure 1. The optimal routingproblemmay thenbe formulatedas:

021:3 5 -76 .E849�; Æ -76 . � j -B6 . Å -B6 . �subjectto

.EA 5 .?6 -_849�; Ä[.?6 - C .(A 5 -76 .(8:9�; Ä

[-76 . �C�n [ & � �� � & � ���F o.w.

(5)

c & ! ���� ! �j -76 . � [79�\ Ä[-76 .

Page 8: Achieving Near-OptimalTraffic Engineering Solutions for ...khayyam.seas.upenn.edu/software/zebra/ashwin_split_hops-report.pdf · 1 Achieving Near-OptimalTraffic Engineering Solutions

8

Çp-B6 . � j -76 . ��Å -76 . c �Q&=�)*� ! �Æ -76 . � j -76 . ÇÈ-76 . O L���É (6)

Æ -76 . ��É j -76 . C �É Å -76 . L���É O ÇÈ-76 . O �h��É (7)

Æ -76 . �¿LIF j -76 . C LI�É Å -76 . �h��É O ÇÈ-76 . O �h��L�F (8)

Æ -76 . ��Ê�F j -76 . C LIÊ��É Å -B6 . �h��LIF O ÇÈ-76 . O L (9)

Æ -76 . ����F�F j -76 . C LI�����É Å -76 . L O ÇÈ-76 . O L�LLIF (10)

Æ -76 . ����F�F�F j -B6 . C LI��É¡LI�É Å -B6 . L�L���L�F O ÇÈ-76 . (11)

where,asbeforen [ � � A [ ^�v [ � �(12)

Notethattheapproachespresentedin [17] and[6]are not limited to any particularcost function. Wesimply chosethis costfunctionasanexample.Alsonotethat theabove formulationis basedon the ideaof destinationbasedaggregation ([1]) presentedinSectionII-A andwasusedfor computationbecauseof its lower complexity. In the rest of the section,we explain our experimentalsetup anddiscussourobservationsregardingperformanceandcomplexitytrade-off.

0 20 40 60 80 100 1200

1

2

3

4

5

6x 10

4

Link Load f

Co

st

Fu

nctio

n Φ

(f,C

)

Fig. 1. Evolution of cost function Ë ºQÌ ÍwÎ7Ï?ºQÌ Í�ÐKÑ�º'Ì ÍwÒ as a function oflink load Ï º'Ì Í

A. ExperimentalSetUp

For our experiments,theartificial topologiesweregeneratedusing the Georgia Tech [18] and BRITE[12] topologygenerators3. Thetopologiesgeneratedusing the generatorswere random graphschosenfrom agrid usingeitherauniform or waxmandistri-bution.Thelink capacitieswereeithersetuniformlyto 500 Mbps in somecases,or chosenrandomlyfrom a uniform distribution in other cases.Actualphysicallink capacitieswereusedfor the topologybasedon the Sprint backbone.

For the artificially generated topologies, wepresentresultsfor randomtraffic matricesthatweregeneratedby picking the traffic intensity of eachroutingprefix from a paretoor uniform distribution.The choice of a paretodistribution was motivatedby measurementstaken from several routerson theSprint backbone.We also experimentedwith otherdistributions,eg. uniform, bimodal,exponentialandgaussian.Of these,we presentresultsfor the uni-form distribution since it is representative of theothers.The other parameterof importanceis thenumberof routingprefixassociatedwith eachegressrouter, that is, thegranularityof thematrix.For this,weagainusedbothuniformandparetodistributions,asit givesareasonablecoveragefor thepossibledif-ferencein the numberof available routing prefixesto a given egressrouter. The Sprint traffic matrixwasbasedon actualtraffic tracesdownloadedfromaccesslinks to two of the Sprint backbonerouters.The traceswas measuredat the granularityof therouting tableentries4 andgivesus two rows of thetraffic matrix. The routing prefix intensitiesin theremainingrows were generatedartificially using aparetodistribution for bothintensityandgranularity.

Eachexperimentwasconductedin the followingfashion:

1) For eachnetwork topology, we generatedran-dom traffic matrices,varying both the totalnumberof routing prefixes and distribution5

from which theingresstraffic intensityof eachrouting prefix waspicked.

2) Hot spotswereintroducedin thetraffic matrixby randomlyselectingelementsfrom thetraf-fic matrix and scalingthem to createseveral

3BRITE allows several options for generating topologies: ASLevel, Hierarchicalandrouterlevel. We chosetherouterlevel option.

4The routing prefix intensitiesareaveragesover 10 hrs.5Exceptin the caseof the Sprint traffic matrix.

Page 9: Achieving Near-OptimalTraffic Engineering Solutions for ...khayyam.seas.upenn.edu/software/zebra/ashwin_split_hops-report.pdf · 1 Achieving Near-OptimalTraffic Engineering Solutions

9

instancesof thetraffic matrix.We testedcaseswhereonly someof the traffic elementswerechosenandalso caseswhereall entrieswerechosen.In thelattercase,this involvesscalingthe entire traffic matrix.

3) The “optimal routing problem” (12) wasthensolved for eachsuch instance(topology andtraffic matrix).

4) The linear program(3), with the optimal linkbandwidthsfrom the “optimal routing solu-tion” asinput, wassolved to obtainthe trafficallocation (which was aggregated basedondestination,ref. SectionII-A) and the set oflink weights.

5) Finally, the threeheuristicswererun over thenetwork with thelink weightsandtraffic flowsfrom thepreviousstep(pleasereferto pseudo-code).

We usedILOG CPLEX to solve theoptimalroutingproblemandthe linearprogram(3). On a Dell 25001 Ghz machine, it took about � hours to solvethe optimal routing problem and L�F minutes forthe LP (Eqn. (3)) and our heuristicson the largestnetworks.

0 2 4 6 8 10 12 14 16

x 109

0

0.5

1

1.5

2

2.5

3

3.5

4x 10

6

Total Traffic

Co

st F

un

ctio

n

Sprint Backbone

OptimalMax−Min Residual CapacityMin−Max Residual GapMin−Max Load F&T HeuristcMax link util=100%Max link util=70%Max. link util=25%

Fig. 2. Sprint Backbone:Performanceof the 3 heuristics

B. PerformanceComparisonagainstOptimalRout-ing

We now presentand discussthe resultsof ourexperiments.In Figure2 we plot Costvs Total Traf-fic Demandfor all the 3 heuristicsand the optimal

0.4 0.6 0.8 1 1.2 1.4 1.6

x 1010

0

0.1

0.2

0.3

Total Traffic

Pe

rce

nta

ge

De

via

tio

n f

rom

Op

tim

al

ISP Backbone

Max−Min Residual CapacityMin−Max Residual GapMin−Max Load

Fig. 3. SprintBackbone:%Deviationof the3 heuristicsfrom optimal

0 0.5 1 1.5 2 2.5 3

x 109

0

1

2

3

4

5

6

7

8

9x 10

5

Co

st F

un

ctio

n

Total Traffic

50 Nodes 200 Edges Graph

OptimalMax−Min Residual CapacityMin−Max Residal GapMin−Max LoadF&T HeuristicMax Link Util=0.90Max Link Util=0.75Max Link Util=0.60Max Link Util=0.25

Fig. 4. BRITE 50 Node 200 Edge graph: Performanceof the 3heuristicswith an averageof 26500routing prefixesper node

routing for theSprint topology. Thehorizontallinesrepresentvarious levels of maximum averagelinkutilization over all links for optimal routing. Theentire traffic matrix wasscaledfor the experimentsinvolving the Sprint backbone.backbone.From thefigure, we seethat in all the cases,the heuristicsare very near the optimal solution indicating thatthey are able to match the optimal traffic splitvery closely. Moreover, all threeheuristicsperformequally well in all instances.For comparison,wehave alsoshown theperformanceof standardOSPFrouting with weights computedusing our imple-

Page 10: Achieving Near-OptimalTraffic Engineering Solutions for ...khayyam.seas.upenn.edu/software/zebra/ashwin_split_hops-report.pdf · 1 Achieving Near-OptimalTraffic Engineering Solutions

10

0 0.5 1 1.5 2 2.5 3

x 109

0

0.2

0.4

0.6

Total Traffic

Pe

rce

nta

ge

De

via

tio

n f

rom

Op

tim

al

50 Nodes 200 Edges Graph

Max−Min Residual CapacityMin−Max Residual GapMin−Max Load

Fig. 5. BRITE 50 Node 200 Edge graph:% Deviation of the 3heuristicsfrom theoptimalwith anaverageof 26500routingprefixesper node

0 0.5 1 1.5 2 2.5

x 1010

0

5

10

15x 10

5

Total Traffic

Co

st F

un

ctio

n

30 Nodes 238 Edges Graph

OptimalMax−Min Residual CapacityMin−Max Residual GapMin−Max Load F&T HeuristcMax link util=100%Max link util=70%Max. link util=25%

Fig. 6. GT TopologyGenerator30 Nodegraph:Performanceof the3 heuristicswith an averageof 17000routing prefixesper node

mentationof the heuristicproposedin [6] (denotedby “F&T Heuristic” in the graph). In Figure 3,we plot the % deviation from the optimal for allthreeheuristics.We canseeagainthat theheuristicsclearly perform very well ( well within LIÓ of theoptimal).

We now look at the artifically generatedgraphs.In Figure4, we plot costvs total traffic demandforall 3 heuristicsand optimal routing on a 50 Node200Edgegraphwith a granularityof 26500routing

0 0.5 1 1.5 2 2.5 3

x 1010

−0.5

0

0.5

1

1.5

2

2.5

3

3.5

4

Total Traffic

Pe

rce

nta

ge

De

via

tio

n f

rom

Op

tim

al

30 Nodes 238 Edges Graph

Max−Min Residual CapacityMin−Max Residual GapMin−Max Load

Fig. 7. GT TopologyGenerator30 Nodegraph:%Deviation of the3heuristicsfrom theoptimalwith anaverageof 17000routingprefixesper node

1 1.5 2 2.5 3

x 1010

0

1

2

3

4

5

6

7x 10

6C

ost F

un

ctio

n

Total Traffic

60 Nodes 250 Edges Graph

OptimalMax−Min Residual CapacityMin−Max Residal GapMin−Max LoadF&T HeuristicMax Link Util=1.00Max Link Util=0.75Max Link Util=0.60Max Link Util=0.25

Fig. 8. 60 Node 250 Edge graph (GT Topology Generator):Performanceof the heuristicvs. optimal routing

prefixesper node.This numberwaschosensimplyasan approximationof the numberof routing pre-fixes in a backbonerouter. We have conductedex-perimentswith upto100,000routingprefixesandasfew as500 routing prefixeswithout any significantchangein performance.The graph was generatedusing the BRITE generatorand all links were setto a capacityof 500Mbps.The entriesin the trafficmatrix for this experimentwere generatedfrom aParetodistribution andwasscaledby selecting70%

Page 11: Achieving Near-OptimalTraffic Engineering Solutions for ...khayyam.seas.upenn.edu/software/zebra/ashwin_split_hops-report.pdf · 1 Achieving Near-OptimalTraffic Engineering Solutions

11

0.5 1 1.5 2 2.5 3 3.5

x 1010

10−3

10−2

10−1

100

101

102

103

Total Traffic

% D

evia

tio

n fro

m O

ptim

al

60 Nodes 250 Edges Graph

Max−Min Residual CapacityMin−Max Residual GapMin−Max Load

Fig. 9. 60 Node 250 Edge graph (GT Topology Generator):%Deviation of the heuristicfrom optimal routing

0.8 0.9 1 1.1 1.2 1.3

x 1010

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

x 106

Co

st F

un

ctio

n

Total Traffic

50 Nodes 154 Edges Graph

OptimalMax−Min Residual CapacityMin−Max Residal GapMin−Max LoadF&T HeuristicMax Link Util=1.00Max Link Util=0.75Max Link Util=0.60Max Link Util=0.25

Fig. 10. 50 Node154 EdgeGraph:Performanceof the 3 heuristics

of the traffic elementsas hotspots.In Figure 5 weplot the % deviation from the optimal for the threeheuristics.The low percentagedeviation

� Fhg'��Ó CL�Ó � from theoptimalvaluehighlightshow effectivethe heuristicsare.

Resultsfor a 30 Node238Edgegraphareshownin Figure6 andFigure7. The graphwasgeneratedby the Georgia Tech topology generatorusing theuniform distribution and their link capacitieswereset to 500 Mbps. To generatehotstpots,around70%of the entriesin the traffic matrix werescaled,with entriesbeingchosenfrom aParetodistribution.

0.7 0.8 0.9 1 1.1 1.2 1.3 1.4

x 1010

10−2

10−1

100

101

102

Total Traffic

% D

evia

tio

n fro

m O

ptim

al

50 Nodes 154 Edges Graph

Max−Min Residual CapacityMin−Max Residual GapMin−Max Load

Fig. 11. 50 Node154 EdgeGraph:%Deviation of the 3 heuristicsfrom optimal

0 1 2 3 4 5 6 7 8

x 1010

0

2

4

6

8

10

12

14

16

18x 10

6

Co

st F

un

ctio

n

Total Traffic

50 Nodes 206 Edges Graph

OptimalMax−Min Residual CapacityMin−Max Residal GapMin−Max LoadF&T HeuristicMax Link Util=1.00Max Link Util=0.75Max Link Util=0.60Max Link Util=0.25

Fig. 12. 50 Node206 EdgeWaxmanGraph:Performanceof the 3heuristics

Again, we seethat theheuristicsperformquitewellfor the 30 Nodegraph,closelytrackingthe optimalin Figure 6 and deviate less than 4% from theoptimal (Figure7).

Till now we have seenthatall the threeheuristicsperform equally well. However, this pattern failson the next three topologiesthat we present.Forall the 3 remaining topologies, the traffic matrixgranularityaswell as intensitieswerechosenfroma uniform distribution. Hotspotswere createdbyscaling50% of the node-pairsin the traffic matrix.

Page 12: Achieving Near-OptimalTraffic Engineering Solutions for ...khayyam.seas.upenn.edu/software/zebra/ashwin_split_hops-report.pdf · 1 Achieving Near-OptimalTraffic Engineering Solutions

12

0 1 2 3 4 5 6 7 8 9

x 1010

10−4

10−3

10−2

10−1

100

101

102

103

104

Total Traffic

% D

evia

tio

n fro

m O

ptim

al

50 Nodes 206 Edges Graph

Max−Min Residual CapacityMin−Max Residual GapMin−Max Load

Fig. 13. 50 Node206 EdgeWaxmanGraph:% Deviation of the 3heuristicsfrom the optimal

Costand% deviation curvesfor a 60 node250edgetopologyareshown in Figures8 and9 respectively.Resultsfor a 50 node 154 edgerandomtopologyare presentedin Figures10 and 11. Both topolo-gies were generatedby the Georgia Tech topologygeneratorusing the uniform distribution and theirlink capacitieswere set to 500 Mbps. Finally, thecostand% deviation graphsfor a 50 node206edgerandomtopologygeneratedusingthewaxman-1dis-tribution areshown in Figures12 and13. The linkcapacitiesfor this topology were randomlychosenfrom a uniform distribution. Observe, that for all 3topologies,theMAX-MIN RESIDUAL CAPACITYandMIN-MAX GAP deviate significantlyfrom theoptimal in someinstances(this is mostpronouncedfor the 60 node 250 edge topology, Figure 9).However, theMIN-MAX LOAD performsverywellin all instances.

C. Non-Integer Link Weights

We should point out that the dual of the linearprogram,Eqn. (4), is not guaranteedto yield inte-ger solutionsfor link weights. In practice,routingprotocols like OSPFand IS-IS have a finite fieldwidth for link weight information. If the weightsarenon-integer, fitting theminto the providedfield,canresult in lossof precision,which in turn affectsthe routing andperformance.Henceit is importantto study how the availableprecisionaffectsperfor-mance.In the experimentswe performed,the link

weightsalmostalways turnedout to be integersatlow utilizations ( ÔÕÊ���Ó � , but at higher loads, thelink weightscould be non-integer6. For suchcases,we roundedof thenon-integerweightsto 5 decimalplacesandthenscaledthemto obtainintegers.Thisoperationrequiresa field width of around17 bits.Current implementationsof OSPFand IS-IS allowonly 16 and 8 bits respectively, but their trafficengineeringextensionsallow at least24 bits for linkweights([10], [15]) which is sufficient. The impactof field lengthon % deviation from optimalcostforthe50 Node206WaxmanGraphis shown in Figure14for field lengthsof 8, 16and24bits.Observethatthefield-lengthhaslittle impacton performancetilllink utilizations start increasingat which point thelower masked weightsstart to deviate significantly(around LIF�F�Ó utilization).

We notehowever, that rounding-off is not alwaysthe main source of imprecision in link weights.There are two other factors which play a majorrole in theprecisionof link weights.First of courseis the fact that computersare finite precisionma-chinesandhencethe weightsobtainedare actuallyintegers approximations.The secondfact is thatthe optimizationsolver we used,or for that matter,any solver, requiresa non-zerotoleranceto yieldfeasiblesolutions.Thesetwo causescombinedcanresult in link weightsthat are accurateonly withina certaintolerance.Consequently, roundingoff thelink weights to obtain integer weights, will carryover the inaccuracy, regardlessof the field width.Specifically, loss of precision is not causeddueto round-off, but rather due to intrinsic factors inthe optimization.While we did not encountersuchproblemsat low to medium link utilizations, thisoften resultedin imprecisionat higher utilizations.As anexample,in Figure14, for the lastdatapoint,observe that even the 24 bit maskresultsin an de-viation of around��F�Ó . To circumvent this problem,we currentlyassumelink weightsto beequalif theyare within the specifiedtolerance.However, otherapproacheswhich inherently yield integer weightsarerequiredin order to truly overcomethis issue.

Finally, we should mention that the load bal-ancing techniquewe have presentedin the paper,namely, pruning the set of next hops for routingprefixes,remainsunaffectedby this issue.Rather, it

6Strictly speaking,the weightswere integers,becausea computeris a finite-precisionmachine

Page 13: Achieving Near-OptimalTraffic Engineering Solutions for ...khayyam.seas.upenn.edu/software/zebra/ashwin_split_hops-report.pdf · 1 Achieving Near-OptimalTraffic Engineering Solutions

13

0 1 2 3 4 5 6

x 1010

10−4

10−3

10−2

10−1

100

101

102

103

% D

evia

tio

n fro

m O

ptim

al

Total Traffic

50 Nodes 206 Edges Graph

Mask − 16Mask − 24Mask − 8

Fig. 14. 50 Node206WaxmanTopology:Impactof Integerweightson performance

is the linear programusedto computeweightsthatis at the core of this issue.Indeed,we could havecomputedthe link weightsandtraffic allocationbyany other methodand then applied our heuristicsin order to attain the desired profile within theconstraintsimposedby theforwardingparadigm.Asan example, one could computethe link weightsusing the heuristic proposedby Fortz et. al. [6],and then potentially improve performancethrougha more fine-tunedload balancingof traffic on thecomputedpaths,insteadof splitting traffic equally.If the constrainingfactor was a poor set of linkweights,then modifying the traffic profile may notoffer much improvement.However, if the decidingfactorwascoarsegranularityof loadbalancingoverthepaths,significantimprovementmaybe obtainedwith our heuristic.We presentan example of thisprocedurefor the 60 Node 250 Graph in Figure15, were we show the performanceimprovementof traffic allocation with the MIN-MAX LOADheuristic over equal splitting. Note that for boththe curves, the sameset of paths were used,butthe MIN-MAX Load heuristicallows for finer loadbalancing.In order to achieve this, we first solvethe optimal routing problem but with traffic nowconstrainedto flow only on pathscomputedusingthelink weightsobtainedfrom the’F & T’ heuristic.TheMIN-MAX LOAD heuristicis thenrun to shapethe allocatedtraffic to matchthe optimal load.

1 1.5 2 2.5 3

x 1010

0

1

2

3

4

5

6

7x 10

6

Co

st F

un

ctio

n

Total Traffic

60 Nodes 250 Edges Graph

OptimalF&T With Min−Max Load BalancingF&T HeuristicMax Link Util=1.00Max Link Util=0.75Max Link Util=0.25

Fig. 15. 60 Node250EdgeTopology:Impactof MIN-MAX LOADHeuristicon load balancingwith weightscomputedusingFortz andThorupHeuristic

D. LoweringConfiguration Overhead

Our other goal was to investigatethe trade-offbetweenconfigurationoverheadand performance.Recall that in the original approachthe heuristicsdecide the subsetof next hops assignedto everyrouting prefix. However, it hasbeenobserved thatin practice ([3]), a large fraction of the traffic isdistributed over a relatively small numberof rout-ing prefixes. Our analysisof the backbonetracesobtainedfrom the Sprint router show that 95% ofthe total traffic was accountedfor by only 10% ofthe routing prefixes,confirmingthe resultsreportedin [3]. Figure16 highlights this observation, wherewe have plottedthe cumulative traffic intensityasafunctionof the numberof routing prefixessortedindecreasingorder of their traffic intensities.We canpotentially exploit such a phenomenonby config-uring the set of next hopsfor only a few selectiverouting prefixes that carry most of the traffic andallowing the default assignmentof all next hopsfor the remaining routing prefixes. This has theadvantageof lowering configurationoverhead,butraisesthe questionof how it impactsperformance.

We carried out a systematicstudy of such atrade-off on all the previous topologies. In eachinstance,we configuredthesetof next hopsat eachnodefor only a certainset of routing prefixes thatwere selectedbasedon the amountof traffic theycarried.The remainingrouting prefixes were split

Page 14: Achieving Near-OptimalTraffic Engineering Solutions for ...khayyam.seas.upenn.edu/software/zebra/ashwin_split_hops-report.pdf · 1 Achieving Near-OptimalTraffic Engineering Solutions

14

equally over the entire set of next hops as wouldhappenwith defaultOSPF/IS-ISbehavior. Thesetofconfiguredrouting prefixes was then progressivelyincreasedin eachexperimentto determinethe evo-lution of the impact on performance.In all casesthe MIN-MAX LOAD heuristic was used whenconfiguringthe setof next hops.

Theresultingperformancecurvesfor the50Node200 Edge graph are shown in Figure 17 and thenumber of configuredrouting prefixes are shownin Table I. Each curve on the plot is referencedby the amountof traffic that was accountedfor bythe configuredrouting prefixes. This can be cross-referencedfrom the table against the number ofrouting prefixes that were configured.We observethat on an average,by configuringabout165 rout-ing prefixes per router we get good performancetill about 50% maximum link utilization. If weconfigurenext hops for about17 % of all routingprefixes,or 4500entries,at a router, we accountforapproximately75% of the traffic and the resultingperformanceis quitecloseto thatof optimalrouting.

Experimentsconductedon the Sprint Backbone(Figure18, TableII) yield similarly encouragingre-sults.We getgoodperformanceup to approximately50-60% maximum link utilization, by configuringonly 200 routingprefixesper routerandup to morethan70%link utilization if weconfigure600routingprefixesper router.

The 50 node 154 edgetopology (Table V, andFigure 21), 60 node250 edgetopology (Table IVand Figure 20) and the 30 node 238 edge graph(Table III, Figure 19) presentan interestingcasein that by carefully configuring only É C � % ofthe prefixes we get close to optimal performanceover a very wide rangeof link utlizations.We donot get quite suchdramaticresultsfor the 50 node206edgeWaxmangraph,(TableVI andFigure22),but even in this case,we get goodperformancebyconfiguring arounda quarterof the prefixes ( ��F�Óof the traffic).

V. CONCLUSION

In this paper, we have describedand evaluatedan approachthat has the potential for providingthe benefits of traffic engineeringto existing IPnetworks, i.e., without requiring changesto eithertheroutingprotocolsor theforwardingmechanisms.

Our contribution is three-fold.First, we proposea solution whereby we can closely approximate

0 1 2 3 4 5 6 7

x 105

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1Cumulative Contribution of Routes at an ISP Router

Route Index in Decreasing Order of Intensity

Fra

ctio

n o

f T

ota

l R

ate

Fig. 16. Cumulative contribution of routing prefixes at a SprintRoutersortedin decreasingorderof intensity.

Prefixes Total No. of % Allocated % ofConfigured Prefixes Fraction Traffic

75 26500 0.3% 10 %165 26500 0.62% 20 %1252 26500 4.7 % 50 %4500 26500 17.0 % 75 %11747 26500 44.32% 90 %

TABLE I

CONFIGURATION OVERHEAD: 50 NODE 200 EDGE GRAPH, ALL

ENTRIES ARE PER NODE

0 0.5 1 1.5 2 2.5

x 109

0

1

2

3

4

5

6

7

8x 10

5

Total Traffic

Co

st F

un

ctio

n

50 Nodes 200 Edges Graph

Optimal10%20%50%75%90%Max Link util=100%Max Link Util = 50%Max Link Util=25%

Fig. 17. BRITE 50Node200Edgegraph: Performanceasa functionof configurationoverhead

Page 15: Achieving Near-OptimalTraffic Engineering Solutions for ...khayyam.seas.upenn.edu/software/zebra/ashwin_split_hops-report.pdf · 1 Achieving Near-OptimalTraffic Engineering Solutions

15

Prefixes Total No. of % Allocated % ofconfigured Prefixes Fraction Traffic

160 30700 0.5% 10 %200 30700 0.6% 20 %620 30700 2 % 50 %1750 30700 6 % 75 %4180 30700 14 % 90 %

TABLE II

CONFIGURATION OVERHEAD: SPRINT BACKBONE, ALL ENTRIES

ARE PER NODE

4 5 6 7 8 9 10 11 12 13 14

x 109

1

2

3

4

5

6

7x 10

5

Total Traffic

Co

st F

un

ctio

n

ISP Backbone

Optimal10%20%50%75%90%Max Link util=90%Max Link Util=70%Max Link Util = 50%Max Link Util=25%

Fig. 18. SprintBackbone:Performanceasa functionof configurationoverhead

the optimal link loads without changing currentforwarding mechanisms,namely, by carefully con-trolling thesetof next hopsfor eachprefix.Second,we proposethree simple heuristics,with provableperformancebounds (see Appendix) in order toimplement our solution. All three hueristicswereshown to give good resultsin most of our experi-ments,and the Min-Max Load heuristicin particu-lar, performedexcellently in all of them.We believethe heuristicsare generalenoughto be potentiallyusefulin their own right. Finally, we showed,usingactual traffic traces, that configuration overheadcan be vastly reducedwithout significant loss ofperformance.Specifically, by only configuringnexthops for a small set of prefixes, we were able toobtain near-optimal performancefor link loads ofup to 70%.This is obviouslyanimportantaspectforthe practicaldeployment of our traffic engineeringsolution.

There are clearly many other aspectsthat need

Avg No. of Total No. of % Allocated % ofRoutes Routes Fraction Traffic

configured37 17000 0.2% 10 %88 17000 0.5% 20 %585 17000 3.5 % 50 %2761 17000 16.2 % 75 %7500 17000 43 % 90 %

TABLE III

CONFIGURATION OVERHEAD: 30 NODE 238 EDGE GRAPH, ALL

ENTRIES ARE PER NODE

0 0.5 1 1.5 2 2.5 3

x 1010

0

0.5

1

1.5

2

2.5x 10

6

Total Traffic

Co

st F

un

ctio

n

30 Nodes 238 Edges Graph

Optimal10%20%50%75%90%Max Link util=90%Max Link Util = 50%Max Link Util=25%

Fig. 19. GT-ITM 30 Node 238 Edge Graph: Performanceas afunction of configurationoverhead

Avg No. of Total No. of % Allocated % ofRoutes Routes Fraction Traffic

configured5125 117890 4.3% 10 %14183 117890 12.0% 20 %32754 117890 27.5 % 50 %57940 117890 49.1 % 75 %80500 117890 68.2% 90 %

TABLE IV

CONFIGURATION OVERHEAD: 60 NODE 250 EDGE TOPOLOGY

(GT TOPOLOGY GENERATOR), ALL ENTRIES ARE PER NODE

Page 16: Achieving Near-OptimalTraffic Engineering Solutions for ...khayyam.seas.upenn.edu/software/zebra/ashwin_split_hops-report.pdf · 1 Achieving Near-OptimalTraffic Engineering Solutions

16

1 1.5 2 2.5 3

x 1010

0

1

2

3

4

5

6

7x 10

6

Total Traffic

Co

st F

un

ctio

n60 Nodes 250 Edges Graph

Optimal10%20%50%75%90%Max Link util=90%Max Link Util = 75%Max Link Util = 50%Max Link Util=25%

Fig. 20. GT-ITM 60 Node 250 EdgeTopology: Performanceas afunction of configurationoverhead

0.8 0.9 1 1.1 1.2 1.3

x 1010

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

x 106

Total Traffic

Co

st F

un

ctio

n

50 Nodes 154 Edges Graph

Optimal10%20%50%75%90%Max Link util=90%Max Link Util = 75%Max Link Util = 50%Max Link Util=25%

Fig. 21. 50 Node 154 Edge Graph:Performanceas function ofconfigurationOverhead

Prefixes Total No. of % Allocated % ofconfigured Prefixes Fraction Traffic

5640 145827 3.8% 10 %15805 145827 10.8% 20 %37108 145827 25.4 % 50 %67132 145827 46 % 75 %95420 145827 65.4 % 90 %

TABLE V

CONFIGURATION OVERHEAD: 50 NODE 154 EDGE GRAPH, ALL

ENTRIES ARE PER NODE

0 1 2 3 4 5 6 7 8

x 1010

0

1

2

3

4

5

6

7

8

9x 10

6

Total Traffic

Co

st F

un

ctio

n

50 Nodes 206 Edges Graph

Optimal10%20%50%75%90%Max Link util=90%Max Link Util = 75%Max Link Util = 50%Max Link Util=25%

Fig. 22. 50Node206EdgeWaxmanGraph:Performanceasfunctoinof configurationOverhead

Prefixes Total No. of % Allocated % ofconfigured Prefixes Fraction Traffic

4221 97772 4.3% 10 %11706 97772 11.9% 20 %27131 97772 27.5 % 50 %48098 97772 49.8 % 75 %

3345244 97772 68 % 90 %

TABLE VI

CONFIGURATION OVERHEAD: 50 NODE 206 EDGE WAXMAN

GRAPH, ALL ENTRIES ARE PER NODE

to be addressedin order to formulate a fully op-erationalsolution to traffic engineering.One topicwe are currently investigating is that of makingour traffic engineeringsolutionrobustto unexpectedchangesin network topology, e.g.,link or routerfail-ures.This is obviously an importantaspectandoneof the areasthat traffic engineeringsolutions thatrely on new forwarding technologies,e.g., MPLS,have focusedon, e.g.,[11]. Thegeneraldirectionofthe solutionwe arepursuing,is to computea setoflink weightsthatarerobustenoughto givegoodper-formancein normalor failurescenarios.In caseof afailure, the assignmentof next hops,asproposedinthe paper, would eitherremainthe same,or defaultto the standardassignmentfor routing prefixesthataregoing over an entirely differentsetof next hopsafter the failure. We are currently investigatingtheperformanceand robustnessof such an approach.See [7] for similar work. Another issue we are

Page 17: Achieving Near-OptimalTraffic Engineering Solutions for ...khayyam.seas.upenn.edu/software/zebra/ashwin_split_hops-report.pdf · 1 Achieving Near-OptimalTraffic Engineering Solutions

17

investigating,is that in somecaseswe encountered,link weights were not integer. Although this didnot happenoften, it is still problematicsince thedual of the LP (2) doesnot guaranteeinteger linkweightsas requiredby OSPF/IS-IS.In our experi-ments,we roundedoff the link weights to 5 digitaccuracy. However sucha solutionmay not alwaysbefeasibledueto thelimited field lengthfor weightsin currentOSPFand IS-IS protocols,which limitsscaling.Instead,we are investigatingmethodsthatwill allows us to obtain integer weightsconsistentwith theoriginal weights.We believe thata solutionthataccountsfor bothproblemsis feasible,andthatthis work representsanotherargumentin favor ofevolving the currentinfrastructureto supporttrafficengineering,if andwhenneeded,ratherthanembarkon a migration to a rather different technology.There may be justifications for such a migration,e.g.,bettersupportfor policiesor VPNs,but trafficengineeringdoesnot appearto be oneof them,andwehopethattheresultsof thispapercanhelpclarifythis issue.

REFERENCES

[1] H. Abrahamsson,B. Ahlgren, J. Alonso, A. Andersson,andP. Kreuger. A multi path routing algorithm for IP networksbasedon flow optimization. In International Workshop onQualityof future InternetServices, Zurich,Switzerland,October2002.

[2] R. K. Ahuja, T. L. Magnanti,andJ. B. Orlin. NetworkFlows,Chapter17, Section17.2. Prentice-HallInc., 1990.

[3] S.Bhattacharya,C. Diot, J.Jetcheva,andN. Taft. Pop-level andaccess-linklevel traffic dynamicsin a tier-1 pop.ProceedingsofACM SIGCOMMInternetMeasurementWorkshop(IMW 2001),November2001.

[4] R. Callon. Use of OSI IS-IS for routing in TCP/IP and dualenvironments. RequestFor Comments(Standard)RFC 1195,InternetEngineeringTaskForce,December1990.

[5] A. Feldmann,A. Greenberg, C. Lund, N. Reingold,J. Rexford,and F. True. Deriving traffic demandsfor operationalIP net-works: Methodologyandexperience.IEEE/ACM Transactionson Networking, 9, 2001.

[6] B. Fortz and M. Thorup. Internet traffic engineeringbyoptimizing OSPFweights.In Proceedingsof INFOCOM’2000,Tel Aviv, Israel,March 2000.

[7] B. Fortz and M. Thorup. OSPF/IS-ISweights in a changingworld. Journal on SelectedAreasin Communication, February2002.

[8] M. R. Garey andD. S. Johnson.Computers and Intractability:A Guide to the Theory of NP-Completeness. W. H. FreemanandCompany, 1979.

[9] R. L. Graham. Boundson multiprocessingtiming anamolies.SIAM Journal on AppliedMathematics, 17, March 1969.

[10] D. Katz, D. Yeung, and K. Kompella. Traffic Engineer-ing Extensions to OSPF Version 2. INTERNET-DRAFT,draft-katz-yeung-ospf-traffic-09.txt, October2002. work in progress.

[11] M. Kodialamand T.V. Lakshman. Dynamic routing of band-width guaranteedtunnelswith restoration.In PROCEEDINGSof INFOCOM, Tel Aviv, Israel,March 2000.

[12] A. Medina,A. Lakhina,I. Matta,andJ. Byers.BRITE: Bostonuniversity representative internettopology generator. http://cs-www.bu.edu/brite,BostonUniversity, April 2001.

[13] J. Moy. OSPFVersion 2. RequestFor Comments(Standard)RFC 2328, InternetEngineeringTaskForce,April 1998.

[14] E. Rosen,A. Viswanathan,andR. Callon. Multiprotocol labelswitching architecture. RequestFor Comments(StandardsTrack) RFC 3031, Internet EngineeringTask Force, January2001.

[15] H. Smit. IS-IS Extensionsfor Traffic Engineering.INTERNET-DRAFT, draft-ietf-isis-traffic-04.txt, Decem-ber 2002. work in progress.

[16] C. Villamizar. OSPF optimized multipath OSPF-OMP.INTERNET-DRAFT, draft-villamizar-ospf-omp-01.txt, February1999. (work in progress).

[17] Z. Wang, Y. Wang, and L. Zhang. Internet traffic engineer-ing without full meshoverlaying. In Proceedingsof INFO-COM’2001, Anchorage,Alaska,April 2001.

[18] E. W. Zegura. GT-ITM: Georgia TechInternetwork Topology Models (software).http://www.cc.gatech.edu/fac/ellen.zegura/gt-itm/gt-itm/tar.gz,Georgia Tech,1996.

APPENDIX

We first show that problem of splitting prefixesequally over a subsetof next hopsis NP-completeeven at a single node.The problem can be statedas follows.

INSTANCE: A set of flows Ö × ØhÙ�ÚKÛ withintensity of flow Ù�Ú denotedby ÜdÝÙ�ÚÞ , set of nexthops ß�×àØháRâ�ã á�ä�ã å å åIã áçæHÛ with capacitiesof hopáçÚ denotedby èDÝáçÚÞ andan integer é .

QUESTION: Is thereallocationof eachflow Ù�Ú to asetof next hops,ie., êMëìÙ�Útí Øhá*âEã á�ä�ã å å�å�ã áçæHÛ anda split î�ÝïÙ�ÚïÞ¨×�ð�êDÝÙ�ÚÞHð , suchthat

ñ2òIóô(õ=ö Ø÷ ô

èøÝá ô Þ Û ù éwhere

÷ ô × ú�û4ü ôEõ�ý�þ ú�û4ÿ ÜdÝÙ�ÚÞî�ÝïÙ�ÚÞ .We restrict the above problemby setting �Õ×�� .

This implies that now eachflow hasthreechoices,either hop � , or hop � , or be equally split onboth hops. Observe that this is equivalent to apriori splitting a flow Ù�Ú into two equalflows withhalf the intensity and then making independentdecisionsfor eachhalf, but now without splittingthenew flows. This new problemcanbe recastintoa modifiedform of thepartitionproblem,which we

Page 18: Achieving Near-OptimalTraffic Engineering Solutions for ...khayyam.seas.upenn.edu/software/zebra/ashwin_split_hops-report.pdf · 1 Achieving Near-OptimalTraffic Engineering Solutions

18

call MODIFIED-PARTITION, definedas follows.

INSTANCE: A setof elements� ×�Ø�� ÚÛ��PØ�� Ú Û withsize of element �çÚ given by ÙRÝ��çÚÞ , and an integer� . Further, Ù*Ý �çÚÞ�×$Ù*Ý � Ú Þ , that is, for eachelement�çÚ in � , thereis an element � Ú with identical size.We shall henceforthrefer to this as the mirroringproperty.QUESTION: Is therea partition of � into disjointsets � and ����� suchthat� � õ���� Ù*Ý�� ÚïÞ�× � õ�������� Ù*Ý �hÞ(åTheorem: MODIFIED-PARTITION is NP-Complete.Proof: It is easy to see that MODIFIED-PARTITION � NP, sinceone only has to guessapartition andcheckif it satisfiesthe relation.

We transform3DM to MODIFIED-PARTITION.The proof closely follows the outline of reductionfrom 3DM to PARTITION presentedin [8]. Let thesets �Âã�� ã���ã with ð � ð ×�ð!� ð ×�ð � ð × " , and# $ � %&� %'� , the candidatematchingset,bean arbitrary instanceof 3DM. Let the elementsofthe setsbe denotedby

� × Ø)(oâEã�(�ä�ã å å åIã*(,+�Û� × Ø�Ütâ(ã�Ü¡ä ã å å åIã�Ü-+�Û� × Ø).�âEã*.�ä�ã å å åIã*.�+IÛ

and # ×"Ø)/ â(ã*/ ä�ã å å�å�ã*/10�Ûwhere á�×�ð # ð . We must construct a set ofelements� with size Ù*Ý �hÞ2�4365 , that satisfiesthemirroring property and a number � such that �containsa subset� satisfying� � õ7��� Ù*Ý �hÞ�× � õ�������� Ù*Ý �hÞif andonly if

#containsa matching.ThesetA will

contain �há98:� elements.The first á areconstructedfrom

#, whereeachelement� Ú is associatedwith a

triple /UÚ;� # . Thesize Ù*Ý �çÚÞ of � Ú will bespecifiedin its binary representation,in termsof a string of0’s and1’s divided into <�" zonesof îT×>=@?BA�C ä Ý��háD8��Þ E bits each.Eachof thesezonesis labeledwithanelementfrom �F�,�G�H� asshown in Figure23.

The representationfor Ù*Ý �çÚÞ dependson the cor-respondingtriple /UÚH× ÝI(,J þ Ú ÿ ã�Ü-K þ Ú ÿ ã*.�L þ Ú ÿ ÞM� #

. It

NONPOPQOQRORSOST UOUVOVWOWXYOYZ [O[\O\]O]^_O_`aaaaaaaaaaa

bbbbbbbbbbb

ccccccccccc

w w w x x x y y y1 1 12 2 2q q q

Fig. 23. Labelingof the d�e zones,eachcontainingfhgji kml*nporq@sutOvwyx{zbits of the binary representationof |}q{~ x .

has a 1 in the rightmost bit position of the zoneslabeled( J þ Ú ÿ ã�Ü K þ Ú ÿ and . L þ Ú ÿ . Hence,thesizeof �çÚ isgiven by

Ù*Ý �çÚïÞ~×��� þ�� + � J þ Ú ÿ:ÿ 8��� þ ä�+ � K þ Ú ÿ4ÿ 8��� þ + � L þ Ú ÿ:ÿThenext á elementsof � arejust replicasof thefirstá , that is Ù*Ý �çÚ 5 0 Þ ×¿Ù*Ý �çÚïÞ . Note that if we sum upall entriesin a zoneover all elementsØ��çÚZë���ù��dù�há Û , it never exceeds�há2×�� � ��� . Hencein adding

� õ7��� Ù*Ý �hÞ for any subset � $ Ø��çÚ ë���ù��mù��há�Û ,therewill never beany carriesfrom onezoneto thenext. Now if we let� × � + � âôu��� � � ôwhich is the numberwhose binary representationhasa 1 in therightmostpositionof every zone,thenany subset� $ Ø��çÚZë���ù��dù��há Û will satisfy

� õ���� ÙRÝ��hÞ~× �if and only if

# × Ø)/UÚdë!� Ú���� Û is a matchingfor

#. Also note that by virtue of construction�

can contain only distinct elementsand hencenoduplicates,In otherwords, � $ Ø��çÚ ë���ù��dù á Û .

We now definethe remaining2 elementsof theset � , � and � with sizes

Ù*Ý �=Þ¨×�Ù*Ý�� Þ × � � � 0Ú � â Ù*Ý �hÞ

where � is a numbersuchthat ÙRÝ��=Þ�� � . The finalstep is to choose � which we set to be equal to�)����� .

Now if,# is a matching,thenwe musthave for

the correspondingset � that � õ�� � Ù*Ý�� Þ�× �. The

Page 19: Achieving Near-OptimalTraffic Engineering Solutions for ...khayyam.seas.upenn.edu/software/zebra/ashwin_split_hops-report.pdf · 1 Achieving Near-OptimalTraffic Engineering Solutions

19

remainingelementsof � sumup to

ä�0Ú � â Ù*Ý �hÞ;�

� 8���ÙRÝ��=Þ× � 0Ú � â Ù*Ý �hÞ��

� 8��)� � ��� 0Ú � â Ù*Ý�� Þ× Ý��)�j���IÞ �which satisfiesthe requirement,that is

Ý��)�����IÞ � õ7� � Ù*Ý �hÞ¨× � õ7����� � Ù*Ý �hÞ .Conversely, if there exists a subset � $ � , suchthat

Ý��)�����IÞ � õ���� Ù*Ý �hÞø× � õ7������� Ù*Ý�� Þthenby virtue of the propertythat

� õ�� ÙRÝ��hÞ~×��)� � ,

the elementsof � must sum up to�

. Moreover,neither � nor � can be in � since they areboth larger than

�. Hence only elementsØ��çÚ­ë���ù �«ù á Û can be in � which results

in a matching. Thus, 3DM � MODIFIED-PARTITION and the theoremis proved.

Beforewe analysetheheuristics,we definesomenotation.

1) As before,let �D��� � denotethe setof streams(prefixes)at node � destinedto egressrouter/ andlet ÜìÚ denotethe intensityof stream� .Also, let  �×¢¡ �£¡ .

2) Let ß denotethe set of next-hops at � for/ . Let è)0 denotethe capacityof hop á and�Õ×¢¡�ß�¡ .3) For simplicity we refer to the load on hop á

as÷ 0 . The context will make it clear if this

is the load before or after assignmentof anyparticularprefix.

Let

¤ ×¥Ú � â ÜìÚ

¦ × æ0 õ=ö è§0�å

A. Analysisof the MIN-MAX LOAD Heuristic

Our analysisof the MIN-MAX LOAD heuristicconsistsof two steps.We first give a performanceboundwhenthesetof routesis unordered.We thendemonstratean improved bound when the set ofroutesis orderedaccordingto their traffic intensity.

Proposition1: Heuristic MIN-MAX LOADachieves a load that is no more than Ý*�¨8G?B©&�ÂÞtimes that of the maximum load with an optimalalgorithm, where � is the number of next hops(for a given destination).Proof: The proof proceedsin two steps.First weidentify a key property of the MAX-MIN LOADheuristic,namely that for eachvirtual assignment,we can associatea distinct hop. Next we use thispropertyto establishthe main result.

Denotethe maximumload achieved by HeuristicMIN-MAX LOAD as ªPÝ��D��� ��ã ß Þ . Let hop «¢� ßachieve this loadandthe lastprefix assignedto hop« be prefix ¬ . By definition, we have

ª Ý �D��� ��ã ß Þ®­ ÷ 0è)0 ¯ á°�­ßoåLet ª�±�Ý �D��� ��ã��ÂÞ denote the maximum load

achievedby anoptimumalgorithmthat satisfiestheequal subsetsplitting constraint,i.e., splits trafficequally acrossthe subsetof next hops that havebeenassignedto a route.Note that ªO±�Ý��D��� ��ã ß Þ¨­¤¦ where

¤¦ is the optimum attainedif arbitrarysplitting of routeswasallowed at the node.

First recallhow thesecondstepof heuristicMIN-MAX LOAD works.Theheuristicdoesa virtual as-signmentof thestreamsoveranincreasingsequenceof hops, ² ײØ�³2ù²á�ë�³´� ßoÛhã á�×µ��ã)�hã å�å åIã)�as describedin the algorithm and choosesthe ar-rangementwith thesmallestmaximumfor anactualassignment.We make the following observations:

1) The smallestmaximumamongall virtual as-signmentsof prefix ¬ must be ª Ý �D��� ��ã�ß�Þ .If there were a smaller maximum, it wouldcontradict our assumptionthat ¬ is the lastprefix assignedto hop « .

2) In avirtual assignmentof prefix ¬ over î hops,let ¶ � denotethe index of the hop with themaximumload amongthe virtually assignedhops(to which aportion Ü ô7· î of theroutewasvirtually assigned). Then ¶ � mustbemaximalover all hops, for that virtual assignment.Ifit were not, let there exists a hop á (which

Page 20: Achieving Near-OptimalTraffic Engineering Solutions for ...khayyam.seas.upenn.edu/software/zebra/ashwin_split_hops-report.pdf · 1 Achieving Near-OptimalTraffic Engineering Solutions

20

mustbelongto thesetof virtually unassignedhops)suchthat÷ 0è)0 � ÷{¸}¹ 8 Ü ô7· îè ¸ ¹ å (13)

Since all hops satisfy the property÷ 0è§0 ù

ª Ý �D��� ��ã ß Þ , we have an assignmentof pre-fix ¬ which yields a lower load ratio thanª Ý �D��� ��ã ß Þ . This contradictsour initial as-sumptionthat ¬ was the last prefix assignedto hop « such that its load was ªPÝ��D��� ��ã ß Þ .Hence¶ � mustbe maximal.

From Observation 1, we have÷{¸ ¹ 8 Ü ô7· îè ¸º¹ ­ ª Ý �D��� ��ã ß Þ(å (14)

Clearly, thefollowing relationholdsfor all the �»�Hîvirtually unassignedhops÷ 0¼8 Ü ô7· îè)0 ­ ÷{¸º¹ 8 Ü ô§· îè ¸º¹ (15)

since only the î smallesthops are chosenin thevirtual assignment.We then have from Equations(14) and(15) that the following relationmustholdfor ¶ � and the remaining(if any) �>� î virtuallyunassignedhops:÷ 0¼8 Ü ô7· îè)0 ­ ªPÝ��D��� ��ã ß Þ(å (16)

Using these2 observationswe make the followingclaim.Claim : For every virtual assignmentover î�×�Zå å�å§� hopsof prefix ¬ , we can identify a distincthop áZÝ'î�Þ that satisfies÷ 0 þ � ÿ 8 Ü ô7· îè 0 þ � ÿ ­ ªPÝ��D��� ��ã ß Þ(å (17)

Proof: The proof is by induction. For a virtualassignmentby the heuristic over î hops, let ½ �denotethe setof hopssuchthat:

½ � × Øhá2ë ÷ 0¼8 Ü ô�· îè)0 ­�ª Ý �D��� ��ã ß Þ(ÛhåNote from Observation2 andEquation(16) that ½ �comprisesof at least ¶ � and the �¾�¤î unassignedhops.Hence

¡h½ � ¡)­��¾� î¿8�� .Also notethat ½ � is a non-decreasingsequenceforîy×�� ã)�À����ã å å�åIã�� , becauseif somehop á°�Á½°� ,

then áÂ�1½°� � âEã �íÄ� . Theclaim certainlyholdsfor îy×�� sinceby Observation1, thereis at least1hop which hasa load of ª Ý �D��� ��ã�ß�Þ . Let the claimhold for all îy×�� ã��Å�Æ��ã å å åIã*� . Thenby the non-decreasingproperty, ½°� containsall the �À�j�Ç84�distinct hops. Let us look at a virtual assignmentover �1��� hops.We know that

¡h½°� � â,¡)­��¾�»�18�� .Since only �>���»8Ä� hops have beenassociated(with virtual assignmentsî�× � to î ×È� ) andby the non-decreasingproperty, ½°� � â , containsallthesehops, there is at least 1 hop which has notyet beenusedand hencecan be associatedwith avirtual assignmentover �Â�Æ� hops.This completesthe proof.We arenow in a position to prove Proposition 1.By our previous result, for each possible virtualassignmentof prefix ¬ over î­×£�Iã)�hã å å�åIã�� hops,we have a distinct hop á Ý'î�Þ which satisfiesEqua-tion (17). SummingEquation(17) over this set of� distinct hops,we have

æ� � â

÷ 0 þ � ÿ ­ æ� � â è)0ɪ Ý��Ê��� ��ã ß�Þ;�

æ� � â

Ü ôî ãæ� � â

÷ 0 þ � ÿ ­ ¦ ªPÝ��D��� ��ã ß Þ;� æ� � â

Ü ôî 㤠�­Ü ô ­ ¦ ªPÝ��D��� ��ã ß Þ;� Ü ô Ý�?Ë©���8��IÞ (18)

ªPÝ��D��� ��ã ß Þ ù ¤¦ 8 Ü ô¦ Ý�?B©��ÂÞ~ù ¤¦ Ýy�Ì8�?B©��ÂÞEãªPÝ��D��� ��ã ß Þ ù ªO±�Ý �D��� ��ã ß Þ�Ý*�h8�?Ë©Á�ÂÞ(ã (19)

where the LHS (18) of follows from the fact thatthetotal assignedloadis not morethan

¤ ��Ü ô . ThisprovesProposition 1.

Theaboveanalysisholdsfor anarbitraryorderingof theprefixes.If theprefixesareorderedin decreas-ing order as is the casein MAX-MIN LOAD, theboundcan be improved ascan be the performanceof the algorithm.Our proof for this resultdraws onthe methodusedin [9].

Proposition2: If the prefixes are assignedindecreasingorderof their traffic intensities,then,

ª Ý �D��� ��ã�ß�ÞªO±�Ý �D��� ��ã ß Þ ù Ýy�Ì8 ?B©&�� ÞEåProof: The proof is by contradiction.Assumethatthe above resultdoesnot hold for someorderedsetof prefixes with intensities Í�× Ø�Ütâ�ã�Ü¡äçã å å åIã�Ü ¥ Û ,

Page 21: Achieving Near-OptimalTraffic Engineering Solutions for ...khayyam.seas.upenn.edu/software/zebra/ashwin_split_hops-report.pdf · 1 Achieving Near-OptimalTraffic Engineering Solutions

21

where

ÜtâD­ Ü¡ä�­ å å å�­ Ü ¥ åWithout loss of generality assume Ü ¥ is theintensity of the last prefix(  ) assignedto the hop,which achieves the maximumload underheuristicMIN-MAX LOAD. If it is not, we can truncatethe sequenceup to the prefix last assignedto ahop which achieves the maximum load withoutaffecting the maximum achieved by MIN-MAXLOAD. If the optimum for the truncatedsequenceis ª ± Ý � ã ß Þ , then ª ± Ý � ã ß ÞPù�ªO±�Ý��D��� ��ã ß Þ andourassumptionstill holds.Let asbefore,¤ × ¥

Ú � â ÜìÚ and¦ × 0

0 � â è)0 .Following the exact same analysis as for the

arbitraryorderingwe have

ª Ý��Ê��� ��ã ß�Þ ù ¤¦G8 Ü ¥¦ ?Ë©Á� ãù ªO±�Ý��Ê��� ��ã ß�ÞÎ8 Ü ¥¦ ?B©�� ã

ªPÝ��D��� ��ã ß ÞªO±�Ý��Ê��� ��ã ß�Þ ù �Ì8 Ü ¥ªO±�Ý'Ü�ã èpÞ;Ï ¦4?Ë©�� åBy our assumption,we have

�Ì8 Ü ¥ªO±�Ý �D��� ��ã ß Þ;Ï ¦ ?B©&� � �Ì8 ?B©��� ãor

Ü ¥¦ � ªO±�Ý �D��� ��ã�ß�Þ� åNote that Ü ¥¦ is the smallestachievable load underany split. Hence,if the above inequality, and ourassumption,is to hold, we can have only oneroute (destinationprefix) in � . However it is clearfrom the algorithm itself that MIN-MAX LOADachieves an optimal allocation when there is onlyonestream(prefix).This provesProposition 2.

B. Analysisof the MAX-MIN RESIDUAL CAPAC-ITY heuristic.

In this section,we presenta boundon theperfor-manceof the MAX-MIN RESIDUAL CAPACITYheuristic. Note that the MIN-MAX GAP heuris-tic performs in a fashion that is identical to theMAX-MIN RESIDUAL CAPACITY heuristic,eventhoughthey aretrying to improve opposingmetrics.Hencewe shall focuson the former.

For simplicity we assumethat¤ × ¦

. DenotebyÐÎÑ theoptimumachievedby analgorithmwhich can

split the traffic arbitrarily. Then Ð Ñ ×�Ò . To seethis,note that if somenodehasexcesscapacity Ó2�µÒ ,thenby conservationandthe fact that

¤ × ¦, some

othernodehasnegative capacity �ÔÓ andhence,theminumumresidualcapacityÐ ×G�¿Ó Õ�Ò . This canbe increasedto Ò by shifting Ó units of traffic fromthe first nodeto the latter.

Let the minumum residualcapacity that resultsfrom executing the MIN-MAX RESIDUAL CA-PACITY is Ð L . Clearly, Ð L ù ÐÎÑ ù¾Ò . Further, likebefore,let somehop « attainsÐ L . Let the laststreamassignedto this hop is streamÜ-Ö . Let the load onhop á beforeassignmentof this streamis

÷ 0 . Clearlyby definition, we musthave

è)0 � ÷ 0 ­ Ð L å (20)

Recall, that the heuristicfirst sortsthe next hopsin non-decreasingorderof residualcapacity, è)0×� ÷ 0 ,and then goes through an increasingsequenceofvirtual assignmentsover ¬ × ��ã)�hã å�å åIã)� hopsand chosesthe one with the maximum minimumresidualcapacity. For convenience,let us index thenext hopsas ��ã)�hã�å å å�ã�� in non-decreasingorderofresidualcapacity. Note that by definition, the maxi-mumminimumresidualcapacityafterassignmentofstreamÜ-Ö mustbe Ð L . Let us look at theassignmentof Ü-Ö , to thefirst hop,ie., 1 only. We mustthenhave

è�âÎ� Ý ÷ â�8 Ü-Ö Þ ù Ð L å (21)

To seethis, observe that if è�â×�UÜ-ÖØ� Ð L , thenalongwith Eqn.(20),it impliesthatwehaveanassignmentof stream Ü-Ö whereby hop « does not reach theresidualcapacityof Ð L . But thiscontradictsour intialdefinitionof streamÜ-Ö beingthelaststreamassignedto hop « . By the samereasoning,we musthave

èIäÊ� Ý ÷ äÊ8 Ü-Ö · ��Þ ù Ð L (22)...

è)0 � Ý ÷ æ�8 Ü-Ö · �ÂÞ ù Ð L å (23)

Adding equations(21) thru (23), we have

æ0 � â è)0 �

æ0 � â

÷ 0 � Ü-Ö æÚ � â Ý�� Þ¨ù��¾Ï Ð L å

Substitutingæ0 � â ÷ 0U× ¤ � ¥

Ú � Ö ÜìÚ and using the fact

Page 22: Achieving Near-OptimalTraffic Engineering Solutions for ...khayyam.seas.upenn.edu/software/zebra/ashwin_split_hops-report.pdf · 1 Achieving Near-OptimalTraffic Engineering Solutions

22

that¦ × ¤ , we obtain,

Ð L ­ Ý �� Þ Ý � Ü-Ö�Ýy�Ì8�?B©,�ÂÞÎ8¥Ú � Ö ÜìÚÞ

Ð L ­ Ý �� Þ �­Ü-Ö�Ý�?Ë©Ù�ÂÞ;8¥Ú � Ö 5 â ÜìÚ å (24)

A simpleapproximationof the boundis

Ð L ­���Ý Ü-� ��Ú� Þ�Ý�?Ë©,�ÂÞ (25)

where Ü-� ��Ú × ñ2òIó ÚØ�ÜìÚKÛ . A better approximationwould be

Ð L ­ Ý �� Þ �­Ü-ÖIÛ Ý�?B©Ù�ÂÞ�8�Ý  À�ë Ñ ���IÞïÜ-�tÚ{�where Ü-�tÚÜ�2× ñ´Ý © ÚØ�ÜìÚ Û and

« Ñ × ò�Þ C ñ°Ý ©Ö�ß ��Ü-Ö�Ý�?Ë©,�ÂÞÎ8 Ý� Ä�ëÌ���IÞïÜ-�tÚÜ�áà