resilient ip backbones · 2007. 12. 27. · ip backbone alternatives ip-over-dwdm ip-over-otn...
TRANSCRIPT
1
Resilient IP Backbones
Debanjan SahaTellium, Inc.
Resilient IP Backbones
Debanjan SahaTellium, Inc.
2
OutlineOutline
Industry overviewIP backbone alternatives
IP-over-DWDMIP-over-OTN
Traffic routing & planningNetwork case studiesResearch Issues
3
Tellium OverviewTellium Overview
Aurora optical switchAurora 32Aurora 128Aurora 512
StarNet software suitElement management systemNetwork operating systemPlanning and modeling tools
Value propositionDynamic provisioningMesh restoration Aurora Optical Switch
512 x 512 configuration
4
Industry OverviewIndustry Overview
Good news - data traffic growing at a healthy rate
100% growth rate in 2001, on track for 85% growth rate in 2002Data traffic surpassed voice traffic by volume.
Bad news - it is not a profitable business
Bandwidth prices falling 30% yearlyLot of unused capacity in the networkCapital spending has fallen 40% from its peak in 2000
Need to increase network efficiency through consolidation and better utilization
Common IP/MPLS backbone for all data trafficIP backbone need to be reliable
050
100150200250300
1998 1999 2000 2001 2002
Traffic growth
Internet Traffic Growth
+Source RHK consulting
5
Troubling Failure StatisticsIP Networks are FragileTroubling Failure StatisticsIP Networks are Fragile
Failures are frequent6 million outages recorded in three yearsMore than 50% of routes have MTTF of 15 daysMore than 75% of routes have MTTF of 30 days
Much lower than carrier class availability30-35% routes have more than 99.99% availability10% of the routes have availability below 95%
Service can be affected for a long timeOnly 30% of the outages are repaired within an hour40% of the failures last more than an hour to several daysRouting convergence after a failure takes 15 minutes
Experimental Study of Internet Stability and Wide-area Backbone Failures.Craig Labovitz, Abha Ahuja, and Farnam Jahanian
6
Internet Routing ArchitectureImpact of FailuresInternet Routing ArchitectureImpact of Failures
ExchangePoint
Autonomous System# 1
Autonomous System# 2
BGPIGP: OSPF/ISIS
Area 1
Area 2
Area 1
Area 2
Inter-domain route failuresFaults in connectivity between providersLoss of provider’s connectivity to customers
BGP requires all routes from a peer withdrawn after loss of peering sessionChange in link state triggers OSPF flooding
7
Router InterconnectionsRouter Interconnections
UPSR/BLSRSONET Ring
UPSR/BLSRSONET Ring
Routers connected over protected SONET Ring Routers connected over unprotected WDM
8
Backbone POP ArchitectureBackbone POP Architecture
OC48
OC48
OC
48
OC192
OC192
OC192
OC192
Access routers
Core routers
Customerrouters
To other PoPs
OC3/OC12
OC3/OC12
Backbone routers are dual redundantBackbone routers are connected to other POPs over unprotected wavelengthsAccess routers are connected to both backbone routersCustomer routers are connected to one or two (rare) access routers
9
Different Types of FailuresDifferent Types of Failures
Transport failuresCaused by fiber cut and WDM failuresMay cause multiple links to fail at the same time and takes hours to fixOptical layer restoration can mask transport failures from IP layer
Router failuresCould be software or hardware failureSoftware failures can be fixed in minutes
Network failuresTypically due to congestion and mis-configuration, and malicious attacks
Maintenance problemsPrimarily a customer network issue, backbone routers typically do not suffer from this problem
26%
28%
27%
19%
Transport failure Router FailureMaintenance Network problem
10
Transport FailuresCause Multiple Failures at IP LayerTransport FailuresCause Multiple Failures at IP Layer
PoP 1
PoP 2 PoP 3
PoP 4
PoP 5 PoP 6
Shares conduit
Shares DWDM
Wavelengths are often glass through at multiple PoPsFibers between different PoP pairs sometime share conduitsIP layer is agnostic of risk dependency at the transport layerSingle failure at the transport layer may trigger multiple failures at the IP layerProtection at the transport layer can alleviate much of the problem
11
A
C
B D
E
F
IP Layer ProtectionIP Layer Protection
IP rerouting around the failureCurrent mode of operation in the Internet todayTime tested mechanism; simple and robustRouting convergence can be slow; network is unstable during recoveryNetwork run at very low utilization to absorb temporary overload due to failure
End-to-end MPLS protectionPotentially faster restorationRequires traffic engineering and primary and backup path pre-planning Backup and primary LSPs should be share disjointMore expensive than IP reroutingRouting stability and scalability issues are still open
Backup path
Primary path
End-to-end MPLS Restoration
12
MPLS Fast RerouteMPLS Fast Reroute
LSP shown traverses A, B, C, D, E, FEach detour avoids
Immediate downstream node & link towards itExcept for last detour: only avoids link DE
Merged detour reduces state maintained, reduces signaling overhead, and improves utilization
A
F
E
D
C
B
Detour to avoid AB
Detour to avoid BC
Detour to avoid CD
Detour to avoid DE
Detour to avoid link DE
A F
ED
CB
Detour to avoid AB
Merged Detour to avoid AB and BC
Detour to avoid CD
Fast reroute example
Fast reroute with merged detour example
13
Optical Mesh RestorationOptical Mesh Restoration
Similar to protection at the SONET layer; but at a fraction of the costProtects the backbone links between routers using shared mesh protectionFast, robust, and scalable mechanism for protecting transport layerDo not protect against router failures; complementary mechanisms necessary
Optical bypass reduces the impact of router failuresIP rerouting or MPLS restoration for handling router failures
05
1015202530354045
Ring 1+1 Dedicated Shared MeshTr
ansc
eive
r Por
t Cou
nt(T
hous
oand
s)
RestorationService
28%
52%
75 cities with 93 fiber links, degree of connectivity of 2.48
14
Optical Restoration AdvantagesFast, Scalable, and EfficientOptical Restoration AdvantagesFast, Scalable, and Efficient
Restores hundreds of wavelengths possibly containing thousands of MPLS LSPs in less than 200 ms
Scalability of MPLS restoration is a big leap of faith. A fiber cut or DWDM failure may lead to thousands of LSPs to fail.
Takes advantage of physical plant information for efficient diverse path routing; physical plant information is not available at IP/MPLS layer
Physical plant information is very important for routing primary and backup paths in an SRLG disjoint fashion.
0
20
40
60
80
100
120
140
160
180
200
6 11 15 24 34
# of OC-48 lightpaths simultaneously failedRe
stor
atio
n La
tenc
ies
(ms.
)max.avg.
50 node network, 910 OC-48 lightpaths
15
Optical Restoration AdvantagesImproves Network StabilityOptical Restoration AdvantagesImproves Network Stability
Optical restoration does not impact IP routing or MPLS label distribution after failure
Every failed link is replaced by a backup link of same capacity - topology does not change In most cases the routers would not even detect the failures at the optical layer.
When optical restoration is used, IP layer utilization does not change after a failure and subsequent restoration
Avoids the need to revert thousands of LSPs from protection to working paths when failures are repaired
Primary route
Backup route
Primary lightpath
Backup lightpath
Connectivity betweenrouters remains the same
Restoration at the IP Layer
Restoration at the Optical Layer
16
Routing with Express Bypass LinksRouting with Express Bypass LinksIP routed path
BEFORE express bypassLightpath established
as express bypass
IP routed path AFTER express bypass
17
Primary and Backup Path RoutingMPLS and OpticalPrimary and Backup Path RoutingMPLS and Optical
Pick the next demand d from the demand setFind the set of k-smallest cost paths S = {P1,P2, .., Pn) from source to destinationFor each primary path Pi find the backup path as follows
For each link in the link in the graphSet the cost to infinity if it is part of Pi
Set the cost of ε if it is shareableSet to original link cost otherwise
Find the shortest backup path Bi
Find the primary and backup path pair such that Pi + Bi is the smallestRepeat until the demand set is exhaustedRepeat the whole process N number of times
α α α
ε ε ε ε
18
Routing IP Flow over Optical BackboneRouting IP Flow over Optical Backbone
Start with a fully connected logical graph consisting of physical links and logical lightpathsPick the next demand d from the demand setSet the cost of a logical link as a function of
Original cost of the physical link and the switch portsRemaining capacity of the link and the size of the demand
Find the smallest cost path from the source to destinationRepeat until the demand set is exhaustedRepeat the whole process N number of times
Physical link
Logical link
19
IP Network StudyNetwork, Traffic, and Pricing AssumptionsIP Network StudyNetwork, Traffic, and Pricing Assumptions
Network Model12 nodes, 17 links US backbone network. Average node degree is 2.8
Traffic ModelTotal of 66 bi-directional demands. Average demand between POPs is 3Gbps75% year-over-year growth assumed
Pricing modelStreet prices for routers used in the study. OC48 and OC-192 ports @ $45K and $125K per port, respectivelyOC48 and OC192 OXC ports are assumed to be $10K and $35K, respectively.OXC common equipment cost of $350K
Different scenarios considered Network Topology
20
Router Port ComparisonIP-over-WDM Link Utilization 50%Router Port ComparisonIP-over-WDM Link Utilization 50%
IP-over-WDM: utilization kept below 50% to handle overload due to failureIP-over-OTN: utilization kept below 50%Large reduction in number of router ports
From 2,096 to 1,581 in Y1From 18,324 to 13,848 in Y5
Large reduction in number of routers
From 64 to 52 in Y1From 388 to 244 in Y5
24% reduction in router portswith IP-over-OTN
IP-over-WDM unprotected with 50% utilizationIP-over-OTN mesh protected with 50% utilization
2096 35
60 6112
1055
8
1832
4
1581 26
83 4613
7971
1384
8
1823 30
25 5103
8831
1532
2
02000400060008000
100001200014000160001800020000
Y1 Y2 Y3 Y4 Y5
Year
Num
ber o
f 2.5
G P
orts
IP-Over-WDM Router PortsIP-Over-OTN Router PortsIP-Over-OTN AOS Ports
21
Network Cost ComparisonIP-over-WDM Link Utilization 50%Network Cost ComparisonIP-over-WDM Link Utilization 50%
Negligible cost increaseInter-quad router ports not accounted for
Optical protection increases the robustness and stability of the networkExtends the lifetime of routers by 2-3 yearsProtection capacity can be used by other applications $37M or 5% CAPEX increase
with IP-over-OTN
IP-over-WDM unprotected with 50% utilizationIP-over-OTN mesh protected with 50% utilization
87
147 25
1
434
752
95
156 26
5
456
789
0100200300400500600700800900
Y1 Y2 Y3 Y4 Y5
Year
Cos
t in
Mill
ion
IP-Over-WDM Total Cost IP-Over-OTN Total Cost
22
Router Port ComparisonIP-over-WDM Link Utilization 30%Router Port ComparisonIP-over-WDM Link Utilization 30%
IP-over-WDM: utilization kept below 30% to handle overload due to failureIP-over-OTN: links are protected at the optical layer; utilization kept below 50%Large reduction in number of router ports
From 2,706 to 1,581 in Y1From 24,086 to 13,848 in Y5
Large reduction in number of routers
From 88 to 52 in Y1From 568 to 244 in Y5 42% reduction in router ports
with IP-over-OTN
IP-over-WDM unprotected with 30% max link utilizationIP-over-OTN mesh protected with 50% max link utilization
2706 46
12 7988
1386
4
2408
6
1581 2683 46
13 7971
1384
8
1823 30
25 5103 88
31
1532
2
0
5000
10000
15000
20000
25000
30000
Y1 Y2 Y3 Y4 Y5
Year
Num
ber o
f 2.5
G P
orts
IP-Over-WDM Router PortsIP-Over-OTN Router PortsIP-Over-OTN AOS Ports
23
Network Cost ComparisonIP-over-WDM Link Utilization 30%Network Cost ComparisonIP-over-WDM Link Utilization 30%
Substantial cost savingsOptical protection increases the robustness and stability of the network Extends the lifetime of routers by 2-3 yearsIn IP-over-OTN protection capacity can be used by other applicationsSimilar savings possible in ATM/FR networks $170M or 18% CAPEX savings
with IP-over-OTN
IP-over-WDM unprotected with 30% max link utilizationIP-over-OTN mesh protected with 50% max link utilization
109 18
4 319
552
959
95
156 26
5
456
789
0
200
400
600
800
1000
1200
Y1 Y2 Y3 Y4 Y5
Year
Cost
in M
illio
n
IP-Over-WDM Total Cost IP-Over-OTN Total Cost
24
MPLS and Optical ProtectionCost ComparisonMPLS and Optical ProtectionCost Comparison
Primary LSPPrimary lightpath
Backup LSPBackup lightpath
Optical Mesh ProtectionMPLS Protection
Burns 4 intermediate backbone router ports in POP2 and POP3 eachTotal cost $1.0M assuming 10G router ports at $125K each
Burns 4 OXC ports in POP1 and POP4 eachBurns 2 OXC ports in POP2 and POP3 eachTotal cost of $420K assuming 10G OXC ports at $35K each
POP 1
POP 2
POP 3
POP 4POP 1
POP 2
POP 3
POP 4
25
IP Network StudyNetwork, Traffic, and Pricing AssumptionsIP Network StudyNetwork, Traffic, and Pricing Assumptions
Network Model12 nodes, 17 links US backbone network. Average node degree is 2.8
Traffic ModelTotal of 66 bi-directional demands. Average demand is 1.5 Gbps50% year-over-year growth assumed
Pricing modelStreet prices for routers used in the study. OC48 and OC-192 ports @ $45K and $125K per port, respectivelyOC48 and OC192 OXC ports are assumed to be $10K and $35K, respectively.OXC common equipment cost of $350K
Different scenarios considered Network Topology
26
84% reduction in transit router ports
Router Port ReductionMPLS vs. Optical RestorationRouter Port ReductionMPLS vs. Optical Restoration
IP-over-WDM: restoration using MPLS shared backup restorationIP-over-OTN: restoration using shared optical mesh restorationOnly core routers consideredLarge reduction in number of router ports
From 732 to 128 in Y1From 3414 to 502 in Y5
Large reduction in number of routers
From 42 to 24 in Y1From 146 to 54 in Y5
732 10
52
1544
2292
3414
128
186
250
354 50
2686 98
2 1412
2012
2918
0
500
1000
1500
2000
2500
3000
3500
4000
Y1 Y2 Y3 Y4 Y5
Year
Num
ber o
f 2.5
G P
orts
IP-Over-WDM Router PortsIP-Over-OTN Router PortsIP-Over-OTN AOS Ports
Transit traffic at IP-over-WDM POPsBurns 2 router ports 50% of the timeBurns 4 router ports 50% of the time
27
54% CAPEX savings
Substantial cost savingsAverage of 54% over 5 yearsTotal of $133M over 5 years excluding DWDM
Similar savings possible in ATM/FR networksInter router-quad tie-ports not accounted for
Network Cost SavingsMPLS vs. Optical RestorationNetwork Cost SavingsMPLS vs. Optical Restoration
55
77
112
165
245
27 37 51
72
102
0
50
100
150
200
250
300
Y1 Y2 Y3 Y4 Y5
Year
Cos
t in
Mill
ion
IP-Over-WDM Total CostIP-Over-OTN Total Cost
Transit traffic at IP-over-WDM POPsBurns 2 router ports 50% of the timeBurns 4 router ports 50% of the time
28
Switched Optical Backbone Advantages: Fast and Easy ProvisioningSwitched Optical Backbone Advantages: Fast and Easy Provisioning
Risk managementIP traffic pattern can vary
Long term variations: due to change in application mix, peering policy etc.Short term variations: due to user behavior, scheduled time-of-day activities etc.
Traffic Projection is DifficultFast reconfiguration reduces detrimental business impact due to inaccurate traffic projection
New services Storage and backup applications, SANs Datacenter to NAP connectivity for special event hostingRemote peering with other ISPs
29
Research IssuesIP NetworksResearch IssuesIP Networks
Multi-area traffic engineeringAggregation of traffic engineering informationTraffic engineering across AS boundary
Primary and backup path routingShared Risk Group informationDistributed routing avoiding same shared risk groups
Constraint-based Shortest Path First routing algorithmsFast convergence of routing protocols
Area 0
Area 1
Area 2
To other AS
30
Research IssuesOptical NetworksResearch IssuesOptical Networks
Optical Network
Optical Network
Optical Network
Optical Network
Optical subnet
Optical subnet
Optical subnet
UNI
UNIE-NNI
E-NNI
E-NNI
I-NNI
Dynamic provisioningGeneralized MPLS (GMPLS)Neighbor discovery, topology discovery, routing, signaling
Mesh restorationShared mesh restorationRestoration protocols
IP-optical convergenceOptical UNIEvent driven network reconfiguration