ams-ix version 4 - nanog
Post on 14-Feb-2022
8 Views
Preview:
TRANSCRIPT
an MPLS/VPLS based internet exchange
AMS-IX version 4
Overview
‣ AMS-IX version 3
‣ Short overview
‣ Bottlenecks and limitations
‣ AMS-IX version 4
‣ The MPLS/VPLS platform
‣ AMS-IX v3 to v4 migration
‣ Operational Experience
June 2009 situation before start of migration
AMS-IX version 3
GlobalSwitch
10/100/1000
10GE
pxc-nik-103pxc-nik-104
NIKHEF
10GE
10/100/1000
pxc-nik-108
pxc-tel-105pxc-tel-106
Telecity
10/100/1000
10GE
pxc-tel-110
pxc-sar-101pxc-sar-102
pxc-sar-111
SARA
10/100/1000
10GE
pxc-sar-114pxc-nik-112 pxc-tel-115
10/100/1000
10GE
pxc-eun-113
euNetworks
pxc-glo-107
pxc-glo-109
pxc-glo-117
pxc-nik-116
pxc-eqx-118
Equinix
10/100/1000
10GE
pxc-eqx-118
stub-nik-211MLX16
stub-nik-214MLX16
stub-eun-315MLX16
stub-eun-215MLX16
stub-glo-319MLX16
stub-glo-219MLX16
stub-eqx-318MLX16
stub-eqx-218MLX16
stub-sar-322MLX32
stub-sar-222MLX32
stub-nik-311MLX16
stub-nik-314MLX16
stub-tel-320MLX32
stub-tel-220MLX32
core-eun-301MLX32
core-glo-201MLX32
edge-sar-001BI15000
edge-tel-005BI15000
edge-nik-003BI15000
edge-eun-014MLX8
edge-glo-007BI15000
edge-eqx-015MLX8edge-sar-001
BI15000edge-nik-003BI15000
edge-tel-005BI15000
CharacterizationAMS-IX version 3
‣ E, FE and (N *) GE connections on BI-15k or RX8 switches
‣ (N * ) 10GE connections resilient connected on switching platform (MLX16 or MLX32) via PXCs
‣ Brocade “port security” on customer interface to enforce one MAC per port rule for loop prevention
CharacterizationAMS-IX version 3
‣ Two networks: one active at any moment in time
‣ Selection of active network by VSRP
‣ Inactive network switch blocks ports to prevent loops
‣ PSCD, photonic switch control daemon
‣ AMS-IX developed software to act on VSRP traps and manage PXCs
Topology Failover
Core Red
MASTER
10GE Access rood
10GE Access Blue
Edge
PXC
Core Blue
BACKUP
Customer 10GE router
Customer 1GE router
VSRP
VSRP HELLO
BLOCKED
for traffic
BLOCKED
for traffic
Core Red
BACKUP
10GE Access rood
10GE Access Blue
Edge
pxc-sar-101
Core Blue
MASTER
Customer 10GE router
Customer 1GE router
VSRP
VSRP HELLO
BLOCKED
for traffic
BLOCKED
for traffic
AMS-IX Version 3 Platform
Problem or maintenancein red network
VSRP PriorityRed Master lowerthan the Blue priority
Traffic and Port Prognoses
!"
#!"
$!!"
$#!"
%!!"
%#!"
&!!"
&#!"
'!!"
'#!"
#!!"$()*+(,,"
$()*+(!!"
&$(-./(!!"
&$(-./(!$"
&$(-./(!%"
&$(-./(!&"
&!(-./(!'"
&!(-./(!#"
&!(-./(!0"
&!(-./(!1"
%,(-./(!2"
%,(-./(!,"
%,(-./($!"
%,(-./($$"
%2(-./($%"
3"45678"
9:8756:/"*+;"<6.;:/7.;"=>875?.6"<5678"
@A"
BA"
$!BA"
$!BA"C5+D"7.6?"<6.;:/E5+"
'!F$!!BA"
Bottlenecks and Limitations
AMS-IX version 3
‣ Core switches (MLX32, 128 10GE line rate) fully utilized
‣ Limits ISL upgrade
‣ Summer 2009 no substantial bigger switches on the market
‣ Platform failover introduces short link-flap on all 10GE customer ports. In few (but increasing) cases this leads to BGP flapping
‣ With more and more 10GE customer ports impact on overall platform stability becomes larger and larger
‣ Growth of number of 10G connections and 10GE customer LAG size requires larger 10GE access switches
‣ Smaller switches => less local switching => larger ISL trunks
AMS-IX version 4
RequirementsAMS-IX version 4
‣ Scale the core to at least double amount of ports (Q2/3 2009)
‣ Keep resilience in platform and 10GE access but reduce impact on failover.
‣ Increase amount of 10GE customer ports on access switches
‣ More local switching
‣ Migrate to single architecture platform
‣ Reduce management overhead
‣ Use future proof (3 to 5 years) hardware that allows upscaling to high-density 10GE (2010) and 40/100GE (end 2010, early 2011)
Complete MPLS/VPLS topology
AMS-IX version 4
GlobalSwitch
10/100/1000
10GE
pxc-sar-101pxc-sar-102
SARA
10GE
10/100/1000
pxc-sar-111
pxc-tel-105pxc-tel-106
Telecity
10/100/1000
10GE
pxc-tel-110
pxc-nik-103pxc-nik-104
pxc-nik-108
NIKHEF
10/100/1000
10GE
pxc-nik-112pxc-sar-114pxc-tel-115
10/100/1000
10GE
pxc-eun-113
euNetworks
pxc-glo-107
pxc-glo-109
pxc-glo-117
pxc-nik-116
pxc-eqx-118
Equinix
10/100/1000
10GE
pxc-eqx-118
core-eun-302MLX32
core-eun-301MLX32
core-glo-202MLX32
core-glo-201MLX32
stub-nik-221MLX32
stub-nik-321MLX32
stub-eun-315MLX16
stub-eun-215MLX16
stub-glo-313MLX16
stub-glo-213MLX16
stub-eqx-318MLX16
stub-eqx-218MLX16
stub-tel-320MLX32
stub-tel-220MLX32
stub-sar-222MLX32
stub-sar-322MLX32
edge-sar-001MLX8
edge-tel-005MLX8
edge-nik-003MLX8
edge-eun-008MLX8
edge-glo-007MLX8
edge-eqx-009MLX8
OverviewAMS-IX version 4
‣ MPLS/VPLS-based peering platform
‣ Scaling of core switches by adding extra switches in parallel
‣ 4 LSPs between each pair of access switches
‣ Load balancing of traffic over 4 LSPs between each pair of access switches
‣ Retain 10GE access switch resilience
‣ Keep 10GE customer connection on PXC
‣ No need for complete platform failover anymore
‣ Local impact only (single pair of access switches on a site)
CharacterizationAMS-IX version 4
‣ OSPF
‣ BFD for fast detection of link failures
‣ RSVP-TE signalled LSPs over predefined paths
‣ primary and secondary (backup) paths defined
‣ VPLS instance per VLAN
‣ Static defined VPLS peers (LDP signalled)
‣ Load balanced over parallel LSPs over all core routers
‣ Layer 2 ACLs instead of Port Security
‣ Manual adjustment for now
MPLS/VPLS setup
core-loctation-1
MLX-32
core-location-1
MLX-32
core-location-2
MLX-32
core-location-2
MLX-32
PE-1-red
MLX-16
PXC
PE-1-blue
MLX-32
PE-3
MLX-8
core-loctation-1
MLX-32
core-location-1
MLX-32
core-location-2
MLX-32
core-location-2
MLX-32
PE-1-red
MLX-16
PXC
PE-1-blue
MLX-32
PE-3
MLX-8
core-loctation-1
MLX-32
core-location-1
MLX-32
core-location-2
MLX-32
core-location-2
MLX-32
PE-1-red
MLX-16
PXC
PE-3
MLX-8
PE-1-blue
MLX-32
core-loctation-1
MLX-32
core-location-1
MLX-32
core-location-2
MLX-32
core-location-2
MLX-32
PE-1-red
MLX-16
PXC
PE-1-blue
MLX-32
PE-3
MLX-8
MPLS path over one corebackup path over secondcore on different site
PhysicalInterconnection
MPLS/VPLS setupResilience
PE router failureCustomers switched to second access switch
core-loctation-1
MLX-32
core-location-1
MLX-32
core-location-2
MLX-32
core-location-2
MLX-32
PE-1-red
MLX-16
PXC
PE-1-blue
MLX-32
PE-3
MLX-8
core-loctation-1
MLX-32
core-location-1
MLX-32
core-location-2
MLX-32
core-location-2
MLX-32
PE-1-red
MLX-16
PXC
PE-1-blue
MLX-32
PE-3
MLX-8
core-loctation-1
MLX-32
core-location-1
MLX-32
core-location-2
MLX-32
core-location-2
MLX-32
PE-1-red
MLX-16
PXC
PE-1-blue
MLX-32
PE-3
MLX-8
Primary Path Backup Path
LSP over primary Path
LSP over backup Path
How did we do the platform migration?
AMS-IX v3 to v4 migration
GlobalSwitch
10/100/1000
10GE
pxc-nik-103pxc-nik-104
NIKHEF
10GE
10/100/1000
pxc-nik-108
pxc-tel-105pxc-tel-106
Telecity
10/100/1000
10GE
pxc-tel-110
pxc-sar-101pxc-sar-102
pxc-sar-111
SARA
10/100/1000
10GE
pxc-sar-114pxc-nik-112 pxc-tel-115
10/100/1000
10GE
pxc-eun-113
euNetworks
pxc-glo-107
pxc-glo-109
pxc-glo-117
pxc-nik-116
pxc-eqx-118
Equinix
10/100/1000
10GE
pxc-eqx-118
stub-nik-211MLX16
stub-nik-214MLX16
stub-eun-315MLX16
stub-eun-215MLX16
stub-glo-319MLX16
stub-glo-219MLX16
stub-eqx-318MLX16
stub-eqx-218MLX16
stub-sar-322MLX32
stub-sar-222MLX32
stub-nik-311MLX16
stub-nik-314MLX16
stub-tel-320MLX32
stub-tel-220MLX32
core-eun-301MLX32
core-glo-201MLX32
edge-sar-001BI15000
edge-tel-005BI15000
edge-nik-003BI15000
edge-eun-014MLX8
edge-glo-007BI15000
edge-eqx-015MLX8edge-sar-001
BI15000edge-nik-003BI15000
edge-tel-005BI15000
GlobalSwitch
10/100/1000
10GE
pxc-sar-101pxc-sar-102
SARA
10GE
10/100/1000
pxc-sar-111
pxc-tel-105pxc-tel-106
Telecity
10/100/1000
10GE
pxc-tel-110
pxc-nik-103pxc-nik-104
pxc-nik-108
NIKHEF
10/100/1000
10GE
pxc-nik-112pxc-sar-114pxc-tel-115
10/100/1000
10GE
pxc-eun-113
euNetworks
pxc-glo-107
pxc-glo-109
pxc-glo-117
pxc-nik-116
pxc-eqx-118
Equinix
10/100/1000
10GE
pxc-eqx-118
core-eun-302MLX32
core-eun-301MLX32
core-glo-202MLX32
core-glo-201MLX32
stub-nik-221MLX32
stub-nik-321MLX32
stub-eun-315MLX16
stub-eun-215MLX16
stub-glo-313MLX16
stub-glo-213MLX16
stub-eqx-318MLX16
stub-eqx-218MLX16
stub-tel-320MLX32
stub-tel-220MLX32
stub-sar-222MLX32
stub-sar-322MLX32
edge-sar-001MLX8
edge-tel-005MLX8
edge-nik-003MLX8
edge-eun-008MLX8
edge-glo-007MLX8
edge-eqx-009MLX8
?
Migration steps: Initial situation
AMS-IX v3 to v4 migration
GlobalSwitch
10/100/1000
10GE
pxc-nik-103pxc-nik-104
NIKHEF
10GE
10/100/1000
pxc-nik-108
pxc-tel-105pxc-tel-106
Telecity
10/100/1000
10GE
pxc-tel-110
pxc-sar-101pxc-sar-102
pxc-sar-111
SARA
10/100/1000
10GE
pxc-sar-114pxc-nik-112 pxc-tel-115
10/100/1000
10GE
pxc-eun-113
euNetworks
pxc-glo-107
pxc-glo-109
pxc-glo-117
pxc-nik-116
pxc-eqx-118
Equinix
10/100/1000
10GE
pxc-eqx-118
stub-nik-211MLX16
stub-nik-214MLX16
stub-eun-315MLX16
stub-eun-215MLX16
stub-glo-319MLX16
stub-glo-219MLX16
stub-eqx-318MLX16
stub-eqx-218MLX16
stub-sar-322MLX32
stub-sar-222MLX32
stub-nik-311MLX16
stub-nik-314MLX16
stub-tel-320MLX32
stub-tel-220MLX32
core-eun-301MLX32
core-glo-201MLX32
edge-sar-001BI15000
edge-tel-005BI15000
edge-nik-003BI15000
edge-eun-014MLX8
edge-glo-007BI15000
edge-eqx-015MLX8edge-sar-001
BI15000edge-nik-003BI15000
edge-tel-005BI15000
PreparationPlatform Migration
‣ Build new version of PSCD (Photonic Switch Control Deamon)
‣ No VSRP traps but LSP state in MPLS cloud
‣ Develop configuration automation
‣ Describe network in XML, generate configurations from this
‣ Move non MPLS capable access switches behind MPLS routers and PXC as a 10GE customer connection
‣ Upgrade all non MPLS capable 10GE access switches to Brocade MLX hardware
‣ Define migration scenario that would have no customer impact
Migration steps: Initial situation simplified
AMS-IX v3 to v4 migration
Colocation 1 Colocation 2
10GE
access
10GE
access
10GE
access
GE
access
GE
access
core core
PXCPXC
N * 10GE < n*1 GE
customer routers customer routers
10GE
access
N * 10GE < n*1 GE
• 2 Co-location sites only for simplicity
• Double L2 network
• VSRP for master slave selection and loop protection
Migration steps: move GE access behind PXC
AMS-IX v3 to v4 migration
Colocation 1 Colocation 2
10GE
access
10GE
access
10GE
access
10GE
access
GE
access
GE
access
core core
PXC PXC
customer routers customer routers
N * 10GE < n*1 GE N * 10GE < n*1 GE
• Not possible to connect GE access switch to both MPLS/VPLS cloud and basic L2 network
Migration steps: Migrate one half to MPLS/VPLS
AMS-IX v3 to v4 migration
Colocation 1 Colocation 2
GE
access
GE
access
P Router
10GE
access
10GE
access
core
PE
router
PE
router
P router
PXC PXC
customer routers customer routers
N * 10GE < n*1 GE N * 10GE < n*1 GE
• Production on L2 network (red)
• Migrate blue network to MPLS/VPLS
• Traffic between two PE routers load balanced over 2 LSPs, one over each P router
• Test functionality and connections using test traffic sent by Anritsu traffic generators
Migration steps: Production on MPLS/VPLS, L2 backup
AMS-IX v3 to v4 migration
Colocation 1 Colocation 2
GE
access
GE
access
P Router
10GE
access
10GE
access
core
PE
router
PE
router
P router
PXC PXC
customer routers customer routers
N * 10GE < n*1 GE N * 10GE < n*1 GE
• Move production traffic to MPLS/VPLS cloud
• Use PXCs for failover
• New PSCD
• Run production on MPLS/VPLS cloud for 6 weeks
Migration steps: Two MPLS/VPLS platforms
AMS-IX v3 to v4 migration
Colocation 1 Colocation 2
GE
access
GE
access
P Router
PE
Router
P Router
PE
Router
PE
router
P router
PE
router
P Router
PXC PXC
customer routers customer routers
N * 10GE < n*1 GE N * 10GE < n*1 GE
• Migrate second half of the platform to MPLS/VPLS
• Test functionality and connections using test traffic sent by Anritsu traffic generators
Migration steps: production on second MPLS/VPLS platform
AMS-IX v3 to v4 migration
Colocation 1 Colocation 2
GE
access
GE
access
P Router
PE
Router
P Router
PE
Router
PE
router
P router
PE
router
P Router
PXC PXC
customer routers customer routers
N * 10GE < n*1 GE N * 10GE < n*1 GE
• Move production traffic to red MPLS/VPLS cloud using the newly developed version of PSCD to manage the PXCs
• Still two separate networks, both MPLS/VPLS based
Migration steps: integration to single MPLS/VPLS cloud
AMS-IX v3 to v4 migration
Colocation 1 Colocation 2
GE
access
GE
access
customer routers
PXC PXC
P Router
PE
Router
P Router
PE
Router
PE
router
P router
PE
router
P Router
customer routers
N * 10GE < n*1 GE N * 10GE < n*1 GE
• All PE routers connected to all P routers
• Between each pair of PE routers, 4 LSPs. One over each P router
• Traffic between each pair of PE routers load balanced over the 4 LSPs
• 10GE customer connections distributed over local PE routers
• Resilience in 10GE customer connection to local PE router by means of PXCs
Migration - Conclusion
‣ Traffic load balancing over multiple core switches solves scaling issues in the core
‣ Increased stability of the platform
‣ Backbone failures are handled in the MPLS cloud and not seen at the access level.
‣ Access switch failures are handled by PXC for a single pair of switches only and not the whole platform
‣ Upscaling access switches to Brocade MLX32 allows for higher access port density
Operational Experiences
Operational experienceIssues
‣ BFD instability
‣ High LP CPU load caused BFD timeouts
‣ Resolved by increasing timers
‣ Bug: ghost tunnels
‣ Double “Up” event for LSP path
‣ Results in unequal load-balancing
‣ Scheduled to be fixed in next patch release
Operational experience Issues (2)
‣ Multicast replication
‣ Replication done on ingress PE, not on core
‣ Only uses 1st link of aggregate of 1st LSP
‣ With PIM-SM snooping traffic is balanced over multiple links, but this has some serious bugs
‣ Bugfixes and load-sharing of multicast traffic over multiple LSPs scheduled for next major release
Operational experienceIssues (3)
Operational experienceIssues (3)
‣ Delay spikes in RIPE TTM graphs
‣ TTM datagrams have high interval (2 packets per minute), with some entropy (source port changes)
‣ Brocade VPLS CAM: Entries programmed individually for each backbone port, age out after 60s
‣ For 24-port aggregates, traffic often passes port without programming => CPU learning => high delay
‣ Does not affect real-world traffic
‣ Much lower interval between frames
‣ Looking into changing/disabling CAM aging
Operational experienceIssues (4)
‣ From 213.136.17.28: icmp_seq=1 Packet is claustrophobic
‣ Limited to single user
‣ Suspecting problem caused by protocol-stack on client ;-)
Operational experienceThe good stuff
‣ Increased stability
‣ Backbone failures handled by MPLS (not seen by customers)
‣ Access switch failures handled for a single pair of switches
‣ Phased relocation of traffic streams
‣ Looped traffic filtered by L2 ACL => No effect on linecard CPU
Operational experienceThe good stuff (2)
‣ Easier debugging of customer ports
‣ Simply swap to different, active switch using Glimmerglass PXC
‣ Config generation
‣ Absolute necessity due to size of MPLS/VPLS configuration
‣ Fairly simple because of single hardware platform
Operational experienceThe good stuff (3)
‣ Scalability (future options)
‣ Bigger core devices
‣ Do not need to be MPLS-capable
‣ Load-sharing over > 4 cores
‣ Pending feature request
‣ Use of different cores for sets of PEs
‣ Multiple layers of P-routers
Conclusions
‣ Some issues found
‣ Nothing with impact on customer traffic
‣ Traffic load-sharing over multiple devices solves scaling issues in the core
‣ Increased stability of the platform
‣ Backbone failures not seen at the access level
‣ Access switch failures trigger failover for corresponding Glimmerglass PXCs only
‣ Upscaling access switches allows for higher access port density
‣ Single hardware platform simplifies configuration generation
Questions ?
top related