putting the layer 3 into layer 3 mpls vpn management 2547bis mpls vpns putting the layer 3 into...

12
Route Analytics for RFC 2547bis MPLS VPNs Putting the Layer 3 into Layer 3 MPLS VPN Management

Upload: dinhcong

Post on 22-Jun-2018

229 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Putting the Layer 3 into Layer 3 MPLS VPN Management 2547bis MPLS VPNs Putting the Layer 3 into Layer 3 MPLS VPN Management MPLS VPN Route Analytics 2013 Packet Design, Inc. 1 Executive

Route Analytics for

RFC 2547bis MPLS VPNs

Putting the Layer 3 into

Layer 3 MPLS VPN Management

Page 2: Putting the Layer 3 into Layer 3 MPLS VPN Management 2547bis MPLS VPNs Putting the Layer 3 into Layer 3 MPLS VPN Management MPLS VPN Route Analytics 2013 Packet Design, Inc. 1 Executive

M P L S V P N R o u t e A n a l y t i c s

2013 Packet Design, Inc. 1

E x e c u t i v e S u m m a r y

Layer 3 MPLS VPNs are an important component of most service provider WAN offerings today, and represent a major revenue stream. Yet while service providers enjoy a level of maturity in service delivery of RFC 2547bis MPLS VPNs, they still suffer from immature MPLS VPN service management capabilities. Management systems have focused on traditional device-oriented connectivity and performance statistics, and are unable to provide the per-customer visibility needed to ensure customers’ site-to-site IP reachability, VPN privacy and correct VPN routing policy. These key customer and operational concerns can be addressed by a new technology for IP Layer 3 network management.

The lack of a Layer 3 management capability has significant operational impacts such as reducing service assurance and operational efficiency, and bottom-line impacts such as lost revenue when Service Level Agreements are not met and longer time to verified customer provisioning and revenue. This white paper examines the current operational state of MPLS VPN management, introduces IP route analytics technology and explains how Packet Design’s VPN Explorer provides the missing Layer 3 management needed to help service providers maximize efficiency, productivity and profitability in their MPLS VPN service offerings.

The State of MPLS VPN Management

RFC 2547bis MPLS VPNs are based on the notion of a virtualized network infrastructure where each customer’s traffic is routed through the provider’s network based on information maintained in separate Virtual Routing and Forwarding (VRF) tables. Customer edge (CE) routers peer with service provider edge (PE) routers, where each VRF acts as a private container for an individual customer’s VPN routing information, directing the flow of traffic between that customer’s sites, while maintaining the privacy of their site addresses from other customers. This virtualized, per-customer routing information is distributed between the PE routers participating in each customer’s VPN using an extended version of the BGP routing protocol known as MBGP. Customer traffic between any two PE routers participating in their VPN is forwarded across one or more provider (P) routers in the MPLS core of the service provider’s network. For a more detailed introduction to MPLS VPNs, see the Resources section at the end of this paper.

Despite the virtualized, logical nature of the MPLS VPN architecture, MPLS VPN management solutions have been remarkably device oriented. Management of MPLS networks and services has so far been accomplished using SNMP polling of router interfaces, and is focused on router connectivity or performance issues. Connectivity-oriented management solutions identify affected VPNs when there is a device or interface failure on a CE, PE or P router, while performance management solutions provide periodic data on delay, jitter, and packet loss within the MPLS core. Ironically, the excess bandwidth and path redundancy available in most service provider core networks mean that relatively few problems result from device failures or traffic congestion. Together, connectivity and performance management tools deliver some visibility into VPN status, but they both miss

Page 3: Putting the Layer 3 into Layer 3 MPLS VPN Management 2547bis MPLS VPNs Putting the Layer 3 into Layer 3 MPLS VPN Management MPLS VPN Route Analytics 2013 Packet Design, Inc. 1 Executive

M P L S V P N R o u t e A n a l y t i c s

2013 Packet Design, Inc. 2

necessary visibility into the virtualized, distributed routing upon which the individual customer VPNs are built.

MPLS VPN Routing is a Key Management Challenge

Most of the problems in managing RFC 2547bis MPLS VPN services are due to the complex virtualized routing configurations required. Today, the configuration and validation of MPLS VPNs is largely a manual process, which is error-prone due to the lack of uniformity in configurations across the aggregate set of customer VPNs.

• Each customer VPN requires configuration of many devices, possibly from multiplevendors

• Each PE router hosts a large number of VRFs

• PE participation in customer VPNs varies based on site locations

• Customer VPN routing policy varies—some are full-mesh connectivity WANs andsome are hub and spoke topologies

Both design as well as human errors can cause configuration problems, and these errors can occur at multiple stages in the life of the service:

• During new service provisioning

• When making a change to a customer’s service

• As a result of changes to other customers’ services

• Competing configurations with other services, such as Internet access service

• During routine router maintenance or upgrades

• Due to vendor “soft” failures or bugs

In addition to configuration errors, MPLS VPNs are prone to routing problems due to the fact that customers are in essence peering directly with the service provider’s core BGP routing operation. This CE to PE peering is vulnerable to the same problems that are common to all router peerings, yet may have greater impact due to the multiple VPNs supported by a common BGP mesh. For example:

• Route leakage: A CE router may mistakenly redistribute incorrect routes, or eventhe full Internet routing table and advertise those routes to its corresponding PErouter. If proper filtering is not enabled, this can consume the memory in the PErouter, possibly disrupting service to all VPNs supported by the PE router. Inaddition, the leaked Internet routes may be advertised to other PE routersparticipating in that customer’s VPN, further affecting the VPN network or theentire BGP mesh. This is a complex problem, since there can be a wide variance inthe number of prefixes per customer VPN—service providers have observedanywhere from 5 to 50,000 prefixes being routed within a customer VPN.

• Route flapping: A CE router may inject a flapping route from the customer’s siteinto the service provider’s network, which would not be visible to an SNMP-basedmanagement system, and could impact CPU utilization of the PE routers anddisrupt multiple customer VPNs.

Page 4: Putting the Layer 3 into Layer 3 MPLS VPN Management 2547bis MPLS VPNs Putting the Layer 3 into Layer 3 MPLS VPN Management MPLS VPN Route Analytics 2013 Packet Design, Inc. 1 Executive

M P L S V P N R o u t e A n a l y t i c s

2013 Packet Design, Inc. 3

The complexity of problems that can occur in MPLS VPN routing is exacerbated by the fact that BGP, which manages the VPN routing, is itself a very complex and difficult to manage routing protocol. Even in a pure MPLS VPN network, BGP can easily generate hundreds of thousands of messages per second, creating a problem for network engineers who have to sort through large volumes of BGP events to detect or isolate reported customer problems.

New Metrics for MPLS VPN Management Required

In the face of these VPN routing challenges, a new set of service management metrics are needed for VPN visibility beyond devices and interfaces. Following are three VPN service metrics that must be proactively managed in order to deliver reliable Layer 3 MPLS VPN service, and which are not addressed by traditional, SNMP-based management tools.

Per-Customer VPN Reachability: Reachability expresses whether VPN prefixes are properly distributed between each customer’s sites by the service provider’s BGP routing, enabling the site-to-site connectivity expected by the customer. Reachability management requires an historical understanding and baselining of customer advertised prefixes and PE router participation in each VPN so that deviations such as dropped/added prefixes can be immediately detected and investigated to ensure that they do not signify a disruption of service. In addition, monitoring of active versus baseline PE routers per customer VPN shows any dropped PE routers, which may signify a service disruption due to a misbehaving router, or a peering problem with a particular site.

Per-Customer VPN Privacy: A fundamental requirement of an MPLS VPN service is that customer traffic and routing are kept private. MPLS VPN privacy management ensures that there is no route leakage between individual customer VPNs. Since MPLS VPNs allow different customer’s to use the same IP address space, such as private addresses (RFC 1918), Route distinguishers (RDs) are used to distinguish customer routes as they are distributed across the BGP mesh between PE routers. Yet RDs can be misconfigured or assigned to the wrong customer, causing overlapping routes between two customer VPNs. Historical baselining and deviation monitoring of per-customer PE router participation and prefixes advertised by each participating PE router allow operators to catch rogue or misconfigured RDs that would otherwise escape detection until a customer reports a privacy violation.

Per-Customer VPN Policy: Customer WANs are not all configured as full-mesh networks, so MPLS VPNs must be configured according to the customer’s chosen policy (e.g. hub and spoke). Route targets (RTs) are utilized to set routing policy within each customer’s VPN. As with other VPN routing parameters, RTs can easily be misconfigured, causing sub-optimal or disrupted connectivity within the customer VPN. Historical baselining and monitoring of deviations in per-customer RTs helps operators immediately notice any routing policy changes and investigate them.

These metrics are more problematic for service providers than core connectivity or performance issues, yet are un-addressed by traditional management solutions.

Page 5: Putting the Layer 3 into Layer 3 MPLS VPN Management 2547bis MPLS VPNs Putting the Layer 3 into Layer 3 MPLS VPN Management MPLS VPN Route Analytics 2013 Packet Design, Inc. 1 Executive

M P L S V P N R o u t e A n a l y t i c s

2013 Packet Design, Inc.4

Coping with the Rate of Change in VPN Routing Management

The rate and constancy with which routing changes can occur in an IP network is high when compared to the typical rate of change in device and interface status. This makes SNMP polling-based management approaches unsuitable for addressing the key routing management challenges in delivering MPLS VPN services, since typical multi-minute polling cycles can easily miss critical event information. BGP, in particular, is a verbose protocol, and is capable of generating thousands of messages per second in a large MPLS VPN core network, and millions of messages per second in an Internet-connected network. This rate and volume of raw BGP data is beyond human ability to analyze effectively. Given the complexity of virtualized service configurations in a MPLS VPN network, the result is that analyzing and troubleshooting MPLS VPN issues requires a heavy investment in time, expertise and manpower.

The amount of effort it takes to analyze the routing issues inherent in today’s MPLS VPN networks slows validation of new service provisioning, deters taking preventive measures on emerging problems, causes delays in mean-time-to-discover (MTTD) and mean-time-to-repair (MTTR) when problems occur, and lowers service assurance and operational productivity. This is a significant challenge for service providers, who have a critical management requirement for real-time monitoring and diagnostic tools to address customer VPN issues.

Route Analytics—Network-Layer Management for MPLS VPNs

A new technology called IP route analytics offers a fundamentally new approach to MPLS VPN management that meets the challenge of complex routing issues by tapping into the distributed intelligence at the heart of IP networks—IP routing. IP routing protocols such as OSPF, EIGRP, IS-IS, BGP, and MP-BGP, operating in routers and switches, control the network-layer behavior of an IP network. IP route analytics leverages the information in the routing protocols to let engineers easily understand how the network is operating, providing a network-layer management solution that complements traditional device management solutions.

Specifically for RFC 2547bis MPLS VPNs, route analytics works by:

• Listening to the IGP and MBGP routing protocol exchanges in the service provider’snetwork

• Computing a real-time, network-wide routing topology map

• Computing real-time, per-customer VPN routing maps including PE participation foreach VPN, advertised prefixes for each VPN site, and VPN routing policies

• Establishing a per-customer baseline of MPLS VPN routing, from which to analyze andalert on changes

• Monitoring and displaying MPLS VPN routing topology changes as they happen, with theability to view them on a per-customer basis

• Detecting and alerting on routing events or failures as routers announce them

Page 6: Putting the Layer 3 into Layer 3 MPLS VPN Management 2547bis MPLS VPNs Putting the Layer 3 into Layer 3 MPLS VPN Management MPLS VPN Route Analytics 2013 Packet Design, Inc. 1 Executive

M P L S V P N R o u t e A n a l y t i c s

2010 Packet Design, Inc. 5

• Recording, analyzing and reporting on historical routing events and trends, in aggregateand per customer VPN

• Analyzing the impact of routing changes or failures on the “as-running” network beforethey happen

The result is the most accurate, up-to-date picture of how the MPLS VPN network is operating - a network-wide, real-time, and holistic view of the network’s behavior, with detailed insight into each customer VPN’s routing operation. Route analytics provides the information network engineers needs to quickly validate newly provisioned customer VPNs, detect and resolve dynamic, network-layer problems, and plan for and prevent unpredictable or adverse network behavior before it can affect customer VPN services.

IP route analytics lets network engineers visualize and understand the dynamic operation of the network as never before. By monitoring the routing control plane, IP route analytics solutions are able to construct the router’s view of the network and individual customer VPNs, “seeing” topology, addressing and other logical changes or problems in real-time as they occur. Loss of customer VPN site-to-site IP reachability is immediately detected, even when device-level status is unchanged or unknown. Routing instabilities or changes that go unnoticed by conventional management systems, but which impact customer VPN availability and performance, are visible within seconds, leading to early detection of service outages and reduced time to repair.

Since route analytics automates these functions, network engineering can reduce MTTR when problems occur, regain lost productivity and leverage the historical and forensic audit trail to more easily explain routing problems in the network, increasing customer credibility and satisfaction. Additional impact-analysis functions enable network engineering to model failure scenarios on the as-running network model, to proactively ensure the continued health of VPN services.

For a full introduction to route analytics technology features and benefits, please read the Packet Design white paper: Route Analytics—Foundation of Modern Network Operations, and visit Packet Design’s website at:

http://www.packetdesign.com

Page 7: Putting the Layer 3 into Layer 3 MPLS VPN Management 2547bis MPLS VPNs Putting the Layer 3 into Layer 3 MPLS VPN Management MPLS VPN Route Analytics 2013 Packet Design, Inc. 1 Executive

M P L S V P N R o u t e A n a l y t i c s

2010 Packet Design, Inc. 6

VPN Explorer

Packet Design’s VPN Explorer is the first route analytics solution developed specifically for MPLS VPNs, offering service providers a wealth of customer VPN management capabilities.

Low-Overhead Appliance Solution

VPN Explorer is a 1U high rack mountable appliance that installs easily and is up and running in hours. Configured to peer with the PE router mesh or with route reflectors, whether used in a single automous system (AS) or a confederation of member ASes, VPN Explorer is compatible with any MPLS VPN-enabled BGP deployment. A single VPN Explorer can provide network-wide route analytics for all customer VPNs in even the largest service provider networks. Since VPN Explorer only listens to BGP messages but does not inject routes, it exacts minimal overhead on routers and the network and is neither a bottleneck or failure point for traffic. Pre-configured reports, histograms, and detailed drill-down analyses reduce time to value and expensive report development cycles associated with many network management products.

Faster Provisioning Validation

When new VPN services are provisioned, VPN Explorer is an invaluable tool for ensuring the operational status and accuracy of customer services. By delivering a real-time view of PE router (i.e. site) participation, Layer 3 site-to-site reachability and routing policy, operations groups can ensure that the customer’s VPN is provisioned as intended. This unprecedented visibility helps speed time to deployment and customer revenue.

Automatic Per-Customer VPN Baselining

Once installed and configured with appropriate BGP peerings, VPN Explorer automatically monitors all customer VPNs, and automatically computes a baseline for every route, PE router and Route Target in each customer’s VPN. This baseline is used to compute

Page 8: Putting the Layer 3 into Layer 3 MPLS VPN Management 2547bis MPLS VPNs Putting the Layer 3 into Layer 3 MPLS VPN Management MPLS VPN Route Analytics 2013 Packet Design, Inc. 1 Executive

M P L S V P N R o u t e A n a l y t i c s

2010 Packet Design, Inc. 7

deviations in site reachability and potential privacy violations between separate customer VPNs.

Customizable Alarm Thresholds

VPN Explorer allows operators and network engineers to customize baseline deviation thresholds and set severity levels for alarms that are viewable via the VPN Explorer GUI, or sent to other network management elements via SNMP traps or Syslog messages.

At a Glance View of Overall MPLS VPN Service Health

VPN Explorer provides a top-level, “VPN Explorer” view of MPLS VPN service health, allowing network engineers to easily monitor the following information, as illustrated in Figure 1:

• A summary dashboard of overall health of the VPN network

• Alarms on specific customer VPNs with deviations from baseline based oncustomizable thresholds and severity levels

• Lists of customers with highest deviations from baseline to help prioritize responseo Route reachability deviation by customer or by RTo PE router (site) participation deviation by customer or by RT

• Quick navigation of analysis tools and pre-packaged reports to rapidly pinpoint andtroubleshoot customer VPN issues

Figure 1: VPN Explorer: VPN Explorer view

VPN Explorer lets network operators quickly identify customer VPNs that have experienced the greatest deviation from baseline, while the alarm feature aids in prioritizing responses.

Page 9: Putting the Layer 3 into Layer 3 MPLS VPN Management 2547bis MPLS VPNs Putting the Layer 3 into Layer 3 MPLS VPN Management MPLS VPN Route Analytics 2013 Packet Design, Inc. 1 Executive

M P L S V P N R o u t e A n a l y t i c s

2013 Packet Design, Inc.8

For example, Figure 1 indicates that Customer58 has had significant changes in both reachability (70%) and PE participation (66%). The information shown in Figure 1 is from an actual service provider network, but the customer names have been anonymized. In production deployments, real customer names may be utilized.

Per-Customer VPN Reachability

Summary VPN reachability reports by customer, VPN reachability over time (Figure 2) and many other reports and graphs are available to help network operators ensure that customer site-to-site route reachability is correct. For example, Figure 2 shows that Customer5’s VPN has experienced a 54% deviation in route reachability, with two routes withdrawn from the baseline, and 100 new routes added. The two lost routes may be the most immediate concern, since it is important to ensure that lost routes, which impact site-to-site reachability, are not due to an error. However, such a large addition of new routes also warrants investigation to ensure that a route leak or other error is not injecting inappropriate routes into Customer5’s VPN.

Figure 3: Customer VPN route reachability over time

Drill-down reports give operators a detailed view of specific routers and routes that have changed from a customer’s baseline, including:

• Prefixes and associated PE routers

Page 10: Putting the Layer 3 into Layer 3 MPLS VPN Management 2547bis MPLS VPNs Putting the Layer 3 into Layer 3 MPLS VPN Management MPLS VPN Route Analytics 2013 Packet Design, Inc. 1 Executive

M P L S V P N R o u t e A n a l y t i c s

2010 Packet Design, Inc. 9

• BGP attributes such as:o AS Path, Local-Pref and Communities

• Router status:o Up/Downo Included in baseline or new

Per-Customer VPN Privacy

In some cases, a customer VPN indicating new routes or new PE routers may mean that a privacy violation has occurred. VPN Explorer gives operators immediate visibility to these and other deviations from baseline, letting them proactively investigate and resolve potential issues, even before customers become aware of them. Figure 4 shows that, of the 100 new routes added to Customer5’s VPN, 13 come from two new PE routers (easily identified in a PE Participation report not shown) that were added to the baseline of eight PE routers. This may be understood as network growth, but further investigation may be advisable to ensure that two new PE routers from another customer’s VPN were not mistakenly added to Customer5’s VPN through misconfiguration, causing a privacy violation.

Figure 4: Two new PE routers in Customer5’s VPN account for 13 new routes—a possible privacy issue

In some instances, a privacy violation can be caused by misconfigurations during maintenance or provisioning of another customer’s VPN, in which case the PE participation report may reveal another customer VPN that recently lost two PE routers from its baseline.

Page 11: Putting the Layer 3 into Layer 3 MPLS VPN Management 2547bis MPLS VPNs Putting the Layer 3 into Layer 3 MPLS VPN Management MPLS VPN Route Analytics 2013 Packet Design, Inc. 1 Executive

M P L S V P N R o u t e A n a l y t i c s

2010 Packet Design, Inc. 10

VPN Explorer’s complete, historical record of PE router and route changes gives the network operator a solid footing for troubleshooting and forensic analysis when needed.

Per-Customer VPN Policy

VPN route reachability and PE router participation can be viewed over time (Figure 5) by customer name or by individual route targets, allowing operators to identify possible misconfigured routing policies that affect how sites are logically connected and potentially result in reachability problems.

Figure 5: Reachability deviation can be viewed by route targets or by customers

Page 12: Putting the Layer 3 into Layer 3 MPLS VPN Management 2547bis MPLS VPNs Putting the Layer 3 into Layer 3 MPLS VPN Management MPLS VPN Route Analytics 2013 Packet Design, Inc. 1 Executive

M P L S V P N R o u t e A n a l y t i c s

2013 Packet Design, Inc.11

Per-Customer VPN Routing Activity Histograms

Histogram views of per-customer VPN routing events (Figure 6) provide an easy way for operators to visually identify periods of problematic or excess activity and drill down on selected time periods to determine root cause.

Figure 6: Histogram shows routing events per customer

C o n c l u s i o n

IP route analytics harnesses the IP routing control plane underlying MPLS VPN services to deliver indispensable information for the reliable operation and management of customer VPN services. From predictive and early warning indicators of MPLS VPN service health issues, to detailed forensics and troubleshooting tools, IP route analytics helps service providers deliver superior MPLS VPN services through superior MPLS VPN service management.

VPN Explorer delivers an extraordinary ROI when considering the unique capabilities and valuable benefits it provides with its low cost and ease of implementation. VPN Explorer is part of a suite of fully integrated IP routing, MPLS VPN, RSVP-TE traffic engineering and NetFlow analysis solutions, offering the most comprehensive monitoring, analysis, modeling and capacity planning capabilities available for Layer 3 network engineering and operations. Used along-side traditional device and performance management systems, VPN Explorer and the Packet Design solution suite enable service providers to attain faster time to service revenue, higher service assurance, increased operational productivity and greater profitability.

To learn more about Packet Design and VPN Explorer, please:

• Email us at [email protected]

• Visit Packet Design’s web site at http://www.packetdesign.com