migrating to openflow sdns
DESCRIPTION
Migrating to OpenFlow SDNs a presentation by Justin Dustzadeh, Huawei at the US Ignite ONF GENI workshop on October 8, 2013TRANSCRIPT
Justin DustzadehHuawei
1
Migrating to OpenFlow SDNs
© 2013 Open Networking Foundation
Outline
2
• Overview
• Migration Use Cases
• Conclusion
© 2013 Open Networking Foundation
Migration Working GroupOverview
3
Objective• Accelerate adoption of open SDN; assist network operators with
recommendations on SDN migration
Timeline• Formed in April 2013, 1st milestone deliverable ready, 3 other
milestones through 2Q2014
Focus• Examine real-world migration use-cases, gather best practices and
make recommendations on migration methods, tools and systems
Who• Team of industry experts and practitioners who have carried out or
have interest in carrying out SDN migrations
© 2013 Open Networking Foundation
ONF Migration Working GroupCharter, Goals & Migration Steps
4
• Identify core requirements of the Target Network
• Prepare the Starting Network for migration
• Phased Migration of service
• Validate the result
1234
TargetNetwork
OpenFlow Controller
PhasedMigration
StartingNetwork
Device
Device Device
23
4
1
© 2013 Open Networking Foundation
What Are We Producing?Migration WG Deliverables
5
1st milestone:• Submit document on use cases and migration methods, leveraging
the experience of prior work by network operators
2nd milestone:• Submit document describing the goals and metrics for the migration
3rd milestone:• Publish prototype working code for migration, and validate the
metrics
4th milestone:• Demonstration of prototype migration tool chain
© 2013 Open Networking Foundation
ONF SDN Architecture
6
© 2013 Open Networking Foundation
SDN Migration Approaches1. Direct Upgrade
7
StartingNetwork
Operational Support Systems
ControlControl
Control
Device
Device Device
TargetNetwork
Operational Support Systems
OpenFlow Controller & Configurator
Upgrading existing equipment with
OpenFlow Agents
Device
Device Device
OpenFlowAgent
© 2013 Open Networking Foundation
TargetNetwork
Operational Support Systems
OpenFlow Controller & Configurator
Operational Support Systems
DeviceDevice
Device
ControlControl
Control
PhasedDeployment
SDN Migration Approaches2. Phased (Parallel) Upgrade
StartingNetwork
ControlControl
Control
Device
Device Device
Operational Support Systems
OpenFlow Controller & Configurator
8
© 2013 Open Networking Foundation
SDN Migration ApproachesA Closer Look at Device Types
9
• Legacy Switch– Traditional switch/router with integrated
control and forwarding plane
• OpenFlow Switch– OpenFlow forwarding only, control plane
residing external to device
• Hybrid Switch– OpenFlow forwarding as well as legacy
control and data planes
Traditional RIB / FIB
Traditional RIB / FIB
OpenFlow
© 2013 Open Networking Foundation
SDN Migration ApproachesA Closer Look at Device Types
10
Three approaches for migration to OpenFlow-based SDN:
1. Legacy to Greenfield
2. Legacy to Mixed
3. Legacy to Hybrid
Traditional RIB / FIB
LegacySwitch
HybridSwitch
OpenFlowSwitch
Traditional RIB / FIB
OpenFlow
© 2013 Open Networking Foundation
Real-World Migration ApproachesDeployment Scenarios
11
1. Legacy to Greenfield– Either no existing deployment, or– Legacy network upgraded to become OpenFlow-enabled and
the Control Machine is replaced with an OpenFlow controller
Traditional RIB / FIB
Traditional RIB / FIB
Traditional RIB / FIB
LegacyNetwork
LegacySwitch
Greenfield OpenFlowNetwork
OpenFlowSwitch
OpenFlow Controller
© 2013 Open Networking Foundation
Real-World Migration ApproachesDeployment Scenarios
12
2. Legacy to Mixed (or “Ships-in-the-Night”)– New OpenFlow devices are deployed and co-exist with
traditional switches/routers and interface with legacy Control Machines
– OpenFlow controller and traditional devices need to exchange routing information via the legacy Control Machine.
Traditional RIB / FIB
Traditional RIB / FIB
Traditional RIB / FIB
LegacyNetwork
LegacySwitch
Traditional RIB / FIB
Mixed (Legacy & OpenFlow)Network
OpenFlowSwitch
OpenFlow Controller
© 2013 Open Networking Foundation
Real-World Migration ApproachesDeployment Scenarios
13
3. Hybrid Network Deployment – Mixed Network deployments and Hybrid devices (with both
legacy and OpenFlow functionality) coexist– Hybrid devices interface with OpenFlow Controller and legacy
Control Machine
Traditional RIB / FIB
Traditional RIB / FIB
Traditional RIB / FIB
LegacyNetwork
LegacySwitch
Hybrid OpenFlowNetwork
HybridSwitch
Traditional RIB / FIB
OpenFlow Controller
Traditional RIB / FIB
OpenFlow
OpenFlowSwitch
© 2013 Open Networking Foundation
Real-World ConsiderationsNetwork Domains and Layers
14
• Service enablement is often the motivation for SDN migration• Services can be end-to-end
– Overlay on conceptual (virtual) networks– Spanning several network segments– Several layers of technologies some/all of which addressable by
OpenFlow• OpenFlow could address layer 0, 1, 2, 2.5, 3, 4-7 applications
– Different use cases requiring specific migration recommendations
• Examples– Application-specific capacity scheduling at lower layers, DPI-
based service chaining at IP Edge, etc.
© 2013 Open Networking Foundation
Outline
15
• Overview
• Migration Use Cases
• Conclusion
© 2013 Open Networking Foundation
Types of Networks
16
WAN DC
Enterprise DC
Campus
• Multiple buildings• Campus backbone• Groups of users, BYOD• Heterogeneous IT
• Various sizes• Sub-networks, storage• Security• WAN optimization, LB
• Multi-tenant, virtualization• Mid-size to hyperscale• VM mobility• Disaster recovery
• Significant diversity• Multiple domains• Carriers (access, transport)• Many customers
© 2013 Open Networking Foundation
Migration Use Cases
17
1. Campus Network: Stanford OpenFlow deployment
2. Network Edge: NTT’s BGP-Free Edge field trial
3. Inter-Data Center WAN: Google’s SDN-powered WAN (B4)
WAN DCCampus
© 2013 Open Networking Foundation
Campus Network Use CaseStanford OpenFlow Deployment
18
Motivation:• Understand and verify the new
SDN technology
• Motivate the need for SDN through innovative experiments
• Contribute back to OpenFlow specification and community
© 2013 Open Networking Foundation
Stanford OpenFlow DeploymentObjectives
19
Overview:• Part of Stanford campus network migrated to OpenFlow in 2010• Migration initially focused on wireless users• Later expanded to selected wired users • Multiple islands across William Gates CS building and Paul Allen
CIS building• Eventual goal was to expand OpenFlow support to several other L2
VLANs and then interconnect them at a L3 router
© 2013 Open Networking Foundation
Stanford OpenFlow DeploymentTopology
20
• Production Network in 3A Wing OpenFlow-enabled• 6 48-port 1GE OpenFlow switches from 4 vendors• 30 WiFi APs based on ALIX PCEngine boxes with dual
802.11g interfaces running Linux-Based software reference switch from OpenFlow website
• 1 WiMAX Base-Station
William Gates CS BuildingOpenFlow-Enabled Network (McKeown Group)
Paul Allen CIS/CIX BuildingOpenFlow-Enabled Network
• VLAN 98 was OpenFlow-enabled• 6 48-port 1GE OpenFlow switches from 1 vendor• 14 WiFi APs based on ALIX PCEngine boxes with
dual 802.11g interfaces
© 2013 Open Networking Foundation
Stanford OpenFlow DeploymentMigration Requirements
21
Target Network Requirements:• Network availability > 99.9%• Fail-safe scheme to revert the network back to legacy mode• Network performance close to the legacy network’s performance• No affect on user experience in any way
Phased Migration:• Migration planned to provide better visibility into network traffic and
allow network experimentation for select users (opt-in) • Migrate select VLANs and users to OpenFlow control, allowing for a
clear path of staged deployment within the existing campus network
© 2013 Open Networking Foundation
Stanford OpenFlow DeploymentMigration Approach
22
Four phases for gradual move of individual users then VLANs to OpenFlow:
1. Add OpenFlow support on hardware (a 1-time firmware update)
2. Verify OpenFlow support on switch: – Add experimental VLAN / test hosts managed by external controller.
Once verified, move to next phase
3. Migrate users to new network: – Create new non-OpenFlow network, safely migrate users to new
network before using OpenFlow for production traffic (minimizing risk). Main steps:
• Add new Production sub-network; gradually add/move users to new subnet; verify reachability within new network
4. Enable OpenFlow for new subnet: – Once the new subnetwork was functional, enable OpenFlow control for
that network by configuring the controller– Again, verify correctness, reachability, performance, and stability using
standard monitoring tools, and user experience info collected in surveys
© 2013 Open Networking Foundation
Data Plane Statistics to Verify Stability
Stanford OpenFlow DeploymentTools, Monitoring and Statistics
23
Control Plane Statistics (SNAC controller) Traffic Volume and CPU Usage
Monitoring Infrastructure for the OpenFlow Network
© 2013 Open Networking Foundation
Stanford OpenFlow DeploymentMigration Acceptance
24
Correctness and Reachability:• Reachability verified using user/probe-generated traffic; completion
of requests made confirmed correctness and reachability
Performance: • Correlating monitored statistics in data plane and control plane
allowed to identify anomalies and incorrect behaviors
Stability: • Statistics monitored for a long period of time. Progression plots
frequently made to verify stability and health of the network
Service Acceptance:• Network stability gradually improved as switches, controllers and
understanding matured. User surveys stopped providing relevant data as users started seeing consistently acceptable service.
© 2013 Open Networking Foundation
Network Edge Use Case Problem Statement
25
Challenges with Traditional Models:• Heavy load on edge routers in traditional BGP deployment models
– BGP adjacencies, routes/paths for address families: IPv4/6, VPNv4/6…– BGP state machine, policy-based BPG updates, best path calculation– Frequent service changes (provision new customers or update policies)
• Limited resources (CPU, memory) and proprietary OS• Service agility & innovation dependent on vendor implementation
Current BGP Deployment Model
IP/MPLS Backbone
Customer AInternet
Customer BInternet
eBGP
iBGP
Device
Device
CE
CE
PE2PE1
RR1 RR2
P P
© 2013 Open Networking Foundation
Network Edge Use CaseNTT’s BGP-Free Edge Field Trial
26
Motivation:• Extend notion of BGP-free core to the edge of the network • Simplified, low-cost routing edge architecture with centralized BGP
policy management, leveraging OpenFlow/SDN• Accelerated deployment of edge services
Current BGP Deployment Model
IP/MPLS Backbone
Customer AInternet
Customer BInternet
eBGP
iBGP
Device
Device
CE
CE
PE2PE1
RR1 RR2
P P
© 2013 Open Networking Foundation
BGP-Free EdgeSDN Architecture
27
Overview:• Move BGP control plane to commodity x86 server and use
OpenFlow-enabled switches for the forwarding plane• Simplification of eBGP routing (control plane load) on edge router• Flexibility to calculate customized BGP best paths not only for each
ingress point, but also on a per-customer basis
BGP-Free Edge – SDN Architecture
IP/MPLS Backbone
Customer AInternet
Customer BInternetDevice CE
P P
BGP Route Controller
OpenFlowController
Control Layer
OpenFlow Session
eBGP Session Redirection
PE1
Device CE
PE2
© 2013 Open Networking Foundation
BGP-Free EdgeSDN Architecture
28
• Remote BGP peers (e.g. CEs) connected to edge device as before• BGP sessions not handled/terminated by OF-enabled edge device• OpenFlow controller pre-programs default flows on edge device• Edge device sends all BGP control plane traffic from internal and
external peers to BGP route controller
BGP-Free Edge – SDN Architecture
IP/MPLS Backbone
Customer AInternet
Customer BInternetDevice CE
P P
BGP Route Controller
OpenFlowController
Control Layer
OpenFlow Session
eBGP Session Redirection
PE1
Device CE
PE2
© 2013 Open Networking Foundation
BGP-Free EdgePre-Migration Assessment
29
• Support of required scale and future growth • Consistency of OpenFlow versions between controller and switch• A BGP route controller capable of handling BGP process• Ensure that appropriate APIs, scripting, and other operational tools
are compatible with the SDN-based deployment • Ensure BGP peer creation and activation can be automated
(optional)• Ensure proper training is provided to NOC staff
© 2013 Open Networking Foundation
BGP-Free EdgeMigration Procedures (Ships-in-the-Night)
30
1. Configure iBGP session between RR and BGP route controller so that BGP route controller can learn routes from the entire network
2. Configure BGP between OpenFlow and BGP route controller
3. Program a default flow entry in OpenFlow controller to initially forward traffic for matching OpenFlow entry to BGP route controller
– Alternatively, TCP port 179 can be programmed to match and forward only BGP traffic to the BGP route controller
4. Before the migration, BGP path information for a random sample of prefixes should be captured. This will help validate accurate BGP path information after migration.
5. Configure a VLAN per customer and configure a corresponding BGP session on the BGP network controller
6. Once the session is established, decommission the session on the legacy router
7. BGP route controller runs BGP best path selection algorithm and passes the best paths to OpenFlow which in turn programs the OpenFlow switch
8. Once forwarding table is programmed, control traffic continue to be forwarded to the BGP route controller while data traffic now follows the path through the OF and non-OF enabled switches/routers along the way to the destination
9. Repeat the above steps to migrate the rest of the BGP sessions and additional edge routers
© 2013 Open Networking Foundation
BGP-Free EdgeMigration Procedures
31
VPN2
VPN1 VPN1
BGP Route Controller
OpenFlowController
BGP Route Reflector
Traditional IP/MPLS Core
P1 P2
Control Control
VPN2
BGP SessionsPost-Migration BGP SessionsBGP Sessions not yet migrated
x
x
Control Layer
OpenFlow Session
CE1
CE2
CE3
CE4
PE1
PE2
PE3
PE4
• Remove CE BGP session on the old PE and configure a new BGP session on the BGP controller for the corresponding CE.
• Remove BGP session on the RR for the corresponding PE
• Establish BGP session between the BGP network controller and BGP RR
• Establish routing session between OF controller and BGP controller
• OF controller programs the forwarding table on the PE1 once it has all the routing information
• Repeat the same steps for rest of the PEs in the network which needs to be migrated.
© 2013 Open Networking Foundation
BGP-Free EdgeMigration Approach
32
From traditional BGP-speaking edge router to BGP-free paradigm:• Greenfield Deployment
– All the edge devices are OpenFlow capable (BGP-free) with BGP terminated at the route controller
– Perhaps the easiest migration model• Mixed (“Ships-in-the-Night”) Deployment
– A new BGP-free edge router is deployed and will co-exist with other traditional BGP speaking routers
– The new BGP free edge devices and the traditional devices need to exchange routing information via the BGP route controller
• Hybrid Network Deployment– Legacy and OpenFlow devices coexist. The edge switch runs BGP and
OpenFlow. The edge router continues to run BGP while BGP sessions and corresponding policies are offloaded to the BGP route controller gradually.
– Requires careful planning and a lot more resources during the transition stage especially since edge device has to maintain the regular forwarding and OpenFlow forwarding tables along with BGP table
© 2013 Open Networking Foundation
BGP-Free EdgePost-Migration
33
Post-Migration Acceptance:• All BGP sessions on BGP Network Controller should be up • Ensure BGP network controller receives and sends all expected
BGP routes with proper next hops from customers and RR and selects the correct BGP best paths
• To ensure BGP routes are learned accurately, compare BGP output of select prefixes with sample output captured in step 4 of migration
Services Acceptance:• Any existing Internet or VPN services should function normally• A random sample of prefixes in the Internet as well as for select
customers can be used to validate the service continuity• Appropriate troubleshooting steps such as ping and trace routes can
be employed to check the connectivity.
© 2013 Open Networking Foundation
Inter-Data Center WANGoogle's OpenFlow-Powered WAN (B4)
34
Overview:• Google’s WAN organized as 2 backbones• Internet-facing (I-scale) network carrying user traffic• Internal (G-scale) network carrying traffic between datacenters
– B4: OpenFlow-powered SDN• Use SDN to manage WAN as a fabric versus a collection of boxes
– Delivery of Google’s global user-based services (Google Web Search, Google+, Gmail, YouTube, Google Maps, etc.) would not be scalable with the traditional technologies due to their non-linear complexity in management and configuration.
© 2013 Open Networking Foundation
Google’s WAN (B4)Highlights
35
• 1000s of individual applications, different traffic volumes, different latency sensitivities and different overall priorities– user data copies (e.g., email, documents, audio/video files) to
remote data centers for availability/durability– remote storage access for computation over inherently
distributed data sources– large-scale data push synchronizing state across multiple data
centers• Example: the user-data represents the lowest volume on B4, is the
most latency sensitive, and is of the highest priority.• B4 was built with a 3-layer architecture:
– Switch hardware layer– Site controller layer– Global control layer
© 2013 Open Networking Foundation
Google’s WAN (B4)Architecture
36
• Switch hardware layer– Switch hardware custom built
from multiple merchant networking chips
– Forwards traffic and does not run complex control software
• Site controller layer– Network control systems hosting
OpenFlow controllers and network control applications
• Global layer– Logically-centralized applications
(e.g. an SDN Gateway and a central TE server) that enable the central control of the entire network
Instead built one integrated, centralized service combining routing and traffic engineering Google chose to deploy routing and traffic engineering as independent services, with the standard routing service deployed initially and central TE subsequently deployed as an overlay. • Focus initial work on SDN infrastructure• Able to fall back to shortest path routing in case of
facing issues with TE service
© 2013 Open Networking Foundation
Google’s WAN (B4)Pre-Migration Assessment
37
A number of B4’s characteristics led to the design approach:
• Elastic bandwidth demands: – the majority of Google's data center traffic involves
synchronizing large data sets across sites. These applications benefit from as much bandwidth as they can get but can tolerate periodic failures with temporary bandwidth reductions.
• Moderate number of sites: – While B4 must scale among multiple dimensions, targeting the
data center deployments meant that the total number of WAN sites would be a few dozens.
© 2013 Open Networking Foundation
Google’s WAN (B4)Pre-Migration Assessment
38
• End application control: – Google controls both the applications and the site networks
connected to B4. Hence, it can enforce relative application priorities and control bursts at the network edge, rather than through over provisioning or complex functionality in B4.
• Cost sensitivity: – B4’s capacity targets and growth rate led to unsustainable cost
projections. – The traditional approach of provisioning WAN links at 30-40%
(or 2-3x the cost of a fully utilized WAN) to protect against failures and packet loss, combined with prevailing per-port router cost, would make the network prohibitively expensive.
© 2013 Open Networking Foundation
Google’s WAN (B4)Migration Approach
39
• Integration of the target network with the legacy routing– Provide a gradual path for enabling OpenFlow in the production
network• BGP integration as a step toward deploying new protocols
customized to the requirements of, for instance, a private WAN setting
• Migration path moved in stages from a fully distributed monolithic control and data plane hardware architecture to a physically decentralized (though logically centralized) control plane architecture
• The hybrid migration for the Google B4 network proceeded in 3 general stages (see next slides)
© 2013 Open Networking Foundation
Google’s WAN (B4)Migration Approach (Step 1: Legacy)
40
1. Legacy: In the initial stage, the network connects Data Centers through legacy nodes using E/IBGP and ISIS routing. Cluster Border routers interface the Data Centers to the network.
LegacyHybrid SDNDeployment
© 2013 Open Networking Foundation
Google’s WAN (B4)Migration Approach (Step 2: Mixed)
41
2. Mixed: In this phase, a subset of nodes in the network are OpenFlow-enabled and controlled by the logically-centralized controller utilizing Paxos, OpenFlow controller, and Quagga
MixedHybrid SDNDeployment
© 2013 Open Networking Foundation
Google’s WAN (B4)Migration Approach (Step 3: Final)
42
3. Final: All nodes are OpenFlow-enabled and the controller controls the entire network. There is no direct correspondence between the Data Center and the network. The controller has also TE server that guides the Traffic Engineering in the network.
Hybrid SDNDeployment
© 2013 Open Networking Foundation
Google’s WAN (B4)Post-Migration Acceptance
43
• Google’s WAN (B4) has been in deployment for 3 years• Carries more traffic than Google’s public-facing WAN, and has a
higher growth rate• Among the first and largest SDN/OpenFlow deployments• Scales to meet application bandwidth demands more efficiently than
would otherwise be possible• Supports rapid deployment and iteration of novel control
functionality such as TE• Enables tight integration with end applications for adaptive behavior
in response to failures or changing communication patterns
© 2013 Open Networking Foundation
Outline
44
• Overview
• Migration Use Cases
• Conclusion
© 2013 Open Networking Foundation
Example GuidelinesRecommendations, Best Practices, etc.
45
Example Recommendations and Best Practices:• Focus on service continuity with minimal disruption• Analysis of OpenFlow features and desired capabilities on the
controller and OpenFlow switch• Detailed gap analysis to understand impact on existing services• Availability of alternate options to mitigate risk during migration• Consistency of OpenFlow versions between controller and switch• OpenFlow switch must be upgraded to run appropriate code and
hardware firmware before migration can be initiated• Provisioning of necessary network management tools for migrated
network for proper management and monitoring of traffic & devices
© 2013 Open Networking Foundation
Example GuidelinesRecommendations, Best Practices, etc.
46
Example Recommendations and Best Practices: (cont’d)• Detailed method of procedure for step-by-step migration with back
out procedures clearly documented in case of unexpected results• Investigate if reverting the configuration can be automated to
minimize disruption in case of deteriorated performance• Create pre and post-migration check lists with specific samples of
applications and/or source destination prefixes which will be used for connectivity and service continuity checks
• Appropriate troubleshooting steps such as ping, trace or accessing an application can be employed to check the connectivity
• In a mixed environment, a dummy service such as customer VPN can be created to verify service availability
© 2013 Open Networking Foundation
Summary
47
• OpenFlow still evolving as new use cases and deployment models emerge
• Legacy networks can successfully migrate to OpenFlow-based SDN– The 3 use cases illustrate diverse migration scenarios for WAN,
campus/LAN and service provider/Internet edge – Google and Stanford use cases (both in production) illustrate
good examples of successful migration to OpenFlow– Alternative options available today to address any gaps with
OpenFlow• More work ahead
– Share your real-world SDN migration experience with the community
© 2013 Open Networking Foundation
How Can You Get Involved?Migration Working Group Charter
48
1st milestone:• Submit document on use cases and migration methods, leveraging
the experience of prior work by network operators
2nd milestone:• Submit document describing the goals and metrics for the migration.
3rd milestone:• Publish prototype working code for migration, and validate the
metrics.
4th milestone:• Demonstration of prototype migration tool chain.