a survey and a layered taxonomy of software-defined networking

29
1553-877X (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. See http://www.ieee.org/publications_standards/publications/rights/index.html for more information. This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI 10.1109/COMST.2014.2320094, IEEE Communications Surveys & Tutorials IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, JANUARY 2014 1 A Survey and a Layered Taxonomy of Software-Defined Networking Yosr Jarraya, Member, IEEE, Taous Madi, and Mourad Debbabi Member, IEEE, Abstract —Software-Defined Networking (SDN) has recently gained an unprecedented attention from in- dustry and research communities and it seems un- likely that this will be attenuated in the near future. The ideas brought by SDN, although often described as “revolutionary paradigm shift” in networking, are not completely new since they have their foundations in programmable networks and control-data planes separation projects. SDN promises a simplified net- work management by enabling network automation, fostering innovation through programmability, and de- creasing CAPEX and OPEX by reducing costs and power consumption. In this paper, we aim at analyzing and categorizing a number of relevant research works toward realizing SDN promises. We first provide an overview on SDN roots and then describe the architec- ture underlying SDN and its main components. There- after, we present existing SDN-related taxonomies and propose a taxonomy that allows classifying the reviewed research works and bringing relevant research direc- tions into focus. We dedicate the second part of this paper to study and compare the current SDN-related research initiatives and describe the main issues that may arise due to the adoption of SDN. Furthermore, we review several domains where the use of SDN shows promising results. We also summarize some foreseeable future research challenges. Index Terms —Software-Defined Networking, Open- flow, Programmable Networks, Controller, Manage- ment, Virtualization, Flow I. Introduction For a long time, networking technologies have evolved at a lower pace compared to other communication tech- nologies. Network equipments such as switches and routers have been traditionally developed by manufacturers. Each vendor designs his own firmware and other software to operate their own hardware in a proprietary and closed way. This slowed the progress of innovations in networking technologies and caused an increase in management and operation costs whenever new services, technologies or hardware were to be deployed within existing networks. The architecture of today’s networks consists of three core logical planes: Control plane, data plane, and management plane. So far, networks hardware have been developed with tightly coupled control and data planes. Thus, traditional networks are known to be “inside the box” paradigm. This significantly increases the complexity and cost of Yosr Jarraya (Corresponding author), Taous Madi, and Mourad Debbabi are with the Concordia Institute for Information Sys- tems Engineering (CIISE), Concordia University, Montreal, Quebec, Canada. e-mail: {y [email protected]}. network administration and management. Being aware of these limitations, networking research communities and industrial market leaders have collaborated in order to rethink the design of traditional networks. Thus, proposals for a new networking paradigm, namely programmable networks [1], have emerged (e.g. active networks [2] and Open Signalling (OpenSig) [3]). Recently, Software-Defined Networking (SDN) has gained popularity in both academia and industry. SDN is not a revolutionary proposal but it is a reshaping of earlier proposals investigated several years ago, mainly programmable networks and control-data planes separa- tion projects [4]. It is the outcome of a long-term process triggered by the desire to bring network“out of the box”. The principal endeavors of SDN are to separate the control plane from the data plane and to centralize network’s intelligence and state. Some of the SDN predecessors that advocate control-data planes separation are Routing Con- trol Platform (RCP) [5], 4D [6], [7], Secure Architecture for the Networked Enterprise (SANE) [8], and lately Ethane [9], [10]. SDN philosophy is based on dissociating the control from the network forwarding elements (switches and routers), logically centralizing network intelligence and state (at the controller), and abstracting the under- lying network infrastructure from the applications [11]. SDN is very often linked to the OpenFlow protocol. The latter is a building block for SDN as it enables creating a global view of the network and offers a consistent, system- wide programming interface to centrally program network devices. OpenFlow is an open protocol that was born in academia at Stanford University after the Clean Slate Project 1 . In [12], OpenFlow was proposed for the first time to enable researchers to run experimental protocols [13] in the campus networks they use every day. Currently, the Open Networking Foundation (ONF), a non-profit industry consortium, is in charge of actively supporting the advancements of SDN and the standardization of OpenFlow, which is currently published under version 1.4.0 [14]. The main objective of this paper is to survey the literature on SDN over the period 2008-2013 to provide a deep and comprehensive understanding of this paradigm, its related technologies, its domains of application, as well as the main issues that need to be solved towards sustaining its success. Despite SDN’s juvenility, we have identified a large number of scientific publications not 1 http://cleanslate.stanford.edu/

Upload: nestor-alzate-mejia

Post on 21-Nov-2015

32 views

Category:

Documents


2 download

DESCRIPTION

Paper SDN

TRANSCRIPT

  • 1553-877X (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

    This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/COMST.2014.2320094, IEEE Communications Surveys & Tutorials

    IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, JANUARY 2014 1

    A Survey and a Layered Taxonomy of

    Software-Defined NetworkingYosr Jarraya, Member, IEEE, Taous Madi, and Mourad Debbabi Member, IEEE,

    AbstractSoftware-Defined Networking (SDN) hasrecently gained an unprecedented attention from in-dustry and research communities and it seems un-likely that this will be attenuated in the near future.The ideas brought by SDN, although often describedas revolutionary paradigm shift in networking, arenot completely new since they have their foundationsin programmable networks and control-data planesseparation projects. SDN promises a simplified net-work management by enabling network automation,fostering innovation through programmability, and de-creasing CAPEX and OPEX by reducing costs andpower consumption. In this paper, we aim at analyzingand categorizing a number of relevant research workstoward realizing SDN promises. We first provide anoverview on SDN roots and then describe the architec-ture underlying SDN and its main components. There-after, we present existing SDN-related taxonomies andpropose a taxonomy that allows classifying the reviewedresearch works and bringing relevant research direc-tions into focus. We dedicate the second part of thispaper to study and compare the current SDN-relatedresearch initiatives and describe the main issues thatmay arise due to the adoption of SDN. Furthermore,we review several domains where the use of SDN showspromising results. We also summarize some foreseeablefuture research challenges.

    Index TermsSoftware-Defined Networking, Open-flow, Programmable Networks, Controller, Manage-ment, Virtualization, Flow

    I. Introduction

    For a long time, networking technologies have evolvedat a lower pace compared to other communication tech-nologies. Network equipments such as switches and routershave been traditionally developed by manufacturers. Eachvendor designs his own firmware and other software tooperate their own hardware in a proprietary and closedway. This slowed the progress of innovations in networkingtechnologies and caused an increase in management andoperation costs whenever new services, technologies orhardware were to be deployed within existing networks.The architecture of todays networks consists of three corelogical planes: Control plane, data plane, and managementplane. So far, networks hardware have been developed withtightly coupled control and data planes. Thus, traditionalnetworks are known to be inside the box paradigm.This significantly increases the complexity and cost of

    Yosr Jarraya (Corresponding author), Taous Madi, and MouradDebbabi are with the Concordia Institute for Information Sys-tems Engineering (CIISE), Concordia University, Montreal, Quebec,Canada. e-mail: {y [email protected]}.

    network administration and management. Being aware ofthese limitations, networking research communities andindustrial market leaders have collaborated in order torethink the design of traditional networks. Thus, proposalsfor a new networking paradigm, namely programmablenetworks [1], have emerged (e.g. active networks [2] andOpen Signalling (OpenSig) [3]).

    Recently, Software-Defined Networking (SDN) hasgained popularity in both academia and industry. SDNis not a revolutionary proposal but it is a reshaping ofearlier proposals investigated several years ago, mainlyprogrammable networks and control-data planes separa-tion projects [4]. It is the outcome of a long-term processtriggered by the desire to bring network out of the box.The principal endeavors of SDN are to separate the controlplane from the data plane and to centralize networksintelligence and state. Some of the SDN predecessors thatadvocate control-data planes separation are Routing Con-trol Platform (RCP) [5], 4D [6], [7], Secure Architecture forthe Networked Enterprise (SANE) [8], and lately Ethane[9], [10]. SDN philosophy is based on dissociating thecontrol from the network forwarding elements (switchesand routers), logically centralizing network intelligenceand state (at the controller), and abstracting the under-lying network infrastructure from the applications [11].SDN is very often linked to the OpenFlow protocol. Thelatter is a building block for SDN as it enables creating aglobal view of the network and offers a consistent, system-wide programming interface to centrally program networkdevices. OpenFlow is an open protocol that was bornin academia at Stanford University after the Clean SlateProject1. In [12], OpenFlow was proposed for the first timeto enable researchers to run experimental protocols [13]in the campus networks they use every day. Currently,the Open Networking Foundation (ONF), a non-profitindustry consortium, is in charge of actively supportingthe advancements of SDN and the standardization ofOpenFlow, which is currently published under version1.4.0 [14].

    The main objective of this paper is to survey theliterature on SDN over the period 2008-2013 to provide adeep and comprehensive understanding of this paradigm,its related technologies, its domains of application, aswell as the main issues that need to be solved towardssustaining its success. Despite SDNs juvenility, we haveidentified a large number of scientific publications not

    1http://cleanslate.stanford.edu/

  • 1553-877X (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

    This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/COMST.2014.2320094, IEEE Communications Surveys & Tutorials

    IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, JANUARY 2014 2

    counting miscellaneous blogs, magazine articles, and onlineforums, etc. To the best of our knowledge, this paper is thefirst comprehensive survey on the SDN paradigm. Whilereviewing the literature, we found few papers surveyingspecifc aspects of SDN [15][17]. For instance, Bozakovand Sander [15] focus on the OpenFlow protocol andprovide implementation scenarios using OpenFlow andNOX controller. Sezer et al. [17] briefly present a surveyon SDN concepts and issues while considering a limitednumber of surveyed works. Lara et al. [16] present anOpenFlow-oriented survey and concentrate on a two-layerarchitecture of SDN: control and data layers. They reviewand compare OpenFlow specifications, from the earliestversions till version 1.3.0. and then present works onOpenFlow capabilities, applications, and deployments allaround the word. Although important research issues havebeen identified, there is no mentioning about other rele-vant aspects such as distributed controllers, northboundAPIs, and SDN programming languages.In the present paper, we aim at providing a more

    comprehensive and up-to-date overview of SDN by target-ing more than one aspect while analyzing most relevantresearch works and identifying foreseeable future researchdirections. The main contributions of this paper are asfollows:

    Provide a comprehensive tutorial on SDN and Open-Flow by studying their roots, their architecture andtheir principal components.

    Propose a taxonomy that allows classifying the re-viewed research works, bringing relevant research di-rections into focus, and easing the understandabilityof the related domains.

    Elaborate a survey on the most relevant research pro-posals supporting the adoption and the advancementof SDN.

    Identify new issues raised from the adoption of SDNthat still need to be addressed by future researchefforts.

    The paper is structured as follows; Section II is a prelimi-nary section where the roots of SDN are briefly presented.Section III is dedicated to unveiling SDN concepts, com-ponents, and architecture. Section IV discusses existingSDN taxonomies and elaborates on a novel taxonomy forSDN. Section V is the survey of the research works onSDN organized according to the proposed taxonomy. Thelatter identifies issues brought by SDN paradigm and thecurrently proposed solutions. Section VI describes openissues that still need to be addressed in this domain. Thepaper ends with a conclusion in Section VII.

    II. Software-Defined Networking Roots

    SDN finds its roots in programmable networks andcontrol-data plane separation paradigms. In the following,we give a brief overview of these two research directionsand then highlight how SDN differs from them.The key principle of programmable networks is to allow

    more flexible and dynamically customizable network. To

    materialize this concept, two separate schools of thoughtshave emerged: OpenSig [3] from the community of telecom-munications and active networks [2] from the commu-nity of IP networks. Active networking emerged fromDARPA [2] in mid 1990s. Its fundamental idea was toallow customized programs to be carried by packets andthen executed by the network equipments. After executingthese programs, the behavior of switches/routers wouldbe subject to change with different levels of granularity.Various suggestions on the levels of programmability existin the literature [1]. Active networks introduced a high-level dynamism for the deployment of new services at run-time. At the same time, OpenSig community has proposedto control networks through a set of well-defined networkprogramming interfaces and distributed programming en-vironments (middleware toolkits such as CORBA) [3]. Inthat case, physical network devices are manipulated likedistributed computing objects. This would allow serviceproviders to construct and manage new network services(e.g., routing, mobility management, etc.) with QoS sup-port.Both active networking and OpenSig introduced many

    performance, isolation, complexity, and security concerns.First, they require that each packet (or subset of packets)is processed separately by network nodes, which raisesperformance issues. They require executing code at theinfrastructure level, which needs most, if not all, routersto be fundamentally upgraded, and raises security andcomplexity problems. This was not accepted by majornetwork devices vendors, and consequently hampered re-search and industrial developments in these directions.A brief survey on approaches to programmable networkscan be found in [18], where SDN is considered as aseparate proposal towards programmable networks be-sides three other paradigms, namely approaches based on:(1) Improved hardware routers such as active networks,OpenSig, Juniper Network Operating System SDK (JunosSDK), (2) Software routers such as Click and XORP,(3) Virtualization such as network virtualization, overlaynetwork, and virtual routers. The survey on programmablenetworks presented in [1] is a more comprehensive butless recent. SDN resembles past research on programmablenetworks, particularly active networking. However, whileSDN has an emphasis on programmable control plane,active networking focuses on programmable data planes[19].After programmable networks, projects towards control-

    data planes separation have emerged supported by ef-forts towards standard open interface between control anddata planes such as the Forwarding and Control ElementSeparation (ForCES) framework [20] and by efforts toenable a logically centralized control of the network suchas the Path Computation Element Protocol (PCEP) [21]and RCP [5]. Although focusing on control-data planesseparation, these efforts rely on existing routing protocols.Thus, these proposals do neither support a wide rangeof functionalities (e.g. dropping, flooding, or modifyingpackets) nor do they allow for a wider range of header

  • 1553-877X (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

    This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/COMST.2014.2320094, IEEE Communications Surveys & Tutorials

    IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, JANUARY 2014 3

    fields matching [19]. These restrictions impose signifi-cant limitations on the range of applications supportedby programmable controllers [19]. Furthermore, most ofthe aforementioned proposals failed to face backwardscompatibility challenges and constraints, which inhibitedimmediate deployment.To broaden the vision of control and data plane sep-

    aration, researchers explored clean-slate architectures forlogically centralized control such as 4D [6], SANE [8],Ethane [9]. Clean Slate 4D project [6] was one of the firstadvocating the redesign of control and management func-tions from the ground up based on sound principles. SANE[8] is a single protection layer consisting of a logicallycentralized server that enforces several security policies(access control, firewall, network address translation, etc.)within the enterprise network. And more recently, Ethane[9] is an extension of SANE that is based on the principle ofincremental deployment in enterprise networks. In Ethane,two components can be distinguished; The first componentis a controller that knows the global network topology andcontains the global network policy used to determine thefate of all packets. It also performs route computation forthe permitted flows. The second component is a set ofsimple and dump Ethane switches. These switches consistof a simple flow table and use a secure channel to com-municate with the controller for exchanging informationand receiving forwarding rules. This principle of packetprocessing constitutes the basis of SDNs proposal.The success and fast progress of SDN are widely due to

    the success of OpenFlow and the new vision of a networkoperating system. Unlike previous proposals, OpenFlowspecification relies on backwards compatibility with hard-ware capabilities of commodity switches. Thus, enablingOpenFlows initial set of capabilities on switches did notneed a major upgrade of the hardware, which encouragedimmediate deployment. In later versions of the Open-Flow switch specification (i.e. starting from 1.1.0), anOpenFlow-enabled switch supports a number of tablescontaining multiple packet-handling rules, where each rulematches a subset of the traffic and performs a set of actionson it. This potentially prepares the floor for a large set ofcontrollers applications with sophisticated functionalities.Furthermore, the deployment of OpenFlow testbeds byresearchers not only on a single campus network butalso over a wide-area backbone network demonstrated thecapabilities of this technology. Finally, a network operatingsystem, as envisioned by SDN, abstracts the state from thelogic that controls the behavior of the network [19], whichenables a flexible programmable control plane.In the following sections, we present a tutorial and a

    comprehensive survey on SDN where we highlight thechallenges that have to be faced to provide better chancesfor SDN paradigm.

    III. SDN: Global Architecture and Meronomy

    In this section, we present the architecture of SDNand describe its principal components. According to theONF, SDN is an emerging architecture that decouples the

    network control and forwarding functions. This enables thenetwork control to become directly programmable and theunderlying infrastructure to be abstracted for applicationsand network services2. In such an architecture, the infras-tructure devices become simply forwarding engines thatprocess incoming packets based on a set of rules generatedon the fly by a (or a set of) controller at the controllayer according to some predefined program logic. Thecontroller generally runs on a remote commodity serverand communicates over a secure connection with the for-warding elements using a set of standardized commands.ONF presents in [11] a high-level architecture for SDN thatis vertically split into three main functional layers:

    Infrastructure Layer : Also known as the data plane[11], it consists mainly of Forwarding Elements (FEs)including physical and virtual switches accessible viaan open interface and allows packet switching andforwarding.

    Control Layer : Also known as the control plane [11],it consists of a set of software-based SDN controllersproviding a consolidated control functionality throughopen APIs to supervise the network forwarding be-havior through an open interface. Three communica-tion interfaces allow the controllers to interact: south-bound, northbound and east/westbound interfaces.These interfaces will be briefly presented next.

    Application Layer : It mainly consists of the end-userbusiness applications that consume the SDN commu-nications and network services [22]. Examples of suchbusiness applications include network visualizationand security business applications [23].

    Figure 1 illustrates this architecture while detailingsome key parts of it, such as the control layer, the applica-tion layer, as well as the communication interfaces amongthe three layers.

    Physical switches

    Control Layer

    Infrastructure Layer

    SDNController

    SDNController

    SDNController

    Application Layer

    Virtual Switches

    Westbound API Eastbound API

    Northbound API (e.g. REST API, etc.)

    Southbound API (e.g. OpenFlow, ForCES, PCEP etc.)

    NetworkVirtualization

    Network Access Control and Bring Your Own Device (BYOD)

    Security (Fw, IDPS, WAF)

    Unified Network Monitoring and Analysis

    Other Business Apps

    Other Business Apps

    Fig. 1. SDN Architecture [11], [23]

    A SDN controller interacts with these three layersthrough three open interfaces:

    2ONF, https://www.opennetworking.org/sdn-resources/sdn-definition

  • 1553-877X (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

    This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/COMST.2014.2320094, IEEE Communications Surveys & Tutorials

    IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, JANUARY 2014 4

    Southbound : This communication interface allows thecontroller to interact with the forwarding elementsin the infrastructure layer. OpenFlow, a protocolmaintained by ONF [14], is according to ONF afoundational element for building SDN solutions andcan be viewed as a promising implementation of suchan interaction. At the time of writing this paper, thelatest OpenFlow version is 1.4 [14]. The evolutionof OpenFlow switch specification will be summarizedin later sections. Most non-OpenFlow-based SDNsolutions from various vendors employ proprietaryprotocols such as Ciscos Open Network EnvironmentPlatform Kit (onePK) [24] and Junipers contrail[25]. Other alternatives to OpenFlow exist, for in-stance, the Forwarding and Control Element Separa-tion (ForCES) framework [20]. The latter defines anarchitectural framework with associated protocols tostandardize information exchange between the controland forwarding layers. It has existed for several yearsas an IETF proposal but it has never achieved thelevel of adoption of OpenFlow. A comparison betweenOpenFlow and ForCES can be found in [26].

    Northbound : This communication interface enablesthe programmability of the controllers by exposinguniversal network abstraction data models and otherfunctionalities within the controllers for use by appli-cations at the application layer. It is more consideredas a software API than a protocol that allows pro-gramming and managing the network. At the time ofwriting this paper, there is no standardization effortyet from the ONF side who is encouraging innova-tive proposals from various controllers developers.According to the ONF, different levels of abstractionsas latitudes and different use cases as longitudeshave to be characterized, which may lead to morethan a single northbound interface to serve all usecases and environments. Among the various propos-als, various vendors are offering a REpresentationalState Transfer (REST)-based APIs [27] to provide aprogrammable interface to their controller to be usedby the business applications.

    East/Westbound : This interface is an envisioned com-munication interface, which is not currently supportedby an accepted standard. It is mainly meant for en-abling communication between groups or federationsof controllers to synchronize state for high availability[28].

    From another perspective, several logical layers definedby abstraction were presented for control [29] and data [30]layers. These layers of abstractions simplify understand-ing of the SDN vision, decrease network programmingcomplexity and facilitate reasoning about such networks.Figure 2 compiles these logical layers and can be describedin a bottom-up approach as follows:

    Physical Forwarding Plane: This refers to the set ofphysical network forwarding elements [30].

    Network Virtualization (or slicing): This refers to

    an abstraction layer that aims at providing greatflexibility to achieve operational goals, while beingindependent from the underlying physical infrastruc-ture. It is responsible for configuring the physical for-warding elements so that the network implements thedesired behavior as specified by the logical forwardingplane [30]. At this layer, there exist proposals to slicenetwork flows such as FlowVisor [13], [30], [31].

    Logical Forwarding Plane: It is a logical abstraction ofthe physical forwarding plane that provides an end-to-end forwarding model. It allows abstracting from thephysical infrastructure. This abstraction is realized bythe network virtualization layer [30].

    Network Operating System: A Network OperatingSystem (NOS) may be though of as a software thatabstracts the installation of state in network switchesfrom the logic and applications that control the be-havior of the network [4]. It provides the ability toobserve and control a network by offering a pro-grammatic interface (NOS API) as well as an anexecution environment for programmatic control ofthe network [32]. NOS needs to communicate with theforwarding elements in two-ways: receives informationin order to build the global state view and pushesthe needed configurations in order to control theforwarding mechanisms of these elements [29]. Theconcept of a single network operating system has beenextended to distributed network operating system toaccommodate large-scale networks, such as ONOS3,where open source software are used to maintainconsistency across distributed state and to provide anetwork topology database to the applications [4].

    Global Network View : It consists of an annotatednetwork graph provided through an API [29].

    Network Hypervisor : Its main function is to map theabstract network view into the global network viewand vice-versa [33].

    Abstract Network View : It provides to the applica-tions a minimal amount of information required tospecify management policies. It exposes an abstractview of the network to the applications rather than atopologically faithful view [29].

    In the following, we provide the details on the role andimplementations of each layer in the architecture.

    A. Forwarding Elements

    In order to be useful in an SDN architecture, forwardingelements, mainly switches, have to support a southboundAPI, particularly OpenFlow. OpenFlow switches come intwo flavors: Software-based (e.g., Open vSwitch (OVS)[34][36]) and OpenFlow-enabled hardware-based imple-mentations (e.g., NetFPGA [37]). Software switches aretypically well-designed and comprise complete features.However, even mature implementations suffer from be-ing often quite slow. Table I provides a list of software

    3Open Network Operating System (ONOS) http://www.sdncentral.com/projects/onos-open-network-operating-system/

  • 1553-877X (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

    This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/COMST.2014.2320094, IEEE Communications Surveys & Tutorials

    IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, JANUARY 2014 5

    FE

    Network Virtualization or Slicing

    Network OS

    App App App App

    Global Network View

    Physical Forwarding Plane

    Control ProgramAbstract Network View

    Logical Forwarding Plane

    Network HypervisorControl Plane

    Data Plane

    Fig. 2. SDN Abstraction Layers [29], [30]

    switches supporting OpenFlow. Hardware-based Open-Flow switches are typically implemented as Application-Specific Integrated Circuits (ASICs); either using mer-chant silicon from vendors or using a custom ASIC. Theyprovide line rate forwarding for large number of ports butlack the flexibility and feature completeness of softwareimplementations [38]. There are various commercial ven-dors that support OpenFlow in their hardware switchesincluding but not limited to HP, NEC, Pronto, Juniper,Cisco, Dell, Intel, etc.

    An OpenFlow-enabled switch can be subdivided intothree main elements [12], namely, a hardware layer (ordatapath), a software layer (or control path), and theOpenFlow protocol:

    The datapath consists of one or more flow tablesand a group table, which perform packet lookupsand forwarding. A flow table consists of flow entrieseach associated with an (or a set of) action that tellsthe switch how to process the flow. Flow tables aretypically populated by the controller. A group tableconsists of a set of group entries. It allows to expressadditional methods of flow forwarding.

    The control path is a channel that connects theswitch to the controller for signaling and for program-ming purposes. Commands and packets are exchangedthrough this channel using the OpenFlow protocol.

    The OpenFlow protocol [14] provides the meansof communication between the controller and theswitches. Exchanged messages may include informa-tion on received packets, sent packets, statistics col-lection, actions to be performed on specific flows, etc.

    The first release of OpenFlow was published by StanfordUniversity in 2008. Since 2011, the OpenFlow switchspecification has been maintained and improved by ONFstarting from version 1.0 [43] onward. The latter versionis currently widely adopted by OpenFlow vendors. In thatversion, forwarding is based on a single flow table andmatching focuses only on layer 2 information and IPv4addresses. The support of multiple flow tables and MPLStags has been introduced in version 1.1, while IPv6 supporthas been included in version 1.2.

    In version 1.3 [44], the support for multiple parallelchannels between switches and controllers has been added.The latest available OpenFlow switch specification pub-lished in 2013 is version 1.4 [14]. The main includedimprovements are the retrofitting of various parts of theprotocol with the TLV structures introduced in version1.2 for extensible matching fields and a flow monitoringframework allowing a controller to monitor in real-timethe changes to any subset of the flow tables done byother controllers. In the rest of this paper, we describethe OpenFlow switch specification version 1.4 [14], if theversion is not explicitly mentioned.A flow table entry in an OpenFlow-enabled switch

    is constituted of several fields that can be classified asfollows:

    Match fields to match packets based on a 15-tuplepackets header, the ingress port, and optionallypackets meta-data. Figure 3 illustrates the packetheader fields grouped according to the OSI layers L1-4.

    Priority of the flow entry, which prioritizes the match-ing precedence of the flow entry.

    An action set that specifies actions to be performedon packets matching the header field. The three basicactions are: forward the packet to a port or a set ofports, forward the flows packets to the controller anddrop the flows packets.

    Counters to keep track of flow statistics (the numberof packets and bytes for each flow, and the time sincethe last packet has matched the flow).

    Timeouts specifying the maximum amount of time oridle time before the flow is expired by the switch.

    Ingress

    Port

    Meta-

    data

    Ethernet

    Src.

    Ethernet

    Dest.

    Ethernet

    Type

    VLAN

    ID

    VLAN

    Priority

    MPLS

    Label

    MPLS

    Traffic Class

    IP

    Src.

    IP

    Dest.

    IP

    Proto.

    IP

    TOS Field

    Transport

    Src. Port

    Transport

    Dest. Port

    L1 L2 L2.5

    L3 L4

    Fig. 3. Flow Identification in OpenFlow

    OpenFlow messages can be categorized into three maintypes [14]: controller-to-switch, asynchronous, and sym-metric. Messages initiated by the controller and usedto manage or inspect the state of the switches are thecontroller-to-switch messages. A switch may initiate asyn-chronous messages in order to update the controller onnetwork events and changes to the switchs state. Finally,symmetric messages are initiated, without solicitation, byeither the switch or the controller and they are used,for instance, to test the liveliness of a controller-switchconnection. Once an ingress packet arrives to the Open-Flow switch, the latter performs lookup in the flow tablesbased on pipeline processing [14]. A flow table entry isuniquely identified by its matching fields and its priority.A packet matches a given flow table entry if the values in

  • 1553-877X (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

    This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/COMST.2014.2320094, IEEE Communications Surveys & Tutorials

    IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, JANUARY 2014 6

    TABLE IOpenFlow Stacks and Switch Implementations

    OpenFlow Switch Description Open Source Language Origin OpenFlowVersion

    Open vSwitch [34] OpenFlow Stack: Soft switch and con-trol stack for hardware switching

    yes C/Python Multiple contributors v 1.0

    OpenFlow ReferenceImplementation [39]

    OpenFlow Stack that follows the spec-ification

    yes C Stanford University/Nicira Networks

    v 0.8

    Pica8 [40] hardware-independent software forhardware switching

    not yet C Pica8 v 1.2

    Indigo [41] For OpenFlow on physical and hyper-visor switches based on the Stanfordreference implementation

    yes C/Lua Big Switch Networks v 1.0

    Pantou/OpenWRT [42] Enable Commercial OpenWRT wirelessdevices with OpenFlow

    yes C - v 1.0

    the packet match those specified in the entrys fields. Aflow table entry field with a value of ANY (field omittedor wildcard field) matches all possible values in the header.Only the highest priority flow entry that matches thepacket must be selected. In the case the packet matchesmultiple flow entries with the same highest priority, theselected flow entry is explicitly undefined [14]. In order toremediate to such a scenario, OpenFlow specification [14]provides a mechanism that enables the switch to optionallyverify whether the added new flow entry overlaps with anexisting entry. Thus, a packet can be matched exactly toa flow (microflow), matched to a flow with wildcard fields(macroflow) or does not match any flow. In the case ofa match found, the set of actions will be performed asdefined in the matching flow table entry. In the case of nomach, the switch forwards the packet (or just its header) tothe controller to request a decision. After consulting theassociated policy located at the management plane, thecontroller responds by a new flow entry to be added to theswitchs flow table. The latter entry is used by the switchto handle the queued packet as well as the subsequentpackets in the same flow.In order to dynamically and remotely configure Open-

    Flow switches, a protocol, namely the OpenFlow Configu-ration and Management Protocol (OF-CONFIG) [45], isalso being maintained by ONF. The latter enables theconfiguration of essential artifacts so that an OpenFlowcontroller can communicate with the network switchesvia OpenFlow. It operates on a slower time-scale thanOpenFlow as it is used for instance to enable/disable aport on a switch, to set the IP address of the controller,etc.

    B. Controllers

    The controller is the core of SDN networks as it is themain part of the NOS. It lies between network devices atthe one end and the applications at the other end. An SDNcontroller takes the responsibility of establishing every flowin the network by installing flow entries on switch devices.One can distinguish two flow setup modes: Proactive vs.Reactive. In proactive settings, flow rules are pre-installedin the flow tables. Thus, the flow setup occurs before thefirst packet of a flow arrives at the OpenFlow switch. The

    main advantages of a proactive flow setup is a negligiblesetup delay and a reduction in the frequency of contactingthe controller. However, it may overflow flow tables of theswitches. With respect to a reactive flow setup, a flow ruleis set by the controller only if no entry exists in the flowtables and this is performed as soon as the first packet ofa flow reaches the OpenFlow switch. Thus, only the firstpacket triggers a communication between the switch andthe controller. These flow entries expire after a pre-definedtimeout of inactivity and should be wiped out. Although areactive flow setup suffers from a large round trip time, itprovides a certain degree of flexibility to make flow-by-flowdecisions while taking into account QoS requirements andtraffic load conditions. To respond to a flow setup request,the controller first checks this flow against policies on theapplication layer and decides on the actions that need to betaken. Then, it computes a path for this flow and installsnew flow entries in each switch belonging to this path,including the initiator of the request.

    With respect to the flow entries installed by the con-troller, there is a design choice over the controlled flowgranularity, which raises a trade-off between flexibilityand scalability based on the requirements of networkmanagement. Although a fine grained traffic control, calledmicro-flows, offers flexibility, it can be infeasible to imple-ment especially in the case of large networks. As opposedto micro-flows, macro-flows can be built by aggregatingseveral micro-flows simply by replacing exact bit patternwith wildcard. Applying a coarse grained traffic control,called macro-flow, allows gaining in terms of scalability atthe cost of flexibility.

    In order to get an overview on the traffic in the switches,statistics are communicated between the controller and theswitches. There are two ways for moving the statistics fromthe switch to the controller: Push-based vs. pull-based flowmonitoring. In a push-based approach, statistics are sentby each switch to the controller to inform about specificevents such as setting up a new flow or removing a flowtable entry due to idling or hard timeouts. This mechanismdoes not inform the controller about the behavior of aflow before the entry times out, which is not useful forflow scheduling. In a pull-based approach, the controllercollects the counters for a set of flows matching a given flow

  • 1553-877X (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

    This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/COMST.2014.2320094, IEEE Communications Surveys & Tutorials

    IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, JANUARY 2014 7

    specification. It can optionally request a report aggregatedover all flows matching a wildcard specification. Whilethis can save switch-to-controller bandwidth, it disablesthe controller from learning much about the behavior ofspecific flows. A pull-based approach requires tuning thedelay between controllers requests as this may impact thescalability and reliability of operations based on statisticsgathering.

    In what follows, we present a list of the most prominentexisting OpenFlow controllers. Note that most of thesereviewed SDN controllers (except RYU) currently supportOpenFlow version 1.0. The controllers are summarized inTable II and compared based on the availability of thesource code, the implementation language, whether multi-threading is supported, the availability of a graphical userinterface, and finally their originators.

    NOX [32] is the first, publicly available, OpenFlowsingle threaded controller. Several derivatives of NOXexist; A multi-threaded successor of NOX, namelyNOX-MT has been proposed in [46]. QNOX [57] isa QoS-aware version of NOX based on GeneralizedOpenFlow, which is an extension of OpenFlow sup-porting multiple layer networking in the spirit ofGMPLS. FortNox [58] is another extension of NOX,which implements a conflict analyzer to detect andre-conciliate conflicting flow rules caused by dynamicOpenFlow applications insertions. Finally, POX [47]controller is a pure Python controller, redesigned toimprove the performance of the original NOX con-troller.

    Maestro [48], [59], [60] takes advantage of multi-core technology to perform parallelism at low-levelwhile keeping a simple programming model for theapplications programmers. It achieves performancethrough distributing tasks evenly over available work-ing threads. Moreover, Maestro processes a batchof flow requests at once, which would increase itsefficiency. It has been shown that on an eight-coreserver, Maestro throughput may achieve a near linearscalability for processing flow setup requests.

    Beacon [49] is built at Stanford University. It isa multi-threaded, cross-platform, modular controllerthat optionally embeds the Jetty enterprise web serverand a custom extensible user interface framework.Code bundles in Beacon can be started, stopped,refreshed, and installed at runtime, without interrupt-ing other non-dependent bundles.

    SNAC [50] uses a web-based policy manager tomonitor the network. A flexible policy definition lan-guage and a user friendly interface are incorporatedto configure devices and monitor events.

    Floodlight [52] is a simple and performent Java-based OpenFlow Controller that was forked from Bea-con. It has been tested using both physical and virtualOpenFlow-compatible switches. It is now supportedand enhanced by a large community including Intel,Cisco, HP, and IBM.

    McNettle [53] is an SDN controller programmedwith Nettle [61], a Domain-Specific Language (DSL)embedded in Haskell, that allows programming Open-Flow networks in a declarative style. Nettle is basedon the principles of Functional Reactive Program-ming (FRP) that allows programming dynamic con-trollers. McNettle operates on shared-memory multi-core servers to achieve global visibility, high through-put, and low latency.

    RISE [51], for Research Infrastructure for large-Scale network Experiments, is an OpenFlow controllerbased on Trema4. The latter is an OpenFlow stackframework based on Ruby and C. Trema provides anintegrated testing and debugging environment and in-cludes a development environment with an integratedtool chain.

    MUL [54] is a C-based multi-threaded OpenFlowSDN controller that supports a multi-level north-bound interface for hooking up applications.

    RYU [55] is a component-based SDN framework. It isopen sourced and fully developed in python. It allowslayer 2 segregation of tenants without using VLAN.It supports OpenFlow v1.0, v1.2, v1.3, and the NiciraExtensions.

    OpenDaylight [56] is an open source project anda software SDN controller implementation containedwithin its own Java virtual machine. As such, it canbe deployed on any hardware and operating systemplatform that supports Java. It supports the OSGiframework [62] for local controller programmabilityand bidirectional REST [27] for remote programma-bility as northbound APIs. Companies such as Con-teXtream, IBM, NEC, Cisco, Plexxi, and Ericsson areactively contributing to OpenDaylight.

    C. Programming SDN Applications

    SDN applications interact with the controllers throughthe northbound interface to request the network stateand/or to request and manipulate the services providedby the network. While the southbound interface betweenthe controller software and the forwarding elements is rea-sonably well-defined through standardization efforts of theunderlying protocols such as OpenFlow and ForCES, thereis no standard yet for the interactions between controllersand SDN applications. This may stem from the fact thatthe northbound interface is more a set of software-definedAPIs than a protocol exposing the universal networkabstraction data models and the functionality within thecontroller [23]. Programming using these APIs allows SDNapplications to easily interface and reconfigure the networkand its components or pull specific data based on theirparticular needs [63]. From the one hand, northboundAPIs can enable basic network functions including pathcomputation, routing, traffic steering, and security. Fromthe other hand, they also allow orchestration systems

    4http://trema.github.com/trema/

  • 1553-877X (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

    This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/COMST.2014.2320094, IEEE Communications Surveys & Tutorials

    IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, JANUARY 2014 8

    TABLE IISDN Controllers

    Controller Open Source Language Multi-threaded GUI OriginNOX [32] yes C++/Python no yes Nicira NetworksNOX-MT [46] yes C++ yes no Nicira Networks and Big Switch NetworksPOX [47] yes Python - yes Nicira NetworksMaestro [48] yes Java yes no Rice UniversityBeacon [49] yes Java yes yes Stanford UniversitySNAC [50] no C++/Python no yes Nicira NetworksRISE [51] yes C and Ruby non-guaranteed no NECFloodlight [52] yes Java - yes Big Switch NetworksMcNettle [53] yes Nettle/Haskell no no Yale UniversityMUL [54] yes C yes yes KulCloudRYU [55] yes Python - - NTT OSRG and VA LinuxOpenDaylight [56] yes Java yes yes Multiple contributors

    such as the OpenStack Quantum [64] to manage networkservices in a cloud.

    SDN programming frameworks consist generally of aprogramming language and eventually the appropriatetools for compiling and validating the OpenFlow rulesgenerated by the application program as well as for query-ing the network state. SDN programming languages canbe compared according to three main design criteria: thelevel of abstraction of the programming language, the classof language it belongs to, and the type of programmedpolicies:

    Level of Abstraction: Low-Level vs. High-Level. Low-level programming languages allow developers to dealwith details related to OpenFlow, whereas high-levelprogramming languages translate information pro-vided by the OpenFlow protocol into a high-levelsemantics. Translating the information provided bythe OpenFlow protocol into a high-level semanticsallows programmers to focus on network managementgoals instead of details of low-level rules.

    Programming: Logic vs. Functional Reactive. Most ofthe existing network management languages adoptthe declarative programming paradigm, which meansthat only the logic of the computation is described(what the program should accomplish), while thecontrol flow (how to accomplish it) is delegated tothe implementation. Nevertheless, there exist twodifferent programming fashions to express networkpolicies: Logic Programming (LP) and FunctionalReactive Programming (FRP). In logic programming,a program is constituted of a set of logical sentences.It applies particularly to areas of artificial intelligence.Functional reactive programming [65] is a paradigmthat provides an expressive and a mathematicallysound approach for programming reactive systems ina declarative manner. The most important featureof FRP is that it allows to capture both continuoustime-varying behaviors and event-based reactivity. Itis consequently used in areas such as robotics andmultimedia.

    Policy Logic: Passive vs. Active. A programming lan-guage can be devised to develop either passive oractive policies. A passive policy can only observe the

    network state, while an active policy is programmedto reactively affect the network-wide state as a re-sponse to certain network events. An example of areactive policy is to limit the network access to adevice/a user based on a maximum bandwidth usage.

    In the following we examine the most relevant program-ming frameworks proposed for developing SDN applica-tions.

    1) Frenetic [66], [67]: It is a high-level network pro-gramming language constituted of two levels of abstrac-tion. The first is a low-level abstraction that consistsof a runtime system that translates high-level policiesand queries into low-level flow rules and then issues theneeded OpenFlow commands to install these rules on theswitches. The second is a high-level abstraction that isused to define a declarative network query language thatresembles the Structured Query Language (SQL) and aFRP-based network policy management library. The querylanguage provides means for reading the state of thenetwork, merging different queries, and expressing high-level predicates for classifying, filtering, transforming, andaggregating the packets streams traversing the network.To govern packet forwarding, the FRP-based policy man-agement library offers high-level packet-processing oper-ators that manipulate packets as discrete streams only.This library allows reasoning about a unified architecturebased on see every packet abstraction and describingnetwork programs without the burden of low-level details.Frenetic language offers operators that allows combiningpolicies in a modular way, which facilitates building newtools out of simpler reusable parts. Frenetic has beenused to implement many services such as load balancing,network topology discovery, fault tolerance routing and itis designed to cooperate with the controller NOX.

    Frenetic defines only parallel composition, which giveseach application the illusion of operating on its own copyof each packet. Monsanto et al. [71] defines an extension toFrenetic language with a sequential composition operatorso that one module acts on the packets produced byanother module. Furthermore, an abstract packet modelwas introduced to allow programmers extending packetswith virtual fields used to associate packets with high-level meta-data and topology abstraction. This abstraction

  • 1553-877X (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

    This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/COMST.2014.2320094, IEEE Communications Surveys & Tutorials

    IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, JANUARY 2014 9

    TABLE IIISDN programming languages

    Framework Level ofAbstraction

    QueryLanguage

    RuntimeSystem

    ImplementationLanguage

    ProgrammingType

    Policies Type

    Frenetic [66], [67] high yes yes Python FRP ActiveNetCore [68] high yes yes Python FRP ActiveNettle [61] low no no Haskell FRP ActiveFML [69] high no no Python/C++ LP PassiveProcera [70] high no no Haskell FRP Active

    allows to limit the scopes of network view and moduleactions, which achieves information hiding and protection,respectively.

    2) NetCore [68]: It is a successor of Frenetic thatenriches the policy management library of Frenetic andproposes algorithms for compiling monitoring policies andmanaging controller-switch interactions. NetCore has aformal semantics and its algorithms have been provedcorrect. NetCore defines a core calculus for high-levelnetwork programming that manipulates two components:Predicates, which match sets of packets, and policies,which specify locations where to forward these packets.Set-theoretic operations are defined to build more complexpredicates and policies from simple ones. Contrarily toFrenetic, NetCore compiler uses wildcard rules to gen-erate switch classifiers (sets of packet-forwarding rules),which increases the efficiency of packets processing on theswitches.

    3) Nettle [61]: It is another FRP-based approach forprogramming OpenFlow networks that is embedded inHaskell [72], a strongly typed language. It defines signalfunctions that transform messages issued from switchesinto commands generated by the controller. Nettle allowsto manipulate continuous quantities (values) that reflectabstract properties of a network, such as the volume ofmessages on a network link. It provides a declarativemechanism for describing time-sensitive and time-varyingbehaviors such as dynamic load balancing. Compared toFrenetic, Nettle is considered as a low-level programminglanguage, which makes it more appropriate for program-ming controllers. However, it can be used as a basis fordeveloping higher level DSL for different tasks such astraffic engineering and access control. Moreover, Nettle hasa sequential operator for creating compound commandsbut lacks a support for composing modules affecting over-lapping portions of the flow space, as it is proposed byFrenetic.

    4) Procera [70]: It is an FRP-based high-level languageembedded in Haskell. It offers a declarative, expressive,extensible, and compositional framework for network op-erators to express realistic network policies that react todynamic changes in network conditions. These changescan be originated from OpenFlow switches or even fromexternal events such as user authentication, time of theday, measurements of bandwidth, server load, etc. For ex-ample, access to a network can be denied when a temporalbandwidth usage condition occurs.

    5) Flow-based Management Language (FML) [69]: Itis a declarative language based on non-recursive Datalog,a declarative logic programming language.A FML policyfile consists of a set of declarative statements and mayinclude additionally external references to, for instance,SQL queries. While the combination of policies statementswritten by different authors is made easy, conflicts aresusceptible to be created. Therefore, a conflict resolutionmechanism is defined as a layer on top of the core seman-tics of FML. For each new application of FML, developerscan define a set of keywords that they need to implement.FML is written in C++ and Python and operates withinNOX. Although FML provides a high-level abstraction,contrarily to Procera, it lacks expressiveness for describingdynamic policies, where forwarding decisions change overtime. Moreover, FML policies are passive, which meansthey can only observe the network state without modifyingit.

    IV. SDN Taxonomy

    The first step in understanding SDN is to elaboratea classification using a taxonomy that simplifies andeases the understanding of the related domains. In thefollowing, we elaborate a taxonomy of the main issuesraised by the SDN networking paradigm and the solu-tions designed to address them. Our proposed taxonomyprovides a hierarchical view and classifies the identifiedissues and solutions per layer: infrastructure, control,and application. We also consider inter-layers, mainlyapplication/control, control/infrastructure, and applica-tion/control/infrastructure.While reviewing the literature, we found only two tax-

    onomies where each focuses on a single aspect of SDN:The first is a taxonomy based on switch-level SDN de-ployment provided by Gartner in a non public report[73], that we will not detail here, and the second focusesabstractions for the control plane [18]. The latter abstrac-tions are meant for ensuring compatibility with low-levelhardware/software and enabling making decisions basedon the entire network. The three proposed control planeabstractions in [18], [74] are as follows:

    Forwarding Abstraction: a flexible forwarding modelthat should support any needed forwarding behaviorand should hide details of the underlying hardware.This corresponds to the aforementioned logical for-warding plane.

    Distributed State Abstraction: This abstraction aimsat abstracting away complex distributed mechanisms

  • 1553-877X (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

    This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/COMST.2014.2320094, IEEE Communications Surveys & Tutorials

    IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, JANUARY 2014 10

    (used today in many networks) and separating statemanagement from protocol design and implementa-tion. It allows providing a single coherent global viewof the network through an annotated network graphaccessible for control via an API. An implementationof such an abstraction is a Network Operating System(NOS).

    Specification (or Configuration) Abstraction: Thislayer allows specifying the behavior of desired controlrequirements (such as access control, isolation, QoS,etc.) on the abstract network model and correspondsto the abstract network view as presented earlier.

    Each one of these existing taxonomies focuses on a singlespecific aspect and we believe that none of them serves ourpurpose. Thus, we present in the following a hierarchicaltaxonomy that comprises three-levels: the SDN layer (orlayers) of concern, the identified issues according to theSDN layer (or layers), and the proposed solutions in theliterature to address these issues. In the following, weelaborate on our proposed taxonomy.

    A. Infrastructure Layer

    At this layer, the main issues identified in the literatureare the performance and scalability of the forwardingdevices as well as the correctness of the flow entries.1) Performance and Scalability of the Forwarding De-

    vices: To tackle performance and scalability issues at thislayer, three main solution classes can be identified and theyare described as follows:

    Switches Resources Utilization: Resources on switchessuch as CPU power, packet buffer size, flow tablessize, and bandwidth of the control datapath are scarceand may create performance and scalability issues atthe infrastructure layer. Works tackling this class ofproblems propose either optimizing the utilization ofthese resources, or modifying the switches hardwareand architecture.

    Lookup Procedure: The implementation of the switchlookup procedure may have an important impact onthe performance at the switch-level. A trade-off existsbetween using hardware and/or software tables sincethe first type of tables are expensive resources and theimplementation of the second type of table may addlookup latencies. Works tackling this class of problemspropose to deal with this trade-off.

    2) Correctness of Flow Entries: Several factors maylead to problems of inconsistencies and conflicts withinOpenFlow configurations at the infrastructure level.Among these factors, the distributed state of the Open-Flow rules across various flow tables and the involvementof multiple independent OpenFlow rules writers (admin-istrators, protocols, etc.). Several approaches have beenproposed to tackle this issue and the solutions can beclassified as follows:

    Run-time Formal Verification: In this thread, theverification is performed at run-time, which allows tocapture the bugs before damages occur. This class of

    solutions is based on formal methods such as modelchecking.

    Oine Formal Verification: In this case, the formalverification is performed oine and the check is onlyrun periodically.

    B. Control Layer

    As far as network control is concerned, the identifiedcritical issues are performance, scalability, and reliabilityof the controller and the security of the control layer.

    1) Performance, Scalability, and Reliability: The con-trol layer can be a bottleneck of the SDN networks ifrelying on a single controller to manage medium to largenetworks. Among envisioned solutions, we can find thefollowing categories:

    Control Partitioning: Horizontal or Vertical. In largeSDN networks, partitioning the network into multiplecontrolled domains should be envisaged. We can dis-tinguish two main types of control plane distribution[75]:

    Horizontally distributed controllers: Multiplecontrollers are organized in a flat control planewhere each one governs a subset of the networkswitches. This deployment comes in two flavors:with state replication or without state replica-tion.

    Vertically distributed controllers: It is a hierar-chical control plane where the controllers func-tionalities are organized vertically. In this de-ployment model, control tasks are distributed todifferent controllers depending on criteria suchas network view and locality requirements. Thus,local events are handled to the controller that arelower in the hierarchy and more global events arehandled at higher level.

    Distributed Controllers Placement: Distributed con-trollers may solve the performance and scalabilityissue, however, they raise a new issue, which is deter-mining the number of the needed controllers and theirplacement within the controlled domain. Researchworks in this direction aim at finding an optimizedsolution for this problem.

    2) Security of the Controller: The scalability issue of thecontroller enables targeted flooding attacks, which leads tocontrol plane saturation. Possible solutions to this problemis Adding Intelligence to the Infrastructure. The latterrelies on adding programmability to the infrastructurelayer, which prevents the congestion of the control plane.

    C. Application Layer

    At this layer, we can distinguish two main researchdirections studied in the literature: developing SDN ap-plications to manage specific network functionalities anddeveloping SDN applications for managing specific envi-ronments (called use cases).

  • 1553-877X (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

    This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/COMST.2014.2320094, IEEE Communications Surveys & Tutorials

    IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, JANUARY 2014 11

    1) SDN Applications: At this layer, SDN applicationsinteract with the controllers to achieve a specific networkfunction in order to fulfill the network operators needs. Wecan categorize these applications according to the relatednetwork functionality or domain they achieve includingsecurity, quality of service (QoS), traffic engineering (TE),universal access control lists (U-ACL) management, andload balancing (LB).2) SDN Use Cases: From the other side, several SDN

    applications are developed in order to serve a specific usecase in a given environment. Among possible SDN usescases, we focus on the application of SDN in cloud com-puting, Information Content Networking (ICN), mobilenetworks, network virtualization, and Network FunctionVirtualization (NFV).

    D. Control/Infrastructure Layers

    In this part of the taxonomy, we focus on issues that mayspan control and infrastructure layers and the connectionbetween them.1) Performance and Scalability: In the SDN design

    vision of keeping data plane simple and delegating the con-trol task to a logically centralized controller, the switches-to-controller connection tends to be highly solicited. Thisadds latency to the processing of the first packets of a flowin the switches buffers but can lead to irreversible damageto the network such as loss of packets and palatalization ofthe whole network. In order to tackle this issue, we mainlyfound a proposal on Control Load Devolving. The latter isbased on the delegation of some of the control load to theinfrastructure layer to alleviate the frequency by which theswitches contact the controller.2) Network Correctness: The controller is in charge

    of instructing the switches in the data plane on how toprocess the incoming flows. As this dynamic insertion offorwarding rules may cause potential violation of networkproperties, verifying network correctness at run-time isessential in keeping the network operational. Among theproposed solutions we cite the use of Algorithm for Run-time Verification to deal with checking correctness of thenetwork while inserting new forwarding rules. The verifica-tion involves checking network-wide policies and invariantssuch as absence of loops and reachability properties.

    E. Application/Control Layers

    Various SDN applications are developed by differentnetwork administrators to manage network functionalitiesby programming the controllers capabilities. Two mainissues were examined in this context: Policy correctnessand the northbound interface security threats representedby adversarial SDN applications.1) Policy Correctness: Conflicts between OpenFlow

    rules may occur due to multiple requests made by severalSDN applications. Different solutions are proposed forconflicts detection and resolution that can be classified aseither approaches for Run-time Formal Verification usingwell-established formal methods or Custom Algorithm forRun-time Verification.

    2) Northbound Interface Security: Multiple SDN ap-plications may request a set of OpenFlow rule inser-tions, which may lead to the possible creation of securitybreaches in the ongoing OpenFlow network configuration.Among the envisaged solutions is the use of a Role-basedAuthorization model to assign a level of authorization toeach SDN application.

    F. Application/Control/Infrastructure Layers

    The decision taken by the SDN application deployedat the application layer influence the OpenFlow rulesconfigured at the infrastructure layer. This influence isdirected via the control layer. In this part of the taxonomy,we focus on issues that concern all of the three SDN layers.

    1) Policy Updates Correctness: Modification in policiesprogrammed by the SDN applications may result in incon-sistent modification of the OpenFlow network configura-tions. These changes in configurations are common sourceof network instability. To prevent such a critical problem,works propose solutions to verify and/or ensure consistentupdates. Thus we enumerate two classes of solutions

    Formal Verification of Updates : Formal verificationapproaches, such as model-checking, are mainly usedto verify that updates are consistent (i.e. updatespreserve well-defined behaviors when transitioningbetween configurations).

    Update Mechanism/Protocol: This class of solutionsproposes a mechanism or protocol that ensures thatupdates are performed without introducing inconsis-tent transient configurations.

    2) Network Correctness: While the network correctnessat the Control/Infrastructure layers is more about thenewly inserted OpenFlow rules and the existing ones atthe infrastructure layer, network correctness at Applica-tion/Control/Infrastructure concerns the policy specifiedby the applications and the existing OpenFlow rules.Among the proposed solutions is Oine Testing, whichuses testing techniques to check generic correctness prop-erties such as no forwarding loops or no black holes andapplication-specific properties over SDN networks takinginto account the three layers.

    V. SDN Issues and Research Directions

    In this section, we present a survey on the most relevantresearch initiatives studying problematic issues raised bySDN and providing proposals towards supporting theadoption of SDN concepts in todays networks. The re-viewed works are organized using our taxonomy: eitherbelonging to a specific functional layer or concerning across-layer issue. We identified a set of most critical con-cerns that may either catalyze the successful growth andadoption of SDN or refrain its advance. These concernsare scalability, performance, reliability, correctness, andsecurity. Figure 4 provides an overview of the reviewedresearch work classified using our taxonomy.

  • 1553-877X (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

    This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/COMST.2014.2320094, IEEE Communications Surveys & Tutorials

    IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, JANUARY 2014 12

    SDN Research

    Application/Control/

    InfrastructureNetwork

    CorrectnessOine Testing [138], [139]

    Policy UpdatesCorrectness Update

    Mechanism/Protocol

    [135][137]

    Formal Verificationof Updates

    [135], [136]

    Application/Control

    NorthboundInterface Security

    Role-basedAuthorization

    [58]

    Policy CorrectnessRun-time Formal

    Verification[134]

    Algorithm for Run-time Verification

    [58], [133]

    Control/Infrastructure Network

    CorrectnessAlgorithm for Run-time Verification

    [131], [132]

    Performanceand Scalability

    Control Devolving [128][130]

    Application Layer

    Use Cases

    NetworkVirtualization

    [127]

    NFV [126]

    ICN [123][125]

    Mobile [118][122]

    Cloud [106][117]

    Applications

    LB [105]

    U-ACL [104]

    TE [101][103]

    QoS [30], [99], [100]

    Security [94][98]

    Control Layer

    Controller SecurityAdding

    Intelligence tothe Infrastructure

    [93]

    Performance,Scalability,

    and Reliability ControllersPlacement

    [90][92]

    ControlPartitioning

    [85][89]

    InfrastructureLayer

    Correctness ofFlow Entries Oine Formal

    Verification[83], [84]

    Run-time FormalVerification

    [82]

    Performanceand Scalability

    Lookup Procedure [37], [79], [81]

    Switch ResourcesUtilization

    [37], [76][80]

    Fig. 4. Overview of the Surveyed Research Works Classified According to the Proposed Taxonomy

  • 1553-877X (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

    This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/COMST.2014.2320094, IEEE Communications Surveys & Tutorials

    IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, JANUARY 2014 13

    A. Infrastructure Layer

    1) Performance and Scalability: Despite the undeni-able advantages brought by SDN since its inception, ithas introduced several concerns including scalability andperformance. These concerns stem from various aspectsincluding the implementation of the OpenFlow switches.The most relevant switch-level problems that limit thesupport of the SDN/OpenFlow paradigm are as follows:

    Flow Tables Size. The CAM is a special type ofmemory that is content addressable. CAM is muchfaster than RAM as it allows parallel lookup search. ATCAM, for ternary CAM, can match 3-valued inputs:0, 1 and X where X denotes the dont carecondition (usually referred to as wildcard condition).Thus, a TCAM entry typically requires a greater num-ber of entries if stored in CAM. With the emergenceof SDN, the volume of flow entries is expected to growseveral orders higher than in traditional networks.This is due to the fact that OpenFlow switches relyon a fine grained management of flows (microflows)to maintain complete visibility in a large OpenFlownetwork. However, TCAM entries are a relativelyexpensive resource in terms of ASIC area and powerconsumption.

    Lookup Procedure. Two types of flow tables exist:hash table and linear table. Hash table is used to storemicroflows where the hash of the flow is used as anindex for fast lookups. The hashes of the exact flowsare typically stored in Static RAM (SRAM) on theswitch. One drawback of this type of memory is thatit is usually off-chip, which causes lookup latencies.Linear tables are typically used for storing macroflowsand are usually implemented in TCAM, which is mostefficient to store flow entries with wildcards. TCAM isoften located on the switching chip, which decreaseslookup delays. In ordinary switches, lookup mecha-nism is the main operation that is performed, whereasin OpenFlow-enabled switches, other operations areconsidered, especially the insert operation. This canlead to a higher power dissipation and a longer accesslatency [76] than in regular switches.

    CPU Power. For a purely software-based OpenFlowswitch, every flow is handled by the system CPUand thus, performance will be determined by theswitches CPU power. Furthermore, CPU is neededin order to encapsulate the packet to be transmittedto the controller for a reactive flow setup throughthe secure channel. However, in traditional networks,the CPU on a switch was not intended to handleper-flow operations, thus, limiting the supported rateof OpenFlow operations. Furthermore, the limitedpower of a switch CPU can restrict the bandwidthbetween the switch and the controller, which will bediscussed in the cross-layer issues.

    Bandwidth Between CPU and ASIC. The controldatapath between the ASIC and the CPU is typicallya slow path as it is not frequently used in traditional

    switch operation. Packet Buffer Size The switch packet buffer ischaracterized by a limited size, which may lead topacket drops and cause throughput degradation.

    Various research works [37], [76][81] addressed one ormore of these issues to improve performance and scala-bility of SDN data plane, and specifically of OpenFlowSwitches. These works are summarized in Table IV.2) Correctness of Flow Entries: More than half of

    network errors are due to misconfiguration bugs [140].Misconfiguration has a direct impact on the security andthe efficiency of the network because of forwarding loops,content delivery failure, isolation guarantee failure, accesscontrol violation, etc. Skowyra et al. [84] propose anapproach based on formal methods to model and verifyOpenFlow learning switches network with respect to prop-erties such as network correctness, network convergence,and mobility-related properties. The verified propertiesare expressed in LTL and PCTL* and both SPIN andPRISM model-checkers are used. McGeer [83] discussesthe complexity of verifying OpenFlow networks. Therein,a network of OpenFlow switches is considered as an acyclicnetwork of high-dimensional Boolean functions. Such ver-ification is shown to be NP-complete by a reduction fromSAT. Furthermore, restricting the OpenFlow rule set toprefix rules makes the verification complexity polynomial.FlowChecker [82] is another tool to analyze, validate, andenforce end-to-end OpenFlow configuration in federatedOpenFlow infrastructures. Various types of misconfigura-tion are investigated: intra-switch misconfiguration withina single FlowTable as well as inter-switch or inter-federatedinconsistencies in a path of OpenFlow switches across thesame or different OpenFlow infrastructures. For federatedOpenFlow infrastructures, FlowChecker is run as a mas-ter controller communicating with various controllers toidentify and resolve inconsistencies using symbolic model-checking over Binary Decision Diagrams (BDD) to encodeOpenFlow configurations. These works are compared inTable V.

    B. Control Layer

    1) Performance, Scalability, and Reliability: Concernsabout performance and scalability have been considered asmajor in SDN since its inception. The most determinantfactors that impact the performance and scalability ofthe control plane are the number of new flows installsper second that the controller can handle and the delayof a flow setup. Benchmarks on NOX [32] showed thatit could handle at least 30.000 new flow installs persecond while maintaining a sub 10-ms flow setup delay[142]. Nevertheless, recent experimental studies suggestthat these numbers are insufficient to overcome scalabilityissues. For example, it has been shown in [143] that themedian flow arrival rate in a cluster of 1500 servers isabout 100.000 flows per second. This level of performance,despite its suitability to some deployment environmentssuch as enterprises, leads to raise legitimate questions

  • 1553-877X (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

    This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/COMST.2014.2320094, IEEE Communications Surveys & Tutorials

    IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, JANUARY 2014 14

    TABLE IVSurvey on Initiatives Addressing Performance and Scalability at the Infrastructure Layer

    Reference Addressed Issue Proposed Solution Required ModificationKannan and Baner-jee [76]

    TCAM Flow tables size andpower dissipation

    Compact TCAM by replacing flowentries with shorter Flow-ID

    Packets headers to carry the Flow-IDand a Flow-ID table at the controllersite

    Narayanan et al.[77]

    Flow tables size, bus speed be-tween the hardware pipelineand the on-chip CPU, and CPUpower

    Use multi-core programmableASICs and a new switcharchitecture (Split SDN DataPlane) splitting the forwardingfast path into TCAM-based andsoftware-based paths and use ahigh-speed bus

    Modification to switches hardware andarchitecture

    Lu et al. [78] Packet buffer size, CPU power,and bandwidth between CPUand ASIC

    Equipping switches with powerfulCPUs and gigabytes DRAM andsetup a high-bandwidth internallink between the CPU and theASIC

    Modification to switches hardware andarchitecture

    Tanyingyong et al.[81]

    Software-based lookup proce-dure

    Use a standard commodityNetwork Interface Card (NIC)and achieve fast decision path bycaching flow table entries on theNIC

    Modification to switches hardware andarchitecture

    Luo et al. [79] CPU power and Lookup proce-dure

    Implemented a network processor-based acceleration card and anOpenFlow switching algorithm

    Modification to switches hardware andarchitecture

    Naous et al. [37] Flow tables size, CPU power,and Lookup procedure

    Implementation of a full line-rateOpenFlow switch on NetFPGA

    Modification to switches hardware andarchitecture

    Kang et al. [80] Flow tables size Algorithm for rules placement thatdistributes forwarding policies andsupports dynamic incremental up-date of policies

    Rule placement algorithm within thecontroller to satisfy switch table-sizeconstraints

    on scaling implications. In large networks, increasing thenumber of switches results in augmenting the number ofOpenFlow messages. Furthermore, networks with largediameters may result in an additional flow setup delay. Atthe control layer, the partitioning of the control, the num-ber and the placement of controllers in the network, andthe design and implementation choices of the controllingsoftware are various proposals to address these issues.The deployment of SDN controller may have a high

    impact on the reliability of the control plane. In contrastto traditional networks where one has to deal with networklinks and nodes failures only, SDN controller and theswitches-to-controllers links may also fail. In a networkmanaged by a single controller, the failure of the latter maycollapse the entire network. Moreover, in case of failure inOpenFlow SDN systems, the number of forwarding rulesthat need to be modified to recover can be very large asthe number of hosts grows. Thus, ensuring the reliabilityof the controlling entity is vital for SDN-based networks.As for the design and implementation choices, some

    works suggest to take benefit from the multi-core tech-nology and propose multi-threaded controllers to improvetheir performance. Nox-MT [46], Maestro [48] and Beacon[49] are examples of such controllers. Tootoonchian et al.[46] show through experiments that multi-threaded con-trollers exhibit better performance than single-threadedones and may boost the performance by an order ofmagnitude.In the following, we focus first on various frameworks

    proposing specific architectures for partitioning the controlplane and then discuss works proposing solutions to deter-

    mine the number of needed controllers and their placementin order to tackle performance issues.Control Partitioning Several proposals [85][89] suggestan architectural-based solution that employs multiple con-trollers deployed according to a specific configuration. Hy-perflow [85] is a distributed event-based control plane forOpenFlow. It keeps network control logically centralizedbut uses multiple physically distributed NOX controllers.These controllers share the same consistent network-wideview and run as if they are controlling the whole network.For the sake of performance, HyperFlow instructs eachcontroller to locally serve a subset of the data plane re-quests by redirecting OpenFlow messages to the intendedtarget. HyperFlow is implemented as an application overNOX [32] and it is in charge of proactively and transpar-ently pushing the network state to other controllers using apublish/subscribe messaging system. Based on the lattersystem, Hyperflow is resilient to network partitions andcomponents failures and allows interconnecting indepen-dently managed OpenFlow networks while minimizing thecross-region control traffic.Onix [86] is another distributed control platform, where

    one or multiple instances may run on one or more clus-tered servers in the network. Onix controllers operateon a global view of the network state where the stateof each controller is stored in a Network InformationBase (NIB) data structure. To replicate the NIB overinstances, two choices of data stores are offered withdifferent degrees of durability and consistency: a replicatedtransactional database for state applications that favordurability and strong consistency and a memory-based

  • 1553-877X (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

    This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/COMST.2014.2320094, IEEE Communications Surveys & Tutorials

    IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, JANUARY 2014 15

    TABLE VSurvey on Initiatives Addressing Correctness Issues

    Reference Layer Goal Approach Type of verificationAl-Shaer andAl-Haj [82](FlowChecker)

    Infrastructure Flow tables in OpenFlowswitch networks againstproperties expressed intemporal logic

    Symbolic model-checking overBDDs

    Run-time verification

    McGeer [83] Infrastructure Flow tables in OpenFlowswitch networks againstnetwork properties

    Formal verification: Solving satisfi-ability problems over networks oflogic functions

    Oine verification

    Skowyra et al.[84]

    Infrastructure OpenFlow learning switch net-works between mobile end-hosts against network correct-ness, network convergence, andmobility-related properties

    Formal verification of the designusing Verificare tool againts a li-brary of requirements using model-checkers

    Oine design verifica-tion

    Khurshidet al. [131](VeriFlow)

    Control-Infrastructure

    Checks for network-wide invari-ant violations using existingflow table rules at each forward-ing rule is inserted by the con-troller

    Algorithm: searching for overlap-ping rules using equivalence classesof packets

    Run-time verification

    Kazemianet al. [132](NetPlumber)

    Control-Infrastructure

    Checks for network-wide poli-cies and invariants on everyevent using existing flow tablerules

    Algorithm: Header space analysis offorwarding rules

    Run-time verification

    Porras et al.[58] (FortNox)

    Application-Control

    Conflicts detection and resolu-tion between newly OpenFlowrules inserted by applicationsvs. existing OpenFlow rules

    Algorithm: Alias set rule reductionfor detecting rule contradictions

    Run-time verification

    Wang et al.[133]

    Application-Control

    Conflicts detection and resolu-tion between flow table rulesetsand SDN firewall applicationsrules

    Algorithm: Header space analysisfor detecting and solving conflicts

    Run-time verification

    Son et al. [134](FLOVER)

    Application-Control

    Compliance of updated flow ta-ble rulesets with respect to non-bypass security properties

    Formal verification using an SMTsolver

    Run-time verification

    Reitblatt et al.[135], [136] (Ki-netic)

    Application-Control-Infrastructure

    Invariance of trace propertiesover updated policies

    A mechanism for consistent updateand a formal verification using amodel-checker

    Oine verification

    McGeer [137] Application-Control-Infrastructure

    Consistent updates of the pro-grammed policies

    A protocol for OpenFlow rule up-dates to ensure per-packet rule-setconsistency

    -

    Canini et al.[138] (NICE)

    Application-Control-Infrastructure

    Testing the behavior of con-troller and its applications in-tegrated wit an abstract modelof the switches

    State-space search using symbolicexecution and custom explicit statemodel-checking

    Oine Testing

    Kuzniar et al.[139] (OFTEN)

    Application-Control-Infrastructure

    Testing the behavior of con-troller with its applications in-tegrated with real OpenFlowswitches

    Based on NICE [138] synchro-nized with state of real OpenFlowswitches

    Oine testing

    one-hop Distributed Hash table (DHT) for volatile statethat is more tolerant to inconsistencies. Onix supportsat least two control scenarios. The first is horizontallydistributed Onix instances where each one is managing apartition of the workload. The second is a federated andhierarchical structuring of Onix clusters where the networkmanaged by a cluster of Onix nodes is aggregated so thatit appears as a single node in a separate clusters NIB. Inthis setting, a global Onix instance performs a domain-wide traffic engineering. Control applications on Onixhandle four types of network failures: forwarding elementfailures, link failures, Onix instance failures, and failures inconnectivity between network elements and Onix instancesas well as between Onix instances themselves.

    The work in [87] proposes a deployment of horizontallydistributed multiple controllers in a cluster of servers, eachinstalled on a distinct server. At any point in time, a singlemaster controller that has the smallest systems load is

    elected and is periodically monitored by all of the othercontrollers for possible failure. In case of failure, masterreelection is performed. The master controller dynamicallymaps switches to controllers using IP aliasing. This allowsdynamic addition and removal of controllers to the clusterand switch migration between controllers and thus dealswith the failure of a controller and a switch-to-controllerlink.

    These aforementioned frameworks allow reducing thelimitations of a centralized controller deployment but theyoverlook an important aspect, which is the fact that notall applications require a network-wide state. In SDN, itbelongs to the controller to maintain a consistent globalstate of the network. Observation granularity refers tothe set of information enclosed in the network view. Con-trollers generally provide some visibility of the networkunder control including the switch-level topology suchas links, switches, hosts, and middelboxes. This view is

  • 1553-877X (c) 2013 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission. Seehttp://www.ieee.org/publications_standards/publications/rights/index.html for more information.

    This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI10.1109/COMST.2014.2320094, IEEE Communications Surveys & Tutorials

    IEEE COMMUNICATIONS SURVEYS & TUTORIALS, VOL. 16, NO. 1, JANUARY 2014 16

    TABLE VISDN Control Frameworks

    Proposal Partitioning Cross-ControllersCoordination

    ObservationGran.

    Implementation Failure Recovery

    HyperFlow [85] Horizontal withstate replication

    Publish/subscribe messag-ing

    Full Application withinNOX [32]

    Network partitionsand componentfailure

    Onix [86] Horizontal withstate replicationor Vertical

    Replicated NIB in areplicated transactionaldatabase and a DHT

    Full Multiple Onixcontrollers (cluster)running on a set ofphysical servers

    switch, link, Onixinstance, switch-to-Onix link, andOnix-to-Onix link

    Yazc et al. [87] Horizontal withstate replication

    JGroups notifications andmessaging

    Full Multiple controllers(cluster with a mastercontroller) each on adistinct server

    controller andswitch-to-controllerlink

    Kandoo [88] Vertical (2levels): acentralizedcontroller (maybe horizontalwith statereplication)at the top layerand a horizontaldeploymentwith no statereplication at thethe bottom layer

    Depending on the top layer Partial (bottomlayer) Full (toplayer)

    Bottom layer controllercan be implemented ona commodity middle-box as switch proxy ordirectly on OpenFlowswitches

    None

    SiBF [89] Horizontalwith no statereplication

    KeySpace, DistributedNoSQL database integratedwith each RM

    Each RMs is a stan-dalone application run-ning in one of the rackservers

    Server and networkfailures

    CPRecovery[141]

    Centralized Primary to secondary con-troller communication (forstate replication)

    Full Application withinNOX [32]

    Controller failure

    FlowVisor [13],[30], [31]

    Horizontalwith no statereplication

    None Partial(Network-slice view foreach controller)

    A software slicing layerbetween the forwardingand control planeshoused outside theswitch

    None

    referred to as a partial network-wide view, whereas aview that accounts for the network traffic is consideredas a full network-wide view. Note that a partial network-wide view changes at a low pace and consequently itcan be scalably maintained. Based on this observation,Kandoo [88] proposes a two-layered hierarchy; A bottomcontrol