service provider devops for large scale modern network services

7
Service Provider DevOps for Large Scale Modern Network Services Juhoon Kim 1 , Catalin Meirosu 2 , Ioanna Papafili 3 , Rebecca Steinert 4 , Sachin Sharma 5 , Fritz-Joachim Westphal 1 , Mario Kind 1 , Apoorv Shukla 6 , Felici´ an N´ emeth 7 , Antonio Manzalini 8 1 Deutsche Telekom AG, Germany; 2 Ericsson AB, Sweden; 3 OTE, Greece; 4 SICS, Sweden; 5 iMinds, Belgium; 6 TU-Berlin, Germany, 7 BME, Hungary; 8 TI, Italy Abstract—Network service providers are facing challenges for deploying new services mainly due to the growing complexity of software architecture and development process. Moreover, the recent architectural innovation of network systems such as Network Function Virtualization (NFV), Software-defined Net- working (SDN), and Cloud computing increases the development and operation complexity yet again. One of the emerging solutions to this problem is a novel software development concept, namely DevOps, that is widely employed by major Internet software companies. Although the goals of DevOps in data centers are well-suited for the demands of agile service creation, additional requirements specific to the virtualized and software-defined network environment are important to be addressed from the perspective of modern network carriers. In this paper, we thoroughly debate DevOps requirements for developing a modern service creation platform by taking EU FP7 project UNIFY as a reference architecture and suggest the corresponding extensions of UNIFY interfaces that meet the discovered requirements. I. I NTRODUCTION Network service providers are victims of their own success in the telecommunication business. System architectures and development processes become more and more complex due to the evolution of network technologies and innovation based on the rich Internet infrastructure. Moreover, recent research efforts made in academia and industry show a strong trend towards utilizing virtualized and software-defined network environments that adds complexity to the development and op- eration of large scale network services. This complexity keeps network service providers tardy in creation and deployment of new services and eventually hinders the further evolution of network technologies. Major Internet software companies have been experiencing similar issues and designed tools and methods that aim at improving the efficiency of the software development and at narrowing the distance between the development and oper- ations. The concept of unifying such modern development methods and tools is commonly referred to as DevOps. The communication and collaboration between developers as well as the integration of an individual piece of work are especially emphasized tasks of DevOps in the software development. Several projects in data centers put this novel development concept into practice by tackling multiple challenges mainly brought up by the geographical distribution of network nodes. Examples of such challenges are: A high cost of operation in terms of time and hu- man/financial resources due to the physical manage- ment of distributed nodes Limited visibility of network and service states that makes it difficult to assure the Quality of Experience (QoE) Difficulties in pinpointing the cause and location of problems (troubleshooting) and debugging Difficulties in deploying services quickly and fre- quently, e.g., due to the validation of service integ- ration or regression test This movement in data centers inspires service providers to adopt automated development and operation processes within the scope of virtualized and software-defined network infra- structure. The virtualized network infrastructure differs from the traditional definition of network resources in the data center in the perspective of its independence from the fixed location and/or hardware; hence provides great opportunities to network carriers to lower operation costs. However, existing manage- ment techniques and tools are developed to solve particular and well-defined problems of static network environments and therefore would need to be adapted to the requirements of virtualized network environments. In this paper, we discuss DevOps requirements within dynamic network environments by taking EU FP7 project UNIFY [1] as a reference architecture built on the highly virtualized network infrastructure. Based on this discussion, we extend the UNIFY APIs in order to support developers in creating DevOps tools that fulfil the requirements of modern network service providers. We call this extended DevOps concept Service-provider DevOps or SP-DevOps in this paper. Further, we show how SP-DevOps can be used for the actual service creation in the UNIFY framework by applying this concept to one of use cases developed in [2]. The rest of this paper is structured as follows. In Section II, we discuss DevOps requirements in dynamic network envir- onments such as virtualized and software-defined networks. Section III gives an overview of UNIFY architecture that is used as a reference architecture throughout this paper. In Sec- tion IV, four main processes of SP-DevOps are described. After that, the extension of UNIFY APIs and the use case of SP-DevOps are illustrated in Section V and Section VI, respectively. Finally, Section VII presents the related studies and Section VIII summarizes this paper. 978-3-901882-76-0 @2015 IFIP 1391

Upload: duongdung

Post on 13-Feb-2017

220 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Service Provider DevOps for Large Scale Modern Network Services

Service Provider DevOps for Large ScaleModern Network Services

Juhoon Kim1, Catalin Meirosu2, Ioanna Papafili3, Rebecca Steinert4, Sachin Sharma5,Fritz-Joachim Westphal1, Mario Kind1, Apoorv Shukla6, Felician Nemeth7, Antonio Manzalini8

1Deutsche Telekom AG, Germany; 2Ericsson AB, Sweden; 3OTE, Greece; 4SICS, Sweden;5iMinds, Belgium; 6TU-Berlin, Germany, 7BME, Hungary; 8TI, Italy

Abstract—Network service providers are facing challenges fordeploying new services mainly due to the growing complexityof software architecture and development process. Moreover,the recent architectural innovation of network systems such asNetwork Function Virtualization (NFV), Software-defined Net-working (SDN), and Cloud computing increases the developmentand operation complexity yet again. One of the emerging solutionsto this problem is a novel software development concept, namelyDevOps, that is widely employed by major Internet softwarecompanies. Although the goals of DevOps in data centers arewell-suited for the demands of agile service creation, additionalrequirements specific to the virtualized and software-definednetwork environment are important to be addressed from theperspective of modern network carriers.

In this paper, we thoroughly debate DevOps requirementsfor developing a modern service creation platform by takingEU FP7 project UNIFY as a reference architecture and suggestthe corresponding extensions of UNIFY interfaces that meet thediscovered requirements.

I. INTRODUCTION

Network service providers are victims of their own successin the telecommunication business. System architectures anddevelopment processes become more and more complex dueto the evolution of network technologies and innovation basedon the rich Internet infrastructure. Moreover, recent researchefforts made in academia and industry show a strong trendtowards utilizing virtualized and software-defined networkenvironments that adds complexity to the development and op-eration of large scale network services. This complexity keepsnetwork service providers tardy in creation and deployment ofnew services and eventually hinders the further evolution ofnetwork technologies.

Major Internet software companies have been experiencingsimilar issues and designed tools and methods that aim atimproving the efficiency of the software development and atnarrowing the distance between the development and oper-ations. The concept of unifying such modern developmentmethods and tools is commonly referred to as DevOps. Thecommunication and collaboration between developers as wellas the integration of an individual piece of work are especiallyemphasized tasks of DevOps in the software development.

Several projects in data centers put this novel developmentconcept into practice by tackling multiple challenges mainlybrought up by the geographical distribution of network nodes.Examples of such challenges are:

• A high cost of operation in terms of time and hu-man/financial resources due to the physical manage-ment of distributed nodes

• Limited visibility of network and service states thatmakes it difficult to assure the Quality of Experience(QoE)

• Difficulties in pinpointing the cause and location ofproblems (troubleshooting) and debugging

• Difficulties in deploying services quickly and fre-quently, e.g., due to the validation of service integ-ration or regression test

This movement in data centers inspires service providers toadopt automated development and operation processes withinthe scope of virtualized and software-defined network infra-structure. The virtualized network infrastructure differs fromthe traditional definition of network resources in the data centerin the perspective of its independence from the fixed locationand/or hardware; hence provides great opportunities to networkcarriers to lower operation costs. However, existing manage-ment techniques and tools are developed to solve particularand well-defined problems of static network environments andtherefore would need to be adapted to the requirements ofvirtualized network environments.

In this paper, we discuss DevOps requirements withindynamic network environments by taking EU FP7 projectUNIFY [1] as a reference architecture built on the highlyvirtualized network infrastructure. Based on this discussion,we extend the UNIFY APIs in order to support developers increating DevOps tools that fulfil the requirements of modernnetwork service providers. We call this extended DevOpsconcept Service-provider DevOps or SP-DevOps in this paper.Further, we show how SP-DevOps can be used for the actualservice creation in the UNIFY framework by applying thisconcept to one of use cases developed in [2].

The rest of this paper is structured as follows. In Section II,we discuss DevOps requirements in dynamic network envir-onments such as virtualized and software-defined networks.Section III gives an overview of UNIFY architecture that isused as a reference architecture throughout this paper. In Sec-tion IV, four main processes of SP-DevOps are described.After that, the extension of UNIFY APIs and the use caseof SP-DevOps are illustrated in Section V and Section VI,respectively. Finally, Section VII presents the related studiesand Section VIII summarizes this paper.

978-3-901882-76-0 @2015 IFIP 1391

Page 2: Service Provider DevOps for Large Scale Modern Network Services

II. DEVOPS REQUIREMENTS FOR SERVICE PROVIDERS

The UNIFY project [3], as well as the IETF [4] and theTMForum [5] have recently initiated activities that investigatethe challenges related to adopting DevOps practices in carriernetworks. In this section, we summarize the current status ofthe discussion that we are driving in IETF on this topic, andmake some additional considerations.

A. Observability

Telecom operator infrastructure is characterized by a widedistribution over geographical areas, large numbers of nodesand functions involved in providing services and stringentcontractual demands on high availability.

In software-defined infrastructures, we define observabilityas the property that provides visibility on the status of bothphysical and virtual components of the infrastructure at thetime scale relevant for a particular task. One important re-quirement for observability is scalability of the componentsinvolved in the various processes, including message bussesand databases. The scalability is also affected by trade-offsbetween the communication overhead and the reliability of aparticular estimate and the strategic placement of the monitor-ing functions.

The SNMP agent paradigm commonly employed in man-agement is clearly limiting the capability to provide scalableobservability, and the OpenFlow extensions for monitoringunfortunately follow the same unscalable communication pat-tern. We believe that distributed information exchange betweenall types of virtual functions (with monitoring functions asa major use case) is key to providing scalable observabilityin a carrier environment. From a DevOps perspective, suchdistributed information exchange is an important enabler forpushing automation beyond simple sets of scripts that replicatetested and trialed situations.

Programmability is another key requirement for monitoringfunctions. It is clear that given the amount of information thatcan be collected from high-speed communication links, it is notfeasible for a for-profit enterprise to monitor everything all thetime. Exposing monitoring functions through programmabilityinterfaces that enable to select what and how to monitorthrough expressive, declarative languages is important forapplying DevOps principles from the data center to carrier-grade software-defined infrastructures.

B. Stability

We define stability as a property of software-defined infra-structures that deliver highly-available telecommunication ser-vices. For developers following that agile paradigm, constantchanges are a way of life, while for the operators instability isthe result of external, usually unforeseen, events that need tobe handled. The difference in mentalities between these twotraditionally separate organizations is thus significant.

The infrastructure itself may be in a state of continuouschanges and thus inherently unstable, but the services thatexecute on top of the infrastructure are expected to be alwaysavailable. The instability in the infrastructure may be due tomanual processes (such as routine management and mainten-ance tasks) or, in particular in software-defined infrastructure,

to the actions of a myriad of controllers and orchestrators thatoptimize resource usage and functional configurations. How-ever, in order to maximize the stability at the service level, anyinstability (whether observed, predicted or scheduled) in theinfrastructure needs to be communicated to all the interestedparties and programmable interfaces that allow actuation haveto be made available.

However, cascading actions originated by different pro-cesses that have different objectives and views of the overalloperational situation need to be mitigated. Failing that, theymay lead to the emergence of non-linear behaviors where localdynamics may lead to radically different global effects whencombined.

C. Troubleshooting

Identifying the cause of a failure in an infrastructure thatis continuously undergoing changes is a major issue thatneeds to be addressed for achieving reliable operations insoftware-defined infrastructures. From a DevOps perspective,troubleshooting usually involves a series of tasks that need tobe performed by either a developer or an operator accordingto a pre-defined incident handling description. However, suchdescriptions are usually built for physical infrastructure andnetwork functions that are implemented in boxes that havefixed locations. Everything changes in a software-defined in-frastructure, where a multitude of actors may trigger changes atany time and not all changes might be available to be logged ina central point for investigation. Methods that allow automaticdefinitions of such workflows and allow automatically adaptingthe workflows to the changes in the infrastructure and virtualnetwork functions are a significant requirement for next-generation infrastructure management. An increased degree ofautomation is in line with DevOps principles and enables moreefficient use of resources.

In particular in SDN, a large amount of information regard-ing the actual state of the infrastructure is stored by controllersand orchestrators, while virtual function managers and orches-trators hold that state information at the functional levels.Exposing information that helps building automated workflowsbecomes key to addressing this requirement. Based on thisinformation, tools could be built that assist both developers andoperators by generating tailor-made troubleshooting workflowsfor particular services composed of virtual network functions.

D. Machine-to-Machine Interaction

The creation of new services is expected to be morestrategic in the future as communication services becomeprimary commodity in daily life. One of the scenarios beinginvestigated in the telecom business is based on the assumptionthat, in the future, telecom infrastructure provides new types ofservices (e.g., Cognition-as-a-Services) to a growing numberof non-human users such as smart devices loaded with thesophisticated cognitive software. This assumption is relevantto the subject of Softwarization or Network Function Virtual-ization (NFV). In this perspective, one important requirementfor DevOps is to enable both human users and autonomousmachine to verify/validate the integration of the service andto troubleshoot/debug causes of a failure via common APIs.When the autonomous machine accesses DevOps APIs, all

IFIP/IEEE IM 2015 Workshop: 10th International Workshop on Business-driven IT Management (BDIM)1392

Page 3: Service Provider DevOps for Large Scale Modern Network Services

Figure 1: Three-layered UNIFY architecture

the necessary steps need to be triggered without a humanintermediation. This eventually leads telecom providers to thereduction of Opex caused by human operations.

III. UNIFY

Although the afore-discussed properties are common De-vOps requirements underlying the most of modern telecomservice creations, viable action towards fulfilling them differsfrom one architecture to another. Therefore, we take EU FP7project UNIFY as a reference architecture to show how suchrequirements can be effectively dealt in a large scale servicecreation.

A. Architecture Overview

To remedy today’s limits in deploying and managingtelecommunication services, the UNIFY project exploits thebenefit of SDN-enabled infrastructure that pursues full networkand service virtualization so as to enable rich and flexibleservices and operational efficiency.

UNIFY introduces a three-layer architecture (as shown inFigure 1) which comprises the service layer, orchestrationlayer, and infrastructure layer. Practically, each layer is em-ployed to group functional components with similar abstrac-tions. First, the service layer comprises traditional and modern(e.g., virtualization, SDN, and/or cloud) management/businessfunctions and is responsible for translating user’s abstractservice request (Service Graph) into the detailed service de-scription (Network Function Forwarding Graph). Second, theorchestration layer maintains the global view of resources andcapabilities, what is more, it performs policy enforcementand resources orchestration between the upper layer functionsand the underlying resources. Finally, the infrastructure layerincludes resources and local resource agents (i.e., compute,storage and network) and their corresponding local agents (e.g.,OpenFlow switch agent).

The UNIFY architecture exhibits specific characteristicswhich are important to SP-DevOps such as i)interalia model-based decomposition so as to build services out of ele-mentary blocks, ii)recursive orchestration due to multi-level

Figure 2: SP-DevOps cycle in UNIFY service creation frame-work

virtualization of storage, compute and network resources,and iii)capability for multiple administrative domains allowingthe possibility to separate roles and responsibilities amongmultiple business actors, e.g., network service providers, OTTproviders, or end-users.

B. Dev & Ops in UNIFY

In UNIFY, two groups of developers are defined dependingon their roles. The first group is service developers that designand create a service by determining its resource requirementsand interconnecting desired network functions. This type of de-velopment is carried out based on a simple service descriptionlanguage. The second group is actual software developers thatimplement Virtualized Network Functions (VNFs). A VNF isa logical network component that runs within an independentcontainer (e.g., virtual machine) and that is responsible forspecific network tasks (e.g., firewall, NAT, or payload inspec-tion). In a NFV architecture, a service is created by combining(or chaining) multiple VNFs. In UNIFY, those who designand implement VNFs are referred to as VNF developers andthey correspond with developers in the software developmentdomain. Since the architecture restricts the service devel-opment to pre-verified sets of physical/virtual infrastructureresources, the greater part of the development complexity liesin implementing VNFs. Thus, unless differently stated, theterm developers addresses VNF developers in the remainingof this paper.

The major role of operators in UNIFY is the performancemanagement of a service. More specifically, operators ensurethat performance indicators of the service meet the require-ments specified in the service graph. For this, operators needa fair degree of monitoring capabilities on virtual/physicalelements of the infrastructure.

IV. SP-DEVOPS PROCESSES OF UNIFY

In this section, we show how the DevOps requirementsfor service providers discovered in Section II can be ad-dressed within the scope of the service creation in the UNIFYframework. Figure 2 briefly illustrates mapping between thefour large processes (VNF developer support, verification, ob-servability, troubleshooting) of Service Provider DevOps (SP-DevOps) and the lifecycle of the service creation in UNIFYwith regard to roles of operators and developers.

IFIP/IEEE IM 2015 Workshop: 10th International Workshop on Business-driven IT Management (BDIM) 1393

Page 4: Service Provider DevOps for Large Scale Modern Network Services

A. VNF development support

A Virtualized Network Function (VNF) is a basic func-tional unit in a Network Function Virtualization (NFV) archi-tecture. In order to facilitate VNF development to be performeddirectly in the production system, there is need for a set ofsupporting functions provided by the architecture towards theVNF developer on top of the service layer. Once a VNFunder development is deployed within the production system,VNF developers are supported with the observability, verifica-tion and troubleshooting capabilities described afterwards. ForVNF development support functionality, three sub-processesare considered:

• Adding a new VNF to the production environment:This sub-process allows developers to add a new orupdated VNF in the production environment for test-ing and debugging purposes. The service layer storesa description of the VNF capabilities and resourcerequirements given by the developer, and informs theorchestration layer about the existence of the VNF.

• Modifying an already deployed service graph witha new or updated VNF: A developer-request todeploy a VNF is received at the service layer andforwarded to the orchestration layer, which assessesthe resource allocation for current and new VNFs byquerying the service catalogue. The controller layerallocates the resources needed for the new instanceand configures the policies associated with it.

• Attach VNF to software development tools: Thedeveloper queries the service layer by providing aservice graph identifier and the type of VNF for thepurpose of debugging. The request is forwarded tothe orchestration layer, which provides an identifierfor the VNF instance and associated resources. Giventhe identifier, the developer can connect to the VNFinstance and run tools for distributed software debug-ging.

B. Verification process

In a large scale service architecture such as UNIFY, theverification process is not limited to the code validation. Theemphasized goal of the verification in UNIFY is rather theassurance that the service configuration is in concord withthe service definition, e.g., quality indicators and performanceindicators.

Verification is considered as a set of features providinggatekeeper functions to validate both the abstract servicemodels and the proposed resource configuration before actualinstantiation on the infrastructure layer takes place.

In UNIFY, the verification process is performed acrossmultiple layers in the architecture:

• The service layer verifies topological inconsistenciesand loops of the service graph (SG) and networkfunction forwarding graph (NF-FG).

• The orchestration layer validates mapping betweenthe requirement of NFs and the allocation of the re-source and capability. This layer also verifies whether

or not the placement of a new NF violates the policy(and the performance) of the already deployed NF-FG.

• The infrastructure layer (controllers) checks con-sistency of specific configuration instances such asinconsistent network configuration in the form ofOpenFlow rules.

C. Observability process

The observability process provides visibility onto the op-erational performance of service graphs deployed in the uni-fied production environment. This process is carried out byselecting points and targets of measurements and choosingcorresponding tools based on the key performance indicators(KPIs) and key quality indicators (KQIs) specified in theservice graph. As a result of such measurements, the observab-ility process generates and notifies diverse statistics of traffic,e.g., latency, throughput, and packet/byte counts, and statusreports of infrastructure components, e.g., resource usage andavailability. The definition of the measurement is deliveredfrom developers down to the infrastructure layer in a chainand the notification traverses in reverse order.

D. Troubleshooting process

A troubleshooting process is requested either bya developer manually or by some components onservice/orchestration layers automatically. The requestedtroubleshooting process aims to follow up on reportedbugs, faults, and anomalous states. Such a process mayinclude automated deployment of relevant verification andobservability tools when the faulty or anomalous conditioncannot be immediately localized from existing observationsand reports. Detected and localized faults and performancedegradations in the infrastructure layer are asynchronouslyreported to the virtualized infrastructure management layer(and forwarded if necessary to higher layers) where furtherinvestigation may take place.

V. EXTENSION OF UNIFY APIS

In order to provide SP-DevOps capabilities to the creationof services, it is crucial to ensure an appropriate level of pro-grammability in the monitoring infrastructure. Taking UNIFYas an example, a set of APIs specific to DevOps are required tobe implemented in order to support developers in implementingSP-DevOps applications (see Section VI) that are an essentialpart of the service creation and maintenance. Due to theabstracted and layered nature of the UNIFY architecture andto the wide scope of the DevOps operation in the architecture,the APIs need to be employed between almost all the layers inthe architecture and are sequentially invoked across the layers.

First, APIs that create and remove a monitoring endpoint(RegisterListener and UnregisterListener) within the virtualinfrastructure are necessary. The established monitoring end-point observes the performance of service entities and validatestheir integrity. Second, it is important to provide developerswith the capability to isolate the debugging environment fromthe production environment (i.e., debugging mode and releasemode), thus APIs that change the execution status (SetExecu-tionState and GetExecutionState) of the service are requiredto be implemented. Third, multiple APIs are needed to query

IFIP/IEEE IM 2015 Workshop: 10th International Workshop on Business-driven IT Management (BDIM)1394

Page 5: Service Provider DevOps for Large Scale Modern Network Services

and obtain the performance value in various metrics (e.g.,GetPerformanceValue).

Furthermore, there are several capabilities that are crucialto enable automated troubleshooting and debugging featuresin DevOps applications. The inter-layer communication cap-ability for notifying (SetupNotification and Notify) problems,e.g., detected and predicted failure, resource shortage, and/ormalicious activities, is one of them and the verification capab-ility (Verify) is another. These capabilities are required to beconfigurable via dedicated APIs.

VI. USE CASE: SSL VPN NETWORK SERVICE

In this section, we consider the application of SP-DevOpsby exploring the SSL VPN network service, i.e., one of theuse cases developed in UNIFY project [2]. SSL VPN networkservice is an example of a value-added service that a serviceprovider might offer on the virtualized network infrastructure.

A. Initiation of observability and troubleshooting

The configuration of a VPN service comprises several stepssuch as the configuration of the core network, the definitionof a virtual template interface which enables the dynamicconfiguration of virtual access interface per user upon request,the formation and association of each VPN to a Virtual Routingand Forwarding (VRF) configuration, and the configuration ofuser profiles and Authentication, Authorization and Account-ing (AAA) services at the customer premises.

Regarding maintenance of a VPN service, the service pro-vider is obliged to frequently monitor a multitude of interfacesand protocols such as: i) concerning the core: validation ofsuccessful running of the routing protocol, and ii) concerningthe VPN itself: validation of VRF configurations and routingtables, verification of associations of provider/customer edge(PE/CE) routers. The procedures for provisioning and main-tenance of VPN include the participation of multiple dedicatedappliances, e.g., AAA server, customer premises equipments(CPEs), DHCP server, edge and core routers. Operationalprocedures related to VPN are inefficient due to the complexmanual configuration and monitoring of the different nodesassociated with it. Adding another service on top of VPN,e.g., firewall, considerably increases further this complexity.Such manual activities are highly time-consuming and proneto errors, which in turn imply significant operating costs forthe service provider. Such costs can be further increased bypenalties due to SLA violations with regard to delivery ortroubleshooting times.

Employing the UNIFY architecture, an SSL VPN servicecan be offered over software-defined virtual infrastructure.The devices deployed at customer premises are extremelysimple and limited to packet forwarding, whereas all otherrelevant CPE functionality would be included in the vCPE1

VNF deployed on a PoP at the edge of the IP network in co-location with a vPE1 VNF that takes over the physical PE. AvSSL1 VNF would implement acceleration of SSL encryptionand decryption that are too intensive to be performed at highspeed on the computational resources deployed in the PoPs.

1‘v’ is the abbreviation for ‘virtualized’

Figure 3: Simplified graphical representation of SSL VPN.

Figure 4: Use case over a watch-point inserted for observingdata plane and control messages.

In the aforementioned setup, an observability VNF couldbe instantiated in co-location with vCPE and vPE, so as tomonitor various KPIs related to the physical node (i.e., PE)as well as other VNFs associated to it. The infrastructuremanagement enables an operator to introduce vMon1, i.e.,a virtualized monitoring VNF, that provides an enhancedgranularity view for users. A customer representative tryingto pro-actively troubleshoot connectivity problems employsvTrb1, i.e., a virtualized troubleshooting VNF, that relies ontrusted agents deployed in the virtual infrastructure to generatepackets that test connectivity conditions at different layers onthe path between the two vPE VNF instances. Finally, the vTrbVNF is integrated with the monitored infrastructure, such thatin case one of the vPE VNF instances is moved automaticallyas a result of a scale-out operation ordered by the infrastructuremanagement, the vTrb instance will follow the associated vPEinstance and inform the customer representative of the event.Figure 3 illustrates how an SSL VPN service is extended withobservability and troubleshooting capabilities.

B. An example of VNF developer support

An example of VNF development support is a stand-alone debugging tool providing network watch-points [3]that give visibility onto the OpenFlow control channel aswell as actions that facilitate troubleshooting and debuggingactivities. The tool helps to define network watch-points forcertain data path or control traffic events (policy violation, or

IFIP/IEEE IM 2015 Workshop: 10th International Workshop on Business-driven IT Management (BDIM) 1395

Page 6: Service Provider DevOps for Large Scale Modern Network Services

any OpenFlow matching rule), and is capable of triggeringdifferent troubleshooting actions. The network watch-pointsprovide an opportunity for service developers or operators todefine OpenFlow-like filters for selecting relevant packets andautomatically perform pre-defined actions collectively or on aper packet basis. Due to its monitoring functionality, matchingpackets could be filtered out and suspicious traffic could bedetected in an operational network.

For these purposes, network watch-points consist of threedifferent parts: standard OpenFlow match fields; the switchdefined by a unique data path ID (DPDI); and, troubleshootingactions (such as no action, drop, log, notify, etc). Accordingto the OpenFlow matching fields, the DPID may contain awildcarded value allowing a single entry to match multipleDPIDs. Moreover, since the match fields are standardized,the data plane operation mode can delegate the execution ofthe matching algorithm to switching hardware supporting theOpenFlow protocol.

Figure 4 illustrates an example where the network watchingtool passively observes the control traffic for a service graph,for the purposes of observing and debugging the Control App.The tool performs a matching process on every OpenFlowmessage in order to find the longest matching watch-pointentry in the watch-point repository (similar to a firewall),executing specified actions upon match. If there is no matchingentry, then the tool forwards the captured OpenFlow messageautomatically to the controller. Additionally, watch-points canhere be installed in relevant switches as flow rules with thehighest priority to capture traffic that normally would not trig-ger a control message (such as PACKET IN), for the purposeof troubleshooting. The tool operates transparently to flowtable changes, by recording and analyzing flow modificationmessages between the data plane component and the controller,such that the mapping to the correct set of actions to beexecuted is maintained.

VII. RELATED WORK

In current telecom industries, the TMForum enhancedTelecom Operations Map (eTOM) and the IT InformationLibrary (ITIL) [6] are proposed to increase the ties betweenDev and Ops. eTOM defines a set of business process modelsfor the Ops part of DevOps. ITIL provides a collection of bestpractices and guidelines for companies and practitioners onthe subject of managing IT services throughout their lifecycle.In the DevOps point of view, both the approaches (eTOMand ITIL) do not really increase the ties between Dev andOps. Because eTOM provides a set of guidelines for onlyOps part of DevOps and ITIL is difficult to implement inthe telecom industries, as this involves a large number ofactors and players. On the other hand, DevOps guidelines areexpected to be for both activities (Dev & Ops) of telecominfrastructure and are expected to decrease the number ofactors and roles, and to provide the exact procedures that arenecessary to implement ITIL in a practical and realistic way.

The SDN solutions to support the DevOps concept are forthe tools: (1) FlowSense [7] and OpenSketch [8], which estim-ate performance metrics, (2) OF-Rewind [9] and automatic testpacket generation tool [10] which localize and find root-causeanalysis of detected faults, and performance degradations and

(3) NICE [11], VeriFlow [12], and NetPlumber [13], whichcompare expected and detected system states. The advantageof FlowSense is that it follows a passive monitoring approach,which is a scalable way of estimating the behavior in thenetwork, but the limitations are related to timing - all estimatesof the link utilization requires data based on completed flowsessions, which may not be suitable in an environment wheretiming is essential for dynamic service-chains. OpenSketch,however, offers a high degree of automation, but the mainten-ance of ‘sketches’, i.e., data structures for storing informationabout packet states, takes resources.

The advantages of OFRewind is that it is capable ofrecording both control and data traffic traces of an OpenFlownetwork, and is capable of replaying it in a custom OpenFlownetwork to reproduce the bugs. The challenges in replayinginclude timing accuracy, multi-instance synchronization, andonline replay of multiple network elements. The automatic testpacket generation tool, however, finds issues in both Dev andOps by sending real test packets in a network. However, theproblem is that it requires gathering all data at a centralizedlocation or massive generation of test traffic, which putsa high load on the infrastructure. The other tools such asNICE, VeriFlow, NetPlumber, enable a (formal) verificationof specific SDN configurations (e.g., availability of a pathto the destination, absence of routing loops, access controlpolicies, or isolation between virtual networks). However,these tools operate on the (centralized) programmability of thecontrol plane only. This might be a limitation if we considera possible network deploying active network functions, i.e.,an environment that also enables programmability of the dataplane in a distributed fashion.

VIII. SUMMARY

In this paper, we discussed the DevOps requirementsspecific to the modern network service creations that followa trend towards employing virtualization and software-definednetwork technologies. Furthermore, we proposed an extendedDevOps concept (SP-DevOps) that addresses the discoveredrequirements and demonstrated how it can be used in a largescale service creation by taking EU FP7 project UNIFY as areference architecture.

In order to further explore the concept of SP-DevOps, wedescribed the four main processes, i.e., VNF developer support,verification process, observability process, and troubleshootingprocess, that together form the SP-DevOps lifecycle in aUNIFY service creation. In this paper, it is explained howeach of these processes effectively tackles the correspondingDevOps requirement for a large scale service creation onvirtualized and software-defined network environments.

ACKNOWLEDGEMENT

This work is supported by FP7 UNIFY, a research pro-ject partially funded by the European Community under theSeventh Framework Program (grant agreement no. 619609).The views expressed here are those of the authors only. TheEuropean Commission is not liable for any use that may bemade of the information in this document.

IFIP/IEEE IM 2015 Workshop: 10th International Workshop on Business-driven IT Management (BDIM)1396

Page 7: Service Provider DevOps for Large Scale Modern Network Services

REFERENCES

[1] “Unify: Unifying cloud and carrier networks,” 2014. [Online].Available: https://www.fp7-unify.eu/

[2] “Use cases and initial architecture,” 2014. [Online]. Available:https://www.fp7-unify.eu/

[3] “Initial requirements for the sp-devops concept, universal nodecapabilities and proposed tools,” 2014. [Online]. Available: https://www.fp7-unify.eu/

[4] C. Meirosu, A. Manzalini, J. Kim, R. Steinert, S. Sharma, and G. Mar-chetto, “Devops for software-defined telecom infrastructures,” 2014.

[5] “Zero-touch orchestration, operations and management project,” 2014.[Online]. Available: http://www.tmforum.org/zoom/16335/home.html

[6] “Gb921-w working together - itil and etom, v 11.3,” 2011. [Online].Available: http://www.tmforum.org/DownloadRelease135/15584/home.html

[7] C. Yu, C. Lumezanu, Y. Zhang, V. Singh, G. Jiang, and H. V.Madhyastha, “Flowsense: Monitoring Network Utilization with ZeroMeasurement Cost,” in ACM PAM, 2013.

[8] M. Yu, L. Jose, and R. Miao, “Software Defined Traffic Measurementwith OpenSketch,” in USENIX NSDI, 2013.

[9] A. Wundsam, D. Levin, S. Seetharaman, A. Feldmann et al.,“OFRewind: Enabling Record and Replay Troubleshooting for Net-works.” in USENIX ATC, 2011.

[10] H. Zeng, P. Kazemian, G. Varghese, and N. McKeown, “Automatic TestPacket Generation,” in ACM CoNEXT, 2012.

[11] M. Canini, D. Venzano, P. Peresini, D. Kostic, J. Rexford et al., “ANICE Way to Test OpenFlow Applications,” in USENIX NSDI, 2012.

[12] A. Khurshid, W. Zhou, M. Caesar, and P. Godfrey, “Veriflow: Verifyingnetwork-wide invariants in real time,” ACM SIGCOMM CCR, 2012.

[13] P. Kazemian, M. Chan, H. Zeng, G. Varghese, N. McKeown, andS. Whyte, “Real time network policy checking using header spaceanalysis.” in USENIX NSDI, 2013.

IFIP/IEEE IM 2015 Workshop: 10th International Workshop on Business-driven IT Management (BDIM) 1397