fault localization (pinpoint) project proposal for opnfv september 2015 version 0.8 1
TRANSCRIPT
![Page 1: Fault Localization (Pinpoint) Project Proposal for OPNFV September 2015 Version 0.8 1](https://reader036.vdocuments.site/reader036/viewer/2022082820/5697bf751a28abf838c80429/html5/thumbnails/1.jpg)
1
Fault Localization (Pinpoint)Project Proposal for OPNFV
September 2015Version 0.8
![Page 2: Fault Localization (Pinpoint) Project Proposal for OPNFV September 2015 Version 0.8 1](https://reader036.vdocuments.site/reader036/viewer/2022082820/5697bf751a28abf838c80429/html5/thumbnails/2.jpg)
2
Fault Localization – Overview
• The process of deducing the exact source of a failure from a set of observed indications– A set of algorithms– A set of APIs– Focus on cloud NFV networking– Extendable to compute and storage
• Fault localization is also known as fault isolation, alarm/event correlation, and root cause analysis (RCA)
![Page 3: Fault Localization (Pinpoint) Project Proposal for OPNFV September 2015 Version 0.8 1](https://reader036.vdocuments.site/reader036/viewer/2022082820/5697bf751a28abf838c80429/html5/thumbnails/3.jpg)
3
VM2
Hypervisor
vSwitch
ToR Switch
NIC
NIC
Hypervisor
vSwitch
ToR Switch
NIC
VM1
VNF1
VNF2
Failure:Network Function
Doesn’t work
Probable cause: iptables not configured
Probable cause: MTU size
misconfiguration
Probable cause: NIC failure
Fault Localization (FL) – Example• VNF #2 indicates that it is not working (no sessions, no network
connectivity etc.)• Several causes may result this: iptables, MTU and NIC failure problems• The FL process should find the exact source problem !
![Page 4: Fault Localization (Pinpoint) Project Proposal for OPNFV September 2015 Version 0.8 1](https://reader036.vdocuments.site/reader036/viewer/2022082820/5697bf751a28abf838c80429/html5/thumbnails/4.jpg)
4
Fault Localization APIs
Fault Localization System(Set of analysis methods)
Fault/PerformanceInformation sources• Events• Alarms• Statistics• Logs
System configuration
• Expected/desired configuration as known by the CMS
System models• Layering• Dependencies• Topology• Connectivity• Policy
System OAM tools• Active tools like ping,
trace etc.)
Get infoGet info Get infoSet config
Set testGet test-info
Find root cause(s)Find correlated failures
Root cause(s)Correlated failures
User/System
![Page 5: Fault Localization (Pinpoint) Project Proposal for OPNFV September 2015 Version 0.8 1](https://reader036.vdocuments.site/reader036/viewer/2022082820/5697bf751a28abf838c80429/html5/thumbnails/5.jpg)
5
Fault Localization System(Set of analysis methods)
Get infoGet info Get infoSet config
Set testGet test-info
Find root cause(s)Find correlated failures
Root cause(s)Correlated failures
User/System
Fault Localization in OpenStack
Neutron/NovaCeilometer/
Monasca/ ExternalNeutron/ Nova/
ExternalNeutron/Nova
SDN Controller
Fault/PerformanceInformation sources• Events/ Alarms• Statistics• Logs• Prediction
System configuration
• Expected/desired configuration as known by the CMS
System models• Layering• Dependencies• Topology• Connectivity• Policy
System OAM tools• Active tools like ping,
trace etc.)
![Page 6: Fault Localization (Pinpoint) Project Proposal for OPNFV September 2015 Version 0.8 1](https://reader036.vdocuments.site/reader036/viewer/2022082820/5697bf751a28abf838c80429/html5/thumbnails/6.jpg)
6Neutron/Nova
Ceilometer/ Monasca
Neutron/ Nova/ External
Neutron/Nova/Cinder
etc.
Relationships with other projects(1)
Fault Localization System(Set of analysis methods)
Fault/PerformanceInformation sources• Events• Alarms• Statistics• Logs
System configuration
• Expected/desired configuration as known by the CMS
System models• Layering• Dependencies• Topology• Connectivity• Policy
System OAM tools• Active tools like ping,
trace etc.)
Get infoGet info Get infoSet config
Set testGet test-info
Find root cause(s)Find correlated failures
Root cause(s)Correlated failures
User/System
Yard
stick
DoctorBottleneck
![Page 7: Fault Localization (Pinpoint) Project Proposal for OPNFV September 2015 Version 0.8 1](https://reader036.vdocuments.site/reader036/viewer/2022082820/5697bf751a28abf838c80429/html5/thumbnails/7.jpg)
7
Relationships with other projects (2)
• Projects underway or being proposed in OPNFV:– Doctor:
• The Doctor project is focused on fault notification but has also some notion of event aggregation. In this context, it can be one of the inputs for the Pinpoint project
– Yardstick:• Configuration verification testing project. Provide a testing frame work and
several basic testing methods. These could be used as possible OAM tools framework for the Pinpoint project
– Bottleneck:• This project aims automated testing environment as part of deployment to
figure out system bottlenecks and performance in staging phase before deployment. It is oriented to performance and focus on staging phase.
![Page 8: Fault Localization (Pinpoint) Project Proposal for OPNFV September 2015 Version 0.8 1](https://reader036.vdocuments.site/reader036/viewer/2022082820/5697bf751a28abf838c80429/html5/thumbnails/8.jpg)
8
Reference in NFV standard• Requirement for distributed fault correlation in ETSI GS NFV-REL 001 V1.1.1 -
Resiliency Requirements – chapter 10.4
Fault correlation in NFV
![Page 9: Fault Localization (Pinpoint) Project Proposal for OPNFV September 2015 Version 0.8 1](https://reader036.vdocuments.site/reader036/viewer/2022082820/5697bf751a28abf838c80429/html5/thumbnails/9.jpg)
9
Reference in ONUG RFI Requirements• Requirement for fault correlation in
Network State Collection, Correlation and Analytics Product/RFI Requirements – May,2015
![Page 10: Fault Localization (Pinpoint) Project Proposal for OPNFV September 2015 Version 0.8 1](https://reader036.vdocuments.site/reader036/viewer/2022082820/5697bf751a28abf838c80429/html5/thumbnails/10.jpg)
10
Proposed Project Scope
Fault Localization
Neutron Ceilometer Others
SDN Controller
Openstack Services
Project Scope VIM
NFVI
1 2 3
4 5
6
7
VNF/ VNFM
Config, OAM, Topology
Statistics
![Page 11: Fault Localization (Pinpoint) Project Proposal for OPNFV September 2015 Version 0.8 1](https://reader036.vdocuments.site/reader036/viewer/2022082820/5697bf751a28abf838c80429/html5/thumbnails/11.jpg)
11
Proposed Project Scope - cont• Focus on networking fault-localization APIs for network connectivity faults
• Use cases : Service continuity, Network load based placement and migration
• In scope:– Network fault localization requirements in virtual environment– Gap analysis for the APIs for the above use cases e.g :
• API for root-cause of a connectivity problem between VNF/VMs• API for OAM tools for Ethernet/IP technologies• API to retrieve network topology information• API for fault and performance collection engines
– Active tests and statistics retrieval required for the above use cases
• Future extensions– Extend the APIs for
• Fault localization requirements for compute and storage• Other OAM tools
– POC that will include simple fault localization analysis logic as reference implementation– Extend for upper layers of NFV (along side with OPNFV evaluation)
![Page 12: Fault Localization (Pinpoint) Project Proposal for OPNFV September 2015 Version 0.8 1](https://reader036.vdocuments.site/reader036/viewer/2022082820/5697bf751a28abf838c80429/html5/thumbnails/12.jpg)
12
Thank You !