tr 32.cde study on alarm management

Upload: pati-jorbenadze

Post on 04-Jun-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 TR 32.Cde Study on Alarm Management

    1/15

    3GPP TR 32.cde V0.1.0 (2012-11)Technical Report

    3rd Generation Partnership Project;Technical Specification Group Services and System Aspects;

    Telecommunication management;Study on Alarm Management

    (Release !"

    The present document has been developed within the 3rdGeneration Partnership Project (3GPPTM) and may be further elaborated for the purposes of 3GPP.

    The present document has not been subject to any approval process by the 3GPP Orani!ational Partners and shall not be implemented.

    This "eport is provided for future development wor# within 3GPPonly. The Orani!ational Partners accept no liability for any use of this $pecification.

    $pecifications and "eports for implementation of the 3GPPTMsystem should be obtained via the 3GPP Orani!ational Partners% Publications Offices.

  • 8/13/2019 TR 32.Cde Study on Alarm Management

    2/15

    MCC selects keywords from stock list.

    3GPP

    &eywords

    3GPP

    Postal address

    3GPP support office address

    650 Route de !uc"o#e - $o%&"' t"%o#"

    V'#*oe - +RTe#./ 33 2 2 00 +'/ 33 3 65 4 16

    'nternet

    &tt%/www.3%%.or

    Copyright Notification

    o part may be reproduced ecept as authori!ed by written permission.

    The copyriht and the foreoin restriction etend to reproduction in all media.

    * +,-- 3GPP Orani!ational Partners (/"'0 /T'$ 11$/ 2T$' TT/ TT1).

    /ll rihts reserved.

    MT$4 is a Trade Mar# of 2T$' reistered for the benefit of its members

    3GPP4 is a Trade Mar# of 2T$' reistered for the benefit of its Members and of the 3GPP Orani!ational Partners

    5T24 is a Trade Mar# of 2T$' reistered for the benefit of its Members and of the 3GPP Orani!ational PartnersG$M6 and the G$M loo are reistered and owned by the G$M /ssociation

    3GPP TR 3!#cde $%##% (!%!&"!(Release !"

  • 8/13/2019 TR 32.Cde Study on Alarm Management

    3/15

    otet

    1ontents....................................................................................................................................................3

    7oreword...................................................................................................................................................8

    'ntroduction...............................................................................................................................................8

    - $cope......................................................................................................................................................9

    + "eferences..............................................................................................................................................9

    3 :efinitions and abbreviations.................................................................................................................93.- :efinitions..............................................................................................................................................................93.+ /bbreviations.........................................................................................................................................................;

    8 "ationale for the $tudy on /larm Manaement.....................................................................................;

    9 7ault Manaement..................................................................................................................................

    < /larms....................................................................................................................................................>

    /larm Manaement 5ifecycle..............................................................................................................--

    -, /larm states ......................................................................................................................................-+

    Annex A

    Change history.............................................................................................................15

    3GPP

    3GPP TR 3!#cde $%##% (!%!&"3(Release !"

  • 8/13/2019 TR 32.Cde Study on Alarm Management

    4/15

    +oreword

    This Technical "eport has been produced by the 3rdGeneration Partnership Project (3GPP).

    The contents of the present document are subject to continuin wor# within the T$G and may chane followin formalT$G approval. $hould the T$G modify the contents of the present document it will be re?released by the T$G with anidentifyin chane of release date and an increase in version number as follows@

    Aersion .y.!

    where@

    the first diit@

    - presented to T$G for informationB

    + presented to T$G for approvalB

    3 or reater indicates T$G approved document under chane control.

    y the second diit is incremented for all chanes of substance i.e. technical enhancements correctionsupdates etc.

    ! the third diit is incremented when editorial only chanes have been incorporated in the document.

    7troduct"o

    The massive amount of networ# elements in a mobile system and the variety of networ# elements and infrastructure

    eCuipment creates hue amount of alarms saturatin the alarm manaement systems. 'n parallel the numbers of types ofalarm have increased to overwhelmin proportions.

    The networ# administrators are flooded with alarms and alarms with often poor Cuality. The conseCuences of badCuality alarms are severe affectin many areas. The analysis of service impacts of faults in the networ#s is an

    increasinly comple challene for operators and need ood Cuality alarms.

    The fault manaement area is well established in the telecom businessB this technical report will eplore the alarm

    information itself taret users usae of such information and mechanisms and processes to enhance usability of thealarm information.

    The telecom alarm manaement eperience described is shared in basically all areas of alarm manaement.$tandardi!ation bodies in the production and enineerin fields (e 22M/ /$') have addressed the problem and

    underta#en substantial wor# under last decade to come up with solutions.

    The objective of this study is to analyse and secure applicability and impacts of the concept of alarm manaement in

    Telecom manaement. 't is proposed to benefit from wor# in the production and enineerin field since the tas# ofalarm manaement to a very hih deree is independent of different businesses. 't is a human?machine interaction.

    3GPP

    3GPP TR 3!#cde $%##% (!%!&"'(Release !"

  • 8/13/2019 TR 32.Cde Study on Alarm Management

    5/15

    1 $co%e

    The scope of this technical report is to improve the Cuality of alarms and enhance usability of alarm systems.

    2 Re8erece

    The followin documents contain provisions which throuh reference in this tet constitute provisions of the present

    document.

    ? "eferences are either specific (identified by date of publication edition number version number etc.) or

    non?specific.

    ? 7or a specific reference subseCuent revisions do not apply.

    ? 7or a non?specific reference the latest version applies. 'n the case of a reference to a 3GPP document (includina G$M document) a non?specific reference implicitly refers to the latest version of that document in the same

    Release as the present document.

    D-E 3GPP T" +-.>,9@ FAocabulary for 3GPP $pecificationsF.

    D+E 3GPP T$ 3+.-,-@ FPrinciples and hih level reCuirementsF.

    D3E 3GPP T$ 3+.---?-@ F7ault ManaementB Part -@ 3G fault manaement reCuirements.

    D8E 3GPP T$ 3+.---?+@ F7ault ManaementB Part +@ /larm '"PB 'nformation $ervice ('$).

    D9E 'T?T M.3,9,. series (+,,8)@ FTM 2nhanced Telecom Operations Map (eTOM)F.

    D;E /$'H'$/ standard -=.+ ?+,,>@ FManaement of /larm $ystems for the Process 'ndustries F.

    D- 2dition + (+,,,9 D-E.

    Alarm:T0:.

    Alarm management:The processes and practices for determinin documentin desinin operatin monitorin andmaintainin alarm systems.

    Alarm system: The collection of hardware and software that detects an alarm state communicates the indication ofthat state to the operator and records chanes in the alarm state.

    Alarm philosophy: / document that establishes the basic definitions principles and processes to desin implementand maintain an alarm system.

    3GPP

    3GPP TR 3!#cde $%##% (!%!&"(Release !"

  • 8/13/2019 TR 32.Cde Study on Alarm Management

    6/15

    3.2 **re:"'t"o

    7or the purposes of the present document the abbreviations iven in T" +-.>,9 D-E and the followin apply. /n

    abbreviation defined in the present document ta#es precedence over the definition of the same abbreviation if any in

    T" +-.>,9 D-E.

    /M /larm Manaement/$' /merican ational $tandards 'nstitute

    22M/ 2nineerin 2Cuipment and Materials sersJs /ssociation7M 7ault Manaement

    '$/ 'nternational $ociety of /utomationKO$ Kuality of $ervice

    R't"o'#e 8or t&e $tudy o #'r; ''e;et

    The massive amount of networ# elements in a mobile system and the variety of networ# elements and infrastructure

    eCuipment creates hue amount of alarms saturatin operators alarm manaement systems. 'n parallel the numbers of

    types of alarms have increased to overwhelmin proportions.

    Major mobility networ# incident manaement centre can count alarms in nL-,, ,,, per day. andlin of nL -,,,different types of alarms. 7indins from independent researchers in telecom are frihtenin

    N =, of all alarms results in a trouble tic#et less than once every -,,, alarms

    N >, of all tic#ets are from Q3, most common alarm types

    N The alarm severity levels have no correlation to the real priority as juded by the networ# administrators.

    The majority of the alarms should never have been presented for the networ# administrators.

    The fundamental problem is that the networ# administrators are flooded with alarms and alarms with often poor Cuality.

    Poor Cuality in this contet can include

    N uisance alarms (repeatin and fleetin alarms redundant and cascadin alarms)

    N $tale alarms

    N /larm floods

    N /larms without response

    N /larms with the wron priority

    N Out?of?$ervice alarms

    N "edundant alarms

    Status of the alarm management environment

    N Too many alarms occurrin. Aastly over alarmed systems producin far more alarms to the operator than needed

    N Too hih proportion of them is nuisance alarms of little operational relevance

    N The majority of the alarms should never have been presented for the networ# administratorsR

    The conseCuences of bad Cuality alarms are severe affectin many areas. / few eamples

    N Too much time and resources are spent to define alarms as irrelevant S most of the alarms are now irrelevantR

    N /larm floodin add compleity in fault resolution activities and thereby delays

    N 1ontributin factor to the seriousness of major incidents caused by delayed service impacts analysis

    3GPP

    3GPP TR 3!#cde $%##% (!%!&")(Release !"

  • 8/13/2019 TR 32.Cde Study on Alarm Management

    7/15

    N 1urrent Cuality of alarm severities as set by eCuipment are misleadin and have a neative effect on the

    networ# service

    N Operators may nelect important alarms caused by not understandable alarm information to respond to the alarm

    N $inificantly overstaffed networ# manaement centre and increased human resources allocated in the assurance

    processes

    N General bad enineerin S O$ systems staff have to cope with poor Cuality data

    N Poor alarm manaement is a major barrier to reachin operational ecellence a business ris#

    N nnecessarily comple and costly O$$ solutions that have not supported a service and customer oriented

    approach at desired deree (1/P2I driver)

    N 1ontributin factor to low success rate of alarm correlation tools in telecom. one will cure fundamental faults

    in the basic alarm system as poor Cuality alarms

    N 0ad alarm data Cuality is a sinificant every day cost driver (OP2I driver).

    The telecom alarm manaement eperience is shared in basically all areas of alarm manaement. The incitements toresolve the alarm manaement problems have been more obviously in other areas as in the production and enineerin

    field.

    The very same issues presented above are often cited as contributin factors in industrial major incidents as Milford

    aven Three Mile 'sland 1hernobyl 0P eplosion major power rid failures ? to name a few. The alarm manaementsystems have recorded alarm for hours but the fault resolution was delayed and understandin of the basic problem was

    drowned in the amount of alarms.

    $tandardi!ation bodies in the production and enineerin fields (e 22M/ /$') have addressed the problem and

    underta#en substantial wor# under last decade to come up with solutions. $olutions are reported to be adopted by

    industry insurance and reulatory bodies.

    /larm manaement in telecom is obviously an overloo#ed and immature area that needs to chane.

    3GPP has a uniCue opportunity to address these problems since 3GPP has all eperts available in the definition of a

    mobile system includin Telecom Manaement. This T" will elaborate and analyse this escalatin and severe problem

    propose solutions to share uidelines and mandatory reCuirements with the networ# element specifyin roups.

    5 +'u#t ''e;et

    The standardisation objectives for 3GPP 7ault manaement are defined in 3GPP T$ 3+.-,- D3E. 'n the contet of this

    technical report the followin identified purpose is in focus@

    The purpose of 7M is to detect failures as soon as they occur and to limit their effects on the networ# Kuality of

    $ervice (KO$) as far as possible.

    The standardisation objectives

    ? :etect failures in the networ# as soon as they occur and alert the operatin personnel as fast as possibleB

    The detailed 3GPP specifications in the area of 7ault Manaement are found in the T$3+.---?L series.

    7or fault detection the followin information is presented in T$3+.---?- D9E@

    7or each fault the fault detection process shall supply the followin information@

    N the deviceHresourceHfileHfunctionalityHsmallest replaceable unit as follows@

    N for hardware faults the smallest replaceable unit that is faultyB

    3GPP

    3GPP TR 3!#cde $%##% (!%!&"*(Release !"

  • 8/13/2019 TR 32.Cde Study on Alarm Management

    8/15

    N for software faults the affected software component e.. corrupted file(s) or databases or software codeB

    N for functional faults the affected functionalityB

    N for faults caused by overload information on the reason for the overloadB

    N for all the above faults wherever applicable an indication of the physical and loical resources that are affected

    by the fault if applicable a description of the loss of capability of the affected resource.

    N the type of the fault (communication environmental eCuipment processin error Ko$) accordin to 'T T

    "ecommendation I.

  • 8/13/2019 TR 32.Cde Study on Alarm Management

    9/15

    6.1 T&e ur:e"##'ce :""o

    The very essence of the surveillance functionality is to alert the operator when incidents appear in out networ#s. This isa human?machine interface and a common epectation is that operators should never overloo# important alarms. To be

    able to fulfil such reCuest the oal must be to monitor only necessary alarms at the riht time by etractin the onesneeded. The vision is@

    7ew alarms.

    1learly prioriti!ed and presented to the operator.

    2ach with a needed action.

    2ach action is ta#en.

    /larms aid the operator in an upset.

    The system is monitored so performance is maintained

    The reality in our comple networ#s with the variety of networ# elements and infrastructure eCuipment is nearly thereverse. ue amount of alarms are saturatin the alarm manaement systems. 'n parallel the numbers of types of alarm

    have increased to overwhelmin proportions. The Cuality of the alarm information is too often of poor Cuality.

    H2ditor@ 'ntroduce the concept of ihly Manaed /larmH

    4 #'r;

    4.1 #'r; de8""t"o

    The definition of alarm is not consistent between different telecom bodies and we have conflictin definitions of events

    in 3GPP e

    I.

  • 8/13/2019 TR 32.Cde Study on Alarm Management

    10/15

    22M/ ->- D

  • 8/13/2019 TR 32.Cde Study on Alarm Management

    11/15

  • 8/13/2019 TR 32.Cde Study on Alarm Management

    12/15

    Of prime importance and of most compleity of the alarm manaement processes is probably the rationali!ation tas#

    that will analyse and prioriti!e many of the individual alarms. 'n telecom most of the classification of alarm severity isdone by the eCuipment vendors and will need to be analysed and often reclassified dependent on the networ#

    environment service offered etc.

    The /$' H'$/-=.+ standard stresses the audit functionality to reular follow the behaviour of the alarm systems with

    defined &P's. 'mportant part of alarm manaement will be alarm suppression methods presented in clause --.

    Obviously we find a similarity with inheritin eTOM D9E processes. 't will be up to the individual operators to tailor its

    operations and will not be taret for standardisation.

    owever bad Cuality alarms in contrast with Uood alarms will be comple or impossible to handle even in an

    advanced alarm manaement environment.

    H2ditor@ T01H

    10 #'r; t'te

    /larm $tate transition diaram from /$'H'$/ -=.+ D;E

    3GPP

    3GPP TR 3!#cde $%##% (!%!&"!(Release !"

  • 8/13/2019 TR 32.Cde Study on Alarm Management

    13/15

    /2ditor: 1onsider etension of the resource alarm states. /$'H'$/ -=.+ arues for new states as UOut of service

    U$uppressed by desin and U$helved. 't is important that the states are controlledHset at lowest possible level. (2

    2M or M)H

    11 #'r; u%%re"o ;et&od

    / #ey idea of 22M/ ->-D

  • 8/13/2019 TR 32.Cde Study on Alarm Management

    14/15

  • 8/13/2019 TR 32.Cde Study on Alarm Management

    15/15

    e &'e &"tory

    5hange history6ate TSG 7 TSG 6oc# 5R Rev Su,ject15omment 8ld 9e0

    o: 2012 +"rt dr'8t 0.1.0

    3GPP TR 3!#cde $%##% (!%!&"(Release !"