tcs events, the data dictionary, and alarms (oh my) michele, chris, and doug version 1b

32
TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

Upload: clinton-williamson

Post on 27-Dec-2015

222 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

TCS Events, the Data Dictionary, and Alarms (oh my)

Michele, Chris, and Doug

Version 1B

Page 2: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

25 April 2014

Overview

This is not a presentation on architecture or design of an alarm handling system. This presentation discusses the state of our system today with respect to Events, and how the ECS is being cleaned up and uses the data dictionary for alarms. ECS can be used as a model for where we can see ourselves going in the future in terms of alarm handling.TCS Events – Overview and purposeData Dictionary – Overview and purposeAlarm Handler – How can we get here with what we already have at our disposal?

Software Group Presentation 2

Page 3: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

25 April 2014

High-level Review of Events

Events are indicators which can be used as alerts in real-time or as tracers for post-mortem analysis of telescope actions.

In real-time, events are designed to aid the Telescope Operator (via the LSSGUI and the message boxes on all the TCS GUIs) and …

… the Observing Astronomer (via the synchronous return information to the instrument software imposed telescope commands) regarding a situation. This is only true if the event is packaged in the command return object.

For post-mortem analysis, events represent clues as to the state of the system at a particular instant.

Software Group Presentation 3

Page 4: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

25 April 2014

Event Feedback in TCS GUIs

Software Group Presentation 4

Page 5: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

25 April 2014

Event Feedback in Text File

Software Group Presentation 5

Page 6: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

25 April 2014

High-level Review of Events

Every TCS subsystem defines their own events. Events can be pre-defined in XML or instantiated at

run-time as needed. Existing events can be modified at run-time (i.e.,

update LogString and associated parameters). All client commands should have “bookend” events:

started accompanied by complete/warning/failed. (e.g., psf.command.setZernikes.started)

Single-shot events are issued for some circumstance of particular interest. (e.g., pcs.command.setNSEphemerisTarget.extrapolation)

Software Group Presentation 6

Page 7: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

25 April 2014

Event Definition Example in XML

Software Group Presentation 7

Page 8: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

25 April 2014

Event Definition Example in Code

Invocation of call:

Supporting method:

The “LogString” is built on-the-fly with all the necessary parameters, and the default priority is OK (5). There is no easy way to know the names and how many events the GCS will generate.

Software Group Presentation 8

Page 9: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

25 April 2014

Event Logging

Software Group Presentation 9

Page 10: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

25 April 2014

Event Characteristics

Essentially isolated messages Not coupled except by convention

(e.g.,started/complete) Do not maintain any state (i.e., do not “latch”) Issued typically to indicate an unexpected or negative

transaction (i.e., there is no “I am happy again” counter-event)

Software Group Presentation 10

Page 11: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

25 April 2014

Event Issues

Not fully implemented across all TCS subsystems Inconsistent in implementation (XML vs instantiation

in code) Inconsistent across subsystems in terms of priority

settings and associated meaning

Priority Color Meaning

1 red error

2 yellow warning

3 green Ok?

4 cyan Ok?

5 white Ok?

Software Group Presentation 11

Page 12: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

25 April 2014

Event Clean Up

Implement in all subsystems Does it matter if some are in XML and others are

generated on the fly? Priorities 3 - 5 need to be better defined for use

(5=started/complete, 4=informational, 3=OK, or ???) Priority/Color will need to be reconciled with Data

Dictionary Severity scheme (discussed later) Clarify the wording as any particular event may be

packaged up as a response to an observer command.

Software Group Presentation 12

Page 13: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

25 April 2014

Event Information

Reference document: 481s505

Presentation: wiki.lbto.org/bin/view/SoftwareProducts/EventSubsystem

Software Group Presentation 13

Page 14: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

Data Dictionary

The data dictionary is a collection of variables representing the state of the TCS at a particular moment in time.

The TCS GUIs mine the data dictionary for the values represented on the GUI.

Variable datatypes are: bit, bool, char, uchar, short, ushort, int, unit, long, ulong, float, double, and string.

Every TCS subsystem defines their own variables. Only the TCS subsystem that owns a variable can

write to that variable. Any TCS subsystem can read any variable.

25 April 2014 Software Group Presentation 14

Page 15: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

25 April 2014

DD realization in DDViewer

Software Group Presentation 15

Page 16: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

DD Definition in XML

25 April 2014 Software Group Presentation 16

Page 17: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

DD Accessing Variables

There are two ways to access DD variables: Gtype objects and SetValueInterface objects. I will only discuss the Gtype objects here. Below is an example of getting a DD value, getting the values associated with the lower and upper limits, and setting a DD value.

25 April 2014 Software Group Presentation 17

Page 18: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

DD Characteristics

Each DD variable is an independent entity – there is no concept of a set of information (e.g., coordinates RA and Dec) though arrays are supported For entries which are updated at a high rate as in this

example, the Dec value may not be from the same timestamp as the RA value

There is no timestamp associated with each entry, but some subsystems have “grouped” data and there is a timestamp for the group

Under the assumption the subsystems keep their DD values up-to-date, then the variables always reflect current state.

25 April 2014 Software Group Presentation 18

Page 19: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

DD Clean up

Not all DD items have associated limits defined in the XML.

Whether or not the limits of a variable are defined in the XML, not all subsystems use the mechanism for obtaining these values and then using this information for limit checking in the subsystem code.

Should the above items be addressed with the understanding the subsystem may need greater flexibility for limit checking than what can be achieved with these static values?

25 April 2014 Software Group Presentation 19

Page 20: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

DD Setting Variables

Reference document: 481s504 Presentation:

wiki.lbto.org/bin/view/SoftwareProducts/ReflectiveMemory

25 April 2014 Software Group Presentation 20

Page 21: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

Leveraging What We Have

After some discussions (Doug, Chris, and Michele) and because of state information, the data dictionary best lends itself to the idea of an annunciator panel (at the least) or an alarm handler (at the best).

25 April 2014 Software Group Presentation 21

Page 22: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

25 April 2014

Using the ECS as a prototype

Leveraging the data dictionary for alarms: Allows for a better and more robust implementation of the

breadcrumb and rollup currently done by the ECSGUI Provides a model (ECS subsystem) for the remaining TCS

subsystems

The above steps allow us to produce an annunciator panel for the TCS. We can go further and …

Export the data dictionary items (as is done for FACSUM via the DDS) to an external system

Software Group Presentation 22

Page 23: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

25 April 2014

ECS as an Example

ECS is comprised of a number of subcomponents, which are comprised of subcomponents … This subsystem is naturally represented in a hierarchical manner.

Software Group Presentation 23

Page 24: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

ECS as an Example

25 April 2014

The mirror ventilation itself is comprised of a number of subcomponents which are depicted here to the lowest level “device”. Each device has an associated severity flag depending upon its PLC state. The next higher-level group also has an associated severity flag equal to the worst or highest level of severity among its constituents. This rollup continues until the top of the hierarchy is reached.

The current severity flag values are: error = 1, warning = 2, ok = 3, info = 4, debug = 5, and unknown = 6.

This scheme is easily exploited by the ECSGUI to color its navigation buttons and “eyebrow” in order to create the breadcrumb trail.

We should reconcile the DD severity flag levels (and colors used by the GUIs) with the event priorities as both facilities will be used.

Software Group Presentation 24

Page 25: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

ECS as an Example

ecs.severity = 1 ecs.mv.severity = 2

ecs.mv.heatExchangers.severity = 2 ecs.mv.heatExchangers.hx0401.severity = 2 ecs.mv.heatExchangers.hx0402.severity = 2 ecs.mv.heatExchangers.hx0403.severity = 3 ecs.mv.heatExchangers.hx0404.severity = 3

ecs.dampers.severity = 1 ecs.dampers.dp0405 = 1 ecs.dampers.dp0406 = 3 ecs.dampers.dp0407 = 2 ecs.dampers.dp0408 = 3

25 April 2014 Software Group Presentation 25

Page 26: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

Considerations

The ECS example only addressed states of error, warning, and OK as it was based upon hardware. What about analog values and their associated lower and upper limits?

Should the third-party package determine when limits have been violated?

What if the limits are not firm, but rather they are based upon some real-time computation?

In order to keep the specific TCS GUI and the alarm handler synchronized (which is a must), as well as retain flexibility when the lower and upper limits must be computed in real-time based upon some system variable, the subsystem should determine when a limit has been violated. This means providing the third-party package only with severity flags. All the real intelligence is retained in the subsystem (at least for the TCS).

25 April 2014 Software Group Presentation 26

Page 27: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

ECS as an Example

Special states suggested by users:Hardware deliberately put into a non-working state so techs want to know about it at the “low-level” but do not want the condition to propagate to the high-level (may be temporary)Hardware deliberately put into a non-standard state (manual vs automatic) permanentlyThere may be other special cases

25 April 2014 Software Group Presentation 27

Page 28: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

Severity Flags

If all TCS subsystems implement this scheme, we will not only have a robust annunciator panel, but also a path to exploit existing alarm hander software packages. Admittedly, there is at least code needed to convert our data dictionary information into the format preferred by the third-party package.

25 April 2014 Software Group Presentation 28

Page 29: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

25 April 2014

Characteristics of Alarm Handlers

Bring the issue to the attention of Mountain personnel Allows for the acknowledgement of alarm (basically someone

has taken ownership of the alarm – fuzzy to me, enforcement?) Provides guidance for actions to pursue via pop-up or URL Provides for processes to be triggered when a particular

transaction occurs (open a subsystem GUI?) Accommodates suppression via filters (alarms held off in the

handler until the entity is in the alarm state for a specified duration, etc.)

Provides for logging of alarms and display of the log history (for TCS this is covered by the event log)

Presents the alarms in a graphical, hierarchical view (logical grouping) for easy digestion

Software Group Presentation 29

Page 30: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

EPICS

The EPICS alarm handler can deal with literally thousands of alarms at a granular level. Since we will have already created a detailed set of severity flags in the TCS subsystems in support of the TCS GUIs, we have the flexibility to control how “deep” we want any portion of the alarm hander to be.

There are effectively four levels of alarm: invalid, major, minor, and no_alarm (though I also see error as the worst level).

Sound can be associated with the raising of an alarm – looks like only one sound accommodated (default: beep)

25 April 2014 Software Group Presentation 30

Page 31: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

EPICS - Filtering

Filtering or use of masks controls when an alarm is displayed

if a device is in alarm for more than N seconds if a device enters into an alarm state from a no_alarm state

more than M times in N secondsif an alarm is even displayed on the handler (though it may still be logged)if an alarm must be acknowledged by the Telescope Operatorif an alarm is logged

25 April 2014 Software Group Presentation 31

Page 32: TCS Events, the Data Dictionary, and Alarms (oh my) Michele, Chris, and Doug Version 1B

Near term

Clarify the meaning of the event priorities Establish firm set of severity flags

Reconcile severity flag levels with event priorities Complete implementation of severity flags in ECS

Clean-up the ECSGUI to use the severity flags Re-think the ECSGUI message box which has always been

more of an Alarm box – address the Event LogStrings? Determine the best manner to map data dictionary items

into the ASCII database needed for EPICS – and do it!

25 April 2014 Software Group Presentation 32