m4 design and features review january 2007

56
M4 Design and Features Review January 2007 M4 Technical Committee Review of the Final M4 Design and Features

Upload: ordell

Post on 06-Jan-2016

28 views

Category:

Documents


1 download

DESCRIPTION

M4 Technical Committee Review of the Final M4 Design and Features. M4 Design and Features Review January 2007. Meeting Agenda. M4 application to replace legacy MiniMon and MultiMon applications and will run on current Microsoft Windows platforms with the following objectives: - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: M4 Design and Features Review January 2007

M4 Design and Features Review January 2007

M4 Technical Committee Review of the Final M4 Design and Features

Page 2: M4 Design and Features Review January 2007

January 11, 2007

Meeting Agenda

9:00 AM – 9:45 AM

Introduction•Review M4 Project Objectives•Review Project Milestones•Project Status and Schedule•Meeting Objectives

10:00 AM – Noon

System Design and Configuration•Target Platforms and Development Toolsets•Revised System Architecture•Data and Topology Configuration

Noon – 12:30 Working Lunch (provided)

12:30 – 4:30 PM

System Features•M4 Client•Data Submission•Legacy Data Migration•Failover

Page 3: M4 Design and Features Review January 2007

January 11, 2007

M4 Project Objectives

M4 application to replace legacy MiniMon and MultiMon applications and will run on current Microsoft Windows platforms with the following objectives:

1. All interrogation data collected by this system will be 100% valid.

2. Interrogation data will be provided to PTAGIS in “near-real” time.

3. 99.9% uptime of all system components.

4. SxC functionality must have as good or better efficiency as MULTIMON.

5. Interface with G2 readers and all legacy hardware.

6. Interface with PTAGIS data management systems.

7. Ease of use.

8. Common application for all deployment scenarios.

9. Monitoring will take precedence over SxC control operations.

10.Provide, as an option, continuous operation with automated fail-over from system or application faults.

Page 4: M4 Design and Features Review January 2007

January 11, 2007

M4 Project Milestones

Project Milestones from M4 Delivery on PTAGIS Wiki

Page 5: M4 Design and Features Review January 2007

January 11, 2007

Project Status

• Large portion of project already implemented as M4 alpha release

• Decision to drop Marathon platform and provide custom failover solution

• Development put on hold as of 10/2006 per failover decision

• Architecture revised to include requested features for software failover

• Completion of PLC communication evaluation

• Presentation of revised architecture and design to M4 Committee for approval

• Review and finalize SxC requirements

• Continue production of M4 development

Page 6: M4 Design and Features Review January 2007

January 11, 2007

Project Schedule Proposal

• Jan. 2007: M4 Committee approves the revised design of M4; SxC requirements are reviewed by the SxC Subcommittee

• Feb. 2007: SxC requirements are complete and approved by SxC Subcommittee

• June 2007: Delivery of a beta M4 that performs basic monitoring and data-submission

– Some basic regression testing on SxC alpha releases throughout summer

• Sept. 2007: Delivery of a beta M4 that performs all functions including SxC and fail-over

– Thorough regression testing and tuning are performed on the beta

– Live fish test is scheduled to evaluate performance of tuned SxC

• Sept. 2007: Delivery of MobileMonitor 2.0 that interfaces with M4 beta

• Dec. 2007: M4 Committee approves production release of M4

Page 7: M4 Design and Features Review January 2007

January 11, 2007

Meeting Objective

To begin production development, the M4 Committee will need to approve the following:

1. Project Schedule Proposal

2. Development Tools and Target Platforms

3. Revised System Architecture

4. Topology Configuration Features

5. M4 Client Features

6. Data Submission Design and Features

7. Legacy Data Migration Design and Features

8. Failover Cluster Design and Features

Page 8: M4 Design and Features Review January 2007

January 11, 2007

Development Toolset

The following development tools will be used to develop M4:• .NET Framework 2.0 and C# Language

– .NET Framework 3.0 is considered for Windows Communication Foundation to replace 2.0 Remoting features and includes Windows Vista support.

• SQL Server Express (client database)– Free, lightweight and popular– Ease of integration and management in .NET; XML support– File-based deployment; automatic tuning and patching– Powerful, reliable, secure and scalable– Reporting and Replication Services

• SQL Server Standard (data submission staging database)– Low TCO: pricing supports low-connectivity with large data volumes– Hosts native XML Web Services without need for IIS– Processor License: $6K; Server + 5 CALS: $2K + $162 additional CAL

• Parijet PLC Communication Library

Page 9: M4 Design and Features Review January 2007

January 11, 2007

Target Platforms

M4 will support the following platforms:

• Windows XP SP2 or better

• Windows 2003 Server

• Windows Vista

• Windows 2000*

*Windows 2000 will not be supported if .NET 3.0 is used

Failover Cluster Requirements:

• Windows 2003 Server– Redundant NIC card supporting fail-over (Private Network)– Single NIC card (Public Network)– High Performance System (dual-core, 2GB, RAID)

Page 10: M4 Design and Features Review January 2007

January 11, 2007

Supported Devices

Device DESCRIPTION PROTOCOL COMMUNICATION

FS2001 ISO Digital Angel Portable Reader ASCII Serial

FS1001 Digital Angel Juvenile Stationary Reader

ASCII/BPA Serial

FS1001A Digital Angel Adult Stationary Reader

ASCII/BPA Serial

FS1001M Digital Angel Multiplexer Reader

ASCII Serial

B2CC-G2 Digital Angel B2CC Reader XML Serial/Ethernet

G2 MUX Proposed In-stream Reader ASCII USB Driver

GPS Global Positioning System NMEA Serial

SLC-500 Programmable Logic Ctrl. ENETIP Ethernet

Page 11: M4 Design and Features Review January 2007

January 11, 2007

Revised System Architecture

Online Revised M4 System Architecture

Page 12: M4 Design and Features Review January 2007

January 11, 2007

M4 Data

The advantages of storing data into a structured database instead of text file:

• Data is relational in nature (status reports linked to topology version)

• Data retrieval for robust reporting and viewing on the client

• Database can be secured

• Adaptable: new types of messages can be easily added to system

• Reliable: data is stored immediately into database and not bufferedDeviceMessage

MessageKey

TopologyKey

SiteCode

AntennaGroupID

DeviceID

AntennaID

Buffered

MessageMessageKey

MessageType

LocalTimeStamp

PSTTimeStamp

Processed

MessageTextMessageKey

LineNumber

MessageText

MessageTypesMessageType

MessageTypeDescription

Page 13: M4 Design and Features Review January 2007

January 11, 2007

Message Data

M4 collects a variety of messages (data) from various sources:• Device• System (application and OS)• Separation-by-Code Operations• Failover Operations

Each message has the following attributes:• Timestamp in Local and PST time (with millisecond resolution)• Source information (machine, topology, device, SxC, application type)• Message Type:

– Real-time or buffered tag

– Device (alarms, status, noise, GPS coordinates)

– Monitor operation (start, stop, pause, system status, pulse)

– Error (system and device)

– SxC operation data

– Failover (planned, system fault)

Page 14: M4 Design and Features Review January 2007

January 11, 2007

Topology Configuration

• Topology configuration describes a set of physical devices and their topologic relationships that provide instrumented monitoring at one or more interrogation sites over a period of time

• It provides location-specific context to interrogation data

• A version of a topology configuration has a one-to-one relationship with collected data; M4 maintains this historical relationship between topology version and data

• Topology information is submitted to PTAGIS and integrated into SiteConfig data table

• Topology configurations are managed by a version number

Page 15: M4 Design and Features Review January 2007

January 11, 2007

Topology Component Relationships

Topology

Site

Site

Site

Antenna Group Antenna Group

Reader Device

Reader Device

Reader Device

Reader Device

Page 16: M4 Design and Features Review January 2007

January 11, 2007

Topology Component Relationships

Topology

Site

Reader Device

Antenna

Antenna

Antenna

Antenna

Trigger Device

Trigger Device

Trigger Device

Page 17: M4 Design and Features Review January 2007

January 11, 2007

Topology Component Features

Description Provides a detailed description for this topology

Version* Major version and revision number, i.e. 1.5

Created* Date version was created

Modified* Date version was last modified

Activated* Date topology was activated (data collected)

Expiration* Date topology expired

State* Current Topology Lifecycle States: New Pending Active Expired

*Read Only

Page 18: M4 Design and Features Review January 2007

January 11, 2007

Site Component Features

Site Code Three character code assigned by PTAGIS

Description Description of interrogation site

Type Type of site: Juvenile (mainstem) Adult (mainstem) In Stream

Location Optional: Lat/long pair representing location of site

Supports SxC True if SxC operations occur at this site

Page 19: M4 Design and Features Review January 2007

January 11, 2007

Antenna Group Component Features

Description Verbose detail about this grouping of antennas

Sorting Sequence Provides logical sorting in relation to physical layout of antenna groups

Site Entrance True if located at entrance of a site

Site Exit True if located at exit of a site

Final Disposition Information on fish disposition after leaving antenna group:UnknownIndeterminateRiverTransportationSample TransportationSMP

Location Optional: Lat/long coordinate pair.

Page 20: M4 Design and Features Review January 2007

January 11, 2007

Device Component Features

Device ID Two character hexadecimal unique identifier assigned by PTAGIS

Description Optional: Verbose description of device

Device Type Type of device: F1001 FS1001A FS1001M FS2001 FS1001G2 GPS SLC500 B2CC

Data Protocol Communication protocols: ASCII Binary (BPA) NMEA SLC500 XML

Page 21: M4 Design and Features Review January 2007

January 11, 2007

Device Component Features Continued

Port Type Type of Communication Port: Serial (RS-232) UDP TCP

Port Communication port (serial: COM1, TCP: 1599)

Ethernet Settings Ethernet communication settings: Host Name Remote Port

Serial Settings Serial communication settings: Baud Rate Parity Data Bits Stop Bits

Location Optional: Lat/Long coordinates of device

Page 22: M4 Design and Features Review January 2007

January 11, 2007

Mux Antenna Component Features

Antenna ID Two character hexadecimal unique identifier assigned by PTAGIS

Alias ID Optional: two character hexadecimal site unique identifier to bypass current PTAGIS limitations.

Description Optional: verbose description of antenna placement

Location Optional: lat/long coordinates of antenna

Page 23: M4 Design and Features Review January 2007

January 11, 2007

Gate Component Features

Description Verbose Description of Gate

Type Type of gate: Two Way Three Way

Address PLC bit-mask address of a physical gate

Delay Period Period in milliseconds to delay before opening gate

Location Optional: lat/long coordinates of gate

Page 24: M4 Design and Features Review January 2007

January 11, 2007

Creating or Modifying Topology Configuration

M4 distinguishes between two types of topology changes:

• Major Changes– Adding or removing a topology component (device, antenna, gate)– Renaming a device id, mux-antenna identifier or site code– Changing the relationship between any of the components, i.e. moving a device

from one antenna group to another– Changing the type of a device

• Minor Changes– Changing a serial port or other serial or Ethernet setting– Changing data protocol or port type for a device– Changing the description of a component– Changing any of the gate settings.

Page 25: M4 Design and Features Review January 2007

January 11, 2007

Topology Versioning Rules

• Major changes require a new topology version– Topology version number will increment (example 1.0 to 2.0)– A new topology can be created while monitoring with the New Topology Manager– User activates a new topology and restarts the monitor to use the new topology

version– User can import a new topology version from a file.

• Minor changes can be made to the active topology version– Topology version number will increment (example 1.0 to 1.1)– Monitor must be stopped to make minor changes– User performs minor changes from the Topology Viewer

Page 26: M4 Design and Features Review January 2007

January 11, 2007

General Topology Rules

1. A valid topology version must exist before the monitor can start• M4 will be installed with a default, empty topology version (0.0)• Any start actions will be disabled if version is not valid• Starting monitor from Service Control Manager will fail and generate an error• A valid topology version has:

• At least one site defined• At least one reader device defined for a site• Any antenna-groups must contain two or more readers• All mandatory settings for each component are specified and

valid• Only one device can be enabled for a single port address

2. Importing a topology version will create a new topology version

3. Topology configuration will override any device id transmitted in data

4. Clustered machines must run the same topology version

5. Any changes to topology take effect the next time the monitor is started

Page 27: M4 Design and Features Review January 2007

January 11, 2007

M4 Client

Page 28: M4 Design and Features Review January 2007

January 11, 2007

Behind the Scenes of M4 Client

Primary Server

User Session

System Device.dll

M4 Site Monitor Service SxC.dll

Failover.dll

Client Database

Config.XML

M4Remoting.dll

Failover ServerPrimary ServerUser Session

M4Remoting.dll

System Device.dll

M4 Site Monitor Service SxC.dll

Failover.dll

Client Database

Config.XML

Page 29: M4 Design and Features Review January 2007

January 11, 2007

M4 Client Features

Topology

Viewer

Data Viewer

Site

Monitor

Device

SxC

Process

Other Topologies

Filter

Page 30: M4 Design and Features Review January 2007

January 11, 2007

M4 Client Features

Page 31: M4 Design and Features Review January 2007

January 11, 2007

Additional M4 Client Features

• Data Viewer displays pages of data– Rows per page is user-defined– Sorting Data (TBD)

• Set Data Viewer to Auto-Refresh– Any filters apply– Cannot scroll (only data that will fit in viewer displayed)

• Right-click device to enable or disable• Right-click device or component to generate context-sensitive reports• SxC Operations

– Start/Stop/Refresh SxC Operations independently from stopping/restarting Monitor– Access SxC Configuration (from topology component or menu)

• Export message data in variety for formats (XML, CSV)• Import data from MobileMonitor 2.0 or other M4 installation• User-initiated data submissions to PTAGIS• Reporting: Device Diagnostics, Noise, Tag Hits, Site Operations, Antenna-Group

Efficiency (TBD), SxC Gate Efficiency (TBD)• Manage Application Settings (schedules for upload, trigger devices, pulse records;

failover configuration, upload settings, time zone)• Download Wizard to download stored data from remote readers

– Supports single serial port for multiple remote readers (maps to existing topology)– Converts buffered tags to real-time tags if timestamps are available

Page 32: M4 Design and Features Review January 2007

January 11, 2007

Creating a New Topology

The New Topology Manager is used to create new topology versions

• Accessed from M4 Client menu• Available even if monitor is running• Create new topology:

– Blank– Existing Topology– Discover– Import

• Provides validation tool• User must activate a new topology

version to be used on restart of monitor

• New topology version can be saved, closed, and updated at a later time

Page 33: M4 Design and Features Review January 2007

January 11, 2007

M4 Data Submission to PTAGIS

M4 Client

PTAGIS

PIT StagingDatabase

Client Database

LDAP

M4 Site Monitor Service

M4 Client

3. Connect HTTPS/TCP

4. Authenticate/Authorize

5. Upload Outstanding Data

XML

6. Feedback

1. Initiate

WS-PDS Web Service

Upload Manager

2. Read Config.XML

1. Upload process is initiated either automatically or by user

2. Upload Manager reads the configuration file (user, connection)

3. Upload Manager connects to WS-PDS web service at PTAGIS

4. Authentication and Authorization with WS-PDS based upon evidence supplied from client

5. Upload outstanding Topology Versions and Message Data

6. Upload Manager reports feedback from WS-PDS service

Page 34: M4 Design and Features Review January 2007

January 11, 2007

Data Submission Step 1: Initiating the Upload

Two ways to start an upload:

1. Manual Upload– End-user initiates upload manually by selecting Upload Data command from M4

Client menu– This upload can be initiated independent of the state of the monitor– Feature will allow a user to reset data for resubmission to PTAGIS– User will be provided feedback during the upload process with the ability to

cancel the process

2. Automatic Upload– Data is uploaded to PTAGIS on a user-defined schedule– Monitor must be running– Data will be uploaded on the next scheduled interval when monitor is started (will

not perform a make-up)– Uploading data should not impact performance of the system– Feedback from upload sessions can be viewed from a system report or the Data

Viewer

Page 35: M4 Design and Features Review January 2007

January 11, 2007

Data Submission Step 2: Read the Configuration File

Before an M4 installation can upload, these settings must be configured from the M4 Client:

SETTING DESCRIPTION

PTAGIS Upload Interval

How frequently the M4 system will automatically upload data to PTAGIS. Zero to disable

PTAGIS Account Name

Name of the PTAGIS user account for authentication and authorization during upload

PTAGIS Account Password

Password associated with PTAGIS account for authentication during uploads. Encrypted.

Use VLAN Connection

Indicates whether to use a TCP connection if VLAN network is configured at the site.

PTAGIS Host Name

Name or IP address of host to query for WS-PDS service

The M4 Client configuration manager will have a Test command to validate the configuration settings with WS-PDF web service.

Page 36: M4 Design and Features Review January 2007

January 11, 2007

Data Submission Step 3: Connecting to the Web Service

• M4 Upload Manager on client computer queries PTAGIS host server over the network for the existence of the WS-PDS service.

• If service is disabled or network connection fails, upload session is terminated and condition is logged

• For M4 installations within Commission network– Use VLAN setting is true– A faster, more reliable TCP connection is used

• For M4 installations outside of Commission network– An HTTPS connection is used instead– Less prone to Firewall issues

• Note: Windows Communication Foundation in .NET 3.0 simplifies the task of building this web service and communicating with this web service with various network bindings

Page 37: M4 Design and Features Review January 2007

January 11, 2007

Data Submission Step 4: Authentication and Authorization

Once a connection is established, the client requests authentication and authorization from the WS-PDS web service:

• The PTAGIS user name and encrypted password are sent to WS-PDS.

• WS-PDS queries the PTAGIS LDAP server with credentials for an authorization role (Data Coordinator)

• If authenticated and authorized, upload session will continue

• If not authenticated or authorized, upload session is terminated at both service and client. Condition is logged on both client and server

• PTAGIS personnel can be alerted to any failed connection attempts

Page 38: M4 Design and Features Review January 2007

January 11, 2007

Data Submission Step 5: Upload Outstanding Data to PTAGIS

Upload Manager is connected and authorized, now it must determine what data to submit:

• Each data message and topology version has a status flag indicating if it has been previously uploaded

• All new topology and message data are packaged together into an XML file (preserving referential integrity) and transferred to the WS-PDS service

• The WS-PDS service verifies the XML file integrity with a file hash:– if file is not valid, it request a resubmission from the client– If file is valid, both client and server consider this a success (in case connection

is broken or database is offline)

• WS-PDS loads data in XML file into staging database on server– XML file is preserved on server for integrity

Page 39: M4 Design and Features Review January 2007

January 11, 2007

Data Submission Step 6: Upload Session Feedback

• The WS-PDS service provides asynchronous feedback to the Upload Manager residing on the client indicating any exceptions or the success of the loaded data

• If success is reported, Upload Manager updates the status field for all records in the client database that were uploaded in the session.

• The Upload Manager records the session’s success or failure in the client database and Windows Event Log viewer

• Custom alerts can be configured to notify users via email that an upload session failed.

Page 40: M4 Design and Features Review January 2007

January 11, 2007

Legacy Data Migration

1. Initiate Migration2. Load New Topology Data (alert PTAGIS personnel)3. Update New Message Data4. Update Staging Data State5. Compact Staging Data

PTAGIS Staging

PIT StagingDatabase

Legacy PTAGIS

1. In

itiat

e

PTAGIS3

2. Load New Topology To SiteConfig

4. U

pdat

e

Home/Ptoc/Stage

Int. File

PTTPIDL

3. Load New Message Data

PIT Data Migration Service

5. Com

pact

M4Data

Archive.XML

Ale

rt

Page 41: M4 Design and Features Review January 2007

January 11, 2007

Data Migration Step 1: Initiate Migration

• Staging database hosts custom SQL Server Integration Service (SSIS) called PIT Data Migration Service (PDMS)

• PDMS can be configured and maintained from SQL Server Management Studio or custom application interface

• PDMS service is initiated automatically upon a user-defined schedule to correspond with existing IDL service for optimum processing of data

• PDMS service can be initiated manually by PTAGIS personnel

Page 42: M4 Design and Features Review January 2007

January 11, 2007

Data Migration Step 2: Load New Topology Data

• PDMS service performs a query within staging database to determine if any new topology data needs to be migrated

• A report is generated providing a summary of topology changes at each site and emailed to target PTAGIS personnel

• PDMS connects to the PTAGIS3 database and inserts the new topology data directly into SiteConfig schema

• PDMS will alert PTAGIS personnel to any errors or faults

• If PDMS cannot migrate topology data, the migration session is aborted (no data will be uploaded until problem resolved)

Page 43: M4 Design and Features Review January 2007

January 11, 2007

Data Migration Step 3: Load New Message Data

• PDMS generates an in-memory dataset of all new message data that corresponds to PTAGIS interrogation data specifications with these configurable options:

– Generate real-time tag records only (this could be set for a site-by-site basis)– Sites to exclude (can be set for period of time)– Limit number of real-time tags per second (Unique Off)

• PDMS transforms dataset into standard PTAGIS interrogation data files with these configurable options:

– Allow interrogation data files to span multiple days (generates less files to load)– Suppress interrogation files that do not contain interrogation records

• PDMS submits interrogation records for traditional PTAGIS loading:– Generates XML header for PTTP loading and puts them into staging directory– Submits them directly to IDL

• This method of loading data will ease deployment of M4 with existing client applications (MiniMon/MultiMon)

Page 44: M4 Design and Features Review January 2007

January 11, 2007

Data Migration Step 4: Update Staging Data

• PDMS service updates data in Staging Database to prevent it from being migrated when the service runs again

• PDMS service can provide a utility to allow a manual reset of select data in Staging Database for reloading to PTAGIS3

• Each data record has a status field that will be used indicate state:– New: (default) generated and stored in M4 Client Database– Uploaded: transferred from M4 Client Database to Staging Database– Migrated: migrated into PTAGIS3 database– Compact: record is compacted in the Staging Database

• The PDMS service logs the success or failure of the migration session to be used for administrative reporting

Page 45: M4 Design and Features Review January 2007

January 11, 2007

Data Migration Step 5: Compacting the Staging Database

PDMS service will initiate a sub-service, either scheduled or user-driven, to compact the M4 data in the Staging Database:

– All message and topology data older than a designated period of time will lose all of their ancillary data, retaining only the minimum data (keys and state) to prevent duplication.

– Before data is compacted, PDMS will generate an XML file representing M4/MobileMonitor 2.0 data for possible future use.

– The staging database is designed to be a temporary store to facilitate data submission and migration. It is not intended to serve data to a web application.

– If end-users want to use M4/MobileMonitor 2.0 data, they should maintain it on the M4 Client Database.

Page 46: M4 Design and Features Review January 2007

January 11, 2007

M4 Failover Services

To meet continuous operational requirements, M4 can provide automatic failover with a redundant (clustered) server

• Supports two types of failover conditions:– System or application fault– Planned failover for server maintenance

• Failover service has specific use case scenarios:– Interrogation sites that perform Separation-by-Code operations– Interrogation sites that collect a large segment of PTAGIS data and require

operational redundancy

• Failover is integrated into M4 as a configurable option:– Does not require overhead of maintaining multiple software versions of the same

application– By default, Failover Services are disabled in M4 to reduce complexity for casual

end-user

• Failover Services should not impact system performance

Page 47: M4 Design and Features Review January 2007

January 11, 2007

Failover Service Architecture and Features

• Two redundant systems host independent M4 monitoring services

– Data is duplicated in separate local databases

– Both process SxC requests• Both systems capture same Serial I/O via

Ethernet using DeviceMaster• Only one system communicates to a PLC

device to provide SxC gate control• Two monitoring services communicate

health status via heartbeat channel– Uses private network with redundant NICs

• M4 Client provides management of failover configuration to end-user

• NTP synchronizes system time between two servers

– provides coarse synchronization of the two sets of data via TimeStamp field

Database

SxC.dll

Device.dll

M4 Site Monitor Service

Failover.dll

SxC.dll

Device.dll

M4 Site Monitor Service

Failover.dll

MS Operating System

Database

MS Operating System

Active Server Standby Server

Private Network

Heartbeat

NTP

RS-232 Devices

DeviceMaster

Public Network

Public or Device Time

Server

PLC

Page 48: M4 Design and Features Review January 2007

January 11, 2007

M4 Failover Assumptions

To reduce complexity and not impact system performance, the following assumptions for M4 Failover Services:

• System platforms should be identical and configured for high-performance:– Dual or Quad Core, 2GB RAM, RAID– Install transaction log of M4 Client Database on separate partition

• Data is not mirrored between two systems:– Data events are not synchronized and may not be recorded in same order– Data recovery from a failover requires manual user intervention– Data is synchronized with scheduled checkpoints to facilitate data recovery

• Separation-by-Code counters are computed independently on two systems– Counters could be synchronized if necessary

• Heartbeat communication channel represents the single point of failure– No guarantee of failover or gate control if this channel fails

• Topology Versions and SxC Configuration must be identical on both machines

– M4 will detect the topology version and will abort starting the monitor– M4 will provide utilities to push configuration changes between two machines

Page 49: M4 Design and Features Review January 2007

January 11, 2007

M4 Failover Service States and Cluster Roles

M4 Failover Service Operational States

• Active: monitoring service is controlling separation-by-code gates

• Standby: monitoring service is computing separation-by-code operations but is not controlling the gates

M4 Failover Cluster Roles

Two redundant monitoring services hosted on separate machines will be configured to start as one of two types:

• Primary: service attempts to start in the Active state

• Secondary: service attempts to start in the Standby state and will promote itself to Active if primary system does not respond

Page 50: M4 Design and Features Review January 2007

January 11, 2007

M4 Failover Service: Primary System Startup Procedures

When Primary server follows this startup procedure:

1. Sends heartbeat message proposing it is the Active service

2. Monitors heartbeat channel for messages from redundant service for a specified period of time (Discovery Period)

3. If it does not receive heartbeat message from an already Active service, it promotes itself as the Active service.

4. If it does receive heartbeat message from an already Active service, it demotes itself as Standby and continues operating in this state.

5. If communication channel fails or no heartbeat message is received at all, it will report failure and send alert, and continue operating in Active state.

Page 51: M4 Design and Features Review January 2007

January 11, 2007

M4 Failover Service: Secondary System Startup Procedures

When Secondary server follows this startup procedure:

1. Sends a heartbeat message proposing it is the Standby service

2. Monitors heartbeat channel for messages from redundant service for a specified period of time (Discovery Period)

3. If it receives a heartbeat message from an Active service, it resumes operations in Standby state.

4. If it receives a heartbeat message from an Standby service – or receives no message at all, it promotes itself as the Active service.

5. If the heartbeat communication channel fails, it will report the failure and send an alert, however, it will remain in the Standby state.

Page 52: M4 Design and Features Review January 2007

January 11, 2007

M4 Failover Service: Failover System Procedures

When an Active service fails, the following occurs:

• The failed active service stops sending heartbeat messages

• The standby service notices the active service has not sent a heartbeat message in a specified amount of time (Failover Interval) and promotes itself as active and takes control of the PLC.

• The new active service reports the error and sends any alerts indicating the condition.

When a Standby service fails, the following occurs:

• The standby service stops sending heartbeat messages

• The active service notices the standby service has not sent a heartbeat message in a specified amount of time (Failover Interval) and reports the error and sends any alerts indicating the condition.

Page 53: M4 Design and Features Review January 2007

January 11, 2007

M4 Failover Service: Configuration

The M4 Client provides the end-user with management of the following Failover Service Configuration Settings:

SETTING DESCRIPTION

Cluster Role Primary or Secondary: this determines which role a server will play in the Failover scheme.

Channel Configuration

TBD: this present transport and application level configuration for a network communication channel.

Heartbeat Interval How often heartbeat messages are sent to the standby server in seconds.

Discovery Period This period, defined in seconds, determines the active server using a promotion/demotion scheme.

Failover Alerts A list of email address to send alert message on failover. Empty setting disables failover alerts.

Failover Interval Period of time to wait to determine if a failover event has occurred.

Checkpoint Interval

Determines how often a checkpoint is sent between the two services

Page 54: M4 Design and Features Review January 2007

January 11, 2007

M4 Failover Service: Operational Control

• M4 Client provides user with simultaneous operational control of both clustered monitors (start, stop, pause)

– User selects the control commands from File Menu:– Prevents failover contention

• Users can enable/disable failover on both servers

• Topology Viewer displays both monitors– Identifies Active and Standby– Remote monitor has limited display features

• Right-clicking a monitor in Topology Viewer provides independent operational control with a context-menu:

– Allows for planned shutdown– Issues an immediate checkpoint– Failover will not report failure or send any alerts– Context-menu provides hint: ‘Stop this Monitor’

Page 55: M4 Design and Features Review January 2007

January 11, 2007

M4 Failover Service: Standard Operating Procedures

• User must take care to provide the same topology version to each server– M4 Client will provide utility to push topology and SxC changes to both servers

• User must configure Failover Service correctly– Servers must be identified with correct Primary and Secondary roles– Heartbeat communication will provide a test utility in the configuration manager

• Perform a manual failover for planned shutdown of an Active server

• Reporting concurrently with SxC processing should be performed on failover server to avoid impacting performance

• Only the Primary server should be configured to automatically upload data to PTAGIS

– Submitting data from the Secondary can result in duplicate data

• The Secondary server should be use for patching data after a failover event

• Normal startup and shutdown of systems should use dual operational controls from menu to start/stop both monitors simultaneously

Page 56: M4 Design and Features Review January 2007

January 11, 2007

M4 Failover Service: Data Recovery

• M4 Client will provide a Data Recovery tool to facilitate manual patching of data after a failover event

– Data Recovery tool will provide a side-by-side viewers of both databases– Viewers will be aligned by checkpoints and timestamps– User will select patch data from viewer and upload it to PTAGIS