detection, isolation, and recovery of sensor...

Detection, Isolation, and Recovery Detection, Isolation, and Recovery of Sensor Performance Degradationof Sensor Performance Degradation

Research Assistants/Staff: Faculty:L. JIANGL. JIANG D. D. DjurdjanovicDjurdjanovic, , JJ.. NiNi

For more information, contact Prof. J. Ni; Phone: 734-936-2918; Email: [email protected]

ObjectivesObjectives

StateState--ofof--thethe--ArtArt

Electronic Automotive Throttle SystemElectronic Automotive Throttle System

Future WorkFuture Work

• An ever-increasing number of sensors have been employed in dynamic systems to ensure correct control functionalities and diagnosis.

• Nevertheless, a sensor itself degrades and fails, just as any other dynamic system.

• A faulty sensor may cause undesirable system performance, process shut down, or even a fatal accident.

Instrument Fault Detection and Isolation Techniques

Hardware Redundancy

Analytical Redundancy

Model-based method

Knowledge-based expert system

Process history based method

Parity relations

Luenberger observer and Kalman filter

Parameter estimation

Multivariate statistical methods

Bayesian belief networks

Artificial neural networks

Desai and Ray (1984) Glockler et. al (1989)

Qin and Li (1999, 2001)

Van Dirk (1994) Upadhaya and Kerlin (1978)

Lee (1994) Betta et al. (1992)

Dunia et al., 1996 Ibarguengoytia et al. (2006) Mehranbod et al. (2003)

Betta et al., (1998) Napolitaneo et al. (1994)

ApproachesApproaches

Fit a proper model to the data within the moving window by subspace model based method

x(t+Ts) = A(t)x(t)+B(t)y(t)+K(t)e(t)y(t) = C(t)x(t)+D(t)y(t)+e(t)

Extract the time constants of the monitored system and the sensor from the poles of the

corresponding transfer function

Degrading Monitored System

Normal Compound System

Changes in slow dyanamics?

Y

Degrading Sensor

Changes in fast dyanamics?

Y

ANDN N

T

Assumption 1: There is no nonlinearity included in the system

Assumption 2: Sensor dynamics is much faster than that of the plant

Detection and Isolation of Dynamic ChangesDetection and Isolation of Dynamic Changes

Detection and Isolation of Gain ChangesDetection and Isolation of Gain Changes

Assumptions 3: The process disturbances and measurement noise are white noise

Monitored Plant Gp(s) Sensor Gs(s)

Control Signals u(t)

Sensor Measurements y(t)

Process Disturbances wp(t)

Measurement Noise wn(t)

++ +

+

( ) ( ( ) ( )) ( )p s p s nY s G G U s W s G W s= + +

Confidence Value

10 20 30 40 50 60 70 80 90 100

0.2

0.4

0.6

0.8

1

10 20 30 40 50 60 70 80 90 100

0.2

0.4

0.6

0.8

1

Time (sec)

Time Constants

0 10 20 30 40 50 60 70 80 90 1000.16

0.18

0.2

0.22

0.24

0 10 20 30 40 50 60 70 80 90 1000.02

0.03

0.04

0.05

Time (sec)

Sen

sor

Pla

nt

0 10 20 30 40 50 60 70 80 90 100

0.8

1

1.2

1.4

p

0 10 20 30 40 50 60 70 80 90 1000.4

0.6

0.8

1

1.2

1.4

Normalized Gains

Time (sec)

Sens

orPl

ant

10 20 30 40 50 60 70 80 90 1000

0.2

0.4

0.6

0.8

1

10 20 30 40 50 60 70 80 90 1000

0.2

0.4

0.6

0.8

1

Confidence Value

Time (sec)

• Develop methods for the case when the process disturbances and measurement noise are colored noise processes.

• Extend this method to detect and isolation degraded sensors in presence of nonlinearities in the monitored system

A Watchdog AgentA Watchdog Agent™™ Toolbox for Toolbox for MultisensorMultisensor Performance AssessmentPerformance Assessment

Research Assistants/Staff: Faculty:K. JohnsonK. Johnson D. D. DjurdjanovicDjurdjanovic,, J. NiJ. Ni





AccomplishmentsAccomplishments

Future WorkFuture WorkSponsorsSponsors

• The Watchdog Agent™ Toolbox is a software program that has been created to assess and predict system performance in various conditions and easily assess which combination of algorithms is the best fit for a particular application.

• Several interchangeable tools have been developed in Matlab and compiled into a standalone application. The tools that have been integrated can be seen below in the main window of the Watchdog Agent™ Toolbox.

• The software takes as input sensor data from the current, normal, and faulty operation of the equipment, and outputs the current level of degradation of the system, an ARMA prediction of future degradation, and a sensitivity analysis, ormeasure of affinity, for the chosen algorithms.

• The Watchdog Agent™ Toolbox has several functional layers, shown below, compliant with the Open System Architecture for Condition Based Maintenance (OSA-CBM).

• Modularization of the toolbox allows different methods of signal processing, feature extraction, performance evaluation, and sensor fusion to be used in any combination.

• A method for analysis of discrete event data will be added to the toolbox.

• NSF Industry/University Collaborative Research Center on Intelligent Maintenance Systems

• National Instruments

Data Acquisition National Instruments PXI system

Data Manipulation Time frequency, wavelet, FFT analysis

Condition Monitor Not currently integrated

Health Assessment

Statistical pattern recognition, logistic regression, sensor fusion

Prognostics ARMA modeling

Decision Support Not currently integrated

Presentation Graphical user interface (GUI) • Shown below is the degradation analysis from 1100 cycles of a boring tool at DaimlerChrysler; two methods of performance assessment are compared.

Statistical Pattern Recognition Logistic Regression• The following picture shows the National Instruments PXI

system which is used for data acquisition.

tool replacement tool

wear

High speed (up to 8 MHz) simultaneous sampling

PC based data acquisition

Fieldbus

Zone PLC

EthernetSwitch/Router

Ethernet

Switches Valves Pumps Motor DrivesRobot/MTControllers

“Polling”Status Monitoring

COSReporting

COSReporting

Embedded Watchdog AgentEmbedded Watchdog Agent™™ for for Industrial Automation SystemsIndustrial Automation Systems

Research Assistants/Staff: Faculty:Y. Lei Y. Lei D. D. DjudjanovicDjudjanovic, , J. NiJ. Ni







• Provide diagnostic & predictive information in a networked industrial automation system at the device level.

• Develop a methodology to evaluate effects of device-level embedded prognostics

• Create methods to minimize maintenance-related costs through strategic allocation of failure risks associated with basic embedded prognostics.

• Built two testbeds for verification and demonstration of effects and benefits of the basic prognostics in Embedded Watchdog Agent

• The Embedded Watchdog Agent™ solution

• The risk allocation problem

• Due to bandwidth limitations, limited amount of maintenance related information is available in an industrial networked automation systems.

• Inherent intelligence of networked devices is not utilized for maintenance purposes.

Level 1 & 2 EWA Testbed Sponsored By Rockwell Automation and GM

Smart DeviceNet Device Testbed Sponsored By Omron

PLC

Level 2 EWA I/O

Block

Cylinders

& Valve

PLC

Level 2 EWA I/O

Block

Cylinders

& Valve

Smart DeviceNet DeviceSmart DeviceNet Device

• Developed a risk allocation methodology for industrial automation systems with basic prognostics

340000

342000

3420

00

342000

3420

00

342000

342000

342000

344000

344000

3440

00

344000

344000

3440

00

344000

3440

00

3440

00

344000

344000

344000

344000

346000

346000

346000

3460

00

346000

346000

346000

346000

346000

3460

00

3480

00

348000

Failure Risk of Type 1 Machine (%)

Failu

re R

isk

of T

ype

2 M

achi

ne (%

)

0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

(0.0061, 0.0162) GA Searching Path

340000

342000

3420

00

342000

3420

00

342000

342000

342000

344000

344000

3440

00

344000

344000

3440

00

344000

3440

00

3440

00

344000

344000

344000

344000

346000

346000

346000

3460

00

346000

346000

346000

346000

346000

3460

00

3480

00

348000

Failure Risk of Type 1 Machine (%)

Failu

re R

isk

of T

ype

2 M

achi

ne (%

)

0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

(0.0061, 0.0162) GA Searching Path

Risk Allocation of Risks of Failure Using Genetic Algorithm

• Developed Dynamic Risk allocation method using Basic Embedded Prognostics

• Conduct and improve dynamic optimal risk allocation using Basic Embedded Watchdog Agent™ Prognostics in complex industrial automation systems.

• Conduct industrial test and cost effect analysis of basic prognostics Embedded Watchdog Agent™.

• Defined 3 levels of EWA complexity

Flowchart for optimal risk allocation based on a Genetic Algorithm

Machine

Buffer

Machine

Buffer

??

???

?

??

OptimizationOptimization

Level 1:Operation/cycle counter

Level 2: Operation/cycle counter AND Cycle Timer

Level 3: Trending and data streaming

N N+1

Time

1 (on)

0 (off)

N N+1

Time

N N+1

Time

1 (on)

0 (off)

N N+1

Time

ΔT ΔTN N+11 (on)

0 (off)

N N+1

Time

ΔT ΔTN N+11 (on)

0 (off)

FeatureExtraction

Multisensor Perf. Evaluation

Performance Assessment

Learning

Performance Prediction

SensorySig. Proc.

Condition Diagnosis

FeatureExtraction

FeatureExtraction




Learning


SensorySig. Proc.Sensory

Sig. Proc.

Condition Diagnosis

Basic P

rognostics

Advanced Prognostics

Basic P

rognostics

Advanced Prognostics

Table: Risk levels assigned to machines of Type 1 & Type 2

layout of the system considered

NetworkFeature

Extraction



Failure modeDatabase

Watchdog Agent

NetworkNetworkFeature

ExtractionFeature

Extraction







Watchdog Agent

Network Watchdog AgentNetwork Watchdog Agent™™ for for Industrial Automation SystemsIndustrial Automation Systems

Research Assistants/Staff: Faculty:Y. Lei Y. Lei D. D. DjudjanovicDjudjanovic, , J. NiJ. Ni



MotivationMotivation




• Develop and evaluate novel network health monitoring tools for industrial automation systems

• Develop methodology that could assess and predict the performance of industrial automation networks.

• Identified key failure modes that occur in the networked industrial automation systems

• Proposed data acquisition system

• Nearly 54% of failures in a networked industrial automation system occur due to faults on the network itself.

• There is need to create a predictive agent for assessment and prediction of performance of an industrial automation network.

• Current tools are not designed to provide such capabilities

• Develop NWA for the data link layer

• Develop diagnosis methodology using data from both data link layer and physical layer

• Testbed validation and demonstration for NWA.

Measurement of key parameters of DeviceNet

PC

DeviceNet Interface Card

Data Acquisition Card

Statistic DataError Frame Ratio

Bus loadBus VoltageOvershoot

Signal-To-Noise Ratio

Framework of the Network Watchdog Agent™

• Connection problems, including loose connection of cable connectors and terminating resistors• Corrupt frames, including damaged transceivers, grounding problems, electrical Magnetic Interference (EMI)

• Identified key parameters that affect the performance of networked industrial automation systems

• Bus voltage and current• Common Mode Signal features• Waveform of Digital signal

• Preliminary results of one lost terminating resistor

-2 0 2 4 6 8 10 12 14 16

x 10-6

1.5

2

2.5

3

3.5

4(a) Normal DeviceNet Signal Profile

Time(s)

Vol

taag

e(V

)

CANLCANH

-2 0 2 4 6 8 10 12 14 16

x 10-6

1

1.5

2

2.5

3

3.5

4

(b) DeviceNet Signal Profile When Loosing One Terminating Resistor

Time(s)

Vol

tage

(V)

CANLCANH

(b) DeviceNet Signals When Loosing One Terminating Resistor

Time (s)

Time (s)

(a) Normal DeviceNet Signals

Vol

tage

(V)

Vol

tage

(V)

-2 0 2 4 6 8 10 12 14 16

x 10-6

1.5

2

2.5

3

3.5

4(a) Normal DeviceNet Signal Profile

Time(s)

Vol

taag

e(V

)

CANLCANH

-2 0 2 4 6 8 10 12 14 16

x 10-6

1

1.5

2

2.5

3

3.5

4

(b) DeviceNet Signal Profile When Loosing One Terminating Resistor

Time(s)

Vol

tage

(V)

CANLCANH

(b) DeviceNet Signals When Loosing One Terminating Resistor

Time (s)

Time (s)

(a) Normal DeviceNet Signals

Vol

tage

(V)

Vol

tage

(V)

Comparison of DeviceNet signals

under normal condition and with

one lost terminating resistor

Feature cluster under normal condition and

with one lost terminating resistor

Data collection system

• National instruments digital oscilloscope

• SST DeviceNet interface card

0 100 200 300 400 500 600 7000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

conf

iden

ce v

alue

time (second)

Plant AD

confidence valuecontrol limit

0 100 200 300 400 500 600 7000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

conf

iden

ce v

alue

time (second)

Controller AD

confidence valuecontrol limit

Anomaly Detection and Localization Anomaly Detection and Localization for Dynamic Control Systemfor Dynamic Control System

Research Assistants/Staff: Faculty:JJ. . LiuLiu J. NiJ. Ni







SponsorsSponsors

• Detecting and localizing unknown faults i.e. anomalies

• Sensing gradual performance degradation of controlled dynamic system

• Discriminating whether the performance deviation is caused by control software or plant (hardware)

• Developing effective diagnostics that can be applied comprehensively from product design through design validation, manufacturing testing on-board monitoring and finally repair service operating

• Ability to detect gradual changes of the system parameters as the dynamic system operates.

• Ability to detect faults that were not observed before (i.e. detect anomalies) using the same generic approach.

• No prior assumptions were made about the system behavior and the model of system behavior was learnt from a normal driving profile.

• Ability to detect and isolate of controller (software) anomalies from the plant (hardware) anomalies.

• With the growing complexity of both plants and control systems, effective diagnostics for all possible failures has become increasingly difficult and time consuming.

• As a result, many systems rely on limited diagnostic coverage provided by a diagnostic strategy which test only for known or anticipated failures, and presumes the system is operating normally if the full set of tests is passed.

• In addition, these tests are often developed separately and at large costs in terms of time and resources after hardware and the control system have been produced and are available for analysis.

• Development of effective method for intermittent fault detection

Inputs

Initial Conditions

OutputsDynamic System

Region of Operation

SOMs TFA & PCA

Performance Index

• The new anomaly detection method relies on the use of Self-Organizing Maps (SOM) for regionalization of the system operating conditions, followed by Time-Frequency Analysis (TFA) and Principal Component Analysis (PCA) for anomaly detection and fault isolation

• The proposed approach consists of two elements: anomaly detection which identifies whether a deviation from normal operation has occurred and fault isolation, which isolates the problem, as best as possible, to the specific component or subsystem that has failed.

• This project is supported in part by Center for Intelligent Maintenance System at University of Michigan, Ann Arbor and ETAS Inc., Ann Arbor

Method Overview

Anomaly Detection and Localization

Driving Profile:FTP75

AD on controller AD on plant

Gradual decrease in the gain factor (parameter of controller)

0 500 1000 1500 2000 2500 3000 3500 40000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

conf

iden

ce v

alue

time (seconds)

cv from detector for F1 cv from detector for F2cv from detector for F0control limit

Fault isolation

0 500 1000 1500 2000 2500 3000 3500 40000

0.5

1

1.5

2

2.5

3

time (seconds)

Inde

x

F0

F1

F3

F2

F0

Index Category

F0 Normal

F1 Reduced K

F2 Increased C

F3 A unknown fault

Data Fusion and Predictive Modeling for Data Fusion and Predictive Modeling for Intelligent Maintenance in Complex Intelligent Maintenance in Complex

Semiconductor Manufacturing ProcessesSemiconductor Manufacturing Processes

Research Assistants/Staff: Faculty:YY. . Liu Liu J. Ni, D. DjurdjanovicJ. Ni, D. Djurdjanovic







SponsorsSponsors

• To develop system modeling and decision support tools that integrate different data domains in a semiconductor fabrication facility into coherent performance information that will facilitate rapid and proactive countermeasures for any problem that will adversely affect the yield and uptime

• To coordinate and synthesize the integrated information from all stations into system level information that will be used to make dynamic, accurate and cost-effective maintenance scheduling and product routing decisions

• Framework for data integration and probability inference

Process

Action

Inspection

Inspection

Action

Detection

Sensor

Action

Diagnostics

Historical DB

Action

Reliability Info

ProcessInspection

QualityInspection

EquipmentMonitoring Reliability

Uncoordinated& ReactiveActions

Uncoordinated& ReactiveActions

Prognostics Tools Decision Support Tools

Monitor > Correlation > Prediction > Prevention > Optimization

P(Yield| Quality, Sensing, Reliability, Process) Decision Tools, Optimization, Synchronization

-Coordinated& ProactiveActions-Rapid Response-High Yield-Zero Downtime-Rapid Cycles

-Coordinated& ProactiveActions-Rapid Response-High Yield-Zero Downtime-Rapid Cycles

Now

Prop

osed

Paradigm Shift –High Yield Next Generation Fab

Historical DB

Action

Maint. Info

Maintenance

• Many yield models based on defect detection and critical area analysis have been developed, in which, however, only defect inspection data are utilized

• Researchers have attempted to predict yield using localized in-line data or in-situ measurements of particle contamination, which only give a partial picture of the process degradation

• Majority of maintenance operations are still based on either historical reliability of equipment, or diagnostic information from equipment performance signatures extracted from in-situ sensors

• Improved yield prediction from integrated information flow using enhanced Bayesian networks

• Enhanced maintenance decision support tools based on improved yield estimation model

• Validation of results in an industrial testbed

Semiconductor Research Corporation (SRC)

• Performing case study using semiconductor dataset

• Handling incomplete data without compromising inference accuracy

• Periodically updating inference structure when real inspection becomes available

• Simulation study to validate the proposed method

• Application to automotive industry dataset

Embedded Life Cycle Unit for Embedded Life Cycle Unit for Prognostics of Mechanical Prognostics of Mechanical

SystemsSystemsResearch Assistants/Staff: Faculty:R. R. Muminovic Muminovic Dr. Dr. D. D. Djurdjanovic Djurdjanovic / Prof. J. Ni/ Prof. J. Ni


Condition Based MaintenanceCondition Based Maintenance

• Shift the maintenance policy from reactive to proactive• Assess current performance of a railroad bogie bearing

Sensors

Maintenance

Central Control Station

Signal Pre-Processingand Analysis (LCU & Diagnostic Algorithms)

PerformanceAssessmentand Prediction

ApproachApproach

Diagnostic and Prognostic Algorithms

Life Cycle unit and diagnostic/prognostic algorithms

Condition based maintenance policy

Feature extraction – Time-frequency distribution

• Time-frequency representation is a view of a signal represented over both time and frequency.

Normalized time

Nor

mal

ized

freq

uenc

y

• Time-frequency analysis is particularly suitable for nonstationary signals whose frequency content varies over time

• High identification rate of functional and defective bearing

• Improved maintenance policy through condition based maintenance

Rifet Muminovic <[email protected]>Prof. Jun Ni <[email protected]>Dr. Dragan Djurdjanovic <[email protected]>Daniel Odry <[email protected]>

Measured vibrations

Measurement: Measured vibration and given RPM

• RPM was varied with a sinusoidal input

• Sinusoid has a period of T=60s

• Experiment was conducted with five different loads acting on the bogie cage

Support Vector Machine (SVM)Input Space Feature Space• Training vectors are

mapped into a higher dimensional space with help of a kernel

• SVM algorithm creates a maximum-margin hyperplane that separates data into two classes

Testbed

* Mapping between input and feature space

Experimental setup

MeasurementMeasurement

Expected ResultsExpected Results

For more information, contact:For more information, contact:

Classification MethodClassification Method

Immune Systems Engineering Immune Systems Engineering for Complex Dynamic Systemsfor Complex Dynamic Systems

Research Assistants/Staff: Faculty:R. Torres R. Torres J. NiJ. NiD. DjurdjanovicD. Djurdjanovic




ApproachApproach



SponsorsSponsors

• Develop novel approach to realize “immune systems”functionalities in automated systems

• Robustly detect abnormal behaviors, isolate its source, and compensate for negative effects to achieve desired performance in spite of the presence of an anomaly

• Apply local adaptive controller to the local models

• Vary parameters to simulate anomalies

• Achieve desired performance despite anomalies

• Apply same approach to practical data (i.e. automotive engine)

National Science Foundation

• Nonlinear plant is regionalized into several local linear models through “divide and conquer” methodology

• Local models are analyzed for fault detection and isolation

• Adaptive local controllers are applied per region to achieve desired performance

Benchmark Problem• Continuously Stir Tank Reactor

(CSTR) Model is used to serve as the plant for preliminary study

• Inputs (flow rate, qr and heat transfer removal rate, QC) are used to control output (concentration Cb)

Model using Growing Structure Multiple Model System (GSMMS) algorithm

Reconfigure control system to restore desired performance

Identify anomalies via Anomaly Detection Agents

Diagnose and isolate faults

Are modeling errors large?

Continue control with current GSMMS model

YESNO

Local Linear Models of Nonlinear SystemOperational Space

Local Model

w1

1 1 1ˆ= −e y y

Adaptive Local Controller

(C1)

Local Model(A1)

1y

Updating SchemeVia AD1

NominalController

v1

∑

ref

qr

Cb

QC

CSTR Schematic

Modeling Results

• Recursive Learning for both GSMMS and Fuzzy Logic have been performed on CSTR model

• The index, variance accounted for (VAF), has been used to quantify the deviation of the models from the actual plant

(Note: the VAF of two equal signals is 100%)

Comparison of ResultsGSMMS

Computation Time = 2.01 sec

VAF = 99.1833 %

FuzzyComputation Time = 29.87 sec

VAF = 99.7115 %

50 55 60 65 70 75 80 85 90 95 1000.4

0.5

0.6

0.7

0.8

0.9

1

Time (min)

CB

CB vs Time

ActualGSMMSFuzzy

Results using GSMMS and Fuzzy

detection, isolation, and recovery of sensor...

Documents