fault detection and diagnosis using information measuresstaff.utia.cas.cz/kulhavy/eth97s.pdf ·...

Fault Detection and Diagnosis

Using Information Measures

Fault Detection and Diagnosis

Using Information Measures

Rudolf Kulhavý

Honeywell Technology Center &

Institute of Information Theory and Automation

Prague, Czech Republic

Outline

� Probability-based inference revisited

Fundamentals of information geometry

� Finite-memory inference

Minimum Relative Entropy (MRE) approximation

� Implementation

Markov Chain Monte Carlo (MCMC) methods

� Brute-Force Alternative

Monte Carlo Again: Weighted Bootstrap

Likelihood-based Inference

� General regression

� Model

� Likelihood function

� Posterior density

(((( )))) )|(),|,(1

11 kk

mN

mk

mmmNm

mNmN zyscuyuyql ∏∏∏∏

++++

++++====

++++++++

++++++++ ======== θθθθθθθθθθθθ

mNmkyuzzy kkkk ++++++++======== −−−− ,,1),,(, 1 Κ

nRTzys ⊂⊂⊂⊂∈∈∈∈θθθθθθθθ ),|(

(((( )))) )()(0 θθθθθθθθθθθθ NN lpcp ====

Information-based Inference

� Empirical density

� Conditional inaccuracy

� Likelihood

� Posterior density

(((( )))) (((( ))))(((( ))))θθθθθθθθ srKNcl NN :exp −−−−====

(((( )))) (((( ))))∑∑∑∑====

−−−−−−−−====N

kkkN zzyy

Nzyr

1

,1

, δδδδ

(((( )))) (((( )))) (((( ))))∫∫∫∫∫∫∫∫==== zyzys

zyrsrK NN dd|

1log,:

θθθθθθθθ

(((( )))) )()(0 θθθθθθθθθθθθ NN lpcp ====

∏∏∏∏++++

++++====−−−−

mN

mkkk zys

N 1

)|(log1

θθθθ

(((( ))))

(((( )))) (((( )))) zyzyr

zyrzyzys

zyrzyr

zyzys

zyrsrK

dd|

1log),(dd

|

)|(log),(

dd|

1log),():(

∫∫∫∫∫∫∫∫∫∫∫∫∫∫∫∫

∫∫∫∫∫∫∫∫

++++====

====

Conditional Inaccuracy

conditional

relative entropy

conditional

Shannon entropy

� Model

� Assumptions

�

�

�

� Theoretical density

Example: Random-Coefficient AR(1)

kkkk eyvy ++++++++==== −−−−1)(µµµµ

constantisµµµµddistribute),0(is 2

vk Nv σσσσ

ddistribute),0(is 2ek Ne σσσσ

),,( 22ev σσσσσσσσµµµµθθθθ ====

−−−−−−−−==== 2

22)(

)(2

1exp

)(2

1)|( zy

zzzys µµµµ

σσσσσσσσππππθθθθ

!)(variancedependenthistory 2222ve zz σσσσσσσσσσσσ ++++====

Empirical vs Theoretical Density

-2 4-2

4

1−−−−==== kk yz 1−−−−==== kk yz

ky

scatter plot

histogram

ky

Testing of Various Hypotheses

-2 4-2

4

8.0====θθθθ

-2 4-2

4

-2

4

03.02 ====vσσσσ

-2 4

03.02 ====eσσσσ

ky ky ky

1−−−−ky 1−−−−ky 1−−−−ky

Minimum Inaccuracy (MI)

unnormalized inaccuracy

(((( ))))zyrN ,

(((( ))))zys ,ˆ,λλλλθθθθ(((( ))))zys |θθθθ

(((( )))) (((( )))) zyzys

zyrsrK dd|

1log),(: ∫∫∫∫∫∫∫∫====

θθθθS

(((( ))))srKnR

:min∈∈∈∈λλλλ

= exponential envelope

(((( )))) (((( )))) (((( ))))(((( ))))zyhzysczys ,exp|,, λλλλθθθθλλλλθθθθ ′′′′====θθθθS

const.)(log1 ++++−−−−==== θθθθNlN

MI coincides with

Maximum Likelihood!

Minimum Relative Entropy (MRE)

= h-compatible set

unnormalized relative entropy

(((( ))))zyrN ,

(((( ))))zys ,ˆ,λλλλθθθθ(((( ))))zys |θθθθ

(((( )))) (((( ))))(((( )))) zy

zys

zyrzyrsrD dd

|

,log),(|| ∫∫∫∫∫∫∫∫====

NR (((( )))) (((( )))) zyzyhzyr dd,,∫∫∫∫∫∫∫∫

(((( ))))srDNr

||minR∈∈∈∈

NR

MRE generalizes

Maximum Entropy!

N

N

kkk hzyh

N======== ∑∑∑∑

====1),(

1

(((( )))) (((( ))))(((( ))))

∫∫∫∫∫∫∫∫ ∫∫∫∫

∫∫∫∫∫∫∫∫

−−−−====

====

zzr

zrzyzys

zyrzyrzr

zyzys

zyrzyrsrD

d)(

1log)(dd

)|(

)|(log)|()(

dd|

,log),(||

Unnormalized Relative Entropy

conditional

relative entropy

marginal

Shannon entropy

Information Geometry

h-projection

)||():():( ˆ,ˆ, θθθθλλλλθθθθλλλλθθθθθθθθ ssDsrKsrK NN ++++====

Pythagorean theorem

(((( ))))zyrN ,

(((( ))))zys ,ˆ,λλλλθθθθ(((( ))))zys |θθθθ θθθθS

NR (((( )))) (((( )))) zyzyhzys dd,,ˆ,∫∫∫∫∫∫∫∫ λλλλθθθθ

(((( )))) (((( )))) NN hzyzyhzyr ======== ∫∫∫∫∫∫∫∫ dd,,

Outline





� Implementation




MRE Approximation

1 choose so that (((( ))))zyh ,

2 approximate (((( ))))θθθθsrK N :

via minimum relative entropy (((( )))) (((( )))) (((( ))))(((( ))))θθθθθθθθθθθθ sDNpcp NN ||expˆ 0 R−−−−====3 approximate posterior density

(((( )))) (((( ))))θθθθθθθθ srDsDNr

N ||min||R

R∈∈∈∈

====

(((( )))) const.: ˆ, ≈≈≈≈λλλλθθθθsrK N

(((( ))))zyrN ,

(((( ))))zys ,ˆ,λλλλθθθθ(((( ))))zys |θθθθθθθθS

NR for expected values of θθθθ

MRE Algorithm

� Convex optimization problem ( easy part )

� Logarithm of normalizing divisor ( difficult part )

(((( )))) (((( ))))(((( ))))∫∫∫∫∫∫∫∫ ′′′′==== zyzyhzys dd,exp|log),( λλλλλλλλθθθθψψψψ θθθθ

]),([min)||( NR

N hsDn

λλλλλλλλθθθθψψψψλλλλ

θθθθ ′′′′−−−−====∈∈∈∈

R

Choice of Statistic

� Differencing

� Differentiation

� Weighted integration

)|(log)|(log),(1

zyszyszyhiii θθθθθθθθ −−−−==== ++++

)|(loggrad),( zyszyhiii θθθθθθθθωωωω ′′′′====

θθθθθθθθ θθθθ d)|(log)(),( zyswzyh ii ∫∫∫∫∫∫∫∫====

0d)( ====∫∫∫∫∫∫∫∫ θθθθθθθθiw

Two Simple Hypotheses

0θθθθs

0θθθθs1θθθθs1θθθθs

)|(

)|(log),(

0

1

zys

zyszyh

θθθθ

θθθθ==== implies

Nr Nr

)),((exp)|()|(0

zyhzysczys λλλλθθθθλλλλ ====

)(

)(log

1

0

1

θθθθθθθθ

N

NN

l

l

Nh ====

exponential envelope

Two Composite Hypotheses

Nr

0H1H

exponential family enveloping 10 , HH

λλλλ̂s

Construction of h-Statistic: Differencing

ezy ++++==== )arctan(θθθθ

vzy ++++==== θθθθ

hnoiseCauchy

ezy ++++==== θθθθ

zy

ezy ++++==== )sin(θθθθ

y y

yz

z z

h h

Construction of h-Statistic: Differentiation

ezy ++++==== )sin(θθθθ

1.0====θθθθ

2.0====θθθθ

4.0====θθθθ

h

h

h

yy

y

z

z

z

Example: Sensor Validation

� Monitoring of signal differences

� Model = mixture of 3 normal distributions

� Unknown parameters

� Statistic chosen

)*100,0()*01.0,0(),0()1( vNvNvN gfgf θθθθθθθθθθθθθθθθ ++++++++−−−−−−−−

normal

operation“frozen”

sensorgross

errors

1−−−−−−−−==== kkk yye

gf θθθθθθθθ ,iesprobabilit

2θθθθ

1θθθθ00 1

1

)(

)(log)(

0es

eseh

ii

θθθθ

θθθθ====

]0,0[0 ====θθθθ]0,1[1 ====θθθθ]1,0[2 ====θθθθ

]3/1,3/1[3 ====θθθθ

Signal Difference

ke

k0 500-25

15

Relative Entropy

)||(log θθθθsD NR

1θθθθ2θθθθ

Posterior Density

)(ˆ θθθθNp

1θθθθ2θθθθ

Outline





� Implementation




(((( )))) (((( ))))(((( )))) ]dd,exp|log[max)||( ∫∫∫∫∫∫∫∫ ′′′′−−−−′′′′====∈∈∈∈

zyzyhzyshsRD NR

Nn

λλλλλλλλ θθθθλλλλ

θθθθ

MRE Algorithm

� Dual optimization task

� Numerical integration necessary

� sample from

� kernel estimate

� from it follows

),(,),,( )()()1()1( MM zyzy Κ ),(, zys λλλλθθθθ

0)ˆ||( ,, ≥≥≥≥==== εεεελλλλθθθθλλλλθθθθ ssD

),(ˆ , zys λλλλθθθθ

),( λλλλθθθθψψψψ

∑∑∑∑====

′′′′≈≈≈≈

M

iii

iiii

zys

zyhzys

M 1)()(

,

)()()()(

)|(ˆ

)),((exp)|(log

1),(

λλλλθθθθ

θθθθ λλλλλλλλθθθθψψψψ

MRE Implementation

Metropolis samplerMetropolis sampler

Metropolis samplerMetropolis sampler

MRE OptimizationMRE Optimization

Tilted model densityTilted model density

Model densityModel density

(((( ))))xs λλλλθθθθ ,

(((( ))))xsθθθθ

)()1( ,, Nxx Κ

)||( )(isD N θθθθR

)()1( ,, Mθθθθθθθθ Κ

),( zyx ====

Sample-based Computations

� expectation

� covariance

� probability of the event

� marginal density of

� predictive density

� direct sampling

� Rao-Blackwellized estimate

)(θθθθNE

)(Cov θθθθN

∫∫∫∫====ApAP θθθθθθθθ d)()(

)|(fromsample )()( zysy ii

θθθθ

∑∑∑∑====

====M

iN zys

Mzys i

1

)|(1

)|(ˆ )(θθθθ

),(given baa θθθθθθθθθθθθθθθθ ====

Metropolis Sampler I.

====∗∗∗∗∗∗∗∗

1,)(/)(

)(/)(min

)()( ii xxp

xxp

ππππππππαααα

).(fromSample xx ππππ∗∗∗∗ .w.p.Accept )1( αααα∗∗∗∗++++ ==== xx i

)(xππππ )(xp

Metropolis Sampler II.

.walkRandom )( nxx i ++++====∗∗∗∗ .w.p.Accept )1( αααα∗∗∗∗++++ ==== xx i

====∗∗∗∗

1,)(

)(min

)(ixp

xpαααα

)(xp

0 10

1

Example: Metropolis Sampling

0 10

1

1θθθθ1θθθθ

2θθθθ 2θθθθ

scatter plot histogram

Outline





� Implementation




Weighted Bootstrap Filtering

� model

� time update

� data update

� calculate normalized weights

� resample M-times from the discrete distribution over

with probability mass wi associated with element i

),(

),( 111

kkkk

kkkk

vxgy

wxfx

======== −−−−−−−−−−−−

Miwxfx ik

ikk

ik ,,1),,( )(

1)(11

)( Κ======== −−−−−−−−−−−−

},,1:{ )( Mix ik Κ====

∑∑∑∑ ====

====M

jjkk

ikk

ixyp

xypw

1)(

)(

)|(

)|(

Stochastic Simulation

new(predicted)

state

Model ofProcessDynamics

Model ofSensors

predictedsensorresponse

current(filtered)state

measureddata

RESAMPLING

Example: Nonisothermal CSTR

TcA ,

fAf Tc ,

V

F

F

TcA ,cQ

,1

)(1

,1

)(1

χχχχθθθθ

ββββθθθθ

θθθθθθθθ

−−−−++++−−−−−−−−====

++++−−−−−−−−====

fA

AfAAA

TcTkTdt

dT

ccTkcdt

dc

)/(exp)( 0 RTEkTk −−−−====

CSTR model

Reaction rate (Arrhenius relation)

Ref: Seborg, Edgar, Mellichamp (1989), Exercise 5.21

Variable Feed

0 20 40 60 80 100 1200.78

0.8

0.82

0.84

0.86Variations in feed concentration [lb mole/ft3]

0 20 40 60 80 100 120147

148

149

150

151Variations in feed temperature [oF]

Afc

fT

Cooling Effect

0 20 40 60 80 100 1200

1

2

3

4

5Periodic cooling

0 20 40 60 80 100 120130

140

150

160

170Temperature [oF]

χχχχ

T

0 20 40 60 80 100 1200

0.01

0.02

0.03

0 20 40 60 80 100 120130

140

150

160

170

State Estimation

Concentration [lb mole/ft3]

Temperature [oF]

Ac

T

Measurement Prediction

0 20 40 60 80 100 1200

0.01

0.02

0.03Concentration measurements vs predictions

0 20 40 60 80 100 120130

140

150

160

170Temperature measurements vs predictions

T

Ac

State Estimation with Sensor Validation

0 20 40 60 80 100 1200

0.01

0.02

0.03

0 20 40 60 80 100 120130

140

150

160

170

Concentration [lb mole/ft3]

Temperature [oF]

Ac

T

Conclusions

� Theory:

� Information geometry yields additional insight.

� Information geometry is tolerant to approximations

and “cheating”.

� Algorithm:

� Iterative sampling and importance resampling Monte

Carlo schemes offer powerful tools to manage the

“curse of dimensionality”.

� Benefit:

� Fine description of uncertainty results in lower missed

& false alarm rates, and shorter delay in detection.

Further Reading

� T.M. Cover and J.A. Thomas (1991). Elements of

Information Theory. Wiley, New York.

� R. E. Blahut (1987). Principles and Practice of

Information Theory. Addison-Wesley, Reading, MA.

� L. Tierney (1994). Markov chains for exploring posterior

distributions. Ann. Statist., 22, 1701-1762.

� A.F.M. Smith and A.E. Gelfand (1992). Bayesian

statistics without tears: a sampling-resampling

perspective. Amer. Statist., 46, 84-88.

� R. Kulhavý (1996). Recursive Nonlinear Estimation: A

Geometric Approach. Springer-Verlag, London.

fault detection and diagnosis using information measuresstaff.utia.cas.cz/kulhavy/eth97s.pdf ·...

Documents