innovations, ricatti and kullback- leibler divergence · 2020. 10. 29. · f. c. schweppe,...

Innovations, Ricatti and Kullback-Leibler Divergence

Dept. of Electrical EngineeringKAISTDaejeon, Korea

Youngchul Sung

Presented at Niigata University, Japan, Oct 22, 2010

This visit and talk is supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education, Science and Technology (2010-0021269).

© Youngchul Sung. Oct. 22, 2010 2

Research Background

Signal Processing

InformationTheory

Communications

Networking

Statistical Signal Processing(Detection & Estimation)

SPTMSAMAdaptiveMachine LearningAudioVideoMultirateBiomedical...

Signal Processing forCommunications

© Youngchul Sung. Oct. 22, 2010

Outline

3

Introduction

Neyman-Pearson detection of hidden Gauss-Markov signals

Optimal sampling for detection

Application to wireless sensor networks

Conclusion


Mathematical Engineering

4

© Youngchul Sung. Oct. 22, 2010Friedrich Leibnitz

Thomasus (1643)

Mencke

Wichmannshausen

Hausen

Kastner

Pfaff

Gauss

Gudermann Gerlings Bessel

WeierstrassPlucker

Euler d’Alembert

Lagrange Laplace

Fourier Poisson

ChaslesDirichlet

Lipschitz

Klein (1868)

Nicola Bugaev

Nikolai Luzin

Andrei Kolmogorov

Albert Shiryaev

Lindemann (26172) Story

Sommerfeld

Guillemin (degree 1926, Munchen > MIT)

Hilbert

Fano (MIT)

Wozencraft (MIT)

T. Kailath (MIT=>Stanford)

Gaston Darboux

Emile Borel StieltjesPicard

Hadamard Henri Lebesgue

Paul Levy

Michel Loeve (Ecole poletechnique => UCB)

Emanuel Parzen

German Root French Root

Russian branch

Norbert Wiener (Havard 1913 =>MIT)

Royce (JHU 1878 > Harvard)

Konig

Heyne

Bach

Kustner

Ernesti

Gesner

Budeus

Walthier

Strauch

Abraham Kein (1632)

Bose Tufts (Havard)

Morris(1878, U. Berlin >JHU)

Trendelenburg(1868 U. Berlin)

도미

도미도미

Vannevar Bush(MIT)

Ragazzini (columbia)

John Russel(MIT)

ZadehKalman (MIT,Columbia: Stanford)

Arther Kennely(MIT)

Frank Hitchkock(MIT)

Claude Shannon Caldwell

Huffman

?

?

? ?

Bernoulli?

...

?Jacob Andreae

G. Riemann

Truxal (Polytech)

Braun (Polytechnic)

Mendel (USC)

Linvill (MIT) Tuttle

Arthurs

I. Jacobs L. Kleinrok

B. Widrow

(German studying in France)

Bieberbach

Schroder

Reissig

Bunke

Nussbaum


Modern Probability and Statistics

Lebesgue: Measure theory

Kolmogorov: Measure-theoretic probability

Wiener: Probabilistic approach to system theory

Shannon: Probabilistic approach to communication theory

6


The Key Problem in Statistics

7

X YBlackbox


The Key Problem in Statistics

8

YP(Y|X)

The Wiener Model

X


Areas in Statistical Signal Processing

Detection theory Radar setup

Digital communication setup: Shannon formulation and following 60,000 Ph.D.’s

Applications: Radar, sonar, distributed (team) detection, digital communications

Point estimation theory Applications: Frequency/spectral estimation, DOA estimation, Beamforming, (Blind)

parametric system identification

Signal tracking Optimal, adaptive, particle filtering

Applications: Radar, sonar, control, digital communications, equalization, negative feedback recovery

Time series analysis Applications: Finance, prediction theory

Machine learning Applications: Pattern recognition, voice recognition, face recognition, financial

modeling

9


Statistical Inference: Detection & Estimation

Pr{ X != X_guess(Y)} || X – X_guess(Y) ||^2

E { || X – X_guess(Y) ||^2 }

Deterministic Least Squares

Stochastic Least Squares orMinimum Mean Square Error

Detection Estimation

Neyman Pearson

H. Chernoff

Kullback Leibler

Claude Shannon

Carl Friedrich Gauß(1777 ~ 1855)

Norbert Wiener

Thomas Bayes(1702 ~ 1762)

Rudolf Kalman

Bernard Widrow

R. Fisher(FIM,1925)

C. R. Rao H. Cramer

(I-divergence)

Lucien Le Cam


Pr{ X != X_guess(Y)} || X – X_guess(Y) ||^2

E { || X – X_guess(Y) ||^2 }

Deterministic Least Squares

Stochastic Least Squares orMinimum Mean Square Error

Detection Estimation

Neyman Pearson

H. Chernoff

Kullback Leibler

Claude Shannon

Carl Friedrich Gauß(1777 ~ 1855)

Norbert Wiener

Thomas Bayes(1702 ~ 1762)

Rudolf Kalman

Bernard Widrow

R. Fisher(FIM,1925)

C. R. Rao H. Cramer

(I-divergence)

Lucien Le Cam

Statistical Inference: Detection & Estimation

State-space model


Introduction

Neyman-Pearson detection of hidden Gauss-Markov signals

Optimal sampling for detection

Application to wireless sensor networks

Conclusion


Related Work

S. Kullback & R. A. Leibler, “On information and sufficiency,” Ann. Math. Stat., 1951

R. I. Price, “Optimum detection of stochastic signals in noise with application to scatter-multipath channels,” IRE Trans. Infor. Theory, 1957

F. C. Schweppe, “Evaluation of likelihood functions for Gaussian signals,” IEEE Trans. Infor. Theory, 1965

S. Saito and F. Itakura, “The theoretical consideration of statistically optimum methods for speech spectral density,” Electrical Communication Laboratory, NTT, Tokyo, Rep. 3107, Dec. 1966.

T. Kailath, “A general likelihood-ratio formula for random signals in Gaussian noise,” IEEE Trans. Infor. Theory, 1966

T. Kailath, “Likelihood ratios for Gaussian processes,” IEEE Trans. Infor. Theory, 1970

M. Donsker & S. Varadhan, “Large deviations for stationary Gaussian processes,” Comm. Math. Phys., 1985

I. Vajda, “Distance and discrimination rates of stochastic processes,” Stoch. Proc. Appl. 1993

Y. Sung, L. Tong and H. V. Poor, “Neyman-Pearson detection of Gauss-Markov signals in noise: Closed-form error exponent and properties,” IEEE Trans. Infor.Theory, 2006

Y. Sung, X. Zhang, L. Tong and H. V. Poor, “Sensor configuration and activation for field detection in large sensor arrays,” IEEE Trans. Signal Processing, 2008

Y. Sung, H. V. Poor and H. Yu, “How much information can one get from a wireless ad hoc sensor network over a correlated random field?” IEEE Trans. Infor.Theory, 2009


Detection of correlated random fields

Sensor field

0H

1H

Data collection

1H or0H


Detection of correlated random fields

Sensor field

0H

1H

Data collection

1H or0H

Spatial correlation

Measurement noise

Sensor deployment Signal sampling

)(xs

2x1ix1x nx

1nx

ix

1s2s

is 1is

1ns

ns

3x


Correlation and Noise

SNR=10dB

SNR=10dB

SNR= -3dB

SNR= -3dB

Dif

fere

nt

sa

mp

le c

orr

ela

tio

n

Different SNR

Design parameter: Number of samples and spacing


Problem Formulation: The Gauss-Gauss Problem

)()()(

xBuxAsdx

xds

iii uass 1

Aea

)( ii xss

Underlying diffusion phenomenon (Ornstein-Uhlenbeck process)

Sampled signal

)(xs

2x1ix1x nx

1nx

ix

1s2s

is 1is

1ns

ns

3x

10 a

2

2

E

ESNR

i

i

w

s

ii wyH :0

iii

iii

wsy

uassH

1

1 :

vs.


Hidden Gauss-Markov Model: The State-Space Model

18

1i i i

i i i

s as u

y s w

Hidden Gauss-Markov model

Burg’s Theorem

The maximum-entropy-rate stochastic process {S_i} satisfying the constraints

, 0,1,...,i i k kES S a k p

is the p-th order Gauss-Markov process.

Kalman filtering

Optimal estimation of state using observations is based on the state-space model.


Optimal Detection: Neyman & Pearson

n

nn

nn

yyp

yyp

),,(

),,(log

1,0

1,1

}|nPr{Decisio 01 HHPF

}|nPr{Decisio 10 HHPM

Likelihood ratio detector

Threshold design

Minimize miss probability

satisfying false alarm probability level

(Neyman-Pearson formulation)

0H

1HNeyman Pearson


Structure for Optimal Detection R. I. Price, “Optimum detection of stochastic signals in noise with application to scatter-multipath channels,” IRE Trans.

Infor. Theory, 1957

Extension by Middleton, Shepp

F. C. Schweppe, “Evaluation of likelihood functions for Gaussian signals,” IEEE Trans. Infor. Theory, 1965

T. Kailath, “A general likelihood-ratio formula for random signals in Gaussian noise,” IEEE Trans. Infor. Theory, 1966

T. Kailath, “Likelihood ratios for Gaussian processes,” IEEE Trans. Infor. Theory, 1970

20

where


Schweppe’s Recursion

21

(Schweppe, 1965)


Performance Analysis:Known Results - I.I.D. Case

n

n

i

iy

1

2

FM Pnn

P 1;2SNR1

1;

2

1

Number of samples

PM

(lo

g s

cale

)

Slope = - K

Energy detector

Performance

1H

0H

nKPM log


General Case

Challenge: Exact error probability is not available!

nK

M eP

nKPM log

PM

(lin

ear

scale

)

PM

(lo

g s

cale

)

Number of samplesNumber of samples


Error Exponent

n

PK M

n

loglim

nKPM log

PM

(lo

g s

cale

)Number of samples

Performance metric for large samples!

Slope=-K


Previous Results on Error Exponents

I.i.d. observation Stein’s lemma

Kullback Leibler

dyypyp

yp

Yp

YpEppDK

)()(

)(log

)(

)(log)||(

0

1

0

1

0010

2

20

2 220 0

21

1

2

2 2 2202 2 2 0 01

0 1 1 2 2 2 2 2

0 1 0 1 12

2

1

1

2 1 1 1 1 1( (0, ) || (0, )) log log log 1

2 2 21

2

x

x

e

D N N E E x

e


Interpretation via Sanov’s Theorem

26

E

Q

P*

*Pr( ) min exp( ( || )) exp( ( || ))Y P EP E nD P Q nD P Q

}|nPr{Decisio 10 HHPM



27

0, 1 2

1, 1 2

( , , , )lim log

( , , , )

n n

nn n

p y y yK

p y y y

under H0

Asymptotic Kullback-Leibler rate Bahadur, Vaida (1980,1989,1990)



28

Between two Markov or Gauss-Markov observations Koopmans (1960), Hoeffding (1965), Boza (1971), Natarajan(1985)

Donsker & Varadhan(1985), Benitz & Bucklew (1995), Bryc & Dembo(1997), etc.

Luschgy (1994) Neyman-Pearson detection between two Gauss-Markov signals

)|()|()|()(),,( 1231211 nnn yypyypyypypyyp



Markov Gauss-Markov observation Koopmans (1960), Hoeffding (1965), Boza (1971), Natarajan(1985)

Donsker & Varadhan(1985), Benitz & Bucklew (1995), Bryc & Dembo (1997), etc.

Luschgy (1994) Neyman-Pearson detection between two Gauss-Markov signals

State-space model: Hidden Markov)|()|()|()(),,( 1231211 nnn yypyypyypypyyp

s1 s2 s3 s4

y1 y2 y3 y4

Observation process is not Markov!

Signal Process

Observation Process


Spectral Domain Approach

20 1

0

1( ,SNR) ( (0, ( )) || (0, ( ))

2y yK a D N S N S d

Theorem (Itakura-Saito, 1970’s)

0 0

0 1

1 1

( ) ( )1 1( (0, ( )) || (0, ( ))) log 1

2 2( ) ( )

y y

y y

y y

S SD N S N S

S S

F. Itakura S. Saito


Innovations Process

2y

11 ey 11 ey

2e1|2y 1|2y

11 ey 2y2e

3e

}1,2{|3y

3y

jieeeeeyyy ji ,},,,{},,{ 321321

}1,,1{|ˆ iiii yye }{E 2

, iie eR ),0( ,iei RNe ~


Innovations Approach to Asymptotic KL ratefor the Hidden Gauss-Markov Model

32

WhiteningFilter


Innovations Approach to Asymptotic KL ratefor Hidden Gauss-Markov Model

33

=-


Error Exponent: Innovations Approach

Theorem (Sung et al. 2004)

Mn

Pn

aK log1

limSNR),(

21 1 1log

2 2 2

e

e e

R

R R

}|E{lim~

}|E{lim

0

2

1

2

HeR

HeR

in

e

in

e


Extreme Correlations

),0(),,0(

)||(

~,

2

01

2

0

10

22

0

NpNp

ppDK

RR ee

Corollary

I.i.d. case

Perfectly correlated case

Stein’s lemma

)1

(

0

~ 2

nP

K

RR

M

ee

~

)0( a

)1( a

perfectly correlatedcase

i.i.d. case

Number of samples

PM

(lo

g s

cale

)P

M (lo

g s

cale

)

Number of samples


K vs. Correlation Strength

For SNR 1, K decreases monotonically as

For SNR < 1, there exists a non-zero optimal correlation

1a

A

a** log

Aea

Optimal Sampling

Maximum spacing

Optimal finite spacing*a

Correlation coefficient, a Correlation coefficient, a

Err

or

exponent,

K

Err

or

exponent,

K

Theorem (Sung et al.)


Optimal Correlation vs. SNR

Transition is very sharp!

SNR=1

Signal-to-noise ratio [dB]

*a


Intuition: Coherency vs. Diversity

Signal notch

Correlation coefficient, a Correlation coefficient, a

Err

or

exponent,

K

Err

or

exponent,

K

SNR=-3 dB

SNR=-3 dB

SNR=10 dB

SNR=10 dB


K vs. Signal-to-Noise Ratio

K is monotone increasing as SNR increases

K is proportional to (1/2)log(1+ SNR) at high SNR

)1exp(a

Theorem 5:

Signal-to-noise ratio [dB]

Err

or

exponent,

K

a=exp(-1)


Simulation Results

40

SNR=10 dB SNR= -3 dB

Number of samples Number of samples

Pe

Pe


How Many Samples?

41

K

M

nK

M

eP

P

Kn

ePP

1

)1(

1

log1

ˆ

MP

FP

Required n

um

ber

of

senso

rs

Sensor spacing

Target : 10-4

Target : 10-3

SNR= - 3 dB

SNR= 0 dB

SNR= 3 dB

1ADiffusion rate

Optimal spacing

Reasonable spacing


Vector State Space Model

42



Vector State Space Model

43

Theorem (Sung et al.)

(Riccati)

(Lyapunov)


Known in Kalman filterin gtheory

Newly defined quantities


Numerical Results

Figures from Y. Sung, X. Zhang, L. Tong and H. V. Poor, “Sensor configuration and activation for field detection in large sensor arrays,” IEEE Trans. Signal Processing, 2008


Extension to d-Dimensional Gauss-Markov Random Fields

45



Extension to d-Dimensional Gauss-Markov Random Fields

46

Itakura-Saito (1970’s)Hannon (1973)

Sung et al. (2009)



Conclusion

The performance of Neyman-Pearson detection of hidden Gauss-Markov signals was analyzed using error exponent

Innovations approach to asymptotic Kullback-Leibler rate

Sharp transition behavior of error exponent as a function of correlation depending on SNR was proved

Connection of Kalman filtering quantities with asymptotic KL rate

Extension to vector state-space model

Extension to multi-dimensional case: Sufficient condition for strong convergence established

47

innovations, ricatti and kullback- leibler divergence · 2020. 10. 29. · f. c. schweppe,...

Documents