mathematical preliminaries

Mathematical PreliminariesMathematical Preliminaries

2

Matrix TheoryMatrix Theory

Vectors nth element of vector u : u(n)

Matrix mth row and nth column of A : a(m,n)

)(

)2(

)1(

)}({

Nu

u

u

nuu

N

NMaMaMa

a

Naaa

nma aaaA

21

),()2,()1,(

)1,2(

),1()2,1()1,1(

)},({

column vector

Tk kMakakawhere )],(),2(),1([ a

3

Row-ordered form of a matrix

Column-ordered form of a matrix

TNMxMxNxxNxxx )],()1,(),2()1,2(),1()2,1()1,1([ x ,2

1

Mr

r

r

TNkxkxkx

vectorrowwhere

k)],()2,()1,([ r

),()2,()1,(

)1,2(

),1()2,1()1,1(

NMxMxMx

x

Nxxx

),()2,()1,(

)1,2(

),1()2,1()1,1(

NMxMxMx

x

Nxxx

TNMxNxMxxMxxx )],(),1()2,()2,1()1,()1,2()1,1([ x ,2

1

Mc

c

c

TkMxkxkx

vectorcolumnwhere

k)],(),2(),1([ c

Lexicographic Ordering(Stacking operation) Lexicographic Ordering(Stacking operation)

5

Transposition and conjugation rules

Toeplitz matrices

Circulant matrices

***11

**

][,][][

][,][

AABAA

ABABAATT

TTTTT

0121

1

2

2101

110

tttt

t

t

tttt

ttt

N

N

N

T

0121

12

2101

1210

cccc

cc

cccc

cccc

N

NN

N

C

nmtnmt ),(

)modulo)((),( Nnmcnmc )%)(( Nnmc

6

1

0

)()()()()(xN

k

kxknhnxnhny

hNnnfornh ,0,0)(

)0()0()()()0(0

1

0

xhkxknhyn

N

k

x

)1()0()0()1()()1()()()1(1

01

1

0

xhxhkxkhkxknhykn

N

k

x

)()0()1()1()0()()()()()()(0

1

0

lxhxlhxlhkxklhkxknhlyl

kln

N

k

x

xNnnfornx ,0,0)(

Linear convolution using Toeplitz matrix

7

)1(

)1(

)0(

)1(000000000

)2()1(00000000

)0()1()1(00

0)0()1()2()2()1(0

00)0()1()2()1(

0000000)0()1()2(

00000000)0()1(

000000000)0(

)2(

)2(

)1(

)0(

x

h

hh

h

hh

hh

xh

Nx

x

x

Nh

NhNh

hhNh

hhhNhNh

hhNhNh

hhh

hh

h

NNy

y

y

y

xN

1 xh NN

Hxy

y H x(Toepliz matrix)

8

N-point circular convolution :

otherwise

Nnkxknhnxnhny

N

k

,0

0,)()(~

)()()(

1

0

1

1

1

00

1

0

)()()0()0()()(~

)()(~

)0(N

k

N

kn

N

k

kxkNhxhkxkhkxknhy

,)()(~

k

kNnhnh

1

2

1

01

1

0

)()()1()0()0()1()()1(~

)()(~

)1(N

k

N

kn

N

k

kxkNhxhxhkxkhkxknhy

1

1

1

0

1

0

)()()()0()1()1()0()(

)()(~

)()(~

)(

N

lk

N

kln

N

k

kxkNhlxhxlhxlh

kxklhkxknhly

Nnnfornx ,0,0)(

h(n) N x(n)

N

Circular convolution using circulant matrixCircular convolution using circulant matrix

9

)1(

)2(

)2(

)1(

)0(

)0()1()2()2()1(

)1()0()1()2()2(

)1()0()1()2(

)2()2()1()0()1(

)1()2()2()1()0(

)1(

)2(

)2(

)1(

)0(

Nx

Nx

x

x

x

hhhNhNh

NhhhhNh

Nhhhh

hNhNhhh

hhNhNhh

Ny

Ny

y

y

y

Hxy

y H x(circulant matrix)

Circular convolution + zero padding linear convolution

Circular convolution with the period : 1 hx NNN

the same result with that of linear convolution

,,0,0)( hNnnfornh xNnnfornx ,0,0)(

10

4

0

)()()()()(k

kxknhnxnhny

)4(

)3(

)2(

)1(

)0(

10000

01000

10100

01010

00101

00010

00001

)5(

)4(

)3(

)2(

)1(

)0(

)1(

x

x

x

x

x

y

y

y

y

y

y

y

4),()(,3)( NNnhnhnnh

)3(

)2(

)1(

)0(

3012

2301

1230

0123

)3(

)2(

)1(

)0(

x

x

x

x

y

y

y

y

(ex) Linear convolution as a Toeplitz matrix operation

(ex) Circular convolution as a circulant matrix operation

11,)( nnnh

10,)()()(1

0

NnkxknhnyN

k

71351 LNM

11

Orthogonal and unitary matrices Orthogonal : Unitary :

TAA 1 IAAAA TTorT*1 AA

Positive definiteness and quadratic forms is called positive definite, if is a Hermitian matrix and

is called positive semidefinite(nonnegative), if is a Hermitian matrix andTheorem

if is a symmetric positive definite matrix, then all its eigenvalues are positive and the determinant of satisfies

IAAAA TT **or

0xAxx ,0*TQ

A

A

N

k

N

kk kka

11

),(|| A

A A

A0xAxx ,0*TQ

A

12

Diagonal forms For any Hermitian matrix there exists a unitary matrix

such that

Eigenvalue and eigenvector

RT*

R

: diagonal matrix containing the the eigenvalues of R

Nkkkk ,,1, R

k : eigenvalue k : eigenvector

)( Ror

]|||[ 21 Nwhere

13

nmmm

n

n

,2,1,

,22,21,2

,12,11,1

AAA

AAA

AAA

A

20,30,)','()','(),(2

0'

1

0'

nmnnmmhnmxnmym n

(ex) 352

141

11

11

m

n

m

n

),( nmx ),( nmh

1 5 5 1

3 10 5 2

2 3 -2 -3

n

m

y(m,n)

][

13

45

12

10 xxX

1

4

1

3

5

2

x

210 yyyY

123

552

5103

132

2

1

0

y

y

y

y

Column; Stacking Operation

Block MatricesBlock Matrices

Block matrices : elements are matrices

14

1

0n'nn'nn xHy

where )},,'({ nmmh nH ,30 m 2'0 m

,

100

110

011

001

0H

100

110

011

001

1H

1

0

1

01

0

2

1

0

x

x

H0

HH

0H

y

y

y

Hxy

block matrix

Let xn and yn be the column vector, then

15

BB

BB

BBA

),()1,(

),1()1,1(

}),({

111

1

MMaMa

Maa

nma

(ex)

43

21,

11

11BA

4433

4433

2211

2211

,

4343

2121

4343

2121

ABBA

Definition

Properties(Table2.7)

)()())(( BDACDCBA

operationsNONO )()(:))(( 46 DCBA

operationsNO )(:)()( 4BDAC

Kronecker ProductsKronecker Products

16

Separable transformationTransformation on an NXM image

TAUBV

U

uBAv )( row-ordered form

m

Tmmk

m

Tm

T mkathenk

uBABuv ,][])[,(

VUuv andofvectorsrowLet mk:,

uv )(][ 21 BAvvv TTM

TT

m n

m n

nlbnmumka

nmlktnmulkv

),(),(),(

),;,(),(),(

Consider the transformation

),(),(),;,( nlbmkanmlkt , if

: matrix form

: vector form

18

Definitions Random signal : a sequence of random variables Mean : Variance : Covariance :

Cross covariance :

Autocorrelation :

Cross correlation :

)]([)( nuEnu

]|)()([|)( 22 nnuEnu

)',()]'(),([ 2 nnnunuCovuu

)]}'()'()][()({[ ** nnunnuE

)',()]'(),([ 2 nnnvnuCovuv

)]}'()'()][()({[ ** nnvnnuE vu

)]'()([)',( * nunuEnnruu )'()()',( *2 nnnnuu

)'()()',()]'()([)',( *2* nnnnnunuEnnr vuuv uv

Random SignalsRandom Signals

19

u)}({][ nE μu : Nx1 vector

)}',({])μ)(μ[(][ 2** nnECov uuT Cuuu

)}',({])μ)(μ[(],[ 2** nnECovuvuv

Tvu Cvuvu

: NxN matrix

: NxN matrix

μ : mean vector C : covariance matrix

Gaussian(or Normal) distribution

}2

||exp{

2

1)(

2

2

2

u

upu

Gaussian random processesGaussian random process if the joint probability density of any finite sub-sequence is a Gaussian distribution

)}μ()μ(2/1exp{]||)2[(),,,()( 1*12/12/21 uuuuupup TN

N CCuu

: covariance matrixC

Representation for an NX1 vector

20

Stationary process Strict-sense stationary if the joint density of any partial sequence

is the same as that of the shifted sequence

Wide-sense stationary if

Gaussian process : wide-sense = strict sense

}),({ klnlx }),({ 0 klnnlx

constantμ)]([ nuE

)'()]'()([ * nnrnunuE uu : covariance matrix is Toeplitz

),,,,(

),,,(

000000 1)(,),1(),(

1)(,),1(),(

nknnnnnkxnnxnnx

knnkxnxnx

xxxF

xxxF

knnfor ,, 0

21

Orthogonal : Independent : Uncorrelated :

(ex) Covariance matrix of a first-order stationary Markov sequence u(n)

nn nuu ,1||,)( ||2

1

1

1

2

12

N

N

C : Toeplitz

)()(),(, ypxpyxp yxyx

0][ * xyE

0]))([(][][][ *** yx yxEoryExExyE

Markov processesp-th order Markov

]),2(),1(|)([ nununuprob

npnununuprob )],(,),1(|)([

22

Karhunen-Loeve(KL) transform KL transform of

Property

The elements of y(k) are orthogonal is called the KL transform matrix The rows of are the conjugate eigenvectors of

,* xy T : NxN unitary matrix

x

Rxxyy TTTT EE **** ][][

)()]()([ * lklykyE k

T*T* R

23

Definitions Discrete random field

Each sample of a 2-D sequence is a random variable Mean : Covariance :

White noise field

Symmetry

),()],([ nmnmuE

)',';,()]','(),,([ 2 nmnmnmunmuCovuu

)]}','()','()][,(),({[ ** nmnmunmnmuE

)','(),()',';,( 22 nnmmnmnmnm xxxx

),;','()',';,(*22 nmnmnmnm uuuu

Discrete Random FieldDiscrete Random Field

24

Separable and isotropic image covariance functions Separable

Separable stationary covariance function

Nonseparable exponential function

)',()',()',';,( 222

21nnmmnmnmxx (Nonstationary case)

(Stationary case)

1||,1||,),( 21||

2||

122 nm

xx nm

}exp{),( 22

21

22 nmnmxx 21

2222 ,,),( nmdnm dxx

(isotropic or circularly symmetric)

Estimation mean and autocorrelation

M

m

N

n

nmuMN 1 1

),(1̂

mM

m

nN

nxxxx nnmmunmu

MNnmnm

1 1

22 ]ˆ)','(][ˆ)','([1

),(ˆ),(

)()(),( 222

21nmnmxx

25

n

uuu fnjnfSnu )2exp()()()}({SDF 2

5.0

5.0

2 )2exp()()( dffnjfSn uuu

2-D case

m n

uuu vnumjnmvuSnmu )](2exp[),(),()},({SDF 2

dudvvnumjvuSnm uuu

5.0

5.0

5.0

5.0

2 )](2exp[),(),(

Average powerdudvvuSuuu

5.0

5.0

5.0

5.0

2 ),()0,0(

SDF(spectral density function)SDF(spectral density function)

Definition Fourier transform of autocorrelation function

1-D case

26

(ex) the SDF of stationary white noise field

),(),( 22 nmnmxx 2),( vuS

28

minimum mean square estimate (MMSE)

also ][]]|[[)]([]ˆ[ xyxyx EEEgEE

unbiased estimator

◆ Theorem

Let y △ tnyyyy 321 and x be jointly Gaussian with zero mean.

The MMSE estimation is

N

iii yaE

1

]|[ yx , where ai is chosen, such that

0])[(1

k

N

iii yyaE x ∀ all k = 1, 2, … , N

(Pf) The random variable

N

iii ya

1

)(x nyyy ,,,, 21 are jointly Gaussian.

But the first one is uncorrelated with all the rest, it is independent of them.

Thus, the error

N

iii ya

1

)(x is independent of the random vector y.

29

0][][][]|)[(111

N

iii

N

iii

N

iii yEaEyaEyaE xxyx

N

iii

N

ii ayEaE

11

]|[]|[ yyx

N

iii yaE

1

]|[ yx

][min])ˆ[(min 2

)}({

2

)}({eEE

nn xx

where

N

n

xnyne1

)()( : estimation error

0)(

][ 2

n

eE

yields

0)]([ neyE , n = 1, 2, … , N

30

The estimation error is minimized if

0)]([ neyE , n = 1, 2, … , N

orthogonality principle

If x and {y(n)} are independent

][]|[ˆ xyx EEx

If zero mean Gaussian random variables

N

n

nyn1

)()(ˆ x : linear combination of {y(n)}

is determined by solving linear equations)(n

31

Orthogonality principle The minimum mean square estimation error vector is

orthogonal to every random variable functionally related to the observations, i.e., for any ))(,),2(),1(()( Nyyygg y

0)]()ˆ[( yxx gEx

x̂ )(yg

xx ˆ

)]([]|)([[)]()|([)](ˆ[ yxyyxyyxyx gEgEEgEEgE

Since is a function of x̂ y

N

n

nyn1

)()(ˆ xsubstitute

NnnxyEnykyEnN

n

,,1)],([)]()([)(1

matrix notation

0]ˆ)ˆ[( xxxE 0)]ˆ()ˆ[( xxx gE,

)]})([{)},({(,1 nyEn xyxyy xrrR

32

Minimum MSE : If x,y(n) are nonzero mean r.v.

If x,y(n) are non-Gaussian, the results still give the best linear mean square estimate.

xyT

x rα 22

N

nyxx nnyn

1ˆ )]()()[(ˆˆ xx

33

Information TheoryInformation Theory

Information

]bits[log2 kk pI

kk rLkp messagetindependeniesprobabilit:,,1,

Entropy]gebits/messa[log

12

L

kkk ppH

]bits[log1

log1

max 21

2 LLL

HL

kpk

p

)( pH

0 5.0 1

For a binary source, i.e., 10,1,,2 121 pppppL

)1(log)1(log 22 ppppH

34

Let x be a discrete r.v. with Sx={1, 2, … , K}

△ {x=k}

uncertainty of Ak is low, if pk is close to one,

with pk=Pr[x=k] let event Ak

and it is high, if pk is small.

uncertainty of event :

0)(

1ln)(

kPkI

r xx if Pr(x=k) = 1

entropy :

)(

1ln)()]([

1 kPkPkIEH

r

K

krx

x

xx

unit : bit when the logarithm is base 2

Information Theory

35

Consider the event Ak, describing the emission of symbol sk

by the source with probability pk

1) if pk=1 and pi=0 all ∀ i≠k

no surprise no information when s⇒ k is emitted by the source

2) if rk is low

more surprise information when s⇒ k is emitted by the source

kk p

sI1

log)( ; amount of information gained

after observing the event sk

)]([ kx sIEH ; average information per source symbol

Entropy as a measure of information

36

16 balls :

2 balls “3”, 2 balls “4”

1 ball “5”, “6”, “7”, “8”

4 balls “1”, 4 balls “2”

Question : Find out the number of the ball

through a series of yes/no questions.

ballbitH x /16

44

4

1log

4

1

4

1log

4

122

x=1 ?

yes

x=2 ?

yes

x=7 ?

yes

Ex)

1)

x=8

x=1 x=2 x=7

no no no

the average number of question asked :

16

51)

16

1(7)

16

1(7)

16

1(6)

16

1(5)

8

1(4)

8

1(3)

4

1(2)

4

1(1][ LE

37

x≤2 ?

yes

x≤4 ?

yes

x=7 ?

yes

2)x=8

x=1 x=2

x=7

no no nox≤6 ?

yes

x=1 ?

yes

no

x=3 x=4

x=3 ?

yes

no

x=5 x=6

x=5 ?

yes

no

no

16

44)

16

1(4)

16

1(4)

16

1(4)

16

1(4)

8

1(3)

8

1(3)

4

1(2)

4

1(2][ LE

⇒ The problem of designing the series of questions to identify x

is exactly the same as the problem of encoding the output

of information source.

38

x=1 0 0 0 yes / yes 1 1 ⇒

x=2 0 0 1 yes / no 1 0 ⇒

x=3 0 1 0 no / yes / yes 0 1 1 ⇒

x=4 0 1 1 no / yes / no 0 1 0 ⇒

x=5 1 0 0 no / no / yes / yes 0 0 1 1 ⇒

x=6 1 0 1 no / no / yes / no 0 0 1 0 ⇒

x=7 1 1 0 no / no / no / yes 0 0 0 1 ⇒

x=8 1 1 1 no / no / no / no 0 0 0 0 ⇒

3 bit / symbol variable length code pk

⇒ Huffman code

⇒ short code to frequency source symbol

average number of bits required to identify the outcome of x

⇒ entropy of x represent the max.

long code to rare source symbol

41

41

81

81

161

161

161

161

39

Noiseless Coding Theorem (1948, Shannon) min(R) = H(x) +ε bit / symbol

when R is the transmission rate and ε is a positive quantity that can be arbitrarily close to zero by sophisticated coding procedure utilizing an appropriate amount of encoding delay.

40

Rate distortion function Distortion

])[( 2yxED 2x : Gaussian r.v of variance

y : reproduced value

Rate distortion function of x

)](log2

1,0max[

0

),(log2

1 2

22

22

2

DD

DDRD

DR

D

Rate distortion function for a Gaussian source

For a fixed average distortion D

1

0

2

2 )](log2

1,0max[

1 N

k

kD N

R

: Gaussian r.v.’s)}1(,),1(),0({ Nxxx : reproduced values)}1(,),1(),0({ Nyyy

where is determined by solving

1

0

2 ],min[1 N

kkN

D

mathematical preliminaries

Documents

matrix form

matrix column

unitary matrix

ex covariance matrix

nxn matrix

toepliz matrix

circulant matrix operation

hermitian matrix andtheoremif