unit review & exam preparation m513 – advanced dsp techniques

Post on 15-Dec-2015

237 Views

Category:

Documents

4 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Unit Review&

Exam Preparation

M513 – Advanced DSP Techniques

M513 – Main Topics Covered in 2011/2012 ay

1. Review of DSP Basics

2. Random Signal Processing

3. Optimal and Adaptive Filters

4. (PSD) Spectrum Estimation Techniques

(exam questions will mainly come from parts 2 and 3 but good knowledge of part 1 is needed !!!)

Part 1 – Review of DSP Basics

DSP = Digital Signal Processing =

Signal Analysis + Signal Processing

… performed in discrete-time domain

• Fourier Transform Family• More general transform (z-transform)• LTI Systems and Convolution• Guide to LTI Systems

Signal Analysis

• To analyse signals in time domain we can use appropriate member of Fourier Transform family

Fourier Transforms - Summary

FourierTransform

Discrete-TimeFourier

Transform

FourierSeries

DiscreteFourier

Transform

time continuous discrete

continuous

discrete

aperiodic

periodic

aperiodic periodic frequency

j tX x t e dt

1

2j tx t X e d

0jk tpx t X k e dt

0

2

2

1T

jk tp

T

X k x t e dtT

1

2j nx n X e d

j nX x n e

21

0

1 N j nkN

k

x n X k eN

21

0

N j nkN

n

X k x n e

Fourier Transforms

• Following analogies can be seen:

Periodic in time ↔ Discrete in Frequency

Aperiodic in time ↔ Continuous in Frequency

Continuous in Time ↔ Aperiodic in Frequency

Discrete in Time ↔ Periodic in Frequency

More general transforms

• Two more transforms are introduced in order to generalise Fourier transforms for both continuous and discrete-time domain signals

• To understand their region of operation it is important to recognise that both CTFT and DTFT only operated on one limited part of the whole complex plane (plane of complex values)

• CTFT operates on the frequency axis, i.e. line s=0 from the complex plane s=s+j W (i.e. s=W).

• DTFT operates on the frequency circle, i.e. curve r=1 from the complex plane z=rejw (i.e. z=ejw ).

From Laplace to Z-Transform

• Evaluate Laplace transform of sampled signal xs(t)

substitute:

' ( )ns

n

X s x nT z X z

e ssTz

From Laplace to Z-Transform• Consider again substitution we made on the previous

slide:

i.e. left half of the s-plane (s<0) maps into the interior of the unit circle in the z-plane |z|<1

e ssTz

e

Re[ ] Im[ ]

e e e e cos sin

s

s s s s

sT

j T T j T T

z

s s j s j

z j

for <0 e 1 for <0 1

cos sin 1

sT

zj

j Axis in s to z Mapping

S-planeIm (jw)

Re

s

s

Re

Z-planeIm

1

j

1

j

2s

sT

2s

Signal Processing

• Delay … signal• Scale … signal• Add … two or more samples

(from the same or different signals)

Signal Filtering Convolution

Convolution

• Gives the system input – system output relationship for LTI type systems (both DT and CT).

Systemx(t) y(t)

x(n) y(n)

Impulse Response of the System

• Let h(n) be the response of the system to d(n) impulse type input (i.e. Impulse Response of the System)

• we note this as ( )n h n

LTI Systemd(n) h(n)

Time-invariance

• For LTI system, if than

(this is so called time-invariance property of the system)

( )n h n ( )n k h n k

LTI Systemd(n-k) h(n-k)

Linearity

• Linearity implies the following system behavior:

LTI Systemx(n) y(n)

LTI Systemax(n) ay(n)

LTI Systemx1(n)+ x2(n) y1(n)+y2(n)

Linearity and Time-Invariance

• We can now combine time-invariance and linearity:

LTI SystemSd(n-k) Sh(n-k)

LTI SystemSx(k)d(n-k) Sx(k)h(n-k)

Convolution Sum

• I.e., if than• and:

• i.e. system output is the sum of lots of delayed impulse responses (i.e. responses to individual, scaled impulse signals which are making up the whole DT input signal)

• This sum is called CONVOLUTION SUM• Sometimes we use to denote convolution operation, i.e.

( )n h n ( )n k h n k

k

y n x k h n k

k

y n x k h n k x n h n

Convolution Sum for CT

• Similarly, for continuous time signals and systems (but a little bit more complicated)

• The above expression basically describes the analogue (CT) input signal x(t) as an integral (i.e. sum) of an infinite number of time-shifted and scaled impulse functions.

x t x t d

Important fact about convolution

Convolution in t domain ↔ Multiplication in f domain

but we also have

Multiplication in t domain ↔ Convolution in f domain

Discrete LTI Systems in Two Domains

h(n)x(n) y(n)

H(z)X(z) Y(z)

h(n) – impulse response, H(z) – transfer function of DT LTI system

Summary

DT:• H(z) is z Transform of the System Impulse Response -

System Transfer Function.• H(w) is Discrete Time Fourier Transform of the System

Impulse Response – System Frequency Response.

CT:• H(s) is Laplace Transform of the System Impulse

Response - System Transfer Function.• H(W) is Fourier Transform of the System Impulse

Response – System Frequency Response.

Guide to Discrete LTI Systems

h(n)

H(z) H(w)TransferFunction

FrequencyResponse

ImpulseResponse

DifferenceEquation

ZT

IZT

ZT

IZT

z=ejw

DTFT

IDTFT

Including some mathematical manipulations

Guide to Continuous LTI Systems

h(t)

H(s) H(w)TransferFunction

FrequencyResponse

ImpulseResponse

DifferentialEquation

LT

ILT

LT

ILT

s=jw

FT

IFT

Including some mathematical manipulations

Example(s) - Use guide to LTI systems to move between the various descriptions of the system.

E.g. Calculate transfer function and frequency response for the IIR filter given with the following difference equation.

0.7 1 0.3 2 6 1y n y n y n x n

1 2 1

1 2 1

1

1 2 2

2 2

0.7 0.3 6

1 0.7 0.3 6

6 6

1 0.7 0.3 0.7 0.3

6 6

1 0.7 0.3 0.7 0.3

j jj

j j j j

Y z z Y z z Y z z X z

z z Y z z X z

Y z z zG z

X z z z z z

e eG e

e e e e

Example(s) - Use guide to LTI systems to move between the various descriptions of the system.

Having obtained frequency response, system response to any frequency F1 can be easily calculated, e.g. for the ¼ of sampling frequency Fs we have:

1

2

12

2 2

6...

0.7 0.3

j

j

j j

eG F G e

e e

11 1

1

42 2 22

s

s s

FFf

F F

(note, in general this is a complex number so both phase and amplitude/gain can be calculated)

Example(s) - Use guide to LTI systems to move between the various descriptions of the system.

Opposite problem can also be easily solved, e.g. for the IIR filter with transfer function:

find the corresponding difference equation to implement the system.

2

2

0.2 0.08

0.5

z zH z

z

2 1 2

2 2

2 1 2

0.2 0.08 1 0.2 0.08

0.5 1 0.5

1 0.5 1 0.2 0.08

0.5 2 0.2 1 0.08 2

0.2 1 0.08 2 0.5 2

Y z z z z z

X z z z

Y z z X z z z

y n y n x n x n x n

y n x n x n x n y n

Part 2 – Random SignalsRandom signals – unpredictable type of

signals ( … well, more or less).

• Moments of random signals – mx, rxx(m)

• Autocorrelation ↔ PSD• Filtering Random Signals –Spectral

Factorisation Equations (in three domains)• Deconvolution and Inverse Filtering• Minimum and Non-minimum phase

systems/filters

Signal Classification• Deterministic signals

– can be characterised by mathematical equation

• Random (Nondeterministic, Stochastic) signals– can not be characterised by mathematical equation– usually characterised by their statistics

Random (Nondeterministic, Stochastic) signals can be further classified as:

• Stationary signals– if their statistics does not change with time

• Nonstationary signals – if their statistics changes with time

Signal Classification

• Wide Sense Stationary (random) signals – random signals with constant signals statistics up to a 2nd order

• Ergodic (random) signals – random signals whose statistics can be measured by Time Averaging rather then Ensemble Averaging (i.e. expectation of the ergodic signal is its average)

• For simplicity reasons, we study Wide-Sense Stationary (WSS) and Ergodic Signals

1st Order Signal Statistics

• Mean Value m of the signal x(n) is its 1st order statistics:

1lim ...

2 1

N

x Nn N

E x n x n x p x dxN

If mx is constant over time,we are talking about stationary signal x(n).

Expected value of x(n)(E – expectation operator)

Single waveform averaging,so signal is ergodic (no need for ensemble averages).

more general eq.

• Autocovariance of the signal is the 2nd order signal statistics:

• It is calculated according to:

where * denotes a complex conjugate in case of complex signals

• For equal lags, i.e. k=l, the autocovariance reduces to variance

2, ...xx x xc k l E x k x l x p x dx

2, xx xc k k

2nd Order Statistics

2nd Order Statistics

• Variance s2 of the signal is:

• Variance can be considered as some kind of measure of signal dispersion around its mean.

22 2 x x x xE x n x n E x n

Analogies to Electrical signals

• Two zero-mean signals, with different variances

0 5 10 15 20 25 30 35 40 45 50-1.5

-1

-0.5

0

0.5

1

1.5

mx – mean – DC component of electrical signal

mx2 – mean squared – DC

powerE[x2(n)] – mean square –

total average powersx

2 – variance – AC power

sx – standard deviation – rms

value

Autocorrelation

• This is also a 2nd order statistics of signal and is very similar (in some cases identical) to autocovariance

• Autocorrelation of the signal is basically a product of signal and its shifted version:

where m=l-k.

• Autocorrelation is a measure of signal predictability – correlated signal is one that has redundancy (it is compressible, e.g. speech, audio or video signals)

1, lim

2 1

N

xxN

k N

r k l E x k x l x k x k mN

Autocorrelation

• for m=l-k we sometimes use notation rxx(m) or even rx(m) instead of rxx(k,l), i.e.

1, lim

2 1

N

xxN

k N

r k l E x k x l x k x k mN

1lim

2 1

x xx

N

Nk N

r m r m E x k x l

E x k x k m x k x k mN

Autocorrelation and Autocovariance

• For zero mean, stationary signals those two quantities are identical:

where mx = 0

• Also notice that the variance of the zero-mean signal then corresponds to zero-lag autocorrelation:

,

( ) ( )

xx xx x x

xx x

c m c k l E x k x l

E x k x l E x k x k m r m r m

2, 0 0xx xx xx xc k k c r

Autocorrelation

• Two important autocorrelation properties:

• rxx(0) is essentially a signal power so it must be larger than any other autocorrelation of that signal (other way of looking at this property of the autocorrelation is to realise that the sample is best correlated with itself).

0

xx xx

xx xx

r k r k

r r k

Example (Tutorial 1, Problems 2, 3)

Random phase family of sinusoids:

A and wk are fixed constants and q is a uniformly distributed random variable (i.e. equally likely to have any value in the interval –p to p).

Prove the stationarity of this process, i.e.

a) Find the mean and variance values (should be const.)

b) Find the autocorrelation function (ACF) (should only depend on time lag value m, otherwise const.)

cos kx n A n

Example (Tutorial 1, Problems 2, 3)

1cos cos sin sin

2

cos sincos sin

2 2

cos sin0 0

2 20 .

x

k k

k k

k k

E x x p d

n n d

n nd d

n n

const

Example (Tutorial 1, Problems 2, 3)

2 2 22

2

2

0

1cos

2

1 11 cos 2

2 2

1 1 1 1cos 2

2 2 2 2

1 1 12 0

2 2 21

.2

x x

k

k

k

E x E x E x

x p d

n d

n d

d n d

const

Example (Tutorial 1, Problems 2, 3)

Random phase family of sinusoids:

A and wk are fixed constants and q is a uniformly distributed random variable (i.e. equally likely to have any value in the interval –p to p).

Discuss two approaches to calculate ACF for this process.

We can use:

or we can go via:

cos kx n A n

2

DTFT IDTFTxx xxx n X R r m

,

cos cos

x xx xx

k k

r m r m r n n m E x n x n m

E n n m

Power Spectral Density (PSD)

• Power Spectral Density is the Discrete Time Fourier Transform of the autocorrelation function

• PSD contains power info (i.e. distribution of signal power across the frequency spectrum) but has no phase information (PSD is “phase blind”).

• PSD is always real and non-negative.

j j mxx xx xx

m

R e R r m e

White Noise• White noise signal has a perfectly flat power spectrum

(equal to the variance of the signal s2).

• Autocorrelation of white noise is a unit impulse with amplitude sw

2 – white noise is perfectly uncorrelated signal

(not realisable in practice, we usually use pseudorandom noise with PSD almost flat over a finite frequency range)

w m

2jwwR e

2 wwr m m

Rww rww

DFT

IDFT

Filtering Random Signals

• Filter scales the mean value of the input signal.• In time domain the scaling value is the sum of the

impulse response.• In frequency domain the scaling value is the frequency

response of the filter at w=0.

0

jy xH e

y xk

h k DIGITALFILTER

mx my

TimeDomain

FrequencyDomain

Filtering Random Signals

• Cross-correlation between the filter input and output signals:

,

...

yx

xxl

r n k n E y n k x n

h l r k l

yx xxr k r k h k or

Filtering Random Signals

• Autocorrelation of the filter output:

,

...

yy

yxl

r n k n E y n k y n

h n l r n k l

yy yx yxl

r k h m r m k r k h k

using m n l

Filtering Random Signals

• The autocorrelation of the filter output therefore depends only on k, difference between the indices n+k and n, i.e.:

• Combining with:

we have: yy xxr k r k h k h k

yy yxr k r k h k

yx xxr k r k h k

Spectral Factorisation Equations

y xr k r k h k h k

h(k)rx

h*(-m)ryx ry

2j j j j j j

y x xR e H e H e R e H e R e

Taking the DTFT of above equation:

* 1y xR z R z H z H z Taking the ZT of above equation:

Filtering Random Signals

• In terms of z-transform:

• If H(z) is real, H(z)=H*(z*) so:

• This is a special case of spectral factorisation.

1

y xR z H z H R zz

1

y xR z H z H R zz

Example – Tutorial 2, Problem 2

A zero mean white noise signal x(n) is applied to an FIR filter with impulse response sequence {0.5, 0, 0.75}. Derive an expression for the PSD of the signal at the output of the filter.

1

2 2 2

2 2 2

2 2 2

2

0.5 0.75 0.5 0.75

0.25 0.375 0.375 0.5625

0.375 0.8125 0.375

2 0 2

yy xx

x

x

x

yy yy yy x

R z H z H z R z

z z

z z

z z

r r r

2 2 22 0.375 ; 0 0.8125 ; 2 0.375yy x yy x yy xr r r

2 2 20 0.8125y yy x xr

2xx xr m m 2

xx xR z

Example - What about IIR type filter and coloured noise signal? (not in Tutorial handouts)

An IIR filter described by the following difference equation:

Is used to process WSS signal with PSD:

Find the PSD of the filter output.

Solution – PSD of the output is required, so spectral factorisation equation in W domain can be used.

1

1 0.5cosj

xR e

0.4 1 2.5 1y n y n x n x n

j j j jy xR e H e H e R e

Example - What about IIR type filter and coloured noise signal? (not in Tutorial handouts)

First calculate the transfer function of the filter, then find the frequency response:

… then apply spectral factorisation:

1 2.5 1 2.5 1

1 0.4 1 0.4 1 0.5cos

j jj j j j

y x j j

e eR e H e H e R e

e e

1 1

1 1

1

1

0.4 1 2.5 1

0.4 2.5

1 0.4 1 2.5

1 2.5

1 0.4

1 2.5

1 0.4

jj

j

y n y n x n x n

Y z z Y z X z z Y z

Y z z X z z

Y z zH z

X z z

eH e

e

Inverse Filters

• Consider the deconvolution problem as represented on the figure below:

• Our task is to design a filter which will reconstruct or reconstitute the original input x(n) from the observed output y(n).

• The reconstruction may have the arbitrary gain A and a delay of d, hence the definition of the required inverse filter is:

H(z)x(n)

H-1(z)y(n) Ax(n-d)

1 dH z H z Az

Inverse Filters• The inverse system is said to ‘equalise’ the amplitude

and phase response of H(z), or to ‘deconvolve’ the output y(n) in order to reconstruct the input x(n).

H(z)x(n)

H-1(z)y(n) Ax(n-d)

Inverse Filters - Problem

• If H(z) is a non-minimum phase system, the zeros outside the unit circle become poles outside the unit circle and the inverse filter is unstable !!!

H(z)x(n)

H-1(z)y(n) Ax(n-d)

Noise Whitening

• With inverse filtering x(n) does not have to be a white random sequence but the inverse filter H0

-1(z) has to produce the same sequence x(n).

• For noise whitening, input x(n) has to be a white random sequence as well as output of H0

-1(z), u(n) but sequences x(n) and u(n) are not the same.

H0(z)x(n)

H0-1(z)

y(n) x(n)

H1(z)x(n)

H0-1(z)

y(n) u(n)

Inverse filtering

Noise whitening

Deconvolution using autocorrelation

• Consider the filtering process again:

• The matrix equation to be solved in order to estimate the impulse response h(k) before attempting the deconvolution is given on the next slide:

yx xr k r k h kh(k)x(n) y(n)

Deconvolution using autocorrelation

• using matrix notation:

• The aim is to obtain coefficients b(0), b(1), …, b(L) which can be done by inverting the matrix Rxx.

• This matrix is known as autocorrelation matrix and can be used as important 2nd order characterisation of random signal.

yx xx xx xx xx

yx xx xx xx xx

xx xx xx xxyx

r 0 r 0 r 1 r 2 ... r 1 L b 0

r 1 r 1 r 0 r 1 ... r 2 L b 1

. . . ... . ......r L r L 1 r L 2 ... r 0 b Lr L

yx xr R b

Deconvolution using autocorrelation

• Solution of the equation from the previous slide:

is obviously given with:

• Important to note is the structure of autocorrelation matrix Rx.

1 x yxb R r

yx xr R b

Toeplitz matrices

• If Ai,j is an element of the matrix in i-th row and j-th column then Ai,j=ai-j.

• Another important DSP operation – convolution also has strong relation to Toeplitz type matrix, called convolution matrix.

0 1 2 n 1

1 0 1 n 2

2 1 0 n 3

n 1 n 2 n 3 0

a a a ... a

a a a ... a

A a a a ... a

... ... ... ... ...

a a a ... a

Convolution matrix

• To form the convolution matrix we need to represent the convolution operation in vector form.

• For example, the output of the FIR filter of length N can be written as:

where x is a vector of input samples to a filter and h is a vector of filter coefficients (impulse response in case of FIR filter)

• The above equation represents the case where x(n)=0 for n<0.

1

T Th x=x h

N

k

y n h k x n k

Decomposition of Autocorrelation Matrix

• The Toeplitz structure of autocorrelation matrix is used in diagonalisation of the autocorrelation matrix – this is the important process of decomposition of the autocorrelation matrix in the form:

here: L – diagonal matrix containing the eigenvalues of R

Q – modal matrix containing the eigenvectors of associated with eigenvalues in L

-1 TR Q Q Q Q

White noise

• What would be the form of the autocorrelation matrix for the case of white noise signal?

• Assuming ideal white noise sequence, i.e. perfectly uncorrelated signal, its autocorrelation is a unit impulse with amplitude sw

2.

• The autocorrelation matrix is in this case diagonal (all non-diagonal elements are zero).

m

rww

More on minimum and non-minimum phase systems

• Non-minimum phase systems (sometimes also called mixed-phase systems) are the systems with some of its zeros inside the unit circle and the remaining zeros outside the unit circle.

• If all of its zeros are outside the unit circle, non-minimum phase system is called a maximum phase system.

• The minimum, non-minimum and maximum phase systems can also be recognised by their phase characteristics.

• The phase characteristic of the minimum phase system has a zero net phase change between w=0 and w=p frequencies, while the non-minimum phase system has non-zero phase change between those frequencies.

More on minimum and non-minimum phase systems

• Maximum phase system has the maximum phase change between w=0 and w=p frequencies amongst all possible systems with the same amplitude response.

Example (Tutorial 2, Problem 4)

A zero-mean stationary white noise x(n) is applied to a filter with a transfer function:

Find all filters that can produce the same PSD as the above filter. Are those filters minimum or maximum phase filters?

Using spectral factorisation equation

2

0.5 3z zH z

z

1 1

1 22 2

0.5 30.5 3yy xx x

z zz zS z H z H z S z

z z

Example (Tutorial 2, Problem 4)

1 1

1 22 2

0.5 30.5 3yy xx x

z zz zS z H z H z S z

z z

1 1

2 1 20 02 2

0.5 3 0.5 3yy x x

z z z zS z H z H z

z z

Same PSD would be obtained using filter H0(z):

H0(z) has two zeros 0.5 and 1/3 = 0.333333, both inside the unit circlei.e. this is a minimum phase filter

1

2 2

0.5 3z zH z

z

has poles at z=2 and z=3, both outside the unit circlei.e. this is a maximum phase filter

Part 3

Optimal and Adaptive Digital Filters

Part 3 – Optimal and Adaptive Digital Filters

Best filters for the task in hand

• Wiener Filter and Equation• Finding the minimum of the cost function

(MSE) = MMSE• Steepest Descent algorithm• LMS and RLS algorithms• Optimal and Adaptive Filter Configurations

(i.e. applications)

Optimal and Adaptive Filters• Optimal filters are the “best” filters for the particular

task. We use knowledge about the signals to design those filters and apply them to the task.

• Adaptive Filters change their coefficients to improve their performance for the given task. They are not fixed and can therefore change their (statistical) properties over time.

• Adaptive Filters may not be optimal but are constantly striving to become optimal

Optimal (Wiener) Filter Design

• System Identification problem:

• We want to estimate the impulse response h(n) of the “unknown” discrete-time system.

• We can use the equation for the cross-correlation between the filter input and output to obtain the estimate for h(n).

h(n) = ?x(n) y(n)

Optimal (Wiener) Filter Design

1. convolution form:

2. matrix form:

3. matrix form in short-notation:

yx xr k r k h k

yx xx xx xx xx

yx xx xx xx xx

xx xx xx xxyx

ˆr 0 h 0r 0 r 1 r 2 ... r 1 Lˆr 1 r 1 r 0 r 0 ... r 2 L h 1

. . . ... . ......r L 1 r L 2 r L 3 ... r 0 ˆr L 1 h L 1

ˆyx xx R R h

Optimal (Wiener) Filter Design

3. matrix form in short-notation:

– cross-correlation vector

(between input and output signals)

– autocorrelation matrix (of input signal)

– estimated impulse response vector

From the above equation we can easily obtain vector

ˆyx xx R R h

1ˆxx yx h R R

yxR

xxR

h

h

Optimal (Wiener) Filter Design

• Equation

is also known as Wiener-Hopf equation

• Using this equation we have actually estimated (or designed) a filter with the impulse response close (or equal) to the impulse response of the unknown system.

• This type of optimal filter is also known as Wiener filter.

1ˆxx yx h R R

Optimal (Wiener) Filter Design

• We can approach the problem of designing the Wiener filter estimate of the unknown system in a slightly different way. Consider a block diagram given below:

• A good estimate of the unknown filter impulse response h(n) can be obtained, if the difference/error signal between two outputs (real and estimated system) is minimal (ideally zero).

Sx(n) y(n) e(n)

-+

d(n)

h n

h n

Optimal (Wiener) Filter Design

• We use the following notation:d(n) – output of the unknown system (desired signal)y(n) – output of the system estimatex(n) – input signal (same for both systems)e(n) – error signal, e(n)= d(n) – y(n)

• For e(n)→0, we expect to achieve a good estimate of the unknown system, i.e.:

Sx(n) y(n) e(n)

-+

d(n)

h n

h n

h n h n

Optimal (Wiener) Filter Design

• Wiener filter design is actually a much more general problem

• desired signal d(n) does not have to be the output of the unknown system

Sx(n) y(n) e(n)

-+

d(n)

h n

Optimal (Wiener) Filter Design

• Another Wiener filter estimation example:

Optimal Filter Sx(n) y(n)

d(n)

-+

SignalDistorting System

d(n) +S

w(n)

+

h(n)e(n)

w(n) – noise signal

Task: Design (determine) h(n) in order to minimise error e(n) !

Optimal (Wiener) Filter Design

2J E e n

• Rather than minimising current value of the error signal - e(n), we can choose more effective approach – minimise the expected value of the square error – mean square error (MSE) function.

• Function to be minimised (cost function) is therefore MSE function defined as:

Sx(n) y(n) e(n)

-+

d(n)

h n

Mathematical Analysis

1

0

( ) ( ) ( )N

k

y n h k x n k

( ) ( ) ( )e n d n y n

Filter output

Error signal

Sx(n) y(n) e(n)

-+

d(n)

h n

22

21

0

( ) ( ) ( )

( ) ( ) ( )N

k

J E e n E d n y n

E d n h k x n k

MSE (cost) function

We can try to minimise this expression or switch to matrix/vector notation

Mathematical Analysis

1

0

( ) ( ) ( )N

k

y n h k x n k

( )

( 1)( )

( 1)

x n

x nn

x n N

x

(0)

(1)

( 1)

h

h

h N

h

T T( ) ( ) ( )y n n n h x x h

Using vector notation:

22( ) ( ) ( )J E e n E d n y n

Mathematical Analysis

2 2

T 2

2 T T T

2 T T T

T Td

[ ( )] [[ ( ) ( )] ]

[[ ( ) ( )] ]

[ ( ) 2 ( ) ( ) ( ) ( ) ]

[ ( )] 2 [ ( ) ( )] [ ( ) ( ) ] ]

P -2 +

E e n E d n y n

E d n n

E d n d n n n n

E d n E d n n E n n

dx xx

h x

h x h x x h

h x h x x h

h R h R h

T[ ( ) ( ) ]E n n xxR x x

2dP = [ ( )]E d n

[ ( ) ( )]E d n n dxR x Cross-correlation vector

Autocorrelation matrix

Scalar

Mathematical Analysis

• To find the minimum error take the derivative with respect to the coefficients, h(k), and set equal to zero.

2[ ( )]2 2 0

( ) opt

E e n

h k

dx xxR R h

• Solving for h:

Wiener-Hopf Equation … again

• Wiener-Hopf equation therefore determines the set of optimal filter coefficients in the mean-square sense.

1opt

xx dxh R R

Example (Tutorial 3, Problems 1 and 2)

Derive the Wiener-Hopf equation for the Wiener FIR filter working as a noise canceller.

Detailed derivation of Wiener-Hopf equation is shown in Tutorial (3); for ANC application, we can start from

1 1

1 2 1 20 0

ˆ ˆN N

k k

e n d n x n v n x n w k v n k d n v n w k v n k

Example – Tutorial 3, Problems 1 and 2

Derivation of the Wiener-Hopf equation for the FIR noise canceller.

21

2

1 20 0 0

N

pn n k

e n d n v n w k v n k

2

0 0

2 2 0p

n n

e ne n e n v n k

w k w k

2 1 2 2 2

1

2 1 2 20 0 0

1

1 2 2 20 0 0

1

0

0

N

n n l

N

n k n

N

dv v v v vk

e n v n k d n v n w k v n l v n k

d n v n v n k w k v n l v n k

r k r k w k r l k

Example – Tutorial 3, Problems 1 and 2

Derivation of the Wiener-Hopf equation for the FIR noise canceller.

1 2 2 2

1

0

N

v v v vk

r k w k r l k

or in matrix/vector form:

1 2 2v v vr R w2 1 2

1opt v v v

w R r

MSE Surface• E[e2(n)] represents the

expected value of squared filter error e(n), i.e. mean-square error (MSE).

• For the N coefficients filter this is an N dimensional surface with Wiener-Hopf solution positioned at the bottom of this surface (i.e. this is the minimum error point)

• We can plot it for the case of 2-coefficient filter (more than that - impossible to draw in 2D).

Mean Square Error Surface Example for 2 weights Wiener filter

0 2 4 6 8 10 12 14010

20

0

1

2

3

4

5

6

MMSE – Wiener optimum

MMSE• Once the coefficients of the Wiener filter (i.e. coordinates

of the MMSE point) are known, the actual MMSE value is easy to calculate – we need to evaluate J(n) for h=hopt.

min d

d

d

d

d

d

P -2 +

P -2 +

P -2 +

P -2 +

P -2 +

P -

J J

opt

T Topt dx opt xx opth=h

T T-1 -1 -1xx dx dx xx dx xx xx dx

T T-1 -1 -1xx dx dx xx dx xx xx dx

T T-1 -1xx dx dx xx dx dx

T T-1 -1xx dx dx xx dx dx

T-1xx dx d

h R h R h

R R R R R R R R

R R R R R R R R

R R R R R IR

R R R R R R

R R R

dP -

x

Topt dxh R

Example (Tutorial 3, Problems 3 and 4)

• Alternative derivation of the MMSE equation is shown in Tutorial 3, Problem 3

• Use of both Wiener-Hopf and MMSE equations is demonstrated in Tutorial 3, Problem 4

Two-coefficient Wiener filter is used to filter zero-mean, unit variance noisy signal, v(n) uncorrelated with the desired signal d(n). Find: rdx, optimal solution (Wiener-Hopf) wopt and MMSE Jmin assuming:

0.6mdr m vr m m

Example (Tutorial 3, Problems 3 and 4)

0 1

1 0.6

dx

dd

d

m E d n x n m

E d n d n m E d n v n m

rm

r

r

r

Example (Tutorial 3, Problems 3 and 4)

1

1

1

1

1

12

2

( ) ( )

0 1 00

1 0 10

1 0.6 1 0

0.6 1 0 1

opt x dx

dx

dx

dx

d v dx

d d dv

d d dv

E x n x n m

E d n v n d n m v n m

E d n d n m E d n v n m E v n d n m E v n v n m

+

r r r

r r r

w R r

r

r

r

R R r

1 11 2 0.6 1 0.549 0.165 1 0.451

0.6 0.6 2 0.6 0.165 0.549 0.6 0.165

Example (Tutorial 3, Problems 3 and 4)

2min

0

0.4511 1 0.6 1 0.549 0.451

0.165

Tdx opt

Td dx opt

J E d n

r

r w

r w

MMSE

MMSE

• Another very important observation can be made after rearranging the basic equation for the error signal:

( ) ( ) ( ) ( )e n d n y n d n n Th x

( ) ( )

( )

n e n n d n n n

n d n n n

T

T

x x x h x

x x x h

( ) ( )

( )

E n e n E n d n n n

E n d n E n n

T

T

dx xx

x x x x h

x x x h

R - R h

The Steepest Descent Algorithm

• The Steepest Descent method iteratively estimates the solution to the Weiner-Hopf equation using a method called gradient descent.

• This minimization method finds a minima by estimating the gradient of the MSE surface and forcing the step in the opposite direction of the gradient.

• The basic equation in gradient descent is:

21 h E e n n nh h

Step size parameter Gradient vector that makes hn+1(k) approach hopt

The Steepest Descent Algorithm

• Notice that the expression for the gradient has already been obtained in the process of calculating Wiener filter coefficients, i.e.:

• This is a significant improvement in our search for more efficient solutions – coefficients are now determined iteratively and no inverse of the autocorrelation matrix is needed.

2[ ( )]2 2

( )h

E e n

h k

dx xxR R h

21 h E e n n nh h

The Steepest Descent Algorithm

• We still need to estimate autocorrelation matrix Rxx and crosscorrelation vector Rdx (for every iteration step !!!)

• Further simplification of the algorithm can be achieved by using the instantaneous estimates of Rxx and Rdx.

2[ ( )]2 2

( )h

E e n

h k

dx xxR R h

2 Tn+1 n xx n dxh h R h R

E n n n n T T

xxR x x x x

E n d n n d n dxR x x

The LMS (Least Mean Squares) Algorithm for Adaptive Filtering

2

2

2

2

n n n d n

n n d n

n e n

n+1 n xx n dx

T

n n

T

n n

n

h h R h R

h x x h x

h x x h

h x

Example – Tutorial 4, Problem 3

4 coefficients LMS based FIR adaptive filter working in the system identification application trying to identify system with transfer function:

Write the equations for the signals d(n) and e(n) and the update equation for each adaptive filter coefficient, i.e. w1(n)… w4(n).

1

1

1.25 0.35

1 0.5

zH z

z

Example – Tutorial 4, Problem 3

1

1 2

1.5 0.3

1 0.5 0.25

D z zG z

X z z z

1 2 11 0.5 0.25 1.5 0.3D z z z X z z

1 1 21.5 0.3 0.5 0.25D z X z z D z z z

( ) 1.5 ( ) 0.3 ( 1) 0.5 ( 1) 0.25 ( 2)d n x n x n d n d n

( ) (0) ( ) (1) ( 1) (2) ( 2) (3) ( 3)y n w x n w x n w x n w x n

( ) ( ) ( )e n d n y n

weights update equations: ( ) ( ) 2 ( ) ( )w i w i e n x n i i=0, 1, 2, 3

Applications

• Before looking into details of Matlab implementation of LMS update algorithm, some practical applications for adaptive filters are considered first.

• Those are:– System Identification– Inverse System Estimation– Adaptive Noise Cancellation– Linear Prediction

Applications: System Identification

Definitions of signals:x(n) – input applied to unknown system and adaptive filtery(n) – filter outputd(n) – system (desired) outpute(n) – estimation error

Digital Filter

Adaptive Algorithm

Sx(n) y(n)

e(n)h(n)

-+

Unknown System

d(n)

Identifying the response of the unknown system.

Applications: Inverse Estimation

Definitions of signals:x(n) – input applied to system y(n) – filter outputd(n) – desired outpute(n) – estimation error

Digital Filter

Adaptive Algorithm

Sx(n) y(n)

e(n)h(n)

-+

Delay

Systemx(n)

d(n)

Estimating the inverse of the system.

Delay block ensures the causality of the estimated inverse

Applications: Noise Cancellation

Definitions of signals:x(n) – noise (so called reference signal)y(n) – noise estimated(n) – signal + noisee(n) – signal estimate

Digital Filter

Adaptive Algorithm

Sx(n) y(n)

e(n)

d(n)=s(n)+n(n)

h(n)

-+

Signalsource

Noisesource

Removing background noise from the useful signals

Applications: Linear Predictor

Definitions of signals:x(n) – signal to be predictedy(n) – filter output (signal prediction)d(n) – desired outpute(n) – estimation error

Digital Filter

Adaptive Algorithm

Sx(n) y(n)

e(n)h(n)

-+

Delayx(n)

d(n)

AR Process

Estimating the future samples of the signal.

Applications: Linear Predictor

• Assuming that the signal x(n):• is periodic • Is steady or varies slowly over time

• the adaptive filter will can be used to predict the future values of the desired signal based on past values.

• When x(n) is periodic and the filter is long enough to remember previous values, this structure with the delay in the input signal, can perform the prediction.

• This configuration can also be used to remove a periodic signal from stochastic noise signals.

Example

• Have a look into Tutorials 3 and 4 for examples of each discussed configuration.

Adaptive LMS Algorithm Implementation

• LMS algorithm can easily be implemented in software. Main steps of this algorithm are:

1. Read in the next sample, x(n), and perform

the filtering operation with the current version of the coefficients.

2. Take the computed output and compare it with the expected output, i.e. calculate the error.

3. Update the coefficients (obtain the next set of coefficients) using the following computation.

• This algorithm is performed in a loop so that with each new sample, a new coefficient vector, hn+1(k) is created.

• In this way, the filter coefficients change and adapt.

1

0

( ) ( ) ( )N

nk

y n h k x n k

( ) ( ) ( )e n d n y n

1( ) ( ) ( ) ( ) n nh k h k e n x n k

Adaptive LMS Algorithm Implementation

• Before the LMS algorithm “kicks in” we also need to initialise filter coefficients; the safest option is to initialise them all to zero.

1

0

( ) ( ) ( )N

nk

y n h k x n k

( ) ( ) ( )e n d n y n

1( ) ( ) ( ) ( ) n nh k h k e n x n k

(0) 0 for k=0...N-1kh

Other applications of adaptive filters

• PSD estimation of observed signal• Foetal ECG monitoring – cancelling of maternal ECG• Removal of mains interference in medical signals• Radar signal processing

– Background noise removal– RX-TX crosstalk reduction– Adaptive jammer suppression

• Separation of the speech from the background noise• Echo cancellation for speaker phones• Beamforming

Part 4

PSD Estimation and Signal Modelling Techniques

Part 4 – PSD Estimation

There are more ways to find PSD of the signal

• Nonparametric techniques (periodogram and correlogram)

• Parametric Techniques (AR, MA and ARMA models)

• Yule-Walker Equations and Signal Predictors

Approaches to PSD estimation

• Classical, Non-parametric Techniques – based on Fourier Transform– robust, require no previous knowledge about the data– assume zero data values outside the data window -results are

distorted, resolution can be low– not suitable for short data records

• Modern, Parametric Techniques – include a priori model information concerning the spectrum to be estimated.

• Modern, Non-parametric methods – use singular value decomposition (SVD) of the signal to separate correlated and uncorrelated signal components for an easier analysis

Non-parametric PSD estimation techniques (DFT/FFT based)

• PSD is estimated directly from the signal itself, with no previous knowledge about signal

• Periodogram:– Based on the following formula

• Correlogram:– Based on the following formula

21 2

0

1 1Nj j n j

xxn

R e x n e X eN N

1

1

ˆN

j j mxx xx

m N

R e r m e

- estimate of the autocorrelation of signal x(n)xxr

Periodogram and Correlogram

• note that since

results obtained with those two estimators should coincide

• variations on the basic periodogram approach are usually used in practice

2*j j j j j

xx

X e X e X e X e X e

DTFT x k x k DTFT r m

(note - not a strict mathematical derivation)

Blackman-Tukey method

• Since correlation function at its extreme lag values is not reliable (less data points enter the computation) it is recommended to use lag values of about 30%-40% of the total length of the data

• Blackman-Tukey is windowed correlogram given by:

w(n) is the window with zero values for |m|>L-1• also, L<<N

1

1

ˆL

j j mBT xx

m L

R e w m r m e

Bartlett Method

• This is an improved periodogram method (note that previously discussed, Blackman-Tukey is a correlogram method)

• Bartlett’s method reduces the fluctuation of the periodogram by splitting up the available data of N observations into K=N/L subsections of L observations each.

• Spectral densities of produced K periodograms are then averaged.

Bartlett Method

L samples L samples L samples…

periodogram 1

periodogram 2

periodogram K

+

=

Total/K PSD Estimate

segment 1 segment 2 segment K…

Welch Method

• Welch proposed further modification to Bartlett method and introduced overlapped and windowed data segments defined as:

where: w(n) - window of length M

D - offset distance

K - number of sections that the sequence x(n) is divided into

0 1, 0 i K-1ix n x iD n w n n M

Welch Method

• i-th periodogram is

• averaged periodogram is

21

0

1 Lj j n

i in

R e x n eL

1

0

1 Lj j

in

R e R eK

Welch Method

segment 1

segment 2

segment K

periodogram 1

periodogram 2

periodogram K

+

=

Total/K PSD Estimate

dataD samples

Modified Welch Method

• Data segments taken from the data record are progressively getting longer thus introducing a better frequency resolution.

• Due to an averaging procedure periodogram variance decreases and smoother periodograms are obtained

Modified (Symmetric) Welch Method

segment 1

segment 2

segment K

periodogram 1

periodogram 2

periodogram K

+

=

Total/K PSD Estimate

data

Modified (Assymetric) Welch Method

segment 1

segment 2

segment K

periodogram 1

periodogram 2

periodogram K

+

=

Total/K PSD Estimate

data

Comparison of nonparametric PSD estimators

• We use quality factor Q to evaluate different nonparametric methods

• This is a ratio if the square if the mean of the power spectral density to its variance

var

jxx

jxx

E R eQ

R e

Comparison of nonparametric PSD estimators

Periodogram

Bartlett

Welch

Blackman-Tukey

N→∞

N,L→∞

N,L→∞, 50% overlapp

N,L→∞,triangular window

1

1.11Nf

1.39Nf

2.34Nf

Inconsisten, independent of N

Quality improves with data length

Quality improves with data length

Quality improves with data length

Method Conditions Q Comments

• f is a 3 dB main lobe of the associated windows

Parametric PSD estimation techniques (DFT based)

• use a priori model information about the spectrum to be estimated

Steps for parametric spectrum estimation

1. Select a suitable model for the procedure. This step may be based on:

- a priori knowledge of the physical mechanism that generates the random process.

- trial and error, by testing various parametric models.

(if wrong model is selected, results can be worse than when using non-parametric methods for PSD estimation)

2. Estimate the (p,q) order of the model (from the collected data and/or from a priori information).

3. Use collected data to estimate model parameters, coefficients.

Stochastic signal modelling

Deterministic signal modelling

Possible Models

b(n)x(n) y(n)

Moving Average (MA) - b coefficients only (i.e. FIR)

All Zero Model

y(n)

+

a(n)

x(n)

Autoregressive (AR) - a coefficients only

All Pole Model

+x(n) b(n)

a(n)

y(n)+

+

Most General Model:

Autoregressive Moving Average (ARMA) - a and b coefficients

(i.e. IIR)

also known as Pole-Zero Model

Model Equations

0 0

p q

k kk k

a y n k b x n k

ARMA

0

q

kk

y n b x n k

MA

0

p

kk

a y n k x n

AR

Model Equations in z-domain (i.e. Model Transfer Functions)

0 0

p q

k kk k

a y n k b x n k

ARMA

0 0

p qn kn k

k kk k

Y k a z X z b z

apply z-transform

0

0

qn k

kkp

n kk

k

b zY k

H zX z

a z

Model Equations in z-domain

MA 0

qk

kk

H z b z

AR

1

1

1p

kk

k

H za z

ARMAfor a0=1: 0

1

1

qk

kk

pk

kk

b zH z

a z

Model Equations in W-domain

MA 0

qj jk

kk

H e b e

AR 1

1

1

jp

jkk

k

H ea e

ARMAfor a0=1: 0

1

1

qjk

kj k

pjk

kk

b eH e

a e

So how do we get the signal PSD from the estimated model?

• If the white noise signal w(n) is the input to our model (i.e. x(n)=w(n)) the output signal y(n) is a WSS (wide sense stationary) signal with PSD given as:

or 1yy wwR z H z H z R z

j j j jyy wwR e H e H e R e

PSD for ARMA modelled signal

• using vector notation:

20 0

1 1

1 1

j j j jyy ww

q qjk jk

k kk k

wp pjk jk

k kk k

R e H e H e R e

b e b e

a e a e

2jyy wR e

H Hq q

H Hp p

e bb e

e aa e

PSD for ARMA modelled signal

• where H denotes Hermitian (transpose + complex conjugate) and:

2jyy wR e

H Hq q

H Hp p

e bb e

e aa e

2

1

...

j

j

jq

e

e

e

qe2

1

...

j

j

jp

e

e

e

pe

0

1

2

...

q

b

b

b

b

b1

2

1

...

p

a

a

a

a

PSD for AR & MA modelled signals

• Similarly, for AR modelled signals we have:

• and for MA models:

2 2

1 1

1 1 1

1 1

jyy w wp p

jk jkk k

k k

R ea e a e

H Hp pe aa e

2 2

0 0

q qj jk jk j

yy k k w yy wk k

R e b e b e R e

H Hq qe bb e

Statistical Signal ModellingYule-Walker Equations

• In statistical signal modelling problem of determining the model coefficients boils down to solving a set of nonlinear equations called Yule Walker equations.

• Next couple of slides show how those equations are obtained starting from the general expression for ARMA model

• Assuming a0=1 we have:

0 1 0

p p q

k k kk k k

a y n k y n a y n k b x n k

Yule-Walker Equations

• Multiplying both sides of the ARMA equation with y(n-i) and taking the expectation we have:

• Since both x(n) and y(n) are jointly wide sense stationary processes, we can rewritte the last part of this equation using the following reasoning:

1 0

p q

y k y kk k

r i a r k i b E x n k y n i

1 0

p q

k kk k

E y n y n i E a y n k y n i E b x n k y n i

Yule-Walker Equations

• i.e.

2

2

,

m

m

x

x

x y

E x n k y n i E x n k x m h n i m

E x n k x m h n i m

n k h n i m

h k i

r k i

2

1 0

p q

y k y x kk k

r i a r k i b h k i

Yule-Walker Equations

• Further, for causal h(n) we can obtain the standard form of Yule-Walker equations

• Introducing

we have:

2 2

1 0

p q q i

y k y x k x k ik k i k

r i a r k i b h k i b h k

0

q q i

k k k ik i k

c b h k i b h k

2

1

for 0

0 for

px k

y k yk

c k qr i a r k i

k q

Yule-Walker Equations

• in matrix form:

0

1

1

2

0 1 ...

11 0 1

... ... ...

1

...1 1 0

... .... ...

1 ... 0

y y y

y y y

y y y q

y y y

p

y y y

r r r p c

r r r p c

a

ar q r q r q p c

r q r q r q p

a

r q p r q p r q

Yule Walker Equations for AR model

0

p

kk

a y n k x n

2 2

1

p q

y k y x k xk k i

r i a r k i b h k i

• in matrix form:

1

22

10 1 ... 1

1 0 ... 0

... ......

1 0 0

y y y

y yx

y y yp

r r r pa

r ra

r p r p ra

top related