digital audio signal processing lecture-4: noise reduction

44
Digital Audio Signal Processing Lecture-4: Noise Reduction Marc Moonen/Alexander Bertrand Dept. E.E./ESAT-STADIUS, KU Leuven [email protected] homes.esat.kuleuven.be/~moonen/

Upload: rosemary-mayo

Post on 30-Dec-2015

52 views

Category:

Documents


8 download

DESCRIPTION

Digital Audio Signal Processing Lecture-4: Noise Reduction. Marc Moonen/Alexander Bertrand Dept. E.E./ESAT-STADIUS, KU Leuven [email protected] homes.esat.kuleuven.be /~ moonen /. Overview. Spectral subtraction for single -micr. noise reduction - PowerPoint PPT Presentation

TRANSCRIPT

Digital Audio Signal Processing

Lecture-4: Noise Reduction

Marc Moonen/Alexander BertrandDept. E.E./ESAT-STADIUS, KU Leuven

[email protected]

homes.esat.kuleuven.be/~moonen/

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 2

Overview

• Spectral subtraction for single-micr. noise reduction– Single-microphone noise reduction problem– Spectral subtraction basics (=spectral filtering)– Features: gain functions, implementation, musical noise,…– Iterative Wiener filter based on speech signal model

• Multi-channel Wiener filter for multi-micr. noise red.– Multi-microphone noise reduction problem– Multi-channel Wiener filter (=spectral+spatial filtering)

• Kalman filter based noise reduction– Kalman filters– Kalman filters for noise reduction

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 3

Single-Microphone Noise Reduction Problem

• Microphone signal is

• Goal: Estimate s[k] based on y[k]

• Applications: Speech enhancement in conferencing, handsfree telephony, hearing aids, …

Digital audio restoration • Will consider speech applications: s[k] = speech signal

desired signal estimate

desired signal s[k]

noise signal(s)

? ][ksy[k]

][][][ knksky

desired signal contribution

noisecontribution

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 4

Spectral Subtraction Methods: Basics

• Signal chopped into `frames’ (e.g. 10..20msec), for each frame a frequency domain representation is

(i-th frame)

• However, speech signal is an on/off signal, hence some frames have speech +noise, i.e.

some frames have noise only, i.e.

• A speech detection algorithm is needed to distinguish between these 2 types of frames (based on energy/dynamic range/statistical properties,…)

][][][ knksky

)()()( iii NSY

frames} only'-noise{`frame )( 0 )( iii NY

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 5

Spectral Subtraction Methods: Basics

• Definition: () = average amplitude of noise spectrum

• Assumption: noise characteristics change slowly, hence estimate () by (long-time) averaging over (M) noise-only frames

• Estimate clean speech spectrum Si() (for each frame), using corrupted speech spectrum Yi() (for each frame, i.e. short-time estimate) + estimated ():

based on `gain function’

)()()(ˆ iii YGS

))(ˆ),(()( ii YfG

framesonly -noise

)(1

)(ˆM

iYM

})({)( iNE

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 6

Spectral Subtraction: Gain Functions

2

2

)(

)(ˆ1)(

i

iY

G

2

2

)(

)(ˆ11

2

1)(

i

iY

G

))(),(ˆ()( ii YfG

)SNR,SNR()( priopostfGi

)(

)(ˆ1)(

ii Y

G

2

2

)(

)(ˆ1)(

i

iY

G

Ephraim-Malah = most frequently used in practice

Non-linear Estimation

Maximum Likelihood

Wiener Estimation

Spectral Subtraction

Magnitude Subtraction

see next slide

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 7

Spectral Subtraction: Gain Functions

• Example 1: Ephraim-Malah Suppression Rule (EMSR)

with:

• This corresponds to a MMSE (*) estimation of the speech spectral amplitude |Si()| based on observation Yi() ( estimate equal to E{ |Si()| | Yi() } ) assuming Gaussian a priori distributions for Si() and Ni() [Ephraim & Malah 1984].

• Similar formula for MMSE log-spectral amplitude estimation [Ephraim & Malah 1985].

(*) minimum mean squared error

prio

priopost

prio

prio

post SNR1

SNRSNR.

SNR1

SNR

SNR

1

2)( MGi

2

2

11postprio

2

2

post

102

)(

)()(1,0)-)max(SNR-(1)(SNR

)(ˆ

)()(SNR

)2

()2

()1(][

ii

i

YG

Y

IIeM

modified Bessel functions

skip fo

rmulas

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 8

Spectral Subtraction: Gain Functions

• Example 2: Magnitude Subtraction – Signal model:

– Estimation of clean speech spectrum:

– PS: half-wave rectification

)(,)(

)()()(

iyji

iii

eY

NSY

)(

)(

)(ˆ1

)(ˆ)()(ˆ

)(

)(,

i

G

i

jii

YY

eYS

i

iy

))(,0max()( ii GG

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 9

Spectral Subtraction: Gain Functions

• Example 3: Wiener Estimation – Linear MMSE estimation:

find linear filter Gi() to minimize MSE

– Solution:

Assume speech s[k] and noise n[k] are uncorrelated, then...

– PS: half-wave rectification

2

2

2

22

,

,,

,

,

)(

)(ˆ1

)(

)(ˆ)(

)(

)()(

)(

)()(

ii

i

iyy

inniyy

iyy

issi

YY

Y

P

PP

P

PG

)(

)(

)().(

)().()(

,

,

iyy

isy

ii

iii P

P

YYE

YSEG <- cross-correlation in i-th frame

<- auto-correlation in i-th frame

2

)(ˆ

)().()(

iS

YGSE iii

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 10

Spectral Subtraction: Implementation

® Short-time Fourier Transform (=uniform DFT-modulated analysis filter bank)

= estimate for Y(n ) at time i (i-th frame)

N=number of frequency bins (channels) n=0..N-1 M=downsampling factor K=frame length h[k] = length-K analysis window (=prototype filter)

® frames with 50%...66% overlap (i.e. 2-, 3-fold oversampling, N=2M..3M)® subband processing:

® synthesis bank: matched to analysis bank (see DSP-CIS)

1

0

/2].[][],[K

k

NknjekMiykhinY

y[k]Y[n,i]

Short-timeanalysis

Short-timesynthesis

Gain functions

][ˆ ks],[ˆ inS

],[ˆ in

],[].,[],[ˆ inYinGinS

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 11

Spectral Subtraction: Musical Noise

• Audio demo: car noise

• Artifact: musical noise

What? Short-time estimates of |Yi()| fluctuate randomly in noise-only frames,

resulting in random gains Gi() ® statistical analysis shows that broadband noise is transformed into

signal composed of short-lived tones with randomly distributed frequencies (=musical noise)

magnitude subtraction][ky ][̂ks

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 12

probability that speech is present, given observation

)()()(ˆ ii YGS

instantaneousaverage

Spectral Subtraction: Musical Noise

• Solutions?- Magnitude averaging: replace Yi() in calculation of Gi()

by a local average over frames

- EMSR (p7)- augment Gi() with soft-decision VAD:

Gi() P(H1 | Yi()). Gi() …

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 13

Spectral Subtraction: Iterative Wiener Filter

Example of signal model-based spectral subtraction…• Basic:

Wiener filtering based spectral subtraction (p.9), with (improved) spectra estimation based on parametric models

• Procedure:1. Estimate parameters of a speech model from noisy signal y[k]

2. Using estimated speech parameters, perform noise reduction (e.g. Wiener estimation, p. 9)

3. Re-estimate parameters of speech model from the speech signal estimate

4. Iterate 2 & 3

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 14

Spectral Subtraction: Iterative Wiener Filter

white noise

generator

pulse train……

pitch period

voiced

unvoiced

x

M

m

mjme

1

1

1

sg

speech

signal

frequency domain:

time domain:

= linear prediction parameters

all-pole filter

u[k]

)(1

)(

1

Ue

gS M

m

mjm

s

M

msm kugmksks

1

][][][

TM 1α

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 15

Spectral Subtraction: Iterative Wiener Filter

For each frame (vector) y[m] (i=iteration nr.)

1. Estimate and

2. Construct Wiener Filter (p.9)

with:

• estimated during noise-only periods

3. Filter speech frame y[m]

isg , TiMii ,,1 α

)()(

)(..)(

nnss

ss

PP

PG

)(nnP

2

1,

,

1

)(

M

m

mjim

isss

e

gP

][ˆ mis

Repeat until

some errorcriterion issatisfied

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 16

Overview

• Spectral subtraction for single-micr. noise reduction– Single-microphone noise reduction problem– Spectral subtraction basics (=spectral filtering)– Features: gain functions, implementation, musical noise,…– Iterative Wiener filter based on speech signal model

• Multi-channel Wiener filter for multi-micr. noise red.– Multi-microphone noise reduction problem– Multi-channel Wiener filter (=spectral+spatial filtering)

• Kalman filter based noise reduction– Kalman filters– Kalman filters for noise reduction

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 17

Multi-Microphone Noise Reduction Problem

(some) speech estimate

speech source

noise source(s)

microphone signals

Miknksky iii ...1],[][][ ][kn

][ks

? ][ks

speech part noise part

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 18

Multi-Microphone Noise Reduction Problem

Will estimate speech

part in microphone 1

(*) (**)

Miknksky iii ...1],[][][

? ][1 ks

(*) Estimating s[k] is more difficult, would include dereverberation... (**) This is similar to single-microphone model (p.3), where additional microphones (m=2..M) help to get a better estimate

][kn

][ks

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 19

Multi-Microphone Noise Reduction Problem

• Data model:

See Lecture-2 on multi-path propagation, with q left out for conciseness.

Hm(ω) is complete transfer function from speech source position to m-the microphone

)(

)(

)(

)(.

)(

)(

)(

)(

)(

)(

)()().(

)()()(

2

1

2

1

2

1

MMM N

N

N

S

H

H

H

Y

Y

Y

S

Nd

NSY

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 20

Multi-Channel Wiener Filter (MWF)

• Data model:

• Will use linear filters to obtain speech estimate (as in Lecture-2)

• Wiener filter (=linear MMSE approach)

Note that (unlike in DSP-CIS) `desired response’ signal S1(w) is unknown here (!), hence solution will be `unusual’…

)()().( )( NdY S

)().()().()(ˆ1

*1 YFH

M

mmm YFS

})().()({min2

1)( YFFHSE

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 21

Multi-Channel Wiener Filter (MWF)

• Wiener filter solution is (see DSP-CIS)

– All quantities can be computed !– Special case of this is single-channel Wiener filter formula on p.9

)}().({)}().({.)}().({

...

.)}().({)}().({)(

*1

*1

1

)0)}().({(with

lationcrosscorre

*1

1

ationautocorrel

*

1

NEYEE

SEE

H

NE

H

NYYY

YYYF

S

compute during speech+noise periods

compute during noise-only periods

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 22

• MWF combines spatial filtering (as in Lecture-2) with single-channel spectral filtering (as in single-channel noise reduction)

if

then…

noise vectorsteering

2

1

)()(.)(

)(

)(

)(

)(

Nd

Y

S

Y

Y

Y

M

)()}().({ NNHE ΦNN

Multi-Channel Wiener Filter (MWF)

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 23

• …then it can be shown that

• represents a spatial filtering (*)

Compare to superdirective & delay-and-sum beamforming (Lecture-2) – Delay-and-sum beamf. maximizes array gain in white noise field– Superdirective beamf. maximizes array gain in diffuse noise field – MWF maximizes array gain in unknown (!) noise field.

MWF is operated without invoking any prior knowledge (steering

vector/noise field) ! (the secret is in the voice activity detection… (explain))

(*) Note that spatial filtering can improve SNR, spectral filtering never improves SNR

(at one frequency)

)().(.)()( 1

scalar

dΦF NN

)().(1 dΦNN

Multi-Channel Wiener Filter (MWF)

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 24

• …then it can be shown that (continued)

• represents an additional `spectral post-filter’ i.e. single-channel Wiener estimate (p.9), applied to output signal

of spatial filter

(prove it!)

)().(.)()( 1

scalar

dΦF NN

1)(.)().().(

)(.)(...)( 21

*1

2

S

HS

NNH dΦd

)(

Multi-Channel Wiener Filter (MWF)

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 25

Multi-Channel Wiener Filter: Implementation

• Implementation with short-time Fourier transform: see p.10 • Implementation with time-domain linear filtering:

M

ii

Ti

iiiTi

kks

Lkykykyk

11 ].[][

]1[...]1[][][

fy

y

][kn

][ks

][1 ks

][1 ky

filter

coefficients

][][][ kkk iii nsy

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 26

Solution is…

}].[][{min2

1 fyf kksE T

TTM

TT

TTM

TT

kkkk ][...][][][

...

21

21

yyyy

ffff

]}[.][{]}[.][{.}][.][{

]}[.][{.}][.][{

11

1

1

1

knkEkykEkkE

kskEkkE

T

T

nyyy

yyyf

compute during speech+noise periods

compute during noise-only periods

• Implementation with time-domain linear filtering:

Multi-Channel Wiener Filter: Implementation

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 27

Overview

• Spectral subtraction for single-micr. noise reduction– Single-microphone noise reduction problem– Spectral subtraction basics (=spectral filtering)– Features: gain functions, implementation, musical noise,…– Iterative Wiener filter based on speech signal model

• Multi-channel Wiener filter for multi-micr. noise red.– Multi-microphone noise reduction problem– Multi-channel Wiener filter (=spectral+spatial filtering)

• Kalman filter based noise reduction– Kalman filters– Kalman filters for noise reduction

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 28

State space model of a time-varying discrete-time system

with v[k] and w[k]: mutually uncorrelated, zero mean, white noises

Then: given A[k], B[k], C[k], D[k], V[k], W[k] and

input/output-observations u[k],y[k], k=0,1,2,... then

Kalman filter produces MMSE estimates of internal states x[k], k=0,1,...

PS: will use shorthand notation here, i.e.

xk, yk ,.. instead of x[k], y[k],..

process noise

measurement noise

Kalman Filter 1/12

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 29

Kalman Filter 2/12

Definition: = MMSE-estimate of xk using all available data up until time l

• `FILTERING’ = estimate

• `PREDICTION’ = estimate

• `SMOOTHING’ = estimate

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 30

Initalization:

‘Conventional Kalman Filter’ operation @ time k (k=0,1,2,..) Given and corresponding error covariance matrix , Compute and using uk, yk :Step 1: Measurement Update (produces ‘filtered’ estimate)

(compare to standard RLS!)

Step 2: Time Update

(produces ‘1-step prediction’)

=error covariance matrix

Kalman Filter 3/12

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 31

PS: ‘Standard RLS’ is a special case of ‘Conventional KF’

Kalman Filter 4/12

Internal state vector is FIR filter

coefficients vector, which is

assumed to be time-invariant

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 32

State estimation @ time k corresponds to overdetermined set of linear

equations, where vector of unknowns contains all previous state vectors :

Kalman Filter 5/12

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 33

State estimation @ time k corresponds to overdetermined set of linear

equations, where vector of unknowns contains all previous state vectors :

Kalman Filter 6/12Te

chn

ical

det

ail…

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 34

State estimation @ time k corresponds to overdetermined set of linear

equations, where vector of unknowns contains all previous state vectors :

Kalman Filter 7/12

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 35

‘Square-Root’ Kalman Filter

Kalman Filter 8/12

Propagated from

time k-1 to time k

..hence requires only lower-right/lower part

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 36

‘Square-Root’ Kalman Filter

Kalman Filter 9/12

Propagated from

time k-1 to time k

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 37

‘Square-Root’ Kalman Filter

Kalman Filter 10/12

Propagated from

time k-1 to time k

Propagated from

time k to time k+1

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 38

Kalman Filter 11/12

QRD-

=QRD-

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 39

Kalman Filter 12/12

can be derived from square-root KF equations

: can be worked into measurement update eq.

: can be worked into state update eq.

is .

[details omitted]

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 40

Kalman filter for Speech Enhancement

• Assume AR model of speech and noise

• Equivalent state-space model is…

y=microphone signal

N

mnm

M

msm

kwgmknkn

kugmksks

1

1

][][][

][][][

u[k], w[k] = zero mean, unit

variance,white noise

][][

][][]1[

kky

kkkT xc

vAxx

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 41

Kalman filter for Speech Enhancement

with: ][]1[][]1[][ knNknksMkskT x

n

s

NM

T

n

s

g0

0gGC

A0

0AA ;100100;

1111

100

00

010

;100

00

010

NN

n

MM

s AA

Tkwkuk ][][.][ Gv

;00;00 nTns

Ts gg gg

TGGQ .

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 42

Kalman filter for Speech Enhancement

Disadvantages iterative approach:• complexity• delay

split signal

in frames

estimate

parameters

Kalman Filter

reconstruct

signal

imisg ,, ˆ;ˆ iming ,,

ˆ;ˆ

iterations

y[k]

][̂ks

],[ˆ mis

][ˆ min

Iterative algorithm

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 43

Kalman filter for Speech Enhancement

iteration index time index (no iterations)

State Estimator:

Kalman Filter

Parameters Estimator

(Kalman Filter)

D

D

],1|1[̂ kks

]1|1[ˆ kkn

][ky

]|[ˆ],|[̂ kknkks

],1|1[ˆ kkm],1|1[ˆ kkm],1|1[ˆ kkgs

]1|1[ˆ kkgn

],|[ˆ kkm],|[ˆ kkm],|[ˆ kkgs

]|[ˆ kkgn

ns ˆ,ˆ

,ˆ,ˆ mm

ns gg ˆ,ˆ

Sequential algorithm

Digital Audio Signal Processing Version 2014-2015 Lecture-4: Noise Reduction p. 44

CONCLUSIONS

• Single-channel noise reduction– Basic system is spectral subtraction– Only spectral filtering, hence can only exploit differences in spectra

between noise and speech signal:• noise reduction at expense of speech distortion• achievable noise reduction may be limited

• Multi-channel noise reduction– Basic system is MWF,– Provides spectral + spatial filtering (links with beamforming!)

• Iterative Wiener filter & Kalman filtering– Signal model based approach