a survey on methods for blind acoustic dereverberation830362/fulltext01.pdf · abstract...

A survey on methods for blind acoustic dereverberation

Masters Thesis in Electrical Engineering (ETD018)

Osunkunle Biodun Isaac Sayed Ali Shekarchi

[email protected] [email protected]

Supervisor: Benny Sallberg

Abstract

Reverberation is a phenomenon in auditoriums such as concert halls and churches. Re-

verberation consists of a combination of multiple echoes, and its intensity and duration

depends on factors such as the dimensions of the enclosure, materials used in construction

and shape. Reverberation is desirable in music reproduction, however, it renders speech

unintelligible. Thus there is a requirement to control reverberation of speech. This the-

sis work investigates the performances of different signal processing algorithms applied to

suppress reverberation. Theoretical methods which have been verified with simulations are

tested with real measurements. This gives a practical evaluation of the performance to be

expected in the use of the algorithms.

ii

CONTENTS

List of Figures viii

1 Introduction 1

1.1 Reverberation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 The basics of room acoustics . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3 Reverberation Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.4 Example Reverberation Calculation . . . . . . . . . . . . . . . . . . . . . . 4

1.5 System types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.5.1 Single Input Single Output System - SISO . . . . . . . . . . . . . . 5

1.5.2 Single Input Multiple Output (SIMO) Model . . . . . . . . . . . . . 6

1.5.3 Multiple Input Single Output (MISO) Systems . . . . . . . . . . . . 8

1.5.4 Multiple Input Multiple Output (MIMO) Systems . . . . . . . . . . 8

1.6 Performance Criteria . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.6.1 Normalized Projection Misalignment . . . . . . . . . . . . . . . . . 10

1.6.2 Signal to Deviation Ratio . . . . . . . . . . . . . . . . . . . . . . . 10

2 Supervised Inverse filtering based dereverberation 11

2.1 The NLMS algorithm for a Single Input Single Output System Identification 11

2.1.1 Supervised inverse filtering . . . . . . . . . . . . . . . . . . . . . . . 11

iii

2.1.2 Impulse Response Measurements - Channel identification using the

NLMS algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2 Impulse Response Measurement Results . . . . . . . . . . . . . . . . . . . 14

2.2.1 Inverse Filtering and Performance Evaluation . . . . . . . . . . . . 15

2.3 Inverse Filtering - Least squares method . . . . . . . . . . . . . . . . . . . 26

2.4 Inverse Filtering - The Multichannel Inverse Theorem - MINT . . . . . . . 30

3 Robust Inverse filtering with MINT 33

3.1 Generalized MINT performance . . . . . . . . . . . . . . . . . . . . . . . . 34

3.2 Regularization performance . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.3 Algorithm Optimization Procedure . . . . . . . . . . . . . . . . . . . . . . 38

3.3.1 Algorithm Optimization Procedure Step 1 - Delay . . . . . . . . . . 38

3.3.2 Algorithm Optimization Procedure Step 2 - Regularization . . . . . 39

3.3.3 Algorithm Optimization Procedure Step 3 - Filter Length . . . . . . 40

3.3.4 Algorithm Optimization Results . . . . . . . . . . . . . . . . . . . . 41

4 Unsupervised Inverse filtering based dereverberation 45

4.1 Basic Principles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

4.2 Identifiability Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.3 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

4.4 Constrained Time Domain Multichannel LMS . . . . . . . . . . . . . . . . 49

4.5 Constrained Time Domain Multichannel Newton Algorithm . . . . . . . . 56

4.6 Unconstrained Blind Multichannel LMS algorithm with Optimal Step Size

control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

4.7 Frequency Domain Normalized Multichannel LMS . . . . . . . . . . . . . . 60

4.8 Performance of Selected Blind Methods . . . . . . . . . . . . . . . . . . . . 67

5 Conclusion 75

6 Matlab Scripts 78

6.1 Two Channel Blind Identification : 3-tap channels . . . . . . . . . . . . . . 78

6.2 Identification with the NLMS Algorithm . . . . . . . . . . . . . . . . . . . 80

6.3 Normalized LMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

6.4 Single Channel Dereverberation with NLMS . . . . . . . . . . . . . . . . . 84

6.5 System Identification using NLMS . . . . . . . . . . . . . . . . . . . . . . . 85

6.6 Blind SIMO LMS Well Conditioned Inputs . . . . . . . . . . . . . . . . . . 85

iv

6.7 Blind SIMO LMS Bad Conditioned Inputs . . . . . . . . . . . . . . . . . . 87

v

LIST OF FIGURES

1.1 Multiple sound propagation paths from source (S) to receiver (R) . . . . . 2

1.2 Example Reverberation Calculation[1] . . . . . . . . . . . . . . . . . . . . . 4

1.3 SISO Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.4 SIMO System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

1.5 MISO System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.6 Multiple Input Multiple Output - MIMO System . . . . . . . . . . . . . . 9

2.1 Acoustic Channel Identification with the NLMS algorithm . . . . . . . . . 15

2.2 Impulse Response - Distance : 1m, Room : Amethyst, Channels : 1 - 4 . . 16


2.4 Smoothed Filter Coefficients - Distance : 1m, Room : Amethyst, Channels

: 1 - 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18


: 5 - 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19




: 1 - 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22


: 5 - 7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

vi

2.10 Measurement Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.11 Direct Inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.12 Mean Square Error Inverse . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.13 MINT Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

2.14 Impuse Response Convolution with LS Inverse channels 1 to 4 . . . . . . . 26

2.15 Impuse Response Convolution with LS Inverse channels 5 to 7 . . . . . . . 27

2.16 A SISO acoustic system, with Loudspeaker S1, microphone M, and acoustic

room impulse response G . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.17 Inverse filtering a SISO system with the LSE Method . . . . . . . . . . . . 27

2.18 SDR with the supervised inverse filtering method . . . . . . . . . . . . . . 29

2.19 MINT Filtering of a 1-Input 2-Output system . . . . . . . . . . . . . . . . 30

3.1 SDR dependence on MINT filter delay with different SNR 2 channel system 38

3.2 SDR dependence on MINT filter delay with different SNR 3 channel system 39

3.3 MINT performance with a regularization parameter and different SNR val-

ues for a two channel system . . . . . . . . . . . . . . . . . . . . . . . . . . 40

3.4 MINT performance with a regularization parameter and different SNR val-

ues for a three channel system . . . . . . . . . . . . . . . . . . . . . . . . . 41

3.5 SDR dependence on MINT filter length with different SNR 2 channel system 42

3.6 SDR dependence on MINT filter length with different SNR 3 channel system 43

3.7 Effects of filter length and delay on SDR with the MINT method with 2

Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3.8 Effects of filter length and delay on SDR with the MINT method with 3

Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

4.1 Constrained Time Domain Multichannel LMS algorithm . . . . . . . . . . 55

4.2 Frequency Domain Normalized Multichannel LMS . . . . . . . . . . . . . . 68

4.3 Algorithm performance in well conditioned 2 channel, 3-tap system, 40dB

SNR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

4.4 Scaled impulse response estimates of a well conditioned system obtained

using the SIMO LMS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

4.5 Algorithm performance in badly conditioned 2 channel, 3-tap system, SNR

= 40dB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

4.6 Scaled impulse response estimates of a badly conditioned system obtained

using the Multichannel Newton Algorithm . . . . . . . . . . . . . . . . . . 71

vii

4.7 Performance of algorithms on 16 tap 3-channel system . . . . . . . . . . . 72

4.8 Scaled filter estimates using the blind identification methods for a 16 tap

3-channel system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

4.9 Normalized Projection Misalignment using a Simulated Room Response,

10dB SNR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

4.10 Impulse Response Estimates of Simulated Room Response . . . . . . . . . 74

5.1 System Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

5.2 System performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

viii

CHAPTER 1

Introduction

1.1 Reverberation

Reverberation is the phenomenon which can occur when acoustic waves are propagating

in an enclosure such as an auditorium. Possible sound paths are shown in Figure 1.1 [2].

If the source of sound is suddenly turned off, there will be some observed residual sound.

The reverberation effect is a desirable effect in music, however, it is undesirable for

speech. For speech to be intelligible, reverberation time must be less that two sec-

onds. Because the dimensions and other physical properties of an enclosure determine

its reverberation[1] properties, there is a conflicting requirement on reverberation in such

enclosures. In the situation that an individual needs to speak to the audience, reverber-

ation must be eliminated or minimized, and when an orchestral performance is to begin,

reverberation should be restored. Thus auditorium design require acoustic considerations.

The human body is a good sound absorber[1], between the time an auditorium is empty

and full, there will be a significant change in acoustic properties of the auditorium. Be-

cause of this, seats in auditoriums are made of materials which absorb sound similar to

the human body. Thus the equipment configuration such as equalization and amplitude

settings will not require adjustment.

1

Chapter 1 starts by outlining concepts of reverberation and models which we use in

its study in this chapter. In Chapter 2, supervised identification of the acoustic channels

using the Normalized Least Mean Squares (NLMS) algorithm is considered. These chan-

nels which may not be minimum phase, which implies that they will not have an inverse,

however, a least squares inverse is obtained using the NLMS algorithm. In order to im-

prove performance of the system, the MINT (Multichannel Inverse Theory) method, which

obtains the inverse of a channel by using a set of channels to represent a single channel, is

applied, thereby introducing the possibility of an exact inverse. The sensitivity of MINT

to noise and channel identification errors is investigated in Chapter 3. In Chapter 4, al-

gorithms for unsupervised blind channel identification are derived and implemented. To

conclude the study we use the MINT method to find the inverse of the blindly identified

channels.

Figure 1.1: Multiple sound propagation paths from source (S) to receiver (R)

1.2 The basics of room acoustics

The physical interpretation of terminology applied in the audio community follows: A

room with long reverberation time is referred to as a live room [3], while one with a short

reverberation time is referred to as a dead room. The term intimacy is a psychological

feeling of proximity to the source of sound, which a spectator experiences. Intimacy is

experienced if the listener hears the first reverberant sound less than 20 milliseconds after

the arrival of the original sound. A small room is usually considered intimate, however,

special consideration such as a canopy placed above a stage, or orchestral shells on the

stage can be used to achieve intimacy in auditoriums[3]. The relative amplitude for the

reverberant sound to the direct sound is termed the fullness of the sound. Fullness is the

2

opposite of clarity. Fullness is due to a long reverberation time, clarity is due to a short

reverberation time. When there is an increase in reverberation at low frequencies compared

to high frequencies, the sound is considered warm, while if the reverberation time decreases

as frequency decreases, the sound is considered as brilliant. The time difference between

the arrival of the direct sound and the first few reverberations determines the texture

of the sound. Good texture requires that the first five reflections arrive a the listener at

about 60 milliseconds after the direct sound. Also, the amplitude of the reverberations

should decrease at a constant rate. When the sounds from different performers combine

with uniform distribution at the observer, the sound is consider to have a good blend.

The reverberant sounds must not be heard by the performers on stage, but only by the

audience.

1.3 Reverberation Estimation

A mathematical relationship between the dimensions of a room and its reverberation was

derived in 1898 by Wallace Sabine[1]. Sabine described the time interval it took for sound

intensity to drop by 60dB as the reverberation time. Sabine realized that the reverberation

time is proportional to volume of the room and the materials of which the walls and objects

in the room are made of. Sabine proposed a formula for the determination of reverberation

time as:

RT60 =0.049V

Sa(1.1)

where

RT60- reverberation time in seconds

V - Volume of room in cubic feet

S- total surface area of room, sq ft

a- average absorption coefficient of room surfaces

Sa- total absorption in Sabins

A perfectly absorbing square foot of material is considered to have the absorption

value of 1 sabin. It has been observed that the absorption coefficients usually provided by

manufacturers of building materials are Sabine coefficients.

3

Figure 1.2: Example Reverberation Calculation[1]

1.4 Example Reverberation Calculation

An example reverberation calculation in shown in Figure 1.2, where the absorption coef-

ficient for an enclosure with a concrete floor and gypsum board walls and ceiling. It can

be seen that the absorption of gypsum board is greater than of concrete. Both materials

have a frequency dependence of the absorption coefficient. At the last row in the table,

the reverberation time for the frequency in each column is calculate. It can be seen that

at 1kHz the reverberation time is 3.39 seconds, this will make speech unintelligible.

In order to develop algorithms applicable to any enclosure, we assume a model on which

the make assumptions on reverberation and noise. These are described next.

4

1.5 System types

Application of signal processing techniques in the control of reverberation requires a model

of the system which is under consideration. This model will be used in algorithm devel-

opment and simulations, before practical testing of the concepts on real systems. Models

considered in this thesis are assumed to be linear and shift invariant. These system models

are generally classified into four types namely:

• Single Input Single Output (SISO) model

• Single Input Multiple Output (SIMO) model

• Multiple Input Single Output (MISO) model

• Multiple Input Multiple Output (SISO) model

1.5.1 Single Input Single Output System - SISO

This is the simplest model used in signal processing and it is very useful for initial analysis.

It is also very widely used. The SISO model is shown in Figure 1.3 and as the name suggests,

it describes a system with a single input, which gives a single output signal. This output

signal is dependent on the input and is given by:

x(k) = h ∗ s(k) + b(k) (1.2)

where x(k) is the output signal, s(k) is the source signal of interest, and b(k) is additive

noise at the output of the system. The symbol * represents the linear convolution operation.

There may be noise at the input of the system which could have added to the source signal

s(k) before it gets convolved with the system h, but this kind of noise is disregarded in this

work.

H(z)

s(k)

b(k)

x(k)+

Figure 1.3: SISO Model

5

1.5.2 Single Input Multiple Output (SIMO) Model

This model forms the core of the research work of this study. The system is shown in figure

1.4.

H1(z)

b1(k)

x1(k)

H2(z)

b2(k)

x2(k)

HN(z)

bN(k)

xN(k)

.

.

.

.

.

.

s(k)

Figure 1.4: SIMO System

In this system, a single source signal s(k) produces multiple output signals. The system

can be represented mathematically as

xn(k) = hn ∗ s(k) + bn(k), n = 1, 2, ..., N (1.3)

where the output signals are identified by their subscripts, and the signal have the same

meaning as the SISO system. If vector notation is utilized, this system can be re written

as

6

xn(k) = hTns(k) + bn(k) (1.4)

where

h = [h0 h1 ... hL−1]T

s(k) = [s(k) s(k − 1) ... s(k − L + 1)]T

[·]T has been used to represent the transpose of a vector in this expression.

The subscripts can be discarded if matrix notation is adopted for the SIMO system,

which gives:

x(k) = Hs(k) + b(k) (1.5)

where

x(k) = [x1(k)x2(k)...xN (k)]T

H =

h1,0 h1,1 ... h1,L−1

h2,0 h2,1 ... h2,L−1

......

......

hN,0 hN,1 ... hN,L−1

NxL

b(k) = [b1(k)b2(k)...bN (k)]T

7

1.5.3 Multiple Input Single Output (MISO) Systems

It is possible to have an acoustic system with multiple input sources and a single microphone

signal as output. This is referred to as a Multiple Input Single Output (MISO) system and

is shown below in Figure 1.5.

H1(z)

H2(z)

HM(z)

bN(k)

s1(k)

s2(k)

sM(k)

.

.

.

. . .

x1(k)

Figure 1.5: MISO System

1.5.4 Multiple Input Multiple Output (MIMO) Systems

A system with multiple signal sources, and multiple microphone pickups(outputs) is re-

ferred to as a Multiple Input Multiple Output (MIMO) system, and is illustrated in Figure

1.6. This is the most general configuration of all acoustic system models. Any MIMO

system can be decomposed into a set of SIMO systems, thus this study focuses on the

SIMO system.

8

s1(k)

s2(k)

sM(k)

H11(z)

H21(z)

HN1(z)

H31(z)

H11(z)

H21(z)

HN1(z)

H31(z).

.

.

H11(z)

H21(z)

HNM(z)

H31(z).

.

.

b1(k)

b2(k)

b3(k)

bN(k)

x1(k)

x2 (k)

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

x3 (k)

xN (k)

Figure 1.6: Multiple Input Multiple Output - MIMO System

9

1.6 Performance Criteria

1.6.1 Normalized Projection Misalignment

The Normalized Projection Misalignment (NPM) [4] criterion projects estimation onto true

impulse response, ignoring scaling factors. This is mandatory in many situations, because

the estimated impulse response is usually a scaled version of the true impulse response.

NPM(k) =‖ς(k)‖‖h‖ (1.6)

where

ς(k) = h− hT h(k)

h(k)T h(k)h(k) (1.7)

This implies that with a perfectly identified set of channels, the NPM will be zero.

1.6.2 Signal to Deviation Ratio

Signal to Deviation Ration (SDR) evaluates the difference between the original and dere-

verberated signals. This difference is determined as

SDR = 10 · log10

( ∑N

k=0 s(k)∑N

k=0 (s(k) − s(k))

)

(1.8)

where:

s(k) is the original source signal and

s(k) is the dereverberated source signal

This implies that the higher the SDR, the more closely the dereverberated signal ap-

proaches the original signal.

10

CHAPTER 2

Supervised Inverse filtering based

dereverberation

2.1 The NLMS algorithm for a Single Input Single

Output System Identification

The NLMS algorithm is one of a class of algorithms, which provides an approximation of

the Wiener solution to a given system setup[5][6][7]. In the first part of this work, we used

the NLMS algorithm as described in [6] to obtain an supervised inverse filter of an acoustic

channel.

2.1.1 Supervised inverse filtering

The steps that were taken are detailed below:

1. Channel identification using the NLMS algorithm

2. Estimation of minimum steady state error in ideal system identification

3. Calculation of observed relative error in measurement

11

4. Selection of RT60 using averaged squared-impulse response amplitude decay

5. Re-estimation of impulse response using RT60

6. Inverse filtering of output signal using observed impulse response

2.1.2 Impulse Response Measurements - Channel identification

using the NLMS algorithm

The system setup for channel identification based on the SISO model is shown in Figure

2.1. The signal received by the microphone contains, in addition to that received from

the speaker, ambient noise. This ambient noise comprises ventilation noise, computer

power supply noise, computer fan noise, fluorescent lighting noise, and other possible noise

contributors. This noise limits the error performance of the system identification. The

source signal (white noise bandlimited to 4kHz) is generated using CoolEditPro Software

for Windows. Analysis of the theoretical minimum error is as follows:

yi = k ∗ wi (2.1)

xi = k ∗ hi + ni (2.2)

ei = k ∗ hi + ni − k ∗ wi (2.3)

At the initial state wi = 0, the ratio of the error variance to the signal variance. This

ratio is evaluated as relative error

relative error = 10 × log10

(var(k ∗ hi + ni)

var(k ∗ hi + ni)

)

= 0 dB (2.4)

After convergence, wi = hi, where hi is the estimate of the room impulse response and the

error can be approximated as

ei = k ∗ hi + ni − k ∗ hi ≈ ni (2.5)

12

This means that the measure minimum relative error will be given by:

minimum relative error = 10 × log10

(var(ni)

var(k ∗ hi + ni)

)

= 10 × log10

(var(ni)

var(xi)

)

(2.6)

Equation 2.6 implies that the best results are obtained when the noise power is minimal.

The theoretical and observed values of the minimum relative error and the corresponding

relative error can be seen in tables 2.1, 2.2 and 2.3.

Channel Variance of Variance of Minimum Measured RT60 Filter betaNumber ambient noise desired signal Error Error length

2 9,91301E-06 0,000741502 -18,73907176 -17,83349399 3500 5000 0.33 1,6257E-05 0,00138798 -19,31342429 -18,38510005 3500 5000 0.34 1,05435E-05 0,000777626 -18,67785155 -18,0108574 3500 5000 0.35 5,67785E-06 0,000404648 -18,5289299 -17,35749282 3500 5000 0.36 2,62405E-05 0,001071439 -16,10995427 -14,620191 3500 5000 0.37 8,69834E-06 0,000621331 -18,53886558 -17,55140998 3500 5000 0.38 2,5568E-05 0,001025777 -16,03356823 -14,2437262 3500 5000 0.3

Table 2.1: Room:Amethyst - Error Analysis and Measurement at a distance of 2m


2 1,55207E-05 0,000454341 -14,66469448 -15,06718021 3500 5000 0.33 2,72855E-05 0,000830839 -14,83584301 -14,85222986 3500 5000 0.34 1,53082E-05 0,000458906 -14,76798329 -14,67470254 3500 5000 0.35 8,60381E-06 0,000243119 -14,51127992 -14,38037285 3500 5000 0.36 4,89463E-05 0,000631809 -11,10865978 -11,38598911 3500 5000 0.37 1,34268E-05 0,000360833 -14,29335597 -14,28905174 3500 5000 0.38 4,71883E-05 0,000639579 -11,32060273 -10,98358546 3500 5000 0.3

Table 2.2: Room:Amethyst - Error Analysis and Measurement at a distance of 3m

13


2 N/A 0,000857483 N/A -17,8348108 2100 5000 0.33 N/A 0,001721477 N/A -18,35951894 2100 5000 0.34 N/A 0,000990037 N/A -18,30126177 2100 5000 0.35 N/A 0,000438182 N/A -17,21057975 2100 5000 0.36 N/A 0,001236768 N/A -15,44003787 2100 5000 0.37 N/A 0,00078554 N/A -18,2797752 2100 5000 0.38 N/A 0,001378459 N/A -15,71278552 2100 5000 0.3

Table 2.3: Room:Smaragden - Error Analysis and Measurement at a distance of 1m

2.2 Impulse Response Measurement Results

The identification of the impulse response at different distances in two rooms are shown

next. From the impulse response, the RT60 of the reverberation was estimated by measur-

ing the time for the filter weights to decay to 60dB below the initial level. The observed

impulse response and corresponding filter coefficient values are shown in figures 2.2, 2.3,

2.4 and 2.5 for measurements taken at a distance of 1 meter in room Amethyst. Impulse

response and filter coefficients obtained at a distance of 2 meters is shown in figures 2.6,

2.7, 2.8 and 2.9 respectively. From the Smoothed filter coefficients, an estimate of the

RT60 can be obtained to be 2100 samples in room Smaragden, and 3500 samples in room

Amethyst. The measurement setup used in this experiment is shown in figure 2.10.

k - white noise generated using CoolEditPro, band limited to 4kHz

hi - room impulse response for each channel ( i = 1, 2, 3, 4, 5, 6, 7)

wi - impulse response of adaptive filter after convergence (system identification)

di - filtered broadband noise output from room impulse response

ni - observed ambient noise at each channel (obtained from the first 5000 samples before

the excitation signal is activated, or at the end of the recorded signal after the

reverberation has decayed to > 99%)

xi - recorded signal at each microphone output (desired) for channel 1, 2, ... , 7

ei - error signal of NLMS algorithm for each channel

14

w

+ +room

h

k d

observation noise

n

y

e

white noise

x

Figure 2.1: Acoustic Channel Identification with the NLMS algorithm

2.2.1 Inverse Filtering and Performance Evaluation

The following steps were taken to perform inverse filtering using the NLMS algorithm:

1. Filter source signal using impulse response estimated with the NLMS algorithm

2. Identify the inverse filter using the NLMS algorithm

3. Perform a convolution of identified impulse response and inverse impulse response

verified to be a delta function

4. Perform an analysis of the convolution in step 3

The result of the convolution of the inverse filter of room Amethyst at a distance of 1 meter

is show for all the channels in figures 2.14 and 2.15. A perfect inverse filter should give the

kronecker delta as the result. It can be observed that the figures indicate the kronecker

delta is being approached.

15

0 1000 2000 3000 4000 5000−0.1

−0.05

0

0.05

0.1

0.15

Channel = 1 Min Error = −17.8891 Observed Error = −16.5325

Filter tap #

W

0 1000 2000 3000 4000 5000−0.1

−0.05

0

0.05

0.1

0.15


Filter tap #

W0 1000 2000 3000 4000 5000

−0.1

−0.05

0

0.05

0.1

0.15


Filter tap #

W

0 1000 2000 3000 4000 5000−0.1

−0.05

0

0.05

0.1

0.15


Filter tap #W

Figu

re2.2:

Impulse

Resp

onse

-D

istance

:1m

,R

oom

:A

meth

yst,

Chan

nels

:1

-4

16

0 1000 2000 3000 4000 5000−0.1

−0.05

0

0.05

0.1

0.15


Filter tap #

W

0 1000 2000 3000 4000 5000−0.1

−0.05

0

0.05

0.1

0.15


Filter tap #

W0 1000 2000 3000 4000 5000

−0.1

−0.05

0

0.05

0.1

0.15


Filter tap #

W

Figu

re2.3:

Impulse

Resp

onse

-D

istance

:1m

,R

oom

:A

meth

yst,

Chan

nels

:5

-7

17

0 1000 2000 3000 4000 5000−70

−60

−50

−40

−30

−20

−10

0

10Channel = 1 Min Error = −17.8891 Observed Error = −16.5325

Fiter tap #

Ave

rage

d W

2

0 1000 2000 3000 4000 5000−70

−60

−50

−40

−30

−20

−10

0


Fiter tap #

Ave

rage

d W

2

0 1000 2000 3000 4000 5000−70

−60

−50

−40

−30

−20

−10

0


Fiter tap #

Ave

rage

d W

2

0 1000 2000 3000 4000 5000−70

−60

−50

−40

−30

−20

−10

0


Fiter tap #A

vera

ged

W2

Figu

re2.4:

Sm

ooth

edFilter

Coeffi

cients

-D

istance

:1m

,R

oom

:A

meth

yst,

Chan

nels

:1

-4

18

0 1000 2000 3000 4000 5000−70

−60

−50

−40

−30

−20

−10

0


Fiter tap #

Ave

rage

d W

2

0 1000 2000 3000 4000 5000−70

−60

−50

−40

−30

−20

−10

0


Fiter tap #

Ave

rage

d W

2

0 1000 2000 3000 4000 5000−70

−60

−50

−40

−30

−20

−10

0


Fiter tap #

Ave

rage

d W

2

Figu

re2.5:

Sm

ooth

edFilter

Coeffi

cients

-D

istance

:1m

,R

oom

:A

meth

yst,

Chan

nels

:5

-7

19

0 1000 2000 3000 4000 5000−0.1

−0.05

0

0.05

0.1

0.15


Filter tap #

W

0 1000 2000 3000 4000 5000−0.1

−0.05

0

0.05

0.1

0.15


Filter tap #

W0 1000 2000 3000 4000 5000

−0.1

−0.05

0

0.05

0.1

0.15


Filter tap #

W

0 1000 2000 3000 4000 5000−0.1

−0.05

0

0.05

0.1

0.15


Filter tap #W

Figu

re2.6:

Impulse

Resp

onse

-D

istance

:2m

,R

oom

:A

meth

yst,

Chan

nels

:1

-4

20

0 1000 2000 3000 4000 5000−0.1

−0.05

0

0.05

0.1

0.15


Filter tap #

W

0 1000 2000 3000 4000 5000−0.1

−0.05

0

0.05

0.1

0.15


Filter tap #

W0 1000 2000 3000 4000 5000

−0.1

−0.05

0

0.05

0.1

0.15


Filter tap #

W

Figu

re2.7:

Impulse

Resp

onse

-D

istance

:2m

,R

oom

:A

meth

yst,

Chan

nels

:5

-7

21

0 1000 2000 3000 4000 5000−70

−60

−50

−40

−30

−20

−10

0


Fiter tap #

Ave

rage

d W

2

0 1000 2000 3000 4000 5000−70

−60

−50

−40

−30

−20

−10

0


Fiter tap #

Ave

rage

d W

2

0 1000 2000 3000 4000 5000−70

−60

−50

−40

−30

−20

−10

0


Fiter tap #

Ave

rage

d W

2

0 1000 2000 3000 4000 5000−70

−60

−50

−40

−30

−20

−10

0


Fiter tap #A

vera

ged

W2

Figu

re2.8:

Sm

ooth

edFilter

Coeffi

cients

-D

istance

:2m

,R

oom

:A

meth

yst,

Chan

nels

:1

-4

22

0 1000 2000 3000 4000 5000−70

−60

−50

−40

−30

−20

−10

0


Fiter tap #

Ave

rage

d W

2

0 1000 2000 3000 4000 5000−70

−60

−50

−40

−30

−20

−10

0


Fiter tap #

Ave

rage

d W

2

0 1000 2000 3000 4000 5000−70

−60

−50

−40

−30

−20

−10

0


Fiter tap #

Ave

rage

d W

2

Figu

re2.9:

Sm

ooth

edFilter

Coeffi

cients

-D

istance

:2m

,R

oom

:A

meth

yst,

Chan

nels

:5

-7

23

2m

4cm

Height of Microphone Array = 1m

Height of speaker = 1m

Microphone Array

Speaker

Figure 2.10: Measurement Setup

H(z)

s(k)

b(k)

x(k)G(z) =

H(z)

1 s(k)^

+

Figure 2.11: Direct Inverse

24

H(z)

s(k)

b(k)

x(k)G(z)

s(k)^

z

+

+-k d

ds(k - k )

Figure 2.12: Mean Square Error Inverse

s(k)+H1(z)

+H2(z)

+HN(z)

.

.

.

.

.

.

.

.

.

b1(k)

b2(k)

bN(k)

G1(z)

G2(z)

GN(z)

.

.

.

s(k)x1(k)

x2(k)

xN(k)

SIMO System MINT Inverse Filters

Figure 2.13: MINT Method

25

1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

0

0.2

0.4

0.6

0.8

Convolution of Inverse Filter and Impulse Response Channel = 1

Offset #

Am

plitu

de

1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

0

0.2

0.4

0.6

0.8


Offset #

Am

plitu

de

1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

0

0.2

0.4

0.6

0.8


Offset #

Am

plitu

de

1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

0

0.2

0.4

0.6

0.8


Offset #

Am

plitu

de

Figure 2.14: Impuse Response Convolution with LS Inverse channels 1 to 4

2.3 Inverse Filtering - Least squares method

Sequel to system identification, there is a requirement to obtain the inverse of the identified

filters. It can be expected that the inverse filter can be obtained by the mathematical

inversion of the impulse response, however, in the case of room impulse responses, the

inverse filter will be unstable. This is because room impulse responses are generally non

minimum-phase[8].

As shown in figure 2.16 and depicted in figure 2.17, a SISO acoustic system consisting

of a speaker S1 and microphone M, with a transfer function G(z−1) is considered. The

inverse of the transfer function will be

H(z−1) =1

G(z−1)(2.7)

An approximate inverse filter can be obtained using the least squares error criterion[9].

This least square estimate inverse is constructed using a stable FIR filter. Considering the

system in figure 2.16, the following relationship exists between inverse filter h(k) and the

26

1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

0

0.2

0.4

0.6

0.8


Offset #

Am

plitu

de

1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

0

0.2

0.4

0.6

0.8


Offset #

Am

plitu

de

1000 2000 3000 4000 5000 6000 7000 8000 9000 10000

0

0.2

0.4

0.6

0.8


Offset #

Am

plitu

de

Figure 2.15: Impuse Response Convolution with LS Inverse channels 5 to 7

H(z-1)Acoustic

transmission

channel G(z-1

)

Input Output

Inverse FilterS1M

Figure 2.16: A SISO acoustic system, with Loudspeaker S1, microphone M, and acousticroom impulse response G

g(k) h(k)Input Output

Linear SISO system FIR Filter

Figure 2.17: Inverse filtering a SISO system with the LSE Method

27

system impulse response g(k)

d(k) = g(k) ∗ h(k) (2.8)

where

d(k) =

{

1, when k = 0

0, when k = 1, 2, ...

linear convolution is denoted with the ∗ symbol. In matrix form, we can write equation

2.8 as

1

0

0...............

0

︸︷︷︸

L+1 × 1

=

g(0) 0 0

g(1) g(0) 0... g(1) � 0

g(m)... 0

0 g(m) g(0)

0 0 g(1)

0 0 �...

0 0 g(m)

︸︷︷︸

L+1 × i+1

h(0)

h(1)......

h(i)

︸︷︷︸

i+1 × 1

(2.9)

equation 2.9 can also be written

D = GH (2.10)

wherem = order of the z-transform of g(k)

i = order of the FIR inverse filter

L = i + m

D = (L+1) by 1 column vector

H = (i+1) by 1 column vector

G = (L+1) by (i+1) matrix

Equation 2.10 is overdetermined, as there are more rows than columns in matrix G as given

by

L + 1 = m + i + 1 > i + 1 (2.11)

An approximate solution using the least squares method can be computed using the rela-

tionship

H = (GTG)−1GTD (2.12)

28

where GT is the transpose of matrix G and the error energy of the approximate filter is

given by

MSE = (D −GH)T (D − GH) (2.13)

This error does not converge to zero even if the order of the inverse is increased, because g(k)

has a nonminimum phase impulse response[10]. Thus there is a limit to the performance

that can be obtained with the least squares error method. An exact inverse of the room

impulse response can be obtained using the multiple input/output inverse theorem (MINT).

The performance of the LSE method when used as an inverse filter is shown in figure 2.18.

To improve this performance, MINT can be used. To apply MINT, additional channels

are introduced in the room impulse response.

0 10 20 30 40 501

2

3

4

5

6

7

8

9

10

11

SD

R in

dB

SNR in dB

Nonblind Inverse Filtering

NLMS methodFilter Length = 300 tapsStep Size = 0.07 Delay = 165 samples

Figure 2.18: SDR with the supervised inverse filtering method

29

g1(k) h1(k)x(k) y(k)

g2(k) h2(k)

+

1 input 2 output SIMO system Inverse Filters

Figure 2.19: MINT Filtering of a 1-Input 2-Output system

2.4 Inverse Filtering - The Multichannel Inverse The-

orem - MINT

The MINT method is illustrated in figure 2.19, in which an extra channel has been added

to the SISO system of figure 2.17. The exact inverse filter is obtained by satisfying

D(z−1) = 1 = G1(z−1)H1(z

−1) + G2(z−1)H2(z

−1) (2.14)

where

D(z−1) is the z-transform of d(k) in 2.8

Because G1(z−1),G2(z

−1),H1(z−1) and H2(z

−1) are polynomials of z−1, the solution for

equation 2.14 exist when these conditions are satisfied

1. G1(z−1) and G2(z

−1) are relatively prime i.e. they do not have any same zero in the

z-plane

2. The orders of the inverse filters H1(z−1) and H2(z

−1) are less than those of G1(z−1)

and G2(z−1)

The solutions to the polynomial equation 2.14, will be the coefficients to a pair of FIR

filters H1(z−1) and H2(z

−1), and will be an exact inverse of the system [10].

To compute the coefficients of these FIR filters (solve the equation) we write equation 2.14

in the time domain as

d(k) = g1(k) ∗ h1(k) + g2(k) ∗ h2(k) (2.15)

where

30

g1(k) and g2(k) denote the impulse response functions of G1(z−1) and G2(z

−1)

h1(k), h2(k) denote the response functions of H1(z−1) and H2(z

−1)

A compact notation in the time domain can be achieved through the use of matrix formu-

lation

D = G1H1 + G2H2 =[

G1 G2

][

H1

H2

]

(2.16)

or

1

0

0...............

0

︸︷︷︸

L+1×1

=

g1(0) 0 0 g2(0) 0 0

g1(1) g1(0) 0 g2(1) g(0) 0... g1(1) � 0

... g2(1) � 0

g1(m)... 0 g2(m)

... 0

0 g1(m) g1(0) 0 g2(n) g2(0)

0 0 g1(1) 0 0 g2(1)

0 0 �... 0 0 �

...

0 0 g1(m) 0 0 g2(n)

︸︷︷︸

L+1×i+j+2

h1(0)

h1(1)......

h1(i)

h2(0)

h2(1)......

h2(j)

︸︷︷︸

i+1×1

(2.17)

where

m + 1, n + 1 denotes the length of the impulse responses g1(k) and g2(k)

i, j denote the length of the impulse responses of H1(z−1) and H2(z

−1)

L = m + i = j + n

D denotes an (L + 1) × 1 column vector

[HT

1 HT2

]denotes a (i + j + 2) × 1 column vector

[G1G2] denotes a (L + 1) × (i + j + 2) matrix

31

[G1G2] is forced to become a square matrix by selecting the length i and j of the FIR

inverse filters to satisfy

i = n − 1

j = m − 1

The coefficients of the FIR filters, h1(k), h2(k) can thus be computed by[

H1

H2

]

= [G1G2]−1 D (2.18)

The SIMO sytem using three MINT channels is also considered in this study. In this case,

we have

D(z−1) = 1 = G1(z−1)H1(z

−1) + G2(z−1)H2(z

−1) + G3(z−1)H3(z

−1) (2.19)

Similar to equation 2.16

D = G1H1 + G2H2 + G3H3 =[

G1 G2 G3

]

H1

H2

H3

(2.20)

And the filter coefficients can be obtained by

H1

H2

H3

= [G1G2G3]

−1 D (2.21)

System dereverberation using filters obtained with the MINT method, with two and three

channels are compared in this study.

32

CHAPTER 3

Robust Inverse filtering with

MINT

It cannot be assumed that the transfer functions between a source and receiver in a room,

referred to as room transfer function (RTF) will be invariant. In this chapter, the per-

formance of the MINT method with respect to three parameters are investigated. These

parameters are

• Regularization

• Filter length

• Modeling delay

These parameters are varied, under different signal to noise ratio situations. These param-

eters were adjusted to reduce the filter energy, since less signal degradation is experience

in filters with less energies[11]. A SIMO system is considered in this performance investi-

gation.

33

3.1 Generalized MINT performance

Performance of the MINT method can be improved with the use of a regularization pa-

rameter, equation 2.16

D = G1H1 + G2H2 =[

G1 G2

][

H1

H2

]

(3.1)

Generalizing, equation 2.16

D = G1H1 + G2H2 + ... + GPHP =[

G1 G2 ... GP

]

H1

H2

...

HP

(3.2)

can be re-written as

D = GH (3.3)

where

G = [G1 G2...GP ]

And

Gi =

gi(0) 0 0

gi(1) gi(0) 0... gi(1) � 0

gi(J)... 0

0 gi(J) gi(0)

0 0 gi(1)

0 0 �...

0 0 gi(J)

︸︷︷︸

M×J+M

34

H = [H1 H2...HP ]T

= [h1(1), ..., h1(M), ..., hP (1), ..., hP (M)]

D = [0, ..., 0︸︷︷︸

d

, 1, ..., 0]T

where

M denotes the filter length for each channel

H is the inverse filter vector

P denotes the number of channels

hi(n) is the impulse response estimate between the source and i-th microphone

J denotes the number of taps of the impulse response estimate

d denotes a modelling delay

3.2 Regularization performance

Using a cost function to design the inverse filter, we can write

C = ‖D − GH‖2 + δ‖H‖2 (3.4)

where δ is used as a regularization parameter. The minimization of the cost function of

equation 3.4 is given by[12]

H(δ) = (GTG + δI)−1GTD (3.5)

35

where I is an identity matrix, and the power of the L2 norm of this filter vecctor is given

by

‖H(δ)‖2 ≤ ‖(GTG + δI)−1GT‖2

= ‖G(GTG + δI)−1(GTG + δI)−1GT‖= ‖GTG(GTG + δI)−1(GTG + δI)−1‖≈ ‖(GTG + δI)−1‖

=1

µmin

[(GT G + δI)]

≤ ‖H‖2 (3.6)

Where the approximation above was obtained by applying the Taylor series expansion to

(GTG + δI)−1. Let (GTG)−1 denote T. If δ is sufficiently small, then ‖I‖ ≫ ‖δG‖, and

(GTG + δI)−1 = ((GTG)(I + δ(GTG)−1))−1

= ((GTG)(I + δT))−1

= (I + δT)−1(GTG)−1

= (I − δT + δ2T2 − ...)(GTG)−1

≈ (GTG)−1 (3.7)

The definition of the L2 norm used in equation 3.6 is defined as[12]

‖A‖ =1

µmin

[A]

where µmin[A] is the minimum singular values of matrix A. The regularization parameter δ

increases the minimum singular value and reduces the norm of the inverse filter, this reduces

the sensitivity to impulse response variation[11]. Because the regularization parameter also

reduces the accuracy of the inverse filter, a compromise is selected. The performance due

to regularization is shown in figure 3.4 for a three-channel MINT system. This system

was simulated with a fixed delay of 175 taps and different SNR. It can be observed that

the optimum value of the regularization parameter is 10−7, at this point, the SDR for the

dereverberated signal is obtained as 23dB, for an input signal of 50 dB. The choice of delay

was determined by the optimum value obtained in the performance study of the MINT

method with varying delay. Figure 3.3 shows the optimum values of SDR in a two-channel

36

MINT system. The delay used for the two-channel system is 115 taps as obtained from the

performance curve of the MINT system, with filter length as a variable, in a two channel

MINT system.

37

3.3 Algorithm Optimization Procedure

In order to obtain the parameters for the optimal results, a three step procedure was

utilized. These steps were applied to the two and three channel MINT inverse filter system.

3.3.1 Algorithm Optimization Procedure Step 1 - Delay

The appropriate delay is obtained for the MINT filter as shown in figures 3.1 and 3.2 for

signals corrupted by three different input SNR situations. It is observed that by choosing

the appropriate delay the performance can be improved significantly in both two and three

channel MINT systems. Also, it can be seen that the optimum delay is independent of

noise power in the signal(SNR)

0 20 40 60 80 100 120 140 160 180 200−100

−90

−80

−70

−60

−50

−40

−30

−20

−10

0

Delay

SD

R d

B

Effect of delay on the output

SNR = 10 dBSNR = 20 dBSNR = 30 dB

optimum delay=115

Figure 3.1: SDR dependence on MINT filter delay with different SNR 2 channel system

38

0 20 40 60 80 100 120 140 160 180 200−80

−70

−60

−50

−40

−30

−20

−10

0

Dealy

SD

R d

B

Optimum delay for 3 channel MINT filter

SNR=10 dBSNR=20 dBSNR=30 dB

optimum delay=175

Figure 3.2: SDR dependence on MINT filter delay with different SNR 3 channel system

3.3.2 Algorithm Optimization Procedure Step 2 - Regularization

The optimum regularization parameter is obtained as shown in figure 3.3 and 3.4, which

reveals noisier the environment, the higher the regularization parameter is required. Also

the regularization parameter is more effective in noisy environments.

39

−15 −10 −5 0−25

−20

−15

−10

−5

0

5

10

15

20

25

regularization paramete 10^

SD

R in

dB

SDR dependence on regularization parameter in various SNR

SNR =10 dBSNR =20 dBSNR =30 dBSNR =40 dBSNR =50 dB

Delay=115 taps

Figure 3.3: MINT performance with a regularization parameter and different SNR valuesfor a two channel system

3.3.3 Algorithm Optimization Procedure Step 3 - Filter Length

The filter norm depends on the filter length M. This implies that we should use the min-

imum possible filter length for the inverse filter. The observed MINT performance while

the filter lengths is varied is shown in figures 3.5 and 3.6 for a two and three channel MINT

system. The minimum required filter length is given by

Lh =L − 1

N − 1(3.8)

where

L - denotes the channel length

N - denotes the number of the channels

The figures show performance for extra taps added to the system impulse response filter

length.

40

−15 −10 −5 0−30

−20

−10

0

10

20

30

regularization parameter 10^

SD

R in

dB

SDR dependence on regularization for various SNR

10dB20dB30dB40dB50dB

Delay = 175 taps

λ =10−3

Figure 3.4: MINT performance with a regularization parameter and different SNR valuesfor a three channel system

3.3.4 Algorithm Optimization Results

In order to evaluate the the decision delay dependence on filter length in two and three

channel MINT filter, figures 3.7 and 3.8 are plotted. As we see in the two channel MINT

filter, by adding more coefficients to filter length the optimum decision delay varies, how-

ever, in three channel mint filter, a longer filter length does not affect the best decision

delay.

Comparing the least squares inverse to two channel and three channel MINT algorithms

it can be concluded that the least squares inverse method has less computational complexity

however, even in a noisy environment the de-reverberated signal using MINT has less

distortion than the least squares method.

The least squares inverse method cannot improve the performance (SDR) by increasing

SNR and the trend will become saturated due to the non minimum phase room impulse

response, while MINT can improve SDR proportional to SNR

In a two channel MINT system, these parameters, are highly dependent to each other,

thus obtaining the best value for each one is tricky while in three channel MINT the

dependency is relatively much less. It is also observed that there is no significant improve-

41

0 20 40 60 80 100 120 140 160 180 2006

8

10

12

14

16

18

20

22

24

Extending taps to minimum filer length

SD

R in

dB

SDR dependence on MINT filter length for various SNR

SNR = 10 dBSNR = 20 dBSNR = 30 dBSNR = 40 dBSNR = 50 dB

Delay=115 tapsoptimum λ

Figure 3.5: SDR dependence on MINT filter length with different SNR 2 channel system

ment in the de-reverberated signal from two channel MINT to three channel MINT and

computational complexity will increase by increasing the number of channels in MINT

method.

42

0 20 40 60 80 100 120 140 160 180 2000

5

10

15

20

25

30

35

Extending taps to minimum filer length

SD

R in

dB

SDR dependence on MINT filter length for various SNR

SNR=50 dB ,Regularization parameter=10−7

SNR=40 dB,Regularization parameter=10−6


SNR=20 dB,regularization parameter=10−4


Figure 3.6: SDR dependence on MINT filter length with different SNR 3 channel system

100120

140160

180200

220240

260280

300

0

50

100

150

20013

14

15

16

17

18

19

Delay

Dependency of Delay and Filer length in 2 channel

Adding length

SD

R d

B

SNR=30 dB2 channel

Figure 3.7: Effects of filter length and delay on SDR with the MINT method with 2Channels

43

180200

220240

260280

300320

340

0

50

100

150

2004

6

8

10

12

14

16

18

20

Delay

Dependency of Delay and Filer length in 3 channel

Adding length

SD

R d

B

Figure 3.8: Effects of filter length and delay on SDR with the MINT method with 3Channels

44

CHAPTER 4

Unsupervised Inverse filtering

based dereverberation

4.1 Basic Principles

The method employed uses the second order statistics of a signal. As shown in [13], if the

single input multiple output system shown in Figure1.4 is considered, for any pair of two

noise-free output signals xi(k) and xi(j) it yields that

xi(k) = hi(k) ∗ s(k) (4.1)

xj(k) = hj(k) ∗ s(k) (4.2)

then

hj(k) ∗ xi(k) = hj(k) ∗ [hi(k) ∗ s(k)]

= hi(k) ∗ [hj(k) ∗ s(k)]

= hi(k) ∗ xj(k) (4.3)

45

This equation shows that a relationship exists between the outputs of each channel pair.

This is because they have the same source. The relationship depends on each individual

channel response. Thus by taking advantage this relationship, an overdetermined system

of equations can be formulated using input data and the corresponding output. Under

certain conditions, the impulse responses hi(k) and hj(k) can be obtained uniquely, up to

a scalar multiple [13]. This identification can be achieved as follows: For k = L, ..., N ,

where N is the last sample index of the received data xi(k) and xj(k) equation 4.3 becomes

N − L + 1 linear equations involving hj(·) and hi(·)

[

Xi(L)... − Xj(L)

][

hj

hi

]

= 0 (4.4)

where hm∆= [hm(L), . . . , hm(0)]T and

Xm(L) =

xm(L) xm(L + 1) ... xm(2L)

xm(L + 1) xm(L + 2) ... xm(2L + 1)...

.... . .

...

xm(N − L) xm(N − L + 1) ... xm(N)

(4.5)

Because this equation can be written for each pair of channels (i,j), the set of linear equa-

tions is combined as shown in [13] to solve all the channel impulse responses simultaneously.

Denoting all the channel impulses by h∆=[hT

1 , . . . , hTL

]Tand

Xi(L) =

0 · · · 0 Xi+1(L) −Xi(L) 0 0...

... 0. . . 0

0 · · · 0 XM(L) 0 0 −Xi(L)

(4.6)

In the noise free case, h is in the null space of the following matrix:

X(L) =

X1(L)...

XM−1(L)︸︷︷︸

Mblocks

(4.7)

46

This can be written as

X(L)h = 0 (4.8)

In order to find solutions to the above equation, certain conditions must be met which are

discussed in the next section.

4.2 Identifiability Conditions

Blind channel identifiability using the cross relations method is dependent on the condition

of the input signals and the condition of the input.

• The channel coefficients do not share any common roots (they are coprime).

• There must be an input signal to excite the channel

• The autocorrelation matrix of the input signal is of full rank

4.3 Algorithms

The set of simultaneous linear equations obtained from the channel outputs can solved

using adaptive algorithms [4]. The following adaptive algorithms are considered

• Constrained time domain Multichannel LMS

• Constrained Time Domain Multichannel Newton Algorithm

• Unconstrained Blind Multichannel LMS with Optimal Stepsize Control

• Frequency Domain Normalized Multichannel LMS

47

As outlined in [4], the constrained time domain multichannel LMS follows. In the

absence of noise:

xi ∗ hj = s ∗ hi ∗ hj = xj ∗ hi, i, j = 1, 2, ..., N, i 6= j, (4.9)

thus at time k

xTi (k)hj = xT

j (k)hi, i, j = 1, 2, ..., N, i 6= j (4.10)

where

xn(k) = [xn(k) xn(k − 1) ... xn(k − L + 1)]T , n = 1, 2, ..., N (4.11)

and multiplying equation 4.10 by xi(k) and taking the expectation, yields

Rxixihj = Rxixj

hi, i, j = 1, 2, ..., N, i 6= j (4.12)

where Rxixj= E

{xi(k)xT

j (k)}. Equation 4.12 contains N(N − 1) different equations, and

by summing up N − 1 cross relations associated with one particular channel hj, yields

N∑

i=1,i6=j

Rxixihj =

N∑

i=1,i6=j

Rxixjhi, j = 1, 2, ..., N, i 6= j (4.13)

This gives N equations when all the channels are considered, and the set of equations for

all the channels can be written as

Rx+h = 0 (4.14)

where:

Rx+ =

∑

n 6=1 Rxnxn−Rx2x1 ... −RxN x1

−Rx1x2

∑

n 6=2 Rxnxn... −RxN x2

xm(L + 1) xm(L + 2) ... xm(2L + 1)...

.... . .

...

−Rx1xN−Rx2xN

...∑

n 6=N Rxnxn

(4.15)

When the identifiability conditions have been met and there is no noise, the matrix Rx+

is rank deficient by one, thus the a unique channel impulse response can be determined.

However, in the presence of noise, the right hand side of equation 4.14 is not equal to zero,

48

but becomes an error vector given as:

e = Rx+h (4.16)

A cost function based on this error can be defined as

J = ‖e‖2 = eTe (4.17)

This cost function can be minimized in the least squares sense in order to determine the

estimate vector h

h = arg minhJ = arg minhhTRT

x+Rx+h (4.18)

The estimate h is the eigenvector of Rx+ obtained from the smallest eigenvalue of Rx+.

Rx+ is positive definite. The estimated channel impulse response is a non-zero scaled

fraction of the true impulse response, even thought it is properly oriented along the true

impulse response in vector space. The algorithms developed also have constraints limiting

the estimated impulse response from attaining the trivial all-zero estimate in the process

of converging.

4.4 Constrained Time Domain Multichannel LMS

In order to derive the update equation for the constrained time domain multichannel LMS

algorithm(also referred to as Multichannel LMS: MCLMS), the cross relations between

output i and output j in equation 4.10 are used. In the presence of noise, the a priori error

signal produced is

eij(k + 1) = xTi (k + 1)hj(k) − xT

j (k + 1)hi(k), i, j = 1, 2, ..., N (4.19)

in this equation, hi(k) is the filter channel estimate for channel i at time k, and if all the

channel estimation errors are of the same importance, a cost function can be defined:

χ(k + 1) =N−1∑

i=1

N∑

j=i+1

e2ij(k + 1) (4.20)

This expression excludes cases where eii(k) = 0, (i = 1, 2, ..., N) and counts the eij(k) =

−eij(k) pair only once. A unit norm constraint is imposed on the channel estimates h(k)

49

in order to avoid an all zero estimate, giving a normalized error signal

ǫij(k + 1) =eij(k + 1)∥∥∥h(k)

∥∥∥

(4.21)

The cost functions is

J(k + 1) =

N−1∑

i=1

N∑

j=i+1

ǫ2ij(k + 1) =

χ(k + 1)

‖h(k)‖2(4.22)

The update equation of the constrained time domain multichannel LMS is

h(k + 1) = h(k) − µ∇J(k + 1) (4.23)

As proposed in the LMS update equation, µ is a small positive step size and ∇J(k + 1) is

the gradient of J(k + 1) with respect to h(k). This gradient

∇J(k + 1) =∂J(k + 1)

δh(k)=

∂

∂h(k)

[

χ(k + 1)

‖h(k)‖2

]

=∂

∂h(k)

[

χ(k + 1)

h(k)T h(k)

]

=1

‖h(k)‖2

[

∂χ(k + 1)

∂h(k)− 2J(k + 1)h(k)

]

(4.24)

where

∂χ(k + 1)

∂h(k)=

(

∂χ(k + 1)

∂h1(k)

)T (

∂χ(k + 1)

∂h2(k)

)T

...

(

∂χ(k + 1)

∂hN (k)

)T

T

(4.25)

50

Next the partial derivative of χ(k + 1) is evaluated with respect to the coefficients of each

of the n channel impulse responses (n = 1, 2, ... N)

∂χ(k + 1)

∂hn(k)=

∂[∑N−1

i=1

∑N

j=i+1 e2ij(k + 1)

]

∂hn(k)

=n−1∑

i=1

2ein(k + 1)xi(k + 1) +N∑

j=n+1

2enj(k + 1)[−xj(k + 1)]

=n−1∑

i=1

2ein(k + 1)xi(k + 1) +N∑

j=n+1

2ejn(k + 1)[xj(k + 1)]

=

N∑

i=1

2ein(k + 1)xi(k + 1) (4.26)

In matrix form, this equation can be expressed as follows:

∂χ(k + 1)

∂hn(k)= 2X(k + 1)en(k + 1)

= 2X(k + 1) [Cn − Dn(k + 1)] h(k) (4.27)

51

where:

X(k + 1) = [x1(k + 1) x2(k + 1) . . . xN(k + 1)]L×N

en(k + 1) = [e1n(k + 1) e2n(k + 1) . . . eNn(k + 1)]T

=

xT1 (k + 1)hn(k) − xT

n (k + 1)h1(k)...

xTN(k + 1)hn(k) − xT

n (k + 1)hN (k)

= [Cn(k + 1) − Dn(k + 1)] h(k)

Cn(k + 1) =

0 . . . 0 xT1 (k + 1) 0 . . . 0

... . . ....

...... . . .

...

0 . . . 0 xTN(k + 1) 0 . . . 0

N×NL

=[0N×(n−1)L XT (k + 1) 0N×(N−n)L

]

Dn(k + 1) =

xTn (k + 1) 0 . . . 0

0 xTn (k + 1) . . . 0

......

. . ....

0 0 . . . xTn (k + 1)

N×NL

(4.28)

The two matrix products in equation 4.27 can be evaluated

X(k + 1)Cn(k + 1) = X(k + 1)[0N×(n−1)L XT (k + 1)0N×(N−n)L

]

N×NL

=

[

0L×(n−1)L

N∑

i=1

Rxixi(k + 1) 0L×(N−n)L

]

L×NL

(4.29)

and

52

X(k + 1)Dn(k + 1) =

[

Rx1xn(k + 1) Rx2xn

(k + 1) Rx2xn(k + 1) . . . RxNxn

(k + 1)]

L×NL(4.30)

where Rxixj(k + 1) = xi(k + 1)xT

j (k + 1) (i, j = 1, 2, ..., N). The tilde represents the

instantaneous value of Rxixj, and by substituting equation 4.29 and equation 4.30 into

equation 4.27

∂χ(k + 1)

ˆ∂hn(k)= 2

[

−Rx1xn(k + 1) − Rx2xn

(k + 1) . . .∑

i6=n

Rxixi(k + 1) . . . − RxN xn

(k + 1)

]

h(k)

(4.31)

Thus, using Equation 4.31 into equation 4.24 yields

∂χ(k + 1)

∂h(k)= 2Rx+(k + 1)h(k) (4.32)

∇J(k + 1) =1

‖h(k)‖2

[

2Rx+(k + 1)h(k)]

− 2J(k + 1)h(k) (4.33)

where:

Rx+ =

∑

n 6=1 Rxnxn(k) −Rx2x1(k) ... −RxN x1(k)

−Rx1x2(k)∑

n 6=2 Rxnxn(k) ... −RxN x2(k)

......

. . ....

−Rx1xN(k) −Rx2xN

(k) ...∑

n 6=N Rxnxn(k)

(4.34)

The update equation is derived by substituting equation 4.33 into equation 4.23 as

h(k + 1) = h(k) − 2µ

‖h(k)‖2

[

Rx+(k + 1)h(k) − J(k + 1)h(k)]

(4.35)

If this estimate is normalized with the norm constraint, it results in the final form of the

update equation

h(k + 1) =h(k) − 2µ[Rx+(k + 1)h(k) − χ(k + 1)h(k)]

‖h(k) − 2µ[Rx+(k + 1)h(k) − χ(k + 1)h(k)]‖(4.36)

53

For this update algorithm to converge, it can be shown that the step size must satisfy[9]

0 < µ <1

λmax

(4.37)

This equation λmax is the largest eigenvalue of matrix E{

Rx+(k) − J(k)INL×NL

}

and

INL×NL is an identity matrix of size NL by NL. The expectation of Equation 4.35 gives:

Rx+h(∞)

‖h(∞)‖= E {J(∞)} h(∞)

‖h(∞)‖(4.38)

which is the desired outcome, because h converges in the mean to the eigenvector of Rx+

corresponding to the smallest eigenvalue of E {J(∞)} The Constrained Time Domain

Multichannel LMS algorithm is illustrated in figure 4.1

54

Initialize hi = [1 0 0 … 0]T /

i = 1, 2, … N

x2

x1

x3

h1

h2

h3

)1(~

kRx

x1 x2 x3

Nj

i

kh

kx

kh

kx

ei

T jj

T iij

...,

,2,1

,

)(

ˆ)1

()

(ˆ

)1(

)(ˆ

)1(

)(ˆ

)1(

~2

)1(

)1(

1 11

2

kh

kk

hk

Rh

ke

k

x

N i

N ij

ij

xi*h2

xi*h1

xi*h3

^

^

^

)1()(ˆ

)1()(ˆ)1(ˆ

khkh

khkhkh

Figure 4.1: Constrained Time Domain Multichannel LMS algorithm

55

4.5 Constrained Time Domain Multichannel Newton

Algorithm

The Constrained Time Domain Multichannel Newton Algorithm (also referred to as the

Multichannel Newton Algorithm: MCN)generally has a better performance than the MCLMS

algorithm[4]. However, it is computationally intensive do to the nature of the Newton

Method[14]. The Newton update equation is given by:

h(k + 1) = h(k) − E−1{∇2J(k + 1)

}∇J(k + 1) (4.39)

where ∇2J(k + 1) is the Hessian matrix of J(k + 1) with respect to h(k). By taking the

derivative of equation 4.33 with respect to h(k) and using the formula[4]

∂

∂h(k)

[

J(k + 1)h(k)]

= h(k)

[

∂J(k + 1)

∂h(k)

]T

+ J(k + 1)INL×NL (4.40)

= h(k)[∇J(k + 1)]T + J(k + 1)INL×NL (4.41)

The obtained Hessian matrix is

∇2J(k + 1) =2{

Rx+(k + 1) −[

h(k)(∇J(k + 1))T + J(k + 1)INL×NL

]}

‖h(k)‖2

−4[

Rx+(k + 1)h(k) − J(k + 1)h(k)]

hT(k)

‖h(k)‖4(4.42)

By using the unit-norm constraint ‖h(k)‖ = 1, equation 4.42 can be re-written as

∇2J(k + 1) = 2{

Rx+(k + 1) − h(k) [∇J(k + 1)]T − J(k + 1)INL×NL

}

− 4[

Rx+(k + 1)h(k) − J(k + 1)h(k)]

hT(k) (4.43)

By taking the mathematical expectation of equation 4.43 and using the independence

56

assumption[12] yields

E{∇2J(k + 1)

}= 2Rx+ − 4h(k)h

T(k) − 4Rx+h(k)h

T(k)

−2E {J(k + 1)}[

INL×NL − 4h(k)hT(k)]

(4.44)

Because Rx+ and E {J(k + 1)} are unknown, estimated values are used. J(k+1) decreases

as the estimates approach convergence and will become small after convergence, the term

E {J(k + 1)} can be neglected in equation 4.44 for simplification. Matrix Rx+ is estimated

recursively conventionally as shown

Rx+(1) = diag{σ2

x1, . . . , σ2

x1, σ2

x2, . . . , σ2

x2, . . . , σ2

xN, . . . , σ2

xN

}

Rx+(k + 1) = λRx+(k) + Rx+(k + 1), fork ≥ 1 (4.45)

where σ2xn

(n = 1, 2, ..., N) is the power of xn(k) and λ(0 < λ < 1) is an exponential

forgetting factor. An estimate W(k + 1) for the mean Hessian matrix of J(k + 1) can be

obtained as

W(k + 1) = 2Rx+(k + 1) − 4h(k)hT(k)Rx+(k + 1) − 4Rx+(k + 1)h(k)h

T(k) (4.46)

from this equation, the multichannel Newton algorithm can be obtained:

h(k + 1) =h(k) − 2ρW−1(k + 1)

[

Rx+(k + 1))h(k)]

− χ(k + 1))h(k)

‖h(k) − 2ρW−1(k + 1)[

Rx+(k + 1))h(k)]

− χ(k + 1))h(k)‖(4.47)

where ρ is the step size which is close to but less than 1.

57

4.6 Unconstrained Blind Multichannel LMS algorithm

with Optimal Step Size control

The unconstrained blind multichannel LMS (also referred to as Variable Step Size Uncon-

strained Multichannel:LMS- VSSUMCLMS)with optimal step size control algorithm has

the potential to overcome the slow convergence property of the Constrained Blind Multi-

channel LMS algorithm. This is due to the dynamic control of the step size depending on

the signal properties. The gradient of the cost function in the update equation 4.35 of the

multichannel LMS

∇J(k + 1) ≈ 2Rx+(k + 1)h(k)

‖h(k)‖2(4.48)

excluding the unit norm constraint gives

h(k + 1) = h(k) − 2µRx+(k + 1)h(k + 1) (4.49)

This update algorithm will not converge to the all-zero estimate if the initial estimate ˆh(0)

and the true channel impulse response are orthogonal to one another. This can be shown

as true by pre-multiplying Equation 4.49 with hT and we get

hT h(k + 1) = hT h(k) − 2µhT Rx+(k + 1)h(k) (4.50)

As previously shown in the cross relations approach in equation 4.10, in the absence of

noise the following relationship holds

hT Rx+(k + 1) = 0T (4.51)

This relationship implies that the gradient ∇J(k + 1) is orthogonal to h at any time k,

further yielding that equation 4.50 becomes

hT h(k + 1) = hT h(k) (4.52)

this shows that hT h(k + 1) is time invariant for the unconstrained multichannel LMS

(MCLMS) algorithm. If hT h(0) 6= 0 then h(k) will not converge to zero. To determine the

update equation for the unconstrained MCLMS, we begin by representing the model filter

with two components, where one component h⊥(k) is parallel to the true impulse response

58

h(k) and the other h‖(k) is perpendicular to the true impulse response

h(k) = h⊥(k) + h‖(k) (4.53)

Because the gradient ∇J(k +1) is orthogonal to h, and h is parallel to h‖(k) then ∇J(k +

1) is orthogonal to h‖(k), implying that the update equation 4.49 of the unconstrained

MCLMS algorithm can be represented with a pair of equations

h⊥(k + 1) = h⊥(k) − µ∇J(k + 1) (4.54)

and

h‖(k + 1) = h‖(k) (4.55)

This pair of equations shows that the unconstrained MCLMS algorithm update equation

adapts the model filter coefficients only in the direction perpendicular to h. Because the

SIMO FIR system identification is a scaled version of the true channel impulse response,

the identification estimate misalignment of h(k) with respect to the true channel impulse

response vector h will be

d(k) =minα ‖h− αh(k)‖2 (4.56)

variable α represents an arbitrary scale. By inserting the representation of h(k) given by

equation 4.53 into equation 4.56 the minimum value of d(k) is obtained as

d(k) =minα[

‖h(k)‖2α2 − 2‖h‖(k)‖‖h‖α + ‖h‖2]

=‖h‖2

1 +(

‖h‖(k)‖/‖h⊥(k)‖)2 (4.57)

This shows that the ratio of ‖h‖(k)‖ to ‖h⊥(k)‖ is an indication of how close the estimate

is to the true impulse response, indicating the optimal step size µopt(k + 1) for the uncon-

strained multichannel LMS algorithm at time k + 1 will be such that it makes h⊥(k + 1)

to have a minimum norm

µopt(n + 1) = argminµ ‖h⊥(k + 1)‖

= argminµ ‖h⊥(k) − µ∇J(k + 1)‖ (4.58)

59

To minimize the norm of h(k+1) = h(k)−µ(k+1)∇J(k+1), the stepsize µ(k+1) should

be such that h(k + 1) is orthogonal to ∇J(k + 1). Thus h(k) is projected onto ∇J(k + 1)

to obtain the optimum step size

µopt(k + 1) =h(k)∇J(k + 1)

‖∇J(k + 1)‖2(4.59)

This algorithm is termed the Variable Step-Size Unconstrained Multichannel LMS (VSS-

UMCLMS)[4].

4.7 Frequency Domain Normalized Multichannel LMS

The motivation for the frequency-domain normalized multichannel LMS algorithm (FNM-

CLMS) is presented in two stages. First the frequency domain unnormalized multichannel

LMS algorithm is derived. Secondly, the algorithm is modified for normalization. Normal-

ization of the algorithm is desired because the unnormalized frequency-domain multichan-

nel LMS has a slow convergence rate. The slow convergence is due to[4]

• the cross coupling between the channels

• the overall convergence rate is determined by the slowest converging channel

Netwon’s method is applied in the normalization procedure, further, approximations are

utilized to simplify the updated algorithm. Thus eigenvalue differences are reduced and

convergence is improved.

To derive the frequency domain normalized multichannel LMS, the following represen-

tations will be used

Time domain vector - x

Time domain matrix - X

Frequency domain vector - x⇒

Frequency domain matrix - X⇒We begin by defining signal yij(k+1) as the result of convolving xi(k+1) with the j the

model filter hj(k)

60

yij(k + 1)∆= xi(k + 1) ∗ hj(k) (4.60)

Using the overlap and save method, let vector yij(t + 1) of length 2L and denote the

result of the circular convolution between xi(t + 1) and hj(t) yields

yij(t + 1) = Cxi(t + 1)h

10

j (t) (4.61)

where

yij(t + 1) = [yij(tL) yij(tL + 1) . . . yij(tL + 2L − 1)]T

Cxi(t + 1) =

xi(tL) xi(tL + 2L − 1) . . . xi(tL + 1)

xi(tL + 1) xi(tL) . . . xi(tL + 2)...

.... . .

...

xi(tL + 2L − 1) xi(tL + 2L − 2) . . . xi(tL)

h10

j (t) =[

hT

j (t) 0TL×1

]T

=[

hj,0(t) · · · hj,L−1(t) 0 · · · 0]T

The matrix Cxi(t + 1) is a circulant matrix. The last L points in the circular convolution

corresponds to the results of a linear convolution:

yij(t + 1) = W01L×2Lyij(t + 1)

= W01L×2LCxi

(t + 1)h10

j (t)

= W01L×2LCxi

(t + 1)W102L×Lhj(t) (4.62)

where

yij(t + 1) =[yij(tL) yij(tL + 1) · · · yij(tL + L − 1)

]T

W01L×2L = [0L×L IL×L]

W102L×L = [IL×L 0L×L]T

hj(t) =[

hj,0(t) hj,1(t) . . . hj,L−1(t)]T

overlapping the input data blocks by L points, and discarding the first circular convo-

lution results.

61

An a priori error signal based on cross relations:

eij(t + 1) = yij(t + 1) − yji(t + 1)

= W01L×2L

[

Cxi(t + 1)W10

2L×Lhj(t) − Cxj(t + 1)W10

2L×Lhi(t)]

(4.63)

where the DFT is used to perform the circular convolution, with the DFT transform

matrix F defined as

FL×L =

1 1 1 ... 1

1 e−j 2πL e−j 4π

L ... e−j2π(L−1)

L

1 e−j 4πL e−j 8π

L ... e−j4π(L−1)

L

......

.... . .

...

1 e−j2π(L−1)

L e−j4π(L−1)

L ... e−j2π(L−1)2

L

(4.64)

where j is the square root of -1. Matrix FL×L and its inverse are related by

FHL×L = LF−1

L×L (4.65)

and (.)H is the Hermitian transpose of a matrix. F2L×2L and F−12LL can be decomposed as

Cxi(t + 1)

Cxi(t + 1) = F−1

2L×2L D⇒

xi(t + 1)F2L×2L (4.66)

where D⇒

xi is a diagonal matrix with the diagonal elements given by the DFT of the

first column of Cxi(t+1), which is the overlapped ith channel output of the (t+1)th block:

xi(t + 1)2L×1 = [xi(tL) xi(tL + 1) . . . xi(tL + 2L − 1)]T (4.67)

By multiplying equation 4.63 by FL×L using (4.66) to determine the block error sequence

in the frequency domain

e−→ij(t + 1) = FL×Leij(t + 1)

= FL×LW01L×2L

[

Cxi(t + 1)W10

2L×Lhj(t) −Cxj(t + 1)W10

2L×Lhi(t)]

= W−→01

L×2L

[

D−→xi(t + 1)W−→

102L×L h−→j(t) − D−→xj

(t + 1)W−→102L×L h−→i(t)

]

(4.68)

62

where

W−→01

L×2L= FL×LW

01L×2LF

−12L×2L

W−→10

2L×L= F2L×2LW

102L×LF−1

L×L

and h−→i(t) is the L-point DFT of the vector hi(t) at the tth block.

The frequency-domain mean square error criterion similar to that of the time domain

is now formulated

Jf = E {Jf(t)} (4.69)

where the instantaneous square error of the tth block is given by Jf(t) as

Jf(t) =

N−1∑

i=1

N∑

j=i+1

e−→Hij (t) e−→ij(t) (4.70)

By taking the partial derivative of Jf(t) with respect to h−→∗n(t) where n = 1, 2, .., N and

(.)∗ is the complex conjugate and taking h−→∗n(t) is a constant yields

∂Jf

∂ h−→∗n(t)

= E

{

∂Jf (t + 1)

∂ h−→∗n(t)

}

(4.71)

63

A single sample is used as an estimate of the expectation as proposed in the LMS algorithm,

thus the instantaneous value is

∂Jf (t + 1)

∂ h−→∗n(t)

=∂

∂ h−→∗n(t)

[N−1∑

i=1

N∑

j=i+1

e−→Hij (t + 1) e−→ij(t + 1)

]

=∂

∂ h−→∗n(t)

[n−1∑

i=1

e−→Hin(t + 1) e−→in(t + 1)

]

+∂

∂ h−→∗n(t)

[N∑

j=n+1

e−→Hnj(t + 1) e−→nj(t + 1)

]

=

n−1∑

i=1

[

W−→01L×2LD−→xi

(t + 1)W−→102L×L

]H

e−→in(t + 1) −

N∑

j=n+1

[

W−→01L×2LD−→xj

(t + 1)W−→102L×L

]H

e−→nj(t + 1)

=

N∑

i=1

[

W−→01L×2LD−→xi

(t + 1)W−→102L×L

]H

e−→in(t + 1) (4.72)

the last simplification step is due to e−→nn(t) = 0, and using this gradient, with a small

positive step size given by µf , the frequency domain unconstrained multichannel LMS al-

gorithm is obtained:

h−→n(t + 1) = h−→n(t) − µfW−→10L×2L

N∑

i=1

D−→∗xi

(t + 1)W−→012L×L e−→in(t + 1) (4.73)

where

W−→10

L×2L= FL×LW

10L×2LF−1

2L×2L =1

2

(

W−→10

2L×L

)H

W−→01

2L×L= F2L×2LW01

2L×LF−1L×L = 2

(

W−→01

L×2L

)H

W−→10

L×2L= [IL×L 0L×L]

W−→01

2L×L= [0L×L 0L×L]T

Selecting a unit norm yields

‖ h−→(t)‖2 =‖h(t)‖2

L=

1

L(4.74)

64

where

h−→(t)∆=[

h−→T1 (t) h−→

T2 (t) . . . h−→

TN(t)

]T

(4.75)

To deduce the frequency-domain constrained multichannel LMS (FCMLMS) algorithm,

the unit norm constraint is enforced on equation 4.73 to get

h−→n(t + 1) =h−→n(t + 1) = h−→n(t) − µfW−→

10L×2L

∑N

i=1 D−→∗xi

(t + 1)W−→012L×L e−→in(t + 1)

√L‖ h−→n(t + 1) = h−→n(t) − µfW−→

10L×2L

∑N

i=1 D−→∗xi

(t + 1)W−→012L×L e−→in(t + 1)‖

,

n = 1, 2, ..., N (4.76)

Normalization

As previously stated, Newton’s method is applied. If we define

S⇒

xn(t + 1)∆= W−→

01

L×2LD⇒

xn(t + 1)W−→10

2L×L(4.77)

where n = 1, 2, ..., N And the frequency domain block error in equation 4.68 can be written

as

e−→ij(t + 1) = S⇒

xi(t + 1) h−→j(t) − S⇒

xj(t + 1) h−→i(t) (4.78)

and the instantaneous gradient in equation 4.72 can be written as

∂Jf (t + 1)

∂ h−→∗n(t)

=N∑

i=1

S⇒

xiH(t + 1) e−→in(t + 1) = S

⇒

H(t + 1) e−→n(t + 1) (4.79)

where

S−→(t + 1) = [ S−→Hx1

(t + 1) S−→Hx2

(t + 1) ... S−→HxN

(t + 1)]H

e−→n(t + 1) = [ e−→T1n(t + 1) e−→

T2n(t + 1) ... e−→

TNn(t + 1)]H

Evaluating the Hessian matrix of Jf(t + 1) with respect to the filter coefficients can be

computed by taking the row gradient of 4.79

65

T−→n(t + 1) =∂

∂ h−→Tn (t)

[

∂Jf (t + 1)

∂ h−→∗n(t)

]

=∂

∂ h−→Tn (t)

[

S⇒

H(t + 1) e−→n(t + 1)

]

= S⇒

H(t + 1)∂ e−→

TNn(t + 1)

∂ h−→n(t)

=

N∑

i=1,i6=n

S⇒

xHi (t + 1)S

⇒xi(t + 1) (4.80)

and the filter coefficients will be updated as

h−→n(t + 1) = h−→n(t) − ρf T−→−1n (t + 1)S

⇒

H(t + 1) e−→n(t + 1) (4.81)

ρf is the new step size of this algorithm. Further simplifications are then applied as shown

in [4] to obtain the frequency domain normalized multichannel LMS algorithm update

equation as

h−→10n (t + 1) = h−→

10n (t) − ρf P−→

−1

6n (t + 1)N∑

i=1

D⇒

x∗i (t + 1) e−→

01in(t + 1) (4.82)

and n = 1, 2 ..., N the symbols used are

h−→10n (t) = W−→

10

2L×Lh−→n(t) = F2L×2L

[

h−→Tn (t) 01×L

]T

e−→01in(t + 1) = W−→

10

2L×Le−→in(t + 1) = F2L×2L

[

01×L e−→Tin(t + 1)

]

where

P−→6n(t + 1) = λp P−→6n(t) + (1 − λp)N∑

i=1,i6=n

D⇒

xi(t + 1)D⇒

x∗i (t + 1) (4.83)

P−→6n is the power spectrum of the multiple channel outputs, which is obtained using the

recursion given in equation 4.83, a forgetting factor λp is set as

λp =

[

1 − 1

3L

]L

a small regularization parameter can be applied to the normalized algorithm to overcome

the situation where the input signal is too small, which causes the inverse of the power

66

calculation to diverge, this makes the frequency domain normalized multichannel LMS

algorithm to be implemented as

h−→10n (t + 1) = h−→

10n (t) − ρf

[

P−→−1

6n (t + 1) + δI2L×2L

] N∑

i=1

D⇒

x∗i (t + 1) e−→

01in(t + 1) (4.84)

and n = 1, 2 ..., N

The Frequency Domain Normalized Multichannel LMS algorithm is illustrated in figure

4.2

4.8 Performance of Selected Blind Methods

The performance of the selected blind identification methods were studied using increas-

ingly longer channel impulse response, and with varying SNR. As the purpose of this study

is to de-reverberate room impulse responses, algorithms which perform with channel im-

pulse responses which are greater than 256 taps are required. At the end of the tests, the

frequency domain normalized multichannel LMS algorithm was selected for the complete

two-stage dereverberation system. This decision was made because of the speed of this

algorithm, due to the frequency domain operation. The tests were executed in three stages

with different SNR values in each stage

• 3 tap channel impulse response

• 16 tap random channel impulse response

• ≥256 tap channel room impulse response

The simulations were performed in the following sequence - a) white noise is filtered with

the impulse response, b) white noise is added to the filtered signals to simulate SNR, c)

the algorithm is applied on the filtered signals to recover the impulse responses.

Minimal length channel impulse response estimation

Initial tests for convergence were carried out using a simple two-channel 3-tap system given

by

h1 = [1 − 2 cos(θ) 1]T

h2 = [1 − 2 cos(θ + v) 1]T

67

FF

T

x1

x2

x3

x1

x2

x3

Nj

i

ht

xh

tx

te

ij

ji

ij

,...

,2,

1,

ˆ.)

1(

ˆ.)

1(

)1(

La

st L

ele

me

nts

of e

ij

Conjugate x

IFF

TF

FT

Average Power of x

Regularize and invert

h1

h2

h3

FFT

h1

h2

h3

Schur product : sum of error between

channel and others, regularized

inverse of average power

IFF

T

h1

h2

h3

h1

h2

h3

h1

h2

h3

^^ ^

Initia

lize h

i = [1

0 0

… 0

] T

i = 1

, 2, …

N

^

^

^

Figure 4.2: Frequency Domain Normalized Multichannel LMS

68

0 50 100 150 200 250 300−60

−50

−40

−30

−20

−10

0

Sample

Mis

alig

nmen

t dB

Normalized Projection Misalignment

VSS−UMCLMSMCLMS µ = 0.005MCLMS µ = 0.01

0 50 100 150 200 250 3000

5

10

15

20

25

30

35

40

Sample

Mag

nitu

de

Error

VSS−UMCLMSMCLMS µ = 0.005MCLMS µ = 0.01

Figure 4.3: Algorithm performance in well conditioned 2 channel, 3-tap system, 40dB SNR

The relative conditioning of the channels can be controlled by changing the value of v. All

the blind identification algorithms performed satisfactorily with this two channel system

when the channels are well conditioned. Figure 4.3 shows the response of the unconstrained

multichannel LMS algorithm with optimal step-size control and that of the constrained

time domain multichannel multichannel LMS with µ = 0.005 and µ = 0.01. It can be

observed that a NPM of between -30dB to -55dB is achieved for the algorithms after 300

samples. Figure 4.4 shows the scaled estimated impulse responses estimated using the

constrained time domain multichannel LMS compared with the original channel impulse

responses. The constrained time domain multichannel LMS algorithm did not perform well

with a badly conditioned channel pair. Figure 4.5 shows the performance of the previous

algorithms, and also that of the multichannel newton algorithm using ρ = 0.5, and the

error estimation as the algorithm progresses up to 8000 samples. The SNR is 40dB in

this configuration, and it can be observed that the multichannel LMS algorithm diverges

for both values of µ, however, the unconstrained blind multichannel LMS with optimal

step-size control and the multichannel newton algorithms converge with NPMs of -30dB

and -40dB respectively. The estimate impulse responses are shown in figure 4.6.

69

1 1.5 2 2.5 3−2

−1.5

−1

−0.5

0

0.5

1h1 to be identified

1 1.5 2 2.5 30

0.5

1

1.5


1 1.5 2 2.5 3−0.4

−0.3

−0.2

−0.1

0

0.1

0.2

0.3h1 estimate

1 1.5 2 2.5 30

0.05

0.1

0.15

0.2

0.25

0.3

0.35h2 estimate

Figure 4.4: Scaled impulse response estimates of a well conditioned system obtained usingthe SIMO LMS

0 1000 2000 3000 4000 5000 6000 7000 8000−70

−60

−50

−40

−30

−20

−10

0

Sample

Mis

alig

nmen

t dB


VSS−UMCLMSMCLMS µ = 0.025MCLMS µ = 0.01MCN ρ = 0.5

0 1000 2000 3000 4000 5000 6000 7000 80000

1

2

3

4

5

6x 10

6

Sample

Mag

nitu

de

Error


Figure 4.5: Algorithm performance in badly conditioned 2 channel, 3-tap system, SNR =40dB

70

1 1.5 2 2.5 3−2

−1.5

−1

−0.5

0

0.5


1 1.5 2 2.5 3−2

−1.5

−1

−0.5

0

0.5


1 1.5 2 2.5 3−0.3

−0.2

−0.1

0

0.1

0.2h1 estimate

1 1.5 2 2.5 3−0.25

−0.2

−0.15

−0.1

−0.05

0

0.05

0.1

0.15h2 estimate

Figure 4.6: Scaled impulse response estimates of a badly conditioned system obtained usingthe Multichannel Newton Algorithm

Algorithm performance with random 16 tap channel impulse responses

The following algorithms were tested using random 16 tap channel responses for a 3 channel

system : MCN, VSS-UMCLMS, MCLMS, and FNMCLMS. Figure 4.7 shows the perfor-

mance of the four algorithms. It can be seen that three of the algorithms perform satis-

factorily with NPM between -38dB and -48dB with a signal to noise ratio of 40dB. The

multichannel newton algorithm diverged in this simulation. Figure 4.8 shows the scaled

estimated impulse responses obtained using FNMCLMS.

Algorithm performance with higher order impulse responses

Time domain algorithms were very computationally intensive with longer channel impulse

responses. The time domain multichannel newton and frequency domain normalized mul-

tichannel newton algorithms were observed to estimate the channel impulse response. The

focus was on the frequency domain normalize multichannel LMS. Simulated room impulse

responses were also used to observe the performance of the algorithms. The room impulse

response generator, using the image method[15], with the implementation in [16] was used

to obtain the room impulse responses. The following parameters were used in this algo-

71

0 0.5 1 1.5 2 2.5 3

x 104

−60

−50

−40

−30

−20

−10

0

Sample

Mis

alig

nmen

t dB


VSS−UMCLMSMCLMS µ = 0.0075FNMCLMS ρ

f = 1.2

MCN ρ = 0.5

0 0.5 1 1.5 2 2.5 3

x 104

0

0.5

1

1.5

2

2.5x 10

11

Sample

Mag

nitu

de

Error


Figure 4.7: Performance of algorithms on 16 tap 3-channel system

2 4 6 8 10 12 14 16

−1.5

−1

−0.5

0

0.5

1

1.5

h1 to be identified

2 4 6 8 10 12 14 16

−1

−0.5

0

0.5

1

h2 to be identified

2 4 6 8 10 12 14 16

−2

−1.5

−1

−0.5

0

0.5

1

1.5

h3 to be identified

2 4 6 8 10 12 14 16

−0.2

−0.1

0

0.1

0.2

0.3

h1 estimate

2 4 6 8 10 12 14 16

−0.2

−0.15

−0.1

−0.05

0

0.05

0.1

0.15

h2 estimate

2 4 6 8 10 12 14 16

−0.3

−0.2

−0.1

0

0.1

0.2

h3 estimate

Figure 4.8: Scaled filter estimates using the blind identification methods for a 16 tap3-channel system

72

rithm. The algorithm was able to attain a NPM of -11dB using a SNR of 10dB, for a 3

channel system with 256 taps. The input parameters used to generate the room impulse

response are shown below:

sv = 340; % Sound velocity (m/s)

fs = 8000; % Sample frequency (samples/s)

r = [2 1.5 2 ; 1 1.5 2; 1.5 1.5 1.5];

% Receiver positions

[x_1 y_1 z_1 ; x_2 y_2 z_2] (m)

s = [2 3.5 2]; % Source position [x y z] (m)

L = [5 4 6]; % Room dimensions [x y z] (m)

c = 0.3; % Reverberation time (s)

n = 296; % Number of samples

mtype = ’omnidirectional’; % Type of microphone

order = -1; % -1 equals maximum reflection order!

dim = 3; % Room dimension

orientation = 0; % Microphone orientation (rad)

hp_filter = 1; % Enable high-pass filter

The performance of the FNMCLMS algorithm is shown in figure 4.9, and the corre-

sponding estimated and scaled filter coefficients are shown in figure 4.10

200 400 600 800 1000 1200 1400 1600 1800 2000−12

−10

−8

−6

−4

−2

0

Samples

Nor

mal

ized

Pro

ject

ion

Mis

alig

nmen

t

Figure 4.9: Normalized Projection Misalignment using a Simulated Room Response, 10dBSNR

73

0 50 100 150 200 250−0.02

−0.01

0

0.01

0.02

0.03

0.04h1 to be identified

0 50 100 150 200 250−0.02

−0.01

0

0.01

0.02


0 50 100 150 200 250−0.01

0

0.01

0.02

0.03


0 50 100 150 200 250−0.02

0

0.02

0.04

0.06

0.08h1 estimate

0 50 100 150 200 250−0.02

−0.01

0

0.01

0.02

0.03

0.04

0.05h2 estimate

0 50 100 150 200 250−0.02

0

0.02

0.04

0.06

0.08h3 estimate

Figure 4.10: Impulse Response Estimates of Simulated Room Response

74

CHAPTER 5

Conclusion

This work has been a study of algorithms suitable for dereverberation of room acoustics.

The focus was on 2-stage dereverberation based on blind identification and channel in-

version with the MINT method. A representation of this approach to dereverberation is

shown in figure 5.1, where Stage 1 illustrates identification. In a real world scenario, blind

identification will be usually required, as the source signal x is usually unknown. In Stage

2, channel inversion using the LS or the MINT method, followed by filtering is executed.

Estimates of the room transfer functions between the source and each microphone denoted

by h1, h2 and h3, are obtained in Stage 1. These estimates are denoted h1, h2, and h2.

These room transfer functions, which are not usually minimum phase, are then inverted to

give the inverse responses denoted by h1, h2 and h3. With the obtained inverse response,

the source signals are filtered to recover the original input signal x. Various algorithms were

analyzed for blind system performance, with further study of the frequency domain un-

normalized multichannel LMS algorithm. Blind identification was possible with truncated

simulated impulse responses. The early part of the simulated impulse response required to

be truncated for the algorithm to perform satisfactorily. This implies that the algorithms

are not suitable for real time operation of unknown room impulse responses. The identi-

fied impulse responses were inverted using the MINT method. The complete gave an SDR

improvement from 0.3dB to 7.3dB in a system with an SNR of 10dB, as shown in figure

75

Identification

Non-Blind / BlindChannel Inversion

MINT / LSInverse Filteringx

x1

x2

x3

h1

h2

h3

h1

h2

h3

x1x2x3

h1

h2

h3

x

^

^

^

Stage 1 Stage 2

Figure 5.1: System Diagram

5.2. Currently, there is much ongoing research in the dereverberation of room acoustics,

and several methods have been used to improve the convergence properties of the blind

identification algorithms. Thus there is a strong possibility of better identification. Also,

the algorithms diverge after prolonged operation, and many methods have been suggested

to improve the long term performance.

76

0 0.2 0.4 0.6 0.8 1−8.5

−8

−7.5

−7

−6.5

−6

−5.5

−5

−4.5

Normalized Frequency (×π rad/sample)

Pow

er/fr

eque

ncy

(dB

/rad

/sam

ple)

Speaker Signal

0 0.2 0.4 0.6 0.8 1−40

−35

−30

−25

−20

−15

−10


Pow

er/fr

eque

ncy

(dB

/rad

/sam

ple)

Microphone 1 Signal

0 0.2 0.4 0.6 0.8 1−40

−35

−30

−25

−20

−15

−10


Pow

er/fr

eque

ncy

(dB

/rad

/sam

ple)

Microphone 2 Signal

0 0.2 0.4 0.6 0.8 1−40

−35

−30

−25

−20

−15

−10


Pow

er/fr

eque

ncy

(dB

/rad

/sam

ple)

Microphone 3 Signal

0 0.2 0.4 0.6 0.8 1−40

−35

−30

−25

−20

−15

−10


Pow

er/fr

eque

ncy

(dB

/rad

/sam

ple)

Dereverberated Signal

Figure 5.2: System performance

77

CHAPTER 6

Matlab Scripts

6.1 Two Channel Blind Identification : 3-tap channels

clear all; close all;

L = 4;

N = 2;

theta = pi/10;

v = pi;

h1 = [1 -2*cos(theta) 1]’;

h2 = [1 -2*cos(theta+v) 1]’;

h1 = [h1; zeros(L-length(h1),1)];

h2 = [h2; zeros(L-length(h2),1)];

x1 = wgn(100000, 1, 40, ’real’);

x2 = wgn(100000, 1, 40, ’real’);

78

x1_filt = filter(h1, 1, x1);

x2_filt = filter(h2, 1, x2);

x = [x1_filt x2_filt];

h = ones(L,N);

e = zeros(N,N);

H_model = [h1;h2];

npm = []; x_chi_s = [];

[m,n] = size(h);

pos = L;

for k = 1:9000

for i = 1:N

for j = 1:N

e(i,j) = (x(pos:-1:pos-L+1,i)’*h(:,j)) - (x(pos:-1:pos-L+1,j)’*h(:,i));

end

end

x_chi = 0;

for i = 1:N-1

for j = i+1:N

x_chi = x_chi + e(i,j)^2;

end

end

x_chi_s = [x_chi_s x_chi];

xtemp = x(pos:-1:pos-L+1,:);

rxx = rx(xtemp);

pos = pos+1;

h_long = reshape(h,m*n,1);

J = 2*rxx*h_long / (norm(h_long))^2;

mu = (h_long.’*J)/((norm(J)).^2);

h_long = h_long - mu*J;

npm_temp = H_model - (((H_model’*h_long)/(h_long’*h_long))*h_long);

79

npm = [npm norm(npm_temp)/norm(H_model)];

h = reshape(h_long,m,n);

end

figure;

subplot(2,2,1);plot(abs(fft(h1,512)).^2);title(’h1 response’);

subplot(2,2,2);plot(abs(fft(h2,512)).^2);title(’h2 response’);

subplot(2,2,3);plot(abs(fft(h(:,1),512)).^2);title(’h1 estimate’);

subplot(2,2,4);plot(abs(fft(h(:,2),512)).^2);title(’h2 estimate’);

figure; plot(10*log10(npm(1:300)));

6.2 Identification with the NLMS Algorithm


amethyst_1m_start = 11820; amethyst_1m_stop = 102630;



smaragden_1m_start = 2750; smaragden_1m_stop = 128790;



ali1m_start = 6300; ali1m_stop = 92000;

ali2m_start = 1; ali2m_stop= 90000;

ali3m_start = 1; ali3m_stop = 90000;

[xx, dd1, dd2, dd3, dd4, dd5, dd6, dd7] = xload(’amethyst1m’);

start = amethyst_1m_start; stop = amethyst_1m_stop;

x = xx(start:stop); d1 = dd1(start:stop); d2 = dd2(start:stop);

d3 = dd3(start:stop); d4 = dd4(start:stop); d5 = dd5(start:stop);

d6 = dd6(start:stop); d7 = dd7(start:stop);

DD = [dd1 dd2 dd3 dd4 dd5 dd6 dd7];

s=5000;%length of the adaptive filter

beta = 0.3;

80

D = [d1 d2 d3 d4 d5 d6 d7];

[m,n] = size(D);

ERROR = zeros(n,1); ENERGY = zeros(m,n);

W = zeros(s,n); RT60 = zeros(s,n);

min_error = zeros(1,7);

sig_variance = zeros(1,7);

noise_variance = zeros(1,7);

[ERROR, ENERGY, W, RT60] = batchnlms(beta,s,x,D);

for i = 1:7

d_ = DD(:,i);

sig_variance(i) = var(d_(12030:35400));

noise_variance(i) = var(d_(512:5761));

min_error(i) = 10*log10((noise_variance(i))/ sig_variance(i));

end

figure;

position = 0;

for i = 1:7

subplot(2,2,i-position);plot(ENERGY(:,i));

title([’Channel = ’,num2str(i),’ Min Error = ’,...

num2str(min_error(i)),’ Observed Error = ’,num2str(ERROR(i))]);

xlabel([’Filter tap #’]);

ylabel(’W^2’);

axis([ 0 5000 -140 1])

if(i == 4)

figure;

position = 4;

end

end

figure;

position = 0;

for i = 1:7

81

subplot(2,2,i-position);plot(W(:,i));



xlabel([’Filter tap #’]);

ylabel(’W’);

axis([ 0 5000 -0.1 0.16])

if(i == 4)

figure;

position = 4;

end

end

figure;

position = 0;

for i = 1:7

subplot(2,2,i-position);plot(RT60(:,i));



xlabel([’Fiter tap #’]);

ylabel(’Averaged W^2’);

if(i == 4)

figure;

position = 4;

end

end

6.3 Normalized LMS

function [error, energy, w1, rt60, e] = nlms2(beta,s,x1,d)

w1=zeros(s,1);

%pre allocating our variables

e=zeros(length(x1),1);

y1=zeros(length(x1),1);

82

mu=zeros(length(x1),1);

rt=zeros(s,1);

% beta=0.3;

%adding zero in the beginig of our input signal

% for performing adaptive algorithm

X1=[zeros((s-1),1); x1];

for i=s:length(X1)

%calculating our mu based on the input power

mu(i-s+1)=beta/(X1(i-s+1:i)’*X1(i-s+1:i)+eps);

y1(i-s+1)=w1’*X1(i:-1:i-s+1);

e(i-s+1)=d(i-s+1)-y1(i-s+1);

w1=w1+mu(i-s+1)*e(i-s+1)*X1(i:-1:i-s+1);

end

error=10*log10(var(e(end-5000:end))/var(d));

%ploting the normalized in energy in our impulse response

energy = 10*log10(eps+abs(w1/max(abs(w1))).^2);

for i=1:length(w1)/10

if abs(w1(i))==max(abs(w1))

j=i;

end

end

%calculating the RT60

for i=1:s

if i<100

rt(i)=1;

else

rt(i)=sum((w1(i-100+1:i)/max(abs(w1))).^2)/100;

end

end

% figure;

rt60 = (10*log10(eps+rt));

83

6.4 Single Channel Dereverberation with NLMS

clear all;

load(’sg2’);%g1 is the room impulse response

W = w1;

for j=1:1

g1=W(:,j);

delay=3100;

s=6000;


d_1 = wavread(’smaragden1m\Track 1.wav’);

d1 = d_1(2750:128790).’;%the desired signal

x=filter(g1,1,d1);%x is the input data for our adaptive algorithem

%synchronising the delayed desire(d) signal to input signal(x)

d = d1(1:end-delay).’;

%length of the adaptive filter

%initializing our filter length

w1=zeros(1,s);

%pre allocating our variables

e=zeros(1,length(d));

y1=zeros(1,length(d));

mu=zeros(1,length(d));

beta=.65;

%adding zero in the beginig of our input signal for performing adaptive algorithm

% X1=[zeros(1,(s-1), x1];

X1=[zeros(1,(s-delay-2)), x];

for i=s:length(X1)

%calculating our mu based on the input power

mu(i-s+1)=beta/(X1(i-s+1:i)*X1(i-s+1:i)’+eps);

y1(i-s+1)=w1*X1(i:-1:i-s+1)’;

e(i-s+1)=d(i-s+1)-y1(i-s+1);

w1=w1+mu(i-s+1)*e(i-s+1)*X1(i:-1:i-s+1);

end

error(j)=10*log10(var(e(end-5000:end))/var(d));

winv(:,j)=w1;

84

end

figure;

position = 0;

plot(conv(W(:,1),winv(:,1)));

title([’Channel=’,num2str(i)]);

ylabel(’amplitude’)

6.5 System Identification using NLMS

function [ERROR, ENERGY, W, RT60] = batchnlms(beta,s,x,D)

[m,n] = size(D);

for i = 1:n

[ERROR(i), ENERGY(:,i), W(:,i), RT60(:,i)] = nlms2(beta,s,x,D(:,i));

end

6.6 Blind SIMO LMS Well Conditioned Inputs


L = 3;

N = 2;

theta = pi/10;

v = pi;

mu = 0.005;

runs = 300;

source_length = 100000;

h1 = [1 -2*cos(theta) 1]’;

h2 = [1 -2*cos(theta+v) 1]’;

ensemble = 1;

hnew = 0; npm_new = 0; x_chi_s_new = 0;

hmcnew = 0; npmmcnew = 0; x_chi_s_mc_new = 0;

hmcnew_01_new = 0; npmmc_01_new = 0;

x_chi_s_mc_01_new = 0;

85

for i = 1:ensemble

x1 = wgn(source_length, 1, 0, ’real’);

xsig1 = filter(h1, 1, x1);


meanstd = mean([std(xsig1), std(xsig2)]);

noise_snr = 40;

x1_filt = xsig1 + randn(source_length,1)*meanstd*10^(-noise_snr/20);



[h,npm,x_chi_s] = vssumclms(x,L,runs,h1,h2);

mu = 0.005;

[hmc,npmmc,x_chi_s_mc] = mclms(x,L,mu,runs,h1,h2);

mu = 0.01;

[hmc_01,npmmc_01,x_chi_s_mc_01] = mclms(x,L,mu,runs,h1,h2);

hnew = h + hnew;

npm_new = npm + npm_new;

x_chi_s_new = abs(x_chi_s) + x_chi_s_new;

hmcnew = hmc + hmcnew; npmmcnew = npmmc + npmmcnew;

x_chi_s_mc_new = abs(x_chi_s_mc) + x_chi_s_mc_new;

hmc_01_new = hmc_01 + hmcnew_01_new; npmmc_01_new = npmmc_01 + npmmc_01_new;

x_chi_s_mc_01_new = abs(x_chi_s_mc_01) + x_chi_s_mc_01_new;

end

npmvss = npm_new/ensemble; npmmc = npmmcnew/ensemble;

npmmc_01 = npmmc_01_new/ensemble;

x_chi_s_vss = x_chi_s_new/ensemble;

xchi_s_mc = x_chi_s_mc_new/ensemble;

86

x_chi_s_mc_01 = x_chi_s_mc_01_new/ensemble;

hvss = h;

figure; subplot(2,1,1);

h = hnew/ensemble;

plot(1:runs,20*log10(npmvss),1:runs,20*log10(npmmc),1:runs,20*log10(npmmc_01));

legend(’VSS-UMCLMS’,’MCLMS \mu = 0.005’,’MCLMS \mu = 0.01’);

title(’Normalized Projection Misalignment’); xlabel(’Sample’);

ylabel(’Misalignment dB’);

subplot(2,1,2);

plot(1:runs,x_chi_s_vss,1:runs,x_chi_s_mc,1:runs,x_chi_s_mc_01);

legend(’VSS-UMCLMS’,’MCLMS \mu = 0.005’,’MCLMS \mu = 0.01’);

title(’Error’);xlabel(’Sample’);

ylabel(’Magnitude’);

figure; subplot(2,2,1);stem(h1,’-x’);title(’h1 to be identified’);

subplot(2,2,2);stem(h2,’-x’);title(’h2 to be identified’);

subplot(2,2,3);stem(hvss(:,1),’-x’);title(’h1 estimate’);


6.7 Blind SIMO LMS Bad Conditioned Inputs


L = 3;

N = 2;

theta = pi/10;

v = pi/10;

mu = 0.025;

rho = 0.5;

h1 = [1 -2*cos(theta) 1]’;

h2 = [1 -2*cos(theta+v) 1]’;

source_length = 100000;

x1 = wgn(source_length, 1, 0, ’real’);



87

meanstd = mean([std(xsig1), std(xsig2)]);

noise_snr = 40;




runs = 8000;

[hvss,npmvss,x_chi_s_vss] = vssumclms(x,L,runs,h1,h2);

[hmc,npmmc,x_chi_s_mc] = mclms(x,L,mu,runs,h1,h2);

mu = 0.01;

[hmc_01,npmmc_01,x_chi_s_mc_01] = mclms(x,L,mu,runs,h1,h2);

[hnewton,npmnewton,x_chi_s_newton] = newtonmclms_t(x,L,mu,runs,h1,h2);

figure; subplot(2,1,1);

plot(1:runs,20*log10(npmvss),1:runs,20*log10(npmmc),1:runs,20*log10(npmmc_01),...

1:runs,20*log10(npmnewton));

legend(’VSS-UMCLMS’,’MCLMS \mu = 0.025’,’MCLMS \mu = 0.01’,’MCN \rho = 0.5’);

title(’Normalized Projection Misalignment’); xlabel(’Sample’);

ylabel(’Misalignment dB’);

subplot(2,1,2);

plot(1:runs,abs(x_chi_s_vss),1:runs,abs(x_chi_s_mc),1:runs,abs(x_chi_s_mc_01),...

1:runs,abs(x_chi_s_newton));

legend(’VSS-UMCLMS’,’MCLMS \mu = 0.025’,’MCLMS \mu = 0.01’,’MCN \rho = 0.5’);

title(’Error’);xlabel(’Sample’);

ylabel(’Magnitude’);

figure; subplot(2,2,1);stem(h1,’-x’);title(’h1 to be identified’);

subplot(2,2,2);stem(h2,’-x’);title(’h2 to be identified’);



88

BIBLIOGRAPHY

[1] F.Everest. The Master Handbook of Acoustics. McGraw-Hill, 1994.

[2] Thomas Funkhouser, Jean-Marc Jot, and Nicolas Tsingos. Sounds good to me.

SIGGRAPH 2002.

[3] Encyclopaedia Britannica Online. Acoustics. 2007 (Retrieved April 3, 2007).

[4] Yiteng Huang, Jacob Benesty, and Jingdong Chen. Acoustic MIMO Signal Processing

(Signals and Communication Technology). Springer-Verlag New York, Inc., Secaucus,

NJ, USA, 2006.

[5] M. H. (Monson H.) Hayes. Statistical digital signal processing and modeling. 1996.

[6] Sen M. Kuo and Dennis Morgan. Active Noise Control Systems: Algorithms and DSP

Implementations. John Wiley & Sons, Inc., New York, NY, USA, 1995.

[7] Osunkunle Biodun Isaac, Sani AlMoudarress, and Sayed Ali Shekarchi. Adaptive echo

cancellation implementation in matlab and dsp. Adaptive Signal Processing ETC004

- Blekinge Tekniska Hogskola.

[8] Stephen T. Neely and Jont B. Allen. Invertibility of a room impulse response. Journal

of the Acoustic Society of America 66(1), VOL. 66(1):165–169, JULY 1979.

89

[9] Bernard Widrow. Adaptive signal processing. Prentice-Hall, Upper Saddle River, New

Jersey 07458, 1985.

[10] IEEE MASATO MIYOSHI, Member and IEEE YUTAKA KANEDA, Member. In-

verse Filtering of Room Acoustics. IEEE TRANSACTIONS ON ACOUSTICS,

SPEECH, AND SIGNAL PROCESSING, VOL. 36(2):145–152, FEBRUARY 1988.

[11] Takafumi Hikichi, Marc Delcroix, and Masato Miyoshi. On robust inverse filter de-

sign for room transfer function fluctuations. European Signal Processing Conference

(EUSIPCO), 2006.

[12] S. Haykin. Adaptive Filter Theory. Prentice-Hall, Englewood Cliffs, NJ, 1986.

[13] G. Xu, H. Liu, L. Tong, and T. Kailath. A Least-Squares approach to blind channel

identification. IEEE Trans. Signal Processing, SP-43(12):2982–2993, December 1995.

[14] T.K. Moon and W.C Stirling. Mathematical Methods and Algorithms. Prentice-Hall,

Upper Saddle River, New Jersey 07458.

[15] Jont B. Allen and David A Berkley. Image method for efficiently simulating small-

room acoustics. Journal of the Acoustical Society of America, 65(4):943–950, 1979.

[16] ir. E.A.P. Habets. Room impulse response generator. Technische University

Eindhoven, The Netherlands, 2006.

90

a survey on methods for blind acoustic dereverberation830362/fulltext01.pdf · abstract...

Documents