a model of binaural processing based on tree-structure filter-bank 길이만, 김영익, 김화길,...

A Model of Binaural Processing Based on Tree-Structure Filter-Bank

길이만 , 김영익 , 김화길 , 구임회한국과학기술원 응용수학전공

Motivation

• Design of auditory preprocessors motivated from the characteristics of biological auditory systems.

- robustness to noise- capturing the minute differences between signals (2 Hz difference)- wide dynamic range (140 dB)- selective attention- source localization using two ears

Design of Basilar Membrane (BM)

Types of BM Models

• Lyon and Mead - R. F. Lyon and C. Mead, An Analog Electronic Cochlea, IEEE Transactions on Acoustics, Speech and Signal Processing, 37(7), 1988.

• Liu - W. Liu, A. G. Andreou, and Jr. M. H. Goldstein, Voiced-Speech Representation by an Analog Silicon Model of the Auditory Periphery, IEEE Transactions on Neural Network, 3(3), 1992.

• Kates - J. M. Kates, A Time-Domain Digital Cochlear Model, IEEE Transaction on Signal Processing, 39(12), 1991.

• Hamming BPF - O. Ghitza, Robustness against Noise: the Role of Timing-Synchrony Measurement. IEEE International Conference on Acoustics, Speech and Audio Processing, 6.8, 1987.

HH

HHHHH H H

LHH

LLLH

L

L H L LL L L L L

HHH

H

HHHHH

HH H H H H H

HHHHHHHHHH

HHHHHHH

HHH

L L LL L L L L

L L LL L L L L

Design of Filter Bank

(1) Lyon & Mead

(2) Fully Cascaded BPF (3) TSFB

• Cascaded LPFs• Number of Filters:

• Cascaded LPFs & HPFs• Higher bandpass capability• Equal delay time• Number of Filters:

• Tree sructure• Cascaded LPFs & HPF• Higher bandpass capability• Equal delay time• Versatile Q control• Number of Filters:

)( 2NO

)(NO

)log( NNO

Binaural Processing Models

• EE (Excitation-Excitation) cells in medial superior olive (MSO)- interaural cross-correlation models

• EI (Excitation-Inhibition) cells in lateral superior olive (LSO) - equalization-cancellation (EC) theory

Interaural Cross-correlation Model (EE-type cells)

• Running interaural cross-correlation (Jeffress, 1948)

• Delay weighting (Colburn, 1977)

• Frequency weighting (Stern and Shear, 1996)

dteftlftrt Tttint/)(),(),(),(

2.2. ||for )033.0(

2.2, || 0.15for

0.15, ||for

)()

3.2

2.2||(

)6.0

15.0||(

eC

Ce

C

p

Hz. 1200 f 300for )(][)|( c ccc fCpCffp

Lindemann’s Model (EI-type cells)

• Contralateral inhibition mechanism

• Stationary-inhibition component

• Dynamic-inhibition component

),(),(),( ,, nminminmi drsrr

.1),(0 with

),(1),(,

nmlc

nmlcnmi

s

ssr

.10 ,1),(0 with

),(),(),(

)},({1),(,

nmkc

nmlnmrnmk

nmkcnmi

d

ddr

Breebaart Model (EI-type cells)

• EI-type cell

• Combined EI-type cell

• Temporal windowing

• Nonlinear saturation

Shamma’s Model

The Stereausis Network

Stereausis Processor

Left In

)(ty j

)(2 ty

)(1 ty

)(1 tx )(2 tx )(txi

2)]()([)( nynxnc jiij

Network output for time shifted 600Hz tonea) zero shift b) shift c) shift d) shift

3

3

2

Binaural Processing with TSFB

TSFB

feature extraction (ZCPA, F0)

++

+

++

noise

speech

left right

HRTR

HRTR

TSFB

Simulation for Binaural Processing

Signal

Noise

Noise

Noise

0

45

90

- Signal : TI46 (‘zero’ ~ ‘nine’) male speech samples

- Noise : Noisex samples

Simulation for Binaural Processing

Feature ZCPA

0 45 90 0 45 90

White

Gaussian

Noise

10 94.3 95.0 95.3 94.1 94.5 94.3

5 85.9 93.0 94.2 92.2 92.5 93.0

0 49.7 63.7 74.3 63.9 68.2 75.9

Op

Room

Noise

10 94.8 94.9 95.6 94.5 94.5 94.5

5 93.3 94.2 95.0 93.7 94.4 94.1

0 74.2 87.5 89.6 88.7 91.3 88.8

F16

Noise

10 94.7 95.2 95.2 94.5 94.4 94.2

5 90.7 93.5 93.6 92.1 92.2 92.1

0 51.0 67.9 71.2 66.4 64.7 64.7

0F

Simulation for Monaural Processing without HRTF

Feature ZCPA

White

Gaussian

Noise

10 97.4 95.8

5 96.8 95.1

0 82.4 86.2

Op

Room

Noise

10 97.4 95.7

5 96.0 94.9

0 76.8 91.2

F16

Noise

10 97.3 95.1

5 96.9 95.1

0 83.5 87.5

0F

Conclusion

• A model of binaural processing with TSFB has been suggested.

• Simulation results showed that the binaural processing could be advantageous in noisy environment.

• The HRTF could degrade the performance of speech recognition.

• A new feature combining binaural data will be investigated in the sense of noise robustness.

a model of binaural processing based on tree-structure filter-bank 길이만, 김영익, 김화길,...

Documents