applied psychoacoustics lecture: binaural hearing jonas braasch jens blauert

Applied PsychoacousticsLecture: Binaural Hearing

Jonas Braasch Jens Blauert

left ear3D acoustic

sceneright ear

Coding

Decoding

3D auditoryscene

Eardrum

signal s(t):distanceazimuthelevation

sl(t)

sr(t)

4D 2x 1D 4D ?

Types of Binaural Models

• Localization Models• Detection Models• Sound-Source Separation Models

• Pink Models• Black-Box Models

Tasks to solve

1 What cues are available to localize a sound source?

2 How can we extract those cues in a Binaural Computational Model?

3 How can we calculate the position of the sound source from the extracted binaural cues ?

Models regarding one sound source1. What cues are available to localize a sound source?

Models regarding one sound source1. What cues are available to localize a sound source?

Head-Related Coordinate System

Those cues are available:

• Interaural cues– Interaural Time Differences (ITD‘s)– Interaural Level Differences (ILD‘s)

• Monaural cues– Spectral Cues

Those cues are available:

• Interaural cues– Interaural Time Differences (ITD‘s)– Interaural Level Differences (ILD‘s)

• Monaural cues– Spectral Cues Rayleigh‘s

Duplex Theorie

HRIR

HRTFRHRTFL

Lateralization=intra cranialLocalization =extra cranial

Lateralization

figure from Jens Blauert

Interaural axis

Sideward deviation=1-D task

delay lines


attenuators

How to generate ITDs and ILDs

ITDsILDs


Frequency

Late

raliz

ati

on

blu

r

Gaussian tones

sinusoids

Lateralization Blur for ILDs

Lateralization blur=Lateralization experimentsMinimal audible angle=Localization experiments


ILD induced Lateralization

inter aural level differences

perc

eiv

ed

sid

ew

ays

devia

tion

left stronger

right stronger

left

ri

ght

broadband noise

600-Hz sinusoid(Sayers, 1964)

fcarrier

en

velo

pe o

r ca

rrie

r carrierenvelopetotal signalpure tones

frequency band wideGaussian tones


Lateralization Blur for ITDs

Gaussian tones:Gaussian enveloped sinusoidsof critical band width

Envelope vs. Carrier Signals

inter aural phase differences

perc

eiv

ed

sid

ew

ays

devia

tion

left earlier right earlier

left

ri

ght


ITD induced Lateralization

Localization Curves

level difference

(left louder)

time difference

(left earlier)

dir

ect

ion o

f au

dit

ory

event

φ→

dir

ect

ion o

f au

dit

ory

event

φ→


Tasks to solve




How can we extract those cues in a Binaural Computational Model?

• Extracting ITD‘S– Jeffress Model– Cross-Correlation Models

• Extracting ILD‘s– Excitation-Inhibition cells

The Jeffress Model (1948)

R

L Jeffress

model (1948)

Estimation of ITDs

R

L

Estimation of ITDs

Jeffressmodel (1948)

R

L

Interaural cross correlation

Estimation of ITDs

Jeffressmodel (1948)

Model Structure

xR M

x

x

R M

R M

Band

pass

filte

r ban

k

Decis ion device

HRTF l

HRTF l

HRTF r

HRTF r

Band

pass

filte

r ban

k

Hairc

ell

Beha

vior

Cros

s-co

rrela

tion

Rem

appi

ng

Hairc

ell

Beha

vior

to: H

RT

Fr

from

: sou

nd s

ourc

es

Out

er e

ar

Out

er e

ar

Soun

d so

urce

s

+

1st

2n d

ith

n th

frequencyband:

+

y 1 y 2 y n

H

H

H

H

H

H HR LP

Halfwaverectification

Lowpassfilter

Cross-Correlation Models

Y ()= 1/(t1-t0) Yl(t)Yr(t+)t=t0

t1

Cherry (1959)

Cross-Correlation Models

Y ()= 1/(t1-t0) Yl(t)Yr(t+)t=t0

t1

Similarity to Jeffress‘ Coincidence Model:

k k+1 k+2 k+3 k+4 k+5 k+6 k+7 k+8

102

103

104

-60

-50

-40

-30

-20

-10

0

10

Frequency [Hz]

Filt

er R

espo

nse

[dB

]

Bandpass Filterbank Fletcher (1940)Patterson et al. (1995)

Model Structure

xR M

x

x

R M

R M

Band

pass

filte

r ban

k

Decis ion device

HRTF l

HRTF l

HRTF r

HRTF r

Band

pass

filte

r ban

k

Hairc

ell

Beha

vior

Cros

s-co

rrela

tion

Rem

appi

ng

Hairc

ell

Beha

vior

to: H

RT

Fr

from

: sou

nd s

ourc

es

Out

er e

ar

Out

er e

ar

Soun

d so

urce

s

+

1st

2n d

ith

n th

frequencyband:

+

y 1 y 2 y n

H

H

H

H

H

H

Blauert und Cobben (1978)

0.0 2.5 5.0 7.5 10.0 12.5 -100

-50

0

dB

Frequency [kHz]

0 0.02 0.04 0.06 0.08 0.1

-1.0

-0.5

0.0

0.5

1.0

time [s]

rel.

ampl

itude

Testsound 1ff=500 Hz

Time

Frequency

Cross-correlationBand 7 (527 Hz)

0 0.5 1 1.5 2

x 10-3time [ms]

rel.

ampl

itude

L

R

Uncertainty in High Frequencies

0 0.5 1 1.5 2

x 10-3time [ms]

rel.

ampl

itude

L

R

? ? ?

Uncertainty in High Frequencies

0 0.02 0.04 0.06 0.08 0.1

-1.0

-0.5

0.0

0.5

1.0

time [s]

rel.

ampl

itude

Testsound 1ff=500 Hzmodulated

xR M

x

x

R M

R M

Band

pass

filte

r ban

k

Decis ion device

HRTF l

HRTF l

HRTF r

HRTF r

Band

pass

filte

r ban

k

Hairc

ell

Beha

vior

Cros

s-co

rrela

tion

Rem

appi

ng

Hairc

ell

Beha

vior

to: H

RT

Fr

from

: sou

nd s

ourc

es

Out

er e

ar

Out

er e

ar

Soun

d so

urce

s

+

1st

2n d

ith

n th

frequencyband:

+

y 1 y 2 y n

H

H

H

H

H

H

Model Structure

Model Structure

xR M

x

x

R M

R M

Band

pass

filte

r ban

k

Decis ion device

HRTF l

HRTF l

HRTF r

HRTF r

Band

pass

filte

r ban

k

Hairc

ell

Beha

vior

Cros

s-co

rrela

tion

Rem

appi

ng

Hairc

ell

Beha

vior

to: H

RT

Fr

from

: sou

nd s

ourc

es

Out

er e

ar

Out

er e

ar

Soun

d so

urce

s

+

1st

2n d

ith

n th

frequencyband:

+

y 1 y 2 y n

H

H

H

H

H

H HR LP

Halfwaverectification

Lowpassfilter

Estimating ILDs using EI-cells

Reed and Blum (1990)Breebaart et al. (2001)

E()=exp((10/40 Pl-10-/40 Pr)2)

ILD

Estimating ILDs using EI-cells

Reed and Blum (1990)Breebaart et al. (2001)

E()=exp((10/40 Pl-10-/40 Pr)2)

ILD

R

L

EI modelBand 25 (6281 Hz)

Tasks to solve




3. How can we calculate the position of the sound source from the extracted binaural cues ?

xR M

x

x

R M

R M

Band

pass

filte

r ban

k

Decis ion device

HRTF l

HRTF l

HRTF r

HRTF r

Band

pass

filte

r ban

k

Hairc

ell

Beha

vior

Cros

s-co

rrela

tion

Rem

appi

ng

Hairc

ell

Beha

vior

to: H

RT

Fr

from

: sou

nd s

ourc

es

Out

er e

ar

Out

er e

ar

Soun

d so

urce

s

+

1st

2n d

ith

n th

frequencyband:

+

y 1 y 2 y n

H

H

H

H

H

H


Remapping

30°0°

90°

Model based on EI-cells

left ear3D acoustic

sceneright ear

Coding

Decoding

3D auditoryscene

Eardrum

signal s(t):distanceazimuthelevation

sl(t)

sr(t)

4D 2x 1D 4D

ITDsILDs

monaural cues

Head-related Coordinate System

frontal plane

median plane

horizontal plane

backwardφ=180°=0°

forwardφ=0°=0°


Head-related transfer functions

Frequency [Hz]

Frequency [Hz]


leftright

Frequency [Hz]


Inte

rau

ral ti

me d

iffere

nce

s [m

s]

Frequency [Hz]


Inte

rau

ral le

vel diff

ere

nce

s [d

B]

Localization in the Median Plane

Blauert 1969/70Monaural Cues

directional bands

boosted bands

Signal: 1/3 oct. Band noise

level

diff

ere

nce

s

rel.

judgem

ent

Localization of a single sound source

Types of accompaning sound sources

• Non-coherent sound sources– independent sound sources (e.g.

street noise, concurrent speakers, accompaning musical instruments)

• Coherent sound sources– wall reflections– electronically processed sound

sources (e.g., loudspeaker arrays)

Part ILocalization of a single sound source

Part IILocalization in the presence of a

non- coherent sound source

Part IIILocalization in the presence of

coherent sound sources

target

distracter200ms 200ms 100ms

Time Course

Methods

• Virtual auditory sound sources• Individual HRTF• 11 listeners, 10 repetitions• Test sound and distracter:

– noise (200 - 14 kHz) – T/D-ratio 0 ... - 15 dB

• GELP

-90 -60 -30 0 30 60 90

-90

-60

-30

0

30

60

90

presented left/right[° ]

perc

eive

d le

ft/r

ight

[°]

Single sourceSNR: 0dB

Localization Results

Listener 6:anechoiccondition

Localization Results

Localization model

xR M

x

x

R M

R M

Band

pass

filte

r ban

k

Decis ion device

HRTF l

HRTF l

HRTF r

HRTF r

Band

pass

filte

r ban

k

Halfw

ave

rect

ificat

ion

Cros

s-co

rrela

tion

Rem

appi

ng

Halfw

ave

rect

ificat

ion

to: H

RT

Fr

from

: sou

nd s

ourc

es

Out

er e

ar

Out

er e

ar

Soun

d so

urce

s

+

1st

2n d

ith

n th

frequencyband:

+

y 1 y 2 y n

Lateralization shifts at 0 dB T/D-ratio

-90 -60 -30 0 30 60 90

-90

-60

-30

0

30

60

90

presented left/right[° ]

perc

eive

d le

ft/r

ight

[°]

Single sourceSNR: 0dB

listeners

distracter distracter

0 dB -10 dB

60 dBRunning interauralcross-correlation

frequency band: 5

Localization

Experiment

target

distracter

target

distracter

Arguments for the cross-correlation difference

hypothesis

• two noise bursts are perceived as one auditory event, if their envelope is identical and they overlap in spectrum. This can be observed, even if the noise burst have different spatial positions and if they are uncorrelated.

• the auditory event of the target depends strongly on the exposure time of the masker before the target onset.

• existing models fail at very low SNRs.

The interaural cross-correlation difference function

T

D‘ D

200ms 200ms 100ms

T = A- D

D‘ D

T‘ A- D‘

step 1:

step 2:

T

T

T

T

T‘D‘

D

A

A

time time

sig: 30°dis: 0°

Distracter

Total signal

Total signal - Distracter

Target

Lateralisation shifts

Simulation usingsubtraction factor g:

T = A-g(t)D‘

with a) g(0)=0; b) g(x0)=1;

Meunier et al. (1996)

Including a detection algorithm

SNR=-15 dB

Conclusion

• The model is able to simulate localization and detection of broadband noise in broadband noise

• It allows localization at very low T/D-ratios• The model explains a number of

psychoacoustical phenomena (e.g. shifts of auditory events, clustering of responses)

• It can be extended to more than two sound sources

Part ILocalization of a single sound

source

Part IILocalization in the presence of a

non- coherent sound source

Part IIILocalization in the presence of

coherent sound sources

The precedence effect

(Blauert, 1983)

Time course

IT D 1 IT D 2IS I

tim e

rig h tc h a n n e l

le f tc h a n n e l

le a d

la g

Methods

• Stimulus presentation via headphones • Lead and lag pair:

– Bandpass filtered noise (500 Hz cf)– 100 Hz, 400 Hz or 800 Hz frequency

range– Lead: 300 ms ITD, lag: —300 ms and vice

versa– ISI 0, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.5 ms

• 6 listeners• Acoustic pointer

Psychoacoustial results

delay Δt of the lag speaker

lead

lag

Revised precedence effect curve for narrow-band

signals

Blauert & Braasch 2004

ILD analysis

ISI [ms]

Specialized Models• Combining Several Cues

– Centrality and Straightness (Stern et al., 1988)– HRTF-adjustment (Gaik-Lindemann, 1990)– Neuronal Networks (z.B. Janko et al. 1996)

• Localizing more than one sound source– Contralateral Inhibition (Lindemann, 1986)– Bayes Classification (Nix, Hohmann, 1999)– Cross-Correlation Difference (Braasch, 2001)

Importance of Head movements

Jonghees and van der Veer 1958

level differences between both loudspeaker signals

Median values and variations between listeners

azi

muth

angle

φ o

f audit

ory

event


applied psychoacoustics lecture: binaural hearing jonas braasch jens blauert

Documents

extracted binaural cues

solvewhat cues

binaural computational

sound source1

jens blauertattenuatorshow

jens blauerttasks

ilds lateralization

itds gaussian tones