applied psychoacoustics lecture: binaural hearing jonas braasch jens blauert
TRANSCRIPT
Applied PsychoacousticsLecture: Binaural Hearing
Jonas Braasch Jens Blauert
left ear3D acoustic
sceneright ear
Coding
Decoding
3D auditoryscene
Eardrum
signal s(t):distanceazimuthelevation
sl(t)
sr(t)
4D 2x 1D 4D ?
Types of Binaural Models
• Localization Models• Detection Models• Sound-Source Separation Models
• Pink Models• Black-Box Models
Types of Binaural Models
• Localization Models• Detection Models• Sound-Source Separation Models
• Pink Models• Black-Box Models
Tasks to solve
1 What cues are available to localize a sound source?
2 How can we extract those cues in a Binaural Computational Model?
3 How can we calculate the position of the sound source from the extracted binaural cues ?
Models regarding one sound source1. What cues are available to localize a sound source?
Models regarding one sound source1. What cues are available to localize a sound source?
Head-Related Coordinate System
Those cues are available:
• Interaural cues– Interaural Time Differences (ITD‘s)– Interaural Level Differences (ILD‘s)
• Monaural cues– Spectral Cues
Those cues are available:
• Interaural cues– Interaural Time Differences (ITD‘s)– Interaural Level Differences (ILD‘s)
• Monaural cues– Spectral Cues Rayleigh‘s
Duplex Theorie
HRIR
HRIR
HRTFRHRTFL
Lateralization=intra cranialLocalization =extra cranial
Lateralization
figure from Jens Blauert
Interaural axis
Sideward deviation=1-D task
delay lines
figure from Jens Blauert
attenuators
How to generate ITDs and ILDs
ITDsILDs
figure from Jens Blauert
Frequency
Late
raliz
ati
on
blu
r
Gaussian tones
sinusoids
Lateralization Blur for ILDs
Lateralization blur=Lateralization experimentsMinimal audible angle=Localization experiments
figure from Jens Blauert
ILD induced Lateralization
inter aural level differences
perc
eiv
ed
sid
ew
ays
devia
tion
left stronger
right stronger
left
ri
ght
broadband noise
600-Hz sinusoid(Sayers, 1964)
fcarrier
en
velo
pe o
r ca
rrie
r carrierenvelopetotal signalpure tones
frequency band wideGaussian tones
figure from Jens Blauert
Lateralization Blur for ITDs
Gaussian tones:Gaussian enveloped sinusoidsof critical band width
Envelope vs. Carrier Signals
inter aural phase differences
perc
eiv
ed
sid
ew
ays
devia
tion
left earlier right earlier
left
ri
ght
figure from Jens Blauert
ITD induced Lateralization
Localization Curves
level difference
(left louder)
time difference
(left earlier)
dir
ect
ion o
f au
dit
ory
event
φ→
dir
ect
ion o
f au
dit
ory
event
φ→
figure from Jens Blauert
Tasks to solve
1 What cues are available to localize a sound source?
2 How can we extract those cues in a Binaural Computational Model?
3 How can we calculate the position of the sound source from the extracted binaural cues ?
How can we extract those cues in a Binaural Computational Model?
• Extracting ITD‘S– Jeffress Model– Cross-Correlation Models
• Extracting ILD‘s– Excitation-Inhibition cells
The Jeffress Model (1948)
R
L Jeffress
model (1948)
Estimation of ITDs
R
L
Estimation of ITDs
Jeffressmodel (1948)
R
L
Estimation of ITDs
Jeffressmodel (1948)
R
L
Estimation of ITDs
Jeffressmodel (1948)
R
L
Estimation of ITDs
Jeffressmodel (1948)
R
L
Estimation of ITDs
Jeffressmodel (1948)
R
L
Interaural cross correlation
Estimation of ITDs
Jeffressmodel (1948)
Model Structure
xR M
x
x
R M
R M
Band
pass
filte
r ban
k
Decis ion device
HRTF l
HRTF l
HRTF r
HRTF r
Band
pass
filte
r ban
k
Hairc
ell
Beha
vior
Cros
s-co
rrela
tion
Rem
appi
ng
Hairc
ell
Beha
vior
to: H
RT
Fr
from
: sou
nd s
ourc
es
Out
er e
ar
Out
er e
ar
Soun
d so
urce
s
+
1st
2n d
ith
n th
frequencyband:
+
y 1 y 2 y n
H
H
H
H
H
H HR LP
Halfwaverectification
Lowpassfilter
Cross-Correlation Models
Y ()= 1/(t1-t0) Yl(t)Yr(t+)t=t0
t1
Cherry (1959)
Cross-Correlation Models
Y ()= 1/(t1-t0) Yl(t)Yr(t+)t=t0
t1
Similarity to Jeffress‘ Coincidence Model:
k k+1 k+2 k+3 k+4 k+5 k+6 k+7 k+8
102
103
104
-60
-50
-40
-30
-20
-10
0
10
Frequency [Hz]
Filt
er R
espo
nse
[dB
]
Bandpass Filterbank Fletcher (1940)Patterson et al. (1995)
Model Structure
xR M
x
x
R M
R M
Band
pass
filte
r ban
k
Decis ion device
HRTF l
HRTF l
HRTF r
HRTF r
Band
pass
filte
r ban
k
Hairc
ell
Beha
vior
Cros
s-co
rrela
tion
Rem
appi
ng
Hairc
ell
Beha
vior
to: H
RT
Fr
from
: sou
nd s
ourc
es
Out
er e
ar
Out
er e
ar
Soun
d so
urce
s
+
1st
2n d
ith
n th
frequencyband:
+
y 1 y 2 y n
H
H
H
H
H
H
Blauert und Cobben (1978)
0.0 2.5 5.0 7.5 10.0 12.5 -100
-50
0
dB
Frequency [kHz]
0 0.02 0.04 0.06 0.08 0.1
-1.0
-0.5
0.0
0.5
1.0
time [s]
rel.
ampl
itude
Testsound 1ff=500 Hz
Time
Frequency
Cross-correlationBand 7 (527 Hz)
Cross-correlationBand 11 (3809 Hz)
0 0.5 1 1.5 2
x 10-3time [ms]
rel.
ampl
itude
L
R
Uncertainty in High Frequencies
0 0.5 1 1.5 2
x 10-3time [ms]
rel.
ampl
itude
L
R
? ? ?
Uncertainty in High Frequencies
0 0.02 0.04 0.06 0.08 0.1
-1.0
-0.5
0.0
0.5
1.0
time [s]
rel.
ampl
itude
Testsound 1ff=500 Hzmodulated
xR M
x
x
R M
R M
Band
pass
filte
r ban
k
Decis ion device
HRTF l
HRTF l
HRTF r
HRTF r
Band
pass
filte
r ban
k
Hairc
ell
Beha
vior
Cros
s-co
rrela
tion
Rem
appi
ng
Hairc
ell
Beha
vior
to: H
RT
Fr
from
: sou
nd s
ourc
es
Out
er e
ar
Out
er e
ar
Soun
d so
urce
s
+
1st
2n d
ith
n th
frequencyband:
+
y 1 y 2 y n
H
H
H
H
H
H
Model Structure
Model Structure
xR M
x
x
R M
R M
Band
pass
filte
r ban
k
Decis ion device
HRTF l
HRTF l
HRTF r
HRTF r
Band
pass
filte
r ban
k
Hairc
ell
Beha
vior
Cros
s-co
rrela
tion
Rem
appi
ng
Hairc
ell
Beha
vior
to: H
RT
Fr
from
: sou
nd s
ourc
es
Out
er e
ar
Out
er e
ar
Soun
d so
urce
s
+
1st
2n d
ith
n th
frequencyband:
+
y 1 y 2 y n
H
H
H
H
H
H HR LP
Halfwaverectification
Lowpassfilter
Cross-correlationBand 21 (3809 Hz)
Estimating ILDs using EI-cells
Reed and Blum (1990)Breebaart et al. (2001)
E()=exp((10/40 Pl-10-/40 Pr)2)
ILD
Estimating ILDs using EI-cells
Reed and Blum (1990)Breebaart et al. (2001)
E()=exp((10/40 Pl-10-/40 Pr)2)
ILD
R
L
EI modelBand 25 (6281 Hz)
Tasks to solve
1 What cues are available to localize a sound source?
2 How can we extract those cues in a Binaural Computational Model?
3 How can we calculate the position of the sound source from the extracted binaural cues ?
3. How can we calculate the position of the sound source from the extracted binaural cues ?
xR M
x
x
R M
R M
Band
pass
filte
r ban
k
Decis ion device
HRTF l
HRTF l
HRTF r
HRTF r
Band
pass
filte
r ban
k
Hairc
ell
Beha
vior
Cros
s-co
rrela
tion
Rem
appi
ng
Hairc
ell
Beha
vior
to: H
RT
Fr
from
: sou
nd s
ourc
es
Out
er e
ar
Out
er e
ar
Soun
d so
urce
s
+
1st
2n d
ith
n th
frequencyband:
+
y 1 y 2 y n
H
H
H
H
H
H
3. How can we calculate the position of the sound source from the extracted binaural cues ?
3. How can we calculate the position of the sound source from the extracted binaural cues ?
xR M
x
x
R M
R M
Band
pass
filte
r ban
k
Decis ion device
HRTF l
HRTF l
HRTF r
HRTF r
Band
pass
filte
r ban
k
Hairc
ell
Beha
vior
Cros
s-co
rrela
tion
Rem
appi
ng
Hairc
ell
Beha
vior
to: H
RT
Fr
from
: sou
nd s
ourc
es
Out
er e
ar
Out
er e
ar
Soun
d so
urce
s
+
1st
2n d
ith
n th
frequencyband:
+
y 1 y 2 y n
H
H
H
H
H
H
Remapping
30°0°
90°
Model based on EI-cells
left ear3D acoustic
sceneright ear
Coding
Decoding
3D auditoryscene
Eardrum
signal s(t):distanceazimuthelevation
sl(t)
sr(t)
4D 2x 1D 4D
ITDsILDs
monaural cues
Head-related Coordinate System
frontal plane
median plane
horizontal plane
backwardφ=180°=0°
forwardφ=0°=0°
figure from Jens Blauert
Head-related transfer functions
Frequency [Hz]
Frequency [Hz]
Head-related transfer functions
leftright
Frequency [Hz]
Head-related transfer functions
Inte
rau
ral ti
me d
iffere
nce
s [m
s]
Frequency [Hz]
Head-related transfer functions
Inte
rau
ral le
vel diff
ere
nce
s [d
B]
Localization in the Median Plane
Blauert 1969/70Monaural Cues
directional bands
boosted bands
Signal: 1/3 oct. Band noise
level
diff
ere
nce
s
rel.
judgem
ent
Localization of a single sound source
Types of accompaning sound sources
• Non-coherent sound sources– independent sound sources (e.g.
street noise, concurrent speakers, accompaning musical instruments)
• Coherent sound sources– wall reflections– electronically processed sound
sources (e.g., loudspeaker arrays)
Part ILocalization of a single sound source
Part IILocalization in the presence of a
non- coherent sound source
Part IIILocalization in the presence of
coherent sound sources
target
distracter200ms 200ms 100ms
Time Course
Methods
• Virtual auditory sound sources• Individual HRTF• 11 listeners, 10 repetitions• Test sound and distracter:
– noise (200 - 14 kHz) – T/D-ratio 0 ... - 15 dB
• GELP
-90 -60 -30 0 30 60 90
-90
-60
-30
0
30
60
90
presented left/right[° ]
perc
eive
d le
ft/r
ight
[°]
Single sourceSNR: 0dB
Localization Results
Listener 6:anechoiccondition
Localization Results
Localization model
xR M
x
x
R M
R M
Band
pass
filte
r ban
k
Decis ion device
HRTF l
HRTF l
HRTF r
HRTF r
Band
pass
filte
r ban
k
Halfw
ave
rect
ificat
ion
Cros
s-co
rrela
tion
Rem
appi
ng
Halfw
ave
rect
ificat
ion
to: H
RT
Fr
from
: sou
nd s
ourc
es
Out
er e
ar
Out
er e
ar
Soun
d so
urce
s
+
1st
2n d
ith
n th
frequencyband:
+
y 1 y 2 y n
Lateralization shifts at 0 dB T/D-ratio
-90 -60 -30 0 30 60 90
-90
-60
-30
0
30
60
90
presented left/right[° ]
perc
eive
d le
ft/r
ight
[°]
Single sourceSNR: 0dB
listeners
distracter distracter
0 dB -10 dB
60 dBRunning interauralcross-correlation
frequency band: 5
Localization
Experiment
target
distracter
target
distracter
Arguments for the cross-correlation difference
hypothesis
• two noise bursts are perceived as one auditory event, if their envelope is identical and they overlap in spectrum. This can be observed, even if the noise burst have different spatial positions and if they are uncorrelated.
• the auditory event of the target depends strongly on the exposure time of the masker before the target onset.
• existing models fail at very low SNRs.
The interaural cross-correlation difference function
T
D‘ D
200ms 200ms 100ms
T = A- D
D‘ D
T‘ A- D‘
step 1:
step 2:
T
T
T
T
T‘D‘
D
A
A
time time
sig: 30°dis: 0°
Distracter
Total signal
Total signal - Distracter
Target
Lateralisation shifts
Simulation usingsubtraction factor g:
T = A-g(t)D‘
with a) g(0)=0; b) g(x0)=1;
Meunier et al. (1996)
Including a detection algorithm
SNR=-15 dB
Conclusion
• The model is able to simulate localization and detection of broadband noise in broadband noise
• It allows localization at very low T/D-ratios• The model explains a number of
psychoacoustical phenomena (e.g. shifts of auditory events, clustering of responses)
• It can be extended to more than two sound sources
Part ILocalization of a single sound
source
Part IILocalization in the presence of a
non- coherent sound source
Part IIILocalization in the presence of
coherent sound sources
The precedence effect
(Blauert, 1983)
Time course
IT D 1 IT D 2IS I
tim e
rig h tc h a n n e l
le f tc h a n n e l
le a d
la g
Methods
• Stimulus presentation via headphones • Lead and lag pair:
– Bandpass filtered noise (500 Hz cf)– 100 Hz, 400 Hz or 800 Hz frequency
range– Lead: 300 ms ITD, lag: —300 ms and vice
versa– ISI 0, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.5 ms
• 6 listeners• Acoustic pointer
Psychoacoustial results
delay Δt of the lag speaker
lead
lag
Revised precedence effect curve for narrow-band
signals
Blauert & Braasch 2004
ILD analysis
ISI [ms]
Specialized Models• Combining Several Cues
– Centrality and Straightness (Stern et al., 1988)– HRTF-adjustment (Gaik-Lindemann, 1990)– Neuronal Networks (z.B. Janko et al. 1996)
• Localizing more than one sound source– Contralateral Inhibition (Lindemann, 1986)– Bayes Classification (Nix, Hohmann, 1999)– Cross-Correlation Difference (Braasch, 2001)
Importance of Head movements
Jonghees and van der Veer 1958
level differences between both loudspeaker signals
Median values and variations between listeners
azi
muth
angle
φ o
f audit
ory
event
figure from Jens Blauert