nearfield spherical microphone arrays for speech enhancement and dereverberation etan fisher...

25
Spherical Microphone Arrays for speech enhancement and dereverberation Etan Fisher Supervisor: Dr. Boaz Rafaely

Post on 21-Dec-2015

230 views

Category:

Documents


1 download

TRANSCRIPT

Nearfield Spherical Microphone Arrays

for speech enhancement and dereverberation

Etan Fisher

Supervisor:

Dr. Boaz Rafaely

Microphone Arrays Spatial sound acquisition Sound enhancement Applications:

reverberation parameter estimation dereverberation video conferencing

SpheresThe sphere as a symmetrical, natural entity.

Spherical symmetry

Facilitates direct sound field analysis:Spherical Fourier transformSpherical harmonics

Photo by Aaron Logan

Nearfield Spherical Microphone Array Generally, the farfield, plane wave assumption is made

(Rafaely, Meyer & Elko). In the nearfield, the spherical wave-front must be

accounted for.

Examples: Close-talk microphone Nearfield music recording Multiple speaker / video conferencing

Sound Pressure - Spherical Wave

Sound pressure on sphere r due to point source rp (spherical wave):

Spherical harmonics:

imm

nmn eP

mn

mnnY )(cos

)!(

)!(

4

)12(),(

0

||

),(),()()()(||

),,(*

n

n

nm

mnpp

mnpnn

p

rrik

YYkrhkrbkikarr

ekrp

p

From the solution to the wave equation (spherical coordinates):

Sound Pressure - Spherical Wave

Sound pressure on sphere r due to point source rp :

Spherical harmonics:

The spherical harmonics

are orthogonal and complete.

immn

mn eP

mn

mnnY )(cos

)!(

)!(

4

)12(),(

0

||

),(),()()()(||

),,(*

n

n

nm

mnpp

mnpnn

p

rrik

YYkrhkrbkikarr

ekrp

p

From the solution to the wave equation (spherical coordinates):

Sound Pressure - Spherical Wave Sound pressure on sphere r due to point source rp:

is the spherical Hankel function.

is the modal frequency function (Bessel):

ra radius of sphere Rigid

sphereOpen

))()('

)(')((4

)(4)(

krhkah

kajkrj

krjkrb

nn

nn

n

n

0

),(),()()()(),,(*

n

n

nm

mnpp

mnpnn YYkrhkrbkikakrp

)(krhn

)(krbn

Spherical Spectrum Functions)(krbn)(krhn

Spherical Spectrum Functions)()( krhkrb nn

Point Source Decomposition Sound pressure on sphere r due to point source rp:

Spherical Fourier transform:

Spatial filter – cancel spherical wave-front, yielding unit amplitude at rp=r0.

)()(

)()(

)()(

)()(

*

00p

mn

n

pn

nn

nmnm Y

krh

krhka

krhkrikb

krpkrw

)()()()()(),()(**

pmnpnn

mnnm YkrhkrbkikadYkrpkrp

0

),(),()()()(),,(*

n

n

nm

mnpp

mnpnn YYkrhkrbkikakrp

Point Source Decomposition Amplitude density:

Using the identity:

where Θ is the angle between Ω and Ωp,

0

*

0

)()()(

)()(),(n

n

nm

mnp

mn

n

pn YYkrh

krhkakw

)(cos4

12)()(

*

n

n

nm

mnp

mn P

nYY

0 0

)(cos4

12

)(

)()(),(n

nn

pn Pn

krh

krhkakw

Nearfield Criteria

N Order of array

k Wave number

rA

Array

radius

rs

Source

distance

N = 4; rA (array) = 0.1m; k = kmax

kmax = N/rA = 40

kmax = 2πfmax /343

fmax = 2184 Hz

r0 – Desired source location

rp – Interference location

Radial Attenuation

N = 4; rA (array) = 0.1m; k = kmax/4

kmax = N/rA = 40

kmax = 2πfmax /343

fmax = 2184 Hz

r0 – Desired source location

rp – Interference location

Radial Attenuation

N = 4; rA (array) = 0.1m; k = kmax/10

kmax = N/rA = 40

kmax = 2πfmax /343

fmax = 2184 Hz

r0 – Desired source location

rp – Interference location

Radial Attenuation

N = 2; rA (array) = 0.05 m; k = kmax

kmax = N/rA = 40

kmax = 2πfmax /343

fmax = 2184 Hz

r0 – Desired source location

rp – Interference location

Radial Attenuation – “Close Talk”

N = 2; rA (array) = 0.05 m; k = kmax /4

kmax = N/rA = 40

kmax = 2πfmax /343

fmax = 2184 Hz

r0 – Desired source location

rp – Interference location

Radial Attenuation – “Close Talk”

N = 12; rA (array) = 0.3 m; k = kmax /4

kmax = N/rA = 40

kmax = 2πfmax /343

fmax = 2184 Hz

r0 – Desired source location

rp – Interference location

Radial Attenuation – Large Array

N = 4; rA (array) = 0.1m; k = kmax

kmax = N/rA = 40

kmax = 2πfmax /343

fmax = 2184 Hz

The natural radial attenuation has been cancelled by multiplying the array output by the distance.

Normalized Beampattern

N = 4; rA (array) = 0.1m; k = kmax /4

kmax = N/rA = 40

kmax = 2πfmax /343

fmax = 2184 Hz

The natural radial attenuation has been cancelled by multiplying the array output by the distance.

Normalized Beampattern

N = 4; rA (array) = 0.1m; k = kmax /10

kmax = N/rA = 40

kmax = 2πfmax /343

fmax = 2184 Hz

The natural radial attenuation has been cancelled by multiplying the array output by the distance.

Normalized Beampattern

Directional Impulse Response

Amplitude density:

Impulse response at direction Ω0:

where is the ordinary inverse Fourier transform.

0 0

)(cos4

12

)(

)()(),(n

nn

pn Pn

krh

krhkakw

)},({)( 01 kwtw

1

Speech Dereverberation

Room IR Directional IR

{4 X 3 X 2}

N = 4

r = 0.1 m

r0 = 0.2 m

“Dry”

“Rev.”

“Derev.”

Music Dereverberation Room IR Directional IR

{ 8 X 6 X 3 }

N = 4

r = 0.1 m

r0 = 1.9 m

“Dry”

“Rev.”

“Derev.”

Conclusions Spherical wave pressure on a spherical microphone

array in spherical coordinates. Point source decomposition achieves radial

attenuation as well as angular attenuation. Directional impulse response (IR) vs. room IR. Speech and music dereverberation. Further work:

Develop optimal beamformer Experimental study of array