speech enhancement using spectral subtraction and cascaded-median based noise estimation

Slide 1

Speech Enhancement Using Spectral Subtraction and Cascaded-Median Based Noise Estimationfor Hearing Impaired Listeners

[Ref: S. K. Waddi, P. C. Pandey, N.Tiwari, Proc. NCC 2013, Delhi, 15-17 Feb. 2013, Paper 3.2_2_1569696063]

[email protected] Dept., IIT BombayOverviewIntroductionSignal Processing for Spectral SubtractionImplementation for Real-time ProcessingTest ResultsSummary & Conclusion [email protected] Dept., IIT Bombay#/191. IntroductionSensorineural hearing lossIncreased hearing thresholds and high frequency lossDecreased dynamic range & abnormal loudness growthReduced speech perception due to increased spectral & temporal masking Decreased speech intelligibility in noisy environment

Signal processing in hearing aidsFrequency selective amplificationAutomatic volume controlMultichannel dynamic range compression (settable attack time, release time, and compression ratios) Processing for reducing the effect of increased spectral masking in sensorineural lossBinaural dichotic presentation (Lunner et al. 1993, Kulkarni et al. 2012)Spectral contrast enhancement (Yang et al. 2003) Multiband frequency compression (Arai et al. 2004, Kulkarni et al. 2012) [email protected] Dept., IIT Bombay#/19Techniques for reducing the background noise Directional microphoneAdaptive filtering (a second microphone needed for noise reference)Single-channel noise suppression using spectral subtraction (Boll 1979, Berouti et al.1979, Martin 1994, Loizou 2007, Lu & Loizou 2008, Paliwal et al. 2010)Processing stepsDynamic estimation of non-stationary noise spectrum- During non-speech segments using voice activity detection- Continuously using statistical techniquesEstimation of noise-free speech spectrum - Spectral noise subtraction- Multiplication by noise suppression functionSpeech resynthesis (using enhanced magnitude and noisy phase) [email protected] Dept., IIT Bombay#/19Research objective Real-time single-input speech enhancement for use in hearing aids and other sensory aids (cochlear prostheses, etc) for hearing impaired listeners

Main challenges Noise estimation without voice activity detection to avoid errors under low-SNR & during long speech segmentsLow signal delay(algorithmic + computational) for real-time applicationLow computational complexity & memory requirement for implementation on a low-power processor

Proposed technique: Spectral subtraction using cascaded-median based continuous updating of the noise spectrum (without using voice activity detection)

Real-time implementation: 16-bit fixed-point DSP with on-chip FFT hardware

Evaluation: Informal listening, PESQ-MOS [email protected] Dept., IIT Bombay#/192. Signal Processing for Spectral Subtraction

Dynamic estimation of non-stationary noise spectrumEstimation of noise-free speech spectrum Speech resynthesis

[email protected] Dept., IIT Bombay#/19Power subtractionWindowed speech spectrum = Xn(k)Estimated noise mag. spectrum = Dn(k)Estimated speech spectrum Yn(k) = [|Xn(k)|2 (Dn(k))2 ] 0.5 e j

speech enhancement using spectral subtraction and cascaded-median based noise estimation

Documents

xnkestimated noise

nonspeech segments

spectral subtraction

compression ratios processing

impaired listeners ref

release time

noisy phase

lu loizou