effcient voice activity detection algorithms using long-term speech information

17
Ecient voice activity detection algorithms using long-term speech information Javier Ram ırez * , Jos e C. Segura 1 , Carmen Ben ıtez, Angel de la Torre, Antonio Rubio 2 Dpto. Electr onica y Tecnolog  ıa de Computadores, Universidad de Granada, Campus Universitario Fuentenueva, 18071 Granada, Spain Received 5 May 2003; received in revised form 8 October 2003; accepted 8 October 2003 Abstract Currently, there are technology barriers inhibiting speech processing systems working under extreme noisy condi- tions. The emerg ing applic ations of speec h techn ology, especial ly in the elds of wirel ess communicat ions, digital hearin g aids or spe ech rec ogniti on, are exampl es of such sys tems and oft en requir e a noise red uct ion tec hniq ue operating in combination with a precise voice activity detector (VAD). This paper presents a new VAD algorithm for improving speech detection robustness in noisy environments and the performance of speech recognition systems. The algorithm measures the long-term spectral divergence (LTSD) between speech and noise and formulates the speech/ non-speech decision rule by comparing the long-term spectral envelope to the average noise spectrum, thus yielding a high discriminating decision rule and minimizing the average number of decision errors. The decision threshold is adapted to the measured noise energy while a controlled hang-over is activated only when the observed signal-to-noise ratio is low. It is shown by conducting an analysis of the speech/non-speech LTSD distributions that using long-term information about speech signals is benecial for VAD. The proposed algorithm is compared to the most commonly used VADs in the eld, in terms of speech/non-speech discrimination and in terms of recognition performance when the VAD is used for an automatic speech recognition system. Experimental results demonstrate a sustained advantage over standard VADs such as G.729 and adaptive multi-rate (AMR) which were used as a reference, and over the VADs of the advanced front-end for distributed speech recognition. Ó 2003 Elsevier B.V. All rights reserved. Keywords: Speech/non-speech detection; Speech enhancement; Speech recognition; Long-term spectral envelope; Long-term spectral divergence 1. Introduction An important problem in many areas of speech pr ocessing is the dete rmination of pr esence of  speech periods in a given signal. This task can be identied as a statistical hypothesis problem and its purpose is the determination to which category or cla ss a gi ven si gnal belongs. The decis ion is made based on an observation vector, frequently * Corr espo nding auth or. Tel.: +34-958 243 271 ; fax: +34 - 958243230. E-mail addres ses: [email protected] (J. Ra m ırez) , segu ra@ ugr.es (J.C. Segura), [email protected] (C. Ben ıtez), [email protected] ( A. de la Torre), [email protected] (A. Rubio). 1 Tel.: +34-958 243283 ; fax: +34-958 243230 . 2 Tel.: +34-958 243193 ; fax: +34-958 243230 . 0167-6393/$ - see front matter Ó 2003 Elsevier B.V. All rights reserved. doi:10.1016/j.specom.2003.10.002 Speech Communication 42 (2004) 271–287 www.elsevier.com/locate/specom

Upload: anshuman-kumar

Post on 03-Apr-2018

214 views

Category:

Documents


0 download

TRANSCRIPT

7/28/2019 Effcient Voice Activity Detection Algorithms Using Long-Term Speech Information

http://slidepdf.com/reader/full/effcient-voice-activity-detection-algorithms-using-long-term-speech-information 1/17

7/28/2019 Effcient Voice Activity Detection Algorithms Using Long-Term Speech Information

http://slidepdf.com/reader/full/effcient-voice-activity-detection-algorithms-using-long-term-speech-information 2/17

7/28/2019 Effcient Voice Activity Detection Algorithms Using Long-Term Speech Information

http://slidepdf.com/reader/full/effcient-voice-activity-detection-algorithms-using-long-term-speech-information 3/17

7/28/2019 Effcient Voice Activity Detection Algorithms Using Long-Term Speech Information

http://slidepdf.com/reader/full/effcient-voice-activity-detection-algorithms-using-long-term-speech-information 4/17

7/28/2019 Effcient Voice Activity Detection Algorithms Using Long-Term Speech Information

http://slidepdf.com/reader/full/effcient-voice-activity-detection-algorithms-using-long-term-speech-information 5/17

7/28/2019 Effcient Voice Activity Detection Algorithms Using Long-Term Speech Information

http://slidepdf.com/reader/full/effcient-voice-activity-detection-algorithms-using-long-term-speech-information 6/17

7/28/2019 Effcient Voice Activity Detection Algorithms Using Long-Term Speech Information

http://slidepdf.com/reader/full/effcient-voice-activity-detection-algorithms-using-long-term-speech-information 7/17

7/28/2019 Effcient Voice Activity Detection Algorithms Using Long-Term Speech Information

http://slidepdf.com/reader/full/effcient-voice-activity-detection-algorithms-using-long-term-speech-information 8/17

7/28/2019 Effcient Voice Activity Detection Algorithms Using Long-Term Speech Information

http://slidepdf.com/reader/full/effcient-voice-activity-detection-algorithms-using-long-term-speech-information 9/17

7/28/2019 Effcient Voice Activity Detection Algorithms Using Long-Term Speech Information

http://slidepdf.com/reader/full/effcient-voice-activity-detection-algorithms-using-long-term-speech-information 10/17

7/28/2019 Effcient Voice Activity Detection Algorithms Using Long-Term Speech Information

http://slidepdf.com/reader/full/effcient-voice-activity-detection-algorithms-using-long-term-speech-information 11/17

7/28/2019 Effcient Voice Activity Detection Algorithms Using Long-Term Speech Information

http://slidepdf.com/reader/full/effcient-voice-activity-detection-algorithms-using-long-term-speech-information 12/17

7/28/2019 Effcient Voice Activity Detection Algorithms Using Long-Term Speech Information

http://slidepdf.com/reader/full/effcient-voice-activity-detection-algorithms-using-long-term-speech-information 13/17

7/28/2019 Effcient Voice Activity Detection Algorithms Using Long-Term Speech Information

http://slidepdf.com/reader/full/effcient-voice-activity-detection-algorithms-using-long-term-speech-information 14/17

7/28/2019 Effcient Voice Activity Detection Algorithms Using Long-Term Speech Information

http://slidepdf.com/reader/full/effcient-voice-activity-detection-algorithms-using-long-term-speech-information 15/17

7/28/2019 Effcient Voice Activity Detection Algorithms Using Long-Term Speech Information

http://slidepdf.com/reader/full/effcient-voice-activity-detection-algorithms-using-long-term-speech-information 16/17

7/28/2019 Effcient Voice Activity Detection Algorithms Using Long-Term Speech Information

http://slidepdf.com/reader/full/effcient-voice-activity-detection-algorithms-using-long-term-speech-information 17/17