separation of multispeaker speech using excitation information b.yegnanarayana, r.kumara swamy and...
TRANSCRIPT
Separation of Multispeaker Speech
Using Excitation InformationB.Yegnanarayana, R.Kumara Swamy and S.R.Mahadeva Prasanna
Dept of Computer Science and Engineering
Indian Institute of Technology Madras
Chennai-600036, India
Talk at NOLISP2005
April 19, 2005
Problem
•Determine the # speakers
•Separate individual speakers
•Enhance speech of individual speakers
Organization of the talk
•Demo illustrating the problem of multispeaker separation
•Basis: Sequences of impulses in speech production
•Proposed method for speaker separation
•Discussion: Scope of the present study and key ideas
•Conclusions
Basis for the Proposed Method of Separation
•Sequences of impulses in direct speech at mic locations
•No effect of channel or other degradations on the sequence
•No two speakers are at the same location
Proposed Method for Speaker Separation
•Record multispeaker data at 2 or more mics
•Compute the HE of the LP residual
•Use peaks in crosscorrelation of HEs to obtain delays
•Take min of shifted HEs to derive HE of desired speaker
•Derive weight function and modified LP residual
•Synthesize speech for each speaker
Time-delay estimation
(b) Time delay and normalized # samples
(a) Peaks in the crosscorrelation plots
Processing HE using time-delay
Ta) HE of mic-1 signal b), c) , d) Min(HE1,HE2) emphasizing
excitation information of Speaker 1,2 and 3, respectively
Results of Separation
a)LP residual of mic-1 signal b), c) and d) modified residual of sp1, sp2
Sp3 e), f) and g) Speech signals after separation
Demo of Speaker EnhancementThree speaker case
aa) Microphone-1 speech signal b) Microphone-2 speech signal
(a)
b)b)
Summary
•Number of speakers (whispered), speaker separation (2 mics), speech enhancement (> 2 mics)
•Only speaker separation is addressed
•Significance of HE for delay estimation and speaker separation
Conclusions
•Need to improve the quality of enhanced speech signals
•Need more microphones for data collection
•Need to deal with moving speaker and variable # speakers