speaker recognition system by abhishek mahajan

17
SHREEJEE INSTITUTE OF TECHNOLOGY AND MANAGEMENT Speaker Recognition Guided By:- Mr. Prakash Singh Panwar By:- Rajpal Singh Chouhan EC BRANCH 1 ST YEAR

Upload: abhishek-mahajan

Post on 23-Jan-2017

256 views

Category:

Engineering


4 download

TRANSCRIPT

Page 1: Speaker recognition system by abhishek mahajan

SHREEJEE INSTITUTE OF TECHNOLOGY AND MANAGEMENT

Speaker Recognition

• Guided By:- Mr. Prakash Singh Panwar

• By:- Rajpal Singh Chouhan• EC BRANCH 1ST YEAR

Page 2: Speaker recognition system by abhishek mahajan

What is Speaker Recognition?

Speaker Recognition is the process of automatically recognizing who is speaking on the basis of individual

information included in speech signals.

Speaker Recognition =

Speaker Identification, Speaker Verification

Page 3: Speaker recognition system by abhishek mahajan

Speaker Identification• a

• Determine the speaker identity.

• Selection between a set of known voices.

• The user does not claim an identity.

Whose voice is this?

? ?

??

Page 4: Speaker recognition system by abhishek mahajan

Speaker Verification• a

• Synonyms: authentication, detection.• User claims an identity.• System task: Accept or reject identity claim.

Is this Ahmad’s voice

?

?

Page 5: Speaker recognition system by abhishek mahajan

Model of Speaker Recognizer• a

Fig -1 : Simple model of Speaker Recognizer .

U Permitted to Access

Hello,Mr. John

Page 6: Speaker recognition system by abhishek mahajan

The Structure of Speaker Recognizer• a

• Figure 2 :Functional Scheme of an ASR System.

Feature Extraction Feature Vector

Training Mode

Recognition

Speaker Modeling

Classification

Decision Logic Speaker

#ID

Speaker_1

Page 7: Speaker recognition system by abhishek mahajan

Speech Signal AnalysisFeature Extraction

• a

• - The aim is to extract the voice features to distinguish different phonemes of a language.

515645465

156156165

156456454

251561565

Page 8: Speaker recognition system by abhishek mahajan

MFCC extraction• a

Pre-emphasis DFT Mel filter banks Log(||2) IDFT

Speech

signalx(n)

WINDOW

x’(n)

xt (n)

Xt(k)

Yt(m)

MFCCyt(m)(k)

MFCC means Mel-frequency cepstral coefficients that representation of the short-term power spectrum of a sound for audio processing.

The MFCCs are the amplitudes of the resulting spectrum.

Page 9: Speaker recognition system by abhishek mahajan

a

• a

Speech waveform of a phoneme “\ae”

After pre-emphasis and Hamming windowing

Power spectrum MFCC

Page 10: Speaker recognition system by abhishek mahajan

Speech Signal to Feature Vector• a

515645465

156156165

156456454

251561565

Page 11: Speaker recognition system by abhishek mahajan

Vector Quantization (VQ) • aAIM of VQ :

representation of large amountsof data by (few) prototype vectors.

example: identification and groupingin clusters of similar data.

assignment of feature vector to the closest prototype w(similarity or distance measure, e.g. Euclidean distance )

Page 12: Speaker recognition system by abhishek mahajan

Database Creation Process• a

Database

Speaker #1

Speaker #2

Speaker #3

Hello, Speaker #1

Speaker #1Speaker #2

Hello, Speaker #2

Page 13: Speaker recognition system by abhishek mahajan

Speaker Identification• a

Database

#1 #2 #3

Speaker

# ?

Speaker 1 5.94

Speaker

# 1

Page 14: Speaker recognition system by abhishek mahajan

Speaker Verification• a

Database

#1 #2 #3

Speaker

# 1

Speaker 1 5.94

Accept

14

Page 15: Speaker recognition system by abhishek mahajan

Database Creation Condition• a

Table 1: Database description.

Parameter Characteristics

Language BanglaNo. of speaker 5Speech type Sentence reading Recording condition A normal room conditionAudio Length 60-90 secondsAudio type StereoSample Format 16-bit PCMSampling Frequency 8 KHzBit Rate 1411 kbps

Page 16: Speaker recognition system by abhishek mahajan

Speaker Recognition Result• a

Table 3: Test result for speaker recognition system.

Speaker No. of input Correct Incorrect Accuracy

Speaker_1 5 5 0 100%

Speaker_2 9 8 1 88.88%

Speaker_3 6 6 0 100%

Speaker_3 12 11 1 91.67%

Speaker_4 8 8 0 100%

Speaker_5 10 10 0 100%

Total Speaker 50 48 2 96%

Page 17: Speaker recognition system by abhishek mahajan

Applications• a • Transaction authentication

– Toll fraud prevention– Telephone credit card purchases– Telephone brokerage (e.g., stock trading)

• Access control– Physical facilities– Computers and data networks

• Information retrieval– Customer information for call centers– Audio indexing (speech skimming device)

• Forensics– Voice sample matching