pitch determination by wavelet transformation santhosh bellikoth ece 5525- speech processing...

48
Pitch Determination by Pitch Determination by Wavelet Transformation Wavelet Transformation Santhosh Bellikoth Santhosh Bellikoth ECE 5525- Speech ECE 5525- Speech Processing Processing Instructor: Dr Kepuska Instructor: Dr Kepuska

Upload: louisa-bridges

Post on 04-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Pitch Determination by Wavelet Pitch Determination by Wavelet TransformationTransformation

Santhosh BellikothSanthosh Bellikoth ECE 5525- Speech Processing ECE 5525- Speech Processing

Instructor: Dr Kepuska Instructor: Dr Kepuska

Page 2: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Pitch DeterminationPitch Determination

Equivalent to fundamental frequency Equivalent to fundamental frequency estimationestimation

Essential Component in all Speech Processing Essential Component in all Speech Processing systemsystem

Page 3: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Applications of Pitch DetectorApplications of Pitch Detector

Speaker Identification and VerificationSpeaker Identification and Verification Pitch Synchronous speech analysis and Pitch Synchronous speech analysis and

SynthesisSynthesis Linguistic and phonetic knowledge acquisitionLinguistic and phonetic knowledge acquisition Voice disease diagnosisVoice disease diagnosis

Page 4: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Continuous Wavelet transformContinuous Wavelet transform

Continuous Wavelet transform is defined as Continuous Wavelet transform is defined as the convolution of a signal x (t) with a wavelet the convolution of a signal x (t) with a wavelet functionfunctionΨΨ(t)(t) shifted in time by a translation shifted in time by a translation parameter ‘b‘ and a dilation parameter ‘a’parameter ‘b‘ and a dilation parameter ‘a’

dta

bttx

aabCWTx )()(

1),(

Page 5: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Dyadic Wavelet TransformDyadic Wavelet Transform

Dyadic Wavelet Transform is defined asDyadic Wavelet Transform is defined as

dtbt

txbWTDjj

jxy )

2()(

2

1)2,(

)()(2ttx j

)

2

1(

2

1)(

2 jjtj

Page 6: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Dyadic Wavelet Transform Dyadic Wavelet Transform PropertiesProperties

LinearityLinearity Time Shift VarianceTime Shift Variance Detection of sharp and slow variation in the Detection of sharp and slow variation in the

signal, which makes it useful tool for the signal, which makes it useful tool for the analysis of Speech Signal. analysis of Speech Signal.

Page 7: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Plot of Haar Wavelet and Scaling Plot of Haar Wavelet and Scaling FunctionFunction

0 0.2 0.4 0.6 0.8 1 1.2 1.4-2

-1

0

1

2

psi(t

)

Wavelet Function

0 0.2 0.4 0.6 0.8 1 1.2 1.40

0.5

1

1.5

phi(t

)

Scaling Function

Page 8: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Pitch Detection StepsPitch Detection Steps

Segmentation of Speech SignalSegmentation of Speech Signal Scale SelectionScale Selection Computation of Wavelet Transformation of Computation of Wavelet Transformation of

each frame at various scaleseach frame at various scales Locating Position of local maxims for each Locating Position of local maxims for each

frameframe Locating position of GCIsLocating position of GCIs Calculation of Pitch PeriodsCalculation of Pitch Periods

Page 9: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Segmentation of Speech SignalSegmentation of Speech Signal

1) Segmentation without Overlapping1) Segmentation without Overlapping

Speech Signal is segmented using a hamming Speech Signal is segmented using a hamming window of 40 ms durationwindow of 40 ms duration

2) Segmentation with 50 % Overlapping2) Segmentation with 50 % Overlapping

Rectangular window is used with overlapping Rectangular window is used with overlapping of less than 10 %of less than 10 %

Page 10: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Scale SelectionScale Selection

Dyalet Wavelet Transform is computed at Dyalet Wavelet Transform is computed at scales a=2^j for all j.scales a=2^j for all j.

Number of Scales for computation of can be Number of Scales for computation of can be reduced based on the nature of the speech reduced based on the nature of the speech signal.signal.

Page 11: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Number of Scales SelectionNumber of Scales Selection

Wavelet with input center frequency fci and Wavelet with input center frequency fci and input bandwidth input bandwidth ΔΔfi, Scale parameter ‘a’ fi, Scale parameter ‘a’ corresponding to the required output center corresponding to the required output center frequency fco using the following relationfrequency fco using the following relation

a= fci/fcoa= fci/fco

Page 12: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Input and Output bandwidthInput and Output bandwidth

Input bandwidth of the wavelet Input bandwidth of the wavelet ΔΔfi= 2*fcifi= 2*fci

Output Bandwidth of the waveletOutput Bandwidth of the wavelet ΔΔfo=2*fcofo=2*fco

Page 13: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Approximation of ‘a’Approximation of ‘a’

If fci/fco is not to some power of 2, then it is If fci/fco is not to some power of 2, then it is rounded off to nearest powerrounded off to nearest power

For high pitch speakers lower bound is For high pitch speakers lower bound is decreased and upper bound is increased for the decreased and upper bound is increased for the better results better results

Page 14: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Computation of Dyadic Wavelet Computation of Dyadic Wavelet Transform Transform

The Dyadic Wavelet Transform is computed The Dyadic Wavelet Transform is computed for each frame by the following equationfor each frame by the following equation

)()()2

()(2

1),(

2ttxdt

bttxabDyWT jjjx

Page 15: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Speech Signal to be SegmentedSpeech Signal to be Segmented

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

Number of Samples

Am

plitu

de

Speech Signal

Page 16: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

First Three Frames of Original First Three Frames of Original Speech Signal with 50% overlappingSpeech Signal with 50% overlapping

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000-0.5

0

0.5

Number of Samples

Am

plit

ude

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000-1

0

1

Number of Samples

Am

plit

ude

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000-0.5

0

0.5

Number of Samples

Am

plit

ude

Frame No.1

Frame No.2

Frame No.3

Page 17: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Speech Segment and Dyadic Wavelet Speech Segment and Dyadic Wavelet Transform Transform

0 50 100 150 200 250 300 350 400-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

Number of Samples

Am

plit

ude

Speech Segment

0 50 100 150 200 250 300 350 400-1

0

1

2

(a) Number of Samples

Am

plitu

de

0 50 100 150 200 250 300 350 400-1

0

1

(b) Number of Samples

Am

plitu

de

0 50 100 150 200 250 300 350 400-1

0

1

(c) Number of Samples

Am

plitu

de

a = 8

a = 16

a = 32

Page 18: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Locating Positions of Local maximsLocating Positions of Local maxims

For locating the position of local maxims, first For locating the position of local maxims, first all the peaks of the waveform are located.all the peaks of the waveform are located.

Positions of local maxims are computed by Positions of local maxims are computed by setting a threshold, which is 80% of the global setting a threshold, which is 80% of the global maximal.maximal.

Page 19: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Locating all the upside peaks of a Locating all the upside peaks of a waveform and local maximswaveform and local maxims

0 50 100 150 200 250 300 350 400-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

Number of Samples

Am

plitu

de

Peak Picking

0 50 100 150 200 250 300 350 400-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

Number of Samples

Am

plitu

de

Location of Local Maxims

Page 20: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Locating the position of GCI’s Locating the position of GCI’s (Glottal closure Instant)(Glottal closure Instant)

If the position of local maxima at a scale If the position of local maxima at a scale matches the position of local maxima of frame matches the position of local maxima of frame whose wavelet transform has been calculated, whose wavelet transform has been calculated, then those locations are called GCI’s positionthen those locations are called GCI’s position

If it does not match then it is compared with If it does not match then it is compared with the Wavelet transform at next higher scale the Wavelet transform at next higher scale

Page 21: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Pitch CalculationPitch Calculation

Pitch can be computed asPitch can be computed as

d is the difference between two GCI positions in terms of d is the difference between two GCI positions in terms of sample and fs is the sampling frequency of the speech signal sample and fs is the sampling frequency of the speech signal

sfdp

Page 22: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Acoustic MeasuresAcoustic Measures

JitaJita

Jita is absolute Jitter, which gives an Jita is absolute Jitter, which gives an evaluation in msec of the period to period evaluation in msec of the period to period variability of the Pitch period with in the variability of the Pitch period with in the analyzed voice sampleanalyzed voice sample

1

111

1 N

iii pp

NJita

Page 23: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

JitterJitter Jitter percent gives an evaluation of the variability of the pitch Jitter percent gives an evaluation of the variability of the pitch

period within the analyzed voice sample in percent.period within the analyzed voice sample in percent.

P is the pitch period and N is the number of pitch estimated. P is the pitch period and N is the number of pitch estimated.

1001

1

1

(%)

1

1

11

N

ii

N

iii

pN

ppN

Jitter

Page 24: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Shimmer (DB)Shimmer (DB)

Shimmer in dB gives an evaluation of the Shimmer in dB gives an evaluation of the period to period variability of the peak to peak period to period variability of the peak to peak amplitude within the analyzed voice sample.amplitude within the analyzed voice sample.

1

1

1 )log(201

1)(

N

i i

iA

AN

dBShimmer

Page 25: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Shimmer(%)Shimmer(%)

Shimmer percent gives an evaluation in Shimmer percent gives an evaluation in percent of the variability of the peak to peak percent of the variability of the peak to peak amplitude within the analyzed voice sample.amplitude within the analyzed voice sample.

Shimmer in percent is given byShimmer in percent is given by

1001

1

1

(%)

1

1

11

N

ii

N

iii

AN

AAN

Shimmer

Page 26: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

ConclusionConclusion

Acoustic parameters computed using wavelet Acoustic parameters computed using wavelet transform can be used for the objective transform can be used for the objective analysis of pathological voice.analysis of pathological voice.

These Acoustic parameters can be used to These Acoustic parameters can be used to differentiate between normal and pathological differentiate between normal and pathological voice.voice.

Page 27: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Final ProgramFinal Program clc; clear all; close all; [s,fs]=wavread('U:\speech2_10k.wav'); %s=s1(1:10000); m=400; wL=400; L=length(s); nf=floor(L/wL); j=1; t=10;

Page 28: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Final programFinal program cmp1=[]; cmp2=[]; cmp3=[]; gci=[]; q=[]; d=[]; a=[]; %b=[]; disp('Enter x=1 for male voice'); disp('Enter x=2 for female voice');

Page 29: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Final ProgramFinal Program x=input('Enter the value of x ='); switch x case 1 for i=1:nf-1 f(j,:)=f_ovp(s,m,wL,i); g=gne(f(j,:)); c1=cwt(f(j,:),4,'haar'); c2=cwt(f(j,:),8,'haar'); c3=cwt(f(j,:),16,'haar'); c4=cwt(f(j,:),32,'haar');

Page 30: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Final ProgramFinal Program [p1,q1,d1]=f_shim_max(c1); [p2,q2,d2]=f_shim_max(c2); [p3,q3,d3]=f_shim_max(c3); [p4,q4,d4]=f_shim_max(c4); L1=length(p1); L2=length(p2); L3=length(p3); L4=length(p4); if L1==L2 cmp1=comp_t(p1,p2,t);

Page 31: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Final ProgramFinal Program elseif L2==L3 cmp2=comp_t(p2,p3,t); elseif L3==L4 cmp3=comp_t(p3,p4,t); end if ~isempty(cmp1) gci=[gci,p1']; q=[q,q1']; d=[d,d1']; elseif ~isempty(cmp2)

Page 32: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Final ProgramFinal Program gci=[gci,p2']; q=[q,q2']; d=[d,d2']; elseif ~isempty(cmp3) gci=[gci,p3']; q=[q,q3']; d=[d,d3']; elseif isempty(cmp1)& isempty(cmp2) d=[d,zeros(1,1)]; end

Page 33: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Final ProgramFinal Program end a=[a g]; % b=[b g2]; j=j+1; end %d1=diff(gci); case 2 for i=1:nf-1 f(j,:)=f_ovp3t(s,m,wL,i); c1=cwt(f(j,:),8,'haar'); c2=cwt(f(j,:),16,'haar'); c3=cwt(f(j,:),32,'haar'); c4=cwt(f(j,:),64,'haar'); g=gne(f(j,:)); [p1,q1,d1]=f_shim_max(c1); [p2,q2,d2]=f_shim_max(c2);

Page 34: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Final ProgramFinal Program [p3,q3,d3]=f_shim_max(c3); [p4,q4,d4]=f_shim_max(c4); L1=length(p1); L2=length(p2); L3=length(p3); L4=length(p4); if L1==L2 cmp1=comp_t(p1,p2,t); elseif L2==L3 cmp2=comp_t(p2,p3,t); elseif L3==L4 cmp3=comp_t(p3,p4,t); end if ~isempty(cmp1) gci=[gci,p1'];

Page 35: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Final ProgramFinal Program q=[q,q1']; d=[d,d1']; elseif ~isempty(cmp2) gci=[gci,p2']; q=[q,q2']; d=[d,d2']; elseif ~isempty(cmp3) gci=[gci,p3']; q=[q,q3']; d=[d,d3']; elseif isempty(cmp1)& isempty(cmp2) d=[d,zeros(1,1)]; end a=[a g]; % b=[b g2];

Page 36: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Final ProgramFinal Program d=smooth_d(d); p=d./fs; L5=length(gci); L6=length(p); L7=abs(L5-L6); m=mean(p); fo=1/m; m1=max(p); m2=min(f_wz(p)); fh=1/m2; fl=1/m1; jit=jita(p); jitt=jitter(p); shdB=shimdB(q,L6); sh=shimmer(q,L6); GNE=max(a);

Page 37: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Final ProgramFinal Program %GNE2=max(b); disp('Fundamental frequency ='); disp(fo); disp('Highest frequency='); disp(fh); disp('Lowest frequency='); disp(fl); disp('Jita ='); disp(jit); disp('Jitter in percentage'); disp(jitt); disp('Shimmer in dB ='); disp(shdB); disp('shimmer in percentage='); disp(sh);

Page 38: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Final ProgramFinal Program disp('Press any key for plot'); pause; if L5==L6 stairs(gci,p); xlabel('Number of Samples'); ylabel('Pitch period in msec'); title('Pitch

contour'); elseif L5<L6 gci=[gci,zeros(1,L7)]; stairs(gci,p); xlabel('Number of Samples'); ylabel('Pitch period in msec'); title('Pitch

contour'); else p=[p,zeros(1,L7)]; stairs(gci,p); xlabel('Number of Samples'); ylabel('Pitch period in msec'); title('Pitch

contour'); end

Page 39: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Results and ObservationsResults and Observations Enter x=1 for male voiceEnter x=1 for male voice Enter x=2 for female voiceEnter x=2 for female voice Enter the value of x =1Enter the value of x =1 Fundamental frequency =Fundamental frequency = 351.4493351.4493 Highest frequency=Highest frequency= 3.3333e+0033.3333e+003 Lowest frequency=Lowest frequency= 217.3913217.3913 Jita =Jita = 0.00210.0021 Jitter in percentageJitter in percentage 72.486472.4864

Page 40: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

Results and observationsResults and observations Jitter in percentageJitter in percentage 72.486472.4864 Shimmer in dB =Shimmer in dB = 3.20173.2017 shimmer in percentage=shimmer in percentage= 15.693115.6931 Press any key for plotPress any key for plot >> >> Variables created in current workspace.Variables created in current workspace. Variables created in current workspace.Variables created in current workspace. >>>>

Page 41: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska

QUESTIONS???????QUESTIONS???????

Page 42: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska
Page 43: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska
Page 44: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska
Page 45: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska
Page 46: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska
Page 47: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska
Page 48: Pitch Determination by Wavelet Transformation Santhosh Bellikoth ECE 5525- Speech Processing Instructor: Dr Kepuska