analysing voice quality - trinity college dublin€¦ · analysing voice quality john kane april...

43
uni Analysing voice quality John Kane April 30, 2010 John Kane () Analysing voice quality April 30, 2010 1 / 18

Upload: others

Post on 20-Aug-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Analysing voice quality

John Kane

April 30, 2010

John Kane () Analysing voice quality April 30, 2010 1 / 18

Page 2: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Voice quality (VQ)

Mainly a consequence of the vibration of the vocal folds.

Overall timbre of a person’s voice (organic setting and dynamicshifts).

VQ not limited to pitch and loudness.

John Kane () Analysing voice quality April 30, 2010 2 / 18

Page 3: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Voice quality (VQ)

Mainly a consequence of the vibration of the vocal folds.

Overall timbre of a person’s voice (organic setting and dynamicshifts).

VQ not limited to pitch and loudness.

John Kane () Analysing voice quality April 30, 2010 2 / 18

Page 4: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Voice quality (VQ)

Mainly a consequence of the vibration of the vocal folds.

Overall timbre of a person’s voice (organic setting and dynamicshifts).

VQ not limited to pitch and loudness.

John Kane () Analysing voice quality April 30, 2010 2 / 18

Page 5: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Voice quality

Laver sought to provide quantitative physiological/acousticdescriptions of VQ (1980).

VQs: breathy, whispery, creaky, harsh, falsetto, modal.

In real speech these VQs exist on continuous scales and incombination with others.

Voice quality examples

John Kane () Analysing voice quality April 30, 2010 3 / 18

Page 6: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Voice quality

Laver sought to provide quantitative physiological/acousticdescriptions of VQ (1980).

VQs: breathy, whispery, creaky, harsh, falsetto, modal.

In real speech these VQs exist on continuous scales and incombination with others.

Voice quality examples

John Kane () Analysing voice quality April 30, 2010 3 / 18

Page 7: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Voice quality

Laver sought to provide quantitative physiological/acousticdescriptions of VQ (1980).

VQs: breathy, whispery, creaky, harsh, falsetto, modal.

In real speech these VQs exist on continuous scales and incombination with others.

Voice quality examples

John Kane () Analysing voice quality April 30, 2010 3 / 18

Page 8: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Voice quality

Laver sought to provide quantitative physiological/acousticdescriptions of VQ (1980).

VQs: breathy, whispery, creaky, harsh, falsetto, modal.

In real speech these VQs exist on continuous scales and incombination with others.

Voice quality examples

John Kane () Analysing voice quality April 30, 2010 3 / 18

Page 9: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Voice quality (VQ)

Reveals information on speaker’s state and attitude.

Infants already sensitive to different VQs.

Mackenzie Beck (2005) VQ used before understanding of linguisticcontent.

John Kane () Analysing voice quality April 30, 2010 4 / 18

Page 10: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Voice quality (VQ)

Reveals information on speaker’s state and attitude.

Infants already sensitive to different VQs.

Mackenzie Beck (2005) VQ used before understanding of linguisticcontent.

John Kane () Analysing voice quality April 30, 2010 4 / 18

Page 11: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Voice quality in speech communication

Contrastive linguistic purpose in some languages.

Gujurati “Twelve” vs “outside” (Breathy)Danish “hun” vs “hund” (Creaky)

Status, popular trends.

Extralinguistic, active listening (grunts etc.).

Prosodic component in neutral running speech.

John Kane () Analysing voice quality April 30, 2010 5 / 18

Page 12: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Voice quality in speech communication

Contrastive linguistic purpose in some languages.

Gujurati “Twelve” vs “outside” (Breathy)Danish “hun” vs “hund” (Creaky)

Status, popular trends.

Extralinguistic, active listening (grunts etc.).

Prosodic component in neutral running speech.

John Kane () Analysing voice quality April 30, 2010 5 / 18

Page 13: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Voice quality in speech communication

Contrastive linguistic purpose in some languages.

Gujurati “Twelve” vs “outside” (Breathy)Danish “hun” vs “hund” (Creaky)

Status, popular trends.

Extralinguistic, active listening (grunts etc.).

Prosodic component in neutral running speech.

John Kane () Analysing voice quality April 30, 2010 5 / 18

Page 14: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Voice quality in speech communication

Contrastive linguistic purpose in some languages.

Gujurati “Twelve” vs “outside” (Breathy)Danish “hun” vs “hund” (Creaky)

Status, popular trends.

Extralinguistic, active listening (grunts etc.).

Prosodic component in neutral running speech.

John Kane () Analysing voice quality April 30, 2010 5 / 18

Page 15: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Potentials of VQ/glottal source in speech technology

Improvement of naturalness in parameter speech synthesis (Cabral2008, Raitio 2008).

Potential for more flexible/expressive speech synthesis

Ability to aid emotion detection and paralinguistic annotation.

John Kane () Analysing voice quality April 30, 2010 6 / 18

Page 16: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Potentials of VQ/glottal source in speech technology

Improvement of naturalness in parameter speech synthesis (Cabral2008, Raitio 2008).

Potential for more flexible/expressive speech synthesis

Ability to aid emotion detection and paralinguistic annotation.

John Kane () Analysing voice quality April 30, 2010 6 / 18

Page 17: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Potentials of VQ/glottal source in speech technology

Improvement of naturalness in parameter speech synthesis (Cabral2008, Raitio 2008).

Potential for more flexible/expressive speech synthesis

Ability to aid emotion detection and paralinguistic annotation.

John Kane () Analysing voice quality April 30, 2010 6 / 18

Page 18: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Difficulties measuring VQ

As listeners were are very sensitive to variation in VQ.

Difficult job for computers.

Hidden position of vocals folds.

Vocal Folds

Robust extraction of glottal source difficult job for signal processing.

John Kane () Analysing voice quality April 30, 2010 7 / 18

Page 19: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Difficulties measuring VQ

As listeners were are very sensitive to variation in VQ.

Difficult job for computers.

Hidden position of vocals folds.

Vocal Folds

Robust extraction of glottal source difficult job for signal processing.

John Kane () Analysing voice quality April 30, 2010 7 / 18

Page 20: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Difficulties measuring VQ

As listeners were are very sensitive to variation in VQ.

Difficult job for computers.

Hidden position of vocals folds.

Vocal Folds

Robust extraction of glottal source difficult job for signal processing.

John Kane () Analysing voice quality April 30, 2010 7 / 18

Page 21: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Difficulties measuring VQ

As listeners were are very sensitive to variation in VQ.

Difficult job for computers.

Hidden position of vocals folds.

Vocal Folds

Robust extraction of glottal source difficult job for signal processing.

John Kane () Analysing voice quality April 30, 2010 7 / 18

Page 22: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Electroglottography (EGG)

John Kane () Analysing voice quality April 30, 2010 8 / 18

Page 23: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Inverse filtering

John Kane () Analysing voice quality April 30, 2010 9 / 18

Page 24: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Parameterisation

Time based measurements (LF model)

0 20 40 60 80 100 120-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

Time (ms)

Am

plitu

de

Adv: Related to physiologyDisAdv: Sensitive to noise and phase.

Frequency domain measurements

Adv: Avoids phase issuesDisAdv: Existing parameters strongly correlated.

John Kane () Analysing voice quality April 30, 2010 10 / 18

Page 25: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Parameterisation

Time based measurements (LF model)

0 20 40 60 80 100 120-1

-0.8

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

Time (ms)

Am

plitu

de

Adv: Related to physiologyDisAdv: Sensitive to noise and phase.

Frequency domain measurements

Adv: Avoids phase issuesDisAdv: Existing parameters strongly correlated.

John Kane () Analysing voice quality April 30, 2010 10 / 18

Page 26: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Our parameterisation system

RECORDEDSPEECH

CODEBOOKSEARCH FOR

INITIAL VALUES

TWO PARTOPTIMISATION

MODELPARAMETERS

AUTOMATICINVERSE

FILTERING(ALKU 2002)

0 500 1000 1500 2000 2500 3000−0.2

0

0.2

Time (ms)

Speech waveform

0 500 1000 1500 2000 2500 3000−1

−0.5

0

0.5

Time (ms)

Voice source waveform

0 500 1000 1500 2000 2500 3000 3500−90

−80

−70

−60

−50

−40

−30

−20

−10

Frequency (Hz)

Am

plitu

de (

dB)

Spectral optimisation

Voice source spectrum

Fitted model

Rk RgRk

EE RaF0

John Kane () Analysing voice quality April 30, 2010 11 / 18

Page 27: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Our Interspeech/Speech Communication submission

Description of our frequency domain parameterisation approach.

Finnish vowels: /A e i o u y æ ø/, 11 speakers

BREATHY

MODAL

PRESSED

John Kane () Analysing voice quality April 30, 2010 12 / 18

Page 28: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Our Interspeech/Speech Communication submission

Description of our frequency domain parameterisation approach.

Finnish vowels: /A e i o u y æ ø/, 11 speakers

BREATHY

MODAL

PRESSED

John Kane () Analysing voice quality April 30, 2010 12 / 18

Page 29: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Our Interspeech/Speech Communication submission

Description of our frequency domain parameterisation approach.

Finnish vowels: /A e i o u y æ ø/, 11 speakers

BREATHY

MODAL

PRESSED

John Kane () Analysing voice quality April 30, 2010 12 / 18

Page 30: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Evaluation

Robustness against simulations of difficult conditions.

Relative change Sensitivity of parametersCoefficient of variation Pulse-to-pulse variation

CLEANSIGNAL

SIGNAL WITHADDITIVE NOISE

(SNR = 45 dB)

SIGNAL WITHADDITIVE NOISE

(SNR = 30 dB)

SIGNAL WITHRECORDING

SYSTEMDISTORTION

Ability to discriminate voice qualities.

Explained variance Regression analysis.Classification Linear discriminant analysis.

John Kane () Analysing voice quality April 30, 2010 13 / 18

Page 31: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Evaluation

Robustness against simulations of difficult conditions.

Relative change Sensitivity of parametersCoefficient of variation Pulse-to-pulse variation

CLEANSIGNAL

SIGNAL WITHADDITIVE NOISE

(SNR = 45 dB)

SIGNAL WITHADDITIVE NOISE

(SNR = 30 dB)

SIGNAL WITHRECORDING

SYSTEMDISTORTION

Ability to discriminate voice qualities.

Explained variance Regression analysis.Classification Linear discriminant analysis.

John Kane () Analysing voice quality April 30, 2010 13 / 18

Page 32: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Overall results

Clearly better robustness against distortions imposed by recordingsystem.

Breathy Modal Pressed0

5

10

15

20

25

30

35

40Ra

Rel

ativ

e C

hang

e (%

)

Voice qualitiesBreathy Modal Pressed

0

5

10

15

20

25

Rel

ativ

e C

hang

e (%

)

Voice qualities

Rk

Breathy Modal Pressed0

1

2

3

4

5

6

7

8

9

Rel

ativ

e C

hang

e (%

)

Voice qualities

Rg

John Kane () Analysing voice quality April 30, 2010 14 / 18

Page 33: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Overall results

Generally less senstive to moderate levels of additive noise imposed onsignals.

High noise levels at times affected robustness.

John Kane () Analysing voice quality April 30, 2010 15 / 18

Page 34: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Overall results

Generally less senstive to moderate levels of additive noise imposed onsignals.

High noise levels at times affected robustness.

John Kane () Analysing voice quality April 30, 2010 15 / 18

Page 35: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Overall results

Clearly higher R2 scores for individual parameters.

Rg Rk Ra0

5

10

15

20

25

ParametersR

−sq

uare

d va

lues

(%

)

New system

Time system

April 30, 2010

Abstract

1

Table 1: Confusion matrix of classification scores (%) of the three voice qualitiesusing the two systems.

Spec TimeBre Neu Pre Bre Neu Pre

Bre 79 20 1 76 22 2Neu 32 47 21 42 43 15Pre 6 24 70 8 28 64

1

Higher classification scores.

John Kane () Analysing voice quality April 30, 2010 16 / 18

Page 36: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Some thoughts

New method may overcome some of the issues which have hamperedautomated glottal source analysis.

Produced vowels vs running speech.

Criteria to be defined to maximise the probablility of robustparameter extraction.

Extension of islands of reliability (Mokhtari & Campbell 2002)

John Kane () Analysing voice quality April 30, 2010 17 / 18

Page 37: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Some thoughts

New method may overcome some of the issues which have hamperedautomated glottal source analysis.

Produced vowels vs running speech.

Criteria to be defined to maximise the probablility of robustparameter extraction.

Extension of islands of reliability (Mokhtari & Campbell 2002)

John Kane () Analysing voice quality April 30, 2010 17 / 18

Page 38: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Some thoughts

New method may overcome some of the issues which have hamperedautomated glottal source analysis.

Produced vowels vs running speech.

Criteria to be defined to maximise the probablility of robustparameter extraction.

Extension of islands of reliability (Mokhtari & Campbell 2002)

John Kane () Analysing voice quality April 30, 2010 17 / 18

Page 39: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Some thoughts

New method may overcome some of the issues which have hamperedautomated glottal source analysis.

Produced vowels vs running speech.

Criteria to be defined to maximise the probablility of robustparameter extraction.

Extension of islands of reliability (Mokhtari & Campbell 2002)

John Kane () Analysing voice quality April 30, 2010 17 / 18

Page 40: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Future work

Applying new method to analysis of glottal source dynamics with Dr.Yanushevskaya.

Further work with HMM based classification of voice qualities withMark Kane.

Possible collaboration with Catharine Oertel and Prof. Campbell inanalysis of voice quality from naturalistic speech recordings.

Open to other collaborations!

John Kane () Analysing voice quality April 30, 2010 18 / 18

Page 41: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Future work

Applying new method to analysis of glottal source dynamics with Dr.Yanushevskaya.

Further work with HMM based classification of voice qualities withMark Kane.

Possible collaboration with Catharine Oertel and Prof. Campbell inanalysis of voice quality from naturalistic speech recordings.

Open to other collaborations!

John Kane () Analysing voice quality April 30, 2010 18 / 18

Page 42: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Future work

Applying new method to analysis of glottal source dynamics with Dr.Yanushevskaya.

Further work with HMM based classification of voice qualities withMark Kane.

Possible collaboration with Catharine Oertel and Prof. Campbell inanalysis of voice quality from naturalistic speech recordings.

Open to other collaborations!

John Kane () Analysing voice quality April 30, 2010 18 / 18

Page 43: Analysing voice quality - Trinity College Dublin€¦ · Analysing voice quality John Kane April 30, 2010 John Kane Analysing voice quality April 30, 2010 1 / 18. uni Voice quality

uni

Future work

Applying new method to analysis of glottal source dynamics with Dr.Yanushevskaya.

Further work with HMM based classification of voice qualities withMark Kane.

Possible collaboration with Catharine Oertel and Prof. Campbell inanalysis of voice quality from naturalistic speech recordings.

Open to other collaborations!

John Kane () Analysing voice quality April 30, 2010 18 / 18