structured variation across sound contrasts, talkers, and...

1
Speech is variable due to talkers, contexts, speaking styles, etc. Individual talker variability observed for; · Vowels (e.g. Peterson & Barney 1952, Johnson et al. 1993) · VOT (e.g. Allen et al., 2003; Scobbie, 2006, Chodoroff et al. 2015) · CoG (e.g. Newman et al. 2001) BACKGROUND Confederate-led scripted dialogues 32 participants (80 dialogues per participant) (Ohala, 1994; Lindbolm et al. 2007; Maniwa, Jongman & Wade, 2009) Measurements (n=2514) / p, t / / s, ʃ / VOT CoG duration Locus equation (LE) slope (co-articulation) Vowel dispersion V = /i/, /æ/, /ɑ/, /oʊ/, or /u/ (Euclidean distance Neary normlalized) Speaking rate (num. of syllables per sec.) Prominence-induced “Clear speech” by confederate “mishearing” portion of dialogue · Control condition: TARGET heard correctly · Prominent condition: TARGE misheard for another C (p,t,s or ʃ) How systematic are talker differences? · Are they stable across consonants? How does this relate to speech style? · Do talkers differ in how clear they are? QUESTIONS METHODS SAMPLE DIALOGUES RESULTS This research was supported by SSHRC grant # 435-2014-1504 to Meghan Clayards and Michael Wagner. The authors would like to thank Alissa Azzi mmaturo for help with data collection and Charlene Alcena and Sarah Perill o for help with annotation. References: Allen, Miller, & DeSteno (2003). Individual talker differences in voice-onset-time. JASA, 113(1), 544-522./ Chodroff, Godfrey, Khudanpur, & Wilson (2015). Structured variability in acoustic realization: A corpus study of voice onset time in American English stops. Proceedings of ICPhS Glasgow, UK. / Johnson, Ladefoged, & Lindau (1993). Individual differences in vowel production. JASA 94(2), 701-714. / Krull (1988). Acoustic properties as p redictors of perceptual responses: A study of Swedish voiced stops. Phonetic Experimental Research, Institute of Linguistics, University of Stockholm (PERILUS) 7. / Lindblom (1990). Explaining phonetic variation: A sketch of the H&H theory. In W. J. Hardcastle & A. Marchal (Eds.), Speech Production and Speech Modeling (pp. 403-439), Dordrecht:Kluwer. / Lindblom, et al. (2007). The effect of emphatic stress on consonant vowel coarticulation. JASA, 1 21(6), 3802- 3813. / Newman, Clouse, & Burnham (2001). The perceptual consequences of within-talker variability in fricative production. JASA, 109(3), 1181-1196. / Recasens, Pallarès, & Fontdevila, (1997). A model of lingual coarticulation based on articulatory constraints. JASA, 102(1), 544–561. / Theodore, R. M., Miller, J. L., & DeSteno, D. (2009). Individual talker differences in voice-onset-time: Contextual influences. JASA, 125(6), 3974-3982 / Sc obbie, J. M. (2006). Flexibility in the face of incompatible English VOT systems. In: Laboratory Phonology 8 Varieties of Phonological Competence. Phonology and Phonetics 4-2 . Mouton de Gruyter, Berlin, pp. 367-392. Structured Variation across Sound Contrasts, Talkers, and Speech Styles Hye-Young Bang 1 & Meghan Clayards 12 [email protected] , [email protected]a 1 Department of Linguistics, McGill University, 2 School of Communication Sciences and Disorders, McGill University Talkers differ systematically across sound contrasts within a speaking style Talkers differ in degree of hyper-articulation Correlations between cues across speakers tend to be strengthened in clear speech (under prominence) Differences between talkers (at least partly) reflect global properties of talkers The systematic differences in talkers may help listeners quickly adapt to talker variability in speech perception CONCLUSION LabPhon 15 Ithaca, NY July 13-16, 2016 CONTROL PROMINENT Have you heard of “grey peep?” Green peep? No, GREY peep. What? Great peep? No! GREY peep. CONFEDERATE PARTICIPANT Have you heard of “grey peep?” Grey what? Grey PEEP. What? Grey teep? No! Grey PEEP. 0.4 0.5 0.6 0.7 100 150 VOT(/p t/) /s/_vowel dispersion prom. no yes (β = 0.32, p = 0.027) 0.4 0.5 0.6 0.7 150 200 250 300 /s/_duration stop_vowel dispersion prom. no yes (β = 0.42, p = 0.012) 6000 8000 10000 100 200 300 400 /s/_duration (mean) /s/_CoG prom. no yes (β = 0.34, p = 0.058) 100 200 100 200 300 400 /s/_duration (mean) VOT(/p t/) prom. no yes (β = 0.76, p <.001) 0.3 0.4 0.5 0.6 0.7 /s/_LE Slope prom. no yes (β = -0.37, p = 0.048) (r= 0.76, p <.001) Differences in body size or gender? Differences in speaking style? 0.4 0.5 0.6 0.7 100 150 VOT(/p t/) stop_vowel dispersion prom. no yes (β = 0.37, p = 0.021) 4000 6000 8000 10000 12000 50 100 150 200 250 VOT(/p t/) /s/_CoG prom. no yes (β = 0.42, p = 0.013) Stats are from ”prominent condition” with speaking rate as a control factor in regression VOT correlated with /s/ CoG /s/ duration correlated with /s/ CoG VOT correlated with /s/ duration VOT and /s/ duration correlated with vowel dispersion across consonants longer /s/ = smaller slope = less co-articulation 4500 5000 5500 6000 6500 7000 100 125 150 175 200 Mean F0 Mean CoG 60 80 100 120 155cm 158cm 160cm 163cm 165cm 168cm 170cm 173cm 175cm 185cm 188cm height(cm) Mean VOT 0 50 100 150 200 no yes prominence VOT 100 200 300 no yes prominence /s/_duration 4000 6000 8000 10000 no yes prominence CoG 1500 2000 2500 1000 1500 2000 2500 3000 F2_mid F2_onset m f 4500 5000 5500 6000 6500 7000 155cm 158cm 160cm 163cm 165cm 168cm 170cm 173cm 175cm 185cm 188cm height(cm) Mean CoG 60 80 100 120 100 125 150 175 200 Mean f0 Mean VOT same pattern as between talkers Clear speech · VOT· /s/ duration· CoG· LE slope· vowel dispersion

Upload: others

Post on 27-Feb-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Structured Variation across Sound Contrasts, Talkers, and ...speechlearning.lab.mcgill.ca/pdfs/BangClayards(LabPhon_2016).pdf · Structured Variation across Sound Contrasts, Talkers,

§  Speech is variable due to talkers, contexts, speaking styles, etc.§  Individual talker variability observed for;

· Vowels (e.g. Peterson & Barney 1952, Johnson et al. 1993) · VOT (e.g. Allen et al., 2003; Scobbie, 2006, Chodoroff et al. 2015)· CoG (e.g. Newman et al. 2001)

BACKGROUND

§  Confederate-led scripted dialogues§  32 participants (80 dialogues per participant) (Ohala, 1994; Lindbolm et al. 2007; Maniwa, Jongman & Wade, 2009)§  Measurements (n=2514)

/ p, t / / s, ʃ /

VOT CoGduration

Locus equation (LE) slope (co-articulation)Vowel dispersion V = /i/, /æ/, /ɑ/, /oʊ/, or /u/

(Euclidean distance Neary normlalized)

Speaking rate (num. of syllables per sec.)

§  Prominence-induced “Clear speech” by confederate “mishearing” portion of dialogue · Control condition: TARGET heard correctly

· Prominent condition: TARGE misheard for another C (p,t,s or ʃ)

§  How systematic are talker differences? · Are they stable across consonants? §  How does this relate to speech style? · Do talkers differ in how clear they are?

QUESTIONS

METHODS SAMPLE DIALOGUES

RESULTS

This research was supported by SSHRC grant # 435-2014-1504 to Meghan Clayards and Michael Wagner. The authors would like to thank Alissa Azzimmaturo for help with data collection and Charlene Alcena and Sarah Perillo for help with annotation.

References: Allen, Miller, & DeSteno (2003). Individual talker differences in voice-onset-time. JASA, 113(1), 544-522./ Chodroff, Godfrey, Khudanpur, & Wilson (2015). Structured variability in acoustic realization: A corpus study of voice onset time in American English stops. Proceedings of ICPhS Glasgow, UK. / Johnson, Ladefoged, & Lindau (1993). Individual differences in vowel production. JASA 94(2), 701-714. / Krull (1988). Acoustic properties as predictors of perceptual responses: A study of Swedish voiced stops. Phonetic Experimental Research, Institute of Linguistics, University of Stockholm (PERILUS) 7. / Lindblom (1990). Explaining phonetic variation: A sketch of the H&H theory. In W. J. Hardcastle & A. Marchal (Eds.), Speech Production and Speech Modeling (pp. 403-439), Dordrecht:Kluwer. / Lindblom, et al. (2007). The effect of emphatic stress on consonant vowel coarticulation. JASA, 121(6), 3802- 3813. / Newman, Clouse, & Burnham (2001). The perceptual consequences of within-talker variability in fricative production. JASA, 109(3), 1181-1196. / Recasens, Pallarès, & Fontdevila, (1997). A model of lingual coarticulation based on articulatory constraints. JASA, 102(1), 544–561. / Theodore, R. M., Miller, J. L., & DeSteno, D. (2009). Individual talker differences in voice-onset-time: Contextual influences. JASA, 125(6), 3974-3982 / Scobbie, J. M. (2006). Flexibility in the face of incompatible English VOT systems. In: Laboratory Phonology 8 Varieties of Phonological Competence. Phonology and Phonetics 4-2 . Mouton de Gruyter, Berlin, pp. 367-392.

Structured Variation across Sound Contrasts, Talkers, and Speech Styles Hye-Young Bang1 & Meghan Clayards12

[email protected] , [email protected] of Linguistics, McGill University,

2School of Communication Sciences and Disorders, McGill University

§  Talkers differ systematically across sound contrasts within a speaking style§  Talkers differ in degree of hyper-articulation§  Correlations between cues across speakers tend to be strengthened in clear speech (under prominence)§  Differences between talkers (at least partly) reflect global properties of talkers§  The systematic differences in talkers may help listeners quickly adapt to talker variability in speech perception

CONCLUSION

LabPhon 15Ithaca, NY

July 13-16, 2016

CO

NTR

OL

PRO

MIN

ENT

Have you heard of “grey peep?”

Green peep?No, GREY peep.

What? Great peep?No! GREY peep.

CONFEDERATE PARTICIPANT

Have you heard of “grey peep?”

Grey what?Grey PEEP.

What? Grey teep?No! Grey PEEP.

0.4

0.5

0.6

0.7

100 150

VOT(/p t/)

/s/_

vowe

l disp

ersio

n

prom. no yes

(β = 0.32, p = 0.027)0.4

0.5

0.6

0.7

150 200 250 300

/s/_duration

stop

_vow

el d

isper

sion

prom. no yes

(β = 0.42, p = 0.012)

6000

8000

10000

100 200 300 400

/s/_duration (mean)

/s/_C

oG

prom. no yes

(β = 0.34, p = 0.058)

100

200

100 200 300 400

/s/_duration (mean)

VOT(

/p t/)

prom. no yes

(β = 0.76, p <.001)

0.3

0.4

0.5

0.6

0.7

150 200 250 300

/s/_duration

/s/_

LE S

lope

prom. no yes

(β = -0.37, p = 0.048)

(r= 0.76, p <.001)

Differences in body size or gender?

Differences in speaking style?

0.4

0.5

0.6

0.7

100 150

VOT(/p t/)

stop

_vow

el d

ispe

rsio

n

prom. no yes

(β = 0.37, p = 0.021)

4000

6000

8000

10000

12000

50 100 150 200 250

VOT(/p t/)

/s/_

CoG

prom. no yes

(β = 0.42, p = 0.013)

§  Stats are from ”prominent condition” with speaking rate as a control factor in regression

§  VOT correlated with /s/ CoG§  /s/ duration correlated with /s/ CoG §  VOT correlated with /s/ duration§  VOT and /s/ duration correlated with vowel

dispersion across consonantslonger /s/ = smaller slope =less co-articulation

4500

5000

5500

6000

6500

7000

100 125 150 175 200

Mean F0

Mea

n Co

G

60

80

100

120

155cm158cm

160cm163cm

165cm168cm

170cm173cm

175cm185cm

188cm

height(cm)

Mea

n VO

T

0

50100150200

no yes

prominence

VOT

100

200

300

no yes

prominence

/s/_duration

4000

6000

8000

10000

no yes

prominence

CoG

1500

2000

2500

1000150

0200

0250

0300

0

F2_mid

F2_onset

m f 4500

5000

5500

6000

6500

7000

155cm158cm

160cm163cm

165cm168cm

170cm173cm

175cm185cm

188cm

height(cm)

Mea

n C

oG

60

80

100

120

100 125 150 175 200

Mean f0

Mea

n VO

T

same pattern as between talkers

§  Clear speech · VOT↑ · /s/ duration↑ · CoG↑ · LE slope↓ · vowel dispersion↑