prosody as a means of acquiring syntactic categories and ... · prosody as a means of acquiring...

Prosody as a means of acquiring syntactic categories and building a syntactic skeletonGutman A.*, Dautriche I.#, Crabbe B+ & Christophe A.#

*Zukunftskolleg, Konstanz University, Germany #LSCP, Ecole Normale Superieure/CNRS, Paris +Alpage, Paris Diderot University

Background & Objective

How can children learn a language without explicit training ?Children acquiring their native language have to learn abstract syntactic categories (nouns,verbs, adjectives etc...) : these may facilitate the acquisition of word meanings [1] as well asthe syntactic structures of the language. Two-year-olds already have an abstractrepresentation of nouns and verbs [2]

What underlies this feat ? We propose to simulate this process with a computationalmodel based on experimental evidence.

A model for bootstrapping syntactic acquisition

We already know that :I 9-month-olds perceive prosodic boundaries [3]I Function words are frequent and situated at the border of prosodic boundaries [4].I Toddlers use function words to categorize the next content words [5].

Given this knowledge, we may construct an approximate syntactic structure [6] :

I Prosodic structure is treated as anapproximative shallow syntactic structure.a prosodic phrase = a syntactic phrase

I Function words and content words serve to labelcategorically the prosodic phrases (NP, VP).

Result : A syntactic skeleton.

Goal : Testing the feasibility of the labelingcomponent of the model.

[The little boy] [is running fast]

[The little boy] NP [is running fast] VP

The little boy

is running fast

Overview of the computational model

Input : Transcribed corpus of child directed speech augmented with prosodic boundariesfollowing the algorithm of [7].

Learning simulation : A prosodic phrase is seen as a probabilistic event, to which weassociate features :

I The observed features are words at the edge of the phrase (L, R).I The hidden feature is the syntactic category of the phrase (NP, VP).

Ex : [TheL−1rabbitR−1

]NP [isL0eatingR0

]VP [theL1grass of the gardenR1

We predict the category φ of each prosodic phrase from its features fi using Naıve Bayes :

ϕ = argmaxφp(φ)∏l

i=1 p(fi|φ)

The prior probability of the category p(φ) and the probability of the different features giventhat category p(fi|φ), are estimated via the Expectation-Maximization algorithm [7] :

I Initialization : Initially assign the prosodic phrases to syntactic clusters.I Maximization : Estimate the parameters of the model.I Expectation : Re-assign the prosodic phrases to new syntactic clusters.I Reiterate the two last steps until convergence.

References

[1] Gillette et al. (1999). Human simulations of vocabulary learning. Cognition, 73, 165 – 176.[2] Bernal, S. et al. (2010). Two-year-olds compute syntactic structure on-line. Developmental Science, 13, 69-73.[3] Gerken et al. (1994). 9-month-olds’ sensitivity to phonological versus syntactic phrases. Cognition 51.3, pp. 237–265[4] Morgan, J.L. et al. (1996). Perceptual bases of rudimentary grammatical categories. pp. 263– 283[5] Cauvet, E. et al. (inpress) Function words constrain on-line recognition of verbs and nouns in French 18-month-olds, LLD.[6] Christophe, A. et al. (2008). Bootstrapping lexical and syntactic acquisition. Language & Speech, 51, 61-75.[7] Nespor, M and I. Vogel. (1986) Prosodic phonology, volume 28. Walter de Gruyter.[8] A.P. Dempster et al (1977). Maximum likelihood from incomplete data via the EM algorithm.JRSS. 39(1) :1–38.[9] Bergelson, E. et al. (2012). At 6 to 9 months, human infants know the meanings of many common nouns. PNAS, 109.

Experiment 1 - Unsupervised clustering using Function words

Assumptions

I Children use function words to classify phrases.I Function words are frequent and often the first

word of a phrase [3].

Implementation

I Assign each prosodic phrase to a cluster labelled byits first word (the feature L).

I Retain only the clusters corresponding to the mostfrequent function words.

I Re-assign the rest of the corpus using the EMalgorithm.

Evaluation & Results

The quality of each cluster is evaluated by examining the largest syntactic category which it hascaptured : purity(Cl) =

Number of hits of largest categoryCluster size

Up-side :I Relying on function words appearing at the

edge of the prosodic phrase provides a powerfulclassification heuristic.

Down-side :I However no cluster corresponds to a single

category VP or NP but these categories aredistributed over several clusters.

Experiment 2 - Semi-supervised classification using Content words

Assumptions

I The child knows already some of the most frequentcontent words [8], and can identify them as objects(nouns) or actions (verbs) [4].

I Based on this knowledge she can identify the classof the current prosodic phrase.

Implementation

I Pre-define a list of known frequent content wordsbased on the corpus : a semantic seed

I Use the last word of each phrase (the feature R) toassign it to one of three classes : Nominal, Verbalor Other.

I Re-assign the unclassified part of the corpus usingEM algorithm

Proportion of each category

(NP, VP, Other) in each cluster

for the smallest seed (2% of

phrases initially classified)

Evaluation & Results

We vary the size of the semantic seed (known words) starting from {3 Nouns, 1 Verb} andcompare to a uniform random clustering of 3 clusters :

I The precision is very good even with asmall semantic seed (highly precise VPand NP clusters, which span about 50% ofthese categories).

I The recall is lower since the known wordsdo not cover all contexts.But precision is more important than recallfor learning !

I Although based on content words, thelearning algorithm ultimately relies onfunction words : knowing a few contentwords allows the learner to discover thefunction words associated with thecategory label.

Conclusion

I The syntactic skeleton hypothesis is corroborated : The language learner can rely on prosodic boundaries, function words and content words in order toconstruct a syntactic skeleton.

I Experiment 1 : While the prosodic boundaries give the structure of the prosodic skeleton, our baseline random EM shows that the prosodic structure alonedoes not permit the inference of meaningful phrasal categories. However relying on function words permits the construction of good categories (high purity).

I Experiment 2 : Relying on knowledge of frequent content words can lead to emergence of abstract syntactic categories. These abstract categories (VP andNP) are grounded in linguistic experience (frequency of words) and semantic experience (verbs = actions and nouns = objects).

I Our model shows that a small semantic seed allows the discovery of function words and that the knowledge of function words may help inthe classification of novel words.

corresponding author : ariel.gutman@uni-konstanz.de

prosody as a means of acquiring syntactic categories and ... · prosody as a means of acquiring...

Documents

opus novum review grammar case case case syntactic...

prosody in vietnamese

prosody modeling

pitch tracking + prosody

on the r elation between syntactic phrases and...

phrasal prosody constrains syntactic analysis in … prosody...

the use of prosody in semantic and syntactic ... · the use...

prosody training

pali prosody

synthesis (tts) text-to-speech part v · sent : “…...

m syntax + prosody: a syntactic–prosodic labelling scheme...

preschoolers use phrasal prosody online to constrain ... use...

prosody introduction

prosody in generation

prosody 2012

automatic intonation recognition for the prosodic...

the use of prosody in syntactic...

background: speakers use prosody to distinguish between the...

prosody and suprasegmentals

sanskrit prosody