LOT 4: 16-20 jan06 1
Language Acquisition
4.
Elena Lieven, MPI-EVA, Leipzig
School of Psychological Sciences, University of Manchester
LOT 4: 16-20 jan06 2
Outline for Session 4
MAIN TOPIC: Studying languages other than English
‘Exotic languages’ and issues they raise
Comparing cues within a language
Comparisons across languages
POST BREAKLearning language environment in different cultures
LOT 4: 16-20 jan06 3
Typological discoveries (1)
Children are sensitive from the outset of speaking to the semantic distinctions made in their language (Bowerman & Choi)
PICTURE Korean English
Cassette in box Fit tightly In
Apple in bowl Put loosely In
Put top on pen Fit tightly On
Put book in bag Put loosely In
LOT 4: 16-20 jan06 4
Typological discoveries (2)Chintang – a Tibeto-Burman language of East Nepal• Free ordering of verbal prefixes• Tense nearer to stem than aspect• Complex system of location marking • Location marking also used to express interpersonal
relations
What is the frequency and pattern of usage of these constructions in the speech of adults?How are they used in speech to children?
Do children make errors predicted by putative linguistic or cognitive universals or do they learn the language in its specificity?
CommunicativeEnvironment!!
LOT 4: 16-20 jan06 6
Productive morphology
• Does productivity develop?
• Are children less productive than adults?
Constructivist model: children are less productive, even with the verbs and affixes that they know, at younger ages and than their parents, since they are slowly building the abstract categories
Full competence model: with the verbs and affixes that they know, children are fully productive
LOT 4: 16-20 jan06 7
Spanish verb inflections[Aguado Orea & Pine]
Nottingham corpus• Lucia: 22 hours: 2;2.25 – 2;7.14• Juan; 31 hours: 1;1-.21 – 2;5.28 • Only verbs used by both adult and child
– stem – agreement properties
• Adult sample of verb tokens randomly reduced to number found in child’s speech
LOT 4: 16-20 jan06 8
Number of inflections per stem
• No significant difference between parents• Significant difference between children and
parents at both tested ages• For Juan, significant difference between first and
second half of the corpus
High frequency verbs have significantly fewer errorsSome person marking is almost always correct, but overgeneralised (1sg)Other person marking is almost always incorrect and another highly frequentform is used (3pl)
LOT 4: 16-20 jan06 9
Marking of German plurals
Köpcke: Cue strength: salience, type frequency, cue validity, iconicityBehrens:-s generalisation errors limited to distributional conditions in the inputSzagun: growth rates in type frequencies per marker match the input
Regularity – recurrent patternGenerality – type frequency
Default – only productive plural marker – English
-s - emergency general ending - German
Schemas – independent of rest of noun declensionInflection classes – gender and four cases in singular
LOT 4: 16-20 jan06 10
Morphological productivity[Laaha et al, in submission]
- The ability to freely form new morphological forms
- Degrees of productivity:- All feminine and animate masculine nouns ending in schwa
take the –en plural -en plural fully productive for feminine nouns ending in
schwa competes with –s for feminine nouns ending in consonant
Even the youngest children sensitive to feminine/non-feminine distinctionDegree of productivity played a role at all agesInput frequency had an effect for some pluralsMorphological transparency for some forms – leave off Umlaut
LOT 4: 16-20 jan06 11
Case marking and word order in German using novel verbs
[Dittmar et al – MPI-EVA]
LOT 4: 16-20 jan06 12
OS+Case21%
SO+Case68%
SO-Case11%
Distribution of SO- and OS-order with unambiguous and ambiguous case marking for German transitive sentences in the input
LOT 4: 16-20 jan06 13
100%
86%86%
68%
87%
79%
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
cue availability cue reliability cue validity
case marking
word order
Availability, reliability and validity for the grammatical cues word order and case marking for German transitive sentences in the input
LOT 4: 16-20 jan06 14
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
2;7-year-olds (N = 16) 5-year-olds (N = 16) 7-year-olds (N = 16)
Prototype
Word order only
Conflict
**
**
**** **
Mean proportion of correct pointing
LOT 4: 16-20 jan06 16
0
10
20
30
40
50
60
70
80
90
2,0 2,6 3,0 3,6 4,0 4,6 5,0 8,0
.German
. Japanese
. Hebrew
. Hebrew
. Japanese [Matsui et al.]
[Wittek]
% c
hild
ren
Novel verb studies of Syntax (Tomasello, Cognition, 2000)
LOT 4: 16-20 jan06 17
Weird linking[Abbot-Smith, MPI-EVA]
Models: always weird
Sentence: The bunnyNOM is pushing/domming the dogACC
Action: Dog pushing/domming bunny
Elicitation:
Action: Lion domming frog
LOT 4: 16-20 jan06 18
Exp: And now you tell me what happens, ok?
Chi: Yes.
Exp: Who is doing what?
Chi: The lion, it [+nom] is domming the [+acc] frog.
LOT 4: 16-20 jan06 19
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
4s familiarverb (N = 16)
4s novelverb
2s familiarverb (N = 16)
2s novelverb
ungrammatical linking
grammatical linking
Grammatical and ungrammatical linking used by German children (those who used both target verbs in a transitive or intransitive in both conditions at least once)
LOT 4: 16-20 jan06 20
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1
Familiar Verb Novel Verb Familiar Verb Novel Verb
German 2;4-year-olds (N = 30) English 2;4-year-olds (N = 30)
Mea
n pr
opor
tion
of li
nkin
g re
spon
ses
Ungrammatical linking
Grammatical linking
Mean proportion of grammatical and ungrammatical linking used by German versus English children
LOT 4: 16-20 jan06 21
• Il pousse Mary (He pushes Mary)
• Il la pousse (He pushes her)
Weird word order in French [Matthews et al, submitted]
LOT 4: 16-20 jan06 22
0.00.10.20.30.40.50.60.70.80.91.0
Low SOV Low VSO High SOV High VSO
Verb frequency and modelled word order
Mea
n pr
opor
tion
res
pons
es
Match
Single Revert
Full Revert
Mean proportion of Matches, Single Argument Reversions and Full Reversions as a function of verb frequency and modelled word order (mean age 2;10).
LOT 4: 16-20 jan06 23
Weird word order in English and French [Matthews et al,submitted]
00.10.20.30.4
0.50.60.70.80.9
SOV 2;10 SOV 3;9 SOV 2;9 SOV 3;9
French French English English
Pro
por
ion
cor
rect
ion
s
% no object
% pro object
% lex object
Mean proportion of canonically ordered responses that expressed no object, a pronominal object or a lexical object as a function of age and language.
LOT 4: 16-20 jan06 24
Other languages[Stoll, Abbot-Smith & Lieven, in prep.]
• English has very fixed word order• The tiger ate the mouse• The mouse ate the tiger
• German is more variable but has more case inflections
• Der Tiger frisst den Hund• Den Hund hat der Tiger gefressen
• Russian has ‘free word order’• Ja videl svoju mašinu (all 24 words orders
possible)
LOT 4: 16-20 jan06 25
Proportions of utterances accounted for by frames
0102030405060708090
100
% of utterances
English German Russian
Frames
LOT 4: 16-20 jan06 26
Proportions of one, two and three-word frames
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
English German Russian
3-word
2-word
1-word
LOT 4: 16-20 jan06 27
Imperatives• ENGLISH:
– Look.... = .10 Come/on... = .10
• GERMAN:– Guck(e)/mal ... =.14 Komm/mal... = .06„ Look...“ „Come...“
• RUSSIAN:– Skazhi ... = .09 Davaj ... = .15„Say ...“ „Give /
Let‘s ...“
LOT 4: 16-20 jan06 28
Wh-questions
0102030405060708090
100
% of wh-questions
English German Russian
Core frames
27 13 16
LOT 4: 16-20 jan06 29
Wh-questions 1,2 and 3-word core frames
0
5
10
15
20
25
30
English German Russian
number of frames
3-word
2-word
1-word
4
LOT 4: 16-20 jan06 30
• German and English:
wh word + aux/modal + pronoun/article/particle
• Russian:
wh word +/- particle
Prodrop, no articles, no copula in present tense
LOT 4: 16-20 jan06 32
The AGR/TNS Omission Model
• Child’s grammar identical to adult’s except Child is subject to a Unique Checking Constraint that results in under-specification of Tense and/or Agreement
• Child uses non-finite verb forms in contexts where finite verbs forms obligatory
– That go there v That goes there (3sg present)
• Since AGR assigns NOM, child also produces Non-NOM subjects when AGR absent
– Him naughty, Her coming
LOT 4: 16-20 jan06 33
Strengths of the ATOM
• Explains statistical patterns of error in English–He goes and He go, but few I goes–He goes, He go and Him go but few Him goes
• Explains why children learning other obligatory subject languages (e.g. Dutch, French) use infinitives in main clauses
–Hij lopen (He to walk) Il faire (He to do)
• Explains why children learning optional subject languages (e.g. Spanish) do not use infinitives in main clauses
–(El) habla (He speaks) not *(El) hablar (He to speak)
LOT 4: 16-20 jan06 34
MOSAICMOSAIC is a simple distributional learner that:• Learns utterance final words and sequences
– Do you want a biscuit? BiscuitA biscuitWant a
biscuit• Generates novel utterances by linking together
words that have been preceded and followed by overlapping sets of words and substituting them in utterance final sequences– a linked to the on basis of: Want a biscuit
Want the ball– allows: Want the biscuit
Eat a biscuitEat the biscuit
LOT 4: 16-20 jan06 35
MOSAIC: Key Features
• Takes as input (orthographically transcribed) samples of Child-Directed Speech
• Produces output in the form of ‘utterances’ that can be compared with those of real children
• Learns to produce progressively longer utterances as a function of the amount of input it has seen
LOT 4: 16-20 jan06 36
MOSAIC-Speak
ROTE LEARNED• DOESN’T FALL OUT • CHEEKY FACE• WHERE DO YOU WANT THEM TO GO?
• HOLD THE CASE THEN• TELL GRANDMA THEN• IT’S THE PHONE• WHICH FRIENDS ARE THEY THEN?
• GONNA WEE IN THE POTTY
GENERATED• MIGHT FALL OUT• CHEEKY FOOT • WHERE DO YOU WANT
HIM TO GO?• TAKE THE CASE THEN• SHOW GRANDMA THEN• IT’S A PHONE• WHICH FRIENDS IS HE THEN?
• GONNA WEE IN THE BALLOON
LOT 4: 16-20 jan06 37
Method
• MOSAIC trained repeatedly on speech addressed to a particular child
• Output generated after each run through input• Output files selected on basis of MLU• Compared with samples of child speech matched as
closely as possible for MLU• Data from child and model coded for non-finites,
simple finites and compound finites using same (automated) coding procedures
LOT 4: 16-20 jan06 38
Simulating differences in patterns of finiteness marking in Dutch, German and
Spanish
• Children modelled:– Peter - Gronigen Dutch corpus (Bols, 1995)– Leo - MPI German corpus (Behrens, in
press)– Juan - Nottingham Spanish corpus
(Aguado-Orea, 2004)
LOT 4: 16-20 jan06 39
Pattern of finiteness marking as a function of MLU for Peter and MOSAIC-Peter (Dutch)
0
0,2
0,4
0,6
0,8
1
1,5 2,2 3,1 4,1
Data for Peter
Non-finite
Simple Finite
CompoundFinite
0
0,2
0,4
0,6
0,8
1
1,4 2,1 2,7 4,1
Model of Peter
Non-finite
Simple Finite
CompoundFinite
MOSAIC simulates high proportion of OI errors in Dutch (and low proportion of compound finites)
LOT 4: 16-20 jan06 40
Pattern of finiteness marking as a function of MLU for Leo and MOSAIC-Leo (German)
0
0,2
0,4
0,6
0,8
1
1,3 2,2 3 3,8
Data for Leo
Non-finite
Simple Finite
CompoundFinite
0
0,2
0,4
0,6
0,8
1
1,4 2,3 3 4
Model of Leo
Non-finite
Simple Finite
CompoundFinite
MOSAIC simulates the moderately high proportion of OI errors in German (and low proportion of compound finites)
LOT 4: 16-20 jan06 41
Pattern of finiteness marking as a function of MLU for Juan and MOSAIC-Juan (Spanish)
0
0,2
0,4
0,6
0,8
1
2,2 2,9 3,8
Data for Juan
Non-finite
Simple Finite
CompoundFinite 0
0,2
0,4
0,6
0,8
1
2,2 2,7 3,8
Model of Juan
Non-finite
Simple Finite
CompoundFinite
MOSAIC simulates the low proportion of OI errors in Spanish (and high proportion of simple finites)
LOT 4: 16-20 jan06 42
OI errors as a function of compound finites in the input and percentage of utterance final verbs in the input that were finite vs. non-finite
OI errors at lowest MLU point (%)
Compound Finites in Input (%)
Utterance-final finite verbs (%)
Dutch 75 31 18
German 61 22 35
Spanish 18 25 74
LOT 4: 16-20 jan06 44
Some claims made about language learning
• There are cultures in which children are not spoken to before they speak
Children only require minimal input to learn language ORChildren can learn language through overhearing
• There are cultures which believe children have to be taught language and corrected from ‘babytalk’
Children can learn language from a highly didactic interactive style
LOT 4: 16-20 jan06 45
Intention reading andpreverbal communication
Distributional analysis:prosody phonemes words
Learning to talk
Form-meaning mappingsLinguisticuniversals
?
How does this relate to patterns of interaction with infants?
How much input is enough?
Learning patternsIdentifying slotsCreating paradigms Abstraction
?How does this relate to the amount and type of languagethat children hear?Communicative
Environment!!
Infant cognition
LOT 4: 16-20 jan06 47
Our study
• Mostly outside• Many different
situations• Mother often absent• Many other children
Most previous studies
• Inside the house• Mother and child
playing• Only mother present• No other children
Comparing recording situations
LOT 4: 16-20 jan06 48
Characterising children’s communicative environment
• How much do people talk to children? • How many people do children interact with?• What types of interaction take place?• How much do children react to what they overhear?
LOT 4: 16-20 jan06 49
Data collectionINFANTS
2-3 hours per month
6m 8m 10m 12m 15m 18m 21m 24m
Dipkala
Saphal
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
X
TWO YEAR OLDS
3-4 hours per month
2;2 – 3;2 3;4 – 3;8
Khem
Kamala
Monthly
Monthly
Bi-monthly
Bi-monthly
THREE YEAR OLDS
3-4 hours per month
3;2 – 4;2 4;4 – 4;8
Kalpana
Man Kumar
Monthly
Monthly
Bi-monthly
Bi-monthly
LOT 4: 16-20 jan06 50
Of the child/
To the child
Child Mother Father Other adult
Other child
Pointing
Offering
Object handling
Mutual gaze
Imitation
Teasing
Attention getting
Showing
Affection
Playing
Vocalisations per minute
Categories for characterising the communicative environment
LOT 4: 16-20 jan06 51
Transcription and data analysis
• Transcribed into Chintang/Puma• Translated into Nepali• Transcription, Nepali, literal and idiomatic translations
into English entered on computer• Interlinearised linguistic gloss added on computer• Video and all transcriptions aligned and added to data
base archives in Nijmegen and at Tribuvhan University• Videos analysed for amount and type of talk to children
and for children’s communicative behaviour• Children’s language analysed for productivity in the
development of linguistic features of interest
LOT 4: 16-20 jan06 52
Summary and conclusions
• Studying a wide variety gives us access to typological differences that have a bearing on fundamental theoretical issues
• Detailed studies within a language can allow a comparison between the role of different cues and markers
• Comparison across languages for the same functions can give us insight into what makes learning easy or difficult
• Virtually all children learn to talk: what are the characteristics of their communicative environments that make this possible?
LOT 4: 16-20 jan06 54
3. Role of Distributed Morphemes
Past participle (w/ novel verb) = 2;6
E: Das Kind hat den Mann ..........C: Gemiekt!
Full utterance in Perfekt (w/ novel verb) = 3;6
E: Das Kind miekt den MannC: Das Kind hat den Mann gemiekt
Wittek & Tomasello (2002)Journal of Child Language
German Perfekt (w/ novel verb):children at 2;6 and 3;6
Slobin onlocal cues
LOT 4: 16-20 jan06 55
German Perfekt (w/ novel verb):children at 2;6 and 3;6
E: Das Kind miekt den MannC: Das Kind hat den Mann gemiekt
E: Das Kind tammtC: Das Kind ist getammen
Sein (ist) form productive later because lower type frequency (fewer verbs)
2. Role of Type Frequency
Wittek & Tomasello (2002)Journal of Child Language
LOT 4: 16-20 jan06 56
The people and the languages
• Highly endangered languages, but nearly completely undocumented.
• Spoken in the lower foothills of the Himalayas. • Rai ethnic group.• Rai culture:
– Sedentary subsistence farmers. – Extremely high degree of social compartmentalisation, where
each household is a political unit.– The social system is largely identical with kinship system. – Shamanist ancestral worship with various degrees of Hindu and
Buddhist influence.– Mixed with Nepali speakers and other ethnic groups, but marriage
only within the culture.
LOT 4: 16-20 jan06 57
The languagesIndo-European Sino-Tibetan
Tibet-BurmanSinitic
Indo-Aryan Balto-Savic Germanic Italo-Celtic etc. Kiranti Bodish Lolo-Burmese etc.
Hindi Nepali etc. Central Kiranti Eastern Kiranti etc.
Chintang LimbuPuma BelhareBantawa Yakkha Camling etc.etc.
LOT 4: 16-20 jan06 58
Language acquisition projects
1. The balance between Chintang and Nepali in children’s language development
2. Learning the special features of a Rai language
3. Documenting the communicative environment in which children learn to talk
LOT 4: 16-20 jan06 59
Chintang VDC
• Chintang VDC has 9 – 10,000 people
• Mulgau – one of three hamlets in which Chintang is spoken as a native language– 85 households– 510 people
• With the help of the local assistants (studying B.Ed. on the Dhankuta campus), we identified 6 families who were prepared to let us film their children every month.