assessment tools and learner corpora - hypotheses.org · assessment tools and learner corpora angel...

Post on 14-Jun-2020

17 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Assessment tools and learner corpora

Angel Chan

• Assessment Tools

– Mandarin Receptive Vocabulary Test

• Learner Corpora

– L2 spoken Mandarin Chinese corpus

– potential to build clinical corpora featuring Chinese language pathology

A New Assessment Tool for

Child Mandarin Receptive Vocabulary

Angel Chan1, Kathy Lee2 &Virginia Yip3

1 Dept of Chinese & Bilingual Studies, HK Polytechnic University

2 Division of Speech Therapy, Dept of Otorhinolaryngology,

Head & Neck Surgery, Faculty of Medicine, CUHK

3 Childhood Bilingualism Research Centre,

Dept of Linguistics and Modern Languages, CUHK

Outline

• HK children’s exposure to Mandarin in

kindergartens

• Mandarin Receptive Vocabulary Test

• Results

• Summary and significance

HK Kindergartens with Mandarin exposure

• Total no. of kindergartens:

965

• No. of kindergartens

- with Mandarin exposure

= 831

- without Mandarin exposure

= 46

- information unavailable

= 88

86.1%

(n=831)

4.8%

(n=46)

9.1%

(n=88)

With Mandarin Exposure

Without Mandarin Exposure

Information Unavailable

Total number

= 965

Children’s Mandarin exposure in

Hong Kong kindergartens

0

50

100

150

200

250

300

350

400

0 min 1-20 min 21-60 min 61-150min >150 min

K1

K2

K3

No.

of K

Gs

Min/ week

Low

Exposure

Average

Exposure

High

Exposure

High High

Exposure

Growing importance of Mandarin in

Hong Kong kindergartens

• Over 80% of HK kindergartens provide regular

exposure to Mandarin, though with varying

amounts of input.

• There is a lack of research-based understanding

of HK children’s developmental profiles in

Mandarin.

Available tools to assess early

Mandarin vocabulary Based on native Mandarin-speaking children in Taiwan:

• Lu L, Liu H. (1988). Revised Peabody Picture Vocabulary Test: Mandarin Chinese Version 修訂畢保德圖畫詞彙測驗 Psychological Publishing Co Ltd. Taipei, Taiwan.

Based on native Mandarin-speaking children in Beijing:

• Tardif, T., Fletcher, P., Zhang, Z.X., Liang, W.L., & Zuo, Q.H. (2008). The Chinese Communicative Development Inventory (Putonghua and Cantonese versions): Manual, Forms, and Norms. Peking University Medical Press.

• Hao, M.L., H., Shu, A.L. Xing and P. Li. (2008). Early vocabulary inventory for Mandarin Chinese. Behavior Research Methods 40.3: 728-733.

Lack of assessment tools

• No standardized tools to assess the Mandarin

proficiency of Hong Kong preschool children.

• Lack of assessment tools even for monolingual

Mandarin children in China and Taiwan.

Mandarin Receptive Vocabulary Test 普通話詞彙理解測驗

• Early vocabulary inventory for Mandarin Chinese (Hao et al. 2008)

http://brm.psychonomic-journals.org/content/40/3/728/suppl/DC1

• Data from 884 Chinese families in Beijing

• Infants and toddlers from 12 to 30 months

• Checklist and norms a/v via the internet

• Words with 90th percentiles of comprehension vocabulary found at 30 month olds were chosen for item construction

Mandarin Receptive Vocabulary Test 普通話詞彙理解測驗

98 target words belonging to 14 semantic categories

Target children:

– Preschool children aged 3-6

Quick to administer: 10-20 minutes

Easy to administer: Each child is shown four pictures at a time, and asked to point to the named picture

Target: 杯子 /bei1 zi/ ‘glass’

Phonological distracter:被子 /bei4 zi/ ‘quilt’

Semantic distracter:碗 /wan3/ ‘bowl’ Unrelated distracter:枕頭 /zhen3 tou/ ‘pillow’

Subjects

• 1163 Hong Kong children (age 3-6, L1 Cantonese) who learn Mandarin as an L2. – come from 4 input condition groups, which differ

in the amount of Mandarin exposure time children regularly receive in school,

– ranging from 15-20 minutes to more than 150 minutes per week

• 288 L1 Mandarin children in Beijing (age 3-6)

Input condition group N

LE (1-20 min) 468

AE (21-60 min) 312

HE (61-150 min) 280

HHE (>150 min) 103

L1 288

Major findings

• input condition is the strongest factor influencing the test score (p < .05, effect size: 0.655), demonstrating that input quantity influences child L2 competence (De Houwer 2011).

Major findings

Error analysis on average error percentage

• 3-way repeated measure ANOVA was applied to investigate the general error patterns.

• Significant effects were revealed including -

– Distracter main effect (p < 0.001)

– Distracter*age group interaction effect (p < 0.001)

– Distracter*input condition group interaction effect (p < 0.001)

– Distracter*age group*input condition group interaction effect (p < 0.001)

17

Major finding

• Error analyses reveal a significant interaction between distracter type, age group and input condition group (p < 0.001), with L2 and L1 children showing distinct profiles in how the distribution of error types changes across age. – L2 children: Semantic and phonological errors are both frequent at

younger ages, as children grow older, semantic errors diminish but

certain phonological errors (especially tone errors) still persist at age 5

& 6

– L1 children: phonology is not a big problem across age

Average Error Percentage (L1)

3;00 - 3;05 (319

a)

3;06 - 3;11 (220)

4;00 - 4;05 (178)

4;06 - 4;11 (200)

5;00 - 5;05 (134)

5;06 - 5;11 (109)

Phonological (P) 29.4 28.6 28.9 33.3 36.0 20.5

Semantic (S) 51.2 47.0 50.9 48.5 47.9 49.8

Unrelated (U) 19.4 24.4 20.2 18.2 16.1 29.7

Pairwise comparison w Bonferoni correction

S > U ** NS S > U* S > U** NS NS

a Numbers in brackets indicate the total number of error items.

> Denotes statistically significantly larger than; ** p < 0.01, * p < 0.05, NS denotes not statistically

significant

19

Significance

• offers researchers and clinicians a useful screening test and an alternative to parental checklists such as the Chinese Communicative Development Inventory (Tardif et al 2008) and the early vocabulary inventory for Mandarin Chinese (Hao et al. 2008) to assess receptive vocabulary competence in Mandarin.

Significance of findings

–For researchers:

• What are the optimal input conditions for

acquisition in terms of quantity and quality

of input?

• What are the common semantic and

phonological errors?

• What do these errors tell us about the child’s

developing semantic and phonological

systems?

Significance of findings –For clinical & educational practitioners

and parents: • How to create optimal input conditions in

terms of the quantity and quality of input to support balanced bilingual/trilingual development?

• A baseline profile for typically developing bilingual children needs to be established for comparison with the atypically developing counterparts

• How to attend to semantic distinctions

and phonological distinctions in

therapy and pedagogy?

Key References

• Hao, M.L., H., Shu, A.L. Xing and P. Li. (2008).

Early vocabulary inventory for Mandarin Chinese.

Behavior Research Methods 40.3: 728-733.

• Lee, K.Y.S., Lee, L.W.T., & Cheung, P.S.P. (1996).

Hong Kong Cantonese Receptive Vocabulary Test.

Hong Kong: The Hong Kong Society for Child

Health and Development.

Acknowledgments

• Research grants: “Constructing a Blueprint of a New

Assessment Tool for Child L2 Mandarin Receptive Vocabulary”

HKPU Ref No. 1-ZV8K and “From Lexicon to Syntax in

Childhood Bilingualism” RGC Ref. No. CUHK 453808

• We thank Yang Wenchun, Angela He, Jacqueline Lai, Sunny

Park, Kelly Shum, Alice Tse, Eunice Wong, Hinny Wong,

Reace Wong, Zhu Xin, Wang Jiao, Liu Chang, Wang Zheng,

Claire Au and Joffee Lam for their participation.

A New Multimedia Shared L2 Spoken Mandarin Chinese Corpus:

Construction and Linguistic Analyses

Angel CHAN1, Zhen-Hui FENG2, Wen-Chun YANG1

1The Hong Kong Polytechnic University, 2Lingnan University

Data Sharing

• a growing commitment to data-sharing

• basing replicable empirical and theoretical analyses on openly shared data

• initiatives to share learner language corpora on the internet interfaces for the international research community have become more common – However, thus far mostly limited to featuring

European languages as the target languages

Existing Corpora in the Focus Area “SLABank” of Talkbank

No. Name of

Corpus

Target L2

Language L1 Contributors

1 BELC English Spanish Research team at the Department of English of

the University of Barcelona

2 Connolly English Japanese Steve Connolly (Tokyo)

3 CUHK English Chinese Brian MacWhinney

(Department of Psychology,

Carnegie Mellon University)

4 DiazRodriguez Spanish German/Swedish/

Icelandic/

Korean/Chinese

Lourdes Diaz Rodriguez

(Universitat Pompeu Fabra, Spain)

5 Dresden English/French/

Czech

German Angelika Kubanek-German (University of

Braunschweig)

6 ESF Dutch/English/

French/German

/Swedish

Arabic/Finnish/

Punjabi/Spanish/

Turkish

Wolfgang Klein, Clive Perdue (Max Planck

Institute)

7 FLLOC French English Florence Myles

(University of Southampton)

Existing Corpora in the Focus Area “SLABank” of Talkbank (cont’d)

No. Name of

Corpus

Target L2

Language L1 Contributors

8 Køge Danish Turkish Jens Normann Jørgensen (University of

Copenhagen)

9 Langman Hungarian Chinese Juliet Langman

(University of Texas at San Antonio)

10 Liceras

Spanish English Liceras, Juana

(University of Ottawa)

11 PAROLE English, French,

Italian

English, French Languages research team (Laboratoire LLS) at

the Université de Savoie (Chambéry, France)

12 Qatar English Arabic Yun Zhao

(Carnegie Mellon University)

13 Reading French English Brian Richards (University of Reading)

14 SPLLOC Spanish English A team of researchers in Southampton,

Newcastle, and York universities

15 TCD French English Seán Devitt (School of Education, Trinity

College, Dublin)

Existing Corpora in the Focus Area “BilingBank” of Talkbank

No. Name of Corpus Target Languages Contributors

1 Bangor-Pilot Welsh-English Margaret Deuchar

(University of Wales)

2 Bangor (Welsh-English)

Siarad

Welsh-English Margaret Deuchar (Bangor

University)

3 BlumSnow Hebrew-English Shoshana Blum-Kulka

(Hebrew University),

Catherine Snow (Harvard

Graduate School of

Education)

4 Eppler German-English Eva Eppler (University of

Surrey Roehampton)

5 Gardner-Chloros Greek-English Dr.P.H.Gardner-Chloros

(Birkbeck College)

6 Hatzidaki Greek-French Aspa Hatzidaki

7 Køge Turkish-Danish Jens Normann Jørgensen

(University of Copenhagen)

Existing Corpora in the Focus Area “Clinical Corpora” of Talkbank

No. Name of

Corpus Language Age Range N Contributors

1 Bliss English 3:0–11;8

2;3–11;8

8 normal

7 impaired

Lynn S. Bliss

(Wayne State University) 2 Bol /

Kuiken

Dutch 4;1.16–8;1.17

8-18

3-9

1;7-3;7

20

20

20

47

Gerard Bol

(University of Groningen)

3 Bol / Pool Dutch 6-7 6 Gerard Bol

(University of Groningen) 4 Chiat English 5;0-5;8 3 Shula Chiat

5 Conti –

Ramsden 1

English 4;0–9;0 4+4 Gina Conti-Ramsden

(The University of Manchester)

6 Conti-

Ramsden

2

English 1;11–5;8 3+3 Gina Conti-Ramsden

(The University of Manchester)

7 Conti –

Ramsden 3

English 2-4 4 Gina Conti-Ramsden

(The University of Manchester)

Existing Corpora in the Focus Area “Clinical Corpora” of Talkbank

No. Name of

Corpus Language Age Range N Contributors

8 Conti-

Ramsden

4

English 13-15 19, 99 Gina Conti-Ramsden

(The University of Manchester)

9 CORDIS Spanish 10-21 52 Teresa Fernández de Vega Losada,

et al. 10 Feldman

English 1;2–3;0

xxx

4 sets of

twins

Heidi Feldman

(Children’s Hospital) 11 Flusberg English, xxx 6 Autism

6 Down 12 Foudon French 3;9-9;2 8 autism Nadège Foudon

(à l'Institut des Sciences

Cognitives) 13 Fujiki /

Brinton

English 24 –77 years 42 Bonnie Brinton, Martin Fujiki

(Brigham Young University) 14 Hargrove English 3;0–6;0 6 Patricia Hargrove

(Mankato State University) 15 Hooshyar English 1;4–2;11

3;2–11;6

2;8–5;9

40 normal

31 Downs

21 impaired

Nahid Hooshyar

Existing Corpora in the Focus Area “Clinical Corpora” of Talkbank

No. Name of Corpus Languages Age Range N Contributors

16 Le Normand-SLI French 6 Dr. Marie-Thérèse Le

Normand

17 LeNormand –

Apraxia

French 3 Dr. Marie-Thérèse Le

Normand

18 Levy Hebrew 1;10–8;4 14 Yonata Levy

(Hebrew University) 19 Malakoff /

Mayes

English 2;0–2;22 76 Marguerite E. Malakoff

(Harvey Mudd College)

20 MOC Spanish 1;6-5;5 1 Ignacio Moreno-Torres,

Santiago Torres, Rafael

Santana

(University of Málaga,

University of Las

Palmas) 21 Nadig English,

French 3-7 12 English,

8 French

Janet Bang, Aparna

Nadig 22 Nicholas English 12-48 90 Johanna Nicholas

Existing Corpora in the Focus Area “Clinical Corpora” of Talkbank

No. Name of Corpus Target

Languages Age Range N Contributors

23 Oviedo Spanish 7–8 2 Eliseo Diez-Itza

(Universidad de Oviedo)

24 Rollins English 2;2–3;1 5 Pamela Rosenthal Rollins

(University of Texas at

Dallas) 25 Rondal English 3;0–12;1 21

Downs

21

controls

Jean Rondal

(Laboratoire de

Psychologie )

26 Serra Spanish 3;9-5;1 10 Miquel Serra

(Universitat de Barcelona)

27 Ulm German 3;0–7;5 165 Andrea Haege

(University of Ulm)

28 Weismer English 2;6, 3;6, 4;6,

and

5;6

138 Weismer, Susan Ellis

(San Diego State

University)

Existing corpora on L2 Chinese • Only a handful of SLA corpora featuring Chinese as the target language, with a

recent emerging trend to share their SLA learner corpora of Chinese on the internet: – (i).暨南大學留學生書面語語料庫

– (ii).暨南大學華文學院留學生口語語料庫

– (iii).北京語言大學「HSK 動態作文語料庫」 – (iv).北京語言大學漢語仲介語語料庫

• Most corpora feature only written data rather than spoken data

– except (ii)

• The web interfaces of all these corpora are all in Chinese – may not be user-friendly for non-Chinese researchers who would like to conduct

cross-linguistic comparisons involving Chinese

This project

• constructed a web accessible and video-linked Second Language (L2) spoken Mandarin Chinese corpus in a common interchange international format – using the commonly used frog story in cross-linguistic

research (Mayer 1969, Berman & Slobin 1994)

– featuring 14 L2 adult participants (First Language (L1): English) and 6 L1 adult participants as controls

– aiming to share the corpus through the international TalkBank database platform (MacWhinney et al 2004; MacWhinney 2007; http://talkbank.org/)

Background information of the 14 L2 Subjects (L1 English)

No.

Subject Age Gender Education

Level

Age at which learning

of Chinese started Contexts of acquisition Other languages

1 Mi 24 F Master 18

Classroom/Conversation/ Reading

German, French

2 Pa 48 M Master 28

Classroom/Conversation/ Self-learning

German, French

3

Ga 42 M Doctor 19

Conversation/ Reading/Watching TV and

movies

Spanish

4 Je 38 M Doctor 30

Conversation

French

5 Jo 37 M High School 34

Classroom/Conversation

None

6 Ta 36 F High School 33

Classroom/Conversation

None

7 Aa 23 M Bachelor 22

Classroom/Conversation

French, Swedish

8 Al 34 M Bachelor 30

Classroom/Conversation/ Reading/Self-

learning

None

9 Ba 20 M Bachelor 18

Classroom/Conversation

Spanish, German

10 Ge 34 M Master 27

Conversation

Spanish

11 Ja 31 M Master 24

Classroom/Conversation/ Self-learning

Spanish, French

12 Mo 70 M Doctor 22

Classroom/Conversation/ Reading/Self-

learning

French

13 Na 23 M Bachelor 20

Classroom/Conversation

French

14 Pi 24 M Bachelor 20

Classroom/Conversation/ Self-learning

None

Background Information of the 6 L1 Mandarin Subjects

No.

Participant

Age

Education

Level

Gender

L2

Matching

Subject

1 Ya

24 Master F Mi

2 Qi 49 Bachelor M Pa

3 Gu 38 Bachelor M Ga

4 Do 35 Doctor M Je

5 Wu 35 Doctor M Jo

6 Zha 32 Master F Ta

Corpus construction

follow the Talkbank format • videotape and collect speech samples from each participant • orthographically transcribe the speech samples according to

the standard CHAT format • link each transcribed utterance to the original video data • conduct inter-person reliability checks of

– the transcriptions format – the video-linking and synchronization of the data

• perform automatic parts-of-speech and English tagging of the transcriptions using the CLAN software in the TalkBank system

• manual disambiguation of the automatic tagging

Story Script for Frog, Where Are You? by Mercer Mayer, 1969

1 There once was a boy who had a dog and a pet frog. He kept the frog in a large jar in

his bedroom. 2 One night while he and his dog were sleeping, the frog climbed out of the jar. He

jumped out of an open window. 3 When the boy and the dog woke up the next morning, they saw that the jar was empty. 4 The boy looked everywhere for the frog. The dog looked for the frog too. When the

dog tried to look in the jar, he got his head stuck. 5 The boy called out the open window, “Frog, where are you?” The dog leaned out the

window with the jar still stuck on his head. 6 The jar was so heavy that the dog fell out of the window headfirst! 7 The boy picked up the dog to make sure he was ok. The dog wasn’t hurt but the jar

was smashed. 8 - 9 The boy and the dog looked outside for the frog. The boy called for the frog. 10 He called down a hole in the ground while the dog barked at some bees in a beehive. 11 A gopher popped out of the hole and bit the boy on right on his nose. Meanwhile, the

dog was still bothering the bees, jumping up on the tree and barking at them. 12 The beehive fell down and all of the bees flew out. The bees were angry at the dog for

ruining their home. 13 The boy wasn’t paying any attention to the dog. He had noticed a large hole in a tree.

So he climbed up the tree and called down the hole.

Story Script for Frog, Where Are You? by Mercer Mayer, 1969

14 All of a sudden an owl swooped out of the hole and knocked the boy to the ground.

15 The dog ran past the boy as fast as he could because the bees were chasing him. 16 The owl chased the boy all the way to a large rock. 17 The boy climbed up on the rock and called again for his frog. He held onto some

branches so he wouldn’t fall. 18 But the branches weren’t really branches! They were deer antlers. The deer picked up

the boy on his head. 19 The deer started running with the boy still on his head. The dog ran along too. They

were getting close to a cliff. 20-21 The deer stopped suddenly and the boy and the dog fell over the edge of the cliff. 22 There was a pond below the cliff. They landed with a splash right on top of one

another. 23 They heard a familiar sound. 24 The boy told the dog to be very quiet. 25 They crept up and looked behind a big log. 26 There they found the boy’s pet frog. He had a mother frog with him. 27 They had some baby frogs and one of them jumped towards the boy. 28-29 The baby frog liked the boy and wanted to be his new pet. The boy and the dog were

happy to have a new pet frog to take home. As they walked away the boy waved and

said “goodbye” to his old frog and his family.

A Sample Video-linked Transcript

Adding Tagging using the CLAN software

Key characteristics of this corpus

• Will be made openly accessible on the internet

• Having video-linked (hence audio-linked too) transcribed oral data

L2 acquisition of Chinese directional complements

(Feng 2011)

Directional Complements in Chinese (Wu 2011)

Six types of directional complement constructions in Chinese (Wu 2011)

Type Example

1. Simple DCs 他 走 到 了。

2. Complex DCs 他 走 进 来 了。

3. Simple DCs with Object NPs 他 搬 出 了 一张大桌子。

4. Simple DCs with Place NPs 他 走 回 宿舍 了。

5. Complex DCs with Object NPs 他 搬 出 一张大桌子 来 了。

6. Complex DCs with Place NPs 他 走 回 宿舍 来 了。

L2 acquisition of Chinese directional complements (Feng 2011)

• L1-L2 comparisons on:

- frequency of use

- accuracy

- productivity of verb usage (verb type frequency)

• Major finding:

- L1-L2 difference especially with constructions that are structurally more complex

Significance

• could raise the visibility of SLA learner corpora featuring Chinese as the target language

• has the potential of being further expanded and developed into a multi-mother tongue corpus of learner Chinese • featuring a variety of first languages as well as a variety of

Chinese languages as the second languages

• expand to clinical corpora featuring Chinese

Thank You!

angel.ws.chan@polyu.edu.hk

top related