qiufang wen the national research center for foreign language education, bfsu chinese learner...

66
Qiufang Wen The national research center fo r foreign language education, B FSU Chinese learner corpora Chinese learner corpora and second language and second language research research The 2006 International Symposium of Computer-Assisted Language Learning June 2-4, 2006, Beijing

Upload: magnus-walker

Post on 24-Dec-2015

218 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Qiufang Wen

The national research center for foreign language education, BFSU

Chinese learner Chinese learner corpora and second corpora and second language researchlanguage research

The 2006 International Symposium of Computer-Assisted Language Learning

June 2-4, 2006, Beijing

Page 2: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Topics to be addressedTopics to be addressed•English corpora of Chinese learnersEnglish corpora of Chinese learners

•Corpus-based studies on English learners in mainlanCorpus-based studies on English learners in mainland Chinad China

•Several corpus-based studies on English learners’ iSeveral corpus-based studies on English learners’ interlanguage by myself or together with my colleaunterlanguage by myself or together with my colleaugesges

•Advantages and disadvantages of corpus-based studAdvantages and disadvantages of corpus-based studies on the interlanguageies on the interlanguage

Page 3: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Topic OneTopic One

English corpora of

Chinese learners

Page 4: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

•Chinese learner English Corpus (CLEC)

•College Learners’ Spoken English

Corpus (COLSEC)

•Spoken and Written Corpus of Chinese

Learners (SWECCL)

–Version 1

–Version 2 (under construction)

•Bilingual Corpus of Chinese English

Learners (BICCEL): under construction

Page 5: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

1. Chinese learner English Corpus (C1. Chinese learner English Corpus (CLEC) by Gui & Yang in 2003LEC) by Gui & Yang in 2003

•Written corpus: 1 million

•Timed and untimed compositions

•Levels of proficiency– Middle school students

– Non-English major (Band 4)

– Non-English major (Band 6)

– English majors (Band 4 )

– English majors (Band 8)

•Error-tagged

Page 6: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Two Types of English Learners in University

English Majors Non-English majors

Year 4

Year 3

Year 2

Year 1

Band 8Band 8

Band 4Band 4

Year 4

Year 3

Year 2

Year 1

Band 6

Band 4

Band 2Band 2

Page 7: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

2. College Learners’ Spoken English Corp2. College Learners’ Spoken English Corpus (COLSEC) by Yang & Wei in 2005us (COLSEC) by Yang & Wei in 2005

•Tokens: 0.7million

•Source: National spoken English

test for non-English majors

•Test items

– Teacher-student conversation

– Student-student discussion

– teacher-student discussion

•Data format: written transcripts

Page 8: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

3. Spoken and Written Corpus of Chinese Learn3. Spoken and Written Corpus of Chinese Learners (SWECCL) by Wen, Wang & Liang in 2005ers (SWECCL) by Wen, Wang & Liang in 2005 (V

ersion 1)

SWECCL

WECCLSECCL

1.18 million1.46 million

Page 9: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Spoken (SECCL)Spoken (SECCL)

•Source of data

– National spoken English test: 1996-2002

– Second-year English majors

•Data format

– Digital sounds as well as transcripts of the

speeches

Page 10: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

National spoken English test for English majors —

Band 4 •Test format

– Test in a lab•The number of testees annually

– 2006: more than 16,000 – Expect to have 50,000 in the future

•Scoring procedures– A random sample (30-35 tapes)– Two raters scoring one tape independently

Page 11: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

•Number of subjects

– 6 groups from each year (1996-

2002)

– 42 groups (30/35) = about 1400

students

– About 230 hours’s speech

•Testing items

Page 12: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Testing itemsTesting items

Task Content Preparation time

Retelling A story Listen twice but no

preparation

3 min.

Monologue

Personal experience

3 min. 3 min.

Role play About an issue in daily life

3 min. 4 min.

Page 13: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

The structure of SECCLThe structure of SECCL

SECCL

Text

Tagged

Raw

Special

Article

Past TenseWholeTask

Year

Task ATask B

Task C

Sound files (1996-2002)

Page 14: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

The written component

Written

Year 1 Year 2 Year 3 Year 4

Page 15: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

The written component

•Source of data

– Timed compositions in class (40 minutes,

no less than 300 words)

– Take-home compositions (no word limit)

•Types of compositions

– Argumentative (a list of topics provided)

– Narrative

Page 16: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

SWECCL in 2007SWECCL in 2007 (Version 2)

SWECCL

WECCLSECCL

Two millionTwo million

Page 17: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

SECCL(Version 2)SECCL(Version 2)

•2003-2006 National Spoken English Test fo

r second-year English majors (band 4)

•2000-2006 National Spoken English Test fo

r 4th-year English majors-Band 8 (Task 3)

•A longitudinal data (2001-2004)

Page 18: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Spoken (Band 8)

•Testing item (Task C)

– Make a comment on a given

topic

•Data format

– Digital sounds as well as

transcripts of the speeches

Page 19: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Spoken (Longitudinal)Spoken (Longitudinal)

•72 students 56 students•40 hours’ speech

Year 1 Year 2 Year 3 Year 4

Data

collection

time

2001 2002 2003 2004

Page 20: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

TasksTasks

•Reading aloud

•Retelling a story

•Talking on a given topic (Narrative)

•Talking on a given topic (argumentative)

•Conversation (Role play)

•Discussion on a given topic

Page 21: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

4. Bilingual Corpus of 4. Bilingual Corpus of Chinese English Learners Chinese English Learners

(BICCEL)(BICCEL)

BICCEL

Spoken Written

E-C C-E E-C C-E

0.5 million 0.5 million 0.5 million 0.5 million

Page 22: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Spoken component of Spoken component of BICCELBICCEL

•National Oral English test — Band 8– The 4th year English majors

– Interpreting from English to Chinese (Task A)

– Interpreting from Chinese to English (Task B)

– 2001-2005: 1100 testees

Page 23: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Written component of Written component of BICCELBICCEL

•Source of data: in-class

assignment

–E-C and C-E translation

–Across the 3rd and 4th years

–30 universities across the country

Page 24: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Topic TwoTopic Two

A brief review of corpus-A brief review of corpus-

based studies on Chinese based studies on Chinese

learner Englishlearner English

Page 25: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

SourcesSources

•China National Knowledge

Infrastructure (CNKI)(On-line

journals)

•Digital dissertation database

Page 26: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Corpus-based studies in mainland Corpus-based studies in mainland ChinaChina

Studies

Year

Articles dissertations

2006 9 7

2005 40 282004 29 172003 8 5

2002 6 5

2001 6 1

2000 1 0

Total 99 63

Page 27: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Research areasResearch areas

Articles

Dissertations

Total

Phonological 5 1 6

Lexical 43 48 91

Grammatical 27 8 35

Discourse 8 2 10

Others 16 4 20

Total 99 63 162

Page 28: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Conferences & workshopConferences & workshop

•The International conference on “Corpus Linguistics” 25-27 October, 2003

•The First National Symposium on corpus linguistics and ELT Education

11-13 October, 2004

•Workshop on the use of corpus in teaching and research 17-19 March, 2006

Page 29: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Topic ThreeTopic Three

Several corpus-based studies on

English learners’ interlanguage

by myself or together with my col

leagues

Page 30: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Study OneStudy One

Features of oral style in English compositions of advanced Chinese EFL learners

(Wen, Q.F. Ding, Y.R. & Wang, W.Y. 2003, Foreign Language Teaching & Research (4):268-274.

Page 31: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Study TwoStudy Two

A Study on Frequency Adverbs A Study on Frequency Adverbs

Used by Advance English Used by Advance English

Learners in China Learners in China

Wen, Q. F. & Ding, Y. R. 2004. Wen, Q. F. & Ding, Y. R. 2004.

Modern foreign languages(2): Modern foreign languages(2):

141-147.141-147.

Page 32: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Study ThreeStudy Three

An analysis of English Majors’ Abstracting abilities through their English compositions

Wen, Q.F. & Liu, R.Q. 2006. Foreign Languages (2)

Page 33: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Study FourStudy Four

•A longitudinal study on the developmental features of speaking vocabulary by English majors in mainland China

Wen, Q. F. 2006. Foreign Language Teaching and Research (3).

Page 34: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Study FiveStudy Five

•A comparison of developmental features of Speaking and Writing vocabulary by English majors

•Wen, Q. F. 2006. Foreign languages and Foreign Language Teaching (4)

Page 35: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Study SixStudy Six

Patterns of change in

speaking vocabulary

development by English

majors

Page 36: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Study TwoStudy Two

A Study on Frequency Adverbs A Study on Frequency Adverbs

Used by Advance English Used by Advance English

Learners in China Learners in China

Wen, Q. F. & Ding, Y. R. 2004. Wen, Q. F. & Ding, Y. R. 2004.

Modern foreign languages(2): Modern foreign languages(2):

141-147.141-147.

Page 37: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Frequency AdverbsFrequency Adverbs

•Adverbs used for

describing “how often”

something happens

•never, sometimes, usually,

always

Page 38: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Top Twenty Frequency Top Twenty Frequency AdverbsAdverbs

•Most frequently used by native

speakers according to the analyses of the British National Corpus (BNC) by Leech, Rayson and Wilson (2001)

Page 39: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Top Twenty Frequency Adverbs (TTFAs)Top Twenty Frequency Adverbs (TTFAs)Level of vocabulary

Frequency adverbs No.

1000-word level

never, always, often, ever, *sometimes, usually, once, generally, hardly, no longer, increasingly, *twice, in general, occasionally, mostly

15

2000-word level

frequently, rarely, regularly

3

Academic word list

normally, constantly 2

Page 40: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Common featuresCommon features

•All high-frequency words

•Different frequencies in speech and writing except sometimes and twice

(Leech et al. 2001)(Leech et al. 2001)

Page 41: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

A comparison of TTFAs in speech aA comparison of TTFAs in speech and writingnd writing

•The overall difference TTFAs more likely occur in writing than in s

peech.

•The specific differences Speech: never, always, ever, normally Neutral: sometimes, twice Writing: 14 words

Page 42: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

PPrevious corpus-based revious corpus-based studiesstudies

•e.g. Altenberg & Granger, 2001; Cobb, 2002; Ringbom, 1998; Wen, Ting, & Wang , 2003

•Conflicting finding one: overuse vs. underuse

Page 43: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

ExamplesExamples

•Overuse high-frequency words in writing (Cobb, 2001)

•Overuse modal verbs (Aijmer, 2002)

•Underuse adverbial connectors (Altenberg & Tapper, 1998)

•No study on frequency adverbs

Page 44: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Conflicting finding twoConflicting finding two

•Tend to use written style features in their speech

•Tend to use a mixed register in either speech or in writing

•Tend to use oral style features in their writing

•Did not compare the use of high-frequency words in speech with writing

Page 45: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

General purposes of this General purposes of this studystudy

Whether Chinese EFL learners simply oveWhether Chinese EFL learners simply ove

ruse the TTFAs or they overuse some whilruse the TTFAs or they overuse some whil

e underusing others e underusing others

whether they use the TTFAs similarly or dwhether they use the TTFAs similarly or d

ifferently when compared their speech wifferently when compared their speech w

ith writingith writing

Page 46: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Research questionsResearch questions

• Do they overuse or underuse the TTFAs differently between speech and writing?

• Do they differ more from native speakers in writing or in speaking with regard to the use of the TTFAs?

• Do they demonstrate a similar pattern of writing-speaking difference as native speakers in the use of the TTFAs?

Page 47: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Data for analysisData for analysisThe

learner corpus:

The corpus of English

majors in China

Spoken

(SECCL)

473,408 words

 

955,043 wordsWritten

(CLEC) 481,635 words

The native-speaker corpus:

The British

National Corpus(BNC)

Spoken(BNCS)

10 million words

100 million words

Written(BNCW)

90 million words

 

955,043 words

Page 48: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Data analysisData analysisFour comparisons

• Learners’ speech and native speakers’ speech

SECCL vs. BNCS

• Learner’s writing and native speakers’ writing CLEC vs. BNCW

• Dif. in learners’ speech & native speakers’ and Dif. In learners’ writing & native speakers’

SECCL vs. BNCS and CLEC vs. BNCW

• Dif. In learners’ speech & writing and dif. in native speakers’ speech & writing

SECCL vs. CLEC and BNCS vs. BNCW

Page 49: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Results(1)Results(1)TTFA use in learners’ spoken corpus (SECCL)Tendency Words

Overuse Always, once, often, sometimes, usually, hardly

(6 words/407 Occurrences)(6 words/407 Occurrences)

Underuse Normally, never, ever, twice, generally,in general, occasionally, no longer, constantly, increasingly

(10 words/48 occurrences)(10 words/48 occurrences)

Page 50: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Results(2)Results(2)TTFAs use in learners’ written corpus(CLEC)

Tendency Words

Overuse Always, sometimes, usually, no

longer, never, once, often,

generally, mostly

(9 words/125 occurrences)

Underuse Constantly, occasionally, ever,

regularly, rarely, frequently, twice,

increasingly, normally,

(9 words/37 occurrences)

Page 51: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Results(3)Results(3)Comparison of learners’ speech with their writing in TTFA use (Overuse)

Tendency Words Frequency difference

SECCL BNCS(Spoken) (6)

always, once, often, sometimes, usually, hardly

407

CLEC BNCW(Written) (9)

always, sometimes, usually, no longer, never, once, often, generally, mostly

125

Page 52: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Results(3)Results(3)Comparison (Underuse)

Tendency Words Frequency

difference

SECCL BNCS(Spoken) (10)

normally, never, ever, twice, generally, in general, occasionally, no longer, constantly, increasingly

- 48

CLEC BNCW(Written) (9)

normally, increasingly, twice, frequently, rarely, regularly, ever, occasionally, constantly

- 37

Page 53: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Results(3)Results(3)Comparison (identical or similar)

Tendency Words Frequency

difference

SECCL BNCS(Spoken) (4)

frequently, regularly, rarely, mostly

- 4

CLEC BNCW(Written) (2)

in general, hardly 3

Page 54: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Results(4)Results(4)Speaking-writing differences in TTFA use in the CEMIC and the BNC

Register-neutral Spoken-register sensitive

BNC TwiceSometimes (2)

Never, always, normally, ever (4)

CEMIC Constantly, never, regularly, rarely, increasingly, normally (6)

Always, once, often, sometimes, hardly (5)

Page 55: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Results(4)Results(4)Speaking-writing differences in TTFA use in the CEMIC and the BNC

Written-register sensitive

BNC Often, once, no longer, generally, increasingly, usually, frequently, hardly, rarely, regularly, constantly, in general, occasionally, mostly (14)

CEMIC No longer, generally, usually, in general, ever, mostly, occasionally, frequently, twice (9)

Page 56: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

•English majors in China tend to overuse and underuse certain TTFAs in their speech and writing. The overuse tendency is stronger than the underuse tendency in both speech and writing.

Summary (1)Summary (1)

Page 57: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Summary (2)Summary (2)

•The overuse tendency is more marked in their speech than in their writing while the underuse tendency is also slightly stronger in speech than in writing. Some of the overused or underused TTFAs in speech are the same as those in writing but others are different.

Page 58: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Summary (3)Summary (3)

•Chinese English majors demonstrate a pattern of speaking-writing difference that is opposite to that shown in the native speakers’ corpus: they tend to use more TTFAs in their speech than in their writing while native speakers tend to use more TTFAs in their writing than in their speech. This shows that Chinese EFL learners use TTFAs without awareness of their register differences.

Page 59: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Possible reasonsPossible reasons

•Limited vocabulary (Table 1b)

•Use them as “time buyers”

•Without equivalents readily

available in Chinese

Page 60: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Topic FourTopic Four

Advantages and Advantages and

disadvantages of disadvantages of

corpus-based studies on corpus-based studies on

SLASLA

Page 61: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Advantage One Advantage One

•A large sample stored

electronically and open to the

public

– Validity and reliability

(replicable)

– Possible for a diachronic study

Page 62: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Advantage TwoAdvantage Two

•Using a computer software such as WordSmith– Effectiveness and efficiency

Page 63: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Advantage ThreeAdvantage Three

•Understand the learner language from a different perspective– Correct vs. incorrect

– More acceptable vs. less acceptable – Frequency

• Overuse

• Underuse

• unuse

Page 64: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Can Cannot Product Process

Productive Receptive

Group patterns Individual differences

Language use Language knowledge

DisadvantagesDisadvantages

Page 65: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Closing RemarkClosing Remark

•The number of researchers increasing

•Constructing different types of corpora

•Carrying corpus-based studies

•Findings useful for textbook writers as well as for practitioners

Page 66: Qiufang Wen The national research center for foreign language education, BFSU Chinese learner corpora and second language research The 2006 International

Thank you!!!