john de jong: optimizing test & courseware development

37
Optimizing Test & Courseware Development Lisbon 23 April 2016 John De Jong SVP Global Assessment Standards, Pearson Professor of Language Testing VU University Amsterdam

Upload: eaquals

Post on 22-Jan-2018

413 views

Category:

Education


0 download

TRANSCRIPT

Page 1: John De Jong: Optimizing Test & Courseware Development

Optimizing Test & Courseware Development

Lisbon 23 April 2016

John De Jong SVP Global Assessment Standards, Pearson

Professor of Language Testing VU University Amsterdam

Page 2: John De Jong: Optimizing Test & Courseware Development
Page 3: John De Jong: Optimizing Test & Courseware Development

3

PISA Programme for International Student Assessment

Page 4: John De Jong: Optimizing Test & Courseware Development

PISA Development over time

2000: Reading Mathematics and Science

2003: Reading Mathematics and Science

2006: Reading Mathematics and Science

2009: Reading Mathematics and Science

+ Optional Electronic Reading

2012: Reading Mathematics and Science

+ Optional Electronic Mathematics

2015: Electronic: Reading Mathematics and Science

+ Collaborative Problem Solving

2018 : Reading Mathematics and Science

+ Global Competence

4

Page 5: John De Jong: Optimizing Test & Courseware Development

Lessons from PISA

Major drivers of success of countries

• Clear standards defined at national level

• High level of teacher autonomy

5

Page 6: John De Jong: Optimizing Test & Courseware Development

… then, how to define standards?

Page 7: John De Jong: Optimizing Test & Courseware Development

Ranking CPS in higher education and workplace Applied Skill Rank Educ Rank Work

Oral Communications 3 1

Teamwork / Collaboration 3 1

Problem Solving 1 2

Written Communications 2 2

Information Technology Application 4 3

Lifelong Learning / Self Direction 2 4

Professionalism / Work Ethic 5 4

Ethics / Social Responsibility 6 4

Creativity / Innovation 3 5

Diversity 7 6

Leadership 7 7

Page 8: John De Jong: Optimizing Test & Courseware Development

Survey results

Definition Agree %

is clearly described 97

matches my own understanding

of CPS 95

will help higher ed institutions

to understand CPS 88

will help employers to

understand CPS 100

is what is taught in my country 52

The CPS definition is … Agree %

Page 9: John De Jong: Optimizing Test & Courseware Development

Crucial reformation targets

• Establish needs

• Define learning objectives

• Define coherent and realistic curriculum

• Engage students

9

Page 10: John De Jong: Optimizing Test & Courseware Development
Page 11: John De Jong: Optimizing Test & Courseware Development
Page 12: John De Jong: Optimizing Test & Courseware Development
Page 13: John De Jong: Optimizing Test & Courseware Development
Page 14: John De Jong: Optimizing Test & Courseware Development
Page 15: John De Jong: Optimizing Test & Courseware Development

15

Structural approach to defining objectives

Difficulty

Dom

ain

Language

Page 16: John De Jong: Optimizing Test & Courseware Development

Do

ma

ins o

f la

ng

ua

ge

use

/ T

op

ics

Difficulty

Self / personal experience

Negotiating with others

Deal with new

Academic

Specialized

Jokes

GE: A1 A2 B1 B2 C1 C2

AE: General MBA

PE: Waiter Politician

Coherent bank of objectives

Page 17: John De Jong: Optimizing Test & Courseware Development

A General Model of Language Development

Gen

eral

Cogn

itio

n

Language Proficiency

Measuring within population of language learners: measures both linguistic and general cognitive development

Measuring across two populations of language learners, may just measure cognitive development only.

Including appropriate native speaker population can help to measure linguistic development only

0 1 2 3 4 5 etc. “language age”

0

1

2

3

4

5

etc

.

“co

gn

itiv

e a

ge”

Page 18: John De Jong: Optimizing Test & Courseware Development

The Global Scale of English

18

Comparison PTE Academic (GSE scale) and IELTS and TOEFL

IELTS

TOEFL iBT

Page 19: John De Jong: Optimizing Test & Courseware Development

Sample page (from B1)

The Pearson Syllabus – General English

Page 20: John De Jong: Optimizing Test & Courseware Development

20

English The need for

Page 21: John De Jong: Optimizing Test & Courseware Development

Overview

• A vocabulary framework linked to the Global Scale of English (GSE) and the CEFR

• Organized by topics and subtopics based on the CoE Vantage specifications categorization

• Describing vocabulary targets for learners of general English

• A probabilistic model of productive vocabulary learning

• Based on the principle of incremental learning of word meanings, from basic to specialized

• Including 20k+ lemmas; 37k+ meanings; 80k+ collocations; 7k+ functional units

• Helping learners, teachers, and materials designers identify level-appropriate vocabulary

Page 22: John De Jong: Optimizing Test & Courseware Development

Methodology

Combines frequency data and teacher judgements via 4 main steps:

1. Corpus 2.5 billion words > extraction of frequency list

2. Semantic annotation

• Manual tagging of 37k word meanings using of CoE ‘Vantage’

3. Teacher ratings

• Rating of 37k word meanings by 10 teachers (scale: 1 to 5 + 99)

4. Statistical analysis

• Rank word meanings by combining frequency data and teacher ratings

5. Fit the data onto a model, link each meaning to the CEFR /GSE

Page 23: John De Jong: Optimizing Test & Courseware Development

Lemmas and meanings

Structure vocabulary around pedagogically relevant

sets using the CoE Vantage categorization

Example:

Specific Notions (Topics)

Fork > FOOD&DRINKS_tableware

SPORT&HOBBIES_gardening

TRAVEL_directions

23

Page 24: John De Jong: Optimizing Test & Courseware Development

Theoretical assumptions

A model of vocabulary growth based on current literature:

• Basic (A1) > 500-1k words (500 words as min. elementary level -Hill, 2013; 500-1k as general teaching target)

• Basic (A2)> boundary for high frequency vocabulary set at 3k families for everyday conversation (Adolphs & Schmitt, 2003)

• Independent (B1) > 5k families to read authentic texts (Schmitt, 2007)

• Independent (B2) > minimum target of 10k lemmas at univ. level (Hazenberg & Hulstijn, 1996) for Dutch; 8/9k f. for unassisted comprehension (Nation, 2006)

• Proficient (C1 upwards) > 20k f. known by educated L1 speakers (Nation, 2001); 50k w. known by most L1 speakers (Crystal, 1981)

Hill, D. R. (2001). Survey: Graded Readers. ELT Journal 55(3), Oxford University Press, 300-324

Adolphs, S. & Schmitt, N. (2003). Lexical coverage of spoken discourse. Applied Linguistics 24, 4: 425-438.

Schmitt, N. (2007). Current perspectives on vocabulary teaching and learning. In J. Cummins and C. Davison (eds.), International Handbook of English language teaching: part II. NY: Springer, 827-841.

Hazenberg, S. & Hulstijn, J. H. (1996). Defining a minimal receptive second‐ language vocabulary for non‐native university students: An empirical investigation. Applied Linguistics, 17 (2), 145‐163

Nation, I., S., P. (2006). How large a vocabulary is needed for reading and listening. The Canadian Modern Language Review, 63 (1), 59-82

Nation, P. (2001). Leaning vocabulary in another language. Cambridge: Cambridge University Press.Schmitt, N. (2000). Vocabulary in language teaching. Cambridge: Cambridge University Press, pp.7-8

Crystal, D. (1981). Clinical Linguistics. Vienna, Springer

Page 25: John De Jong: Optimizing Test & Courseware Development
Page 26: John De Jong: Optimizing Test & Courseware Development

Data modelling 1

y = 0.006x3.539

R² = 0.9842

0

10,000

20,000

30,000

40,000

50,000

60,000

10 20 30 40 50 60 70 80 90

From GSE to ModelLem

Hypothesis: 'CumLem'

Model: 'ModelLem'

Page 27: John De Jong: Optimizing Test & Courseware Development
Page 28: John De Jong: Optimizing Test & Courseware Development

Meanings vs Lemmas

1.0

1.5

2.0

2.5

<T T A1 A2 A2+ B1 B1+ B2 B2+ C1 C2

Average number of Meanings per Lemma

Page 29: John De Jong: Optimizing Test & Courseware Development
Page 30: John De Jong: Optimizing Test & Courseware Development

Vocabulary growth

0

2000

4000

6000

8000

10000

12000

14000

PreT T A1 A2 A2+ B1 B1+ B2 B2+ C1 C2

Vocabulary growth by level

New Meanings New Lemmas

Page 31: John De Jong: Optimizing Test & Courseware Development

Cumulative vocabulary growth

0

10000

20000

30000

40000

50000

60000

PreT T A1 A2 A2+ B1 B1+ B2 B2+ C1 C2

Cumulative Vocabulary Growth by Level

Cumul Meanings Cumul Lemmas

Page 32: John De Jong: Optimizing Test & Courseware Development

The vocabulary usefulness rating

1 = Essential words learners would want to acquire first

2 = Important words that become necessary at a next stage

3 = Useful words enabling more detailed and specific language

4 = Nice to have words to express concepts more accurately

5 = Extra words some language users will use occasionally

99 “Escape” words which are impossible to rate - you have never heard of the word before or you cannot decide between widely different ratings

Teachers received online training and followed specific

guidelines

Each word was rated by a random 10 out of the 19 raters in an

overlapping design using a pre-defined scale of 1-5

Page 33: John De Jong: Optimizing Test & Courseware Development

Combine ratings and Frequency data

Ra x rRating + Frank x (1- rRating) + Frank

Combine =

2

Where

Combine is the optimal combination of ratings and Frequency data

Ra is the Rating average

rRating is the Reliability of rating data

Frank is the scaled frequency rank.

Page 34: John De Jong: Optimizing Test & Courseware Development

adj.in People & relationships [personal traits]

A1: happy (23), good (22);

A2: angry (34), kind (36)

A2+: noisy (39), silly (40)

B1: upset (47), lonely (48)

B1+: confident (51), nasty (53)

B2: creative (59), sympathetic (63)

B2+: kind-hearted (67), spoiled (70)

C1: hypocritical (76), bashful (80)

C2: shifty (86), sycophantic (88)

34

Page 35: John De Jong: Optimizing Test & Courseware Development

y = -3.8806x2 + 42.05x - 24.081R² = 0.9974

10

20

30

40

50

60

70

80

90

1 2 3 4 5

Tourist

A1

A2

B1

B2

C1

C2

Essential

Important

Useful

Extra

Nice to have

Page 36: John De Jong: Optimizing Test & Courseware Development

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

10 20 30 40 50 60 70 80 90

Lik

elih

ood

of

Su

ccess

GSE Task Difficulty

A learner at 25 on GSE

Girl, Mother

Boy, Father

Page 37: John De Jong: Optimizing Test & Courseware Development

www.English.com/GSE

37