Download - Using the Common European Framework of Reference to Report Language Test Scores Spiros Papageorgiou University of Michigan [email protected]

Using the Common European Framework of Reference

to Report Language Test Scores

Spiros PapageorgiouUniversity of Michigan

[email protected]

Overview

• The Common European Framework of Reference (CEFR)

• The Manual for relating language examinations to the CEFR

• Standard setting• An example of a CEFR standard setting study

in Colombia

The CEFR

• Reference document—not prescriptive• Basis for the elaboration of language syllabi, curricula,

examinations, and textbooks• Language objectives: Description of what language learners

have to learn to do in order to use a language for communication

• Six main levels of proficiency: A1 (lowest), A2, B1, B2, C1, C2 (highest)

The Manual for RelatingExaminations to the CEFR

It aims to “help the providers of examinations to develop, apply and report transparent, practical procedures in a cumulative process of continuing improvement in order to situate their examination(s) in relation to the Common European Framework” (p. 1).

Stages for Relating Test Contentand Test Scores to the CEFR

• Familiarization• Specification• Standardization training

and benchmarking• Standard setting• Validation

Standard Setting• The decision making process of classifying examination

results in a number of successive levels • Performance Level Descriptions (PLD): statements

describing what learners can do with language(e.g., CEFR descriptors)

• Performance Level Labels (PLL): labels of PLD(e.g., A1–C2)

• Cut scores: the boundary between two successive levels

• Participation of expert judges (panelists)

C2 Can write clear, smoothly flowing, complex texts in an appropriate and effective style and a logical structure which helps the reader to find significant points.

C1 Can write clear, well-structured texts of complex subjects, underlining the relevant salient issues, expanding and supporting points of view at some length with subsidiary points, reasons and relevant examples, and rounding off with an appropriate conclusion.

B2 Can write clear, detailed texts on a variety of subjects related to his field of interest, synthesising and evaluating information and arguments from a number of sources.

B1 Can write straightforward connected texts on a range of familiar subjects within his field of interest, by linking a series of shorter discrete elements into a linear sequence.

A2 Can write a series of simple phrases and sentences linked with simple connectors like “and”, “but” and “because”.

A1 Can write simple isolated phrases and sentences.

PLLPLL PLDPLD

An Example of a Standard Setting Study in Colombia

• Reporting scores for the Michigan English Test on the CEFR levels

• 13 participants from the 9 Binational centers in Colombia

• Familiarization with the CEFR• Training with item difficulty (Pilot Form B) • Angoff standard setting method• First round of judgments• Pilot Form A statistical information• Second round of judgments

Standard Setting Validity Evidence • Procedural validity: examining whether the procedures

followed were practical and implemented properly; that feedback given to the judges was effective; and that documentation was sufficiently compiled.

• Internal validity: addressing issues of accuracy and consistency of the standard setting results.

• External validation: collecting evidence from independent sources that support the outcome of the standard setting meeting.

The Familiarization Task

• A1 = 1, A2 = 2, B1 = 3, B2 = 4, C1 = 5, C2 = 6

Procedural Validity:Internalization of the CEFR

Correlation of descriptor level judgments withthe CEFR during the Familiarization stage

Descriptors J1 J2 J3 J4 J5 J6 J7 J8 J9 J10 J11 J12 J13

Listening .85 .89 .80 .81 .71 .77 .79 .88 .80 .70 .91 .84 .79

Reading .92 .92 .85 .86 .69 .86 .84 .84 .82 .62 .90 .86 .77

Vocabulary .89 .93 .91 .96 .70 .76 .73 .92 .90 .84 .97 .90 .86

Grammar .90 .94 .97 .87 .91 .95 .89 .95 .84 .78 .93 .85 .89

Internal Validity: Method ConsistencyStandard error of judgments should be ≤ ½

of the standard error of the test(Section I 1.71 and Section II 1.74 )

Cut score SEj incl. extreme ratings SEj excl. extreme ratings

Section I B1 1.97 1.57

Section I B2 1.34 1.34

Section I C1 1.69 1.69

Section II B1 2.00 1.71

Section II B2 2.30 1.62

Section II C1 2.57 1.71

Internal Validity: Decision Consistency

Calculating agreement coefficient rho(p0; max .98) and kappa (k; max 71)

Cut score p0 kSection I B1 .90 .68Section I B2 .88 .70Section I C1 .97 .61Section II B1 .95 .64Section II B2 .86 .71Section II C1 .94 .65

Internal Validity: Intra-judge Consistency

Correlation of mean of judgmentswith empirical item difficulty

MET section/round of judgments Correlation

Section I, Round 1 .42

Section I, Round 2 .83

Section II, Round 1 .73

Section II, Round 2 .92

Internal Validity: Inter-judge Consistency

Indices of agreement and consistency

Index Section I Section II

ICC .94 .94

W .80 .76

Alpha .94 .94

External Validity: Reasonablenessof the Cut Scores

Classification of Pilot Form A test takers(N = 660) into CEFR levels

Level Section I Section II

A2 105 (15.91%) 55 (8.33%)

B1 408 (61.81%) 323 (48.94%)

B2 95 (14.39%) 214 (32.43%)

C1 52 (7.88%) 68 (10.30%)

External Validity: Comparison ofLevel Classifications

Exact and adjacent level agreement of classifications (N = 302) provided by a test center and the cut score

Agreement Section I Section II

Exact level 122 (40.40%) 92 (30.46%)

Within 1 level 290 (96.03%) 264 (87.42%)

Final Stage Before ReportingTest Scores: Equating

• A statistical procedure used to allow for comparisons of scores obtained on different test forms

• Adjustment of differences in test form difficulty(but not content)

• Scaled scores, not percentages• Examinee position on the language ability scale• Scores are comparable across different administrations • Linked to the CEFR cut scores

Reported ScoresBoth section scores should be taken into account when interpreting the test results for use in decision-making

CEFR Level MET Section I scores MET Section II scores

C1 64 and above 64 and above

B2 53–63 53–63

B1 40–52 40–52

A2 39 or below 39 or below

For more information visitwww.lsa.umich.edu/eli/testing

Download - Using the Common European Framework of Reference to Report Language Test Scores Spiros Papageorgiou University of Michigan [email protected]

Top Related