testing reading through multiple-choice? an insight within ... · testing reading through...

24
Inga Wagner Centro Linguistico di Ateneo Università di Modena e Reggio Emilia Testing reading through multiple-choice? An insight within the Modena Language Centre Testing Project 3. Bremer Symposion 2011 Testen, Evaluieren, Zertifizieren 4-5 March 2011

Upload: others

Post on 18-Apr-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Testing reading through multiple-choice? An insight within ... · Testing reading through multiple-choice? An insight within the Modena Language Centre Testing Project 3. Bremer Symposion

Inga Wagner

Centro Linguistico di AteneoUniversità di Modena e Reggio Emilia

Testing reading through multiple-choice?An insight within the Modena Language

Centre Testing Project

3. Bremer Symposion 2011Testen, Evaluieren, Zertifizieren

4-5 March 2011

Page 2: Testing reading through multiple-choice? An insight within ... · Testing reading through multiple-choice? An insight within the Modena Language Centre Testing Project 3. Bremer Symposion

Outline

• Testing at CLA: the past

• Overview of new testing project

• Testing reading comprehension

• Focus on Multiple Choice

• MC vs other techniques: some preliminary results

Page 3: Testing reading through multiple-choice? An insight within ... · Testing reading through multiple-choice? An insight within the Modena Language Centre Testing Project 3. Bremer Symposion

Outline of former language testing at theUNIMORE Language Center

Computerised language testing of mainlylexico-grammatical knowledge throughdiscrete-point multiple-choice items

UsesPlacement Test for all Faculties (English)Admission Test for Faculty of Arts (English, Spanish, French, German)

Other tests• Paper-based placement test for Italian as a foreign language• Proficiency test (B2,C1, C2) for Faculty of Arts (English, Spanish,

French, German)• International Language Certificates (i.e. Cambridge, ÖSD)

Page 4: Testing reading through multiple-choice? An insight within ... · Testing reading through multiple-choice? An insight within the Modena Language Centre Testing Project 3. Bremer Symposion

- differentiated use of tests > flexibility

- inclusion of other competences and skills

(e.g. reading, listening; communicative competence)

- introduction of multiple techniques

- economy (time and resources; semi-adaptive placement

test)

Why a new test?

Page 5: Testing reading through multiple-choice? An insight within ... · Testing reading through multiple-choice? An insight within the Modena Language Centre Testing Project 3. Bremer Symposion

Uses for the new test

Some Advantages:• ‘on-the-fly’ customisation of same test> unlimited possibilities of testing assemblage (machine-scored and human-rated; detailed tagging system)

• self-contained testing and authoring software for item creation, storage and test management

Page 6: Testing reading through multiple-choice? An insight within ... · Testing reading through multiple-choice? An insight within the Modena Language Centre Testing Project 3. Bremer Symposion

Competencesand skills to betested

grammaticaltextualpragmatic (e.g. sociolinguistic)

listeningreadingwriting

Techniques

Directed response itemsword poolordering tasksgap-filling with selection from banksingle matchingmultiple matchingsingle multiple choicemultiple multiple choice

Constructed response itemswritingopen gap-filling

Page 7: Testing reading through multiple-choice? An insight within ... · Testing reading through multiple-choice? An insight within the Modena Language Centre Testing Project 3. Bremer Symposion

List of tags

LETTURA

ASCOLTO

SCRITTURA INDIRETTA

TEX_dis_rip

TEX_coes

TEX_marc_disc

TEX_voce_tempo

PRA_idiom

PRA_stile_reg

PRA_meta_disc

PRA_funz_strat

PRA_scopo_disc

FON_fonemi

FON_accent

FON_pron_ort

FON_into

LG_temp

LG_modal

LG_mp

LG_cong

LG_pp

LG_pron

LG_agg

LG_avv

LG_frasi

LG_lex

LG_lex immagini

LSP

L generale

Sc0

Sc1

Sc3

S-E0

S-E1

S-E2

Um0

Um1

Um2

Um6

Leg0

Leg1

Leg2

Page 8: Testing reading through multiple-choice? An insight within ... · Testing reading through multiple-choice? An insight within the Modena Language Centre Testing Project 3. Bremer Symposion

Testing reading ability

Techniques• Reordering• Gap-filling with selection from bank• Single matching• Multiple matching• Multiple choice

Main aims• generalise results to non-testing-situations• avoid test method effect• meet test criteria (validity, reliability, objectivit y, practicality)

“The very act of taking a test may require different sorts of reading from non-test-based reading, and may therefore limit the sorts of abilities orcomprehensions that can be tested.”

(Alderson 2000: 123)

However

Page 9: Testing reading through multiple-choice? An insight within ... · Testing reading through multiple-choice? An insight within the Modena Language Centre Testing Project 3. Bremer Symposion

Single Matching(e.g. text types, titles)

Page 10: Testing reading through multiple-choice? An insight within ... · Testing reading through multiple-choice? An insight within the Modena Language Centre Testing Project 3. Bremer Symposion

Multiple Matching (e.g. scanning)

Page 11: Testing reading through multiple-choice? An insight within ... · Testing reading through multiple-choice? An insight within the Modena Language Centre Testing Project 3. Bremer Symposion

Reordering task (cohesion,coherence)

Page 12: Testing reading through multiple-choice? An insight within ... · Testing reading through multiple-choice? An insight within the Modena Language Centre Testing Project 3. Bremer Symposion

Gap-filling with selection (e.g. text organisation – cause/effect)

Page 13: Testing reading through multiple-choice? An insight within ... · Testing reading through multiple-choice? An insight within the Modena Language Centre Testing Project 3. Bremer Symposion

Multiple Choice

Page 14: Testing reading through multiple-choice? An insight within ... · Testing reading through multiple-choice? An insight within the Modena Language Centre Testing Project 3. Bremer Symposion

CLA guidelines for MC readingcomprehension items

A2, B1 B2, C1 C2

Skimming / reading for gist(topic, text type, text purpose, etc. )

� �** �**

Scanning / finding specific details(e.g. names, figures, dates, or other surface information)

Careful reading(understanding explicitly stated main idea(s) and / or distinguishing that from supporting details; locating, identifying, understanding and comparing facts, etc)

� �** �**

Understanding lexis(predicting meaning of (unknown) words from the context).

� � �

Making inferences(deducing information that is not explicitry stated; alsoauthor's opinion, allusions, irony, critique, etc.)

� �

Guidelines also given ontext length, no. of options, authenticity, instructio ns

(Based on Urquhar & Weir 1998)

Page 15: Testing reading through multiple-choice? An insight within ... · Testing reading through multiple-choice? An insight within the Modena Language Centre Testing Project 3. Bremer Symposion

Pros and cons of multiple choice*Benefits• familiar to nearly all candidates in all places• independent of writing ability• objectively and easily scored

• economical in terms of candidate's time > adding to reliability of test

• versatile

* Based on Alderson 2000, ALTE 2005, Arras 2006, Grotjahn 2000

Drawbacks• unnatural process and non-authentic use of texts• separate ability, different from reading ability > testwiseness• validity may be limited • high guessing • cheating• backwash may be harmful• very difficult and time consuming to write successfully

Page 16: Testing reading through multiple-choice? An insight within ... · Testing reading through multiple-choice? An insight within the Modena Language Centre Testing Project 3. Bremer Symposion

RELATIV GENIAL: ALBERT EINSTEINDeutschland feierte im Jahr 2005 das < Einsteinjahr > . Der 100. Geburtstag derRelativitätstheorie und der 50. Todestag des weltberühmten Wissenschaftlers waren derGrund.[………………………….] (aus: JUMA, Schülerzeitung, März 2005)

Warum wird das <Einsteinjahr > gefeiert ?1) Seit 100 Jahren gibt es die Relativitätstheorie. [corretta]2) Seit 50 Jahren gibt es die Relativitätstheorie. [errata]3) Albert Einstein wird 100 Jahre alt. [errata]

Warum ist Einsteins Theorie so neu und wichtig ?1) Sie will Physik und visionäre Philosophie erneuern. [errata]2) Sie definierte neue Konzepte, die nicht mehr physikalisch sind. [errata]3) Sie definierte physikalische Gesetze ( wie Raum, Zeit, Materie ) neu. [corretta]

War Einstein nur für seine Physiktheorien bekannt ?1) Nein, er ist auch für Ideen wie Pazifismus, Weltbürgertum, Menschlichkeit bekannt. [corretta]2) Ja, denn er hat sich nur mit Physik beschäftigt. [errata]3) Ja, denn dafür hat er den Nobelpreis bekommen. [errata]

Wer feiert das Einsteinjahr ?1) Universitäten und Schulen und alle, die an Einstein interessiert sind. [corretta]2) die ganze Welt, besonders Amerika [errata]3) alle Universitäten, wo Physik unterrichtet wird [errata]

Wie haben zum Beispiel die Schulen das Einsteinjahr gefeiert?1) Sie haben in Bussen, in denen ein kleines Physiklabor war, verschiedene Schulen besucht. [corretta]2) Sie haben Filme und Fotos von Einstein gesehen. [errata]3) Sie haben in der Schule Experimente von Einstein nachgemacht. [errata]

Clue to answer to last question is in wording of 4th question

Right answer is always the longest

Page 17: Testing reading through multiple-choice? An insight within ... · Testing reading through multiple-choice? An insight within the Modena Language Centre Testing Project 3. Bremer Symposion

DAS WAGNER THEATER IN BAYREUTHWenn wir die Landkarte von Bayern betrachten, finden wir im Nordosten des Landes die kleine Stadt Bayreuth. Jedes Jahr kommen Tausende und Abertausende von Menschen aus aller Welt hier zusammen. Was gibt denn dieser Stadt den Zauber, dass sie wie ein Magnet so viele Mensch […] (Quelle: www.bayreuth.de )

Wo liegt das Wagner - Theater ?1) in Ostdeutschland, in Leipzig [errata]2) in Bayern, in München [errata]3) in Bayern, in der Stadt Bayreuth [corretta]

Was für Musik komponierte Wagner ?1) Musikdramen [corretta]2) Lieder und Arien [errata]3) Konzertmusik [errata]

Will der Komponist mit seiner Musik etwas erzählen ?1) Ja, er will alte Sagen und Legenden erzählen. [corretta]2) Ja, er will das Drama des modernen Menschen erzählen. [errata]3) Nein, nur die Melodie ist wichtig [errata]

Kommen viele Menschen, um Wagners Musik zu hören?1) Nein, Wagners Musik gefällt nicht vielen Menschen. [errata]2) Ja, viele deutsche und ausländische Gäste kommen. [corretta]3) Ja, viele Deutsche kommen jedes Jahr. [errata]

Ist Wagner immer in Deutschland geblieben ?1) Nein, er machte auch viele Reisen. [corretta]2) Ja, er lebt und starb in Deutschland. [errata]3) Ja, denn nur hier fand er seine Inspiration. [errata]

Spielte der bayerische König Ludwig eine Rolle im L eben Wagners ?1) Nein, sie kannten sich gar nicht. [errata]2) Ja, denn sie lebten beide in Bayern. [errata]3) Ja, er war nämlich sein Förderer. [corretta]

Possible problems: background knowledge; implausibledistractors; text irrelevant questions; yes-no options (� higherguessing)

Page 18: Testing reading through multiple-choice? An insight within ... · Testing reading through multiple-choice? An insight within the Modena Language Centre Testing Project 3. Bremer Symposion

Islamunterricht muss sein, solange es christlichen Religionsunterricht an staatlichen Schulen gibt. Doch warum gibt es den bei uns? Die ZEIT führte ein Interview mit Johann-Albrecht Haupt von der Humanistischen Union.ZEIT online: Was halten Sie davon, dass in den Schulen demnächst Islamunterricht erteilt wird?

ZEIT online: Was halten Sie davon, dass in den Schulen demnächst Islamunterricht erteilt wird? Johann-Albrecht Haupt: In Deutschland ist der Religionsunterricht verfassungsrechtlich geschützt. Deshalb ist es konsequent, dass auch muslimische Kinder Religionsunterricht erhalten. Allerdings halten

[….]

(Quelle: Zeit-online)

Auf der Islamkonferenz wurde beschlossen, dass in Z ukunft an deutschen Schulen für muslimische Schüler Islamunterricht angeboten wird. Das ist fair, da in Deutschland christliche Kinder je nach Konfession katholischen oder evangelischen Unterricht besuchen können. Doch ist das zu vereinbaren mit der Trennung von Kirche und Staat? Was denkt Johann-Alb recht Haupt?1) Der Staat habe sich um die Vermittlung von Glauben und Religion zu kümmern, weil er sie nur auf diese Weise unter seine Kontrolle bekomme. [errata]2) Christlicher Religionsunterricht sei gut, muslimischer hingegen schlecht, weil man nicht wisse, wie er eigentlich gehalten werden solle. [errata]3) Der christliche Religionsunterricht sei weltanschaulich akzeptabel, ein muslimischer dagegen aus menschenrechtlichen Gründen abzulehnen. [errata]4) Staat und Religion seien zweierlei. Glaubensvermittlung gehöre nicht zu den Aufgaben des Staates bzw. der staatlichen Schulen. [corretta]

high reading load; language of question more difficult than text

Page 19: Testing reading through multiple-choice? An insight within ... · Testing reading through multiple-choice? An insight within the Modena Language Centre Testing Project 3. Bremer Symposion

Main problems of submittedmultiple-choice items

1. more than one genuinely correct answer / correct answer not clear

2. items test background knowledge

3. implausible distractors

4. clues to the right answer(correct answer is longest, ‘word spotting‘, convergence strategy, etc.)

5. item tests what is easy or trivial, not what is important and relevant to the text

Page 20: Testing reading through multiple-choice? An insight within ... · Testing reading through multiple-choice? An insight within the Modena Language Centre Testing Project 3. Bremer Symposion

New specifications for writing MC-readingcomprehension items

Formal aspects•reduce reading load

•language should be clear and not more difficult than the text

•[…]

Distractors•make all distractors plausible (while still being incorrect)

•avoid giving clues to the right answer, such as grammatical clues, word repeats …

•avoid yes-no options (> guessing)

•[…]

General•make sure that there is one, genuinely correct anwer

•base each item on (important content of) text

•keep items independent

•ensure variety

•[…]

Remember:

-item difficulty is determined by text, stem and options (generally, the closer the options, the more difficult theitem)

-avoid unsuitable topics; topics outside the experience of candidates' likely age-group; topics assuming in-depth cultural knowledge; general-knowledge topics

-favour general-interest topics instead of the latest news

-items that test lexical knowledge should require the candidate to infer the meaning of a) an unknown wordfrom the context b) a known word used with a different meaning

-double-check with someone else

Page 21: Testing reading through multiple-choice? An insight within ... · Testing reading through multiple-choice? An insight within the Modena Language Centre Testing Project 3. Bremer Symposion

Where we are now

� Detailed specifications for item writers

� Statistical analysis and evaluation of pretestingand first test results (facility and discrimination index)

� Review of rejected items

� Comparing MC-results to results of othertechniques used for testing reading

Page 22: Testing reading through multiple-choice? An insight within ... · Testing reading through multiple-choice? An insight within the Modena Language Centre Testing Project 3. Bremer Symposion

23% MC + OT - OT -

17,5% MC + OT + OT -

11% MC + OT + OT +

23% MC + OT ++ OT -

8% MC - OT - OT +

17,5% MC - OT + OT +

Overall Performance in Proficiency Tests(N=73): Multiple Choice (MC) vs 2 Other Techniques (OT)

Relative differences between techniquesseem to be evenly distributed

Page 23: Testing reading through multiple-choice? An insight within ... · Testing reading through multiple-choice? An insight within the Modena Language Centre Testing Project 3. Bremer Symposion

Performance in Proficiency Tests per Level: MC vs Other Techniques

3% MC + OT - OT -

8% MC + OT + OT -

11% MC + OT + OT +

30% MC + OT ++ OT -

18% MC - OT - OT +

30% MC - OT + OT +

42% MC + OT - OT -

26% MC + OT + OT -

11% MC + OT + OT +

16% MC + OT ++ OT -

0% MC - OT - OT +

5% MC - OT + OT +

B2 (N=30) C1 (N=43)

Are MC reading comprehension tasks adequate to test reading ability- at all levels?- in a semi-professional (testing) environment?- what alternatives are there for computerized language testing?

Different relative performances according to level– why?

Page 24: Testing reading through multiple-choice? An insight within ... · Testing reading through multiple-choice? An insight within the Modena Language Centre Testing Project 3. Bremer Symposion

Selected ReferencesAlderson J.C. (2000) Assessing Reading. Cambridge: Cambridge University Press.

Bachman L. F. (1990) Fundamental Considerations in Language Testing. Oxford: Oxford University

Press.

Alderson J.C., Clapham C. and Wall D., (1995) Language Test Construction and Evaluation. Cambridge: Cambridge University Press.

ALTE (2005), Materials for the guidance of test item writers. ALTE – The Association of Language Testers in Europe. Available at: www.alte.org/downloads/index.php?docid=89

Arras U. (2006) “Testen und Beurteilen des Leseverstehens in der Fremdsprache”. Babylonia, 14(3-4), 81-86.

Grotjahn R. (2000) “Determinanten der Schwierigkeit von Leseverstehensaufgaben: Theoretische Grundlagenund Konsequenzen für die Entwicklung des TESTDAF”. In: Bolton S. (Hrsg.) TESTDAF: Grundlagen für dieEntwicklung eines neuen Sprachtests. Beiträge aus einem Expertenseminar. München: Goethe-Insitut, 7-55.

Hughes A. (2003) Testing for language teachers (2nd ed.). Cambridge: Cambridge University Press.

Rupp A.A., Ferne T. and Choi H. (2006) “How assessing reading comprehension with multiple-choicequestions shapes the construct: A cognitive processing perspective”. Language Testing, 23, 441–474.

Urquhart S. and Weir C. (1998) Reading in a Second Language: Process, product and practice. New York: Longman.

Westhoff, G.(1997). Fertigkeit Lesen. München : Langenscheidt