japan’s universitykoizumi/proms2016_koizumi_entrance...2 reforms in university entrance...
TRANSCRIPT
![Page 1: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/1.jpg)
1
Multi-faceted Rasch analysis in rating tasks in Japan’s university entrance examinations
Rie KOIZUMI (Juntendo University)
PROMS (Pacific Rim Objective Measurement Society) 2016, Orient Hotel, Xi'an, China,
August 1
![Page 2: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/2.jpg)
2
Reforms in university entrance examinations in Japan (MEXT, 2014;
Yomiuri Shimbun Kyouikubu, 2016)
To keep up with the internationalization
To increase those who have the ability to think, judge, and express ideas and perform actively
Exam construct: knowledge, skills → + ability to think logically and express effectively
L2 English test: reading, listening → + writing, speaking
![Page 3: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/3.jpg)
3
Current types of university entrance examinations
(1) General examinations
(2) Recommendation-based examinations
(3) Admissions Office (AO) examinations
(4) Special selection examinations
→Restructured? All should use both academic tests and other methods such as interviews and long essays.
![Page 4: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/4.jpg)
4
Types of tests in general examinations
(a) Only the National Center Test for University Admissions (Center Test) Administered only once a year to about
550,000 examinees nationwide
(b) Only a university-developed test Limited quality control
(c) Both
![Page 5: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/5.jpg)
5
Reforms in the Center Test Format: multiple-choice (MC) with single
answers → + MC with single and multiple answers + constructed-response
×Multiple administrations per year
×Not select applicants based on a one-point difference in test scores
L2 English four-skill test, with the speaking section using voice recorders
Analysis: CTT → + IRT
![Page 6: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/6.jpg)
6
Concrete plan proposed for the Center Test reform
For math and L1 Japanese:
2020 to 2023: Elicit relatively short answers (40–80 Japanese characters)
From 2024: Administered digitally and elicit longer written responses (200–300 characters)
Scored initially by humans, but later by artificial intelligence
![Page 7: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/7.jpg)
40–80 Japanese characters
西安は観光するところも多く、滞在はとても充実しています。兵馬俑に行き、中国の歴史と文化を学びました。兵馬俑博物館では写真をとってよいことに驚きました。(76 Japanese characters)
7
![Page 8: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/8.jpg)
8
Constructed responses require judgments using rating scales
Rating involves many factors (McNamara, 1996; Fulcher, 2003)
Human raters; automated system (e.g., characteristics, training)
Rating scale (e.g., orientation, construct definition)
Task (e.g., orientation, goals)
Interlocutor
Local performance condition
![Page 9: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/9.jpg)
9
Multi-faceted Rasch analysis (MFRA)
Useful in analyzing data affected by many factors (Barkaoui, 2013; Bond & Fox, 2015; Eckes,
2011; Engelhard, 2013; McNamara & Knoch, 2012)
Facets: Test takers, tasks, raters, rating criteria, + α (Two or more facets)
Obtain detailed information that cannot be derived from raw-score data analysis
E.g., Do raters produce stable scores?
Do rating scales work effectively?
Do tasks work in an expected manner?
![Page 10: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/10.jpg)
Argument-based approach to validity (Chapelle et al., 2008)
F. Utilization
The degree of impact on learning and teaching
E. Extrapolation
Relationship with other test scores
D. Explanation (reflection of the construct)
Factor structure
Function analysis
C. Generalization (reliability)
The test and raters provides stable estimates.
B. Evaluation (appropriate scoring)
Statistical characteristics of tasks/a rating scale
Sufficient difficulty spread of tasks
A. Domain Definition
Relevance and representativeness of the target domain
![Page 11: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/11.jpg)
11
Issues in reforms Radical proposal in 2014
Slow and limited progress
Background:
Conflicting principles (Kuramoto & Koizumi, 2016)
Principle of education
Principle of measurement
Methodological preferences
Some users of 2-parameter item response theory
Limited users of Rasch model
![Page 12: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/12.jpg)
12
Focus on speaking tests testing L2 English ability 1/2
Speaking tests for university entrance examinations
(1) Four skill test
TEAP Eiken
GTEC CBT GTEC for Students
IELTS Cambridge English
TOEIC (LR + SW)
TOEFL iBT TOEFL Junior Comprehensive
![Page 13: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/13.jpg)
13
Focus on speaking tests testing L2 English ability 2/2
Separate speaking tests
As a supplement to the current exam (Mizohata, 2016)
SST
TSST
OPIc
Versant
TOEIC Speaking
![Page 14: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/14.jpg)
14
TEAP (Test of English for Academic Purposes)
Developed by the Eiken Foundation of Japan and Sophia University (http://www.eiken.or.jp/teap/)
Construct: academic English proficiency required for learning and researching at universities
10-min face-to-face, one-on-one interviews
CEFR about A2 to B2
0 to 30 points
Rater: training session
![Page 15: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/15.jpg)
15
TEAP tasks and criteria
Part 1: Interview
Part 2: Role play (interviewing an examiner)
Part 3: Monologue
Part 4: Extended interview
5 criteria
4 levels (0 to 3; Below A2, A2, B1, B2)
(a) Pronunciation, (b) Grammatical Range and Accuracy, (c) Lexical Range and Accuracy, (d) Fluency, (e) Interactional Effectiveness
![Page 16: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/16.jpg)
16
TEAP analysis (Nakatsuhara, 2014)
Analysis of the pilot study data
120 test takers
5 criteria
6 trained raters
Facets
using a partial
credit model
![Page 17: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/17.jpg)
17
TEAP pilot study results
Favorable overall
High rater agreement (actual: 59.7%)
5.67 strata of test takers
2.8% underfitting test takers
Appropriate rating scale functions
Only a pilot study was analyzed using MFRM.
![Page 18: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/18.jpg)
18
GTEC CBT (Global Test of English
Communication Computer Based Testing)
Developed by the Benesse Corporation and Center for Entrance Examination Standardization (http://www.benesse-gtec.com/cbt/en)
Construct: English proficiency in four skills for academic purposes
20-min computer-based test
CEFR about A2 to B2
Maximum score: 350 points
![Page 19: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/19.jpg)
19
GTEC CBT tasks Part 1: Listening and responding (6 items)
E.g., Where do people like to travel to in your country?
When is the best time of year to visit your country?
Part 2: Delivering and asking for information (3 items)
Part 3: Expressing your opinion (3 items)
Do you think technology has changed the way we live?
![Page 20: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/20.jpg)
20
GTEC CBT criteria 23 criteria in total. Each item has 1 to 5.
Full marks: 1 to 3 points
Example of criteria
Part 1: Listening and responding
Respond to simple questions appropriately and clearly
Part 2: Delivering and asking for information
Based on the provided information, give the factual information and preference, and ask questions
Part 3: Expressing your opinion
State an opinion and provide reasons to support the opinion
![Page 21: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/21.jpg)
21
GTEC CBT rater training, quality maintenance
Practice session and certification
Trial session
Independent ratings by two raters.
Divergent responses are assessed by the third experienced rater.
Analysis (Koizumi, Okabe, & Kashimada, 2016): 648 test takers, 23 criteria, 13 trained raters; Facets(Version 3.71.4), using partial credit model
![Page 22: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/22.jpg)
GTEC CBT Results 1/2
22
Min~Max M (SD) Strata Relia- bility
% of under-fitting
Test takers
-6.94~5.10
0.92 (1.44)
6.35 .95 8%
Criteria -3.18~
3.35 0.00 (1.54)
29.84 1.00 0%
Raters -0.41~
0.17 0.00 (0.17)
3.64 .86 0%
![Page 23: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/23.jpg)
23
GTEC CBT Results 2/2 Favorable overall
High rater agreement (actual: 79.4% > expected: 63.4%)
Criteria and rater fit to the model
Bias analysis: Rater x criteria: 0%
Rater x test taker: 1.39%
Test taker x criteria: 0.01%
Off-topic responses received unexpectedly low scores.
Appropriate rating scale functions
![Page 24: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/24.jpg)
24
TSST (Telephone Standard Speaking Test)
Developed by ALC (https://tsst.alc.co.jp/tsst/e_contact.html)
Based on ACTFL OPI
Construct: how well a person can answer function-based questions spontaneously
15-min telephone-mediated test
Prompts both in L1 Japanese and L2 English
1 to 9 levels (Novice to Advanced)
CEFR A1 to B2
![Page 25: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/25.jpg)
25
TSST
10 tasks (no preparation time; speaking for 45 sec)
Questions selected randomly from a question pool
From intermediate level to advanced level tasks
E.g.,
![Page 26: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/26.jpg)
26
TSST rater training, quality maintenance
Certification
Training sessions annually
Independent ratings by three raters
One of them is an experienced rater.
Analysis (Koizumi, 2016) 5406 test takers, 771 tasks, 32 trained raters
Facets(Version 3.71.4), using a rating scale model
![Page 27: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/27.jpg)
27
TSST Results 2/2 Favorable overall
Task and rater fit to the model
2 task separation is intentional (intermediate and advanced level tasks)
High % of underfitting test takers
Appropriate rating scale function
Require further analysis
![Page 28: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/28.jpg)
Toward reforms in Japanese university entrance exams
Recent use of multi-faceted Rasch analysis in L2 speaking tests
Should be expanded to other types of tests
Routine analysis and reporting to the public will benefit both test developers and test users.
Need for cooperation with content experts and measurement professionals
28
![Page 29: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/29.jpg)
Acknowledgments
I appreciate Dr. Zhang and organizing committee members for inviting me to PROMS 2016.
This work was supported by the Japan Society for the Promotion of Science (JSPS) KAKENHI, Grant-in-Aid for Scientific Research (C), Grant Number 26370737.
29
![Page 30: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/30.jpg)
References Barkaoui, K. (2013). Multifaceted Rasch analysis for test
evaluation. In A. Kunnan (Ed.), The companion to language assessment (Vol. III: Evaluation, Methodology, and Interdisciplinary Themes, Part 10: Quantitative analysis, pp. 1301–1322). West Sussex, UK: John Wiley & Sons. doi:10.1002/9781118411360.wbcla070
Bond, T. G., & Fox, C. M. (2015). Applying the Rasch
model: Fundamental measurement in the human sciences
(3rd ed.). New York, NY: Routledge.
Chapelle, C. A., Enright, M. K., & Jamieson, J. M. (Eds.). (2008). Building a validity argument for the Test of English as a Foreign Language.™ New York, NY: Routledge.
Eckes, T. (2011). Introduction to many-facet Rasch
measurement: Analyzing and evaluating rater-mediated
assessments. Frankfurt am Main, Germany: Peter Lang. 30
![Page 31: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/31.jpg)
Engelhard, Jr. G. (2013). Invariant measurement: Using
Rasch models in the social, behavioral, and health
sciences. New York, NY: Routledge.
Fulcher, G. (2003). Testing second language speaking. Essex, U.K.: Pearson Education Limited.
Kuramoto, N., & Koizumi, R. (2016). Current issues in large-scale educational assessment in Japan: Focus on national assessment of academic ability and university entrance. Unpublished manuscript. Submitted for a journal.
Koizumi, R. (2016). Validity argument for TSST. Unpublished manuscript.
Koizumi, R., Okabe, Y., & Kashimada, Y. (2016, August). Rater reliability in GTEC CBT speaking section: Using multi-faceted Rasch analysis. Paper presented at the 32 Japan Society of English Language Education (JASELE), Saitama 2016. Dokkyo University, Saitama.
31
![Page 32: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/32.jpg)
McNamara, T. (1996). Measuring second language performance. Essex, U.K.: Addison Wesley Longman Limited.
McNamara, T., & Knoch, U. (2012). The Rasch wars: The emergence of Rasch measurement in language testing. Language Testing, 29, 555-576. doi: 10.1177/0265532211430367
MEXT (Ministry of Education, Culture, Sports, Science & Technology). (2014). [On integrated reforms in high school and university education and university entrance examination aimed at realizing a high school and university articulation system appropriate for a new era―Creating a future for the realization of the dreams and goals of all young people: Report]. Retrieved from http://www.mext.go.jp/b_menu/shingi/chukyo/chukyo0/toushin/1354191.htm (for the abbreviated version in English, see http://www.mext.go.jp/english/topics/1356088.htm)
32
![Page 33: Japan’s universitykoizumi/PROMS2016_Koizumi_entrance...2 Reforms in university entrance examinations in Japan (MEXT, 2014; Yomiuri Shimbun Kyouikubu, 2016) To keep up with the internationalization](https://reader030.vdocuments.site/reader030/viewer/2022040403/5e8c43913f541b75ed43f22f/html5/thumbnails/33.jpg)
Mizohata, Y. (2016). Nyuushi to supiikingu no hyouka [Entrance examinations and speaking assessment]. In E. Izumi & S. Kadota (Eds.), Eigo Supiikingu shidou handobukku [Handbook of teaching English speaking] (pp. 282-287). Tokyo: Taishukan.
Nakatsuhara, F. (2014). A research report on the development of the Test of English for Academic Purposes (TEAP) speaking test for Japanese university entrants—Study 1 & Study 2. Retrieved from http://www.eiken.or.jp/teap/group/pdf/teap_speaking_report1.pdf
Yomiuri Shimbun Kyouikubu. (2016). Daigaku nyuushi kaikaku [University entrance examination reforms: Reports from Japan and abroad]. Tokyo: Chuokoron-Shinsha.
33