1
University of Szeged, MTA-SZTE Research Group on the Development of Competencies
Athena seminar
University of Helsinki, 20th of October, 2016
Attila Pásztor MTA-SZTE Research Group on the Development of Competencies www.edu.u-szeged.hu/phd/people/apasztor/
Online assessment and development of
thinking in school context
Short introduction
Attila Pásztor, junior research fellow
Research Group on the Development of Competencies, Hungarian Academy of Sciences, University of Szeged
Research interest: Online assessment and development of thinking in school context
Graduated as a psychologist
PhD candidate at Doctoral School of Education Supervisor: Benő Csapó
Title: Technology-based assessment and development of inductive reasoning
I am also involved in other projects aiming assess and develop other
(thinking) skills such as combinatorial reasoning, creativity (divergent
thinking), scientific reasoning, pupils’ capacity to follow instructions , mouse
use skills.
2
Outline
Brief overview about the Hungarian school system
Possibilities of technology-based assessment and the eDia project
Online assessment of thinking skills:
Mouse use skills, pupils’ capacity to follow instructions, combinatorial
reasoning, creativity (divergent thinking), inductive reasoning
Possibilities of technology-based development - DGBL
Playful fostering of inductive reasoning in computer-based
environment
Implications for policy making
Brief overview about the Hungarian school system
Plays
Day-nursery - Bölcsőde
23
21
3
475
480
485
490
495
500
505
2000 2003 2006 2009 2012
Szövegértés Matematika Természettudomány
PISA results: 2000-2012
Mathematics Science Reading
Problem solving
4
PISA 2012 - reading: below level 2
(%, OECD countries)
0,0
5,0
10,0
15,0
20,0
25,0
30,0
35,0
40,0
45,0
Kore
a
Esto
nia
Irela
nd
Japan
Pola
nd
Canada
Fin
land
Sw
itzerland
Neth
erlands
Austr
alia
Germ
any
Denm
ark
Belg
ium
Norw
ay
New
Zeala
nd
United S
tate
s
United K
ingdom
Czech R
epublic
Spain
Port
ugal
Fra
nce
Austr
ia
Italy
Hungary
Icela
nd
Slo
venia
Turk
ey
Luxem
bourg
Gre
ece
Sw
eden
Isra
el
Slo
vak R
epublic
Chile
Mexic
o
19.7
0,0
10,0
20,0
30,0
40,0
50,0
60,0
Kore
a
Esto
nia
Japan
Fin
land
Sw
itzerland
Canada
Pola
nd
Neth
erlands
Denm
ark
Irela
nd
Germ
any
Austr
ia
Belg
ium
Austr
alia
Slo
venia
Czech R
epublic
Icela
nd
United K
ingdom
Norw
ay
Fra
nce
New
Zeala
nd
Spain
Luxem
bourg
Italy
Port
ugal
United S
tate
s
Sw
eden
Slo
vak R
epublic
Hungary
Isra
el
Gre
ece
Turk
ey
Chile
Mexic
o
28.1
It was 23% in
2003
PISA 2012 - mathematics: below level 2
(%, OECD countries)
5
0,0 5,0 10,0 15,0 20,0 25,0 30,0 35,0
Lithuania
Russian
Latvia
Sweden
Croatia
Denmark
Norway
Portugal
Hungary
Italy
Ireland
Israel
United States
Iceland
Luxembourg
United Kingdom
Slovak Republic
France
Czech Republic
Austria
Slovenia
Viet Nam
Finland
Estonia
Australia
Canada
Netherlands
New Zealand
Germany
Poland
Belgium
Switzerland
Liechtenstein
Macao-China
Japan
Korea
Hong Kong-China
Chinese Taipei
Singapore
Shanghai-China
2.1
3.8
PISA 2012 - mathematics: on level 6
(%, OECD countries)
0,0 1,0 2,0 3,0 4,0 5,0 6,0
CroatiaLatvia
Russian FederationSpain
LithuaniaMacao-China
HungaryChinese Taipei
Slovak RepublicItaly
IsraelIceland
DenmarkSwedenAustria
Czech RepublicBelgiumFrance
Viet NamSwitzerland
LiechtensteinKorea
NorwayUnited States
LuxembourgSlovenia
NetherlandsIreland
GermanyPoland
EstoniaHong Kong-China
United KingdomCanada
AustraliaNew Zealand
FinlandJapan
Shanghai-ChinaSingapore
0.5
5.8
PISA 2012 - science: on level 6
(%, OECD countries)
6
500
510
520
530
540
550
1995 1999 2003 2007 2011
Matematika 8. évf. Természettudomány 8. évf.
TIMSS results, 8th grade: 1995-2011 Mathematics Science
Possibilities of technology-based assessment (TBA)
Paper-based and face–to-face testing is resource and time consuming
Assessment in early childhood – importance of manipulation, problem of reading skills
TBA - innovative item design and test administration:
possibilities for manipulation
students can listen the instructions
group assessment
automatic scoring
instant feedback
Easy-to-use instruments in everyday school practice
7
TBA: impacts research practice as well
The quality of the data can be increased – e.g. human errors in coding, data administration, subjective aspects of scoring
Individual testing - e.g. adaptive testing
Assessing new constructs – e.g. problem solving
Large scale assessments even with young students - pre-recorded instructions, interactive items
Follow up – longitudinal studies
Faster and „easier” data collection
But…
High costs of the development of a system in the initial period
Technical conditions in schools
Media effects
Brief introduction of the eDia project
Developing Diagnostic Assessments
Main features:
Three main domains:
- Reading
- Mathematics
- Science
The target population are 1st to 6th grade students.
edia.hu
8
Three dimensions:
Application of
knowledge
GENERAL SKILLS
Content knowledge
Psychology
LITERACY
DISCIPLINARY
PISA, 3 main
domains Based on the curriculum
Thinking
skills
edia.hu
Further domains
Writing
ICT literacy
Economic literacy
Musical abilities
Problem solving
Combinatorial reasoning
Creativity Social skills
Visual language
Civic competence
English
Motivation
Health literacy
Learning to learn
edia.hu
Inductive reasoning
9
Partner schools network (around 800 elementary schools and100 secondary schools)
edia.hu
edia.hu/OKE
10
Outline
Brief overview about the Hungarian school system
Possibilities of technology-based assessment and the eDia project
Online assessment of thinking skills:
Mouse use skills, pupils’ capacity to follow teacher instructions,
combinatorial reasoning, creativity (divergent thinking), inductive
reasoning
Possibilities of technology-based development - DGBL
Playful fostering of inductive reasoning in computer-based
environment
Educational assessment and policy making in Hungary
Mouse use skills or ICT familiarity test
11
12
Módszerek – figurális sorozatok
Pupils’ capacity to follow teacher instructions
13
Pupils’ capacity to follow
instructions
Important influencing factor of students’ school achievement and reasoning especially in early school years (Vainikainen, 2014)
The tasks were originally developed by Elkonin (see Elkonin & Venger, 1988)
Graphic dictation: 1+2 patterns – drawing lines in a grid according to the teacher’s dictation
Modified version by Hautamäki, J., Arinen, P., Hautamäki, A.,
Lehto, J., Lindblom, B., Kupiainen, S., Outinen, K., Pekuri, M.,
Reuhkala, M., Scheinin, P. (2001).
14
Methods - participants
Participants:
5628 first-grade students (2809 boys and 2687 girls, age M=7.09, SD=.48)
166 primary schools, 278 classes
Data collection: October, 2015
Methods: instruments, procedure
The test was part of an Online School Readiness test Battery
2 tasks, 23 items
21 dictation items
2 pattern following items
Procedure:
eDia platform - edia.hu (Molnár, 2015)
schools’ ICT rooms
headsets to listen instructions
automatic scoring, instant feedback
ICT familiarity test: 10 items, Cronbach α=.62, M=91.1% SD=13.4%
+ 5 practice items: M=96.6% SD=11%
15
16
Példafeladat 2
Results
Nr. of
items Cronbach’s α
Mean
(%)
SD
(%)
Dictation tasks 21 .916 66.4 28.4
Pattern follow task 1 1 - 38.5 48.7
Pattern follow task 2 1 - 17.7 38.2
Pattern follow tasks 2 - 28.1 34.3
Foll. instr. test 23 .913 63.4 26.9
rdictation_pattern_follow 1-2 = .37
rdictation_pattern_follow1 = .29
rdictation_pattern_follow2 = .30
17
Results
Modell χ2 df P CFI TLI RMSEA (95% CI)
2-dim 5341.78 229 .001 .944 .938 .062 (.061–.064)
1-dim 5341.77 230 .001 .943 .937 .063 (.061–.064)
Difference test: χ2=95.29; df=3; p<.001
Testing time: 19.3 min SD=10.6
Only 41 students didn’t reach the end of the test. (N=5628)
Results – correlations between other domains
Tests Following instructions
ICT literacy .23
Early mathematics .50
Early reading .46
Inductive reasoning .40
Further researches needed…
18
Combinatorial reasoning test
Combinatorial reasoning
Mayor role in Piaget’s theory.
Plays central role in scientific reasoning, problem
solving and creativity (English, 1993; Kishta,
1979; Lockwood, 2013).
Enumeration of all constructs under the given
conditions - large number of responses, scoring.
19
Cartesian product – figural content
Cartesian product – formal content
20
Methods - scoring
Scoring (Csapó, 1988): 𝐽 =𝑥 𝑇 − 𝑦
𝑇2
Where x = number of right formations given by the
students; y = number of redundant and wrong formations
given by the students; T = number of all possible right formations.
Scores range between 0 to 1 where 1 means the production of all right formations without redundancy or any wrong answers.
Fully computerized, instant feedback after completing the test.
Results
Grade N Number of items Cronbach α
4th grade 219 10 .88
Model χ2 Df p CFI RMSEA Δχ2 Δdf p
1-dimensional - content 195.36 35 <.01 .84 .15
151.18 1 >.01
2-dimensional - content 44.18 34 .11 .99 .04
1-dimensional - operat. 197.87 20 <.01 .81 .19
77.13 6 >.01
2-dimensional - operat. 120.74 14 <.01 .87 .19
Note: df = degrees of freedom; CFI = Comparative Fit Index; TLI = Tucker–Lewis Index;
RMSEA = Root Mean Square Error of Approximation; χ2 and df are estimated by WLSMV.
21
Results
0
5
10
15
20
25
30
35
40
45
50Frequency (nr. of students)
Achievement (%)
Creativity – divergent thinking test
22
Divergent thinking (Guilford, 1959)
Convergent thinking: the ability to apply rules to arrive at a single ‘correct’ solution to a problem or task (e.g. Intelligence tests) The process is systematic and linear.
Divergent thinking: the process of generating multiple related ideas for a given topic or solutions to a problem. It occurs in a spontaneous, free-flowing, ‘non-linear’ manner.
It has been considered as an indicator of creative potential (Kim, 2006; Runco & Acar, 2012)
Assessment of divergent thinking
Torrance Test of Creative Thinking (TTCT, 1966)
Verbal and figural tasks
Unusual Uses task (e.g. book)
Circles task
Wallach–Kogan Creativity Test (WKCT, 1965)
Verbal subtests
Alternative Uses (e.g. for a newspaper)
Instances (e.g. name all the round things you can think of)
Similarities (e.g. How are a cat and mouse similar?)
Figural subtests
Pattern Meanings and Line Meanings (interpreting abstract patterns and lines)
23
Instrument – divergent thinking
Divergent thinking:
Based on Torrance’s (1966) and Wallach and Kogan’s (1965) open-ended item types
Computerized data collection (eDia platform: edia.hu)
Nine tasks:
3 alternative uses tasks – verbal subtest 1 (match, cup, toothbrush)
3 instances tasks – verbal subtest 2 (transparent, produce light, jingle)
3 picture meaning tasks – figural subtest
Students had three minutes to provide answers for each task.
CENTER FOR RESEARCH ON LEARNING AND INSTRUCTION DEVELOPING DIAGNOSTIC ASSESSMENTS
TÁMOP 3.1.9-11/1-2012-0001
24
The three stimuli for the picture meaning tasks
CENTER FOR RESEARCH ON LEARNING AND INSTRUCTION DEVELOPING DIAGNOSTIC ASSESSMENTS
TÁMOP 3.1.9-11/1-2012-0001
25
Methods - participants
Participants:
Sixth-grade students (N=1984, 1005 boys and 937 girls, age M=12.05, SD=.51
78 primary schools, 97 classes
Assessment of divergent thinking
Fluency: the ability to generate numerous responses - the total number of interpretable, meaningful, and relevant ideas generated in response to the stimulus
Flexibility: shifts in approaches to produce numerous ideas- the number of different categories of relevant responses
Originality: the ability to produce unusual ideas - the statistical rarity of the responses.
26
Methods – scoring divergent thinking
Fluency: number of relevant answers
Flexibility: number of categories implied by the responses
Originality: statistical rarity of the responses (formula developed by Barkóczi & Klein, 1968)
Four raters: categories were created, all answers categorized manually and decisions made about questionable answers with regard to relevance
A separate online platform was developed which calculated the three indices automatically.
Results
27
Results
Results
Diftest: 2=386.01; df=3; p<.001
28
Assessment of inductive reasoning from kindergarten to fourth grade
Inductive reasoning and the need for fostering it
Plays a central role in knowledge acquisition and in the transfer of knowledge (Klauer & Phye, 2008; Molnár, Greiff, & Csapó, 2013).
In order to foster inductive reasoning as early as possible in educational context a detailed understanding of its nature and development is essential (Csapó, 2003).
Assessment: series, analogies, matrices, classification (Klauer, 2008; Molnár, 2011; Piaget, 1967).
29
Methods - participants
Fourth grade students:
5017 students (2534 boys and 2483 girls, age M=10.3, SD=.49)
Data collection: October 2014
First grade students:
6013 students (3026 boys and 2872 girls, age M=7.1, SD=.48)
Data collection: October 2015
Kindergarten:
278 students (144 boys and 134 girls, age M=5.6, SD=.69)
Data collection: February – April 2016
All together: 11308 children
Methods
Kindergarten
tablets
small groups, trained test administrators
1st and 4th grade
computers
schools’ ICT rooms
ICT familiarity test: 1st grade: M=91.1% SD=13.4%
Kindergarten: M=89.2%; SD=14.3%
30
31
Methods
Nr. of
items
Kindergarten 3 4 9 18 34
1st grade 9 18 1 2 32
4th grade 4 18 2 30 56
All together: 67 items
Anchors in the three tests: 18 items
Anchors in kindergarten and 1st grade: 27 items
Anchors in 1st grade and 4th grade: 20 items
Results
Age group Number of
items Cronbach-α Mean (SD) % N
4th grade – all 56 .93 64.2 (18.9) 5017
Figural series 20 .83 74.5 (19.9) 5016
Figural analogies 21 .85 63.7 (21.4) 5012
Number series 8 .73 49.5 (25.2) 5009
Number analogies 7 .70 53.0 (27.1) 5004
1st grade – all 32 .89 41.2 (22.2) 6013
Figural series 12 .81 43.1 (26.5) 5988
Figural analogies 13 .79 39.2 (24.4) 5980
Classification 7 .77 41.6 (31.6) 5972
Kindergarten – all 34 .87 25.6 (17.2) 278
Figural series 14 .80 21.8 (2.6) 276
Figural analogies 13 .77 24.2 (21.2) 273
Classification 7 .71 2.8 (24.3) 271
32
Results
4th grade: EAP/PV=.92
Each ‚x’ represents 7.8 cases
1st grade: EAP/PV=.87
Each ‚x’ represents 7.8 cases
kindergarten: EAP/PV=.81
Each ‚x’ represents 7.8 cases
parameters (logit) cases items parameters (logit) cases items parameters (logit) cases items
parameters (logit) cases items
Each ‚x’ represents 14.3 cases
EAP/PV RELIABILITY: 0.943
33
Results
-4,5
-3,5
-2,5
-1,5
-0,5
0,5
1,5
2,5
3,5
4,5
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Skil
l le
vel
(lo
git
)
Age
Results
0
5
10
15
20
25
-5,5 -4,5 -3,5 -2,5 -1,5 -0,5 0,5 1,5 2,5 3,5 4,5 5,5
Frequency (%)
Skill level (logit)
kindergarten - not school aged
kindergarten - school aged
1st grade
4th grade
34
Summary, further research aims
Inductive reasoning develops fast during kindergarten and in the first school years
Large individual differences
Implications for interventions
Further research: test development, data collection in 2nd and 3rd grade, interventions, relations to other constructs and school achievements, log file analyses
Face-to-face studies
Summary, further research aims
Inductive reasoning develops fast during kindergarten and in the first school years
Large individual differences
Implications for interventions
Further research: test development, data collection in 2nd and 3rd grade, interventions, relations to other constructs and school achievements, log file analyses
Face-to-face studies
35
Summary, further research aims
Inductive reasoning develops fast during kindergarten and in the first school years
Large individual differences
Implications for interventions
Further research: test development, data collection in 2nd and 3rd grade, interventions, relations to other constructs and school achievements, log file analyses
Face-to-face studies
Molnár, Gy. (2011). Playful fostering of 6- to 8-year-old students’ inductive reasoning. Thinking skills and Creativity, 6(2), 91-99.
Pásztor, A. (2014). [Challenges and possibilities in digital game-based learning: Effectiveness of a playful inductive reasoning training program] Magyar Pedagógia, 114(4), 281-301.
Hotulainen, R., Mononen, R., & Aunio. P. (2016). Thinking skills intervention for low-achieving first graders. European Journal of Special Needs Education, 1-16.
Playful fostering of inductive reasoning in computer-
based environment
36
Outline
Theoretical background of the program
TBA and DGBL
Pilot study
Conclusions
Theoretical background of the training program
Klauer and Phye, 2008 pp. 87.
37
Theoretical background of the training program
Klauer and Phye, 2008 pp. 89.
Theoretical background of the training program
Klauer and Phye, 2008 pp. 88.
38
Three training programs have been developed:
• Program I for children age 5 through 8;
• Program II for children age 11 through 13;
• Program III for youth age 14 through 16.
Paper-based, content general, 120 learning tasks
Well-documented (Barkl, Porter, & Ginns 2012; de Koning & Hamers, 1999;
de Koning, Hamers, Sijtsma &Vermeer 2002; Hamers, de Koning & Sijtsma,
1998; Klauer, 1996; Klauer, 1997; Klauer, 1999; Klauer, Willmes & Phye 2002;
Tomic. 1995; Tomic & Kingma, 1998; Tomic & Klauer, 1996)
Meta-analyses (Klauer & Phye, 2008)
• Transfer on intelligence: d=.52
• Transfer on learning: d=.69
Theoretical background of the training program
Technology-based Assessment and
Digital Game-Based Learning
• Immediate feedback
• Instructional support – formative assessment
• Criterions
• Optimizing the learning process – adaptive testing
• Interactivity, tasks with manipulation
• Innovative task design – e.g. audiovisual elements
• Practical advantages
+ Gamification – e.g.. story. personalization
39
UNIVERSITY OF SZEGED – GRADUATE SCHOOL OF EDUCATIONAL SCIENCES
• Lehet csak ezt kellene
40
In case of failure…
UNIVERSITY OF SZEGED – GRADUATE SCHOOL OF EDUCATIONAL SCIENCES
41
In case of success…
UNIVERSITY OF SZEGED – GRADUATE SCHOOL OF EDUCATIONAL SCIENCES
Methods - procedure
• Experimental group: N=88; matched control group from a sample of N=240
• Students were trained in groups of 20
• 5 sessions, 24 learning tasks, 20-30 minutes
• 1 session per week in the afternoon (after teaching)
42
UNIVERSITY OF SZEGED – GRADUATE SCHOOL OF EDUCATIONAL SCIENCES
Results
n.s.
t(174)=-2.288. p=.02
Cohen d (contr.)= .15
Cohen d (exp.)= .47 Cohen d (contr-exp)= .33
30
35
40
45
50
55
60
65
70
Pre-test Post-test
Ach
ievm
en
t (%
)
Experimental group
Control group
43
Results
UNIVERSITY OF SZEGED – GRADUATE SCHOOL OF EDUCATIONAL SCIENCES
• There was no significant difference in the value of development with regard to gender:
t(86) = -.520, p= .83
• and grade:
t(86) = -.425, p= .85
Results
44
Discussion
„Easy to use” instrument: providing individualized feedback,
no need for permanent teacher presence, can be applied in
larger groups
Transfer effects - other skills, content domain
Long term effects, placebo effects
Integrating online assessment and development
Development of the training: more effective instructional
support, motivation
SZTE BTK Neveléstudományi Doktori Iskola
45
SZTE BTK Neveléstudományi Doktori Iskola
Implications for policy making
Opinion vs. evidence-based policy in Hungary
Technology-based assessment: increases the effectiveness of
feedback mechanisms in all eduactional levels
Data driven policy making
Assessment and interventions
Knowledge transfer into classroom teaching
Importance of research-based teacher education
Using such a system (like eDia) should be obligatory for the
schools?