large scale quantitative studies in educational research
DESCRIPTION
Large scale quantitative studies in educational research. Nic Spaull SAERA conference | Durban Presentation available online: nicspaull.com /presentations | 12 August 2014. Objectives of the workshop. For participants to leave with… - PowerPoint PPT PresentationTRANSCRIPT
Large scale quantitative studies in educational research
Nic SpaullSAERA conference | Durban
Presentation available online: nicspaull.com/presentations | 12 August 2014
Objectives of the workshop
• For participants to leave with…1. A good idea of what large-scale data exist in SA
and which assessments SA participates in. 2. To appreciate why we need them3. Which areas of research are most amenable to
analysis using quantitative data?
(The focus here is on non-technical, usually descriptive, analyses of large-scale education data. There is obviously an enormous field of complex multivariate research using quantitative data. See Hanushek and Woessman, 2013)
1. What do we mean by “large-scale quantitative research”?
1. What the heck do we mean by “large-scale quantitative research” ?Firstly, what do we mean when we say “large-scale quantitative studies”
– Large-scale: usually implies some sort of representivity of an underlying population (if sample-based) or sometimes the whole population.
– There are two “main” sources of large-scale data in education
1. Assessment data and concomitant background info (PIRLS/TIMSS/SACMEQ/ANA/Matric/NSES)
2. Administrative data like EMIS, HEMIS, PERSAL etc..
– Quantitative: The focus is more on breadth than depth.• As an aside in the economics of education, qualitative research that uses
numerical indicators for the 15 (?) schools it is looking at would not really be considered quantitative research. The focus is still qualitative.
Qualitative Quantitative
Number of schoolsUsually a small number of schools (1-50?) selected without intending to be representative (statistically speaking)
Usually a large number of schools (250+) that may be representative of
an underlying population or not
Over-arching interest Depth over breadth Breadth over depth
Can make population-wide claims?
No. This is one of the major limitations.
Yes. This is one of the major advantages
Scope of researchUsually very specific getting detailed information pertinent to the specific
research topic.
Often quite broad but shallow (one dataset might be analysed from a SLM perspective, a content perspective, a
resourcing perspective etc.)
Numerical summaries of data Less important More important
Personal reflections – please challenge me on these…
1. What are we talking about?
A. Types of research questions that are amenable to quantitative research:– How many students in South Africa are literate by the end of Grade 4? – What proportion of students have their own textbook?– What do grade 6 mathematics teachers know relative to the curriculum?– Which areas of the grade 9 curriculum do students battle with the most?– How large are learning deficits in Gr3? Gr6? Gr9?
B. Types of research questions that are LESS amenable to quantitative research:– Which teaching practices and styles promote/hinder learning?– Questions relating to personal motivation, school culture, leadership style etc. (all of which
require in-depth observation and analysis)– All the ‘philosophical’ areas of research: what is education for? What is knowledge? Says who?
Who should decide what goes into the curriculum? How should they decide? Should education be free?
That being said, researchers do focus on some of “type-B” questions (non-philosophical ones) using quantitative data – (and have often made important contributions) but the scope of questions is usually quite limited, but the breadth/coverage and ability to control for other variables often makes the analysis insightful
1. What are we talking about?
• To provide one example. If we look at something like school leadership and management (SLM), there are various approaches to researching this including:– In-depth study of a small number (15) of schools
(something like the SPADE analysis of Galant & Hoadley)
– Using existing large-scale data sets to try and understand how proxies of SLM are related to performance. To provide some examples…
The above analysis is taken from Gabi Wills (2013)
The above analysis is taken from Gabi Wills (2013)
Sample-based Census-based
Number of
schools?Number of students?
Comparable over time?
Cross-national studies of educational achievement
TIMSS 1995, 1999, 2003, 2011 - 285 11969 Yes
SACMEQ 2000, 2007, 2013 - 392 9071 Yes
PIRLS 2006, 2011 (Eng/Afr only) - 92 3515 Sort of
prePIRLS 2011 341 15744 NA
National assessments (diagnostic)
Systemic Evaluations 2004 (Gr6), 2007
(Gr3)- 2340 54 Sort-of
-ANA
2011/12/13/14
24 7mil Definitely not
Verification-ANA 2011, 2013 (Gr 3 & 6) 2164 (125/
prov) No
NSES* Gr3 (2007) Gr4 (2008) Gr5 (2009) 266 24000
(8383 panel)Yes
(+ longitudinal)
National assessments (certification) - Matric 6591 about 550,000
*Number of schools and students is for the most recent round of assessments
Differences between national assessment and public exams
Like TIMSS/PIRLS/SACMEQ
Like matric
Source: Greaney & Kellaghan (2008)
There are also other assessments which SA doesn’t take part in…
School-based• PISA: Program for International Student Assessment [OECD]• ICCS: International Civic and Citizenship Education Study [IEA]Home-based• IALS: International Adult Literacy Survey [OECD]• ALLS: Adult Literacy and Life Skills Survey [OECD]• PIAAC: Programme for the International Assessment of Adult
Competencies [OECD]For more information see: http://www.ierinstitute.org/
Source: IERI Spring Academy 2013
Source: IERI Spring Academy 2013
Source: IERI Spring Academy 2013
An aside on matrix sampling…
Because one1. can only test students for a limited amount of time (due to practical reasons and cognitive fatigue),2. and because one cannot cover the full curriculum in a 2 hour test (at least not in sufficient detail for
diagnostic purposes)It becomes necessary to employ what is called matrix sampling.
• If you have 200 questions that cover the full range of the maths curriculum you could split this into 20 modules of 10 questions.
• If a student can cover 40 questions in 2 hours then they can write 4 modules.• Different students within the same class will therefore write different tests with overlapping
modules.• Matrix sampling allows authorities to cover the full curriculum and thus get more insight into
specific problem-areas, something that isn’t possible with a (much) shorter test.• TIMSS/PIRLS/PISA all employ matrix sampling. SACMEQ 2000 and 2007 did not employ
matrix sampling (all children wrote the same test) but from 2013 I think they are doing matrix sampling as well.
• This highlights one of the important features of sample-based assessments: the aim is NOT to get an accurate indication of any specific child or specific school but rather some aggregated population (girls/boys/provinces/etc.)
Sample-based assessments (cont.)
• The aim of sample-based assessments is to be able to gain insight (and make statements) that pertain to an underlying population AND NOT the sampled schools.
• For example in SACMEQ the sample was drawn such that the sampling accuracy was at least equivalent to a Simple Random Sample of 400 students which guarantees a 95% confidence interval for sample means that is plus or minus 1/10th of a student standard deviation (see Ross et al. 2005).– This is largely based on the intra-class correlation coefficient (ICC) which is a
measure of the relationship between the variance between schools and within schools.
– In South Africa this meant we needed to sample 392 schools in SACMEQ 2007• Important to understand that there are numerous sources of error and
uncertainty, especially sampling error and measurement error. Consequently one should ALWAYS report confidence intervals or standard errors.
Sample-based assessments (cont.)
• Once you know the ICC and therefore the number of schools you need to sample, you need a sampling frame (i.e. the total number of schools).
• One can also use stratification to ensure representivity at lower levels than the whole country (i.e. province or language group)
• Randomly select schools from sampling frame.• For example, for the NSES 2007/8/9….
Brown dots = former black schoolsBlue dots = former white schoolsPurple dots = school included in NSES(courtesy of Marisa Coetzee)
What kinds of administrative data exist?
• Education Management Information Systems (EMIS)– Annual Survey of Schools– SNAP– LURITZ. System aimed at being able to identify and follow individual
learners using unique IDs– SA-SAMS
• HEMIS – EMIS but for higher education• PERSAL – payroll database • School Monitoring Survey• Infrastructure survey• ECD Audit 2013
Overview
• Main educational datasets in South Africa:
• PIRLS 2006 2011• TIMSS 1995 1999 2002 2011• SACMEQ 2000 2007 2013• V-ANA 2011• ANA 2011 2012• NSES 2007 2008 2009• EMIS (various)• Matric (annual)• Household surveys (various
PIRLSWhat:• Progress in International Reading and Literacy
Study• Tests the reading literacy of grade four children
from 49 countries• Run by CEA at UP on behalf of IEA (
http://timss.bc.edu/)
When and Who:• PIRLS 2006 (grade 4 and 5)• PIRLS* 2011 (grade 5 Eng/Afr only)• prePIRLS (grade 4)
Examples of how we can use it?• Issues related to LOLT• Track reading performance over time• International comparisons
Engli
sh
Afrika
ans
siSwati
isiZulu
isiNdeb
ele
isiXhosa
setsw
ana
Seso
tho
Xitsonga
Tshive
nda
Seped
i
South Afri
ca
Botswan
a
Columbia240
280
320
360
400
440
480
520
560
600
531 525
452 443 436 429 428 425407 395 388
461 463
576
Test language
preP
IRLS
read
ing
scor
e 20
11
0.0
01
.00
2.0
03
.00
4.0
05
kden
sity
re
adin
g te
st s
core
0 200 400 600 800reading test score
African language schools English/Afrikaans schools
PIRLS 2006 – see Shepherd (2011)
prePIRLS 2011 – see Howie et al (2012)
TIMSSWhat:• Trends in International Mathematics and
Science Study• Tests mathematics and science achievement of
grade 4 and grade 8 pupils • Run by HSRC in SA on behalf of IEA (
http://timss.bc.edu/)
When and Who:• TIMSS 1995, 1999 (grade 8 only)• TIMSS 2002 (grade 8 and 9)• TIMSS 2011 (grade 9 only)
Examples of how we can we use it?• Interaction between maths and science• Comparative performance of maths and
science achievement• Changes over time
TIMSS 2003 Maths – see Taylor (2011)
TIMSS 2011 Science – see Spaull (2013)
Rus
sian
Fed
erati
on
Lith
uani
a
Ukr
aine
K
azak
hsta
n
Tur
key
Ir
an, I
slam
ic R
ep. o
f R
oman
ia
Chi
le
Tha
iland
Jo
rdan
T
unis
ia
Arm
enia
M
alay
sia
S
yria
n Ar
ab R
epub
lic
Geo
rgia
P
ales
tinia
n N
at'l
Auth
. M
aced
onia
, Rep
. of
Indo
nesi
a
Leb
anon
B
otsw
ana
(Gr 9
) M
oroc
co
Hon
dura
s (
gr 9
) S
outh
Afr
ica
(Gr 9
) G
hana
Qui
ntile
1Q
uinti
le 2
Qui
ntile
3Q
uinti
le 4
Qui
ntile
5In
depe
nden
t
Middle-income countries South Africa (Gr9)
200
240
280
320
360
400
440
480
520
560
TIM
SS 2
011
Scie
nce
scor
e
0.0
02.0
04.0
06.0
08D
ens
ity
0 200 400 600 800Grade 8 mathematics score
South Africa Quintile 5 ChileChile Quintile 5 SingaporeSingapore Quintile 5
1995
1999
2002
2002
2011
2011
1995
1999
2002
2002
2011
2011
Grade 8 Grade 9 TIMSS middle-income country
Gr8 mean
Grade 8 Grade 9 TIMSS middle-income country
Gr8 mean
TIMSS Mathematics TIMSS Science
0
40
80
120
160
200
240
280
320
360
400
440
480
276 275 264 285352
433
260 243 244 268332
443
TIM
SS sc
ore
TIMSS 2011South African mathematics and science performance in the Trends in International Mathematics and
Science Study (TIMSS 1995-2011) with 95% confidence intervals around the mean (Spaull, 2013)
SACMEQWhat:• Southern and East African Consortium for
Monitoring Educational Quality • Tests the reading and maths performance of
grade six children from 15 African countries• Run by DBE – Q.Moloi (
http://www.sacmeq.org/)
When and Who:• SACMEQ II – 2000 (grade 6)• SACMEQ III – 2007 (grade 6)• SACMEQ IV – 2013 (grade 6)
Examples of how can we use it?• Regional performance over time• Teacher content knowledge• Understanding the determinants of
numeracy and literacy
SACMEQ III – see Spaull (2013)
SACMEQ III – see McKay & Spaull (2013)
600
650
700
750
800
850
900
950
Series1
Mean Lower bound confidence interval (95%)Upper bound confidence interval (95%)
Mat
hs-t
each
er m
athe
mati
cs sc
ore
0.0
02.0
04.0
06.0
08
Den
sity
0 200 400 600 800 1000Learner Reading Score
Poorest 25% Second poorest 25%Second wealthiest 25% Wealthiest 25%
SACMEQ III (Spaull & Taylor, 2014)
ANAWhat:• Annual National Assessments• Administrative data on enrolments, staff,
schools etc.• Collected by DBE
When and Who:• Grades 1-6 and 9 (maths and language - FAL
and HL)
Examples of how can we use it?• Analyse performance at primary grades,
potentially at the micro-level (district/circuit)• Create indicators for dashboards• Report cards (once ANA is externally evaluated
at one grade)• Early indicators of problems/deficits• Planning at primary school level• Serious comparability problems between ANA
2011 and ANA 2012 (see SVDB and Spaull interview)
ANA – see Spaull (2012)
020
40
60
80
100
Perc
ent
School categorization (Average school numeracy and literacy score)
Universal ANA 2011
School Categorisation by District (KZN)
Dys func tional schools : <30% Underperforming schools : 30-40%
Poor schools : 40-50% Good schools : 50-60%
Great schools : 60-70% Excellent schools : 70%+
020
4060
8010
0
Ave
rage
sch
ool g
rade
3 n
umer
acy
scor
e
0 20 40 60 80
Average school grade 6 numeracy score
U-ANA 2011
Correlation Between Avg. School Gr3 and Gr6 Numeracy Score (KZN)
020
40
60
80
100
School a
vera
ge g
rade 3
num
eracy s
cor
e
0 20 40 60 80
School average grade 6 numeracy score
U-ANA 2011
Correlation Between Avg. School Gr3 and Gr6 Numeracy Score (WC)
ANALanguage by grade/quintile (KZN)
Q1 Q2 Q3 Q4 Q50%
10%20%30%40%50%60%70%80%90%
100%
100 100 98 91
65
1 3
11
1 3141
3
1
8
Race Distribution by Quintile (KZN)U-ANA 2011
OtherAsianIndianWhiteColouredBlack
Correlation 0.82
Correlation 0.51
EMISWhat:• Education Management Information System• Administrative data on enrolments, staff,
schools etc.• Collected by DBE (
http://www.education.gov.za/EMIS/tabid/57/Default.aspx)
When and Who:• Various
Examples of how can we use it?• Analyse flow-through• Create indicators for dashboards
– PTR, school size, LOLT etc
• Provide an up-to-date and accurate picture of elements of the education system
• Planning
EMIS – see Taylor (2012)
EMIS – see Taylor (2012)
The ratio of grade 2 enrolments ten years prior to matric to matric passes by province
19941995
19961997
19992000
20012002
20032004
20052006
20072008
20092010
20110
200000
400000
600000
800000
1000000
1200000
grade 10 Grade 12
“In 1999 and 2000 the numbers enrolling in grade 1 dropped substantially, by about half a million. Crucially, it is these cohorts who make up the bulk of the matric class of 2011. This was due to a change in the policy stipulating age of entry into grade 1. According to Notice 2433 of 1998, it was stipulated that children should only be allowed to enrol in grade 1 if they turned seven in that calendar year. Therefore children who previously might have entered in the year in which they turned six were now not allowed to. The policy change was announced in October 1998 and schools were expected to comply by January 2000. This would explain why grade 1 enrolments declined somewhat in 1999 and then again even more so in 2000. The reason why numbers declined as the policy was phased in is that some children who turned 7 in the 2000 calendar year had already entered in the previous year under the previous policy. “
- Taylor 2012
MatricWhat:• Grade 12 examinations results• Performance data• Collected by DBE
When and Who:• Various
Examples of how can we use it?• Analyse subject choices/combinations• Create indicators for dashboards
– % taking maths/science– Proportion of Gr 8’s passing matric
• Relatively trustworthy and regular indication of student outcomes in SA.
• Planning
EMIS – see Taylor (2012)
EMIS – see Taylor (2012)
Matric 2008 (Gr 10 2006)
Matric 2009 (Gr 10 2007)
Matric 2010 (Gr 10 2008)
Matric 2011 (Gr 10 2009)
0
200000
400000
600000
800000
1000000
1200000
0%
10%
20%
30%
40%
50%
60%
Grade 10 (2 years earlier) Grade 12Those who pass matric Pass matric with mathsProportion of matrics taking mathematics
Num
ber o
f stu
dent
s
Prop
ortio
n of
mat
rics (
%)
Household Surveys
What:• Grade 12 examinations results• Performance data• Collected by DBE
When and Who:• Various
Examples of how can we use it?• Research• Link education to other social outcomes
like employment and health
HH-Surveys – see Taylor (2012)
Household Surveys
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
0%
10%
20%
30%
40%
50%
60%
70%
80%
Working-Age Population All Youth Youth with Less than Matric
With Matric Youth With Diploma Youth With Degree
Empl
oym
ent/
LFA
Rate
for
18 -
24 -y
ear -
olds
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
0%10%20%30%40%50%60%70%80%90%
100%
With Less Than Matric With Matric With Diploma With Degree
Prop
ortio
n of
you
th w
ith Q
ualifi
ca-
tion
Composition of 18 - 24-year-olds by highest level of education completed (Van Broekhuizen, 2013)
Percentage of youth in employment by highest educational attainment (Van Broekhuizen, 2013)
Some other research… (Discuss if time permits)
47
Context: low and unequal learner performance0
.005
.01
.015
.02
Den
sity
0 20 40 60 80 100Literacy score (%)
Black WhiteIndian Asian
U-ANA 2011
Kernel Density of Literacy Score by Race (KZN)
0.0
02.0
04.0
06.0
08
Den
sity
0 200 400 600 800 1000Learner Reading Score
Poorest 25% Second poorest 25%Second wealthiest 25% Wealthiest 25%
0.0
01
.00
2.0
03
.00
4.0
05
kden
sity
re
adin
g te
st s
core
0 200 400 600 800reading test score
African language schools English/Afrikaans schools
0.0
05.0
1.0
15.0
2.0
25D
ensity
0 20 40 60 80 100Numeracy score 2008
Ex-DET/ Homelands schools Historically white schools
0.0
1.0
2.0
3.0
4D
ensi
ty
0 20 40 60 80 100Average school literacy score
Quintile 1 Quintile 2Quintile 3 Quintile 4Quintile 5
U-ANA 2011
Kernel Density of School Literacy by Quintile
PIRLS / TIMSS / SACMEQ / NSES / ANA / Matric… by Wealth / Language / Location / Dept…
Comparing WCED Systemic Evaluation and DBE ANA WC 2011
49
Quantifying learning deficits in Gr3
• Following Muralidharan & Zieleniak (2013) we classify students as performing at the grade-appropriate level if they obtain a mean score of 50% or higher on the full set of Grade 3 level questions.
0.0
05
.01
.01
5.0
2.0
25
Ke
rne
l d
en
sit
y o
f G
rad
e 3
-le
ve
l s
co
res
0 10 20 30 40 50 60 70 80 90
Systemic 2007 Grade 3 mean score (%) on Grade 3 level items
Quintile 5 Quintile 1-4
Figure 1: Kernel density of mean Grade 3 performance on Grade 3 level items by quintiles of student socioeconomic status (Systemic Evaluation 2007)
(Grade-3-appropriate level)
51%
11%
16% Only the top 16% of grade 3 students are
performing at a Grade 3 level
(Spaull & Viljoen, 2014)
50
NSES question 42NSES followed about 15000 students (266 schools) and tested them in Grade 3 (2007), Grade 4 (2008) and Grade 5 (2009).
Grade 3 maths curriculum: “Can perform calculations using appropriate symbols to solve problems involving: division of at least 2-digit by 1-digit numbers”
Q1 Q2 Q3 Q4 Q5Question 42
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
16% 19% 17% 17%
39%13% 10% 12% 12%
14%
13% 14% 14% 15%
13%
59% 57% 57% 55%
35%
Still wrong in Gr5Correct in Gr5Correct in Gr4Correct in Gr3
Even at the end of Grade 5 most (55%+) quintile 1-4 students cannot answer this simple Grade-3-level problem.
“The powerful notions of ratio, rate and proportion are built upon the simpler concepts of whole number, multiplication and division, fraction and rational number, and are themselves the precursors to the development of yet more complex concepts such as triangle similarity, trigonometry, gradient and calculus” (Taylor & Reddi, 2013: 194)
(Spaull & Viljoen, 2014)
51
Insurmountable learning deficits: 0.3 SD
Gr3 Gr4 Gr5 Gr6 Gr7 Gr8 Gr9 Gr10 Gr11 Gr12(NSES 2007/8/9) (SACMEQ
2007)Projections (TIMSS
2011)Projections
0
1
2
3
4
5
6
7
8
9
10
11
12
13
South African Learning Trajectories by National Socioeconomic QuintilesBased on NSES (2007/8/9) for grades 3, 4 and 5, SACMEQ (2007) for grade 6 and
TIMSS (2011) for grade 9)
Quintile 1Quintile 2Quintile 3Quintile 4Quintile 5Q1-4 TrajectoryQ5 Trajectory
Actual grade (and data source)
Effec
tive
grad
e
(Spaull & Viljoen, 2014)
Data and analysis
• In order to answer research questions and engage with the data requires some level of analytic proficiency with a statistical software package like STATA or SPSS (or R if you are hardcore)
• Education faculties in South Africa really need to up their game as far as quantitative analysis is concerned. For whatever reason there seems to be an anti-empirical, anti-quantitative bias across the board. This filters through into course-load priorities and expectations (or lack of expectations) on graduate students.
• Without an ability to interact with a large data set and do BASIC data analysis any graduate student’s research opportunities are severely (and unnecessarily) limited (the same applies to faculty members)
• SALDRU (UCT) runs a free online STATA course to teach the basics of data analysis – http://www.saldru.uct.ac.za/training/online-stata-course – There is also a two-week ”UCT Summer training Programme in Social Science research Using Survey Data” run in January
every year and well worth going to if you already have a basic background in statistics
Conclusion• Data is essential for making informed decisions• To be able to use these data sets requires some level of
analytic proficiency. Basic proficiency can take as little as 4 months but is infinitely valuable.
• Nationally representative datasets allow us to draw conclusions for each province and the whole country – something that is not possible from small local studies.
• DBE has access to a wealth of useful but under-utilized data– ANA, EMIS, MATRIC, HH-SURVEYS (also PERSAL & SYSTEMIC)
• Many datasets are publicly available on request– SACMEQ, TIMSS, PIRLS (SACMEQ 2013 soon to be available)
• “Without data you are just another person with an opinion” – Andreas Schleicher
References and useful websites
• Fleisch, B. (2008). Primary Education in Crisis: Why South African Schoolchildren underachieve in reading and mathematics (pp. 1–162). Cape Town: Juta & Co.
• Greaney, V., & Kellaghan, T. (2008). Assessing national achievement levels in education (Vol. 1). World Bank Publications.
• Reddy, V., Prinsloo, C., Visser, M., Arends, F., Winnaar, L., & Rogers, S. (2012). Highlights from TIMSS 2011: The South African perspective. Pretoria.
• Ross, K. N., Dolata, S., Ikeda, M., Zuze, L., & Murimba, S. (2005). The Conduct of the SACMEQ II Project in Kenya. Harare.
• Taylor, N., Van der berg, S., & Mabogoane, T. (2013). What makes schools effective? Report of the National School Effectiveness Study. Cape Town: Pearson.
• Taylor, S., & Yu, D. (2009). The importance of socioeconomic status in determining educational achievement in South Africa (No. 1). Stellenbosch.
• Van der berg, S., Burger, C., Burger, R., De Vos, M., Du Rand, G., Gustafsson, M., … Von Fintel, D. (2011). Low quality education as a poverty trap. Stellenbosch.
• http://www.sacmeq.org/ www.oecd.org/pisa http://timssandpirls.bc.edu
Group exercise
• Last 45 minutes– Split into groups of 5 (8 groups)– Using questionnaires provided, come up with at least 5
research questions that could (potentially) be answered using that data
1. Explain which variables you would use and how (what would the graph/table look like or be populated with? Sketch the axes)
2. Why did you choose those research questions?3. Which other large-scale data do you think you could look at to
further investigate the issue?
Thank youwww.nicspaull.com/research [email protected]
@NicSpaull
Difference between TIMSS & PISA
Difference between TIMSS & PISA