is activity in online quizzes correlated with higher exam marks?

13
This article was downloaded by: [McGill University Library] On: 30 October 2014, At: 13:21 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK New Zealand Economic Papers Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/rnzp20 Is activity in online quizzes correlated with higher exam marks? Paul McKeown a & Gillis Maclean b a University of Canterbury , New Zealand b Lincoln University , New Zealand Published online: 23 Aug 2012. To cite this article: Paul McKeown & Gillis Maclean (2013) Is activity in online quizzes correlated with higher exam marks?, New Zealand Economic Papers, 47:3, 276-287, DOI: 10.1080/00779954.2012.715826 To link to this article: http://dx.doi.org/10.1080/00779954.2012.715826 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing, systematic supply, or distribution in any form to anyone is expressly forbidden. Terms & Conditions of access and use can be found at http://www.tandfonline.com/page/terms- and-conditions

Upload: gillis

Post on 28-Feb-2017

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Is activity in online quizzes correlated with higher exam marks?

This article was downloaded by: [McGill University Library]On: 30 October 2014, At: 13:21Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

New Zealand Economic PapersPublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/rnzp20

Is activity in online quizzes correlatedwith higher exam marks?Paul McKeown a & Gillis Maclean ba University of Canterbury , New Zealandb Lincoln University , New ZealandPublished online: 23 Aug 2012.

To cite this article: Paul McKeown & Gillis Maclean (2013) Is activity in online quizzescorrelated with higher exam marks?, New Zealand Economic Papers, 47:3, 276-287, DOI:10.1080/00779954.2012.715826

To link to this article: http://dx.doi.org/10.1080/00779954.2012.715826

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoever orhowsoever caused arising directly or indirectly in connection with, in relation to or arisingout of the use of the Content.

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Page 2: Is activity in online quizzes correlated with higher exam marks?

Is activity in online quizzes correlated with higher exam marks?

Paul McKeowna* and Gillis Macleanb

aUniversity of Canterbury, New Zealand; bLincoln University, New Zealand

(Received 25 November 2011; final version received 21 July 2012)

Online quizzes are widely used as formative learning exercises. An importantinnovation is that online activity log data allow direct measurement of studentactivity in quizzes, such as time taken and number of attempts. It has previouslybeen shown that participation in quizzes increases dramatically when grades areawarded as incentives. Data from six semesters in an introductory economics courseshow a significant positive correlation between total quiz activity and final exammarks for a given first test mark. This has implications for the prediction of examaegrotats. While the results don’t necessarily imply a causal link between quizactivity and exam marks, they do provide some evidence to indicate such a link.

Keywords: online learning; online quizzes; formative assessment; student effort;economics education

1. Introduction

Innovations in online teaching and learning have been readily taken up by TertiaryEducation Institutions. Learning management systems (such as Moodle) allow muchmore flexibility for students who can access resources at any time from any location.The benefits of such platforms include better communication with students andautomation of some class management and assessment, all with minimal marginalcost. The introduction of digital media is causing a paradigm shift in how we teach.One such innovation is the ability to collect and analyse online activity data and thisshould lead to a better understanding of online learning.

Online quizzes are widely used for both formative and summative assessment. Thispaper considers the use of online quizzes as formative assessment, that is, as an aid toself-regulated learning. Students report favourably on their use of online quizzes.Swan (2004) found that 94% of physics students surveyed thought that online quizzeshelped them learn. Our own online surveys indicate around 90% of economicsstudents thought that the online quizzes helped them learn. When asked, ‘howimportant do you rate the online quizzes for your learning in the subject?’ the modalresponse is ‘very important’, the highest level of importance in the options given.

However, students favouring quizzes does not necessarily mean those studentsmake full and effective use of quizzes or that using quizzes improves learningoutcomes. McKeown and Maclean (2010a) show that awarding grade-points as anincentive increases students’ use of online quizzes. As economists we realise students

*Corresponding author. Email: [email protected]

New Zealand Economic Papers, 2013

Vol. 47, No. , – , http://dx.doi.org/10.1080/00779954.201 .727 23 26 87 15826

� 201 New Zealand Association of Economists Incorporated3

Dow

nloa

ded

by [

McG

ill U

nive

rsity

Lib

rary

] at

13:

21 3

0 O

ctob

er 2

014

Page 3: Is activity in online quizzes correlated with higher exam marks?

respond to incentives and therefore it is important to know how our incentives (inthe form of course marks) influence student study habits and, subsequently, studentlearning outcomes. This paper therefore asks the question, ‘does increased activity inonline quizzes improve student learning outcomes?’ It is hard to conclusively show acausal link so we also consider a more practical question that can be answered usingavailable historical data. That is, ‘are exam marks correlated with activity in onlinequizzes?’ To help answer these questions we look at the relationship between studentactivity in on-line quizzes and exam performance, in a large introductory economicspaper (Econ101/110) over six semesters.

While there is little research that considers usage data for online quizzes ineconomics, there is some research in other fields. Brothen and Wambach (2001)looked at different strategies for quiz usage by psychology students and presentsome evidence in favour of a ‘prepare-gather, feedback-restudy’ approach versus a‘quiz to learn’ approach. Their work is more focused on the way students usequizzes than whether or not increased usage leads to better exam scores. From theirlimited sample of 29 students, they conclude that the number of quiz attempts andthe average time per quiz attempt are both negatively correlated with exam marks.Our results show a positive correlation between the number of quiz attempts andexam marks. This is more in line with Grimstad and Grabe (2004), who presentevidence showing a positive correlation between the numbers of practice questionscompleted and test scores. However, they use voluntary quizzes so it may be thatstudents who choose to do the quizzes are more able or more engaged. Ourresearch covers the entire class and controls for student ability and hence avoidsthis issue.

A further issue we consider is whether online quiz activity is suitable for inclusionin aegrotat calculations. In all New Zealand universities (and many universitieselsewhere), consideration of aegrotats is standard practice when circumstances suchas illness or accident prevent a student sitting an exam, or impair their performanceduring an exam. If the aegrotat application is accepted as genuine, the examiner isasked to estimate the likely result had the student been able to complete theassessment. Given the widespread use of aegrotats, and the lack of attention given toaegrotats in the literature, we have structured our analysis so that the results have adirect practical application to aegrotat prediction.

Data from outside the course may not be appropriate (or allowed) whencalculating aegrotats. For example, it would be hard to justify the use of a student’ssex, country of origin, or GPA in other courses, when calculating an exam aegrotat,even if they do improve the prediction. If there is a significant correlation betweenquiz activity and final exam marks then information on students’ quiz attempts willhelp improve the accuracy of final exam aegrotats by adding to the pool of data thatcan be used to make the prediction.

2. Background to online quizzes

Econ101/110 (Principles of Economics) is Lincoln University’s entry-level economicscourse (with no pre-requisites). Econ101 and 110 are the same course with a differentcourse code and they will both be referred to hereafter as Econ101 for the sake ofsimplicity. Econ101 is a typical introductory economics course, offered in twosemesters each year. It is part of the compulsory core of several degree programmeswith a variety of different majors, and includes students from many different

New Zealand Economic Papers 277

Dow

nloa

ded

by [

McG

ill U

nive

rsity

Lib

rary

] at

13:

21 3

0 O

ctob

er 2

014

Page 4: Is activity in online quizzes correlated with higher exam marks?

countries with a wide range of backgrounds, of whom only a fraction continue ontofurther study in economics.

As well as two tests and the exam, we wanted an extra assessment format thatcreated frequent engagement by students, provided rapid feedback to students andteachers, and carried a grade as an incentive for students to participate. In 2008semester one we trialled quizzes without awarding grade-points to ensure the systemworked reliably; in particular, that after data transfer from Moodle to our studentrecords, it uniquely identified each student and recorded marks correctly (which wasnot a simple process). In 2008 semester two we began awarding grade-points for quizmarks.

Each semester lasts 12 weeks and there are 12 weekly quizzes. The course isdivided into three modules. The first four weekly quizzes cover the material in moduleone, which is then assessed in test one; quizzes five to eight relate to module two,which is assessed in test two; quizzes nine to twelve relate to module three, which isassessed in the final exam. Modules one and two are also assessed in the final exam.

Each quiz consists of 10 questions. Most questions are multi-choice but, from2009 semester one, a number of quizzes also include calculation questions. Quizquestions are randomly selected from a bank of around 30 questions for each quiz.The questions were sourced from our test and exam multiple-choice question banksso they are directly relevant as practice for the tests and exam. Students can makeunlimited attempts for any quiz while it is open. The students keep their highestmark for each quiz; this provides an incentive to continue practising as there is norisk of losing grade-points in further attempts.

Each week of the semester one new quiz opens in sequence. The quizzes formodule one are closed just after test one, and reopened after lectures end before thefinal exam. This is repeated for modules two and three (with quizzes for module threestaying open until just after the exam).

Calculation questions were introduced to reduce the ability of students to gathermarks simply by repeating quizzes rapidly and using the process of elimination tofind answers to multi-choice questions. Students must calculate from valuesrandomly generated each time. Students have to learn how to calculate a correctanswer. This leads to more effective use of quizzes by encouraging the ‘prepare-gather, feedback-restudy’ method highlighted by Brothen and Wambach (2001).

For convenience we denote semesters by Year.Semester, e.g. 2008.1 denotes 2008semester one and 2008.2 refers to 2008 semester two. Figure 1 shows theparticipation rates for quizzes one to twelve for each semester from 2008.1 to2010.2. Participation is defined as the percentage of the class attempting a given quizat least once. In 2008.1, quizzes carried no grade and were effectively voluntary.From 2008.2 onwards grade-points were awarded for quizzes and this clearlyincreased participation. With grade-points, participation was much more consistent,with smaller spikes and less drop-off over the semester. The pattern is similar for thetotal number of attempts at quizzes and the total time spent on quizzes (not shown).

Figure 2 compares the average number of quiz attempts per student on each daythroughout the semester in 2008.1 and 2008.2. The attempts are adjusted for classsize by dividing the total attempts by the number of students in the class. In 2008.1,when grade-points weren’t awarded, most quiz attempts were made in the few dayspreceding each test. The large peaks in semester one coincide with test days. Insemester two, when grade-points were awarded for quizzes the number of quizattempts increased significantly. The distribution of quiz attempts over the semester

P. McKeown and G. Maclean278

Dow

nloa

ded

by [

McG

ill U

nive

rsity

Lib

rary

] at

13:

21 3

0 O

ctob

er 2

014

Page 5: Is activity in online quizzes correlated with higher exam marks?

was more even in semester two but still had spikes around the test dates. It should benoted that the quizzes were closed a couple of days after each test and there aresecondary spikes after the test as students rush to attempt quizzes before closing.

The introduction of an incentive to complete quizzes clearly increased students’participation in quizzes. The question that now needs to be answered is, ‘doesincreased activity in quizzes lead to better exam marks (or at least is it correlatedwith increased exam marks)?’

Figure 1. Proportion of class attempting each quiz.

Figure 2. Daily attempts per student.

New Zealand Economic Papers 279

Dow

nloa

ded

by [

McG

ill U

nive

rsity

Lib

rary

] at

13:

21 3

0 O

ctob

er 2

014

Page 6: Is activity in online quizzes correlated with higher exam marks?

3. Method

One intended purpose of online quizzes is better learning outcomes as measured bysummative assessments, and in particular the exam mark, which is 50% of the finalgrade. We define the effectiveness of the quizzes as a measurable increase in the finalexam mark where students with a given test one mark make more use of the quizzes.It is important to note that the marks gained in quizzes are not used in this model.With unlimited attempts and only the highest mark recorded for each quiz, quizmarks are very high: the mean is 82 over all quizzes and 26% of all students score themaximum 100. In this context the quiz mark is less informative than the underlyingquiz activity, which is an interesting result.

The model in abstract form relates performance (as measured by the final exammark) to initial ability, quiz activity and ambition, that is:

Econ101 Performance ¼ f Initial Ability; Quiz Activity; Ambitionð Þ

Peter Kennedy pointed out in his keynote address to the 15th ATEC conference inHamilton that, from his experience as the editor of the statistics section of theJournal of Economic Education, the only reliable predictor of student success seemsto be their initial ability. Controlling for this variable is important in any analysis ofstudent performance. Cameron and Lim (2010) used the Test of Economics Literacy(TEL) to control for incoming student ability and others have used high schoolmarks. First year university marks have been used by Hickson (2010a) to helppredict marks in second year and Hickson (2010b) uses a non-economics GPA forthe first year students taking an economics course as a control for ability. This paperuses a more direct control for student ability in economics, namely students’ marksin their first test in Econ101.

The first test measures achievement at the end of the first four weeks and the finalexam measures achievement at the end of the 12-week semester. Test one is abenchmark against which further progress over the remaining weeks can bemeasured. By dividing the semester at week five, we can use the test one mark as aproxy for the student’s initial ability at week five and consider quiz activity over theremaining eight weeks. Since effort on quizzes one to four will influence the test onemark we must exclude quizzes one to four from the data set, leaving quizzes five totwelve only. The final exam mark is then a function of the initial ability as measuredby test one and the study after test one.

It should be noted that the quiz data are strictly measures of student activity –time taken and attempts made – and do not allow discussion in terms such as effortand ambition, which we cannot observe. When two students work for the samenumber of minutes, we cannot tell whether one exerts more effort. Nor can weobserve what study methods are being used during quiz activity. Direct measures ofambition, such as a student’s target course grade, will influence effort but are notadmissible for aegrotat calculations. Therefore, we leave ambition out of thisanalysis but suggest it would be an interesting area for investigation in the moregeneral context of explaining student learning. So, the model becomes:

Final Exam Mark ¼ f Test1 mark; Activity on Quizzes 5-12ð Þ

From the data, two alternative measures of activity can be looked at, namely, thetotal time spent working on the quizzes and the total number of attempts made.

P. McKeown and G. Maclean280

Dow

nloa

ded

by [

McG

ill U

nive

rsity

Lib

rary

] at

13:

21 3

0 O

ctob

er 2

014

Page 7: Is activity in online quizzes correlated with higher exam marks?

A third specification, which is a disaggregated version of the first version (becausemultiplying number of attempts by the average time per attempt will give the totaltime spent), is also considered.

The four main models estimated for each semester were:

Finali ¼ b0 þ b1Test1i þ ei Baseline modelð Þ ð1Þ

Finali ¼ b0 þ b1Test1i þ b2TotMini þ ei ð2Þ

Finali ¼ b0 þ b1Test1i þ b2CoSi þ ei ð3Þ

Finali ¼ b0 þ b1Test1i þ b2CoSi þ b3AvgTimei þ ei ð4Þ

where:

b0 ¼ intercept,Test1i ¼ Test 1 mark for student i,TotMini ¼ total minutes spent on quizzes 5–12 by student i,CoSi ¼ count of starts (i.e. number of attempts) for student i on quizzes 5–12,AvgTimei ¼ average time per attempt for student i.

For the combined, multi-semester data sets the model is estimated with dummyvariables to capture any variation across semesters.

The average time alone is not expected to give a good indication of total activity.For example, a student who took 60 minutes for only one attempt has a much higheraverage time than a student who did 50 attempts at an average 10 minutes perattempt, even though the second has the far greater total activity.

Total time spent would seem to be a natural figure to use. However, activity logsrecord the time elapsed between the student opening and submitting the quiz. Thereis no record of the time the student is actually engaged in the quiz; for example, thestudent may leave the quiz open while they go for coffee. The measured total time aquiz attempt was open can be thought of as an upper limit on the time actuallyspent working on the quiz. The number of attempts is a simpler measure of activityand may provide a more reliable estimate of quiz study. We test both measures astheir usefulness depends on their ability to provide statistically significantinformation.

To see if activity variables provide any improvement over ability alone the resultsfor models (2) to (4) are compared to the baseline model (1), where model (1) simplyuses final exam versus test one. The significance of the activity variable coefficients ischecked and the adjusted R-squared values are compared with the baseline model tosee if there is an increase in explanatory power.

4. Data

4.1. Data overview

For each semester our raw data consists of a table of course marks and a tablecontaining data on quiz attempts. For each quiz attempt the basic data of interest arethe start time, finish time, and mark out of 10. Because students can repeat quizzes

New Zealand Economic Papers 281

Dow

nloa

ded

by [

McG

ill U

nive

rsity

Lib

rary

] at

13:

21 3

0 O

ctob

er 2

014

Page 8: Is activity in online quizzes correlated with higher exam marks?

there can be more than one attempt for each quiz by each student, and the markawarded will be the maximum mark of all the attempts for that quiz.

4.2. Data availability and issues

The innovation presented here is the use of data on actual student behaviour ratherthan surveys of student study habits. This has two advantages. Self-reports of studyactivity may have questionable validity and cannot be independently verified,whereas online activity logs measure activity directly. Further, as only a subset of theclass responds to surveys, survey results cannot be treated as unbiased, whereas quizdata are collected for all students.

Given demographic variables cannot be used for aegrotat prediction we have leftout such variables in order to present results that are directly applicable to aegrotatpredictions. Limited demographic data were available for the period covered but wedid do some preliminary work that included a dummy variable for sex. This analysisdid not provide any evidence of a difference between males and females.

Table 1 summarises the differences in the data over the six semesters covered.The ‘Grade’ is the marks awarded for all quizzes, which is 10% of the overall

grade from 2008.1 onwards. ‘Inc. Cal’d Q’s’ refers to the inclusion of calculated(where students must provide a numerical answer) as well as multi-choice questionsin the quizzes (not all quizzes include calculated questions).

Where quiz attempts had no time limit, any quiz attempt that took longer than180 minutes had the time taken truncated to 180 minutes. Attempts that took longerthan 180 minutes couldn’t be dropped as they could have been attempts where thestudent attained their highest mark. Hence, a compromise was made and 180minutes was used to be in line with the maximum time available when attempt timewas limited. To make data more amenable a time limit was introduced in 2010.1;thereafter, any quiz attempt kept open beyond the 180 minute limit is voided, with amark of zero and a null value for time spent.

Given the variation in setup and data over the full sample period, regressions arerun for each individual semester and then for semesters 2008.2 to 2010.2 as a group(referred to as 2008.2 onwards), semesters 2009.1 to 2010.2 as a group (referred toas 2009–2010), and semesters 2010.1 to 2010.2 as a group (referred to asBoth2010Sems).

McKeown and Maclean (2010b) shows how inter-semester data can be used toimprove final exam aegrotats. In terms of evaluating the use of quiz data forcalculating aegrotats, it is possible that the multi-semester models give better resultsand therefore it is important to investigate the combined semesters.

Table 1. Summary of quiz structure.

Semester Grade Time limit Inc Calc’d Q’s Other issues

2008.1 0 – N2008.2 10% – N2009.1 10% – Y2009.2 10% – Y2010.1 10% 180 min Y2010.2 10% 180 min Y M7.1 Earthquake!

P. McKeown and G. Maclean282

Dow

nloa

ded

by [

McG

ill U

nive

rsity

Lib

rary

] at

13:

21 3

0 O

ctob

er 2

014

Page 9: Is activity in online quizzes correlated with higher exam marks?

5. Results

Because of the large number of regressions run, presenting full results would becumbersome, and full results for only a subset of regressions are reported (Table 2).Results for other periods were very similar and have been left out to save space. Fullresults are available from the authors. Highlights, summaries, and comments follow.

The procedure was:

(1) Test validity of the linear models.(2) Use the adjusted R2 values to see if the introduction of quiz variables helps.(3) Look at the statistical significance of coefficients.(4) Evaluate the magnitude of the effect of quiz activity.

5.1. Model specification tests using ‘gvlma’

First, the specifications are tested using the global test statistic introduced by Penaand Slate (2006) and provided in the ‘gvlma’ package for R (see Pena & Slate, 2010).This is a global test that accounts for interactions between model violations and thusprovides a better test than the usual individual tests.

The ‘gvlma’ tests indicate that there was no evidence to reject the null hypothesis(that the model was specified correctly) at the 5% level of significance (except 2009.1which was only accepted at 10% significance).

Table 2 presents regression results for eight models. Models (1) to (4) are the fourmain models outlined in the method section.Models (5) to (8) are obviously not effectivebut are included for comparison and context. Model (5) uses Test1, CoS and Sex asexplanatory variables and highlights the general result, that Sex isn’t statistically signi-ficant. Model (6) uses just TotMin as an explanatory variable and Model (7) uses onlyCoS; both formulations obviously perform poorly.Model (8) uses Test1 andAvgTime asexplanatory variables and AvgTime doesn’t provide significant explanatory power.

5.2. Explanatory power – adjusted R2

Adjusted R2 figures for the regressions are in line with the ‘gvlma’ tests. That is, thecount of starts or total minutes alone do not explain much of the variation in exammarks but the other models give a reasonable explanation of the exam marks.

Models (2), (3) and (4) have higher adjusted R2 than the base formulation (1) inall periods covered except 2008.1. Given the models fail to meet linear modelassumptions in 2008.1, this is not surprising. In general TotMin and CoS providesimilar results, with CoS being slightly better in a number of periods. The addition ofaverage time doesn’t improve the adjusted R2. Given the total time variable wasaffected by truncation in semesters up to 2009.2 it is good to see that it didn’tperform too badly when compared with the count of starts variable. In general, thetotal time spent variable is more likely to have a high number of extreme values thanthe count of starts variable. Some attempts open for a long time are likely tooverstate the time actually engaged in the quiz.

The multi-semester models don’t seem to provide any improvement in theadjusted R2 figures and this indicates that the model did not explain all the dif-ferences across semesters. For example, the 2010.2 semester was affected by a largeearthquake in September 2010. The earthquake closed our university for a week anda half in the middle of the semester and this forced a reorganisation of the remaining

New Zealand Economic Papers 283

Dow

nloa

ded

by [

McG

ill U

nive

rsity

Lib

rary

] at

13:

21 3

0 O

ctob

er 2

014

Page 10: Is activity in online quizzes correlated with higher exam marks?

Table

2.

Model

resultsfor2008.2

onwards.

(1)

(2)

(3)

(4)

(5)

(6)

(7)

(8)

(Intercept)

5.616***

3.054*

3.157*

5.043***

3.520*

48.521***

49.271***

8.305***

(1.363)

(1.365)

(1.355)

(1.523)

(1.458)

(1.549)

(1.519)

(1.475)

Test1

0.880***

0.869***

0.872***

0.849***

0.865***

0.851***

(0.020)

(0.020)

(0.019)

(0.020)

(0.020)

(0.020)

Dummy:2009S1

0.171

–1.584

–1.098

–1.107

–1.219

4.109

5.027*

0.031

(1.234)

(1.220)

(1.206)

(1.202)

(1.207)

(2.095)

(2.080)

(1.217)

Dummy:2009S2

–2.258*

–3.861***

–2.755*

–2.708*

–2.818*

–2.254

–0.542

–1.994

(1.144)

(1.130)

(1.110)

(1.128)

(1.110)

(1.955)

(1.928)

(1.148)

Dummy:2010S1

3.847**

4.581***

2.009

2.174

1.821

11.658***

8.193***

3.297**

(1.234)

(1.203)

(1.217)

(1.228)

(1.220)

(2.063)

(2.104)

(1.244)

Dummy:2010S2

–0.740

0.012

–2.539*

–2.041

–3.727**

9.475***

6.049**

–0.837

(1.187)

(1.157)

(1.171)

(1.195)

(1.281)

(1.969)

(2.010)

(1.208)

TotalM

insSpentA

llQ

0.005***

0.007***

(0.001)

(0.001)

CountO

fStarts

0.086***

0.073***

0.094***

0.114***

(0.011)

(0.011)

(0.011)

(0.018)

AverageT

imeSpent

0.018

–0.015

(0.023)

(0.023)

Dummy:Male

–0.419

(0.779)

adj.R-squared

0.673

0.692

0.693

0.678

0.689

0.079

0.075

0.663

R-squared

0.675

0.694

0.695

0.680

0.691

0.084

0.080

0.665

sigma

11.988

11.648

11.617

11.401

11.596

20.195

20.244

11.658

F412.883

374.478

377.320

292.084

297.325

18.286

17.237

318.812

P0.000

0.000

0.000

0.000

0.000

0.000

0.000

0.000

N1000

1000

1000

969

938

1003

1003

969

***,**and*besidecoeffi

cients

carrythestandard

meaning;thatis,they

indicate

thattheindividualcoeffi

cients

are

significantat1%

,5%

and10%

respectively.

P. McKeown and G. Maclean284

Dow

nloa

ded

by [

McG

ill U

nive

rsity

Lib

rary

] at

13:

21 3

0 O

ctob

er 2

014

Page 11: Is activity in online quizzes correlated with higher exam marks?

teaching time. Thus, data from 2010.1 were not helpful when predicting marks for2010.2 as students were operating under very different conditions.

We should note here that a subsequent earthquake at the end of semester one in2011 caused the cancellation of a number of exams. This meant that some examinershad to make aegrotat predictions for all students in their courses, thus highlightingthe need for robust aegrotat prediction. Predicting marks for all students in a courseis technically more difficult than predicting exam marks for a few students in a givencourse and is a subject for further research.

5.3. Coefficient significance

The results for models (2) and (3) indicate that student activity, measured by CoS orTotMin, provides additional information when explaining variation in exam marks,whereas model (4) shows that AvgTime does not have a significant coefficient, thusindicating that a model with only Test1 and CoS (or TotMin) is sufficient to explainthe variation in the final exam mark.

5.4. Coefficient size

Given the coefficients on CoS and TotMin are statistically significant it is nowimportant to consider the actual size of the coefficients. That is, are the coefficientsbig enough to mean that the activity variables have a reasonable influence on marks?For example, suppose the coefficient on CoS was 0.0001 and statistically differentfrom zero; in practical terms this is irrelevant as it would require an extra 10,000 quizattempts to get 1 more mark in the exam.

To evaluate the significance we consider the difference in the predicted markbetween a student at the third quartile (of CoS or TotMin) versus a student at thefirst quartile, where both students have the same test one mark. For example, in2008.2 the third quartile for TotMin is 930 and the first quartile is 244. So, given theestimated coefficient of 0.008 for TotMin, a student who spent 930 minutes onquizzes would be predicted to score 5.5 exam marks more than a student who did 244minutes, ceteris paribus. A similar figure can be calculated for CoS.

The inter-quartile mark range varies from 2.2 to 5.5, indicating that variation inTotMin and CoS will have a significant effect on predicted exam marks. That is, afew more (or less) marks could move an aegrotat prediction between the 5 markgrade bands (GPA points). For example, a student with a large CoS might beawarded a Cþ (GPA 3) versus a C (GPA 2) for a student with a small CoS.

If a student is to be awarded an aegrotat then a point estimate of a percentage markwould be hard to justify given the size of the standard errors in the predictions. Wesuggest that the use of grade letter predictions would be more appropriate. A lettergrade is effectively saying that the predicted mark would most likely lie somewherewithin that letter grade band, for example, awarding an aegrotat C– would, at ourinstitution, mean that the student was likely to score a mark somewhere in the 50 to55% range. So such an aegrotat would be avoiding a point estimate of a student’s mark.

6. Conclusion

This paper introduces data on student activity now available through online learningmanagement systems. The data are then analysed to provide insight into the effect of

New Zealand Economic Papers 285

Dow

nloa

ded

by [

McG

ill U

nive

rsity

Lib

rary

] at

13:

21 3

0 O

ctob

er 2

014

Page 12: Is activity in online quizzes correlated with higher exam marks?

student activity on exam marks and the utility of such data in the prediction of examaegrotats.

We find that both the number of quizzes attempted and the total time spent onquizzes are positively correlated with final exam marks. Aegrotat estimation is basedon limited data and can be subject to significant error. The addition of quiz activitydata reduces the error involved with aegrotat prediction. Therefore, measures of quizactivity are candidates for inclusion in the models used by examiners to determineaegrotat marks.

The results here show that, for a given set of quizzes and overall assessmentstructure, students with greater quiz activity tend to have higher exam marks.Unfortunately, it is not the case that offering twice as many quizzes wouldnecessarily increase overall exam marks. As students operate with limited timebudgets, at least some time put into graded quizzes will be taken from otherungraded learning activities, and the net effect of learning outcomes could be small –or, in the extreme, even negative.

Showing that an increase in the final exam mark is correlated with an increase intime spent on quizzes or the number of attempts at quizzes does not necessarilyimply causation. That is, it may be that students with an ambition for higher markswill tend to spend more time on quizzes (especially when quizzes provide guaranteedmarks) and therefore it is the goal of higher exam marks that leads to higher quizusage. But it could be that students who spend more time on quizzes are indeedlearning more and hence improving their exam marks. It is not possible todistinguish the direction of causality from the available data but at least the data areconsistent with the hypothesis that higher quiz activity leads to greater learning andhigher exam marks.

References

Brothen, T., & Wambach, C. (2001). Effective student use of computerized quizzes. Teachingof Psychology, 28(4), 292–294. (Doi: 10.1207/S15328023TOP2804_10.)

Cameron, M.P., & Lim, S. (2010). Recognising and building on freshman students’ priorknowledge of economics. In M.P. Cameron & S. Lim (Eds.), 15th Australasian TeachingEconomics Conference, pp. 3–24.

Grimstad, K., & Grabe, M. (2004). Are online study questions beneficial? Teaching ofPsychology, 31(2), 143–146. (Doi: 10.1207/s15328023top3102_8.)

Hickson, S. (2010a). Predicting student achievement in intermediate university economicsfrom principles assessments. In M.P. Cameron & S. Lim (Eds.), 15th AustralasianTeaching Economics Conference, pp. 263–282.

Hickson, S. (2010b). The impact of question format in principles of economics classes:Evidence from New Zealand. New Zealand Economic Papers, 44(3), 269–287. (Doi:10.1080/00779954.2010.522165.)

McKeown, P.C., & Maclean, G. (2010a). Modelling student study behaviour and marks in on-line quizzes in Econ 101. In M.P. Cameron & S. Lim (Eds.), Australasian TeachingEconomics Conference, pp. 175–202, Waikato University.

McKeown, P.C., & Maclean, G. (2010b). The prediction of exam aegrotats using the nearest-neighbour method. In W.M. Davies & S. Draper (Eds.), Quantitative Analysis of Teachingand Learning in Higher Education in Economics, Commerce and Business. Proceedings ofthe One Day Forum 12/2/2010 (Faculty of Business and Economics, the University ofMelbourne: On Demand Press, Southbank).

Pena, E., & Slate, E.H. (2006). Global validation of linear model assumptions. Journal ofThe American Statistical Association, 101, 341–354. (Doi: 10.1198/016214505000000637.)

P. McKeown and G. Maclean286

Dow

nloa

ded

by [

McG

ill U

nive

rsity

Lib

rary

] at

13:

21 3

0 O

ctob

er 2

014

Page 13: Is activity in online quizzes correlated with higher exam marks?

Pena, E.A., & Slate, E.H. (2010). gvlma: Global validation of linear models assumptions. (Rpackage version 1.0.0.1) Retrieved from http://CRAN.R-project.org/package¼gvlma.

Swann, G.I. (2004). Online assessment and study. In R. Atkinson, C. McBeath, D. Jonas-Dwyer, & R. Phillips (Eds.), Beyond the comfort zone: Proceedings of the 21st ASCILITEConference, pp. 891–894. Perth, 5–8 December. http://www.ascilite.org.au/perth04/procs/swan.html

New Zealand Economic Papers 287

Dow

nloa

ded

by [

McG

ill U

nive

rsity

Lib

rary

] at

13:

21 3

0 O

ctob

er 2

014