large-scale student assessment: guidelines for policymakers

12
This article was downloaded by: [University of Guelph] On: 18 November 2014, At: 11:26 Publisher: Routledge Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK International Journal of Testing Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/hijt20 Large-Scale Student Assessment: Guidelines for Policymakers Charles Ungerleider Published online: 13 Nov 2009. To cite this article: Charles Ungerleider (2003) Large-Scale Student Assessment: Guidelines for Policymakers, International Journal of Testing, 3:2, 119-128, DOI: 10.1207/S15327574IJT0302_2 To link to this article: http://dx.doi.org/10.1207/S15327574IJT0302_2 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content.

Upload: charles

Post on 22-Mar-2017

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Large-Scale Student Assessment: Guidelines for Policymakers

This article was downloaded by: [University of Guelph]On: 18 November 2014, At: 11:26Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH,UK

International Journal of TestingPublication details, including instructions forauthors and subscription information:http://www.tandfonline.com/loi/hijt20

Large-Scale StudentAssessment: Guidelines forPolicymakersCharles UngerleiderPublished online: 13 Nov 2009.

To cite this article: Charles Ungerleider (2003) Large-Scale Student Assessment:Guidelines for Policymakers, International Journal of Testing, 3:2, 119-128, DOI:10.1207/S15327574IJT0302_2

To link to this article: http://dx.doi.org/10.1207/S15327574IJT0302_2

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all theinformation (the “Content”) contained in the publications on our platform.However, Taylor & Francis, our agents, and our licensors make norepresentations or warranties whatsoever as to the accuracy, completeness,or suitability for any purpose of the Content. Any opinions and viewsexpressed in this publication are the opinions and views of the authors, andare not the views of or endorsed by Taylor & Francis. The accuracy of theContent should not be relied upon and should be independently verified withprimary sources of information. Taylor and Francis shall not be liable for anylosses, actions, claims, proceedings, demands, costs, expenses, damages,and other liabilities whatsoever or howsoever caused arising directly orindirectly in connection with, in relation to or arising out of the use of theContent.

Page 2: Large-Scale Student Assessment: Guidelines for Policymakers

This article may be used for research, teaching, and private study purposes.Any substantial or systematic reproduction, redistribution, reselling, loan,sub-licensing, systematic supply, or distribution in any form to anyone isexpressly forbidden. Terms & Conditions of access and use can be found athttp://www.tandfonline.com/page/terms-and-conditions

Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 11:

26 1

8 N

ovem

ber

2014

Page 3: Large-Scale Student Assessment: Guidelines for Policymakers

Large-Scale Student Assessment:Guidelines for Policymakers

Charles UngerleiderDepartment of Education StudiesUniversity of British Columbia

Targeted for policy-makers, this article provides suggestions for ensuring that thebenefits of using large-scale student assessments are achieved in the face of a numberof challenges to their effective use. Fundamental concepts and concerns are reviewedin the context of making valid inferences from test scores. A unique feature of this ar-ticle is that the author draws on his recent experiences in the educational policy arena.

Despite their utility for (a) determining whether students are meeting the stan-dards set for their achievement and (b) for providing information for improvingstudent achievement, large-scale assessments are not typically used well. Thepurpose of this article is to provide policy-makers with suggestions for ensuringthat the benefits of using large-scale student assessments are achieved in the faceof a number of challenges to their effective use (cf. Taylor & Tubianosa, 2001;Kohn, 2000).

This article addresses major challenges to the effective use of large-scale stu-dent assessments, but does not address many issues affecting the construction ofthe assessments to ensure their psychometric integrity.

One challenge is to distinguish between norm- and criterion-referenced assess-ment. Norm-referenced assessments are primarily aimed at determining how wellstudents perform in relation to one another. Criterion-referenced assessments areprimarily aimed at determining which students have achieved mastery of a body ofknowledge deemed appropriate for the students who take the test, although theyare occasionally used to make comparisons among students. This article addressesthe use of criterion-referenced large-scale student assessments (CRLSAs) as

INTERNATIONAL JOURNAL OF TESTING, 3(2), 119–128Copyright © 2003, Lawrence Erlbaum Associates, Inc.

Requests for reprints should be sent to Charles Ungerleider, Department of Social and EducationStudies, 2125 Main Mall, University of British Columbia, Vancouver, BC, Canada V6T 1Z4. E-mail:[email protected]

Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 11:

26 1

8 N

ovem

ber

2014

Page 4: Large-Scale Student Assessment: Guidelines for Policymakers

means of (a) determining whether students are meeting the standards set for theirachievement and (b) providing information for improving student achievement.

Another challenge is ensuring a match between the assessment and the curricu-lum that students are expected to have mastered by the time the assessment is con-ducted. The effective use of the results from CRLSAs depends on clearly articu-lated curricula, specific learning outcomes, and widely accepted curricularexemplars. The curricula should specify clearly the knowledge that students areexpected to master and when they are expected to have it mastered. It is essential tofully explore and gather curriculum exemplars of acceptable student work, studentwork that exceeds expectations, and work that falls short of the desired standard forthe construction of assessment instruments. These pieces of work are also helpfulto teachers and parents in understanding the relation between student perfor-mances and the standard. It is desirable that jurisdictions ensure the participationof classroom teachers in the curriculum development process and, in particular, theidentification of relevant curriculum exemplars. Their experience with and knowl-edge of the student population for whom the curriculum is intended will help to en-sure that the criteria developed are appropriate.

The involvement of classroom teachers is also an important part of successfulassessment programs. In addition to the importance of such involvement for ensur-ing a match between the assessment and curriculum, the participation of teachersfrom the relevant jurisdiction at all stages of the assessment process (i.e., develop-ment, field-testing, administration, marking, standard-setting, interpretation) iscritical to gaining trust and grass-roots support. Though many Canadian jurisdic-tions involve classroom teachers, many departments of education in the UnitedStates hire companies to develop, mark, and interpret the results of their tests. In-volvement of local teachers in all aspects of the assessment process is beneficialfor everyone. Teachers benefit professionally by learning about assessment prac-tices and from their collaboration with other teachers from elsewhere in the juris-diction. Ministries of education benefit from the development of valid andbetter-respected assessment instruments.

Teachers faced with centralized testing regimes feel that they must ensure thatstudents have mastered the knowledge that will be assessed. If the time availablefor instruction is not sufficient to achieve the intended learning outcomes for allstudents, teachers confronted with CRLSAs may sacrifice other important aspectsof the domain being assessed or “borrow” time from curricular areas that are notassessed. Thus, it is important that the required learning outcomes that will be for-mally assessed using CRLSAs are clearly indicated, are achievable in the time al-located for instruction, and can be achieved without sacrificing other importantlearning outcomes.

CRLSAs depend on teachers having appropriate instructional material. In addi-tion to having sufficient quantities of instructional material, it is important that theinstructional material available be closely linked to the curriculum as it is intended

120 UNGERLEIDER

Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 11:

26 1

8 N

ovem

ber

2014

Page 5: Large-Scale Student Assessment: Guidelines for Policymakers

to be implemented and learned. If knowledge is of sufficient importance to ensureits mastery through CRLSAs, it is essential that the knowledge be addressed wellin the instructional support material.

Credibility of the results can be an obstacle if the standards themselves arenot widely accepted. Determination of standards of mastery must be made priorto assessment construction and should involve parents, teachers, administrators,and representatives of the broader public. Policy-makers must be prepared forthe increased time that will be required for standard setting and preparation ofthe assessment as well as for the conflict that will likely ensue when expecta-tions of different constituencies differ. It is likely that business people, universityeducators, teachers, and parents will hold different perspectives about standards.Though possible and desirable, obtaining agreement about standards with the in-volvement of a wider constituency will be more difficult than if only grade-levelteachers or subject specialists are involved. Jurisdictions that do not involve arange of educational partners in the standard setting process risk that the stan-dards will not be respected by all partners.

When standards are used in reporting the results of large-scale assessments, it isimportant to ensure that the standards remain relatively consistent from one year tothe next across successive administrations. Jurisdictions that want to assess growthover time using different tests will need to employ proper test equating procedures(e.g., through use of a common set of anchor items and item response theory) sothat the results of different tests in different years can be placed on a common met-ric and compared to a common standard. Use of these types of test equating proce-dures will also help address problems relating to the creation of inconsistent stan-dards across different standard-setting panels. Test equating also eliminates theneed for annual standard-setting activities because the same standards could be ap-plied across a number of years of assessments, with periodic reviews every 4 or 5years to ensure that the standards reflect current expectations. If the assessment ad-dresses broadly accepted and important dimensions of learning, the standardsshould not change dramatically over time.

Policy-makers should be aware that some of the planned consequences result-ing from centralized testing regimes might have a deleterious impact on teachingand learning. The decline of promotion of students from one grade to the next,teacher and administrator reassignment, and school closures are examples of “highstakes” consequences of assessment (cf. Firestone & Mayrowetz, 2000). The pros-pect of such consequences have produced a variety of undesirable behaviors, in-cluding coaching students while the assessment is being conducted or inappropri-ately excluding students whose performances are likely to negatively influence theresults. Even when the stakes of assessments are seemingly nonexistent, there areunintended consequences that must be taken into account (cf. McNeil, 2000).Knowing that the results do not “count,” students may not put much effort intotheir responses. Policy-makers should avoid either extreme. Making the assess-

LARGE-SCALE STUDENT ASSESSMENT 121

Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 11:

26 1

8 N

ovem

ber

2014

Page 6: Large-Scale Student Assessment: Guidelines for Policymakers

ment part of the final grade students receive in the subject assessed should ensurethat students take the assessment seriously.

The work of Coleman and his associates (1966) helped to sensitize educators tothe close association between the socioeconomic background of youngsters andtheir achievements. The use of CLRSA has been challenged on the grounds that theresults from centralized testing regimes can normally be predicted from the back-ground characteristics of the students whose performances are being assessed.

State and district monitoring systems usually report school mean test scores, or thepercentage of students achieving some criteria, without regard to the characteristicsof students entering the school, measurement and sampling error, or the wider socialand economic factors that contribute to schooling outcomes. This practice has lead towidespread misinterpretation of data on school achievement, because higher testscores tend to be mistakenly equated with more effective schooling. It has also cre-ated mistrust and scepticism among many educators, such that there is widespread re-sistance to any kind of performance monitoring. (Willms, 1998, p. 4; cf. Kohn, 2001)

The utility of CRLSAs for improving student achievement depends on the ca-pacity for investigating relations among variables over which the system exer-cises control or is capable of exercising control. CRLSAs depend on the abilityto use advanced statistical techniques such as hierarchical linear modelling to in-vestigate relations (Hao, 1999; Lee, 2001). Provincial and state departments ofeducation should be able to ensure the integration of the data from CRLSAs withother relevant data sets and to provide the analysis and interpretation necessaryto isolate the factors over which schools have control. The ability to identifyclassroom and school attributes and relate them to results will be crucial to mak-ing full use of CRLSAs. In a sense, CRLSAs also depend on building researchcapacity and interest. Making data available to qualified researchers will encour-age interest and stimulate capacity for analyzing large sets of data with appropri-ate statistical techniques.

Those contemplating the use of CRLSAs must make provision for disentan-gling the influence of nonschool variables on the achievements of students: thatis, discerning a school’s effects from other effects (e.g., socioeconomic back-ground, ethnicity, and gender). Willms (1998) argued persuasively for the use ofsocioeconomic gradients, depicting the relation between the socioeconomicbackground of the students and the outcomes being assessed, and formultivariate analyses using sophisticated statistical techniques. Citing the advan-tages that accrue as a consequence of having schools or classrooms with stu-dents from advantaged backgrounds or who possess high initial ability, Willms(1998) argued that it is also necessary to take into account the school’s contex-tual effects, distinguishing between such factors as the socioeconomic profile ofthe school community and the effects specifically attributable to teaching (e.g.,

122 UNGERLEIDER

Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 11:

26 1

8 N

ovem

ber

2014

Page 7: Large-Scale Student Assessment: Guidelines for Policymakers

the use of strategies that maximize the achievement of all students) and schoolpolicies and practices such as educational leadership, academic expectations,curricular focus, disciplinary climate, and parental involvement (see, e.g., Ma &Klinger, 2000). From an educational accountability standpoint, it is the effectsthat are attributable to teaching and school policies and that are of greatest inter-est. As difficult as it may be to ascertain the effects of teaching and school poli-cies and practices, it is necessary to isolate them from effects of student back-ground (e.g., socioeconomic status, gender, ethnic group membership, etc.) andaptitude (e.g., prior ability, attitude toward school, locus-of-control, etc.), as wellas the contextual effects of attending a particular school (Raudenbush & Willms,1995).

Expressing the relationship between school attributes over which policy-mak-ers exercise control and student outcomes in familiar terms is essential in present-ing the data to provincial and local politicians. For example, it should be possibleto make such statements as “one can reduce the proportion of students scoring be-low grade level by reducing class sizes at the primary level from X to Y” (cf. Co-hen, Raudenbush, & Ball, 2000).

The maximum benefits of CRLSAs accrue by tracking the performance of spe-cific studentsover timeandcomputing theschool’s impacton theirgrowth trajectory(Willms & Audas, 2000). Cohort studies are of more limited use in determining stu-dent improvement thanstudies that trackandmeasure theperformanceof students atvarious stages in their school careers. Growth trajectories can be computed from as-sessments conducted; for example, at Grades 4, 7, and 10 (cf. Bloom, 1999).

As Runte (1998) made clear, the potential for misusing the results of CRLSAis great. Though intended primarily for professional use and for informing par-ents about the progress of their children, others—including education profes-sionals, local and provincial politicians, and the media—may unintentionally orwilfully misuse the results of the assessments. Teachers are not likely to embracethe use of CRLSA as an aid to instructional decision making if the environmentis punishment centered. Comparisons among classrooms, schools, or districtswithout proper regard for and control of situational or other differences betweenthem are likely to exacerbate teacher opposition to the use of such information.Administrators, school trustees, parents, and politicians who misuse the resultsas a measure of teacher performance or productivity are not likely to engenderteacher confidence in the process. Invidious comparisons of classrooms, schools,and districts abound in popular media. Politicians and educators who shouldknow better make inappropriate comparisons. Jurisdictions contemplating theuse of CRLSAs must plan carefully for the dissemination of results, prepare pro-fessionals, journalists, and politicians for their appropriate use, and deliberatelycounter misuse of the data.

Effective use of the information provided by CRLSAs depends on a school en-vironment that fosters a sense of collective responsibility for the educational suc-

LARGE-SCALE STUDENT ASSESSMENT 123

Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 11:

26 1

8 N

ovem

ber

2014

Page 8: Large-Scale Student Assessment: Guidelines for Policymakers

cess of students. Much of the variation in student achievement that is amenable tointervention occurs at the classroom level. Instruction is the principal, but not sole,mediating link between assessment and student achievement; other variables in-clude the disciplinary climate of the school and the school’s emphasis on academicachievement. Effort and resources should be focused on reducing the variationamong students in general and between identifiable groups of students in particu-lar. CLRSAs cannot succeed in the face of organizational factors that impede theaccomplishment of the required learning outcomes. These factors include inade-quate time allocation (i.e., hours of instruction), inappropriate time distribution.(i.e., timetabling of subjects), and insufficient time for teachers to review collabor-atively the assessment data and to consider what, if any, modifications should bemade to their instructional plans, to the methodology they employ in the class-room, or to school policies and practices.

Inadequate preparation of teachers is an obstacle to the effective use ofCLRSAs. Faculties of education should ensure that those preparing to teach pos-sess the knowledge they will need in the use of assessment information for in-structional planning and implementation. Generalized measurement courses donot typically develop the knowledge one needs to use information from CRLSAsfor instructional planning and implementation. Such knowledge should be devel-oped in the context of curriculum and instruction courses and should comple-ment the knowledge about teacher-designed assessment included in thosecourses.

Continuing professional education should be provided for experienced teachersin the use of assessment information for instructional planning and implementa-tion. Responsibility for providing such continuing education falls to the jurisdic-tion that is responsible for the assessment. In Canada and the United States, prov-inces and states typically promulgate CRLSAs. These jurisdictions must alsoprovide for the professional education that practitioners require.

The lack of school-based leadership can be an obstacle to the use of the infor-mation provided by CRLSAs. The importance of school-based leadership hasbeen well established, though less well developed in practice. Insufficient atten-tion has been given to instructional leadership ability in the recruitment, selec-tion, and preparation of school-based administrators. As a consequence,school-based and district administrators are often ill-disposed and ill-equippedto lead the staff in carefully considering the results of CRLSAs and the implica-tions of the results for school policies and practices and for instruction. If im-proving student success is to remain a central goal of education, administratorsmust be steeped in the intricacies of using information from CLRSA. Possessingsuch capacity must become a prerequisite for holding an administrative position.Responsibility for providing continuing professional education for administra-tors currently employed falls to the jurisdiction responsible for promulgatingsuch assessments.

124 UNGERLEIDER

Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 11:

26 1

8 N

ovem

ber

2014

Page 9: Large-Scale Student Assessment: Guidelines for Policymakers

The absence of close working relationships among the agencies and organiza-tions responsible for the various dimensions of education and educational im-provement will hamper the appropriate and most efficacious use of CRLSAs. Theagencies responsible for certification of teachers and administrators must workclosely with faculties of education and the other agencies providing for profes-sional education. Provincial and state departments and ministries of educationneed to work with school boards, senior administrators, teacher organizations, andparent groups to facilitate understanding of and support for the appropriate use ofCLRSA. School boards must monitor the information provided by CRLSAs andtake it into account in their performance plans. Ensuring school success for all stu-dents requires jurisdictions to distribute resources to ensure equality of outcomes,especially for those students for whom learning is more challenging because ofpoverty and discrimination.

CRLSAs do not by themselves improve the performance of students. Withoutmaking that a large number of other factors are present, it would be unrealistic andunsound to embark on such assessment. The use of large-scale assessments to in-form changes at the school and classroom level must address the aforementionedobstacles and challenges. At a minimum, the following actions (after Willms,1998; cf. Woessmann, 2001) appear to be essential:

• Establish broad agreement about what school outcomes are essential for allstudents.

• Ensure that these outcomes are clearly articulated in the curriculum and aresupported with appropriate instructional material.

• Hold students, parents, and teachers accountable for those outcomes.• Assess student progress in the areas of importance at different times over

their school careers.• Prepare teachers and encourage them to use teaching strategies that increase

learning outcomes for all students.• Encourage mixed-ability grouping and discourage grouping, tracking, or

streaming students by socioeconomic background or in ways that increasedifferentiation among students of different ethnocultural backgrounds.

• Assess schools on the basis of student growth in learning outcomes, takinginto account their individual socioeconomic backgrounds, the socioeco-nomic context of the school community, as well as school policies and prac-tices known to influence the achievement of the valued outcomes.

• Examine rates of student progress as well as gradients in student progress as-sociated with such background factors as socioeconomic standing, gender,and ethnicity.

• Ensure that teachers and administrators are well prepared for their responsi-bilities.

• Counter misuse of the results of CLRSAs in the media and elsewhere.

LARGE-SCALE STUDENT ASSESSMENT 125

Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 11:

26 1

8 N

ovem

ber

2014

Page 10: Large-Scale Student Assessment: Guidelines for Policymakers

• Provide teachers with adequate time to individually and collectively interpretdata for the purpose of improving instruction.

As I hope I have made clear, improving student success will depend as much onother factors as it does on the assessment system itself. Improving the capacity ofteachers to employ teaching strategies that increase the performance of all stu-dents, developing the capacity of principals to provide instructional leadership,helping district administrators to acquire the capacity to interpret and use the infor-mation provided by large-scale student assessments will require thoughtful and fo-cused professional preparation and continuing education. Educating school dis-trict or provincial politicians about the impact of their comments on the results ofsuch assessments, and countering erroneous or wilfully misleading representa-tions of the results of such assessments by the media or others, will be challengingand will require significant attention.

Despite the challenges and obstacles to their effective use, if the data fromCRLSAs are subjected to the rigorous statistical analyses and interpretation, theycan be very useful in improving the conditions influencing student achievement.Unless ministries of education are willing to analyze and interpret the data withcare and rigor, I fear that large-scale student assessments will be used to bludgeonschools and teachers rather than to inform them about the changes that might bemade to improve student success.

In the face of anxiety about the future of the next generation, and thus our ownfuture, and in the absence of other publicly available standards, too many peo-ple—including ones who should know better—invest too much importance in theresults of assessments and too little in the analyses and interpretation of those re-sults. Policy-makers would do well to consider the differences between blunt andideologically motivated analyses of data such as the Fraser Institute’s annual re-ports on secondary schooling (Cowley & Easton, 2001; Cowley & Shahabi-Azad,2001a, 2000b) and the finer grained and more useful analyses of Ma and Klinger(2000) or Willms (1998), for example.

The analyses and interpretations of the data from large-scale measures of stu-dent achievement are best used on small-scale initiatives to improve studentachievement. Consider how countries such as Canada and the United States use pe-riodic reports on the fluctuations in their economies. They invest little import inshort-term fluctuations, reacting only to patterns that develop over time. They sel-dom respond to the data by making system-wide economic changes. Instead, theydirect interventions at specific areas of the economy. The problem with thelarge-scale assessments programs I have seen is that the data reported prompt callsfor radical reform of the system rather than thoughtful consideration of classroomand school practices. Policy-makers need to get beyond viewing large-scale stu-dent assessments as “accountability” measures and embrace them as one, and onlyone, source of information that can be used to improve student success in school.

126 UNGERLEIDER

Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 11:

26 1

8 N

ovem

ber

2014

Page 11: Large-Scale Student Assessment: Guidelines for Policymakers

ACKNOWLEDGMENTS

This article is a response to an invitation I received when I was Deputy Minister ofEducation for the Province of British Columbia. I was asked to address the issue oflarge-scale student assessment from a “policy perspective.” In responding to thatinvitation, I bring my background as a sociologist of education, my observations ofthe use of large-scale assessments in other jurisdictions, and my relatively recentexperience as Deputy Minister.

I am grateful to Dean Goodman, Jerry Mussio, David Robitaille, and ThomasStephens for helpful suggestions for the improvement of the article. They are not,of course, responsible for any failure to follow their advice!

REFERENCES

Bloom, H. S. (1999). Estimating program impacts on student achievement using short: Interruptedtime series. New York: Manpower Demonstration Research Corporation.

Coleman, J. S., Campbell, E. Q., Hobson, C. F., McPartland, A. M., Mood, A. M., Weinfeld, F. D., et al.(1966). Equality of educational opportunity. Washington, DC: Department of Health, Education, andWelfare.

Cohen, D. K., Raudenbush, S. W., & Ball, D. L. (2000). Resources, instruction, and research. Seattle,WA: University of Washington, Center for the Study of Teaching and Policy. (ERIC Document No.W-00–2)

Cowley, P., & Easton, S. (2001). Report on British Columbia’s secondary schools: 2001 edition. Van-couver, BC, Canada: The Fraser Institute.

Cowley, P., & Shahabi-Azad, S. (2001a). Report on Alberta’s secondary schools: 2001 edition. Van-couver, BC, Canada: The Fraser Institute.

Cowley, P., & Shahabi-Azad, S. (2001b). Report on Ontario’s secondary schools: 2001 edition. Van-couver, BC, Canada: The Fraser Institute.

Firestone, W. A., & Mayrowetz, D. (2000). Rethinking “high stakes”: Lessons from the United Statesand England and Wales. Teachers College Record, 102, 724–749.

Hao, L. (1999). Nested heterogeneity and difference within differences. Washington, DC: JohnsHopkins University, National Institute of Child and Human Development.

Kohn, A. (2000). The case against standardized testing: Raising the scores, ruining the schools.Portsmouth, NH: Heinemann.

Kohn, A. (2001). Fighting the tests: A practical guide to rescuing our schools. Phi Delta Kappan, 82,349–357.

Lee, Y. B. (2001). Measuring school effects on student achievement in the hierarchical school system: Acomparison of multiple techniques. Columbus, OH: Ohio State University, School of Public Policyand Management.

Ma, X., & Klinger, D. A. (2000). Hierarchical linear modelling and student and school effects on aca-demic achievement. Canadian Journal of Education, 25, 41–55.

McNeil, L. (2000). Creating new inequalities: contradictions of reform. Phi Delta Kappan, 82,729–734.

Raudenbush, S. W., & Willms, J. D. (1995). The estimation of school effects. Journal of Educationaland Behavioral Statistics, 20, 307–335.

LARGE-SCALE STUDENT ASSESSMENT 127

Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 11:

26 1

8 N

ovem

ber

2014

Page 12: Large-Scale Student Assessment: Guidelines for Policymakers

Runte, R. (1998). The impact of centralized examinations on teacher professionalism. Canadian Jour-nal of Education, 23, 166–181.

Taylor, A. R., & Tubianosa, T. (2001). Student assessment in Canada: Improving the learning environ-ment through effective evaluation. Kelowna, BC, Canada: Society for the Advancement of Excel-lence in Education.

Willms, J. D. (1998). Assessment strategies for title I of the improving America’s schools act. Paper pre-pared for the Committee on Title I and Assessment of the National Academy of Sciences, Washing-ton, DC.

Willms, J. D., & Audas, R. (2000). The content of national reports for the Programme of Indicators ofStudent Achievement (PISA). Report prepared for the Organisation for Economic Cooperation andDevelopment, Paris.

Woessmann, L. (2001). Schooling, resources, educational institutions, and student performance: theinternational evidence. Germany: Kiel Institute of World Economics.

128 UNGERLEIDER

Dow

nloa

ded

by [

Uni

vers

ity o

f G

uelp

h] a

t 11:

26 1

8 N

ovem

ber

2014