changing roles: evaluator and teacher collaborating in school change

Pergamon

Emluation and Program Planning, Vol. 20, No. I, pp. 7-15, 1997 $1 1997 Elsevier Science Ltd

PII: SO149-7189(96)00032-S All rights reserved. Printed in Great Britain

0149-7189197 $17.00+0.00

CHANGING ROLES: EVALUATOR AND TEACHER COLLABORATING IN SCHOOL CHANGE

GENEVIEVE LAU

Skyline College, San Bruno

PAUL LEMAHIEU

University of Delaware and Delaware Department of Public Instruction

ABSTRACT

Teachers are ground-level implementors ofschool change. An evaluation qf their efforts in experimenting with new classroom practices must empower andnot inhibit themfrom ji&litating change. The evaluation of the HERALD project had teacher empowerment as its concern and three movements (utilization focused evaluation, teacher research, and authentic assessment) inform its design. As teachers experimented with new practices in their classrooms, external evaluators held a mirror to these practices and coordinated teams of teachers, who served as internal evaluators, to document project effects on student outcomes in systematic yet authentic ways. Evaluator-teacher collaboration made evaluation more rejective of classroom realities and more useful than threatening to these implementors of change. 0 1997 Elsevier Science Lid

INTRODUCTION

One vital purpose of a formative evaluation design is to show stake-holders what is working and what is not with a particular interest in improving the program. A design evaluating school change must empower teachers, as ground-level implementors, to facilitate change. Evaluation must not inhibit their efforts. The intent of this article is to discuss the theoretical underpinnings and the execution of an evaluation design that empowered teachers to reform classroom practice within the context of the HERALD Project (Phase I) in San Fran- cisco. The objectives of the evaluation were to find out the effects of the project on teachers, students, and the organization structure to inform change.

Individuals who have chosen teaching as their mission in life find fulfillment in seeing their students learn. They welcome opportunities to improve their teaching; they feel empowered when supported to unleash their cre-

ative energies in pushing their professional development to higher grounds. However, they may be inhibited in their innovative ventures by limiting forces imposed upon them in the name of “accountability”. Often teacher effectiveness is measured by their students’ standardized test scores (Haas, Haladyna, & Nolen, 1989). Moreover, teachers feel the emotional burden of helping students “succeed” in published test results and become too narrowly focused in their teaching (Smith, Edelsky, Draper, Rottenberg, & Cherland, 1989; Smith, 1991).

In order to support teachers to do what they do best - teaching, and to develop what they most want to develop - students’ learning, the measures to which they are held accountable must empower and not inhibit. Three movements played an important role in informing such an empowering evaluation design for the HERALD Project: utilization focused evaluation (Patton, 1986), teacher research (Goswami & Stillman,

Requests for reprints should be addressed to Genevieve Lau, Skyline College, 3300 College Drive, San Bruno, CA 94066, U.S.A

8 GENEVIEVE LAU and PAUL LEMAHIEU

1987; Lytle & Cochran-Smith, 1989) and authentic assessment (Mitchell, 1989; Wiggins, 1989).

Informing the Design Utilization focused evaluation framed the objectives of

the evaluation design; authentic assessment guided the exploration of new ways of measuring student outcomes promoted by the Project; and teacher research prompted the involvement of teachers as internal evaluators.

Utilization Focused Evaluation: Evaluators Working with Stake-holders

teaching (which may produce new learning) to raise traditional test scores (which may not reflect the new learning). An example would be having students learn problem solving skills through group work in an academic content area and then assessing their learning

through a multiple-choice test which is set up to test simple recall of content information. When teachers are thus held accountable, they tend to focus on the materials and skills the test covers, resulting in “a nar- rowing of possible curriculum and a reduction of teachers’ ability to adapt, create, or diverge” (Smith, 199 1, p. 10).

Utilization focused evaluation proposes that evaluators be active, reactive and adaptive in identifying, organizing and working with relevant decision makers and information users of the evaluation (Patton, 1986). This collaboration with stake-holders is particularly appropriate to programs of educational reform because of the nature of change.

Educational change is complex. It involves many interacting factors which contribute to the success or failure of the effort (Fullan, 1991). Therefore, to be useful to a school change project, the evaluation needs to capture as many of the interacting factors as possible by considering the context and following closely the process of change, and not to consider only the outcomes of change (particularly not to consider only those outcomes measured by preconceived criteria). Since project participants are the most knowledgeable of the context and the process, the design must provide opportunities for stake-holder input so that the study may evolve with the project, documenting the strengths and weaknesses at each step of the development and making the information available for mid-course changes.

The above dilemma confronting teachers engaged in trying new ways is caused by demands of accountability favoring tests which have standardized procedures of administration to masses of students and have estab- lished reliability and validity in providing comparative reports of students, schools and districts to the public. Multiple-choice norm-referenced tests fulfill these demands (Frechtling, 1991). However, they are also found to be lacking in other areas, such as their dictating indirect assessment of many important skills (Powell,

1990) requiring only the ability to recognize the best answer (Hiebert & Calfee, 1989) and providing too much information about students’ short-term recall and too little about the long-term functioning of the mind (Wiggins, 1989). While standardized tests serve some important purposes, limiting student learning assessment to one format is not adequate and discourages innovation.

The evaluator, therefore, is no longer a distant, dis-

interested gatherer of data using predetermined and invariant methods of collection, but a breathing, living human being in close contact with the progress of the

program and its participants, using his/her judgment and expertise in response to the evolving evaluation needs. Evaluators should be “adapting, reacting.. . interacting with and sensitive to the changing nature of evaluation concerns - rather than being stati- cally fixed on identified decisions” (Alkin, 1991, p. 14). The evaluator’s responsiveness to the needs of the primary users plays an important role in the utilization of the evaluation (Cousins & Earl, 1992).

Authentic Assessment: A Close Ally of Curriculum and Practice Innovations

As some educators recognize the close tie between testing and reforming practices, they push for alternative means of assessment, which more realistically measure what teachers want their students to learn and provide superior feed back to teachers about areas which need improvement (Calfee, Henry, & Fund- erburg, 1988; LeMahieu, Eresh, & Wallace, 1993; Valencia, 1990). Teachers are encouraged to use multiple and diverse ways to measure student learning in the natural teaching-learning contexts, such as with performance samples, observations, interviews, and portfolios (Hiebert & Calfee, 1989; Au, Scheu, Kawakami, & Herman, 1990). At the national level, three forms of assessment were developed: performance examinations, portfolios, and projects (Olson, 1990; Resnick & Tucker, 1991). In some instances, new forms of assessment were being developed with technical qualities adequate to the challenge of external accountability (LeMahieu, Gitomer, & Eresh, 1995). The primary cri- terion of these measures is that of authenticity: “is this what we want students to know and to be able to do?” (Mitchell, 1989, p. 3).

As teachers implement curriculum innovations, they are Authentic assessment encourages innovations in cur-

still held accountable for their students’ learning, more riculum and practice because it reflects the new goals.

often than not, by traditional ways of assessment. Tea- In the traditional paradigm, assessment is more often

chers face the dilemma of trying to use new ways of than not the last step in the ladder of curriculum inno-

Changing Roles 9

vations or school change. For education to be mean-

ingful to students and teachers, the order needs to be

reversed. Education outcomes which reflect new goals for student achievement should dictate the curriculum, which in turn mandates certain changes in structural support (Newman, 1990). Assessment of student outcomes in a project promoting change in teacher practice must, therefore, be authentic to the new goals for student learning. These authentic measures will feed back to teachers how effective their ways of teaching

are in helping their students achieve these goals. Seeing results helps fan their enthusiasm for discovering effective practice.

Teachers as Researchers/Evaluators: A Way to Streng- then the Change Process The desire to do more with their students prompts teachers to explore new practices. Changing practice becomes possible when they have access to new ideas, and new practices are sustained when they see results in

their students’ learning. Traditionally, however, teachers’ access to new infor-

mation from research has been problematic due to the separation of research from practice. School and university cultures seem to favor the distinction between teachers and researchers based on the norms organizing their practices and the perceptions of their status. While university researchers are pushed to “publish or perish” and often relate to teachers only as their research objects (Sironik & Clark, 1988) teachers do not benefit from such research due to their lack of time for reading and reflection (Livingston & Castle, 1989) inadequate administrative support to attend professional confer-

ences, and the jargon and abstract propositions in research (Eisner, 1984). Furthermore, administrator attitudes show a higher regard for researchers than for practitioners (Florio-Ruane, 1990). As teachers are alienated from researchers and research practices, so are they from research findings.

Even in cases when innovations from research man- age to reach teachers, they may run into implementation problems and slip back into their traditional practices if they do not have continued support. Inadequate feedback is also discouraging. Experimenting with new ideas generally upsets the tested and comfortable routine and creates problems teachers may not have had to face before. Without assistance in assessing students’ new learning, teachers will not see results and give up making changes.

In the past decade, involving teachers in the change process (Cross, 1987; Tyack, 1990) has helped to bridge the gap between research and practice. Teacher involvement has taken the forms of researchers collaborating with teachers (Livingston & Castle, 1989; Florio-Ruane, 1990) and of training teachers to be researchers in their own classrooms (Heath & Branscombe, 1985; Heath,

Branscombe, & Thomas, 1986; Low, 1993). An example

of a simple and effective way is documenting their own practices as a way to facilitate reflection and change (Koziol & Burns, 1986). When change is perceived as meaningful, teachers as researchers will push change beyond the classroom to the school context (LeMahieu & Asher, 1987).

Having researchers collaborate with teachers facili- tates change. The evaluation of a project of change should also build on this effective collaboration. Based

on the model of teacher research, an external evaluator not only works with but also trains teachers to sys- tematically reflect upon their practices and to use authentic means to measure their students’ learning. Besides gathering day-to-day assessment information, teachers can work with external evaluators to aggregate, analyze, and interpret the results to inform further decisions and choices. In other words, they become internal evaluators. Furthermore, by identifying needed support for the changed practices, they may then push for structural reforms. Teacher-evaluator collaboration helps to counter the problem that teachers do not have the time to school themselves in research skills and lends objec- tivity to their self-evaluation process. On the other hand, teachers yield invaluable insight to the construction of authentic measures and richness to the interpretation.

TEACHER AS EVALUATOR IN A SCHOOL CHANGE PROJECT

School change can only be effective if it is meaningful to teachers and helpful to students. Having teachers and students as central figures does not mean including a

few representatives in committee meetings which for- mulate a new structure, curriculum, or strategy, but it is rather to have visions of change arise from daily interactions in the classroom. However, it is difficult for people who are immersed in a context to see beyond their situation and make necessary or effective changes. It is, therefore, important to have in place a process whereby teachers can try out new ideas in their classrooms and assess the impact of these ideas on their students’ learning by looking at the familiar happenings in the classroom in a new and objective way to inform further changes. This process is cyclical, which means it can go on indefinitely and one can start with introducing new ideas into the classroom or with evaluating present practice.

By having utilization as the focus of the evaluation and by involving teachers as internal evaluators exploring authentic means to measure student learning, Phase I of the HERALD Project in San Francisco demonstrated the close tie that should exist between evaluation and practice.


The HERALD Project (Phase I, 1988-1991) and its ject meetings provided opportunities for collegial ex-

Evaluation Design change.

Phase I of the HERALD project (Humanities Education, Research, and Language Development) in the San Francisco Unified School District was part of a nationwide network of programs working together to change teaching and learning about the arts and humanities, known as the Collaboratives for Humani- ties and Arts Teaching (CHART). The mission was to promote oral and written language of high school students in a racially diverse school district through the vehicles of teacher research and an enriched Humanities curriculum. Project participants grew from 27 teachers from five high schools in the first year (1988-89) to 74 teachers from twelve high schools in the third and final

year (1990-9 1) of Phase I of the Project. The evaluation design, framed by the tenets of uti-

lization-focused evaluation, teacher research and auth-

entic assessment, evolved with the Project - different evaluation strands were added or modified according to the needs of the developing Project. In the first year, the evaluation looked at the Project’s impact on teachers’ practices, which had to be in place before student learning could be affected. This phase of the evaluation focused on teachers’ responses to and implementation of new ideas. In the second year, besides documenting the “treatment” or teacher practice, student outcome studies were added. External evaluators trained and supported teachers in exploring the use of multiple and authentic ways to measure student learning. Thus, the emphasis was not only on the Project’s impact on student learning, but also on identifying useful assessment tools and necessary supports for change. In the third year, teachers and external evaluators collaborated in studying HERALD’s contribution to and support from the institutional structure. Three reports give details of the evaluation of each year of Phase I of HERALD. Therefore, the following sections will only focus on how

evaluation supported teachers in making changes and highlight the related effects on the participants.

The evaluation design for Year 1 was formative. Its

intent was to hold a mirror for participants to develop and strengthen approaches that worked and to make changes in areas of weakness. The evaluator chose to be adapting, reacting, and interacting by taking the

stance of an ethnographer in approaching this initial phase of the evaluation. She related to teachers as one who would make observations and provide feedback, not one who would make summary judgments of their success in meeting externally imposed standards. She participated in most of the program activities, observed, took field notes and communicated patterns of development that helped or hindered the program’s progress to the participants through the director and at meetings. To gain a better understanding of events, she related program activities to a larger context; for example, the

political history surrounding the events, and the values and ways of looking at things in the teaching community. She also interviewed university partners, funders, administrators and conducted structured open- ended group and individual interviews with Project teachers. By providing these kinds of information to the participants at regular intervals, the evaluation helped them see what was happening from a more objective and broader perspective as teachers were introduced to and experimented with new ways of teaching.

Evaluator as Participant Observer (Year 1 Implemen- tation of Evaluation) As a teacher empowering project, HERALD drew together a group of teachers who were not afraid to try new practices. Also, as a Humanities project, teachers came from various disciplines - English, Social Stud- ies, History, Music, Art, and Foreign Language. The Project supported these teachers in gathering new ideas by offering choices of professional development opportunities depending on the interest and discipline of the teachers. They participated in national, local, and project sponsored workshops, retreats, and symposia. They also collaborated with university researchers who acted as resource persons in some cases and as co-researchers in others. Their weekly site meetings and monthly Pro-

A synthesis of the evaluation findings for the first year produced a more focused conceptual frame of the Project, identified strengths and weaknesses in its implementation, and produced suggestions for Year 2 planning. The Project started as a collaborative effort. Each partner in the Project proposal contributed to its mission and goals - the School District wanted to improve the language acquisition and performance of its students, CHART saw the Humanities as a vehicle for change, and university partners had expertise in teacher research. If these were not new ideas to Project participants, putting them all together to make new teaching work in the classroom was definitely new. Different teachers joined the project because they were attracted to one or another of the objectives - improving language teaching, revising the Humanities curriculum, or doing research (in the way they conceived research). However, initial excitement turned into bewilderment in the first months as participants not only encountered new ideas, but also discovered other expectations on them than the one for which they joined the Project.

Documentation of Project activities and teachers’ reactions helped identify a number of sources of diffi- culty and facilitated the process of implementing change. First, ambiguities were inherent in interpret- ations of program intents and caused teacher frus- tration. The director, therefore, seized various occasions

Changing Roles 11

to make the objectives of the Project more explicit to the participants. Second, integrating all the new ideas

into classroom practice was confusing to the teachers. The evaluator translated the mission and goals of the Project into a conceptual frame for classroom implementation and inquiry so that participants could see where all the pieces fit. Third, teachers realized their strength in experimenting with new practices and in learning from each other’s successful teaching strategies. Collegial exchange was an effective source of support and encouragement. The school structure, however, was not conducive to opportunities for exchange, and the Project worked towards providing those opportunities, such as coring (the practice of having a group of students take several classes together, with the teachers of those classes collaborating on curriculum planning and student progress monitoring) and providing a site team with a common preparation and planning time. Fourth, doing research was unfamiliar to many

teachers. They had little time to read because of their teaching load; they were more concerned about prac- tical implementation than theory building as they had to satisfy state mandates and demands from parents and students; and they saw report writing as yet another burden on their busy schedule. Therefore, going into the second year and acting on feedback from key stake- holders, the Project planned not only to continue facilitating teacher exploration of language development strategies in the context of an integrated Humanities program, but also to give teachers much more coordinated support and reason for doing research.

Evaluation Teams (Year 2 Implementation of Evaluation) In the second year, the number of HERALD school teams increased (from five to eleven), so too did the number of external evaluators (from one to four). Infor- med by the previous year’s evaluation, the original evaluator continued to examine the Project’s effects on teachers, but also invited three researchers who had expertise in different approaches in studying student outcomes to join the evaluation team.

Studying the project’s effects on teachers continued.

Because of the larger number of teachers, data were gathered not only through group interviews and participant observation, but also with questionnaires, classroom practice inventories, and documentary analysis of Project archives. The evaluator emphasized her role in supporting teachers in systematic reflection on their practices. Teachers identified several changes from their previous practice. In a few cases, teachers felt HERALD validated what they did only timidly before; however, in most cases they recognized that HERALD gave them license to try new things by providing ideas and support. The data from classroom practice inventories showed their involving students in more speaking and writing

activities, in taking a more active role in their learning, and in participating in more group work. Moreover,

teachers were comfortable about giving feedback to inform future evaluation.

Studying the project’s effects on student learning was

added. As teachers experimented with innovative ways, they needed a distancing device to assess the impact of their teaching on their students’ learning in order to make informed decisions about which strategies to keep, which to modify, and which to discard. However, the assessment device used to measure student learning had to authentically reflect the desired outcomes, and at the same time not to pose a political threat to the teachers. Otherwise, teachers would simply teach whatever the evaluation measured and not address the desired outcomes.

The design to provide authentic assessment of student learning took three factors into consideration:

First, existing standardized tests measured only general language development at the recognition level but not oral communication nor actual writing skills, not to speak of other valued outcomes, such as students’ more active engagement in their learning. Thus, the student outcome measures for the Project had to examine actual student writing, student oral language, and ways students showed that they were taking more responsibility for their learning. Second, the Project was neither a curriculum package

nor a set strategy of intervention. It emphasized teacher innovations effecting change through collegial exploration and systematic reflection. Also, Project teachers were from different disciplines - social science, English, foreign languages, art, music, science, math, computer, and special education. The assessment of student outcomes could not rely on only one or two (discipline-based) measures across the whole spectrum of courses if such assessment was to be authentic and appropriate to the full scope of the project. There had to be multiple measures, and teachers needed the flexibility to experiment with different measures to find those most appropriate to their classes. Third, research has shown many interactive variables affecting student learning, such as teacher and student cultural, socio-economic, and life experience back- grounds; their personalities and expectations; as well as more obvious factors such as the learning environ- ment, teaching strategies, duration of exposure, and curriculum materials. It is impossible to control all variables; it would not be a fruitful endeavor even if it were possible. As argued by Cronbach (1977), the more controlled an experiment, the less its appli- cability to other situations. Thus, the evaluation would seek ways other than using control groups to


explore the effectiveness of certain practices, and it roles in their learning and showed improvement in their

would not try to make claims that the Project was the oral and written language performance. Details of their

sole cause of changes in student language perform- findings are recorded in the HERALD Year II Evalu-

ance. ation: Final Report (Lau, Low, Sato, & Stack, 1990).

In view of the above challenges, teachers in the Project were in a better position than any others to be sensitive to significant factors affecting their students’ learning and to recognize various authentic ways of assessing such. Therefore, teachers were invited to participate as internal evaluators as an optional fulfillment of the teacher research objective of the Project. However, they needed assistance to produce credible and valid documentation, as well as evidence of sufficient quality to

justify the knowledge claims based upon it.

Involving teachers as internal evaluators strengthened the validity of the findings as measured against the criteria defined by Goetz and LeCompte (1984):

Validity is concerned with the accuracy of scientific findings. Establishing validity requires (1) determining the extent to which conclusions effectively represent empirical reality and (2) assessing whether constructs devised by researchers represent or measure the categories of human experience that occur (p. 210).

Several Project activities paved the way for teachers to become internal evaluators. During the pre-second- year summer retreat, a panel of expert researchers made presentations in the areas of using portfolios, con- ducting classroom observation, and holistically scoring writing. About half of the Project teachers who attended orientation workshops subsequently signed up to participate on one or more of three teams, each of which was led by an external researcher/evaluator. The portfolio team used portfolios to document student learning in oral presentations, written assignments and special projects; the classroom observation team examined issues in classroom interaction of interest to the participating teachers; and the holistic scoring team com- posed the prompts, administered and holistically scored student essays to show progress in the latter’s writing skills. Although doing research was a Project require- ment, teachers had a choice of the focus, context, and

type of research. This was to accommodate their diverse interests and strengths and to minimize the feeling of imposition. Each external evaluator worked with a team

of teachers throughout the year. Their charge was to document their students’ learning and to find what worked and what did not while using these alternative means of assessment.

HERALD teachers, as internal evaluators, used a variety of measures and instruments and carefully docu- mented strengths and weaknesses of each by keeping reflective journals. These measures and instruments included narrative case studies; video and audio tran- scripts; checklists designed by groups or individuals; portfolios of oral, written, and non-verbal student work; student assessment surveys; observations by teachers, their colleagues, or students; journals; and holistic scoring of prompted and open-ended writing. Some teachers used pre- and post-writing samples to show change, and others linked observed student behaviors and products to certain classroom practices. Despite the diversity of contexts, students, subject areas, and assessment instruments, findings from the three teams corroborated one another. Students took more active

Teachers and students were closest to the realities

of the classroom. When teachers were motivated by a genuine concern for assessing the impact of their practices on students’ learning, their involvement in devising authentic assessment instruments pushed them to make explicit their teaching objectives, examine whether their practices actually reAect those objectives, and make sure the instruments were able to measure student learning in response to those teaching practices. The process not only helped teachers be more sensitive to their practice, but also made the assessment more reflective of “empirical reality”. Also, keeping reflective journals of their inquiry studies helped teachers document the limi- tations of the various devices in their actual implementation. Teachers’ interest was to find out what really worked, and they were in a position to make first-hand observations. Moreover, as teachers worked in a team with an external evaluator, the evaluator gave objec- tivity and technical support to uphold the quality of the study, while other team members also gave more objective input (than the idea of the person directly involved) to each other as persons with knowledge of different but often similar classroom realities. Team discussions helped members draw on each other’s experience and find alternative ways to confront real- life problems in data gathering, and provided different angles to the analysis and interpretation of data.

However, several precautions and structural supports needed to be in place. Teachers had to feel comfortable in getting involved. At the beginning of HERALD, quite a few teachers felt imposed upon when they were asked to participate in doing research in the classroom. The sense of resentment was not found in internal evaluation team members because they voluntarily signed on for approaches (modeled in presentations) and activities of their choice. External evaluators were chosen for their expertise in a particular approach and their ability to work with people. Rather than presenting an imposing presence, they lent another experienced pair of eyes and ears to the teachers. Moreover, support from colleagues was important. As demonstrated by the Project, consistent opportunities for the exchange of

Changing Roles 13

ideas with the external evaluator and other teachers

doing parallel inquiries were critical to maintaining a

balance between subjective and objective perspectives.

Lastly, channels of communication needed to be kept open. In the second year, while each team worked by itself, communication among the three strands needed to be maintained. For the Project in general, tri- angulation of findings from the three strands as well as with the results of the classroom practice inventory strengthened the conclusions drawn about the year’s progress.

Collaborating to Push Change from the Classroom to Beyond (Year 3 Implementation of Evaluation) With the experience in Year 2, teachers as internal eva-

luators and their external evaluator partners refined the ways of looking at classroom practice, their students’ learning, and necessary structural support beyond their classrooms.

First, teachers on the evaluation team integrated the ways of assessment that showed most promise in Year 2 into a portfolio system which included evidence of language learning in oral, written, and nonverbal domains. At the end of the year, an analysis of portfolios showed student learning in a number of areas that might have escaped documentation in traditional ways of testing. Evidence of learning was found in the following areas: writing in various genres, informal and formal

oral language, non-verbal display of knowledge, coop- erative group projects and cross-curriculum thematic work, knowledge in the humanities and specific subject areas, self-reflection and assessment, and family and community related work (Low, 1991).

Second, teachers helped create their HERALD ver- sion of an instructional practices inventory. While Year 2 teachers found the teaching practice inventory adapted from another district helpful in their reflection, they wanted to make one more suited to their needs. The evaluator gave a short open-ended questionnaire titled “What is HERALD?” to teachers at a spring retreat. From the categorized data, a team of teachers compiled their Inventory-Questionnaire, which was later used by other teachers for reflection and for the evaluation study.

Third, some teachers, empowered by their ownership of the evaluation process, took a further step to affirm their teaching. These teachers worked with an external evaluator to adapt the Inventory-Questionnaire of HERALD instructional practices and presented it in a format more suited to student use. They had their students complete the inventory for their classes. They also solicited the participation of a colleague who taught the same or a similar course at the same level as their HERALD class. These collaborating teachers also gave the inventory to their students. Student reports supported the perception of HERALD teachers regarding

their own practices, and the reports also showed differ- ences between HERALD classrooms and non-HER-

ALD classrooms in those areas promoted by the project.

Fourth, teachers began to look beyond their classrooms and their current circumstance. Ten teachers from five schools participated in meetings on a new strand of evaluation on organizational structure added in the third year. Working with an external evaluator, they drew on their observations of the Project’s strengths and weaknesses and proposed strategies for increas- ing the Project’s survivability and impact in the district.

Through being involved directly in the evaluation process and working closely with external evaluators, teachers no longer feared experimenting with new ways and being held accountable to outcomes. They had input into what was authentic in evaluating the new instructional practices, and evaluation became a non- threatening objective device to inform change. One teacher’s observation that for students, “Assessment becomes motivational rather than evaluative” (Lau, 199 1, p. 15) also reflects teachers-as-internal-evaluators’ view of evaluation. Furthermore, other teachers looked at the risk of change as empowering. “I really like it- it reduces isolation between teachers and departments, empowers teachers through collaboration, and pro- motes on-site staff development” (Mills, 1991, p. 36).

CONCLUSION

As teachers experiment with new ways in their teaching, they can also distance themselves by using authentic ways to look at their students’ learning in the research role of internal evaluators. Collaborating with external evaluators/researchers bridges the gap of information flow between researcher and practitioner. If the evaluation design is properly structured and relationships

are appropriately defined, the external evaluators can provide an important safeguard, ensuring requisite quality of evidence validity and credibility of the evaluation undertaking. Instead of being a threatening stran- ger coming into the classroom to gather data which is measured against external criteria, the external evaluator is a collaborator engaged in “substantive con- versation” with teachers in examining teaching practices and creating authentic ways of assessing student learning. Evaluation aids rather than inhibits the construction of new knowledge within the project.

Not only did the utilization focused evaluation design which involved teachers as internal evaluators using authentic assessment for student learning make change in classroom practice more meaningful and useful to teachers, but it also helped teachers become agents of change beyond the classroom. From what they had learned that aided or created hurdles in sustaining effective practices, they pushed for consequent changes in


the structure. They have indeed proven the importance of actively involving ground level implementors in the school change process - and the power of a properly constructed evaluation team for facilitating and empowering that endeavor.

The formative and evolving evaluation design, besides helping teachers, also yields important data about school change and program effects. The findings include not just the results of certain changed practices, but also information concerning the process itself. Moreover, involving teachers as internal evaluators yields insights that an outsider might miss. These fin-

dings about the process are useful in informing the project in making mid-course changes and may be helpful to other projects attempting to make changes in schools through similar efforts.

While the importance of actively engaging teachers in the evaluation of the school change process is apparent, some may wonder if teacher involvement might not become a program element, further reinforcing the effect. The answer is affirmative. But if that effect is the objective of the project, isn’t that the ultimate goal of a formative evaluation? Evaluation of HERALD started as an external process studying an intervention, but became an internal force driving further change. Eva- luator and teacher collaboration, therefore, can effect change in the classroom and beyond.

REFERENCES

Alkin, M.C. (1991). Evaluation theory development: II. In M.W.

McLaughlin & D.C. Philips (Eds.), Evaluurion andeducation: At quar-

ter century. The 90th yearbook of the National Sociat>, ,for the Stud),

qf Education, part II (pp. 91-l 12). Chicago: University of Chicago

Press.

Au, K.H., Scheu, J.A.. Kawakami, A.J., & Herman, P.A. (1990).

Assessment and accountability in a whole literacy curriculum. The

Reading Teacher, 4(90), 514-578.

Calfee, R.C., Henry, K.K., & Funderburg, A. (1988). A model for

school change. In S.J. Samuels & P.D. Pearson (Eds.), Changing

schoolreadjngprogrums (pp. 12lll41). Newark, DE.: IRA.

Cousins. J.B., & Earl, L.M. (1992). The case for participatory evalu-

ation. Educational Etlaluation and Polic], Ancrlysis, 14(4), 397418.

Cronbach. L.J. (1977). Remarks to the new society. Evaluation

Researcl~ Society Newsletter, I( 1). 2.

Cross, P.K. (1987). Education reform in wonderland: Implementing education reform. Phi Delta Kappun, 68, 496502.

Eisner, E.W. (1984). Can educational research inform educational practice? Phi Delta Kappan, 65(7), 441452.

Florio-Ruane, S. (1990). The Written Literacy Forum: An analysis of

teacher/researcher collaboration. JournalofCurriculum Studies, 22(4),

313-328.

Frechtling, J. (1991). Performance assessment: Moonstruck or the

real thing?. Educational Measurement: Issues and Practice, 10(4), 23-

25

Fullan, M.G. (1991). The nets meaning of educational change. New

York: Teachers College Press.

Goetz, J., & LeCompte, M. (1984). Ethnography undqualiratioe design

in educational research. Orlando, FL: Academic Press.

Goswami, D., & Stillman, P. (1987). Reclaiming the classroom: Teu-

cher research as an agency for change. Upper Montclair, NJ: Boyn-

ton/Cook.

Haas. N.S., Haladyna, T. M., & Nolen, S. B. (1989). Standardized

testing in Arizona: Intercie,v and ~vritten comments,from teachers and

administrators (Tech. Rep. No. 89-3). Phoenix, AZ: Arizona State

University West Campus,

Heath, S.B., & Branscombe, A. (1985). Intelligent writing in an audi-

ence community: Teacher, students, and researcher. In S.W. Freed-

man (Ed.), The acquisition ofivritten language: Revision and response

(pp. 3--32). Norwood, NJ: Ablex.

Heath, S.B., Branscombe, A., & Thomas, C. (1986). The book as

narrative prop in language acquisition. In B. Schieffelin & P. Gilmore

(Eds.), The acquisition sf literuc!: Ethnographic perspectives (pp. 16

34). Norwood, NJ: Ablex.

Hiebert, E., & Calfee, R.C. (1989). Advancing academic literacy

through teachers’ assessments. Educational Leadership, 46(7), 50-

54.

Koziol. S.M., & Burns, P. (1986). Teachers’ accuracy in self-reporting

about instructional practices using a focused self-report inventory.

Journal of Educational Research. 79(4), 205-209.

Lau, G., Low, P., &to, N., & Stack, J. (1990). HERALD Year II

evaluation: Final report. (Available from San Francisco Education

Fund, San Francisco).

Lau, G. (1991). HERALD project effects on teachers. In G. Lau, P.

LOW. S. Mills, & J. Stack (Eds.), HERALD Year III evaluation: Final

report (pp. 1 l-27). (Available from San Francisco Education Fund,

San FmKiSCo).

LeMahieu, P.G.. & Asher, C. (1987). Teachers as institutional

researchers: The development of’ a rejectiae capacit?: \z,ithin schools.

New York: Lehman College: The Institute for Literacy Studies.

LeMahieu, P-G., Eresh, J.T., & Wallace, R.C. Jr. (1993). Using port-

folios to support public accounting. The School Administrator. Jour-

nal of the American Association of School Administrators, 49(1 I), 8-

15.

LeMahieu, P.G., Gitomer, D.A., & Eresh, J.T., (1995). Portfolios in large scale assessment: Difficult but not impossible. Educational

Measurement, Issues and Practice: Journal of the National Council in

Measurement and Education, 13(3), 1 l-28.

Livingston, C., & Castle, S. (1989). Teachers using research: What

does it mean? In C. Livingston & S. Castle (Eds.). Teachers and

research in action (pp. 13-28). Washington, DC: National Education Association.

Low. P. (1991). Focus on student learning-portfolio evaluation. In

G. Lau, P. Low, S. Mills, & J. Stack (Eds.), HERALD Year III

Changing Roles 15

evaluation: Final reporf (pp. 41-65). (Available from San Francisco Patton, M.Q. (1986). Utilization-focusedevaluation. Beverly Hills, CA:

Education Fund, San Francisco). Sage.

Low, P. (1993). Literate learners: The ecolution qfteacher research at

the Breadloaf School qf English. Dissertation (Stanford University,

Stanford, CA).

Powell, M. (1990). Performance assessment: Panacea orpandora’.s ho.u.

Rockville, MD: Montgomery County Public Schools.

Lytle, S.L., & Cochran-Smith, M. (1989). Teacher research: Toward

clarifying the concept. The Quarter!,, of‘ the National Writing Project

and the Center&v the Stu& of Writing, 11(2), l-27.

Resnick, L., & Tucker, M. (1991). Nero standards development pro-

posal: Assessment to support the thinking curriculum. Pittsburgh, PA:

University of Pittsburgh, Learning Research and Development

Center.

Mills, S. (1991). Organizational impact of the HERALD project. In

G. Lau, P. Low, S. Mills, & J. Stack (Eds.), HERALD Year III

evaluation: Final report (pp. 29940). (Available from San Francisco

Education Fund, San Francisco).

Sironik, K. A., & Clark. R. W. (1988). School-centered decision-

making and renewal. Phi Delta Kappan, 69, 660-664.

Smith, M. L., Edelsky, C., Draper, K.. Rottenberg, C., & Cherland,

M. (1989). The role of testing in elementary schools. Los Angeles, CA:

Center for Research on Educational Standards and Student Tests,

Graduate School of Education, UCLA. Mitchell, R. (1989). A sampler of authentic assessment: What it is

and what it looks like. Paper prepared for the 1989 Cur-

riculum/Assessment Alignment Conference, Sacramento and Long

Beach, CA.

Smith, M.L. (1991). Put to the test: The effects of external testing on

teachers. Educational Reseurcher, 20(5), 8-l 1.

Newman, F.M. (1990). Linking restructuring to authentic student

achievement. Paper presented at the Indiana University Annual Edu-

cation Conference, Bloomington. IN.

Tyack, D. (1990). “Restructuring” in historical perspective: Tinkering

toward Utopia. Teachers College Record, 92(2), 170-191.

Valencia, SW. (1990). Alternative assessment: Separating the wheat

from the chaff. The Readinq Teacher, 44(l), 60-61.

Olson. L. (1990). McArthur awards $1.3 million for National Exams.

Education Week, December 12. 5.

Wiggins, G. (1989). A true test: Toward more authentic and equatable

assessment. Phi Delia Kappan, 70(9), 703-713.

changing roles: evaluator and teacher collaborating in school change

Documents