guba, egon_fourth generation

Upload: tesalianow

Post on 14-Apr-2018

231 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 Guba, Egon_Fourth Generation

    1/17

    I know of no safe depository of the ultimatepowers of the society but the people themselves;and if we think them not enlightened enoughto exercise their control with a wholesomediscretion, the remedy is not to take it fromthem, but to inform their discretion.

    -Thomas JeffersonLetter to William Charles JarvisSeptember 28, 1820

    Evaluation is an investment in people and in progress.

    t'ourthGenerationEvaluationEgon C. CubaYvonna S. Lincoln

    \

    SAGE PUBLICATIONSThe Internatlonal Professional Publishers~ Newbury Park London New Delhi

  • 7/30/2019 Guba, Egon_Fourth Generation

    2/17

    Copyrichl .1989 bySage Publications. Inc.All rights n:scncd. No part of lhis book may be reproduced or ulilized in anyConn or byany means. dectronic or e c h a n i c a ~ including pholocopying.rteording. or by any informalion storage and retrieval syslem. wilhoulpermission in wriling from lhe publisher.

    For in(onn,Jlion dddrar:SAGE Publicalions,lnc.24SSTeller Road($) Newbury Park, California 91320SAGE Publicalions ua,6 Bonhill Sln:etLondon EC2A 4PUUnited KingdomSAGE Publicalions India PvL Ltd.M-32 MarkelGreater Kailash INew Delhi 110048 India

    Printed in the United Slates of A m ~ i e a

    1.lIIK\R\ (W U l ' \ I t : R ~ ~ ~ l:""... I t X ~ I ~ t : I ' \ I I ' I I R I . l V \ I l ( l . . , Il'\;l"-\Cuba. Fog'lll C.

    Fourth !!.enerationevaluation / Egon G. Cuba and Y q J l I I I ~ S. Lincoln,p, em.

    Bibliogrilphy: p,Includes mdex,ISBN 0-80111-1215/1I. EVilluiltion- Mtthodology. 2. Negotiation. I. Lincoln. Yvonna

    S. II. Title.AZ1/1I.C8ZOO1.4-ddO

    III. Title 4th gem:ration evaluation.1/18/1

    89-10426CIP92 93 94 15 14 13 12 11 10 9 8 7 6 5

    Contents

    ForewordAcknowledgernentsI. The Coming of Age of Evaluation2. What Is Fourth Generation Evaluation?

    Why Should We Choose to Practice It?3. What Is This Constructivist Paradigm Anyway?4. Ethics and Politics: The Twin Failures of Positivist Science5. Constructions and Reconstructions of Realities6. Paradigms and Methodologies7. The Methodology of Fourth Generation Evaluation8. Judging the Quality of Fourth Generation Evaluation9. Puttin g It All Together so That It

    Spells E-V-A-L-U-A-T-I-O-NReferencesIndexAbout the Authors

    72021

    5079

    117142156184228

    252271281293

  • 7/30/2019 Guba, Egon_Fourth Generation

    3/17

    sources:Joe David BellamyRuth BleierJ.R. BrownLee J. CronbachP.CW. DaviesA.Michael HubermanJoint Committee on Standards

    for Educational EvaluationH. Earle KnowltonMatthew B. MilesMarion NamenwirthMichael Q. PattonPeter Reason20

    Acknowledgemen ts

    We gratefully 'acknowledge permission to quote from the following

    Shulamit ReinharzJohn RowanGraham D. RowlesJ. SandersThomas A.SchwandtThomas SkrticRobert E. StakeRobert M.W. TraversJ. WagnerDavid D. WilliamsB. WorthenGary Zukav

    1 The Coming ofAge ofEvaluation -

    If this is to be a book about evaluation, it would seem reasonable to beginwith a definition of what we shall mean by that term. But to propose adefinition at this point, aside from being arbitrary and preemptive, wouldbe counterp roduc tive to the book's central themes. For we will argu e thatthere is no "right" way to define evaluation, a way that, if it could befound, would forever put an end to argumentation about how evaluationis to proceed and what its purposes are. We take definitions of evaluationto be human mental constructions, whose correspondence to some "real-ity" is not and cannot be an issue. There is no answer to the question,"But what is evalua tion really?" and there is no point in asking it.-AUTHORS' NOTE: Much of the material in this chapte r is drawn from our paper "TheCountenances of Fourth Ceneiation Evaluation: Description, Judgment, and Negotiation"(pp. 7().80 in Evaluation Studies ReYiewAnnua( Volume JI, Newbury Park, CA: Sage,

    1 9 8 6 ~ A slightly different version appears in The Politia of Evaluation (Dennis J. Palumbo,editor, Newbury Park, CA: Sage, 1988}

    21

  • 7/30/2019 Guba, Egon_Fourth Generation

    4/17

    2Z FOURTH GENERATION EVALUATION

    Instead we will begin by sketching briefly the changed meanings thathave been ascribed to evaluatio n for the past hundred years, ascriptionsthat have reflected the existing historical context, the purposes thatpeople had in mind for doing evaluations, and the philosophic assumptions that evaluators, theoreticians, and practitioners alike have beenwilling to make. We will argue that, over time, the construction ofevaluation has become more informed and sophisticated, until, at thispresent time, we are in a position to devise a radical new construction,which we characterize as fourth generati on evaluation. There is, ofcourse, no consensus about this form of evaluation, as there has not beenon earlier forms-a state of affairs that a brief glance through any

    s t ~ n d a r d text on evaluat ion quickly verifies. But we offer it as a construction that we believe counters, or at least ameliorates, the imperfections,gaps, and naivete of earlier formulations.

    We do not believe that we have stumbled upon the ultimately correctformulation. We are prepared, however, to argue that the constructionthat we have labeled fourth generation evaluation is more informed andsophisticated than previous constructions have been. But like thoseearlier forms, this form will sooner or later also prove to be inadequate insome way, and will require revision, refinement, extension, and probablyeven complete replacement. Indeed, we take it to be our obligation toseek out aspects of evaluation that this form does not handle well, in acontinuing effort at reconstruction.

    On that note, we may begin.

    The First Generation: MeasurementEvaluation as we know it did not simply appear one day; it is the re

    sult of a developmental process of construction and reconstruction thatinvolves a number of interacting influences.

    Chief among the early influences is the measurement of variousattributes of schoolchildren. School tests had been utilized for hundredsof years, to determine whether students had "mastered" the content ofthe various courses or subjects to which they were exposed. Appropriate'content was defined by reference to authority, whether Aristotle, theBible, or, most recently, the findings of science. Th e major purpose of the

    The Coming of Age of Evaluation Z3school was to teach children what was known to be true; children demonstrated mastery of those "facts" by regurgitating them on what wereessentially tests of memory. The earliest school tests were adminis teredchiefly orally, one student at a time, and required, had they been writtendown, "essay-type" answers.) It is thus not surprising tha t the first published example of educationalresearch, "The Futility of the Spelling Grind" (Rice, 18971 depended ontest scores for its data. Joseph Mayer Rice was appal led at the fact that allschool time was devoted, at least in American schools, to what we wouldnowadays call the "basics." Rice felt that, if schools could be made moreefficient, that is, if the same basic learning could occur in less time, thecurriculum could be expanded to include art, music, and other subjectsthat he wanted to see included (the "frills"). After an abortive effort toattack the problem curriculumwide (which failed because he could notestablish adequate field control s for so vast an enterprise], Rice focusedon spelling as a prime example. He devised a spelling test that he himselfadministered in multiple schools widely scattered geographically, alsocollecting from each school data about the amount of time devoted tothe teaching of spelling. His subse quent analysis indicated that there wasvirtually no relat ionship between time spent studying spelling and subsequent achievement on the test. Th e scores achieved by pupils were takenas concrete evidence of the degree of their achievement.

    Another app lication of testing th at was to have major import occurredin France. Th e French national minister of education, harried byteachers demanding that he find some way to screen out mentallyretarded youngsters, who, it was said, were making it impossible to teach"normal" children (a kind of reverse mainstreaming, as it were), asked apsychologist, Alfred Binet, to devise a test for that purpose. Binet tried firstto utilize the psychometric measurement techniques that had beenperfected in England by Francis Galton (whose staff member, KarlPearson, had earl ier invented the product -momen t correlation coefficientas a means to analyze data) and in Germany by Wilhelm Wundt. Whenthese techniques did not prove to be successful, Binet devised a newapproach, based on the commonsens e observation that mentally retardedyoungsters would not be able to co pe with simple life situations such ascounting money or identifying household objects as well as their normal

  • 7/30/2019 Guba, Egon_Fourth Generation

    5/17

    25

    FOURTH GENERATION EVALUAT-JON4

    counterparts. Binet was ultimately able to organize his tasks according tothe age of subjects typically able to complete them, coining the term"mental age" in the bargain. By 1912 it had become commonplace todivide the achieved mental age by the subject's chronological age todetermine the "intelligence quotient" The Binet test leaped across theAtlantic in 1910 via a translation by Henry Goddard; when, in 1916,Louis Terman revised and renormed the Binet (now called the StanfordBinet) forusewith Americanchildren, the IQ testhad becomea permanent part of the Americansystem.

    The utilityof testsfor school purposes waswell recognized by leadership personnel. The National Education Association appointed a comrnittee in 1904 to study the use of tests in classifying children anddetermining their progress; the association appointed three additionalcommillees by 1911. In 1912, the first school district Bureau of Researchwasestablished in New York City;its mandate wastoconduct a continuing survey ofthe system, usingthe "new measurement techniques."Similar bureaus were soon established in other major cities; the directors ofthese bureaus, who often. held titles such as assistant superintendent,began to meet annually in conjunction with the American Association ofSchool Administrators; later they organized themselves more formallyinto the American Educational Research Association.

    Probably the single most important influence leading to the rapidadvanceand acceptance ofmental testswasthe need to screenpersonnelfor induction into the armed services in World War I. Military leadersenlisted the support of the American Psychological Association to devisean appropriate instrumenl The APA appointed a commillee, chaired byArthurOtis, to undertakethis task, which they accomplished in remarkablyshorttime as well as withdistinction.This first group intelligence test,the Army Alpha, was successfully administered to more than 2 millionmen. Encouraged by this success, Otis undertook to revise and adapt theinstrument for use in the schools.

    Several factors that appeared to be only indirectly related to testingwere also destined to play major roles in the development of this firstgeneration of evaluation. The first of these contextual factors was thelegitimation provided by the phenomenal rise of social science. WhenJohn Stuart Mill called in 1843 for the application of "the science

    The ComingofAge of Evaluationapproach" to the study of human/social phenomena, a call based on theenormous successes of that approach in the physics and chemistry of thelate eighteenth and early nineteenth centuries and on the lack of asystematic base for"human" studies,he could scarcely have foreseen theenthusiasm with which his suggestion would be greeted, or the farreaching consequences that its adoption would have. The first majorefforts to be "scientific" were stimulated by Darwin's thesis that evensmall differences in animal or plant structure could, when accumulatedoverlong time periods, have verysignificantfunctional consequences forthe species. If that is so, social scientists began to reason, then perhapssmall differences in humans might also be a key to understanding majordevelopmental pallerns in humans. It was for this reason, among others,that the already mentioned psychometric laboratories were establishedby Galton, in 1873, and by Wundt, in 1879. Findings such as thatindividual differences in reaction times were typical of human subjectssuggested that these investigators were on the right trail. Psychology inparticular became wedded to the new scientific approach, attempting toemulate the physical sciencesas closely as possible. Of course,this intentwas well served by the apparently precise quantitative measurements thattests were yielding. When, by the mid-1920s, Ronald Fisher, working asstatistician for the British cotton industry, had devised the basic analytictools together with the mathematical tables needed to interpret theirresults,the socialsciences, including education, were treading closely inthe footsteps of their much-admired hard science counterparts.

    A second contextual factor stimulating testing was the emergence ofthe scientific management movement in business and industry. Ifhumanbeingsare the major element in the production ofgoodsand services, thetask of the manager is to make their work as effective and efficient aspossible. Beginning before World War I but coming into full flower inthe 1920s, the movement relied heavily on time and motion studies todetermine the most productive methods of working and on pieceworkwage rates to make the workers willing to submit themselves to such anarduous and personally unrewarding discipline. By the time that viewwas critically challenged by the Hawthorne Studies (Roethlisberger &Dixon, 19391 the ethosof scientificmanagement had alsopenetrated theschools. Pupils were seen as "raw material" to be "processed" in the

  • 7/30/2019 Guba, Egon_Fourth Generation

    6/17

  • 7/30/2019 Guba, Egon_Fourth Generation

    7/17

    29OURTH GENERATION EVALUATION8working as intended. It would, after all, not be a fair test if students failedin college not because the curricula were in principle inadequate butonly because they were so in practice. By a serendipitous coincidence,Ralph W. Tyler, a member of the Bureau of Educational Research atOhi o State University,' the campus on which the Eig ht Year Study washeadquartered, had for several years been working with selected OhioState faculty to develop tests that would measure whether or not thestudents learned what their professorshad intended them to learn. Thesedesired learning outcomes were labeled objectives. Tyler was engaged tocarry out the same kind of work with the Eight Year Study secondaryschools, but with one impor tant variation from conventional evaluation(measurement) the purpose of the studies would be to refine the devel-oping curricula and make sure they were working. Program evaluationwas born.

    As the participant secondary schools began devising their new curricula, Tyler collected information about the extent of achievement of theirdefined objectives by the pupils in the programs. This information,together with an analysis of the strengths and weaknesses tha t therebybecame apparent, was utilized to guide refinements and revisions-aprocess we today would call formative evaluation, except that the resultswere not available until afterrather than duringa trial. This process wasreiterated over successivecourse offeringsuntil the curriculum wasfoundto produce an appropriate level of achievement.

    Thus there emerged what we now choose to call second generationevaluation, an approach characterized by description of patterns ofstrengths and weaknesses with respect to certai n stated objectives.l Therole of the evaluator was that of e s c r i ~ r , although the earlier technicalaspects of that role were also retained. Mea surement was no longertreated as the equivalent of evaluation but was redefined as oneof severaltools that might be used in its service. When the report of the Eight YearStudy was published in 1942, the third volume, which described theevaluation activities of the project (Smith & Tyler, 1 9 4 2 ~ drew widespread attention. Like Lord Byron, Tyler awoke one morning to findhimself famous. Later he was to be recognized as the "Father of Evaluation" (Joint Committee, 1 9 8 1 ~

    The ComingofAge of Evaluation .

    The Third Generation: JudgmentThe objectives-oriented descriptive approach had some serious flaws,

    although they were not very noticeable until the post-Sputnik period( 1 9 5 7 ~ Then it proved inadequate to the task of evaluating the federalgovernment's response to the putative deficiencies of American education that had allowed the Russians to gain a march in space exploration:the course content improvement programs of the National ScienceFoundation (BSCS Biology, Project CHEM, PSSC Physics, and SMSGMathematics) and of the then Office of Educati on (Project English andProject Social Studies], When evaluators appointed to these project staffsinsisted that they could not begin working until they had project objectives in hand, they were dismissed by the program developers (who, itshould be recalled, were practicing physicists, chemists, biologists, andmathematicians in the earlier NSF projects, not science educators ormathematics educators) as irrelevant. These developers feared to committhemselves to objectives until they had a clearer picture of what theywere doing, and they did not want to state even provisional objectivesthat might later be found to have closed off their creativity prematurely.Moreover, they could not brook an evaluation strategy that would notproduce results until after the program had been completely developed;if an evaluation then showed deficiencies it was in many ways too late to'do anyt hing abou t the m (recalling especially the sense of national crisisunde r which they worked], These problems are well document ed inCronbach's now classic "Course Improvement Through Evaluation"

    ( 1 9 6 3 ~ But the dissenter from what had by this time become the acceptedmode of evaluation had ano ther even more impor tant criticism to level.

    Since it was essentially descriptive in nature, second generation Tylerianevaluation neglected what Robert Stake in his earlier-cited 1967 papercalled the other countenance or face of evaluation: judgment. Stake noted:

    Th e counten ance of evaluation beheld by the educat or is not the sameone beheld by the specialist in evaluation. Th e specialist sees himself

  • 7/30/2019 Guba, Egon_Fourth Generation

    8/17

    30 FOURTH GENERATION EVALUATIONas a "describer," one who describes aptitudes and environment andaccomplishments. The teacher and the school administrator, on theother hand, expect an evaluator to grade something or someone as tomerit. Moreover, they expect that he will judge things against externalstandards, on criteria perhaps little related to the local school's resources. Neither sees evaluation broadly enough. Both description andjudgment are essential-in fact, they are the two basic acts of evaluation. (cited in Worthen & Sanders, 1973, p. 109)The call to include judgment in the act of evaluation marked the

    emergence of third generation evaluation, a generation in which evaluation was characterized by efforts to reach judgments, and in which theevaluator assumed the role of judge, while retaining the earlier technicaland descriptive functions as well. This call, widely echoed in the profession, notably by Michael Scriven (1967), exposed several problems thathad not been dealt with adequately in earlier generations. First, itrequired that the objectives themselves be taken as problematic; goals noless than performance were to be subject to evaluation. As a wag pointedout, something not worth doing at all is certainly not worth doing well.Further, judgment, as Stake pointed out, requires standards againstwhich the judgment can be made. But the inclusion of standards thatmust, by definition of the genre, be value-laden into a scientific andputatively value-free enterprise such as evaluation, was repugnant tomost evaluators. Finally, if there is to be judgment, there must be a judge.Evaluators did not feel compe tent to act in that capacity, felt it presumptuous to do so, and feared the political vulnerability to which it exposedthem. Nevertheless, they were urged to accept that obligation, largely onthe ground that, among all possible judge candidates, the evaluators wererwithout do ubt the most objective (Scriven, 1967).

    In the final analysis the call to judgment could not be ignored, andevaluators soon rose to the challenge. A bevy of new evaluation modelssprang up in 1967 and thereafter: ne o-Tylerian models including Stake'sown CountenanceModel (1967) and the Discrepancy Evaluation Model(Provus, 1 9 7 1 ~ decision-oriented models such as CIPP (Stufflebeam et aI.,

    1 9 7 1 ~ effects-oriented models such as the Coal Free Model (Scriven,1 9 7 3 ~ neomeasurement models in the guise of social experimentation(Boruch, 1974; Campbell , 1969; Rivlin & Tirnpane, 1975; Rossi &

    The Coming of Age of Evaluation 31Williams, 1972); and models that were directly judgmental, such as tlieConnoisseurship Model (Eisner, 1979). All of these post-1967 modelsagreed on one point, however: Judgment was an integral part of evaluation. All urged, more or less explicitly, that the evaluator be judge. Therewere differences in the extent to which evaluators were represented asappropriate judges, ranging from the tentativeness of decision-orientedmodels-whose proponents hesitated to advocate an aggressive judgmental role because that seemingly co-opted the very decisionrnakerswhom the evaluations were ostensibly to serve-s-through OEM advocates-who saw their role as helping the client determine standards forjudgment-to the assertiveness of the advocates of judgmental models inwhich the evaluator was chosen precisely because of his or her connoisseurship qualities. Nevertheless, it seems fair to say that du ring thedecade and more following 1967, judgment became the hallmark of thirdgeneration evaluators.Pervasive Problems of the First Three Generations

    Although the preceding discussion of the first three generations ofevaluation has been brief, it is sufficient to demonstrate that eachsucceeding generation represented a step forward, both in the range ofsubstance or content included in the construction held as well as in itslevel of sophistication. Collection of data from individuals was notsystematically possible until the development of appropr iate instrumentsof the sort that characterized the first generation. But evaluation wouldhave stagnated at that level had not the second generation shown theway to evaluate the many nonhuman evaluands as well-the programs,materials, teaching strategies, organizational patterns, and "treatments"in general. The third generation required that evaluation lead to judgment, both about an evaluand's merit-its inner or intrinsic value-andabout itsworth-its extrinsic or contextual value (Cuba & Lincoln, 1981).But all three generations, as a group, have suffered and continue to sufferfrom certain flaws or defects .sufficiently serious to warrant raising thequestion whether additional refinements-or even a complete reconstruction-may not now be needed. We believe there are at least threesuch major flaws or defects: a tendency toward managerialism, a failure

  • 7/30/2019 Guba, Egon_Fourth Generation

    9/17

    33OU,RTH GENERATION EVALUATION2to accommodate value-pluralism, and overcommitment to the scientificparadigm of inquiry. We consider eac h briefly below; we shall return tothese themes frequently throughout this book.

    A tendency toward managerialism. The term manager includes avariety of individuals, but most often it will denote the clients or sponsorswho commission or fund an evaluation as well as the leadership personnelto whom the agents responsible for implementing the evaluand (notthe evaluation) report. The latter category includes, for example, theschool board members, the superintendent, and the principals (sometimes) in whose schools a curricular innovation is being tried; the administrator of a hospital and the director of nursing services in a hospital inwhich new care modes for oncology patients are being instituted; or themanager of a social agency in which several new programs intended toprovide leisure opportunities for the disabled are being compared. It isthe manager(s) with whom the evaluator typically contracts for anevaluation, to whom he or she defers in setting parameters and boundaries for the study, and to whom he or she reports. This traditionalrelationship between managers and evaluators is rarely challenged; yet ityields a number of highly undesirable consequences.

    First, given such an arrangement, the manager is effectively savedharmless. Insofar as the manager standsoutside the evaluation, his or hermanagerial qualities and practices cannot be called into question, norcan the manager be held accountable for what the evaluand does or doesnot produce. If there is a failure; the evaluation will necessarily point thefinger of blame elsewhere.

    Second, the typical manager/evaluator relationship is disempoweringand unfair. The manager has the ultimate power to determine whatquestions the evaluation will pursue. how the answers will be collectedand interpreted, and to whom the findings will be disseminated. Ofcourse, these matters are often settled in consultation with the evaluator,but in case of disagreement, the final decision is the manager's. The onlyrecourse the evaluator has is to refuse to conduct an evaluation underconditions not to his or her liking. This state of affairs in effect disempowers stakeholders who may have other questions to be answered, otherwaysof answering them, and other interpretat ions to make about them. Itis difficult if not impossible to conduct an evaluation in any of the first

    The ComingofAge of Evaluationthree generation modes open to inputs from other stakeholder groups.The entire process is patently unfair to those ot her groups, whose potential inputs are neither solicited nor honored, while the manager iselevated to a position of greatest power.

    Third, the typical manager/evaluator relationship is disenfranchising.Frequently the manager retains the right, contractually, to determine ifthe evaluation findings are to be released, and, if so, to whom. It has notbeen u ncomm on for evaluators to trade information release power forthe right to produce whatever report the evaluators see fit. It seemed areasonable exchange: the evaluator protected his or her integrity byretaining editoral prerogatives, while the manager in turn decided ondissemination issues. But those stakeholders who remained ignorant ofthe findings were effectively prevented from taking whatever actionsthose findings might have suggested to them, including, and most important, the protection of their own interests. They were denied privilege ofinformation and hence their rights.

    Finally, the typical manage r/eva luator relationship is very likely tobecome a cozyone. To concede to the manager the right to determinethe form of an evaluation is, in a very real sense, to enter into collusionwith him or her. There are obvious advantages to both manager andevaluator to engage in such collusion. On the manager's side, an evaluation conducte d in- ways that save the manager harmless, while disempowering and disenfranchising possible rivals, is clearly preferable to onethat holds the mana ger account able and makes it possible for rivals toassume some modicum of power. On the evaluator's side, an evaluationdon e in ways that gain the manager' s aproval is likely to lead to othercontracts and ensur e a steady source of income. Henry M. Brickell oncenoted that an evaluator perforce engages in a delicate balancing act:"Biting the h and that feeds you while appear ing only to be licking it."That balance is maintained more easily if the evaluator decides not tobite at all. While the vast majority of evaluators probably would hesitateto engage consciously in collusion, it was nevertheless all too easy to slipinto such a state of affairs in any of the first three generations.

    Michael Scriven has written extensively on the problem of managerialism ( l 9 8 3 ~ his solution to the problem is to engage in a form ofevaluation that asks questions of putative interest to the consumer and

  • 7/30/2019 Guba, Egon_Fourth Generation

    10/17

    H FOURTH GENERATION EVALUATIONthat reports to that group. He projects evaluations that parallel the kindsof analyses found in Consumer Reports. That approach does represent animportant step forward in that it recognizes some group other thanmanagers as important. But there is no compelling reason to excludemanagers simply to avoid the possibility of managerialism; there arebetter ways to deal with that problem, as we shall show. Consumerismdoes add onestakeholding audience to the mix; others,including managers, remain that are not included in the consumerism approach.

    Failure toaccommodate value-pluralism. It iscommon to believe thatsocieties share values, that there is some value set that characterizesmembers of a society to which all members are acculturated and subscribe. The concept of the "great melting pot" that assimilated immigrantsand somehow turned them into Americansisan example. Anotheristhat schools areexpectedto teach"our heritage,"a phrasethat includesthe idea that our heritage is shared. A third: It is commonly asserted thatour moral system is based on the "[udeo-Christian ethic."

    It has been only during the past twenty years that we have come tounderstand that this society, our society, is essentially value-pluralistic.Lessons about value pluralism were brought home to all of us in thelatter 19605, which witnessed not only traditional rivalries as betweenpolitical parties but also ethnic, gender, and, alas, even generationalconflicts that seemingly could not be resolved.

    The call to judgment in evaluation first came at about the same timethat an appreciation of value-pluralism emerged. Of course, values hadbeen implicit in evaluation since its first use; indeed, the very termevaluation is linguistically rooted in the term value. But it was easy tooverlook the factthat even the development ofan "objective"instrumentinvolved value judgments,or that the delineation of objectives impliedvalue agreement, so long as the question of value differences was notraised. Butonce raised it could not be stuffed backinto itscontainer.Thequestion of whose values would dominate in an evaluation, or, alternatively, how value differences might be negotiated, now emerged as the'major problem.It had longbeenargued that, despite the existenceofvalue differences,the findings of an evaluation could be trusted because the methodologyusedisscientific and scienceis demonstrablyvalue-free. The whole point

    The Coming of Age of Evaluationofdemanding objectivity is to obviate the questionof valueinfluence. Ofcourse,it was admitted, the evaluator has no control over howevaluationfindings are used; if persons with different values choose to interpret thefactual findings in different ways, the evaluator can hardly be heldaccountable. But,as weshall see, the assertion that science isvalue-freecan be seriously challenged.If science isnot value-free, then it is the casenot only that findings are subject to different interpretations but that the"facts" themselves are determined in interaction with the value systemthe evaluator (probably unknowingly) brings to bear.Then every act ofevaluation becomes a political act. Indeed,every act of inquiry, whetherevaluation, research, or policy analysis, becomes a political act in thissense.

    The assertion of value-freedom forevaluations is completely resonant(reinforcing) with the managerial tendency already described. If valuesmake no difference, then the findings of an evaluation represent states ofaffairs as they really are; they must be accepted as objective truths. Thefact that the manager sets the boundaries and parameters for the studywould then be relatively inconsequential, as wouldbe the fact that he orshe controls the questions asked, the methodology, and the findings.

    The claim of value-freedom is not tenable, as we shall show. And ifthat isthe case, then the value-pluralism ofoursociety isa crucial matterto be attended to in an evaluation. None of the evaluation approaches ofthe first three generations accommodates value differences in theslightest.

    Overcommitment to the scientific paradigm of inquiry. As we havenoted, practitioners of the social sciences have followed Mill's advice toemulate the methodsof the physical sciences withconviction and enthusiasm. There aremultiple reasons for thisstrongpositive response, amongthem the spectacular successes that have been enjoyed in the physicalsciences, the desire of-social scientists to be rationaland systematic, in thespirit of Descartes ("I think, therefore I am") and of positivism generally,and the need to achieve legitimation as a profession by following asrigorously as possible the methodology that characterized their hardscience counterparts.

    The premises of the scientific method were themselves siren songs,since theyseemedso self-evidently true.There isan objective reality "out

    35

  • 7/30/2019 Guba, Egon_Fourth Generation

    11/17

    376 FOURTH CENERATION EVALUATIONthere" that goes on about its business regardless of our interest in it; thisreality operates according to certain immutable natural laws. It is thebusiness of science to describe that reality and to uncover those laws.Once that isdone,sciencecan be used to exploit nature to humankind'sadvantage; we become able to predict and control at will. Each act ofinquiry brings us closer to understanding ultimate reality; eventually wewill be able to converge on it. But in order to understand reality and itslaws fully, the investigator must be able to stand outside (a neutraldistance from) the phenomenon being studied, so as not to influence it(which would keep us from seeing things as they really are and work) orbe influenced in our judgment by it. We must also be mindful thatnature is tricky and has many tactics to obfuscate the search for truth.The buzzing, bumblingconfusionofnature-in-the-raw must be carefullycontrolled to avoid the confounding of results that will otherwise surelyoccur. Thus the investigator must control the phenomenon, eitherthrough manipulation,as in a laboratory,or statistically, as is the caseinmost human studies. To yield control is to ensure spurious results.

    Virtually every first, second,or third generation evaluation model usesthe scientific paradigm to guide its methodological work (an exception isthe EisnerConnoisseurship Model,which purports to follow a humanistic paradigm), But this extreme dependence on the methods of sciencehas had unfortunate results. First, it has led to what some have called"context-stripping," that is, assessing the evaluand as though it did notexist in a context but only under the carefullycontrolledconditionsthatare in force aftera design is implemented. Such conditions are institutedin the hope that irrelevant local factors can be swept aside, and moregeneralizable results obtained (it unfortunately being the case that manyevaluations are commissioned to determine the generalizable qualitiesofthe evaluands), Weshall argue later that the motivationfor such contextstripping is in any event mistaken, in that generalizations are not possible.But for now let us note that, when attention is paid only to generalfactors, the local situation, existing as it does in the original unstrippedcontext, cannot be well served by the evaluation results. Moreover, theresources needed to institute and maintain controls offset other uses towhich the inquiry might have been put; thus the range of availableinformation is truncated. Surely this effort to derive general truths

    The Coming of Age of Evaluationthrough context-stripping(control)is one of the reasons why evaluationsare so often found to be irrelevant at the local level, leading to the muchlamented nonuse of evaluation findings about which we,as a profession,seem so fond of complaining. No one of the first three generations dealswith this problem.

    Second, commitment to the scientif ic paradigm inevitably seems tolead to an overdependence on formal quantitative measurement. Therigorthat that paradigm appears to promiserests on the"hardness"of thedata that are fed into the process. Hard data implies quantifiable data,data that can be measured with precision and analyzed with powerfulmathematical and statistical tools. Quantifiable data also ease the problemsassociated with prediction and control;they can easily be insertedinformulasespecially designed for those tasks. Aftera time these measuringinstruments take on a life of their own; while initially intended as"operationalizations"of scientificvariables, they become,in the end, thevariables themselves. It follows that what cannot be measured cannot bereal. No one of the first three generations deals with this problem.

    Third, since the methods of science promise to provide us withinformation about the way things really are, they claim a certain authority that is hard to resist. Hannah Arendt (1963) has noted this "coerciveness of truth." Truth is nonnegotiable. As evaluators using the scientificmethod, wecan assureour clients that nature herselfhas provided us thedata we in turn present; there is no arguing with them or denying them.None of our values, the client's,or anyone else'scan have influenced theoutcome. We as evaluators can take on the authority with which naturehas clothed us as her lawful messengers. If the status quo reflects nature'sown laws, it exists by a kind of divine right. It iseasy,then, to see howthescientific method reinforces and supports the managerial tendencies wenoted earlier.Anything being evaluated that is supported by positivistic(scientific) evaluation is locked in as the right thing to do. And weevaluators, as messengers, are not accountable for what nature hasdecreed.Both manager and evaluator are rendered unassailable. No oneof the first three generations deals with this problem.

    Fourth, useofthe scientificmethod closes out alternativeways to thinkabout the evaluand. Since science discloses the truth about things, anyother alternatives must be in error. Evaluators, clients, and stakeholders

  • 7/30/2019 Guba, Egon_Fourth Generation

    12/17

    38 FOURTH GENERATION EVALUATIONalike are all forced to be "true believers," because science has thatauthority that comes with being able to discover how things really areand reallywork. The worst thing that can be said about any assertion inour culture isthat there isno scientific evidence to support it;conversely,when there is scientific evidence, we must accept it at face value.Perfectly reasonable alternatives cannot in good conscience be entertained. There are no negotiations possible about what is true. No one ofthe first three generations deals with this problem.Finally, because science is putatively value-free, adherence to thescientific paradigm relieves the evaluator of any moral responsibility forhis or her actions. One cannot be faulted for just telling the truth, forgiving the facts, for "callin 'em as we sees 'em," or for "letting the chipsfall where they may." It iseasy to argue that the evaluator cannot controlhow evaluation findings are used. It is also easy to assert that the evaluator has no responsibility to follow up on evaluation;hisor herroleendswhen the report is delivered. In any event, the evaluator (messenger)cannot be held responsible for findings (the message) that simply reflectwhat exists in nature. No one of the three generations holds the evaluatormorally responsible for whatever emerges from the evaluation or for theuses to which the findings may be put.

    An Alternative ApproachWe hope to have made it clear that an alternative approach to

    evaluation-indeed, in the very meaning of the term-is desperatelyneeded. We shall propose one, which we designate by the perhapsclumsy but nevertheless highly descriptive title of responsive constructi-vistevaluation. The term responsive is used to designate a different wayof focusing an evaluation,that is, decidingon what wehavebeencallingits parameters and boundaries. In the models included in the first threegenerations, parameters and boundaries have been established a priori;their specification, usually accomplished through negotiations betweenclient and evaluator, is part of the design process. Robert Stakecoined theterm preordinate evaluation as a way of signaling this a priori quality.Responsive evaluation, also first proposed by Stake ( 1 9 7 5 ~ determinesparameters and boundaries through an interactive, negotiated process

    The Coming ofAce of Evaluation 39that involves stakeholders and that consumes a considerable portion ofthe time and resources available. It is for this reason, among others, thatthe designof a responsive evaluation issaid to be emergent.

    The term constructivist is used to designate the methodology actuallyemployed in doing an evaluation. It has its roots in an inquiry paradigmthat isan alternative to thescientific paradigm, andwechoose tocallit theconstructivist but it has many other names including interpretive andhermeneutic. Each ofthesetermscontributes somespecific insight intothenatureofthisparadigm, and weshall usethemallat different places in thisbook. In this section we introduce a few of the leading ideas that characterize, respectively, a responsive mode of focusing and a constructivistmodeofdoing. Each isexplored in much greater detailasthe bookunfolds.

    The responsive mode of focusing. The algorithm for any evaluationprocess must beginwith a method for determiningwhat questions are tobe raised and what information is to be gathered. In the case of firstgeneration evaluation, certain variables are identified, and the information to be gathered consists of individual scores on instruments thatputatively measure those variables (frequently, in school settings,achievementscores), In the caseof second generationevaluation, certainobjectives are identified; the information to be collected consists ofassessment of the congruence between pupil performance and the described objectives. In the case of third generation evaluation, variousmodelscall for different information; thus decision-oriented models suchas CIPP require information that services the decisions to be made in atimely manner (usually that impliescollectingcomparable informationfor each of the decision alternatives]; goal-free models call for information about experienced "effects"; the connoisseurship model calls forjudgments in relation to certain critical guideposts internalized by connoisseur/critics through training and experience; and the like. Thesefocusing elements-variables, objectives, decisions, and the like-may becalled "advance organizers"; the organizer that an evaluator is usingbecomes apparent as soon as the evaluator raises such questions as "whatare your objectives?" or "what decisions must this evaluation inform?"and the like.

    Responsive evaluation has its advance organizer as well: the claims,concerns, and issues about the evaluand that are identified by stake-

  • 7/30/2019 Guba, Egon_Fourth Generation

    13/17

    41OURTH GENERATION EVALUATION0holders, that is, persons or groups that are put at some risk by the evaluation. A claim is any assertion that a stakeholder may introduce that isfavorable to the evaluand, for example, that a particular mode of readinginstruction will result in more than a year's gain in standard test readingscores for every year of classroom use, or that a particular mode ofhandling domestic disturbance calls by police will materially reducerecidivism in offenders. A concern is any assertion that a stakeholder mayintroduce that is unfavorable to the evaluand, for example, that instruction in the use of a computer materially reduces pupils' ability to docomputations by hand, or that use of the evaluand will result in a greatdeal more "homework" time for teachers. An issue is any state of affairsabout whichreasonablepersonsmay disagree,for example, the introduction ofeducation about AIDSinto the elementary schools,or the use ofschool property to conduct classes in religion. Different stakeholders willharbordifferent claims,concerns,and issues; it isthe taskofthe evaluatorto ferret these out and to address them in the evaluation.

    There are always many different stakeholders. In Effective Evaluation(Cuba & Lincoln,1981), weidentified three broad classes, each with somesubtypes:

    I. The agenls, those persons involved in producing, using, and implementingtheevaluand-vhese agents include:a. the developers ofthe evaluandb. the funders, local, regional. and nationalc. local needs assessors who identified the need that the evaluand willputatively ameliorate or removed. decision makers who determined to utilize or develop the evaluandlocallye. the providers offacilities, supplies, and materials( theclient for theevaluation itself (the contractor)g. thepersonnel engaged in implementing theevaluand, suchasclassroomteachers, halfway house staB: police officers, nurses, and the like

    2. "thebeneficiaries, those persons who profit in some way from the use oftheevaluand-thesebeneficiaries include:a. the direct beneficiaries, the "target group," the persons for whom theevaluand was designedb. indirect beneficiaries, persons whose relationship with the direct benefi-ciaries ismediated, eased, enhanced, or otherwise positively influenced

    The Coming of ABe of EwduationCo persons who gain by the fact that the evaluand is in use, such aspublishers ofthe materials, contractors who provide needed services, andthe like

    3. The victim.. those persons who are negatively affected by the use of theevaluand (which may include, because ofsome failure intheevaluand, oneor more putative beneficiary groups)-these victims include:a. groups systematically excluded from the use of the evaluand, such as"normal" youngsters excluded from special programming for the giftedb. groups that suffer negative side effects, such as students, and theirparents, who are bused to a distant school sothat disadvantaged young-sters may occupy theirplaces intheoriginal schoolc. persons who are politically disadvantaged by the use of the ev..luand,suchas those suffering losses in power, influence, or prestiged. persons who suffer opportunity costs for forgone opportunities asa resultofthe useof the evaluand, suchas persons who would have elected todevote the necessary resources to some other venture, or publishers ofrival materials

    Responsive evaluation is not only responsive for the reason that it seeksout differentstakeholder views but alsosince it respondsto those items inthe subsequent collection of information. It is quite likely that differentstakeholders will hold very different constructions with respect to anyparticular claim, concern, or issue. Asweshall see,one of the major tasksof the evaluator is to conduct the evaluation in such a way that eachgroup must confront and deal with the constructions of all the others, aprocesswe shall referto as a hermeneutic dialectic. In that process some,perhaps many, of the original claims, concerns, and issues may be settledwithout recourse to new information, that is, information that is notalready available from one or more of the stakeholding groups themselves. Aseach group copes with the constructions posed by others, theirown constructions alter by virtue of becoming better informed and moresophisticated. Ideally, responsiveevaluation seeks to reach consensus onall claims, concerns, and issues at this point, but that is rarely if everpossible. Conflicts will remain whose resolution requires the introductionof outside information, which it becomes the evaluator's task to obtain.When this information (as much of it as is feasible) has become available,the evaluator prepares an agenda for negotiation, taking the leadership insetting up and moderating a negotiation session. Representatives of all

  • 7/30/2019 Guba, Egon_Fourth Generation

    14/17

    42 FOURTH GENERATION EVALUATION

    relevant stakeholders join with the evaluator in a joint effort to resolvewhat remains on the table. The final conclusions and recommendationsthat emerge from such a negotiation (as well as those reached earlier inthe hermeneutic dialectic) are thus arrived at jointly; they are never theunique or sole province of the evaluator or the client. Those agendaitems that c'!.nnot be resolved remain as points of contention, of course,but at the very least each of the stakeholders will understand what theconflict is and where other groups stand in relation to it.The stage is setfor recycling the evaluation. Such iteration and reiteration is typical ofresponsive evaluation; evaluations are never complete but are suspendedfor logistical reasons, such as the timing of a mandated decision orbecause resources are exhausted.

    Responsive evaluation has four phases, which may be reiterated andthat may overlap. In'the first phase, stakeholders are identified and aresolicited for those claims, concerns, and issues that they may wish tointroduce. In the second phase, the claims, concerns, and issues raised byeach stakeholder group are introduced to all other groups for comment,refutation, agreement, or whatever reaction may please them. In thisphase many of the original claims, concerns, and issues will be resolved.In the third phase, those claims, concerns, and issues that have not beenresolved become the advance organizersfor information collection by thee..aluator, The precise form of information collection will depend onwhether the bone of contention is a claim (information may be gatheredto test the claim, for example], a concern (information may be gatheredon the extent to which the concern is justified), or an issue(informationsupporting or refuting each side-and there may be more than twosides-may be gathered), The information may bequantitative or qual-itative. Responsive evaluation does not rule out quantitative modes,as ismistakenly believed by many, but deals with whatever information isresponsive to the unresolved claim, concern, or issue. In the fourth phase,negotiation among stakeholding groups, under the guidance of theevaluator and utilizing the evaluative information that has been collected, takesplace,in an effortto reach consensus on each disputed item.Notall such itemswillbe resolved; those that remain become the core forthe next evaluation that may be undertaken when time, resources, andinterest permit,

    TheComingof Age of Evaluation 4,Constructivist methodology. Constructivist methodology is the ap

    proach that we propose as a replacement for the scientific mode that hascharacterized virtually all evaluation carried out in thiscentury. It rests ina beliefsystemthat isvirtuallyopposite to that of science; a kind ofbeliefsystem that is often referred to as a paradigm. As Michael Quinn Patton(1978, p. 203) has put it,

    A paradigm is a world view, a general perspective, a way of breakingdown the complexity of the real world. As such, paradigms are deeplyembedded in the socialization of adherents and practitioners: paradigms tell them what is important, legitimate, and reasonable. Paradigms are also normative, telling the practitioner what to do withoutthe necessity of long existential or epistemological considerations. Butit is this aspect of paradigms that constitutes both their strength andtheir weakness-their strength in that it makes action possible, theirweakness in that the very reason for action is hidden in the unquestioned assumptions of the paradigm.It is not possible to prove or disprove a paradigm in an absolute sense,

    as it is not possible, say, to prove the existenceof a deity,or to prove thevalue of the adversarial system in use in the courts, or to prove thejudgmental system that characterizes literary, criticism. We shall, however,show that questions can be raisedaboutthe positivist paradigm thathas characterized contemporary science that are so fundamental as tosuggest that the positivist paradigm needs to be replaced.

    We believe that the constructivist paradigm is appropriate to that task.It resembles science hardly at all, particularly in its basic assumptions,which are virtually polar to those of science. For ontologically, it deniesthe existence of an objective reality, asserting instead that realities aresocial constructions of the mind, and that there exist as many suchconstructions as there are individuals (although clearly many constructions will be shared). We argue that science itselfissuch a construction;we can admit it freely to the pantheon of constructions provided onlythat we are not asked to accept science as the right or true construction.And we note that if realities are constructions, then there cannot be,except by mental imputation, immutable natural laws governing theconstructions, such as cause-effectlaws. Epistemologically, the constructi-

  • 7/30/2019 Guba, Egon_Fourth Generation

    15/17

    ...5 FOURTH CENERATION EVALUATIONvist paradigm denies the possibility of subject-object dualism, suggestinginstead that the findings of a study exist precisely because there is aninteraction between observer and observed that literally creates whatemerges from that inquiry. Methodologically, and in consequence of theontological and epistemological assumptions already made, the naturalistic paradigm rejects the controlling, manipulative (experimental) approach that characterizes science and substitutes for it a hermeneutic!dialectic process that takes full advantage, and account, of the observer/observed interaction to create a constructed reality that is as informedand sophisticated as it can be made at a particular point in time.

    The reader should not fail to note the resonance between an inquiryparadigm that proposes a hermeneutic/dialectic methodology and anevaluation model that depends exactly on such a process to substantiateits claim of responsiveness. Responsive focusing calls out for a constructi vist methodology, and constructivist methodology fits exactly the inquiryprocess needs of responsive evaluation.

    The consequences of utilizing constructivist methodology are startlingly different from those. we have come to expect from scientificinquiry. We shall argue that both are forms ofdisciplined inquiry, in thesense of that term proposed by Cronbach and Suppes ( l 9 6 9 ~ which isthat within both methodologies it is possible to submit for public inspection and verification "both the raw materials ente ring into the argumentand the logical processes by which they were compressed and rearrangedto make the conclusions credible." Within that framework we shall arguefor something very different than scientific assumptions would suggest,for example (Cuba, 1987):

    "Trut h" is a mailer of consensus among informed and sophisticatedconstructors, not of correspondence with an objective reality.

    "Facts" have no meaning except within some value framework;hence there cannot be an "objective" assessment of any proposition.

    "Causes" and "effects" do not exist except by imputation; henceaccountability is a relative matter and implicates all interactingparties (entities) equally.

    , ..TheComing ofAge ofEWJfuation Phenomena can be understood only within the context in which

    they are studied; findings from one context cannot be generalized toanother; neither problems nor their solutions can be generalizedfrom one setting to another.

    Intervent ions are not stable; when they are introduced into a particular context they will be at least as much affected (changed) by thatcontext as they are likely to affect the context.

    Change canno t be engineered; it is a non linear process that involvesthe introduc tion of new information, and increased sophistication inits use, into the constructions of the involved humans.

    Evaluation produces data in which facts and values are inextricablylinked. Valuing is an essential part of the evaluation process, providing the basis for an allributed meaning.

    Accountability is a characteristic of a conglomerate of mutual andsimultaneous shapers, no one of which nor one subset of which canbe uniquely singled out for praise or blame.

    Evaluators are subjective partners with stakeholders in the literalcreation of data.

    Evalua tors are orchestrators of a negotiation process that attempts toculminate in consensus on beller informed and more sophisticatedconstructions.

    Evalua tion data derived from constructivist inquiry have neitherspecial status nor legitimation; they represent simply another construction to be taken into account in the move toward consensus.

    Assertions such as those above-and others we might have made andwill make elsewhere in this book-at first glance seem so unreasonable asto be rejected out of hand. Yet there seems to be a powerful move in thedirection proposed in this paradigm-often, and mistakenly, termed thequalitative paradigm. While perhaps not everyone-or even manywould agree that the naturalistic paradigm should be the paradigm ofchoice, it is surprising to note the variety of fields in which issue is beingtaken with scientific positivism and proposals for redirection are beingmade.' In that context our proposal to realign evaluation with a differentparadigm than the scientific does not seem so unusual.

  • 7/30/2019 Guba, Egon_Fourth Generation

    16/17

    46 FOURTH GENERATION EVALUATION

    The Trade-Offs in AcceptingResponsive Constructivist EvaluationAcceptance of the responsive constructivist mode of conceptualizingand doingevaluation involves gainsand losses-although whatcounts as

    a gain or a loss is a matter of the perspective from which you happen tobe speaking.

    Certainly proponents of more conventional forms of evaluation arelikely to consider a shift to responsive constructivist evaluation as unfortunate, incurringmany losses. For one thing, there isthe implicitadmission that there can be no certainty about states of affairs; there is noobjective truth on which inquiries can converge. One cannot find outhow things really are or how they really work. That level of ambiguity isalmost too much to tolerate. If evaluations cannot ferret out the truth,what use can there be in doing them?

    Shifting to responsive constructivist evaluation also impliesgiving upcontrol over the process, given that stakeholders play equally definitiveroles at all stages with the evaluator and the client. Such loss of controlhas both methodological and political consequences. On the one hand, ifpersons who are not typically expert in methodological issues becomemajor decision makers, exercise of their prerogative may seriouslythreaten the technicaladequacy of thestudy.Further, ifthesepersons aregiven the power to make methodological decisions, they are simultaneously dealt a political hand as well; methodology may become the objectof a tug-of-war between politically dissident groups.Third, a commitment to responsive constructivist evaluation meansabandoning the hope that interventions (treatments, programs, materials,strategies, and the like) can be found that, on evaluation, prove widelyequal to whatever task they were intended to accomplish,such as ameliorating alcoholism, remediating underachievers, reducing recidivism, orwhatever. Ifthere were no basis for assuminggeneralizability of interventions, or of devising such interventions on the basis of well-establishedcause-effect relationships, there can be little reason to believe, conventionalists would aver, that societycan finally cope with the many prob

    The Comingof Age of Evaluation 47lems that patently beset it. To accept the basic premises undergirdingresponsive constructivist evaluation is virtually to abandon hope thatsolutions to social problems can ever be found. If everything must betailored to specific moresand a specific context, culture, economic level,and so on, society will soon be overwhelmed by this impossibly large anddifficult taskBut, answer the proponents of responsive constructivist evaluation, allthese fears-about the loss of absolutes on which to pin our hopes, aboutintolerable ambiguity, about the loss of experimental and political control, about our inability to find widely useful solutions to our pressingproblems-are themselves only constructions in which their constructorsare trapped because of their rigid adherence to assumptions that havepatently outlived their utility and their credibility. It is precisely becauseof our preoccupation with finding universal solutions that we fail to seehow to devise solutions with local meaning and utility. It is preciselybecause of our preoccupation with control that we fail to empower thevery peoplewhom weare putatively tryingto serve.The replacement of the certainty that appears to be invested in

    conventional methodology with the relativism characteristic of responsiveconstructivist evaluationdoesnot leadto an "anythinggoes" posture.Instead, that change focuses special attentionon the questionof howonecan compare one construction with another to determine which is to bepreferred. Conventionally such a comparison is made on the basis ofwhich construction better approximates reality. But when the possibilityofan objective reality isdenied,that standarddisappears, and othermoresubtle and sophisticated distinctions are called for. The moral imperativeof the responsive constructivist evaluator is continuously to be on thealert for-indeed, to seekout-challenges to the prevailing construction(however much it may be supported in consensus), and to stand ready torefine, change, or even reject that which is currentlybelieved in favor ofsomething elsethat, on examination,seems more reasonable and appropriateto thosein the bestpositionto makethat judgment. If nothingelse,commitment to responsive constructivist evaluation replaces the arrogance so easily assumed by the conventionalist, convinced that he or shehas found out about "reality," with a humility appropriate to the insightthat one can never know how things "really"are;that one'sconstruction

  • 7/30/2019 Guba, Egon_Fourth Generation

    17/17

    49FOURTH GENERATION EVALUATION8about howthingsare iscreated by the inquiryitselfand isnot determinedby some mysterious "nature." To substitute relativity for certainty.empowerment for control, local understanding for generalized explanation,and humility forarrogance.seems to bea series of cleargains for thefourth generationevaluator.

    You. the reader, will have to decide for yourselfhow you wish to countthe gainsand losses.Notes

    I. By a curious twist of fate, one of us had the privilege of directing this same bur eau(which had been renamed the Bureau of Educational Research