performance assessment and science learning: rationale for computers

7
Journal of Science Education and Technology, Vol. 4, No. L 1995 Performance Assessment and Science Learning: Rationale for Computers James R. Okey 1,2 Performance assessment in science focuses on measuring achievement of problem solving, higher-order thinking, and application of skills and knowledge. Descriptions of how performance assessment tasks are linked to science curriculum objectives are given, along with criteria for and examples of performance assessment tasks. Criteria for developing scoring procedures are considered. Throughout the discussion, the role that computers and related technology can play in problem-based performance assessment is described. KEY WORDS: Performance assessment; science assessment; alternative assessment, computer assessment. ASSESSMENT IN SCIENCE The primary purpose of assessment in an instruc- tional setting is to determine if learners have achieved the outcomes, goals, and objectives of the curriculum. Although this purpose seems straightforward, there are circumstances where the match between the as- sessment being done and the goals, objectives, and outcomes of the curriculum is not obvious. In other words, there may be a lack of fidelity between what students learn and what they are asked to demon- strate in an assessment. Assessments that focus on facts and comprehension may be incongruent with a science curriculum in which students learn to solve problems and conduct investigations. Similarly, assess- ment tasks that ask learners to organize and use knowledge in an applied setting may be at odds with a curriculum that provides scant opportunity to par- ticipate in science learning beyond a vocabulary level, and those kinds of assessment tasks are more readily created and carried out on computers. Information obtained from assessments is used for a number of purposes--some for purposes of learners, some for teachers, and some for programs. 1Department of Instructional Technology, University of Georgia, Athena, Georgia 30602. 2Correspondence should be directed to Department of Instructional "l~hnology, University of Georgia, Athens, Georgia 30602. Learners want to know "How am I doing?", and an appropriate assessment can provide that knowledge. Teachers have multiple reasons for conducting assess- ments. They too want to know "How am I doing?", and an assessment can provide information on the value of the teaching activities and other curriculum events that are provided. Teachers also have respon- sibilities for documenting what students have learned and must report to those interested (parents, admin- istrators, public agencies, and the public) how indi- vidual students are achieving curriculum objectives. In light of the importance of assessment, the fact that computers can widen its scope and accuracy make them increasingly relevant for assessment of science learning. Computer-based assessments can address the full range of outcomes to be learned in a science pro- gram. While the focus in this paper will be on per- formance assessment related to higher-order learning and problem solving, assessment of all kinds of out- comes is appropriate and important. It would be a mistake to single out certain kinds of assessment as the good kinds and others as the bad ones. It may be appropriate to learn plant names and charac- teristics in a science setting and have them assessed in time honored ways using multiple choice, comple- tion, or similar kinds of test items. In the same set- ting it might be appropriate to have students do a biological census that relies on considerable knowl- 81 1059-0145/95/0300-008150"/.50/0 0 1995 Plenum Publishing Corporation

Upload: james-r-okey

Post on 10-Jul-2016

214 views

Category:

Documents


2 download

TRANSCRIPT

Journal of Science Education and Technology, Vol. 4, No. L 1995

Performance Assessment and Science Learning: Rationale for Computers

James R. Okey 1,2

Performance assessment in science focuses on measuring achievement of problem solving, higher-order thinking, and application of skills and knowledge. Descriptions of how performance assessment tasks are linked to science curriculum objectives are given, along with criteria for and examples of performance assessment tasks. Criteria for developing scoring procedures are considered. Throughout the discussion, the role that computers and related technology can play in problem-based performance assessment is described.

KEY WORDS: Performance assessment; science assessment; alternative assessment, computer assessment.

ASSESSMENT IN SCIENCE

The primary purpose of assessment in an instruc- tional setting is to determine if learners have achieved the outcomes, goals, and objectives of the curriculum. Although this purpose seems straightforward, there are circumstances where the match between the as- sessment being done and the goals, objectives, and outcomes of the curriculum is not obvious. In other words, there may be a lack of fidelity between what students learn and what they are asked to demon- strate in an assessment. Assessments that focus on facts and comprehension may be incongruent with a science curriculum in which students learn to solve problems and conduct investigations. Similarly, assess- ment tasks that ask learners to organize and use knowledge in an applied setting may be at odds with a curriculum that provides scant opportunity to par- ticipate in science learning beyond a vocabulary level, and those kinds of assessment tasks are more readily created and carried out on computers.

Information obtained from assessments is used for a number of purposes--some for purposes of learners, some for teachers, and some for programs.

1Department of Instructional Technology, University of Georgia, Athena, Georgia 30602.

2Correspondence should be directed to Department of Instructional "l~hnology, University of Georgia, Athens, Georgia 30602.

Learners want to know "How am I doing?", and an appropriate assessment can provide that knowledge. Teachers have multiple reasons for conducting assess- ments. They too want to know "How am I doing?", and an assessment can provide information on the value of the teaching activities and other curriculum events that are provided. Teachers also have respon- sibilities for documenting what students have learned and must report to those interested (parents, admin- istrators, public agencies, and the public) how indi- vidual students are achieving curriculum objectives. In light of the importance of assessment, the fact that computers can widen its scope and accuracy make them increasingly relevant for assessment of science learning.

Computer-based assessments can address the full range of outcomes to be learned in a science pro- gram. While the focus in this paper will be on per- formance assessment related to higher-order learning and problem solving, assessment of all kinds of out- comes is appropriate and important. It would be a mistake to single out certain kinds of assessment as the good kinds and others as the bad ones. It may be appropriate to learn plant names and charac- teristics in a science setting and have them assessed in time honored ways using multiple choice, comple- tion, or similar kinds of test items. In the same set- ting it might be appropriate to have students do a biological census that relies on considerable knowl-

81 1059-0145/95/0300-008150"/.50/0 0 1995 Plenum Publishing Corporation

82 Okey

edge of individual plants but incorporates as well the problem solving involved in planning, conducting, and reporting a survey based on careful sampling. The appropriateness of the assessments in each of these cases relates to their correspondence to the curriculum objectives.

An essential distinction to make during any con- sideration of assessments is their difference from learning activities. Although computer-based assess- ments may sometimes be done in settings that are hard to distinguish from the activities in which learn- ing was accomplished, the two are different dimen- sions of instruction. Learning activities precede assessment and provide opportunities to achieve goals and objectives. Considerable coaching, assis- tance, advice, practice, modeling, and feedback may take place during the learning, but during assess- ment, full knowledge and performance should be demonstrated without teacher guidance and assis- tance. Although time constraints in real classrooms may not always allow adequate time for full and com- plete instruction followed by full and complete as- sessment, teachers should understand the compro- mises they are making and the consequences of them. Using a learning project (for example, the sur- vey of a network of streams to identify sources and amounts of runoff) to simultaneously learn and as- sess may provide only limited evidence of the ability of students to plan and conduct this type of survey that is dependent on substantial knowledge about and procedures of science.

In the world of real school learning where hours, days, and weeks for science instruction are limited, it is a value decision as to the amount of time to devote to learning and instruction and the time to devote to testing or assessment. (Computer-based as- sessment could make testing more time and cost ef- ficient.) Even though assessment provides important information for teachers and learners (and others), it may still be a wise decision to focus principally on instructional time and shortchange assessment that is time consuming yet provides the most accurate in- formation about the knowledge and skills acquired by learners.

Performance assessment is a means of determin- ing achievement of certain kinds of outcomes--that is, outcomes related to applying knowledge, solving problems, and using knowledge and skills in tasks that are or approximate real-world science problems, a valuable function of the computer in assessment. Science programs may, however, have a variety of outcomes related to knowledge acquisition, compre-

hension-level tasks that are part of the curriculum as well and would be assessed in other ways. The point to be made clear is that performance assessment is not the only kind of assessment that will occur in a science program. It relates to one kind of outcome; other kinds of outcomes can be assessed with other means. Some may argue that it is best to assess knowledge acquisition (symbols for elements, classi- fication of plants, types of bacteria) in an applied set- t i n g - t h a t is, during its use in an applied problem or investigation--but assessing content knowledge in a problem setting may be an inefficient way to deter- mine if certain key knowledge is remembered. The computer applications again prove their value in their variety of means of assessment.

PERFORMANCE ASSESSMENT

Several of the terms associated with assessment will be described here. Meanings associated with the terms differ somewhat depending on who uses them; some of the terms have overlapping meanings.

Performance assessment implies the construction or production of something to demonstrate knowl- edge and skills rather than the selection of responses.

Performance assessment by any name requires stu- dents to actively accomplish complex and significant tasks, while bringing to bear prior knowledge, recent learning, and relevant skills to solve realistic or authentic problems. Exhibitions, investigations, dem- onstrations, written or oral responses, journals, and portfolios are examples of the assessment alterna- tives . . . [Herman et al., 1992].

Performances used to assess learning may also be considered authentic. A n authentic task is one that replicates the context, challenges, and standards of the field in which it is set (Wiggins, 1989). Thus, in woodworking, for example, if a student must con- struct a piece of furniture requiring joinery with mor- tise and tenon joints, it is an authentic performance task because it is what real woodworkers do. In sci- ence (as well as in other curriculum areas), it is pos- sible to have performance tasks that are authentic and those that are not. Consider having students es- timate the impact on land fill needs in a particular community if 75% of the paper and plastics were re- cycled. Contrast this with identifying the ingredients in household refuse that could be recycled. The first of these lends itself well to computer-based assess- ment and is considered authentic because it is a real- world task, while the second is information that could

Performance Assessment in Science 83

be assessed with a typical multiple-choice or short- answer test item. Or, think of the difference in a task that has a learner determine the amount of energy used to take a shower with solving word problems that ask how much energy is used by a 3-amp com- puter operating for 2 hours. Again, the first of these attempts to pose a task that requires application of knowledge and skill in a real setting readily accessi- ble by computers, while the second is a typical prob- lem that students may be asked following study of a unit on energy use.

Alternative assessment is another term used to describe assessment that focuses on student perform- ance. Worthen (1993) describes alternative assess- ment as an alternative to traditional testing (e.g., multiple-choice, short-answer, and standardized tests) and as "direct examination of student perform- ance on significant tasks that are relevant to life out- side of school." Computers have been found useful in such assessments (Helgeson and Kumar, 1993).

While any of these terms might appropriately be used to describe the kind of assessment discussed in this article, the term performance assessment has been chosen because it seems to capture best the idea of assessing student problem solving and higher- order thinking ability in a science program that al- lows performance through computers.

Factors Influencing the Rise of Performance Assessment

In some curriculum areas (for example, art, mu- sic, industrial arts, physical education), the idea of assessing the knowledge and skills of learners in a performance context is long standing. In science, there has been a long struggle to reconcile the at- tention to process learning (problem solving, proce- dures of learning) with a continued emphasis on assessment that focused principally on memorized re- call. However, attention to higher-order thinking, problem solving, and ability to use and apply knowl- edge have moved science educators to the inevitable dilemma of a curriculum in which the goals and ob- jectives were at odds with the assessment procedures used to measure their achievement. Influential as well in changing assessment methods is the wide- spread philosophy that active, problem-based learn- ing is a critical factor in learning and in a quality science program. If the curriculum emphasizes prob- lem-based learning, then an assessment program should parallel it with assessment tasks that call for

and allow demonstrations of knowledge and skill consistent with the curriculum (Shavelson et al., 1991). Computers provide a means of assessment us- ing both conventional testing procedures and those that require problem solving and application of knowledge in science.

Influences of Performance Assessment

In conventional thinking, the curriculum pre- cedes and drives the assessment procedure. When goals and objectives are set in the curriculum, then the methods appropriate to their assessment are de- vised. Certainly in science, an emphasis on problem solving, application, and processes resulted in the need to devise assessments that paralleled them, but the attention to active and problem-based assessment has had the reverse effect too. Because recent discus- sion have so strongly focused on how problem solving and problem based thinking can be assessed, the em- phasis has ultimately turned to curriculum activities that can lead to carrying them out successfully and to computers for assessing through simulation the ac- tivities that present problem-solving opportunities.

233Pes of Performances

What constitutes an appropriate performance assessment? In other words, in what ways can learn- ers demonstrate their ability to use and apply knowl- edge in a productive, generative manner? The most common way in science is to plan, conduct, and re- port on investigations. The investigations can be re- lated to air quality, impact of ferti l izers and pesticides, plant growth, solar insolation, land-use patterns, or any other matter chosen by a student or teacher that provides an opportunity to study phe- nomena using knowledge about and procedures of science. To the degree that the problem or issue is perceived as a real one, the more likely learners will see it as an authentic task that has value. The tan- gible result of the performance assessment will be what is produced by the learner. Here you may find logs, field notes, recorded data, written plans and conclusions, pictures, video images, and so on. As- sembling and displaying these in a coherent way is itself a considerable task that can be aided by appro- priate use of computer and related technology.

Some writers, e.g., Feuer and Fulton (1993), list a variety of performances that are appropriate for

84 Okey

an assessment--constructed responses, writing, oral discourse, exhibitions, experiments, and portfolios. Most performance assessments would seem to re- quire learners to engage in combinations of these methods of response when they demonstrate their knowledge and skills in an area.

ROLE OF COMPUTERS IN PERFORMANCE ASSESSMENT

The use of computers in science instruction had its beginnings little more than a decade ago (with some earlier and uncommon exceptions that relied on a few mainframe computers). Computer application in data acquisition, data processing, in- formation access (of data bases), and reporting can be widely observed in schools at all levels, and they are increasingly useful in both science instruction and assessment.

Both students and teachers use computers in performance assessment. All phases of planning, con- ducting, and reporting in a performance assessment environment can be affected by computers. Consider the conduct of studies, investigations, or reports. Stu- dents may use computers for recording data, acquir- ing information through laboratory study, accessing information from data bases, and communicating with other investigators at remote sites concerning their data. When preparing reports or productions, again computers and related technologies are used to create print documents, record sounds and images of phenomena, and prepare multimedia formats (text, graphics, still images, sound, video). Portfolios that document performance over a period of time may provide examples of journals, tests, reports, de- scriptions of investigations, and multimedia reports that are created and stored on computers with the auxiliary aid of tape and video recorders, cameras, and appropriate software (Barrett, 1994).

Another type of assessment that may rely com- pletely or to some degree on computers is simulation of science-related events (e.g., operating a nuclear reactor or regulating factors that influence produc- tion in fish farming). Computer-based simulations can provide problem situations that incorporate nu- merous variables, foreshortened time, and safe access to consequences of decisions and choices. It is through these characteristics (time, safety, complex- ity) that computers demonstrate their value.

PERFORMANCE ASSESSMENT PROCEDURES

Three critical steps are involved in establishing a performance assessment program in science. The steps include: (1) setting or posing performance tasks, (2) describing conditions for responding to per- formance tasks, and (3) establishing scoring criteria for performance tasks. Descriptions and considera- tions when carrying out these three steps are consid- ered in this section, and these complex steps can be considerably eased through computer use (Kumar and Helgeson, 1994).

Setting or Posing Performance Tasks

Attention to a variety of concerns is needed when selecting performance assessment tasks. Which performance tasks provide an opportunity to assess whether students have acquired the knowledge and skills intended in the curriculum (Linnet al., 1991)? What performance tasks are appropriate for science students at different ages and experience levels? Which tasks provide an opportunity to demonstrate skills and knowledge in solving problems that are similar to those in the real world? These are among the questions faced by teachers when they plan, se- lect, and design performance assessment tasks for science students.

Herman et al. (1992) provide a list of criteria to consider when setting assessment tasks. Among their criteria are:

matching the performance task to the in- tended outcomes and objectives. No task, no matter how real or original, is appropriate if it fails to match the objectives and goals of the curriculum. representing the content and skills in the cur- riculum. Per fo rmance assessment tasks should map curriculum goals and objectives at an appropr ia te degree of depth and breadth. A performance assessment task could match the curriculum objectives but do this so narrowly that dozens of such tasks would be required to provide adequate as- sessment. Similarly, the performance assess- ment task could be so easy as to inadequately test application of skills and knowledge or be so difficult as to render what was learned as inadequate to accomplish the task.

Performance Assessment in Science 85

�9 providing a fair, equitable, and unbiased as- sessment. Performance tasks should be fair in terms of gender and race. They should be constructed so that students with different life experiences and backgrounds are not unfairly aided or hampered.

�9 appearing credible to both learners and out- siders. Authentic tasks have been described as those that truly represent what people in a field actually do. Some degree of authenticity is required in assessment tasks so that learn- ers view them as credible evidence of knowl- edge and skill application and not just make work. As well, for a school assessment pro- gram to be viewed as credible by outsiders (parents and other citizens), the performance tasks must not be viewed as trivial, shallow, or contrived and not representative of genu- ine intellectual application.

�9 allowing a feasible means of assessing per- formance. Consideration of such needs as equipment, time required, space, and access are all important in selecting performance as- sessments. A performance assessment task, no matter how credible, authentic, and faith- ful to curriculum goals, is inappropriate if it requires too much time, too much equipment and space, and access to distant places and people.

When students use computers during assess- ment, many of these performance assessment criteria are addressed. Representation of the curriculum con- tent and skills within a computer environment is aided by the ready access to and revision of assess- ment items and procedures. Fair and equitable as- sessment in a computer environment includes ensuring that students have sufficient access so they are comfortable using computers. Credibility of the assessment process and tasks can be aided through computer presentation of tasks. In these and other ways that computers assist in assessment by providing complex and faithful renderings of science problems, the criteria for performance assessment can be met (see Helgeson and Kumar, 1993).

Having criteria like those listed above in mind, teachers can identify and select tasks that are appro- priate for their students as a way of demonstrating their knowledge and skills in an applied problem- solving setting. What might such tasks look like? Consider these possibilities:

�9 Create a report on ecological problems asso- ciated with increased urbanization in a par- ticular area.

�9 Determine how much it costs in energy use to cook a meal, take a shower, get all students to school on one day, or operate a home for a day.

�9 Compare the growth and productivity of plants growing in hydroponic and soil condi- tions.

�9 Prepare a documentary (words, pictures, sounds) that explains in lay language the causes and fluctuations in the tides.

�9 Determine the impact of a recycling plan on waste removal services, land fill use, and costs and benefits to local citizens.

�9 Conduct a survey of the plant life in a desig- nated area. Devise a sampling plan, docu- ment da ta co l lec t ion p rocedures , and organize a written and visual report.

Performance tasks such as the above spring from the minds of teachers and can benefit from group consideration, alteration, review, and extension. Brainstorming such tasks is a helpful practice. Stu- dents may have excellent ideas here as well. Good teaching activities are potentially good assessment tasks and vice versa. The important event in trying to invent such tasks is to keep the criteria for per- formance assessment tasks in mind.

Describing Conditions for Responding to Performance Tasks

Having identified performance assessment tasks is the first step in readying them for use. Imagine the kind of questions (concerning scope, time, re- sources, assistance, and so on) that students would reasonably ask when faced with such a task, problem, or question. What the teacher needs to do, therefore, is have an accompanying description of these condi- tions to provide with each task (Herman et al., 1992). An example of what this might look like for a par- ticular task is shown in Fig. 1 for the plant survey task listed above. Using computer assistance, teach- ers could make numerous variations of this example task for students of different abilities, in different classes, and of different ages.

86 Okey

Conduct a survey of the plant life in the Fulton Woods plot. The survey should be prepared for the Urban Land Use Commission which is making recommendations for zoning changes for this area. The report of the survey should be both written, with an executive summary, and in verbal and visual form for presentation to the Commission in a 15 minute session.

Restrictions and specifications for the survey include a description of sampling procedures followed and descriptive and tabular data presentations. Reports can be prepared in a team of four persons. Any materials available in the science department may be used. Computer word processing complete with visual images is expected in both the written document and the presentation. Library resources from our library and through internet access are appropriate. The report must be completed within eight days of starting the task. Any questions or concerns about conduct of the survey, scope of the report, resources available, and work allocation should be addressed to Ms. Felton.

Fig. I. Specifications for a performance assessment task in science.

Performance Cri ter ia

ExemDlary -- Provides a clear and coherent plan for addressing the problem. Communicates sampling procedure and data with appropriate words, charts, and diagrams. Written, verbal, and visual reports all succinctly and clearly convey the problem, procedures, data, and conclusions.

Comoetent -. Provides an adequate description of the problem, procedures, data, and conclusions. Some problems of clarity, thoroughness, and adequacy may exist in any or all of these phases. Reports (written, verbal, visual) all meet high quality standards.

Intermediate -- Descriptions of the problem, procedures, and data convey some difficulties with either intentions or results. Reports are adequate but lack key elements (e.g., in accuracy, style) of clear and coherent communication.

Novice -- Data collection plan inappropriate. Data inaccurate or incomplete. Report fails to convey the problem or the results accurately and coherently.

Fig. 2. Scoring criteria for a performance task.

Establishing Scoring Criteria for Performance Tasks

Beyond the selection of the problem used to as- sess performance and specification of the conditions for completing the assessment, are development of the criteria for judging the responses. Although these scoring criteria are used when the task is complete, they are developed before students engage in the as- sessment so that they may be used as a guide to per- f o r m a n c e e x p e c t a t i o n s . Scor ing c r i te r ia fo r performance assessment tasks (referred to as scoring rubrics) are typically laid out in scale form to show gradations of performance from poor to exemplary. The criteria may relate to the entire performance task or be developed for facets or dimensions of the task. Thus the survey task described above could be judged in its entirety (see the following scoring ru- bric) or be examined by considering dimensions of the task (such as adequacy of the sampling plan, quality of the verbal report, appropriateness of data displays, etc.) and then summed across these dimen- sions to produce an overall score.

A possible scoring rubric for the survey task is given in Fig. 2. This example uses four levels of per- formance but more could be used, and the scoring criteria are based on the collective consideration of several elements or dimensions of the task. These elements could be separated and individually judged on scales developed for each. As with other elements of performance assessment, application of these scor-

ing criteria could be computerized to aid the teacher in both record keeping and grading.

ISSUES IN PERFORMANCE ASSESSMENT

The central issue in performance assessment in science is that of validity. Does the assessment cor- respond to the objectives and outcomes of the sci- ence program? If problem solving, application of knowledge and skills, and ability to engage in the procedures of science are at the heart of these ob- jectives, then the assessment should provide credible evidence that students can engage in them. Problem- solving outcomes should be assessed with problem- solving tasks, but the problem, of course, is cost and efficiency (Popham, 1993; Wiggins, 1989). To be a valid test of problem-solving ability, an assessment is not a trivial matter to construct by the teacher, to engage in by the learner, or to score. The costs are primarily in time for teachers to prepare and score the tests and for students to complete the tasks. Valid performance tests will take longer than paper-and- pencil tests that use constructed and multiple-choice responses.

A second issue for any test is reliability. When a student takes a test, teachers want to be sure that the results are indicative of the knowledge and skill that the student has. A reliable test is one that pro- vides results that can be believed in, that is, it is not capricious, one time giving high scores and another

Performance Assessment in Science 87

time low ones. Reliability for performance tests is largely determined by the careful and consistent use of explicit scoring procedures. This means that the scoring criteria must be fully explicated and that the persons doing the scoring render consistent scoring results. Because classroom tests are usually scored by teachers, reliability in scoring comes from consistent use of the explicit scoring criteria across different students and different times.

When computers (and related technological products) are used in science assessment, equity of access is a potential problem. Schools need to be concerned with equity across and within classes and among various programs in the school. Beyond these equity concerns are those that relate to out-of-school use of computers. Performance assessments are often not short, within-the-hour measures. They may take place across days or even weeks. When they do, and when students are preparing reports or exhibits, ac- cessing external data sources, and so on, use of com- puters in homes becomes a major concern. Some students will have this access and others will not. Ac- cess to technology resources should not be a factor in student success, so it is incumbent upon teachers to ensure access and familiarity.

A technological issue related to computer-based performance assessment has to do with storage ca- pability. Because the use of computers is projected into multimedia use (including text, pictures, sound, and video), the multimedia records must be saved and be accessible. This can be a problem for elec-

tronic portfolios that include significant video and sound storage since most personal computers do not have the capability to store more than a few seconds of such information. The increasing storage capacity of computers suggests this will be a problem of de- clining importance.

REFERENCES

Barrett, H. (1994). Technology-supported assessment portfolios. The Computing Teacher 21(6): 9-12.

Feuer, M., and Fulton, K. (1993). The many faces of performance assessment. Phi Delta Kappan 74(6): 478.

Helgeson, S., and Kumar, D. (1993). A review of educational tech- nology in science assessment. Journal of Computers in Mathe- matics and Science Teaching 12(3/4): 227-243.

Herman, J., Aschbacher, P., and Winters, L. (1992). A Practical Guide to Alternative Assessment, Association for Supervision and Curriculum Development, Alexandria, Virginia.

Kumar, D., and Helgeson, S. (1994). Computer applications in as- sessing science problem solving. In Levoie, D. (Ed.), Toward a Cognitive Science Perspective for Scientific Problem Solving, National Association for Research in Science Teaching, Man- hattan, Kansas (in press).

Linn, R., Baker, E., and Dunbar, S. (1991). Complex, perform- ance-based assessment: Expectations and validation criteria. Educational Researcher 20(8): 15-23.

Popham, J. (1993). Circumventing the high costs of authentic as- sessment. Phi Delta Kappan 74(6): 470--473.

Shavelson, R., Baxter, G., and Pine, J. (1991). Performance as- sessment in science. Applied Measurement in Education 4(4): 347-362.

Wiggins, G. (1989). A true test: Toward more authentic and equi- table assessment. Phi Delta Kappan 70(9): 703-713.

Worthen, B. (1993). Critical issues that will determine the future of alternative assessment. Phi Delta Kappan 74(6): 444-454.