explanatory concepts in physics: towards a principled evaluation of teaching materials

8
Computers Educ. Vol. 17, No. I, pp. 73-80, 1991 Printed in Great Britain 0360-131 S/91$3.00 + 0.00 Pergamon Press plc EXPLANATORY CONCEPTS IN PHYSICS: TOWARDS A PRINCIPLED EVALUATION OF TEACHING MATERIALS CHRISTINE J. HOWE Department of Psychology, University of Strathclyde, Alexander Turnbull Building, 155 George Street, Glasgow Gl IRD, Scotland (Accepted 24 December 1990) Abstract-The assumption underlying the present paper is that in the ideal world teaching materials should be evaluated with reference to general principles of learning. When such principles are not available, teaching materials should be evaluated using controlled empirical studies. However, the design of the studies should be such as to render the principles more certain. This said, two questions follow for any domain of knowledge: where in the domain uncertainties lie, and by what empirical methods they should be resolved. The present paper asks these questions in relation to the explanatory concepts of physics. It starts by considering what might, given recent research, be taken as a principle, namely that ‘Socratic dialogue’ between pupils facilitates learning. Subjecting the supportive research to methodologi- cal review, this paper not only debates the legitimacy of its claims. It also clarifies the methods by which some specific uncertainties might be further explored. In doing this, it draws attention to the long term nature of the evaluative strategy that its working assumption has led to, and asks whether the assumption should be suspended. Arguing that apparent short-cuts are illusory, the paper concludes that it should not. INTRODUCTION The brief for the present paper is to contribute to a debate concerning the methods by which the research community should contribute to the evaluation of teaching materials. In one sense, however, it can have little to say. There can surely be no doubt that if the teaching materials relate to a domain for which general learning principles are already established, the research community, and indeed anyone else, should evaluate them with reference to those principles. There would in other words be no need for empirical research. If the teaching materials relate to a domain for which there is still uncertainty, the research community has a responsibility to evaluate them using controlled empirical studies. However, since evaluation with reference to general learning principles is clearly desirable, the aim of the studies should not simply be to establish that the materials work but also to contribute to the reduction of the uncertainties. Therefore, the nature of the controls should be determined accordingly. Accepting these points, the impossibility of drawing general conclusions about evaluation immediately becomes clear. Categorical statements about the assessment of workability are, for instance, precluded, for the options for assessment must depend on the controls that particular studies demand. Thus, there is no point in attempting anything apart from the focussed treatment of a specific domain. This said, there is some point in choosing a domain which unites several interests, and this is the rationale behind the selection in the present paper of explanatory concepts in physics. The union of interests stems from the fact that the teaching materials aimed at explanatory growth in physics are now assigning a crucial role to information technology. Bearing witness to this are the proceedings of a recent conference in the United States[l] whose contents of over 150 papers are mostly presentations of relevant software. Given the centrality of information technology, the paper should bear directly on both physics education and computer- assisted learning. The paper will proceed by discussing two basic issues. The first is where in its domain the uncertainties lie, the point being to delimit the contexts for principled and empirical methods. The second is what the options for controlled investigation are, given the context that the uncertainties impose. However, the paper will not treat the issues separately for they are necessarily interwoven. After all, discussion of what can be taken for granted and what is uncertain will not only lay out the variables requiring study. Since it will necessitate a methodological critique of existing research, it will also cover the options for assessing the effects that the variables have. Thus there is a sense 13

Upload: christine-j

Post on 03-Jan-2017

212 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Explanatory concepts in physics: Towards a principled evaluation of teaching materials

Computers Educ. Vol. 17, No. I, pp. 73-80, 1991 Printed in Great Britain

0360-131 S/91 $3.00 + 0.00 Pergamon Press plc

EXPLANATORY CONCEPTS IN PHYSICS: TOWARDS A

PRINCIPLED EVALUATION OF TEACHING MATERIALS

CHRISTINE J. HOWE

Department of Psychology, University of Strathclyde, Alexander Turnbull Building, 155 George Street, Glasgow Gl IRD, Scotland

(Accepted 24 December 1990)

Abstract-The assumption underlying the present paper is that in the ideal world teaching materials should be evaluated with reference to general principles of learning. When such principles are not available, teaching materials should be evaluated using controlled empirical studies. However, the design of the studies should be such as to render the principles more certain. This said, two questions follow for any domain of knowledge: where in the domain uncertainties lie, and by what empirical methods they should be resolved. The present paper asks these questions in relation to the explanatory concepts of physics. It starts by considering what might, given recent research, be taken as a principle, namely that ‘Socratic dialogue’ between pupils facilitates learning. Subjecting the supportive research to methodologi- cal review, this paper not only debates the legitimacy of its claims. It also clarifies the methods by which some specific uncertainties might be further explored. In doing this, it draws attention to the long term nature of the evaluative strategy that its working assumption has led to, and asks whether the assumption should be suspended. Arguing that apparent short-cuts are illusory, the paper concludes that it should not.

INTRODUCTION

The brief for the present paper is to contribute to a debate concerning the methods by which the research community should contribute to the evaluation of teaching materials. In one sense, however, it can have little to say. There can surely be no doubt that if the teaching materials relate to a domain for which general learning principles are already established, the research community, and indeed anyone else, should evaluate them with reference to those principles. There would in other words be no need for empirical research. If the teaching materials relate to a domain for which there is still uncertainty, the research community has a responsibility to evaluate them using controlled empirical studies. However, since evaluation with reference to general learning principles is clearly desirable, the aim of the studies should not simply be to establish that the materials work but also to contribute to the reduction of the uncertainties. Therefore, the nature of the controls should be determined accordingly.

Accepting these points, the impossibility of drawing general conclusions about evaluation immediately becomes clear. Categorical statements about the assessment of workability are, for instance, precluded, for the options for assessment must depend on the controls that particular studies demand. Thus, there is no point in attempting anything apart from the focussed treatment of a specific domain. This said, there is some point in choosing a domain which unites several interests, and this is the rationale behind the selection in the present paper of explanatory concepts in physics. The union of interests stems from the fact that the teaching materials aimed at explanatory growth in physics are now assigning a crucial role to information technology. Bearing witness to this are the proceedings of a recent conference in the United States[l] whose contents of over 150 papers are mostly presentations of relevant software. Given the centrality of information technology, the paper should bear directly on both physics education and computer- assisted learning.

The paper will proceed by discussing two basic issues. The first is where in its domain the uncertainties lie, the point being to delimit the contexts for principled and empirical methods. The second is what the options for controlled investigation are, given the context that the uncertainties impose. However, the paper will not treat the issues separately for they are necessarily interwoven. After all, discussion of what can be taken for granted and what is uncertain will not only lay out the variables requiring study. Since it will necessitate a methodological critique of existing research, it will also cover the options for assessing the effects that the variables have. Thus there is a sense

13

Page 2: Explanatory concepts in physics: Towards a principled evaluation of teaching materials

14 CHRISTINE J. HOWE

in which following the debating strategy to its logical conclusion will reduce the paper to a critical review of existing research.

THE VALUE OF SOCRATIC DIALOGUE

By way of introduction, it should be noted that during the past decade, there have been important changes in the assumptions about how the explanatory concepts of physics should be conveyed. It was previously thought that, despite lifetimes of operating with physical events, pupils’ conceptions of the explanatory factors were entirely implicit. Thus, with respect to the explicit conceptions that physics tries to teach, pupils could be regarded as ‘blank slates’. Thanks to such papers as Driver and Erickson [2] and Gilbert and Watts[3] this analysis is now seen as mistaken. It is widely accepted that by the time they come to physics, pupils will have evolved explicit preconceptions about a range of events. There is debate over the precise form the preconceptions will take, with positions ranging from all-encompassing ‘naive’ theories (e.g. [4] to interweaving and fragmentary primitives (e.g. [5]). However, it is agreed that, induced from direct observation and/or ordinary discourse, explanatory preconceptions will be strongly held. Hence, if they are ignored, they will, it is thought, interact with teaching in damaging ways. Accepting this, the appropriate approach will, as Hewson and Hewson [6] point out, be one that views teaching as effecting conceptual change.

Acknowledging the approach in principle, there has been extensive discussion as to how it should be realised, with a favoured solution being a form of ‘Socratic dialogue’ where pupils with contrasting preconceptions debate their positions. Advocated by writers like Champagne et al. [7] and Clement [8], Socratic dialogue is being promoted in the National Curriculum for Science (DES, [9]) that is currently being implemented in England and Wales. Should Socratic dialogue prove of value, it would have important implications for computer-assisted learning. In the first place, it would make a virtue out of the necessity, due as Light et al. [lo] point out to shortage of resources, for computer-based work to be group work. In the second, it might, for the present at least, provide an actual justification for computer-based work. This is because few of the teaching materials that are currently available have features specifically directed at Socratic dialogue. Few, for instance, demand the making of joint decisions. The teaching materials intended for computer presentation are certainly no exception, but where they differ is in requiring individual decisions of a highly constrained nature. Because of this, groups working with them should be relatively likely to negotiate their decisions, meaning that a context for Socratic dialogue could be provided even when it is not explicitly signalled.

Accepting then that approaches to teaching which deploy Socratic dialogue would have these consequences, the question has to be raised as to whether such approaches should in fact be taken, whether in other words they would concur with well-established principles. At first sight, the answer

would seem to be affirmative. The use of Socratic dialogue would agree, for instance, with Vygotsky’s [l l] proposal that conceptual change depends on the construction (in response to complementary information) of notions which, though superior to the original ones, remain in their ‘zone of proximal development’. It would also concur with Piaget’s (e.g. [12]) claim that changing understanding of ‘physical causality’ requires the resolution of notions which, though competing, are mutually ‘assimilable’ to existing ‘schemes’. Finally, it would be consistent with empirical results obtained in the United States by Champagne et al. [13], Forman[l4], Forman and Kraker[15], Nussbaum and Novick [ 161 and Trowbridge [ 171; in Australasia by Osborne and Freyberg [ 181 and Thorley and Treagust[l9]; and in the U.K. by myself and my colleagues, our results derived from the four studies summarised in Howe et al. [20] and one recently completed follow-up on relative velocity.

On closer scrutiny, however, matters are less clearcut. Despite the wealth of seeming support, there have been negative results. Those reported by Tudge[21] are an example. Furthermore, even the supportive studies are open to criticism. As mentioned already, it can surely be taken for granted that, to establish principles of learning, it is necessary to conduct empirical investigations in controlled conditions. In the present context, appropriate controls would be groups of pupils working on identical teaching materials but some engaging in Socratic dialogue and some precluded. The studies that I was involved in attempted to establish such controls, but as far as

Page 3: Explanatory concepts in physics: Towards a principled evaluation of teaching materials

Principled evaluation of teaching materials 15

I can see this was exceptional. However, perhaps my studies are sufficient. Certainly, they covered five topics in physics and included pupils whose ages ranged from eight to undergraduate level, preconceptions whose quality varied from irrelevant to more or less complete, groups whose sizes alternated between two and four and teaching materials whose presentation moved from illustrated workbooks to HyperCard simulations. Certainly also, the results were highly consistent, providing statistically significant support for Socratic dialogue in all five studies. However, there were many commonalities of method between the studies, raising the possibility that the consistent results may have resulted from common biases.

To see if this was likely, it should be noted that in all five studies, the basic method was to administer individual pre-tests to randomly sampled pupils, to use the pre-tests to establish groups where explanatory preconceptions were either substantially different or very similar and to allow the groups to work on teaching materials that were specially designed to elicit explanatory discussion. The teaching materials required the pupils to make independent predictions about relevant events, to compare their predictions and come to an agreement, to test the agreed predictions using the facilities provided and to evolve joint interpretations of what transpired. The rationale behind the method was that although all the pupils would engage in discussion by virtue of the materials, only the pupils in the groups where preconceptions differed could engage in Socratic dialogue. Hence, the focus for the subsequent analysis was progress as a function of different or similar group. However, in two of the studies, progress was also correlated with indices of Socratic dialogue. These were derived from videorecordings made while the groups worked on the materials, with other potential influences being partialled out. Thus, two forms of control were used, and it is hard to think what more could have been done.

However, perhaps it is not so much the procedural part of the method where fault is to be found, but rather the assessment of conceptual change, for here there was no attempt at a range of approaches. On the contrary, in all five studies, assessment of progress was based on change from the pre-tests to comparable post-tests administered some weeks after the teaching materials. Given that group discussion was such a central feature, assessment could in theory have been based on the on-task dialogue. Although pre- to post-test change is the favoured method in the literature as a whole, there is a precedent for the use of on-task dialogue in Nussbaum and Novick’s [ 161 work on the particulate theory of matter. Moreover, the use of on-task dialogue has been actively promoted by Forman [14] for a mixture of theoretical and practical reasons. Clearly, this is an issue of some importance that needs to be discussed at length.

THE ASSESSMENT OF CONCEPTUAL CHANGE

Forman’s [14] main theoretical reason for preferring on-task dialogue to pre- to post-test change is that, in her view, the latter is ‘decontextualized’ while the former is not. However, Forman does not explain what she means by decontextualization let alone shows it to be a problem. If she means isolation from the context of teaching, decontextualization would, assuming the concern to be generalizable learning, appear to be a strength. The same would apply if she means explicit rather than tacit knowledge, for the former is the stuff of physics. She may mean isolation from the context of problem solving, and this could be a problem. However, it can be minimized if pupils are asked not to state explanatory concepts explicitly but rather to display them in the course of problem solving. This was routine in the studies that I was involved in for the pre- and post-tests invariably required pupils to make and justify predictions, the explanations oIIered by way of justification being the means by which knowledge was estimated.

Of course, there is a dilemma as to how contextualized the testing should be. For instance, instead of focussing on justifications, my colleagues and I could have used the predictions themselves, inferring the explanatory concepts from what we took to be their consequences. This would have been consistent with the approach of McCloskey[4] in work on freefall under horizontal motion, for there, predictions about the paths objects trace were the sole evidence for beliefs about underlying factors. Certainly had we done this, the results would have been different. One of the studies that we conducted was also concerned with freefall under horizontal motion. However, in contrast to McCloskey, both predictions and justifications were analysed. Predictions were scored from 0 to 5 depending on their approximation to a properly shaped path, a vertical path being

Page 4: Explanatory concepts in physics: Towards a principled evaluation of teaching materials

16 CHRISTINE J. HOWE

scored 0 and an appropriate parabola 5. Explanations offered in justification were scored from 0 to 6 according to the number of relevant factors (from initial velocity, gravity and wind resistance) identified and used appropriately. Tolmie et al. [22] report the correlation between prediction and justification scores for 73 pupils aged 12-15. It was virtually zero (Y = -0.01, NS).

In view of this, there is a real problem in selecting an index. Against predictions, it might be argued that there can be no guarantee that explicit preconceptons would be tapped rather than something implicit. Against justifications, the possible confounding influence of linguistic skill has to be recognized. There is no clear answer, though it may be that with pupils of eight and over as was the case in my own studies, the linguistic problem should not be overly serious. Moreover, with one-to-one interviewing, marked reticence can at worst be noted and at best be countered. Whatever the answer, it should be easy to see that exactly the same problem would occur with on-task dialogue. Thus, contrary to what Forman implies, the problem is not a theoretical objection to pre- and post-testing. It is a general dilemma for all methods of assessment. Noting this, it is interesting that, in her own work, Forman solved the dilemma for on-task dialogue in a fashion that parallels the solution that my colleagues and I adopted for pre- and post-tests. She looked at justifications for predictions and jointly constructed interpretations!

However, Forman’s objections are not simply theoretical, for she raises a further point addressed to the practice of pre- and post-testing in the context of group work. In particular, she suggests that to the extent the group work is successful, pupils will come to see the materials they are working with as inherently collaborative. Hence, when they are confronted with a non-collabora- tive post-test, their motivation will be undermined, and they will underperform accordingly. Against Forman, it might seem worth noting that in my studies as in the literature as a whole, pre- to post-test change was in a positive direction. Hence, there were still signs of learning even though its absolute level may have been underestimated. This, however, will not do. Forman’s point amounts to the claim that the success of group work will be inversely related to the success of post-test performance. Since the conclusions, in my studies at least, were drawn from comparisons of pre- to post-test change across two kinds of group, this, if true, would be exceedingly worrying.

However, two of the studies I was involved in suggest that it is most unlikely to be true. These studies (Howe et al.[23]) were both concerned with the explanation of floating and sinking, one focussing on the use of object density and the other on the use of fluid. In both studies, the pre-tests were clinical interviews to around 120 eight to twelve year old pupils, with the predictions relating both to physically present objects (e.g. a metal key in a tank of freshwater) and described examples (e.g. the key in Loch Lomond rather than the tank). The explanations offered in justification of the predictions were scored on a scale from 0 to 4 for their approximation to density. Using the explanations, around 75% of the pupils were placed in different and similar groups, with every group being a foursome. The groups came to work with their teaching materials a few weeks after the pre-tests, making their independent predictions on cards before moving to the subsequent phases which were described to them via workbooks. The post-tests were identical in form though different in content to the pre-tests, and were administered between two and five weeks after the teaching materials.

These studies are relevant because it was not simply the post-test explanations that were scored using the pre-test scale. It was also the explanations that the pupils offered on the teaching materials when deriving joint interpretations for the outcomes of tests. These explanations had been extracted from videotapes recorded during the group work. With pre-test, post-test and group scores, the latter being assessed in a fashion comparable to Forman, Forman’s objection can be addressed directly for it predicts a strong negative correlation between pre- to post-test change and pre-test to group change. As it turns out, the correlation in the object density study was negligible (r = 0.06, NS) and the correlation in the fluid density study strongly positive (Y = 0.55, P -C 0.002). Even worse however, pre- to post-test change was, in both studies, greater than pre-test to group. Indeed. in the object density study, the pre-test mean score of 1.49 was no different from the group mean score of 1.45 while both differed significantly (P < 0.001) from the post-test mean score of 1.7 1. Moreover, although these figures are global ones computed regardless of grouping, separation into the same and different conditions does not affect the picture. Thus, it seems that far from being undermined by successful group work, post-test performance can be promoted by group work regardless of its success or failure.

Page 5: Explanatory concepts in physics: Towards a principled evaluation of teaching materials

Principled evaluation of teaching materials 1-I

Having reached this conclusion, it becomes tempting to claim not just that pre- to post-test change has been vindicated against Forman’s criticisms but also that in comparison with on-task assessment it is the preferable measure. However, this would be going too far, for the theoretical issues discussed already still remain unsolved. On the other hand, if it is agreed that justifications for predictions and interpretations for events should be the contexts for assessing explanatory knowledge, additional points could be advanced to support pre- to post-test change in preference to group score. In the first place, the correlations between pre-test to group change and pre-test score relative to other group members were -0.47 (P < 0.001) in the objects density study and -0.38 (P < 0.001) in the fluids. In other words, the greatest within-group progress came from the pupils who were relatively less advanced and indeed, given the mean group score, the most advanced pupils in the objects density study must have produced group performances that were actually regressive. This suggests that group scores reflect compromises to facilitate joint decisions rather than anything about learning, a suggestion that is not only consistent with the social psychological literature on group dynamics (e.g. Brown[24]) but also endorsed by a third study that I was involved in (Howe et al.[25]).

This study was concerned with motion down an inclined plane. Like the floating and sinking studies, it began with clinical interview pre-tests to over 100 eight to twelve year old pupils. The focus of the pre-tests was predictions about the distances toy vehicles would roll from inclines under different combinations of surface friction, surface angle, surface length and vehicle weight. However, like the floating and sinking studies, predictions were also solicited about described instances. The pupils were asked to justify their predictions, and the explanations they offered in justification were scored from 1 to 4. Eighty four of the pupils were then grouped into same or different foursomes, and taken to work on teaching materials parallel in form to the ones used for floating and sinking. Once more, the explanations produced while constructing joint interpretations were extracted from videorecordings of the group work, and scored using the pre-test scale.

The main innovation in the study was the use of immediate and delayed post-tests. The immediate post-tests were administered within a day of the teaching materials to a randomly chosen 25% of the group participants. The delayed post-tests were administered to all the participants about five weeks later. Both post-tests were scored using the pre-test scale. For present purposes, it is permissible once more to ignore the division into different and similar groups. Doing so, three points emerge from analysis of the scores: firstly, the mean group score of 2.14 was significantly (P < 0.001) lower than the mean pre-test score of 2.35; secondly, the mean pre-test score was, in turn, significantly (P < 0.001) lower than the mean delayedpost-test score of 2.68; and thirdly, the mean pre-test score was no d@erent from the mean immediate post-test score of 2.42. Taken together, these points suggest not only that the group score is a poor index of learning, but also that the bulk of whatever boosted eventual performance took place once the groups were over.

THE OPTIMIZATION OF GROUP WORK

Clearly, establishing the validity of pre- to post-test change as a measure of progress is a crucial step towards establishing Socratic dialogue as helpful for physics. However, it is no less crucial in deciding how to resolve the issues that would be outstanding should Socratic dialogue be regarded as established. It is not difficult to see what some of the issues might be, for in thinking of teaching materials that would support Socratic dialogue, it is possible to envisage them with or without an empirical component and with or without some expert guidance. In the studies that I was involved in, the teaching materials did have an empirical component in that the pupils tested their

predictions. However, expert guidance was studiously avoided. Yet there is no suggestion that this combination was optimal. On the contrary, given that observational evidence must have been adduced as part of the discussions, the empirical component may have been redundant or even distracting. Given that, although statistically significant, the delayed post-test progress was still relatively small, an element of guidance might have been of use.

There can be no doubt that clarification of these issues would be extremely useful for, once more, it would sharpen the implications for computer-assisted learning. After all, as O’Shea and Self[26] point out, many events relevant to physics are hard to observe in the world as it stands. There may

Page 6: Explanatory concepts in physics: Towards a principled evaluation of teaching materials

78 CHRISTINE J. HOWE

be danger, the scale may be wrong or things may happen too quickly. Hence, should an empirical component be useful, computer simulations along the lines pioneered by diSessa[27] and White [28] could play a role. In addition, asking teachers to guide Socratic dialogue in a progressive direction may be asking too much. The time commitment may be too great. Moreover, at the primary school level where, in the U.K. at least, physics is being actively promoted, teachers freely admit to lacking the necessary knowledge (HMI[29]). Consequently, should expert guidance prove helpful, a role for ‘intelligent tutoring’ (Murray [30]) would be clearly marked out.

Thus, the issues of empirical evidence and expert support are highly important, and yet very little is known by way of resolution. It is true that something can be gleaned from the Socratic dialogue literature, for the combination adopted in my studies was not necessarily mirrored elsewhere. Thus, information is to be gained contrastively, and the information by and large, is intriguing. For example, in my floating and sinking studies, the pupils precluded from Socratic dialogue showed no pre- to post-test change despite ample empirical support. The pupils in Nussbaum and Novick’s [ 161 study showed considerable on-task progress after extensive Socratic dialogue with no empirical support. By contrast, in Tudge’s[21] work with balance beams, empirical support was precluded while Socratic dialogue can be presumed. Yet, pre- to post-test change was found to be small. Moreover, in my study on freefall under horizontal motion, there were signs, reported in Howe et al. [3 11, that scrutinizing the results of testing was related to progress. Seeking to reconcile such results, it may be that, although never sufficient, empirical evidence is useful when the teaching materials contextualize it as a meaningful thing to seek. Certainly, Tudge’s work and my freefall study used materials which, through explicit variable manipulation and a strong predictive component, should have achieved such contextualization, and this is a marked point of contrast with the other research.

Be that as it may, such post hoc analysis can be merely suggestive. The studies under consideration varied along many dimensions in addition to the relevant ones, meaning that controlled conditions were not approached. Unfortunately, remedying the situation would be far from straightforward. Take, for instance, the imposition of expert guidance. If imposed abruptly, it might undermine what Socratic dialogue achieves. However, accepting gradual imposition, it is unclear how it should be timed and how much should be revealed. The options for timing would seem to be during the dialogue, at its completion or sometime later. In favour of the first would be evidence from the delayed post-test of the motion down an incline study, for in addition to the predictive questions, the pupils were asked about their quest for further information, via books, informants or direct observation, once the teaching materials were over. The 23% who responded affirmatively performed no better in terms of pre- to delayed post-test change than the remainder of the sample, suggesting that the crucial information was generated within the groups. In favour of the final option would be the evidence already presented that whatever its basis the learning may have been post-group. As regards the revelation of expert knowledge, it is clear from both the motion down an incline study and the floating and sinking that something only slightly more advanced would make little difference. This is because when, in those studies, the pupils worked with peers who were relatively advanced, they progressed no more than when they worked with peers who were more or less equivalent. Also, the more advanced pupils in the assymmetrical groups learned as much as the less. This said, the implications at higher levels of expertise remain unclear.

Issues like these would need consideration before research into empirical evidence or expert guidance could proceed. However, the complex design that consideration would probably lead to would nevertheless fail to suffice. The issues under consideration are general ones, calling for a programme of studies in which the variables that might interact with the focal ones were also contrasted. These variables would include the ages of the pupils and the quality of their preconceptions, the latter being defined against both the received wisdom of physics and the group the pupils were working with. The variables would also include the subject matter of the teaching materials and their mode of presentation. Should the role of empirical evidence be studied, computer- vs noncomputer-presentation would be an important dimension, for observations reported by Howe et al. [20] suggest that in the context of experimentation, simulated events are treated as more veridical than real ones!

Page 7: Explanatory concepts in physics: Towards a principled evaluation of teaching materials

Principled evaluation of teaching materials 79

CONCLUSIONS

Clearly, a major undertaking has to be envisaged. Yet even if it occurred, it might do little more than bear on the principles relevant to a particular aspect of physics. Moreover, even in that context, it would relate only to a few specific questions. Nevertheless, there can be no alternative if the establishment of learning principles is the ultimate aim, and the present paper began by endorsing the aim as virtually self-evident. Now that the full complexities are beginning to be clear, it might seem appropriate to think again. It might be tempting to argue that even though the establishment of learning principles should, in an ideal world, be taken as the aim, the world is not ideal. Pupils need to be educated now and teachers need to select from the materials that already exist. In this imperfect situation, it might seem incumbent on researchers to cut their losses and ask quite simply which materials work.

The line of argument is seductive but it is nevertheless mistaken. Showing that something works also implies controlled investigation, and without a guiding question, there are endless controls that might be imposed. Pupils who are kept from the topic of teaching, pupils who work on the topic with different materials and pupils who work with the materials in different contexts are three of many options, and within each option there are countless possibilities. There are, for example, innumerable different materials on given topics with which pupils might work. This recognized, it should be clear that the assessment of workability would be no more straightforward than the specification of principles. Since only the latter holds any hope of longer term benefits, it has to be perfered.

Of course, the difficulty in assessing workability is not always recognized, either by the designers of teaching materials or by those who select them. Thus, materials appear which have supposedly been ‘evaluated’ but which in fact have been assessed in extremely limited contexts, and teachers are persuaded to use them expecting success. Not surprisingly, the results are often disappointing, leading to confusion and, increasingly in the U.K., national anxiety about educational attainment. With a clearer understanding of what evaluation means, it would still be acceptable to conduct modest assessments. Budgets might allow nothing more. However, they could be planned in such a way as to bear on learning principles, thereby becoming part of the overall program. Indeed, for teachers also, greater circumspection should eventually be helpful, for it would encourage them to hedge bets and choose materials that, on the one hand, reflect what is known and, on the other, represent the range of possibilities when matters are unclear.

In making this point, the paper is of course moving away from the evaluation of teaching materials within the research community, and acknowledging that parallel evaluation takes place daily in educational settings. Hopefully, the present paper carries an implicit message of support to those who conduct such evaluation, pinpointing the extent of the uncertainties within which they have to work. Hopefully also, the paper counsels patience to those who want immediate returns from the research community to lighten the practical task. If the paper achieves this, there can be some hope that when the research community conducts the empirical studies that true evaluation requires, it will be welcome in what must, surely regardless of underlying question, be the best context to operate: real classrooms with real teachers fully involved in the work.

REFERENCES

1. Redish E. F. and Risley J. S., Computers in Physics Insfrucrion. The University of Chicago Press, Chicago (1990). 2. Driver R. and Erickson G., Theories in action: some theoretical and empirical issues in the study of students’ conceptual

frameworks in science. Studies Sci. Educ. 10, 3760 (1983). 3. Gilbert J. K. and Watts D. M., Concepts, misconceptions and alternative conceptions: changing perspectives in science

education. Studies Sci. Educ. 10, 61-98 (1983). 4. McCloskey M., Naive theories of motion. In Mental Models (Edited by Gentner D. and Stevens A. L.). Erlbaum, N.J.

(1983). 5. diSessa A., Knowledge in pieces. In Construcriuism in the Cornpurer Age (Edited by Forman G. and Pufall P. B.).

Erlbaum, N.J. (1988). 6. Hewson M. G. and Hewson P. W., The effects of instruction using students’ prior knowledge and conceptual change

strategies on science learning. J. Res. Sci. Teach. 20, 731-743 (1983). 7. Champagne A. B., Klopfer L. E. and Gunstone R., Cognitive research and the design of science instruction. Educ.

Psychol. 17, 31-53 (1982). 8. Clement J., A conceptual model discussed by Galileo and used intuitively by physics students. In Menral Models (Edited

by Gentner D. and Stevens A. L.). Erlbaum, Hillsdale, N.J. (1983).

Page 8: Explanatory concepts in physics: Towards a principled evaluation of teaching materials

CHRISTINE J. HOWE 80

9. IO.

11. 12. 13.

14.

15.

16.

17.

18. 19.

20.

21.

22.

23.

24. 25.

Department of Education and Science. Science in fhe National Curriculum. HMSO, London (1989). Light P., Foot T., Colbourn C. and McLelland I., Collaborative interactions at the microcomputer keyboard. E&c. Psycho/. 7, 13-21 (1987). Vygotsky L. S., Thoughf and Language. MIT Press, Cambridge, Mass. (1962). Piaget J., The Equilibration sf Cognifive Structures. University of Chicago Press, Chicago (1985). Champagne A. B., Gunstone R. and Klopfer L. E., Effecting changes in cognitive structure amongst physics students. Paper presented to American Educational Research Association, Montreal (1983). Forman E. A., The role of peer interaction in the social construction of mathematical knowledge. In Peer Inferacfion, Problem Solving and Cognition: Multi-disciplinary Perspecfives (Edited by Webb N.). Pergamon Press, Oxford (1989). Forman E. A. and Kraker M. J., The social origins of logic: the contributions of Piaget and Vygotsky. In Peer Conflict and Psychological Growth (Edited by Berkowitz M.). Jossey-Bass, San Francisco (1985). Nussbaum J. and Novick S.. Brainstorming in the classroom to invent a model: a case study. School Sci. Rev. 62, 771-778 (1981). Trowbridge D., An investigation of groups working at the computer. In Applications of Cognifive Psychology: Problem Solving, Education and Computing (Edited by Berger D. E., Pezdek K. and Banks W. P.). Erlbaum. Hillsdale, N.J. (1987). Osborne R. and Freyberg P., Learning in Science. Heinemann, Auckland (1985). Thorley N. R. and Treagust D. F., Conflict within dyadic interactions as a stimulant for conceptual change in physics. Inf. J. Sci. Educ. 9, 203-216 (1987). Howe C. J., Tolmie A. and Mackenzie M., Computer support for the collaborative learning of physics concepts. In Computer Supported Collaborative Learning (Edited by O’Malley C.). (In press). Tudge J., When collaboration leads to regression: some negative consequences of socio-cognitive conflict. Eur. J. Social Psychol. 19, 1233138 (1989). Tolmie A., Howe C. J., Anderson A., Mayes J. T. and Mackenzie M., Peer interaction in the teaching of physics. Paper presented at the British Psychological Society Annual Conference, Swansea (1990). Howe C. J., Rodgers C. and Tolmie A., Physics in the primary school: peer interaction and the understanding of floating and sinking. Eur. J. Sci. Educ. (In press). Brown R., Group Processes. Blackwells, Oxford (1988). Howe C. J., Tolmie A. and Rodgers C., Constructive interaction and science concepts: an investigation through primary school children’s understanding of motion down an incline. Paper presented at the British Psychological Society Annual Conference, St Andrews (1989).

26. O’Shea T. and Self J., Learning and Teuching uifh Computers: Artificial lntelkence in Education. Harvester Press. Brighton (1983).

27. diS&sa A., Unlearning Aristotelian physics: a study of knowledge-based learning. Cognifive Sci. 6, 37-75 (1982). 28. White B. Y., Designing computer games to help physics students understand Newton’s Laws of Motion. Cognition

Insrrucfion 1, 69-108 (1984). 29. HMI. Science in Primagj Schools. HMSO. London (1984). 30. Murray D. M., A survey of user cognitive modelling. NPL Report DITC 92/87 (1988). 3 I, Howe C. J., Tolmie A. and Anderson A., Information technology and group work in physics. J. Comp. Assist Learn.

(In press).