research meth- · characteristics of correlational studies 199 interpreting the correlation...

463

Upload: others

Post on 31-Mar-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

This completely rewritten and updated fifth edition of the long-running bestseller, Research Meth-ods in Education covers the whole range of methods currently employed by educational research-ers. It continues to be the standard text for students undertaking educational research whether atundergraduate or postgraduate level.

This new edition constitutes the largest reshaping of the text to date and includes new material on: • qualitative, naturalistic and ethnographic research methods;• curricular and evaluative research;• critical theory and educational research;• feminist perspectives on research;• research and policy-making;• planning educational research;• critical action research;• statistical analysis;• sampling reliability and validity;• event-history analysis;• meta-analysis and multi-level modelling;• nominal group technique and Delphi techniques;• case study planning;• qualitative data analysis;• questionnaire design and construction;• focus groups;• testing;• test construction and item response theory;• recent developments in educational research including Internet usage, simulations, fuzzy logic,

Geographical Information Systems, needs assessment and evidence-based education. This user-friendly text provides both the theory that underpins research methodology and verypractical guidelines for conducting educational research. It is essential reading for both the profes-sional researcher and the consumer of research—the teacher, educational administrator, adviser,and all those concerned with educational research and practice.

Louis Cohen is Emeritus Professor of Education at Loughborough University. Lawrence Manion isformer Principal Lecturer in Music at Didsbury School of Education, Manchester MetropolitanUniversity. Keith Morrison is Senior Professor of Education at the Inter-University Institute of Macau.

Research Methods inEducation

Research Methods inEducation

Fifth edition

Louis Cohen, Lawrence Manion

and Keith Morrison

London and New York

First published 2000 by RoutledgeFalmer11 New Fetter Lane, London EC4P 4EE

Simultaneously published in the USA and Canada by RoutledgeFalmer29 West 35th Street, New York, NY 10001

This edition published in the Taylor & Francis e-Library, 2005.

“To purchase your own copy of this or any of Taylor & Francis or Routledge’scollection of thousands of eBooks please go to www.eBookstore.tandf.co.uk.”

RoutledgeFalmer is an imprint of the Taylor & Francis Group

©2000 Louis Cohen, Lawrence Manion and Keith Morrison

All rights reserved. No part of this book may be reprinted or reproduced orutilised in any form or by any electronic, mechanical, or other means, nowknown or hereafter invented, including photocopying and recording, or inany information storage or retrieval system, without permission in writingfrom the publishers.

British Library Cataloguing in Publication DataA catalogue record for this book is available from the British Library

Library of Congress Cataloguing in Publication Data

Cohen, LouisResearch methods in education/Louis Cohen, Lawrence Manion, and Keith

Morrison.—5th ed.p. cm.

Includes bibliographical references (p. ) and index.ISBN 0-415-19541-1 (pbk.)1. Education—Research. 2. Education—Research—Great Britain. I. Manion, Lawrence.

II. Morrison, Keith (Keith R.B.) III. Title.

LB 1028 .C572 2000370’.7’2–dc21

ISBN 0-203-22434-5 Master e-book ISBN ISBN 0-203-22446-9 (Adobe eReader Format)ISBN 0-415-19541-1

‘To understand is hard. Once one understands, action is easy.’(Sun Yat Sen, 1866–1925)

List of boxes xiAcknowledgements xiiiIntroduction xv

Part 1

The context of educational research

1 The nature of inquiry

Introduction 3The search for truth 3Two conceptions of social reality 5Positivism 8The assumptions and nature of science 10The tools of science 13The scientific method 15Criticisms of positivism and thescientific method 17Alternatives to positivistic socialscience: naturalistic approaches 19A question of terminology: thenormative and interpretive paradigms 22Phenomenology, ethnomethodologyand symbolic interactionism 23Criticisms of the naturalistic andinterpretive approaches 26Critical theory and critical educationalresearch 27Criticisms of approaches from criticaltheory 31Critical theory and curriculum research 32A summary of the three paradigms 34Feminist research 34Research and evaluation 38Research, politics and policy-making 43Methods and methodology 44

Part 2

Planning educational research

2 The ethics of educational and social

research

Introduction 49Informed consent 50Access and acceptance 53Ethics of social research 56Sources of tension 58Voices of experience 59Ethical dilemmas 60Ethics and research methods ineducation 66Ethics and teacher evaluation 67Research and regulation 69Conclusion 71

3 Research design issues: planning

research

Introduction 73A framework for planning research 73A planning matrix for research 80Managing the planning of research 88Conclusion 90

4 Sampling

Introduction 92The sample size 93Sampling error 96The representativeness of the sample 98The access to the sample 98The sampling strategy to be used 99Conclusion 104

Contents

CONTENTSviii

5 Validity and reliability

Defining validity 105Triangulation 112Ensuring validity 115Defining reliability 117Validity and reliability in interviews 120Validity and reliability in experiments 126Validity and reliability in questionnaires 128Validity and reliability in observations 129Validity and reliability in tests 130Validity and reliability in life histories 132

Part 3

Styles of educational research

6 Naturalistic and ethnographic research

Elements of naturalistic inquiry 137Planning naturalistic research 140Critical ethnography 153Computer usage 155Some problems with ethnographic andnaturalistic approaches 156

7 Historical research

Introduction 158Choice of subject 159Data collection 160Evaluation 162Writing the research report 163The use of quantitative methods 164Life histories 165

8 Surveys, longitudinal, cross-sectional and

trend studies

Some preliminary considerations 172Survey sampling 174Longitudinal, cross-sectional and trendstudies 174Strengths and weaknesses of cohortand cross-sectional studies 176Event history analysis 177

9 Case studies

Introduction 181What is a case study? 181Types of case study 185Why participant observation? 187Recording observations 188Planning a case study 189Conclusion 190

10 Correlational research

Introduction 191Explaining correlation and significance 193Curvilinearity 197Co-efficients of correlation 198Characteristics of correlational studies 199Interpreting the correlation co-efficient 201Examples of correlational research 202

11 Ex post facto research

Introduction 205Characteristics of ex post facto research 206Occasions when appropriate 207Advantages and disadvantages ofex post facto research 208Designing an ex post facto investigation 209Procedures in ex post facto research 209

12 Experiments, quasi-experiments

and single-case research

Introduction 211Designs in educationalexperimentation 212A pre-experimental design: the onegroup pretest-post-test 212A ‘true’ experimental design: thepretest-post-test control group design 213A quasi-experimental design: thenon-equivalent control group design 214Procedures in conductingexperimental research 215Examples from educational research 217Single-case research: ABAB design 219Meta-analysis in educational research 220

ixCONTENTS

13 Action research

Introduction 226Defining action research 226Principles and characteristics of actionresearch 228Action research as critical praxis 231Procedures for action research 234Some practical and theoreticalmatters 239Conclusion 241

Part 4

Strategies for data collection and

researching

14 Questionnaires

Ethical issues 245Approaching the planning of aquestionnaire 246Types of questionnaire items 248The layout of the questionnaire 258Piloting the questionnaire 260Postal questionnaires 262Processing questionnaire data 265

15 Interviews

Introduction 267Conceptions of the interview 267Purposes of the interview 268Types of interview 270Planning interview-based researchprocedures 273Group interviewing 287Focus groups 288The non-directive interview and thefocused interview 289Telephone interviewing 290Ethical issues in interviewing 292

16 Accounts

Introduction 293The ethogenic approach 294Characteristics of accounts and episodes 294

Procedures in eliciting, analysing andauthenticating accounts 294Network analyses of qualitative data 297What makes a good network? 297Discourse analysis 298Analysing social episodes 300Account gathering in educationalresearch: an example 300Problems in gathering andanalysing accounts 302Strengths of the ethogenic approach 302A note on stories 303

17 Observation

Structured observation 306Critical incidents 310Naturalistic observation 310Ethical considerations 314Conclusion 315

18 Tests

Parametric and non-parametric tests 317Norm-referenced, criterion-referencedand domain-referenced tests 318Commercially produced testsand researcher-produced tests 319Constructing a test 321Devising a pretest and post-test 334Reliability and validity of tests 334Ethical issues in preparing for tests 334Computerized adaptive testing 335

19 Personal constructs

Introduction 337Characteristics of the method 337‘Elicited’ versus ‘provided’ constructs 338Allotting elements to constructs 339Laddering and pyramid constructions 341Grid administration and analysis 341Procedures in grid administration 341Procedures in grid analysis 342Strengths of repertory grid technique 344Difficulties in the use of repertorygrid technique 345

CONTENTSx

Some examples of the use of repertorygrid in educational research 346Grid technique and audio/video lessonrecording 347

20 Multi-dimensional measurement

Introduction 349Elementary linkage analysis:an example 349Steps in elementary linkage analysis 350Cluster analysis: an example 351Factor analysis: an example 354Multi-dimensional tables 364Multi-dimensional data: some wordson notation 365Degrees of freedom 367A note on multilevel modelling 369

21 Role-playing

Introduction 370Role-playing versus deception:the argument 372Role-playing versus deception:the evidence 373

Role-playing in educational settings 374The uses of role-playing 375Strengths and weaknesses of role-playingand other simulation exercises 377Role-playing in an educational setting:an example 378Evaluating role-playing and othersimulation exercises 379

Part 5

Recent developments in educational

research

22 Recent developments

The Internet 383Simulations 385Fuzzy logic 389Geographical Information Systems 389Needs analysis 390Evidence-based education 394

Notes 396Bibliography 407Index 438

1.1 The subjective—objective dimension 71.2 Alternative bases for interpreting

social reality 91.3 The functions of science 111.4 The hypothesis 151.5 Stages in the development of a science 161.6 An eight-stage model of the scientific

method 161.7 A classroom episode 211.8 Differing approaches to the study of

behaviour 352.1 The costs/benefits ratio 502.2 Guidelines for reasonably informed

consent 512.3 Close encounters of a researcher kind 542.4 Conditions and guarantees proffered

for a school-based research project 562.5 Negotiating access checklist 572.6 Absolute ethical principles in social

research 582.7 An extreme case of deception 642.8 Ethical principles for the guidance of

action researchers 682.9 An ethical code: an illustration 713.1 Elements of research styles 783.2 Statistics available for different types

of data 813.3 A matrix for planning research 833.4 A planning sequence for research 893.5 A planning matrix for research 904.1 Determining the size of a random

sample 944.2 Sample size, confidence levels and

sampling error 954.3 Distribution of sample means

showing the spread of a selection ofsample means around the populationmean 96

5.1 Principal sources of bias in life historyresearch 133

7.1 Some historical interrelations betweenmen, movements and institutions 159

7.2 A typology of life histories and theirmodes of presentation 166

8.1 Stages in the planning of a survey 1708.2 Types of developmental research 1758.3 Advantages of cohort over cross-sectional

designs 1778.4 The characteristics, strengths and

weaknesses of longitudinal,cross-sectional, trend analysis, andretrospective longitudinal studies 178

9.1 Possible advantages of case study 1849.2 Nisbet and Watt’s (1984) strengths and

weaknesses of case study 1849.3 A typology of observation studies 1869.4 The case study and problems of selection 1899.5 Continua of data collection, types and

analysis in case study research 19010.1 Common measures of relationship 19210.2 Correlation scatter diagrams 19410.3 A line diagram to indicate curvilinearity 19810.4 Visualization of correlation of 0.65

between reading grade and arithmeticgrade 201

10.5 Correlations between the variousestimates of academic rank 203

12.1 The effects of randomization 21412.2 The ABAB design 22012.3 An ABAB design in an educational

setting 22112.4 Class size and learning in

well-controlled and poorly controlledstudies 225

13.1 Action research in classroom and school 23413.2 A model of emancipatory action

research for organizational change 23614.1 A flow chart technique for question

planning 24714.2 A flow chart for the planning of a postal

survey 26415.1 Attributes of ethnographers as

interviewers 26815.2 Summary of relative merits of

interview versus questionnaire 26915.3 Strengths and weaknesses of different

types of interview 27115.4 The selection of response mode 278

Boxes

BOXESxii

16.1 Principles in the ethogenic approach 29416.2 Account gathering 29516.3 Experience sampling method 29616.4 Concepts in children’s talk 29916.5 ‘Ain’t nobody can talk about things

being about theirselves’ 30116.6 Parents and teachers: divergent

viewpoints on children’s communicativecompetence 302

16.7 Justification of objective systematicobservation in classroom settings 303

17.1 A structured observation schedule 30718.1 A matrix of test items 32318.2 Compiling elements of test items 32319.1 Eliciting constructs and constructing a

repertory grid 33819.2 Allotting elements to constructs: three

methods 34019.3 Laddering 34219.4 Elements 34319.5 Difference score for constructs 34319.6 Grid matrix 34420.1 Rank ordering of ten children on seven

constructs 35020.2 Intercorrelations between seven personal

constructs 35120.3 The structuring of relationships

among the seven personal constructs 35120.4 Central profiles (percentage occurrence)

at 12-cluster levels 35220.5 Factor analysis of responsibility for

stress items 35520.6 Factor analysis of the occupational

stress items 356

20.7 Factor analysis of the occupationalsatisfaction items 358

20.8 Correlations between (dependent)stress and (independent) satisfactionfactors and canonical variates 359

20.9 Biographical data and stress factors 35920.10 Students’ perceptions of social episodes 36120.11 Perception of social episodes 36220.12 Person concept coding system 36320.13 Reliability co-efficients for peer

descriptions 36320.14 Sex, voting preference and social class: a

three-way classification table 36520.15 Sex, voting preference and social class: a

three-way notational classification 36620.16 Expected frequencies in sex, voting

preference and social class 36720.17 Expected frequencies assuming that

sex is independent of social class andvoting preference 368

20.18 Sex and voting preference: a two-wayclassification table 368

21.1 Dimensions of role-play methods 37121.2 The Stanford Prison experiment 37221.3 A flow chart for using role-play 37521.4 Critical factors in a role-play: smoking

and young people 37621.5 Categorization of responses to the four

video extracts 37822.1 Geographical Information Systems in

secondary schools 39022.2 Location of home postcodes using

Geographical Information Systems 391

Academic Press Inc. London, for Bannister, D.and Mair, J.M.M. (1968) The Evaluation ofPersonal Constructs, Box 19.2; words fromLeCompte, M., Millroy, W.L. and Preissle, J.(eds) (1992) The Handbook of QualitativeResearch in Education.

Addison-Wesley Publishing Co., Wokingham,for words from Boyd-Barrett, D. and Scanlon,E., (1991) Computers and Learning.

Allyn & Bacon, Boston, Massachusetts, forwords from Bogdan, R.G. and Biklen, S.K.(1992) Qualitative Research for Education(second edition).

Associated Book Publishers Ltd., London, forFransella, F. (1975) Need to Change? Box 19.3.

Carfax Publishing Co., Abingdon, forMcCormick, J. and Solman, R. (1992) Teach-ers’ attributions of responsibility for occupa-tional stress and satisfaction: an organisa-tional perspective, Educational Studies, 18(2), 201–22, Box 20.5, Box 20.6, Box 20.7;Halpin, D. et al. (1990) Teachers’ perceptionsof the effects of inservice education, BritishEducational Research Journal, 16 (2), 163–77, Box 10.5.

Cassell, London, for words from Tymms, P.(1996) Theories, models and simulations:school effectiveness at an impasse, in J.Gray,D.Reynolds, C.T.Fitz-Gibbon and D.Jesson(eds) Merging Traditions: the Future of Re-search on School Effectiveness and SchoolImprovement, 121–35.

Centre for Applied Research in Education, Nor-wich, East Anglia, for words fromMacDonald, B. (1987) Research and Actionin the Context of Policing, paper commis-sioned by the Police Federation.

Corwin Press, Newbury Park, California, forwords from Millman, J. and Darling-Hammond, L. (eds), (1991) The New Hand-book of Teacher Evaluation.

Countryside Commission, Cheltenham, forDavidson, J. (1970) Outdoor Recreation Sur-veys: the Design and Use of Questionnairesfor Site Surveys, Box 8.1.

Deakin University Press, Deakin, Australia, forwords from Kemmis, S. and McTaggart, R.(1981) The Action Research Planner, andKemmis, S. and McTaggart, R. (1992:8 and21–8) The Action Research Planner (thirdedition).

Elsevier Science Ltd, for words from Haig, B.D.(1997) Feminist research methodology, in J.P.Keeves (ed.) Educational Research, Method-ology, and Measurement: an InternationalHandbook, Oxford: Elsevier Science Ltd,222–31.

Forgas, J.P. (1976) Journal of Personal and So-cial Psychology, 34 (2), 199–209, Box 20.10.

George Allen & Unwin, London, for Plummer,K. (1983) Documents of Life: Introductionto the Problems and Literature of a Human-istic Method, Box 2.8; Whyte, W.F. (1982)Interviewing in field research. In R.G.Burgess(ed.) Field Research: a Sourcebook and FieldManual, 111–22.

Harcourt Brace Jovanovich, Inc., New York, forTuckman, B.W. (1972) Conducting Educa-tional Research, Box 16.5, and Box 15.2.

Harper & Row Publishers, London, for Cohen,L. (1977) Educational Research in Class-rooms and Schools, Box 20.1, Box 20.2, andBox 20.3.

Heinemann Educational Books Ltd, London,

Acknowledgements

Our thanks are due to the following publishers and authors for permission to include materials inthe text:

ACKNOWLEDGEMENTSxiv

for Hoinville, G. and Jowell, R. (1978) Sur-vey Research Practice, Box 8.1.

Hodder & Stoughton, Sevenoaks, for words fromFrankfort-Nachmias, C. and Nachmias, D.(1992) Research Methods in the Social Sciences.

Jossey-Bass, San Francisco, for words fromReynolds, P.D. (1979) Ethical Dilemmas andSocial Science Research.

Kogan Page, London, for words from Norris,N. (1990) Understanding Educational Evalu-ation.

Krejcie, R.V and Morgan, D.W. (1970) Deter-mining sample size for research activities, Edu-cational and Psychological Measurement 30,609, copyright Sage Publications Inc, reprintedby permission of Sage Publications Inc.

Kvale, S. (1996) Interviews, pp. 30, 88, 145,copyright Sage Publications Inc., reprinted bypermission of Sage Publications Inc.

Lincoln, Y.S. and Guba, E.G. (1985) words fromNaturalistic Inquiry, copyright Sage Publica-tions Inc., reprinted by permission of SagePublications Inc.

Methuen & Co, London, for words fromShipman, M.D. (1974) Inside a CurriculumProject.

Mies, M. (1993) for words from Towards amethodology for feminist research, in M.Hammersley (ed.) Social Research: Philoso-phy, Politics and Practice, reprinted by per-mission of Sage Publications Inc.

Multilingual Matters Ltd, Clevedon, for figuresfrom Parsons, E., Chalkley, B. and Jones, A.(1996) The role of Geographic InformationSystems in the study of parental choice andsecondary school catchments, Evaluation andResearch in Education, 10 (1), 23–34; forwords from Stronach, I. and Morris, B. (1994)Polemical notes on educational evaluation inan age of ‘policy hysteria’, Evaluation andResearch in Education, 8 (1 & 2), 5–19.

Open Books, London and Shepton Mallet, forBennett, S.N. (1976), Teaching Styles andPupil Progress, Box 20.4.

Open University Press, Milton Keynes, forwords from Bell, J. (1987) Doing Your Re-

search Project; Hopkins, D. (1985) A Teach-er’s Guide to Classroom Research; Pilliner,A. (1973) Experiment in Educational Re-search, E 341, Block 5; Rose, D. and Sullivan,O. (1993) Introducing Data Analysis for So-cial Scientists.

Parsons, E., Chalkley, B. and Jones, A. (1996)The role of Geographic Information Systemsin the study of parental choice and second-ary school catchments, Evaluation and Re-search in Education, 10 (1), 23–34, publishedby Multilingual Matters Ltd, Clevedon,Boxes 22.1 and 22.2.

Patton, M.Q. (1980) Qualitative EvaluationMethods, p. 206, copyright Sage PublicationsInc., reprinted by permission of Sage Publi-cations Inc.

Peter Francis Publishers, Dereham, Norfolk, formaterial from Morrison, K.R.B. (1993) Plan-ning and Accomplishing School-CentredEvaluation.

Prentice-Hall, Englewood Cliffs, New Jersey, forwords from Denzin, N.K. (1989) The Re-search Act (third edition).

Routledge, London, for words from Hitchcock,G. and Hughes, D. (1989) Research and theTeacher: a Qualitative Introduction toSchool-based Research, Box 7.2; words fromHitchcock, G. and Hughes, D. (1995) Re-search and the Teacher (second edition);words from Carspecken, P.F. (1996) CriticalEthnography in Educational Research;Wragg, E.C. (1994) An Introduction to Class-room Observation.

Taylor & Francis, London, for words and fig-ure from: Zuber-Skerritt, O. (ed.) (1996) NewDirections in Action Research, published byFalmer, pp. 99 and 17; James, M. (1993)Evaluation for policy: rationality and politi-cal reality: the paradigm case of PRAISE, inR.G. Burgess (ed.) Educational Research andEvaluation for Policy and Practice, publishedby Falmer.

University of Chicago Press, Chicago, for wordsfrom Diener, E. and Crandall R. (1978) Eth-ics in Social and Behavioral Research.

It is six years since the fourth edition of ResearchMethods in Education was published and weare indebted to RoutledgeFalmer for the oppor-tunity to produce a fifth edition. The book con-tinues to be received very favourably worldwideand we should like to thank reviewers for theirconstructive comments which have helped in theproduction of this fifth edition. In particular, thishas led to the substantial increase in the cover-age of qualitative approaches to educationalresearch, which has resulted in a fairer balanceto the book. This new edition constitutes thelargest reshaping of the book to date, and in-cludes a reorganization of the material into fiveparts that catch the range of issues in planningeducational research: (a) the context of educa-tional research; (b) planning educational re-search; (c) styles of educational research; (d)strategies for data collection and researching;(e) recent developments in educational research.Much of the material from the previous editionshas been relocated within these five parts tomake them more accessible to the reader, andthe careful titling of chapters is designed to in-crease this accessibility. Within these main partsthe book includes considerable additional ma-terial to give this edition greater balance andcoverage, and to provide examples and greaterpractical guidance for those who are planningand conducting educational research. This edi-tion includes, also, guidance on data analysiswithin both qualitative and quantitative ap-proaches, and issues in reporting research. Inparticular the following are included: Part One:

• additional material on interpretive, ethno-graphic, interactionist, phenomenological andqualitative perspectives;

• additional material on curricular and evalu-ative research;

• new material on critical perspectives on edu-cational research, including ideology critiquefrom Habermas and the Frankfurt School,and feminist perspectives;

• new material on research, politics and policy-making.

Part Two:

• an entirely new part that is designed to assistnovice researchers to design and conduct edu-cational research, from its earliest stages toits completion. It is envisaged that this partwill be particularly useful for higher educa-tion students who are undertaking educa-tional research as part of their course require-ments.

Part Three:

• considerable new material on naturalistic,qualitative and ethnographic approaches, in-cluding critical ethnographies;

• additional material on action research, align-ing it to the critical approaches set out in PartOne;

• new material and chapters on sampling, reli-ability and validity, including qualitative ap-proaches to educational research;

• additional explanations of frequently usedconcepts in quantitative educational research,for example statistical significance, correla-tions, regression, curvilinearity, and an indi-cation of particular statistics to use for dataanalysis;

• new and additional material on event-historyanalysis, meta-analysis and multilevel mod-elling;

Introduction

INTRODUCTIONxvi

• an introduction to Nominal Group Techniqueand Delphi techniques;

• additional material on case study planningand implementation;

• additional material on data analysis for quali-tative data, e.g. content analysis and coding,analysis of field notes, cognitive mapping,patterning, critical events and incidents, ana-lytic induction and constant comparison.

Part Four:

• new material and chapters on questionnairedesign and construction, interviews, focusgroups, telephone interviewing, observation,the laddering and pyramid designs of personalconstructs, speech acts, and stories, includ-ing analysis of data derived from these in-struments for data collection;

• a new chapter on testing, test construction,item response theory, item analysis, item dif-ficulty and discriminability and computeradaptive testing;

• additional material on contingency tables andstatistical significance.

Part Five:

• a new chapter on recent developments in edu-cational research, including material onInternet usage, simulations, fuzzy logic, Geo-graphical Information Systems, needs analy-sis/assessment and evidence-based education.

By careful cross-referencing and the provisionof explanations and examples we have at-tempted to give both considerable coherence tothe book and to provide researchers with clearand deliberately practical guidance on all stagesof the research process, from planning tooperationalization, ethics, methodology, sam-pling, reliability and validity, instrumentationand data collection, data analysis and report-ing. We have attempted to show throughout howpractices derive from, and are located within,

the contexts of educational research that are setout in Part One. The guidance that we provideis couched in a view of educational research asan ethical activity, and care has been taken toensure that ethical issues, in addition to the spe-cific chapter on ethics, are discussed through-out the book. The significance of the ethical di-mension of educational research is underlinedby the relocation of the chapter on ethics to veryearly on in this edition.

We have deliberately reduced the more ex-tended discussion of published examples in re-sponse to feedback on previous editions fromreviewers, but we have included detailed backupreference to these and additional references toupdated examples for the reader to follow upand consult at will.

We are joined by Keith Morrison for the au-thorship of this new edition. We welcome theadditions and amendments that he has made, inthe firm knowledge that these will guarantee thebook’s continuing success. Overall, this editionprovides a balanced, structured and comprehen-sive introduction to educational research thatsets out both its principles and practice for re-searchers in a user-friendly way, and which isguided by the principle of Occam’s razor: allthings being equal, the simplest explanation isfrequently the best, or, as Einstein put it, oneshould make matters as simple as possible butno simpler! Balancing simplicity and the ines-capable complexity of educational research is ahigh-wire act; we hope to have provided a use-ful introduction to this in the fifth edition ofResearch Methods in Education.

Louis Cohen, Ph.D., D.Litt., is Emeritus Profes-sor of Education at Loughborough University.Lawrence Manion, Ph.D., is former PrincipalLecturer in Music in Didsbury School of Edu-cation, Manchester Metropolitan University.Keith Morrison, Ph.D., is Professor of Educa-tion at the Inter-University Institute of Macau.

This part locates the research enterprise in sev-

eral contexts. It commences with positivist and

scientific contexts of research and then pro-

ceeds to show the strengths and weaknesses

of such traditions for educational research. As

an alternative paradigm, the cluster of ap-

proaches that can loosely be termed interpre-

tive, naturalistic, phenomenological,

interactionist and ethnographic are brought to-

gether and their strengths and weaknesses for

educational research are also examined. The

rise of critical theory as a paradigm in which

educational research is conducted has been

meteoric and its implications for the research

undertaking are addressed in several ways in

this chapter, resonating with curriculum re-

search and feminist research. Indeed critical

theory links the conduct of educational research

with politics and policy-making, and this is

Part one

The context of educational

research

reflected in the discussions here of research

and evaluation, arguing how much educational

research has become evaluative in nature. That

educational research serves a political agenda

is seen in the later sections of this part, though

the links between educational research and

policy-making are typically far from straightfor-

ward. The intention in this section is to intro-

duce the reader to different research traditions,

and, rather than advocating slavish adherence

to a single research paradigm, we suggest that

‘fitness for purpose’ must be the guiding prin-

ciple: different research paradigms are suitable

for different research purposes and questions.

Different research traditions spawn different

styles of research; researchers must make in-

formed choices of research traditions, mindful

of the political agendas that their research might

serve.

Introduction

This chapter explores the context of educationalresearch. It sets out three significant lensesthrough which to examine the practice of re-search: (a) scientific and positivistic methodolo-gies; (b) naturalistic and interpretive method-ologies; (c) methodologies from critical theory.Our analysis takes as a starting point an impor-tant notion from Hitchcock and Hughes(1995:21) who suggest that ontological assump-tions give rise to epistemological assumptions;these, in turn, give rise to methodological con-siderations; and these, in turn, give rise to issuesof instrumentation and data collection. This viewmoves us beyond regarding research methodsas simply a technical exercise; it recognizes thatresearch is concerned with understanding theworld and that this is informed by how we viewour world(s), what we take understanding to be,and what we see as the purposes of understand-ing. The chapter outlines the ontological, epis-temological and methodological premises of thethree lenses and examines their strengths andweaknesses. In so doing it recognizes that edu-cation, educational research, politics and deci-sion-making are inextricably intertwined, a viewwhich the lens of critical theory, for example,brings sharply into focus in its discussions ofcurriculum decision-making. Hence this intro-ductory chapter draws attention to the politicsof educational research and the implications thatthis has for undertaking research (e.g. the movetowards applied and evaluative research andaway from ‘pure’ research).

The search for truth

People have long been concerned to come togrips with their environment and to understand

the nature of the phenomena it presents to theirsenses. The means by which they set out toachieve these ends may be classified into threebroad categories: experience, reasoning and re-search (Mouly, 1978). Far from being independ-ent and mutually exclusive, however, these cat-egories must be seen as complementary and over-lapping, features most readily in evidence wheresolutions to complex modern problems aresought.

In our endeavours to come to terms with theproblems of day-to-day living, we are heavilydependent upon experience and authority andtheir value in this context should not be under-estimated. Nor should their respective roles beoverlooked in the specialist sphere of researchwhere they provide richly fertile sources of hy-potheses and questions about the world, though,of course, it must be remembered that as toolsfor uncovering ultimate truth they have decidedlimitations. The limitations of personal experi-ence in the form of common-sense knowing, forinstance, can quickly be exposed when comparedwith features of the scientific approach to prob-lem-solving. Consider, for example, the strikingdifferences in the way in which theories are used.Laypeople base them on haphazard events anduse them in a loose and uncritical manner. Whenthey are required to test them, they do so in aselective fashion, often choosing only that evi-dence that is consistent with their hunches andignoring that which is counter to them. Scien-tists, by contrast, construct their theories care-fully and systematically. Whatever hypothesesthey formulate have to be tested empirically sothat their explanations have a firm basis in fact.And there is the concept of control distinguish-ing the layperson’s and the scientist’s attitudeto experience. Laypeople generally make no

The nature of inquiry11

THE NATURE OF INQUIRY4

attempt to control any extraneous sources of in-fluence when trying to explain an occurrence.Scientists, on the other hand, only too consciousof the multiplicity of causes for a given occur-rence, resort to definite techniques and proce-dures to isolate and test the effect of one or moreof the alleged causes. Finally, there is the differ-ence of attitude to the relationships among phe-nomena. Laypeople’s concerns with such rela-tionships are loose, unsystematic and uncon-trolled. The chance occurrence of two events inclose proximity is sufficient reason to predicatea causal link between them. Scientists, however,display a much more serious professional con-cern with relationships and only as a result ofrigorous experimentation will they postulate arelationship between two phenomena.

The second category by means of which peo-ple attempt to comprehend the world aroundthem, namely, reasoning, consists of three types:deductive reasoning, inductive reasoning, andthe combined inductive—deductive approach.Deductive reasoning is based on the syllogismwhich was Aristotle’s great contribution to for-mal logic. In its simplest form the syllogism con-sists of a major premise based on an a priori orself-evident proposition, a minor premise pro-viding a particular instance, and a conclusion.Thus:

All planets orbit the sun;The earth is a planet;Therefore the earth orbits the sun.

The assumption underlying the syllogism is thatthrough a sequence of formal steps of logic, fromthe general to the particular, a valid conclusioncan be deduced from a valid premise. Its chieflimitation is that it can handle only certain kindsof statement. The syllogism formed the basis ofsystematic reasoning from the time of its incep-tion until the Renaissance. Thereafter its effec-tiveness was diminished because it was no longerrelated to observation and experience and be-came merely a mental exercise. One of the con-sequences of this was that empirical evidence asthe basis of proof was superseded by authority

and the more authorities one could quote, thestronger one’s position became. Naturally, withsuch abuse of its principal tool, science becamesterile.

The history of reasoning was to undergo adramatic change in the 1600s when Francis Ba-con began to lay increasing stress on the obser-vational basis of science. Being critical of themodel of deductive reasoning on the groundsthat its major premises were often preconceivednotions which inevitably bias the conclusions,he proposed in its place the method of induc-tive reasoning by means of which the study ofa number of individual cases would lead to ahypothesis and eventually to a generalization.Mouly (1978) explains it like this: ‘His basicpremise was that if one collected enough datawithout any preconceived notion about theirsignificance and orientation—thus maintainingcomplete objectivity—inherent relationshipspertaining to the general case would emerge tobe seen by the alert observer.’ Bacon’s majorcontribution to science was thus that he wasable to rescue it from the death-grip of the de-ductive method whose abuse had brought sci-entific progress to a standstill. He thus directedthe attention of scientists to nature for solu-tions to people’s problems, demanding empiri-cal evidence for verification. Logic and author-ity in themselves were no longer regarded asconclusive means of proof and instead becamesources of hypotheses about the world and itsphenomena.

Bacon’s inductive method was eventually fol-lowed by the inductive-deductive approachwhich combines Aristotelian deduction withBaconian induction. In Mouly’s words, this con-sisted of:

a back-and-forth movement in which the investi-gator first operates inductively from observationsto hypotheses, and then deductively from thesehypotheses to their implications, in order to checktheir validity from the standpoint of compatibil-ity with accepted knowledge. After revision, wherenecessary, these hypotheses are submitted to fur-ther test through the collection of data specifically

Chapte

r 15

designed to test their validity at the empirical level.This dual approach is the essence of the modernscientific method and marks the last stage of man’sprogress toward empirical science, a path that tookhim through folklore and mysticism, dogma andtradition, casual observation, and finally to sys-tematic observation.

(Mouly, 1978) Although both deduction and induction havetheir weaknesses, their contributions to the de-velopment of science are enormous and fall intothree categories: (1) the suggestion of hypoth-eses; (2) the logical development of these hy-potheses; and (3) the clarification and interpre-tation of scientific findings and their synthesisinto a conceptual framework.

The third means by which we set out to dis-cover truth is research. This has been definedby Kerlinger (1970) as the systematic, control-led, empirical and critical investigation of hy-pothetical propositions about the presumed re-lations among natural phenomena. Researchhas three characteristics in particular which dis-tinguish it from the first means of problem-solv-ing identified earlier, namely, experience. First,whereas experience deals with events occurringin a haphazard manner, research is systematicand controlled, basing its operations on the in-ductive-deductive model outlined above. Sec-ond, research is empirical. The scientist turnsto experience for validation. As Kerlinger putsit, ‘subjective belief…must be checked againstobjective reality. Scientists must always subjecttheir notions to the court of empirical inquiryand test’. And, third, research is self-correct-ing. Not only does the scientific method havebuilt-in mechanisms to protect scientists fromerror as far as is humanly possible, but alsotheir procedures and results are open to publicscrutiny by fellow professionals. As Mouly says,‘This self corrective function is the most im-portant single aspect of science, guaranteeingthat incorrect results will in time be found tobe incorrect and duly revised or discarded.’Research is a combination of both experienceand reasoning and must be regarded as the mostsuccessful approach to the discovery of truth,

particularly as far as the natural sciences areconcerned (Borg, 1963).2

Educational research has at the same timeabsorbed two competing views of the social sci-ences—the established, traditional view and amore recent interpretive view. The former holdsthat the social sciences are essentially the sameas the natural sciences and are therefore con-cerned with discovering natural and universallaws regulating and determining individual andsocial behaviour; the latter view, however, whilesharing the rigour of the natural sciences andthe same concern of traditional social science todescribe and explain human behaviour, empha-sizes how people differ from inanimate naturalphenomena and, indeed, from each other. Thesecontending views—and also their correspond-ing reflections in educational research—stem inthe first instance from different conceptions ofsocial reality and of individual and social be-haviour. It will help our understanding of theissues to be developed subsequently if we exam-ine these in a little more detail.

Two conceptions of social reality

The two views of social science that we havejust identified represent strikingly different waysof looking at social reality and are constructedon correspondingly different ways of interpret-ing it. We can perhaps most profitably approachthese two conceptions of the social world byexamining the explicit and implicit assumptionsunderpinning them. Our analysis is based on thework of Burrell and Morgan (1979) who iden-tified four sets of such assumptions.

First, there are assumptions of an ontologi-cal kind—assumptions which concern the verynature or essence of the social phenomena be-ing investigated. Thus, the authors ask, is socialreality external to individuals—imposing itselfon their consciousness from without—or is itthe product of individual consciousness? Is re-ality of an objective nature, or the result of indi-vidual cognition? Is it a given ‘out there’ in theworld, or is it created by one’s own mind?These questions spring directly from what is

TWO CONCEPTIONS OF SOCIAL REALITY

THE NATURE OF INQUIRY6

known in philosophy as the nominalist-realistdebate. The former view holds that objects ofthought are merely words and that there is noindependently accessible thing constituting themeaning of a word. The realist position, how-ever, contends that objects have an independ-ent existence and are not dependent for it onthe knower.

The second set of assumptions identified byBurrell and Morgan are of an epistemologicalkind. These concern the very bases of knowl-edge—its nature and forms, how it can be ac-quired, and how communicated to other hu-man beings. The authors ask whether ‘it ispossible to identify and communicate the na-ture of knowledge as being hard, real and ca-pable of being transmitted in tangible form, orwhether knowledge is of a softer, more subjec-tive, spiritual or even transcendental kind,based on experience and insight of a uniqueand essentially personal nature. The epistemo-logical assumptions in these instances deter-mine extreme positions on the issues ofwhether knowledge is something which can beacquired on the one hand, or is somethingwhich has to be personally experienced on theother’ (Burrell and Morgan, 1979). How onealigns oneself in this particular debate pro-foundly affects how one will go about uncov-ering knowledge of social behaviour. The viewthat knowledge is hard, objective and tangiblewill demand of researchers an observer role,together with an allegiance to the methods ofnatural science; to see knowledge as personal,subjective and unique, however, imposes onresearchers an involvement with their subjectsand a rejection of the ways of the natural sci-entist. To subscribe to the former is to be posi-tivist; to the latter, anti-positivist.

The third set of assumptions concern hu-man nature and, in particular, the relationshipbetween human beings and their environment.Since the human being is both its subject andobject of study, the consequences for socialscience of assumptions of this kind are indeedfar-reaching. Two images of human beingsemerge from such assumptions—the one

portrays them as responding mechanically totheir environment; the other, as initiators oftheir own actions. Burrell and Morgan writelucidly on the distinction:

Thus, we can identify perspectives in social sci-ence which entail a view of human beings re-sponding in a mechanistic or even deterministicfashion to the situations encountered in theirexternal world. This view tends to be one inwhich human beings and their experiences areregarded as products of the environment; onein which humans are conditioned by their ex-ternal circumstances. This extreme perspectivecan be contrasted with one which attributes tohuman beings a much more creative role: witha perspective where ‘free will’ occupies the cen-tre of the stage; where man [sic] is regarded asthe creator of his environment, the controlleras opposed to the controlled, the master ratherthan the marionette. In these two extremeviews of the relationship between human be-ings and their environment, we are identifyinga great philosophical debate between the advo-cates of determinism on the one hand andvoluntarism on the other. Whilst there are so-cial theories which adhere to each of these ex-tremes, the assumptions of many social scien-tists are pitched somewhere in the range be-tween.

(Burrell and Morgan, 1979) It would follow from what we have said so farthat the three sets of assumptions identifiedabove have direct implications for the methodo-logical concerns of researchers, since the con-trasting ontologies, epistemologies and modelsof human beings will in turn demand differentresearch methods. Investigators adopting an ob-jectivist (or positivist) approach to the socialworld and who treat it like the world of naturalphenomena as being hard, real and external tothe individual will choose from a range of tradi-tional options—surveys, experiments, and thelike. Others favouring the more subjectivist (oranti-positivist) approach and who view thesocial world as being of a much softer, personaland humanly created kind will select from acomparable range of recent and emerging

Chapte

r 17

techniques—accounts, participant observationand personal constructs, for example.

Where one subscribes to the view which treatsthe social world like the natural world—as if itwere a hard, external and objective reality—thenscientific investigation will be directed at ana-lysing the relationships and regularities betweenselected factors in that world. It will be pre-domi-nantly quantitative. ‘The concern’, say Burrelland Morgan, ‘is with the identification and defi-nition of these elements and with the discoveryof ways in which these relationships can be ex-pressed. The methodological issues of impor-tance are thus the concepts themselves, theirmeasurement and the identification of underly-ing themes. This perspective expresses itself mostforcefully in a search for universal laws whichexplain and govern the reality which is beingobserved’ (Burrell and Morgan, 1979). An ap-proach characterized by procedures and meth-ods designed to discover general laws may bereferred to as nomothetic.

However, if one favours the alternative viewof social reality which stresses the importanceof the subjective experience of individuals in thecreation of the social world, then the search forunderstanding focuses upon different issues andapproaches them in different ways. The princi-pal concern is with an understanding of the wayin which the individual creates, modifies and

interprets the world in which he or she findshimself or herself. The approach now takes ona qualitative as well as quantitative aspect. AsBurrell and Morgan observe,

The emphasis in extreme cases tends to be placedupon the explanation and understanding of whatis unique and particular to the individual ratherthan of what is general and universal. This ap-proach questions whether there exists an externalreality worthy of study. In methodological termsit is an approach which emphasizes the relativis-tic nature of the social world.

(Burrell and Morgan, 1979) Such a view is echoed by Kirk and Miller(1986:14). In its emphasis on the particular andindividual this approach to understanding indi-vidual behaviour may be termed idiographic.

In this review of Burrell and Morgan’s analy-sis of the ontological, epistemological, humanand methodological assumptions underlying twoways of conceiving social reality, we have laidthe foundations for a more extended study ofthe two contrasting perspectives evident in thepractices of researchers investigating humanbehaviour and, by adoption, educational prob-lems. Box 1.1 summarizes these assumptions ingraphic form along a subjective—objectivedimension. It identifies the four sets ofassumptions by using terms we have adopted in

Box 1.1The subjective—objective dimension

Source Burrell and Morgan, 1979

TWO CONCEPTIONS OF SOCIAL REALITY

THE NATURE OF INQUIRY8

the text and by which they are known in theliterature of social philosophy.

Each of the two perspectives on the study ofhuman behaviour outlined above has profoundimplications for research in classrooms andschools. The choice of problem, the formula-tion of questions to be answered, the charac-terization of pupils and teachers, methodologi-cal concerns, the kinds of data sought and theirmode of treatment—all will be influenced ordetermined by the viewpoint held. Some idea ofthe considerable practical implications of thecontrasting views can be gained by examiningBox 1.2 which compares them with respect to anumber of critical issues within a broadly societaland organizational framework. Implications ofthe two perspectives for research into classroomsand schools will unfold in the course of the text.Because of its significance to the epistemologi-cal basis of social science and its consequencesfor educational research, we devote much of therest of this chapter to the positivist and anti-positivist debate.

Positivism

Although positivism has been a recurrenttheme in the history of western thought fromthe Ancient Greeks to the present day, it is his-torically associated with the nineteenth-cen-tury French philosopher, Auguste Comte, whowas the first thinker to use the word for aphilosophical position (Beck, 1979). Here ex-planation proceeds by way of scientific de-scription (Acton, 1975). In his study of the his-tory of the philosophy and methodology ofscience, Oldroyd (1986) says:

It was Comte who consciously ‘invented’ the newscience of society and gave it the name to whichwe are accustomed. He thought that it would bepossible to establish it on a ‘positive’ basis, justlike the other sciences, which served as necessarypreliminaries to it. For social phenomena wereto be viewed in the light of physiological (or bio-logical) laws and theories and investigated em-pirically, just like physical phenomena. Likewise,biological phenomena were to be viewed in the

light of chemical laws and theories; and so ondown the line.

(Oldroyd, 1986)

Comte’s position was to lead to a general doc-trine of positivism which held that all genuineknowledge is based on sense experience and canonly be advanced by means of observation andexperiment. Following in the empiricist tradi-tion, it limited inquiry and belief to what can befirmly established and in thus abandoning meta-physical and speculative attempts to gain knowl-edge by reason alone, the movement developedwhat has been described as a ‘tough-mindedorientation to facts and natural phenomena’(Beck, 1979).

Since Comte, the term positivism has beenused in such different ways by philosophers andsocial scientists that it is difficult to assign it aprecise and consistent meaning. Moreover, theterm has also been applied to the doctrine of aschool of philosophy known as ‘logical positiv-ism’.3 The central belief of the logical positivistsis that the meaning of a statement is, or is givenby, the method of its verification. It followsfrom this that unverifiable statements are heldto be meaningless, the utterances of traditionalmetaphysics and theology being included in thisclass.

However the term positivism is used by phi-losophers and social scientists, a residual mean-ing is always present and this derives from anacceptance of natural science as the paradigmof human knowledge (Duncan, 1968). This in-cludes the following connected suppositionswhich have been identified by Giddens (1975).First, the methodological procedures of naturalscience may be directly applied to the social sci-ences. Positivism here implies a particularstance concerning the social scientist as an ob-server of social reality. Second, the end-productof investigations by social scientists can be for-mulated in terms parallel to those of naturalscience. This means that their analyses must beexpressed in laws or law-like generalizations ofthe same kind that have been established inrelation to natural phenomena. Positivism here

Chapte

r 19

involves a definite view of social scientists asanalysts or interpreters of their subject matter.Positivism may be characterized by its claimthat science provides us with the clearest possi-ble ideal of knowledge.

Where positivism is less successful, however,

is in its application to the study of human be-haviour where the immense complexity of hu-man nature and the elusive and intangible qual-ity of social phenomena contrast strikingly withthe order and regularity of the natural world.This point is nowhere more apparent than in

Box 1.2Alternative bases for interpreting social reality

Source Adapted from Barr Greenfield, 1975

POSITIVISM

Conceptions of social realityDimensions of comparison Objectivist SubjectivistPhilosophical basis Realism: the world exists and is Idealism: the world exists but different

knowable as it really is. people construe it in very different ways.Organizations are real entities with Organizations are invented social reality.a life of their own.

The role of social science Discovering the universal laws of Discovering how different peoplesociety and human conduct within it. interpret the world in which they live.

Basic units of social reality The collectivity: society or Individuals acting singly or together.organizations.

Methods of understanding Identifying conditions or relationships Interpretation of the subjective meaningswhich permit the collectivity to exist. which individuals place upon their action.Conceiving what these conditions Discovering the subjective rules for suchand relationships are. action.

Theory A rational edifice built by scientists Sets of meanings which people use toto explain human behaviour. make sense of their world and behaviour

within it.

Research Experimental or quasi-experimental The search for meaningful relationshipsvalidation of theory. and the discovery of their consequences

for action.

Methodology Abstraction of reality, especially The representation of reality for purposesthrough mathematical models and of comparison.quantitative analysis. Analysis of language and meaning.

Society Ordered. Governed by a uniform Conflicted. Governed by the values ofset of values and made possible people with access to power.only by those values.

Organizations Goal oriented. Independent of people. Dependent upon people and their goals.Instruments of order in society Instruments of power which some peopleserving both society and the control and can use to attain ends whichindividual. seem good to them.

Organizational pathologies Organizations get out of kilter with Given diverse human ends, there is alwayssocial values and individual needs. conflict among people acting to pursue

them.

Prescription for change Change the structure of the Find out what values are embodied inorganization to meet social values organizational action and whose they are.and individual needs. Change the people or change their values

if you can.

THE NATURE OF INQUIRY10

the contexts of classroom and school where theproblems of teaching, learning and human in-teraction present the positivistic researcher witha mammoth challenge.

For further information on positivism withinthe history of the philosophy and methodologyof science, see Oldroyd (1986). We now lookmore closely at some of its features.

The assumptions and nature of science

Since a number of the research methods we de-scribe in this book draw heavily on the scien-tific method either implicitly or explicitly andcan only be fully understood within the totalframework of its principles and assumptions, wewill here examine some of the characteristics ofscience a little more closely.

We begin with an examination of the tenetsof scientific faith: the kinds of assumptions heldby scientists, often implicitly, as they go abouttheir daily work. First, there is the assumptionof determinism. This means simply that eventshave causes, that events are determined by othercircumstances; and science proceeds on the be-lief that these causal links can eventually be un-covered and understood, that the events are ex-plicable in terms of their antecedents. Moreo-ver, not only are events in the natural world de-termined by other circumstances, but there isregularity about the way they are determined:the universe does not behave capriciously. It isthe ultimate aim of scientists to formulate lawsto account for the happenings in the worldaround them, thus giving them a firm basis forprediction and control.

The second assumption is that of empiricism.We have already touched upon this viewpoint,which holds that certain kinds of reliableknowledge can only originate in experience. Inpractice, therefore, this means scientifically thatthe tenability of a theory or hypothesis dependson the nature of the empirical evidence for itssupport. Empirical here means that which isverifiable by observation; and evidence, datayielding proof or strong confirmation, in prob-ability terms, of a theory or hypothesis in a

research setting. The viewpoint has beensummed up by Barratt who writes, ‘The deci-sion for empiricism as an act of scientific faithsignifies that the best way to acquire reliableknowledge is the way of evidence obtained bydirect experience’ (Barratt, 1971).

Mouly (1978) has identified five steps in theprocess of empirical science: 1 experience—the starting point of scientific

endeavour at the most elementary level;2 classification—the formal systematization of

otherwise incomprehensible masses of data;3 quantification—a more sophisticated stage

where precision of measurement allows moreadequate analysis of phenomena by math-ematical means;

4 discovery of relationships—the identificationand classification of functional relationshipsamong phenomena;

5 approximation to the truth—science proceedsby gradual approximation to the truth.

The third assumption underlying the work ofthe scientist is the principle of parsimony. Thebasic idea is that phenomena should be explainedin the most economical way possible. The firsthistorical statement of the principle was byWilliam of Occam when he said that explana-tory principles (entities) should not be needlesslymultiplied. It may, of course, be interpreted invarious ways: that it is preferable to account fora phenomenon by two concepts rather thanthree; that a simple theory is to be preferred to acomplex one; or as Lloyd Morgan said as a guideto the study of animal behaviour: ‘In no casemay we interpret an action as the outcome ofthe exercise of a higher psychical faculty, if itcan be interpreted as the outcome of the exer-cise of one which stands lower in the psycho-logical scale.’

The final assumption, that of generality, playedan important part in both the deductive and in-ductive methods of reasoning. Indeed, histori-cally speaking, it was the problematic relation-ship between the concrete particular and theabstract general that was to result in two

Chapte

r 111

competing theories of knowledge—the rationaland the empirical. Beginning with observationsof the particular, scientists set out to generalizetheir findings to the world at large. This is sobecause they are concerned ultimately with ex-planation. Of course, the concept of generalitypresents much less of a problem to natural sci-entists working chiefly with inanimate matterthan to human scientists who, of necessity hav-ing to deal with samples of larger humanpopulations, have to exercise great caution whengeneralizing their findings to the particular par-ent populations.

Having identified the basic assumptions ofscience, we come now to the core question: Whatis science? Kerlinger (1970) points out that inthe scientific world itself two broad views ofscience may be found: the static and the dynamic.The static view, which has particular appeal forlaypeople, is that science is an activity that con-tributes systematized information to the world.The work of the scientist is to uncover new factsand add them to the existing corpus of knowl-edge. Science is thus seen as an accumulatedbody of findings, the emphasis being chiefly onthe present state of knowledge and adding toit.4 The dynamic view, by contrast, conceivesscience more as an activity, as something thatscientists do. According to this conception it isimportant to have an accumulated body ofknowledge, of course, but what really mattermost are the discoveries that scientists make. Theemphasis here, then, is more on the heuristicnature of science.

Contrasting views exist on the functions ofscience. We give a composite summary of thesein Box 1.3. For the professional scientists how-ever, science is seen as a way of comprehendingthe world; as a means of explanation and under-standing, of prediction and control. For them theultimate aim of science is theory. Theory has beendefined by Kerlinger as ‘a set of interrelated con-structs [concepts], definitions, and propositionsthat presents a systematic view of phenomena byspecifying relations among variables, with thepurpose of explaining and predicting the phenom-ena’ (Kerlinger, 1970). In a sense, theory gathers

together all the isolated bits of empirical data intoa coherent conceptual framework of wider ap-plicability. Mouly expresses it thus: ‘If nothingelse, a theory is a convenience—a necessity, re-ally—organizing a whole slough of unassortedfacts, laws, concepts, constructs, principles, intoa meaningful and manageable form. It constitutesan attempt to make sense out of what we knowconcerning a given phenomenon’ (Mouly, 1978).More than this, however, theory is itself a poten-tial source of further information and discover-ies. It is in this way a source of new hypothesesand hitherto unasked questions; it identifies criti-cal areas for further investigation; it discloses gapsin our knowledge; and enables a researcher topostulate the existence of previously unknownphenomena.

Clearly there are several different types oftheory, and each type of theory defines its ownkinds of ‘proof’. For example, Morrison (1995a)identifies empirical theories, ‘grand’ theories and‘critical’ theory. Empirical theories and criticaltheories are discussed below. ‘Grand theory’ is ametanarrative, defining an area of study, beingspeculative, clarifying conceptual structures andframeworks, and creatively enlarging the way weconsider behaviour and organizations (Layder,1994). It uses fundamental ontological and epis-temological postulates which serve to define afield of inquiry (Hughes, 1976). Here empirical

Box 1.3The functions of science

1 Its problem-seeking, question-asking, hunch-

encouraging, hypotheses-producing function.

2 Its testing, checking, certifying function; its trying

out and testing of hypotheses; its repetition and

checking of experiments; its piling up of facts

3 Its organizing, theorizing, structuring, function; its

search for larger and larger generalizations.

4 Its history-collecting, scholarly function.

5 Its technological side; instruments, methods,

techniques.

6 Its administrative, executive, and organizational side.

7 Its publicizing and educational functions.

8 Its applications to human use.

9 Its appreciation, enjoyment, celebration, and

glorification.

Source Maslow, 1954

THE ASSUMPTIONS AND NATURE OF SCIENCE

THE NATURE OF INQUIRY12

material tends to be used by way of illustrationrather than ‘proof’. This is the stuff of some so-ciological theories, for example Marxism, con-sensus theory and functionalism. Whilst sociolo-gists may be excited by the totalizing and all-encompassing nature of such theories, they havebeen subject to considerable undermining forhalf a century. For example, Merton (1949),Coser and Rosenberg (1969), Doll (1993) andLayder (1994) contend that whilst they mightpossess the attraction of large philosophical sys-tems of considerable—Byzantine—architectonicsplendour and logical consistency, nevertheless,they are scientifically sterile, irrelevant and outof touch with a postmodern world that is char-acterized by openness, fluidity, heterogeneity andfragmentation. This book does not endeavourto refer to this type of theory.

The status of theory varies quite considerablyaccording to the discipline or area of knowledgein question. Some theories, as in the natural sci-ences, are characterized by a high degree of el-egance and sophistication; others, like educa-tional theory, are only at the early stages of for-mulation and are thus characterized by greatunevenness. Popper (1968), Lakatos (1970),5

Mouly (1978), Laudan (1990) and Rasmussen(1990) identify the following characteristics ofan effective empirical theory: • A theoretical system must permit deductions and

generate laws that can be tested empirically; thatis, it must provide the means for its confirma-tion or rejection. One can test the validity of atheory only through the validity of the proposi-tions (hypotheses) that can be derived from it. Ifrepeated attempts to disconfirm its various hy-potheses fail, then greater confidence can beplaced in its validity. This can go on indefinitely,until possibly some hypothesis proves untenable.This would constitute indirect evidence of theinadequacy of the theory and could lead to itsrejection (or more commonly to its replacementby a more adequate theory that can incorporatethe exception).

• Theory must be compatible with both obser-vation and previously validated theories. It

must be grounded in empirical data that havebeen verified and must rest on sound postu-lates and hypotheses. The better the theory,the more adequately it can explain the phe-nomena under consideration, and the morefacts it can incorporate into a meaningfulstructure of ever-greater generalizability.There should be internal consistency betweenthese facts. It should clarify the precise termsin which it seeks to explain, predict and gen-eralize about empirical phenomena.

• Theories must be stated in simple terms; thattheory is best that explains the most in thesimplest way. This is the law of parsimony. Atheory must explain the data adequately andyet must not be so comprehensive as to beunwieldy. On the other hand, it must notoverlook variables simply because they aredifficult to explain.

• A theory should have considerable explana-tory and predictive potential.

• A theory should be able to respond to ob-served anomalies.

• A theory should spawn a research enterprise(echoing Siegel’s (1987) comment that oneof the characteristics of an effective theory isits fertility).

• A theory should demonstrate precision anduniversality, and set the grounds for its ownfalsification and verification, identifying thenature and operation of a ‘severe test’ (Pop-per, 1968). An effective empirical theory istested in contexts which are different fromthose that gave rise to the theory, i.e. theyshould move beyond simply corroborationand induction and towards ‘testing’ (Laudan,1990). It should identify the type of evidencewhich is required to confirm or refute thetheory.

• A theory must be operationalizable precisely.• A test of the theory must be replicable. Sometimes the word model is used instead of,or interchangeably with, theory. Both may beseen as explanatory devices or schemes hav-ing a broadly conceptual framework, thoughmodels are often characterized by the use of

Chapte

r 113

analogies to give a more graphic or visual rep-resentation of a particular phenomenon. Pro-viding they are accurate and do not misrepre-sent the facts, models can be of great help inachieving clarity and focusing on key issues inthe nature of phenomena.

Hitchcock and Hughes (1995:20–1) drawtogether the strands of the discussion so far whenthey describe a theory thus:

Theory is seen as being concerned with the devel-opment of systematic construction of knowledgeof the social world. In doing this theory employsthe use of concepts, systems, models, structures,beliefs and ideas, hypotheses (theories) in orderto make statements about particular types of ac-tions, events or activities, so as to make analysesof their causes, consequences and process. Thatis, to explain events in ways which are consistentwith a particular philosophical rationale or, forexample, a particular sociological or psychologi-cal perspective. Theories therefore aim to both pro-pose and analyze sets of relations existing betweena number of variables when certain regularitiesand continuities can be demonstrated via empiri-cal inquiry.

(Hitchcock and Hughes, 1995:20–1) Scientific theories must, by their very nature, beprovisional. A theory can never be complete inthe sense that it encompasses all that can beknown or understood about the given phenom-enon. As Mouly says,

Invariably, scientific theories are replaced by moresophisticated theories embodying more of the ad-vanced state of the question so that science wid-ens its horizons to include more and more of thefacts as they accumulate. No doubt, many of thethings about which there is agreement today willbe found inadequate by future standards. But wemust begin where we are.

(Mouly, 1978)

We have already implied that the quality of atheory is determined by the state of developmentof the particular discipline. The early stages of ascience must be dominated by empirical work,that is, the accumulation and classification of

data. This is why, as we shall see, much ofeducational research is descriptive. Only as adiscipline matures can an adequate body oftheory be developed. Too premature a formula-tion of theory before the necessary empiricalspadework has been done can lead to a slowingdown of progress. Mouly optimistically suggeststhat some day a single theoretical system, un-known to us at the present time, will be used toexplain the behaviour of molecules, animals andpeople.

In referring to theory and models, we havebegun to touch upon the tools used by scientistsin their work. We look now in more detail attwo such tools which play a crucial role in sci-ence—the concept and the hypothesis.

The tools of science

Concepts express generalizations from particu-lars—anger, achievement, alienation, velocity,intelligence, democracy. Examining these exam-ples more closely, we see that each is a wordrepresenting an idea: more accurately, a conceptis the relationship between the word (or sym-bol) and an idea or conception. Whoever we areand whatever we do, we all make use of con-cepts. Naturally, some are shared and used byall groups of people within the same culture—child, love, justice, for example; others, how-ever, have a restricted currency and are used onlyby certain groups, specialists, or members ofprofessions—idioglossia, retroactive inhibition,anticipatory socialization.

Concepts enable us to impose some sort ofmeaning on the world; through them reality isgiven sense, order and coherence. They are themeans by which we are able to come to termswith our experience. How we perceive the world,then, is highly dependent on the repertoire ofconcepts we can command. The more we have,the more sense data we can pick up and the surerwill be our perceptual (and cognitive) grasp ofwhatever is ‘out there’. If our perceptions of theworld are determined by the concepts avail-able to us, it follows that people with differ-ing sets of concepts will tend to view the ‘same’

THE TOOLS OF SCIENCE

THE NATURE OF INQUIRY14

objective reality differently—a doctor diagnos-ing an illness will draw upon a vastly differentrange of concepts from, say, the restricted andsimplistic notions of the layperson in that con-text; and a visitor to civilization from a distantprimitive culture would be as confused by thefrenetic bustle of urban life as would the mythi-cal Martian.

So, you may ask, where is all this leading?Simply to this: that social scientists have like-wise developed, or appropriated by giving pre-cise meaning to, a set of concepts which enablethem to shape their perceptions of the world ina particular way, to represent that slice of real-ity which is their special study. And collectively,these concepts form part of their wider mean-ing system which permits them to give accountsof that reality, accounts which are rooted andvalidated in the direct experience of everydaylife. These points may be exemplified by theconcept of social class. Hughes says that it of-fers ‘a rule, a grid, even though vague at times,to use in talking about certain sorts of experi-ence that have to do with economic position,life-style, life-chances, and so on. It serves toidentify aspects of experience, and by relatingthe concept to other concepts we are able toconstruct theories about experience in a particu-lar order or sphere’ (Hughes, 1976:34).

There are two important points to stress whenconsidering scientific concepts. The first is thatthey do not exist independently of us: they areindeed our inventions enabling us to acquiresome understanding at least of the apparentchaos of nature. The second is that they are lim-ited in number and in this way contrast withthe infinite number of phenomena they are re-quired to explain.

A second tool of great importance to the sci-entist is the hypothesis. It is from this that muchresearch proceeds, especially where cause-and-effect or concomitant relationships are beinginvestigated. The hypothesis has been definedby Kerlinger (1970) as a conjectural statementof the relations between two or more variables.More simply, it has been termed ‘an educatedguess’, though it is unlike an educated guess in

that it is often the result of considerable study,reflective thinking and observation. Medawar(1972) writes incomparably of the hypothesisand its function in the following way:

All advances of scientific understanding, at everylevel, begin with a speculative adventure, an imagi-native preconception of what might be true—a pre-conception which always, and necessarily, goes a lit-tle way (sometimes a long way) beyond anythingwhich we have logical or factual authority to believein. It is the invention of a possible world, or of a tinyfraction of that world. The conjecture is then ex-posed to criticism to find out whether or not thatimagined world is anything like the real one. Scien-tific reasoning is therefore at all levels an interactionbetween two episodes of thought—a dialogue be-tween two voices, the one imaginative and the othercritical; a dialogue, if you like, between the possibleand the actual, between proposal and disposal, con-jecture and criticism, between what might be trueand what is in fact the case.

(Medawar, 1972)

Kerlinger (1970) has identified two criteria for‘good’ hypotheses. The first is that hypothesesare statements about the relations betweenvariables; and second, that hypotheses carryclear implications for testing the stated rela-tions. To these he adds two ancillary criteria:that hypotheses disclose compatibility with cur-rent knowledge; and that they are expressed aseconomically as possible. Thus if we conjecturethat social class background determines aca-demic achievement, we have a relationship be-tween one variable, social class, and another,academic achievement. And since both can bemeasured, the primary criteria specified byKerlinger can be met. Neither do they violatethe ancillary criteria proposed by Kerlinger (seealso Box 1.4).

He further identifies four reasons for the im-portance of hypotheses as tools of research. First,they organize the efforts of researchers. The re-lationship expressed in the hypothesis indicateswhat they should do. They enable them to un-derstand the problem with greater clarity andprovide them with a framework for collecting,

Chapte

r 115

analysing and interpreting their data. Second,they are, in Kerlinger’s words, the working in-struments of theory. They can be deduced fromtheory or from other hypotheses. Third, they canbe tested, empirically or experimentally, thusresulting in confirmation or rejection. And thereis always the possibility that a hypothesis, onceconfirmed and established, may become a lawAnd fourth, hypotheses are powerful tools forthe advancement of knowledge because, asKerlinger explains, they enable us to get outsideourselves.

Hypotheses and concepts play a crucial partin the scientific method and it is to this that wenow turn our attention.

The scientific method

If the most distinctive feature of science is itsempirical nature, the next most important char-acteristic is its set of procedures which show not

only how findings have been arrived at, but aresufficiently clear for fellow-scientists to repeatthem, i.e. to check them out with the same orother materials and thereby test the results. AsCuff and Payne (1979) say: A scientific approachnecessarily involves standards and proceduresfor demonstrating the “empirical warrant” ofits findings, showing the match or fit betweenits statements and what is happening or has hap-pened in the world’ (Cuff and Payne, 1979:4).These standards and procedures we will call forconvenience ‘the scientific method’, though thiscan be somewhat misleading for the followingreason: the combination of the definite article,adjective and singular noun conjures up in theminds of some people a single invariant ap-proach to problem-solving, an approach fre-quently involving atoms or rats, and taking placewithin the confines of a laboratory peopled withstereotypical scientists wearing white coats andgiven to eccentric bouts of behaviour. Yet thereis much more to it than this. The term in factcloaks a number of methods which vary in theirdegree of sophistication depending on their func-tion and the particular stage of development ascience has reached. We refer you at this pointto Box 1.5 which sets out the sequence of stagesthrough which a science normally passes in itsdevelopment or, perhaps more realistically, thatare constantly present in its progress and onwhich scientists may draw depending on the kindof information they seek or the kind of problemconfronting them. Of particular interest to us inour efforts to elucidate the term ‘scientificmethod’ are stages 2, 3 and 4. Stage 2 is a rela-tively uncomplicated point at which the re-searcher is content to observe and record factsand possibly arrive at some system of classifica-tion. Much research in the field of education,especially at classroom and school level, is con-ducted in this way, e.g. surveys and case stud-ies. Stage 3 introduces a note of added sophisti-cation as attempts are made to establish rela-tionships between variables within a loose frame-work of inchoate theory. Stage 4 is the mostsophisticated stage and often the one that manypeople equate exclusively with the scientific

Box 1.4The hypothesis

Source Medawar, 1981

THE SCIENTIFIC METHOD

Once he has a hypothesis to work on, the scientist is in

business; the hypothesis will guide him to make some

observations rather than others and will suggest

experiments that might not otherwise have been

performed. Scientists soon pick up by experience the

characteristics that make a good hypothesis;…almost all

laws and hypotheses can be read in such a way as to

prohibit the occurrence of certain phenomena…

Clearly, a hypothesis so permissive as to accommodate

any phenomenon tells us precisely nothing; the more

phenomena it prohibits, the more informative it is.

Again, a good hypothesis must also have the character

of logical immediacy, by which I mean that it must be

rather specially an explanation of whatever it is that

needs to be explained and not an explanation of a

great many other phenomena besides… The great

virtue of logical immediacy in a hypothesis is that it can

be tested by comparatively direct and practicable

means—that is, without the foundation of a new

research institute or by making a journey into outer

space. A large part of the art of the soluble is the art of

devising hypotheses that can be tested by practicable

experiments.

THE NATURE OF INQUIRY16

method. In order to arrive at causality, asdistinct from mere measures of association, re-searchers here design experimental situations inwhich variables are manipulated to test theirchosen hypotheses. Here is how one noted re-searcher describes the later stages:

First, there is a doubt, a barrier, an indeterminatesituation crying out, so to speak, to be made de-terminate. The scientist experiences vague doubts,emotional disturbances, inchoate ideas. He strug-gles to formulate the problem, even if inadequately.He studies the literature, scans his own experi-ence and the experience of others. Often he sim-ply has to wait for an inventive leap of mind.Maybe it will occur; maybe not. With the prob-lem formulated, with the basic question or ques-tions properly asked, the rest is much easier. Thenthe hypothesis is constructed, after which its im-plications are deduced, mainly along experimen-tal lines. In this process the original problem, andof course the original hypothesis, may be changed.It may be broadened or narrowed. It may even beabandoned. Lastly, but not finally, the relation ex-pressed by the hypothesis is tested by observation

and experimentation. On the basis of the researchevidence, the hypothesis is accepted or rejected.This information is then fed back to the originalproblem and it is kept or altered as dictated bythe evidence. Dewey finally pointed out that onephase of the process may be expanded and be ofgreat importance, another may be skimped, andthere may be fewer or more steps involved. Thesethings are not important. What is important is theoverall fundamental idea of scientific research asa controlled rational process of reflective inquiry,the interdependent nature of the parts of the proc-ess, and the paramount importance of the prob-lem and its statement.

(Kerlinger, 1970) With stages 3 and 4 of Box 1.5 in mind, we maysay that the scientific method begins consciouslyand deliberately by selecting from the totalnumber of elements in a given situation. Morerecently Hitchcock and Hughes (1995:23) sug-gest an eight-stage model of the scientific methodthat echoes Kerlinger. This is represented in Box1.6. The elements the researchers fasten on towill naturally be suitable for scientific formula-tion; this means simply that they will possessquantitative aspects. Their principal workingtool will be the hypothesis which, as we haveseen, is a statement indicating a relationship (orits absence) between two or more of the chosenelements and stated in such a way as to carryclear implications for testing. Researchers then

Box 1.5Stages in the development of a science

Box 1.6An eight-stage model of the scientific method

1 Definition of the science and identification of the

phenomena that are to be subsumed under it.

2 Observational stage at which the relevant factors,

variables or items are identified and labelled;

and at which categories and taxonomies are

developed.

3 Correlational research in which variables and

parameters are related to one another and

information is systematically integrated as theories

begin to develop.

4 The systematic and controlled manipulation of

variables to see if experiments will produce

expected results, thus moving from correlation to

causality.

5 The firm establishment of a body of theory as the

outcomes of the earlier stages are accumulated.

Depending on the nature of the phenomena

under scrutiny, laws may be formulated and

systematized.

6 The use of the established body of theory in the

resolution of problems or as a source of further

hypotheses.

Stage 1: Hypotheses, hunches and guesses

Stage 2: Experiment designed; samples taken; variables

isolated

Stage 3: Correlations observed; patterns identified

Stage 4: Hypotheses formed to explain regularities

Stage 5: Explanations and predictions tested;

falsifiability

Stage 6: Laws developed or disconfirmation

(hypothesis rejected)

Stage 7: Generalizations made

Stage 8: New theories

Chapte

r 117

choose the most appropriate method and puttheir hypotheses to the test.

Criticisms of positivism and the scientificmethod

In spite of the scientific enterprise’s proven suc-cess—especially in the field of natural science—its ontological and epistemological bases havebeen the focus of sustained and sometimes ve-hement criticism from some quarters. Beginningin the second half of the nineteenth century, therevolt against positivism occurred on a broadfront, attracting some of the best intellectuals inEurope—philosophers, scientists, social criticsand creative artists. Essentially, it has been areaction against the world picture projected byscience which, it is contended, undermines lifeand mind. The precise target of the anti-positiv-ists’ attack has been science’s mechanistic andreductionist view of nature which, by definition,excludes notions of choice, freedom, individu-ality, and moral responsibility.

One of the most sustained and consistent at-tacks in this respect came from the poet, WilliamBlake, who perceived the universe not as amechanism, but as a living organism:

Blake would have us understand that mecha-nistic science and the philosophy of material-ism eliminate the concept of life itself. All theycan do is to define life in terms of biochemistry,biophysics, vibrations, wavelengths, and so on;they reduce ‘life’ to conceivable measurement,but such a conception of life does not embracethe most evident element of all: that life can onlybe known by a living being, by ‘inner’ experi-ence. No matter how exact measurement maybe, it can never give us an experience of life, forlife cannot be weighed and measured on a physi-cal scale.

(Nesfield-Cookson, 1987) Another challenge to the claims of positivismcame from Søren Kierkegaard, the Danish phi-losopher, from whose work was to originatethe movement that became known as Existen-tialism. Kierkegaard was concerned with indi-

viduals and their need to fulfil themselves tothe highest level of development. This realiza-tion of a person’s potential was for him themeaning of existence which he saw as ‘con-crete and individual, unique and irreducible,not amenable to conceptualization’ (Beck,1979). Characteristic features of the age inwhich we live—democracy’s trust in thecrowd mentality, the ascendancy of reason,scientific and technological progress—all mili-tate against the achievement of this end andcontribute to the dehumanization of the indi-vidual. In his desire to free people from theirillusions, the illusion Kierkegaard was mostconcerned about was that of objectivity. Bythis he meant the imposition of rules of behav-iour and thought, and the making of a personinto an observer set on discovering generallaws governing human behaviour. The capac-ity for subjectivity, he argued, should be re-gained. This he regarded as the ability to con-sider one’s own relationship to whatever con-stitutes the focus of inquiry. The contrast hemade between objectivity and subjectivity isbrought out in the following passage:

When the question of truth is raised in an ob-jective manner, reflection is directed objectivelyto the truth as an object to which the knoweris related. Reflection is not focused on the re-lationship, however, but upon the question ofwhether it is the truth to which the knower isrelated. If only the object to which he is re-lated is the truth, the subject is accounted tobe in the truth. When the question of truth israised subjectively, reflection is directed sub-jectively to the nature of the individual’s rela-tionship; if only the mode of this relationshipis in the truth, the individual is in the truth,even if he should happen to be thus related towhat is not true.

(Kierkegaard, 1974)

For Kierkegaard, ‘subjectivity and concretenessof truth are together the light. Anyone who iscommitted to science, or to rule-governed mo-rality, is benighted, and needs to be rescued fromhis state of darkness’ (Warnock, 1970).

CRITICISMS OF POSITIVISM AND THE SCIENTIFIC METHOD

THE NATURE OF INQUIRY18

Also concerned with the dehumanizing effectsof the social sciences is Ions (1977). While ac-knowledging that they can take much credit forthrowing light in dark corners, he expresses se-rious concern at the way in which quantifica-tion and computation, assisted by statisticaltheory and method, are used. On this point, hewrites:

The argument begins when we quantify the proc-ess and interpret the human act. In this respect,behavioural science represents a form of collec-tivism which runs parallel to other developmentsthis century. However high-minded the intention,the result is depersonalization, the effects of whichcan be felt at the level of the individual humanbeing, not simply at the level of culture.

(Ions, 1977) His objection is not directed at quantificationper se, but at quantification when it becomes anend in itself—‘a branch of mathematics ratherthan a humane study seeking to explore and elu-cidate the gritty circumstances of the humancondition’ (Ions, 1977). This echoesHorkheimer’s (1972) powerful critique of posi-tivism as the ‘mathematication of nature’.

Another forceful critic of the objective con-sciousness has been Roszak. Writing of its al-ienating effect in contemporary life, he says:

While the art and literature of our time tell us withever more desperation that the disease from whichour age is dying is that of alienation, the sciences,in their relentless pursuit of objectivity, raise al-ienation to its apotheosis as our only means ofachieving a valid relationship to reality. Objectiveconsciousness is alienated life promoted to its mosthonorific status as the scientific method. Underits auspices we subordinate nature to our com-mand only by estranging ourselves from more andmore of what we experience, until the reality aboutwhich objectivity tells us so much finally becomesa universe of congealed alienation.

(Roszak, 1970)6

The justification for any intellectual activity liesin the effect it has on increasing our awarenessand degree of consciousness. This increase, some

claim, has been retarded in our time by the ex-cessive influence the positivist paradigm has beenallowed to exert on areas of our intellectual life.Holbrook, for example, affording consciousnessa central position in human existence and deeplyconcerned with what happens to it, has written:

[O]ur approaches today to the study of man [sic]have yielded little, and are essentially dead, be-cause they cling to positivism—that is, to an ap-proach which demands that nothing must be re-garded as real which cannot be found by empiri-cal science and rational methods, by ‘objectivity’.Since the whole problem…belongs to ‘psychic re-ality’, to man’s ‘inner world’, to his moral being,and to the subjective life, there can be no debateunless we are prepared to recognize the bankruptcyof positivism, and the failure of ‘objectivity’ togive an adequate account of existence, and are pre-pared to find new modes of inquiry.

(Holbrook, 1977)

Other writers question the perspective adoptedby positivist social science because it presents amisleading picture of the human being.Hampden-Turner (1970), for example, con-cludes that the social science view of humanbeings is biased in that it is conservative andignores important qualities. This restricted im-age of humans, he contends, comes about be-cause social scientists concentrate on the repeti-tive, predictable and invariant aspects of theperson; on ‘visible externalities’ to the exclusionof the subjective world; and—at least as far aspsychology is concerned—on the parts of theperson in their endeavours to understand thewhole. For a trenchant critique of science fromthe point of view of theology, see Philip Sherrard(1987), The Eclipse of Man and Nature.

Habermas (1972), in keeping with the Frank-furt School of critical theory (critical theory isdiscussed below), provides a corrosive critiqueof positivism, arguing that the scientific men-tality has been elevated to an almost unassail-able position—almost to the level of a religion(scientism)—as being the only epistemology ofthe west. In this view all knowledge becomesequated with scientific knowledge. This neglects

Chapte

r 119

hermeneutic, aesthetic, critical, moral, creativeand other forms of knowledge. It reduces be-haviour to technicism.

Positivism’s concern for control and, thereby,its appeal to the passivity of behaviourism andfor instrumental reason is a serious danger tothe more open-ended, creative, humanitarianaspects of social behaviour. Habermas (1972,1974) and Horkheimer (1972) are arguing thatscientism silences an important debate aboutvalues, informed opinion, moral judgements andbeliefs. Scientific explanation seems to be theonly means of explaining behaviour, and, forthem, this seriously diminishes the very charac-teristics that make humans human. It makes fora society without conscience. Positivism is un-able to answer questions about many interest-ing or important areas of life (Habermas,1972:300). Indeed this is an echo ofWittgenstein’s (1974) famous comment thatwhen all possible scientific questions have beenaddressed they have left untouched the mainproblems of life.

Other criticisms are commonly levelled atpositivistic social science from within its ownranks. One is that it fails to take account ofour unique ability to interpret our experiencesand represent them to ourselves. We can, anddo construct theories about ourselves and ourworld; moreover, we act on these theories. Infailing to recognize this, positivistic social sci-ence is said to ignore the profound differencesbetween itself and the natural sciences. Socialscience, unlike natural science, ‘stands in asubject—subject relation to its field of study,not a subject—object relation; it deals with apre-interpreted world in which the meaningsdeveloped by active subjects enter the actualconstitution or production of the world’(Giddens, 1976).

The difficulty in which positivism finds itselfis that it regards human behaviour as passive,essentially determined and controlled, therebyignoring intention, individualism and freedom.This approach suffers from the same difficultiesthat inhere in behaviourism, which has scarcelyrecovered from Chomsky’s withering criticism

in 1959 where he writes that a singular prob-lem of behaviourism is our inability to infercauses from behaviour, to identify the stimulusthat has brought about the response—the weak-ness of Skinner’s stimulus-response theory. Thisproblem with positivism also rehearses the fa-miliar problem in social theory, viz. the tensionbetween agency and structure (Layder, 1994);humans exercise agency—individual choice andintention—not necessarily in circumstances oftheir own choosing, but nevertheless they do notbehave simply, deterministically like puppets.

The findings of positivistic social science areoften said to be so banal and trivial that theyare of little consequence to those for whom theyare intended, namely, teachers, social workers,counsellors, personnel managers, and the like.The more effort, it seems, that researchers putinto their scientific experimentation in the labo-ratory by restricting, simplifying and control-ling variables, the more likely they are to endup with a ‘pruned, synthetic version of the whole,a constructed play of puppets in a restricted en-vironment’.7

These are formidable criticisms; but what al-ternatives are proposed by the detractors ofpositivistic social science?

Alternatives to positivistic social science:naturalistic approaches

Although the opponents of positivism withinsocial science itself subscribe to a variety ofschools of thought each with its own subtly dif-ferent epistemological viewpoint, they are unitedby their common rejection of the belief that hu-man behaviour is governed by general, univer-sal laws and characterized by underlying regu-larities. Moreover, they would agree that thesocial world can only be understood from thestandpoint of the individuals who are part ofthe ongoing action being investigated; and thattheir model of a person is an autonomous one,not the plastic version favoured by positivistresearchers. In rejecting the viewpoint of thedetached, objective observer—a mandatory fea-ture of traditional research—anti-positivists

ALTERNATIVES TO POSITIVISTIC SOCIAL SCIENCE

THE NATURE OF INQUIRY20

would argue that individuals’ behaviour can onlybe understood by the researcher sharing theirframe of reference: understanding of individu-als’ interpretations of the world around themhas to come from the inside, not the outside.Social science is thus seen as a subjective ratherthan an objective undertaking, as a means ofdealing with the direct experience of people inspecific contexts. The following extract nicelycaptures the spirit in which the anti-positivistsocial scientist would work:

[T]he purpose of social science is to understand so-cial reality as different people see it and to demon-strate how their views shape the action which theytake within that reality. Since the social sciences can-not penetrate to what lies behind social reality, theymust work directly with man’s definitions of realityand with the rules he devises for coping with it. Whilethe social sciences do not reveal ultimate truth, theydo help us to make sense of our world. What thesocial sciences offer is explanation, clarification anddemystification of the social forms which man hascreated around himself.

(Beck, 1979) The anti-positivist movement has so influencedthose constituent areas of social science of mostconcern to us, namely, psychology, social psy-chology and sociology, that in each case amovement reflecting its mood has developedcollaterally with mainstream trends. Whetherthis development is seen in competitive or com-plementary terms depends to some extent onone’s personal viewpoint. It cannot be denied,however, that in some quarters proponents ofthe contrasting viewpoints have been preparedto lock horns on some of the more contentiousissues.

In the case of psychology, for instance, aschool of humanistic psychology has emergedalongside the co-existing behaviouristic andpsychoanalytic schools. Arising as a response tothe challenge to combat the growing feelings ofdehumanization which characterize much ofthe current social and cultural milieu, it sets outto study and understand the person as a whole(Buhler and Allen, 1972). Humanistic psy-

chologists present a model of people that ispositive, active and purposive, and at the sametime stresses their own involvement with thelife experience itself. They do not stand apart,introspective, hypothesizing. Their interest isdirected at the intentional and creative aspectsof the human being. The perspective adoptedby humanistic psychologists is naturally re-flected in their methodology. They are dedi-cated to studying the individual in preference tothe group, and consequently prefer idiographicapproaches to nomothetic ones. The implica-tions of the movement’s philosophy for theeducation of the human being have been drawnby Carl Rogers.8

Comparable developments within socialpsychology may be perceived in the ‘science ofpersons’ movement. Its proponents contendthat because of our self-awareness and powersof language, we must be seen as systems of adifferent order of complexity from any otherexisting system whether natural, like an ani-mal, or artificial, a computer, for instance. Be-cause of this, no other system is capable of pro-viding a sufficiently powerful model to advanceour understanding of ourselves. It is argued,therefore, that we must use ourselves as a keyto our understanding of others and conversely,our understanding of others as a way of findingout about ourselves. What is called for is an an-thropomorphic model of people. Since anthro-pomorphism means, literally, the attribution ofhuman form and personality, the implied criti-cism is that social psychology as traditionallyconceived has singularly failed, so far, to modelpeople as they really are. As one wry commen-tator has pleaded, ‘For scientific purposes, treatpeople as if they were human beings’ (Harréand Secord, 1972).

This approach would entail working from amodel of humans that takes account of the fol-lowing uniquely human attributes:

We are entities who are capable of monitoring ourown performance. Further, because we are awareof this self-monitoring and have the power ofspeech, we are able to provide commentaries on

Chapte

r 121

those performances and to plan ahead of them aswell. Such entities it is held, are much inclined tousing rules, to devising plans, to developing strat-egies in getting things done the way they want themdoing.

(Harré and Secord, 1972)

Social psychology’s task is to understand peo-ple in the light of this anthropomorphic model.But what specifically would this involve? Pro-ponents of this ‘science of persons’ approachplace great store on the systematic and pains-taking analysis of social episodes, i.e. behaviourin context. In Box 1.7 we give an example ofsuch an episode taken from a classroom study.Note how the particular incident would appearon an interaction analysis coding sheet of a re-searcher employing a positivistic approach.Note, too, how this slice of classroom life canonly be understood by knowledge of the spe-cific organizational background and context inwhich it is embedded.

The approach to analysing social episodes interms of the ‘actors’ themselves is known as the‘ethogenic method’.9 Unlike positivistic social psy-chology which ignores or presumes its subjects’interpretations of situations, ethogenic social psy-chology concentrates upon the ways in which per-sons construe their social world. By probing theirown accounts of their actions, it endeavours tocome up with an understanding of what thosepersons were doing in the particular episode.

As an alternative to positivist approaches,naturalistic, qualitative, interpretive approachesof various hue possess particular distinguishingfeatures:

• people are deliberate and creative in theiractions, they act intentionally and makemeanings in and through their activities(Blumer, 1969);

• people actively construct their social world—they are not the ‘cultural dopes’ or passive dollsof positivism (Becker, 1970; Garfinkel, 1967);

Box 1.7

A classroom episode

Source Adapted from Delamont, 1976

ALTERNATIVES TO POSITIVISTIC SOCIAL SCIENCE

Walker and Adelman describe an incident in the following manner:

In one lesson the teacher was listening to the boys read through short essays that they had written for homework on thesubject of Prisons’. After one boy,Wilson, had finished reading out his rather obviously skimped piece of work the teachersighed and said, rather crossly:

T: Wilson, we’ll have to put you away if you don’t change your ways, and do your homework. Is that all you’vedone?

P: Strawberries, strawberries. (Laughter)

Now at first glance this is meaningless. An observer coding with Flanders Interaction Analysis Categories (FIAC) wouldwrite down:

‘7’ (teacher criticizes) followed by a,‘4’ (teacher asks question) followed by a,‘9’ (pupil irritation) and finally a,

‘10’ (silence or confusion) to describe the laughter

Such a string of codings, however reliable and valid, would not help anyone to understand why such an interruption wasfunny. Human curiosity makes us want to know why everyone laughs — and so, I would argue, the social scientist needs toknow too. Walker and Adelman asked subsequently why ‘strawberries’ was a stimulus to laughter and were told that theteacher frequently said the pupils’ work was ‘like strawberries - good as far as it goes, but it doesn’t last nearly longenough’. Here a casual comment made in the past has become an integral part of the shared meaning system of the class.It can only be comprehended by seeing the relationship as developing over time.

THE NATURE OF INQUIRY22

• situations are fluid and changing ratherthan fixed and static; events and behaviourevolve over time and are richly affected bycontext—they are ‘situated activities’;

• events and individuals are unique andlargely non-generalizable;

• a view that the social world should be stud-ied in its natural state, without the inter-vention of, or manipulation by, the re-searcher (Hammersley and Atkinson,1983);

• fidelity to the phenomena being studied isfundamental;

• people interpret events, contexts and situa-tions, and act on the bases of those events(echoing Thomas’s (1928) famous dictumthat if people define their situations as realthen they are real in their consequences—ifI believe there is a mouse under the table, Iwill act as though there is a mouse underthe table, whether there is or not(Morrison, 1998));

• there are multiple interpretations of, andperspectives on, single events and situa-tions;

• reality is multi-layered and complex;• many events are not reducible to simplistic

interpretation, hence ‘thick descriptions’(Geertz, 1973) are essential rather thanreductionism;

• we need to examine situations through theeyes of participants rather than the re-searcher.

The anti-positivist movement in sociology isrepresented by three schools of thought—phe-nomenology, ethnomethodology and symbolicinteractionism. A common thread runningthrough the three schools is a concern withphenomena, that is, the things we directly ap-prehend through our senses as we go aboutour daily lives, together with a consequentemphasis on qualitative as opposed to quanti-tative methodology. The differences betweenthem and the significant roles each phenom-enon plays in research in classrooms andschools are such as to warrant a more ex-

tended consideration of them in the discussionbelow (p. 23).

A question of terminology: the normativeand interpretive paradigms

We so far have introduced and used a variety ofterms to describe the numerous branches andschools of thought embraced by the positivistand anti-positivist viewpoints. We clarify at thispoint two generic terms conventionally used todescribe these two perspectives and the catego-ries subsumed under each, particularly as theyrefer to social psychology and sociology. Theterms in question are ‘normative’ and ‘interpre-tive’. The normative paradigm (or model) con-tains two major orienting ideas (Douglas, 1973):first, that human behaviour is essentially rule-governed; and second, that it should be investi-gated by the methods of natural science. Theinterpretive paradigm, in contrast to its norma-tive counterpart, is characterized by a concernfor the individual. Whereas normative studiesare positivist, all theories constructed within thecontext of the interpretive paradigm tend to beanti-positivist.10 As we have seen, the centralendeavour in the context of the interpretive para-digm is to understand the subjective world ofhuman experience. To retain the integrity of thephenomena being investigated, efforts are madeto get inside the person and to understand fromwithin. The imposition of external form andstructure is resisted, since this reflects the view-point of the observer as opposed to that of theactor directly involved.

Two further differences between the twoparadigms may be identified at this stage: thefirst concerns the concepts of ‘behaviour’ and‘action’; the second, the different conceptionsof ‘theory’. A key concept within the normativeparadigm, behaviour refers to responses eitherto external environmental stimuli (another per-son, or the demands of society, for instance) orto internal stimuli (hunger, or the need toachieve, for example). In either case, the causeof the behaviour lies in the past. Interpretiveapproaches, on the other hand, focus on action.

Chapte

r 123

This may be thought of as behaviour-with-mean-ing; it is intentional behaviour and as such, fu-ture oriented. Actions are only meaningful to usin so far as we are able to ascertain the inten-tions of actors to share their experiences. A largenumber of our everyday interactions with oneanother rely on such shared experiences.

As regards theory, normative researchers tryto devise general theories of human behaviourand to validate them through the use of increas-ingly complex research methodologies which,some believe, push them further and further fromthe experience and understanding of the every-day world and into a world of abstraction. Forthem, the basic reality is the collectivity; it isexternal to the actor and manifest in society, itsinstitutions and its organizations. The role oftheory is to say how reality hangs together inthese forms or how it might be changed so as tobe more effective. The researcher’s ultimate aimis to establish a comprehensive ‘rational edifice’,a universal theory, to account for human andsocial behaviour.

But what of the interpretive researchers? Theybegin with individuals and set out to understandtheir interpretations of the world around them.Theory is emergent and must arise from par-ticular situations; it should be ‘grounded’ on datagenerated by the research act (Glaser andStrauss, 1967). Theory should not precede re-search but follow it.

Investigators work directly with experienceand understanding to build their theory on them.The data thus yielded will be glossed with themeanings and purposes of those people who aretheir source. Further, the theory so generatedmust make sense to those to whom it applies.The aim of scientific investigation for the inter-pretive researcher is to understand how thisglossing of reality goes on at one time and inone place and compare it with what goes on indifferent times and places. Thus theory becomessets of meanings which yield insight and under-standing of people’s behaviour. These theoriesare likely to be as diverse as the sets of humanmeanings and understandings that they are toexplain. From an interpretive perspective the

hope of a universal theory which characterizesthe normative outlook gives way to multifac-eted images of human behaviour as varied asthe situations and contexts supporting them.

Phenomenology, ethnomethodology andsymbolic interactionism

There are many variants of qualitative, natural-istic approaches (Jacob, 1987; Hitchcock andHughes, 1995). Here we focus on three signifi-cant ‘traditions’ in this style of research—phe-nomenology, ethnomethodology and symbolicinteractionism. In its broadest meaning, phe-nomenology is a theoretical point of view thatadvocates the study of direct experience takenat face value; and one which sees behaviour asdetermined by the phenomena of experiencerather than by external, objective and physicallydescribed reality (English and English, 1958).Although phenomenologists differ among them-selves on particular issues, there is fairly generalagreement on the following points identified byCurtis (1978) which can be taken as distinguish-ing features of their philosophical viewpoint: • a belief in the importance, and in a sense the

primacy, of subjective consciousness;• an understanding of consciousness as active,

as meaning bestowing; and• a claim that there are certain essential struc-

tures to consciousness of which we gain directknowledge by a certain kind of reflection.Exactly what these structures are is a pointabout which phenomenologists have differed.

Various strands of development may be tracedin the phenomenological movement: we shallbriefly examine two of them—the transcenden-tal phenomenology of Husserl; and existentialphenomenology, of which Schutz is perhaps themost characteristic representative.

Husserl, regarded by many as the founder ofphenomenology, was concerned with investigat-ing the source of the foundation of science andwith questioning the commonsense, ‘taken-for-granted’ assumptions of everyday life (see Burrell

PHENOMENOLOGY, ETHNOMETHODOLOGY, INTERACTIONISM

THE NATURE OF INQUIRY24

and Morgan, 1979). To do this, he set aboutopening up a new direction in the analysis ofconsciousness. His catch-phrase was ‘back to thethings!’ which for him meant finding out howthings appear directly to us rather than throughthe media of cultural and symbolic structures.In other words, we are asked to look beyondthe details of everyday life to the essences un-derlying them. To do this, Husserl exhorts us to‘put the world in brackets’ or free ourselves fromour usual ways of perceiving the world. What isleft over from this reduction is our conscious-ness of which there are three elements—the ‘I’who thinks, the mental acts of this thinking sub-ject, and the intentional objects of these mentalacts. The aim, then, of this method of epoché,as Husserl called it, is the dismembering of theconstitution of objects in such a way as to freeus from all preconceptions about the world (seeWarnock, 1970).

Schutz was concerned with relating Husserl’sideas to the issues of sociology and to the scien-tific study of social behaviour. Of central con-cern to him was the problem of understandingthe meaning structure of the world of everydaylife. The origins of meaning he thus sought inthe ‘stream of consciousness’—basically an un-broken stream of lived experiences which haveno meaning in themselves. One can only imputemeaning to them retrospectively, by the processof turning back on oneself and looking at whathas been going on. In other words, meaning canbe accounted for in this way by the concept ofreflexivity. For Schutz, the attribution of mean-ing reflexively is dependent on the people iden-tifying the purpose or goal they seek (see Burrelland Morgan, 1979).

According to Schutz, the way we understandthe behaviour of others is dependent on a proc-ess of typification by means of which the ob-server makes use of concepts resembling ‘idealtypes’ to make sense of what people do. Theseconcepts are derived from our experience of eve-ryday life and it is through them, claims Schutz,that we classify and organize our everydayworld. As Burrell and Morgan observe, ‘Thetypifications are learned through our biographi-

cal situation. They are handed to us accordingto our social context. Knowledge of everydaylife is thus socially ordered. The notion of typi-fication is thus…an inherent feature of our eve-ryday world’ (Burrell and Morgan, 1979).

The fund of everyday knowledge by meansof which we are able to typify other people’sbehaviour and come to terms with social realityvaries from situation to situation. We thus livein a world of multiple realities:

The social actor shifts between these provinces ofmeaning in the course of his everyday life. As heshifts from the world of work to that of homeand leisure or to the world of religious experi-ence, different ground rules are brought into play.While it is within the normal competence of theacting individual to shift from one sphere to an-other, to do so calls for a ‘leap of consciousness’to overcome the differences between the differentworlds.

(Burrell and Morgan, 1979) Like phenomenology, ethnomethodology is con-cerned with the world of everyday life. In thewords of its proponent, Harold Garfinkel, it setsout ‘to treat practical activities, practical circum-stances, and practical sociological reasonings astopics of empirical study, and by paying to themost commonplace activities of daily life the at-tention usually accorded extraordinary events,seeks to learn about them as phenomena in theirown right’ (Garfinkel, 1967). He maintains thatstudents of the social world must doubt the real-ity of that world; and that in failing to view hu-man behaviour more sceptically, sociologists havecreated an ordered social reality that bears littlerelationship to the real thing. He thereby chal-lenges the basic sociological concept of order.

Ethnomethodology, then, is concerned withhow people make sense of their everydayworld. More especially, it is directed at themechanisms by which participants achieve andsustain interaction in a social encounter—theassumptions they make, the conventions theyutil ize, and the practices they adopt.Ethnomethodology thus seeks to understandsocial accomplishments in their own terms; it

Chapte

r 125

is concerned to understand them from within(see Burrell and Morgan, 1979).

In identifying the ‘taken-for-granted’ assump-tions characterizing any social situation and theways in which the people involved make theiractivities rationally accountable,ethnomethodologists use notions like‘indexicality’ and ‘reflexivity’. Indexicality re-fers to the ways in which actions and statementsare related to the social contexts producing them;and to the way their meanings are shared by theparticipants but not necessarily stated explicitly.Indexical expressions are thus the designationsimputed to a particular social occasion by theparticipants in order to locate the event in thesphere of reality. Reflexivity, on the other hand,refers to the way in which all accounts of socialsettings—descriptions, analyses, criticisms,etc.—and the social settings occasioning themare mutually interdependent.

It is convenient to distinguish between twotypes of ethnomethodologists: linguistic andsituational. The linguistic ethnomethodologistsfocus upon the use of language and the ways inwhich conversations in everyday life are struc-tured. Their analyses make much use of theunstated ‘taken-for-granted’ meanings, the useof indexical expressions and the way in whichconversations convey much more than is actu-ally said. The situational ethnomethodologistscast their view over a wider range of social ac-tivity and seek to understand the ways in whichpeople negotiate the social contexts in whichthey find themselves. They are concerned tounderstand how people make sense of and or-der their environment. As part of their empiri-cal method, ethnomethodologists may con-sciously and deliberately disrupt or question theordered ‘taken-for-granted’ elements in every-day situations in order to reveal the underlyingprocesses at work.

The substance of ethnomethodology thuslargely comprises a set of specific techniques andapproaches to be used in the study of whatGarfinkel has described as the ‘awesomeindexicality’ of everyday life. It is geared toempirical study, and the stress which its practi-

tioners place upon the uniqueness of the situa-tion encountered, projects its essentially relativ-ist standpoint. A commitment to the develop-ment of methodology and field-work has occu-pied first place in the interests of its adherents,so that related issues of ontology, epistemologyand the nature of human beings have receivedless attention than perhaps they deserve.

Essentially, the notion of symbolicinteractionism derives from the work ofG.H.Mead (1934). Although subsequently to beassociated with such noted researchers asBlumer, Hughes, Becker and Goffman, the termdoes not represent a unified perspective in thatit does not embrace a common set of assump-tions and concepts accepted by all who subscribeto the approach. For our purposes, however, itis possible to identify three basic postulates.These have been set out by Woods (1979) asfollows. First, human beings act towards thingson the basis of the meanings they have for them.Humans inhabit two different worlds: the ‘natu-ral’ world wherein they are organisms of drivesand instincts and where the external world ex-ists independently of them, and the social worldwhere the existence of symbols, like language,enables them to give meaning to objects. Thisattribution of meanings, this interpreting, is whatmakes them distinctively human and social.Interactionists therefore focus on the world ofsubjective meanings and the symbols by whichthey are produced and represented. This meansnot making any prior assumptions about whatis going on in an institution, and taking seri-ously, indeed giving priority to, inmates’ ownaccounts. Thus, if pupils appear preoccupied fortoo much of the time—‘being bored’, ‘muckingabout’, ‘having a laugh’, etc. the interactionistis keen to explore the properties and dimensionsof these processes. Second, this attribution ofmeaning to objects through symbols is a con-tinuous process. Action is not simply a conse-quence of psychological attributes such as drives,attitudes, or personalities, or determined by ex-ternal social facts such as social structure orroles, but results from a continuous process ofmeaning attribution which is always emerging

PHENOMENOLOGY, ETHNOMETHODOLOGY, INTERACTIONISM

THE NATURE OF INQUIRY26

in a state of flux and subject to change. The in-dividual constructs, modifies, pieces together,weighs up the pros and cons and bargains. Third,this process takes place in a social context. Indi-viduals align their actions to those of others.They do this by ‘taking the role of the other’, bymaking indications to ‘themselves’ about thelikely responses of ‘others’. They construct howothers wish or might act in certain circum-stances, and how they themselves might act.They might try to ‘manage’ the impressions oth-ers have of them, put on a ‘performance’, try toinfluence others’ ‘definition of the situation’.

Instead of focusing on the individual, then,and his or her personality characteristics, or onhow the social structure or social situationcauses individual behaviour, symbolicinteractionists direct their attention at the na-ture of interaction, the dynamic activities tak-ing place between people. In focusing on theinteraction itself as a unit of study, the symbolicinteractionist creates a more active image of thehuman being and rejects the image of the pas-sive, determined organism. Individuals interact;societies are made up of interacting individuals.People are constantly undergoing change in in-teraction and society is changing through inter-action. Interaction implies human beings actingin relation to each other, taking each other intoaccount, acting, perceiving, interpreting, actingagain. Hence, a more dynamic and active hu-man being emerges rather than an actor merelyresponding to others. Woods (1983:15–16)summarizes key emphases of symbolic interac-tion thus: • individuals as constructors of their own ac-

tions;• the various components of the self and how

they interact; the indications made to self,meanings attributed, interpretive mecha-nisms, definitions of the situation; in short,the world of subjective meanings, and thesymbols by which they are produced and rep-resented;

• the process of negotiation, by which mean-ings are continually being constructed;

• the social context in which they occur andwhence they derive;

• by taking the ‘role of the other’—a dynamicconcept involving the construction of howothers wish to or might act in a certain cir-cumstance, and how individuals themselvesmight act—individuals align their actions tothose of others.

A characteristic common to thephenomenological, ethnomethodological andsymbolic interactionist perspectives—and onewhich makes them singularly attractive to thewould-be educational researcher—is the waythey fit naturally to the kind of concentrated ac-tion found in classrooms and schools. Yet an-other shared characteristic is the manner inwhich they are able to preserve the integrity ofthe situation where they are employed. This is tosay that the influence of the researcher in struc-turing, analysing and interpreting the situation ispresent to a much smaller degree than would bethe case with a more traditionally oriented re-search approach.

Criticisms of the naturalistic andinterpretive approaches

Critics have wasted little time in pointing outwhat they regard as weaknesses in these newerqualitative perspectives. They argue that whileit is undeniable that our understanding of theactions of our fellow-beings necessarily requiresknowledge of their intentions, this, surely, can-not be said to comprise the purpose of a socialscience. As Rex has observed:

Whilst patterns of social reactions and institutionsmay be the product of the actors’ definitions ofthe situations there is also the possibility that thoseactors might be falsely conscious and that soci-ologists have an obligation to seek an objectiveperspective which is not necessarily that of any ofthe participating actors at all… We need not beconfined purely and simply to that…social realitywhich is made available to us by participant ac-tors themselves.

(Rex, 1974)

Chapte

r 127

Giddens similarly argues against the likely rela-tivism of this paradigm:

No specific person can possess detailed knowl-edge of anything more than the particular sectorof society in which he participates, so that therestill remains the task of making into an explicitand comprehensive body of knowledge that whichis only known in a partial way by lay actors them-selves.

(Giddens, 1976)

While these more recent perspectives have pre-sented models of people that are more in keep-ing with common experience, their methodolo-gies are by no means above reproof. Some ar-gue that advocates of an anti-positivist stancehave gone too far in abandoning scientific pro-cedures of verification and in giving up hope ofdiscovering useful generalizations about behav-iour (see Mead, 1934). Are there not dangers, itis suggested, in rejecting the approach of phys-ics in favour of methods more akin to litera-ture, biography and journalism? Some specificcriticisms of the methodologies used are welldirected:

If the carefully controlled interviews used in so-cial surveys are inaccurate, how about the uncon-trolled interviews favoured by the (newer perspec-tives)? If sophisticated ethological studies of be-haviour are not good enough, are participant ob-servation studies any better?

(Argyle, 1978)

And what of the insistence of the interpretive meth-odologies on the use of verbal accounts to get atthe meaning of events, rules and intentions? Arethere not dangers? Subjective reports are sometimesincomplete and they are sometimes misleading.

(Bernstein, 1974)

Bernstein’s criticism is directed at the overrid-ing concern of phenomenologists andethnomethodologists with the meanings of situ-ations and the ways in which these meaningsare negotiated by the actors involved. What isoverlooked about such negotiated meanings,

observes Bernstein, is that they ‘presuppose astructure of meanings (and their history) widerthan the area of negotiation. Situated activitiespresuppose a situation; they presuppose relation-ships between situations; they presuppose setsof situations’ (Bernstein, 1974).

Bernstein’s point is that the very processwhereby one interprets and defines a situationis itself a product of the circumstances in whichone is placed. One important factor in such cir-cumstances that must be considered is thepower of others to impose their own definitionsof situations upon participants. Doctors’ con-sulting rooms and headteachers’ studies are lo-cations in which inequalities in power are regu-larly imposed upon unequal participants. Theability of certain individuals, groups, classesand authorities to persuade others to accepttheir definitions of situations demonstrates thatwhile—as ethnomethodologists insist—socialstructure is a consequence of the ways in whichwe perceive social relations, it is clearly morethan this. Conceiving of social structure as ex-ternal to ourselves helps us take its self-evidenteffects upon our daily lives into our under-standing of the social behaviour going on aboutus. Here is rehearsed the tension betweenagency and structure of social theorists (Layder,1994); the danger of interactionist and inter-pretive approaches is their relative neglect ofthe power of external-structural—forces toshape behaviour and events. There is a risk ininterpretive approaches that they becomehermetically sealed from the world outside theparticipants’ theatre of activity—they put arti-ficial boundaries around subjects’ behaviour.Just as positivistic theories can be criticized fortheir macro-sociological persuasion, so inter-pretive and qualitative can be criticized fortheir narrowly micro-sociological persuasion.

Critical theory and critical educationalresearch

Positivist and interpretive paradigms are essen-tially concerned with understanding phenomenathrough two different lenses. Positivism strives

CRITICAL THEORY AND CRITICAL EDUCATIONAL RESEARCH

THE NATURE OF INQUIRY28

for objectivity, measurability, predictability, con-trollability, patterning, the construction of lawsand rules of behaviour, and the ascription ofcausality; the interpretive paradigms strive tounderstand and interpret the world in terms ofits actors. In the former, observed phenomenaare important; in the latter, meanings and inter-pretations are paramount. Habermas(1984:109–10) describes this latter as a ‘doublehermeneutic’, where people strive to interpretand operate in an already interpreted world. Byway of contrast, an emerging approach to edu-cational research is the paradigm of critical edu-cational research. This regards the two previ-ous paradigms as presenting incomplete ac-counts of social behaviour by their neglect ofthe political and ideological contexts of mucheducational research. Positivistic and interpre-tive paradigms are seen as preoccupied with tech-nical and hermeneutic knowledge respectively(Gage, 1989). The paradigm of critical educa-tional research is heavily influenced by the earlywork of Habermas and, to a lesser extent, hispredecessors in the Frankfurt School, most no-tably Adorno, Marcuse, Horkheimer andFromm. Here the expressed intention is deliber-ately political—the emancipation of individualsand groups in an egalitarian society.

Critical theory is explicitly prescriptive andnormative, entailing a view of what behaviourin a social democracy should entail (Fay, 1987;Morrison, 1995a). Its intention is not merely togive an account of society and behaviour but torealize a society that is based on equality anddemocracy for all its members. Its purpose isnot merely to understand situations and phe-nomena but to change them. In particular it seeksto emancipate the disempowered, to redress in-equality and to promote individual freedomswithin a democratic society.

In this enterprise critical theory identifies the‘false’ or ‘fragmented’ consciousness (Eagleton,1991) that has brought an individual or socialgroup to relative powerlessness or, indeed,power, and it questions the legitimacy of this. Itholds up to the lights of legitimacy and equalityissues of repression, voice, ideology, power, par-

ticipation, representation, inclusion, and inter-ests. It argues that much behaviour (includingresearch behaviour) is the outcome of particu-lar illegitimate, dominatory and repressive fac-tors, illegitimate in the sense that they do notoperate in the general interest—one person’s orgroup’s freedom and power is bought at the priceof another’s freedom and power. Hence criticaltheory seeks to uncover the interests at work inparticular situations and to interrogate the le-gitimacy of those interests—identifying the ex-tent to which they are legitimate in their serviceof equality and democracy. Its intention istransformative: to transform society and indi-viduals to social democracy. In this respect thepurpose of critical educational research is in-tensely practical—to bring about a more just,egalitarian society in which individual and col-lective freedoms are practised, and to eradicatethe exercise and effects of illegitimate power. Thepedigree of critical theory in Marxism, thus, isnot difficult to discern. For critical theorists,researchers can no longer claim neutrality andideological or political innocence.

Critical theory and critical educational re-search, then, have their substantive agenda—forexample examining and interrogating: the rela-tionships between school and society—howschools perpetuate or reduce inequality; the so-cial construction of knowledge and curricula,who defines worthwhile knowledge, what ideo-logical interests this serves, and how this repro-duces inequality in society; how power is pro-duced and reproduced through education; whoseinterests are served by education and how le-gitimate these are (e.g. the rich, white, middle-class males rather than poor, non-white, fe-males).

The significance of critical theory for researchis immense, for it suggests that much social re-search is comparatively trivial in that it acceptsrather than questions given agendas for research.That this is compounded by the nature of fund-ing for research underlines the political dimen-sion of research sponsorship (discussed later)(Norris, 1990). Critical theorists would arguethat the positivist and interpretive paradigms are

Chapte

r 129

essentially technicist, seeking to understand andrender more efficient an existing situation, ratherthan to question or transform it.

Habermas (1972) offers a useful tripartiteconceptualization of interests that catches thethree paradigms of research in this chapter. Hesuggests that knowledge—and hence researchknowledge—serves different interests. Interests,he argues, are socially constructed, and are‘knowledge-constitutive’, because they shapeand determine what counts as the objects andtypes of knowledge. Interests have an ideologi-cal function (Morrison, 1995a), for example a‘technical interest’ (discussed below) can havethe effect of keeping the empowered in theirempowered position and the disempowered intheir powerlessness—i.e. reinforcing and per-petuating the status quo. An ‘emancipatory in-terest’ (discussed below) threatens the status quo.In this view knowledge—and research knowl-edge—is not neutral (see also Mannheim, 1936).What counts as worthwhile knowledge is deter-mined by the social and positional power of theadvocates of that knowledge. The link here be-tween objects of study and communities of schol-ars echoes Kuhn’s (1962) notions of paradigmsand paradigm shifts, where the field of knowl-edge or paradigm is seen to be only as good asthe evidence and the respect in which it is heldby ‘authorities’. Knowledge and definitions ofknowledge reflect the interests of the commu-nity of scholars who operate in particular para-digms. Habermas (1972) constructs the defini-tion of worthwhile knowledge and modes ofunderstanding around three cognitive interests: 1 prediction and control;2 understanding and interpretation;3 emancipation and freedom. He names these the ‘technical’, ‘practical’ and‘emancipatory’ interests respectively. The tech-nical interest characterizes the scientific, posi-tivist method outlined earlier, with its emphasison laws, rules, prediction and control of behav-iour, with passive research objects—instrumen-tal knowledge. The ‘practical’ interest, an at-

tenuation of the positivism of the scientificmethod, is exemplified in the hermeneutic, in-terpretive methodologies outlined in the quali-tative approaches earlier (e.g. symbolicinteractionism). Here research methodologiesseek to clarify, understand and interpret the com-munications of ‘speaking and acting subjects’(Habermas, 1974:8). Hermeneutics focuses oninteraction and language; it seeks to understandsituations through the eyes of the participants,echoing the verstehen approaches of Weber andpremised on the view that reality is socially con-structed (Berger and Luckmann, 1967). IndeedHabermas (1988:12) suggests that sociologymust understand social facts in their cultural sig-nificance and as socially determined.Hermeneutics involves recapturing the meaningsof interacting others, recovering and reconstruct-ing the intentions of the other actors in a situa-tion. Such an enterprise involves the analysis ofmeaning in a social context (Held, 1980).Gadamer (1975:273) argues that thehermeneutic sciences (e.g. qualitative ap-proaches) involve the fusion of horizons betweenparticipants. Meanings rather than phenomenatake on significance in this paradigm.

The emancipatory interest subsumes the previ-ous two paradigms; it requires them but goes be-yond them (Habermas, 1972:211). It is concernedwith praxis—action that is informed by reflectionwith the aim to emancipate (Kincheloe, 1991:177).The twin intentions of this interest are to exposethe operation of power and to bring about socialjustice as domination and repression act to pre-vent the full existential realization of individualand social freedoms (Habermas, 1979:14). Thetask,of this knowledge-constitutive interest, indeedof critical theory itself, is to restore to conscious-ness those suppressed, repressed and submergeddeterminants of unfree behaviour with a view totheir dissolution (Habermas, 1984:194–5).

What we have in effect, then, in Habermas’searly work is an attempt to conceptualize threeresearch styles: the scientific, positivist style; theinterpretive style; and the emancipatory, ideol-ogy critical style. Not only does critical theoryhave its own research agenda, but it also has its

CRITICAL THEORY AND CRITICAL EDUCATIONAL RESEARCH

THE NATURE OF INQUIRY30

own research methodologies, in particular ideol-ogy critique and action research. With regard toideology critique, a particular reading of ideol-ogy is being adopted here, as the suppression ofgeneralizable interests (Habermas, 1976:113),where systems, groups and individuals operatein rationally indefensible ways because their powerto act relies on the disempowering of other groups,i.e. that their principles of behaviour are notuniversalizable. Ideology—the values and prac-tices emanating from particular dominantgroups—is the means by which powerful groupspromote and legitimate their particular—sectoral—interests at the expense of disempoweredgroups. Ideology critique exposes the operationof ideology in many spheres of education, theworking out of vested interests under the mantleof the general good. The task of ideology critiqueis to uncover the vested interests at work whichmay be occurring consciously or subliminally,revealing to participants how they may be actingto perpetuate a system which keeps them eitherempowered or disempowered (Geuss, 1981), i.e.which suppresses a generalizable interest. Expla-nations for situations might be other than those‘natural’, taken for granted, explanations that theparticipants might offer or accept. Situations arenot natural but problematic (Carr and Kemmis,1986). They are the outcomes or processes whereininterests and powers are protected and suppressed,and one task of ideology critique is to expose this(Grundy, 1987). The interests at work are un-covered by ideology critique, which, itself, is prem-ised on reflective practice (Morrison, 1995a,1995b, 1996a).

Habermas (1972:230) suggests that ideologycritique through reflective practice can be ad-dressed in four stages:

Stage 1 A description and interpretation of theexisting situation—a hermeneutic exercise thatidentifies and attempts to make sense of the cur-rent situation (echoing the verstehen approachesof the interpretive paradigm).Stage 2 A penetration of the reasons thatbrought the existing situation to the form thatit takes—the causes and purposes of a situa-

tion and an evaluation of their legitimacy, in-volving an analysis of interests and ideologiesat work in a situation, their power and legiti-macy (both in micro- and macro-sociologicalterms). In Habermas’s early work (1972) helikens this to psychoanalysis as a means forbringing into consciousness of ‘patients’ thoserepressed, distorted and oppressive conditions,experiences and factors that have preventedthem from a full, complete and accurate un-derstanding of their conditions, situations andbehaviour, and that, on such exposure and ex-amination, will be liberatory and emancipatory.Critique here serves to reveal to individuals andgroups how their views and practices might beideological distortions that, in their effects, areperpetuating a social order or situation thatworks against their democratic freedoms, in-terests and empowerment (see also Carr andKemmis, 1986:138–9).Stage 3 An agenda for altering the situation—inorder for moves to an egalitarian society to befurthered.Stage 4 An evaluation of the achievement of thesituation in practice.

In the world of education Habermas’s stages areparalleled by Smyth (1989) who, too, denotes afour-stage process: description (what am I do-ing?); information (what does it mean?); con-frontation (how did I come to be like this?); andreconstruction (how might I do things differ-ently?). It can be seen that ideology critique herehas both a reflective, theoretical and a practicalside to it; without reflection it is hollow andwithout practice it is empty.

As ideology is not mere theory but impacts di-rectly on practice (Eagleton, 1991) there is astrongly practical methodology implied by criticaltheory, which articulates with action research(Callewaert, 1999). Action research (discussed inChapter 13) as its name suggests, is about researchthat impacts on, and focuses on, practice. In itsespousal of practitioner research, for exampleteachers in schools, participant observers and cur-riculum developers, action research recognizes thesignificance of contexts for practice—locational,

Chapte

r 131

ideological, historical, managerial, social. Further-more it accords power to those who are operatingin those contexts, for they are both the engines ofresearch and of practice. In that sense the claim ismade that action research is strongly empoweringand emancipatory in that it gives practitioners a‘voice’ (Carr and Kemmis, 1986; Grundy, 1987),participation in decision-making, and control overtheir environment and professional lives. Whetherthe strength of the claims for empowerment are asstrong as their proponents would hold is anothermatter, for action research might be relatively pow-erless in the face of mandated changes in educa-tion. Here action research might be more con-cerned with the intervening in existing practice toensure that mandated change is addressed effi-ciently and effectively.

Morrison (1995a) suggests that criticaltheory, because it has a practical intent to trans-form and empower, can—and should—be ex-amined and perhaps tested empirically. For ex-ample, critical theory claims to be empowering;that is a testable proposition. Indeed, in a de-parture from some of his earlier writing, in someof his later work Habermas (1990) acknowl-edges this; he argues for the need to find ‘coun-ter examples’ (p. 6), to ‘critical testing’ (p. 7)and empirical verification (p. 117). He acknowl-edges that his views have only ‘hypothetical sta-tus’ (p. 32) that need to be checked against spe-cific cases (p. 9). One could suggest, for instance,that the effectiveness of his critical theory canbe examined by charting the extent to whichequality, freedom, democracy, emancipation,empowerment have been realized by dint of histheory, the extent to which transformative prac-tices have been addressed or occurred as a re-sult of his theory, the extent to which subscrib-ers to his theory have been able to assert theiragency, the extent to which his theories havebroken down the barriers of instrumental ra-tionality. The operationalization and testing (orempirical investigation) of his theories clearly isa major undertaking, and one which Habermashas not done. In this respect critical theory, atheory that strives to improve practical living,runs the risk of becoming merely contemplative.

Criticisms of approaches from criticaltheory

There are several criticisms that have been voicedagainst critical approaches. Morrison (1995a)suggests that there is an artificial separationbetween Habermas’s three interests—they aredrawn far more sharply (Hesse, 1982; Bernstein,1976; 1983:33). For example, one has to bringhermeneutic knowledge to bear on positivistscience and vice versa in order to make mean-ing of each other and in order to judge theirown status. Further, the link between ideologycritique and emancipation is neither clear norproven, nor a logical necessity (Morrison,1995a:67)—whether a person or society canbecome emancipated simply by the exercise ofideology critique or action research is an em-pirical rather than a logical matter (Morrison,1995a; Wardekker and Miedama, 1997). Indeedone can become emancipated by means otherthan ideology critique; emancipated societies donot necessarily demonstrate or require an aware-ness of ideology critique. Moreover, it could beargued that the rationalistic appeal of ideologycritique actually obstructs action designed tobring about emancipation. Roderick (1986:65),for example, questions whether the espousal ofideology critique is itself as ideological as theapproaches that it proscribes. Habermas, in hisallegiance to the view of the social constructionof knowledge through ‘interests’, is inviting thecharge of relativism.

Whilst the claim to there being three formsof knowledge has the epistemological attractionof simplicity, one has to question this very sim-plicity (e.g. Keat, 1981:67); there are a multi-tude of interests and ways of understanding theworld and it is simply artificial to reduce theseto three. Indeed it is unclear whether Habermas,in his three knowledge-constitutive interests, isdealing with a conceptual model, a politicalanalysis, a set of generalities, a set oftranshistorical principles, a set of temporallyspecific observations, or a set of loosely definedslogans (Morrison, 1995a:71) that survive onlyby dint of their ambiguity (Kolakowsi, 1978).

CRITICISMS OF APPROACHES FROM CRITICAL THEORY

THE NATURE OF INQUIRY32

Lakomski (1999) questions the acceptability ofthe consensus theory of truth on whichHabermas’s work is premised (pp. 179–82); sheargues that Habermas’s work is silent on socialchange, and is little more than speculation, aview echoed by Fendler’s (1999) criticism of criti-cal theory as inadequately problematizing sub-jectivity and ahistoricity

More fundamental to a critique of this ap-proach is the view that critical theory has adeliberate political agenda, and that the task ofthe researcher is not to be an ideologue or tohave an agenda, but to be dispassionate, disin-terested and objective (Morrison, 1995a). Ofcourse, critical theorists would argue that thecall for researchers to be ideologically neutral isitself ideologically saturated with laissez-fairevalues which allow the status quo to be repro-duced, i.e. that the call for researchers to be neu-tral and disinterested is just as value laden as isthe call for them to intrude their own perspec-tives. The rights of the researcher to move be-yond disinterestedness are clearly contentious,though the safeguard here is that the research-er’s is only one voice in the community of schol-ars (Kemmis, 1982). Critical theorists as research-ers have been hoisted by their own petard, for ifthey are to become more than merely negativeJeremiahs and skeptics, berating a particularsocial order that is dominated by scientism andinstrumental rationality (Eagleton, 1991;Wardekker and Miedama, 1997), then they haveto generate a positive agenda, but in so doingthey are violating the traditional objectivity ofresearchers. Because their focus is on an ideo-logical agenda, they themselves cannot avoidacting ideologically (Morrison, 1995a).

Claims have been made for the power of ac-tion research to empower participants as re-searchers (e.g. Carr and Kemmis, 1986; Grundy,1987). This might be over-optimistic in a worldin which power is often through statute; the re-ality of political power seldom extends to teach-ers. That teachers might be able to exercise somepower in schools but that this has little effect onthe workings of society at large was caught inBernstein’s famous comment (1970) that ‘edu-

cation cannot compensate for society’. Givingaction researchers a small degree of power (toresearch their own situations) has little effecton the real locus of power and decision-mak-ing, which often lies outside the control of ac-tion researchers. Is action research genuinely andfull-bloodedly empowering and emancipatory?Where is the evidence?

Critical theory and curriculum research

In terms of a book on research methods, the ten-ets of critical theory suggest their own substan-tive fields of inquiry and their own methods (e.g.ideology critique and action research). Beyondthat the contribution to this text on empiricalresearch methods is perhaps limited by the factthat the agenda of critical theory is highlyparticularistic, prescriptive and, as has been seen,problematical. Though it is an influential para-digm, it is influential in certain fields rather thanin others. For example, its impact on curricu-lum research has been far-reaching.

It has been argued for many years that themost satisfactory account of the curriculum isgiven by a modernist, positivist reading of thedevelopment of education and society. This hasits curricular expression in Tyler’s (1949) famousand influential rationale for the curriculum interms of four questions: 1 What educational purposes should the school

seek to attain?2 What educational experiences can be pro-

vided that are likely to attain these purposes?3 How can these educational experiences be

effectively organized?4 How can we determine whether these pur-

poses are being attained? Underlying this rationale is a view that the cur-riculum is controlled (and controllable), ordered,pre-determined, uniform, predictable and largelybehaviourist in outcome—all elements of thepositivist mentality that critical theory eschews.Tyler’s rationale resonates sympathetically witha modernist, scientific, managerialist mentality

Chapte

r 133

of society and education that regards ideologyand power as unproblematic, indeed it claimsthe putative political neutrality and objectivityof positivism (Doll, 1993); it ignores the ad-vances in psychology and psychopedagogy madeby constructivism.

However, this view has been criticized forprecisely these sympathies. Doll (1993) arguesthat it represents a closed system of planningand practice that sits uncomfortably with thenotion of education as an opening process andwith the view of postmodern society as openand diverse, multidimensional, fluid and withpower less monolithic and more problematical.This view takes seriously the impact of chaosand complexity theory and derives from themsome important features for contemporary cur-ricula. These are incorporated into a view ofcurricula as being rich, relational, recursive andrigorous (Doll, 1993) with an emphasis on emer-gence, process epistemology and constructivistpsychology.

Not all knowledge can be included in thecurriculum; the curriculum is a selection of whatis deemed to be worthwhile knowledge. The jus-tification for that selection reveals the ideolo-gies and power in decision-making in society andthrough the curriculum. Curriculum is an ideo-logical selection from a range of possible knowl-edge. This resonates with a principle fromHabermas (1972) that knowledge and its selec-tion are neither neutral nor innocent.

Ideologies can be treated unpejoratively assets of beliefs or, more sharply, as sets of beliefsemanating from powerful groups in society, de-signed to protect the interests of the dominant.If curricula are value-based then why is it thatsome values hold more sway than others? Thelink between values and power is strong. Thistheme asks not only what knowledge is impor-tant but whose knowledge is important in cur-ricula, what and whose interests such knowl-edge serves, and how the curriculum and peda-gogy serve (or do not serve) differing interests.Knowledge is not neutral (as was the tacit viewin modernist curricula). The curriculum is ideo-logically contestable terrain.

The study of the sociology of knowledge in-dicates how the powerful might retain theirpower through curricula and how knowledgeand power are legitimated in curricula. Thestudy of the sociology of knowledge suggeststhat the curriculum should be both subject toideology critique and itself promote ideologycritique in students. A research agenda for criti-cal theorists, then is how the curriculum per-petuates the societal status quo and how can it(and should it) promote equality in society.

The notion of ideology critique engages theearly writings of Habermas (1972), in particu-lar his theory of three knowledge-constitutiveinterests. His technical interest (in control andpredictability) resonates with Tyler’s model ofthe curriculum and reveals itself in technicist,instrumentalist and scientistic views of curriculathat are to be ‘delivered’ to passive recipients—the curriculum is simply another commodity ina consumer society in which differential culturalcapital is inevitable. Habermas’s ‘hermeneutic’interest (in understanding others’ perspectivesand views) resonates with a process view of thecurriculum. His emancipatory interest (in pro-moting social emancipation, equality, democ-racy, freedoms and individual and collectiveempowerment) requires an exposure of the ideo-logical interests at work in curricula in orderthat teachers and students can take control oftheir own lives for the collective, egalitariangood. Habermas’s emancipatory interest denotesan inescapably political reading of the curricu-lum and the purposes of education—the move-ment away from authoritarianism and elitismand towards social democracy.

Habermas’s work underpins and informsmuch contemporary and recent curriculumtheory (e.g. Grundy, 1987; Apple, 1990;UNESCO, 1996) and is a useful heuristic devicefor understanding the motives behind the heavyprescription of curriculum content in, for ex-ample, the UK, New Zealand, the USA andFrance. For instance, one can argue that theNational Curriculum of England and Wales isheavy on the technical and hermeneutic inter-ests but very light on the emancipatory interest

CRITICAL THEORY AND CURRICULUM RESEARCH

THE NATURE OF INQUIRY34

(Morrison, 1995a), and that this (either deliber-ately or in its effects) supports—if not contrib-utes to—the reproduction of social inequality.As Bernstein (1971) argues: ‘how a society se-lects, classifies, distributes, transmits and evalu-ates the educational knowledge it considers tobe public, reflects both the distribution of powerand the principles of social control’ (p. 47).

Further, one can argue that the move towardsmodular and competence-based curricula re-flects the commodification, measurability andtrivialization of curricula, the technicist controlof curricula, a move toward the behaviourismof positivism and a move away from thetransformatory nature of education, a silencingof critique, and the imposition of a narrow ide-ology of instrumental utility on the curriculum.

Several writers on contemporary curriculumtheory (e.g. McLaren, 1995; Leistna, Woodrumand Sherblom, 1996) argue that power is a cen-tral, defining concept in matters of the curricu-lum. Here considerable importance is accordedto the political agenda of the curriculum, and theempowerment of individuals and societies is aninescapable consideration in the curriculum. Onemeans of developing student and societal empow-erment finds its expression in Habermas’s (1972)emancipatory interest and critical pedagogy.

In the field of critical pedagogy the argumentis advanced that educators must work with, andon, the lived experience that students bring tothe pedagogical encounter rather than impos-ing a dominatory curriculum that reproducessocial inequality. In this enterprise teachers areto transform the experience of domination instudents and empower them to become ‘eman-cipated’ in a full democracy. Students’ everydayexperiences of oppression, of being ‘silenced’,of having their cultures and ‘voices’ excludedfrom curricula and decision-making are to beinterrogated for the ideological messages thatare contained in such acts. Raising awarenessof such inequalities is an important step to over-coming them. Teachers and students togethermove forward in the progress towards ‘indi-vidual autonomy within a just society’(Masschelein, 1991:97). In place of centrally

prescribed and culturally biased curricula thatstudents simply receive, critical pedagogy re-gards the curriculum as a form of cultural poli-tics in which participants in (rather than recipi-ents of) curricula question the cultural anddominatory messages contained in curricula andreplace them with a ‘language of possibility’ andempowering, often community-related curricula.In this way curricula serve the ‘socially critical’rather than the culturally and ideologically pas-sive school.

One can discern a Utopian and generalizedtenor in some of this work, and applyingcritical theory to education can be criticizedfor its limited comments on practice. IndeedMiedama and Wardekker (1999:68) go so faras to suggest that critical pedagogy has hadits day, and that it was a stillborn child andthat critical theory is a philosophy of sciencewithout a science (p. 75)! Nevertheless it isan important field for it recognizes andmakes much of the fact that curricula andpedagogy are problematical and political.

A summary of the three paradigms11

Box 1.8 summarizes some of the broad differ-ences between the three approaches that we havemade so far.

Feminist research

It is perhaps no mere coincidence that feministresearch should surface as a serious issue at thesame time as ideology-critical paradigms for re-search; they are closely connected. Usher (1996),although criticizing Habermas (p. 124) for hisfaith in family life as a haven from a heartless,exploitative world, nevertheless sets out severalprinciples of feminist research that resonate withthe ideology critique of the Frankfurt School: 1 The acknowledgement of the pervasive influ-

ence of gender as a category of analysis andorganization.

2 The deconstruction of traditional commit-ments to truth, objectivity and neutrality.

Chapte

r 135

3 The adoption of an approach to knowledgecreation which recognizes that all theories areperspectival.

4 The utilization of a multiplicity of researchmethods.

5 The inter-disciplinary nature of feminist re-search.

6 Involvement of the researcher and the peoplebeing researched.

7 The deconstruction of the theory/practice re-lationship.

Her suggestions build on earlier recognition ofthe significance of addressing the ‘power issue’in research (‘whose research’, ‘research forwhom’, ‘research in whose interests’) and theneed to address the emancipatory element ofeducational research—that research should beempowering to all participants. The paradigmof critical theory questioned the putative objec-tive, neutral, value-free, positivist, ‘scientific’paradigm for the splitting of theory and prac-tice and for its reproduction of asymmetries ofpower (reproducing power differentials in the

research community and for treating partici-pants/respondents instrumentally—as objects).

Feminist research, too, challenges the legiti-macy of research that does not empower op-pressed and otherwise invisible groups—women.Positivist research served a given set of powerrelations, typically empowering the white, male-dominated research community at the expenseof other groups whose voices were silenced. Ithad this latent, if not manifest or deliberate(Merton, 1967) function or outcome; it had thissubstantive effect (or maybe even agenda). Femi-nist research seeks to demolish and replace thiswith a different substantive agenda—of empow-erment, voice, emancipation, equality and rep-resentation for oppressed groups. In doing so, itrecognizes the necessity for foregrounding issuesof power, silencing and voicing, ideology critiqueand a questioning of the legitimacy of researchthat does not emancipate hitherto disempoweredgroups.

The issue of empowerment resonates with thework of Freire (1970) on ‘conscientization’,wherein oppressed groups—in his case the

Box 1.8Differing approaches to the study of behaviour

FEMINIST RESEARCH

Normative Interpretive Critical

Society and the social system The individual Societies, groups and individuals

Medium/large-scale research Small-scale research Small-scale research

Impersonal, anonymous forces Human actions continuously Political, ideological factors, power and

regulating behaviour recreating social life interests shaping behaviour

Model of natural sciences Non-statistical Ideology critique and action research

‘Objectivity’ ‘Subjectivity’ Collectivity

Research conducted ‘from the Personal involvement of the Participant researchers, researchers and

outside’ researcher facilitators

Generalizing from the specific Interpreting the specific Critiquing the specific

Explaining behaviour/seeking causes Understanding actions/meanings Understanding, interrogating, critiquing,

Assuming the taken-for-granted rather than causes transforming actions and interests

Macro-concepts: society, institutions, Investigating the taken-for-granted Interrogating and critiquing the taken-for-

norms, positions, roles, expectations Micro-concepts: individual granted

Structuralists perspective, personal constructs, Macro- and micro-concepts: political and

Technical interest negotiated meanings, definitions of ideological interests, operations of power

situations Critical theorists, action researchers,

Phenomenologists, symbolic practitioner researchers

interactionists, ethnomethodologists Emancipatory interest

Practical interest

THE NATURE OF INQUIRY36

illiterate poor—are taught to read and write byfocusing on their lived experiences, e.g. of power,poverty, oppression, such that a political agendais raised in their learning. In feminist research,women’s consciousness of oppression, exploi-tation and disempowerment becomes a focus forresearch—the paradigm of ideology critique.

Far from treating educational research as ob-jective and value-free, feminists argue that this ismerely a smokescreen that serves the existing,disempowering status quo, and that the subjectand value-laden nature of research must be sur-faced, exposed and engaged (Haig, 1999:223). Thisentails taking seriously issues of reflexivity, the ef-fects of the research on the researched and the re-searchers, the breakdown of the positivist para-digm, and the raising of consciousness of the pur-poses and effects of the research. Indeed Ribbensand Edwards (1997) suggest that it is importantto ask how researchers can produce work withreference to theoretical perspectives and formaltraditions and requirements of public, academicknowledge whilst still remaining faithful to theexperiences and accounts of research participants.Denzin (1989), Mies (1993) and Haig (1999) ar-gue for several principles in feminist research: • The asymmetry of gender relations and rep-

resentation must be studied reflexively asconstituting a fundamental aspect of sociallife (which includes educational research).

• Women’s issues, their history, biography andbiology, feature as a substantive agenda/fo-cus in research—moving beyond mereperspectival/methodological issues to settinga research agenda.

• The raising of consciousness of oppression,exploitation, empowerment, equality, voiceand representation is a methodological tool.

• The acceptability and notion of objectivityand objective research must be challenged.

• The substantive, value-laden dimensions andpurposes of feminist research must be para-mount.

• Research must empower women.• Research need not only be undertaken by

academic experts.

• Collective research is necessary—womenneed to collectivize their own individual his-tories if they are to appropriate these histo-ries for emancipation.

• There is a commitment to revealing core proc-esses and recurring features of women’s op-pression.

• An insistence on the inseparability of theoryand practice.

• An insistence on the connections between theprivate and the public, between the domesticand the political.

• A concern with the construction and repro-duction of gender and sexual difference.

• A rejection of narrow disciplinary boundaries.• A rejection of the artificial subject/researcher

dualism.• A rejection of positivism and objectivity as

male mythology.• The increased use of qualitative, introspec-

tive biographical research techniques.• A recognition of the gendered nature of so-

cial research and the development of anti-sex-ist research strategies.

• A review of the research process as conscious-ness and awareness raising and as fundamen-tally participatory.

• The primacy of women’s personal subjectiveexperience.

• The rejection of hierarchies in social research.• The vertical, hierarchical relationships of re-

searchers/research community and researchobjects, in which the research itself can be-come an instrument of domination and thereproduction and legitimation of power eliteshas to be replaced by research that promotesthe interests of dominated, oppressed, ex-ploited groups.

• The recognition of equal status and recipro-cal relationships between subjects and re-searchers.

• There is a need to change the status quo, notmerely to understand or interpret it.

• The research must be a process ofconscientization, not research solely by ex-perts for experts, but to empower oppressedparticipants.

Chapte

r 137

Gender shapes research agendas, the choice oftopics and foci, the choice of data collection tech-niques and the relationships between research-ers and researched. Several methodological prin-ciples flow from a ‘rationale’ for feminist re-search (Denzin, 1989; Mies, 1993; Haig, 1997,1999): • The replacement of quantitative, positivist,

objective research with qualitative, interpre-tive, ethnographic reflexive research.

• Collaborative, collectivist research under-taken by collectives—often of women—com-bining researchers and researched in order tobreak subject/object and hierarchical, non-reciprocal relationships.

• The appeal to alleged value-free, neutral, in-different and impartial research has to be re-placed by conscious, deliberate partiality—through researchers identifying with partici-pants.

• The use of ideology-critical approaches andparadigms for research.

• The spectator theory or contemplativetheory of knowledge in which researchersresearch from ivory towers has to be re-placed by a participatory approach—per-haps action research—in which all partici-pants (including researchers) engage in thestruggle for women’s emancipation—aliberatory methodology.

• The need to change the status quo is the start-ing point for social research—if we want toknow something we change it. (Mies (1993)cites the Chinese saying that if you want toknow a pear then you must chew it!).

• The extended use of triangulation and mul-tiple methods (including visual techniquessuch as video, photograph and film).

• The use of linguistic techniques such as con-versational analysis.

• The use of textual analysis such asdeconstruction of documents and texts aboutwomen.

• The use of meta-analysis to synthesize find-ings from individual studies (see Chapter 12).

• A move away from numerical surveys and acritical evaluation of them, including a cri-tique of question wording.

The drive towards collective, egalitarian andemancipatory qualitative research is seen as nec-essary if women are to avoid colluding in theirown oppression by undertaking positivist, un-involved, objective research. Mies (ibid.: 67)argues that for women to undertake this latterform of research puts them into a schizophrenicposition of having to adopt methods which con-tribute to their own subjugation and repressionby ignoring their experience (however vicarious)of oppression and by forcing them to abide bythe ‘rules of the game’ of the competitive, male-dominated academic world. In this view, argueRoman and Apple (1990:59) it is not enoughfor women simply to embrace ethnographicforms of research, as this does not necessarilychallenge the existing and constituting forces ofoppression or asymmetries of power. Ethno-graphic research, they argue, has to be accom-panied by ideology critique, indeed they arguethat the transformative, empowering, emanci-patory potential of a piece of research is a criti-cal standard for evaluating that piece of research.

However, these views of feminist research andmethodology are not unchallenged by otherfeminist researchers. For example Jayaratne(1993:109) argues for ‘fitness for purpose’, sug-gesting that exclusive focus on qualitative meth-odologies might not be appropriate either forthe research purposes or, indeed, for advancingthe feminist agenda. She refutes the argumentthat quantitative methods are unsuitable forfeminists because they neglect the emotions ofthe people under study. Indeed she argues forbeating quantitative research on its own grounds(p. 121), suggesting the need for feminist quan-titative data and methodologies in order to coun-ter sexist quantitative data in the social sciences.She suggests that feminist researchers can ac-complish this without ‘selling out’ to the posi-tivist, male-dominated academic research com-munity.

FEMINIST RESEARCH

THE NATURE OF INQUIRY38

An example of a feminist approach to research isthe Girls Into Science and Technology (GIST) ac-tion research project. This took place over three yearsand involved 2,000 students and their teachers inten co-educational, comprehensive schools in theGreater Manchester area of the UK, eight schoolsserving as the bases of the ‘action’, the remainingtwo acting as ‘controls’. Several publications havedocumented the methodologies and findings of theGIST study (Whyte, 1986; Kelly, 1986, 1989a,1989b; Kelly and Smail, 1986), described by its co-director as ‘simultaneous-integrated action research’(Kelly, 1987) (i.e. integrating action and research).

Research and evaluation

The preceding discussion has suggested that re-search and politics are inextricably bound to-gether. This can be taken further, as researchersin education will be advised to pay serious con-sideration to the politics of their research enter-prise and the ways in which politics can steerresearch. For example one can detect a trend ineducational research towards more evaluativeresearch, where, for instance, a researcher’s taskis to evaluate the effectiveness (often of the im-plementation) of given policies and projects. Thisis particularly true in the case of ‘categoricallyfunded’ and commissioned research—researchwhich is funded by policy-makers (e.g. govern-ments, fund-awarding bodies) under any numberof different headings that those policy-makersdevise (Burgess, 1993). On the one hand this islaudable, for it targets research directly towardspolicy; on the other hand it is dangerous in thatit enables others to set the research agenda. Re-search ceases to become open-ended, pure re-search, and, instead, becomes the evaluation ofgiven initiatives. Less politically charged, muchresearch is evaluative, and indeed there are manysimilarities between research and evaluation.The two overlap but possess important differ-ences. The problem of trying to identify differ-ences between evaluation and research is com-pounded because not only do they share severalof the same methodological characteristics butone branch of research is called evaluative re-

search or applied research. This is often keptseparate from ‘blue skies’ research in that thelatter is open-ended, exploratory, contributessomething original to the substantive field andextends the frontiers of knowledge and theorywhereas in the former the theory is given ratherthan interrogated or tested.

One can detect many similarities between thetwo in that they both use methodologies andmethods of social science research generally,covering, for example: • the need to clarify the purposes of the inves-

tigation;• the need to operationalize purposes and ar-

eas of investigation;• the need to address principles of research de-

sign that include:

(a) formulating operational questions;(b) deciding appropriate methodologies;(c) deciding which instruments to use for data

collection;(d) deciding on the sample for the investigation;(e) addressing reliability and validity in the

investigation and instrumentation;(f) addressing ethical issues in conducting the

investigation;(g) deciding on data analysis techniques;(h) deciding on reporting and interpreting

results. Indeed Norris (1990) argues that evaluationapplies research methods to shed light on a prob-lem of action (Norris, 1990:97); he suggests thatevaluation can be viewed as an extension of re-search, because it shares its methodologies andmethods, and because evaluators and research-ers possess similar skills in conducting investi-gations. In many senses the eight features out-lined above embrace many elements of the sci-entific method, which Smith and Glass (1987)set out thus:

Step 1 A theory about the phenomenon exists.Step 2 A research problem within the theory isdetected and a research question is devised.

Chapte

r 139

Step 3 A research hypothesis is deduced (oftenabout the relationship between constructs).Step 4 A research design is developed,operationalizing the research question and stat-ing the null hypothesis.Step 5 The research is conducted.Step 6 The null hypothesis is tested based on thedata gathered.Step 7 The original theory is revised or supportedbased on the results of the hypothesis testing.

Indeed, if steps 1 and 7 were removed thenthere would be nothing to distinguish betweenresearch and evaluation. Both researchers andevaluators pose questions and hypotheses, se-lect samples, manipulate and measure vari-ables, compute statistics and data, and stateconclusions. Nevertheless several commenta-tors suggest that there are important differ-ences between evaluation and research thatare not always obvious simply by looking atpublications. Publications do not always makeclear the background events that gave rise tothe investigation, nor do they always makeclear the uses of the material that they report,nor do they always make clear what the dis-semination rights (Sanday, 1993) are and whoholds them.

Several commentators set out some of thedifferences between evaluation and research. Forexample Smith and Glass (1987) offer eight maindifferences: 1 The intents and purposes of the investigation

The researcher wants to advance the fron-tiers of knowledge of phenomena, to contrib-ute to theory and to be able to make gener-alizations; the evaluator is less interested incontributing to theory or general body ofknowledge. Evaluation is more parochial thanuniversal (pp. 33–4).

2 The scope of the investigation Evaluationstudies tend to be more comprehensive thanresearch in the number and variety of aspectsof a programme that are being studied (p. 34).

3 Values in the investigation Research aspiresto value neutrality, evaluations must repre-

sent multiple sets of values and include dataon these values.

4 The origins of the study Research has its ori-gins and motivation in the researcher’s curi-osity and desire to know (p. 34). The re-searcher is answerable to colleagues and sci-entists (i.e. the research community) whereasthe evaluator is answerable to the ‘client’. Theresearcher is autonomous whereas the evalu-ator is answerable to clients and stakeholders.The researcher is motivated by a search forknowledge, the evaluator is motivated by theneed to solve problems, allocate resources andmake decisions. Research studies are public,evaluations are for a restricted audience.

5 The uses of the study The research is used tofurther knowledge, evaluations are used toinform decisions.

6 The timeliness of the study Evaluations mustbe timely, research need not be. Evaluators’time scales are given, researchers’ time scalesneed not be given.

7 Criteria for judging the study Evaluations arejudged by the criteria of utility and credibil-ity, research is judged methodologically andby the contribution that it makes to the field(i.e. internal and external validity).

8 The agendas of the study An evaluator’sagenda is given, a researcher’s agenda isher own.

Norris (1990) reports an earlier piece of workby Glass and Worthen in which they identifiedimportant differences between evaluation andresearch: • The motivation of the enquirer Research is

pursued largely to satisfy curiosity, evalua-tion is undertaken to contribute to the solu-tion of a problem.

• The objectives of the search Research andevaluation seek different ends. Research seeksconclusions, evaluation leads to decisions.

• Laws versus description Research is the questfor laws (nomothetic), evaluation merelyseeks to describe a particular thing(idiographic).

RESEARCH AND EVALUATION

THE NATURE OF INQUIRY40

• The role of explanation Proper and usefulevaluation can be conducted without produc-ing an explanation of why the product orproject is good or bad or of how it operatesto produce its effects.

• The autonomy of the inquiry Evaluation isundertaken at the behest of a client, whileresearchers set their own problems.

• Properties of the phenomena that are assessedEvaluation seeks to assess social utility di-rectly, research may yield evidence of socialutility but often only indirectly.

• Universality of the phenomena studied Re-searchers work with constructs having a cur-rency and scope of application that make theobjects of evaluation seem parochial by com-parison.

• Salience of the value question In evaluationvalue questions are central and usually deter-mine what information is sought.

• Investigative techniques While there maybe legitimate differences between researchand evaluation methods, there are far moresimilarities than differences with regard totechniques and procedures for judging va-lidity.

• Criteria for assessing the activity The twomost important criteria for judging the ad-equacy of research are internal and externalvalidity, for evaluation they are utility andcredibility.

• Disciplinary base The researcher can affordto pursue inquiry within one discipline andthe evaluator cannot.

A clue to some of the differences between evalu-ation and research can be seen in the definitionof evaluation. Most definitions of evaluation in-clude reference to several key features: (1) an-swering specific, given questions; (2) gatheringinformation; (3) making judgements; (4) takingdecisions; (5) addressing the politics of a situa-tion (Morrison, 1993:2). Morrison provides onedefinition of evaluation as: the provision of in-formation about specified issues upon whichjudgements are based and from which decisionsfor action are taken (ibid., p. 2). This view

echoes MacDonald (1987) in his comments thatthe evaluator:

is faced with competing interest groups, withdivergent definitions of the situation and con-flicting informational needs… He has to decidewhich decision-makers he will serve, what in-formation will be of most use, when it isneeded and how it can be obtained. I am sug-gesting that the resolution of these issues com-mits the evaluator to a political stance, an atti-tude to the government of education. No suchcommitment is required of the researcher. Hestands outside the political process, and valueshis detachment from it. For him the productionof new knowledge and its social use are sepa-rated. The evaluator is embroiled in the action,built into a political process which concernsthe distribution of power, i.e. the allocation ofresources and the determination of goals, rolesand tasks… When evaluation data influencespower relationships the evaluator is compelledto weight carefully the consequences of his taskspecification… The researcher is free to selecthis questions, and to seek answers to them. Theevaluator, on the other hand, must never fallinto the error of answering questions which noone but he is asking.

(MacDonald, 1987:42)

MacDonald argues that evaluation is an inher-ently political enterprise. His much-used three-fold typification of evaluations as autocratic,bureaucratic and democratic is premised on apolitical reading of evaluation (a view echoed byChelinsky and Mulhauser, 1993, who refer to‘the inescapability of politics’ (p. 54) in theworld of evaluation). MacDonald (1987), not-ing that ‘educational research is becoming moreevaluative in character’ (p. 101), argues for re-search to be kept out of politics and for evalua-tion to square up to the political issues at stake:

The danger therefore of conceptualizing evalua-tion as a branch of research is that evaluators be-come trapped in the restrictive tentacles of researchrespectability. Purity may be substituted for util-ity, trivial proofs for clumsy attempts to grasp com-plex significance. How much more productive it

Chapte

r 141

would be to define research as a branch of evalu-ation, a branch whose task it is to solve the tech-nological problems encountered by the evaluator.

(MacDonald, 1987:43) However, these typifications are very much ‘idealtypes’; the truth of the matter is far more blurredthan these distinctions suggest. Two principal causesof this blurring lie in the funding and the politicsof both evaluation and research. For example, theview of research as uncontaminated by everydaylife is naive and simplistic; Norris (1990) arguesthat such an antiseptic view of research

ignores the social context of educational inquiry,the hierarchies of research communities, the re-ward structure of universities, the role of centralgovernment in supporting certain projects and notothers, and the long-established relationships be-tween social research and reform. It is, in short,an asocial and ahistorical account.

(Norris, 1990:99) The quotation from Norris (in particular thefirst three phrases) has a pedigree that reachesback to Kuhn (1962). After that his analysisbecomes much more contemporaneous. Norrisis making an important comment on the poli-tics of research funding and research utiliza-tion. Since the early 1980s one can detect amassive rise in ‘categorical’ funding of projects,i.e. defined, given projects (often by govern-ment or research sponsors) for which bids haveto be placed. This may seem unsurprising if oneis discussing research grants by the Departmentfor Education and Employment in the UK,which are deliberately policy-oriented, thoughone can also detect in projects that have beengranted by non-governmental organizations(e.g. the Economic and Social Research Councilin the UK) a move towards sponsoring policy-oriented projects rather than the ‘blue-skies’ re-search mentioned earlier. Indeed Burgess(1993) argues that ‘researchers are little morethan contract workers…research in educationmust become policy relevant…research mustcome closer to the requirement of practitioners’(Burgess, 1993:1).

This view is reinforced by several articles in thecollection edited by Anderson and Biddle (1991)which show that research and politics go togetheruncomfortably because researchers have differentagendas and longer time scales than politicians andtry to address the complexity of situations, whereaspoliticians, anxious for short-term survival wanttelescoped time scales, simple remedies and re-search that will be consonant with their politicalagendas. Indeed James (1993) argues that

the power of research-based evaluation to provideevidence on which rational decisions can be expectedto be made is quite limited. Policy-makers will al-ways find reasons to ignore, or be highly selectiveof, evaluation findings if the information does notsupport the particular political agenda operating atthe time when decisions have to be made.

(James, 1993:135) The politicization of research has resulted infunding bodies awarding research grants forcategorical research that specify time scales andthe terms of reference. Burgess’s view also pointsto the constraints under which research is un-dertaken; if it is not concerned with policy is-sues then research tends not to be funded. Onecould support Burgess’s view that research musthave some impact on policy-making.

Not only is research becoming a political is-sue, but this extends to the use being made ofevaluation studies. It was argued above thatevaluations are designed to provide useful datato inform decision-making. However, as evalu-ation has become more politicized so its uses(or non-uses) have become more politicized. In-deed Norris (1990) shows how politics fre-quently overrides evaluation or research evi-dence. He writes:

When the national extension of the TVEI wasannounced, neither the Leeds nor NFER team hadreported and it had appeared that the decision toextend the initiative had been taken irrespectiveof any evaluation findings.

(Norris, 1990:135) This echoes James (1993) where she writes:

RESEARCH AND EVALUATION

THE NATURE OF INQUIRY42

The classic definition of the role of evaluation asproviding information for decision-makers…is afiction if this is taken to mean that policy-makerswho commission evaluations are expected to makerational decisions based on the best (valid andreliable) information available to them.

(James, 1993:119)

Where evaluations are commissioned and haveheavily political implications, Stronach andMorris (1994) argue that the response to this isthat evaluations become more ‘conformative’.‘Conformative evaluations’, they argue, haveseveral characteristics:

• Short-term, taking project goals as given, and

supporting their realization.• Ignoring the evaluation of longer-term learn-

ing outcomes, or anticipated economic/socialconsequences of the programme.

• Giving undue weight to the perceptions ofprogramme participants who are responsiblefor the successful development and implemen-tation of the programme; as a result, tendingto ‘over-report’ change.

• Neglecting and ‘under-reporting’ the views ofclassroom practitioners, and programme critics.

• Adopting an atheoretical approach, and gen-erally regarding the aggregation of opinionas the determination of overall significance.

• Involving a tight contractual relationship withthe programme sponsors that either disbarspublic reporting, or encourages self-censorshipin order to protect future funding prospects.

• Undertaking various forms of implicit advo-cacy for the programme in its reporting style.

• Creating and reinforcing a professionalschizophrenia in the research and evaluationcommunity, whereby individuals come tohold divergent public and private opinions,or offer criticisms in general rather than inparticular, or quietly develop ‘academic’ cri-tiques which are at variance with their con-tractual evaluation activities, alternating be-tween ‘critical’ and ‘conformative’ selves.

The argument so far has been confined to

large-scale projects that are influenced by andmay or may not influence political decision-making. However the argument need not re-main there. Morrison (1993) for example indi-cates how evaluations might influence the ‘mi-cro-politics of the school’. Hoyle (1986) askswhether evaluation data are used to bring re-sources into, or take resources out of, a depart-ment or faculty. The issue does not relate onlyto evaluations, for school-based research, farfrom the emancipatory claims for it made byaction researchers (e.g. Carr and Kemmis,1986; Grundy, 1987), is often concerned morewith finding out the most successful ways of or-ganization, planning, teaching and assessmentof a given agenda rather than setting agendasand following one’s own research agendas.This is problem-solving rather than problem-setting. That evaluation and research are beingdrawn together by politics at both a macro andmicro level is evidence of a growing interven-tionism by politics into education, thus rein-forcing the hegemony of the government inpower. Several points have been made here: • there is considerable overlap between evalu-

ation and research;• there are some conceptual differences between

evaluation and research, though, in practice,there is considerable blurring of the edges ofthe differences between the two;

• the funding and control of research and re-search agendas reflect the persuasions of po-litical decision-makers;

• evaluative research has increased in responseto categorical funding of research projects;

• the attention being given to, and utilizationof, evaluation varies according to the conso-nance between the findings and their politicalattractiveness to political decision-makers.

In this sense the views expressed earlier byMacDonald are now little more than an histori-cal relic; there is very considerable blurring ofthe edges between evaluation and research be-cause of the political intrusion into, and use of,these two types of study. One response to this

Chapte

r 143

can be seen in Burgess’s (1993) view that a re-searcher needs to be able to meet the sponsor’srequirements for evaluation whilst also gener-ating research data (engaging the issues of theneed to negotiate ownership of the data and in-tellectual property rights).

Research, politics and policy-making

The preceding discussion has suggested thatthere is an inescapable political dimension toeducational research, both in the macro- andmicro-political senses. In the macro-politicalsense this manifests itself in funding arrange-ments, where awards are made provided thatthe research is ‘policy-related’ (Burgess, 1993)—guiding policy decisions, improving quality inareas of concern identified by policy-makers,facilitating the implementation of policy deci-sions, evaluating the effects of the implementa-tion of policy. Burgess notes a shift here from asituation where the researcher specifies the topicof research and towards the sponsor specifyingthe focus of research. The issue of sponsoringresearch reaches beyond simply commissioningresearch towards the dissemination (or not) ofresearch—who will receive or have access to thefindings and how the findings will be used andreported. This, in turn, raises the fundamentalissue of who owns and controls data, and whocontrols the release of research findings. Unfa-vourable reports might be withheld for a time,suppressed or selectively released! Research canbe brought into the service of wider educationalpurposes—the politics of a local education au-thority, or indeed the politics of governmentagencies.

On a micro-scale Morrison (1993) suggeststhat research and evaluation are not politicallyinnocent because they involve people. Researchis brought into funding decisions in institu-tions—for example to provide money or to with-hold money, to promote policies or to curtailthem, to promote people or to reduce their sta-tus (Usher and Scott, 1996:177). Micro-politics,Usher and Scott argue, influence the commis-sioning of research, the kind of field-work and

field relations that are possible, funding issues,and the control of dissemination. Morrison sug-gests that this is particularly the case in evalua-tive research, where an evaluation might influ-ence prestige, status, promotion, credibility, orfunding. For example, in a school a negativeevaluation of one area of the curriculum mightattract more funding into that department, or itmight have the effect of closing down the de-partment and the loss of staff.

Though research and politics intertwine, therelationships between educational research, poli-tics and policy-making are complex because re-search designs strive to address a complex so-cial reality (Anderson and Biddle, 1991); a pieceof research does not feed simplistically or di-rectly into a specific piece of policy-making.Rather, research generates a range of differenttypes of knowledge—concepts, propositions,explanations, theories, strategies, evidence,methodologies (Caplan, 1991). These feed sub-tly and often indirectly into the decision-mak-ing process, providing, for example, direct in-puts, general guidance, a scientific gloss, orient-ing perspectives, generalizations and newinsights. Basic and applied research have signifi-cant parts to play in this process.

The degree of influence exerted by researchdepends on careful dissemination; too little andits message is ignored, too much and data over-load confounds decision-makers and makesthem cynical—the syndrome of the boy whocried wolf (Knott and Wildavsky, 1991). Henceresearchers must give more care to utilizationby policy-makers (Weiss, 1991a), reduce jargon,provide summaries, and improve links betweenthe two cultures of researchers and policy-mak-ers (Cook, 1991) and, further, to the educationalcommunity. Researchers must cultivate ways ofinfluencing policy, particularly when policy-makers can simply ignore research findings,commission their own research (Cohen andGaret, 1991) or underfund research into socialproblems (Coleman, 1991; Thomas, 1991). Re-searchers must recognize their links with thepower groups who decide policy. Research uti-lization takes many forms depending on its

RESEARCH, POLITICS AND POLICY-MAKING

THE NATURE OF INQUIRY44

location in the process of policy-making, e.g. inresearch and development, problem solving, in-teractive and tactical models (Weiss, 1991b).Researchers will have to judge the most appro-priate forms of utilization of their research(Alkin, Daillak and White, 1991).

The impact of research on policy-makingdepends on its degree of consonance with thepolitical agendas of governments (Thomas,1991) and policy-makers anxious for their ownpolitical survival (Cook, 1991) and the promo-tion of their social programmes. Research is usedif it is politically acceptable. That the impact ofresearch on policy is intensely and inescapablypolitical is a truism (Selleck, 1991; Kamin, 1991;Horowitz and Katz, 1991; Wineburg, 1991).Research too easily becomes simply an‘affirmatory text’ which ‘exonerates the system’(Wineburg, 1991) and is used by those who seekto hear in it only echoes of their own voices andwishes (Kogan and Atkin, 1991).

There is a significant tension between re-searchers and policy-makers. The two partieshave different, and often conflicting, interests,agendas, audiences, time scales, terminology, andconcern for topicality (Levin, 1991). These havehuge implications for research styles. Policy-makers anxious for the quick fix of superficialfacts, short-term solutions and simple remediesfor complex and generalized social problems(Cartwright, 1991; Cook, 1991)—the SimpleImpact model (Biddle and Anderson, 1991;Weiss, 1991a, 1991b)—find positivist method-ologies attractive, often debasing the datathrough illegitimate summary. Moreover policy-makers find much research uncertain in its ef-fects (Kerlinger, 1991; Cohen and Garet, 1991),dealing in a Weltanschauung rather than specif-ics, and being too complex in its designs and oflimited applicability (Finn, 1991). This, reply theresearchers, misrepresents the nature of theirwork (Shavelson and Berliner, 1991) and beliesthe complex reality which they are trying to in-vestigate (Blalock, 1991). Capturing social com-plexity and serving political utility can run coun-ter to each other.

The issue of the connection between research

and politics—power and decision-making—iscomplex. On another dimension, the notion thatresearch is inherently a political act because it ispart of the political processes of society has notbeen lost on researchers. Usher and Scott(1996:176) argue that positivist research hasallowed a traditional conception of society tobe preserved relatively unchallenged—the white,male, middle-class researcher—to the relativeexclusion of ‘others’ as legitimate knowers. Thatthis reaches into epistemological debate is evi-denced in the issues of who defines the ‘tradi-tions of knowledge’ and the disciplines of knowl-edge; the social construction of knowledge hasto take into account the differential power ofgroups to define what is worthwhile researchknowledge, what constitutes acceptable focusesand methodologies of research and how the find-ings will be used.

Methods and methodology

We return to our principal concern, methods andmethodology in educational research. By meth-ods, we mean that range of approaches used ineducational research to gather data which areto be used as a basis for inference and interpre-tation, for explanation and prediction. Tradi-tionally, the word refers to those techniques as-sociated with the positivistic model—elicitingresponses to predetermined questions, record-ing measurements, describing phenomena andperforming experiments. For our purposes, wewill extend the meaning to include not only themethods of normative research but also thoseassociated with interpretive paradigms—partici-pant observation, role-playing, non-directiveinterviewing, episodes and accounts. Althoughmethods may also be taken to include the morespecific features of the scientific enterprise suchas forming concepts and hypotheses, buildingmodels and theories, and sampling procedures,we will limit ourselves principally to the moregeneral techniques which researchers use.

If methods refer to techniques and proceduresused in the process of data-gathering, the aimof methodology then is, in Kaplan’s words:

Chapte

r 145

to describe and analyze these methods, throwinglight on their limitations and resources, clarifyingtheir presuppositions and consequences, relatingtheir potentialities to the twilight zone at the fron-tiers of knowledge. It is to venture generalizationsfrom the success of particular techniques, suggest-ing new applications, and to unfold the specificbearings of logical and metaphysical principles onconcrete problems, suggesting new formulations.

(Kaplan, 1973) In summary, he suggests, the aim of methodol-ogy is to help us to understand, in the broadestpossible terms, not the products of scientific in-quiry but the process itself.

We, for our part, will attempt to present nor-mative and interpretive perspectives in a com-plementary light and will try to lessen the ten-sion that is sometimes generated between them.Merton and Kendall12 express the same senti-ment when they say, ‘Social scientists have cometo abandon the spurious choice between quali-tative and quantitative data: they are concernedrather with that combination of both whichmakes use of the most valuable features of each.The problem becomes one of determining atwhich points they should adopt the one, andat which the other, approach’ (Merton andKendall, 1946).

Our earlier remarks on the nature of researchmay best be summarized by quoting Mouly’sdefinitive statement on the subject. He writes,‘Research is best conceived as the process of ar-riving at dependable solutions to problemsthrough the planned and systematic collection,

analysis, and interpretation of data. It is a mostimportant tool for advancing knowledge, forpromoting progress, and for enabling man [sic]to relate more effectively to his environment, toaccomplish his purposes, and to resolve his con-flicts’ (Mouly, 1978).

The term research itself may take on a rangeof meanings and thereby be legitimately appliedto a variety of contexts from, say, an investiga-tion into the techniques of Dutch painters of theseventeenth century to the problem of findingmore efficient means of improving traffic flowin major city centres. For our purposes, how-ever, we will restrict its usages to those activi-ties and undertakings aimed at developing a sci-ence of behaviour, the word science itself imply-ing both normative and interpretive perspectives.Accordingly, when we speak of social research,we have in mind the systematic and scholarlyapplication of the principles of a science of be-haviour to the problems of people within theirsocial contexts and when we use the term edu-cational research, we likewise have in mind theapplication of these same principles to the prob-lems of teaching and learning within the formaleducational framework and to the clarificationof issues having direct or indirect bearing onthese concepts.

The particular value of scientific research ineducation is that it will enable educators to de-velop the kind of sound knowledge base thatcharacterizes other professions and disciplines;and one that will ensure education a maturityand sense of progression it at present lacks.

METHODS AND METHODOLOGY

The planning of educational research is not

an arbitrary matter, the research itself being

an inescapably ethical enterprise. The re-

search community and those using the find-

ings of research have a right to expect that

research be conducted rigorously, scrupu-

lously and in an ethically defensible manner.

All this necessitates careful planning, with

thought being given particularly to the con-

sequences of the research. It is no accident,

therefore, that we place the chapter on ethi-

cal issues at an early point in the book, for

such matters must be a touchstone of ac-

ceptable practice. In addition, the param-

eters of the research need to be considered

and made explicit by researchers at this ini-

tial stage; subsequent chapters in this part

indicate how this might be achieved. A new

chapter on planning educational research is

intended to give novice researchers an over-

view of, and introduction to, planning issues,

the intention being deliberately practical, for

educational research has to work!

In planning research and identifying its

parameters, we need to consider the issues

of sampling, reliability, and validity at thevery outset. We regard these factors as so

important that in this edition we have in-

cluded new chapters to address them. All

are complex in nature, for there is no singu-

lar or exclusive version of reliability, valid-

ity, or what constitutes an acceptable sam-

ple. However, we believe that it is essential

for planners to embark on research ‘with

their eyes open’ so that they can consider

what avenues of approach are open to

them. What follows in this part of the book

sets out to do just this: to make clear the

range of possibilities and interpretations in

these respects so that the eventual selec-

tion of sampling procedures and versions of

reliability and validity will be made on the

basis of fitness for purpose rather than ca-

price. The intention is that by the end of this

part of the book researchers, however inex-

perienced, will be able to make informed

decisions about the parameters and con-

duct of the research.

Part two

Planning educational research

Introduction

Developments in the field of social science inrecent years have been accompanied by a grow-ing awareness of the attendant moral issues im-plicit in the work of social researchers and oftheir need to meet their obligations with respectto those involved in, or affected by, their inves-tigations. This awareness, focusing chiefly, butby no means exclusively, on the subject matterand methods of research in so far as they affectthe participants, is reflected in the growth ofrelevant literature and in the appearance ofregulatory codes of research practice formu-lated by various agencies and professional bod-ies.1 Ethical concerns encountered in educa-tional research in particular can be extremelycomplex and subtle and can frequently placeresearchers in moral predicaments which mayappear quite unresolvable. One such dilemmais that which requires researchers to strike abalance between the demands placed on themas professional scientists in pursuit of truth,and their subjects’ rights and values potentiallythreatened by the research. This is known asthe ‘costs/benefits ratio’, the essence of which isoutlined by Frankfort-Nachmias andNachmias (1992) in Box 2.1 , and is a conceptwe return to later in the chapter when we con-sider how ethical dilemmas arise from varioussources of tension. It is a particularly thornydilemma because, as Aronson et al. (1990)note, it cannot be shrugged off either by mak-ing pious statements about the inviolability ofhuman dignity or by pledging glib allegiance tothe cause of science. Most standard textbookson ethics in social research would, in this case,advise researchers to proceed ethically without

threatening the validity of the research endeav-our in so far as it is possible to do so. Conven-tional wisdom of this kind is admirable in itsway, but the problems for researchers can mul-tiply surprisingly when the principle comes tobe applied: when they move from the general tothe particular, from the abstract to the con-crete. Each research undertaking is differentand investigators may find that on one occasiontheir work proceeds smoothly without the Hy-dra-headed creature of ethical concern break-ing surface. At another time, they may come torealize that, suddenly and without prior indica-tion, they are in the middle of an ethical mine-field, and that the residual problems of a tech-nical and administrative nature that one ex-pects as a matter of course when pursuing edu-cational research are compounded by unfore-seen moral questions.

Ethical issues may stem from the kinds ofproblems investigated by social scientists and themethods they use to obtain valid and reliabledata. In theory at least, this means that eachstage in the research sequence may be a poten-tial source of ethical problems. Thus, they mayarise from the nature of the research project it-self (ethnic differences in intelligence, for exam-ple); the context for the research (a remandhome); the procedures to be adopted (produc-ing high levels of anxiety); methods of data col-lection (covert observation); the nature of theparticipants (emotionally disturbed adolescents);the type of data collected (highly personal in-formation of a sensitive kind); and what is to bedone with the data (publishing in a manner thatcauses the participants embarrassment).

Our initial observations would seem to

The ethics of educational and social research2

50 ETHICS OF EDUCATIONAL AND SOCIAL RESEARCH

indicate that the subject of ethics in social re-search is potentially a wide-ranging and chal-lenging one. It is fitting, therefore, if in thischapter we present a conspectus of the main is-sues that may confront workers in the field. Al-though what follows offers advice and guid-ance in liberal amounts drawn from the workof seasoned researchers and from a range ofempirical studies, we do not intend to be un-duly prescriptive or prescriptive. As we sug-gested in our opening comments, each researchundertaking is an event sui generis, and theconduct of researchers cannot be, indeedshould not be, forced into a procrustean systemof ethics. When it comes to the resolution of aspecific moral problem, each situation fre-quently offers a spectrum of possibilities. Inwhat follows, we have indulged in a certainamount of repetition without, we hope, beingrepetitious. This has advantages since some ofthe ideas discussed are multi-faceted and their

reappearance in different contexts may assistgreater understanding.

From what we have said so far, we hope thatwe will be seen as informants rather than arbi-ters, and that our counsels will be perceived asmarkers and signposts in what for many readerswill be a largely unexplored terra incognita. It isin this spirit that we review seriatim the prob-lems of access to the research setting; the natureof ethics in social research generally; sources oftension in the ethical debate; problems and di-lemmas confronting the researcher, includingmatters of privacy, anonymity, confidentiality,betrayal and deception; ethical problems endemicin particular research methods; ethics and teacherevaluation; regulations affecting research; and afinal word on personal codes of practice. Beforethis, however, we examine another fundamentalconcept which, along with the costs/benefits ra-tio, contributes to the bedrock of ethical proce-dure—that of informed consent.

Informed consent

Much social research necessitates obtaining theconsent and co-operation of subjects who areto assist in investigations and of significant oth-ers in the institutions or organizations provid-ing the research facilities. In some cultures, in-formed consent is absolutely essential wheneverparticipants are exposed to substantial risks orasked to forfeit personal rights. Writing of thesituation in the USA, for instance, Frankfort-Nachmias and Nachmias say:

When research participants are to be exposed topain, physical or emotional injury, invasions ofprivacy, or physical or psychological stress, orwhen they are asked to surrender their autonomytemporarily (as, for example, in drug research),informed consent must be fully guaranteed. Par-ticipants should know that their involvement isvoluntary at all times, and they should receive athorough explanation beforehand of the benefits,rights, risks, and dangers involved as a conse-quence of their participation in the researchproject.

(Frankfort-Nachmias and Nachmias, 1992)

Box 2.1The costs/benefits ratio

Source Adapted from Frankfort-Nachmias and Nachmias, 1992

The costs/benef its ratio is a fundamental concept

expressing the primary ethical dilemma in social

research. In planning their proposed research, social

scientists have to consider the likely social benefits of

their endeavours against the personal costs to the

individuals taking part. Possible benefits accruing from

the research may take the form of crucial findings

leading to significant advances in theoretical and applied

knowledge. Failure to do the research may cost society

the advantages of the research findings and ultimately

the opportunity to improve the human condition. The

costs to participants may include affronts to dignity,

embarrassment, loss of trust in social relations, loss of

autonomy and self-determination, and lowered self-

esteem. On the other hand, the benefits to participants

could take the form of satisfaction in having made a

contribution to science and a greater personal

understanding of the research area under scrutiny. The

process of balancing benefits against possible costs is

chiefly a subjective one and not at all easy. There are

few or no absolutes and researchers have to make

decisions about research content and procedures in

accordance with professional and personal values. This

costs/benef its ratio is the basic dilemma residual in a

great deal of social research.

Chapte

r 251

The principle of informed consent arises fromthe subject’s right to freedom and self-determi-nation. Being free is a condition of living in ademocracy and when restrictions and limitationsare placed on that freedom they must be justi-fied and consented to, even in research proceed-ings. Consent thus protects and respects the rightof self-determination and places some of the re-sponsibility on the participant should anythinggo wrong in the research. Another aspect of theright to self-determination is that the subject hasthe right to refuse to take part, or to withdrawonce the research has begun (see Frankfort-Nachmias and Nachmias, 1992). Thus informedconsent implies informed refusal.

Informed consent has been defined by Dienerand Crandall as ‘the procedures in which indi-viduals choose whether to participate in an in-vestigation after being informed of facts thatwould be likely to influence their decisions’ (Di-ener and Crandall, 1978). This definition in-volves four elements: competence, voluntarism,full information and comprehension. ‘Compe-tence’ implies that responsible, mature individu-als will make correct decisions if they are giventhe relevant information. It is incumbent on re-searchers to ensure they do not engage individu-als incapable of making such decisions eitherbecause of immaturity or some form of psycho-logical impairment. ‘Voluntarism’ entails apply-ing the principle of informed consent and thusensuring that participants freely choose to takepart (or not) in the research and guarantees thatexposure to risks is undertaken knowingly andvoluntarily. This element can be problematical,especially in the field of medical research whenunknowing patients are used as guinea-pigs. ‘Fullinformation’ implies that consent is fully in-formed, though in practice it is often impossiblefor researchers to inform subjects on everything,e.g. on the statistical treatment of data; and, aswe shall see below, on those occasions when theresearchers themselves do not know everythingabout the investigation. In such circumstances,the strategy of reasonably informed consent hasto be applied. Box 2.2 illustrates a set of

guidelines used in the USA that are based on theidea of reasonably informed consent.2 ‘Compre-hension’ refers to the fact that participants fullyunderstand the nature of the research project,even when procedures are complicated and en-tail risks. Suggestions have been made to ensurethat subjects fully comprehend the situation theyare putting themselves into, e.g. by using highlyeducated subjects, by engaging a consultant toexplain difficulties or by building into the re-search scheme a time lag between the requestfor participation and decision time. If these fourelements are present, researchers can be assuredthat subjects’ rights will have been given appro-priate consideration. As Frankfort-Nachmiasand Nachmias note, however:

The principle of informed consent should not…be made an absolute requirement of all social sci-ence research. Although usually desirable, it is notabsolutely necessary to studies where no dangeror risk is involved. The more serious the risk toresearch participants, the greater becomes theobligation to obtain informed consent.

(Frankfort-Nachmias and Nachmias, 1992) It must also be remembered that there are someresearch methods where it is impossible to seekinformed consent. Covert observation, for

Box 2.2Guidelines for reasonably informed consent

Source United States Department of Health, Education andWelfare, Institutional Guide to DHEW Policy, 1971

INFORMED CONSENT

1 A fair explanation of the procedures to be followedand their purposes.

2 A description of the attendant discomforts andrisks reasonably to be expected.

3 A description of the benefits reasonably to beexpected.

4 A disclosure of appropriate alternative proceduresthat might be advantageous to the participants.

5 An offer to answer any inquiries concerning theprocedures.

6 An instruction that the person is free to withdrawconsent and to discontinue participation in theproject at any time without prejudice to theparticipant.

52 ETHICS OF EDUCATIONAL AND SOCIAL RESEARCH

example, as used in Patrick’s study of a Glas-gow gang (Chapter 9), or experimental tech-niques involving deception, as in Milgram’s Obe-dience-to-authority experiments (Chapter 21),would, by their very nature, rule out the option.And, of course, there may be occasions whenproblems arise even though consent has beenobtained. Burgess (1989a), for example, cites hisown research in which teachers had been in-formed that research was taking place but inwhich it was not possible to specify exactly whatdata would be collected or how they would beused. It could be said, in this particular case,that individuals were not fully informed, thatconsent had not been obtained, and that pri-vacy had been violated. As a general rule, how-ever, informed consent is an important princi-ple to abide by and the fact that moral philoso-phers have joined in the debate engendered bythe concept is testimony to the seriousness withwhich it is viewed (Soble, 1978). It is this prin-ciple that will form the basis, so to speak, of animplicit contractual relationship between theresearcher and the researched and will serve asa foundation on which subsequent ethical con-siderations can be structured.

From our remarks and citations so far on thissubject of informed consent, we may appear tobe assuming relationships between peers—re-searcher and teachers, for example, or researchprofessor and post-graduate students; and thisassumption would seem to underpin many ofthe discussions of an ethical nature in the re-search literature generally. Readers will beaware, however, that much educational researchinvolves children who cannot be regarded asbeing on equal terms with the researcher, and itis important to keep this in mind at all stages inthe research process including the point whereinformed consent is sought. In this connectionwe refer to the important work of Fine andSandstrom (1988), whose ethnographic andparticipant observational studies of children andyoung people focus, among other issues, on thisasymmetry with respect to the problems of ob-taining informed consent from their young sub-jects and explaining the research in a compre-

hensible fashion. As a guiding principle theyadvise that while it is desirable to lessen thepower differential between children and adultresearchers, the difference will remain and itselimination may be ethically inadvisable.

It may be of some help to readers if we referbriefly to other aspects of the problem of informedconsent (or refusal) in relation to young, or veryyoung, children. Seeking informed consent withregard to minors involves two stages. First, re-searchers consult and seek permission from thoseadults responsible for the prospective subjects;and, second, they approach the young peoplethemselves. The adults in question will be, forexample, parents, teachers, tutors, or psychia-trists, youth leaders, or team coaches, dependingon the research context. The point of the researchwill be explained, questions invited and permis-sion to proceed to the next stage sought. Objec-tions, for whatever reason, will be duly respected.Obtaining approval from relevant adults may bemore difficult than in the case of the children,but at a time of increasing sensitivity to children’swelfare it is vital that researchers secure suchapproval. It may be useful if, in seeking the con-sent of children, researchers bear in mind the pro-visory comments below.

While seeking children’s permission and co-operation is an automatic part of quantitativeresearch (a child cannot unknowingly completea simple questionnaire), the importance of in-formed consent in qualitative research is not al-ways recognized. Speaking of participant obser-vation, for example, Fine and Sandstrom say thatresearchers must provide a credible and mean-ingful explanation of their research intentions,especially in situations where they have littleauthority, and that children must be given a realand legitimate opportunity to say that they donot want to take part. The authors advise thatwhere subjects do refuse, they should not bequestioned, their actions should not be recorded,and they should not be included in any book orarticle (even under a pseudonym). Where theyform part of a group, they may be included aspart of a collectivity. Fine and Sandstromconsider that such rejections are sometimes a

Chapte

r 253

result of mistrust of the researcher. They sug-gest that at a later date, when the researcherhas been able to establish greater rapport withthe group, those who refused initially may beapproached again, perhaps in private.

Two particular groups of children require spe-cial mention: very young children, and those notcapable of making a decision. Researchers intend-ing to work with pre-school or nursery childrenmay dismiss the idea of seeking informed con-sent from their would-be subjects because of theirage, but Fine and Sandstrom would recommendotherwise. Even though such children would notunderstand what research was, the authors ad-vise that the children be given some explanation.For example, one to the effect that an adult willbe watching and playing with them might be suf-ficient to provide a measure of informed consentconsistent with the children’s understanding. AsFine and Sandstrom comment:

Our feeling is that children should be told as muchas possible, even if some of them cannot under-stand the full explanation. Their age should notdiminish their rights, although their level of un-derstanding must be taken into account in the ex-planations that are shared with them.

(Fine and Sandstrom, 1988)

The second group consists of those children whoare to be used in a research project and whomay not meet Diener and Crandall’s (1978) cri-terion of ‘competence’ (a group of psychologi-cally impaired children, for example—the issueof ‘advocacy’ applies here). In such circum-stances there may be LEA guidelines to follow.In the absence of these, the requirements of in-formed consent would be met by obtaining thepermission of headteachers who will be actingin loco parentis or who have had delegated tothem the responsibility for providing informedconsent by the parents.

Two final cautions: first, where an extremeform of research is planned, parents would haveto be fully informed in advance and their con-sent obtained; and second, whatever the natureof the research and whoever is involved, should

a child show signs of discomfort or stress, theresearch should be terminated immediately. Forfurther discussion on the care that needs to beexercised in working with children we refer read-ers to Greig and Taylor (1998), Holmes (1998)and Graue and Walsh (1998).

Access and acceptance

The relevance of the principle of informed con-sent becomes apparent at the initial stage of theresearch project—that of access to the institu-tion or organization where the research is to beconducted, and acceptance by those whose per-mission one needs before embarking on the task.We highlight this stage of access and acceptancein particular at this point because it offers thebest opportunity for researchers to present theircredentials as serious investigators and estab-lish their own ethical position with respect totheir proposed research.

Investigators cannot expect access to a nurs-ery, school, college, or factory as a matter ofright. They have to demonstrate that they areworthy, as researchers and human beings, ofbeing accorded the facilities needed to carry outtheir investigations. The advice of Bell (1987) isparticularly apposite in this connection:

Permission to carry out an investigation must al-ways be sought at an early stage. As soon as youhave an agreed project outline and have readenough to convince yourself that the topic is fea-sible, it is advisable to make a formal, writtenapproach to the individuals and organization con-cerned, outlining your plans. Be honest. If you arecarrying out an investigation in connection with adiploma or degree course, say that is what youare doing. If you feel the study will probably yielduseful and/or interesting information, make a par-ticular point of that fact—but be careful not toclaim more than the investigation merits.

(Bell, 1987:42) The first stage thus involves the gaining of offi-cial permission to undertake one’s research inthe target community. This will mean contact-ing, in person or in writing, an LEA official and/

ACCESS AND ACCEPTANCE

54 ETHICS OF EDUCATIONAL AND SOCIAL RESEARCH

or the chairperson of the governors, if one is towork in a school, along with the headteacher orprincipal. At a later point, significant figures whowill be responsible for, or assist in, the organi-zation and administration of the research willalso need to be contacted—the deputy head orsenior teacher, for instance, and most certainlythe classteacher if children are to be used in theresearch. Since the researcher’s potential for in-trusion and perhaps disruption is considerable,amicable relations with the classteacher in par-ticular should be fostered as expeditiously aspossible. If the investigation involves teachersas participants, propositions may have to be putto a full staff meeting and conditions negotiated.Where the research is to take place in anotherkind of institution—a youth club or detentioncentre, for example—the principle of approachwill be similar, although the organizational struc-ture will be different.

Achieving goodwill and co-operation is es-pecially important where the proposed researchextends over a period of time: days, perhaps, inthe case of an ethnographic study; months (orperhaps years!) where longitudinal research isinvolved. Access does not present quite such aproblem when, for example, a one-off surveyrequires respondents to give up half-an-hour oftheir time; or when a researcher is normally amember of the organization where the researchis taking place (an insider), though in the caseof the latter, it is generally unwise to take coop-eration for granted. Where research proceduresare extensive and complicated, however, orwhere the design is developmental or longitudi-nal, or where researchers are not normally basedin the target community, the problems of accessare more involved and require greater prepara-tion. Box 2.3 gives a flavour of the kinds of ac-cessibility problems that can be experienced(Foster, 1989).

Having identified the official and significantfigures whose permission must be sought, andbefore actually meeting them, researchers willneed to clarify in their own minds the precisenature and scope of their research. In this re-spect researchers could, for instance, identify the

aims of the research; its practical applications,if any; the design, methods and procedures tobe used; the nature and size of samples or groups;what tests are to be administered and how; whatactivities are to be observed; what subjects areto be interviewed; observational needs; the timeinvolved; the degree of disruption envisaged;arrangements to guarantee confidentiality withrespect to data (if this is necessary); the role offeedback and how findings can best be dissemi-nated; the overall timetable within which theresearch is to be encompassed; and finally,whether assistance will be required in the or-ganization and administration of the research.By such planning and foresight, both research-ers and institutions will have a good idea of thedemands likely to be made on both subjects (bethey children or teachers) and organizations. Itis also a good opportunity to anticipate and re-solve likely problems, especially those of a prac-tical kind. A long, complicated questionnaire,for example, may place undue demands on thecomprehension skills and attention spans of aparticular class of 13-year-olds; or a relativelyinexperienced teacher could feel threatened bysustained research scrutiny. Once this kind ofinformation has been sorted out and clarified,researchers will be in a strong position to

Box 2.3Close encounters of a researcher kind

Source Foster, 1989

My first entry into a staffroom at the college was the

occasion of some shuffling and shifting of books and

chairs so that I could be given a comfortable seat

whilst the tutor talked to me from a standing position.

As time progressed my presence was almost taken for

granted and later, when events threatened the security

of the tutors, I was ignored. No one inquired as to

whether they could assist me and my own inquiries

were met with cursory answers and confused looks,

followed by the immediate disappearance of the

individuals concerned, bearing a pile of papers. I learned

not to make too many inquiries. Unfortunately, when

individuals feel insecure, when their world is threatened

with change that is beyond their control, they are likely

to respond in an unpredictable manner to persons

within their midst whose role is unclear, and the role of

the researcher is rarely understood by those not

engaged in research.

Chapte

r 255

discuss their proposed plans in an informed,open and frank manner (though not necessarilytoo open, as we shall see) and will thereby morereadily gain permission, acceptance, and sup-port. It must be remembered that hosts will haveperceptions of researchers and their intentionsand that these need to be positive. Researcherscan best influence such perceptions by present-ing themselves as competent, trustworthy, andaccommodating.

Once this preliminary information has beencollected, researchers are duly prepared forthe next stage: making actual contact in per-son, perhaps after an introductory letter, withappropriate people in the organization with aview to negotiating access. If the research iscollege-based, they will have the support oftheir college and course supervisors. Festingerand Katz (1966) consider that there is realeconomy in going to the very top of the or-ganization or system in question to obtain as-sent and cooperation. This is particularly sowhere the structure is clearly hierarchical andwhere lower levels are always dependent ontheir superiors. They consider that it is likelythat the nature of the research will be referredto the top of the organization sooner or later,and that there is a much better chance for afavourable decision if leaders are consulted atthe outset. It may also be the case that headswill be more open-minded than those lowerdown, who because of their insecurity, may beless co-operative. The authors also warnagainst using the easiest entrances into the or-ganization when seeking permission. Re-searchers may perhaps seek to come in as alliesof individuals or groups who have a special in-terest to exploit and who see research as ameans to their ends. As Festinger and Katz putit, ‘The researcher’s aim should be to enter thesituation in the common interests of all par-ties, and his findings should be equally avail-able to all groups and individuals’ (Festingerand Katz, 1966). Investigators should thusseek as broad a basis for their support as pos-sible. Other potential problems may be cir-cumvented by making use of accepted chan-

nels of communication in the institution or or-ganization. In a school, for example, this maytake the form of a staff forum. As Festingerand Katz say in this regard, ‘If the informationis limited to a single channel, the study maybecome identified with the interests associatedwith that channel’ (Festinger and Katz, 1966).

Following contact, there will be a negotia-tion process. At this point researchers will giveas much information about the aims, natureand procedures of the research as is appropri-ate. This is very important: information thatmay prejudice the results of the investigationshould be withheld. Aronson and Carlsmith(1969), for instance, note that one cannot im-agine researchers who are studying the effectsof group pressure on conformity announcingtheir intentions in advance. On the other hand,researchers may find themselves on dangerousground if they go to the extreme of maintaininga ‘conspiracy of silence’, because, as Festingerand Katz note, such a stance is hard to keep upif the research is extensive and lasts over severaldays or weeks. As they say in this respect, ‘Anattempt to preserve secrecy merely increases thespread and wildness of the rumours’ (Festingerand Katz, 1966). If researchers do not wanttheir potential hosts and/or subjects to knowtoo much about specific hypotheses and objec-tives, then a simple way out is to present an ex-plicit statement at a fairly general level withone or two examples of items that are not cru-cial to the study as a whole. As most researchentails some risks, especially where field studiesare concerned, and as the presence of an ob-server scrutinizing various aspects of commu-nity or school life may not be relished by all inthe group, investigators must at all times mani-fest a sensitive appreciation of their hosts’ andsubjects’ position and reassure anyone whofeels threatened by the work. Such reassurancecould take the form of a statement of condi-tions and guarantees given by researchers atthis negotiation stage. By way of illustration,Box 2.4 contains conditions laid down by anOpen University student for a school-based re-search project.

ACCESS AND ACCEPTANCE

56 ETHICS OF EDUCATIONAL AND SOCIAL RESEARCH

We conclude this section by reminding be-ginning researchers in particular that therewill be times when ethical considerations willpervade much of their work and that thesewill be no more so than at the stage of accessand acceptance, where appropriateness oftopic, design, methods, guarantees of confi-dentiality, analysis and dissemination of find-ings must be negotiated with relative open-ness, sensitivity, honesty, accuracy and scien-tific impartiality. As we have indicated ear-lier, there can be no rigid rules in this context.It will be a case of formulating and abidingby one’s own situational ethics. These willdetermine what is acceptable and what is notacceptable. As Hitchcock and Hughes (1989)say in this regard:

Individual circumstances must be the final arbi-ter. As far as possible it is better if the teacher candiscuss the research with all parties involved. Onother occasions it may be better for the teacher todevelop a pilot study and uncover some of theproblems in advance of the research proper. If itappears that the research is going to come intoconflict with aspects of school policy, managementstyles, or individual personalities, it is better toconfront the issues head on, consult relevant par-

ties, and make rearrangements in the research de-sign where possible or necessary.

(Hitchcock and Hughes, 1989:198) Where a pilot study is not feasible it may bepossible to arrange one or two scouting foraysto assess possible problems and risks. By way ofsummary, we refer the reader to Box 2.5.

Ethics of social research

Social scientists generally have a responsibilitynot only to their profession in its search forknowledge and quest for truth, but also for thesubjects they depend on for their work. What-ever the specific nature of their work, social re-searchers must take into account the effects ofthe research on participants, and act in such away as to preserve their dignity as human be-ings. Such is ethical behaviour. Indeed, ethics hasbeen defined as:

a matter of principled sensitivity to the rights ofothers. Being ethical limits the choices we can makein the pursuit of truth. Ethics say that while truthis good, respect for human dignity is better, evenif, in the extreme case, the respect of human na-ture leaves one ignorant of human nature.

(Cavan, 1977:810) Kimmel (1988) has pointed out that when at-tempting to describe ethical issues, it is impor-tant we remember to recognize that the distinc-tion between ethical and unethical behaviour isnot dichotomous, even though the normativecode of prescribed (‘ought’) and proscribed(‘ought not’) behaviours, as represented by theethical standards of a profession, seem to implythat it is. Judgements about whether behaviourconflicts with professional values lie on a con-tinuum that ranges from the clearly ethical tothe clearly unethical. The point to be borne inmind is that ethical principles are not absolute,generally speaking, though some maintain thatthey are as we shall see shortly, but must be in-terpreted in the light of the research context andof other values at stake.

It is perhaps worthwhile at this point to pause

Box 2.4Conditions and guarantees proffered for a school-basedresearch project

Source Bell, 1987

1 All participants will be offered the opportunity toremain anonymous.

2 All information will be treated with the strictestconfidentiality.

3 Interviewees will have the opportunity to verifystatements when the research is in draft form.

4 Participants will receive a copy of the final report.5 The research is to be assessed by the Open

University for examination purposes only, butshould the question of publication arise at a laterdate, permission will be sought from theparticipants.

6 The research will attempt to explore educationalmanagement in practice. It is hoped the final reportmay be of benefit to the school and to those whotake part.

Chapte

r 257

and remind ourselves that a considerable amountof research does not cause pain or indignity tothe participants, that self-esteem is not neces-sarily undermined nor confidences betrayed, andthat the social scientist may only infrequentlybe confronted with an unresolvable ethical di-lemma. Where research is ethically sensitive,however, many factors may need to be takeninto account and these may vary from situationto situation. By way of example, we identify aselection of such variables, the prior considera-tion of which will perhaps reduce the numberof problems subsequently faced by the re-searcher. Thus, the age of those being researched;

whether the subject matter of the research is asensitive area; whether the aims of the researchare in any way subversive (vis-à-vis subjects,teachers, or institution); the extent to which theresearcher and researched can participate andcollaborate in planning the research; how thedata are to be processed, interpreted, and used(and Laing (1967:53) offers an interesting, cau-tionary view of data where he writes that theyare ‘not so much given as taken out of a con-stantly elusive matrix of happenings. We shouldspeak of capta rather than data’); the dissemi-nation of results; and guarantees of confidenti-ality are just some of the parameters that can

Box 2.5Negotiating access checklist

Source Adapted from Bell, 1991

ETHICS OF SOCIAL RESEARCH

1 Clear official channels by formally requesting permission to carry out your investigation as soon as you

have an agreed project outline.

Some LEAs insist that requests to carry out research are channelled through the LEA office. Check what is required in

your area.

2 Speak to the people who will be asked to co-operate.

Getting the LEA or head’s permission is one thing, but you need to have the support of the people who will be asked to

give interviews or complete questionnaires.

3 Submit the project outline to the head, if you are carrying out a study in your or another educational

institution.

List people you would like to interview or to whom you wish to send questionnaires and state conditions under which the

study will be conducted.

4 Decide what you mean by anonymity and confidentiality.

Remember that if you are writing about ‘the head of English’ and there is only one head of English in the school, the

person concerned is immediately recognizable.

5 Decide whether participants will receive a copy of the report and/or see drafts or interview transcripts.

There are cost and time implications. Think carefully before you make promises.

6 Inform participants what is to be done with the information they provide.

Your eyes and those of the examiner only? Shown to the head, the LEA etc.?

7 Prepare an outline of intentions and conditions under which the study will be carried out to hand to

the participants.

Even if you explain the purpose of the study the conditions and the guarantees, participants may forget.

8 Be honest about the purpose of the study and about the conditions of the research.

If you say an interview will last ten minutes, you will break faith if it lasts an hour. If you are conducting the investigation as

part of a degree or diploma course, say so.

9 Remember that people who agree to help are doing you a favour.

Make sure you return papers and books in good order and on time. Letters of thanks should be sent, no matter how busy

you are.

10 Never assume ‘it will be all right’. Negotiating access is an important stage in your investigation.

If you are an inside researcher, you will have to live with your mistakes, so take care.

58 ETHICS OF EDUCATIONAL AND SOCIAL RESEARCH

form the basis of, to use Aronson andCarlsmith’s phrase, ‘a specification of demo-cratic ethics’. Readers will no doubt be in a po-sition to develop their own schema from theideas and concepts expressed in this chapter aswell as from their own widening experience asresearchers.

Sources of tension

We noted earlier that the question of ethics inresearch is a highly complex subject. This com-plexity stems from numerous sources of tension.We consider two of the most important. Thefirst, as expressed by Aronson and Carlsmith(1969), is the tension that exists between twosets of related values held by society: a belief inthe value of free scientific inquiry in pursuit oftruth and knowledge; and a belief in the dignityof individuals and their right to those consid-erations that follow from it. It is this polaritythat we referred to earlier as the costs/benefitsratio and by which ‘greater consideration mustbe given to the risks to physical, psychological,humane, proprietary and cultural values thanto the potential contribution of research toknowledge’ (Social Sciences and HumanitiesResearch Council of Canada, 1981), i.e. the is-sue of ‘non-maleficence’ (where no harm befallsthe subjects). When researchers are confrontedwith this dilemma (and it is likely to occur muchless in education than in social psychology ormedicine), it is generally considered that theyresolve it in a manner that avoids the extremesof, on the one hand, giving up the idea of re-search and, on the other, ignoring the rights ofthe subjects. At all times, the welfare of subjectsshould be kept in mind (though this has not al-ways been the case, as we shall see), even if itinvolves compromising the impact of the re-search. Researchers should never lose sight ofthe obligations they owe to those who are help-ing, and should constantly be on the alert foralternative techniques should the ones they areemploying at the time prove controversial (seethe penultimate paragraph in this chapter onpersonal codes of ethical practice). Indeed, this

polarity between the research and the researchedis reflected in the principles of the AmericanPsychological Association who, as Zechmeisterand Shaughnessy (1992) show, attempt to strikea balance between the rights of investigators toseek an understanding of human behaviour, andthe rights and welfare of individuals who par-ticipate in the research. In the final reckoning,the decision to go ahead with a research projectrests on a subjective evaluation of the costs bothto the individual and society.

The second source of tension in this contextis that generated by the competing absolutist andrelativist positions. The absolutist view holdsthat clear, set principles should guide the re-searchers in their work and that these shoulddetermine what ought and what ought not tobe done (see Box 2.6). To have taken a whollyabsolutist stance, for example, in the case of theStanford Prison Experiment (see Chapter 21)where the researchers studied interpersonal dy-namics in a simulated prison, would have meantthat the experiment should not have taken placeat all or that it should have been terminated wellbefore the sixth day. Zimbardo has stated theethical position:

Box 2.6Absolute ethical principles in social research

Source Zimbardo, 1984

Ethics embody individual and communal codes of

conduct based upon adherence to a set of principles

which may be explicit and codified or implicit and

which may be abstract and impersonal or concrete and

personal. For the sake of brevity, we may say that ethics

can be dichotomized as ‘absolute’ and ‘relative’. When

behaviour is guided by absolute ethical standards, a

higher-order moral principle can be postulated which is

invariant with regard to the conditions of its

applicability—across time, situations, persons and

expediency. Such principled ethics allow no degree of

freedom for ends to justify means or for any positive

consequences to qualify instances where the principle

is suspended or applied in an altered, watered-down

form. In the extreme, there are no extenuating

circumstances to be considered or weighed as justifying

an abrogation of the ethical standard.

Chapte

r 259

To search for those conditions which justify ex-periments that induce human suffering is not anappropriate enterprise to anyone who believes inthe absolute ethical principle that human life issacred and must not in any way be knowinglydemeaned physically or mentally by experimentalinterventions. From such a position it is even rea-sonable to maintain that no research should beconducted in psychology or medicine which vio-lates the biological or psychological integrity ofany human being regardless of the benefits thatmight, or even would definitely, accrue to the so-ciety at large.

(Zimbardo, 1984) By this absolute principle, the Stanford PrisonExperiment must be regarded as unethical be-cause the participants suffered considerably.

Those who hold a relativist position, by con-trast to this, would argue that there can be noabsolute guidelines and that the ethical consid-erations will arise from the very nature of theparticular research being pursued at the time:situation determines behaviour. There are somecontexts, however, where neither the absolutistnor the relativist position is clear cut. Writingof the application of the principle of informedconsent with respect to life history studies,Plummer says:

Both sides have a weakness. If, for instance, as theabsolutists usually insist, there should be informedconsent, it may leave relatively privileged groupsunder-researched (since they will say ‘no’) andunderprivileged groups over-researched (they havenothing to lose and say ‘yes’ in hope). If the indi-vidual conscience is the guide, as the relativistsinsist, the door is wide open for the unscrupu-lous—even immoral—researcher.

(Plummer, 1983) He suggests that broad guidelines laid down byprofessional bodies which offer the researcherroom for personal ethical choice are a way outof the problem. Raffe et al. (1989) have identi-fied other sources of tension which arose in theirown research: that between different ethical prin-ciples, for instance, and between groups andother individuals or groups. Before we consider

the problems set by ethical dilemmas, we touchupon one or two other trip-wires disclosed byempirical research.

Voices of experience

Whatever the ethical stance one assumes andno matter what forethought one brings to bearon one’s work, there will always be unknown,unforeseen problems and difficulties lying in wait(Kimmel, 1988). It may therefore be of assist-ance to readers if we dip into the literature andidentify some of these. Baumrind (1964), forexample, warns of the possible failure on theresearchers’ part to perceive a positive indebt-edness to their subjects for their services, per-haps, she suggests, because the detachmentwhich investigators bring to their task preventsappreciation of subjects as individuals. This kindof omission can be averted if the experimentersare prepared to spend a few minutes with sub-jects afterwards in order to thank them for theirparticipation, answer their questions, reassurethem that they did well, and generally talk tothem for a time. If the research involves sub-jects in a failure experience, isolation, or loss ofself-esteem, for example, researchers must en-sure that the subjects do not leave the situationmore humiliated, insecure, and alienated thanwhen they arrived. From the subject’s point ofview, procedures which involve loss of dignity,injury to self-esteem, or affect trust in rationalauthority are probably most harmful in the longrun and may require the most carefully organ-ized ways of recompensing the subject in someway if the researcher chooses to carry on withsuch methods. With particularly sensitive areas,participants need to be fully informed of thedangers of serious after-effects. There is reasonto believe that at least some of the obedient sub-jects in Milgram’s (1963) experiments (see Chap-ter 21) came away from the experience with alower self-esteem, having to live with the reali-zation that they were willing to yield to destruc-tive authority to the point of inflicting extremepain on a fellow human being (Kelman, 1967).It follows that researchers need to reflect

VOICES OF EXPERIENCE

60 ETHICS OF EDUCATIONAL AND SOCIAL RESEARCH

attitudes of compassion, respect, gratitude andcommon sense without being too effusive. Sub-jects clearly have a right to expect that the re-searchers with whom they are interacting havesome concern for the welfare of participants.Further, the subject’s sensibilities need also tobe taken into account when the researcher comesto write up the research. There have been noto-rious instances in the research literature wheneven experienced researchers have shown scantregard for subjects’ feelings at the report stage.A related and not insignificant issue concernsthe formal recognition of those who have as-sisted in the investigation, if such be the case.This means that whatever form the written ac-count takes, be it a report, article, chapter, orcourse thesis, and no matter the readership forwhich it is intended, its authors must acknowl-edge and thank all who helped in the research,even to the extent of identifying by name thosewhose contribution was significant. This can bedone in a foreword, introduction or footnote.All this is really a question of common-sensicalethics, an approach that will go a long way inenabling researchers to overcome many of thechallenges that beset them.

Ethical problems in educational research canoften result from thoughtlessness, oversight, ortaking matters for granted. For example, a re-searcher may be completely oblivious to attend-ant moral issues and perceive his or her work inan ethical void (not to be compared with thesituation where a researcher knowingly treatsmoral issues as if they do not matter, with, as itwere, ‘metaethical disdain’). Again, researchersengaged in sponsored research may feel they donot have to deal with ethical issues, believingtheir sponsors to have them in hand.

Likewise, each researcher in a collaborativeventure may take it for granted, wrongly, thatcolleagues have the relevant ethical questions inmind; consequently, appropriate precautions goby default. A student whose research is part ofa course requirement and who is motivatedwholly by self-interest, or the academic research-ers with professional advancement in mind, mayoverlook the ‘oughts’ and ‘ought nots’. There is

nothing wrong with either motivation provid-ing that ethical issues are borne in mind. Finally,researchers should beware of adopting modioperandi in which correct ethical procedureunwittingly becomes a victim of convenience.

Ethical dilemmas

At the beginning of this chapter, we spoke ofthe costs/benefits ratio. This has been explainedby Frankfort-Nachmias and Nachmias as a con-flict between two rights which they express as:

the right to research and acquire knowledge andthe right of individual research participants to self-determination, privacy and dignity. A decision notto conduct a planned research project because itinterferes with the participants’ welfare is a limiton the first of these rights. A decision to conductresearch despite an ethically questionable practice…is a limit on the second right.

(Frankfort-Nachmias and Nachmias, 1992) This constitutes the fundamental ethical dilemmaof the social scientist for whom there are noabsolute right or wrong answers. Which propo-sition is favoured, or how a balance betweenthe two is struck will depend very much on thebackground, experience, and personal values ofthe individual researcher. With this issue in mind,we now examine other dilemmas that may con-front investigators once they have come to someaccommodation with this fundamental dilemmaand decided to proceed with their research.

Privacy

For the most part, individual ‘right to privacy’is usually contrasted with public ‘right to know’(Pring, 1984) and this has been defined in theEthical Guidelines for the Institutional ReviewCommittee for Research with Human Subjectsas that which:

extends to all information relating to a person’sphysical and mental condition, personal circum-stances and social relationships which is notalready in the public domain. It gives to the

Chapte

r 261

individual or collectivity the freedom to decide forthemselves when and where, in what circumstancesand to what extent their personal attitudes, opin-ions, habits, eccentricities, doubts and fears are tobe communicated to or withheld from others.

(Social Sciences and Humanities ResearchCouncil of Canada, 1981)

In the context of research, therefore, ‘right toprivacy’ may easily be violated during the courseof an investigation or denied after it has beencompleted. At either point the participant isvulnerable.

Privacy has been considered from three dif-ferent perspectives by Diener and Crandall(1978). These are: the sensitivity of the infor-mation being given, the setting being observed,and dissemination of information. Sensitivity ofinformation refers to how personal or poten-tially threatening the information is that is be-ing collected by the researcher. Certain kinds ofinformation are more personal than others andmay be more threatening. According to a reportby the American Psychological Association forexample, ‘Religious preferences, sexual practices,income, racial prejudices, and other personalattributes such as intelligence, honesty, and cour-age are more sensitive items than “name, rank,and serial number”’ (American PsychologicalAssociation, 1973). Thus, the greater the sensi-tivity of the information, the more safe-guardsare called for to protect the privacy of the re-search participant. The setting being observedmay vary from very private to completely pub-lic. The home, for example, is considered one ofthe most private settings, and intrusions intopeople’s homes without their consent are for-bidden by law. Dissemination of informationconcerns the ability to match personal informa-tion with the identity of the research participants.Indeed, personal data are defined at law as thosedata which uniquely identify the individual pro-viding them. When such information is publi-cized with names through the media, for exam-ple, privacy is seriously violated. The more peo-ple there are who can learn about the informa-tion, the more concern there must be about pri-vacy (see Diener and Crandall, 1978).

As is the case with most rights, privacy canbe voluntarily relinquished. Research partici-pants may choose to give up their right to pri-vacy by either allowing a researcher access tosensitive topics or settings or by agreeing thatthe research report may identify them by name.The latter case at least would be an occasionwhere informed consent would need to besought.

Generally speaking, if researchers intend toprobe into the private aspects or affairs of indi-viduals, their intentions should be made clear andexplicit and informed consent should be soughtfrom those who are to be observed or scrutinizedin private contexts. Other methods to protectparticipants are anonymity and confidentiality andour examination of these follows.

Anonymity

As Frankfort-Nachmias and Nachmias say, ‘Theobligation to protect the anonymity of researchparticipants and to keep research data confiden-tial is all-inclusive. It should be fulfilled at allcosts unless arrangements to the contrary aremade with the participants in advance’(Frankfort-Nachmias and Nachmias, 1992).

The essence of anonymity is that informa-tion provided by participants should in no wayreveal their identity. The obverse of this is, aswe saw earlier, personal data that uniquelyidentify their supplier. A participant or subjectis therefore considered anonymous when theresearcher or another person cannot identifythe participant or subject from the informationprovided. Where this situation holds, a partici-pant’s privacy is guaranteed, no matter howpersonal or sensitive the information is. Thus arespondent completing a questionnaire thatbears absolutely no identifying marks—names,addresses, occupational details, or coding sym-bols—is ensured complete and total anonymity.A subject agreeing to a face-to-face interview,on the other hand, can in no way expect ano-nymity. At most, the interviewer can promiseconfidentiality. Non-traceability is an impor-tant matter, and this extends to aggregating

ETHICAL DILEMMAS

62 ETHICS OF EDUCATIONAL AND SOCIAL RESEARCH

data in some cases, so that an individual’s re-sponse is not identifiable.

The principal means of ensuring anonymitythen, is not using the names of the participantsor any other personal means of identification.Further ways of achieving anonymity have beenlisted by Frankfort-Nachmias and Nachmias asfollows:

participants may be asked to use an alias oftheir own creation or to transfer well-remem-bered personal data (birthdays or National In-surance number, for instance). Anonymity maybe enhanced if names and other identifiers arelinked to the information by a code number.Once the data have been prepared for analysis,anonymity can be maintained by separatingidentifying information from the research data.Further safeguards include the prevention ofduplication of records and passwords to con-trol access to data.

(Frankfort-Nachmias and Nachmias, 1992)3

These directives may work satisfactorily in mostsituations, but as Raffe and his colleagues (1989)have shown, there is sometimes the difficulty ofmaintaining an assurance of anonymity when,for example, categorization of data mayuniquely identify an individual or institution orwhen there is access to incoming returns by sup-port staff. Plummer (1983), likewise, refers tolife studies in which names have been changed,places shifted, and fictional events added to pre-vent acquaintances of subjects discovering theiridentity. Although one can go a long way downthis path, there is no absolute guarantee of totalanonymity as far as life studies are concerned.Fortunately, in experimental social psychologi-cal research the experimenter is interested in‘human’ behaviour rather than in the behaviourof specific individuals, as Aronson and Carlsmith(1969) note. Consequently the researcher hasabsolutely no interest in linking the person as aunique, named individual to actual behaviour,and the research data can be transferred tocoded, unnamed data sheets. As they comment,‘the very impersonality of the process is a greatadvantage ethically because it eliminates some

of the negative consequences of the invasion ofprivacy’ (Aronson and Carlsmith, 1969:33).

Confidentiality

The second way of protecting a participant’sright to privacy is through the promise of con-fidentiality. This means that although re-searchers know who has provided the infor-mation or are able to identify participantsfrom the information given, they will in noway make the connection known publicly; theboundaries surrounding the shared secret willbe protected. The essence of the matter is theextent to which investigators keep faith withthose who have helped them. It is generally atthe access stage or at the point where research-ers collect their data that they make their posi-tion clear to the hosts and/or subjects. Theywill thus be quite explicit in explaining to sub-jects what the meaning and limits of confiden-tiality are in relation to the particular researchproject. On the whole, the more sensitive, inti-mate, or discrediting the information, thegreater is the obligation on the researcher’spart to make sure that guarantees of confiden-tiality are carried out in spirit and letter. Prom-ises must be taken seriously.

In his account of confidentiality and the rightto privacy, Kimmel (1988) notes that one gen-eral finding that emerges from the empirical lit-erature is that some potential respondents inresearch on sensitive topics will refuse to co-operate when an assurance of confidentiality isweak, vague, not understood, or thought likelyto be breached. He concludes that the useful-ness of data in sensitive research areas may beseriously affected by the researcher’s inabilityto provide a credible promise of confidentiality.Assurances do not appear to affect co-opera-tion rates in innocuous studies perhaps because,as Kimmel suggests, there is expectation on thepart of most potential respondents that confi-dentiality will be protected.

A number of techniques have been developedto allow public access to data and informationwithout confidentiality being betrayed. These

Chapte

r 263

have been listed by Frankfort-Nachmias andNachmias (1992) as follows: 1 Deletion of identifiers (for example, delet-

ing the names, addresses, or other means ofidentification from the data released on in-dividuals).

2 Crude report categories (for example, releas-ing the year of birth rather than the specificdate, profession but not the speciality withinthat profession, general information ratherthan specific).

3 Microaggregation (that is, the constructionof ‘average persons’ from data on individu-als and the release of these data, rather thandata on individuals).

4 Error inoculation (deliberately introducingerrors into individual records while leavingthe aggregate data unchanged). Such tech-niques ensure that the notion of non-trace-ability is upheld.

Betrayal

The term ‘betrayal’ is usually applied to thoseoccasions where data disclosed in confidence arerevealed publicly in such a way as to cause em-barrassment, anxiety, or perhaps suffering to thesubject or participant disclosing the information.It is a breach of trust, in contrast to confidenti-ality, and is often a consequence of selfish mo-tives of either a personal or professional nature.Plummer comments, ‘in sociology, there is some-thing slightly awry when a sociologist can entera group and a person’s life for a lengthy period,learn their most closely guarded secrets, and thenexpose all in a critical light to the public’(Plummer, 1983). One of the research methodswe deal with in this book that is perhaps mostvulnerable to betrayal is action research. As Kelly(1989a) notes, this can produce several ethicalproblems. She says that if we treat teachers ascollaborators in our day-to-day interactions, itmay seem like betrayal of trust if these interac-tions are recorded and used as evidence. This isparticularly the case where the evidence is nega-tive. One way out, Kelly suggests, could be to

submit reports and evaluations of teachers’ re-actions to the teachers involved for comment;to get them to assess their own changing atti-tudes. She warns, however, that this might workwell with teachers who have become converts,but is more problematic where teachers remainindifferent or hostile to the aims of the researchproject. How does one write an honest but criti-cal report of teachers’ attitudes, she asks, if onehopes to continue to work with those involved?As she concludes, ‘Our position lies uncomfort-ably between that of the internal evaluatorwhose main loyalty is to colleagues and theschool, and the external researcher for whominformal comments and small incidents mayprovide the most revealing data’ (Kelly, 1989a).

Deception

The use of deception in social psychological andsociological research has attracted a certainamount of adverse publicity. In social psycho-logical research, the term is applied to that kindof experimental situation where the researcherknowingly conceals the true purpose and con-ditions of the research, or else positively misin-forms the subjects, or exposes them to undulypainful, stressful or embarrassing experiences,without the subjects having knowledge of whatis going on. The deception lies in not telling thewhole truth. Advocates of the method feel thatif a deception experiment is the only way to dis-cover something of real importance, the truthso discovered is worth the lies told in the proc-ess, so long as no harm comes to the subject (seeAronson et al., 1990). Objections to the tech-nique, on the other hand, are listed in Chapter21, where the approach is contrasted with roleplaying. The problem from the researcher’s pointof view is: ‘What is the proper balance betweenthe interests of science and the thoughtful, hu-mane treatment of people who, innocently, pro-vide the data?’ In other words, the problem againhinges on the costs/benefits ratio.

The pervasiveness of the problem of decep-tion becomes even more apparent when we re-member that it is even built into many of our

ETHICAL DILEMMAS

64 ETHICS OF EDUCATIONAL AND SOCIAL RESEARCH

measurement devices, since it is important tokeep the respondent ignorant of the personal-ity and attitude dimensions that we wish to in-vestigate.

There are many problems that cannot be in-vestigated without deception and although thereis some evidence that most subjects accept with-out resentment the fact of having been dupedonce they understand the necessity for it (see,for instance, Festinger and Katz, 1966), it isimportant to keep in the forefront of one’s mindthe question of whether the amount and type ofdeception is justified by the significance of thestudy and the unavailability of alternative pro-cedures.

Ethical considerations loom particularly largewhen second-order deception is involved; thatis, letting persons believe they are acting as re-searchers or researchers’ accomplices when theyare in fact serving as the subjects (i.e., as un-knowing participants). Such procedures canundermine the relationship between the re-searcher and subject even more than simply mis-informing them. The use of deception resultingin particularly harmful consequences would beanother occasion where ethical considerationswould need to be given priority. An example herewould be the study by Campbell, Sanderson andLaverty (1964) which created extremely stress-ful conditions by using drugs to induce tempo-rary interruption of breathing (see Box 2.7).

Kelman (1967) has suggested three ways ofdealing with the problem of deception. First, itis important that we increase our active aware-ness that it exists as a problem. It is crucial thatwe always ask ourselves the question whetherdeception is necessary and justified. We must bewary of the tendency to dismiss the question asirrelevant and to accept deception as a matterof course. Active awareness is thus in itself partof the solution, for it makes the use of decep-tion a focus for discussion, deliberation, inves-tigation, and choice.

The second way of approaching the problemconcerns counteracting and minimizing the nega-tive effects of deception. For example, subjectsmust be selected in a way that will exclude

individuals who are especially vulnerable; anypotentially harmful manipulation must be keptto a moderate level of intensity; researchers mustbe sensitive to danger signals in the reactions ofsubjects and be prepared to deal with crises whenthey arise; and at the conclusion of the research,they must take time not only to reassure sub-jects, but also help them work through their feel-ings about the experience to whatever degreemay be required. The principle that subjectsought not to leave the research situation withgreater anxiety or lower levels of self-esteem thanthey came with is a good one to follow (the is-sue of non-maleficence again). Desirably, sub-jects should be enriched by the experience andshould leave it with the feeling that they havelearned something.

The primary way of counteracting negativeeffects of research employing deception is toensure that adequate feedback is provided at theend of the research or research session. Feed-back must be kept inviolable and in no circum-stances should subjects be given false feedbackor be misled into thinking they are receiving feed-back when the researcher is in fact introducinganother experimental manipulation.

Even here, however, there are dangers. AsAronson and Carlsmith say:

Box 2.7An extreme case of deception

Source Adapted from Kelman, 1967

In an experiment designed to study the establishment

of a conditioned response in a situation that is

traumatic but not painful, Campbell, Sanderson and

Laverty induced—through the use of a drug—a

temporary interruption of respiration in their subjects.

The subjects’ reports confirmed that this was a

‘horrific’ experience for them. All the subjects thought

they were dying. The subjects, male alcoholic patients

who had volunteered for the experiment when they

were told that it was connected with a possible

therapy for alcoholism, were not warned in advance

about the effect of the drug, since this information

would have reduced the traumatic impact of the

experience.

Chapte

r 265

debriefing a subject is not simply a matter of ex-posing him to the truth. There is nothing magi-cally curative about the truth; indeed…if harshlypresented, the truth can be more harmful than noexplanation at all. There are vast differences inhow this is accomplished, and it is precisely thesedifferences that are of crucial importance in de-termining whether or not a subject is uncomfort-able when he leaves the experimental room.

(Aronson and Carlsmith, 1969:31) They consider that the one essential aspect ofthe debriefing process is that researchers com-municate their own sincerity as scientists seek-ing the truth and their own discomfort aboutthe fact that they found it necessary to resort todeception in order to uncover the truth. As theysay, ‘No amount of postexperimental gentlenessis as effective in relieving a subject’s discomfortas an honest accounting of the experimenter’sown discomfort in the situation’ (Aronson andCarlsmith, 1969:31–2).

The third way of dealing with the problemof deception is to ensure that new proceduresand novel techniques are developed. It is a ques-tion of tapping one’s own creativity in the questfor alternative methods. It has been suggestedthat role-playing, or ‘as-if’ experiments, couldprove a worthwhile avenue to explore—the‘role-playing versus deception’ debate we raisein Chapter 21. By this method, as we shall see,the subject is asked to behave as if he/she were aparticular person in a particular situation. What-ever form they take, however, new approacheswill involve a radically different set of assump-tions about the role of the subject in this type ofresearch. They require us to use subjects’motivations rather than bypassing them. Theymay even call for increasing the sophisticationof potential subjects, rather than maintainingtheir naivety.

Plummer (1983) informs us that even in anunlikely area like life history, deceptions of alesser nature occur. Thus, for example, the gen-eral description given of research may leave outsome key issues; indeed, to tell the subject whatit is you are looking for may bias the outcomequite substantially. Further, different accounts

of the research may have to be presented to dif-ferent groups. He quotes an instance from hisown research, a study of sexual minorities, whichrequired various levels of release—for the sub-jects, for colleagues, for general inquiries, andfor outside friends. None of these accounts ac-tually lied, they merely emphasized a differentaspect of the research.

In the social sciences, the dilemma of decep-tion, as we have seen, has played an importantpart in experimental social psychology wheresubjects are not told the true nature of the ex-periment. Another area where it has been in-creasingly used in recent years is that of sociol-ogy, where researchers conceal their identitiesand ‘con’ their way into alien groups—the overt/covert debate (Mitchell, 1993). Covert, or se-cret participation, then, refers to that kind ofresearch where researchers spend an extendedperiod of time in particular research settings,concealing the fact that they are researchers andpretending to play some other role. Bulmer(1982) notes that such methods have producedan extremely lively ongoing debate and thatthere are no simple and universally agreed an-swers to the ethical issues the method produces.Erikson (1967), for example, makes a numberof points against covert research; among them,that sociologists have responsibilities to theirsubjects in general and that secret research caninjure other people in ways that cannot be an-ticipated or compensated for afterwards; andthat sociologists have responsibilities towardsfellow-sociologists. Douglas (1976), by contrast,argues that covert observation is a necessary,useful and revealing method. And Bulmer (1982)concludes that the most compelling argumentin favour of covert observation is that it has pro-duced good social science which would not havebeen possible without the method. It would bechurlish, he adds, not to recognize that the useof covert methods has advanced our understand-ing of society.

The final word on the subject of deception ingeneral goes to Kimmel (1988) who claims thatfew researchers feel that they can do withoutdeception entirely, since the adoption of an

ETHICAL DILEMMAS

66 ETHICS OF EDUCATIONAL AND SOCIAL RESEARCH

overtly conservative approach could deem thestudy of important research hardly worth theeffort. A study of racial prejudice, for example,accurately labelled as such would certainly af-fect the behaviour of the subjects taking part.Deception studies, he considers, differ so greatlythat even the harshest critics would be hardpressed to state unequivocally that all deceptionhas potentially harmful effects on participantsor is otherwise wrong. We turn now to researchmethods used in educational settings and to someethical issues associated with them.

Ethics and research methods ineducation

Ethical problems arising from research methodsused in educational contexts occur passim inBurgess’s (1989a) edited collection of papers,The Ethics of Educational Research, and thebook is recommended to readers for their pe-rusal. Burgess himself considers ethical issuesemerging from ethnographic research (1989b).Similar themes characterize Riddell’s paper inwhich she examines feminist research in tworural comprehensive schools. Her work illus-trates how feminist investigations raise questionsabout honesty, power relations, the responsibil-ity of the researcher to the researched, and col-laboration. Corresponding topics are broachedfor action researchers by Kelly (1989b), who wasco-director of the ‘Girls into Science and Tech-nology’ project, a study focusing on girls’ un-der-involvement in science and technology. Arange of questions are considered—researcherpower and values, the problem of informed con-sent, and the manner in which research data arepresented to the participants in the project withrespect to empirical research are considered inthe second part of the book.

Reflection on the articles in Burgess (1989a)will show that the issues thrown up by the com-plexities of research methods in educational in-stitutions and their ethical consequences areprobably among the least anticipated, particu-larly among the more inexperienced research-ers. The latter need to be aware of those kinds

of research which, by their nature, lead fromone problem to another. Serial problems of thissort may arise in survey methods or ethnographicstudies, for example, or in action research orthe evaluation of developments. Indeed, the re-searcher will frequently find that methodologi-cal and ethical issues are inextricably interwo-ven in much of the research we have designatedas qualitative or interpretive. As Hitchcock andHughes note:

Doing participant observation or interviewingone’s peers raises ethical problems that are directlyrelated to the nature of the research techniqueemployed. The degree of openness or closure ofthe nature of the research and its aims is one thatdirectly faces the teacher researcher.

(Hitchcock and Hughes, 1989:199) They go on to pose the kinds of question thatmay arise in such a situation. ‘Where for theresearcher does formal observation end and in-formal observation begin?’ ‘Is it justifiable to beopen with some teachers and closed with oth-ers?’ ‘How much can the researcher tell the pu-pils about a particular piece of research?’ ‘Whenis a casual conversation part of the research dataand when is it not?’ ‘Is gossip legitimate dataand can the researcher ethically use material thathas been passed on in confidence?’ As Hitchcockand Hughes conclude, the list of questions isendless yet they can be related to the nature ofboth the research technique involved and thesocial organization of the setting being investi-gated. The key to the successful resolution ofsuch questions lies in establishing good relations.This will involve the development of a sense ofrapport between researchers and their subjectsthat will lead to feelings of trust and confidence.Mention must be made once again in this par-ticular context of the work of Fine andSandstrom (1988) who discuss in some detailthe ethical and practical aspects of doing field-work with children. In particular they show howthe ethical implications of participant observa-tion research differ with the age of the children.Another feature of qualitative methods in this

Chapte

r 267

connection has been identified by Finch whoobserves that:

there can be acute ethical and political dilemmasabout how the material produced is used, both bythe researcher her/himself, and by other people.Such questions are not absent in quantitative re-search, but greater distancing of the researcherfrom the research subjects may make them lesspersonally agonizing. Further, in ethnographicwork or depth interviewing, the researcher is verymuch in a position of trust in being accorded privi-leged access to information which is usually pri-vate or invisible. Working out how to ensure thatsuch trust is not betrayed is no simple matter…Where qualitative research is targeted upon socialpolicy issues, there is the special dilemma that find-ings could be used to worsen the situation of thetarget population in some way.

(Finch, 1985) Kelly’s (1989a) paper would seem to suggest, aswe have noted elsewhere in this chapter, thatthe area in qualitative research where one’s ethi-cal antennae need to be especially sensitive isthat of action research, and it is here that re-searchers, be they teachers or outsiders, mustshow particular awareness of the traps that liein wait. These difficulties have been nowherebetter summed up than in Hopkins when he says:

[The researchers’] actions are deeply embedded inan existing social organization and the failure towork within the general procedures of that organi-zation may not only jeopardize the process of im-provement but existing valuable work. Principlesof procedures for action research accordingly gobeyond the usual concerns for confidentiality andrespect for persons who are the subjects of en-quiry and define in addition, appropriate ways ofworking with other participants in the social or-ganization.

(Hopkins, 1985:135) Box 2.8 presents a set of principles specially for-mulated for action researchers by Kemmis andMcTaggart (1981) and quoted by Hopkins(1985).

We conclude by reminding readers who may

become involved in action research that theproblem of access is not resolved once one hasbeen given permission to use the school or or-ganization. The advice given by Hammersleyand Atkinson with respect to ethnographic re-search is equally applicable to action research.As they say:

[having] gained entry to a setting…by no meansguarantees access to all the data available withinit. Not all parts of the setting will be equally opento observation, not everyone may be willing totalk, and even the most willing informant willnot be prepared, or perhaps even able, to divulgeall the information available to him or her. If thedata required to develop and test the theory areto be acquired, negotiation of access is thereforelikely to be a recurrent preoccupation for the eth-nographer.

(Hammersley and Atkinson, 1983:76) As the authors observe, different kinds of datawill demand different roles, and these in turnresult in varying ethical principles being appliedto the various negotiating stances.

Ethics and teacher evaluation

After our brief excursus into the problems ofethics in relation to action research, an approachto classroom activities frequently concerned withthe improvement of teacher performance andefficiency, it would seem logical to acknowledgethe role and importance of ethics in teacherevaluation. The appraisal of teacher andheadteacher performance is one that is going toplay an increasingly important part as account-ability, teacher needs, and management effi-ciency assume greater significance, as govern-ments introduce pedagogic and curricularchanges, and as market forces exert pressure onthe educational system generally. By thus throw-ing teacher appraisal into greater relief, it be-comes very important that training appraisalprogrammes are planned and designed in sucha way as to give due recognition to the ethicalimplications at both school and LEA levels. Withthis in mind, we briefly review some basic

ETHICS AND TEACHER EVALUATION

68 ETHICS OF EDUCATIONAL AND SOCIAL RESEARCH

principles and concepts formulated in the USAthat may sensitize all those involved in appraisalprocedures to the concomitant ethical factors.

Strike (1990), in his paper on the ethics ofeducational evaluation, offers two broad prin-ciples which may form the basis of further con-siderations in the field of evaluation. These arethe principle of benefit maximization and theprinciple of equal respect. The former, the prin-ciple of benefit maximization, holds that the bestdecision is the one that results in the greatestbenefit for most people. It is pragmatic in thesense that it judges the rightness of our actionsby their consequences or, as Strike says, the bestaction is the one with the best results. In Britishphilosophical circles it is known as

utilitarianism and requires us to identify the par-ticular benefits we wish to maximize, to iden-tify a suitable population for maximization,specify what is to count as maximization, andfully understand the consequences of our actions.The second principle, that of equal respect, de-mands that we respect the equal worth of allpeople. This requires us to treat people as endsrather than means; to regard them as free andrational; and to accept that they are entitled tothe same basic rights as others.

Strike then goes on to list the following ethi-cal principles which he regards as particularlyimportant to teacher evaluation and which maybe seen in the light of the two broad principlesoutlined above:

Source Adapted from Kemmis and McTaggart (1981) and quoted in Hopkins (1985)

Box 2.8Ethical principles for the guidance of action researchers

Observe protocol: Take care to ensure that the relevant persons, committees, and authorities have been consulted, informed

and that the necessary permission and approval have been obtained.

Involve participants: Encourage others who have a stake in the improvement you envisage to shape and form the work.

Negotiate with those affected: Not everyone will want to be directly involved; your work should take account of the

responsibilities and wishes of others.

Report progress: Keep the work visible and remain open to suggestions so that unforeseen and unseen ramifications can be

taken account of; colleagues must have the opportunity to lodge a protest to you.

Obtain explicit authorizations: This applies where you wish to observe your professional colleagues; and where you wish to

examine documentation.

Negotiate descriptions of people’s work: Always allow those described to challenge your accounts on the grounds of fairness,

relevance and accuracy.

Negotiate accounts of others’ points of view (e.g. in accounts of communication): Always allow those involved in interviews,

meetings and written exchanges to require amendments which enhance fairness, relevance and accuracy.

Obtain explicit authorization before using quotations: Verbatim transcripts, attributed observations, excerpts of audio and

video recordings, judgements, conclusions or recommendations in reports (written or to meetings).

Negotiate reports for various levels of release: Remember that different audiences require different kinds of reports; what is

appropriate for an informal verbal report to a faculty meeting may not be appropriate for a staff meeting, a report to

council, a journal article, a newspaper, a newsletter to parents; be conservative if you cannot control distribution.

Accept responsibility for maintaining conf identiality.

Retain the right to report your work: Provided that those involved are satisf ied with the fairness, accuracy and relevance of

accounts which pertain to them, and that the accounts do not unnecessarily expose or embarrass those involved, then

accounts should not be subject to veto or be sheltered by prohibitions of confidentiality.

Make your principles of procedure binding and known: All of the people involved in your action research project must agree to

the principles before the work begins; others must be aware of their rights in the process.

Chapte

r 269

1 Due process Evaluative procedures mustensure that judgements are reasonable: thatknown and accepted standards are consist-ently applied from case to case, that evi-dence is reasonable and that there are sys-tematic and reasonable procedures for col-lecting and testing evidence.

2 Privacy This involves a right to control in-formation about oneself, and protects peo-ple from unwarranted interference in theiraffairs. In evaluation, it requires that proce-dures are not overtly intrusive and thatsuch evaluation pertains only to those as-pects of a teacher’s activity that are job re-lated. It also protects the confidentiality ofevaluation information.

3 Equality In the context of evaluation, thiscan best be understood as a prohibitionagainst making decisions on irrelevantgrounds, such as race, religion, gender, eth-nicity or sexual orientation.

4 Public perspicuity This principle requiresopenness to the public concerning evalua-tive procedures, their purposes and theirresults.

5 Humaneness This principle requires thatconsideration is shown to the feelings andsensitivities of those in evaluative contexts.

6 Client benefit This principle requires thatevaluative decisions are made in a way thatrespects the interests of students, parentsand the public, in preference to those ofeducational institutions and their staff.This extends to treating participants as sub-jects rather than as ‘research fodder’.

7 Academic freedom This requires that anatmosphere of intellectual openness ismaintained in the classroom for bothteachers and students. Evaluation shouldnot be conducted in a way that chills thisenvironment.

8 Respect for autonomy Teachers are entitledto reasonable discretion in, and to exercisereasonable judgement about, their work.Evaluations should not be conducted so asto unreasonably restrict discretion andjudgement.

Strike has developed these principles in a moreextended and systematic form in his article. Fi-nally, we note the three principles that Strikeapplies to the task of conflict resolution, to re-solving the differences between teachers andthe institutions in which they work as a resultof the evaluation process. He recommendsthat where a conflict has to be resolved,remediation is to be preferred, where possible,to disciplinary action or termination; media-tion is to be preferred, where possible, to morelitigious forms and solutions; and that infor-mal attempts to settle disputes should precedeformal ones.

We have seen throughout this chapter andin this particular section how the codificationand regulation of ethical principles is proceed-ing apace in the USA; and that this is occur-ring at both a formal and informal level. Inthis next, penultimate, section we look a littlecloser at these matters and their implicationsfor the UK.

Research and regulation

A glance at any current American textbook inthe social sciences will reveal the extent towhich professional researchers in the USA aregoverned by laws and regulations (Zechmeisterand Shaughnessy, 1992). These exist at severallevels: federal and state legal statutes, ethics re-view committees to oversee research in univer-sities and other institutions (these can consti-tute a major hurdle for those planning to under-take research), ethical codes of the professionalbodies and associations as well as the personalethics of individual researchers are all impor-tant regulatory mechanisms. All investigators,from undergraduates pursuing a course-basedresearch project to professional researchersstriving at the frontiers of knowledge, musttake cognizance of the ethical codes and regula-tions governing their practice. Indeed, we havesampled some of the ethical research require-ments of American investigators in this chapter.Failure to meet these responsibilities on the partof researchers is perceived as undermining the

RESEARCH AND REGULATION

70 ETHICS OF EDUCATIONAL AND SOCIAL RESEARCH

whole scientific process and may lead to legaland financial penalties for individuals and insti-tutions.

If Britain has not yet gone as far as the USAdown this path of regulation and litigation, itmay only be a question of time. Even in the UK,however, professional societies have formu-lated working codes of practice which expressthe consensus of values within a particulargroup and which help individual researchers inindicating what is desirable and what is to beavoided. Of course, this does not solve all theproblems, for there are few absolutes and inconsequence ethical principles may be open to awide range of interpretations. In addition, moreinformal codes of ethical principles haveemerged as a result of individual initiative. Theestablishment of comprehensive regulatorymechanisms is thus well in hand in the UK, butit is perhaps in the field of information anddata—how they are stored and the uses towhich they are put, for example—that educa-tional researchers are likely to find growing in-terest. This category would include, for in-stance, statistical data, data used as the basisfor evaluation, curricular records, writtenrecords, transcripts, data sheets, personal docu-ments, research data, computer files, and audioand video recordings.

As information technology establishes itselfin a centre-stage position and as society becomesincreasingly dependent on information economi-cally and functionally, so we realize just howimportant the concept of information is to us. Itis important not only for what it is, but for whatit can do. Numerous writers have pointed outthe connection between information and power.Harris, Pearce and Johnstone (1992), for in-stance, say:

Information and power have a very close relation-ship… Power over individuals…relies on the con-trol of personal information. Power of profession-alism involves both submission of the client to theprofessional’s better judgment and a network ofprofessional and inter-professional relationships,and probably rivalries, buttressed by exclusive

sharing of information. It is well to recognize thatdecisions about information-holding or access are,to an extent, always decisions about power.

(Harris, Pearce and Johnstone, 1992) When we reflect on the extent to which two keyconcepts in the world of contemporary educa-tion, namely ‘evaluation’ (or appraisal) and ‘ac-countability’, depend wholly on information inone form or another, that it is their very lifeblood, we realize just how powerful it is. Itsmisuse, therefore, or disclosure at the wrong timeor to the wrong client or organ, can result in themost unfortunate consequences for an indi-vidual, group, or institution. And matters aregreatly exacerbated if it is the wrong informa-tion, or incomplete, or deliberately misleading.

In an increasingly information-rich world, it isessential that safeguards be established to protectit from misuse or abuse. The Data Protection Act(1984) was designed to achieve such an end. Thiscovered the principles of data protection, the re-sponsibilities of data users, and the rights of datasubjects, and its broad aims are embodied in eightprinciples. However, data held for ‘historical andresearch’ purposes are exempted from the princi-ple which gives individuals the right of access topersonal data about themselves, provided the dataare not made available in a form which identifiesindividuals. Research data also have partial ex-emption from two further principles, with the ef-fect that such data may be held indefinitely andthe use of the data for research purposes need notbe disclosed at the time of data collection.

Of the two most important principles whichdo concern research data, one states that per-sonal data (i.e., data that uniquely identifies theperson supplying it) shall be held only for speci-fied and lawful purposes. The second principlestates that appropriate security measures shallbe taken against unauthorized access to, or al-teration, disclosure, or destruction of personaldata and against accidental loss or destructionof personal data. For a study of the effects ofthe Data Protection Act on the work of the Cen-tre for Educational Sociology, see Raffe, Bundelland Bibby (1989).

Chapte

r 271

Conclusion

This book is concerned with the methods usedin educational research and in this chapter wehave attempted to acquaint readers with someof the ethical difficulties they are likely to expe-rience in the conduct of such research. To thisend, we have drawn on key concepts and ideasfrom deliberations and investigations in the edu-cational, psychological, social psychological, andsociological domains in order to elucidate someof the more important dilemmas and issues thatare an inevitable part of social research. In do-ing this we are well aware that it is not possibleto identify all potential ethical questions or ad-judicate on what is correct researcher behaviour.4

On the other hand, perhaps some of the thingswe have said will seem irrelevant to readers whoare unlikely to be called upon to submit sub-jects to painful electric shocks, provoke aggres-sion, embarrass them, have them tell lies, or eatgrasshoppers, as Aronson and Carlsmith (1969)put it. Nevertheless, it is hoped that these fewpages will have induced in readers a certain dis-position that will enable them to approach theirown more temperate projects with a greater

awareness and fuller understanding of the ethi-cal dilemmas and moral issues lurking in theinterstices of the research process. However in-experienced in these matters researchers are, theywill bring to the world of social research a senseof rightness5 on which they can construct a setof rational principles appropriate to their owncircumstances and based on personal, profes-sional, and societal values (we stress the word‘rational’ since reason is a prime ingredient ofethical thinking and it is the combination of rea-son and a sense of rightness that researchers mustkeep faith with if they are to bring a rich ethicalquality to their work).6

Although no code of practice can anticipateor resolve all problems, there is a six-fold ad-vantage in fashioning a personal code of ethicalpractice.7 First, such a code establishes one as amember of the wider scientific community hav-ing a shared interest in its values and concerns.Second, a code of ethical practice makes re-searchers aware of their obligations to their sub-jects and also to those problem areas where thereis a general consensus about what is acceptableand what is not. In this sense it has a clarificatoryvalue. Third, when one’s professional behaviour

Box 2.9An ethical code: an illustration

Source Adapted from Reynolds, 1979

CONCLUSION

1 It is important for the researcher to reveal fully his or her identity and background.

2 The purpose and procedures of the research should be fully explained to the subjects at the outset.

3 The research and its ethical consequences should be seen from the subjects’ and institution’s point of view.

4 Ascertain whether the research benefits the subjects in any way (beneficence).

5 Where necessary, ensure the research does not harm the subjects in any way (non-maleficence).

6 Possible controversial findings need to be anticipated and where they ensue, handled with great sensitivity.

7 The research should be as objective as possible. This will require careful thought being given to the design, conduct and

reporting of research.

8 Informed consent should be sought from all participants. All agreements reached at this stage should be honoured.

9 Sometimes it is desirable to obtain informed consent in writing.

10 Subjects should have the option to refuse to take part and know this; and the right to terminate their involvement at any

time and know this also.

11 Arrangements should be made during initial contacts to provide feedback for those requesting it. It may take the form

of a written résumé of findings.

12 The dignity, privacy and interests of the participants should be respected. Subsequent privacy of the subjects after the

research is completed should be guaranteed (non-traceability).

13 Deceit should only be used when absolutely necessary.

14 When ethical dilemmas arise, the researcher may need to consult other researchers or teachers.

72 ETHICS OF EDUCATIONAL AND SOCIAL RESEARCH

is guided by a principled code of ethics, then itis possible to consider that there may be alter-native ways of doing the same thing, ways thatare more ethical or less unethical should one beconfronted by a moral challenge. Fourth, a bal-anced code can be an important organizing fac-tor in researchers’ perceptions of the researchsituation, and as such may assist them in theirneed to anticipate and prepare. Fifth, a code ofpractice validated by their own sense of right-ness will help researchers to develop an intui-tive sensitivity that will be particularly helpfulto them in dealing with the unknown and theunexpected, especially where the more fluidicmethods such as ethnography and participantobservation are concerned. And sixth, a code ofpractice will bring discipline to researchers’

awareness. Indeed, it should be their aim to strikea balance between discipline and awareness.Discipline without awareness may result inlargely mechanical behaviour; whereas aware-ness without discipline can produce inappropri-ate responses. Box 2.9 gives a short ethical code,by way of example. It must be stressed, how-ever, that bespoke items, i.e. ones designed tomeet the needs of a specific project, are prefer-able to standards ones. The items in Box 2.9 areillustrative, and in no way exhaustive. Finally,we live in a relative universe and it has beensaid that relativity seeks adjustment; that ad-justment is art; and that the art of life lies in aconstant readjustment to one’s surroundings(Okakura, 1991). What better precept for theart of the ethical researcher?

Research design issues: planning research3

Introduction

There is no single blueprint for planning re-search. Research design is governed by the no-tion of ‘fitness for purpose’. The purposes of theresearch determine the methodology and designof the research. For example, if the purpose ofthe research is to map the field, or to makegeneralizable comments then a survey approachmight be desirable, using some form of strati-fied sample; if the effects of a specific interven-tion are to be evaluated then maybe an experi-mental or action research model is appropriate;if an in-depth study of a particular situation orgroup is important then an ethnographic modelmight be more appropriate.

That said, it is possible, nevertheless, to iden-tify a set of issues that researchers need to ad-dress, regardless of the specifics of their research.It is this set of issues that this chapter addresses.It acts as a bridge between the theoretical dis-cussions of the opening chapter and the subse-quent chapters that cover: (a) specific styles ofresearch (Part Three); (b) specific issues in plan-ning a research design, e.g. sampling, validity,reliability, ethics (Part Two); (c) planning datacollection (instrumentation, Part Four); (d) dataanalysis. The intention here is to provide a setof issues that need to be addressed in practice sothat an area of research interest can becomepracticable, feasible and capable of being un-dertaken. This chapter indicates how researchmight be operationalized, i.e. how a general setof research aims and purposes can be translatedinto a practical, researchable topic.

To change the ‘rules of the game’ in mid-stream once the research has commenced is asure recipe for problems. The terms of the re-

search and the mechanism of its operationmust be ironed out in advance if it is to becredible, legitimate and practicable. Once theyhave been decided upon the researcher is in avery positive position to undertake the re-search. The setting up of the research is a bal-ancing act, for it requires the harmonizing ofplanned possibilities with workable, coherentpractice, i.e. the resolution of the differencebetween idealism and reality, between whatcould be done and what will actually work,for at the end of the day research has to work.In planning research there are two phases—adivergent phase and a convergent phase. Thedivergent phase will open up a range of possi-ble options facing the researcher, whilst theconvergent phase will sift through these possi-bilities, see which ones are desirable, whichones are compatible with each other, whichones will actually work in the situation, andmove towards an action plan that can realisti-cally operate. This can be approached throughthe establishment of a framework of planningissues.

A framework for planning research

Though, clearly, the set of issues that consti-tute a framework for planning research willneed to be interpreted differently for differ-ent styles of research, nevertheless, it is use-ful to indicate what those issues might be.These include: 1 the general aims and purposes of the research;2 how to operationalize research aims and pur-

poses;

74 RESEARCH DESIGN ISSUES: PLANNING RESEARCH

3 generating research questions;4 identifying and setting in order the priori-

ties for and constraints on the research;5 approaching the research design;6 focusing the research;7 research methodology;8 ethical issues;9 audiences of the research;

10 instrumentation;11 sampling;12 time frames;13 resources required;14 validity and reliability;15 data analysis;16 verifying and validating the data;17 reporting and writing up the research. These can be arranged into four main areas(Morrison, 1993): (i) orienting decisions;(ii) research design and methodology;(iii) data analysis;(iv) presenting and reporting the results. Orienting decisions are those decisions which willset the boundaries or the parameters of constraintson the research. For example, let us say that theoverriding feature of the research is that it has tobe completed within six months; this will exert aneffect on the enterprise. On the one hand it will‘focus the mind’, really requiring priorities to besettled and data to be provided in a relatively shorttime. On the other hand this may reduce the vari-ety of possibilities available to the researcher. Hencequestions of time scale will affect: • the research questions which might be an-

swered feasibly and fairly (for example, someresearch questions might require a long datacollection period);

• the number of data collection instrumentsused (for example, there might be only enoughtime for a few instruments to be used);

• the sources (people) to whom the researchermight go (for example, there might only beenough time to interview a handful of people);

• the number of foci which can be covered inthe time (for example, for some foci it willtake a long time to gather relevant data);

• the size and nature of the reporting (there mightonly be time to produce one interim report).

By clarifying the time scale a valuable noteof realism is injected into the research, whichenables questions of practicability to be an-swered.

Let us take another example. Suppose theoverriding feature of the research is that thecosts in terms of time, people and materials forcarrying it out are to be negligible. This, too,will exert an effect on the research. On the onehand it will inject a sense of realism into pro-posals, identifying what is and what is notmanageable. On the other it will reduce,again, the variety of possibilities which areavailable to the researcher. Questions of costwill affect: • the research questions which might be feasi-

bly and fairly answered (for example, someresearch questions might require: (a) inter-viewing which is costly in time both to ad-minister and transcribe; (b) expensive com-mercially produced data collection instru-ments, e.g. tests, and costly computer serv-ices, which may include purchasing softwarefor example);

• the number of data collection instrumentsused (for example, some data collection in-struments, e.g. postal questionnaires, arecostly for reprographics and postage);

• the people, to whom the researcher might go(for example, if teachers are to be releasedfrom teaching in order to interviewed thencover for their teaching may need to befound);

• the number of foci which can be covered inthe time (for example, in uncovering relevantdata, some foci might be costly in research-er’s time);

• the size and nature of the reporting (for ex-ample, the number of written reports pro-duced, the costs of convening meetings).

Chapte

r 375

Certain time scales permit certain types of re-search, e.g. a short time scale permits answersto short-term issues, whilst long-term or largequestions might require a long-term data col-lection period to cover a range of foci. Costs interms of time, resources and people might af-fect the choice of data collection instruments.Time and cost will require the researcher to de-termine, for example, what will be the mini-mum representative sample of teachers or stu-dents in a school, for interviews are time-con-suming and questionnaires are expensive toproduce. These are only two examples of thereal constraints on the research which must beaddressed. Planning the research early on willenable the researcher to identify the boundarieswithin which the research must operate andwhat the constraints are on it.

With these preliminary comments, let usturn to the four main areas of the frameworkfor planning research.

Orienting decisions

Decisions in this field are strategic; they set thegeneral nature of the research, and the ques-tions that researchers may need to consider are:

Who wants the research?Who will receive the research/who is it for?Who are the possible/likely audiences of the re-search?What powers do the recipients of the researchhave?What are the general aims and purposes of theresearch?What are the main priorities for and constraintson the research?What are the time scales and time frames of theresearch?Who will own the research?At what point will the ownership of the researchpass from the participants to the researcher andfrom the researcher to the recipients of the re-search?Who owns the data?What ethical issues are to be faced in undertakingthe research?

What resources (e.g. physical, material, temporal,human, administrative) are required for the re-search?

It can be seen that decisions here establish somekey parameters of the research, including somepolitical decisions (for example, on ownershipand on the power of the recipients to take ac-tion on the basis of the research). At this stagethe overall feasibility of the research will be ad-dressed.

Research design and methodology

If the preceding orienting decisions are strategicthen decisions in this field are tactical; they es-tablish the practicalities of the research, assum-ing that, generally, it is feasible (i.e. that the ori-enting decisions have been taken). Decisions hereinclude addressing such questions as:

What are the specific purposes of the research?How are the general research purposes andaims operationalized into specific researchquestions?What are the specific research questions?What needs to be the focus of the research in or-der to answer the research questions?What is the main methodology of the research (e.g.a quantitative survey, qualitative research, an eth-nographic study, an experiment, a case study, apiece of action research etc.)?How will validity and reliability be addressed?What kinds of data are required?From whom will data be acquired (i.e. sampling)?Where else will data be available (e.g. documen-tary sources)?How will the data be gathered (i.e. instrumen-tation)?Who will undertake the research?

The process of operationalization is critical foreffective research. What is required here is trans-lating a very general research aim or purposeinto specific, concrete questions to which spe-cific, concrete answers can be given. The proc-ess moves from the general to the particular, fromthe abstract to the concrete. Thus the researcher

A FRAMEWORK FOR PLANNING RESEARCH

76 RESEARCH DESIGN ISSUES: PLANNING RESEARCH

breaks down each general research purpose orgeneral aim into more specific research purposesand constituent elements, continuing the proc-ess until specific, concrete questions have beenreached to which specific answers can be pro-vided. An example of this is provided below

Let us imagine that the overall research aimis to ascertain the continuity between primaryand secondary education (Morrison, 1993:31–3). This is very general, and needs to be trans-lated into more specific terms. Hence the re-searcher might deconstruct the term ‘continu-ity’ into several components, for example expe-riences, syllabus content, teaching and learningstyles, skills, concepts, organizational arrange-ments, aims and objectives, ethos, assessment.Given the vast scope of this, the decision is takento focus on continuity of pedagogy. This is thenbroken down into its component areas:

the level of continuity of pedagogy;the nature of continuity of pedagogy;the degree of success of continuity of pedagogy;the responsibility for continuity;record keeping and documentation of continuity;resources available to support continuity.

The researcher might take this further into in-vestigating: the nature of the continuity (i.e. theprovision of information about continuity); thedegree of continuity (i.e. a measure against agiven criterion); the level of success of the conti-nuity (i.e. a judgement). An operationalized setof research questions, then, might be: • How much continuity of pedagogy is occur-

ring across the transition stages in each cur-riculum area? What kind of evidence is re-quired to answer this question? On what cri-teria will the level of continuity be decided?

• What pedagogical styles operate in each cur-riculum area? What are the most frequent andmost preferred? What is the balance of peda-gogical styles? How is pedagogy influencedby resources? To what extent is continuityplanned and recorded? On what criteria willthe nature of continuity be decided? What

kind of evidence is required to answer thisquestion?

• On what aspects of pedagogy does planningtake place? By what criteria will the level ofsuccess of continuity be judged? Over howmany students/teachers/curriculum areas willthe incidence of continuity have to occur forit to be judged successful? What kind of evi-dence is required to answer this question?

• Is continuity occurring by accident or design?How will the extent of planned and un-planned continuity be gauged? What kind ofevidence is required to answer this question?

• Who has responsibility for continuity at thetransition points? What is being undertakenby these people?

• How are records kept on continuity in theschools? Who keeps these records? What isrecorded? How frequently are the recordsupdated and reviewed? What kind of evidenceis required to answer this question?

• What resources are there to support continu-ity at the point of transition? How adequateare these resources? What kind of evidence isrequired to answer this question?

It can be seen that these questions, several innumber, have moved the research from sim-ply an expression of interest (or a general aim)into a series of issues that lend themselves tobeing investigated in concrete terms. This isprecisely what we mean by the process ofoperationalization. It is now possible not onlyto formulate the specific questions to be posed,but also to select appropriate instruments thatwill gather the data to answer them (e.g. semi-structured interviews, rating scales on question-naires, or documentary analysis). By this proc-ess of operationalization we thus make a gen-eral purpose amenable to investigation, e.g. bymeasurement (Rose and Sullivan, 1993:6) orsome other means.

In planning research it is important to clarifya distinction that needs to be made betweenmethodology and methods, approaches and in-struments, styles of research and ways of col-lecting data. Several of the later chapters of this

Chapte

r 377

book are devoted to specific instruments forcollecting data, e.g.: • interviews;• questionnaires;• observation;• tests;• accounts;• biographies and case studies;• role playing;• simulations;• personal constructs. The decision on which instrument to use fre-quently follows from an important earlier deci-sion on which kind of research to undertake,for example: • a survey;• an experiment;• an in-depth ethnography;• action research;• case study research;• testing and assessment. Subsequent chapters of this book examine eachof these research styles, their principles,rationales and purposes, and the instrumenta-tion and data types that seem suitable for them.For conceptual clarity it is possible to set outsome key features of these models (Box 3.1).1 Itis intended that, when decisions have beenreached on the stage of research design andmethodology, a clear plan of action will havebeen prepared. To this end, considering modelsof research might be useful (Morrison, 1993).

Data analysis

The prepared researcher will need to considerthe mode of data analysis to be employed. Insome cases this is very important as it has a spe-cific bearing on the form of the instrumenta-tion. For example, a researcher will need to planthe layout and structure of a questionnaire sur-vey very carefully in order to assist data entryfor computer reading and analysis; an inappro-

priate layout may obstruct data entry and sub-sequent analysis by computer. The planning ofdata analysis will need to consider:

What needs to be done with the data when theyhave been collected—how will they be processedand analysed?How will the results of the analysis be verified,cross-checked and validated?

Decisions will need to be taken with regard tothe statistical tests that will be used in data analy-sis as this will affect the layout of research items(for example in a questionnaire), and the com-puter packages that are available for processingquantitative and qualitative data, e.g. SPSS andNUD.IST respectively.

For statistical processing the researcher will needto ascertain the level of data being processed—nominal, ordinal, interval or ratio (discussed inChapter 10). Nominal and ordinal scales yield non-parametric data, i.e. data from populations, wherefew or no assumptions are made about the distri-bution of the population or the characteristics ofthat population; the parameters of the populationare unknown. Interval and ratio scales yield para-metric data, on the basis of which assumptionsare made about the characteristics and distribu-tion of the wider population, i.e. the parametersof the population are known, and usually assumea normal, Gaussian curve of distribution, as in read-ing scores, for example. Non-parametric data areoften derived from questionnaires and surveys(though these can also yield parametric data, see‘survey’ in Box 3.1), whilst parametric data tendto be derived from experiments and tests.

The choice of which statistics to employ isnot arbitrary, and Box 3.2 sets out the commonlyused statistics for data types (Siegel, 1956; Cohenand Holliday, 1996; Hopkins, Hopkins and Glass,1996). For qualitative data analysis the research-ers have at their disposal a range of techniques,for example (Hammersley, 1979): • coding of field notes (Miles and Huberman, 1984);• content analysis of field notes or qualitative

data (see chapter 6);

A FRAMEWORK FOR PLANNING RESEARCH

78 RESEARCH DESIGN ISSUES: PLANNING RESEARCH

Box 3.1Elements of research styles

Model Purposes Foci Key terms Characteristics

Survey Gathering large scale Opinions Measuring Describes and explains

data in order to Scores Testing

make generalizations Outcomes Representativeness Represents wide

Conditions Generalizability population

Generating Ratings Gathers numerical data

statistically

manipulable data Much use of

questionnaires and

Gathering context- assessment/test data

free data

Experiment Comparing under Initial states, Pretest and post-test Control and

controlled conditions intervention and experimental groups

outcomes Identification, isolation

Making generalizations and control of key variables Treats situations like a

about efficacy Randomized controlled laboratory

trials Generalizations

Objective Causes due to

measurement Comparing experimental

of treatment intervention

Causality

Establishing causality Does not judge worth

Simplistic

Ethnography Portrayal of events Perceptions and views Subjectivity Context-specific

in subjects’ terms of participants

Honesty, authenticity Formative and

Subjective and Issues as they emerge emergent

reporting of multiple over time Non-generalizable

perspectives Responsive to

Multiple perspectives emerging features

Description,

understanding and Exploration and rich Allows room for

explanation of a reporting of a specific judgements and

specific situation context multiple perspectives

Emergent issues Wide data base

gathered over a long

period of time

Time consuming to

process data

continued

Chapte

r 379

Box 3.1continued

A FRAMEWORK FOR PLANNING RESEARCH

Model Purposes Foci Key terms Characteristics

Action To plan, implement, Everyday practices Action Context-specific

research review and evaluate Improvement

an intervention Outcomes of Reflection Participants as

designed to improve interventions Monitoring researchers

practice/solve local Evaluation

problem Intervention Reflection on practice

Problem-solving

To empower Empowering Interventionist —

participants through Planning leading to solution of

research involvement Participant Reviewing ‘real’ problems and

and ideology critique empowerment meeting ‘real’ needs

To develop Empowering for

reflective practice participants

To promote equality Reflective practice Collaborative

democracy

Promoting praxis and

To link practice and Social democracy and equality

research equality

Stakeholder research

To promote Decision-making

collaborative research

Case study To portray, analyse Individuals and local Individuality, uniqueness In-depth, detailed data

and interpret the situations from wide data source

uniqueness of real In-depth analysis and

individuals and Unique instances portrayal Participant and non-

situations through participant observation

accessible accounts A single case Interpretive and

inferential analysis Non-interventionist

To catch the Bounded phenomena

complexity and and systems: Subjective Empathic

situatedness of • individual Descriptive

behaviour • group Analytical Holistic treatment of

• roles Understanding phenomena

To contribute to • organizations specific situations

action and • community What can be learned

intervention Sincerity from the particular

Complexity case

To present and Particularity

represent reality - to

give a sense of ‘being

there’

continued

80 RESEARCH DESIGN ISSUES: PLANNING RESEARCH

Box 3.1continued

Box 3.2Statistics available for different types of data

Model Purposes Foci Key terms Characteristics

Testing and To measure Academic and non- Reliability Materials designed to

assessment achievement and academic, cognitive, Validity provide scores that can

potential affective and Criterion-referencing be aggregated

psychomotor domains - Norm-referencing

To diagnose strengths low order to high order Domain-referencing Enables individuals and

and weaknesses Item-response groups to be compared

Performance, Formative

To assess achievement, potential, Summative In-depth diagnosis

performance and abilities Diagnostic

abilities Personality Standardization Measures performance

characteristics Moderation

Data type Legitimate statistics Points to observe/questions/examples

Nominal I Mode (the score achieved by the Is there a clear ‘front runner’ that receives the highest

greatest number of people) score with low scoring on other categories, or is the modal

score only narrowly leading the other categories? Are there

two scores which are vying for the highest score - a bi-

modal score?

2 Frequencies Which are the highest/lowest frequencies? Is the

distribution even across categories?

3 Chi-square (χ2) (a statistic that charts Are differences between scores caused by chance/accident

the difference between statistically or are they statistically significant, i.e. not simply caused by

expected and actual scores) chance?

Ordinal I Mode Which score on a rating scale is the most frequent?

2 Median (the score gained by the middle What is the score of the middle person in a list of scores?

person in a ranked group of people or,

if there is an even number of cases, the

score which is midway between the

highest score obtained in the lower

half of the cases and the lowest score

obtained in the higher half of the cases)

3 Frequencies Do responses tend to cluster around one or two

categories of a rating scale? Are the responses skewed

towards one end of a rating scale (e.g.’strongly agree’)? Do

the responses pattern themselves consistently across the

sample? Are the frequencies generally high or generally low

(i.e. whether respondents tend to feel strongly about an

issue)? Is there a clustering of responses around the central

categories of a rating scale (the central tendency,

respondents not wishing to appear to be too extreme)?

continued

Chapte

r 381

Box 3.2continued

A PLANNING MATRIX FOR RESEARCH

Data type Legitimate statistics Points to observe/questions/examples

4 Chi-square (χ2) Are the frequencies of one set of nominal variables (e.g.

sex) significantly related to a set of ordinal variables?

5 Spearman rank order correlation Do the results from one rating scale correlate with the

(a statistic to measure the degree of results from another rating scale?

association between two ordinal Do the rank order positions for one variable correlate with

variables) the rank order positions for another variable?

6 Mann-Whitney U-test (a statistic to Is there a significant difference in the results of a rating

measure any significant difference scale for two independent samples (e.g. males and females)?

between two independent samples)

7 Kruskal-Wallis analysis of variance Is there a significant difference between three or more

(a statistic to measure any significant nominal variables (e.g. membership of political parties) and

differences between three or more the results of a rating scale?

independent samples)

Interval and I Mode

ratio 2 Mean What is the average score for this group?

3 Frequencies

4 Median

5 Chi-square (χ2)

6 Standard deviation (a measure of the Are the scores on a parametric test evenly distributed? Do

dispersal of scores) scores cluster closely around the mean? Are scores widely

spread around the mean? Are scores dispersed evenly? Are

one or two extreme scores (‘outliers’) exerting a

disproportionate influence on what are otherwise closely

clustered scores?

7 z-scores (a statistic to convert scores How do the scores obtained by students on a test which

from different scales, i.e. with different was marked out of 20 compare to the scores by the same

means and standard deviations, to a students on a test which was marked out of 50?

common scale, i.e. with the same mean

and standard deviation, enabling

different scores to be compared fairly)

8 Pearson product moment correlation Is there a correlation between one set of interval data (e.g.

(a statistic to measure the degree of test scores for one examination) and another set of

association between two interval or interval data (e.g. test scores on another examination)?

ratio variables)

9 t-tests (a statistic to measure the Are the control and experimental groups matched in their

difference between the means of one mean scores on a parametric test? Is there a significant

sample on two separate occasions or difference between the pretest and post-test scores of a

between two samples on one occasion) sample group?

10 Analysis of variance (a statistic to Are the differences in the means between test results of

ascertain whether two or more three groups statistically significant?

means differ significantly)

82 RESEARCH DESIGN ISSUES: PLANNING RESEARCH

• cognitive mapping (Jones, 1987; Morrison,1993);

• seeking patterning of responses;• looking for causal pathways and connections

(Miles and Huberman, 1984);• presenting cross-site analysis (ibid.);• case studies;• personal constructs;• narrative accounts;• action research analysis;• analytic induction (Denzin, 1970);• constant comparison (Glaser and Strauss,

1967);• grounded theory (Glaser and Strauss, 1967);• discourse analysis (Stillar, 1998);• biographies and life histories (Atkinson, 1998). The criteria for deciding which forms of dataanalysis to undertake are governed both by fit-ness for purpose and legitimacy—the form ofdata analysis must be appropriate for the kindsof data gathered. For example, it would be in-appropriate to use certain statistics with certainkinds of numerical data (e.g. using means onnominal data), or to use causal pathways onunrelated cross-site analysis.

Presenting and reporting the results

As with the stage of planning data analysis, theprepared researcher will need to consider theform of the reporting of the research and its re-sults, giving due attention to the needs of differ-ent audiences (for example, an academic audi-ence may require different contents from a widerprofessional audience and, a fortiori, from a layaudience). Decisions here will need to consider:

How to write up and report the research?When to write up and report the research (e.g.ongoing or summative)?How to present the results in tabular and/or writ-ten-out form?How to present the results in non-verbal forms?To whom to report (the necessary and possibleaudiences of the research)?How frequently to report?

A planning matrix for research

In planning a piece of research the range of ques-tions to be addressed can be set into a matrix.Box 3.3 provides such a matrix, in the left handcolumn of which are the questions which figurein the four main areas set out so far:(i) orienting decisions;(ii) research design and methodology;(iii) data analysis;(iv) presenting and reporting the results. Questions 1–10 are the orienting decisions, ques-tions 11–22 concern the research design andmethodology, questions 23–4 cover data analy-sis, and questions 25–30 deal with presentingand reporting the results. Within each of the 30questions there are several sub-questions whichresearch planners may need to address. For ex-ample, within question 5 (‘What are the pur-poses of the research?’) the researcher wouldhave to differentiate major and minor purposes,explicit and maybe implicit purposes, whosepurposes are being served by the research, andwhose interests are being served by the research.An example of these sub-issues and problems iscontained in the second column.

At this point the planner is still at the diver-gent phase of the research planning, dealing withplanned possibilities (Morrison, 1993:19), open-ing up the research to all facets and interpreta-tions. In the column headed ‘decisions’ the re-search planner is moving towards a convergentphase, where planned possibilities become visiblewithin the terms of constraints available to theresearcher. To do this the researcher has to movedown the column marked ‘decisions’ to see howwell the decision which is taken in regard to oneissue/question fits in with the decisions in regardto other issues/questions. For one decision to fitwith another, four factors must be present: 1 All of the cells in the ‘decisions’ column must

be coherent—they must not contradict eachother.

2 All of the cells in the ‘decisions’ column mustbe mutually supporting.

Chapte

r 383

Box 3.3A matrix for planning research

A PLANNING MATRIX FOR RESEARCH

Orienting decisions

Question Sub-issues and problems Decisions

I Who wants Is the research going to be useful? Find out the controls over the research which can

the research? Who might wish to use the research? be exercised by respondents.

Are the data going to be public? What are the scope and audiences of the research?

What if different people want different things Determine the reporting mechanisms.

from the research?

Can people refuse to participate?

2 Who will Will participants be able to veto the release of Determine the proposed internal and external

receive the parts of the research to specified audiences? audiences of the research.

research? Will participants be able to give the research to Determine the controls over the research which

whomsoever they wish? can be exercised by the participants.

Will participants be told to whom the research Determine the rights of the participants and the

will go? researcher to control the release of the research.

3 What powers What use will be made of the research? Determine the rights of recipients to do what they

do the recipients How might the research be used for wish with the research.

of the research or against the participants? Determine the respondents’ rights to protection

have? What might happen if the data fall into the as a result of the research.

‘wrong’ hands?

Will participants know in advance what use will

and will not be made of the research?

4 What are the Is there enough time to do all the research? Determine the time scales and timing of the

time scales of How to decide what to be done within the time research.

the research? scale?

5 What are the What are the formal and hidden agendas here? Determine all the possible uses of the research.

purposes of the Whose purposes are being served by the Determine the powers of the respondents to

research? research? control the uses made of the research.

Who decides the purposes of the research? Decide on the form of reporting and the intended

How will different purposes be served in the and possible audiences of the research.

research?

6 What are the Who decides what the questions will be? Determine the participants’ rights and powers to

research Do participants have rights to refuse to answer participate in the planning, form and conduct of

questions? or take part? the research.

Can participants add their own questions? Decide the balance of all interests in the research.

7 What must be Is sufficient time available to focus on all the Determine all the aspects of the research,

the focus in necessary aspects of the research? prioritize them, and agree on the minimum

order to answer How will the priority foci be decided? necessary areas of the research.

the research Who decides the foci? Determine decision-making powers on the

questions? research.

continued

84 RESEARCH DESIGN ISSUES: PLANNING RESEARCH

Box 3.3continued

Question Sub-issues and problems Decisions

8 What costs What support is available for the researcher? Cost out the research.

are there — What materials are necessary?

human, material,

physical,

administrative,

temporal?

9 Who owns Who controls the release of the report? Determine who controls the release of the report.

the research? What protection can be given to participants? Decide the rights and powers of the researcher.

Will participants be identified and identifiable/ Decide the rights of veto.

traceable? Decide how to protect those who may be identified/

Who has the ultimate decision on what data are identifiable in the research.

included?

10 At what Who decides the ownership of the research? Determine the ownership of the research at all

point does the Can participants refuse to answer certain parts stages of its progress.

ownership pass if they wish, or, if they have the option not to Decide the options available to the participants.

from the take part, must they opt out of everything? Decide the rights of different parties in the

respondent to Can the researcher edit out certain responses? research, e.g. respondents, researcher, recipients.

the researcher

and from the

researcher to

the recipients?

Research design and methodology

Question Sub-issues and problems Decisions

I I What are How do these purposes derive from the overall Decide the specific research purposes and write

the specific aims of the research? them as concrete questions.

purposes of Will some areas of the broad aims be covered,

the research? or will the specific research purposes have to

be selective?

What priorities are there?

12 How are Do the specific research questions together Ensure that each main research purpose is

the general cover all the research purposes? translated into specific, concrete questions that,

research Are the research questions sufficiently concrete together, address the scope of the original

purposes as to suggest the kinds of answers and data research questions.

and aims required and the appropriate instrumentation Ensure that the questions are sufficiently specific

operationalized and sampling? as to suggest the most appropriate data types,

into specific How to balance adequate coverage of research kinds of answers required, sampling, and

research purposes with the risk of producing an unwieldy instrumentation.

questions? list of sub-questions? Decide how to ensure that any selectivity still

represents the main fields of the research

questions.

13 What are Do the specific research questions demonstrate Ensure that the coverage and operationalization of

the specific construct and content validity? the specific questions addresses content and

research construct validity respectively.

questions?

continued

Chapte

r 385

Box 3.3continued

A PLANNING MATRIX FOR RESEARCH

Question Sub-issues and problems Decisions

14 What needs How may foci are necessary? Decide the number of foci of the research

to be the focus Are the foci clearly identifiable and questions.

of the research operationalizable? Ensure that the foci are clear and can be

in order to operationalized.

answer the

research

questions?

15 What is How many methodologies are necessary? Decide the number, type and purposes of the

the main Are several methodologies compatible with methodologies to be used.

methodology each other? Decide whether one or more methodologies is

of the research? Will a single focus/research question require necessary to gain answers to specific research

more than one methodology (e.g. for questions.

triangulation and concurrent validity)? Ensure that the most appropriate form of

methodology is employed.

16 How will Will there be the opportunity for cross- Determine the process of respondent validation of

validity and checking? the data.

reliability be Will the depth and breadth required for content Decide a necessary minimum of topics to be

addressed? validity be feasible within the constraints of the covered.

research (e.g. time constraints, instrumentation)? Subject the plans to scrutiny by critical friends

In what senses are the research questions valid (‘jury’ validity).

(e.g. construct validity)? Pilot the research.

Are the questions fair? Build in cross-checks on data.

How does the researcher know if people are Address the appropriate forms of reliability and

telling the truth? validity.

What kinds of validity and reliability are to be Decide the questions to be asked and the

addressed? methods used to ask them.

How will the researcher take back the research Determine the balance of open and closed

to respondents for them to check that the questions.

interpretations are fair and acceptable?

How will data be gathered consistently over

time?

How to ensure that each respondent is given

the same opportunity to respond?

17 How will How will reflexivity be recognized? Determine the need to address reflexivity and to

reflexivity be Is reflexivity a problem? make this public.

addressed? How can reflexivity be included in the research? Determine how to address reflexivity in the

research.

18 What kinds Does the research need words, numbers or Determine the most appropriate types of data for

of data are both? the foci and research questions.

required? Does the research need opinions, facts or both? Balance objective and subjective data.

Does the research seek to compare responses Determine the purposes of collecting different

and results or simply to illuminate an issue? types of data and the ways in which they can be

processed.

continued

86 RESEARCH DESIGN ISSUES: PLANNING RESEARCH

Box 3.3continued

Question Sub-issues and problems Decisions

19 From whom Will there be adequate time to go to all the Determine the minimum and maximum sample.

will data be relevant parties? Decide on the criteria for sampling.

acquired (i.e. What kind of sample is required (e.g. probability/ Decide the kind of sample required.

sampling)? non-probability/random/stratified etc.)? Decide the degree of representativeness of the

How to achieve a representative sample sample.

(if required)? Decide how to follow up and not to follow up on

the data gathered.

20 Where else What documents and other written sources of Determine the necessary/desirable/possible

will data be data can be used? documentary sources.

available? How to access and use confidential material? Decide access and publication rights and

What will be the positive or negative effects on protection of sensitive data.

individuals of using certain documents?

21 How will What methods of data gathering are available Determine the most appropriate data collection

the data be and appropriate to yield data to answer the instruments to gather data to answer the research

gathered (i.e. research questions? questions.

instrumentation)? What methods of data gathering will be used? Pilot the instruments and refine them

How to construct interview schedules/ subsequently.

questionnaires/tests/observation schedules? Decide the strengths and weaknesses of different

What will be the effects of observing data collection instruments in the short and long

participants? term.

How many methods should be used (e.g. to Decide which methods are most suitable for

ensure reliability and validity)? which issues.

Is it necessary or desirable to use more than Decide which issues will require more than one

one method of data collection on the same data collection instrument.

issue? Decide whether the same data collection methods

Will many methods yield more reliable data? will be used with all the participants.

Will some methods be unsuitable for some

people or for some issues?

22 Who will Can different people plan and carry out different Decide who will carry out the data collection,

undertake the parts of the research? processing and reporting.

research?

Data analysis

Question Sub-issues and problems Decisions

23 How will Are the data to be processed numerically or Clarify the legitimate and illegitimate methods of

the data be verbally? data processing and analysis of quantitative and

analysed? What computer packages are available to assist qualitative data.

data processing and analysis? Decide which methods of data processing and

What statistical tests will be needed? analysis are most appropriate for which types of

How to perform a content analysis of word data and for which research questions.

data? Check that the data processing and analysis will

How to summarize and present word data? serve the research purposes.

How to process all the different responses to Determine the data protection issues if data are to

open-ended questions? be processed by ‘outsiders’ or particular ‘insiders’.

Will the data be presented person by person,

issue by issue, aggregated to groups, or a

combination of these?

Does the research seek to make generalizations?

Who will process the data?

continued

Chapte

r 387

Box 3.3continued

A PLANNING MATRIX FOR RESEARCH

Question Sub-issues and problems Decisions

24 How to What opportunities will there be for Determine the process of respondent validation

verify and respondents to check the researcher’s during the research.

validate the interpretation? Decide the reporting of multiple perspectives and

data and their At what stages of the research is validation interpretations.

interpretation? necessary? Decide respondents’ rights to have their views

What will happen if respondents disagree with expressed or to veto reporting.

the researcher’s interpretation?

Presenting and reporting the results

Question Sub-issues and problems Decisions

25 How to Who will write the report and for whom? Ensure that the most appropriate form of

write up and How detailed must the report be? reporting is used for the audiences.

report the What must the report contain? Keep the report as short, clear and complete as

research? What channels of dissemination of the research possible.

are to be used? Provide summaries if possible/fair.

Ensure that the report enables fair critique and

evaluation to be undertaken.

26 When to How many times are appropriate for Decide the most appropriate timing, purposes and

write up and reporting? audiences of the reporting.

report the For whom are interim reports compiled? Decide the status of the reporting (e.g. formal,

research (e.g. Which reports are public? informal, public, private).

ongoing or

summative)?

27 How to How to ensure that everyone will understand Decide the most appropriate form of reporting.

present the the language or the statistics? Decide whether to provide a glossary of terms.

results in How to respect the confidentiality of the Decide the format(s) of the reports.

tabular and/or participants? Decide the number and timing of the reports.

written-out How to report multiple perspectives? Decide the protection of the individual’s rights,

form? balancing this with the public’s rights to know.

28 How to Will different parties require different reports? Decide the most appropriate form of reporting.

present the How to respect the confidentiality of the Decide the number and timing of the reports.

results in participants? Ensure that a written record is kept of oral

non-verbal How to report multiple perspectives? reports.

forms? Decide the protection of the individual’s rights,

balancing this with the public’s rights to know.

29 To whom Do all participants receive a report? Identify the stakeholders.

to report (the What will be the effects of not reporting to Determine the least and most material to be made

necessary and stakeholders? available to the stakeholders.

possible

audiences of

the research)?

30 How Is it necessary to provide interim reports? Decide on the timing and frequency of the

frequently to If interim reports are provided, how might reporting.

report? this affect the future reports or the course Determine the formative and summative nature of

of the research? the reports.

88 RESEARCH DESIGN ISSUES: PLANNING RESEARCH

3 All of the cells in the ‘decisions’ column mustbe practicable when taken separately.

4 All of the cells in the ‘decisions’ column mustbe practicable when taken together.

Not all of the planned possibilities might be prac-ticable when these four criteria are applied. Itwould be of very little use if the methods of datacollection listed in the ‘decisions’ column of ques-tion 21 (‘How will the data be gathered?’) of-fered little opportunity to fulfil the needs of ac-quiring information to answer question 7 (‘Whatmust be the focus in order to answer the researchquestions?’), or if the methods of data collec-tion were impracticable within the time scalesavailable in question 4.

In the matrix of Box 3.3 the cells have beencompleted in a deliberately content-free way,i.e. the matrix as presented here does not dealwith the specific, actual points which mightemerge in a particular research proposal. If thematrix were to be used for planning an actualpiece of research, then, instead of couching thewording of each cell in generalized terms, itwould be more useful if specific, concrete re-sponses were given which addressed particularissues and concerns in the research proposal inquestion.

Many of these questions concern rights, re-sponsibilities and the political uses (and abuses)of the research. This underlines the view thatresearch is an inherently political and moral ac-tivity; it is not politically or morally neutral.The researcher has to be concerned with theuses as well as the conduct of the research.

Managing the planning of research

The preceding discussion has revealed the com-plexity of planning a piece of research, yet itshould not be assumed that research will al-ways go according to plan! For example, themortality of the sample might be a feature (par-ticipants leaving during the research), or a poorresponse rate to questionnaires might be en-countered, rendering subsequent analysis, re-porting and generalization problematical;

administrative support might not be forthcom-ing, or there might be serious slippage in thetiming. This is not to say that a plan for theresearch should not be made; rather it is tosuggest that it is dangerous to put absolutefaith in it!

To manage the complexity in planning out-lined above a simple four-stage model can beproposed:

Stage 1 Identify the purposes of the research.Stage 2 Identify and give priority to the con-straints under which the research will takeplace.Stage 3 Plan the possibilities for the researchwithin these constraints.Stage 4 Decide the research design.

Each stage contains several operations. Box 3.4clarifies this four stage model, drawing out thevarious operations contained in each stage. Itmay be useful for research planners to considerwhich instruments will be used at which stage ofthe research and with which sectors of the sam-ple population. Box 3.5 sets out a matrix of thesefor planning (see also Morrison, 1993:109), forexample, of a small-scale piece of research.

A matrix approach such as this enables re-search planners to see at a glance their coverageof the sample and of the instruments used atparticular points in time, making omissions clear,and promoting such questions as:

Why are certain instruments used at certain timesand not at others?Why are certain instruments used with certainpeople and not with others?Why do certain times in the research use moreinstruments than other times?Why is there such a heavy concentration of in-struments at the end of the study?Why are certain groups involved in more instru-ments than other groups?Why are some groups apparently neglected (e.g.parents), e.g. is there a political dimension to theresearch?Why are questionnaires the main kinds of instru-ment to be used?

Chapte

r 389

Box 3.4A planning sequence for research

MANAGING THE PLANNING FOR RESEARCH

90 RESEARCH DESIGN ISSUES: PLANNING RESEARCH

Why are some instruments (e.g. observation, test-ing) not used at all?What makes the five stages separate?Are documents only held by certain parties (and,if so, might one suspect an ‘institutional line’ tobe revealed in them)?Are some parties more difficult to contact thanothers (e.g. University teacher educators)?Are some parties more important to the researchthan others (e.g. the principals)?Why are some parties excluded from the sample(e.g. school governors, policy-makers, teachers’ as-sociations and unions)?What is the difference between the three groupsof teachers?

Matrix planning is useful for exposing key fea-tures of the planning of research. Further matri-ces might be constructed to indicate other fea-tures of the research, for example: • the timing of the identification of the sample;• the timing of the release of interim reports;• the timing of the release of the final report;• the timing of pretests and post-tests (in an

experimental style of research);• the timing of intensive necessary resource

support (e.g. reprographics);• the timing of meetings of interested parties. These examples cover timings only; other ma-trices might be developed to cover other

combinations, for example: reporting by audi-ences; research team meetings by reporting; in-strumentation by participants etc. They are use-ful summary devices.

Conclusion

This chapter has suggested how a research plancan be formulated and operationalized, movingfrom general areas of interest, questions andpurposes to very specific research questionswhich can be answered using appropriate sam-pling procedures, methodologies and instru-ments, and with the gathering of relevant data.The message from this chapter is that, while re-search may not always unfold according to plan,it is important to have thought out the severalstages and elements of research so that coher-ence and practicability have been addressedwithin an ethically defensible context. Such plan-ning can be usefully informed by models of re-search, and, indeed, these are addressed in sev-eral chapters of the book. The planning of re-search begins with the identification of purposesand constraints. With these in mind, the re-searcher can now decide on a research designand strategy that will provide him or her withanswers to specific research questions. These inturn will serve more general research purposesand aims. Both the novice and experienced

Box 3.5A planning matrix for research

Time sample Stage 1 (start) Stage 2 (3 months) Stage 3 (6 months) Stage 4 (9 months) Stage 5 (12 months)

Principal/ Documents Interview Documents Interview Documents

headteacher Interview Questionnaire 2 Interview

Questionnaire 1 Questionnaire 3

Teacher group 1 Questionnaire 1 Questionnaire 2 Questionnaire 3

Teacher group 2 Questionnaire 1 Questionnaire 2 Questionnaire 3

Teacher group 3 Questionnaire 1 Questionnaire 2 Questionnaire 3

Students Questionnaire 2 Interview

Parents Questionnaire 1 Questionnaire 2 Questionnaire 3

University Interview Interview

teacher Documents Documents

educators

Chapte

r 391

researcher alike have to confront the necessityof having a clear plan of action if the research isto have momentum and purpose. The notion of‘fitness for purpose’ reigns here; the research planmust suit the purposes of the research. If the

reader is left feeling, at the end of this chap ter,that the task of research is complex, then that isan important message, for rigour and thought-ful, thorough planning are necessary if the re-search is to be worthwhile and effective.

CONCLUSION

Introduction

The quality of a piece of research not only standsor falls by the appropriateness of methodologyand instrumentation but also by the suitabilityof the sampling strategy that has been adopted(see also Morrison, 1993:112–17). Questions ofsampling arise directly out of the issue of defin-ing the population on which the research willfocus. Researchers must take sampling decisionsearly in the overall planning of a piece of re-search. Factors such as expense, time and acces-sibility frequently prevent researchers from gain-ing information from the whole population.Therefore they often need to be able to obtaindata from a smaller group or subset of the totalpopulation in such a way that the knowledgegained is representative of the total population(however defined) under study. This smallergroup or subset is the sample. Experienced re-searchers start with the total population andwork down to the sample. By contrast, less ex-perienced researchers often work from the bot-tom up; that is, they determine the minimumnumber of respondents needed to conduct theresearch (Bailey, 1978). However, unless theyidentify the total population in advance, it isvirtually impossible for them to assess how rep-resentative the sample is that they have drawn.

Suppose that a class teacher has been releasedfrom her teaching commitments for one monthin order to conduct some research into the abili-ties of 13-year-old students to undertake a setof science experiments and that the research isto draw on three secondary schools which con-tain 300 such students each, a total of 900 stu-dents, and that the method that the teacher hasbeen asked to use for data collection is a

semi-structured interview. Because of the timeavailable to the teacher it would be impossiblefor her to interview all 900 students (the totalpopulation being all the cases). Therefore shehas to be selective and to interview fewer thanall 900 students. How will she decide that selec-tion; how will she select which students to in-terview?

If she were to interview 200 of the students,would that be too many? If she were to inter-view just twenty of the students would that betoo few? If she were to interview just the malesor just the females, would that give her a fairpicture? If she were to interview just those stu-dents whom the science teachers had decidedwere ‘good at science’, would that yield a truepicture of the total population of 900 students?Perhaps it would be better for her to interviewthose students who were experiencing difficultyin science and who did not enjoy science, as wellas those who were ‘good at science’. So she turnsup on the days of the interviews only to findthat those students who do not enjoy sciencehave decided to absent themselves from the sci-ence lesson. How can she reach those students?

Decisions and problems such as these faceresearchers in deciding the sampling strategy tobe used. Judgements have to be made about fourkey factors in sampling: 1 the sample size;2 the representativeness and parameters of the

sample;3 access to the sample;4 the sampling strategy to be used. The decisions here will determine the samplingstrategy to be used.

Sampling4

Chapte

r 493

The sample size

A question that often plagues novice research-ers is just how large their samples for the re-search should be. There is no clear-cut answer,for the correct sample size depends on the pur-pose of the study and the nature of the popula-tion under scrutiny. However it is possible togive some advice on this matter. Thus, a samplesize of thirty is held by many to be the mini-mum number of cases if researchers plan to usesome form of statistical analysis on their data.Of more import to researchers is the need tothink out in advance of any data collection thesorts of relationships that they wish to explorewithin subgroups of their eventual sample. Thenumber of variables researchers set out to con-trol in their analysis and the types of statisticaltests that they wish to make must inform theirdecisions about sample size prior to the actualresearch undertaking.

As well as the requirement of a minimumnumber of cases in order to examine relation-ships between subgroups, researchers must ob-tain the minimum sample size that will accu-rately represent the population being targeted.With respect to size, will a large one guaranteerepresentativeness? Surely not! In the exampleabove the researcher could have interviewed atotal sample of 450 females and still not haverepresented the male population. Will a smallsize guarantee representativeness? Again, surelynot! The latter falls into the trap of saying that50 per cent of those who expressed an opinionsaid that they enjoyed science, when the 50 percent was only one student, a researcher havinginterviewed only two students in all. Further-more, too large a sample might become unwieldyand too small a sample might be unrepresenta-tive (e.g. in the first example, the researchermight have wished to interview 450 students butthis would have been unworkable in practice orthe researcher might have interviewed only tenstudents, which would have been unrepresenta-tive of the total population of 900 students).

Where simple random sampling is used, thesample size needed to reflect the population

value of a particular variable depends both onthe size of the population and the amount ofheterogeneity in the population (Bailey, 1978).Generally, for populations of equal heterogene-ity, the larger the population, the larger the sam-ple that must be drawn. For populations of equalsize, the greater the heterogeneity on a particu-lar variable, the larger the sample that is needed.To the extent that a sample fails to representaccurately the population involved, there is sam-pling error, discussed below.

Sample size is also determined to some extentby the style of the research. For example, a sur-vey style usually requires a large sample, particu-larly if inferential statistics are to be calculated.In an ethnographic or qualitative style of researchit is more likely that the sample size will be small.Sample size might also be constrained by cost—in terms of time, money, stress, administrativesupport, the number of researchers, and resources.Borg and Gall (1979:194–5) suggest that corre-lational research requires a sample size of no fewerthan thirty cases, that causal-comparative andexperimental methodologies require a sample sizeof no fewer than fifteen cases, and that surveyresearch should have no fewer than 100 cases ineach major subgroup and twenty to fifty in eachminor subgroup.

They advise (ibid.: 186) that sample size hasto begin with an estimation of the smallestnumber of cases in the smallest subgroup of thesample, and ‘work up’ from that, rather thanvice versa. So, for example, if 5 per cent of thesample must be teenage boys, and this sub-sam-ple must be thirty cases (e.g. for correlationalresearch), then the total sample will be30÷0.05=600; if 15 per cent of the sample mustbe teenage girls and the sub-sample must beforty-five cases, then the total sample must be45÷0.15=300 cases.

The size of a probability (random) sample canbe determined in two ways, either by the re-searcher exercising prudence and ensuring thatthe sample represents the wider features of thepopulation with the minimum number of casesor by using a table which, from a mathematicalformula, indicates the appropriate size of a

THE SAMPLE SIZE

94 SAMPLING

random sample for a given number of the widerpopulation (Morrison, 1993:117). One such ex-ample is provided by Krejcie and Morgan (1970)in Box 4.1. This suggests that if the researcherwere devising a sample from a wider populationof thirty or fewer (e.g. a class of students or agroup of young children in a class) then she/hewould be well advised to include the whole ofthe wider population as the sample.

The key point to note about the sample size inBox 4.1 is that the smaller the number of casesthere are in the wider, whole population, the largerthe proportion of that population must be which

appears in the sample; the converse of this is true:the larger the number of cases there are in thewider, whole population, the smaller the propor-tion of that population can be which appears inthe sample. Krejcie and Morgan (1970) note that‘as the population increases the sample size in-creases at a diminishing rate and remains con-stant at slightly more than 380 cases’ (ibid.: 610).Hence, for example, a piece of research involv-ing all the children in a small primary or elemen-tary school (up to 100 students in all) might re-quire between 80 per cent and 100 per cent ofthe school to be included in the sample, whilst alarge secondary school of 1,200 students mightrequire a sample of 25 per cent of the school inorder to achieve randomness. As a rough guidein a random sample, the larger the sample, thegreater is its chance of being representative.

Another approach to determining sample sizefor a probability sample is in relation to the con-fidence level and sampling error. For example,with confidence levels of 95 per cent and 99 percent and sampling errors of 5 per cent and 1 percent respectively, the following can be set as sam-ple sizes (Box 4.2). As with the table from Krejcieand Morgan earlier, we can see that the size ofthe sample reduces at an increasing rate as thepopulation size increases; generally (but, clearly,not always) the larger the population, the smallerthe proportion of the probability sample can be.

Borg and Gall (1979:195) suggest that, as ageneral rule, sample sizes should be large where: • there are many variables;• only small differences or small relationships

are expected or predicted;• the sample will be broken down into sub-

groups;• the sample is heterogeneous in terms of the

variables under study;• reliable measures of the dependent variable

are unavailable. Oppenheim (1992:44) adds to this the view thatthe nature of the scales to be used also exerts aninfluence on the sample size. For nominal datathe sample sizes may well have to be larger than

NotesN=population sizeS=sample sizeSource Krejcie and Morgan, 19701

Box 4.1Determining the size of a random sample

N S N S N S

10 10 220 140 1,200 291

15 14 230 144 1,300 297

20 19 240 148 1,400 302

25 24 250 152 1,500 306

30 28 260 155 1,600 310

35 32 270 159 1,700 313

40 36 280 162 1,800 317

45 40 290 165 1,900 320

50 44 300 169 2,000 322

55 48 320 175 2,200 327

60 52 340 181 2,400 331

65 56 360 186 2,600 335

70 59 380 191 2,800 338

75 63 400 196 3,000 341

80 66 420 201 3,500 346

85 70 440 205 4,000 351

90 73 460 210 4,500 354

95 76 480 214 5,000 357

100 80 500 217 6,000 361

110 86 550 226 7,000 364

120 92 600 234 8,000 367

130 97 650 242 9,000 368

140 103 700 248 10,000 370

150 108 750 254 15,000 375

160 113 800 260 20,000 377

170 118 850 265 30,000 379

180 123 900 269 40,000 380

190 127 950 274 50,000 381

200 132 1,000 278 75,000 382

210 136 1,100 285 100,0000 384

Chapte

r 495

for interval and ratio data (i.e. a variant of theissue of the number of subgroups to be ad-dressed, the greater the number of subgroups orpossible categories, the larger the sample willhave to be).

Borg and Gall (ibid.) set out a formula-drivenapproach to determining sample size (see alsoMoser and Kalton, 1977; Ross and Rust,1997:427–38), and they also suggest using cor-relational tables for correlational studies—avail-able in most texts on statistics—‘in reverse’ as itwere, to determine sample size (p. 201), i.e.looking at the significance levels of correlationco-efficients and then reading off the samplesizes usually required to demonstrate that levelof significance. For example, a correlational sig-nificance level of 0.01 would require a samplesize of 10 if the estimated co-efficient of correla-tion is 0.65, or a sample size of 20 if the esti-mated correlation co-efficient is 0.45, and asample size of 100 if the estimated correlationco-efficient is 0.20. Again, an inverse proportioncan be seen—the larger the sample size, thesmaller the estimated correlation co-efficient canbe to be deemed significant.

With both qualitative and quantitative data,the essential requirement is that the sample isrepresentative of the population from which itis drawn. In a dissertation concerned with a lifehistory (i.e. n=1), the sample is the population!

Qualitative data

In a qualitative study of thirty highly able girlsof similar socio-economic background follow-ing an A-level Biology course, a sample of fiveor six may suffice the researcher who is preparedto obtain additional corroborative data by wayof validation.

Where there is heterogeneity in the popula-tion, then a larger sample must be selected onsome basis that respects that heterogeneity. Thus,from a staff of sixty secondary school teachersdifferentiated by gender, age, subject specialism,management or classroom responsibility, etc., itwould be insufficient to construct a sample con-sisting of ten female classroom teachers of Artsand Humanities subjects.

Quantitative data

For quantitative data, a precise sample numbercan be calculated according to the level of accu-racy and the level of probability that the re-searcher requires in her work. She can then re-port in her study the rationale and the basis ofher research decision (Blalock, 1979).

By way of example, suppose a teacher/re-searcher wishes to sample opinions among 1,000secondary school students. She intends to use a10-point scale ranging from 1=totally

Box 4.2Sample size, confidence levels and sampling error

THE SAMPLE SIZE

Sampling error of 5% with a conf idence Sampling error of 1 % with a conf idence

level of 95% level of 99%

Size of total population Size of sample population Size of sample population

(N) (S) (S)

50 44 50

100 79 99

200 132 196

500 217 476

1,000 278 907

2,000 322 1,661

5,000 357 3,311

10,000 370 4,950

20,000 377 6,578

50,000 381 8,195

100,000 383 8,926

1,000,000 384 9,706

96 SAMPLING

unsatisfactory to 10=absolutely fabulous. Shealready has data from her own class of thirtystudents and suspects that the responses of otherstudents will be broadly similar. Her own stu-dents rated the activity (an extra-curricularevent) as follows: mean score=7.27; standard de-viation=1.98. In other words, her students werepretty much ‘bunched’ about a warm, positiveappraisal on the 10-point scale. How many ofthe 1,000 students does she need to sample inorder to gain an accurate (i.e. reliable) assess-ment of what the whole school (n=1,000) thinksof the extra-curricular event?

It all depends on what degree of accuracy and whatlevel of probability she is willing to accept.

A simple calculation from a formula by Blalock(1979:215–18) shows that: • if she is happy to be within + or - 0.5 of a

scale point and accurate 19 times out of 20,then she requires a sample of 60 out of the1,000;

• if she is happy to be within + or - 0.5 of ascale point and accurate 99 times out of 100,then she requires a sample of 104 out of the1,000;

• if she is happy to be within + or - 0.5 of ascale point and accurate 999 times out of1,000, then she requires a sample of 170 outof the 1,000;

• if she is a perfectionist and wishes to be within+ or - 0.25 of a scale point and accurate 999times out of 1,000, then she requires a sam-ple of 679 out of the 1,000.

Determining the size of the sample will also haveto take account of attrition and respondentmortality, i.e. that some participants will leavethe research or fail to return questionnaires.Hence it is advisable to overestimate rather thanto underestimate the size of the sample required.

It is clear that sample size is a matter of judge-ment as well as mathematical precision; evenformula-driven approaches make it clear thatthere are elements of prediction, standard error

and human judgement involved in determiningsample size.

Sampling error

If many samples are taken from the same popula-tion, it is unlikely that they will all have character-istics identical with each other or with the popula-tion; their means will be different. In brief, therewill be sampling error (see Cohen and Holliday,1979; 1996). Sampling error is often taken to bethe difference between the sample mean and thepopulation mean. Sampling error is not necessar-ily the result of mistakes made in sampling proce-dures. Rather, variations may occur due to thechance selection of different individuals. For ex-ample, if we take a large number of samples fromthe population and measure the mean value of eachsample, then the sample means will not be identi-cal. Some will be relatively high, some relativelylow, and many will cluster around an average ormean value of the samples. We show this diagram-matically in Box 4.3.

Why should this occur? We can explain thephenomenon by reference to the Central LimitTheorem which is derived from the laws of prob-ability. This states that if random large samplesof equal size are repeatedly drawn from anypopulation, then the mean of those samples willbe approximately normally distributed. The

Box 4.3Distribution of sample means showing the spread of a se-lection of sample means around the population mean

Source Cohen and Holliday, 1979

Chapte

r 497

distribution of sample means approaches thenormal distribution as the size of the sample in-creases, regardless of the shape—normal or oth-erwise—of the parent population (Hopkins,Hopkins and Glass, 1996:159, 388). Moreover,the average or mean of the sample means willbe approximately the same as the populationmean. The authors demonstrate this (pp. 159–62) by reporting the use of computer simula-tion to examine the sampling distribution ofmeans when computed 10,000 times (a methodthat we discuss in the final chapter of this book).Rose and Sullivan (1993:144) remind us that95 per cent of all sample means fall betweenplus or minus 1.96 standard errors of the sam-ple and population means, i.e. that we have a95 per cent chance of having a single samplemean within these limits, that the sample meanwill fall within the limits of the population mean.

By drawing a large number of samples ofequal size from a population, we create a sam-pling distribution. We can calculate the errorinvolved in such sampling. The standard devia-tion of the theoretical distribution of samplemeans is a measure of sampling error and iscalled the standard error of the mean (SEM).Thus,

where

SDS = the standard deviation of the sample andN = the number in the sample.

Strictly speaking, the formula for the standarderror of the mean is:

where SDpop=the standard deviation of thepopulation.

However, as we are usually unable to ascertainthe SD of the total population, the standarddeviation of the sample is used instead. Thestandard error of the mean provides the best

estimate of the sampling error. Clearly, the sam-pling error depends on the variability (i.e. theheterogeneity) in the population as measured bySDpop as well as the sample size (N) (Rose andSullivan, 1993:143). The smaller the SDpop thesmaller the sampling error; the larger the N, thesmaller the sampling error. Where the SDpop isvery large, then N needs to be very large to coun-teract it. Where SDpop is very small, then N, too,can be small and still give a reasonably smallsampling error. As the sample size increases thesampling error decreases. Hopkins, Hopkins andGlass (1996:159) suggest that, unless there aresome very unusual distributions, samples oftwenty-five or greater usually yield a normalsampling distribution of the mean. For furtheranalysis of steps that can be taken to cope withthe estimation of sampling in surveys we referthe reader to Ross and Wilson (1997).

The standard error of proportions

We said earlier that one answer to ‘How big asample must I obtain?’ is ‘How accurate do Iwant my results to be?’ This is well illustratedin the following example:

A school principal finds that the 25 students shetalks to at random are reasonably in favour of aproposed change in the lunch break hours, 66 percent being in favour and 34 per cent being against.How can she be sure that these proportions aretruly representative of the whole school of 1,000students?

A simple calculation of the standard error of pro-portions provides the principal with her answer.

whereP = the percentage in favourQ = 100 per cent–PN = the sample size

The formula assumes that each sample is drawnon a simple random basis. A small correction

SAMPLE ERROR

98 SAMPLING

factor called the finite population correction(fpc) is generally applied as follows:

SE of proportions= where f is the

proportion included in the sample.

Where, for example, a sample is 100 out of 1,000,f is 0.1.

SE of proportions =

In our example above, of the school principal’sinterest in lunch break hours, with a sample of25, the SE=9.4. In other words, the favourablevote can vary between 56.6 per cent and 75.4per cent; likewise, the unfavourable vote canvary between 43.4 per cent and 24.6 per cent.Clearly, a voting possibility ranging from 56.6per cent in favour to 43.4 per cent against is lessdecisive than 66 per cent as opposed to 34 percent. Should the school principal enlarge hersample to include 100 students, then the SE be-comes 4.5 and the variation in the range is re-duced to 61.5 per cent–70.5 per cent in favourand 38.5 per cent–29.5 per cent against. Sam-pling the whole school’s opinion (n=1,000) re-duces the SE to 1.5 and the ranges to 64.5 percent–67.5 per cent in favour and 35.5 per cent–32.5 per cent against. It is easy to see why po-litical opinion surveys are often based upon sam-ple sizes of 1,000 to 1,500 (Gardner, 1978).

The representativeness of the sample

The researcher will need to consider the extentto which it is important that the sample in factrepresents the whole population in question (inthe example above, the 1,000 students), if it isto be a valid sample. The researcher will needto be clear what it is that is being represented,i.e. to set the parameter characteristics of thewider population—the sampling frame—clearlyand correctly. There is a popular example of howpoor sampling may be unrepresentative andunhelpful for a researcher. A national newspa-per reports that one person in every two suffers

from backache; this headline stirs alarm in everydoctor’s surgery throughout the land. However,the newspaper fails to make clear the param-eters of the study which gave rise to the head-line. It turns out that the research took place (a)in a damp part of the country where the inci-dence of backache might be expected to behigher than elsewhere, (b) in a part of the coun-try which contained a disproportionate numberof elderly people, again who might be expectedto have more backaches than a younger popu-lation, (c) in an area of heavy industry wherethe working population might be expected tohave more backache than in an area of lighterindustry or service industries, (d) by using twodoctors’ records only, overlooking the fact thatmany backache sufferers did not go to thosedoctors’ surgeries because the two doctors con-cerned were known to regard backache suffer-ers with suspicion—as shirkers from work.

These four variables—climate, age group, oc-cupation and reported incidence—were seen toexert a disproportionate effect on the study, i.e.if the study were to have been carried out in anarea where the climate, age group, occupationand reporting were to have been different, thenthe results might have been different. The news-paper report sensationally generalized beyond theparameters of the data, thereby overlooking thelimited representativeness of the study.

The access to the sample

Researchers will need to ensure not only thataccess is permitted, but is, in fact, practicable.For example, if a researcher were to conductresearch into truancy and unauthorized absencefrom school, and she decided to interview a sam-ple of truants, the research might never com-mence as the truants, by definition, would notbe present! Similarly access to sensitive areasmight not only be difficult but problematicalboth legally and administratively, for example,access to child abuse victims, child abusers, dis-affected students, drug addicts, school refusers,bullies and victims of bullying. In some sensi-tive areas access to a sample might be denied by

Chapte

r 499

the potential sample participants themselves, forexample an AIDS counsellor might be so seri-ously distressed by her work that she simplycannot face discussing with a researcher its trau-matic subject matter; it is distressing enough todo the job without living through it again witha researcher. Access might also be denied by thepotential sample participants themselves for verypractical reasons, for example a doctor or ateacher simply might not have the time to spendwith the researcher. Further, access might bedenied by people who have something to pro-tect, for example, a person who has made animportant discovery or a new invention and whodoes not wish to disclose the secret of her suc-cess; the trade in intellectual property has ren-dered this a live issue for many researchers. Thereare very many reasons which might prevent ac-cess to the sample, and researchers cannot af-ford to neglect this potential source of difficultyin planning research.

Not only might access be problematic, but itscorollary—release of information—might also beproblematic. For example, a researcher might gainaccess to a wealth of sensitive information andappropriate people, but there might be a restric-tion on the release of the data collection: in thefield of education in the UK, reports have beenknown to be suppressed, delayed or ‘doctored’.It is not always enough to be able to ‘get to’ thesample, the problem might be to ‘get the infor-mation out’ to the wider public, particularly if itcould be critical of powerful people.

The sampling strategy to be used

There are two main methods of sampling (Cohenand Holliday, 1979, 1982, 1996; Schofield,1996). The researcher must decide whether toopt for a probability (also known as a randomsample) or a non-probability sample (alsoknown as a purposive sample). The differencebetween them is this: in a probability samplethe chances of members of the wider popula-tion being selected for the sample are known,whereas in a non-probability sample the chancesof members of the wider population being

selected for the sample are unknown. In theformer (probability sample) every member of thewider population has an equal chance of beingincluded in the sample; inclusion or exclusionfrom the sample is a matter of chance and noth-ing else. In the latter (non-probability sample)some members of the wider population definitelywill be excluded and others definitely included(i.e. every member of the wider population doesnot have an equal chance of being included inthe sample). In this latter type the researcher hasdeliberately—purposely—selected a particularsection of the wider population to include in orexclude from the sample.

Probability samples

A probability sample, because it draws randomlyfrom the wider population, will be useful if theresearcher wishes to be able to make generali-zations, because it seeks representativeness ofthe wider population. This is a form of sam-pling that is popular in randomized controlledtrials. On the other hand, a non-probability sam-ple deliberately avoids representing the widerpopulation; it seeks only to represent a particu-lar group, a particular named section of thewider population, e.g. a class of students, a groupof students who are taking a particular exami-nation, a group of teachers.

A probability sample will have less risk ofbias than a non-probability sample, whereas,by contrast, a non-probability sample, beingunrepresentative of the whole population,may demonstrate skewness or bias. This isnot to say that the former is bias-free; there isstill likely to be sampling error in a probabil-ity sample (discussed below), a feature thathas to be acknowledged, for example opinionpolls usually declare their error factors, e.g. ±3 per cent.

There are several types of probability sam-ples: simple random samples; systematic sam-ples; stratified samples; cluster samples; stagesamples, and multi-phase samples. They all havea measure of randomness built into them andtherefore have a degree of generalizability.

THE SAMPLING STRATEGY TO BE USED

100 SAMPLING

Simple random sampling

In simple random sampling, each member of thepopulation under study has an equal chance ofbeing selected and the probability of a memberof the population being selected is unaffectedby the selection of other members of the popu-lation, i.e. each selection is entirely independentof the next. The method involves selecting atrandom from a list of the population (a sam-pling frame) the required number of subjects forthe sample. This can be done by drawing namesout of a hat until the required number is reached,or by using a table of random numbers set outin matrix form (these are reproduced in manybooks on quantitative research methods and sta-tistics), and allocating these random numbersto participants or cases (e.g. Hopkins, Hopkinsand Glass, 1996:148–9). Because of probabilityand chance, the sample should contain subjectswith characteristics similar to the population asa whole; some old, some young, some tall, someshort, some fit, some unfit, some rich, some pooretc. One problem associated with this particu-lar sampling method is that a complete list ofthe population is needed and this is not alwaysreadily available.

Systematic sampling

This method is a modified form of simple ran-dom sampling. It involves selecting subjectsfrom a population list in a systematic ratherthan a random fashion. For example, if from apopulation of, say, 2,000, a sample of 100 isrequired, then every twentieth person can beselected. The starting point for the selection ischosen at random.

One can decide how frequently to make sys-tematic sampling by a simple statistic—the to-tal number of the wider population being repre-sented divided by the sample size required:

where

f = the frequency interval

N = the total number of the wider populationsn = the required number in the sample.

Let us say that the researcher is working with aschool of 1,400 students; by looking at the ta-ble of sample size (Box 4.1) required for a ran-dom sample of these 1,400 students we see that302 students are required to be in the sample.Hence the frequency interval (f) is:

Hence the researcher would pick out every fifthname on the list of cases.

Such a process, of course, assumes that thenames on the list themselves have been listed ina random order. A list of females and malesmight list all the females first, before listing allthe males; if there were 200 females on the list,the researcher might have reached the desiredsample size before reaching that stage of the listwhich contained males, thereby distorting(skewing) the sample. Another example mightbe where the researcher decides to select everythirtieth person identified from a list of schoolstudents, but it happens that: 1 the school has approximately thirty students

in each class;2 each class is listed from high ability to low

ability students;3 the school listing identifies the students by

class. In this case, although the sample is drawn fromeach class, it is not fairly representing the wholeschool population since it is drawing almostexclusively on the higher ability students. Thisis the issue of periodicity (Calder, 1979). Notonly is there the question of the order in whichnames are listed in systematic sampling, but thereis also the issue that this process may violateone of the fundamental premises of probabilitysampling, namely that every person has an equalchance of being included in the sample. In theexample above where every fifth name is se-lected, this guarantees that names 1–4, 6–9 etc.

Chapte

r 4101

will not be selected, i.e. that everybody does nothave an equal chance to be chosen. The ways tominimize this problem are to ensure that theinitial listing is selected randomly and that thestarting point for systematic sampling is simi-larly selected randomly.

Stratified sampling

Stratified sampling involves dividing the popu-lation into homogenous groups, each group con-taining subjects with similar characteristics. Forexample, group A might contain males andgroup B, females. In order to obtain a samplerepresentative of the whole population in termsof sex, a random selection of subjects from groupA and group B must be taken. If needed, theexact proportion of males to females in thewhole population can be reflected in the sam-ple. The researcher will have to identify thosecharacteristics of the wider population whichmust be included in the sample, i.e. to identifythe parameters of the wider population. This isthe essence of establishing the sampling frame.

To organize a stratified random sample is asimple two-stage process. First, identify thosecharacteristics which appear in the wider popu-lation which must also appear in the sample,i.e. divide the wider population into homoge-neous and, if possible, discrete groups (strata),for example males and females. Second, ran-domly sample within these groups, the size ofeach group being determined either by thejudgement of the researcher or by reference toBoxes 4.1 or 4.2.

The decision on which characteristics to in-clude should strive for simplicity as far as possi-ble, as the more factors there are, not only themore complicated the sampling becomes, butoften the larger the sample will have to be toinclude representatives of all strata of the widerpopulation.

A stratified random sample is, therefore, auseful blend of randomization and categoriza-tion, thereby enabling both a quantitative andqualitative piece of research to be undertaken.A quantitative piece of research will be able to

use analytical and inferential statistics, whilst aqualitative piece of research will be able to tar-get those groups in institutions or clusters ofparticipants who will be able to be approachedto participate in the research.

Cluster sampling

When the population is large and widely dis-persed, gathering a simple random sample posesadministrative problems. Suppose we want tosurvey students’ fitness levels in a particularlylarge community. It would be completely im-practical to select students and spend an inordi-nate amount of time travelling about in orderto test them. By cluster sampling, the researchercan select a specific number of schools and testall the students in those selected schools, i.e. ageographically close cluster is sampled.

Cluster samples are widely used in small scaleresearch. In a cluster sample the parameters ofthe wider population are often drawn verysharply; a researcher, therefore, would have tocomment on the generalizability of the findings.The researcher may also need to stratify withinthis cluster sample if useful data, i.e. those whichare focused and which demonstratediscriminability, are to be acquired.

Stage sampling

Stage sampling is an extension of cluster sam-pling. It involves selecting the sample in stages,that is, taking samples from samples. Using thelarge community example in cluster sampling,one type of stage sampling might be to select anumber of schools at random, and from withineach of these schools, select a number of classesat random, and from within those classes selecta number of students.

Morrison (1993:121–2) provides an exam-ple of how to address stage sampling in prac-tice. Let us say that a researcher wants to ad-minister a questionnaire to all 16-year-olds ineach of eleven secondary schools in one region.By contacting the eleven schools she finds thatthere are 2,000 16-year-olds on roll. Because of

THE SAMPLING STRATEGY TO BE USED

102 SAMPLING

questions of confidentiality she is unable to findout the names of all the students so it is impos-sible to draw their names out of a hat to achieverandomness (and even if she had the names, itwould be a mind-numbing activity to write out2,000 names to draw out of a hat!). From look-ing at Box 4.1 she finds that, for a random sam-ple of the 2,000 students, the sample size is 322students. How can she proceed?

The first stage is to list the eleven schools on apiece of paper and then to put the names of theeleven schools onto a small card and place eachcard in a hat. She draws out the first name of theschool, puts a tally mark by the appropriateschool on her list and returns the card to the hat.The process is repeated 321 times, bringing thetotal to 322. The final totals might appear thus:

School 1 2 3 4 5 6 7 8 9 10 11 Total

Required no.

of students 15 21 13 52 33 22 38 47 36 22 23 322

For the second stage she then approaches eachof the eleven schools and asks them to selectrandomly the required number of students foreach school. Randomness has been maintainedin two stages and a large number (2,000) hasbeen rendered manageable. The process at workhere is to go from the general to the specific, thewide to the focused, the large to the small.

Multi-phase sampling

In stage sampling there is a single unifying purposethroughout the sampling. In the previous examplethe purpose was to reach a particular group of stu-dents from a particular region. In a multi-phase samplethe purposes change at each phase, for example, atphase one the selection of the sample might be basedon the criterion of geography (e.g. students living ina particular region); phase two might be based onan economic criterion (e.g. schools whose budgetsare administered in markedly different ways); phasethree might be based on a political criterion (e.g.schools whose students are drawn from areas witha tradition of support for a particularpolitical party), and so on. What is evident here is

that the sample population will change at each phaseof the research.

Non-probability samples

The selectivity which is built into a non-prob-ability sample derives from the researcher tar-geting a particular group, in the full knowledgethat it does not represent the wider population;it simply represents itself. This is frequently thecase in small scale research, for example, as withone or two schools, two or three groups of stu-dents, or a particular group of teachers, whereno attempt to generalize is desired; this is fre-quently the case for some ethnographic research,action research or case study research.

Small scale research often uses non-probabil-ity samples because, despite the disadvantagesthat arise from their non-representativeness, theyare far less complicated to set up, are consider-ably less expensive, and can prove perfectly ad-equate where researchers do not intend to gen-eralize their findings beyond the sample in ques-tion, or where they are simply piloting a ques-tionnaire as a prelude to the main study.

Just as there are several types of probabilitysample, so there are several types of non-prob-ability sample: convenience sampling, quota sam-pling, dimensional sampling, purposive samplingand snowball sampling. Each type of sample seeksonly to represent itself or instances of itself in asimilar population, rather than attempting torepresent the whole, undifferentiated population.

Convenience sampling

Convenience sampling—or as it is sometimescalled, accidental or opportunity sampling—in-volves choosing the nearest individuals to serveas respondents and continuing that process un-til the required sample size has been obtained.Captive audiences such as students or studentteachers often serve as respondents based onconvenience sampling. The researcher simplychooses the sample from those to whom she haseasy access. As it does not represent any groupapart from itself, it does not seek to generalize

Chapte

r 4103

about the wider population; for a conveniencesample that is an irrelevance. The researcher, ofcourse, must take pains to report this point—that the parameters of generalizability in thistype of sample are negligible. A conveniencesample may be the sampling strategy selectedfor a case study or a series of case studies.

Quota sampling

Quota sampling has been described as the non-probability equivalent of stratified sampling (Bai-ley, 1978). Like a stratified sample, a quota sam-ple strives to represent significant characteristics(strata) of the wider population; unlike stratifiedsampling it sets out to represent these in the pro-portions in which they can be found in the widerpopulation. For example, suppose that the widerpopulation (however defined) were composed of55 per cent females and 45 per cent males, thenthe sample would have to contain 55 per centfemales and 45 per cent males; if the populationof a school contained 80 per cent of students upto and including the age of 16, and 20 per cent ofstudents aged 17 and over, then the sample wouldhave to contain 80 per cent of students up to theage of 16 and 20 per cent of students aged 17and above. A quota sample, then, seeks to giveproportional weighting to selected factors (strata)which reflects their weighting in which they canbe found in the wider population. The researcherwishing to devise a quota sample can proceed inthree stages:

Stage 1 Identify those characteristics (factors)which appear in the wider population whichmust also appear in the sample, i.e. divide thewider population into homogeneous and, if pos-sible, discrete groups (strata), for example,males and females, Asian, Chinese and Afro-Caribbean.Stage 2 Identify the proportions in which theselected characteristics appear in the wider popu-lation, expressed as a percentage.Stage 3 Ensure that the percentaged proportionsof the characteristics selected from the widerpopulation appear in the sample.

Ensuring correct proportions in the sample maybe difficult to achieve where the proportions inthe wider community are unknown; sometimesa pilot survey might be necessary in order toestablish those proportions (and even then sam-pling error or a poor response rate might renderthe pilot data problematical).

It is straightforward to determine the mini-mum number required in a quota sample. Letus say that that the total number of students ina school is 1,700, made up thus:

Performing arts 300 studentsNatural sciences 300 studentsHumanities 600 studentsBusiness and social sciences 500 students

The proportions being 3:3:6:5, a minimum of17 students might be required (3+3+6+5) for thesample. Of course this would be a minimumonly, and it might be desirable to go higher thanthis. The price of having too many characteris-tics (strata) in quota sampling is that the mini-mum number in the sample very rapidly couldbecome very large, hence in quota sampling it isadvisable to keep the numbers of strata to aminimum. The larger the number of strata thelarger the number in the sample will become,usually at a geometric rather than an arithmeticrate of progression.

Purposive sampling

In purposive sampling, researchers handpick thecases to be included in the sample on the basisof their judgement of their typicality. In this way,they build up a sample that is satisfactory totheir specific needs. As its name suggests, thesample has been chosen for a specific purpose,for example: (a) a group of principals and sen-ior managers of secondary schools is chosen asthe research is studying the incidence of stressamongst senior managers; (b) a group of disaf-fected students has been chosen because theymight indicate most distinctly the factors whichcontribute to students’ disaffection (they are‘critical cases’ akin to ‘critical events’ discussed

THE SAMPLING STRATEGY TO BE USED

104 SAMPLING

in Chapter 17); (c) one class of students has beenselected to be tracked throughout a week in or-der to report on the curricular and pedagogicdiet which is offered to them so that other teach-ers in the school might compare their own teach-ing to that reported. Whilst it may satisfy theresearcher’s needs to take this type of sample, itdoes not pretend to represent the wider popula-tion; it is deliberately and unashamedly selec-tive and biased.

Dimensional sampling

One way of reducing the problem of sample sizein quota sampling is to opt for dimensional sam-pling. Dimensional sampling is a further refine-ment of quota sampling. It involves identifyingvarious factors of interest in a population andobtaining at least one respondent of every com-bination of those factors. Thus, in a study ofrace relations, for example, researchers may wishto distinguish first, second and third generationimmigrants. Their sampling plan might take theform of a multi-dimensional table with ‘ethnicgroup’ across the top and ‘generation’ down theside. A second example might be of a researcherwho may be interested in studying disaffectedstudents, girls and secondary aged students andwho may find a single disaffected secondary fe-male student, i.e. a respondent who is the bearerof all of the sought characteristics.

Snowball sampling

In snowball sampling researchers identify asmall number of individuals who have the char-acteristics in which they are interested. These

people are then used as informants to identify,or put the researchers in touch with, others whoqualify for inclusion and these, in turn, identifyyet others—hence the term snowball sampling.This method is useful for sampling a popula-tion where access is difficult, maybe because itis a sensitive topic (e.g. teenage solvent abusers)or where communication networks are unde-veloped (e.g. where a researcher wishes to inter-view stand-in ‘supply’ teachers—teachers whoare brought in on an ad hoc basis to cover forabsent regular members of a school’s teachingstaff—but finds it difficult to acquire a list ofthese stand-in teachers, or where a researcherwishes to contact curriculum co-ordinatorswho have attended a range of in-service coursesand built up an informal network of inter-school communication). The task for the re-searcher is to establish who are the critical orkey informants with whom initial contact mustbe made.

Conclusion

The message from this chapter is the same asfor many of the others—that every element ofthe research should not be arbitrary but plannedand deliberate, and that, as before, the criterionof planning must be fitness for purpose. Theselection of a sampling strategy must be gov-erned by the criterion of suitability. The choiceof which strategy to adopt must be mindful ofthe purposes of the research, the time scales andconstraints on the research, the methods of datacollection, and the methodology of the research.The sampling chosen must be appropriate forall of these factors if validity is to be served.

The concepts of validity and reliability are multi-faceted; there are many different types of valid-ity and different types of reliability. Hence therewill be several ways in which they can be ad-dressed. It is unwise to think that threats to va-lidity and reliability can ever be erased com-pletely; rather, the effects of these threats can beattenuated by attention to validity and reliabil-ity throughout a piece of research.

This chapter discusses validity and reliabilityin quantitative and qualitative, naturalistic re-search. It suggests that both of these terms canbe applied to these two types of research, thoughhow validity and reliability are addressed in thesetwo approaches varies. Finally validity and reli-ability using different instruments for data col-lection are addressed. It is suggested that reli-ability is a necessary but insufficient conditionfor validity in research; reliability is a necessaryprecondition of validity. Brock-Utne (1996:612)contends that the widely held view that reliabil-ity is the sole preserve of quantitative researchhas to be exploded, and this chapter demon-strates the significance of her view.

Defining validity

Validity is an important key to effective research.If a piece of research is invalid then it is worth-less. Validity is thus a requirement for both quan-titative and qualitative/naturalistic research.

Whilst earlier versions of validity were basedon the view that it was essentially a demonstra-tion that a particular instrument in fact meas-

ures what it purports to measure, more recentlyvalidity has taken many forms. For example, inqualitative data validity might be addressedthrough the honesty, depth, richness and scopeof the data achieved, the participants ap-proached, the extent of triangulation and thedisinterestedness or objectivity of the researcher.In quantitative data validity might be improvedthrough careful sampling, appropriate instru-mentation and appropriate statistical treatmentsof the data. It is impossible for research to be100 per cent valid; that is the optimism of per-fection. Quantitative research possesses a meas-ure of standard error which is inbuilt and whichhas to be acknowledged. In qualitative data thesubjectivity of respondents, their opinions, atti-tudes and perspectives together contribute to adegree of bias. Validity, then, should be seen asa matter of degree rather than as an absolutestate (Gronlund, 1981). Hence at best we striveto minimize invalidity and maximize validity.

There are several different kinds of validity,for example: • content validity;• criterion-related validity;• construct validity;• internal validity;• external validity;• concurrent validity;• face validity;• jury validity;• predictive validity;• consequential validity;

5 Validity and reliability

VALIDITY AND RELIABILITY106

• systemic validity;• catalytic validity;• ecological validity;• cultural validity;• descriptive validity;• interpretive validity;• theoretical validity;• evaluative validity.

It is not our intention in this chapter to discussall of these terms in depth. Rather, the main typesof validity will be addressed. The argument willbe made that, whilst some of these terms aremore comfortably the preserve of quantitativemethodologies, this is not exclusively the case.Indeed, validity is the touchstone of all types ofeducational research. That said, it is importantthat validity in different research traditions isfaithful to those traditions; it would be absurdto declare a piece of research invalid if it werenot striving to meet certain kinds of validity, e.g.generalizability, replicability, controllability.Hence the researcher will need to locate her dis-cussions of validity within the research paradigmthat is being used. This is not to suggest, how-ever, that research should be paradigm-bound,that is a recipe for stagnation and conservatism.Nevertheless, validity must be faithful to itspremises and positivist research has to be faith-ful to positivist principles, e.g.: • controllability;• replicability;• predictability;• the derivation of laws and universal state-

ments of behaviour;• context-freedom;• fragmentation and atomization of research;• randomization of samples;• observability. By way of contrast, naturalistic research hasseveral principles (Lincoln and Guba, 1985;Bogdan and Biklen, 1992): • the natural setting is the principal source of

data;• context-boundedness and ‘thick description’;

• data are socially situated, and socially andculturally saturated;

• the researcher is part of the researched world;• as we live in an already interpreted world, a

doubly hermeneutic exercise (Giddens, 1979)is necessary to understand others’ understand-ings of the world; the paradox here is thatthe most sufficiently complex instrument tounderstand human life is another human(Lave and Kvale, 1995; 220), but that thisrisks human error in all its forms;

• holism in the research;• the researcher—rather than a research tool—

is the key instrument of research;• the data are descriptive;• there is a concern for processes rather than

simply with outcomes;• data are analysed inductively rather than us-

ing a priori categories;• data are presented in terms of the respondents

rather than researchers;• seeing and reporting the situation through the

eyes of participants—from the native’s pointof view (Geertz, 1974);

• respondent validation is important;• catching meaning and intention are essential. Indeed Maxwell (1992) argues that qualitativeresearchers need to be cautious not to be work-ing within the agenda of the positivists in argu-ing for the need for research to demonstrate con-current, predictive, convergent, criterion-related,internal and external validity. The discussion be-low indicates that this need not be so. He ar-gues, with Guba and Lincoln (1989), for the needto replace positivist notions of validity in quali-tative research with the notion of authenticity.Maxwell, echoing Mishler (1990), suggests that‘understanding’ is a more suitable term than ‘va-lidity’ in qualitative research. We, as researchers,are part of the world that we are researching,and we cannot be completely objective about that,hence other people’s perspectives are equally asvalid as our own, and the task of research is touncover these. Validity, then, attaches to accounts,not to data or methods (Hammersley andAtkinson, 1983); it is the meaning that subjects

Cha

pte

r 5107

give to data and inferences drawn from the datathat are important. ‘Fidelity’ (Blumenfeld-Jones,1995) requires the researcher to be as honest aspossible to the self-reporting of the researched.

The claim is made (Agar, 1993) that, in quali-tative data collection, the intensive personal in-volvement and in-depth responses of individu-als secure a sufficient level of validity and reli-ability. This claim is contested by Hammersley(1992:144) and Silverman (1993:153), who ar-gue that these are insufficient grounds for valid-ity and reliability, and that the individuals con-cerned have no privileged position on interpre-tation. (Of course, neither are actors ‘culturaldopes’ who need a sociologist or researcher totell them what is ‘really’ happening!) Silvermanargues that, whilst immediacy and authenticitymake for interesting journalism, ethnographymust have more rigorous notions of validity andreliability. This involves moving beyond select-ing data simply to fit a preconceived or idealconception of the phenomenon or because theyare spectacularly interesting (Fielding and Field-ing, 1986). Data selected must be representa-tive of the sample, the whole data set, the field,i.e. they must address content, construct andconcurrent validity.

Hammersley (1992:50–1) suggests that valid-ity in qualitative research replaces certainty withconfidence in our results, and that, as reality isindependent of the claims made for it by research-ers, our accounts will only be representations ofthat reality rather than reproductions of it.

Maxwell (1992) argues for five kinds of va-lidity in qualitative methods that explore hisnotion of ‘understanding’: • descriptive validity (the factual accuracy of

the account, that it is not made up, selective,or distorted); in this respect validity subsumesreliability; it is akin to Blumenfeld-Jones’s(1995) notion of ‘truth’ in research—whatactually happened (objectively factual);

• interpretive validity (the ability of the researchto catch the meaning, interpretations, terms, in-tentions that situations and events, i.e. data, havefor the participants/subjects themselves, in their

terms); it is akin to Blumenfeld-Jones’s (1995)notion of ‘fidelity’—what it means to the re-searched person or group (subjectively meaning-ful); interpretive validity has no clear counter-part in experimental/positivist methodologies;

• theoretical validity (the theoretical construc-tions that the researcher brings to the research(including those of the researched)); theoryhere is regarded as explanation. Theoreticalvalidity is the extent to which the researchexplains phenomena; in this respect is it akinto construct validity (discussed below); in theo-retical validity the constructs are those of allthe participants;

• generalizability (the view that the theory gen-erated may be useful in understanding othersimilar situations); generalizing here refers togeneralizing within specific groups or com-munities, situations or circumstances validly)and, beyond, to specific outsider communi-ties, situations or circumstance (external va-lidity); internal validity has greater signifi-cance here than external validity;

• evaluative validity (the application of an evalu-ative framework, judgemental of that whichis being researched, rather than a descriptive,explanatory or interpretive one). Clearly thisresonates with critical-theoretical perspectives,in that the researchers’ own evaluative agendamight intrude.

Both qualitative and quantitative methods canaddress internal and external validity.

Internal validity

Internal validity seeks to demonstrate that theexplanation of a particular event, issue or set ofdata which a piece of research provides can ac-tually be sustained by the data. In some degreethis concerns accuracy, which can be applied toquantitative and qualitative research. The find-ings must accurately describe the phenomenabeing researched.

This chapter sets out the conventional notionsof validity as derived from quantitative meth-odologies. However, in ethnographic research

DEFINING VALIDITY

VALIDITY AND RELIABILITY108

internal validity can be addressed in several ways(LeCompte and Preissle, 1993:338):

• using low-inference descriptors;• using multiple researchers;• using participant researchers;• using peer examination of data;• using mechanical means to record, store and

retrieve data.

In ethnographic, qualitative research there areseveral overriding kinds of internal validity(LeCompte and Preissle, 1993:323–4):

• confidence in the data;• the authenticity of the data (the ability of the

research to report a situation through the eyesof the participants);

• the cogency of the data;• the soundness of the research design;• the credibility of the data;• the auditability of the data;• the dependability of the data;• the confirmability of the data.

The writers provide greater detail on the issueof authenticity, arguing for the following:

• fairness (that there should be a complete andbalanced representation of the multiple re-alities in and constructions of a situation);

• ontological authenticity (the research shouldprovide a fresh and more sophisticated under-standing of a situation, e.g. making the famil-iar strange, a significant feature in reducing‘cultural blindness’ in a researcher, a problemwhich might be encountered in moving frombeing a participant to being an observer(Brock-Utne, 1996:610));

• educative authenticity (the research shouldgenerate a new appreciation of theseunderstandings);

• catalytic authenticity (the research gives riseto specific courses of action);

• tactical authenticity (the research should ben-efit all those involved—the ethical issue of‘beneficence’).

Hammersley (1992:71) suggests that internalvalidity for qualitative data requires attention to:

• plausibility and credibility;• the kinds and amounts of evidence required

(such that the greater the claim that is beingmade, the more convincing the evidence hasto be for that claim);

• clarity on the kinds of claim made from theresearch (e.g. definitional, descriptive, ex-planatory, theory generative).

Lincoln and Guba (1985:219, 301) suggest thatcredibility in naturalistic inquiry can be ad-dressed by: • prolonged engagement in the field;• persistent observation (in order to establish

the relevance of the characteristics for thefocus);

• triangulation (of methods, sources, investi-gators and theories);

• peer debriefing (exposing oneself to a disin-terested peer in a manner akin to cross-ex-amination, in order to test honesty, workinghypotheses and to identify the next steps inthe research);

• negative case analysis (in order to establish atheory that fits every case, revising hypoth-eses retrospectively);

• member checking (respondent validation) toassess intentionality, to correct factual errors,to offer respondents the opportunity to addfurther information or to put information onrecord; to provide summaries and to checkthe adequacy of the analysis).

Whereas in positivist research history and matu-ration are viewed as threats to the validity ofthe research, ethnographic research simply as-sumes that this will happen; ethnographic re-search allows for change over time—it builds itin. Internal validity in ethnographic research isalso addressed by the reduction of observer ef-fects by having the observers sample both widelyand stay in the situation long enough for theirpresence to be taken for granted. Further, bytracking and storing information, it is possiblefor the ethnographer to eliminate rival explana-tions of events and situations.

Cha

pte

r 5109

External validity

External validity refers to the degree to whichthe results can be generalized to the wider popu-lation, cases or situations. The issue of generali-zation is problematical. For positivist research-ers generalizability is a sine qua non, whilst thisis attenuated in naturalistic research. For oneschool of thought, generalizability through strip-ping out contextual variables is fundamental,whilst, for another, generalizations that say littleabout the context have little that is useful to sayabout human behaviour (Schofield, 1993). Forpositivists variables have to be isolated and con-trolled, and samples randomized, whilst for eth-nographers human behaviour is infinitely com-plex, irreducible, socially situated and unique.

Generalizability in naturalistic research is in-terpreted as comparability and transferability(Lincoln and Guba, 1985; Eisenhart and Howe,1992:647). These writers suggest that it is pos-sible to assess the typicality of a situation—theparticipants and settings, to identify possiblecomparison groups, and to indicate how datamight translate into different settings and cul-tures (see also LeCompte and Preissle,1993:348). Schofield (1992:200) suggests thatit is important in qualitative research to providea clear, detailed and in-depth description so thatothers can decide the extent to which findingsfrom one piece of research are generalizable toanother situation, i.e. to address the twin issuesof comparability and translatability. Indeed,qualitative research can be generalizable, thepaper argues (p. 209), by studying the typical(for its applicability to other situations—the is-sue of transferability (LeCompte and Preissle,1993:324)) and by performing multi-site stud-ies (e.g. Miles and Huberman, 1984), though itcould be argued that this is injecting (or infect-ing!) a degree of positivism into non-positivistresearch. Lincoln and Guba (1985:316) cautionthe naturalistic researcher against this; they ar-gue that it is not the researcher’s task to providean index of transferability; rather, they suggest,researchers should provide sufficiently rich datafor the readers and users of research to deter-

mine whether transferability is possible. In thisrespect transferability requires thick description.

Bogdan and Biklen (1992:45) argue thatgeneralizability, construed differently from itsusage in positivist methodologies, can be ad-dressed in qualitative research. Positivist re-searchers, they argue, are more concerned toderive universal statements of general socialprocesses rather than to provide accounts of thedegree of commonality between various socialsettings (e.g. schools and classrooms). Bogdanand Biklen are more interested not with the is-sue of whether their findings are generalizablein the widest sense but with the question of thesettings, people and situations to which theymight be generalizable.

In naturalistic research threats to external va-lidity include (Lincoln and Guba, 1985:189, 300): • selection effects (where constructs selected in

fact are only relevant to a certain group);• setting effects (where the results are largely a

function of their context);• history effects (where the situations have been

arrived at by unique circumstances and, there-fore, are not comparable);

• construct effects (where the constructs beingused are peculiar to a certain group).

Content validity

To demonstrate this form of validity the instru-ment must show that it fairly and comprehen-sively covers the domain or items that it pur-ports to cover. It is unlikely that each issue willbe able to be addressed in its entirety simplybecause of the time available or respondents’motivation to complete, for example, a longquestionnaire. If this is the case, then the re-searcher must ensure that the elements of the mainissue to be covered in the research are both a fairrepresentation of the wider issue under investi-gation (and its weighting) and that the elementschosen for the research sample are themselvesaddressed in depth and breadth. Careful samplingof items is required to ensure their representa-tiveness. For example, if the researcher wished

DEFINING VALIDITY

VALIDITY AND RELIABILITY110

to see how well a group of students could spell1,000 words in French but decided only to havea sample of fifty words for the spelling test, thenthat test would have to ensure that it representedthe range of spellings in the 1,000 words—maybe by ensuring that the spelling rules hadall been included or that possible spelling errorshad been covered in the test in the proportionsin which they occurred in the 1,000 words.

Construct validity

A construct is an abstract; this separates it fromthe previous types of validity which dealt in ac-tualities—defined content. In this type of valid-ity agreement is sought on the ‘operationalized’forms of a construct, clarifying what we meanwhen we use this construct. Hence in this formof validity the articulation of the construct isimportant; is my understanding of this constructsimilar to that which is generally accepted to bethe construct? For example, let us say that Iwished to assess a child’s intelligence (assum-ing, for the sake of this example, that it is a uni-tary quality). I could say that I construed intel-ligence to be demonstrated in the ability tosharpen a pencil. How acceptable a construc-tion of intelligence is this? Is not intelligencesomething else (e.g. that which is demonstratedby a high result in an intelligence test)?

To establish construct validity I would needto be assured that my construction of a particu-lar issue agreed with other constructions of thesame underlying issue, e.g. intelligence, creativ-ity, anxiety, motivation. This can be achievedthrough correlations with other measures of theissue or by rooting my construction in a wideliterature search which teases out the meaningof a particular construct (i.e. a theory of whatthat construct is) and its constituent elements.Demonstrating construct validity means not onlyconfirming the construction with that given inrelevant literature, but looking for counter ex-amples which might falsify my construction.When I have balanced confirming and refutingevidence I am in a position to demonstrate con-

struct validity. I am then in a position to stipu-late what I take this construct to be. In the caseof conflicting interpretations of a construct, Imight have to acknowledge that conflict andthen stipulate the interpretation that I shall use.

In qualitative/ethnographic research constructvalidity must demonstrate that the categoriesthat the researchers are using are meaningful tothe participants themselves (Eisenhart andHowe, 1992:648), i.e. that they reflect the wayin which the participants actually experience andconstrue the situations in the research; that theysee the situation through the actors’ eyes.

Campbell and Fiske (1959) and Brock-Utne(1996) suggest that convergent validity impliesthat different methods for researching the sameconstruct should give a relatively high inter-cor-relation, whilst discriminant validity suggeststhat using similar methods for researching dif-ferent constructs should yield relatively low in-ter-correlations.

Ecological validity

In quantitative, positivist research variables arefrequently isolated, controlled and manipulatedin contrived settings. For qualitative, naturalis-tic research a fundamental premise is that theresearcher deliberately does not try to manipu-late variables or conditions, that the situationsin the research occur naturally. The intentionhere is to give accurate portrayals of the reali-ties of social situations in their own terms, intheir natural or conventional settings. In educa-tion, ecological validity is particularly importantand useful in charting how policies are actuallyhappening ‘at the chalk face’ (Brock-Utne,1996:617). For ecological validity to be demon-strated it is important to include and address inthe research as many characteristics in, and fac-tors of, a given situation as possible. The diffi-culty for this is that the more characteristics areincluded and described, the more difficult it isto abide by central ethical tenets of much re-search—non-traceability, anonymity and non-identifiability.

Cha

pte

r 5111

A related type of validity is the emerging no-tion of cultural validity (Morgan, 1999). This isparticularly an issue in cross-cultural, inter-cul-tural and comparative kinds of research, wherethe intention is to shape research so that it isappropriate to the culture of the researched.Cultural validity, Morgan (1999) suggests, ap-plies at all stages of the research, and affectsits planning, implementation and dissemina-tion. It involves a degree of sensitivity to theparticipants, cultures and circumstances beingstudied.

Catalytic validity

Catalytic validity embraces the paradigm of criti-cal theory discussed in Chapter 1. Put neutrally,catalytic validity simply strives to ensure thatresearch leads to action. However, the story doesnot end there, for discussions of catalytic valid-ity are substantive; like critical theory, catalyticvalidity suggests an agenda. Lather (1986,1991), Kincheloe and McLaren (1994) suggestthat the agenda for catalytic validity is to helpparticipants to understand their worlds in orderto transform them. The agenda is explicitly po-litical, for catalytic validity suggests the need toexpose whose definitions of the situation areoperating in the situation. Lincoln and Guba(1986) suggest that the criterion of ‘fairness’should be applied to research, meaning that itshould (a) augment and improve the partici-pants’ experience of the world, and (b) that itshould improve the empowerment of the par-ticipants. In this respect the research might fo-cus on what might be (the leading edge of inno-vations and future trends) and what could be(the ideal, possible futures) (Schofield,1992:209).

Catalytic validity—a major feature in femi-nist research which, Usher (1996) suggests, needsto permeate all research—requires solidarity inthe participants, an ability of the research topromote emancipation, autonomy and freedomwithin a just, egalitarian and democratic soci-ety (Masschelein, 1991), to reveal the distor-tions, ideological deformations and limitations

that reside in research, communication and so-cial structures (see also LeCompte and Preissle,1993). Validity, it is argued (Mishler, 1990;Scheurich, 1996), is no longer an ahistoricalgiven, but contestable, suggesting that the defi-nitions of valid research reside in the academiccommunities of the powerful. Lather (1986) callsfor research to be emancipatory and to empowerthose who are being researched, suggesting thatcatalytic validity, akin to Freire’s notion of‘conscientization’, should empower participantsto understand and transform their oppressedsituation.

Validity, it is proposed (Scheurich, 1996),is but a mask that in fact polices and setsboundaries to what is considered to be accept-able research by powerful research communi-ties; discourses of validity, in fact, are dis-courses of power to define worthwhile knowl-edge. Valid research, if it is to meet the de-mands of catalytic validity, must demonstrateits ability to empower the researched as wellas the researchers.

How defensible it is to suggest that research-ers should have such ideological intents is, per-haps, a moot point, though not to address thisarea is to perpetuate inequality by omission andneglect. Catalytic validity reasserts the central-ity of ethics in the research process, for it re-quires the researcher to interrogate her alle-giances, responsibilities and self-interestedness(Burgess, 1989a).

Criterion-related validity

This form of validity endeavours to relate theresults of one particular instrument to anotherexternal criterion. Within this type of validitythere are two principal forms: predictive valid-ity and concurrent validity.

Predictive validity is achieved if the data ac-quired at the first round of research correlatehighly with data acquired at a future date. Forexample, if the results of examinations taken by16-year-olds correlate highly with the examina-tion results gained by the same students whenaged 18, then we might wish to say that the first

DEFINING VALIDITY

VALIDITY AND RELIABILITY112

examination demonstrated strong predictivevalidity.

A variation on this theme is encountered inthe notion of concurrent validity. To demonstratethis form of validity the data gathered from us-ing one instrument must correlate highly withdata gathered from using another instrument.For example, suppose I wished to research a stu-dent’s problem-solving ability. I might observethe student working on a problem, or I mighttalk to the student about how she is tackling theproblem, or I might ask the student to writedown how she tackled the problem. Here I havethree different data-collecting instruments—ob-servation, interview and documentation respec-tively. If the results all agreed—concurred—that,according to given criteria for problem-solvingability, the student demonstrated a good abilityto solve a problem, then I would be able to saywith greater confidence (validity) that the stu-dent was good at problem-solving than if I hadarrived at that judgement simply from using oneinstrument.

Concurrent validity is very similar to its part-ner—predictive validity—in its core concept (i.e.agreement with a second measure); what differ-entiates concurrent and predictive validity is theabsence of a time element in the former; con-currence can be demonstrated simultaneouslywith another instrument.

An important partner to concurrent validity,which is also a bridge into later discussions ofreliability, is triangulation.

Triangulation

Triangulation may be defined as the use of twoor more methods of data collection in the studyof some aspect of human behaviour. It is a tech-nique of research to which many subscribe inprinciple, but which only a minority use in prac-tice. In its original and literal sense, triangula-tion is a technique of physical measurement:maritime navigators, military strategists andsurveyors, for example, use (or used to use) sev-eral locational markers in their endeavours topinpoint a single spot or objective. By analogy,

triangular techniques in the social sciences at-tempt to map out, or explain more fully, the rich-ness and complexity of human behaviour bystudying it from more than one standpoint and,in so doing, by making use of both quantitativeand qualitative data. Triangulation is a power-ful way of demonstrating concurrent validity,particularly in qualitative research (Campbelland Fiske, 1959).

The advantages of the multimethod approachin social research are manifold and we examinetwo of them. First, whereas the single observa-tion in fields such as medicine, chemistry andphysics normally yields sufficient and unambigu-ous information on selected phenomena, it pro-vides only a limited view of the complexity ofhuman behaviour and of situations in whichhuman beings interact. It has been observed thatas research methods act as filters through whichthe environment is selectively experienced, theyare never atheoretical or neutral in representingthe world of experience (Smith, 1975). Exclu-sive reliance on one method, therefore, may biasor distort the researcher’s picture of the particu-lar slice of reality she is investigating. She needsto be confident that the data generated are notsimply artefacts of one specific method of col-lection (Lin, 1976). And this confidence can onlybe achieved as far as normative research is con-cerned when different methods of data collec-tion yield substantially the same results. (Wheretriangulation is used in interpretive research toinvestigate different actors’ viewpoints, the samemethod, e.g. accounts, will naturally producedifferent sets of data.) Further, the more themethods contrast with each other, the greaterthe researcher’s confidence. If, for example, theoutcomes of a questionnaire survey correspondto those of an observational study of the samephenomena, the more the researcher will be con-fident about the findings. Or, more extreme,where the results of a rigorous experimental in-vestigation are replicated in, say, a role-playingexercise, the researcher will experience evengreater assurance. If findings are artefacts ofmethod, then the use of contrasting methods con-siderably reduces the chances that any consistent

Cha

pte

r 5113

findings are attributable to similarities of method(Lin, 1976).

We come now to a second advantage: sometheorists have been sharply critical of the lim-ited use to which existing methods of inquiry inthe social sciences have been put (Smith, 1975).The use of triangular techniques, it is argued,will help to overcome the problem of ‘method-boundedness’, as it has been termed. One of theearliest scientists to predict such a condition wasBoring, who wrote:

as long as a new construct has only the single op-erational definition that it received at birth, it isjust a construct. When it gets two alternative op-erational definitions, it is beginning to be validated.When the defining operations, because of provencorrelations, are many, then it becomes reified.

(Boring, 1953) In its use of multiple methods, triangulation mayutilize either normative or interpretive tech-niques; or it may draw on methods from boththese approaches and use them in combination.

Types of triangulation and theircharacteristics

We have just seen how triangulation is charac-terized by a multi-method approach to a prob-lem in contrast to a single-method approach.Denzin (1970) has, however, extended this viewof triangulation to take in several other types aswell as the multi-method kind which he terms‘methodological triangulation’, including: • time triangulation (expanded by Kirk and

Miller (1986) to include diachronic reliabil-ity—stability over time—and synchronic re-liability—similarity of data gathered at thesame time);

• space triangulation;• combined levels of triangulation (e.g. indi-

vidual, group, organization, societal);• theoretical triangulation (drawing on alter-

native theories);• investigator triangulation (more than one

observer);

• methodological triangulation (using the samemethod on different occasions or differentmethods on the same object of study).

The vast majority of studies in the social sci-ences are conducted at one point only in time,thereby ignoring the effects of social change andprocess. Time triangulation goes some way torectifying these omissions by making use ofcross-sectional and longitudinal approaches.Cross-sectional studies collect data concernedwith time-related processes from differentgroups at one point in time; longitudinal stud-ies collect data from the same group at differentpoints in the time sequence. The use of panelstudies and trend studies may also be mentionedin this connection. The former compare the samemeasurements for the same individuals in a sam-ple at several different points in time; and thelatter examine selected processes continuallyover time. The weaknesses of each of these meth-ods can be strengthened by using a combinedapproach to a given problem.

Space triangulation attempts to overcome thelimitations of studies conducted within one cul-ture or subculture. As one writer says, ‘Not onlyare the behavioural sciences culture-bound, theyare sub-culture-bound. Yet many such scholarlyworks are written as if basic principles have beendiscovered which would hold true as tendenciesin any society, anywhere, anytime’ (Smith, 1975).Cross-cultural studies may involve the testing oftheories among different people, as in Piagetianand Freudian psychology; or they may measuredifferences between populations by using severaldifferent measuring instruments. Levine describeshow he used this strategy of convergent valida-tion in his comparative studies:

I have studied differences of achievement motiva-tion among three Nigerian ethnic groups by theanalysis of dream reports, written expressions ofvalues, and public opinion survey data. The con-vergence of findings from the diverse set of data(and samples) strengthens my conviction…that thedifferences among the groups are not artifacts pro-duced by measuring instruments.

(Levine, 1966)

TRIANGULATION

VALIDITY AND RELIABILITY114

Social scientists are concerned in their researchwith the individual, the group and society. Thesereflect the three levels of analysis adopted byresearchers in their work. Those who are criti-cal of much present-day research argue thatsome of it uses the wrong level of analysis, indi-vidual when it should be societal, for instance,or limits itself to one level only when a moremeaningful picture would emerge by using morethan one level. Smith extends this analysis andidentifies seven possible levels: the aggregativeor individual level, and six levels that are moreglobal in that ‘they characterize the collectiveas a whole, and do not derive from an accumu-lation of individual characteristics’ (Smith,1975). The six include: • group analysis (the interaction patterns of

individuals and groups);• organizational units of analysis (units which

have qualities not possessed by the individu-als making them up);

• institutional analysis (relationships withinand across the legal, political, economic andfamilial institutions of society);

• ecological analysis (concerned with spatialexplanation);

• cultural analysis (concerned with the norms,values, practices, traditions and ideologies ofa culture); and

• societal analysis (concerned with gross fac-tors such as urbanization, industrialization,education, wealth, etc.)

Where possible, studies combining several lev-els of analysis are to be preferred.

Researchers are sometimes taken to task fortheir rigid adherence to one particular theory ortheoretical orientation to the exclusion of com-peting theories. Thus, advocates of Piaget’s de-velopmental theory of cognition rarely take intoconsideration Freud’s psychoanalytic theory ofdevelopment in their work; and Gestaltists workwithout reference to S–R theorists. Few pub-lished works, as Smith (1975) points out, evengo as far as to discuss alternative theories aftera study in the light of methods used, much less

consider alternatives prior to the research. Ashe recommends:

The investigator should be more active in design-ing his research so that competing theories can betested. Research which tests competing theorieswill normally call for a wider range of researchtechniques than has historically been the case; thisvirtually assures more confidence in the data analy-sis since it is more oriented towards the testing ofrival hypotheses.

(Smith, 1975) Investigator triangulation refers to the use ofmore than one observer (or participant) in a re-search setting (Silverman, 1993:99). Observersand participants working on their own each havetheir own observational styles and this is re-flected in the resulting data. The careful use oftwo or more observers or participants independ-ently, therefore, can lead to more valid and reli-able data. Smith comments:

Perhaps the greatest use of investigator triangula-tion centres around validity rather than reliabilitychecks. More to the point, investigators with dif-fering perspectives or paradigmatic biases may beused to check out the extent of divergence in thedata each collects. Under such conditions if datadivergence is minimal then one may feel more con-fident in the data’s validity. On the other hand, iftheir data are significantly different, then one hasan idea as to possible sources of biased measure-ment which should be further investigated.

(Smith, 1975) In this respect the notion of triangulation bridgesissues of reliability and validity. We have alreadyconsidered methodological triangulation earlier.Denzin identifies two categories in his typology:‘within methods’ triangulation and ‘betweenmethods’ triangulation. Triangulation withinmethods concerns the replication of a study as acheck on reliability and theory confirmation (seeSmith, 1975). Triangulation between methods,as we have seen, involves the use of more thanone method in the pursuit of a given objective.As a check on validity, the between methodsapproach embraces the notion of convergence

Cha

pte

r 5115

between independent measures of the same ob-jective as has been defined by Campbell andFiske (1959).

Of the six categories of triangulation inDenzin’s typology, four are frequently used ineducation. These are: time triangulation with itslongitudinal and cross-sectional studies; spacetriangulation as on the occasions when a numberof schools in an area or across the country areinvestigated in some way; investigator triangula-tion as when two observers independently ratethe same classroom phenomena; and methodo-logical triangulation. Of these four, methodologi-cal triangulation is the one used most frequentlyand the one that possibly has the most to offer.

Triangular techniques are suitable when amore holistic view of educational outcomes issought. An example of this can be found inMortimore et al.’s (1988) search for school ef-fectiveness.

Triangulation has special relevance where acomplex phenomenon requires elucidation. Mul-tiple methods are suitable where a controver-sial aspect of education needs to be evaluatedmore fully. Triangulation is useful when an es-tablished approach yields a limited and fre-quently distorted picture. Finally, triangulationcan be a useful technique where a researcher isengaged in case study, a particular example ofcomplex phenomena (Adelman et al., 1980). Foran example of the use of triangular techniquesin educational research we refer the reader toBlease and Cohen’s (1990) account of investi-gator triangulation and methodological trian-gulation.

Triangulation is not without its critics. Forexample, Silverman (1985) suggests that the verynotion of triangulation is positivistic, and thatthis is exposed most clearly in data triangula-tion, as it is presumed that a multiple data source(concurrent validity) is superior to a single datasource or instrument. The assumption that a sin-gle unit can always be measured more than onceviolates the interactionist principles of emer-gence, fluidity, uniqueness and specificity(Denzin, 1997:320). Further, Patton (1980) sug-gests that even having multiple data sources,

particularly of qualitative data, does not ensureconsistency or replication. Fielding and Field-ing (1986) hold that methodological triangula-tion does not necessarily increase validity, re-duce bias or bring objectivity to research.

With regard to investigator triangulation Lin-coln and Guba (1985:307) contend that it is er-roneous to assume that one investigator will cor-roborate another, nor is this defensible, particu-larly in qualitative, reflexive inquiry. They ex-tend their concern to include theory and meth-odological triangulation, arguing that the searchfor theory and methodological triangulation isepistemologically incoherent and empiricallyempty (see also Patton, 1980). No two theories,it is argued, will ever yield a sufficiently com-plete explanation of the phenomenon being re-searched.

These criticisms are trenchant, but they havebeen answered equally trenchantly by Denzin(1997).

Ensuring validity

It is very easy to slip into invalidity; it is bothinsidious and pernicious as it can enter at everystage of a piece of research. The attempt to buildout invalidity is essential if the researcher is tobe able to have confidence in the elements ofthe research plan, data acquisition, data process-ing analysis, interpretation and its ensuing judge-ment.

At the design stage threats to validity can beminimized by:

• choosing an appropriate time scale;• ensuring that there are adequate resources for

the required research to be undertaken;• selecting an appropriate methodology for

answering the research questions;• selecting appropriate instrumentation for

gathering the type of data required;• using an appropriate sample (e.g. one which

is representative, not too small or too large);• demonstrating internal, external, content,

concurrent and construct validity;‘operationalizing,’ the constructs fairly;

ENSURING VALIDITY

VALIDITY AND RELIABILITY116

• ensuring reliability in terms of stability (con-sistency, equivalence, split-half analysis of testmaterial);

• selecting appropriate foci to answer the re-search questions;

• devising and using appropriate instruments (forexample, to catch accurate, representative, rel-evant and comprehensive data (King, Morrisand Fitz-Gibbon, 1987)); ensuring that read-ability levels are appropriate; avoiding anyambiguity of instructions, terms and questions;using instruments that will catch the complex-ity of issues; avoiding leading questions; ensur-ing that the level of test is appropriate—e.g.neither too easy nor too difficult; avoiding testitems with little discriminability; avoiding mak-ing the instruments too short or too long; avoid-ing too many or too few items for each issue;

• avoiding a biased choice of researcher or re-search team (e.g. insiders or outsiders as re-searchers).

There are several areas where invalidity or biasmight creep into the research at the stage of datagathering; these can be minimized by: • reducing the Hawthorne effect;• minimizing reactivity effects (respondents

behaving differently when subjected to scru-tiny or being placed in new situations, forexample, the interview situation—we distortpeople’s lives in the way we go about study-ing them (Lave and Kvale, 1995:226));

• trying to avoid dropout rates amongst re-spondents;

• taking steps to avoid non-return of question-naires;

• avoiding having too long or too short an in-terval between pretests and post-tests;

• ensuring inter-rater reliability;• matching control and experimental groups

fairly;• ensuring standardized procedures for gath-

ering data or for administering tests;• building on the motivations of the respondents;• tailoring the instruments to the concentration

span of the respondents and addressing other

situational factors (e.g. health, environment,noise, distraction, threat);

• addressing factors concerning the researcher(particularly in an interview situation); forexample, the attitude, gender, race, age, per-sonality, dress, comments, replies, question-ing technique, behaviour, style and non-ver-bal communication of the researcher.

At the stage of data analysis there are severalareas where invalidity lurks; these might be mini-mized by: • using respondent validation;• avoiding subjective interpretation of data (e.g.

being too generous or too ungenerous in theaward of marks), i.e. lack of standardizationand moderation of results;

• reducing the halo effect, where the research-er’s knowledge of the person or knowledgeof other data about the person or situationexerts an influence on subsequent judgements;

• using appropriate statistical treatments for thelevel of data (e.g. avoiding applying tech-niques from interval scaling to ordinal dataor using incorrect statistics for the type, size,complexity, sensitivity of data);

• recognizing spurious correlations and extra-neous factors which may be affecting the data(i.e. tunnel vision);

• avoiding poor coding of qualitative data;• avoiding making inferences and generaliza-

tions beyond the capability of the data tosupport such statements;

• avoiding the equating of correlations andcauses;

• avoiding selective use of data;• avoiding unfair aggregation of data (particu-

larly of frequency tables);• avoiding unfair telescoping of data (degrad-

ing the data);• avoiding Type I and/or Type II errors.

A Type I error is committed where the researcherrejects the null hypothesis when it is in fact true(akin to convicting an innocent person (Mitchelland Jolley, 1988:121)); this can be addressed bysetting a more rigorous level of significance (e.g.

Cha

pte

r 5117

ρ<0.01 rather than ρ<0.05). A Type II error iscommitted where the null hypothesis is acceptedwhen it is in fact not true (akin to finding a guiltyperson innocent (Mitchell and Jolley, ibid.)).Boruch (1997:211) suggests that a Type II errormay occur if: (a) the measurement of a responseto the intervention is insufficiently valid; (b) themeasurement of the intervention is insufficientlyrelevant; (c) the statistical power of theexperiment is too low; (d) the wrong populationwas selected for the intervention.

A Type II error can be addressed by reducingthe level of significance (e.g. ρ<0.20 or ρ<0.30rather than ρ<0.05). Of course, the more one re-duces the chance of a Type I error the more chancethere is of committing a Type II error, and viceversa. In qualitative data a Type I error is commit-ted when a statement is believed when it is, in fact,not true, and a Type II error is committed when astatement is rejected when it is in fact true.

At the stage of data reporting invalidity canshow itself in several ways; the researcher musttake steps to minimize this by, for example: • avoiding using data very selectively and

unrepresentatively (for example, accentuat-ing the positive and neglecting or ignoringthe negative);

• indicating the context and parameters of theresearch in the data collection and treatment,the degree of confidence which can be placedin the results, the degree of context-freedomor context-boundedness of the data (i.e. thelevel to which the results can be generalized);

• presenting the data without misrepresentingtheir message;

• making claims which are sustainable by thedata;

• avoiding inaccurate or wrong reporting ofdata (i.e. technical errors or orthographic er-rors);

• ensuring that the research questions are an-swered; releasing research results neither toosoon nor too late.

Having identified the realms in which invaliditylurks, the researcher can take steps to ensure

that, as far as possible, invalidity has been mini-mized in all areas of the research.1

Defining reliability

Reliability in quantitative research

Reliability is essentially a synonym for consist-ency and replicability over time, over instru-ments and over groups of respondents. It is con-cerned with precision and accuracy; some fea-tures, e.g. height, can be measured precisely,whilst others, e.g. musical ability, cannot. Forresearch to be reliable it must demonstrate thatif it were to be carried out on a similar group ofrespondents in a similar context (however de-fined), then similar results would be found. Thereare three principal types of reliability: stability,equivalence and internal consistency.

Reliability as stability

In this form reliability is a measure of consist-ency over time and over similar samples. A reli-able instrument for a piece of research will yieldsimilar data from similar respondents over time.A leaking tap which each day leaks one litre isleaking reliably whereas a tap which leaks onelitre some days and two litres on others is not.In the experimental and survey models of re-search this would mean that if a test and then are-test were undertaken within an appropriatetime span, then similar results would be ob-tained. The researcher has to decide what anappropriate length of time is; too short a timeand respondents may remember what they saidor did in the first test situation, too long a timeand there may be extraneous effects operatingto distort the data (for example, maturation instudents, outside influences on the students). Aresearcher seeking to demonstrate this type ofreliability will have to choose an appropriatetime scale between the test and re-test. Correla-tion coefficients can be calculated for the reli-ability of pre- and post-tests, using formulaewhich are readily available in books on statis-tics and test construction.

DEFINING RELIABILITY

VALIDITY AND RELIABILITY118

In addition to stability over time, reliabilityas stability can also be stability over a similarsample. For example, we would assume that ifwe were to administer a test or a questionnairesimultaneously to two groups of students whowere very closely matched on significant charac-teristics (e.g. age, gender, ability etc.—whatevercharacteristics are deemed to have a significantbearing, on the responses), then similar results(on a test) or responses (to a questionnaire) wouldbe obtained. The correlation co-efficient on thismethod can be calculated either for the wholetest (e.g. by using the Pearson statistic) or for sec-tions of the questionnaire (e.g. by using theSpearman or Pearson statistic as appropriate).The statistical significance of the correlation co-efficient can be found and should be 0.05 orhigher if reliability is to be guaranteed. This formof reliability over a sample is particularly usefulin piloting tests and questionnaires.

Reliability as equivalence

Within this type of reliability there are two mainsorts of reliability. Reliability may be achieved,firstly, through using equivalent forms (alsoknown as alternative forms) of a test or data-gathering instrument. If an equivalent form ofthe test or instrument is devised and yields simi-lar results, then the instrument can be said todemonstrate this form of reliability. For exam-ple, the pretest and post-test in the experimen-tal model of evaluation are predicated on thistype of reliability, being alternate forms of in-strument to measure the same issues. This typeof reliability might also be demonstrated if theequivalent forms of a test or other instrumentyield consistent results if applied simultaneouslyto matched samples (e.g., a control and experi-mental group or two random stratified samplesin a survey). Here reliability can be measuredthrough a t-test, through the demonstration ofa high correlation co-efficient and through thedemonstration of similar means and standarddeviations between two groups.

Secondly, reliability as equivalence may be

achieved through inter-rater reliability. If morethan one researcher is taking part in a piece ofresearch then, human judgement being fallible,agreement between all researchers must beachieved through ensuring that each researcherenters data in the same way. This would be par-ticularly pertinent to a team of researchers gath-ering structured observational or semi-structuredinterview data where each member of the teamwould have to agree on which data would beentered in which categories. For observationaldata reliability is addressed in the training ses-sions for researchers where they work on videomaterial to ensure parity in how they enter thedata.

Reliability as internal consistency

Whereas the test/re-test method and the equiva-lent forms method of demonstrating reliabilityrequire the tests or instruments to be done twice,demonstrating internal consistency demands thatthe instrument or tests be run once only throughthe split-half method.

Let us imagine that a test is to be adminis-tered to a group of students. Here the test itemsare divided into two halves, ensuring that eachhalf is matched in terms of item difficulty andcontent. Each half is marked separately. If thetest is to demonstrate split-half reliability, thenthe marks obtained on each half should be cor-related highly with the other. Any student’smarks on the one half should match his or hermarks on the other half. This can be calculatedusing, the Spearman—Brown formula:

where r=the actual correlation between thehalves of the instrument.

This calculation requires a correlation coefficientto be calculated, e.g. a Spearman rank order cor-relation or a Pearson product moment correlation.

Let us say that using the Spearman—Brownformula the correlation co-efficient is 0.85; inthis case the formula for reliability is set out thus:

Cha

pte

r 5119

Given that the maximum value of the co effi-cient is 1.00 we can see that the reliability ofthis instrument, calculated for the split-half formof reliability, is very high indeed.

This type of reliability assumes that the testadministered can be split into two matchedhalves; many tests have a gradient of difficultyor different items of content in each half. If thisis the case and, for example, the test containstwenty items, then the researcher, instead of split-ting the test into two by assigning items one toten to one half and items eleven to twenty to thesecond half may assign all the even numbereditems to one group and all the odd numbereditems to another. This would move towards thetwo halves being matched in terms of contentand cumulative degrees of difficulty.

Reliability, thus construed, makes several as-sumptions, for example: that instrumentation,data and findings should be controllable, pre-dictable, consistent and replicable. This pre-sup-poses a particular style of research, typicallywithin the positivist paradigm.

Reliability in qualitative research

LeCompte and Preissle (1993:332) suggest thatthe canons of reliability for quantitative researchmay be simply unworkable for qualitative re-search. Quantitative research assumes the pos-sibility of replication; if the same methods areused with the same sample then the results shouldbe the same. Typically, quantitative methods re-quire a degree of control and manipulation ofphenomena. This distorts the natural occurrenceof phenomena (see earlier: ecological validity).Indeed the premises of naturalistic studies includethe uniqueness and idiosyncrasy of situations,such that the study cannot be replicated—thatis their strength rather than their weakness.

On the other hand, this is not to say that quali-tative research need not strive for replication ingenerating, refining, comparing and validatingconstructs. Indeed LeCompte and Preissle (ibid.:

334) argue that such replication might includerepeating: • the status position of the researcher;• the choice of informant/respondents;• the social situations and conditions;• the analytic constructs and premises that are

used;• the methods of data collection and analysis. Further, Denzin and Lincoln (1994) suggest thatreliability as replicability in qualitative researchcan be addressed in several ways: • stability of observations (whether the re-

searcher would have made the same obser-vations and interpretation of these if they hadbeen observed at a different time or in a dif-ferent place);

• parallel forms (whether the researcher wouldhave made the same observations and inter-pretations of what had been seen if she hadpaid attention to other phenomena during theobservation);

• inter-rater reliability (whether another ob-server with the same theoretical frameworkand observing the same phenomena wouldhave interpreted them in the same way).

Clearly this is a contentious issue, for it is seek-ing to apply to qualitative research the canonsof reliability of quantitative research. Puristsmight argue against the legitimacy, relevance orneed for this in qualitative studies.

In qualitative research reliability can be re-garded as a fit between what researchers recordas data and what actually occurs in the naturalsetting that is being researched, i.e. a degree ofaccuracy and comprehensiveness of coverage(Bogdan and Biklen, 1992:48). This is not tostrive for uniformity; two researchers who arestudying a single setting may come up with verydifferent findings but both sets of findings mightbe reliable. Indeed Kvale (1996:181) suggeststhat, in interviewing, there might be as manydifferent interpretations of the qualitative dataas there are researchers. A clear example of this

DEFINING RELIABILITY

VALIDITY AND RELIABILITY120

is the study of the Nissan automobile factory inthe UK, where Wickens (1987) found a ‘virtu-ous circle’ of work organization practices thatdemonstrated flexibility, teamwork and qualityconsciousness, whereas the same practices wereinvestigated by Garrahan and Stewart (1992)who found a ‘vicious circle’ of exploitation, sur-veillance and control respectively. Both versionsof the same reality co-exist because reality ismulti-layered. What is being argued for here isthe notion of reliability through an eclectic useof instruments, researchers, perspectives andinterpretations (echoing the comments earlierabout triangulation) (see also Eisenhart andHowe, 1992).

Brock-Utne (1996) argues that qualitativeresearch, being holistic, strives to record themultiple interpretations of, intention in andmeanings given to situations and events. Herethe notion of reliability is construed as depend-ability (Guba and Lincoln, 1985:108–9), recall-ing the earlier discussion on internal validity. Forthem, dependability involves member checks(respondent validation), debriefing by peers, tri-angulation, prolonged engagement in the field,persistent observations in the field, reflexive jour-nals, and independent audits (identifying accept-able processes of conducting the inquiry so thatthe results are consistent with the data). Audittrails enable the research to address the issue ofconfirmability of results. These, argue the au-thors (ibid.: 289), are a safeguard against thecharge levelled against qualitative researchers,viz. that they respond only to the ‘loudest bangsor the brightest lights’.

Dependability raises the important issue ofrespondent validation (see also McCormick andJames, 1988). Whilst dependability might sug-gest that researchers need to go back to respond-ents to check that their findings are dependable,researchers also need to be cautious in placingexclusive store on respondents, for, asHammersley and Atkinson (1983) suggest, theyare not in a privileged position to be sole com-mentators on their actions.

Bloor (1978) suggests three means by whichrespondent validation can be addressed:

• researchers attempt to predict what the par-ticipants’ classifications of situations will be;

• researchers prepare hypothetical cases andthen predict respondents’ likely responses tothem;

• researchers take back their research reportto the respondents and record their reactionsto that report.

The argument rehearses the paradigm wars dis-cussed in the opening chapter: quantitative meas-ures are criticized for combining sophisticationand refinement of process with crudity of con-cept (Ruddock, 1981) and for failing to distin-guish between educational and statistical signifi-cance (Eisner, 1985); qualitative methodologies,whilst possessing immediacy, flexibility, authen-ticity, richness and candour, are criticized forbeing impressionistic, biased, commonplace,insignificant, ungeneralizable, idiosyncratic, sub-jective and short-sighted (Ruddock, 1981). Thisis an arid debate; rather, the issue is one of fit-ness for purpose. For our purposes here we needto note that criteria of reliability in quantitativemethodologies differ from those in qualitativemethodologies. In qualitative methodologiesreliability includes fidelity to real life, context-and situation-specificity, authenticity, compre-hensiveness, detail, honesty, depth of responseand meaningfulness to the respondents.

Validity and reliability in interviews

Studies reported by Cannell and Kahn (1968), inwhich the interview was used, seemed to indi-cate that validity was a persistent problem. Inone such study, subjects interviewed on the exist-ence and state of their bank accounts often pre-sented a misleading picture: fewer accounts werereported than actually existed and the amountsdeclared frequently differed from bank records,often in the direction of understating assets. Thecause of invalidity, they argue, is bias which theydefine as ‘a systematic or persistent tendency tomake errors in the same direction, that is, to over-state or understate the “true value” of an at-tribute’. (Lansing, Ginsberg and Braaten, 1961).

Cha

pte

r 5121

The problem, it seems, is not limited to a nar-row range of data but is widespread. One wayof validating interview measures is to comparethe interview measure with another measure thathas already been shown to be valid. This kindof comparison is known as ‘convergent valid-ity’. If the two measures agree, it can be assumedthat the validity of the interview is comparablewith the proven validity of the other measure.

Perhaps the most practical way of achievinggreater validity is to minimize the amount ofbias as much as possible. The sources of biasare the characteristics of the interviewer, thecharacteristics of the respondent, and the sub-stantive content of the questions. More particu-larly, these will include: • the attitudes, opinions, and expectations of

the interviewer;• a tendency for the interviewer to see the re-

spondent in her own image;• a tendency for the interviewer to seek answers

that support her preconceived notions;• misperceptions on the part of the interviewer

of what the respondent is saying;• misunderstandings on the part of the respond-

ent of what is being asked. Studies have also shown that race, religion, gen-der, sexual orientation, status, social class andage in certain contexts can be potent sources ofbias, i.e. interviewer effects (Lee, 1993;Scheurich, 1995). Interviewers and interview-ees alike bring their own, often unconsciousexperiential and biographical baggage with theminto the interview situation. Indeed Hitchcockand Hughes (1989) argue that because inter-views are interpersonal, humans interacting withhumans, it is inevitable that the researcher willhave some influence on the interviewee and,thereby, on the data. Fielding and Fielding(1986:12) make the telling comment that eventhe most sophisticated surveys only manipulatedata that at some time had to be gained by ask-ing people! Interviewer neutrality is a chimera(Denscombe, 1995).

Lee (1993) indicates the problems of conduct-

ing interviews perhaps at their sharpest, wherethe researcher is researching sensitive subjects,i.e. research that might pose a significant threatto those involved (be they interviewers or inter-viewees). Here the interview might be seen asan intrusion into private worlds, or the inter-viewer might be regarded as someone who canimpose sanctions on the interviewee, or as some-one who can exploit the powerless; the inter-viewee is in the searchlight that is being held bythe interviewer (see also Scheurich, 1995). Theissues also embrace transference andcountertransference, which have their basis inpsychoanalysis. In transference the intervieweesproject onto the interviewer their feelings, fears,desires, needs and attitudes that derive from theirown experiences (Scheurich, 1995). Incountertransference the process is reversed.

One way of controlling for reliability is tohave a highly structured interview, with the sameformat and sequence of words and questions foreach respondent (Silverman, 1993), thoughScheurich (1995:241–9) suggests that this is tomisread the infinite complexity and open-endedness of social interaction: controlling thewording is no guarantee of controlling the in-terview. Oppenheim (1992:147) argues thatwording is a particularly important factor inattitudinal questions rather than factual ques-tions. He suggests that changes in wording, con-text and emphasis undermine reliability, becauseit ceases to be the same question for each re-spondent. Indeed he argues that error and biascan stem from alterations to wording, procedure,sequence, recording, rapport, and that trainingfor interviewers is essential to minimize this.Silverman (1993) suggests that it is importantfor each interviewee to understand the questionin the same way. He suggests that the reliabilityof interviews can be enhanced by: carefulpiloting of interview schedules; training of in-terviewers; inter-rater reliability in the codingof responses; and the extended use of closedquestions.

On the other hand Silverman (1993) arguesfor the importance of open-ended interviews, asthis enables respondents to demonstrate their

VALIDITY AND RELIABILITY IN INTERVIEWS

VALIDITY AND RELIABILITY122

unique way of looking at the world—their defi-nition of the situation. It recognizes that what isa suitable sequence of questions for one respond-ent might be less suitable for another, and open-ended questions enable important but unantici-pated issues to be raised.

Oppenheim (1992:96–7) suggests severalcauses of bias in interviewing: • biased sampling (sometimes created by the

researcher not adhering to sampling instruc-tions);

• poor rapport between interviewer and inter-viewee;

• changes to question wording (e.g. inattitudinal and factual questions);

• poor prompting and biased probing;• poor use and management of support mate-

rials (e.g. show cards);• alterations to the sequence of questions;• inconsistent coding of responses;• selective or interpreted recording of data/

transcripts;• poor handling of difficult interviews. There is also the issue of leading questions. Aleading question is one which makes assump-tions about interviewees or ‘puts words into theirmouths’, i.e. where the question influences theanswer perhaps illegitimately. For example(Morrison, 1993:66–7) the question ‘when didyou stop complaining to the headteacher?’ as-sumes that the interviewee had been a frequentcomplainer, and the question ‘how satisfied areyou with the new Mathematics scheme?’ as-sumes a degree of satisfaction with the scheme.The leading questions here might be renderedless leading by rephrasing, for example: ‘howfrequently do you have conversations with theheadteacher?’ and ‘what is your opinion of thenew Mathematics scheme?’ respectively.

In discussing the issue of leading questionswe are not necessarily suggesting that there isnot a place for them. Indeed Kvale (1996:158)makes a powerful case for leading questions,arguing that they may be necessary in order toobtain information that the interviewer suspects

the interviewee might be withholding. Here itmight be important to put the ‘burden of de-nial’ onto the interviewee (e.g. ‘when did youlast stop beating your wife?’). Leading questions,frequently used in police interviews, may be usedfor reliability checks with what the intervieweehas already said, or may be deliberately used toelicit particular non-verbal behaviours that givean indication of the sensitivity of the interview-ee’s remarks.

Hence reducing bias becomes more than sim-ply: careful formulation of questions so that themeaning is crystal clear; thorough training pro-cedures so that an interviewer is more aware ofthe possible problems; probability sampling ofrespondents; and sometimes matching inter-viewer characteristics with those of the samplebeing interviewed. Oppenheim (1992:148) ar-gues, for example, that interviewers seekingattitudinal responses have to ensure that peoplewith known characteristics are included in thesample—the criterion group. We need to recog-nize that the interview is a shared, negotiatedand dynamic social moment.

The notion of power is significant in the in-terview situation, for the interview is not sim-ply a data collection situation but a social andfrequently a political situation. Power can residewith interviewer and interviewee alike, thoughScheurich (1995:246) argues that, typically, morepower resides with the interviewer: the inter-viewer generates the questions and the intervieweeanswers them; the interviewee is under scrutinywhilst the interviewer is not. This view is sup-ported by Kvale (1996:126), who suggests thatthere are definite asymmetries of power as theinterviewer tends to define the situation, the top-ics, and the course of the interview.

Cassell (in Lee, 1993) suggests that elites andpowerful people might feel demeaned or insultedwhen being interviewed by those with a lowerstatus or less power. Further, those with power,resources and expertise might be anxious tomaintain their reputation, and so will be moreguarded in what they say, wrapping this up inwell-chosen, articulate phrases. Lee (1993) com-ments on the asymmetries of power in several

Cha

pte

r 5123

interview situations, with one party having morepower and control over the interview than theother. Interviewers need to be aware of the po-tentially distorting effects of power, a significantfeature of critical theory, as discussed in theopening chapter.

Neal (1995) draws attention to the feelingsof powerlessness and anxieties about physicalpresentation and status on the part of interview-ers when interviewing powerful people. This isparticularly so for frequently lone, low statusresearch students interviewing powerful people;a low status female research student might findthat an interview with a male in a position ofpower (e.g. a university vice-chancellor, a deanor a senior manager) might turn out to be verydifferent from an interview with the same per-son if conducted by a male university professorwhere it is perceived by the interviewee to bemore of a dialogue between equals (see alsoGewirtz and Ozga, 1993, 1994). Ball (1994b)comments that, when powerful people are be-ing interviewed, interviews must be seen as anextension of the ‘play of power’—with its game-like connotations. He suggests that powerfulpeople control the agenda and course of the in-terview, and are usually very adept at this be-cause they have both a personal and professionalinvestment in being interviewed (see alsoBatteson and Ball, 1995; Phillips, 1998).

The effect of power can be felt even beforethe interview commences, notes Neal (1995),where she instances being kept waiting, and sub-sequently being interrupted, being patronized,and being interviewed by the interviewee (seealso Walford, 1994). Indeed Scheurich (1995)suggests that many powerful interviewees willrephrase or not answer the question. Connell,Ashenden, Kessler and Dowsett (in Limerick,Burgess-Limerick and Grace (1996)) argue thata working-class female talking with a multina-tional director will be very different from a mid-dle-class professor talking to the same person.Limerick, Burgess-Limerick and Grace (1996)comment on occasions where interviewers havefelt themselves to be passive, vulnerable, help-less and indeed manipulated. One way of over-

coming this is to have two interviewers conduct-ing each interview (Walford, 1994:227). On theother hand, Hitchcock and Hughes (1989) ob-serve that if the researchers are known to theinterviewees and they are peers, however pow-erful, then a degree of reciprocity might be tak-ing place, with interviewees giving answers thatthey think the researchers might want to hear.

The issue of power has not been lost on femi-nist research; that is, research that emphasizessubjectivity, equality, reciprocity, collaboration,non-hierarchical relations and emancipatory po-tential (catalytic validity) (Neal, 1995), echoingthe comments about research that is influencedby the paradigm of critical theory. Here feministresearch addresses a dilemma of interviews thatare constructed in the dominant, male paradigmof pitching questions that demand answers froma passive respondent. Limerick, Burgess-Limer-ick and Grace (1996) suggest that, in fact, it iswiser to regard the interview as a gift, as inter-viewees have the power to withhold information,to choose the location of the interview, to choosehow seriously to attend to the interview, how longit will last, when it will take place, what will bediscussed—and in what and whose terms—whatknowledge is important, even how the data willbe analysed and used. Echoing Foucault, theyargue that power is fluid and is discursively con-structed through the interview rather than be-ing the province of either party.

Miller and Cannell (1997) identify some par-ticular problems in conducting telephone inter-views, where the reduction of the interview situ-ation to just auditory sensory cues can be par-ticularly problematical. There are samplingproblems, as not everyone will have a telephone.Further, there are practical issues, for example,the interviewee can only retain a certain amountof information in her/his short term memory, sobombarding the interviewee with too manychoices (the non-written form of ‘show cards’of possible responses) becomes unworkable.Hence the reliability of responses is subject tothe memory capabilities of the interviewee—howmany scale points and descriptors, for example,can an interviewee retain in her head about a

VALIDITY AND RELIABILITY IN INTERVIEWS

VALIDITY AND RELIABILITY124

single item? Further, the absence of non-verbalcues is significant, e.g. facial expression, gestures,posture, the significance of silences and pauses(Robinson, 1982), as interviewees may be un-clear about the meaning behind words and state-ments. This problem is compounded if the in-terviewer is unknown to the interviewee.

Miller and Cannell present important researchevidence to support the significance of the non-verbal mediation of verbal dialogue. As was dis-cussed earlier, the interview is a social situation;in telephone interviews the absence of essentialsocial elements could undermine the salient con-duct of the interview, and hence its reliability andvalidity. Non-verbal paralinguistic cues affect theconduct, pacing, and relationships in the inter-view and the support, threat, confidence felt bythe interviewees. Telephone interviews can easilyslide into becoming mechanical and cold.

Further, telephone interviewing is becomingincreasingly used by general medical practition-ers (the practice of ‘triaging’). Here the prob-lem of loss of non-verbal cues is compoundedby the asymmetries of power that often existbetween doctor and patient. This contains a use-ful lesson for telephone interviews in educationalresearch—the issue of power is itself a cogentmediating influence between researcher and re-searched; the interviewer will need to take im-mediate steps to address these issues (e.g. byputting interviewees at their ease).

On the other hand, Nias (1991) and Millerand Cannell (1997) suggest that the very factorthat interviews are not face-to-face maystrengthen their reliability, as the intervieweemight disclose information that may not be soreadily forthcoming in a face-to-face, more inti-mate situation. Hence, telephone interviews havetheir strengths and weaknesses, and their useshould be governed by the criterion of fitness-for-purpose. They tend to be shorter, more fo-cused and useful for contacting busy people(Harvey, 1988; Miller, 1995).

In his critique of the interview as a researchtool, Kitwood draws attention to the conflict itgenerates between the traditional concepts ofvalidity and reliability. Where increased reliabil-

ity of the interview is brought about by greatercontrol of its elements, this is achieved, he ar-gues, at the cost of reduced validity. He explains:

In proportion to the extent to which ‘reliability’is enhanced by rationalization, ‘validity’ woulddecrease. For the main purpose of using an inter-view in research is that it is believed that in aninterpersonal encounter people are more likely todisclose aspects of themselves, their thoughts, theirfeelings and values, than they would in a less hu-man situation. At least for some purposes, it isnecessary to generate a kind of conversation inwhich the ‘respondent’ feels at ease. In otherwords, the distinctively human element in the in-terview is necessary to its ‘validity’.

(Kitwood, 1977) Kitwood suggests that a solution to the prob-lem of validity and reliability might lie in thedirection of a ‘judicious compromise’.

A cluster of problems surround the personbeing interviewed. Tuckman (1972), for exam-ple, has observed that when formulating herquestions an interviewer has to consider the ex-tent to which a question might influence the re-spondent to show herself in a good light; or theextent to which a question might influence therespondent to be unduly helpful by attemptingto anticipate what the interviewer wants to hear;or the extent to which a question might be ask-ing for information about a respondent that sheis not certain or likely to know herself. Further,interviewing procedures are based on the as-sumption that the person interviewed has insightinto the cause of her behaviour. It has now cometo be realized that insight of this kind is rarelyachieved and that when it is, it is after long anddifficult effort, usually in the context of repeatedclinical interviews.

In educational circles interviewing might bea particular problem in working with children.Simons (1982) and McCormick and James(1988) comment on particular problems in-volved in interviewing children, for example: • establishing trust;• overcoming reticence;

Cha

pte

r 5125

• maintaining informality;• avoiding assuming that children ‘know the

answers’;• overcoming the problems of inarticulate chil-

dren;• pitching the question at the right level;• choice of vocabulary;• non-verbal cues;• moving beyond the institutional response or

receiving what children think the interviewerwants to hear;

• avoiding the interviewer being seen an au-thority spy or plant;

• keeping to the point;• breaking silences on taboo areas and those

which are reinforced by peer-group pressure;• children being seen as of lesser importance

than adults (maybe in the sequence in whichinterviews are conducted, e.g. theheadteacher, then the teaching staff, then thechildren).

These are not new matters. The studies by Labovin the 1970s showed how students reacted verystrongly to contextual matters in an interviewsituation (Labov, 1969). The language of chil-dren varied according to the ethnicity of the in-terviewer, the friendliness of the surroundings,the opportunity for the children to be interviewedwith friends, the ease with which the scene wasset for the interview, the demeanour of the adult(e.g. whether the adult was standing or sitting),the nature of the topics covered. The differenceswere significant, varying from monosyllabic re-sponses by children in unfamiliar and uncongenialsurroundings to extended responses in the morecongenial and less threatening surroundings—more sympathetic to the children’s everydayworld. The language, argot and jargon (Edwards,1976), social and cultural factors of the inter-viewer and interviewee all exert a powerful in-fluence on the interview situation.

The issue is also raised here (Lee, 1993) ofwhether there should be a single interview thatmaintains the detachment of the researcher (per-haps particularly useful in addressing sensitivetopics), or whether there should be repeated in-

terviews to gain depth and to show fidelity tothe collaborative nature of research (a feature,as was noted above, which is significant for femi-nist research (Oakley, 1981)).

Kvale (1996:148–9) sets out a range of quali-fications for an effective interviewer, that sheshould be: • knowledgeable (of the subject matter so that

an informed conversation can be held);• structuring (making clear the purpose, con-

duct, completion of the interview);• clear (in choice of language, in presentation

of subject matter);• gentle (enabling subjects to say what they

want to say in its entirety and in their owntime and way);

• sensitive (employing empathic, active listen-ing, taking account of non-verbal communi-cation and how something is said);

• open (sensitive to which aspects of the inter-view are significant for the interviewee);

• steering (keeping to the point);• critical (questioning to check the reliability,

consistency and validity of what is being said);• remembering (recalling earlier statements and

relating to them during the interview);• interpreting (clarifying, confirming and

disconfirming the interviewee’s statementswith the interviewee).

Walford (1994:225) adds to this the need forthe interviewer to have done her homeworkwhen interviewing powerful people, as such peo-ple could well interrogate the interviewer—theywill assume up-to-dateness, competence andknowledge in the interviewer. Powerful inter-viewees are usually busy people and will expectthe interviewer to have read the material that isin the public domain.

The issues of reliability do not reside solely inthe preparations for and conduct of the interview;they extend to the ways in which interviews areanalysed. For example, Lee (1993) and Kvale(1996:163) comment on the issue of ‘transcriberselectivity’. Here transcripts of interviews, how-ever detailed and full they might be, remain

VALIDITY AND RELIABILITY IN INTERVIEWS

VALIDITY AND RELIABILITY126

selective, since they are interpretations of socialsituations. They become decontextualized, ab-stracted, even if they record silences, intonation,non-verbal behaviour etc. The issue, then, is howuseful they are to researchers overall rather thanwhether they are completely reliable.

One of the problems that has to be consid-ered when open-ended questions are used in theinterview is that of developing a satisfactorymethod of recording replies. One way is to sum-marize responses in the course of the interview.This has the disadvantage of breaking the con-tinuity of the interview and may result in biasbecause the interviewer may unconsciously em-phasize responses that agree with her expecta-tions and fail to note those that do not. It issometimes possible to summarize an individu-al’s responses at the end of the interview. Al-though this preserves the continuity of the in-terview, it is likely to induce greater bias becausethe delay may lead to the interviewer forgettingsome of the details. It is these forgotten detailsthat are most likely to be the ones that disagreewith her own expectations.

Validity and reliability in experiments

As we have seen, the fundamental purpose ofexperimental design is to impose control overconditions that would otherwise cloud the trueeffects of the independent variables upon thedependent variables.

Clouding conditions that threaten to jeopard-ize the validity of experiments have been identi-fied by Campbell and Stanley (1963), Bracht andGlass (1968) and Lewis-Beck (1993), conditionsincidentally that are of greater consequence tothe validity of quasi-experiments (more typicalin educational research) than to true experimentsin which random assignment to treatments oc-curs and where both treatment and measure-ment can be more adequately controlled by theresearcher. The following summaries adaptedfrom Campbell and Stanley, Bracht and Glass,and Lewis-Beck distinguish between ‘internalvalidity’ and ‘external validity’. Internal valid-ity is concerned with the question, do the ex-

perimental treatments, in fact, make a differencein the specific experiments under scrutiny? Ex-ternal validity, on the other hand, asks the ques-tion, given these demonstrable effects, to whatpopulations or settings can they be generalized?

Threats to internal validity

• History Frequently in educational research,events other than the experimental treatmentsoccur during the time between pretest andpost-test observations. Such events produceeffects that can mistakenly be attributed todifferences in treatment.

• Maturation Between any two observationssubjects change in a variety of ways. Suchchanges can produce differences that are in-dependent of the experimental treatments.The problem of maturation is more acute inprotracted educational studies than in brieflaboratory experiments.

• Statistical regression Like maturation effects,regression effects increase systematically withthe time interval between pretests and post-tests. Statistical regression occurs in educa-tional (and other) research due to theunreliability of measuring instruments and toextraneous factors unique to each experimen-tal group. Regression means, simply, that sub-jects scoring highest on a pretest are likely toscore relatively lower on a post-test; con-versely, those scoring lowest on a pretest arelikely to score relatively higher on a post-test.In short, in pretest-post-test situations, thereis regression to the mean. Regression effectscan lead the educational researcher mistak-enly to attribute post-test gains and losses tolow scoring and high scoring respectively.

• Testing Pretests at the beginning of experi-ments can produce effects other than thosedue to the experimental treatments. Such ef-fects can include sensitizing subjects to thetrue purposes of the experiment and practiceeffects which produce higher scores on post-test measures.

• Instrumentation Unreliable tests or instru-ments can introduce serious errors into

Cha

pte

r 5127

experiments. With human observers or judgesor changes in instrumentation and calibra-tion, error can result from changes in theirskills and levels of concentration over thecourse of the experiment.

• Selection Bias may be introduced as a resultof differences in the selection of subjects forthe comparison groups or when intact classesare employed as experimental or controlgroups. Selection bias, moreover, may inter-act with other factors (history, maturation,etc.) to cloud even further the effects of thecomparative treatments.

• Experimental mortality The loss of subjectsthrough dropout often occurs in long-runningexperiments and may result in confoundingthe effects of the experimental variables, forwhereas initially the groups may have beenrandomly selected, the residue that stays thecourse is likely to be different from the unbi-ased sample that began it.

• Instrument reactivity The effects that the in-struments of the study exert on the people inthe study (see also Vulliamy, Lewin andStephens, 1990).

• Selection-maturation interaction where thereis a confusion between the research designeffects and the variables’ effects.

Threats to external validity

Threats to external validity are likely to limitthe degree to which generalizations can be madefrom the particular experimental conditions toother populations or settings. Below, we sum-marize a number of factors (adapted fromCampbell and Stanley, 1963; Bracht and Glass,1968; Hammersley and Atkinson, 1983;Vulliamy, 1990; Lewis-Beck, 1993) that jeop-ardize external validity. • Failure to describe independent variables ex-

plicitly Unless independent variables are ad-equately described by the researcher, futurereplications of the experimental conditionsare virtually impossible.

• Lack of representativeness of available and

target populations Whilst those participatingin the experiment may be representative ofan available population, they may not be rep-resentative of the population to which theexperimenter seeks to generalize her findings,i.e. poor sampling and/or randomization.

• Hawthorne effect Medical research has longrecognized the psychological effects that ariseout of mere participation in drug experiments,and placebos and double-blind designs arecommonly employed to counteract the bias-ing effects of participation. Similarly, so-calledHawthorne effects threaten to contaminateexperimental treatments in educational re-search when subjects realize their role asguinea pigs.

• Inadequate operationalizing of dependentvariables Dependent variables that the ex-perimenter operationalizes must have valid-ity in the non-experimental setting to whichshe wishes to generalize her findings. A pa-per and pencil questionnaire on careerchoice, for example, may have little validityin respect of the actual employment deci-sions made by undergraduates on leavinguniversity.

• Sensitization/reactivity to experimental con-ditions As with threats to internal validity,pretests may cause changes in the subjects’sensitivity to the experimental variables andthus cloud the true effects of the experimen-tal treatment.

• Interaction effects of extraneous factors andexperimental treatments All of the abovethreats to external validity represent interac-tions of various clouding factors with treat-ments. As well as these, interaction effectsmay also arise as a result of any or all of thosefactors identified under the section on‘Threats to internal validity’.

• Invalidity or unreliability of instruments Theuse of instruments which yield data in whichconfidence cannot be placed (see below ontests).

• Ecological validity, and its partner, the ex-tent to which behaviour observed in one con-text can be generalized to another.

VALIDITY AND RELIABILITY IN EXPERIMENTS

VALIDITY AND RELIABILITY128

Hammersley and Atkinson (1983:10) com-ment on the serious problems that surroundattempts to relate inferences from responsesgained under experimental conditions, orfrom interviews, to everyday life.

By way of summary, we have seen that an ex-periment can be said to be internally valid tothe extent that within its own confines, its re-sults are credible (Pilliner, 1973); but for thoseresults to be useful, they must be generalizablebeyond the confines of the particular experiment;in a word, they must be externally valid also.Pilliner points to a lopsided relationship betweeninternal and external validity. Without internalvalidity an experiment cannot possibly be ex-ternally valid. But the converse does not neces-sarily follow; an internally valid experiment mayor may not have external validity. Thus, the mostcarefully designed experiment involving a sam-ple of Welsh-speaking children is not necessar-ily generalizable to a target population whichincludes non-Welsh-speaking subjects.

It follows, then, that the way to good experi-mentation in schools, or indeed any other or-ganizational setting, lies in maximizing both in-ternal and external validity.

Validity and reliability in question-naires

Validity of postal questionnaires can be seenfrom two viewpoints according to Belson (1986).First, whether respondents who complete ques-tionnaires do so accurately, honestly and cor-rectly; and second, whether those who fail toreturn their questionnaires would have given thesame distribution of answers as did the returnees.The question of accuracy can be checked bymeans of the intensive interview method, a tech-nique consisting of twelve principal tactics thatinclude familiarization, temporal reconstruction,probing and challenging. The interested readershould consult Belson (1986:35–8).

The problem of non-response (the issue of ‘vol-unteer bias’ as Belson calls it) can, in part, bechecked on and controlled for, particularly when

the postal questionnaire is sent out on a continu-ous basis. It involves follow-up contact with non-respondents by means of interviewers trained tosecure interviews with such people. A compari-son is then made between the replies of respond-ents and non-respondents. Further, Hudson andMiller (1997) suggest several strategies for maxi-mizing the response rate to postal questionnaires(and, thereby to increase reliability). They involve: • including stamped addressed envelopes;• multiple rounds of follow-up to request re-

turns;• stressing the importance and benefits of the

questionnaire;• stressing the importance of, and benefits to,

the client group being targeted (particularlyif it is a minority group that is struggling tohave a voice);

• providing interim data from returns to non-returners to involve and engage them in theresearch;

• checking addresses and changing them if nec-essary;

• following up questionnaires with a personaltelephone call;

• tailoring follow-up requests to individuals(with indications to them that they are per-sonally known and/or important to the re-search—including providing respondentswith clues by giving some personal informa-tion to show that they are known) rather thanblanket generalized letters;

• features of the questionnaire itself (ease of com-pletion, time to be spent, sensitivity of the ques-tions asked, length of the questionnaire);

• invitations to a follow-up interview (face toface or by telephone);

• encouragement to participate by a friendlythird party;

• understanding the nature of the sample popu-lation in depth, so that effective targetingstrategies can be used.

The advantages of the questionnaire over inter-views, for instance, are: it tends to be more reli-able; because it is anonymous, it encourages

Cha

pte

r 5129

greater honesty (though, of course, dishonestyand falsification might not be able to be discov-ered in a questionnaire); it is more economicalthan the interview in terms of time and money;and there is the possibility that it may be mailed.Its disadvantages, on the other hand, are: thereis often too low a percentage of returns; the in-terviewer is able to answer questions concern-ing both the purpose of the interview and anymisunderstandings experienced by the inter-viewee, for it sometimes happens in the case ofthe latter that the same questions have differentmeanings for different people; if only closeditems are used, the questionnaire may lack cov-erage or authenticity; if only open items are used,respondents may be unwilling to write their an-swers for one reason or another; questionnairespresent problems to people of limited literacy;and an interview can be conducted at an appro-priate speed whereas questionnaires are oftenfilled in hurriedly. There is a need, therefore, topilot questionnaires and refine their contents,wording, length, etc. as appropriate for the sam-ple being targeted.

One central issue in considering the reliabil-ity and validity of questionnaire surveys is thatof sampling. An unrepresentative, skewed sam-ple, one that is too small or too large, can easilydistort the data, and indeed, in the case of verysmall samples, prohibit statistical analysis(Morrison, 1993). The issue of sampling hasbeen covered in the preceding chapter.

Validity and reliability in observa-tions

There are questions about two types of validity inobservation-based research. In effect, commentsabout the subjective and idiosyncratic nature ofthe participant observation study are to do withits external validity. How do we know that theresults of this one piece of research are applicableto other situations? Fears that observers’ judge-ments will be affected by their close involvementin the group relate to the internal validity of themethod. How do we know that the results of thisone piece of research represent the real thing, the

genuine product? In the preceding chapter on sam-pling, we refer to a number of techniques (quotasampling, snowball sampling, purposive sampling)that researchers employ as a way of checking onthe representativeness of the events that they ob-serve and of cross-checking their interpretationsof the meanings of those events.

In addition to external validity, participantobservation also has to be rigorous in its inter-nal validity checks. There are several threats tovalidity and reliability here, for example: • the researcher, in exploring the present, may

be unaware of important antecedent events;• informants may be unrepresentative of the

sample in the study;• the presence of the observer might bring

about different behaviours (reactivity andecological validity);

• the researcher might ‘go native’, becomingtoo attached to the group to see it sufficientlydispassionately.

To address this Denzin suggests triangulation ofdata sources and methodologies. Chapter 6 dis-cusses the principal ways of overcoming prob-lems of reliability and validity in observationalresearch in naturalistic inquiry. In essence it issuggested that the notion of ‘trustworthiness’(Lincoln and Guba, 1985) replaces more con-ventional views of reliability and validity, andthat this notion is devolved on issues of cred-ibility, confirmability, transferability and de-pendability. Chapter 6 indicates how these ar-eas could be addressed.

If observational research is much more struc-tured in its nature, yielding quantitative data,then the conventions of intra- and inter-raterreliability apply. Here steps are taken to ensurethat observers enter data into the appropriatecategories consistently (i.e. intra- and inter-raterreliability) and accurately. Further, to ensurevalidity, a pilot must have been conducted toensure that the observational categories them-selves are appropriate, exhaustive, discrete, un-ambiguous and effectively operationalize thepurposes of the research.

VALIDITY AND RELIABILITY IN OBSERVATIONS

VALIDITY AND RELIABILITY130

Validity and reliability in tests

The researcher will have to judge the place andsignificance of test data, not forgetting the prob-lem of the Hawthorne effect operating negativelyor positively on students who have to undertakethe tests. There is a range of issues which mightaffect the reliability of the test—for example, thetime of day, the time of the school year, the tem-perature in the test room, the perceived impor-tance of the test, the degree of formality of thetest situation, ‘examination nerves’, the amountof guessing of answers by the students (the calcu-lation of standard error which the tests demon-strate feature here), the way that the test is ad-ministered, the way that the test is marked, thedegree of closure or openness of test items. Hencethe researcher who is considering using testing asa way of acquiring research data must ensure thatit is appropriate, valid and reliable (Linn, 1993).

Wolf (1994) suggests four main factors thatmight affect reliability: the range of the groupthat is being tested, the group’s level of profi-ciency, the length of the measure (the longer thetest the greater the chance of errors), and theway in which reliability is calculated. Fitz-Gib-bon (1997:36) argues that, other things beingequal, longer tests are more reliable than shortertests. Additionally there are several ways inwhich reliability might be compromised in tests.Feldt and Brennan (1993) suggest four types ofthreat to reliability: • individuals (e.g. their motivation, concentra-

tion, forgetfulness, health, carelessness, guess-ing, their related skills, e.g. reading ability,their usedness to solving the type of problemset, the effects of practice);

• situational factors (e.g. the psychological andphysical conditions for the test—the context);

• test marker factors, e.g. idiosyncrasy and sub-jectivity;

• instrument variables (e.g. poor domain sam-pling, errors in sampling tasks, the realism ofthe tasks and relatedness to the experience ofthe testees, poor question items, the assump-tion or extent of unidimensionality in item

response theory, length of the test, mechani-cal errors, scoring errors, computer errors).

There is also a range of particular problems inconducting reliable tests, for example: • there might be a questionable assumption of

transferability of knowledge and skills fromone context to another (e.g. students mightperform highly in a mathematics examina-tion, but are unable to use the same algorithmin a physics examination);

• students whose motivation, self-esteem, andfamiliarity with the test situation are low mightdemonstrate less than their full abilities;

• language and readability exert a significantinfluence (e.g. whether testees are using theirfirst or second language);

• tests might have a strong cultural bias;• instructions might be unclear and ambiguous;• difficulty levels might be too low or too high;• the number of operations in a single test item

might be unreasonable (e.g. students mightbe able to perform each separate item butmight be unable to perform several opera-tions in combination).

To address reliability there is a need for mod-eration procedures (before and after the admin-istration of the test) to iron out inconsistenciesbetween test markers (Harlen, 1994), including: • statistical reference/scaling tests;• inspection of samples (by post or by visit);• group moderation of grades;• post hoc adjustment of marks;• accreditation of institutions;• visits of verifiers;• agreement panels;• defining marking criteria;• exemplification;• group moderation meetings. Whilst moderation procedures are essentiallypost hoc adjustments to scores, agreement tri-als and practice-marking can be undertakenbefore the administration of a test, which is

Cha

pte

r 5131

particularly important if there are large num-bers of scripts or several markers.

The issue here is that the results as well asthe instruments should be reliable. Reliability isalso addressed by: • calculating coefficients of reliability, split half

techniques, the Kuder—Richardson formula,parallel/equivalent forms of a test, test/re-testmethods;

• calculating and controlling the standard er-ror of measurement;

• increasing the sample size (to maximize therange and spread of scores in a norm-refer-enced test), though criterion-referenced testsrecognize that scores may bunch around thehigh level (in mastery learning for example),i.e. that the range of scores might be limited,thereby lowering the correlation co-efficientsthat can be calculated;

• increasing the number of observations madeand items included in the test (in order to in-crease the range of scores);

• ensuring effective domain sampling of itemsin tests based on item response theory (a par-ticular issue in Computer Adaptive Testingintroduced below (Thissen, 1990));

• ensuring effective levels of itemdiscriminability and item difficulty.

Reliability not only has to be achieved but beseen to be achieved, particularly in ‘high stakes’testing (where a lot hangs on the results of thetest, e.g. entrance to higher education or em-ployment). Hence the procedures for ensuringreliability must be transparent. The difficultyhere is that the more one moves towards reli-ability as defined above, the more the test willbecome objective, the more students will bemeasured as though they are inanimate objects,and the more the test will becomedecontextualized.

An alternative form of reliability which ispremissed on a more constructivist psychology,emphasizes the significance of context, the im-portance of subjectivity and the need to engageand involve the testee more fully than a simple

test. This rehearses the tension between positiv-ism and more interpretive approaches outlinedin the first chapter of this book. Objective tests,as described in this chapter, lean strongly to-wards the positivist paradigm, whilst morephenomenological and interpretive paradigmsof social science research will emphasize theimportance of settings, of individual perceptions,of attitudes, in short, of ‘authentic’ testing (e.g.by using non-contrived, non-artificial forms oftest data, for example portfolios, documents,course work, tasks that are stronger in realismand more ‘hands on’). Though this latter adoptsa view which is closer to assessment rather thannarrowly ‘testing’, nevertheless the two overlap,both can yield marks, grades and awards, bothcan be formative as well as summative, both canbe criterion-referenced.

With regard to validity, it is important to notehere that an effective test will ensure adequate: • content validity (e.g. adequate and representa-

tive coverage of programme and test objec-tives in the test items, a key feature of domainsampling); content validity is achieved by en-suring that the content of the test fairly sam-ples the class or fields of the situations or sub-ject matter in question. Content validity isachieved by making professional judgementsabout the relevance and sampling of the con-tents of the test to a particular domain. It isconcerned with coverage and representative-ness rather than with patterns of response orscores. It is a matter of judgement rather thanmeasurement (Kerlinger, 1986). Content va-lidity will need to ensure several features of atest (Wolf, 1994): (a) test coverage (the extentto which the test covers the relevant field); (b)test relevance (the extent to which the test itemsare taught through, or are relevant to, a par-ticular programme); (c) programme coverage(the extent to which the programme coversthe overall field in question).

• criterion-related validity (where a high corre-lation co-efficient exists between the scores onthe test and the scores on other accepted testsof the same performance); criterion-related

VALIDITY AND RELIABILITY IN TESTS

VALIDITY AND RELIABILITY132

validity is achieved by comparing the scoreson the test with one or more variables (crite-ria) from other measures or tests that are con-sidered to measure the same factor. Wolf(1994) argues that a major problem facing testdevisers addressing criterion-related validity isthe selection of the suitable criterion measure.He cites the example of the difficulty of se-lecting a suitable criterion of academic achieve-ment in a test of academic aptitude. The crite-rion must be: (a) relevant (and agreed to berelevant); (b) free from bias (i.e. where exter-nal factors that might contaminate the crite-rion are removed); (c) reliable—precise andaccurate; (d) capable of being measured orachieved.

• construct validity (e.g. the clear relatednessof a test item to its proposed construct/unobservable quality or trait, demonstratedby both empirical data and logical analysisand debate, i.e. the extent to which particu-lar constructs or concepts can give an accountfor performance on the test); construct valid-ity is achieved by ensuring that performanceon the test is fairly explained by particularappropriate constructs or concepts. As withcontent validity, it is not based on test scores,but is more a matter of whether the test itemsare indicators of the underlying, latent con-struct in question. In this respect constructvalidity also subsumes content and criterion-related validity. It is argued (Loevinger, 1957)that, in fact construct validity is the queen ofthe types of validity because it is subsumptiveand because it concerns constructs or expla-nations rather than methodological factors.Construct validity is threatened by (a) under-representation of the construct, i.e. the test istoo narrow and neglects significant facets ofa construct, (b) the inclusion of irrelevan-cies—excess reliable variance.

• concurrent validity (where the results of thetest concur with results on other tests or in-struments that are testing/assessing the sameconstruct/performance—similar to predictivevalidity but without the time dimension. Con-current validity can occur simultaneously

with another instrument rather than aftersome time has elapsed);

• face validity (that, superficially, the test ap-pears—at face value—to test what it is de-signed to test);

• jury validity (an important element in con-struct validity, where it is important to agreeon the conceptions and operationalization ofan unobservable construct);

• predictive validity (where results on a testaccurately predict subsequent performance—akin to criterion-related validity);

• consequential validity (where the inferencesthat can be made from a test are sound);

• systemic validity (Fredericksen and Collins,1989) (where programme activities both en-hance test performance and enhance perform-ance of the construct that is being addressedin the objective). Cunningham (1998) givesan example of systemic validity where, if thetest and the objective of vocabulary perform-ance leads to testees increasing their vocabu-lary, then systemic validity has been ad-dressed.

To ensure test validity, then the test must demon-strate fitness for purpose as well as addressingthe several types of validity outlined above. Themost difficult for researchers to address, perhaps,is construct validity, for it argues for agreementon the definition and operationalization of anunseen, half-guessed-at construct or phenomenon.The community of scholars has a role to playhere. For a full discussion of validity see Messick(1993). To conclude this chapter, we turn brieflyto consider validity and reliability in life historyaccounts.

Validity and reliability in life histories

Three central issues underpin the quality of datagenerated by life history methodology. They areto do with representativeness, validity and reli-ability. Plummer (1983) draws attention to afrequent criticism of life history research, namely,that its cases are atypical rather than representa-tive. To avoid this charge, he urges intending

Cha

pte

r 5133

researchers to, ‘work out and explicitly state thelife history’s relationship to a wider population’(Plummer, 1983) by way of appraising the sub-ject on a continuum of representativeness andnon-representativeness.

Reliability in life history research hinges uponthe identification of sources of bias and the ap-plication of techniques to reduce them. Biasarises from the informant, the researcher, andthe interactional encounter itself. Box 5.1,adapted from Plummer (1983) provides a check-list of some aspects of bias arising from theseprincipal sources.

Several validity checks are available to in-tending researchers. Plummer identifies the fol-lowing:

1 The subject of the life history may presentan autocritique of it, having read the entireproduct.

2 A comparison may be made with similar writ-ten sources by way of identifying points ofmajor divergence or similarity.

3 A comparison may be made with officialrecords by way of imposing accuracy checkson the life history.

4 A comparison may be made by interviewingother informants.

Essentially, the validity of any life history lies inits ability to represent the informant’s subjec-tive reality, that is to say, his or her definition ofthe situation.

Box 5.1Principal sources of bias in life history research

Source: InformantIs misinformation (unintended) given?Has there been evasion?Is there evidence of direct lying and deception?Is a ‘front’ being presented?What may the informant ‘take for granted’ and hence not reveal?How far is the informant ‘pleasing you’?How much has been forgotten?How much may be self-deception?

Source: ResearcherAttitudes of researcher: age, gender, class, race, religion, politics etc.Demeanour of researcher: dress, speech, body language etc.Personality of researcher: anxiety, need for approval, hostility, warmth etc.Scientific role of researcher: theory held (etc.), researcher expectancy

Source: The interactionThe encounter needs to be examined. Is bias coming from:The prior interaction?The physical setting—‘social space’?Non-verbal communication?Vocal behaviour?

Source Adapted from Plummer, 1983: Table 5.2, p. 103

VALIDITY AND RELIABILITY IN LIFE HISTORIES

It is important to distinguish between matters

of research design, methodology and instru-

mentation. Too often methods are confused with

methodology, and methodology is confused

with design. Part Two provided an introduction

to design issues and this part examines differ-

ent styles or kinds of research, separating them

from methods—instruments to be used for data

collection and analysis. We identify eight main

styles of educational research in this section.

Although we recognize that these are by no

means exhaustive, we suggest that this fairly

covers the major styles of research methodol-

ogy. The gamut of research styles is vast and

this part illustrates the scope of what is avail-

able, embracing quantitative and qualitative re-

search, together with small scale and large

scale approaches. These enable the researcher

to address the notion of ‘fitness for purpose’ in

deciding the most appropriate style of research

for the task in hand.

This part deliberately returns to issues in-

troduced in Part One, and suggests that, though

styles of research can be located within par-

ticular research paradigms, this does not ne-

cessitate the researcher selecting a single para-

digm only, nor does it advocate paradigm-driven

research. Rather, the intention here is to shed

light on styles of research from the paradig-

matic contexts in which they are located. To do

this we have introduced considerable new ma-

terial into this part, for example on naturalistic

and ethnographic research (including issues in

data analysis), computer usage, action re-

search as political praxis, the limits of statisti-

cal significance and the importance of effect

size, the burgeoning scope of meta-analysis,

event history analysis, Nominal Group Tech-

nique and Delphi techniques, recent develop-

ments in case study research, and issues in

correlational research. The previous edition kept

separate developmental research and surveys;

this edition has brought them together as they

are mutually informing.

Part three

Styles of educational research

Elements of naturalistic inquiry

Chapter 1 indicated that several approaches toeducational research are contained in the para-digm of qualitative, naturalistic and ethno-graphic research. The characteristics of thatparadigm (Boas, 1943; Blumer, 1969; Lincolnand Guba, 1985; Woods, 1992; LeCompte andPreissle, 1993) include: • humans actively construct their own mean-

ings of situations;• meaning arises out of social situations and is

handled through interpretive processes;• behaviour and, thereby, data are socially situ-

ated, context-related, context-dependent andcontext-rich. To understand a situation re-searchers need to understand the context be-cause situations affect behaviour and perspec-tives and vice versa;

• realities are multiple, constructed and holis-tic ;

• knower and known are interactive, insepa-rable;

• only time- and context-bound working hy-potheses (idiographic statements) are possible;

• all entities-are in a state of mutual simulta-neous shaping, so that it is impossible to dis-tinguish causes from effects;

• inquiry is value-bound:• inquiries are influenced by inquirer values as

expressed in the choice of a problem,evaluand, or policy option, and in the fram-ing, bounding, and focusing of that problem,evaluand or policy option;

• inquiry is influenced by the choice of the para-digm that guides the investigation into theproblem;

• inquiry is influenced by the choice of the sub-stantive theory utilized to guide the collec-tion and analysis of data and in the interpre-tation of findings;

• inquiry is influenced by the values that in-here in the context;

• inquiry is either value-resident (reinforcing orcongruent) or value-dissonant (conflicting).Problem, evaluand, or policy option, para-digm, theory, and context must exhibit con-gruence (value-resonance) if the inquiry is toproduce meaningful results;

• research must include ‘thick descriptions’(Geertz, 1973) of the contextualized behaviour;

• the attribution of meaning is continuous andevolving over time;

• people are deliberate, intentional and crea-tive in their actions;

• history and biography intersect—we createour own futures but not necessarily in situa-tions of our own choosing;

• social research needs to examine situationsthrough the eyes of the participants—the taskof ethnographies, as Malinowski (1922:25)observed, is to grasp the point of view of thenative [sic], his [sic] view of the world andrelation to his life;

• researchers are the instruments of the research(Eisner, 1991);

• researchers generate rather than test hypoth-eses;

6 Naturalistic and ethnographic research

NATURALISTIC AND ETHNOGRAPHIC RESEARCH138

• researchers do not know in advance what theywill see or what they will look for;

• humans are anticipatory beings;• human phenomena seem to require even more

conditional stipulations than do other kinds;• meanings and understandings replace proof;• generalizability is interpreted as

generalizability to identifiable, specific set-tings and subjects rather than universally;

• situations are unique;• the processes of research and behaviour are

as important as the outcomes;• people, situations, events and objects have

meaning conferred upon them rather thanpossessing their own intrinsic meaning;

• social research should be conducted in natu-ral, uncontrived, real world settings with aslittle intrusiveness as possible by the re-searcher;

• social reality, experiences and social phenom-ena are capable of multiple, sometimes con-tradictory interpretations and are availableto us through social interaction;

• all factors, rather than a limited number ofvariables, have to be taken into account;

• data are analysed inductively, with constructsderiving from the data during the research;

• theory generation is derivative—grounded(Glaser and Strauss, 1967)—the data suggestthe theory rather than vice versa.

Lincoln and Guba (1985:39–43) tease out theimplications of these axioms: • studies must be set in their natural settings as

context is heavily implicated in meaning;• humans are the research instrument;• utilization of tacit knowledge is inescapable;• qualitative methods sit more comfortably

than quantitative methods with the notionof the human-as-instrument;

• purposive sampling enables the full scope ofissues to be explored;

• data analysis is inductive rather than a prioriand deductive;

• theory emerges rather than is pre-ordinate. Apriori theory is replaced by grounded theory;

• research designs emerge over time (and as thesampling changes over time);

• the outcomes of the research are negotiated;• the natural mode of reporting is the case

study;• nomothetic interpretation is replaced by

idiographic interpretation;• applications are tentative and pragmatic;• the focus of the study determines its bounda-

ries;• trustworthiness and its components replace

more conventional views of reliability andvalidity.

LeCompte and Preissle (1993) suggest that ethno-graphic research is a process involving methods ofinquiry, an outcome and a resultant record of theinquiry. The intention of the research is to createas vivid a reconstruction as possible of the cultureor groups being studied (p. 235). That said, thereare several purposes of qualitative research, forexample, description and reporting, the creationof key concepts, theory generation and testing.LeCompte and Preissle (1993) indicate several keyelements of ethnographic approaches: • phenomenological data are elicited (p. 3);• the world view of the participants is investi-

gated and represented—their ‘definition of thesituation’ (Thomas, 1923);

• meanings are accorded to phenomena by boththe researcher and the participants; the proc-ess of research, therefore is hermeneutic, un-covering meanings (LeCompte and Preissle,1993:31–2);

• the constructs of the participants are used tostructure the investigation;

• empirical data are gathered in their natural-istic setting (unlike laboratories or in control-led settings as in other forms of researchwhere variables are manipulated);

• observational techniques are used extensively(both participant and non-participant) to ac-quire data on real-life settings;

• the research is holistic, that is, it seeks a descrip-tion and interpretation of ‘total phenomena’;

• there is a move from description and data to

Cha

pte

r 6139

inference, explanation, suggestions of causa-tion, and theory generation;

• methods are ‘multimodal’ and the ethnographeris a ‘methodological omnivore’ (ibid.: 232).

Hitchcock and Hughes (1989:52–3) suggest thatethnographies involve:

• the production of descriptive cultural knowl-edge of a group;

• the description of activities in relation to aparticular cultural context from the point ofview of the members of that group them-selves;

• the production of a list of features constitu-tive of membership in a group or culture;

• the description and analysis of patterns ofsocial interaction;

• the provision as far as possible of ‘insideraccounts’;

• the development of theory.

There are several key differences between thisapproach and that of the positivists to whomwe made reference in Chapter 1. LeCompte andPreissle (ibid.: 39–44) suggest that ethnographicapproaches are concerned more with descrip-tion rather than prediction, induction rather thandeduction, generation rather than verificationof theory, construction rather than enumeration,and subjectivities rather than objective knowl-edge. With regard to the latter the authors dis-tinguish between emic approaches (as in the term‘phonemic’, where the concern is to catch thesubjective meanings placed on situations by par-ticipants) and etic approaches (as in the term‘phonetic’, where the intention is to identify andunderstand the objective or researcher’s mean-ing and constructions of a situation) (p. 45).

That said, Woods (1992) argues that somedifferences between quantitative and qualitativeresearch have been exaggerated. He proposes,for example (p. 381), that the 1970s witnessedan unproductive dichotomy between the two,the former being seen as strictly in thehypothetico-deductive mode (testing theories)and the latter being seen as the inductive methodused for generating theory. He suggests that the

epistemological contrast between the two is over-stated, as qualitative techniques can be used bothfor generating and testing theories.

Indeed Dobbert and Kurth-Schai (1992) urgeethnographic approaches to become not onlymore systematic but to study and address regu-larities in social behaviour and social structure(pp. 94–5). The task of ethnographers is to bal-ance a commitment to catch the diversity, vari-ability, creativity, individuality, uniqueness andspontaneity of social interactions (e.g. by ‘thickdescriptions’ (Geertz, 1973)) with a commitmentto the task of social science to seek regularities,order and patterns within such diversity (ibid.:150). As Durkheim noted, there are ‘social facts’.

Following this line, it is possible, therefore, tosuggest that ethnographic research can addressissues of generalizability—a tenet of positivist re-search—interpreted as ‘comparability’ and ‘trans-latability’ (LeCompte and Preissle, 1992:47). Forcomparability the characteristics of the group thatis being studied need to be made explicit so thatreaders can compare them with other similar ordissimilar groups. For translatability the analyticcategories used in the research as well as the char-acteristics of the groups are made explicit so thatmeaningful comparisons can be made to othergroups and disciplines.

Spindler and Spindler (1992:72–4) put for-ward several hallmarks of effectiveethnographies:

• Observations have contextual relevance, bothin the immediate setting in which behaviouris observed and in further contexts beyond.

• Hypotheses emerge in situ as the study de-velops in the observed setting.

• Observation is prolonged and often repeti-tive. Events and series of events are observedmore than once to establish reliability in theobservational data.

• Inferences from observation and variousforms of ethnographic inquiry are used toaddress insiders’ views of reality.

• A major part of the ethnographic task is to elicitsociocultural knowledge from participants, ren-dering social behaviour comprehensible.

ELEMENTS OF NATURALISTIC INQUIRY

NATURALISTIC AND ETHNOGRAPHIC RESEARCH140

• Instruments, schedules, codes, agenda for in-terviews, questionnaires, etc. should be gen-erated in situ, and should derive from obser-vation and ethnographic inquiry.

• A transcultural, comparative perspective isusually present, although often it is anunstated assumption, and cultural variation(over space and time) is natural.

• Some sociocultural knowledge that affectsbehaviour and communication under studyis tacit/implicit, and may not be known evento participants or known ambiguously to oth-ers. It follows that one task for an ethnogra-phy is to make explicit to readers what is tacit/implicit to informants.

• The ethnographic interviewer should notframe or predetermine responses by the kindsof questions that are asked, because the in-formants themselves have the emic, nativecultural knowledge.

• In order to collect as much live data as possi-ble, any technical device may be used.

• The ethnographer’s presence should be de-clared and his or her personal, social andinteractional position in the situation shouldbe described.

With ‘mutual shaping and interaction’ betweenthe researcher and participants taking place (Lin-coln and Guba, 1985:155) the researcher be-comes, as it were, the ‘human instrument’ in theresearch (ibid.: 187), building on her tacit knowl-edge in addition to her prepositional knowledge,using methods that sit comfortably with humaninquiry, e.g. observations, interviews, documen-tary analysis and ‘unobtrusive’ methods (ibid.:187). The advantage of the ‘human instrument’is her adaptability, responsiveness, knowledge,ability to handle sensitive matters, ability to seethe whole picture, ability to clarify and summa-rize, to explore, to analyse, to examine atypicalor idiosyncratic responses (ibid.: 193–4).

Planning naturalistic research

In many ways the issues in naturalistic researchare not exclusive; they apply to other forms of

research, for example: identifying the problemand research purposes; deciding the focus of thestudy; selecting the research design and instru-mentation; addressing validity and reliability;ethical issues; approaching data analysis andinterpretation. These are common to all re-search. More specifically Wolcott (1992:19) sug-gests that naturalistic researchers should addressthe stages of watching, asking and reviewing,or, as he puts it, experiencing, inquiring and ex-amining. In naturalistic inquiry it is possible toformulate a more detailed set of stages that canbe addressed (Hitchcock and Hughes, 1989:57–71; LeCompte and Preissle, 1993; Bogdan andBiklen, 1992):

Stage 1 Locating a field of study.Stage 2 Addressing ethical issues.Stage 3 Deciding the sampling.Stage 4 Finding a role and managing entry intothe context.Stage 5 Finding informants.Stage 6 Developing and maintaining relationsin the field.Stage 7 Data collection in situ.Stage 8 Data collection outside the field.Stage 9 Data analysis.Stage 10 Leaving the field.Stage 11 Writing the Report.

These stages—addressed later in this chapter—are shot through with a range of issues that willaffect the research, for example:

• personal issues (the disciplinary sympathiesof the researcher, researcher subjectivities andcharacteristics. Hitchcock and Hughes(1989:56) indicate that there are several seri-ous strains in conducting fieldwork becausethe researcher’s own emotions, attitudes, be-liefs, values, characteristics enter the research;indeed the more this happens the less will bethe likelihood of gaining the participants’perspectives and meanings);

• the kinds of participation that the researcherwill undertake;

• issues of advocacy (where the researcher maybe expected to identify with the same emo

Cha

pte

r 6141

tions, concerns and crises as the members ofthe group being studied and wishes to ad-vance their cause, often a feature that arisesat the beginning and the end of the researchwhen the researcher is considered to be a le-gitimate spokesperson for the group);

• role relationships;• boundary maintenance in the research;• the maintenance of the balance between dis-

tance and involvement;• ethical issues;• reflexivity. Reflexivity recognizes that researchers are ines-capably part of the social world that they areresearching, and, indeed, that this social worldis an already interpreted world by the actors,undermining the notion of objective reality. Re-searchers are in the world and of the world. Theybring their own biographies to the research situ-ation and participants behave in particular waysin their presence. Reflexivity suggests that re-searchers should acknowledge and disclose theirown selves in the research; they should holdthemselves up to the light, echoing Cooley’s(1902) notion of the ‘looking glass self. Highlyreflexive researchers will be acutely aware of theways in which their selectivity, perception, back-ground and inductive processes and paradigmsshape the research. They are research instru-ments. McCormick and James (1988:191) ar-gue that combating reactivity through reflexiv-ity requires researchers to monitor closely andcontinually their own interactions with partici-pants, their own reaction, roles, biases, and anyother matters that might bias the research. Thisis addressed more fully in the chapter 5 on va-lidity, encompassing issues of triangulation andrespondent validity.

Lincoln and Guba (1985:226–47) set out ten ele-ments in research design for naturalistic studies: 1 Determining a focus for the inquiry.2 Determining fit of paradigm to focus3 Determining the fit of the inquiry paradigm

to the substantive theory selected to guide theinquiry.

4 Determining where and from whom data willbe collected.

5 Determining successive phases of the inquiry.6 Determining instrumentation.7 Planning data collection and recording

modes.8 Planning data analysis procedures.9 Planning the logistics:

• prior logistical considerations for theproject as a whole;

• the logistics of field excursions prior togoing into the field;

• the logistics of field excursions while inthe field;

• the logistics of activities following fieldexcursions;

• the logistics of closure and termination.

10 Planning for trustworthiness.

This can be set out into a sequential, staged ap-proach to planning naturalistic research (see, forexample: Schatzman and Strauss, 1973;Delamont, 1992). Spradley (1979) sets out thestages of: (a) selecting a problem; (b) collectingcultural data; (c) analysing cultural data; (d)formulating ethnographic hypotheses; writingthe ethnography. More fully, we suggest aneleven stage model.

Stage 1: locating a field of study

Bogdan and Biklen (1992:2) suggest that re-search questions in qualitative research are notframed by simply operationalizing variables asin the positivist paradigm. Rather, they proposethat research questions are formulated in situand in response to situations observed, i.e. thattopics are investigated in all their complexity, inthe naturalistic context.

Stage 2: addressing ethical issues

Deyle, Hess and LeCompte (1992:623) identifyseveral critical ethical issues that need to be ad-dressed in approaching the research:

How does one present oneself in the field? Aswhom does one present oneself? How ethically

PLANNING NATURALISTIC RESEARCH

NATURALISTIC AND ETHNOGRAPHIC RESEARCH142

defensible is it to pretend to be somebody thatyou are not in order to: (a) gain knowledge thatyou would otherwise not be able to gain; (b) gainand preserve access to places which otherwise youwould be unable to gain or sustain such access?

The issues here are several. Firstly, there is theissue of informed consent (to participate and fordisclosure), whether and how to gain participantassent (see also LeCompte and Preissle, 1993:66).This uncovers another consideration, namelycovert or overt research. On the one hand thereis a powerful argument for informed consent.However, the more participants know about theresearch the less naturally they may behave (ibid.:108), and naturalism is self-evidently a key crite-rion of the naturalistic paradigm.

Mitchell (1993) catches the dilemma for re-searchers in deciding whether to undertake overtor covert research. The issue of informed con-sent, he argues, can lead to the selection of par-ticular forms of research—those where research-ers can control the phenomena under investiga-tion—thereby excluding other kinds of researchwhere subjects behave in less controllable, pre-dictable, prescribed ways, indeed where subjectsmay come in and out of the research over time.

He argues that in the real social world accessto important areas of research is prohibited ifinformed consent has to be sought, for examplein researching those on the margins of society orthe disadvantaged. It is in the participants’ owninterests that secrecy is maintained as, if secrecyis not upheld, important work may not be doneand ‘weightier secrets’ (ibid., p. 54) may be keptwhich are of legitimate public interest and in theparticipants’ own interests. Mitchell makes apowerful case for secrecy, arguing that informedconsent may excuse social scientists from the riskof confronting powerful, privileged, and cohe-sive groups who wish to protect themselves frompublic scrutiny. Secrecy and informed consent aremoot points. The researcher, then, has to con-sider her loyalties and responsibilities (LeCompteand Preissle, 1993:106), for example what is thepublic’s right to know and what is the individu-al’s right to privacy? (Morrison, 1993).

In addition to the issue of overt or covert re-search, LeCompte and Preissle (1993) indicatethat the problems of risk and vulnerability toparticipants must be addressed; steps must betaken to prevent risk or harm to participants(non-maleficence—the principle of primum nonnocere). Bogdan and Biklen (1992:54) extendthis to include issues of embarrassment as wellas harm to the participants. The question ofvulnerability is present at its strongest whenparticipants in the research have their freedomto choose limited, e.g. by dint of their age, byhealth, by social constraints, by dint of their lifestyle (e.g. engaging in criminality), social accept-ability, experience of being victims (e.g. of abuse,of violent crime) (p. 107). As the authors com-ment, participants rarely initiate research, so itis the responsibility of the researcher to protectparticipants. Relationships between researcherand the researched are rarely symmetrical interms of power; it is often the case that thosewith more power, information and resourcesresearch those with less.

A standard protection is often the guaranteeof confidentiality, withholding participants’ realnames and other identifying characteristics. Theauthors contrast this with anonymity, whereidentity is withheld because it is genuinely un-known (p. 106). The issues of identifiability andtraceability are raised. Further, participantsmight be able to identify themselves in the re-search report though others may not be able toidentify them. A related factor here is the own-ership of the data and the results, the control ofthe release of data (and to whom, and when)and what rights respondents have to veto theresearch results. Patrick (1973) indicates thispoint at its sharpest, when as an ethnographerof a Glasgow gang, he was witness to a murder;the dilemma was clear—to report the matter(and thereby, also to ‘blow his cover’, conse-quently endangering his own life) or to stay as acovert researcher.

Bogdan and Biklen (1992:54) add to this dis-cussion the need to respect participants as sub-jects, not simply as research objects to be usedand then discarded.

Cha

pte

r 6143

Stage 3: deciding the sampling

In an ideal world the researcher would be ableto study a group in its entirety. This was thecase in Goffman’s (1968) work on ‘total insti-tutions’—e.g. hospitals, prisons and policeforces. It was also the practice of anthropolo-gists who were able to study specific isolatedcommunities or tribes. That is rarely possiblenowadays because such groups are no longerisolated or insular. Hence the researcher is facedwith the issue of sampling, that is, decidingwhich people it will be possible to select to rep-resent the wider group (however defined). Theresearcher has to decide the groups for whichthe research questions are appropriate, the con-texts which are important for the research, thetime periods that will be needed, and the pos-sible artefacts of interest to the researcher. Inother words decisions are necessary on the sam-pling of people, contexts, issues, time frames,artefacts and data sources. This takes the dis-cussion beyond conventional notions of sam-pling, which are confined to issues of samplingpopulations.

In several forms of research sampling is fixedat the start of the study, though there may beattrition of the sample through ‘mortality’ (e.g.people leaving the study). Mortality is seen asproblematic. Ethnographic research regards thisas natural rather than a problem. People comeinto and go from the study. This impacts on thedecision whether to have a synchronie investi-gation occurring at a single point in time, or adiachronic study where events and behaviourare monitored over time to allow for change,development, and evolving situations. In ethno-graphic inquiry sampling is recursive and ad hocrather than fixed at the outset; it changes anddevelops over time. Let us consider how thismight happen.

LeCompte and Preissle (ibid.: 82–3) point outthat ethnographic methods rule out statisticalsampling, for a variety of reasons: • the characteristics of the wider population are

unknown;

• there are no straightforward boundary mark-ers (categories or strata) in the group;

• generalizability, a goal of statistical methods,is not necessarily a goal of ethnography;

• characteristics of a sample may not be evenlydistributed across the sample;

• only one or two subsets of a characteristic ofa total sample may be important;

• researchers may not have access to the wholepopulation;

• some members of a subset may not be drawnfrom the population from which the samplingis intended to be drawn.

Hence other types of sampling are required. Acriterion-based selection requires the researcherto specify in advance a set of attributes, factors,characteristics or criteria that the study mustaddress. The task then is to ensure that theseappear in the sample selected (the equivalent ofa stratified sample). There are other forms ofsampling (discussed in Chapter 4) that are use-ful in ethnographic research (Bogdan and Biklen,1992:70; LeCompte and Preissle, 1993:69–83),such as: • convenience sampling (opportunistic sam-

pling, selecting from whoever happens to beavailable);

• critical-case sampling (e.g. people who dis-play the issue or set of characteristics in theirentirety or in a way that is highly significantfor their behaviour);

• the norm of a characteristic is identified, thenthe extremes of that characteristic are located,and finally, the bearers of that extreme char-acteristic are selected;

• typical case-sampling (where a profile of at-tributes or characteristics that are possessedby an ‘average’, typical person or case is iden-tified, and the sample is selected from thesetypical people or cases);

• unique-case sampling, where cases that arerare, unique or unusual on one or more crite-ria are identified, and sampling takes placeswithin these. Here whatever other character-istics or attributes a person might share with

PLANNING NATURALISTIC RESEARCH

NATURALISTIC AND ETHNOGRAPHIC RESEARCH144

others, a particular attribute or characteris-tic sets that person apart.

• reputational-case sampling, a variant of ex-treme-case and unique-case sampling, iswhere a researcher chooses a sample on therecommendation of experts in the field;

• snowball sampling—using the first inter-viewee to suggest or recommend other inter-viewees.

Patton (1980) identifies six types of sampling thatare useful in naturalistic research, including • sampling extreme/deviant cases—this is done

in order to gain information about unusualcases that may be particularly troublesomeor enlightening;

• sampling typical cases—this is done in orderto avoid rejecting information on the groundsthat it has been gained from special or devi-ant cases;

• maximum variation sampling—this is donein order to document the range of uniquechanges that have emerged, often in responseto the different conditions to which partici-pants have had to adapt;

• sampling critical cases—this is done in orderto permit maximum applicability to others—if the information holds true for critical cases(e.g. cases where all of the factors sought arepresent), then it is likely to hold true for oth-ers;

• sampling politically important or sensitivecases—this can be done to draw attention tothe case;

• convenience sampling—this saves time andmoney and spares the researcher the effort offinding less amenable participants.

Lincoln and Guba (1985:201–2) suggest an im-portant difference between conventional andnaturalistic research designs. In the former theintention is to focus on similarities and to beable to make generalizations, whereas in the lat-ter the objective is informational, to provide sucha wealth of detail that the uniqueness and indi-viduality of each case can be represented. To

the charge that naturalistic inquiry, thereby, can-not yield generalizations because of samplingflaws the writers argue that this is necessarilythough trivially true. In a word, it is unimpor-tant.

Stage 4: finding a role and managing entryinto the context

This involves issues of access and permission,establishing a reason for being there, develop-ing a role and a persona, identifying the ‘gate-keepers’ who facilitate entry and access to thegroup being investigated (see LeCompte andPreissle, 1993:100 and 111). The issue here iscomplex, for the researcher will be both a mem-ber of the group and yet studying that group, soit is a delicate matter to negotiate a role thatwill enable the researcher to be both participantand observer. The authors comment (p. 112) thatthe most important elements in securing accessare the willingness of researchers to be flexibleand their sensitivity to nuances of behaviour andresponse in the participants.

A related issue is the timing of the point ofentry, so that researchers can commence the re-search at appropriate junctures (e.g. before thestart of a programme, at the start of a pro-gramme, during a programme, at the end of aprogramme, after the end of a programme). Theissue goes further than this, for the ethnogra-pher will need to ensure acceptance into thegroup, which will be a matter of her/his dress,demeanour, persona, age, colour, ethnicity, em-pathy and identification with the group, lan-guage, accent, argot and jargon, willingness tobecome involved and to take on the group’s val-ues and behaviour etc. (see Patrick’s (1973) fas-cinating study of a Glasgow gang).

Lofland (1971) suggests that the field re-searcher should attempt to adopt the role of the‘acceptable incompetent’, balancing intrusionwith knowing when to remain apart.

Stage 5: finding informants

This involves identifying those people who havethe knowledge about the society or group being

Cha

pte

r 6145

studied. This places the researcher in a difficultposition, for she has to be able to evaluate keyinformants, to decide: • whose accounts are more important than oth-

ers;• which informants are competent to pass com-

ments;• which are reliable;• what the statuses of the informants are;• how representative are the key informants

(of the range of people, of issues, of situa-tions, of views, of status, of roles, of thegroup);

• how to see the informants in different set-tings;

• how knowledgeable informants actuallyare—do they have intimate and expert un-derstanding of the situation;

• how central to the organization or situa-tion the informant is (e.g. marginal or cen-tral);

• how to meet and select informants;• how critical the informants are as gatekeep-

ers to other informants, opening up or restrict-ing entry to avenues of inquiry to people(Hammersley and Atkinson, 1983:73);

• the relationship between the informant andothers in the group or situation being stud-ied.

The selection and/or relationships with inform-ants is problematical; LeCompte and Preissle(1993:95), for example, suggest that the firstinformants that an ethnographer meets mightbe self-selected people who are marginal to thegroup, have a low status, and who, therefore,might be seeking to enhance their own prestigeby being involved with the research. Indeed Lin-coln and Guba (1985:252) argue that the re-searcher must be careful to use informantsrather than informers, the latter possibly hav-ing ‘an axe to grind’. Researchers who areworking with gatekeepers, they argue, will beengaged in a constant process of bargaining andnegotiation.

Stage 6: developing and maintaining rela-tions in the field

This involves addressing interpersonal and prac-tical issues, for example:

• building participants’ confidence in the re-searcher;

• developing rapport, trust, sensitivity and dis-cretion;

• handling people and issues with which theresearcher disagrees or finds objectionable orrepulsive;

• being attentive and empathizing;• being discreet;• deciding how long to stay. Spindler and

Spindler (1992:65) suggest that ethnographicvalidity is attained by having the researcherin situ long enough to see things happeningrepeatedly rather than just once, that is tosay, observing regularities.

LeCompte and Preissle (1993:89) suggest that field-work, particularly because it is conducted face-to-face, raises problems and questions that are lesssignificant in research that is conducted at a dis-tance, including: (a) how to communicate mean-ingfully with participants; (b) how they and theresearcher might be affected by the emotionsevoked in one another, and how to handle these;(c) differences and similarities between the re-searcher and the participants (e.g. personal char-acteristics, power, resources), and how these mightaffect relationships between parties and the courseof the investigation; (d) the researcher’s responsi-bilities to the participants (qua researcher andmember of their community), even if the period ofresidence in the community is short; (e) how tobalance responsibilities to the community withresponsibilities to other interested parties. The is-sue here is that the data collection process is itselfsocially situated; it is neither a clean, antisepticactivity nor always a straightforward negotiation.

Stage 7: data collection in situ

The qualitative researcher is able to use a varietyof techniques for gathering information. There is

PLANNING NATURALISTIC RESEARCH

NATURALISTIC AND ETHNOGRAPHIC RESEARCH146

no single prescription for which data collectioninstruments to use; rather, the issue here is of ‘fit-ness for purpose’ because, as was mentioned ear-lier, the ethnographer is a methodological omni-vore! That said, there are several types of datacollection instruments that are used more widelyin qualitative research than others. The researchercan use field notes, participant observation, jour-nal notes, interviews, diaries, life histories, arte-facts, documents, video recordings, audio record-ings etc. Several of these are discussed elsewherein this book. Lincoln and Guba (1985:199) dis-tinguish between ‘obtrusive’ (e.g. interviews, ob-servation, non-verbal language) and ‘unobtrusive’methods (e.g. documents and records), on thebasis of whether another human typically ispresent at the point of data collection.

Field notes can be written both in situ andaway from the situation. They contain the re-sults of observations. The nature of observationin ethnographic research is discussed fully inChapter 17. Accompanying observation tech-niques is the use of interview techniques, docu-mentary analysis and life histories. These arediscussed separately in Chapters 7, 15 and 16.The popularly used interview technique em-ployed in qualitative interviewing is the semi-structured interview, where a schedule is pre-pared but it is sufficiently open-ended to enablethe contents to be re-ordered, digressions andexpansions made, new avenues to be included,and further probing to be undertaken.Carspecken (1996:159–60) describes how suchinterviews can range from the interviewer giv-ing bland encouragements, ‘non-leading’ leads,active listening and low-inference paraphrasingto medium- and high-inference paraphrasing. Ininterviews the researcher might wish to explorefurther some matters arising from the observa-tions. In naturalistic research the canons of va-lidity in interviews include: honesty, depth ofresponse, richness of response, and commitmentof the interviewee (Oppenheim, 1992).

Lincoln and Guba (1985:268–70) proposeseveral purposes for interviewing, including:present constructions of events, feelings, persons,organizations, activities, motivations, concerns,

claims, etc.; reconstructions of past experiences;projections into the future; verifying, amendingand extending data.

Further, Silverman (1993:92–3) adds that in-terviews in qualitative research are useful for:(a) gathering facts; (b) accessing beliefs aboutfacts; (c) identifying feelings and motives; (d)commenting on the standards of actions (whatcould be done about situations); (e) present orprevious behaviour; (f) eliciting reasons and ex-planations.

Lincoln and Guba (1985) emphasize that theplanning of the conduct of the interview is im-portant, including the background preparation,the opening of the interview, its pacing and tim-ing, keeping the conversation going and elicit-ing knowledge, and rounding off and ending theinterview. Clearly, it is important that carefulconsideration be given to the several stages ofthe interview. For example at the planning stageof the interview attention will need to be givento the number (per person), duration, timing,frequency, setting/location, number of people ina single interview situation (e.g. individual orgroup interviews) and respondent styles(LeCompte and Preissle, 1993:177). At the im-plementation stage the conduct of the interviewwill be important, for example responding tointerviewees, prompting, probing, supporting,empathizing, clarifying, crystallizing, exempli-fying, summarizing, avoiding censure, accept-ing. At the analysis stage there will be severalimportant considerations, for example (ibid.:195): the ease and clarity of communication ofmeaning; the interest levels of the participants;the clarity of the question and the response; theprecision (and communication of this) of theinterviewer; how the interviewer handles ques-tionable responses (e.g. fabrications, untruths,claims made).

The qualitative interview tends to move awayfrom the pre-structured, standardized form andtoward the open-ended or semi-structuredinterview (see Chapter 15), as this enables re-spondents to project their own ways of defin-ing the world. It permits flexibility rather thanfixity of sequence of discussions, and it also

Cha

pte

r 6147

enables participants to raise and pursue issuesand matters that might not have been includedin a pre-devised schedule (Denzin, 1970;Silverman, 1993).

In addition to interviews, Lincoln and Guba(1985) discuss data collection from non-humansources, including: 1 Documents and records (e.g. archival records,

private records). These have the attraction ofbeing always available, often at low cost, andbeing factual. On the other hand they maybe unrepresentative, they may be selective,lack objectivity, be of unknown validity, andmay possibly be deliberately deceptive (seeFinnegan, 1996).

2 Unobtrusive informational residues. Theseinclude artefacts, physical traces, and a vari-ety of other records. Whilst they frequentlyhave face validity, and whilst they may be sim-ple and direct, gained by non-interventionalmeans (hence reducing the problems of reac-tivity), they may also be very heavily inferen-tial, difficult to interpret, and may contain el-ements whose relevance is questionable.

Stage 8: data collection outside the field

In order to make comparisons and to suggestexplanations for phenomena, researchers mightfind it useful to go beyond the confines of thegroups in which they occur. That this is a thornyissue is indicated in the following example. Twostudents are arguing very violently and physi-cally in a school. At one level it is simply a fightbetween two people. However, this is a com-mon occurrence between these two students asthey are neighbours outside school and theydon’t enjoy positive amicable relations as theirfamilies are frequently feuding. The two house-holds have been placed next door to each otherby the local authority because the authority hastaken a decision to keep together families whoare very poor at paying for their local housingrent (i.e. a ‘sink’ estate). The local authority hastaken this decision because of a government

policy to keep together disadvantaged groupsso that targeted action and interventions can bemore effective, meeting the needs of whole com-munities as well as individuals.

The issue here is: how far out of a micro-situation does the researcher need to go to un-derstand that micro-situation? This is an impre-cise matter but it is not insignificant in educa-tional research (e.g. it underpinned: (a) the cel-ebrated work by Bowles and Gintis (1976) onschooling in capitalist America, in which theauthors suggested that the hidden curricula ofschools were preparing students for differentialoccupational futures that perpetuated aninegalitarian capitalist system; (b) research onthe self-fulfilling prophecy (Hurn, 1978); (c)work by Pollard (1985:110) on the social worldof the primary school, where everyday interac-tions in school were preparing students for theindividualism, competition, achievement orien-tation, hierarchies and self-reliance that charac-terize mass private consumption in wider soci-ety; (d) Delamont’s (1981) advocacy that edu-cationists should study similar but different in-stitutions to schools (e.g. hospitals and other ‘to-tal’ institutions) in order to make the familiarstrange (see also Erickson, 1973).

Stage 9: data analysis

This involves organizing, accounting for, andexplaining the data; in short, making sense ofthe data in terms of the participants’ definitionsof the situation, noting patterns, themes, cat-egories and regularities. Typically in qualitativeresearch, data analysis commences during thedata collection process. There are several rea-sons for this, and these are discussed below.

At a practical level, qualitative research rap-idly amasses huge amounts of data, and earlyanalysis reduces the problem of data overloadby selecting out significant features for futurefocus. Miles and Huberman (1984) suggest thatcareful data display is an important element ofdata reduction and selection. ‘Progressivefocussing’, according to Parlett and Hamilton(1976), starts with the researcher taking a wide

PLANNING NATURALISTIC RESEARCH

NATURALISTIC AND ETHNOGRAPHIC RESEARCH148

angle lens to gather data, and then, by sifting,sorting, reviewing and reflecting on them thesalient features of the situation emerge. Theseare then used as the agenda for subsequent fo-cusing. The process is akin to funnelling fromthe wide to the narrow.

At a theoretical level a major feature of quali-tative research is that analysis commences earlyon in the data collection process so that theorygeneration can be undertaken (LeCompte andPreissle, 1993:238). The authors (pp. 237–53)advise that researchers should set out the mainoutlines of the phenomena that are under inves-tigation. They then should assemble chunks orgroups of data, putting them together to makea coherent whole (e.g. through writing summa-ries of what has been found). Then they shouldpainstakingly take apart their field notes, match-ing, contrasting, aggregating, comparing andordering notes made. The intention is to movefrom description to explanation and theory gen-eration.

Becker and Geer (1960) indicate how thismight proceed: • comparing different groups simultaneously

and over time;• matching the responses given in interviews

to observed behaviour;• an analysis of deviant and negative cases;• calculating frequencies of occurrences and

responses;• assembling and providing sufficient data that

keeps separate raw data from analysis. For clarity, the process of data analysis can beportrayed in a sequence of seven steps:

Step 1 Establish units of analysis of the data,indicating how these units are similar to anddifferent from each other.Step 2 Create a ‘domain analysis’.Step 3 Establish relationships and linkages be-tween the domains.Step 4 Making speculative inferences.Step 5 Summarizing.Step 6 Seeking negative and discrepant cases.

Step 7 Theory generation.

The following pages address each of these steps.

Step 1: establish units of analysis of the data,indicating how these units are similar to anddifferent from each other

The criterion here is that each unit of analysis(category—conceptual, actual, classification ele-ment, cluster, issue) should be as discrete as pos-sible whilst retaining fidelity to the integrity ofthe whole, i.e. that each unit must be a fair ratherthan a distorted representation of the context andother data. The creation of units of analysis canbe done by ascribing codes to the data (Miles andHuberman, 1984). This is akin to the process of‘unitizing’ (Lincoln and Guba, 1985:203).

Codes define categories; they are astringent,pulling together a wealth of material into someorder and structure. They keep words as words;they maintain context specificity.

At this stage the codes are essentially descrip-tive and might include (Bogdan and Biklen,1992:167–72): situation codes; perspectives heldby subjects; ways of thinking about people andobjects; process codes; activity codes; eventcodes; strategy codes; relationship and socialstructure codes; methods codes. However, to befaithful to the data, the codes themselves derivefrom the data responsively rather than beingcreated pre-ordinately. Hence the researcher willgo through the data ascribing codes to each pieceof datum. The code is a word or abbreviationthat is sufficiently close to that which it is de-scribing that the researcher can see at a glancewhat it means (in this respect it is unlike anumber). For example, the code ‘trust’ mightrefer to a person’s trustworthiness; the code‘power’ might refer to the status or power ofthe person in the group.

Miles and Huberman advise that codes shouldbe kept as discrete as possible and that codingshould start earlier rather than later as late cod-ing enfeebles the analysis. It is possible, theysuggest, for as many as ninety codes to be heldin the working memory whilst going through

Cha

pte

r 6149

data, though clearly, there is a process of itera-tion and reiteration whereby some codes thatare used in the early stages of coding might bemodified subsequently and vice versa, necessi-tating the researcher to go through a data setmore than once to ensure consistency, refine-ment, modification and exhaustiveness of cod-ing (some codes might become redundant, oth-ers might need to be broken down into finercodes). By coding up the data the researcher isable to detect frequencies (which codes are oc-curring most commonly) and patterns (whichcodes occur together).

Hammersley and Atkinson (1983:177–8) pro-pose that the first activity here is to read and re-read the data to become thoroughly familiarwith them, noting also any interesting patterns,any surprising, puzzling or unexpected features,any apparent inconsistencies or contradictions(e.g. between groups, within and between indi-viduals and groups, between what people sayand what they do).

Step 2: create a ‘domain analysis’

This involves grouping the units into domains,clusters, groups, patterns, themes and coherentsets to form domains. A domain is any symboliccategory that includes other categories (Spradley,1979:100). At this stage it might be useful forthe researcher to recode the data into domaincodes, or to review the codes used to see howthey naturally fall into clusters, perhaps creat-ing overarching codes for each cluster.Hammersley and Atkinson (1983) show howitems can be assigned to more than one category,and, indeed, see this as desirable as it maintainsthe richness of the data. This is akin to the proc-ess of ‘categorization’ (Lincoln and Guba, 1985),putting ‘unitized’ data to provide descriptive andinferential information.

Spradley (1979) suggests that establishingdomains can be achieved by four analytic tasks:(a) selecting a sample of verbatim interview andfield notes; (b) looking for the names of things;(c) identifying possible terms from the sample;(d) searching through additional notes for other

items to include. He identifies six steps to achievethese tasks: (i) select a single semantic relation-ship; (ii) prepare a domain analysis sheet; (iii)select a sample of statements from respondents;(iv) search for possible cover terms and includedterms that fit the semantic relationship identi-fied; (v) formulate structural questions for eachdomain identified; (vi) list all the hypothesizeddomains. Domain analysis, then, strives to dis-cover relationships between symbols (ibid.: 157).

Step 3: establish relationships and linkagesbetween the domains

This process ensures that the data, their rich-ness and ‘context-groundedness’ are retained.Linkages can be found by identifying confirm-ing cases, by seeking ‘underlying associations’(LeCompte and Preissle, 1993:246) and connec-tions between data subsets.

Step 4: making speculative inferences

This is an important stage, for it moves the re-search from description to inference. It requiresthe researcher, on the basis of the evidence, toposit some explanations for the situation, somekey elements and possibly even their causes. Itis the process of hypothesis generation or thesetting of working hypotheses that feeds intotheory generation.

Step 5: summarizing

By this stage the researcher will be in a positionto write a summary of the main features of thesituation that have been researched so far. Thesummary will identify key factors, key issues,key concepts and key areas for subsequent in-vestigation. It is a watershed stage during thedata collection, as it pinpoints major themes,issues and problems that have arisen from thedata to date (responsively) and suggests avenuesfor further investigation. The concepts used willhave been a combination of those derived fromthe data themselves and those inferred by the re-searcher (Hammersley and Atkinson, 1983:178).

By this stage the researcher will have gonethrough the preliminary stages of theory

PLANNING NATURALISTIC RESEARCH

NATURALISTIC AND ETHNOGRAPHIC RESEARCH150

generation. Patton (1980) sets these out forqualitative data: • finding a focus for the research and analysis;• organizing, processing, ordering and check-

ing data;• writing a qualitative description or analysis;• inductively developing categories, typologies,

and labels;• analysing the categories to identify where

further clarification and cross-clarification areneeded;

• expressing and typifying these categoriesthrough metaphors (see also Pitman andMaxwell, 1992:747);

• making inferences and speculations aboutrelationships, causes and effects.

Bogdan and Biklen (1992:154–63) identify sev-eral important items that researchers need toaddress at this stage, including: forcing yourselfto take decisions that will focus and narrow thestudy and decide what kind of study it will be;developing analytical questions; using previousobservational data to inform subsequent data col-lection; writing reflexive notes and memos aboutobservations, ideas, what you are learning; tryingout ideas with subjects; analysing relevant litera-ture whilst you are conducting the field research;generating concepts, metaphors and analogies andvisual devices to clarify the research.

Step 6: seeking negative and discrepantcases

In theory generation it is important to seek notonly confirming cases but to weigh the signifi-cance of discontinuing cases. LeCompte andPreissle (1993:270) suggest that because inter-pretations of the data are grounded in the datathemselves, results that fail to support an origi-nal hypothesis are neither discarded nor dis-cred-ited; rather, it is the hypotheses themselves thatmust be modified to accommodate these data.Indeed Erickson (1992:208) identifies progres-sive problem-solving as one key aspect of eth-nographic research and data analysis. LeCompte

and Preissle (1993:250–1) define a negative caseas an exemplar which disconfirms or refutes theworking hypothesis, rule or explanation so far.It is the qualitative researcher’s equivalent of thepositivist’s null hypothesis. The theory that isbeing developed becomes more robust if it ad-dresses negative cases, for it sets the boundariesto the theory; it modifies the theory, it sets pa-rameters to the applicability of the theory.

Discrepant cases are not so much exceptionsto the rule (as in negative cases) as variants ofthe rule (ibid.: 251). The discrepant case leadsto the modification or elaboration of the con-struct, rule or emerging hypothesis. Discrepantcase analysis requires the researcher to seek outcases for which the rule, construct or explana-tion cannot account or with which they will notfit, i.e. they are neither exceptions nor contra-dictions, they are simply different!

Step 7: theory generation

Here the theory derives from the data—it isgrounded in the data and emerges from it. AsLincoln and Guba (1985:205) argue, groundedtheory must fit the situation that is being re-searched. By going through the previous sections,particularly the search for confirming, negativeand discrepant cases, the researcher is able tokeep a ‘running total’ of these cases for a par-ticular theory. The researcher also generates al-ternative theories for the phenomena under in-vestigation and performs the same count of con-firming, negative and discrepant cases. Lincolnand Guba (ibid.: 253) argue that the theory withthe greatest incidence of confirming cases andthe lowest incidence of negative and discrepantcases is the most robust.

There are several procedural tools for ana-lysing qualitative data. LeCompte and Preissle(ibid.: 253) see analytic induction, constant com-parison, typological analysis and enumeration(discussed above) as valuable tools for the quali-tative researcher to use in analysing data andgenerating theory.

Analytic induction is a term and process thatwas introduced by Znaniecki (1934) deliberately

Cha

pte

r 6151

in opposition to statistical methods of data analy-sis. LeCompte and Preissle (1993:254) suggestthat the process is akin to the several steps setout above, in that: (a) data are scanned to gener-ate categories of phenomena; (b) relationshipsbetween these categories are sought; (c) workingtypologies and summaries are written on the ba-sis of the data examined; (d) these are then re-fined by subsequent cases and analysis; (e) nega-tive and discrepant cases are deliberately soughtto modify, enlarge or restrict the original expla-nation/theory. Denzin (1970:192) uses the term‘analytical induction’ to describe the broad strat-egy of participant observation that is set out be-low: • A rough definition of the phenomenon to be

explained is formulated.• A hypothetical explanation of that phenom-

enon is formulated.• One case is studied in the light of the hypoth-

esis, with the object of determining whetheror not the hypothesis fits the facts in that case.

• If the hypothesis does not fit the facts, eitherthe hypothesis is reformulated or the phenom-enon to be explained is redefined, so that thecase is excluded.

• Practical certainty may be attained after asmall number of cases has been examined,but the discovery of negative cases disprovesthe explanation and requires a reformulation.

• This procedure of examining cases, redefin-ing the phenomenon, and reformulating thehypothesis is continued until a universal re-lationship is established, each negative casecalling for a redefinition of a reformulation.

A more deliberate seeking of discontinuing casesis advocated by Bogdan and Biklen (1992:72)where they enumerate five main stages in ana-lytic induction:

Step 1 In the early stages of the research a roughdefinition and explanation of the particular phe-nomenon is developed.Step 2 This definition and explanation is exam-

ined in the light of the data that are being col-lected during the research.Step 3 If the definition and/or explanation thathave been generated need modification in thelight of new data (e.g. if the data do not fit theexplanation or definition) then this is under-taken.Step 4 A deliberate attempt is made to find casesthat may not fit into the explanation or definition.Step 5 The process of redefinition and reformu-lation is repeated until the explanation is reachedthat embraces all the data, and until a general-ized relationship has been established, which willalso embrace the negative cases.

Constant comparison, LeCompte and Preissle(1993:256) opine, combines the elements of in-ductive category coding (see above) with simul-taneously comparing these with the other eventsand social incidents that have been observed andcoded over time and location. This enables so-cial phenomena to be compared across catego-ries, where necessary, giving rise to new dimen-sions, codes and categories.

Glaser (1978) indicates that constant com-parison can proceed from the moment of start-ing to collect data, to seeking key issues andcategories, to discovering recurrent events oractivities in the data that become categories offocus, to expanding the range of categories. Thisprocess can continue during the writing-up proc-ess (which should be continuous), so that amodel or explanation of the phenomena canemerge that accounts for fundamental socialprocesses and relationships.

In constant comparison data are comparedacross a range of situations, times, groups ofpeople, and through a range of methods. Theprocess resonates with the methodological no-tion of triangulation. Glaser and Strauss(1967:105–6) suggest that the constant compari-son method involves four stages: (1) comparingincidents and data that are applicable to eachcategory, comparing them with previous inci-dents in the same category and with other datathat are in the same category; (2) integrating

PLANNING NATURALISTIC RESEARCH

NATURALISTIC AND ETHNOGRAPHIC RESEARCH152

these categories and their properties; (3) bound-ing the theory; (4) setting out the theory.

Typological analysis is essentially aclassificatory process (LeCompte and Preissle,1993:257) wherein data are put into groups,subsets or categories on the basis of some clearcriterion (e.g. acts, behaviour, meanings, natureof participation, relationships, settings, activi-ties). It is the process of secondary coding (Milesand Huberman, 1984) where descriptive codesare then drawn together and put into subsets.Typologies are a set of phenomena that repre-sent subtypes of a more general set or category(Lofland, 1970). Lazarsfeld and Barton (1951)suggest that a typology can be developed interms of an underlying dimension or key char-acteristic. In creating typologies Lofland insiststhat the researcher must: (a) deliberately assem-ble all the data on how a participant addressesa particular issue—what strategies are beingemployed; (b) disaggregate and separate out thevariations between the ranges of instances ofstrategies; (c) classify these into sets and sub-sets; (d) present them in an ordered, named andnumbered way for the reader.

Lincoln and Guba (1985:354–5) urge the re-searcher to be mindful of several issues in ana-lysing and interpreting the data, including: (a)data overload; (b) the problem of acting on firstimpressions only; (c) the availability of peopleand information (e.g. how representative theseare and how to know if missing people and datamight be important); (d) the dangers of onlyseeking confirming rather than discontinuinginstances; (e) the reliability and consistency ofthe data and confidence that can be placed inthe results.

These are significant issues in addressing re-liability, trustworthiness and validity in the re-search (see the discussions of reliability and va-lidity in Chapter 5). The essence of this ap-proach, that theory emerges from and isgrounded in data, is not without its critics. Forexample, Silverman (1993:47) suggests that itfails to acknowledge the implicit theories whichguide research in its early stages (i.e. data arenot theory neutral but theory-saturated) and that

it might be strong on providing categorizationswithout necessarily having explanatory poten-tial. These are caveats that should feed into theprocess of reflexivity in qualitative research,perhaps.

Stage 10: leaving the field

The issue here is how to terminate the research,how to terminate the roles adopted, how (andwhether) to terminate the relationships that havebuilt up over the course of the research, and howto disengage from the field in ways that bring aslittle disruption to the group or situation as pos-sible (LeCompte and Preissle, 1993:101).

Stage 11: writing the report

Delamont (1998) notes the shift in emphasis inmuch research literature, away from the con-duct of the research and towards the reportingof the research. It is often the case that the mainvehicle for writing naturalistic research is thecase study (see Chapter 9), whose ‘trustworthi-ness’ (Lincoln and Guba, 1985:189) is definedin terms of credibility, transferability, depend-ability and confirmability—discussed in Chap-ter 5. Case studies are useful in that they canprovide the thick descriptions that are useful inethnographic research, and can catch and por-tray to the reader what it is like to be involvedin the situation (ibid.: 214). As the writers com-ment (p. 359), the case study is the ideal instru-ment for ‘emic’ inquiry. It also builds in andbuilds on the tacit knowledge that the writerand reader bring to the report, and, thereby,takes seriously their notion of the ‘human in-strument’ in research, indicating the interactionsof researcher and participants.

Lincoln and Guba provide several guidelinesfor writing case studies (ibid.: 365–6): • the writing should strive to be informal and

to capture informality;• as far as possible the writing should report facts

except in those sections where interpretation,evaluation and inference are made explicit;

• in drafting the report it is more advisable to

Cha

pte

r 6153

opt for over-inclusion rather than under-in-clusion;

• the ethical conventions of report writing mustbe honoured, e.g. anonymity, non-traceabil-ity;

• the case study writer should make clear thedata that gave rise to the report, so the read-ers have a means of checking back for reli-ability and validity and inferences;

• a fixed completion date should be specified. Spradley suggests nine practical steps that canbe followed in writing an ethnography:

Step 1 Select the audience.Step 2 Select the thesis.Step 3 Make a list of topics and create an out-line of the ethnography.Step 4 Write a rough draft of each section of theethnography.Step 5 Revise the outline and create subhead-ings.Step 6 Edit the draft.Step 7 Write an introduction and a conclusion.Step 8 Re-read the data and report to identifyexamples.Step 9 Write the final version.

Clearly there are several other aspects of casestudy reporting that need to be addressed. Theseare set out in Chapter 9.

Critical ethnography

An emerging branch of ethnography that reso-nates with the critical paradigm outlined inChapter 1 is the field of critical ethnography.Here not only is qualitative, anthropological,participant, observer-based research undertaken,but its theoretical basis lies in critical theory(Quantz, 1992:448; Carspecken, 1996). As wasoutlined in Chapter 1, this paradigm is con-cerned with the exposure of oppression and in-equality in society with a view to emancipatingindividuals and groups towards collective em-powerment. In this respect research is an inher-

ently political enterprise. Carspecken (1996, 4ff.)suggests several key premises of critical ethnog-raphy: • research and thinking are mediated by power

relations;• these power relations are socially and histori-

cally located;• facts and values are inseparable;• relationships between objects and concepts

are fluid and mediated by the social relationsof production;

• language is central to perception;• certain groups in society exert more power

than others;• inequality and oppression are inherent in capi-

talist relations of production and consump-tion;

• ideological domination is strongest whenoppressed groups see their situation as inevi-table, natural or necessary;

• forms of oppression mediate each other andmust be considered together (e.g. race, gen-der, class).

Quantz (1992:473–4) argues that research isinescapably value-laden in that it serves someinterests, and that in critical ethnography thetask of researchers is to expose these interestsand move participants towards emancipationand freedom. The focus and process of researchare thus political at heart, concerning issues ofpower, domination, voice and empowerment.In critical ethnography the cultures, groups andindividuals being studied are located in con-texts of power and interests. These contextshave to be exposed, their legitimacy interro-gated, and the value base of the research itselfexposed. Reflexivity is high in critical ethnog-raphy. What separates critical ethnographyfrom other forms of ethnography is that, in theformer, questions of legitimacy, power, valuesin society and domination and oppression arefore-grounded.

CRITICAL ETHNOGRAPHY

NATURALISTIC AND ETHNOGRAPHIC RESEARCH154

How does the critical ethnographerproceed?

Carspecken and Apple (1992:512–14) andCarspecken (1996:41–2) identify five stages incritical ethnography:

Stage 1: compiling the primary record throughthe collection of monological data

At this stage the researcher is comparativelypassive and unobtrusive—a participant observer.The task here is to acquire objective data and itis ‘monological’ in the sense that it concerns onlythe researcher writing her own notes to herself.Lincoln and Guba (1985) suggest that validitychecks at this stage will include: • using multiple devices for recording together

with multiple observers;• using a flexible observation schedule in or-

der to minimize biases;• remaining in the situation for a long time in

order to overcome the Hawthorne effect;• using low-inference terminology and descrip-

tions;• using peer-debriefing;• using respondent validation. Echoing Habermas’s (1979, 1982, 1984) workon validity claims, validity here includes truth (theveracity of the utterance), legitimacy (rightnessand appropriateness of the speaker), comprehen-sibility (that the utterance is comprehensible) andsincerity (of the speaker’s intentions). Carspecken(1996:104–5) takes this further in suggesting sev-eral categories of reference in objective validity:(a) that the act is comprehensible, socially legiti-mate and appropriate; (b) that the actor has aparticular identity and particular intentions orfeelings when the action takes place; (c) that ob-jective, contextual factors are acknowledged.

Stage 2: preliminary reconstructive analysis

Reconstructive analysis attempts to uncover thetaken-for-granted components of meaning orabstractions that participants have of a situa-tion. Such analysis is intended to identify the

value systems, norms, key concepts that are guid-ing and underpinning situations. Carspecken(ibid.: 42) suggests that the researcher goes backover the primary record from stage one to ex-amine patterns of interaction, power relations,roles, sequences of events, and meanings ac-corded to situations. He asserts that what dis-tinguishes this stage as ‘reconstructive’ is thatcultural themes, social and system factors thatare not usually articulated by the participantsthemselves are, in fact, reconstructed and articu-lated, making the undiscursive into discourse.In moving to higher level abstractions this stagecan utilize high level coding (see the discussionof coding in this chapter).

In critical ethnography Carspecken (ibid.:141) delineates several ways of ensuring valid-ity at this stage:

• Use interviews and group discussions with thesubjects themselves.

• Conduct member checks on the reconstruc-tion in order to equalize power relations.

• Use peer debriefing (a peer is asked to reviewthe data to suggest if the researcher is beingtoo selective, e.g. of individuals, of data, ofinference) to check biases or absences in re-constructions.

• Employ prolonged engagement to heightenthe researcher’s capacity to assume the insid-er’s perspective.

• Use ‘strip analysis’—checking themes andsegments of extracted data with the primarydata, for consistency.

• Use negative case analysis.

Stage 3: dialogical data collection

Here data are generated by, and discussed with,the participants (Carspecken and Apple, 1992).The authors argue that this is not-naturalistic inthat the participants are being asked to reflecton their own situations, circumstances and livesand to begin to theorize about their lives. Thisis a crucial stage because it enables the partici-pants to have a voice, to democratize theresearch. It may be that this stage produces newdata that challenge the preceding two stages.

Cha

pte

r 6155

In introducing greater subjectivity by partici-pants into the research at this stage Carspecken(1996:164–5) proffers several validity checks,for example: (a) consistency checks on interviewsthat have been recorded; (b) repeated interviewswith participants; (c) matching observation withwhat participants say is happening or has hap-pened; (d) avoiding leading questions at inter-view, reinforced by having peer debriefers checkon this; (e) respondent validation; (f) askingparticipants to use their own terms in describ-ing naturalistic contexts, and to explain theseterms.

Stage 4: discovering system relations

This stage relates the group being studied toother factors that impinge on that group, forexample, local community groups, local sitesthat produce cultural products. At this stageCarspecken (ibid.: 202) notes that validity checkswill include: (a) maintaining the validity require-ments of the earlier stages; (b) seeking a matchbetween the researcher’s analysis and the com-mentaries that are provided by the participantsand other researchers; (c) using peer debriefersand respondent validation.

Stage 5: using system relations to explainfindings

This stage seeks to examine and explain the find-ings in light of macro-social theories (ibid.: 202).In part this is a matching exercise, to fit the re-search findings within a social theory.

In critical ethnography, therefore, the moveis from describing a situation, to understandingit, to questioning it, and to changing it. Thisparallels the stages of ideology critique set outin Chapter 1:

Step 1 A description of the existing situation—ahermeneutic exercise.Step 2 A penetration of the reasons that broughtthe situation to the form that it takes.Step 3 An agenda for altering the situation.Step 4 An evaluation of the achievement of thenew situation.

Computer usage

LeCompte and Preissle (1993) provide a sum-mary of ways in which information technologycan be utilized in supporting ethnographic re-search (see also Tesch, 1990). As can be seenfrom the list below, the uses of information tech-nology are diverse; as data have to be processed,and as word data are laborious to process, andas several powerful packages for data analysisand processing exist, researchers will find it use-ful to make full use of computing facilities. Thesecan be used as follows (LeCompte and Preissle,1993:280–1): • To store and check (e.g. proofread) data.• To collate and segment data and to make

numerous copies of data.• To enable memoing to take place, together

with details of the circumstances in which thememos were written.

• To conduct a search for words or phrases inthe data and to retrieve text.

• To attach identification labels to units of text,(e.g. questionnaire responses), so that subse-quent sorting can be undertaken.

• To partition data into units that have beendetermined either by the researcher or in re-sponse to the natural language itself.

• To enable preliminary coding of data to beundertaken.

• To sort, re-sort, collate, classify and reclas-sify pieces of data to facilitate constant com-parison and to refine schemas of classifica-tion.

• To code memos and bring them into the sameschema of classification.

• To assemble, re-assemble and recall data intocategories.

• To undertake frequency counts (e.g. of words,phrases, codes).

• To cross-check data to see if they can be codedinto more than one category, enabling link-ages between categories to be discovered.

• To establish the incidence of data that arecontained in more than one category.

• To retrieve coded data segments from

COMPUTER USAGE

NATURALISTIC AND ETHNOGRAPHIC RESEARCH156

subsets (e.g. by sex) in order to compare andcontrast data.

• To search for pieces of data that appear in acertain (e.g. chronological) sequence.

• To establish linkages between coding catego-ries.

• To display relationships of categories (e.g.hierarchical, temporal, relational, sub-sumptive, superordinate).

• To quote data in the final report. Kelle (1995) suggests that computers are par-ticularly effective at coping with the often-en-countered problem of data overload and re-trieval in qualitative research. Computers, it isargued, enable the researcher to use codes,memos, hypertext systems, selective retrieval, co-occurring codes, and to perform quantitativecounts of qualitative data types (see also Seideland Kelle, 1995). In turn, these authors suggest,this enables linkages of elements to be under-taken, the building of networks, and, ultimately,theory generation to be undertaken. IndeedLonkila (1995) indicates how computers canassist in the generation of grounded theorythrough coding, constant comparison, linkages,memoing, use of diagrams, verification and, ul-timately, theory building. In this process Kelleand Laurie (1995:27) suggest that computer-aided methods can enhance: (a) validity (by themanagement of samples); and (b) reliability (byretrieving all the data on a given topic, therebyensuring trustworthiness of the data).

A major feature of computer use is in thecoding and compilation of data (for example,Kelle (1995:62–104). Lonkila (1995) identifiesseveral kinds of codes. Open coding generatescategories and defines their properties and di-mensions. Axial coding works within one cat-egory, making connections between subgroupsof that category and makes connections betweenone category and another. This might be in termsof the phenomena that are being studied, thecausal conditions that lead to the phenomena,the context of the phenomena and their inter-vening conditions, and the actions and interac-tions of, and consequences for, the actors in situ-

ations. Selective coding identifies the core cat-egories of text data. Seidel and Kelle (1995) sug-gest that codes can denote a text, passage, orfact, and can be used to construct data networks.

There are several computer packages for quali-tative data (see Kelle, 1995), for example: AQUAD;ATLAS/ti; HyperQuad2; HyperRESEARCH;Hypersoft; Kwaliton; Martin; MAX; WINMAX;NUD.IST; QUALPRO; Textbase Alpha,ETHNOGRAPH, ATLAS.ti, Code-A-Text, Deci-sion Explorer, Diction. Some of these are reviewedby Prein, Kelle and Bird (1995:190–209).

To conclude this chapter we identify a numberof difficulties that arise in the implementationof ethnographic and naturalistic research pro-grammes.

Some problems with ethnographicand naturalistic approaches

There are several difficulties in ethnographic andnatural approaches. These might affect the reli-ability and validity of the research, and include: 1 The definition of the situation—the partici-

pants are being asked for their definition ofthe situation, yet they have no monopolyon wisdom. They may be ‘falsely conscious’(unaware of the ‘real’ situation), deliberatelydistorting or falsifying information, orhighly selective. The issues of reliability andvalidity here are addressed in Chapter 5 (seethe discussions of triangulation).

2 Reactivity (the Hawthorne effect)—the pres-ence of the researcher alters the situation asparticipants may wish to avoid, impress, di-rect, deny, influence the researcher. Again,this is discussed in Chapter 5. Typically theproblem of reactivity is addressed by care-ful negotiation in the field, remaining in thefield for a considerable time, ensuring as faras possible a careful presentation of the re-searcher’s self.

3 The halo effect—where existing or giveninformation about the situation or partici-pants might be used to be selective in sub-sequent data collection, or may bring about

Cha

pte

r 6157

a particular reading of a subsequent situa-tion (the research equivalent of the self-ful-filling prophecy). This is an issue of reli-ability, and can be addressed by the use ofa wide, triangulated data base and the as-sistance of an external observer.

4 The implicit conservatism of the interpretivemethodology—the kind of research describedin this chapter, with the possible exceptionof critical ethnography, accepts the perspec-tive of the participants and corroborates thestatus quo. It is focused on the past and thepresent rather than on the future.

5 There is the difficulty of focusing on the fa-miliar—participants (and, maybe research-ers too) being so close to the situation thatthey neglect certain, often tacit, aspects ofit. The task, therefore, is to make the famil-iar strange. Delamont (1981) suggests thatthis can be done by:

• studying unusual examples of the sameissue (e.g. atypical classroom, timetablingor organizations of schools);

• studying examples in other cultures;• studying other situations that might have

a bearing on the situation in hand (e.g. ifstudying schools it might be useful to lookat other similar-but-different organiza-tions, for instance hospitals or prisons);

• taking a significant issue and focusing onit deliberately, e.g. gendered behaviour.

6 The open-endedness and diversity of thesituations studied. Hammersley and

Atkinson (1983) counsel that the drive to-wards focusing on specific contexts and situ-ations might overemphasize the differencebetween contexts and situations rather thantheir gross similarity, their routine features.Researchers, he argues, should be as awareof regularities as of differences.

7 The neglect of wider social contexts andconstraints. In studying situations that em-phasize how highly context-bound they are,this might neglect broader currents and con-texts—micro-level research risks puttingboundaries that exclude important macro-level factors. Wider—macro-contexts can-not be ruled out of individual situations.

8 The issue of generalizability. If situations areunique and non-generalizable, as manynaturalistic principles would suggest, howis the issue of generalizability going to beaddressed? To which contexts will the find-ings apply, and what is the role and natureof replication studies?

9 How to write up multiple realities and ex-planations? How will a representative viewbe reached? What if the researcher seesthings that are not seen by the participants?

10 Who owns the data, the report, and whohas control over the release of the data?

Naturalistic and ethnographic research, then, areimportant but problematical research methodsin education. Their widespread use signals theirincreasing acceptance as legitimate and impor-tant styles of research.

PROBLEMS WITH RESEARCH APPROACHES

Introduction

Mouly (1978) states that while historical re-search cannot meet some of the tests of the sci-entific method interpreted in the specific senseof its use in the physical sciences (it cannot de-pend, for instance, on direct observation or ex-perimentation, but must make use of reports thatcannot be repeated), it qualifies as a scientificendeavour from the standpoint of its subscrip-tion to the same principles and the same generalscholarship that characterize all scientific re-search.1

Historical research has been defined as thesystematic and objective location, evaluation andsynthesis of evidence in order to establish factsand draw conclusions about past events (Borg(1963). It is an act of reconstruction undertakenin a spirit of critical inquiry designed to achievea faithful representation of a previous age. Inseeking data from the personal experiences andobservations of others, from documents andrecords, researchers often have to contend withinadequate information so that their reconstruc-tions tend to be sketches rather than portraits.Indeed, the difficulty of obtaining adequate datamakes historical research one of the most tax-ing kinds of inquiry to conduct satisfactorily.2

Reconstruction implies a holistic perspective inthat the method of inquiry characterizing his-torical research attempts to ‘encompass and thenexplain the whole realm of man’s past in a per-spective that greatly accents his social, cultural,economic, and intellectual development’ (Hilland Kerber, 1967).

Ultimately, historical research is concerned

with a broad view of the conditions and notnecessarily the specifics which bring them about,although such a synthesis is rarely achieved with-out intense debate or controversy, especially onmatters of detail. The act of historical researchinvolves the identification and limitation of aproblem or an area of study; sometimes the for-mulation of a hypothesis (or set of questions);the collection, organization, verification, vali-dation, analysis and selection of data; testingthe hypothesis (or answering the questions)where appropriate; and writing a research re-port. This sequence leads to a new understand-ing of the past and its relevance to the presentand future.

The values of historical research have beencategorized by Hill and Kerber as follows: • it enables solutions to contemporary prob-

lems to be sought in the past;• it throws light on present and future trends;• it stresses the relative importance and the ef-

fects of the various interactions that are tobe found within all cultures;

• it allows for the revaluation of data in relationto selected hypotheses, theories and generali-zations that are presently held about the past.

As the writers point out, the ability of history toemploy the past to predict the future, and to usethe present to explain the past, gives it a dualand unique quality which makes it especiallyuseful for all sorts of scholarly study and re-search.3

The particular value of historical research inthe field of education is unquestioned. It can, for

Historical research7

Chapte

r 7159

example, yield insights into some educationalproblems that could not be achieved by any othermeans. Further, the historical study of an educa-tional idea or institution can do much to help usunderstand how our present educational systemhas come about; and this kind of understandingcan in turn help to establish a sound basis forfurther progress or change. Historical researchin education can also show how and why educa-tional theories and practices developed. It ena-bles educationalists to use former practices toevaluate newer, emerging ones. Recurrent trendscan be more easily identified and assessed from ahistorical standpoint—witness, for example, thevarious guises in which progressivism in educa-tion have appeared. And it can contribute to afuller understanding of the relationship betweenpolitics and education, between school and soci-ety, between local and central government, andbetween teacher and pupil.4

Historical research in education may con-cern itself with an individual, a group, a move-ment, an idea or an institution. As Best (1970)points out, however, not one of these objectsof historical interest and observation can beconsidered in isolation. No one person can besubjected to historical investigation withoutsome consideration of his or her contributionto the ideas, movements or institutions of aparticular time or place. These elements arealways interrelated. The focus merely deter-mines the point of emphasis towards whichhistorical researchers direct their attention.Box 7.1 illustrates some of these relationshipsfrom the history of education. For example,

no matter whether the historian chooses tostudy the Jesuit order, religious teaching or-ders, the Counter-Reformation or IgnatiusLoyola, each of the other elements appears asa prominent influence or result, and an indis-pensable part of the narrative. For an exampleof historical research see Thomas (1992) andGaukroger and Schwartz (1997).

Choice of subject

As with other methods we consider in thisbook, historical research may be structuredby a flexible sequence of stages, beginningwith the selection and evaluation of a prob-lem or area of study. Then follows the defini-tion of the problem in more precise terms, theselection of suitable sources of data, collec-tion, classification and processing of thedata, and finally, the evaluation and synthesisof the data into a balanced and objective ac-count of the subject under investigation.There are, however, some important differ-ences between the method of historical re-search and other research methods used ineducation. The principal difference has beenhighlighted by Borg:

In historical research, it is especially importantthat the student carefully defines his problemand appraises its appropriateness before com-mitting himself too fully. Many problems arenot adaptable to historical research methodsand cannot be adequately treated using this ap-proach. Other problems have little or no chanceof producing significant results either because of

Box 7.1Some historical interrelations between men, movements and institutions

Source Adapted from Best, 1970

CHOICE OF SUBJECT

160 HISTORICAL RESEARCH

the lack of pertinent data or because the prob-lem is a trivial one.

(Borg, 1963) One can see from Borg’s observations that thechoice of a problem can sometimes be a daunt-ing business for the potential researcher. Once atopic has been selected, however, and its poten-tial and significance for historical research evalu-ated, the next stage is to define it more precisely,or, perhaps more pertinently, delimit it so that amore potent analysis will result. Too broad ortoo vague a statement can result in the final re-port lacking direction or impact. Best expressesit like this: ‘The experienced historian realizesthat research must be a penetrating analysis ofa limited problem, rather than the superficialexamination of a broad area. The weapon ofresearch is the rifle not the shotgun’ (Best, 1970).Various prescriptions exist for helping to definehistorical topics. Gottschalk (1951) recommendsthat four questions should be asked in identify-ing a topic: • Where do the events take place?• Who are the people involved?• When do the events occur?• What kinds of human activity are involved? As Travers (1969) suggests, the scope of a topiccan be modified by adjusting the focus of anyone of the four categories; the geographical areainvolved can be increased or decreased; more orfewer people can be included in the topic; thetime span involved can be increased or de-creased; and the human activity category canbe broadened or narrowed. It sometimes hap-pens that a piece of historical research can onlybegin with a rough idea of what the topic in-volves; and that delimitation of it can only takeplace after the pertinent material has been as-sembled.

In hand with the careful specification of theproblem goes the need, where this is appropri-ate, for an equally specific and testable hypoth-esis (sometimes a sequence of questions may besubstituted.) As in empirical research, the hy-

pothesis gives direction and focus to data col-lection and analysis. It imposes a selection, astructure on what would otherwise be an over-whelming mass of information. As Borg (1963)observes:

Without hypotheses, historical research often be-comes little more than an aimless gathering offacts. In searching the materials that make up thesources of historical research data, unless the stu-dent’s attention is aimed at information relatingto specific questions or concerned with specifichypotheses, he [sic] has little chance of extractinga body of data from the available documents thatcan be synthesized to provide new knowledge ornew understanding of the topic studied. Even af-ter specific hypotheses have been established, thestudent must exercise strict self-control in his studyof historical documents or he will find himselfcollecting much information that is interesting butis not related to his area of inquiry. If the student’shypotheses are not sufficiently delimited or spe-cific, it is an easy matter for him to become dis-tracted and led astray by information that is notreally related to his field of investigation.

Hill and Kerber (1967) have pointed out that theevaluation and formulation of a problem associ-ated with historical research often involve thepersonality of the researcher to a greater extentthan do other basic types of research. They sug-gest that personal factors of the investigator suchas interest, motivation, historical curiosity, andeducational background for the interpretation ofhistorical facts tend to influence the selection ofthe problem to a great extent.

Data collection

One of the principal differences between histori-cal research and other forms of research is thathistorical research must deal with data that al-ready exist. Hockett (1955) expresses it thus:

History is not a science of direct observation, likechemistry and physics. The historian like the ge-ologist interprets past events by the traces theyhave left; he deals with the evidence of man’s pastacts and thoughts. But the historian, no less than

Chapte

r 7161

the scientist, must utilize evidence resting on reli-able observation. The difference in procedure isdue to the fact that the historian usually does notmake his own observations, and that those uponwhose observations he must depend are, or were,often if not usually untrained observers. Histori-cal method is, strictly speaking, a process supple-mentary to observations, a process by which thehistorian attempts to test the truthfulness of thereports of observations made by others. Like thescientist, he [sic] examines his data and formu-lates hypotheses, i.e. tentative conclusions. Theseconjectures he must test by seeking fresh evidenceor re-examining the old, and this process he mustcontinue until, in the light of all available evidence,the hypotheses are abandoned as untenable ormodified until they are brought into conformitywith the available evidence.

(Hockett, 1955) Sources of data in historical research may beclassified into two main groups: primary sources,which are the life blood of historical research;and secondary sources, which may be used inthe absence of, or to supplement, primary data.

Primary sources of data have been describedas those items that are original to the problemunder study and may be thought of as being intwo categories, thus:

1 The remains or relics of a given period. Al-

though such remains and artefacts as skel-etons, fossils, weapons, tools, utensils, build-ings, pictures, furniture, coins and objets d’artwere not meant to transmit information tosubsequent eras, nevertheless they may beuseful sources providing sound evidenceabout the past.

2 Those items that have had a direct physicalrelationship with the events being recon-structed. This category would include notonly the written and oral testimony providedby actual participants in, or witnesses of, anevent, but also the participants themselves.Documents considered as primary sourcesinclude manuscripts, charters, laws; archivesof official minutes or records, files, letters,memoranda, memoirs, biography, official

publications, wills, newspapers and maga-zines, maps, diagrams, catalogues, films,paintings, inscriptions, recordings, transcrip-tions, log books and research reports. Allthese are, intentionally or unintentionally,capable of transmitting a first-hand accountof an event and are therefore considered assources of primary data. Historical researchin education draws chiefly on the kind ofsources identified in this second category.

Secondary sources are those that do not bear adirect physical relationship to the event beingstudied. They are made up of data that cannotbe described as original. A secondary sourcewould thus be one in which the person describ-ing the event was not actually present but whoobtained descriptions from another person orsource. These may or may not have been pri-mary sources. Other instances of secondarysources used in historical research include:quoted material, textbooks, encyclopedias, otherreproductions of material or information, printsof paintings or replicas of art objects. Best (1970)points out that secondary sources of data areusually of limited worth because of the errorsthat result when information is passed on fromone person to another.

Various commentators stress the importanceof using primary sources of data where possible(Hill and Kerber, 1967). The value, too, of sec-ondary sources should not be minimized. Thereare numerous occasions where a secondarysource can contribute significantly to more validand reliable historical research than would oth-erwise be the case.

One further point: the review of the litera-ture in other forms of educational research isregarded as a preparatory stage to gathering dataand serves to acquaint researchers with previ-ous research on the topics they are studying(Travers, 1969). It thus enables them to con-tinue in a tradition, to place their work in con-text, and to learn from earlier endeavours. Thefunction of the review of the literature in his-torical research, however, is different in that itprovides the data for research; the researchers’

DATA COLLECTION

162 HISTORICAL RESEARCH

acceptance or otherwise of their hypotheses willdepend on their selection of information fromthe review and the interpretation they put on it.Borg (1963) has identified other differences: oneis that the historical researcher will have to pe-ruse longer documents than the empirical re-searcher who normally studies articles very muchmore succinct and precise. Further, documentsrequired in historical research often date backmuch further than those in empirical research.And one final point: documents in educationoften consist of unpublished material and aretherefore less accessible than reports of empiri-cal studies in professional journals.

For a detailed consideration of the specificproblems of documentary research, the reader isreferred to the articles by Platt (1981) where sheconsiders authenticity, availability of documents,sampling problems, inference and interpretation.

Evaluation

Because workers in the field of historical researchgather much of their data and information fromrecords and documents, these must be carefullyevaluated so as to attest their worth for the pur-poses of the particular study. Evaluation of his-torical data and information is often referred toas historical criticism and the reliable data yieldedby the process are known as historical evidence.Historical evidence has thus been described asthat body of validated facts and informationwhich can be accepted as trustworthy, as a validbasis for the testing and interpretation of hypoth-eses. Historical criticism is usually undertaken intwo stages: first, the authenticity of the source isappraised; and second, the accuracy or worth ofthe data is evaluated. The two processes areknown as external and internal criticism respec-tively, and since they each present problems ofevaluation they merit further inspection.

External criticism

External criticism is concerned with establish-ing the authenticity or genuineness of data. Itis therefore aimed at the document (or other

source) itself rather than the statements it con-tains; with analytic forms of the data ratherthan the interpretation or meaning of them inrelation to the study. It therefore sets out touncover frauds, forgeries, hoaxes, inventionsor distortions. To this end, the tasks of estab-lishing the age or authorship of a documentmay involve tests of factors such as signatures,handwriting, script, type, style, spelling andplace-names. Further, was the knowledge itpurports to transmit available at the time andis it consistent with what is known about theauthor or period from another source? In-creasingly sophisticated analyses of physicalfactors can also yield clues establishing au-thenticity or otherwise: physical and chemicaltests of ink, paper, parchment, cloth and othermaterials, for example. Investigations in thefield of educational history are less likely toencounter deliberate forgeries than in, say, po-litical or social history, though it is possible tofind that official documents, correspondenceand autobiographies have been ‘ghosted’, thatis, prepared by a person other than the allegedauthor or signer.

Internal criticism

Having established the authenticity of the docu-ment, the researcher’s next task is to evaluatethe accuracy and worth of the data containedtherein. While they may be genuine, they maynot necessarily disclose the most faithful pic-ture. In their concern to establish the meaningand reliability of data, investigators are con-fronted with a more difficult problem than ex-ternal criticism because they have to establishthe credibility of the author of the documents.Travers (1969) has listed those characteristicscommonly considered in making evaluations ofwriters. Were they trained or untrained observ-ers of the events? In other words, how compe-tent were they? What were their relationshipsto the events? To what extent were they underpressure, from fear or vanity, say, to distort oromit facts? What were the intents of the writ-ers of the documents? To what extent were they

Chapte

r 7163

experts at recording those particular events?Were the habits of the authors such that theymight interfere with the accuracy of recordings?Were they too antagonistic or too sympatheticto give true pictures? How long after the eventdid they record their testimonies? And werethey able to remember accurately? Finally, arethey in agreement with other independent wit-nesses?

Many documents in the history of educationtend to be neutral in character, though it is pos-sible that some may be in error because of thesekinds of observer characteristics. A particularproblem arising from the questions posed byTravers is that of bias. This can be particularlyacute where life histories are being studied. Thechief concern here, as Plummer (1983) remindsus, resides in examining possible sources of biaswhich prevent researchers from finding out whatis wanted and using techniques to minimize thepossible sources of bias.

Researchers generally recognize three sourcesof bias: those arising from the subject being in-terviewed, those arising from themselves as re-searchers and those arising from the subject-re-searcher interaction (Travers, 1969).5

Writing the research report

Once the data have been gathered and subjectedto external criticism for authenticity and to in-ternal criticism for accuracy, the researcher isnext confronted with the task of piecing togetheran account of the events embraced by the re-search problem. This stage is known as the proc-ess of synthesis. It is probably the most difficultphase in the project and calls for considerableimagination and resourcefulness. The resultingpattern is then applied to the testing of the hy-pothesis.

The writing of the final report is equally de-manding and calls for creativity and high stand-ards of objective and systematic analysis.

Best (1970) has listed the kinds of problemsoccurring in the various types of historical re-search projects submitted by students. Theseinclude:

• Defining the problem too broadly.• The tendency to use easy-to-find secondary

sources of data rather than sufficient primarysources, which are harder to locate but usu-ally more trustworthy.

• Inadequate historical criticism of data, dueto failure to establish authenticity of sourcesand trustworthiness of data. For example,there is often a tendency to accept a state-ment as necessarily true when several observ-ers agree. It is possible that one may haveinfluenced the others, or that all were influ-enced by the same inaccurate source of infor-mation.

• Poor logical analysis resulting from:• oversimplification—failure to recognize

the fact that causes of events are more of-ten multiple and complex than single andsimple;

• overgeneralization on the basis of insuffi-cient evidence, and false reasoning by anal-ogy, basing conclusions upon superficialsimilarities of situations;

• failure to interpret words and expressionin the light of their accepted meaning inan earlier period;

• failure to distinguish between significantfacts in a situation and those that are ir-relevant or unimportant.

• Expression of personal bias, as revealed bystatements lifted out of context for purposesof persuasion, assuming too generous or un-critical an attitude towards a person or idea(or being too unfriendly or critical), exces-sive admiration for the past (sometimesknown as the ‘old oaken bucket’ delusion),or an equally unrealistic admiration for thenew or contemporary, assuming that allchange represents progress.

• Poor reporting in a style that is dull and col-ourless, too flowery or flippant, too persua-sive or of the ‘soap-box’ type, or lacking inproper usage.

Borg and Gall (1979:400) suggest several mis-takes that can be made in conducting historicalresearch:

WRITING THE RESEARCH REPORT

164 HISTORICAL RESEARCH

• The selection of a topic for which histori-cal sources are slight, inaccessible or non-existent.

• Over-reliance on secondary sources.• Failure to subject the historical sources to

internal or external validity/criticism checks.• Lack of reflexivity and the researcher’s selec-

tivity and bias in using sources.• Importing concepts from other disciplines.• Making illegitimate inferences of causality

and monocausality.• Generalizing beyond acceptable limits of

the data.• Listing facts without appropriate

thematization. In addition to these, Sutherland (1969) has bril-liantly illustrated two further common errorsamong historians of education. These are first,projecting current battles backwards onto a his-torical background which leads to distortion;and second, ‘description in a vacuum’ which failsto illustrate the relationship of the educationalsystem to the structure of society. To concludeon a more positive note Mouly (1978) itemizesfive basic criteria for evaluating historical re-search: • Problem Has the problem been clearly de-

fined? It is difficult enough to conduct his-torical research adequately without addingto the confusion by starting out with a nebu-lous problem. Is the problem capable of so-lution? Is it within the competence of the in-vestigator?

• Data Are data of a primary nature availablein sufficient completeness to provide a solu-tion, or has there been an overdependenceon secondary or unverifiable sources?

• Analysis Has the dependability of the databeen adequately established? Has the rel-evance of the data been adequately explored?

• Interpretation Does the author display ad-equate mastery of his [sic] data and insightinto the relative significance? Does he displayadequate historical perspective? Does hemaintain his objectivity or does he allow

personal bias to distort the evidence? Are hishypotheses plausible? Have they been ad-equately tested? Does he take a sufficientlybroad view of the total situation? Does hesee the relationship between his data andother ‘historical facts’?

• Presentation Does the style of writing attractas well as inform? Does the report make acontribution on the basis of newly discovereddata or new interpretation, or is it simply ‘un-inspired hack-work’? Does it reflect scholar-liness?

The use of quantitative methods

By far the greater part of research in historicalstudies is qualitative in nature. This is so becausethe proper subject-matter of historical researchconsists to a great extent of verbal and other sym-bolic material emanating from a society’s or aculture’s past. The basic skills required of the re-searcher to analyse this kind of qualitative or sym-bolic material involve collecting, classifying, or-dering, synthesizing, evaluating and interpreting.At the basis of all these acts lies sound personaljudgement. In the comparatively recent past, how-ever, attempts have been made to apply the quan-titative methods of the scientist to the solution ofhistorical problems (Travers, 1969). Of thesemethods, the one having greatest relevance tohistorical research is that of content analysis, thebasic goal of which is to take a verbal, non-quan-titative document and transform it into quanti-tative data (Bailey, 1978).

Content analysis itself has been defined as ‘amultipurpose research method developed spe-cifically for investigating a broad spectrum ofproblems in which the content of communica-tion serves as a basis of inference’,6 from wordcounts (Travers, 1969) to categorization. Ap-proaches to content analysis are careful to iden-tify appropriate categories and units of analy-sis, both of which will reflect the nature of thedocument being analysed and the purpose of theresearch. Categories are normally determinedafter initial inspection of the document and willcover the main areas of content.

Chapte

r 7165

We can readily see how the technique of con-tent analysis may be applied to selected aspects ofhistorical research in education. It could be used,for instance, in the analysis of educational docu-ments. In addition to elucidating the content ofthe document, the method may throw additionallight on the source of the communication, its au-thor, and on its intended recipients, those to whomthe message is directed. Further, an analysis of thiskind would tell us more about the social contextand the kinds of factors stressed or ignored, andof the influence of political factors, for instance. Itfollows from this that content analysis may formthe basis of comparative or cross-cultural studies.The purposes of content analysis have been iden-tified by Holsti (1968): • To describe trends in communication content.• To relate known characteristics of sources to

messages they produce.• To audit communication content against

standards.• To analyse techniques of persuasion.• To analyse style.• To relate known attributes of the audience to

messages produced for them.• To describe patterns of communication. Different examples of the use of content analysisin historical contexts are provided by Thomasand Znaniecki (1918)7 and Bradburn and Berlew(1961). A further example of content analysis inhistorical settings is McClelland et al.’s (1953)study of the relationship between the need toachieve (n’ach, for short) among members of asociety and the economic growth of the particu-lar society in question. Finally, for a more de-tailed and technical consideration of the use ofquantitative methods in historical research, astudy which looks at the classifying and arrang-ing of historical data and reviews basic descrip-tive statistics, we refer the reader to Floud (1979).

Life histories

Thomas and Znaniecki’s monumental study, ThePolish Peasant in Europe and America (1918),

serves as an appropriate introduction to this sec-tion, for their detailed account of the life andtimes of Wladek Wisniewski is commonly heldto be the first sociological life history.

The life history, according to Plummer (1983),is frequently a full-length book about one per-son’s life in his or her own words. Often, Plummerobserves, it is gathered over a number of years,the researcher providing gentle guidance to thesubject, encouraging him or her either to writedown episodes of life or to tape-record them. Andoften as not, these materials will be backed upwith intensive observations of the subject’s life,with interviews of the subject’s friends and ac-quaintances and with close scrutiny of relevantdocuments such as letters, diaries and photo-graphs. Essentially, the life history is an ‘interac-tive and co-operative technique directly involv-ing the researcher’ (Plummer, 1983).

Recent accounts of the perspectives and in-terpretations of people in a variety of educationalsettings are both significant and pertinent,8 forthey provide valuable ‘insights into the ways inwhich educational personnel come to terms withthe constraints and conditions in which theywork’ (Goodson, 1983). Life histories, Goodsonargues, ‘have the potential to make a far-reach-ing contribution to the problem of understand-ing the links between “personal troubles” and“public issues”, a task that lies at the very heartof the sociological enterprise’. Their importance,he asserts, ‘is best confirmed by the fact thatteachers continually, most often unsolicited,import life history data into their accounts ofclassroom events’ (Goodson, 1983).

Miller (1999) demonstrates that biographi-cal research is a distinctive way of conceptual-izing social activity. He provides outlines of thethree main approaches to analysis, that is to say: • the realist which is focused upon grounded-

theory techniques;• the neo-positivist, employing more structured

interviews; and• the narrative with its emphasis on using the

interplay between interviewer and intervieweeto actively construct life histories.

LIFE HISTORIES

166 HISTORICAL RESEARCH

Denzin (1999) suggests that there are severalvarieties of biographical research methods in-cluding: biography, autobiography, story, dis-course, narrative writing, personal history, oralhistory, case history, life history, personal expe-rience, and case study. This is addressed furtherby Connelly and Clandinin (1999) who indicateseveral approaches to narrative inquiry: • oral history;• stories;• annals and chronicles;• photographs;• memory boxes;• interviews;• journals;• autobiography;• letters;• conversations;• and documents. In exploring the appropriateness of life historytechniques to a particular research project, andwith ever-present constraints of time, facilitiesand finance in mind, it is useful to distinguishlife histories both by type and mode of presen-tation, both factors bearing directly upon thescope and feasibility of the research endeavour.Box 7.2 draws on an outline by Hitchcock andHughes (1989). Readers may wish to refer tothe descriptions of types and modes of presen-tation contained in Box 7.2 in assessing the dif-fering demands that are made on intending re-searchers as they gather, analyse and presenttheir data. Whether retrospective or contempo-raneous, a life history involves five broad re-search processes. These have been identified anddescribed by Plummer (1983).

Preparation

This involves the researcher both in selecting anappropriate problem and devising relevant re-search techniques. Questions to be asked at thisstage are first, ‘Who is to be the object of thestudy?’—the great person, the common person,the volunteer, the selected, the coerced? Second,

‘What makes a good informant?’ Plummerdraws attention to key factors such as accessi-bility of place and availability of time, and theawareness of the potential informant of his/herparticular cultural milieu. A good informant isable and willing to establish and maintain aclose, intimate relationship with the researcher.It is axiomatic that common sympathies andmutual respect are prerequisites for the suste-nance and success of a life history project. Third,‘What needs clarifying in the early stages of theresearch?’ The motivations of the researcherneed to be made explicit to the intended sub-ject. So too, the question of remuneration forthe subject’s services should be clarified fromthe outset. The issue of anonymity must also beaddressed, for unlike other research methodolo-gies, life histories reveal intimate details (names,places, events) and provide scant cover fromprying eyes. The earlier stages of the project alsoprovide opportunities for discussing with theresearch subject the precise nature of the life

Box 7.2A typology of life histories and their modes ofpresentation

Source Adapted from Hitchcock and Hughes, 1989

Types

Retrospective life historya reconstruction of past events from the present feel-ings and interpretations of the individual concerned. Contemporaneous life historya description of an individual’s daily life in progress,here and now. Modes of Presentation

Naturalistica first-person life history in which the life story is largelyin the words of the individual subject, supported bya brief introduction, commentary and conclusion onthe part of the researcher. Thematically-editedsubject’s words are retained intact but are presentedby the researcher in terms of a series of themes, top-ics or headings, often in chapter-by-chapter format. Interpreted and editedthe researcher’s influence is most marked in his/herversion of a subject’s life story which the researcherhas sifted, distilled, edited and interpreted.

Chapte

r 7167

history study, the logistics of interview situationsand modes of data recording.

Data collection

Central to the success of a life history is the re-searcher’s ability to use a variety of interviewtechniques (see also Chapter 15). As the occa-sion demands, these may range from relativelystructured interviews that serve as general guidesfrom the outset of the study, to informal, un-structured interviews reminiscent of non-direc-tive counselling approaches espoused by CarlRogers (1945) and his followers. In the case ofthe latter, Plummer (1983) draws attention tothe importance of empathy and ‘non-possessivewarmth’ on the part of the interviewer-re-searcher. A third interviewing strategy involvesa judicious mixture of participant observation(see Chapter 17) and casual chatting, supple-mented by note-taking.

Data storage

Typically, life histories generate enormousamounts of data. Intending researchers mustmake early decisions about the use of tape-re-cordings, the how, what and when of their tran-scription and editing, and the development ofcoding and filing devices if they are to avoidbeing totally swamped by the materials created.Readers are referred to the discussion in Chap-ter 9 and to Fiedler’s (1978) extensive accountof methods appropriate to field studies in natu-ral settings.

Data analysis

Three central issues underpin the quality of datagenerated by life history methodology. They areto do with representativeness, reliability andvalidity (see also Chapters 5, 9 and 15).

Plummer draws attention to a frequent criti-cism of life history research, namely that its casesare atypical rather than representative. To avoidthis charge, he urges intending researchers to‘work out and explicitly state the life history’s

relationship to a wider population’ (Plummer,1983) by way of appraising the subject on acontinuum of representativeness and non-rep-resentativeness.

Reliability in life history research hinges uponthe identification of sources of bias and the ap-plication of techniques to reduce them. Biasarises from the informant, the researcher, andthe interactional encounter itself (Plummer,1983), and these were presented in Box 5.1. Sev-eral validity checks are available to intendingresearchers. Plummer identifies the following: • The subject of the life history may present

an autocritique of it, having read the entireproduct.

• A comparison may be made with similar writ-ten sources by way of identifying points ofmajor divergence or similarity.

• A comparison may be made with officialrecords by way of imposing accuracy checkson the life history.

• A comparison may be made by interviewingother informants.

Essentially, the validity of any life history lies inits ability to represent the informant’s subjec-tive reality, that is to say, his or her definition ofthe situation.

Data presentation

Plummer provides three points of direction forthe researcher intent upon writing a life his-tory. First, have a clear view of who you arewriting for and what you wish to accomplishby writing the account. Are you aiming to pro-duce a case history or a case study? Case histo-ries ‘tell a good story for its own sake’(Plummer, 1983). Case studies, by contrast, usepersonal documents for wider theoretical pur-poses such as the verification and/or the gen-eration of theory. Second, having establishedthe purpose of the life history, decide how faryou should intrude upon your assembled data.Intrusion occurs both through editing and in-terpreting. Editing (‘cutting’, sequencing,

LIFE HISTORIES

168 HISTORICAL RESEARCH

disguising names, places etc.) is almost a sinequa non of any life history study. ParaphrasingPlummet, editing involves getting your subject’sown words, grasping them from the inside andturning them into a structured and coherentstatement that uses the subject’s words in placesand your own, as researcher, in others, but re-

tains their authentic meaning at all times. Third,as far as the mechanics of writing a life historyare concerned, practise writing regularly. Writ-ing, Plummer observes, needs working at, anddaily drafting, revising and redrafting is neces-sary. For an example of life history methodol-ogy and research see Evetts (1991).

Many educational research methods are descrip-tive; that is, they set out to describe and to in-terpret what is. Descriptive research, accordingto Best, is concerned with:

conditions or relationships that exist; practices thatprevail; beliefs, points of views, or attitudes thatare held; processes that are going on; effects thatare being felt; or trends that are developing. Attimes, descriptive research is concerned with howwhat is or what exists is related to some preced-ing event that has influenced or affected a presentcondition or event.

(Best, 1970) Such studies look at individuals, groups, insti-tutions, methods and materials in order to de-scribe, compare, contrast, classify, analyse andinterpret the entities and the events that consti-tute their various fields of inquiry.

This chapter deals with several types of de-scriptive survey research, including longitudinal,cross-sectional and trend or prediction studies.Collectively longitudinal, cross-sectional andtrend or prediction studies are sometimes termeddevelopmental research because they are con-cerned both to describe what the present rela-tionships are among variables in a given situa-tion and to account for changes occurring inthose relationships as a function of time. Theterm ‘developmental’ is primarily biological,having to do with the organization and life proc-esses of living things. The concept has been ap-propriated and applied to diverse educational,historical, sociological and psychological phe-nomena. In education, for example, developmen-tal studies often retain the original biological

orientation of the term, having to do with theacquisition of motor and perceptual skills inyoung children. However, the designation ‘de-velopmental’ has wider application in this field,for example, in connection with Piaget’s studiesof qualitative changes occurring in children’sthinking, and Kohlberg’s work on moral devel-opment.

Typically, surveys gather data at a particularpoint in time with the intention of describingthe nature of existing conditions, or identifyingstandards against which existing conditions canbe compared, or determining the relationshipsthat exist between specific events. Thus, surveysmay vary in their levels of complexity from thosewhich provide simple frequency counts to thosewhich present relational analysis.

Surveys may be further differentiated in termsof their scope. A study of contemporary devel-opments in post-secondary education, for ex-ample, might encompass the whole of WesternEurope; a study of subject choice, on the otherhand, might be confined to one secondaryschool. The complexity and scope of surveys ineducation can be illustrated by reference to fa-miliar examples. The surveys undertaken for thePlowden Committee on primary school children(Central Advisory Council on Education, 1967)collected a wealth of information on children,teachers and parents and used sophisticated ana-lytical techniques to predict pupil attainment.By contrast, the small scale survey of Jacksonand Marsden (1962) involved a detailed studyof the backgrounds and values of 88 working-class adults who had achieved success throughselective secondary education.

8 Surveys, longitudinal, cross-sectional andtrend studies

170 SURVEYS AND STUDIES

Box 8.1Stages in the planning of a survey

Source Adapted from Davidson, 1970

Chapte

r 8171

Whether the survey is large scale and under-taken by some governmental bureau or smallscale and carried out by the lone researcher, thecollection of information typically involves oneor more of the following data-gathering tech-niques: structured or semi-structured interviews,self-completion or postal questionnaires, stand-ardized tests of attainment or performance, andattitude scales. Typically, too, surveys proceedthrough well-defined stages, though not everystage outlined in Box 8.1 is required for the suc-cessful completion of a survey.

A survey has several characteristics and sev-eral claimed attractions; typically it is used toscan a wide field of issues, populations, pro-grammes etc. in order to measure or describeany generalized features. It is useful (Morrison,1993:38–40) in that it usually: • gathers data on a one-shot basis and hence is

economical and efficient;• represents a wide target population (hence

there is a need for careful sampling, see Chap-ter 4);

• generates numerical data;• provides descriptive, inferential and explana-

tory information;• manipulates key factors and variables to de-

rive frequencies (e.g. the numbers registeringa particular opinion or test score);

• gathers standardized information (i.e. usingthe same instruments and questions for allparticipants);

• ascertains correlations (e.g. to find out if thereis any relationship between gender andscores);

• presents material which is uncluttered by spe-cific contextual factors;

• captures data from multiple choice, closedquestions, test scores or observation sched-ules;

• supports or refutes hypotheses about the tar-get population;

• generates accurate instruments through theirpiloting and revision;

• makes generalizations about, and observespatterns of response in, the targets of focus;

• gathers data which can be processed statistically;• usually relies on large scale data gathering

from a wide population in order to enablegeneralizations to be made about given fac-tors or variables.

Examples of surveys1 are: • opinion polls (which refute the notion that

only opinion polls can catch opinions);• test scores (e.g. the results of testing students

nationally or locally);• students’ preferences for particular courses,

e.g. humanities, sciences;• reading surveys (e.g. Southgate’s et al. exam-

ple of teaching practices in 1981 in the UnitedKingdom).

A researcher using these types of survey typicallywill be seeking to gather large scale data from asrepresentative a sample population as possible inorder to say with a measure of statistical confidencethat certain observed characteristics occur with adegree of regularity, or that certain factors clustertogether (see Chapter 20) or that they correlate witheach other (correlation and covariance), or that theychange over time and location (e.g. results of testscores used to ascertain the ‘value-added’ dimen-sion of education, maybe using regression analysisand analysis of residuals to determine the differencebetween a predicted and an observed score), or re-gression analysis to use data from one variable topredict an outcome on another variable.

The attractions of a survey lie in its appeal togeneralizability or universality within given pa-rameters, its ability to make statements whichare supported by large data banks and its abil-ity to establish the degree of confidence whichcan be placed in a set of findings.

On the other hand, if a researcher is concernedto catch local, institutional or small scale fac-tors and variables—to portray the specificity ofa situation, its uniqueness and particular com-plexity, its interpersonal dynamics, and to pro-vide explanations of why a situation occurredor why a person or group of people returned aparticular set of results or behaved in a

SURVEYS AND STUDIES

172 SURVEYS AND STUDIES

particular way in a situation, or how a pro-gramme changes and develops over time, then asurvey approach is probably unsuitable. Its de-gree of explanatory potential or fine detail islimited; it is lost to broad brush generalizationswhich are free of temporal, spatial or local con-texts, i.e. its appeal largely rests on the basis ofpositivism. The individual instance is sacrificedto the aggregated response (which has the at-traction of anonymity, non-traceability and con-fidentiality for respondents).

Surveys typically rely on large scale data, e.g.from questionnaires, test scores, attendancerates, results of public examinations etc., all ofwhich would enable comparisons to be madeover time or between groups. This is not to saythat surveys cannot be undertaken on a smallscale basis, as indeed they can; rather, it is tosay that the generalizability of such small scaledata will be slight. In surveys the researcher isusually very clearly an outsider; indeed questionsof reliability must attach themselves to research-ers conducting survey research on their ownsubjects, e.g. participants in a course that theyhave been running.2 Further, it is critical thatattention is paid to rigorous sampling, otherwisethe basis of its applicability to wider contexts isseriously undermined. Non-probability samplestend to be avoided in surveys if generalizabilityis sought; probability sampling will tend to leadto generalizability of the data collected.

Some preliminary considerations

Three prerequisites to the design of any surveyare: the specification of the exact purpose of theinquiry; the population on which it is to focus;and the resources that are available, Hoinvilleand Jowell’s (1978) consideration of each ofthese key factors in survey planning can be il-lustrated in relation to the design of an educa-tional inquiry.

The purpose of the inquiry

First, a survey’s general purpose must be trans-lated into a specific central aim. Thus, ‘to

explore teachers’ views about in-service work’is somewhat nebulous, whereas ‘to obtain a de-tailed description of primary and secondaryteachers’ priorities in the provision of in-serviceeducation courses’ is reasonably specific.

Having decided upon and specified the pri-mary objective of the survey, the second phaseof the planning involves the identification anditemizing of subsidiary topics that relate to itscentral purpose. In our example, subsidiary is-sues might well include: the types of courses re-quired; the content of courses; the location ofcourses; the timing of courses; the design ofcourses; and the financing of courses.

The third phase follows the identification anditemization of subsidiary topics and involvesformulating specific information requirementsrelating to each of these issues. For example,with respect to the type of courses required, de-tailed information would be needed about theduration of courses (one meeting, several meet-ings, a week, a month, a term or a year), thestatus of courses (non-award bearing, awardbearing, with certificate, diploma, degreegranted by college or university), the orienta-tion of courses (theoretically oriented involvinglectures, readings, etc., or practically orientedinvolving workshops and the production of cur-riculum materials).

As these details unfold, note Hoinville andJowell, consideration would have to be given tothe most appropriate ways of collecting items ofinformation (interviews with selected teachers,postal questionnaires to selected schools, etc.).

The population upon which the surveyis focused

The second prerequisite to survey design, thespecification of the population to which the in-quiry is addressed, affects decisions that re-searchers must make both about sampling andresources. In our hypothetical survey of inservicerequirements, for example, we might specify thepopulation as ‘those primary and secondaryteachers employed in schools within a 30-mileradius of Loughborough University’. In this case,

Chapte

r 8173

the population is readily identifiable and, givensufficient resources to contact every member ofthe designated group, sampling decisions do notarise. Things are rarely so straightforward, how-ever. Often the criteria by which populations arespecified (‘severely challenged’, ‘under-achiev-ers’, ‘intending teachers’ or ‘highly anxious’) aredifficult to operationalize. Populations, moreo-ver, vary considerably in their accessibility; pu-pils and student teachers are relatively easy tosurvey, gypsy children and headteachers aremore elusive. More importantly, in a large sur-vey researchers usually draw a sample from thepopulation to be studied; rarely do they attemptto contact every member. We deal with the ques-tion of sampling shortly.

The resources available

The third important factor in designing and plan-ning a survey is the financial cost. Sample sur-veys are labour-intensive (see Davidson, 1970),the largest single expenditure being the fieldworkwhere costs arise out of the interviewing time,travel time and transport claims of the interview-ers themselves. There are additional demandson the survey budget. Training and supervisingthe panel of interviewers can often be as expen-sive as the costs incurred during the time thatthey actually spend in the field. Questionnaireconstruction, piloting, printing, posting, coding,together with computer programming—all eatinto financial resources.

Proposals from intending education re-searchers seeking governmental or private fund-ing are often weakest in the amount of timeand thought devoted to a detailed planning ofthe financial implications of the projected in-quiries. (In this chapter we confine ourselvesfrom this point to a discussion of surveys basedon self-completion questionnaires. A full ac-count of the interview as a research techniqueis given in Chapter 15.)

From here it is possible to identify severalstages to the conduct of a survey. Rosier(1997:154–62) suggests that the planning of asurvey will need to include clarification of:

• the research questions to which answers needto be provided;

• the conceptual framework of the survey,specifying in precise terms the concepts thatwill be used and explored;

• operationalizing the research questions (e.g.into hypotheses);

• the instruments to be used for data collec-tion, e.g.: to chart or measure backgroundcharacteristics of the sample (often nominaldata), academic achievements (e.g. examina-tion results, degrees awarded), attitudes andopinions (often using ordinal data from rat-ing scales) and behaviour (using observationaltechniques);

• sampling strategies and subgroups within thesample (unless the whole population is beingsurveyed, e.g. through census returns or na-tionally aggregated test scores etc.);

• pre-piloting the survey;• piloting the survey;• data collection practicalities and conduct (e.g.

permissions, funding, ethical considerations,response rates);

• data preparation (e.g. coding, data entry forcomputer analysis, checking and verification);

• data analysis (e.g. statistical processes, con-struction of variables and factor analysis, in-ferential statistics);

• reporting the findings (answering the researchquestions).

It is important to pilot and pre-pilot a survey.The difference between the pre-pilot and thepilot is significant. Whereas the pre-pilot is usu-ally a series of open-ended questions that areused to generate categories for closed, typicallymultiple choice questions, the pilot is used totest the actual survey instrument itself (see Chap-ter 14).

A rigorous survey, then, formulates clear, spe-cific objectives and research questions, ensuresthat the instrumentation, sampling, and datatypes are appropriate to yield answers to theresearch questions, ensures that as high a levelof sophistication of data analysis is undertakenas the data will sustain (but no more!).

SOME PRELIMINARY CONSIDERATIONS

174 SURVEYS AND STUDIES

Survey sampling

Because questions to do with sampling arise di-rectly from the second of our preliminary con-siderations, that is, defining the population uponwhich the survey is to focus, researchers musttake sampling decisions early in the overall plan-ning of a survey (see Box 8.1). We have alreadyseen that due to factors of expense, time andaccessibility, it is not always possible or practi-cal to obtain measures from a population. Re-searchers endeavour therefore to collect infor-mation from a smaller group or subset of thepopulation in such a way that the knowledgegained is representative of the total populationunder study. This smaller group or subset is a‘sample’. Notice how competent researchersstart with the total population and work downto the sample. By contrast, novices work fromthe bottom up, that is, they determine the mini-mum number of respondents needed to conducta successful survey. However, unless they iden-tify the total population in advance, it is virtu-ally impossible for them to assess how repre-sentative the sample is that they have drawn.There are two methods of sampling. One yieldsprobability samples in which, as the term im-plies, the probability of selection of each re-spondent is known. The other yields non-prob-ability samples, in which the probability of se-lection is unknown. We refer the reader to Chap-ter 4 for a discussion of sampling.

Longitudinal, cross-sectional and trendstudies

The term ‘longitudinal’ is used to describe a va-riety of studies that are conducted over a periodof time. Often, as we have seen, the word ‘de-velopmental’ is employed in connection withlongitudinal studies that deal specifically withaspects of human growth.

A clear distinction is drawn between longi-tudinal and cross-sectional studies.3 The lon-gitudinal study gathers data over an extendedperiod of time; a short-term investigation maytake several weeks or months; a long-termstudy can extend over many years. Where

successive measures are taken at differentpoints in time from the same respondents, theterm ‘follow-up study’ or ‘cohort study’ is usedin the British literature, the equivalent termin the United States being the ‘panel study’.The term ‘cohort’ is a group of people withsome common characteristic. A cohort studyis sometimes differentiated from a panel study.In a cohort study a specific population istracked over a specific period of time but se-lective sampling within that sample occurs(Borg and Gall, 1979:291). This means thatsome members of a cohort may not be includedeach time. By contrast, in a panel study eachsame individual is tracked over time.

Where different respondents are studied at dif-ferent points in time, the study is called ‘cross-sec-tional’. Where a few selected factors are studiedcontinuously over time, the term ‘trend study’ isemployed. One example of regular or repeatedcross-sectional social surveys is the General House-hold Survey, in which the same questions are askedevery year though they are put to a different sam-ple of the population each time. A well knownexample of a longitudinal (cohort) study is theNational Child Development Study, which startedin 1958, the most recent round of interviews tookplace in 1991. The British Household Panel Sur-vey has interviewed individuals from a representa-tive sample each year in the 1990s.

Cohort studies and trend studies are prospec-tive longitudinal methods in that they are ongo-ing in their collection of information about indi-viduals or their monitoring of specific events.Retrospective longitudinal studies, on the otherhand, focus upon individuals who have reachedsome defined end-point or state. For example, agroup of young people may be the researcher’sparticular interest (intending social workers,convicted drug offenders or university dropouts,for example), and the questions to which she willaddress herself are likely to include ones such as:‘Is there anything about the previous experienceof these individuals that can account for theirpresent situation?’

Retrospective analysis is not confined to lon-gitudinal studies alone. For example Rose and

Chapte

r 8175

Sullivan (1993:185) suggest that cross-sectionalstudies can use retrospective factual questions,e.g. previous occupations, dates of birth withinthe family, dates of marriage, divorce, thoughthe authors advise against collecting other typesof retrospective data in cross-sectional studies,as the quality of the data diminishes the furtherback one asks respondents to recall previousstates or even facts.

A cross-sectional study is one that producesa ‘snapshot’ of a population at a particularpoint in time. The epitome of the cross-sec-tional study is a national census in which a rep-resentative sample of the population consistingof individuals of different ages, different occu-pations, different educational and income lev-els, and residing in different parts of the coun-try, is interviewed on the same day. More typi-cally in education, cross-sectional studies in-volve indirect measures of the nature and rateof changes in the physical and intellectual de-velopment of samples of children drawn fromrepresentative age levels. The single ‘snapshot’of the cross-sectional study provides research-

ers with data for either a retrospective or a pro-spective inquiry.

Trend or prediction studies have an obviousimportance to educational administrators or plan-ners. Like cohort studies, they may be of relativelyshort or long duration. Essentially, the trend studyexamines recorded data to establish patterns ofchange that have already occurred in order to pre-dict what will be likely to occur in the future. Intrend studies two or more cross-sectional studiesare undertaken with identical age groups at morethan one point in time in order to make compari-sons over time (e.g. the Scholastic Aptitude andAchievement tests in the United States) (Keeves,1997:141) and the National Assessment of Edu-cational Progress results (Lietz and Keeves,1997:122). A major difficulty researchers face inconducting trend analyses is the intrusion of un-predictable factors that invalidate forecasts formu-lated on past data. For this reason, short-term trendstudies tend to be more accurate than long-termanalyses. The distinctions we have drawn betweenthe various terms used in developmental researchare illustrated in Box 8.2.

Box 8.2Types of developmental research

LONGITUDINAL, CROSS-SECTIONAL AND TREND STUDIES

176 SURVEYS AND STUDIES

Strengths and weaknesses of cohort andcross-sectional studies

Longitudinal studies of the cohort analysis typehave an important place in the research armouryof educational investigators. Cohort studies ofhuman growth and development conducted onrepresentative samples of populations areuniquely able to identify typical patterns of de-velopment and to reveal factors operating onthose samples which elude other research de-signs. They permit researchers to examine indi-vidual variations in characteristics or traits, andto produce individual growth curves. Cohortstudies, too, are particularly appropriate wheninvestigators attempt to establish causal relation-ships, for this task involves identifying changesin certain characteristics that result in changesin others. Cross-sectional designs are inappro-priate in causal research. Cohort analysis is es-pecially useful in sociological research becauseit can show how changing properties of indi-viduals fit together into changing properties ofsocial systems as a whole. For example, the studyof staff morale and its association with theemerging organizational climate of a newlyopened school would lend itself to this type ofdevelopmental research. A further strength ofcohort studies in schools is that they providelongitudinal records whose value derives in partfrom the known fallibility of any single test orassessment (see Davie, 1972). Finally, time, al-ways a limiting factor in experimental and in-terview settings, is generally more readily avail-able in cohort studies, allowing the researchergreater opportunity to observe trends and to dis-tinguish ‘real’ changes from chance occurrences(see Bailey, 1978).

Longitudinal studies suffer several disadvan-tages (though the gravity of these weaknesses ischallenged by supporters of cohort analysis). Thedisadvantages are first, that they are time-con-suming and expensive, because the researcher isobliged to wait for growth data to accumulate.Second, there is the difficulty of sample mortal-ity. Inevitably during the course of a long-termcohort study, subjects drop out, are lost or refuse

further co-operation. Such attrition makes itunlikely that those who remain in the study areas representative of the population as the sam-ple that was originally drawn. Sometimes at-tempts are made to lessen the effects of samplemortality by introducing aspects of cross-sec-tional study design, that is, ‘topping up’ the origi-nal cohort sample size at each time of retestingwith the same number of respondents drawnfrom the same population. The problem here isthat differences arising in the data from one sur-vey to the next may then be accounted for bydifferences in the persons surveyed rather thanby genuine changes or trends. A third difficultyhas been termed ‘control effect’ (sometimes re-ferred to as ‘measurement effect’). Often, re-peated interviewing results in an undesired andconfusing effect on the actions or attitudes un-der study, influencing the behaviour of subjects,sensitizing them to matters that have hithertopassed unnoticed, or stimulating them to com-munication with others on unwanted topics (seeRiley, 1963). Fourth, cohort studies can sufferfrom the interaction of biological, environmen-tal and intervention influences. Finally, cohortstudies in education pose considerable problemsof organization due to the continuous changesthat occur in pupils, staff, teaching methods andthe like. Such changes make it highly unlikelythat a study will be completed in the way that itwas originally planned.

Cohort studies, as we have seen, are particu-larly appropriate in research on human growthand development. Why then are so many stud-ies in this area cross-sectional in design? Thereason is that they have a number of advantagesover cohort studies; they are less expensive; theyproduce findings more quickly; they are lesslikely to suffer from control effects; and theyare more likely to secure the co-operation ofrespondents on a ‘one-off’ basis. Generally,cross-sectional designs are able to include moresubjects than are cohort designs.

The strengths of cohort analysis are the weak-nesses of the cross-sectional design. The cross-sectional study is a less effective method for theresearcher who is concerned to identify

Chapte

r 8177

individual variations in growth or to establishcausal relationships between variables. Samplingin the cross-sectional study is complicated be-cause different subjects are involved at each agelevel and may not be comparable. Further prob-lems arising out of selection effects and the ob-scuring of irregularities in growth weaken thecross-sectional study so much that one observerdismisses the method as a highly unsatisfactoryway of obtaining developmental data except forthe crudest purposes. Douglas (1976),4 who pio-neered the first national cohort study to be un-dertaken in any country, makes a spirited de-fence of the method against the common criti-cisms that are levelled against it—that it is ex-pensive and time-consuming. His account of theadvantages of cohort analysis over cross-sec-tional designs is summarized in Box 8.3.

The comparative strengths and weaknessesof longitudinal studies (including retrospectivestudies), cross-section analysis and trend stud-ies are summarized in Box 8.4 (see also Roseand Sullivan (1993:184–8)). Several of thestrengths and weaknesses of retrospective lon-gitudinal studies share the same characteristicsas those of ex post facto research, discussed inChapter 11.

Event history analysis

Recent developments in longitudinal studies in-clude the use of ‘event history analysis’ (e.g. vonEye, 1990; Rose and Sullivan, 1993:189–90;Plewis, 1997). This is a set of statistical tech-niques whose key concepts include: a risk set (aset of participants who have yet to experience aparticular event or situation); a survivor func-tion or survivor curve (the decline in the size ofrisk over time); the hazard or hazard rate (therate at which particular events occur, or the riskof a particular event occurring at a particulartime). Event-history analysis suggests that it ispossible to consider the dependent variable in(e.g. marriage, employment changes, redun-dancy, further and higher education, movinghouse, death) as predictable within certain timeframes for individuals. The rationale for thisderives from life-table analysis used by demog-raphers to calculate survival and mortality ratesin a given population over time. For example ifx number of the population are alive at time t,then it may be possible to predict the survivalrate of that population at time t+1. In a sense itis akin to a prediction study. Life-table studiesare straightforward in that they are concerned

Box 8.3Advantages of cohort over cross-sectional designs

1 Some types of information, for example, on attitudes or assessment of potential ability, are only meaningful ifcollected contemporaneously. Other types are more complete or more accurate if collected during the courseof a longitudinal survey, though they are likely to have some value even if collected retrospectively, forexample, length of schooling, job history, geographical movement.

2 In cohort studies, no duplication of information occurs, whereas in cross-sectional studies the same type ofbackground information has to be collected on each occasion. This increases the interviewing costs.

3 The omission of even a single variable, later found to be important, from a cross-sectional study is a disaster,whereas it is usually possible in a cohort study to fill the gap, even if only partially, in a subsequentinterview.

4 A cohort study allows the accumulation of a much larger number of variables, extending over a much widerarea of knowledge than would be possible in a cross-sectional study. This is of course because the collectioncan be spread over many interviews. Moreover, information may be obtained at the most appropriate time,for example, information on job entry may be obtained when it occurs even if this varies from one memberof the sample to another.

5 Starting with a birth cohort removes later problems of sampling and allows the extensive use of subsamples.It also eases problems of estimating bias and reliability.

6 Longitudinal studies are free of one of the major obstacles to causal analysis, namely, the re-interpretation ofremembered information so that it conforms with conventional views on causation. It also provides the meansto assess the direction of effect.

Source Adapted from Douglas, 1976

EVENT HISTORY ANALYSIS

178 SURVEYS AND STUDIES

Box 8.4

The characteristics, strengths and weaknesses of longitudinal, cross-sectional, trend analysis, and retrospective longitudinalstudies

Study type

Longitudinalstudies (cohort/panel studies)

Features

1 Single sample overextended period of time.

2 Enables the sameindividuals to be comparedover time (diachronFicanalysis).

3 Micro-level analysis.

Strengths

1 Useful for establishingcausal relationships andfor making reliableinferences.

2 Shows how changingproperties of individuals fitinto systemic change.

3 Operates within theknown limits of instrumen-tation employed.

4 Separates real trendsfrom chance occurrence.

5 Brings the benefits ofextended time frames.

6 Useful for chartinggrowth and development.

7 Gathers data contempo-raneously rather thanretrospectively, therebyavoiding the problems ofselective or false memory.

8 Economical in that apicture of the sample isbuilt up over time.

9 In-depth and comprehen-sive coverage of a widerange of variables, bothinitial and emergent—individual specific effectsand populationheterogeneity.

10 Enables change to beanalysed at the indi-vidual/ micro level.

11 Enables the dynamicsof change to be caught,the flows into and out ofparticular states and thetransitions between states.

12 Individual level dataare more accurate thanmacro-level, cross-sectional data.

13 Sampling errorreduced as the studyremains with the samesample over time.

14 Enables clearrecommendations forintervention to be made.

Weaknesses

1 Time-consuming—it takes along time for the studies to beconducted and the results toemerge.

2 Problems of sample mortalityheighten over time and diminishinitial representativeness.

3 Control effects—repeatedinterviewing of the same sampleinfluences their behaviour.

4 Intervening effects attenuatethe initial research plan.

5 Problem of securing participa-tion as it involves repeatedcontact.

6 Data, being rich at anindividual level, are typicallycomplex to analyse.

continued

Chapte

r 8179

Box 8.4cotinued

EVENT HISTORY ANALYSIS

Study type

Cross-sectionalstudies

Features

1 Snapshot of differentsamples at one or morepoints in time (synchronieanalysis).

2 Large-scale andrepresentative sampling.

3 Macro-level analysis.

4 Enables differentgroups to be compared.

5 Can be retrospectiveand/or prospective.

Strengths

1 Comparatively quick toconduct.

2 Comparatively cheap toadminister.

3 Limited control effects assubjects only participateonce.

4 Stronger likelihood ofparticipation as it is for asingle time.

5 Charts aggregatedpatterns.

6 Useful for chartingpopulation-wide features atone or more single pointsin time.

7 Enable researchers toidentify the proportions ofpeople in particular groupsor states.

8 Large samples enableinferential statistics to beused, e.g. to comparesubgroups within thesample.

Weaknesses

1 Do not permit analysis of causalrelationships.

2 Unable to chart individualvariations in development orchanges, and their significance.

3 Sampling not entirely compara-ble at each round of data collectionas different samples are used.

4 Can be time-consuming asbackground details of each samplehave to be collected each time.

5 Omission of a single variablecan undermine the resultssignificantly.

6 Unable to chart changing socialprocesses over time.

7 They only permit analysis ofoverall, net change at the macro-level through aggregated data.

1 Selected factorsstudied continuouslyover time.

2 Uses recorded datato predict future trends.

1 Maintains clarity of focusthroughout the duration of thestudy.

2 Enables prediction andprojection on the basis ofidentified and monitoredvariables and assumptions.

Trend analysis 1 Neglects influence ofunpredicted factors.

2 Past trends are not always agood predictor of future trends.

3 Formula-driven, i.e. could betoo conservative or initialassumptions might be erroneous.

4 Neglects the implications ofchaos and complexity theory,e.g. that long-range forecastingis dangerous.

5 The criteria for prediction maybe imprecise.

continued

180 SURVEYS AND STUDIES

with specific, non-repeatable events (e.g. death);in this case the calculation of life expectancy doesnot rely on distinguishing various causes of death(Rose and Sullivan, 1993:189). However, inevent-history analysis the parameters becomemuch more complex as multiple factors comeinto the equation, requiring some form ofmultivariate analysis to be undertaken.

In event-history analysis the task is to calcu-late the ‘hazard rate’—the probability of a de-pendent variable occurring to an individualwithin a specified time frame. The approach ismathematical, using log-linear analysis to com-pute the relative size of each of several factors(independent variables), e.g. by calculating co-efficients in cross-tabulations, that will have aneffect on the hazard rate, the likelihood of anevent occurring to an individual within a spe-cific time frame (Rose and Sullivan, 1993:190).5

Event-history analysis also addresses theproblem of attrition, as members leave a studyover time. Plewis (1997:117) suggests that manylongitudinal studies suffer from sample loss overtime, and attempts to address the issue of cen-soring—the adjustments necessary in a study inorder to take account of the accretion of miss-ing data. Right censoring occurs when we knowwhen a particular event commences but notwhen it finishes; left censoring occurs when weknow of the existence of a particular event orsituation, but not when it began. Plewis (ibid.:118) suggests that censored events and episodes(where attrition has taken place) last longer thanuncensored events and episodes, and, hence,hazard rates that are based on uncensored ob-servations will usually be too high. Event-his-tory is a valuable, and increasingly used tech-nique for research.

Box 8.4continued

Study type

Retrospectivelongitudinalstudies

Features

1 Retrospective analysisof history of a sample.

2 Individual- and micro-level data.

Strengths

1 Useful for establishing causalrelationships.

2 Clear focus (e.g. how did thisparticular end state or set ofcircumstances come to be?)

3 Enables data to be assembledthat are not susceptible toexperimental analysis.

Weaknesses

1 Remembered information mightbe faulty, selective and inaccurate.

2 People might forget, suppress orfail to remember certain factors.

3 Individuals might interpret theirown past behaviour in light of theirsubsequent events, i.e. theinterpretations are not contempora-neous with the actual events.

4 The roots and causes of the endstate may be multiple, diverse,complex, unidentified andunstraightforward to unravel.

5 Simple causality is unlikely.

6 A cause may be an effect andvice versa.

7 It is difficult to separate real fromperceived or putative causes.

8 It is seldom easily falsifiable orconfirmable.

Introduction

How can knowledge of the ways in which chil-dren learn and the means by which schools achievetheir goals be verified, built upon and extended?This is a central question for educational research.The problem of verification and cumulation ofeducational knowledge is implicit in our discus-sion of the nature of educational inquiry in theopening chapter of the book. There, we outlinethree broad approaches to educational research.The first, based on the ‘scientific’ paradigm, restsupon the creation of theoretical frameworks thatcan be tested by experimentation, replication andrefinement. The second approach seeks to under-stand and interpret the world in terms of its actorsand consequently may be described as interpre-tive and subjective. A third, emerging, approachthat takes account of the political and ideologicalcontexts of much educational research is that ofcritical educational research.

The paradigm most naturally suited to casestudy research, the subject of this chapter, is thesecond one, with its emphasis on the interpretiveand subjective dimensions. The first paradigm,the ‘scientific’, is reflected in our examples ofquantitative case study research. The use of criti-cal theory in case study research is at a compara-tively embryonic stage but offers rich potential.Our broad treatment of case study techniquesfollows directly from a typology of observationstudies that we develop shortly. We begin with abrief description of the case study itself.

What is a case study?

A case study is a specific instance that is fre-quently designed to illustrate a more general

principle (Nisbet and Watt, 1984:72), it is ‘thestudy of an instance in action’ (Adelman et al.,1980). The single instance is of a bounded sys-tem, for example a child, a clique, a class, aschool, a community. It provides a unique ex-ample of real people in real situations, enablingreaders to understand ideas more clearly thansimply by presenting them with abstract theo-ries or principles. Indeed a case study can en-able readers to understand how ideas and ab-stract principles can fit together (ibid.: 72–3).Case studies can penetrate situations in waysthat are not always susceptible to numericalanalysis.

Case studies can establish cause and effect, in-deed one of their strengths is that they observeeffects in real contexts, recognizing that context isa powerful determinant of both causes and effects.As Nisbet and Watt remark (p. 78), the whole ismore than the sum of its parts. Sturman (1999:103)argues that a distinguishing feature of case studiesis that human systems have a wholeness or integ-rity to them rather than being a loose connectionof traits, necessitating in-depth investigation. Fur-ther, contexts are unique and dynamic, hence casestudies investigate and report the complex dynamicand unfolding interactions of events, human rela-tionships and other factors in a unique instance.Hitchcock and Hughes (1995:316) suggest thatcase studies are distinguished less by the method-ologies that they employ than by the subjects/ob-jects of their inquiry (though, as indicated below,there is frequently a resonance between case stud-ies and interpretive methodologies). Hitchcock andHughes (1995:322) further suggest that the casestudy approach is particularly valuable when the

9 Case studies

182 CASE STUDIES

researcher has little control over events. They con-sider (p. 317) that a case study has several hall-marks: • It is concerned with a rich and vivid descrip-

tion of events relevant to the case.• It provides a chronological narrative of events

relevant to the case.• It blends a description of events with the

analysis of them.• It focuses on individual actors or groups of

actors, and seeks to understand their percep-tions of events.

• It highlights specific events that are relevantto the case.

• The researcher is integrally involved in thecase.

• An attempt is made to portray the richnessof the case in writing up the report.

Case studies, they suggest (ibid.: 319): (a) are setin temporal, geographical, organizational, insti-tutional and other contexts that enable bounda-ries to be drawn around the case; (b) can be de-fined with reference to characteristics defined byindividuals and groups involved; and (c) can bedefined by participants’ roles and functions in thecase. They also point out that case studies: • will have temporal characteristics which help

to define their nature;• will have geographical parameters allowing

for their definition;• will have boundaries which allow for defini-

tion;• may be defined by an individual in a particu-

lar context, at a point in time;• may be defined by the characteristics of the

group;• may be defined by role or function;• may be shaped by organizational or institu-

tional arrangements. Case studies strive to portray ‘what it is like’ tobe in a particular situation, to catch the close-up reality and ‘thick description’ (Geertz, 1973)of participants’ lived experiences of, thoughts

about and feelings for, a situation. Hence it isimportant for events and situations to be allowedto speak for themselves rather than to be largelyinterpreted, evaluated or judged by the re-searcher. In this respect the case study is akin tothe television documentary.

This is not to say that case studies areunsystematic or merely illustrative; case studydata are gathered systematically and rigorously.Indeed Nisbet and Watt (ibid.: 91) specificallycounsel case study researchers to avoid: • journalism (picking out more striking features

of the case, thereby distorting the full accountin order to emphasize these more sensationalaspects);

• selective reporting (selecting only that evi-dence which will support a particular con-clusion, thereby misrepresenting the wholecase);

• an anecdotal style (degenerating into an end-less series of low-level banal and tedious il-lustrations that take over from in-depth, rig-orous analysis); one is reminded of Stake’s(1978) wry comment that ‘our scrapbooksare full of enlargements of enlargements’, al-luding to the tendency of some case studiesto over-emphasize detail to the detriment ofseeing the whole picture;

• pomposity (striving to derive or generate pro-found theories from low-level data, or by wrap-ping up accounts in high-sounding verbiage);

• blandness (unquestioningly accepting only therespondents’ views, or only including thoseaspects of the case study on which peopleagree rather than areas on which they mightdisagree).

Case studies can make theoretical statements,but, like other forms of research and human sci-ences, these must be supported by the evidencepresented. This requires the nature of generali-zation in case study to be clarified. Generaliza-tion can take various forms, for example: • from the single instance to the class of in-

stances that it represents (for example a

Chapte

r 9183

single-sex selective school might act as a casestudy to catch significant features of othersingle-sex selective schools);

• from features of the single case to a multi-plicity of classes with the same features;

• from the single features of part of the case tothe whole of that case.

More recently Simons (1996) has argued thatcase study needs to address six paradoxes; itneeds to: • reject the subject—object dichotomy, regard-

ing all participants equally;• recognize the contribution that a genuine

creative encounter can make to new forms ofunderstanding education;

• regard different ways of seeing as new waysof knowing;

• approximate the ways of the artist;• free the mind of traditional analysis;• embrace these paradoxes, with an overrid-

ing interest in people. There are several types of case study. Yin (1984)identifies three such types in terms of their out-comes: (a) exploratory (as a pilot to other stud-ies or research questions); (b) descriptive (pro-viding narrative accounts); (c) explanatory (test-ing theories). Exploratory case studies that actas a pilot can be used to generate hypothesesthat are tested in larger scale surveys, experi-ments or other forms of research, e.g. observa-tional. However Adelman et al. (1980) cautionagainst using case studies solely as preliminar-ies to other studies, e.g. as pre-experimental orpre-survey; rather, they argue, case studies existin their own right as a significant and legitimateresearch method.

Yin’s (1984) classification accords withMerriam (1988) who identifies three types: (a)descriptive (narrative accounts); (b) interpreta-tive (developing conceptual categories induc-tively in order to examine initial assumptions);(c) evaluative (explaining and judging). Merriamalso categorizes four common domains or kindsof case study: ethnographic, historical,

psychological and sociological. Sturman(1999:107), echoing Stenhouse (1985), identi-fies four kinds of case study: (a) an ethnographiccase study—single in-depth study; (b) action re-search case study; (c) evaluative case study; and(d) educational case study. Stake (1994) identi-fies three main types of case study: (a) intrinsiccase studies (studies that are undertaken in or-der to understand the particular case in ques-tion); (b) instrumental case studies (examininga particular case in order to gain insight into anissue or a theory); (c) collective case studies(groups of individual studies that are undertakento gain a fuller picture). Because case studiesprovide fine grain detail they can also be usedto complement other, more coarsely grained—often large scale—kinds of research. Case studymaterial in this sense can provide powerful hu-man-scale data on macro-political decision-mak-ing, fusing theory and practice, for example thework of Ball (1990), Bowe et al. (1992) and Ball(1994a) on the impact of government policy onspecific schools.

Case studies have several claimed strengthsand weaknesses. These are summarized in Box9.1 (Adelman et al., 1980) and Box 9.2 (Nisbetand Watt, 1984). From the preceding analysis itis becoming clear that case studies frequentlyfollow the interpretive tradition of research—seeing the situation through the eyes of partici-pants—rather than the quantitative paradigm,though this need not always be the case. Its sym-pathy to the interpretive paradigm has renderedcase study an object of criticism, treating pecu-liarities rather than regularities (Smith,1991:375). Smith (1991:375) suggests that:

The case study method…is the logically weakestmethod of knowing. The study of individual ca-reers, communities, nations, and so on has becomeessentially passé. Recurrent patterns are the mainproduct of the enterprise of historic scholarship.

This is prejudice and ideology rather than cri-tique, but signifies the problem of respectabilityand legitimacy that case study has to conqueramongst certain academics. Like other researchmethods, case study has to demonstrate

WHAT IS A CASE STUDY?

184 CASE STUDIES

reliability and validity. This can be difficult, forgiven the uniqueness of situations, they may be,by definition, inconsistent with other case stud-ies or unable to demonstrate this positivist view

of reliability. Even though case studies do nothave to demonstrate this form of reliability, nevertheless there are important questions to befaced in undertaking case studies, for example

Box 9.1Possible advantages of case study

Source Adapted from Adelman et al., 1980

Box 9.2Nisbet and Watt’s (1984) strengths and weaknesses of case study

Case studies have a number of advantages that make them attractive to educational evaluators or researchers. Thus:

1 Case study data, paradoxically, is ‘strong in reality’ but difficult to organize. In contrast, other research data isoften ‘weak in reality’ but susceptible to ready organization. This strength in reality is because case studies aredown-to-earth and attention-holding, in harmony with the reader’s own experience, and thus provide a ‘natural’basis for generalization.

2 Case studies allow generalizations either about an instance or from an instance to a class. Their peculiar strengthlies in their attention to the subtlety and complexity of the case in its own right.

3 Case studies recognize the complexity and ‘embeddedness’ of social truths. By carefully attending to socialsituations, case studies can represent something of the discrepancies or conflicts between the viewpoints held byparticipants. The best case studies are capable of offering some support to alternative interpretations.

4 Case studies, considered as products, may form an archive of descriptive material sufficiently rich to admitsubsequent reinterpretation. Given the variety and complexity of educational purposes and environments, there isan obvious value in having a data source for researchers and users whose purposes may be different from ourown.

5 Case studies are ‘a step to action’. They begin in a world of action and contribute to it.Their insights may bedirectly interpreted and put to use; for staff or individual self-development, for within-institutional feedback; forformative evaluation; and in educational policy making.

6 Case studies present research or evaluation data in a more publicly accessible form than other kinds of researchreport, although this virtue is to some extent bought at the expense of their length. The language and the form ofthe presentation is hopefully less esoteric and less dependent on specialized interpretation than conventionalresearch reports. The case study is capable of serving multiple audiences. It reduces the dependence of thereader upon unstated implicit assumptions…and makes the research process itself accessible. Case studies,therefore, may contribute towards the ‘democratization’ of decision-making (and knowledge itself). At its best,they allow readers to judge the implications of a study for themselves.

Strengths

1 The results are more easily understood by a wide audience (including non-academics) as they are frequentlywritten in everyday, non-professional language.

2 They are immediately intelligible; they speak for themselves.

3 They catch unique features that may otherwise be lost in larger scale data (e.g. surveys); these unique featuresmight hold the key to understanding the situation.

4 They are strong on reality.5 They provide insights into other, similar situations and cases, thereby assisting interpretation of other similar cases.

6 They can be undertaken by a single researcher without needing a full research team.

7 They can embrace and build in unanticipated events and uncontrolled variables.

Weaknesses

1 The results may not be generalizable except where other readers/researchers see their application.

2 They are not easily open to cross-checking, hence they may be selective, biased, personal and subjective.

3 They are prone to problems of observer bias, despite attempts made to address reflexivity.

Chapte

r 9185

(Adelman et al., 1980; Nisbet and Watt, 1984;Hitchcock and Hughes, 1995):

What exactly is a case?How are cases identified and selected?What kind of case study is this (what is itspurpose)?What is reliable evidence?What is objective evidence?What is an appropriate selection to include fromthe wealth of generated data?What is a fair and accurate account?Under what circumstances is it fair to take an ex-ceptional case (or a critical event—see the discus-sion of observation in Chapter 17)?What kind of sampling is most appropriate?To what extent is triangulation required and howwill this be addressed?What is the nature of the validation process incase studies?How will the balance be struck between unique-ness and generalization?What is the most appropriate form of writing upand reporting the case study?What ethical issues are exposed in undertaking acase study?

A key issue in case study research is the selec-tion of information. Though it is frequentlyuseful to record typical, representative occur-rences, the researcher need not always adhereto criteria of representativeness. For example,it may be that infrequent, unrepresentative butcritical incidents or events occur that are cru-cial to the understanding of the case. For ex-ample, a subject might only demonstrate aparticular behaviour once, but it is so impor-tant as not to be ruled out simply because itoccurred once; sometimes a single event mightoccur which sheds a hugely important insightinto a person or situation (see the discussion ofcritical incidents in the chapter on observa-tion); it can be a key to understanding a situa-tion (Flanagan, 1949).

For example, it may be that a psychologicalcase study might happen upon a single instanceof child abuse earlier in an adult’s life, but theeffects of this were so profound as to constitutea turning point in understanding that adult. A

child might suddenly pass a single comment thatindicates complete frustration with or completefear of a teacher, yet it is too important to over-look. Case studies, in not having to seek fre-quencies of occurrences, can replace quantitywith quality and intensity, separating the sig-nificant few from the insignificant many in-stances of behaviour. Significance rather thanfrequency is a hallmark of case studies, offeringthe researcher an insight into the real dynamicsof situations and people.

Types of case study

Unlike the experimenter who manipulates vari-ables to determine their causal significance orthe surveyor who asks standardized questionsof large, representative samples of individuals,the case study researcher typically observes thecharacteristics of an individual unit—a child, aclique, a class, a school or a community. Thepurpose of such observation is to probe deeplyand to analyse intensively the multifarious phe-nomena that constitute the life cycle of the unitwith a view to establishing generalizations aboutthe wider population to which that unit belongs.

Antipathy among researchers towards thestatistical-experimental paradigm has createdsomething of a boom industry in case study re-search. Delinquents (Patrick, 1973), dropouts(Parker, 1974) and drug-users (Young, 1971)to say nothing of studies of all types of schools(King, 1979),1 attest to the wide use of the casestudy in contemporary social science and edu-cational research. Such wide use is marked byan equally diverse range of techniques em-ployed in the collection and analysis of bothqualitative and quantitative data. Whateverthe problem or the approach, at the heart ofevery case study lies a method of observation.Box 9.3 sets out a typology of observationstudies.

Acker’s (1990) study is an ethnographic ac-count that is based on several hundred hours ofparticipant observational material, whilstBoulton’s (1992) work, by contrast, is based onhighly structured, non-participant observation

TYPES OF CASE STUDY?

186 CASE STUDIES

conducted over five years. The study by Wild,Scivier and Richardson (1992) used participantobservation, loosely structured interviews thatyielded simple frequency counts. Blease andCohen’s (1990) study of coping with computersused highly structured observation schedules,undertaken by non-participant observers, with theexpress intention of obtaining precise, quantita-tive data on the classroom use of a computer pro-gramme. This was part of a longitudinal study inprimary classrooms, and yielded typical profilesof individual behaviour and group interaction instudents’ usage of the computer programme.Antonsen’s (1988) study was of a single childundergoing psychotherapy at a Child PsychiatricUnit, and uses unstructured observation withinthe artificial setting of a psychiatric clinic and isa record of the therapist’s non-directive approach.Finally Houghton’s (1991) study uses data fromstructured sets of test materials together with fo-cused interviews with those with whom this in-ternational student had contact. Together thesecase studies provide a valuable insight into therange and types of case study.

There are two principal kinds of observation incase study—participant observation and non-par-ticipant observation. In the former, observers en-gage in the very activities they set out to observe.Often, their ‘cover’ is so complete that as far as the

other participants are concerned, they are simplyone of the group. In the case of Patrick for exam-ple, born and bred in Glasgow, his researcher roleremained hidden from the members of the Glas-gow gang in whose activities he participated for aperiod of four months (see Patrick, 1973). Suchcomplete anonymity is not always possible, how-ever. Thus in Parker’s study of downtown Liver-pool adolescents, it was generally known that theresearcher was waiting to take up a post at theuniversity. In the meantime, ‘knocking around’during the day with the lads and frequenting theirpub at night rapidly established that he was ‘OK’:

I was a drinker, a hanger-arounder, and had beentested in illegal ‘business’ matters and could berelied on to say nothing since I ‘knew the score’.

(Parker, 1974) Cover is not necessarily a prerequisite of par-ticipant observation. In an intensive study of asmall group of working-class boys during theirlast two years at school and their first monthsin employment, Willis (1977) attended all thedifferent subject classes at school—‘not as ateacher, but as a member of the class’—andworked alongside each boy in industry for ashort period.

Non-participant observers, on the other hand,

Box 9.3A typology of observation studies

Source Adapted from Bailey, 1978

Chapte

r 9187

stand aloof from the group activities they areinvestigating and eschew group membership—no great difficulty for King (1979), an adultobserver in infant classrooms. Listen to him re-counting how he firmly established his non-par-ticipant status with young children:

I rapidly learnt that children in infants’ classroomsdefine any adult as another teacher or teacher sur-rogate. To avoid being engaged in conversation,being asked to spell words or admire pictures, Ievolved the following technique.

To begin with, I kept standing so that physicalheight created social distance… Next, I did notshow immediate interest in what the children weredoing, or talk to them. When I was talked to Ismiled politely and if necessary I referred the childasking a question to the teacher. Most importantly,I avoided eye contact: if you do not look you willnot be seen.

(King, 1979) The best illustration of the non-participant ob-server role is perhaps the case of the researchersitting at the back of a classroom coding up everythree seconds the verbal exchanges betweenteacher and pupils by means of a structured setof observational categories.

It is frequently the case that the type of ob-servation undertaken by the researcher is asso-ciated with the type of setting in which the re-search takes place. In Box 9.3 we identify a con-tinuum of settings ranging from the ‘artificial’environments of the counsellor’s and the thera-pist’s clinics (cell 5 and 6) to the ‘natural’ envi-ronments of school classrooms, staffrooms andplaygrounds (cells 1 and 2). Because our con-tinuum is crude and arbitrary we are at libertyto locate studies of an information technologyaudit and computer usage (cells 3 and 4) some-where between the ‘artificial’ and the ‘natural’poles.

Although in theory each of the six examplesof case studies in Box 9.3 could have been un-dertaken either as a participant or as a non-par-ticipant observation study, a number of factorsintrude to make one or other of the observa-tional strategies the dominant mode of inquiry

in a particular type of setting. Bailey explains asfollows:

In a natural setting it is difficult for the researcherwho wishes to be covert not to act as a partici-pant. If the researcher does not participate, thereis little to explain his presence, as he is very obvi-ous to the actual participants… Most studies in anatural setting are unstructured participant ob-servation studies… Much the opposite is true inan artificial environment. Since there is no natu-ral setting, in a sense none of the persons beingstudied are really participants of long standing,and thus may accept a non-participant observermore readily… Laboratory settings also enable anon-participant observer to use sophisticatedequipment such as videotape and tape recordings… Thus most studies in an artificial laboratorysetting will be structured and will be non-partici-pant studies.

(Bailey, 1978) What we are saying is that the unstructured, eth-nographic account of teachers’ work (cell 1) isthe most typical method of observation in thenatural surroundings of the school in which thatstudy was conducted. Similarly, the structuredinventories of study habits and personality em-ployed in the study of Mr Chong (cell 6) reflecta common approach in the artificial setting of acounsellor’s office.

Why participant observation?

The natural scientist, Schutz (1962) points out,explores a field that means nothing to the mol-ecules, atoms and electrons therein. By contrast,the subject matter of the world in which theeducational researcher is interested is composedof people and is essentially meaningful to them.That world is subjectively structured, possess-ing particular meanings for its inhabitants. Thetask of the educational investigator is very of-ten to explain the means by which an orderlysocial world is established and maintained interms of its shared meanings. How do partici-pant observation techniques assist the researcherin this task? Bailey (1978) identifies some

WHY PARTICIPANT OBSERVATION

188 CASE STUDIES

inherent advantages in the participant observa-tion approach: • Observation studies are superior to experi-

ments and surveys when data are being col-lected on non-verbal behaviour.

• In observation studies, investigators are ableto discern ongoing behaviour as it occurs andare able to make appropriate notes about itssalient features.

• Because case study observations take place overan extended period of time, researchers candevelop more intimate and informal relation-ships with those they are observing, generallyin more natural environments than those inwhich experiments and surveys are conducted.

• Case study observations are less reactive thanother types of data-gathering methods. Forexample, in laboratory-based experimentsand in surveys that depend upon verbal re-sponses to structured questions, bias can beintroduced in the very data that researchersare attempting to study.

Recording observations

I filled thirty-two notebooks with about half a mil-lion words of notes made during nearly six hun-dred hours [of observation].

(King, 1979) The recording of observations is a frequentsource of concern to inexperienced case studyresearchers. How much ought to be recorded?In what form should the recordings be made?What does one do with the mass of recordeddata? Lofland (1971) gives a number of usefulsuggestions about collecting field notes: • Record the notes as quickly as possible after

observation, since the quantity of informa-tion forgotten is very slight over a short pe-riod of time but accelerates quickly as moretime passes.

• Discipline yourself to write notes quickly andreconcile yourself to the fact that although itmay seem ironic, recording of field notes can

be expected to take as long as is spent in ac-tual observation.

• Dictating rather than writing is acceptable ifone can afford it, but writing has the advan-tage of stimulating thought.

• Typing field notes is vastly preferable to hand-writing because it is faster and easier to read,especially when making multiple copies.

• It is advisable to make at least two copies offield notes and preferable to type on a mas-ter for reproduction. One original copy is re-tained for reference and other copies can beused as rough draft to be cut up, reorganizedand rewritten.

• The notes ought to be full enough adequatelyto summon up for one again, months later, areasonably vivid picture of any describedevent. This probably means that one oughtto be writing up, at the very minimum, atleast a couple of single space typed pages forevery hour of observation.2

The sort of note-taking recommended byLofland and actually undertaken by King(1979) and Wolcott (1973)3 in their ethno-graphic accounts grows out of the nature ofthe unstructured observation study. Note-tak-ing, confessed Wolcott, helped him fight theacute boredom that he sometimes felt whenobserving the interminable meetings that arethe daily lot of the school principal. Occa-sionally, however, a series of events would oc-cur so quickly that Wolcott had time only tomake cursory notes which he supplementedlater with fuller accounts. One useful tip fromthis experienced ethnographer is worth not-ing: never resume your observations until thenotes from the preceding observation arecomplete. Until your observations and im-pressions from one visit are a matter ofrecord, there is little point in returning to theclassroom or school and reducing the impactof one set of events by superimposing anotherand more recent set. Indeed, when to recordone’s data is but one of a number of practicalproblems identified by Walker, which arelisted in Box 9.4 (Walker, 1980).

Chapte

r 9189

Planning a case study

In planning a case study there are several issuesthat researchers may find useful to consider (e.g.Adelman et al., 1980): • The particular circumstances of the case, in-

cluding: (a) the possible disruption to indi-vidual participants that participation mightentail; (b) negotiating access to people; (c)negotiating ownership of the data; (d) nego-tiating release of the data;

• The conduct of the study including: (a) theuse of primary and secondary sources; (b) theopportunities to check data; (c) triangulation(including peer examination of the findings,respondent validation and reflexivity); (d)data collection methods—in the interpretiveparadigm case studies tend to use certain data

collection methods, e.g. semi-structured andopen interviews, observation, narrative ac-counts and documents, diaries, maybe alsotests, rather than other methods, e.g. surveys,experiments. Nisbet and Watt (1984) suggestthat, in conducting interviews, it may be wiserto interview senior people later rather thanearlier so that the most effective use of dis-cussion time can be made, the intervieweehaving been put into the picture fully beforethe interview; (e) data analysis and interpre-tation, and, where appropriate, theory gen-eration; (f) the writing of the report—Nisbetand Watt (ibid.) suggest that it is importantto separate conclusions from the evidence,with the essential evidence included in themain text, and to balance illustration withanalysis and generalization;

• The consequences of the research (for par-ticipants). This might include theanonymizing of the research in order to pro-tect participants, though such anonymizationmight suggest that a primary goal of casestudy is generalization rather than the por-trayal of a unique case, i.e. it might go againsta central feature of case study. Anonymizingreports might render them anodyne, andAdelman et al. suggest that the distortion thatis involved in such anonymization—to rendercases unrecognizable might be too high a priceto pay for going public.

Nisbet and Watt (1984:78) suggest three mainstages in undertaking a case study Because casestudies catch the dynamics of unfolding situationsit is advisable to commence with a very wide fieldof focus, an open phase, without selectivity orprejudgement. Thereafter progressive focusingenables a narrower field of focus to be established,identifying key foci for subsequent study and datacollection. At the third stage a draft interpreta-tion is prepared which needs to be checked withrespondents before appearing in the final form.Nisbet and Watt (ibid.: 79) advise against thegeneration of hypotheses too early in a case study;rather, they suggest, it is important to gather dataopenly. Respondent validation can be particularly

Box 9.4The case study and problems of selection

Source Adapted from Walker, 1980

PLANNING A CASE STUDY

Among the issues confronting the researcher at theoutset of his case study are the problems of selection.The following questions indicate some of theobstacles in this respect:

1 How do you get from the initial idea to the

working design (from the idea to a specifica-tion, to usable data)?

2 What do you lose in the process?3 What unwanted concerns do you take on board

as a result?4 How do you find a site which provides the best

location for the design?5 How do you locate, identify and approach key

informants?6 How they see you creates a context within

which you see them. How can you handle suchsocial complexities?

7 How do you record evidence? When? Howmuch?

8 How do you file and categorize it?9 How much time do you give to thinking and

reflecting about what you are doing?10 At what points do you show your subjects what

you are doing?11 At what points do you give them control over

who sees what?12 Who sees the reports first?

190 CASE STUDIES

useful as respondents might suggest a better wayof expressing the issue or may wish to add orqualify points.

There is a risk in respondent validation, how-ever, that they may disagree with an interpreta-tion. Nisbet and Watt (ibid.: 81) indicate the needto have negotiated rights to veto. They also rec-ommend that researchers: (a) promise that re-spondents can see those sections of the reportthat refer to them (subject to controls for

confidentiality, e.g. of others in the case study);(b) take full account of suggestions and responsesmade by respondents and, where possible, tomodify the account; (c) in the case of disagree-ment between researchers and respondents,promise to publish respondents’ comments andcriticisms alongside the researchers’ report.

Sturman (1997) places on a set of continuathe nature of data collection, types andanalysis techniques in case study research.These are presented in summary form (Box9.5). At one pole we have unstructured, typi-cally qualitative data, whilst at the other wehave structured, typically quantitative data.Researchers using case study approaches willneed to decide which methods of data collec-tion, which type of data and techniques ofanalysis to employ.

Conclusion

The different strategies we have illustrated inour six examples of case studies in a variety ofeducational settings suggest that participantobservation is best thought of as a generic termthat describes a methodological approach ratherthan one specific method.4 What our exampleshave shown is that the representativeness of aparticular sample often relates to the observa-tional strategy open to the researcher. Gener-ally speaking, the larger the sample, the morerepresentative it is, and the more likely that theobserver’s role is of a participant nature.

Box 9.5Continua of data collection, types and analysis in case studyresearch

Source Adapted from Sturman, 1997

Introduction

Human behaviour at both the individual andsocial level is characterized by great complex-ity, a complexity about which we understandcomparatively little, given the present state ofsocial research. One approach to a fuller under-standing of human behaviour is to begin by teas-ing out simple relationships between those fac-tors and elements deemed to have some bearingon the phenomena in question. The value ofcorrelational research is that it is able to achievethis end.

Much of social research in general, and edu-cational research more particularly, is concernedat our present stage of development with thefirst step in this sequence—establishing interre-lationships among variables. We may wish toknow, for example, how delinquency is relatedto social class background; or whether an asso-ciation exists between the number of years spentin full-time education and subsequent annualincome; or whether there is a link between per-sonality and achievement. Numerous techniqueshave been devised to provide us with numericalrepresentations of such relationships and theyare known as ‘measures of association’. We listthe principal ones in Box 10.1. The interestedreader is referred to Cohen and Holliday (1982,1996), texts containing worked examples of theappropriate use (and limitations) of the correla-tional techniques outlined in Box 10.1, togetherwith other measures of association such asKruskal’s gamma, Somer’s d, and Guttman’slambda.

Look at the words used at the top of the Boxto explain the nature of variables in connectionwith the measure called the Pearson product

moment, r. The variables, we learn, are ‘con-tinuous’ and at the ‘interval’ or the ‘ratio’ scaleof measurement. A continuous variable is onethat, theoretically at least, can take any valuebetween two points on a scale. Weight, for ex-ample, is a continuous variable; so too is time,so also is height. Weight, time and height cantake on any number of possible values betweennought and infinity, the feasibility of measuringthem across such a range being limited only bythe variability of suitable measuring instruments.

A ratio scale includes an absolute zero and pro-vides equal intervals. Using weight as our exam-ple, we can say that no mass at all is a zero meas-ure and that 1,000 grams is 400 grams heavierthan 600 grams and twice as heavy as 500. In ourdiscussion of correlational research that follows,we refer to a relationship as a ‘correlation’ ratherthan an ‘association’ whenever that relationshipcan be further specified in terms of an increase ora decrease of a certain number of units in the onevariable (IQ for example) producing an increaseor a decrease of a related number of units of theother (e.g. mathematical ability).

Turning again to Box 10.1, we read in con-nection with the second measure shown there(Rank order or Kendall’s tau) that the two con-tinuous variables are at the ‘ordinal’ scale ofmeasurement. An ordinal scale is used to indi-cate rank order; that is to say, it arranges indi-viduals or objects in a series ranging from thehighest to the lowest according to the particularcharacteristic being measured. In contrast to theinterval scale discussed earlier, ordinal numbersassigned to such a series do not indicate abso-lute quantities nor can one assume that the

10 Correlational research

192 CORRELATION RESEARCH

intervals between the numbers are equal. Forexample, in a class of children rated by a teacheron the degree of their co-operativeness andranged from highest to lowest according to thatattribute, it cannot be assumed that the differ-ence in the degree of co-operativeness betweensubjects ranked 1 and 2 is the same as that ob-taining between subjects 9 and 10; nor can it betaken that subject 1 possesses 10 times the quan-tity of co-operativeness of subject 10.

The variables involved in connection with thephi co-efficient measure of association (halfwaydown Box 10.1) are described as ‘true dichoto-mies’ and at the ‘nominal’ scale of measurement.Truly dichotomous variables (such as sex or driv-ing test result) can take only two values (maleor female; pass or fail). The nominal scale is themost elementary scale of measurement. It doesno more than identify the categories into whichindividuals, objects or events may be classified.

Box 10.1Common measures of relationship

Source Mouly, 1978

Measure Nature of Variables CommentPearson product moment r Two continuous variables; Relationship linear

interval or ratio scale

Rank order or Kendall’s tau Two continuous variables;ordinal scale

Correlation ratio, � (eta) One variable continuous, Relationship nonlinearother either continuous ordiscrete

Intraclass One variable continuous; Purpose: to determine within-other discrete; interval or group similarityratio scale

Biserial, rbis

One variable continuous; Index of item discriminationPoint biserial, r

pt bisother (a) continuous but (used in item analysis)dichotomized. r

bis or (b)

true dichotomy, rpt bis

Phi co-efficient, Φ Two true dichotomies;nominal or ordinal series

Partial correlation r12.3

Three or more continuous Purpose: to determinevariables relationship between two

variables, with effect of thirdheld constant

Multiple correlation r1.234

Three or more continuous Purpose: to predict onevariables variable from a linear

weighted combination of twoor more independentvariables

Kendall’s co-efficient of Three or more continuous Purpose: to determineconcordance, (W) variables; ordinal series the degree of (say, interrater)

agreement

Chapte

r 10

193

Those categories have to be mutually exclusiveof course, and a nominal scale should also becomplete; that is to say it should include all pos-sible classifications of a particular type.

To conclude our explanation of terminology,readers should note the use of the phrase ‘dis-crete variable’ in the description of the thirdcorrelation ratio (eta) in Box 10.1. We said ear-lier that a continuous variable can take on anyvalue between two points on a scale. A discretevariable, however, can only take on numeralsor values that are specific points on a scale. Thenumber of players in a football team is a dis-crete variable. It is usually 11; it could be fewerthan 11, but it could never be 7¼!

Explaining correlation and significance

Correlational techniques are generally intendedto answer three questions about two variablesor two sets of data. First, ‘Is there a relation-ship between the two variables (or sets ofdata)?’ If the answer to this question is ‘yes’,then two other questions follow: ‘What is thedirection of the relationship?’ and ‘What is themagnitude?’

Relationship in this context refers to any ten-dency for the two variables (or sets of data) tovary consistently. Pearson’s product momentcoefficient of correlation, one of the best-knownmeasures of association, is a statistical valueranging from -1.0 to +1.0 and expresses this re-lationship in quantitative form. The coefficientis represented by the symbol r.

Where the two variables (or sets of data) fluc-tuate in the same direction, i.e. as one increasesso does the other, or as one decreases so doesthe other, a positive relationship is said to exist.Correlations reflecting this pattern are prefacedwith a plus sign to indicate the positive natureof the relationship. Thus, +1.0 would indicateperfect positive correlation between two factors,as with the radius and diameter of a circle, and+0.80 a high positive correlation, as betweenacademic achievement and intelligence, for ex-ample. Where the sign has been omitted, a plussign is assumed.

A negative correlation or relationship, on theother hand, is to be found when an increase inone variable is accompanied by a decrease inthe other variable. Negative correlations areprefaced with a minus sign. Thus, -1.0 wouldrepresent perfect negative correlation, as be-tween the number of errors children make on aspelling test and their score on the test, and -0.30 a low negative correlation, as between ab-senteeism and intelligence, say.

Generally speaking, researchers tend to bemore interested in the magnitude of an obtainedcorrelation than they are in its direction. Corre-lational procedures have been developed so thatno relationship whatever between two variablesis represented by zero (or 0.00), as between bodyweight and intelligence, possibly. This means thata person’s performance on one variable is to-tally unrelated to her performance on a secondvariable. If she is high on one, for example, sheis just as likely to be high or low on the other.Perfect correlations of +1.00 or -1.00 are rarelyfound. The correlation co-efficient may be seenthen as an indication of the predictability of onevariable given the other: it is an indication ofcovariation. The relationship between two vari-ables can be examined visually by plotting thepaired measurements on graph paper with eachpair of observations being represented by apoint. The resulting arrangement of points isknown as a ‘scatter diagram’ and enables us toassess graphically the degree of relationship be-tween the characteristics being measured. Box10.2 gives some examples of ‘scatter diagrams’in the field of educational research.

Let us imagine we observe that many peoplewith large hands also have large feet and thatpeople with small hands also have small feet (seeMorrison, 1993:136–40). We decide to conductan investigation to see if there is any correlationor degree of association between the size of feetand the size of hands, or whether it was justchance that led some people to have large handsand large feet. We measure the hands and thefeet of 100 people and observe that 99 timesout of 100 those people with large feet also havelarge hands. That seems to be more than mere

EXPLAINING CORRELATION AND SIGNIFICANCE

194 CORRELATION RESEARCH

coincidence; it would seem that we could saywith some certainty that if a person has largehands then she will also have large feet. Howdo we know when we can make that assertion?When do we know that we can have confidencein this prediction?

For statistical purposes, if we can observe thisrelationship occurring 95 times out of 100 i.e. thatchance only accounted for 5 per cent of the differ-ence, then we could say with some confidence thatthere seems to be a high degree of association be-tween the two variables hands and feet; it would

Box 10.2Correlation scatter diagrams

Source Tuckman, 1972

Chapte

r 10

195

not occur in only 5 people in every 100, reportedas the 0,05 level of significance (0.05 being five-hundredths). If we can observe this relationshipoccurring 99 times out of every 100 (as in the ex-ample of hands and feet), i.e. that chance only ac-counted for 1 per cent of the difference, then wecould say with even greater confidence that thereseems to be a very high degree of association be-tween the two variables; it would not occur onlyonce in every hundred, reported as the 0.01 levelof significance (0.01 being one-hundredth). Webegin with a null hypothesis, which states that thereis no relationship between the size of hands andthe size of feet. The task is to disprove or reject thehypothesis—the burden of responsibility is to re-ject the null hypothesis. If we can show that thehypothesis is untrue for 95 per cent or 99 per centof the population, then we have demonstrated thatthere is a statistically significant relationship be-tween the size of hands and the size of feet at the0.05 and 0.01 levels of significance respectively.These two levels of significance—the 0.05 and 0.01levels—are the levels at which statistical signifi-cance is frequently taken to have been demon-strated. The researcher would say that the nullhypothesis (that there is no significant relation-ship between the two variables) had been rejectedand that the level of significance observed (ρ) waseither at the 0.05 or 0.01 level.

Let us take a second example. Let us say thatwe have devised a scale of 1–8 which can beused to measure the sizes of hands and feet. Us-ing the scale we make the following calculationsfor eight people, and set out the results thus:

Hand size Foot sizeSubject A 1 1Subject B 2 2Subject C 3 3Subject D 4 4Subject E 5 5Subject F 6 6Subject G 7 7Subject H 8 8

We can observe a perfect correlation betweenthe size of the hands and the size of feet, from

the person who has a size 1 hand and a size 1foot to the person who has a size 8 hand andalso a size 8 foot. There is a perfect positive cor-relation (as one variable increases, e.g. hand size,so the other variable—foot size—increases, andas one variable decreases so does the other).Using a mathematical formula (a correlationstatistic, available in most statistics books) wewould calculate that this perfect correlationyields an index of association—a co-efficient ofcorrelation—which is +1.00.

Suppose that this time we carried out the in-vestigation on a second group of eight peopleand reported the following results:

Hand size Foot sizeSubject A 1 8Subject B 2 7Subject C 3 6Subject D 4 5Subject E 5 4Subject F 6 3Subject G 7 2Subject H 8 1

This time the person with a size 1 hand has asize 8 foot and the person with the size 8 handhas a size 1 foot. There is a perfect negative cor-relation (as one variable increases, e.g. hand size,the other variable—foot size—decreases, and asone variable decreases, the other increases). Us-ing the same mathematical formula we wouldcalculate that this perfect negative correlationyielded an index of association—a co-efficientof correlation—which is -1.00.

Now, clearly it is very rare to find a perfectpositive or a perfect negative correlation; thetruth of the matter is that looking for correla-tions will yield co-efficients of correlation whichlie somewhere between -1.00 and +1.00. Howdo we know whether the co-efficients of corre-lation are significant or not?

Let us say that we take a third sample of eightpeople and undertake an investigation into theirhand and foot size. We enter the data case bycase (Subject A to Subject H), indicating their rankorder for hand size and then for foot size. This

EXPLAINING CORRELATION AND SIGNIFICANCE

196 CORRELATION RESEARCH

time the relationship is less clear because the rankordering is more mixed, for example, Subject Ahas a hand size of 2 and 1 for foot size, Subject Bhas a hand size of 1 and a foot size of 2 etc.:

Hand size Foot sizeSubject A 1 1Subject B 1 2Subject C 3 3Subject D 5 4Subject E 4 5Subject F 7 6Subject G 6 7Subject H 8 8

Using the mathematical formula for calculating thecorrelation statistic, we find that the coefficient ofcorrelation for the eight people is 0.7857. Is it sta-tistically significant? From a table of significance(commonly printed in appendices to books on sta-tistics or research methods), we read off whetherthe co-efficient is statistically significant or not fora specific number of cases, for example:

The first example above of hands and feet(see p. 193) is very neat because it has 100 peo-ple in the sample. If we have more or less than100 people how do we know if a relationshipbetween two factors is significant? Let us saythat we have data on 30 people; in this case,because our sample size is so small, we mighthesitate to say that there is a strong associationbetween the size of hands and size of feet if weobserve that in 27 people (i.e. 90 per cent of thepopulation). On the other hand, let us say thatwe have a sample of 1,000 people and we ob-serve the association in 700 of them. In this case,even though only 70 per cent of the sample dem-onstrate the association of hand and foot size,we might say that because the sample size is solarge we can have greater confidence in the datathan on the small sample.

Statistical significance varies according to thesize of the population in the sample (as can beseen also in the section of the table of signifi-cance reproduced above). In order to be able todetermine significance we need to have two factsin our possession: the size of the sample and theco-efficient of correlation. As the selection fromthe table of significance reproduced aboveshows, the co-efficient of correlation can de-crease and still be statistically significant as longas the sample size increases. (This resonates withKrejcie and Morgan’s (1970) principles for sam-pling, observed in chapter 4, viz. as the popula-tion increases the sample size increases at a di-minishing rate in addressing randomness.)

To ascertain significance from a table, then, itis simply a matter of reading off the significancelevel from a table of significance according to thesample size, or processing data on a computerprogramme to yield the appropriate statistic. Inthe selection from the table of significance forthe third example above concerning hand andfoot size, the first column indicates the numberof people in the sample and the other two col-umns indicate significance at the two levels.Hence, if we have 30 people in the sample then,for the correlation to be significant at the 0.05level, we would need a correlation co-efficient of0.36, whereas, if there were only 10 people in the

We see that for eight cases in an investigationthe correlation co-efficient has to be 0.78 orhigher, if it is to be significant at the 0.05 level,and 0.875 or higher, if it is to be significant atthe 0.01 level of significance. As the correlationcoefficient in the example of the third experi-ment with eight subjects is 0.7857 we can seethat it is higher than that required for signifi-cance at the 0.05 level (0.78) but not as high asthat required for significance at the 0.01 level(0.875). We are safe, then, in stating that thedegree of association between the hand and footsizes rejects the null hypothesis and demonstratesstatistical significance at the 0.05 level.

Chapte

r 10

197

sample, we would need a correlation co-efficientof 0.65 for the correlation to be significant at thesame 0.05 level.

In addition to the types of purpose set outabove, the calculation of a correlation coeffi-cient is also used in determining the itemdiscriminability of a test (see Chapter 18), e.g.using a point bi-serial calculation, and in deter-mining split-half reliability in test items (seeChapter 5) using the Spearman rank order cor-relation statistic.

More recently statistical significance on itsown has been seen as an unacceptable index ofeffect (Thompson, 1994, 1996, 1998; Thompsonand Snyder, 1997; Rozeboom, 1997:335; Fitz-Gibbon, 1997:43) because it depends on sam-ple size. What is also required to accompanysignificance is information about effect size(American Psychological Association, 1994:18).Indeed effect size is seen as much more impor-tant than significance (see also Chapter 12). Sta-tistical significance is seen as arbitrary and un-helpful—a ‘corrupt form of the scientificmethod’ (Carver, 1978), being an obstacle ratherthan a facilitator in educational research. It com-mands slavish adherence rather than address-ing the subtle, sensitive and helpful notion ofeffect size (see Fitz-Gibbon, 1997:118). Indeedcommonsense should tell the researcher that adifferential measure of effect size is more usefulthan the blunt edge of statistical significance.

Whilst correlations are widely used in re-search, and they are straightforward to calcu-late and to interpret, the researcher must beaware of four caveats in undertaking correla-tional analysis: 1 Do not assume that correlations imply causal

relationships (Mouly, 1978) (i.e. simply be-cause having large hands appears to corre-late with having large feet does not imply thathaving large hands causes one to have largefeet).

2 There is a need to be alert to a Type I error—rejecting the null hypothesis when it is in facttrue (a particular problem as the sample in-creases, as the chances of finding a signifi-

cant association increase, irrespective ofwhether a true association exists (Rose andSullivan, 1993:168), requiring the researcher,therefore to set a higher limit (e.g. 0.01 or0.001) for statistical significance to beachieved).

3 There is a need to be alert to a Type II er-ror—accepting the null hypothesis when it isin fact not true (often the case if the levels ofsignificance are set too stringently, i.e. requir-ing the researcher to lower the level of sig-nificance (e.g. 0.1 or 0.2) required).

4 Statistical significance must be accompaniedby an indication of effect size.

Identifying and resolving issues (2) and (3) areaddressed in Chapter 5.

Curvilinearity

The correlations discussed so far have assumedlinearity, that is, the more we have of one prop-erty, the more (or less) we have of another prop-erty, in a direct positive or negative relationship. Astraight line can be drawn through the points onthe scatter diagrams (scatterplots). However, lin-earity cannot always be assumed. Consider thecase, for example, of stress: a little stress mightenhance performance (‘setting the adrenalin run-ning’) positively, whereas too much stress mightlead to a downturn in performance. Where stressenhances performance there is a positive correla-tion, but when stress debilitates performance thereis a negative correlation. The result is not a straightline of correlation (indicating linearity) but a curvedline (indicating curvilinearity). This can be showngraphically (Box 10.3). It is assumed here, for thepurposes of the example, that muscular strengthcan be measured on a single scale. It is clear fromthe graph that muscular strength increases frombirth until 50 years, and thereafter it declines asmuscles degenerate. There is a positive correlationbetween age and muscular strength on the left handside of the graph and a negative correlation on theright hand side of the graph, i.e. a curvilinear cor-relation can be observed.

Hopkins, Hopkins and Glass (1996:92) provide

CURVILINEARITY

198 CORRELATION RESEARCH

another example of curvilinearity: room tem-perature and comfort. Raising the temperaturea little can make for greater comfort—a posi-tive correlation—whilst raising it too greatly canmake for discomfort—a negative correlation.Many correlational statistics assume linearity(e.g. the Pearson product-moment correlation).However, rather than using correlational statis-tics arbitrarily or blindly, the researcher will needto consider whether, in fact, linearity is a rea-sonable assumption to make, or whether a cur-vilinear relationship is more appropriate (inwhich case more sophisticated statistics will beneeded, e.g. η (‘eta’) (Glass and Hopkins, 1996,section 8.7; Cohen and Holliday, 1996:84) ormathematical procedures will need to be appliedto transform non-linear relations into linear re-lations). Examples of curvilinear relationshipsmight include: • Pressure from the principal and teacher per-

formance;• Pressure from the teacher and student

achievement;• Degree of challenge and student achievement;• Assertiveness and success;• Age and muscular strength;• Age and physical control;• Age and concentration;• Age and sociability;• Age and cognitive abilities. Hopkins, Hopkins and Glass (ibid.) suggest thatthe variable ‘age’ frequently has a curvilinear

relationship with other variables. The authorsalso point out (p. 92) that poorly constructedtests can give the appearance of curvilinearity ifthe test is too easy (a ‘ceiling effect’ where moststudents score highly) or if it is too difficult, butthat this curvilinearity is, in fact, spurious, asthe test does not demonstrate sufficient item dif-ficulty or discriminability.

In planning correlational research, then, at-tention will need to be given to whether linear-ity or curvilinearity is to be assumed.

Co-efficients of correlation

The co-efficient of correlation, then, tells ussomething about the relations between two vari-ables. Other measures exist, however, which al-low us to specify relationships when more thantwo variables are involved. These are known asmeasures of ‘multiple correlation’ and ‘partialcorrelation’.

Multiple correlation measures indicate thedegree of association between three or more vari-ables simultaneously. We may want to know, forexample, the degree of association between de-linquency, social class background and leisurefacilities. Or we may be interested in finding outthe relationship between academic achievement,intelligence and neuroticism. Multiple correlation,or ‘regression’ as it is sometimes called, indicatesthe degree of association between n variables. Itis related not only to the correlations of the inde-pendent variables with the dependent variables,but also to the intercorrelations between the de-pendent variables.

Partial correlation aims at establishing thedegree of association between two variables af-ter the influence of a third has been controlledor partialled out. Studies involving complex re-lationships utilize multiple and partial correla-tions in order to provide a clearer picture of therelationships being investigated. Guilford andFruchter (1973) define a partial correlation be-tween two variables as:

one that nullifies the effects of a third variable (ora number of variables) upon both the variables

Box 10.3A line diagram to indicate curvilinearity

Chapte

r 10

199

being correlated. The correlation between heightand weight of boys in a group where age is per-mitted to vary would be higher than the correla-tion between height and weight in a group at con-stant age. The reason is obvious. Because certainboys are older, they are both heavier and taller.Age is a factor that enhances the strength of cor-respondence between height and weight. With ageheld constant, the correlation would still be posi-tive and significant because at any age, taller boystend to be heavier.

(Guilford and Fruchter, 1973) Correlational research is particularly useful intackling problems in education and the socialsciences because it allows for the measurementof a number of variables and their relationshipssimultaneously. The experimental approach, bycontrast, is characterized by the manipulationof a single variable, and is thus appropriate fordealing with problems where simple causal re-lationships exist. In educational and behaviouralresearch, it is invariably the case that a numberof variables contribute to a particular outcome.Experimental research thus introduces a note ofunreality into research, whereas correlationalapproaches, while less rigorous, allow for thestudy of behaviour in more realistic settings.Where an element of control is required, how-ever, partial correlation achieves this withoutchanging the context in which the study takesplace. However, correlational research is less rig-orous than the experimental approach becauseit exercises less control over the independentvariables; it is prone to identify spurious rela-tion patterns; it adopts an atomistic approach;and the correlation index is relatively imprecise,being limited by the unreliability of the meas-urements of the variables.

Characteristics of correlational studies

Correlational studies may be broadly classifiedas either ‘relational studies’ or as ‘predictionstudies’. We now look at each a little moreclosely.

In the case of the first of these two catego-ries, correlational research is mainly concerned

with achieving a fuller understanding of the com-plexity of phenomena or, in the matter of be-havioural and educational research, behaviouralpatterns, by studying the relationships betweenthe variables which the researcher hypothesizesas being related. As a method, it is particularlyuseful in exploratory studies into fields wherelittle or no previous research has been under-taken. It is often a shot in the dark aimed atverifying hunches a researcher has about a pre-sumed relationship between characteristics orvariables. Take a complex notion like ‘teachereffectiveness’, for example. This is dependentupon a number of less complex factors operat-ing singly or in combination. Factors such asintelligence, motivation, person perception, ver-bal skills and empathy come to mind as possi-bly having an effect on teaching outcomes. Areview of the research literature will confirm orreject these possibilities. Once an appropriatenumber of factors have been identified in thisway, suitable measures may then be chosen ordeveloped to assess them. They are then givento a representative sample and the scores ob-tained are correlated with a measure of the com-plex factor being investigated, namely, teachereffectiveness. As it is an exploratory undertak-ing, the analysis will consist of correlation co-efficients only, though if it is designed carefully,we will begin to achieve some understanding ofthe particular behaviour being studied. The in-vestigation and its outcomes may then be usedas a basis for further research or as a source ofadditional hypotheses.

Exploratory relationship studies may alsoemploy partial correlational techniques. Partialcorrelation is a particularly suitable approachwhen a researcher wishes to nullify the influ-ence of one or more important factors uponbehaviour in order to bring the effect of lessimportant factors into greater prominence. If,for example, we wanted to understand morefully the determinants of academic achievementin a comprehensive school, we might begin byacknowledging the importance of the factor ofintelligence and establishing a relationship be-tween intelligence and academic achievement.

CHARACTERISTICS OF CORRELATIONAL STUDIES

200 CORRELATION RESEARCH

The intelligence factor could then be held con-stant by partial correlation, thus enabling theinvestigator to clarify other, lesser factors suchas motivation, parental encouragement or vo-cational aspiration. Clearly, motivation is relatedto academic achievement but if a pupil’s moti-vation score is correlated with academic achieve-ment without controlling the intelligence factor,it will be difficult to assess the true effect ofmotivation on achievement because the pupilwith high intelligence but low motivation maypossibly achieve more than pupils with lowerintelligence but higher motivation. Once intelli-gence has been nullified, it is possible to see moreclearly the relationship between motivation andachievement. The next stage might be to con-trol the effects of both intelligence and motiva-tion and then to seek a clearer idea of the effectsof other selected factors—parental encourage-ment or vocational aspiration, for instance. Fi-nally, exploratory relationship studies may em-ploy sophisticated, multivariate techniques inteasing out associations between dependent andindependent variables.

In contrast to exploratory relationship stud-ies, prediction studies are usually undertaken inareas having a firmer and more secure knowl-edge base. Prediction through the use of corre-lational techniques is based on the assumptionthat at least some of the factors that will lead tothe behaviour to be predicted are present andmeasurable at the time the prediction is made(see Borg, 1963). If, for example, we wanted topredict the probable success of a group of sales-people on an intensive training course, we wouldstart with variables that have been found in pre-vious research to be related to later success insaleswork. These might include enterprise, ver-bal ability, achievement motivation, emotionalmaturity, sociability and so on. The extent towhich these predictors correlate with the par-ticular behaviour we wish to predict, namely,successful selling, will determine the accuracyof our prediction. Clearly, variables crucial tosuccess cannot be predicted if they are notpresent at the time of making the prediction. Asales-person’s ability to fit in with a team of his

or her fellows cannot be predicted where thesefuture colleagues are unknown.

In order to be valuable in prediction, the mag-nitude of association between two variables mustbe substantial; and the greater the association,the more accurate the prediction it permits. Inpractice, this means that anything less than per-fect correlation will permit errors in predictingone variable from a knowledge of the other.

Borg recalls that much prediction research inthe United States has been carried out in the fieldof scholastic success. Some studies in this con-nection have been aimed at short-term predic-tion of students’ performance in specific coursesof study, while other studies have been directedat long-term prediction of general academic suc-cess. Sometimes, short-term academic predictionis based upon a single predictor variable. Mostefforts to predict future behaviours, however,are based upon scores on a number of predictorvariables, each of which is useful in predicting aspecific aspect of future behaviour. In the pre-diction of college success, for example, a singlevariable such as academic achievement is lesseffective as a predictor than a combination ofvariables such as academic achievement togetherwith, say, motivation, intelligence, study habits,etc. More complex studies of this kind, there-fore, generally make use of multiple correlationand multiple regression equations.

Predicting behaviours or events likely to oc-cur in the near future is easier and less hazard-ous than predicting behaviours likely to occurin the more distant future. The reason is that inshort-term prediction, more of the factors lead-ing to success in predicted behaviour are likelyto be present. In addition, short-term predictionallows less time for important predictor vari-ables to change or for individuals to gain expe-rience that would tend to change their likelihoodof success in the predicted behaviour.

One further point: correlation, as Mouly andBorg observe, is a group concept, a generalizedmeasure that is useful basically in predictinggroup performance. Whereas, for instance, it canbe predicted that gifted children as a group willsucceed at school, it cannot be predicted with

Chapte

r 10

201

certainty that one particular gifted child willexcel. Further, low co-efficients will have littlepredictive value, and only a high correlation canbe regarded as valid for individual prediction.

Interpreting the correlation co-efficient

Once a correlation co-efficient has been com-puted, there remains the problem of interpret-ing it. A question often asked in this connectionis how large should the co-efficient be for it tobe meaningful. The question may be approachedin three ways: by examining the strength of therelationship; by examining the statistical signifi-cance of the relationship (discussed earlier); andby examining the square of the correlation co-efficient.

Inspection of the numerical value of a corre-lation co-efficient will yield clear indication ofthe strength of the relationship between the vari-ables in question. Low or near zero values indi-cate weak relationships, while those nearer to+1 or -1 suggest stronger relationships. Imag-ine, for instance, that a measure of a teacher’ssuccess in the classroom after five years in theprofession is correlated with her final schoolexperience grade as a student and that it wasfound that r=+0.19. Suppose now that her scoreon classroom success is correlated with a meas-ure of need for professional achievement andthat this yielded a correlation of 0.65. It couldbe concluded that there is a stronger relation-ship between success and professional achieve-ment scores than between success and final stu-dent grade.

Exploratory relationship studies are generallyinterpreted with reference to their statistical sig-nificance, whereas prediction studies depend fortheir efficacy on the strength of the correlationco-efficients. These need to be considerablyhigher than those found in exploratory relation-ship studies and for this reason rarely invokethe concept of significance.

The third approach to interpreting a co-effi-cient is provided by examining the square of theco-efficient of correlation, r2. This shows theproportion of variance in one variable that can

be attributed to its linear relationship with thesecond variable. In other words, it indicates theamount the two variables have in common. If,for example, two variables A and B have a cor-relation of 0.50, then (0.50)2 or 0.25 of the vari-ation shown by the B scores can be attributedto the tendency of B to vary linearly with A.Box 10.4 shows graphically the common vari-ance between reading grade and arithmetic gradehaving a correlation of 0.65.

There are three cautions to be borne in mindwhen one is interpreting a correlation co-effi-cient. First, a co-efficient is a simple number andmust not be interpreted as a percentage. A cor-relation of 0.50, for instance, does not mean 50per cent relationship between the variables. Fur-ther, a correlation of 0.50 does not indicate twiceas much relationship as that shown by a corre-lation of 0.25. A correlation of 0.50 actuallyindicates more than twice the relationship shownby a correlation of 0.25. In fact, as co-efficientsapproach +1 or -1, a difference in the absolutevalues of the co-efficients becomes more impor-tant than the same numerical difference betweenlower correlations would be.

Second, a correlation does not necessarilyimply a cause-and-effect relationship betweentwo factors, as we have previously indicated.Third, a correlation co-efficient is not to be in-terpreted in any absolute sense. A correlationalvalue for a given sample of a population maynot necessarily be the same as that found in an-other sample from the same population.

Box 10.4Visualization of correlation of 0.65 between readinggrade and arithmetic grade

Source Fox, 1969

INTERPRETING THE CORRELATION CO-EFFICIENT

202 CORRELATION RESEARCH

Many factors influence the value of a given cor-relation coefficient and if researchers wish toextrapolate to the populations from which theydrew their samples they will then have to testthe significance of the correlation.

We now offer some general guidelines for in-terpreting correlation co-efficients. They are basedon Borg’s (1963) analysis and assume that thecorrelations relate to a hundred or more subjects.

Correlations ranging from 0.20 to 0.35

Correlations within this range show only veryslight relationship between variables althoughthey may be statistically significant. A correla-tion of 0.20 shows that only 4 per cent of thevariance is common to the two measures.Whereas correlations at this level may have lim-ited meaning in exploratory relationship re-search, they are of no value in either individualor group prediction studies.

Correlations ranging from 0.35 to 0.65

Within this range, correlations are statisticallysignificant beyond the 1 per cent level. Whencorrelations are around 0.40, crude group pre-diction may be possible. As Borg notes, correla-tions within this range are useful, however, whencombined with other correlations in a multipleregression equation. Combining several corre-lations in this range can in some cases yield in-dividual predictions that are correct within anacceptable margin of error. Correlations at thislevel used singly are of little use for individualprediction because they yield only a few morecorrect predictions than could be accomplishedby guessing or by using some chance selectionprocedure.

Correlations ranging from 0.65 to 0.85

Correlations within this range make possiblegroup predictions that are accurate enough formost purposes. Nearer the top of the range,group predictions can be made very accurately,usually predicting the proportion of successful

candidates in selection problems within a verysmall margin of error. Near the top of this cor-relation range individual predictions can bemade that are considerably more accurate thanwould occur if no such selection procedureswere used.

Correlations over 0.85

Correlations as high as this indicate a close re-lationship between the two variables correlated.A correlation of 0.85 indicates that the measureused for prediction has about 72 per cent vari-ance in common with the performance beingpredicted. Prediction studies in education veryrarely yield correlations this high. When corre-lations at this level are obtained, however, theyare very useful for either individual or groupprediction.

Examples of correlational research

To conclude this chapter, we illustrate the useof correlation co-efficients in a small-scale studyof young children’s attainments and self-im-ages, and, by contrast, we report some of thefindings of a very large scale, longitudinal sur-vey of the outcomes of truancy that uses spe-cial techniques for controlling intruding vari-ables in looking at the association between tru-ancy and occupational prospects. Finally, weshow how partial correlational techniques canclarify the strength and direction of associa-tions between variables.

Small-scale study of attainment andself-image

A study by Crocker and Cheeseman (1988) in-vestigated young children’s ability to assess theacademic worth of themselves and others. Spe-cifically, the study posed the following threequestions: 1 Can children in their first years at school

assess their own academic rank relative totheir peers?

Chapte

r 10

203

2 What level of match exists between self-esti-mate, peer-estimate, teacher-estimate of aca-demic rank?

3 What criteria do these children use whenmaking these judgements?

Using three infant schools in the Midlands theage range of which was from 5 to 7 years, theresearchers selected a sample of 141 childrenfrom 5 classes. Observations took place on 20half-day visits to each class and the observer wasable to interact with individual children. Noteson interactions were taken. Subsequently, eachchild was given pieces of paper with the namesof all his or her classmates on them and wasthen asked to arrange them in 2 piles—thosethe child thought were ‘better than me’ at schoolwork and those the child thought were ‘not asgood as me’. No child suggested that the taskwas one which he or she could not do. The rela-tive self-rankings were converted to a percent-age of children seen to be ‘better than me’ ineach class.

Correspondingly, each teacher was asked torank all the children in her class without usingany standardized test. Spearman’s rank ordercorrelations were calculated between self-teacher,self-peer, and peer-teacher rankings. The tablebelow indicates there was a high degree of agree-ment between self estimates of rank position, peerestimate and teacher estimate. The correlationsappeared to confirm earlier researches in whichthere was broad agreement between self, peer andteacher ratings (see Box 10.5).

The researchers conclude that the youngestschoolchildren quickly acquire a knowledge ofthose academic criteria that teachers use toevaluate pupils. The study disclosed a high de-gree of agreement between self, peers and teacheras to the rank order of children in a particularclassroom. It seemed that only the youngest usednonacademic measures to any great extent andthat this had largely disappeared by the time thechildren were 6 years old.

Large-scale study of truancy

Drawing on the huge database of the NationalChild Development Study (a longitudinal sur-vey of all people in Great Britain born in theweek 3–9 March, 1958), Hibbett and her asso-ciates (1990) were able to explore the associa-tion between reported truancy at school, basedon information obtained during the school years,and occupational, financial and educationalprogress, family formation and health, based oninterview at the age of 23 years. We report hereon some occupational outcomes of truancy.

Whereas initial analyses demonstrated a con-sistent relationship between truancy and drop-ping out of secondary education, less skilledemployment, increased risk of unemploymentand a reduced chance of being in a job involv-ing further training, these associations were de-rived from comparisons between truants and allother members of the 1958 cohort. In brief, theyfailed to take account of the fact that truantsand nontruants differed in respect of such vitalfactors as family size, father’s occupation, andpoorer educational ability and attainment be-fore truancy commenced. Using sophisticatedstatistical techniques, the investigators went onto control for these initial differences, thus ena-bling them to test whether or not the outcomesfor truants differed once they were being com-pared with people who were similar in these re-spects. The multivariate techniques used in theanalyses need not concern us here. Suffice it tosay that by and large, the differences that werenoted before controlling for the intruding vari-ables persisted even when those controls were

Box 10.5Correlations between the various estimates ofacademic rank

Note All correlations are significant beyond the 0.01 levelSource Crocker and Cheesman, 1988

EXAMPLES OF CORRELATIONAL RESEARCH

204 CORRELATION RESEARCH

introduced. That is to say, truancy was found tocorrelate with: • unstable job history;• a shorter mean length of jobs;• higher total number of jobs;• greater frequency of unemployment;• greater mean length of unemployment spells;• lower family income. Thus, by sequentially controlling for such vari-ables as family size, father’s occupation, meas-ured ability and attainment at 11 years, etc., theresearchers were able to ascertain how mucheach of these independent variables contributedto the relationship between truancy and the out-come variables that we identify above.

The investigators report their findings interms such as: • truants were 2.4 times more likely than nontruants

to be unemployed rather than in work;• truants were 1.4 times more likely than

nontruants to be out of the labour force;• truants experienced, on average, 4.2 months

more unemployment than non-truants;• truants were considerably less well off than

non-truants in net family income per week. The researchers conclude that their study chal-lenges a commonly held belief that truants sim-ply outgrow school and are ready for the worldof work. On the contrary, truancy is often a signof more general and long-term difficulties and apredictor of unemployment problems of a moresevere kind than will be the experience of oth-ers who share the disadvantaged backgroundsand the low attainments that typify truants.

Partial correlation and associationsbetween variables

The ability of partial correlational techniquesto clarify the strength and direction of associa-tions between variables is demonstrated in astudy by Halpin, Croll and Redman (1990). Inan exploration of teachers’ perceptions of theeffects of in-service education, the authors re-port correlations between Teaching (T), Organi-zation and Policy (OP), Attitudes and Knowl-edge (AK) and the dependent variable, PupilAttainment (PA).

The strength of these associations suggeststhat there is a strong tendency (r=0.68) for teach-ers who claim a higher level of ‘INSET effect’on the Teaching dimension to claim also a higherlevel of effect on Pupil Attainment and vice versa.The correlations between the Organization andPolicy (OP) and Pupil Attainment (PA), and At-titudes and Knowledge (AK) and Pupil Attain-ment (PA), however, are much weaker (r=0.27and r=0.23 respectively). When the researcherscalculated the partial correlation between Teach-ing and Pupil Attainment, controlling for (a)Organization and Policy and (b) Attitudes andKnowledge, the results showed little differencein respect of Teaching and Pupil Attainment(r=0.66 as opposed to 0.68 above). Howeverthere was a noticeably reduced association withregard to Pupil Attainment and Organizationand Policy (0.14 as opposed to 0.27 above) andAttitudes and Knowledge (0.09 as opposed to0.23 above) when the association betweenTeaching and Pupil Attainment is partialled out.The authors conclude that improved teaching isseen as improving Pupil Attainment, regardlessof any positive effects on Organization andPolicy and Attitudes and Knowledge.

Introduction

When translated literally, ex post facto means‘from what is done afterwards’. In the contextof social and educational research the phrasemeans ‘after the fact’ or ‘retrospectively’ andrefers to those studies which investigate possi-ble cause-and-effect relationships by observingan existing condition or state of affairs andsearching back in time for plausible causal fac-tors. In effect, researchers ask themselves whatfactors seem to be associated with certain oc-currences, or conditions, or aspects of behav-iour. Ex post facto research, then, is a methodof teasing out possible antecedents of events thathave happened and cannot, therefore, be engi-neered or manipulated by the investigator. Thefollowing example will illustrate the basic idea.Imagine a situation in which there has been adramatic increase in the number of fatal roadaccidents in a particular locality. An expert iscalled in to investigate. Naturally, there is noway in which she can study the actual accidentsbecause they have happened; nor can she turnto technology for a video replay of the incidents.What she can do, however, is attempt a recon-struction by studying the statistics, examiningthe accident spots, and taking note of the state-ments given by victims and witnesses. In thisway the expert will be in a position to identifypossible determinants of the accidents. Thesemay include excessive speed, poor road condi-tions, careless driving, frustration, inefficientvehicles, the effects of drugs or alcohol and soon. On the basis of her examination, she canformulate hypotheses as to the likely causes andsubmit them to the appropriate authority in theform of recommendations. These may include

improving road conditions, or lowering thespeed limit, or increasing police surveillance, forinstance. The point of interest to us is that inidentifying the causes retrospectively, the expertadopts an ex post facto perspective.

Kerlinger (1970) has defined ex post factoresearch more formally as that in which the in-dependent variable or variables have alreadyoccurred and in which the researcher starts withthe observation of a dependent variable or vari-ables. She then studies the independent variableor variables in retrospect for their possible rela-tionship to, and effects on, the dependent vari-able or variables. The researcher is thus exam-ining retrospectively the effects of a naturallyoccurring event on a subsequent outcome witha view to establishing a causal link betweenthem. Interestingly, some instances of ex postfacto designs correspond to experimental re-search in reverse, for instead of taking groupsthat are equivalent and subjecting them to dif-ferent treatments so as to bring about differencesin the dependent variables to be measured, anex post facto experiment begins with groups thatare already different in some respect and searchesin retrospect for the factor that brought aboutthe difference. Indeed Spector (1993:42) suggeststhat ex post facto research is a procedure that isintended to transform a non-experimental re-search design into a pseudo-experimental form.

Two kinds of design may be identified in expost facto research—the co-relational study andthe criterion group study. The former is some-times termed ‘causal research’ and the latter,‘causal-comparative research’. A co-relational(or causal) study is concerned with identifying

11 Ex post facto research

206 EX POST FACTO RESEARCH

the antecedents of a present condition. As itsname suggests, it involves the collection of twosets of data, one of which will be retrospective,with a view to determining the relationship be-tween them. The basic design of such an experi-ment may be represented thus:1

An example of this kind of design can be seen inthe study by Borkowsky (1970). Where a strongrelationship is found between the independentand dependent variables, three possible interpre-tations are open to the researcher: 1 that the variable has caused O;2 that the variable O has caused ; or3 that some third unidentified, and therefore

unmeasured, variable has caused and O. It is often the case that a researcher cannot tellwhich of these is correct.

The value of co-relational or causal studieslies chiefly in their exploratory or suggestivecharacter for, as we have seen, while they arenot always adequate in themselves for establish-ing causal relationships among variables, theyare a useful first step in this direction in thatthey do yield measures of association.

In the criterion-group (or causal-comparative)approach, the investigator sets out to discoverpossible causes for a phenomenon being stud-ied, by comparing the subjects in which the vari-able is present with similar subjects in whom itis absent. The basic design in this kind of studymay be represented thus:

criterion group, are identified by measuring thedifferential effects of the groups on classes ofchildren. The researcher may then examine ,some variable or event, such as the background,training, skills and personality of the groups, todiscover what might ‘cause’ only some teachersto be effective.

Criterion-group or causal-comparative stud-ies may be seen as bridging the gap betweendescriptive research methods on the one handand true experimental research on the other.

Characteristics of ex post facto research

In ex post facto research the researcher takesthe effect (or dependent variable) and examinesthe data retrospectively to establish causes, re-lationships or associations, and their meanings.

Other characteristics of ex post facto researchbecome apparent when it is contrasted with trueexperimental research. Kerlinger (1970) describesthe modus operandi of the experimental researcher.(‘If x, then y’ in Kerlinger’s usage. We have substi-tuted for x and O for y to fit in with Campbelland Stanley’s (1963) conventions throughout thechapter.) Kerlinger hypothesizes: if , then O; iffrustration, then aggression. Depending on circum-stances and his own predilections in research de-sign, he uses some method to manipulate . Hethen observes O to see if concomitant variation,the variation expected or predicted from the vari-ation in , occurs. If it does, this is evidence forthe validity of the proposition, –O, meaning ‘If

, then O’. Note that the scientist here predictsfrom a controlled to O. To help him achievecontrol, he can use the principle of randomizationand active manipulation of and can assume,other things being equal, that O is varying as aresult of the manipulation of .

In ex post facto designs, on the other hand,O is observed. Then a retrospective search for

ensues. An is found that is plausible andagrees with the hypothesis. Due to lack of con-trol of and other possible s, the truth of thehypothesized relation between and O cannotbe asserted with the confidence of the

If, for example, a researcher chose such a de-sign to investigate factors contributing to teachereffectiveness, the criterion group O1, the effec-tive teachers, and its counterpart O2, a groupnot showing the characteristics of the

207Chapte

r 11

experimental researcher. Basically, then, ex postfacto investigations have, so to speak, a built-inweakness: lack of control of the independentvariable or variables. As Spector (1993:43) sug-gests, it is impossible to isolate and control everypossible variable, or to know with absolute cer-tainty which are the most crucial variables.

This brief comparison highlights the mostimportant difference between the two designs—control. In the experimental situation, investi-gators at least have manipulative control; theyhave as a minimum one active variable. If anexperiment is a ‘true’ experiment, they can alsoexercise control by randomization. They canassign subjects to groups randomly; or, at thevery least, they can assign treatments to groupsat random. In the ex post facto research situa-tion, this control of the independent variable isnot possible, and what is perhaps more impor-tant, neither is randomization. Investigatorsmust take things as they are and try to disentan-gle them, though having said this, we must pointout that they can make use of selected proce-dures that will give them an element of controlin this research. These we shall touch uponshortly.

By their very nature, ex post facto experimentscan provide support for any number of different,perhaps even contradictory, hypotheses; they areso completely flexible that it is largely a matter ofpostulating hypotheses according to one’s personalpreference. The investigator begins with certaindata and looks for an interpretation consistent withthem; often, however, a number of interpretationsmay be at hand. Consider again the hypotheticalincrease in road accidents in a given town. A ret-rospective search for causes will disclose half adozen plausible ones. Experimental studies, by con-trast, begin with a specific interpretation and thendetermine whether it is congruent with externallyderived data. Frequently, causal relationships seemto be established on nothing more substantial thanthe premise that any related event occurring priorto the phenomenon under study is assumed to beits cause—the classical post hoc, ergo propter hocfallacy.2 Overlooked is the fact that even when wedo find a relationship between two variables, we

must recognize the possibility that both are indi-vidual results of a common third factor rather thanthe first being necessarily the cause of the second.And as we have seen earlier, there is also the realpossibility of reverse causation, e.g. that a heartcondition promotes obesity rather than the otherway around, or that they encourage each other.The point is that the evidence simply illustratesthe hypothesis; it does not test it, since hypothesescannot be tested on the same data from which theywere derived. The relationship noted may actu-ally exist, but it is not necessarily the only rela-tionship, or perhaps the crucial one. Before we canaccept that smoking is the primary cause of lungcancer, we have to rule out alternative hypotheses.

We must not conclude from what has justbeen said that ex post facto studies are of littlevalue; many of our important investigations ineducation and psychology are ex post facto de-signs. There is often no choice in the matter: aninvestigator cannot cause one group to becomefailures, delinquent, suicidal, brain-damaged ordropouts. Research must of necessity rely onexisting groups. On the other hand, the inabil-ity of ex post facto designs to incorporate thebasic need for control (e.g. through manipula-tion or randomization) makes them vulnerablefrom a scientific point of view and the possibil-ity of their being misleading should be clearlyacknowledged. Ex post facto designs are prob-ably better conceived more circumspectly, notas experiments with the greater certainty thatthese denote, but more as surveys, useful assources of hypotheses to be tested by more con-ventional experimental means at a later date.

Occasions when appropriate

It would follow from what we have said in thepreceding section that ex post facto designs areappropriate in circumstances where the morepowerful experimental method is not possible.These would arise when, for example, it is notpossible to select, control and manipulate thefactors necessary to study cause-and-effect re-lationships directly; or when the control of allvariables except a single independent variable

OCCASIONS WHEN APPROPRIATE

208 EX POST FACTO RESEARCH

may be unrealistic and artificial, preventing thenormal interaction with other influential vari-ables; or when laboratory controls for manyresearch purposes would be impractical, costlyor ethically undesirable.

Ex post facto research is particularly suit-able in social, educational and—to a lesser ex-tent—psychological contexts where the inde-pendent variable or variables lie outside theresearcher’s control. Examples of the methodabound in these areas: the research on ciga-rette smoking and lung cancer, for instance; orstudies of teacher characteristics; or studiesexamining the relationship between politicaland religious affiliation and attitudes; or in-vestigations into the relationship betweenschool achievement and independent variablessuch as social class, race, sex and intelligence.Many of these may be divided into large-scaleor small-scale ex post facto studies, for exam-ple Stablest (1990) large scale study of differ-ences between pupils from mixed andsinglesex schools (1990) and Arnold’s andAtkins’s (1991) small scale study of the socialand emotional adjustment of hearing-im-paired children.3

Advantages and disadvantages of expost facto research

Among the advantages of the approach we mayidentify the following: • Ex post facto research meets an important

need of the researcher where the more rig-orous experimental approach is not possi-ble. In the case of the alleged relationshipbetween smoking and lung cancer, for in-stance, this cannot be tested experimentally(at least as far as human beings are con-cerned).

• The method yields useful information con-cerning the nature of phenomena—what goeswith what and under what conditions. In thisway, ex post facto research is a valuable ex-ploratory tool.

• Improvements in statistical techniques and

general methodology have made ex post factodesigns more defensible.

• In some ways and in certain situations themethod is more useful than the experimentalmethod, especially where the setting up of thelatter would introduce a note of artificialityinto research proceedings.

• Ex post facto research is particularly appro-priate when simple cause-and-effect relation-ships are being explored.

• The method can give a sense of direction andprovide a fruitful source of hypotheses thatcan subsequently be tested by the more rig-orous experimental method.

Among the limitations and weaknesses of expost facto designs the following may be men-tioned: • There is the problem of lack of control in that

the researcher is unable to manipulate theindependent variable or to randomize hersubjects.

• One cannot know for certain whether thecausative factor has been included or evenidentified.

• It may be that no single factor is the cause.• A particular outcome may result from differ-

ent causes on different occasions.• When a relationship has been discovered,

there is the problem of deciding which is thecause and which the effect; the possibility ofreverse causation has to be considered.

• The relationship of two factors does not es-tablish cause and effect.

• Classifying into dichotomous groups can beproblematic.

• There is the difficulty of interpretation andthe danger of the post hoc assumption beingmade, that is, believing that because precedesO, causes O.

• It often bases its conclusions on too limited asample or number of occurrences.

• It frequently fails to single out the really signifi-cant factor or factors, and fails to recognize thatevents have multiple rather than single causes.

• As a method it is regarded by some as too flexible.

209Chapte

r 11

• It lacks nullifiability and confirmation.• The sample size might shrink massively with

multiple matchings (Spector, 1993:43).

Designing an ex post facto investigation

We earlier referred to the two basic designs em-braced by ex post facto research—the co-rela-tional (or causal) model and the criterion group(or causal-comparative) model. We return tothem again here in order to consider designingboth types of investigation. As we saw, the causalmodel attempts to identify the antecedent of apresent condition and may be represented thus:

Although one variable in an ex post facto studycannot be confidently said to depend upon the

differences by investigating possible antecedents.These two examples reflect two types of approachto causal-comparative research: the ‘cause-to-ef-fect’ kind and the ‘effect-to-cause’ kind.

The basic design of causal-comparative in-vestigations is similar to an experimentally de-signed study. The chief difference resides in thenature of the independent variable, . In a trulyexperimental situation, this will be under thecontrol of the investigator and may thereforebe described as manipulable. In the causal-com-parative model (and also the causal model), how-ever, the independent variable is beyond her con-trol, having already occurred. It may thereforebe described in this design as non-manipulable.

Procedures in ex post facto research

We now examine the steps involved in implement-ing a piece of ex post facto research. We maybegin by identifying the problem area to be in-vestigated. This stage will be followed by a clearand precise statement of the hypothesis to betested or questions to be answered. The next stepwill be to make explicit the assumptions on whichthe hypothesis and subsequent procedures willbe based. A review of the research literature willfollow. This will enable the investigator to ascer-tain the kinds of issues, problems, obstacles andfindings disclosed by previous studies in the area.There will then follow the planning of the actualinvestigation and this will consist of three broadstages—identification of the population and sam-ples; the selection and construction of techniquesfor collecting data; and the establishment of cat-egories for classifying the data. The final stagewill involve the description, analysis and inter-pretation of the findings.

It was noted earlier that the principal weak-ness of ex post facto research is the absence ofcontrol over the independent variable influenc-ing the dependent variable in the case of causaldesigns or affecting observed differences betweendependent variables in the case of causal-com-parative designs. (We take up the question of con-trol in experimental research in greater detail inthe next chapter.) Although the ex post facto

other as would be the case in a truly experimen-tal investigation, it is nevertheless usual to des-ignate one of the variables as independent ( )and the other as dependent (O). The left to rightdimension indicates the temporal order, thoughhaving established this, we must not overlookthe possibility of reverse causality.

The second model, the causal-comparative,may be represented schematically as:

Using this model, the investigator hypothesizesthe independent variable and then compares twogroups, an experimental group (E) which has beenexposed to the presumed independent variable and a control group (C) which has not. (Thedashed line in the model shows that the compari-son groups E and C are not equated by randomassignment.) Alternatively, she may examine twogroups that are different in some way or waysand then try to account for the difference or

PROCEDURES IN EX POST FACTO RESEARCH

210 EX POST FACTO RESEARCH

researcher is denied not only this kind of controlbut also the principle of randomization, she cannevertheless utilize procedures that will give hersome measure of control in her investigation. Andit is to some of these that we now turn.

One of the commonest means of introducingcontrol into this type of research is that of match-ing the subjects in the experimental and controlgroups where the design is causal-comparative.One group of writers explain it thus:

The matching is usually done on a subject-to-sub-ject basis to form matched pairs. For example, ifone were interested in the relationship betweenscouting experiences and delinquency, he couldlocate two groups of boys classified as delinquentand non-delinquent according to specified crite-ria. It would be wise in such a study to select pairsfrom these groups matched on the basis of socio-economic status, family structure, and other vari-ables known to be related to both scouting expe-rience and delinquency Analysis of the data fromthe matched samples could be made to determinewhether or not scouting characterized the non-delinquent and was absent in the background ofthe delinquent.

(Ary et al., 1972) There are difficulties with this procedure, however,for it assumes that the investigator knows what therelevant factors are, that is, the factors that may berelated to the dependent variable. Further, there isthe possibility of losing those subjects who cannotbe matched, thus reducing one’s sample.

As an alternative procedure for introducinga degree of control into ex post facto research,Ary and his colleagues suggest building the ex-traneous independent variables into the designand using an analysis of variance technique.They explain:

Assume that intelligence is a relevant extraneousvariable and it is not feasible to control it throughmatching or other means. In this case, intelligencecould be added to the design as another independ-ent variable and the subjects of the study classi-fied in terms of intelligence levels. The dependentvariable measures would then be analyzed throughan analysis of variance and the main and

interaction effects of intelligence might be deter-mined. Such a procedure would reveal any sig-nificant differences among the groups on the de-pendent variable, but no causal relationship be-tween intelligence and the dependent variablecould be assumed. Other extraneous variablescould be operating to produce both the main ef-fect and any interaction effects.

(Ary et al., 1972) Yet another procedure which may be adoptedfor introducing a measure of control into ex postfacto design is that of selecting samples that areas homogeneous as possible on a given variable.The writers quoted above illustrate the proce-dure with the following example.

If intelligence were a relevant extraneous variable,its effects could be controlled by using subjectsfrom only one intelligence level. This procedureserves the purpose of disentangling the independ-ent variable in which the investigator may be in-terested from other variables with which it is com-monly associated, so that any effects that are foundcan justifiably be associated with the independentvariable.

(Ary et al., 1972)

Finally, control may be introduced into an expost facto investigation by stating and testingany alternative hypotheses that might be plau-sible explanations for the empirical outcomesof the study. A researcher has thus to beware ofaccepting the first likely explanation of relation-ships in an ex post facto study as necessarily theonly or final one. A well-known instance towhich reference has already been made is thepresumed relationship between cigarette smok-ing and lung cancer. Government health offi-cials have been quick to seize on the explana-tion that smoking causes lung cancer. Tobaccofirms, however, have put forward an alternativehypothesis—that both smoking and lung can-cer are possibly the result of a third, as yet un-specified, factor. In other words, the possibilitythat both the independent and dependent vari-ables are simply two separate results of a singlecommon cause cannot be ignored.

Introduction

The issue of causality and, hence, predictabilityhas exercised the minds of researchers consider-ably (Smith, 1991:177). One response to theproblem has been in qualitative research thatdefines causality in the terms of the participants(Chapter 6). Another response has been in theoperation of control, and it finds its apotheosisin the experimental design. If rival causes orexplanations can be eliminated from a studythen, it is argued, clear causality can be estab-lished, the model can explain outcomes. Smith(1991:177) claims the high ground for the ex-perimental approach, arguing that it is the onlymethod that directly concerns itself with cau-sality; this, clearly is contestable, as we show inChapters 6–9, and 13 of this book.

In Chapter 11, we described ex post factoresearch as experimentation in reverse in thatex post facto studies start with groups that arealready different with regard to certain charac-teristics and then proceed to search, in retro-spect, for the factors that brought about thosedifferences. We then went on to cite Kerlinger’sdescription of the experimental researcher’s ap-proach:

If x, then y; if frustration, then aggression…theresearcher uses some method to measure x andthen observes y to see if concomitant variationoccurs.

(Kerlinger, 1970) The essential feature of experimental researchis that investigators deliberately control andmanipulate the conditions which determine theevents in which they are interested. At its sim-plest, an experiment involves making a changein the value of one variable—called the

independent variable—and observing the effectof that change on another variable—called thedependent variable.

Imagine that we have been transported to alaboratory to investigate the properties of anew wonder-fertilizer that farmers could useon their cereal crops, let us say wheat(Morrison, 1993:44–5). The scientist wouldtake the bag of wheat seed and randomly splitit into two equal parts. One part would begrown under normal existing conditions—controlled and measured amounts of soil,warmth, water and light and no other factors.This would be called the control group. Theother part would be grown under the sameconditions—the same controlled and meas-ured amounts of soil, warmth, water and lightas the control group, but, additionally, thenew wonder-fertilizer. Then, four monthslater, the two groups are examined and theirgrowth measured. The control group hasgrown half a metre and each ear of wheat is inplace but the seeds are small. The experimen-tal group, by contrast, has grown half a metreas well but has significantly more seeds oneach ear, the seeds are larger, fuller and morerobust.

The scientist concludes that, because bothgroups came into contact with nothing otherthan measured amounts of soil, warmth, waterand light, then it could not have been anythingelse but the new wonder-fertilizer that causedthe experimental group to flourish so well. Thekey factors in the experiment were: • the random allocation of the whole bag of

wheat into two matched groups (the controland the experimental group), involving the

12 Experiments, quasi-experiments andsingle-case research

212 EXPERIMENTS, QUASI-EXPERIMENTS, SINGLE-CASE RESEARCH

initial measurement of the size of the wheatto ensure that it was the same for both groups(i.e. the pretest);

• the identification of key variables (soil,warmth, water, and light);

• the control of the key variables (the sameamounts to each group);

• the exclusion of any other variables;• the giving of the special treatment (the inter-

vention) to the experimental group whilstholding every other variable constant for thetwo groups;

• the final measurement of yield and growth(the post-test);

• the comparison of one group with another;• the stage of generalization—that this new

wonder-fertilizer improves yield and growthunder a given set of conditions.

This model, premised on notions of isolation andcontrol of variables in order to establish causal-ity, may be appropriate for a laboratory, thoughwhether, in fact a social situation either evercould become the antiseptic, artificial world ofthe laboratory or should become such a worldis both an empirical and a moral question re-spectively. Further, the ethical dilemmas of treat-ing humans as manipulable, controllable andinanimate are considerable (see Chapter 2).However, let us pursue the experimental modelfurther.

Frequently in learning experiments in class-room settings the independent variable is astimulus of some kind, a new method in arith-metical computation for example, and the de-pendent variable is a response, the time takento do twenty problems using the new method.Most empirical studies in educational settings,however, are quasi-experimental rather thanexperimental. The single most important differ-ence between the quasi-experiment and the trueexperiment is that in the former case, the re-searcher undertakes his study with groups thatare intact, that is to say, the groups have beenconstituted by means other than random selec-tion. We begin by identifying the essential fea-tures of pre-experimental, true experimental and

quasi-experimental designs, our intention beingto introduce the reader to the meaning and pur-pose of control in educational experimentation.

Designs in educational experimentation

In the outline of research designs that followswe use symbols and conventions from Campbelland Stanley (1963): 1 represents the exposure of a group to an

experimental variable or event, the effects ofwhich are to be measured.

2 O refers to the process of observation ormeasurement.

3 s and Os in a given row are applied to thesame persons.

4 Left to right order indicates temporal se-quence.

5 s and Os vertical to one another are simul-taneous.

6 R indicates random assignment to separatetreatment groups.

7 Parallel rows unseparated by dashes repre-sent comparison groups equated byrandomization, while those separated by adashed line represent groups not equated byrandom assignment.

A pre-experimental design: the one grouppretest-post-test

Very often, reports about the value of a newteaching method or interest aroused by somecurriculum innovation or other reveal that a re-searcher has measured a group on a dependentvariable (O1), for example, attitudes towardsminority groups, and then introduced an experi-mental manipulation ( ), perhaps a ten-weekcurriculum project designed to increase toleranceof ethnic minorities. Following the experimen-tal treatment, the researcher has again measuredgroup attitudes (O2) and proceeded to accountfor differences between pretest and post-testscores by reference to the effects of X.

The one group pretest-post-test design canbe represented as:

213Chapte

r 12Suppose that just such a project has been un-

dertaken and that the researcher finds that O2

scores indicate greater tolerance of ethnic mi-norities than O

1 scores. How justified is she

in attributing the cause of O1-O

2 differences

to the experimental treatment ( ), that is, theterm’s project work? At first glance the as-sumption of causality seems reasonableenough. The situation is not that simple, how-ever. Compare for a moment the circumstancesrepresented in our hypothetical educationalexample with those which typically obtain inexperiments in the physical sciences. A physi-cist who applies heat to a metal bar can con-fidently attribute the observed expansion tothe rise in temperature that she has introducedbecause within the confines of her laboratoryshe has excluded (i.e. controlled) all other ex-traneous sources of variation (this example issuggested by Pilliner, 1973).

The same degree of control can never be at-tained in educational experimentation. At thispoint readers may care to reflect upon somepossible influences other than the ten-week cur-riculum project that might account for the O1-O2 differences in our hypothetical educationalexample.

They may conclude that factors to do withthe pupils, the teacher, the school, the class-room organization, the curriculum materialsand their presentation, the way that the sub-jects’ attitudes were measured, to say nothingof the thousand and one other events that oc-curred in and about the school during thecourse of the term’s work, might all have ex-erted some influence upon the observed dif-ferences in attitude. These kinds of extrane-ous variables which are outside the experi-menters’ control in one-group pretest—post-test designs threaten to invalidate their re-search efforts. We identify a number of suchthreats to the validity of educational experi-mentation in Chapter 5.

A ‘true’ experimental design: thepretest-post-test control group design

A complete exposition of experimental designsis beyond the scope of this chapter. In the briefoutline that follows, we have selected one de-sign from the comprehensive treatment of thesubject by Campbell and Stanley (1963) in or-der to identify the essential features of what theyterm a ‘true experimental’ and what Kerlinger(1970) refers to as a ‘good’ design. Along withits variants, the chosen design is commonly usedin educational experimentation.

The pretest-post-test control group design canbe represented as:

It differs from the pre-experimental design thatwe have just described in that it involves the useof two groups which have been constituted byrandomization. As Kerlinger observes, in theory,random assignment to E and C conditions con-trols all possible independent variables. In prac-tice, of course, it is only when enough subjectsare included in the experiment that the princi-ple of randomization has a chance to operate asa powerful control. However, the effects ofrandomization even with a small number of sub-jects is well illustrated in Box 12.1.

Randomization, then, ensures the greaterlikelihood of equivalence, that is, the appor-tioning1 out between the experimental and con-trol groups of any other factors or characteris-tics of the subjects which might conceivablyaffect the experimental variables in which theresearcher is interested. It is, as Kerlinger (1970)notes, the addition of the control group in ourpresent example and the random assignmentof subjects to E and C groups that radicallyalters the situation from that which obtains inthe pre-experimental design outlined earlier. Forif the groups are made equivalent, then any so-called ‘clouding’ effects should be present inboth groups.

A ‘TRUE’ EXPERIMENTAL DESIGN

214 EXPERIMENTS, QUASI-EXPERIMENTS, SINGLE-CASE RESEARCH

If the mental ages of the children of the experi-mental group increase, so should the mental agesof the children of the control group… If some-thing happens to affect the experimental subjectsbetween the pretest and the post-test, this some-thing should also affect the subjects of the controlgroups.

(Kerlinger, 1970) So strong is this simple and elegant true experi-mental design, that all the threats to internalvalidity identified by Campbell and Stanley(1963) are controlled in the pretest-post-testcontrol group design.

One problem that has been identified withthis particular experimental design is the inter-action effect of testing. Good (1963) explainsthat whereas the various threats to the validityof the experiments listed in Chapter 5 can bethought of as main effects, manifesting them-selves in mean differences independently of thepresence of other variables, interaction effects,as their name implies, are joint effects and mayoccur even when no main effects are present.2

For example, an interaction effect may occur asa result of the pretest measure sensitizing thesubjects to the experimental variable.3 Interac-tion effects can be controlled for by adding tothe pretest-post-test control group design twomore groups that do not experience the pretestmeasures. The result is a four-group design, assuggested by Solomon. Later in the chapter, wedescribe an educational study which built into a

pretest-post-test group design a further controlgroup to take account of the possibility of pre-test sensitization.

A quasi-experimental design: the non-equivalent control group design

Often in educational research, it is simply notpossible for investigators to undertake true ex-periments. At best, they may be able to employsomething approaching a true experimental de-sign in which they have control over whatCampbell and Stanley (1963) refer to as ‘the whoand to whom of measurement’ but lack controlover ‘the when and to whom of exposure’, orthe randomization of exposures—essential if trueexperimentation is to take place. These situa-tions are quasi-experimental and the method-ologies employed by researchers are termedquasi-experimental designs. (Kerlinger (1970)refers to quasi-experimental situations as ‘com-promise designs’, an apt description when ap-plied to much educational research where therandom selection or random assignment ofschools and classrooms is quite impracticable.)

One of the most commonly used quasi-ex-perimental designs in educational research canbe represented as:

Select twenty cards from a pack, ten red and ten black. Shuffle and deal into two ten-card piles. Now count thenumber of red cards and black cards in either pile and record the results. Repeat the whole sequence many times,recording the results each time.You will soon convince yourself that the most likely distribution of reds and blacks in a pile is five in each: the nextmost likely, six red (or black) and four black (or red); and so on. You will be lucky (or unlucky for the purposes of thedemonstration!) to achieve one pile of red and the other entirely of black cards. The probability of this happening is 1in 92,378! On the other hand, the probability of obtaining a ‘mix’ not more than 6 of one colour and 4 of the otheris about 82 in 100.If you now imagine the red cards to stand for the ‘better’ ten children and the black cards for the ‘poorer’ ten childrenin a class of twenty, you will conclude that the operation of the laws of chance alone will almost probably give youclose equivalent ‘mixes’ of ‘better’ and ‘poorer’ children in the experimental and control groups.

Box 12.1The effects of randomization

Source Adapted from Pilliner, 1973

215Chapte

r 12

The dashed line separating the parallel rows inthe diagram of the non-equivalent control groupindicates that the experimental and controlgroups have not been equated byrandomization—hence the term ‘non-equiva-lent’. The addition of a control group makes thepresent design a decided improvement over theone group pretest-post-test design, for to thedegree that experimenters can make E and Cgroups as equivalent as possible, they can avoidthe equivocality of interpretations that plaguethe preexperimental design discussed earlier. Theequivalence of groups can be strengthened bymatching, followed by random assignment to Eand C treatments.

Where matching is not possible, the re-searcher is advised to use samples from the samepopulation or samples that are as alike as possi-ble (Kerlinger, 1970). Where intact groups dif-fer substantially, however, matching is unsatis-factory due to regression effects which lead todifferent group means on post-test measures.Campbell and Stanley put it this way:

If [in the non-equivalent control group design] themeans of the groups are substantially different,then the process of matching not only fails to pro-vide the intended equation but in addition insuresthe occurrence of unwanted regression effects. Itbecomes predictably certain that the two groupswill differ on their post-test scores altogether in-dependently of any effects of , and that this dif-ference will vary directly with the difference be-tween the total populations from which the selec-tion was made and inversely with the test-retestcorrelation.

(Campbell and Stanley, 1963)

Procedures in conducting experimentalresearch

In Chapter 11, we identified a sequence of stepsin carrying out an ex post facto study. An ex-perimental investigation must also follow a setof logical procedures. Those that we now enu-merate, however, should be treated with somecircumspection. It is extraordinarily difficult(and indeed, foolhardy) to lay down clear-cut

rules as guides to experimental research. At best,we can identify an ideal route to be followed,knowing full well that educational researchrarely proceeds in such a systematic fashion. (Fora detailed discussion of the practical issues ineducational experimentation, see Evans (1978),Chapter 4, ‘Planning experimental work’,Riecken and Boruch (1974), and Bennett andLumsdaine (1975).)

First, the researcher must identify and definethe research problem as precisely as possible,always supposing that the problem is amenableto experimental methods.

Second, she must formulate hypotheses thatshe wishes to test. This involves making predic-tions about relationships between specific vari-ables and at the same time making decisions aboutother variables that are to be excluded from theexperiment by means of controls. Variables, re-member, must have two properties. First, theymust be measurable. Physical fitness, for exam-ple, is not directly measurable until it has beenoperationally defined. Making the variable ‘physi-cal fitness’ operational means simply defining itby letting something else that is measurable standfor it—a gymnastics test, perhaps. Second, theproxy variable must be a valid indicator of thehypothetical variable in which one is interested.That is to say, a gymnastics test probably is areasonable proxy for physical fitness; height onthe other hand most certainly is not.

Third, the researcher must select appropriatelevels at which to test the independent variablesin order for differences to be observed. The ex-perimenter will vary the stimuli at such levels asare of practical interest in the real-life situation.For example comparing reading periods of forty-four minutes, or forty-six minutes, withtimetabled reading lessons of forty-five minutesis scarcely likely to result in observable differ-ences in attainment.

Fourth, in planning the design of the experi-ment, the researcher must take account of thepopulation to which she wishes to generalize herresults. This involves her in decisions over sam-ple sizes and sampling methods.

Fifth, with problems of validity in mind, the

PROCEDURES IN CONDUCTING EXPERIMENTAL RESEARCH

216 EXPERIMENTS, QUASI-EXPERIMENTS, SINGLE-CASE RESEARCH

researcher must select instruments, choose tests anddecide upon appropriate methods of analysis.

Sixth, before embarking upon the actual ex-periment, the researcher must pilot test the ex-perimental procedures to identify possible snagsin connection with any aspect of the investiga-tion (Simon, 1978).

Seventh, during the experiment itself, the re-searcher must endeavour to follow tested andagreed-on procedures to the letter. The stand-ardization of instructions, the exact timing ofexperimental sequences, the meticulous record-ing and checking of observations—these are thehallmark of the competent researcher.

With her data collected, the researcher facesthe most important part of the whole enterprise.Processing data, analysing results and draftingreports are all extremely demanding activities,both in intellectual effort and time.

Borg and Gall (1979:547) set out a useful series ofsteps in the planning and conduct of an experiment:

Step 1 Carry out a measure of the dependentvariable.Step 2 Assign participants to matched pairs,based on the scores and measures establishedfrom Step 1.Step 3 Randomly assign one person from eachpair to the control group and the other to theexperimental group.Step 4 Administer the experimental treatment/intervention to the experimental group and, ifappropriate, a placebo to the control group.Ensure that the control group is not subject tothe intervention.Step 5 Carry out a measure of the dependentvariable with both groups and compare/meas-ure them in order to determine the effect and itssize on the dependent variable.

Borg and Gall indicate that difficulties arise inthe close matching of the sample of the controland experimental groups. This involves carefulidentification of the variables on which the match-ing must take place. They suggest (p. 547) thatmatching on a number of variables that correlatewith the dependent variable is more likely to re-

duce errors than matching on a single variable.The problem, of course, is that the greater thenumber of variables that have to be matched, theharder it is actually to find the sample of peoplewho are matched. Hence the balance must bestruck between having too few variables such thaterror can occur, and having so many variablesthat it is impossible to draw a sample.

Further, the authors draw attention to theneed to specify the degree of exactitude (or vari-ance) of the match. For example, if the subjectswere to be matched on, say, linguistic ability asmeasured in a standardized test, it is importantto define the limits of variability that will beused to define the matching (e.g. ± 3 points). Asbefore, the greater the degree of precision in thematching here, the closer will be the match, butthe greater the degree of precision the harder itwill be to find an exactly matched sample.

One way of addressing this issue is to place allthe subjects in rank order on the basis of the scoresor measures of the dependent variable. Then thefirst two subjects become one matched pair (whichone is allocated to the control group and which tothe experimental group is done randomly, e.g. bytossing a coin), subjects three and four become thenext matched pair, subjects five and six become thenext matched pair, and so on until the sample isdrawn. Here the loss of precision is counterbalancedby the avoidance of the loss of subjects.

The alternative to matching that has beendiscussed earlier in the chapter is randomization.Smith (1991:215) suggests that matching is mostwidely used in quasi-experimental and non-ex-perimental research, and is a far inferior meansof ruling out alternative causal explanations thanrandomization. Randomization, he argues, pro-duces equivalence over a whole range of vari-ables, whereas matching produces equivalenceover only a few named variables. The use ofrandomized controlled trials (RCTs), a methodused in medicine, is a putative way of establish-ing causality and generalizability (though, inmedicine, the sample sizes for some RCTs arenecessarily so small—there being limited suffer-ers from a particular complaint—thatrandomization is seriously compromised).

217Chapte

r 12

A powerful advocacy of RCTs for planningand evaluation is provided by Boruch (1997).Indeed he argues (p. 69) that the problem of poorexperimental controls has led to highly question-able claims being made about the success of pro-grammes. Examples of the use of RCTs can beseen in Maynard and Chalmers (1997).

Mitchell and Jolley (1988:103) pose threeimportant questions that researchers need toconsider when comparing two groups: • Are the two groups equal at the commence-

ment of the experiment?• Would the two groups have grown apart

naturally, regardless of the intervention?• To what extent has initial measurement er-

ror of the two groups been a contributoryfactor in differences between scores?

Examples from educational research

Example 1: a pre-experimental design

A pre-experimental design was used in a study in-volving the 1991–2 Postgraduate Diploma in Edu-cation group following a course of training to equipthem to teach social studies in senior secondaryschools in Botswana. The researcher wished to findout whether the programme of studies he had de-vised would effect changes in the students’orientations towards social studies teaching. Tothat end, he employed a research instrument, theBarth/Shermis Studies Preference Scale (BSSPS).

The BSSPS provides measures of what pur-port to be three social studies traditions or philo-sophical orientations, the oldest of which, Citi-zenship Transmission, involves indoctrination ofthe young in the basic values of a society. Thesecond orientation, called the Social Science, isheld to relate to the acquisition of knowledge-gathering skills based on the mastery of socialscience concepts and processes. The third tradi-tion, Reflective Inquiry, emphasizes the processof inquiry. Forty-eight Postgraduate Diplomastudents were administered the BSSPS during thefirst session of their one-year course of study.At the end of the programme, the BSSPS wasagain completed in order to determine whether

changes had occurred in students’ philosophi-cal orientations. Briefly, the ‘preferred orienta-tion’ in the pretest and post-test was the crite-rion measure, the two orientations least pre-ferred being ignored. Broadly speaking, studentstended to move from a majority holding a Citi-zenship Transmission orientation at the begin-ning of the course to a greater affirmation ofthe Social Science and the Reflective Inquiry tra-ditions. Using the symbols and conventionsadopted earlier to represent research designs, wecan illustrate the Botswana study as:

The briefest consideration reveals inadequaciesin the design. Indeed, Campbell and Stanley de-scribe the one group pretest-post-test design as‘a “bad example” to illustrate several of the con-founded extraneous variables that can jeopard-ize internal validity. These variables offer plau-sible hypotheses explaining an O

1–O

2 difference,

rival to the hypothesis that caused the differ-ence’ (Campbell and Stanley, 1963). The inves-tigator is rightly cautious in his conclusions: ‘itis possible to say that the social studies coursemight be responsible for this phenomenon, al-though other extraneous variables might be op-erating’ (Adeyemi, 1992, emphasis added).Somewhat ingenuously he puts his finger on onepotential explanation, that the changes couldhave occurred among his intending teachers be-cause the shift from ‘inculcation to rational de-cision-making was in line with the recommen-dation of the Nine Year Social Studies Syllabusissued by the Botswana Ministry of Educationin 1989’ (Adeyemi, 1992).

Example 2: a quasi-experimental design

Mason, Mason and Quayle’s longitudinal studytook place between 1984 and 1992. Its princi-pal aim was to test whether the explicit teach-ing of linguistic features of GCSE textbooks,coursework and examinations would producean improvement in performance across the

EXAMPLES FROM EDUCATIONAL RESEARCH

218 EXPERIMENTS, QUASI-EXPERIMENTS, SINGLE-CASE RESEARCH

secondary curriculum. The design adopted in thestudy may be represented as:

Example 3: a ‘true’ experimental design

Another investigation (Bhadwal and Panda,1991) concerned with effecting improvements instudents’ performance as a consequence of chang-ing teaching strategies used a more robust ex-perimental design. In rural India, the researchersdrew a sample of seventy-eight pupils, matchedby socio-economic backgrounds and non-verbalIQs, from three primary schools that were them-selves matched by location, physical facilities,teachers’ qualifications and skills, school evalua-tion procedures and degree of parental involve-ment. Twenty-six pupils were randomly selectedto comprise the experimental group, the remain-ing fifty-two being equally divided into two con-trol groups. Before the introduction of thechanged teaching strategies to the experimentalgroup, all three groups completed questionnaireson their study habits and attitudes. These instru-ments were specifically designed for use withyounger children and were subjected to the usualitem analyses, test-retest and split-half reliabilityinspections. Bhadwal and Panda’s research de-sign can be represented as:

This is, of course, the non-equivalent controlgroup design outlined earlier in this chapter inwhich parallel rows separated by dashed linesrepresent groups that have not been equated byrandom assignment.

In brief, the researchers adopted a method-ology akin to teaching English as a foreign lan-guage and applied this to Years 7–9 in the studyschool and two neighbouring schools, moni-toring the pupils at every stage and comparingtheir performance with control groups drawnboth from the three schools. Inevitably, becauseexperimental and control groups were not ran-domly allocated, there were significant differ-ences in the performance of some groups onpre-treatment measures such as the York Lan-guage Aptitude Test. Moreover, because nostandardized reading tests of sufficient difficultywere available as post-treatment measures, testshad to be devised by the researchers, who pro-vide no details as to their validity or reliability.These difficulties notwithstanding, pupils in theexperimental groups taking public examina-tions in 1990 and 1991 showed substantialgains in respect of the percentage increases ofthose obtaining GCSE Grades AC. The re-searchers note that during the three years 1989to 1991, ‘no other significant change in thepolicy, teaching staff or organization of theschool took place which could account for thisdramatic improvement of 50 per cent’ (Masonet al., 1992).

Although the researchers attempted to con-trol extraneous variables, readers may well askwhether threats to internal and external valid-ity (see Chapter 5) were sufficiently met as toallow such a categorical conclusion as, ‘the pu-pils… achieved greater success in public exami-nations as a result of taking part in the project’(Mason et al., 1992).

Recalling Kerlinger’s discussion of a ‘good’ ex-perimental design, the version of the pretest-post-test control design employed here (unlike thedesign used in Example 2 above) resorted torandomization which, in theory, controls allpossible independent variables. Kerlinger adds,however, ‘in practice, it is only when enoughsubjects are included in the experiment that theprinciple of randomization has a chance to op-erate as a powerful control’ (Kerlinger, 1970).It is doubtful whether twentysix pupils in eachof the three groups in Bhadwal and Panda’sstudy constituted ‘enough subjects’.

In addition to the matching procedures indrawing up the sample, and the random alloca-tion of pupils to experimental and control

219Chapte

r 12

groups, the researchers also used analysis ofcovariance, as a further means of controlling forinitial differences between E and C groups ontheir pretest mean scores on the independentvariables, study habits and attitudes.

The experimental programme4 involved im-proving teaching skills, classroom organiza-tion, teaching aids, pupil participation, reme-dial help, peer-tutoring, and continuous evalu-ation. In addition, provision was also made inthe experimental group for ensuring parentalinvolvement and extra reading materials. Itwould be startling if such a package of teach-ing aids and curriculum strategies did not ef-fect significant changes in their recipients andsuch was the case in the experimental results.The Experimental Group made highly signifi-cant gains in respect of its level of study habitsas compared with Control Group 2 where stu-dents did not show a marked change. Whatdid surprise the investigators, we suspect, wasthe significant increase in levels of study habitsin Control Group 1. Maybe, they opine, thisunexpected result occurred because ControlGroup 1 pupils were tested immediately priorto the beginning of their annual examinations.On the other hand, they concede, some unac-countable variables might have been operat-ing. There is, surely, a lesson here for all re-searchers!

Single-case research: ABAB design

Increasingly, in recent years, single-case researchas an experimental methodology has extendedto such diverse fields as clinical psychology,medicine, education, social work, psychiatry,and counselling. Most of the single-case studiescarried out in these (and other) areas share thefollowing characteristics: • they involve the continuous assessment of

some aspect of human behaviour over a pe-riod of time, requiring on the part of the re-searcher the administration of measures onmultiple occasions within separate phases ofa study.

• they involve ‘intervention effects’ which arereplicated in the same subject(s) over time.

Continuous assessment measures are used as abasis for drawing inferences about the effective-ness of intervention procedures.

The characteristics of single-case researchstudies are discussed by Kazdin (1982) in termsof ABAB designs, the basic experimental formatin most single-case researches. ABAB designs,Kazdin observes, consist of a family of proce-dures in which observations of performance aremade over time for a given client or group ofclients. Over the course of the investigation,changes are made in the experimental conditionsto which the client is exposed. The basic ration-ale of the ABAB design is illustrated in Box 12.2.What it does is this. It examines the effects ofan intervention by alternating the baseline con-dition (the A phase), when no intervention is ineffect, with the intervention condition (the Bphase). The A and B phases are then repeated tocomplete the four phases. As Kazdin says, theeffects of the intervention are clear if perform-ance improves during the first interventionphase, reverts to or approaches original base-line levels of performance when the treatment iswithdrawn, and improves again when treatmentis recommenced in the second interventionphase.

An example of the application of the ABABdesign in an educational setting is provided byDietz (1977)5 whose single-case study sought tomeasure the effect that a teacher could haveupon the disruptive behaviour of an adolescentboy whose persistent talking disturbed his fel-low classmates in a special education class.

In order to decrease the unwelcome behav-iour, a reinforcement programme was devisedin which the boy could earn extra time with theteacher by decreasing the number of times hecalled out. The boy was told that when he madethree (or fewer) interruptions during any 55-minute class period the teacher would spendextra time working with him. In the technicallanguage of behaviour modification theory, thepupil would receive reinforcing consequences

SINGLE-CASE RESEARCH: ABAB DESIGN

220 EXPERIMENTS, QUASI-EXPERIMENTS, SINGLE-CASE RESEARCH

when he was able to show a low rate of disrup-tive behaviour (in Box 12.3 this is referred to as‘differential reinforcement of low rates’ or DRL).

When the boy was able to desist from talk-ing aloud on fewer than three occasions dur-ing any timetabled period, he was rewardedby the teacher spending fifteen minutes withhim helping him with his learning tasks. Thepattern of results displayed in Box 12.3 showsthe considerable changes that occurred in theboy’s behaviour when the intervention proce-dures were carried out and the substantial in-creases in disruptions towards baseline levelswhen the teacher’s rewarding strategies werewithdrawn. Finally, when the intervention wasreinstated, the boy’s behaviour is seen to im-prove again.

By way of conclusion, the single-case researchdesign is uniquely able to provide an experimen-tal technique for evaluating interventions for theindividual subject. Moreover, such interventionscan be directed towards the particular subjector group and replicated over time or across be-haviours, situations, or persons. Single-case

research offers an alternative strategy to themore usual methodologies based on between-group designs. There are, however, a number ofproblems that arise in connection with the useof single-case designs having to do with ambi-guities introduced by trends and variations inbaseline phase data and with the generality ofresults from single-case research. The interestedreader is directed to Kazdin (1982), Borg (1981)and Vasta (1979).6

Meta-analysis in educational research

The study by Bhadwal and Panda (1991) is typi-cal of research undertaken to explore the effec-tiveness of classroom methods. Often as not,such studies fail to reach the light of day, par-ticularly when they form part of the researchrequirements for a higher degree. Meta-analy-sis is, simply, the analysis of other analyses. Itinvolves aggregating the results of other studiesinto a coherent account. Among the advantagesof using meta-analysis, Fitz-Gibbon cites thefollowing:

Box 12.2The ABAB design

Source Adapted from Kazdin, 1982

221Chapte

r 12

– Humble, small-scale reports which have simplybeen gathering dust may now become useful,– Small-scale research conducted by individualstudents and lecturers will be valuable sincemetaanalysis provides a way of coordinating re-sults drawn from many studies without having tocoordinate the studies themselves.– For historians, a whole new genre of studies iscreated—the study of how effect sizes vary overtime, relating this to historical changes.

(Fitz-Gibbon, 1985:46) McGaw (1997:371) suggests that quantitativemeta-analysis replaces intuition, which is fre-quently reported narratively (Wood, 1995:389),as a means of synthesizing different researchstudies transparently and explicitly (a desidera-tum in many synthetic studies (Jackson, 1980)),particularly when they differ very substantially.Narrative reviews, suggest Jackson (1980),Cook et al. (1992:13) and Wood (1995:390)are prone to: • lack comprehensiveness, being selective and

only going to subsets of studies;• misrepresentation and crude representation

of research findings;

• over-reliance on significance tests as a meansof supporting hypotheses, thereby overlook-ing the point that sample size exerts a majoreffect on significance levels, and overlookingeffect size;

• reviewers’ failure to recognize that randomsampling error can play a part in creatingvariations in findings amongst studies;

• overlook differing and conflicting researchfindings;

• reviewers’ failure to examine critically theevidence, methods and conclusions of previ-ous reviews;

• overlook the extent to which findings fromresearch are mediated by the characteristicsof the sample;

• overlook the importance of intervening vari-ables in research;

• unreplicability because the procedures forintegrating the research findings have notbeen made explicit.

Over the past few years a quantitative methodfor synthesizing research results has been devel-oped by Glass et al. (1978; 1981) and others(e.g. Hedges and Olkin, 1985; Hedges, 1990;

Box 12.3An ABAB design in an educational setting

Source Kazdin, 1982

META-ANALYSIS IN EDUCATIONAL RESEARCH

222 EXPERIMENTS, QUASI-EXPERIMENTS, SINGLE-CASE RESEARCH

Rosenthal, 1991) to supersede narrative intui-tion. Meta-analysis, essentially the ‘analysis ofanalysis’, is a means of quantitatively (a) identi-fying generalizations from a range of separateand disparate studies, and (b) discovering inad-equacies in existing research such that new em-phases for future research can be proposed. It issimple to use and easy to understand, thoughthe statistical treatment that underpins it is some-what complex. It involves the quantification andsynthesis of findings from separate studies onsome common measure, usually an aggregate ofeffect size estimates, together with an analysisof the relationship between effect size and otherfeatures of the studies being synthesized. Statis-tical treatments are applied to attenuate the ef-fects of other contaminating factors, e.g. sam-pling error, measurement errors, and range re-striction. Research findings are coded into sub-stantive categories for generalizations to be made(Glass et al., 1981), such that consistency of find-ings is discovered that, through the traditionalmeans of intuition and narrative review, wouldhave been missed.

Fitz-Gibbon (1985:45) explains the techniqueby suggesting that in meta-analysis the effectsof variables are examined in terms of their ef-fect size, that is to say, in terms of how muchdifference they make rather than only in termsof whether or not the effects are statistically sig-nificant at some arbitrary level such as 5 per-cent. Because, with effect sizes, it becomes easierto concentrate on the educational significanceof a finding rather than trying to assess its im-portance by its statistical significance, we mayfinally see statistical significance kept in its placeas just one of many possible threats to internalvalidity. The move towards elevating effect sizeover significance levels is hugely important (seealso Chapter 10), and signals an emphasis on‘fitness for purpose’ (the size of the effect hav-ing to be suitable for the researcher’s purposes)over arbitrary cut-off points in significance lev-els as determinants of utility.

The term ‘meta-analysis’ originated in 1976(Glass, 1976) and early forms of meta-analysisused calculations of combined probabilities and

frequencies with which results fell into definedcategories (e.g. statistically significant at givenlevels), though problems of different samplesizes confounded rigour (e.g. large sampleswould yield significance in trivial effects, whilstimportant data from small samples would notbe discovered because they failed to reach sta-tistical significance) (Light and Smith, 1971;Glass et al., 1981; McGaw, 1997:371). Glass(1976) and Glass et al. (1981) suggested threelevels of analysis: (a) primary analysis of thedata; (b) secondary analysis, a re-analysis usingdifferent statistics; (c) meta-analysis analysingresults of several studies statistically in order tointegrate the findings. Glass et al. (1981) andHunter et al. (1982) suggest several stages inthe procedure:

Step 1 Identify the variables for focus (independ-ent and dependent).Step 2 Identify all the studies which feature thevariables in which the researcher is interested.Step 3 Code each study for those characteristicsthat might be predictors of outcomes and effectsizes. (e.g. age of participants, gender, ethnicity,duration of the intervention).Step 4 Estimate the effect sizes through calcula-tion for each pair of variables (dependent andindependent variable) (see Glass, 1977), weight-ing the effect size by the sample size.Step 5 Calculate the mean and the standard de-viation of effect sizes across the studies, i.e. thevariance across the studies.Step 6 Determine the effects of sampling errors,measurement errors and range of restriction.Step 7 If a large proportion of the variance isattributable to the issues in Step 6, then the av-erage effect size can be considered an accurateestimate of relationships between variables.Step 8 If a large proportion of the variance isnot attributable to the issues in Step 6, then re-view those characteristics of interest which cor-relate with the study effects.

Cook et al. (1992:7–12) set out a five stagemodel for an integrative review as a researchprocess, covering:

223Chapte

r 12

• problem formulation (where a high quality meta-analysis must be rigorous in its attention to thedesign, conduct and analysis of the review);

• data collection (where sampling of studies forreview has to demonstrate fitness for purpose);

• data retrieval and analysis (where threats tovalidity in non-experimental research—ofwhich integrative review is an example—areaddressed). Validity here must demonstratefitness for purpose, reliability in coding, andattention to the methodological rigour of theoriginal pieces of research;

• analysis and interpretation (where the accu-mulated findings of several pieces of researchshould be regarded as complex data pointsthat have to be interpreted by meticulous sta-tistical analysis).

Fitz-Gibbon (1984:141–2) sets out four steps inconducting a meta-analysis:

Step 1 Finding studies (e.g. published, unpub-lished, reviews) from which effect sizes can becomputed.Step 2 Coding the study characteristics (e.g. date,publication status, design characteristics, qual-ity of design, status of researcher).Step 3 Measuring the effect sizes (e.g. locating theexperimental group as a z-score in the controlgroup distribution) so that outcomes can be meas-ured on a common scale, controlling for ‘lumpydata’ (non-independent data from a large data set).Step 4 Correlating effect sizes with context vari-ables (e.g. to identify differences between well-controlled and poorly controlled studies).

Wood (1995:393) suggests that effect-size canbe calculated by dividing the significance levelby the sample size. Glass et al. (1981:29, 102)calculate the effect size as:

two most frequently used indices of effect sizesare standardized mean differences and correla-tions (ibid.: 373), though nonparametric statis-tics, e.g. the median, can be used. Lipsey(1992:93–100) sets out a series of statistical testsfor working on effect sizes, effect size means andhomogeneity. It is clear from this that Glass andothers assume that meta-analysis can only beundertaken for a particular kind of research—the experimental type—rather than for all typesof research; this might limit its applicability.

Glass et al. (1981) suggest that meta-analy-sis is particularly useful when it uses unpublisheddissertations, as these often contain weaker cor-relations than those reported in published re-search, and hence act as a brake on misleading,more spectacular generalizations. Meta-analy-sis, it is claimed (Cooper and Rosenthal, 1980),is a means of avoiding Type II errors (discussedin Chapter 5—failing to find effects that reallyexist), synthesizing research findings more rig-orously and systematically, and generating hy-potheses for future research. However Hedgesand Olkin (1980) and Cook et al. (1992:297)show that Type II errors become more likely asthe number of studies included in the sampleincreases.

Further, Rosenthal (1991) has indicated amethod for avoiding Type I errors (finding aneffect that, in fact, does not exist) that is basedon establishing how many unpublished studiesthat average a null result would need to be un-dertaken to offset the group of published statis-tically significant studies. For one example heshows a ratio of 277:1 of unpublished to pub-lished research, thereby indicating the limitedbias in published research.

Meta-analysis is not without its critics. Sinceso much depends upon the quality of the resultsthat are to be synthesized, there is the dangerthat adherents may simply multiply the inad-equacies of the data base and the limits of thesample (e.g. trying to compare the incompara-ble). Hunter et al. (1982) suggest that samplingerror and the influence of other factors has tobe addressed, and that it should account for lessthan 75 per cent of the variance in observed

Hedges (1981) and Hunter et al., (1982) suggestalternative equations to take account of differen-tial weightings due to sample size variations. The

META-ANALYSIS IN EDUCATIONAL RESEARCH

224 EXPERIMENTS, QUASI-EXPERIMENTS, SINGLE-CASE RESEARCH

effect sizes if the results are to be acceptable andable to be coded into categories. The issue isclear here: coding categories have to declare theirlevel of precision, their reliability (e.g. intercoderreliability—the equivalent of inter-rater reliabil-ity, see Chapter 5) and validity (McGaw,1997:376–7).

To the charge that selection bias will be asstrong in meta-analysis—which embraces bothpublished and unpublished research—as in solelypublished research, Glass et al. (1981:226–9)argue that it is necessary to counter gross claimsmade in published research with more cautiousclaims found in unpublished research.

Because the quantitative mode of (many)studies demands only a few common variablesto be measured in each case, argues Tripp(1985),7 cumulation of the studies tends to in-crease sample size much more than it increasesthe complexity of the data in terms of the numberof variables. Meta-analysis risks attempting tosynthesize studies which are insufficiently simi-lar to each other to permit this with any legiti-macy (Glass et al., 1981:22; McGaw, 1997:372)other than at an unhelpful level of generality.The analogy here might be to try to keep to-gether oil and water as ‘liquids’; meta-analystswould argue that differences between studies andtheir relationships to findings can be coded andaddressed in meta-analysis. Eysenck (1978) sug-gests that early meta-evaluation studies mixedapples with oranges! Though Glass et al.(1981:218–20) refute this charge, it remains thecase (McGaw, 1997) that there is a risk in meta-analysis of dealing indiscriminately with a largeand sometimes incoherent body of research lit-erature.

It is unclear, too, how meta-analysis differ-entiates between ‘good’ and ‘bad’ research—e.g.between methodologically rigorous and poorlyconstructed research (Cook et al., 1992:297).Smith and Glass (1977) suggest that it is possi-ble to use study findings, regardless of theirmethodological quality, though Glass and Smith(1978) and Slavin (1984a, 1984b), in a study ofthe effects of class size, indicate that methodo-logical quality does make a difference. Glass et

al. (1981:220–6) effectively address the chargeof using data from ‘poor’ studies, arguing,amongst other points, that many weak studiescan add up to a strong conclusion (p. 221) andthat the differences in the size of experimentaleffects between high-validity and low-validitystudies are surprisingly small (p. 226).

Further, Wood (1995:296) suggests thatmetaanalysis oversimplifies results by concen-trating on overall effects to the neglect of theinteraction of intervening variables. To thecharge that, because meta-analyses are fre-quently conducted on large data sets wheremultiple results derive from the same study (i.e.that the data are non-independent) and are there-fore unreliable, Glass et al. (1981) indicate howthis can be addressed by using sophisticated dataanalysis techniques (pp. 153–216). Finally, apractical concern is the time required not onlyto use the easily discoverable studies (typicallylarge-scale published studies) but to include thesmaller-scale unpublished studies; the effect ofneglecting the latter might be to build in bias inthe meta-analysis.

It is the traditional pursuit of generalizationsfrom each quantitative study which has mosthampered the development of a data base ad-equate to reflect the complexity of the socialnature of education. The cumulative effects of‘good’ and ‘bad’ experimental studies is graphi-cally illustrated in Box 12.4.

An example of meta-analysis ineducational research

Glass and Smith (1978) and Glass et al.(1981:35–44) identified seventy-seven empiri-cal studies of the relationship between class sizeand pupil learning.8 These studies yielded 725comparisons of the achievements of smaller andlarger classes, the comparisons resting on dataaccumulated from nearly 900,000 pupils of allages and aptitudes studying all manner ofschool subjects. Using regression analysis, the725 comparisons were integrated into a singlecurve showing the relationship between classsize and achievement in general. This curve

225Chapte

r 12

revealed a definite inverse relationship betweenclass size and pupil learning. When the research-ers derived similar curves for a variety of cir-cumstances that they hypothesized would al-ter the basic relationship (for example, gradelevel, subject taught, pupil ability etc.), virtu-ally none of these special circumstances altered

the basic relationship. Only one factor substan-tially affected the curve—whether the originalstudy controlled adequately in the experimen-tal sense for initial differences among pupilsand teachers in smaller and larger classes. Ad-equate and inadequate control curves are setout in Box 12.4.9

Box 12.4Class size and learning in well-controlled and poorly-controlled studies

Source Adapted from Glass and Smith, 1978

META-ANALYSIS IN EDUCATIONAL RESEARCH

Introduction

One of the founding figures of action research,Kurt Lewin (1948) remarked that research whichproduced nothing but books is inadequate. Thetask, as Marx suggests in his Theses on Feuerbach,is not merely to understand and interpret theworld but to change it. Action research is a pow-erful tool for change and improvement at the lo-cal level. Indeed Lewin’s own work was deliber-ately intended to change the life chances of dis-advantaged groups in terms of housing, employ-ment, prejudice, socialization, and training. Itscombination of action and research has contrib-uted to its attraction to researchers, teachers andthe academic and educational community alike,demolishing Hodgkinson’s (1957) corrosive criti-cism of action research as easy hobby games forlittle engineers!

The scope of action research as a method isimpressive. Action research may be used in al-most any setting where a problem involving peo-ple, tasks and procedures cries out for solution,or where some change of feature results in amore desirable outcome. It can be undertakenby the individual teacher, a group of teachersworking co-operatively within one school, or ateacher or teachers working alongside a re-searcher or researchers in a sustained relation-ship, possibly with other interested parties likeadvisers, university departments and sponsorson the periphery (Holly and Whitehead, 1986).Action research can be used in a variety of ar-eas, for example: • teaching methods—replacing a traditional

method by a discovery method;

• learning strategies—adopting an integratedapproach to learning in preference to a sin-gle-subject style of teaching and learning;

• evaluative procedures—improving one’smethods of continuous assessment;

• attitudes and values—encouraging more posi-tive attitudes to work, or modifying pupils’ valuesystems with regard to some aspect of life;

• continuing professional development ofteachers—improving teaching skills, develop-ing new methods of learning, increasing pow-ers of analysis, of heightening self-awareness;

• management and control—the gradual intro-duction of the techniques of behaviour modi-fication;

• administration—increasing the efficiency ofsome aspect of the administrative side ofschool life.

These examples do not mean, however, that ac-tion research can be typified straightforwardly;that is to distort its complex and multifacetednature. Indeed Kemmis (1997) suggests thatthere are several schools of action research.1

Defining action research

The different conceptions of action research canbe revealed in some typical definitions of actionresearch, for example Hopkins (1985:32) andEbbutt (1985:156) suggest that the combinationof action and research renders that action a formof disciplined inquiry, in which a personal attemptis made to understand, improve and reform prac-tice. Cohen and Manion (1994:186) define it as‘a small-scale intervention in the functioning of

13 Action research

Cha

pte

r 13

227

the real world and a close examination of theeffects of such an intervention’. The rigour ofaction research is attested by Corey (1953:6) whoargues that it is a process in which practitionersstudy problems scientifically (our italics) so thatthey can evaluate, improve and steer decision-making and practice. Indeed Kemmis andMcTaggart (1992:10) argue that ‘to do actionresearch is to plan, act, observe and reflect morecarefully, more systematically, and more rigor-ously than one usually does in everyday life’.

A more philosophical stance on action re-search, one that echoes the work of Habermas,is taken by Carr and Kemmis (1986:162), whoregard it as a form of ‘self-reflective inquiry’ byparticipants, undertaken in order to improve un-derstanding of their practices in context with aview to maximizing social justice. Grundy(1987:142) regards action research as concernedwith improving the ‘social conditions of exist-ence’. Kemmis and McTaggart (1992) suggestthat:

Action research is concerned equally with chang-ing individuals, on the one hand, and, on the other,the culture of the groups, institutions and socie-ties to which they belong. The culture of a groupcan be defined in terms of the characteristic sub-stance and forms of the language and discourses,activities and practices, and social relationshipsand organization which constitute the interactionsof the group.

(Kemmis and McTaggart, 1992:16) It can be seen that action research is designed tobridge the gap between research and practice(Somekh, 1995:340), thereby striving to over-come the perceived persistent failure of researchto impact on, or improve, practice (see alsoRapoport, 1970:499; and McCormick andJames, 1988:339). Stenhouse (1979) suggeststhat action research should contribute not onlyto practice but to a theory of education andteaching which is accessible to other teachers,making educational practice more reflective(Elliott, 1991:54).

Action research combines diagnosis with re-

flection, focusing on practical issues that havebeen identified by participants and which aresomehow both problematic yet capable of be-ing changed (Elliott, 1978:355–6; 1991:49).Zuber-Skerritt (1996b: 83) suggests that ‘theaims of any action research project or programare to bring about practical improvement, in-novation, change or development of social prac-tice, and the practitioners’ better understandingof their practices’.

The several strands of action research aredrawn together by Kemmis and McTaggart(1988) in their all-encompassing definition:

Action research is a form of collective self-reflec-tive inquiry undertaken by participants in socialsituations in order to improve the rationality andjustice of their own social or educational prac-tices, as well as their understanding of these prac-tices and the situations in which these practicesare carried out… The approach is only action re-search when it is collaborative, though it is im-portant to realize that the action research of thegroup is achieved through the critically examinedaction of individual group members.

(Kemmis and McTaggart, 1988:5) Kemmis and McTaggart (1992) distinguish actionresearch from the everyday actions of teachers:

• It is not the usual thinking teachers do when

they think about their teaching. Action researchis more systematic and collaborative in collect-ing evidence on which to base rigorous groupreflection.

• It is not simply problem-solving. Action re-search involves problem-posing, not just prob-lem-solving. It does not start from a view of‘problems’ as pathologies. It is motivated by aquest to improve and understand the world bychanging it and learning how to improve it fromthe effects of the changes made.

• It is not research done on other people. Actionresearch is research by particular people on theirown work, to help them improve what theydo, including how they work with and for oth-ers…

• Action research is not ‘the scientific method’

DEFINING ACTION RESEARCH

ACTION RESEARCH228

applied to teaching. There is not just one viewof ‘the scientific method’; there are many.

(Kemmis and McTaggart, 1992:21–2) Noffke and Zeichner (1987) make several claimsfor action research with teachers, viz. that it: • brings about changes in their definitions of

their professional skills and roles;• increases their feelings of self-worth and con-

fidence;• increases their awareness of classroom issues;• improves their dispositions toward reflection;• changes their values and beliefs;• improves the congruence between practical

theories and practices;• broadens their views on teaching, schooling

and society. A significant feature here is that action researchlays claim to the professional development ofteachers; action research for professional devel-opment is a frequently heard maxim (e.g. Nixon,1981; Oja and Smulyan, 1989; Somekh,1995:343; Winter, 1996). It is ‘situated learn-ing’; learning in the workplace and about theworkplace (Collins and Duguid, 1989). Theclaims for action research, then are several. Aris-ing from these claims and definitions are sev-eral principles.

Principles and characteristics ofaction research

Hult and Lennung (1980:241–50) andMcKernan (1991:32–3) suggest that action re-search: • makes for practical problem solving as well

as expanding scientific knowledge;• enhances the competencies of participants;• is collaborative;• is undertaken directly in situ;• uses feedback from data in an ongoing cycli-

cal process;• seeks to understand particular complex so-

cial situations;

• seeks to understand the processes of changewithin social systems;

• is undertaken within an agreed frameworkof ethics;

• seeks to improve the quality of human ac-tions;

• focuses on those problems that are of imme-diate concern to practitioners;

• is participatory;• frequently uses case study;• tends to avoid the paradigm of research that

isolates and controls variables;• is formative, such that the definition of the

problem, the aims and methodology may al-ter during the process of action research;

• includes evaluation and reflection;• is methodologically eclectic;• contributes to a science of education;• strives to render the research usable and

shareable by participants;• is dialogical and celebrates discourse;• has a critical purpose in some forms;• strives to be emancipatory.

Zuber-Skerritt (1996b:85) suggests that actionresearch is:

critical (and self-critical) collaborative inquiry byreflective practitioners beingaccountable and making results of their inquirypublicself-evaluating their practice and engaged inparticipatory problem-solving and continuingprofessional development.

This latter view is echoed in Winter’s (1996:13–14) six key principles of action research:

• reflexive critique, which is the process of be-coming aware of our own perceptual biases;

• dialectical critique, which is a way of under-standing the relationships between the ele-ments that make up various phenomena inour context;

• collaboration, which is intended to mean thateveryone’s view is taken as a contribution tounderstanding the situation;

• risking disturbance, which is an under-standing of our own taken-for-granted

Cha

pte

r 13

229

processes and willingness to submit them tocritique;

• creating plural structures, which involves de-veloping various accounts and critiques, ratherthan a single authoritative interpretation;

• theory and practice internalized, which is see-ing theory and practice as two interdepend-ent yet complementary phases of the changeprocess.

The several features that the definitions at the startof this chapter have in common suggest that ac-tion research has key principles. These are sum-marized by Kemmis and McTaggart (1992:22–5): • Action research is an approach to improving

education by changing it and learning fromthe consequences of changes.

• Action research is participatory: it is researchthrough which people work towards the im-provement of their own practices (and onlysecondarily on other people’s practices).

• Action research develops through the self-re-flective spiral: a spiral of cycles of planning,acting (implementing plans), observing (sys-tematically), reflecting…and then replanning,further implementation, observing and re-flecting.

• Action research is collaborative: it involvesthose responsible for action in improving it.

• Action research establishes self-critical com-munities of people participating and collabo-rating in all phases of the research process:the planning, the action, the observation andthe reflection; it aims to build communitiesof people committed to enlightening them-selves about the relationship between circum-stance, action and consequence in their ownsituation, and emancipating themselves fromthe institutional and personal constraintswhich limit their power to live their own le-gitimate educational and social values.

• Action research is a systematic learning proc-ess in which people act deliberately, thoughremaining open to surprises and responsiveto opportunities.

• Action research involves people in theorizing

about their practices—being inquisitive aboutcircumstances, action and consequences andcoming to understand the relationships be-tween circumstances, actions and conse-quences in their own lives.

• Action research requires that people put theirpractices, ideas and assumptions about insti-tutions to the test by gathering compellingevidence which could convince them that theirprevious practices, ideas and assumptionswere wrong or wrong-headed.

• Action research is open-minded about whatcounts as evidence (or data)—it involves notonly keeping records which describe what ishappening as accurately as possible…but alsocollecting and analyzing our own judgements,reactions and impressions about what is go-ing on.

• Action research involves keeping a personaljournal in which we record our progress andour reflections about two parallel sets oflearning: our learnings about the practices weare studying…and our learnings about theprocess (the practice) of studying them.

• Action research is a political process becauseit involves us in making changes that will af-fect others.

• Action research involves people in makingcritical analyses of the situations (classrooms,schools, systems) in which they work: thesesituations are structured institutionally.

• Action research starts small, by workingthrough changes which even a single person(myself) can try, and works towards exten-sive changes—even critiques of ideas or in-stitutions which in turn might lead to moregeneral reforms of classroom, school or sys-tem-wide policies and practices.

• Action research starts with small cycles ofplanning, acting, observing and reflectingwhich can help to define issues, ideas andassumptions more clearly so that those in-volved can define more power questions forthemselves as their work progresses.

• Action research starts with small groups ofcollaborators at the start, but widens the com-munity of participating action researchers so

PRINCIPLES AND CHARACTERISTICS OF ACTION RESEARCH

ACTION RESEARCH230

that it gradually includes more and more ofthose involved and affected by the practicesin question.

• Action research allows us to build records ofour improvements: (a) records of our chang-ing activities and practices, (b) records of thechanges in the language and discourse inwhich we describe, explain and justify ourpractices, (c) records of the changes in thesocial relationships and forms of organiza-tion which characterize and constrain ourpractices, and (d) records of the developmentin mastery of action research.

• Action research allows us to give a reasonedjustification of our educational work to oth-ers because we can show how the evidencewe have gathered and the critical reflectionwe have done have helped us to create a de-veloped, tested and critically-examined ra-tionale for what we are doing.

Though these principles find widespread sup-port in the literature on action research, theyrequire some comment. For example, there is astrong emphasis in these principles on actionresearch as a co-operative, collaborative activity(e.g. Hill and Kerber, 1967). Kemmis andMcTaggart locate this in the work of Lewin him-self, commenting on his commitment to groupdecision-making (p. 6). They argue, for example,that ‘those affected by planned changes have theprimary responsibility for deciding on courses ofcritically informed action which seem likely tolead to improvement, and for evaluating the re-sults of strategies tried out in practice’. Actionresearch is a group activity (p. 6) and that actionresearch is not individualistic. To lapse into in-dividualism is to destroy the critical dynamic ofthe group (p. 15) (italics in original).

The view of action research solely as a groupactivity, however, might be too restricting. It ispossible for action research to be an individualis-tic matter as well, relating action research to the‘teacher-as-researcher’ movement (Stenhouse1975). Whitehead (1985:98) explicitly writesabout action research in individualistic terms, andwe can take this to suggest that a teacher can ask

herself or himself : ‘What do I see as my prob-lem?’ ‘What do I see as a possible solution?’ ‘Howcan I direct the solution?’ ‘How can I evaluatethe outcomes and take subsequent action?’

The adherence to action research as a groupactivity derives from several sources. Pragmati-cally, Oja and Smulyan (1989:14), in arguingfor collaborative action research, suggest thatteachers are more likely to change their behav-iours and attitudes if they have been involved inthe research that demonstrates not only the needfor such change but that it can be done—theissue of ‘ownership’ and ‘involvement’ that findsits parallel in management literature that sug-gests that those closest to the problem are in thebest position to identify it and work towards itssolution (e.g. Morrison, 1998).

Ideologically, there is a view that those expe-riencing the issue should be involved indecisionmaking, itself hardly surprising givenLewin’s own work with disadvantaged andmarginalized groups, i.e. groups with little voice.That there is a coupling of the ideological andpolitical debate here has been brought more upto date with the work of Freire (1970) and Torres(1992:56) in Latin America, the latter setting outseveral principles of participatory action research: • It commences with explicit social and politi-

cal intentions that articulate with the domi-nated and poor classes and groups in society.

• It must involve popular participation in the re-search process, i.e. it must have a social basis.

• It regards knowledge as an agent of socialtransformation as a whole, thereby consti-tuting a powerful critique of those views ofknowledge (theory) as somehow separatefrom practice.

• Its epistemological base is rooted in criticaltheory and its critique of the subject/objectrelations in research.

• It must raise the consciousness of individu-als, groups, and nations.

Participatory action research recognizes a role forthe researcher as facilitator, guide, formulator andsummarizer of knowledge, and raiser of issues

Cha

pte

r 13

231

(e.g. the possible consequences of actions, theawareness of structural conditions) (Weiskopf andLaske (1996:132–3).

What is being argued here is that action re-search is a democratic activity (Grundy,1987:142). This form of democracy is partici-patory (rather than, for example, representative),a key feature of critical theory (discussed be-low, see also Aronowitz and Giroux, 1986;Giroux, 1989). Action research is seen as anempowering activity. Elliott (1991:54) arguesthat such empowerment has to be at a collectiverather than individual level as individuals do notoperate in isolation from each other, but areshaped by organizational and structural forces.

The issue is important, for it begins to sepa-rate action research into different camps(Kemmis, 1997:177). On the one hand are long-time advocates of action research such as Elliott(e.g. 1978; 1991) who are in the tradition ofSchwab and Schön and who emphasize reflec-tive practice; this is a particularly powerful fieldof curriculum research with notions of the‘teacher-as-researcher’ (Stenhouse, 1975, and thereflective practitioner, Schön, 1983, 1987). Onthe other are advocates in the ‘critical’ actionresearch model, e.g. Carr and Kemmis (1986).

Action research as critical praxis

Much of the writing in this field of action re-search draws on the Frankfurt School of criticaltheory (discussed in Chapter 1), in particular thework of Habermas. Indeed Weiskopf and Laske(1996:123) locate action research, in the Germantradition, squarely as a ‘critical social science’.Using Habermas’s early writing on knowledge-constitutive interests (1972, 1974) a three-foldtypification of action research can be constructed;the classification was set out in Chapter 1.

Grundy (1987:154) argues that ‘technical’ ac-tion research is designed to render an existing situ-ation more efficient and effective. In this respectit is akin to Argyris’s notion of ‘singleloop learn-ing’ (Argyris, 1990), being functional, often short-term and technical. It is akin to Schön’s (1987)

notion of ‘reflection-in-action’ (Morrison, 1995a).Elliott (1991:55) suggests that this view is limit-ing for action research since it is too individualis-tic and neglects wider curriculum structures, re-garding teachers in isolation from wider factors.

By contrast, ‘practical’ action research is de-signed to promote teachers’ professionalism bydrawing on their informed judgement (Grundy,1987:154). It is akin to Schön’s ‘reflection-en-action’ and is a hermeneutic activity of under-standing and interpreting social situations witha view to their improvement. Grundy suggests(p. 148) that it is this style that characterizesmuch action research in the UK.

Emancipatory action research has an explicitagenda which is as political as it is educational.Grundy (1987) provides a useful introductionto this view. She argues (pp. 146–7) that eman-cipatory action research seeks to develop in par-ticipants their understandings of illegitimatestructural and interpersonal constraints that arepreventing the exercise of their autonomy andfreedom. These constraints, she argues, are basedon illegitimate repression, domination and con-trol. When participants develop a consciousnessof these constraints, she suggests, they begin tomove from unfreedom and constraint to free-dom, autonomy and social justice.

Action research, then, aims to empower in-dividuals and social groups to take control overtheir lives within a framework of the promo-tion, rather than the suppression of generalizableinterests (Habermas, 1976). It commences witha challenge to the illegitimate operation ofpower, hence in some respects (albeit more po-liticized because it embraces the dimension ofpower) it is akin to Argyris’s (1990) notion of‘doubleloop learning’ in that it requires partici-pants to question and challenge given value sys-tems. For Grundy, praxis fuses theory and prac-tice within an egalitarian social order, and actionresearch is designed with the political agenda ofimprovement towards a more just, egalitariansociety. This accords to some extent with Lewin’sview that action research leads to equality andcooperation, an end to exploitation and the

ACTION RESEARCH AS CRITICAL PRAXIS

ACTION RESEARCH232

furtherance of democracy (see also Hopkins,1985: 32; Carr and Kemmis, 1986:163). Zuber-Skerritt (1996a) suggests that:

emancipatory action research…is collaborative,critical and self-critical inquiry by practitioners…into a major problem or issue or concern in theirown practice. They own the problem and feel re-sponsible and accountable for solving it throughteamwork and through following a cyclical proc-ess of:1 strategic planning;2 action, i.e. implementing the plan;3 observation, evaluation and self-evaluation;4 critical and self-critical reflection on the results

of points 1–3 and making decisions for the nextcycle of action research.

Zuber-Skerritt (1996a:3)

Action research, she argues (p. 5) is emancipa-tory when it aims not only at technical and prac-tical improvement and the participants’ betterunderstanding, along with transformation andchange within the existing boundaries and con-ditions, but also at changing the system itself orthose conditions which impede desired improve-ment in the system/organization… There is nohierarchy, but open and ‘symmetrical commu-nication’.

The emancipatory interest is based on thenotion of action researchers as participants in acommunity of equals. This, in turn is premisedon Habermas’s notion of the ‘ideal speech situ-ation’ which can be summarized thus (Morrison,1996b: 171):

• orientation to a common interest ascertainedwithout deception;

• freedom to enter a discourse and equal op-portunity for discussion;

• freedom to check questionable claims andevaluate explanations;

• freedom to modify a given conceptual frame-work;

• freedom to reflect on the nature of knowl-edge;

• freedom to allow commands or prohibitionsto enter discourse when they can no longerbe taken for granted;

• freedom to assess justifications;• freedom to alter norms;• freedom to reflect on the nature of political

will;• mutual understanding between participants;• recognition of the legitimacy of each subject

to participate in the dialogue as an autono-mous and equal partner;

• discussion to be free from domination anddistorting or deforming influences;

• the consensus resulting from discussion de-rives from the force of the better argumentalone, and not from the positional power ofthe participants;

• all motives except the co-operative search fortruth are excluded;

• the speech act validity claims of truth, legiti-macy, sincerity and comprehensibility are alladdressed.

This formidable list, characterized, perhaps, bythe opacity of Habermas’s language itself (seeMorrison, 1995b) is problematical, though thiswill not be discussed in this volume (for a fullanalysis of this see Morrison (1995b)). What isimportant to note, perhaps, is that: • action research here is construed as reflective

practice with a political agenda;• all participants (and action research is par-

ticipatory) are equal ‘players’;• action research, in this vein, is necessarily

dialogical—interpersonal—rather thanmonological (individual); and

• communication is an intrinsic element, withcommunication being amongst the commu-nity of equals (Grundy and Kemmis, 1988:87,term this ‘symmetrical communication’);

• because it is a community of equals, actionresearch is necessarily democratic and pro-motes democracy;

• that the search is for consensus (and consen-sus requires more than one participant), henceit requires collaboration and participation.

In this sense emancipatory action research ful-fils the requirements of action research set out

Cha

pte

r 13

233

by Kemmis and McTaggart above; indeed itcould be argued that only emancipatory actionresearch (in the threefold typology) has the po-tential to do this.

Kemmis (1997:177) suggests that the distinc-tion between the two camps (the reflective prac-titioners and the critical theorists) lies in theirinterpretation of action research. For the former,action research is an improvement to profes-sional practice at the local, perhaps classroomlevel, within the capacities of individuals andthe situations in which they are working; forthe latter, action research is part of a broaderagenda of changing education, changing school-ing and changing society.

A key term in action research is ‘empowerment’;for the former camp, empowerment is largely amatter of the professional sphere of operations,achieving professional autonomy through profes-sional development. For the latter, empowermentconcerns taking control over one’s life within ajust, egalitarian, democratic society. Whether thelatter is realizable or Utopian is a matter of cri-tique of this view. Where is the evidence thatcritical action research either empowers groupsor alters the macro-structures of society? Is criti-cal action research socially transformative? Atbest the jury is out; at worst the jury simply hasgone away as capitalism overrides egalitarian-ism worldwide. The point at issue here is theextent to which the notion of emancipatory ac-tion research has attempted to hijack the actionresearch agenda, and whether, in so doing (if ithas), it has wrested action research away frompractitioners and into the hands of theorists andthe academic research community only.

More specifically, several criticisms have beenlevelled at this interpretation of emancipatoryaction research (Gibson, 1985; Morrison, 1995a,1995b; Somekh, 1995; Melrose, 1996; Grundy,1996; Weiskopf and Laske, 1996; Webb, 1996;McTaggart, 1996; Kemmis, 1997), including theviews that:

• it is utopian and unrealizable;• it is too controlling and prescriptive, seeking

to capture and contain action research within

a particular mould—it moves towards con-formity;

• it adopts a narrow and particularistic viewof emancipation and action research, and howto undertake the latter;

• it undermines the significance of the indi-vidual teacher-as-researcher in favour of self-critical communities. Kemmis andMcTaggart (1992:152) pose the question‘why must action research consist of a groupprocess?’;

• the three-fold typification of action researchis untenable;

• it assumes that rational consensus is achiev-able, that rational debate will empower allparticipants (i.e. it understates the issue ofpower, wherein the most informed are alreadythe most powerful—Grundy (1996:111) ar-gues that the better argument derives fromthe one with the most evidence and reasons,and that these are more available to the pow-erful, thereby rendering the conditions ofequality suspect);

• it overstates the desirability ofconsensusoriented research (which neglectsthe complexity of power);

• power cannot be dispersed or rearranged sim-ply by rationality;

• action research as critical theory reduces itspractical impact and confines it to thecommodification of knowledge in the academy;

• is uncritical and self-contradicting;• will promote conformity through slavishly

adhering to its orthodoxies;• is naïve in its understanding of groups and

celebrates groups over individuals, particu-larly the ‘in-groups’ rather than the‘outgroups’;

• privileges its own view of science (rejectingobjectivity) and lacks modesty;

• privileges the authority of critical theory;• is elitist whilst purporting to serve egalitari-

anism;• assumes an undifferentiated view of action

research;• is attempting to colonize and redirect action

research.

ACTION RESEARCH AS CRITICAL PRAXIS

ACTION RESEARCH234

This seemingly devastating critique serves toremind the reader that critical action research,even though it has caught the high ground ofrecent coverage, is highly problematical. It is justas controlling as those controlling agendas thatit seeks to attack (Morrison, 1995b). IndeedMelrose (1996:52) suggests that, because criti-cal research is, itself, value laden it abandonsneutrality; it has an explicit social agenda that,under the guise of examining values, ethics,morals and politics that are operating in a par-ticular situation, is actually aimed at transform-ing the status quo.

Procedures for action research

Nixon offers several principles for consideringaction research in schools (Box 13.1). There areseveral ways in which the steps of action researchhave been analysed. Blum (National EducationAssociation of the United States, 1959) castsaction research into two simple stages: a diag-nostic stage in which the problems are analysedand the hypotheses developed; and a therapeu-tic stage in which the hypotheses are tested by aconsciously directed intervention or experimentin situ. Lewin (1946, 1948) codified the actionresearch process into four main stages: planning,acting, observing and reflecting.

He suggests that action research commenceswith a general idea and data are sought aboutthe presenting situation. The successful outcomeof this examination is the production of a planof action to reach an identified objective, to-gether with a decision on the first steps to betaken. Lewin acknowledges that this might in-volve modifying the original plan or idea. Thenext stage of implementation is accompanied byongoing fact-finding to monitor and evaluate theintervention, i.e. to act as a formative evalua-tion. This feeds forward into a revised plan andset of procedures for implementation, themselvesaccompanied by monitoring and evaluation.Lewin (1948:205) suggests that such ‘rationalsocial management’ can be conceived of as aspiral of planning, action and fact-finding aboutthe outcomes of the actions taken.

Box 13.1Action research in classroom and school

1 All teachers possess certain skills which can

contribute to the research task. The important thingis to clarify and define one’s own particular set ofskills. Some teachers, for example, are able tocollect and interpret statistical data; others torecord in retrospective accounts the key moments ofa lesson. One teacher may know something aboutquestionnaire design; another have a natural flairfor interviewing. It is essential that teachers workfrom their own particular strengths when develop-ing the research.

2 The situations within which teachers work imposedifferent kinds of constraints. Some schools, forexample, are equipped with the most up-to-dateaudio-visual equipment, others cannot even boast acassette tape-recorder. Some have spare rooms inwhich interviews could be carried out, othershardly have enough space to implement theexisting time-table. Action research must bedesigned in such a way as to be easily imple-mented within the pattern of constraints existingwithin the school.

3 Any initial definition of the research problem willalmost certainly be modified as the researchproceeds. Nevertheless, this definition is importantbecause it helps to set limits to the inquiry. If, forexample, a teacher sets out to explore throughaction research the problem of how to start alesson effectively, the research will tend to focusupon the first few minutes of the lesson. Thequestion of what data to collect is very largelyanswered by a clear definition of the researchproblem.

Source Nixon, 1981

The legacy of Lewin’s work, though contested

(e.g. Elliott, 1978, 1991; McTaggart, 1996:248)is powerful in the steps of action research setout by Kemmis and McTaggart (1981:2):

In practice, the process begins with a general ideathat some kind of improvement or change is de-sirable. In deciding just where to begin in makingimprovements, one decides on a field of action…where the battle (not the whole war) should befought. It is a decision on where it is possible tohave an impact. The general idea prompts a ‘re-connaissance’ of the circumstances of the field, andfact-finding about them. Having decided on thefield and made a preliminary reconnaissance, theaction researcher decides on a general plan of ac-tion. Breaking the general plan down into achiev-able steps, the action researcher settles on the first

Cha

pte

r 13

235

action step. Before taking this first step the actionresearcher becomes more circumspect, and devisesa way of monitoring the effects of the first actionstep. When it is possible to maintain fact-findingby monitoring the action, the first step is taken.As the step is implemented, new data start com-ing in and the effect of the action can be describedand evaluated. The general plan is then revised inthe light of the new information about the field ofaction and the second action step can be plannedalong with appropriate monitoring procedures.The second step is then implemented, monitoredand evaluated; and the spiral of action, monitor-ing, evaluation and replanning continues.

McKernan (1991:17) suggests that Lewin’smodel of action research is a series of spirals,each of which incorporates a cycle of analysis,reconnaissance, reconceptualization of the prob-lem, planning of the intervention, implementa-tion of the plan, evaluation of the effectivenessof the intervention. Ebbutt (1985) adds to thisthe view that feedback within and between eachcycle is important, facilitating reflection (see alsoMcNiff, 1988). This is reinforced in the modelof action research by Altricher and Gstettner(1993) where, though they have four steps (p.343): (a) finding a starting point, (b) clarifyingthe situation, (c) developing action strategies andputting them into practice, (d) making teachers’knowledge public—they suggest that steps (b)and (c) need not be sequential, thereby avoidingthe artificial divide that might exist between datacollection, analysis and interpretation.

Zuber-Skerritt (1996b:84) sets emancipatory(critical) action research into a cyclical processof: ‘(1) strategic planning, (2) implementing theplan (action), (3) observation, evaluation andself-evaluation, (4) critical and self-critical re-flection on the results of (1)—(3) and makingdecisions for the next cycle of research’. In animaginative application of action research toorganizational change theory she takes the fa-mous work of Lewin (1952) on forcefield analy-sis and change theory (unfreezing → moving →refreezing) and the work of Beer et al. (1990)

on task alignment, and sets them into an actionresearch sequence that clarifies the steps of ac-tion research very usefully (Box 13.2).

In our earlier editions we set out an eight-stage process of action research that attemptsto draw together the several strands and stepsof the action research undertaking. The firststage will involve the identification, evaluationand formulation of the problem perceived ascritical in an everyday teaching situation. ‘Prob-lem’ should be interpreted loosely here so that itcould refer to the need to introduce innovationinto some aspect of a school’s established pro-gramme.

The second stage involves preliminary discus-sion and negotiations among the interested par-ties—teachers, researchers, advisers, sponsors,possibly—which may culminate in a draft pro-posal. This may include a statement of the ques-tions to be answered (e.g. ‘Under what condi-tions can curriculum change be best effected?’‘What are the limiting factors in bringing abouteffective curriculum change?’ ‘What strongpoints of action research can be employed tobring about curriculum change?’). The research-ers in their capacity as consultants (or sometimesas programme initiators) may draw upon theirexpertise to bring the problem more into focus,possibly determining causal factors or recom-mending alternative lines of approach to estab-lished ones. This is often the crucial stage for,unless the objectives, purposes and assumptionsare made perfectly clear to all concerned, andunless the role of key concepts is stressed (e.g.feedback), the enterprise can easily miscarry.

The third stage may involve a review of theresearch literature to find out what can belearned from comparable studies, their objec-tives, procedures and problems encountered.

The fourth stage may involve a modificationor redefinition of the initial statement of theproblem at stage one. It may now emerge in theform of a testable hypothesis; or as a set of guid-ing objectives. Sometimes change agents delib-erately decide against the use of objectives on

PROCEDURES FOR ACTION RESEARCH

ACTION RESEARCH236

the grounds that they have a constraining effecton the process itself. It is also at this stage thatassumptions underlying the project are made ex-plicit (e.g. in order to effect curriculum changes,the attitudes, values, skills and objectives of theteachers involved must be changed).

The fifth stage may be concerned with theselection of research procedures—sampling,administration, choice of materials, methods ofteaching and learning, allocation of resourcesand tasks, deployment of staff and so on.

The sixth stage will be concerned with thechoice of the evaluation procedures to be usedand will need to take into consideration thatevaluation in this context will be continuous.

The seventh stage embraces the implementa-tion of the project itself (over varying periods of

time). It will include the conditions and meth-ods of data collection (e.g. fortnightly meetings,the keeping of records, interim reports, final re-ports, the submission of self-evaluation andgroup-evaluation reports, etc.); the monitoringof tasks and the transmission of feedback to theresearch team; and the classification and analy-sis of data.

The eighth and final stage will involve theinterpretation of the data; inferences to bedrawn; and overall evaluation of the project(see Woods, 1989). Discussions on the findingswill take place in the light of previously agreedevaluative criteria. Errors, mistakes and prob-lems will be considered. A general summing-up may follow this, in which the outcomes ofthe project are reviewed, recommendations

Box 13.2A model of emancipatory action research for organizational change

Source Zuber-Skerritt, 1996b: 99

Cha

pte

r 13

237

made, and arrangements for dissemination ofresults to interested parties decided.

As we stressed, this is a basic framework;much activity of an incidental and possibly adhoc nature will take place in and around it. Thismay comprise discussions among teachers, re-searchers and pupils; regular meetings amongteachers or schools to discuss progress and prob-lems, and to exchange information; possibly re-gional conferences; and related activities, allenhanced by the range of current hardware—tapes, video recordings and transcripts.

Hopkins (1985), McNiff (1988), Edwards(1990) and McNiff, Lomax and Whitehead(1996) offer much practical advice on the con-duct of action research, including ‘gettingstarted’, operationalization, planning, monitor-ing and documenting the intervention, collect-ing data and making sense of them, using casestudies, evaluating the action research, ethicalissues and reporting. We urge readers to go tothese helpful sources. These are essentially bothintroductory sources and manuals for practice.

Kemmis and McTaggart (1992:25–7) offer auseful series of observations for beginning ac-tion research: • Get an action research group together and

participate yourself—be a model learnerabout action research.

• Be content to start to work with a smallgroup.

• Get organized.• Start small.• Establish a time line.• Arrange for supportive work-in-progress dis-

cussions in the action research group.• Be tolerant and supportive—expect people to

learn from experience.• Be persistent about monitoring.• Plan for a long haul on the bigger issues of

changing classroom practices and schoolstructures.

• Work to involve (in the research process)those who are involved (in the action), so thatthey share responsibility for the whole actionresearch process.

• Remember that how you think about things—the language and understandings that shapeyour action—may need changing just as muchas the specifics of what you do.

• Register progress not only with the partici-pant group but also with the whole staff andother interested people.

• If necessary arrange legitimizing rituals—in-volving consultants or other outsiders.

• Make time to write throughout your project.• Be explicit about what you have achieved by

reporting progress.• Throughout, keep in mind the distinction

between education and schooling.• Throughout, ask yourself whether your ac-

tion research project is helping you (and thosewith whom you work) to improve the extentto which you are living your educational val-ues (italics in original).

It is clear from this list that action research is ablend of practical and theoretical concerns, it isboth action and research.

In conducting action research the participantscan be both methodologically eclectic and canuse a variety of instruments for data collection:questionnaires, diaries, interviews, case studies,observational data, experimental design, fieldnotes, photography, audio and video recording,sociometry, rating scales, biographies and ac-counts, documents and records, in short the fullgamut of techniques (for a discussion of these,see Hopkins, 1985; McKernan, 1991, and thechapters in our own book here).

Additionally a useful way of managing to gaina focus within a group of action researchers isthrough the use of Nominal Group Technique(Morrison, 1993). The administration is straight-forward and is useful for gathering informationin a single instance. In this approach one mem-ber of the group provides the group with a se-ries of questions, statements or issues. A four-stage model can be adopted:

Stage 1 A short time is provided for individualsto write down without interruption or discus-sion with anybody else their own answers, views,

PROCEDURES FOR ACTION RESEARCH

ACTION RESEARCH238

reflections and opinions in response to questions/statements/issues provided by the group leader(e.g. problems of teaching or organizing such-and-such, or an identification of issues in theorganization of a piece of the curriculum etc.).Stage 2 The responses are entered onto a sheetof paper which is then displayed for others toview. The leader invites individual comments onthe displayed responses to the questions/state-ments/issue, but no group discussion, i.e. the datacollection is still at an individual level, and thennotes these comments on the display sheet onwhich the responses have been collected. Theprocess of inviting individual comments/contri-butions which are then displayed for everyoneto see is repeated until no more comments arereceived.Stage 3 At this point the leader asks the respond-ents to identify clusters of displayed commentsand responses, i.e. to put some structure, orderand priority into the displayed items. It is herethat control of proceedings moves from theleader to the participants. A group discussiontakes place since a process of clarification ofmeanings and organizing issues and responsesinto coherent and cohesive bundles is requiredwhich then moves to the identification of pri-orities.Stage 4 Finally the leader invites any furthergroup discussion about the material and its or-ganization.

The process of the Nominal Group Techniqueenables individual responses to be includedwithin a group response, i.e. the individual’scontribution to the group delineation of signifi-cant issues is maintained. This technique is veryuseful in gathering data from individuals andputting them into some order which is sharedby the group (and action research is largely,though not exclusively, a group matter), e.g. ofpriority, of similarity and difference, of general-ity and specificity. It also enables individual disa-greements to be registered and to be built intothe group responses and identification of sig-nificant issues to emerge. Further, it gives equalstatus to all respondents in the situation, for

example, the voice of the new entrant to theteaching profession is given equal considerationto the voice of the headteacher of several years’experience. The attraction of this process is thatit balances writing with discussion, a divergentphase with a convergent phase, space for indi-vidual comments and contributions to groupinteraction. It is a useful device for developingcollegiality. All participants have a voice and areheard.

The written partner to the Nominal GroupTechnique is the Delphi technique. This has theadvantage that it does not require participantsto meet together as a whole group. This is par-ticularly useful in institutions where time is pre-cious and where it is difficult to arrange a wholegroup meeting. The process of data collectionresembles that of the nominal group techniquein many respects: it can be set out in a three-stage process:

Stage 1 The leader asks participants to respondto a series of questions and statements in writ-ing. This may be done on an individual basis oron a small group basis—which enables it to beused flexibly, e.g. within a department, withinan age phase.Stage 2 The leader collects the written responsesand collates them into clusters of issues and re-sponses (maybe providing some numerical dataon frequency of response). This analysis is thenpassed back to the respondents for comment,further discussion and identification of issues,responses and priorities. At this stage the re-spondents are presented with a group response(which may reflect similarities or record differ-ences) and the respondents are asked to react tothis group response. By adopting this procedurethe individual has the opportunity to agree withthe group response (i.e. to move from a possi-bly small private individual disagreement to ageneral group agreement) or to indicate a moresubstantial disagreement with the group re-sponse.Stage 3 This process is repeated as many times asit is necessary. In saying this, however, the leaderwill need to identify the most appropriate place

Cha

pte

r 13

239

to stop the re-circulation of responses. This mightbe done at a group meeting which, it is envis-aged, will be the plenary session for the partici-pants, i.e. an endpoint of data collection will bein a whole group forum.

By presenting the group response back to theparticipants, there is a general progression in thetechnique towards a polarizing of responses, i.e.a clear identification of areas of consensus anddissensus (and emancipatory action researchstrives for consensus). The Delphi techniquebrings advantages of clarity, privacy, voice andcollegiality. In doing so it engages the issues ofconfidentiality, anonymity and disclosure of rel-evant information whilst protecting participants’rights to privacy. It is a very useful means ofundertaking behind-the-scenes data collectionwhich can then be brought to a whole groupmeeting; the price that this exacts is that theleader has much more work to do in collecting,synthesizing, collating, summarizing, prioritizingand re-circulating data than in the NominalGroup Technique, which is immediate. As par-ticipatory techniques both the Nominal GroupTechnique and Delphi techniques are valuablefor data collection and analysis in action re-search. A fully worked example of the use ofDelphi techniques for an international study isCogan and Derricot (1998), a study of citizen-ship education.

Reflexivity in action research

The analysis so far has made much of the issueof reflection, be it reflection-in-action, reflection-on-action, or critical reflection (Morrison,1995a). Reflection, it has been argued, occursat every stage of action research. Beyond this,the notion of reflexivity is central to action re-search, because the researchers are also the par-ticipants and practitioners in the action re-search—they are part of the social world thatthey are studying (Hammersley and Atkinson,1983:14). Hall (1996:29) suggests that reflex-ivity is an integral element and epistemologicalbasis of emancipatory action research because

it takes as its premiss the view of the construc-tion of knowledge in which: (a) data are authen-tic and reflect the experiences of all participants;(b) democratic relations exist between all par-ticipants in the research; the researcher’s views(which may be theory-laden) do not hold prec-edence over the views of participants.

What is being required in the notion of re-flexivity is a self-conscious awareness of the ef-fects that the participants-as-practitioners-and-researchers are having on the research process,how their values, attitudes, perceptions, opin-ions, actions, feelings etc. are feeding into thesituation being studied (akin, perhaps, to thenotion of counter-transference in counselling).The participants-as-practitioners-and-research-ers need to apply to themselves the same criticalscrutiny that they are applying to others and tothe research. This issue is discussed in Chapter 5.

Some practical and theoreticalmatters

Much has been made in this chapter of the demo-cratic principles that underpin a considerableamount of action research. The ramifications ofthis are several. For example, there must be afree flow of information between participantsand communication must be extensive (Elliott,1978:356) and, echoing the notion of the idealspeech situation discussed earlier, communica-tion must be open, unconstrained andunconstraining—the force of the better argu-ment. That this might be problematic in someorganizations has been noted by Holly(1984:100), as action research and schools areoften structured differently, schools being hier-archical, formal and bureaucratic whilst actionresearch is collegial, informal, open, collabora-tive and crosses formal boundaries. In turn thissuggests that, for action research to be success-ful, the conditions of collegiality have to bepresent, for example (Morrison, 1998:157–8): • participatory approaches to decision-making;• democratic and consensual decision-making;• shared values, beliefs and goals;

PRACTICAL AND THEORETICAL MATTERS

ACTION RESEARCH240

• equal rights of participation in discussion;• equal rights to determine policy;• equal voting rights on decisions;• the deployment of sub-groups who are ac-

countable to the whole group;• shared responsibility and open accountabil-

ity;• an extended view of expertise;• judgements and decisions based on the power

of the argument rather than the positionspower of the advocates;

• shared ownership of decisions and practices. It is interesting, perhaps, that these features,derived from management theory, can apply sowell to action research—action research nestscomfortably within certain management styles.Indeed Zuber-Skerritt (1996b:90) suggests thatthe main barriers to emancipatory action re-search are: (a) single-loop learning (rather thandouble-loop learning (Argyris, 1990)); (b) over-dependence on experts or seniors to the extentthat independent thought and expression arestifled; (c) an orientation to efficiency rather thanto research and development (one might addhere ‘rather than to reflection and problem pos-ing’); (d) a preoccupation with operational ratherthan strategic thinking and practice.

Zuber-Skerritt (1996a:17) suggests four prac-tical problems that action researchers might face: • How can we formulate a method of work

which is sufficiently economical as regardsthe amount of data gathering and dataprocessing for a practitioner to undertake italongside a normal workload, over a limitedtime scale?

• How can action research techniques be suffi-ciently specific that they enable a small-scaleinvestigation by a practitioner to lead to genu-inely new insights, and avoid being accusedof being either too minimal to be valid, ortoo elaborate to be feasible?

• How can these methods, given the above, bereadily available and accessible to anyonewho wishes to practise them, building on the

competencies which practitioners alreadypossess?

• How can these methods contribute a genu-ine improvement of understanding and skill,beyond prior competence, in return for thetime and energy expended—that is, a morerigorous process than that which character-izes positivist research?

She also suggests that the issue of the audienceof action research reports is problematic:

The answer to the question ‘who are action re-search reports written for?’ is that there are threeaudiences—each of equal importance. One audi-ence comprises those colleagues with whom wehave collaborated in carrying out the research re-ported… It is important to give equal importanceto the second audience. These are interested col-leagues in other institutions, or in other areas ofthe same institution, for whom the underlyingstructure of the work presented may be similar tosituations in which they work… But the third, andperhaps most important audience, is ourselves. Theprocess of writing involves clarifying and explor-ing ideas and interpretations (p. 26).

Action research reports, argues Somekh(1995:347), unlike many ‘academic’ papers, aretypically written in the first person, indeed, sheargues, not to do so is hard to defend (given, per-haps, the significance of participation, collabora-tion, reflexivity and individuality). They have tobe written in the everyday, commonsense languageof the participants.

(Elliott, 1978:356) We have already seen that the participants in achange situation may be either a teacher, a groupof teachers working internally, or else teachersand researchers working on a collaborative ba-sis. It is this last category, where action researchbrings together two professional bodies eachwith its own objectives and values, that we shallconsider further at this point because of its in-herent problematic nature. Both parties sharethe same interest in an educational problem, yettheir respective orientations to it differ. It hasbeen observed (Halsey, 1972, for instance) that

Cha

pte

r 13

241

research values precision, control, replicationand attempts to generalize from specific events.Teaching, on the other hand, is concerned withaction, with doing things, and translates gener-alizations into specific acts. The incompatibil-ity between action and research in these respects,therefore, can be a source of problems (Marrisand Rein, 1967).

Another issue of some consequence concernsheadteachers’ and teachers’ attitudes to the pos-sibility of change as a result of action research.Hutchinson and Whitehouse (1986), for exam-ple, having monitored teachers’ efforts to formcollaborative groups within their schools, dis-covered one source of difficulty to be not onlyresistance from heads but also, and in their viewmore importantly, from some teachers them-selves to the action researcher’s efforts to havethem scrutinize individual and social practice,possibly with a view to changing it, e.g. in linewith the head teacher’s policies.

Finally, Winter draws attention to the prob-lem of interpreting data in action research. Hewrites:

The action research/case study tradition does havea methodology for the creation of data, but not(as yet) for the interpretation of data. We areshown how the descriptive journal, the observer’sfield notes, and the open-ended interview are uti-lized to create accounts of events which will con-front the practitioner’s current pragmatic assump-tions and definitions; we are shown the potentialvalue of this process (in terms of increasing teach-ers’ sensitivity) and the problem it poses for indi-vidual and collective professional equilibrium.What we are not shown is how the teacher can orshould handle the data thus collected.

(Winter, 1982)

The problem for Winter is how to carry out aninterpretive analysis of restricted data, that is,data which can make no claim to be generallyrepresentative. In other words, the problem ofvalidity cannot be side-stepped by arguing thatthe contexts are unique.

Conclusion

Action research is an expanding field which iscommanding significant education attention andwhich has its own centres (e.g. at the Universi-ties of Cambridge and East Anglia in the UKand Deakin University in Australia) and its ownjournals (e.g. Educational Action Research). Ithas been seen as a significant vehicle for em-powering teachers, though this chapter has ques-tioned the extent of this. As a research device itcombines six notions: 1 a straightforward cycle of: identifying a prob-

lem, planning an intervention, implementingthe intervention, evaluating the outcome;

2 reflective practice;3 political emancipation;4 critical theory;5 professional development; and6 participatory practitioner research. It is a flexible, situationally responsive method-ology that offers rigour, authenticity and voice.That said, this chapter has tried to expose boththe attractions and problematic areas of actionresearch. In its thrust towards integrating ac-tion and research one has to question whetherthis is an optimistic way of ensuring that researchimpacts on practice for improvement, or whetherit is a recessive hybrid.

CONCLUSION

This section moves to a closer-grained account

of instruments for collecting data, how they can

be used, and how they can be constructed. We

identify eight kinds of instrument for data collec-

tion in what follows, and have expanded on the

previous edition of the book by new chapters on

testing (including recent developments in item

response theory and computer-adaptive testing),

questionnaire design and observation, together

with material on focus groups, statistical signifi-

cance, multilevel modelling, laddering in per-

sonal constructs, telephone interviewing, and

speech act theory (echoing elements of critical

theory that were introduced in Part One).

The intention of this part is to enable re-

searchers to decide on the most appropriate

instruments for data collection, and to design

such instruments. The strengths and weak-

nesses of these instruments are set out, so that

decisions on their suitability avoid being arbi-

trary and the criterion of fitness for purpose is

held high. To that end, the intention is to intro-

duce underlying issues of principle in instru-

mentation as well as to ensure that practical

guidelines are provided for researchers. For

each instrument the purpose is to ensure that

researchers can devise appropriate data col-

lection instruments for themselves, and are

aware of the capabilities of such instruments

to provide useful and usable data.

Part four

Strategies for data collection

and researching

The field of questionnaire design is vast, and thischapter is intended to provide a straightforwardintroduction to its key elements, indicating themain issues to be addressed, some importantproblematical considerations and how they canbe resolved. The chapter follows a sequence indesigning a questionnaire that, it is hoped, willbe useful for researchers. The sequence is:

ethical issues;approaching the planning of a questionnaire;operationalizing the questionnaire;structured, semi-structured and unstructuredquestionnaires;avoiding pitfalls in question writing;dichotomous questions;multiple choice questions;rank ordering;rating scales;open-ended questions;asking sensitive questions;sequencing the questions;questionnaires containing few verbal items;the layout of the questionnaire;covering letters/sheets and follow-up letters;piloting the questionnaire;practical considerations in questionnairedesign;postal questionnaires;processing questionnaire data.

It is suggested that the researcher may find ituseful to work through these issues in sequence,though, clearly, a degree of recursion is desir-able.

The questionnaire is a widely used and useful

instrument for collecting survey information, pro-viding structured, often numerical data, being ableto be administered without the presence of the re-searcher, and often being comparatively straight-forward to analyze (Wilson and McLean, 1994).1

These attractions have to be counterbalanced bythe time taken to develop, pilot and refine thequestionnaire, by the possible unsophisticationand limited scope of the data that are collected,and from the likely limited flexibility of response,though, as Wilson and McLean (ibid.: 3) observe,this can frequently be an attraction. The re-searcher will have to judge the appropriatenessof using a questionnaire for data collection, and,if so, what kind of questionnaire it will be.

Ethical issues

The questionnaire will always be an intrusion intothe life of the respondent, be it in terms of timetaken to complete the questionnaire, the level ofthreat or sensitivity of the questions, or the pos-sible invasion of privacy. Questionnaire respond-ents are not passive data providers for research-ers; they are subjects not objects of research. Thereare several sequiturs that flow from this.

Respondents cannot be coerced into complet-ing a questionnaire. They might be strongly en-couraged, but the decision whether to becomeinvolved and when to withdraw from the re-search is entirely theirs. Their involvement inthe research is likely to be a function of: • their informed consent (see Chapter 2 on the

ethics of educational research);

14 Questionnaires

QUESTIONNAIRES246

• their rights to withdraw at any stage or notto complete particular items in the question-naire;

• the potential of the research to improve theirsituation (the issue of beneficence);

• the guarantees that the research will not harmthem (the issue of non-maleficence);

• the guarantees of confidentiality, anonymityand non-traceability in the research;

• the degree of threat or sensitivity of the ques-tions (which may lead to respondents’ over-reporting or under-reporting (Sudman andBradburn, 1982:32 and Chapter 3);

• factors in the questionnaire itself (e.g. its cov-erage of issues, its ability to catch what re-spondents want to say rather than to promotethe researcher’s agenda), i.e. the avoidance ofbias and the assurance of validity and reliabil-ity in the questionnaire—the issues of meth-odological rigour and fairness. Methodologi-cal rigour is an ethical, not simply a technical,matter (Morrison, 1996c), and respondentshave a right to expect reliability and validity;

• the reactions of the respondent, for examplerespondents will react if they consider an itemto be offensive, intrusive, misleading, biased,misguided, irritating, inconsiderate, imperti-nent or abstruse.

These factors impact on every stage of the useof a questionnaire, to suggest that attention hasto be given to the questionnaire itself, the ap-proaches that are made to the respondents, theexplanations that are given to the respondents,the data analysis and the data reporting.

Approaching the planning of aquestionnaire

At this preliminary stage of design, it can some-times be helpful to use a flow chart technique toplan the sequencing of questions. In this way,researchers are able to anticipate the type andrange of responses that their questions are likelyto elicit. In Box 14.1 we illustrate a flow chartemployed in a commercial survey based uponan interview schedule, though the application

of the method to a self-completion questionnaireis self-evident.

Operationalizing the questionnaire

The process of Operationalizing a questionnaireis to take a general purpose or set of purposesand turn these into concrete, researchable fieldsabout which actual data can be gathered. Firstly,a questionnaire’s general purposes must be clari-fied and then translated into a specific, concreteaim or set of aims. Thus, ‘to explore teachers’views about in-service work’ is somewhat nebu-lous, whereas ‘to obtain a detailed descriptionof primary and secondary teachers’ priorities inthe provision of in-service education courses’ isreasonably specific.

Having decided upon and specified the pri-mary objective of the questionnaire, the secondphase of the planning involves the identificationand itemizing of subsidiary topics that relate toits central purpose. In our example, subsidiaryissues might well include: the types of coursesrequired; the content of courses; the location ofcourses; the timing of courses; the design ofcourses; and the financing of courses.

The third phase follows the identification anditemization of subsidiary topics and involvesformulating specific information requirementsrelating to each of these issues. For example,with respect to the type of courses required, de-tailed information would be needed about theduration of courses (one meeting, several meet-ings, a week, a month, a term or a year), thestatus of courses (non-award bearing, awardbearing, with certificate, diploma, degreegranted by college or university), the orienta-tion of courses (theoretically oriented involvinglectures, readings, etc., or practically orientedinvolving workshops and the production of cur-riculum materials).

What we have in the example, then, is a movefrom a generalized area of interest or purposeto a very specific set of features about whichdirect data can be gathered. Wilson and McLean(ibid.: 8–9) suggest an alternative approachwhich is to identify the research problem, then

Cha

pte

r 14

247

to clarify the relevant concepts or constructs,then to identify what kinds of measures (if ap-propriate) or empirical indicators there are ofthese, i.e. the kinds of data required to give theresearcher relevant evidence about the conceptsor constructs, e.g. their presence, their intensity,their main features and dimensions, their keyelements etc.

What unites these two approaches is theirrecognition of the need to ensure that the ques-tionnaire: (a) is clear on its purposes; (b) is clearon what needs to be included or covered in thequestionnaire in order to meet the purposes; (c)is exhaustive in its coverage of the elements ofinclusion; (d) asks the most appropriate kindsof question (discussed below); (e) elicits the mostappropriate kinds of data to answer the researchpurposes and sub-questions; (f) asks for empiri-cal data.

Structured, semi-structured and un-structured questionnaires

Though there is a large range of types of ques-tionnaire, there is a simple rule of thumb: thelarger the size of the sample, the more struc-tured, closed and numerical the questionnairemay have to be, and the smaller the size of thesample, the less structured, more open and word-based the questionnaire may be. Highly struc-tured, closed questions are useful in that theycan generate frequencies of response amenableto statistical treatment and analysis. They alsoenable comparisons to be made across groupsin the sample (Oppenheim, 1992:115). Indeedit would be almost impossible, as well as unnec-essary, to try to process vast quantities of word-based data in a short time frame. If a site-spe-cific case study is required, then qualitative, less

Box 14.1A flow chart technique for question planning

Source Social and Community Planning Research, 1972

APPROACHING THE PLANNING OF A QUESTIONNAIRE

QUESTIONNAIRES248

structured, word-based and open-ended ques-tionnaires may be more appropriate as they cancapture the specificity of a particular situation.Where measurement is sought then a quantita-tive approach is required; where rich and per-sonal data are sought, then a word-based quali-tative approach might be more suitable.

The researcher can select several types of ques-tionnaire, from highly structured to unstructured.If a closed and structured questionnaire is used,enabling patterns to be observed and compari-sons to be made, then the questionnaire will needto be piloted and refined so that the final versioncontains as full a range of possible responses ascan be reasonably fore-seen. Such a questionnaireis heavy on time early in the research; however,once the questionnaire has been ‘set up’ then themode of analysis might be comparatively rapid.For example, it may take two or three monthsto devise a survey questionnaire, pilot it, refineit and set it out in a format that will enable thedata to be processed and statistics to be calcu-lated. However, the ‘trade-off’ from this is thatthe data analysis can be undertaken fairly rap-idly—we already know the response categories,the nature of the data and the statistics to beused; it is simply a matter of processing the data—often using computer analysis. Indeed there areseveral computer packages available for paperless,on-line questionnaire completion, e.g. Results forResearch™, SphinxSurvey.

It is perhaps misleading to describe a ques-tionnaire as being ‘unstructured’, as the wholedevising of a questionnaire requires respondentsto adhere to some form of given structure. Thatsaid, between a completely open questionnairethat is akin to an open invitation to ‘write whatone wants’ and a totally closed, completely struc-tured questionnaire, there is the powerful toolof the semi-structured questionnaire. Here a se-ries of questions, statements or items are pre-sented and the respondent is asked to answer,respond to or comment on them in a way thatshe or he thinks best. There is a clear structure,sequence, focus, but the format is open-ended,enabling the respondent to respond in her/hisown terms. The semi-structured questionnaire

sets the agenda but does not presuppose thenature of the response.

Types of questionnaire items

There are several kinds of question and responsemodes in questionnaires, including, for exam-ple: dichotomous questions; multiple choicequestions; rating scales; and open-ended ques-tions. These are considered below (see alsoWilson, 1996). Closed questions prescribe therange of responses from which the respondentmay choose. In general closed questions (di-chotomous, multiple choice and rating scales)are quick to complete and straightforward tocode (e.g. for computer analysis), and do notdiscriminate unduly on the basis of how articu-late the respondents are (Wilson and McLean,1994:21). On the other hand they do not enablerespondents to add any remarks, qualificationsand explanations to the categories, and there isa risk that the categories might not be exhaus-tive and that there might be bias in them(Oppenheim, 1992:115).

Open questions, on the other hand, enablerespondents to write a free response in their ownterms, to explain and qualify their responses andavoid the limitations of pre-set categories of re-sponse. On the other hand the responses are dif-ficult to code and to classify. The issue for re-searchers is one of ‘fitness for purpose’.

Avoiding pitfalls in question writing

Though there are several kinds of questions thatcan be used (discussed below), there are severalcaveats about the framing of questions in a ques-tionnaire:

1 Avoid leading questions, that is, questionswhich are worded (or their response catego-ries presented) in such a way as to suggest torespondents that there is only one acceptableanswer, and that other responses might ormight not gain approval or disapproval re-spectively. For example:

Do you prefer abstract, academic-type courses,

Cha

pte

r 14

249

or down-to-earth, practical courses that havesome pay-off in your day-to-day teaching?

The guidance here is to check the ‘loadedness’or possible pejorative overtones of terms orverbs.

2 Avoid highbrow questions even with sophis-ticated respondents. For example:

What particular aspects of the currentpositivistic/interpretive debate would you liketo see reflected in a course of developmentalpsychology aimed at a teacher audience?

Where the sample being surveyed is representa-tive of the whole adult population, misunder-standings of what researchers take to be clear,unambiguous language are commonplace.

3 Avoid complex questions. For example:

Would you prefer a short, non-award bearingcourse (3,4 or 5 sessions) with part-day release(e.g. Wednesday afternoons) and one eveningper week attendance with financial reimburse-ment for travel, or a longer, non-award bear-ing course (6, 7 or 8 sessions) with full-dayrelease, or the whole course designed on part-day release without evening attendance?

4 Avoid irritating questions or instructions. Forexample:

Have you ever attended an in-service course ofany kind during your entire teaching career?

If you are over forty, and have never attendedan in-service course, put one tick in the boxmarked NEVER and another in the boxmarked OLD.

5 Avoid questions that use negatives and dou-ble negatives (Oppenheim, 1992:128). Forexample:

How strongly do you feel that no teachershould enrol on the in-service, award-bearingcourse who has not completed at least two yearsfull-time teaching?

6 Avoid too many open-ended questions on self-completion questionnaires. Because self-com-

pletion questionnaires cannot probe respond-ents to find out just what they mean by par-ticular responses, open-ended questions area less satisfactory way of eliciting informa-tion. (This caution does not hold in the inter-view situation, however.) Open-ended ques-tions, moreover, are too demanding of mostrespondents’ time. Nothing can be more off-putting than the following format whichmight appear in a questionnaire:

Use pages 5, 6 and 7 respectively to respond toeach of the questions about your attitudes toin-service courses in general and your beliefsabout their value in the professional life of theserving teacher.

The problem of ambiguity in words is intracta-ble; at best it can be minimized rather than elimi-nated altogether. The most innocent of questionsis replete with ambiguity (Youngman,1984:158–9; Morrison, 1993:71–2). Take thefollowing examples:

Does your child regularly do homework? What does ‘regularly’ mean—once a day; oncea year; once a term; once a week?

How many students are there in the school? What does this mean: on roll, on roll but ab-sent; marked as present but out of school on afield trip; at this precise moment or this week(there being a difference in attendance betweena Monday and a Friday), or between the firstterm of an academic year and the last term ofthe academic year for secondary school studentsas some of them will have left school to go intoemployment and others will be at home revis-ing for examinations or have completed them?

How many computers do you have in school? What does this mean: present but broken; in-cluding those out of school being repaired; theproperty of the school or staffs’ and students’own computers; on average or exactly in schooltoday?

TYPES OF QUESTIONNAIRE ITEMS

QUESTIONNAIRES250

Have you had a French lesson this week? What constitutes a ‘week’: the start of the schoolweek (i.e. from Monday to a Friday), since lastSunday (or Saturday depending on one’s reli-gion), or, if the question were put on a Wednes-day, since last Wednesday; how representativeof all weeks is this week—there being publicexaminations in the school for some of theweek?

How far do you agree with the view that withouta Parent-Teacher Association you cannot talkabout the progress of your children?

The double negative (‘without’ and ‘cannot’)makes this question a difficult one to answer. IfI wanted to say that I believe that Parent—Teacher Associations are necessary for adequateconsultation between parents and teachers, do Ianswer with a ‘yes’ or a ‘no’?

How old are you?15–2020–3030–4040–5050–60

The categories are not discrete; will an old-look-ing 40-year-old flatter himself and put himselfin the 30–40 category, or will an immature 20-year-old seek the maturity of being put into the20–30 category? The rule in questionnaire de-sign is to avoid any overlap of categories.

Vocational education is only available to the lowerability students but it should be open to every stu-dent.

This is, in fact, a double question. What doesthe respondent do who agrees with the first partof the sentence—‘vocational education is onlyavailable to the lower ability students’—but disa-grees with the latter part of the sentence, or viceversa? The rule in questionnaire design is to askonly one question at a time.

Though it is impossible to legislate for therespondents’ interpretation of wording, the re-

searcher, of course, has to adopt a commonsenseapproach to this, recognizing the inherent am-biguity but nevertheless still feeling that it ispossible to live with this ambiguity.

An ideal questionnaire possesses the sameproperties as a good law:

It is clear, unambiguous and uniformly workable.Its design must minimize potential errors fromrespondents…and coders. And since people’s par-ticipation in surveys is voluntary, a questionnairehas to help in engaging their interest, encouragingtheir co-operation, and eliciting answers as closeas possible to the truth.

(Davidson, 1970)

Dichotomous questions

A highly structured questionnaire will ask closedquestions. These can take several forms. Di-chotomous questions require a ‘yes’/‘no’ re-sponse, e.g. ‘have you ever had to appear incourt?’, ‘do you prefer didactic methods to child-centred methods’? The dichotomous question isuseful, for it compels respondents to ‘come offthe fence’ on an issue. Further, it is possible tocode responses quickly, there being only twocategories of response. A dichotomous questionis also useful as a funnelling or sorting devicefor subsequent questions, for example: ‘if youanswered “yes” to question X, please go to ques-tion Y; if you answered “no” to question X,please go to question Z’. Sudman and Bradburn(1982:89) suggest that if dichotomous questionsare being used, then it is desirable to use severalto gain data on the same topic, in order to re-duce the problems of respondents’ ‘guessing’answers.

On the other hand, the researcher must ask,for instance, whether a ‘yes’/‘no’ response actu-ally provides any useful information. Requiringrespondents to make a ‘yes’/‘no’ decision maybe inappropriate; it might be more appropriateto have a range of responses, for example in arating scale. There may be comparatively fewcomplex or subtle questions which can be an-swered with a simple ‘yes’ or ‘no’. A ‘yes’ or a‘no’ may be inappropriate for a situation whose

Cha

pte

r 14

251

complexity is better served by a series of ques-tions which catch that complexity. Further,Youngman (1984:163) suggests that it is a natu-ral human tendency to agree with a statementrather than to disagree with it; this suggests thata simple dichotomous question might build inrespondent bias.

In addition to dichotomous questions (‘yes’/‘no’ questions) a piece of research might askfor information about dichotomous variables,for example gender (male/female), type ofschool (elementary/secondary), type of course(vocational/non-vocational). In these cases onlyone of two responses can be selected. This ena-bles nominal data to be gathered, which canthen be processed using the chi-square statis-tic, the binomial test, the G-test, and cross-tabu-lations (see Cohen and Holliday (1996) forexamples).

Multiple choice questions

To try to gain some purchase on complexity, theresearcher can move towards multiple choicequestions, where the range of choices is designedto capture the likely range of responses to givenstatements. For example, the researcher mightask a series of questions about a new Chemistryscheme in the school; a statement precedes a setof responses thus:

The New Intermediate Chemistry Education (NICE)is:

(a) a waste of time;(b) an extra burden on teachers;(c) not appropriate to our school;(d) a useful complementary scheme;(e) a useful core scheme throughout the school;(f) well-presented and practicable.

The categories would have to be discrete (i.e.having no overlap and being mutually exclusive)and would have to exhaust the possible rangeof responses. Guidance would have to be givenon the completion of the multiple-choice, clari-fying, for example, whether respondents are ableto tick only one response (a single answer mode)

or several responses (multiple answer mode)from the list. Like dichotomous questions, mul-tiple choice questions can be quickly coded andquickly aggregated to give frequencies of re-sponse. If that is appropriate for the research,then this might be a useful instrument.

Just as dichotomous questions have their par-allel in dichotomous variables, so multiple choicequestions have their parallel in multiple elementsof a variable. For example, the researcher maybe asking to which form a student belongs—there being up to, say, forty forms in a largeschool, or the researcher may be asking whichpost-16 course a student is following (e.g. aca-demic, vocational, manual, non-manual). Inthese cases only one response may be selected.As with the dichotomous variable, the listing ofseveral categories or elements of a variable (e.g.form membership and course followed) enablesnominal data to be collected and processed us-ing the chi-square statistic, the G-test, and cross-tabulations (Cohen and Holliday, 1996).

The multiple choice questionnaire seldomgives more than a crude statistic, for words areinherently ambiguous. In the example above thenotion of ‘useful’ is unclear, as are ‘appropri-ate’, ‘practicable’ and ‘burden’. Respondentscould interpret these words differently in theirown contexts, thereby rendering the data am-biguous. One respondent might see the utilityof the chemistry scheme in one area and therebysay that it is useful—ticking (d). Another re-spondent might see the same utility in that sameone area but, because it is only useful in thatsingle area, may see this as a flaw and thereforenot tick category (d). With an anonymous ques-tionnaire this difference would be impossible todetect.

This is the heart of the problem of question-naires—that different respondents interpret thesame words differently. Anchor statements’ canbe provided to allow a degree of discriminationin response (e.g. ‘strongly agree’, ‘agree’ etc.)but there is no guarantee that respondents willalways interpret them in the way that was in-tended. In the example above this might not be

TYPES OF QUESTIONNAIRE ITEMS

QUESTIONNAIRES252

a problem as the researcher might only be seek-ing an index of utility—without wishing to knowthe areas of utility or the reasons for that utility.The evaluator might only be wishing for a crudestatistic (which might be very useful statisticallyin making a decisive judgement about a pro-gram) in which case this rough and ready statis-tic might be perfectly acceptable.

What one can see in the example above isnot only ambiguity in the wording but a veryincomplete set of response categories which ishardly capable of representing all aspects of thechemistry scheme. That this might be politicallyexpedient cannot be overlooked, for if the choiceof responses is limited, then those responsesmight enable bias to be built into the research.For example, if the responses were limited tostatements about the utility of the chemistryscheme, then the evaluator would have littledifficulty in establishing that the scheme wasuseful. By avoiding the inclusion of negativestatements or the opportunity to record a nega-tive response the research will surely be biased.The issue of the wording of questions has beendiscussed earlier.

Rank ordering

The rank order question is akin to the multiplechoice question in that it identifies options fromwhich respondents can choose, yet it moves be-yond multiple choice items in that it asks re-spondents to identify priorities. This enables arelative degree of preference, priority, intensityetc. to be charted.

In the rank ordering exercise a list of factorsis set out and the respondent is required to placethem in a rank order, for example:

Please indicate your priorities by placing numbersin the boxes to indicate the ordering of your views,1=the highest priority, 2=the second highest, andso on.The proposed amendments to the mathematicsscheme might be successful if the following fac-tors are addressed:

• the appropriate material resources are inschool �

• the amendments are made clear to all teach-ers �

• the amendments are supported by the math-ematics team �

• the necessary staff development is assured �

• there are subsequent improvements to studentachievement �

• the proposals have the agreement of all teach-ers �

• they improve student motivation �

• parents approve of the amendments �

• they will raise the achievements of the brighterstudents �

• the work becomes more geared to problem-solving �

In this example ten items are listed. Whilst thismight be enticing for the researcher, enablingfine distinctions possibly to be made in priori-ties, it might be asking too much of the respond-ents to make such distinctions. They genuinelymight not be able to differentiate their responses,or they simply might not feel strongly enoughto make such distinctions. The inclusion of toolong a list might be overwhelming. IndeedWilson and McLean (1994:26) suggest that it isunrealistic to ask respondents to arrange priori-ties where there are more than five ranks thathave been requested. In the case of the list often points above, the researcher might approachthis problem in one of two ways. The list in thequestionnaire item can be reduced to five itemsonly, in which case the range and comprehen-siveness of responses that fairly catches whatthe respondent feels is significantly reduced.Alternatively, the list of ten items can be retained,but the request can be made to the respondentsonly to rank their first five priorities, in whichcase the range is retained and the task is notoverwhelming (though the problem of sortingthe data for analysis is increased).

Rankings are useful in indicating degrees ofresponse. In this respect they are like ratingscales, discussed below.

Cha

pte

r 14

253

Rating scales

One way in which degrees of response, inten-sity of response, and the move away from di-chotomous questions has been managed can beseen in the notion of rating scales—Likert scales,semantic differential scales, Thurstone scales,Guttman scaling. These are very useful devicesfor the researcher, as they build in a degree ofsensitivity and differentiation of response whilststill generating numbers. This chapter will fo-cus on the first two of these, though readers willfind the others discussed in Oppenheim (1992).A Likert scale (named after its deviser, RensisLikert, 1932) provides a range of responses to agiven question or statement, for example:

How important do you consider work placementsto be for secondary school students?1 = not at all2 = very little3 = a little4 = a lot5 = a very great deal

All students should have access to free higher edu-cation.

1 = strongly disagree2 = disagree3 = neither agree nor disagree4 = agree5 = strongly agree

In these examples the categories need to be dis-crete and to exhaust the range of possible re-sponses which respondents may wish to give.Notwithstanding the problems of interpretationwhich arise as in the previous example—onerespondent’s ‘agree’ may be another’s ‘stronglyagree’, one respondent’s ‘very little’ might beanother’s ‘a little’—the greater subtlety of re-sponse which is built into a rating scale rendersthis a very attractive and widely used instrumentin research.

These two examples both indicate an impor-tant feature of an attitude scaling instrument,viz. the assumption of unidimensionality in thescale; the scale should only be measuring onething at a time (Oppenheim, 1992:187–8). In-

deed this is a cornerstone of Likert’s own think-ing (1932).

It is a very straightforward matter to converta dichotomous question into a multiple choicequestion. For example, instead of asking the ‘doyou?’, ‘have you?’, ‘are you?’, ‘can you?’ typequestions in a dichotomous format, a simpleaddition to wording will convert it into a muchmore subtle rating scale, by substituting thewords ‘to what extent?’, ‘how far?’, ‘how much?’etc.

A semantic differential is a variation of a rat-ing scale which operates by putting an adjectiveat one end of a scale and its opposite at the other,for example:

How informative do you consider the new set ofhistory text books to be?

1 2 3 4 5 6 7useful — — — — — — — useless

The respondent indicates on the scale by circlingor putting a mark on that position which mostrepresents what she or he feels.

Osgood et al. (1957), the pioneers of this tech-nique, suggest that semantic differential scalesare useful in three contexts: evaluative (e.g. valu-able-valueless, useful—useless, good—bad);potency (e.g. large—small, weak—strong,light—heavy); and activity (e.g. quick—slow,active—passive, dynamic-lethargic).

Rating scales are widely used in research, andrightly so, for they combine the opportunity fora flexible response with the ability to determinefrequencies, correlations and other forms ofquantitative analysis. They afford the researcherthe freedom to fuse measurement with opinion,quantity and quality.

Though rating scales are powerful and use-ful in research, the researcher, nevertheless, needsto be aware of their limitations. For example,the researcher may not be able in infer a degreeof sensitivity and subtlety from the data that theycannot bear. There are other cautionary factorsabout rating scales, be they Likert scales or se-mantic differential scales:

TYPES OF QUESTIONNAIRE ITEMS

QUESTIONNAIRES254

• There is no assumption of equal intervalsbetween the categories, hence a rating of 4indicates neither that it is twice as powerfulas 2 nor that it is twice as strongly felt; onecannot infer that the intensity of feeling inthe Likert scale between ‘strongly disagree’and ‘disagree’ somehow matches the inten-sity of feeling between ‘strongly agree’ and‘agree’. These are illegitimate inferences. Theproblem of equal intervals has been addressedin Thurstone scales (Thurstone and Chave,1929; Oppenheim, 1992:190–5).

• We have no check on whether the respond-ents are telling the truth. Some respondentsmay be deliberately falsifying their replies.

• We have no way of knowing if the respond-ent might have wished to add any other com-ments about the issue under investigation. Itmight have been the case that there was some-thing far more pressing about the issue thanthe rating scale included but which was con-demned to silence for want of a category. Astraightforward way to circumvent this issueis to run a pilot and also to include a cat-egory entitled ‘other (please state)’.

• Most of us would not wish to be called ex-tremists; we often prefer to appear like eachother in many respects. For rating scales thismeans that we might wish to avoid the twoextreme poles at each end of the continuumof the rating scales, reducing the number ofpositions in the scales to a choice of three (in afive-point scale). That means that in fact therecould be very little choice for us. The wayround this is to create a larger scale than afive-point scale, for example a seven-pointscale. To go beyond a seven-point scale is toinvite a degree of detail and precision whichmight be inappropriate for the item in ques-tion, particularly if the argument set out aboveis accepted, viz. that one respondent’s scalepoint three might be another’s scale point four.

• On the scales so far there have been mid-points; on the five-point scale it is categorythree, and on the seven-point scale it is cat-egory four. The use of an odd number ofpoints on a scale enables this to occur. How-

ever, choosing an even number of scale points,for example a six-point scale, might requirea decision on rating to be indicated.

For example, suppose a new staffing structure hasbeen introduced into a school and the headteacheris seeking some guidance on its effectiveness. Asix-point rating scale might ask respondents toindicate their response to the statement:

The new staffing structure in the school has ena-bled teamwork to be managed within a clear modelof line management.

(Circle one number)

1 2 3 4 5 6strongly — — — — — — stronglyagree disagree

Let us say that one member of staff circled 1,eight staff circled 2, twelve staff circled 3, ninestaff circled 4, two staff circled 5, and seven staffcircled 6. There being no mid-point on this con-tinuum, the researcher could infer that thoserespondents who circled 1, 2, or 3 were in somemeasure of agreement, whilst those respondentswho circled 4, 5, or 6 were in some measure ofdisagreement. That would be very useful for, say,a headteacher, in publicly displaying agreement,there being twenty-one staff (1+8+12) agreeingwith the statement and eighteen (9+2+7) display-ing a measure of disagreement. However, onecould point out that the measure of ‘stronglydisagree’ attracted seven staff—a very strongfeeling—which was not true for the ‘stronglyagree’ category, which only attracted one mem-ber of staff. The extremity of the voting has beenlost in a crude aggregation.

Further, if the researcher were to aggregate thescoring around the two mid-point categories (3and 4) there would be twenty-one members ofstaff represented, leaving nine (1+8) from catego-ries 1 and 2 and nine (2+7) from categories 5 and6; adding together categories 1, 2, 5 and 6, a to-tal of 18 is reached, which is less than the twenty-one total of the two categories 3 and 4. It seemson this scenario that it is far from clear that therewas agreement with the statement from the staff;

Cha

pte

r 14

255

indeed taking the high incidence of ‘strongly disa-gree’, it could be argued that those staff who wereperhaps ambivalent (categories 3 and 4), coupledwith those who registered a ‘strongly disagree’indicate not agreement but disagreement with thestatement.

The interpretation of data has to be handledvery carefully; ordering them to suit a research-er’s own purposes might be very alluring butillegitimate. The golden rule here is that crudedata can only yield crude interpretation; subtlestatistics require subtle data. The interpretationof data must not distort the data unfairly.

It has been suggested that the attraction ofrating scales is that they provide more opportu-nity than dichotomous questions for renderingdata more sensitive and responsive to respond-ents. This makes rating scales particularly use-ful for tapping attitudes, perceptions and opin-ions of respondents. The need for a pilot to de-vise and refine categories, making them exhaus-tive and discrete, has been suggested as a neces-sary part of this type of data collection.

Questionnaires that are going to yield numeri-cal or word-based data can be analyzed usingcomputer programmes (for example SPSS,SphinxSurvey or Ethnograph respectively). If theresearcher intends to process the data using acomputer package it is essential that the layoutand coding system of the questionnaire is appro-priate for the computer package. Instructions forlayout in order to facilitate data entry are con-tained in manuals that accompany such packages.

Rating scales are more sensitive instrumentsthan dichotomous scales. Nevertheless they arelimited in their usefulness to researchers by theirfixity of response caused by the need to selectfrom a given choice. A questionnaire might betailored even more to respondents by includingopen-ended questions to which respondents canreply in their own terms and own opinions, andthese we now consider.

Open-ended questions

The open-ended question is a very attractivedevice for smaller scale research or for those

sections of a questionnaire that invite an hon-est, personal comment from the respondents inaddition to ticking numbers and boxes. Thequestionnaire simply puts the open-ended ques-tions and leaves a space (or draws lines) for afree response. It is the open-ended responses thatmight contain the ‘gems’ of information thatotherwise might not have been caught in thequestionnaire. Further, it puts the responsibilityfor and ownership of the data much more firmlyinto the respondents’ hands.

This is not to say that the open-ended ques-tion might well not frame the answer, just as thestem of a rating scale question might frame theresponse given. However, an open-ended ques-tion can catch the authenticity, richness, depthof response, honesty and candour which, as isargued elsewhere in this book, are the hallmarksof qualitative data.

Oppenheim (1992:56–7) suggests that a sen-tence-completion item is a useful adjunct to anopen-ended question, for example:

Please complete the following sentence in yourown words:

An effective teacher…

or

The main things that I find annoying with disrup-tive students are…

Open-endedness also carries problems of datahandling. For example, if one tries to convertopinions into numbers (e.g. so many people in-dicated some degree of satisfaction with the newprincipal’s management plan), then it could beargued that the questionnaire should have usedrating scales in the first place. Further, it mightwell be that the researcher is in danger of vio-lating one principle of word-based data, whichis that they are not validly susceptible to aggre-gation, i.e. that it is trying to bring to word-based data the principles of numerical data,borrowing from one paradigm (quantitativemethodology) to inform another paradigm(qualitative methodology).

Further, if a genuinely open-ended question

TYPES OF QUESTIONNAIRE ITEMS

QUESTIONNAIRES256

is being asked, it is perhaps unlikely that re-sponses will bear such a degree of similarity toeach other to enable them to be aggregated tootightly. Open-ended questions make it difficultfor the researcher to make comparisons betweenrespondents, as there may be little in commonto compare. Moreover, to complete an open-ended questionnaire takes much longer thanplacing a tick in a rating scale response box; notonly will time be a constraint here, but there isan assumption that respondents will be suffi-ciently or equally capable of articulating theirthoughts and committing them to paper.

Despite these cautions, the space provided foran open-ended response is a window of oppor-tunity for the respondent to shed light on anissue or course. Thus, an open-ended question-naire has much to recommend it.

Asking sensitive questions

Sudman and Bradburn (1982: Chapter 3) drawattention to the important issue of including sen-sitive items in a questionnaire. Whilst the ano-nymity of a questionnaire and, frequently, thelack of face-to-face contact between the re-searcher and the respondents in a questionnairemight facilitate responses to sensitive material,the issues of sensitivity and threat cannot beavoided, as they might lead to under-reportingand over-reporting by participants. Sudman andBradburn (1982:55–6) identify several impor-tant considerations in addressing potentiallythreatening or sensitive issues, for example so-cially undesirable behaviour (e.g. drug abuse,sexual offences, violent behaviour, criminality,illnesses, employment and unemployment, physi-cal features, sexual activity, behaviour and sexu-ality, gambling, drinking, family details, politi-cal beliefs, social taboos). They suggest that: • Open rather than closed questions might be

more suitable to elicit information about so-cially undesirable behaviour, particularly fre-quencies.

• Long rather than short questions might bemore suitable for eliciting information about

socially undesirable behaviour, particularlyfrequencies.

• Using familiar words might increase thenumber of reported frequencies of sociallyundesirable behaviour.

• Using data gathered from informants, wherepossible, can enhance the likelihood of ob-taining reports of threatening behaviour.

• Deliberately loading the question so thatoverstatements of socially desirable behav-iour and understatements of socially unde-sirable behaviour are reduced might be a use-ful means of eliciting information.

• With regard to socially undesirable behaviour,it might be advisable, firstly, to ask whetherthe respondent has engaged in that behav-iour previously, and then move to askingabout his or her current behaviour. By con-trast, when asking about socially acceptablebehaviour the reverse might be true, i.e. ask-ing about current behaviour before askingabout everyday behaviour.

• In order to defuse threat, it might be usefulto locate the sensitive topic within a discus-sion of other more or less sensitive matters,in order to suggest to respondents that thisissue might not be too important.

• Use alternative ways of asking standard ques-tions, for example sorting cards, or puttingquestions in sealed envelopes, or repeatingquestions over time (this has to be handledsensitively, so that respondents do not feel thatthey are being ‘checked’), and in order to in-crease reliability.

• Ask respondents to keep diaries in order toincrease validity and reliability.

• At the end of an interview ask respondentstheir views on the sensitivity of the topics thathave been discussed questions.

• If possible find ways of validating the data. Indeed the authors suggest (ibid.: 86) that, as thequestions become more threatening and sensitive,it is wise to expect greater bias and unreliability.They draw attention to the fact (ibid.: 208) thatseveral nominal, demographic details might beconsidered threatening by respondents. This has

Cha

pte

r 14

257

implications for their location within the ques-tionnaire (discussed below). The issue here is thatsensitivity and threat are to be viewed throughthe eyes of respondents rather than the question-naire designer; what might appear innocuous tothe researcher might be highly sensitive or offen-sive to the respondent.

Sequencing the questions

The order of the questions in a questionnaire,to some extent, is a function of the target sam-ple (e.g. how they will react to certain questions),the purposes of the questionnaire (e.g. to gatherfacts or opinions), the sensitivity of the research(e.g. how personal and potentially disturbing theissues are that will be addressed), and the over-all balance of the questionnaire (e.g. where bestto place sensitive questions in relation to lessthreatening questions, and how many of eachto include).

The ordering of the questionnaire is impor-tant, for early questions may set the tone of, orthe mind-set of the respondent to, the later ques-tions. For example, a questionnaire that makesa respondent irritated or angry early on is un-likely to have managed to enable that respond-ent’s irritation or anger to subside by the end ofthe questionnaire. As Oppenheim remarks(1992:121) one covert purpose of each questionis to ensure that the respondent will continue toco-operate.

Further, a respondent might ‘read the signs’in the questionnaire, seeking similarities andresonances between statements, so that re-sponses to early statements will affect responsesto later statements and vice versa. Whilst multi-ple items may act as a cross-check, this very proc-ess might be irritating for some respondents.

The key principle, perhaps, is to avoid creat-ing a mood-set or a mind-set early on in thequestionnaire. For this reason it is important tocommence the questionnaire with non-threat-ening questions that they can readily answer.After that it might be possible to move towardsmore personalized questions.

Completing a questionnaire can be seen as a

learning process in which respondents becomemore at home with the task as they proceed.Initial questions should therefore be simple, havehigh interest value, and encourage participation.This will build up the confidence and motiva-tion of the respondent. The middle section ofthe questionnaire should contain the difficultquestions; the last few questions should be ofhigh interest in order to encourage respondentsto return the completed schedule.

A common sequence of a questionnaire is: 1 to commence with unthreatening factual

questions (that, perhaps, will give the re-searcher some nominal data about the sam-ple, e.g. age group, sex, occupation, years inpost, qualifications etc.);

2 to move to closed questions (e.g. dichoto-mous, multiple choice, rating scales) aboutgiven statements or questions, eliciting re-sponses that require opinions, attitudes, per-ceptions, views;

3 to move to more open-ended questions (or,maybe, to intersperse these with more closedquestions) that seek responses on opinions,attitudes, perceptions and views, togetherwith reasons for the responses given. Theseresponses and reasons might include sensi-tive or more personal data.

The move is from objective facts to subjectiveattitudes and opinions through justifications andto sensitive, personalized data. Clearly the or-dering is neither as discrete nor as straightfor-ward as this. For example, an apparently innocu-ous question about age might be offensive tosome respondents, a question about income isunlikely to go down well with somebody whohas just become unemployed, and a questionabout religious belief might be seen as an un-warranted intrusion into private matters.

The issue here is that the questionnaire designerhas to anticipate the sensitivity of the topics interms of the respondents, and this has a largesocio-cultural dimension. What is being arguedhere is that the logical ordering of a question-naire has to be mediated by its psychological

TYPES OF QUESTIONNAIRE ITEMS

QUESTIONNAIRES258

ordering. The instrument has to be viewedthrough the eyes of the respondent as well as thedesigner.

In addition to the overall sequencing of thequestionnaire, Oppenheim (1992: Chapter 7)suggests that the sequence within sections of thequestionnaire is important. He indicates that thequestionnaire designer can use funnels and fil-ters within the question. A funnelling processmoves from the general to the specific, askingquestions about the general context or issues andthen moving toward specific points within that.A filter is used to include and exclude certainrespondents, i.e. to decide if certain questionsare relevant or irrelevant to them, and to in-struct respondents about how to proceed (e.g.which items to jump to or proceed to). For ex-ample, if a respondent indicates a ‘yes’; or a ‘no’to a certain question, then this might exempther/him from certain other questions in that sec-tion or subsequently.

Questionnaires containing few verbalitems

The discussion so far has assumed that ques-tionnaires are entirely word-based. This mightbe off-putting for many respondents, particu-larly children. In these circumstances a question-naire might include visual information and askparticipants to respond to this (e.g. pictures,cartoons, diagrams) or might include some pro-jective visual techniques (e.g. to draw a pictureor diagram, to join two related pictures with aline, to write the words or what someone is say-ing or thinking in a ‘bubble’ picture), to tell thestory of a sequence of pictures together withpersonal reactions to it. The issue here is that,in tailoring the format of the questionnaire tothe characteristics of the sample, a very wideembrace might be necessary to take in non word-based techniques. This is not only a matter ofappeal to respondents, but, perhaps more sig-nificantly, is a matter of accessibility of the ques-tionnaire to the respondents, i.e. a matter of re-liability and validity.

The layout of the questionnaire

The appearance of the questionnaire is vitallyimportant. It must look easy, attractive and in-teresting rather than complicated, unclear, for-bidding and boring. A compressed layout is un-inviting and it clutters everything together; alarger questionnaire with plenty of space forquestions and answers is more encouraging torespondents. Verma and Mallick (1999:120) alsosuggest the use of high quality paper if fundingpermits.

It is important, perhaps, for respondents tobe introduced to the purposes of each section ofa questionnaire, so that they can become in-volved in it and maybe identify with it. If spacepermits, it is useful to tell the respondent thepurposes and foci of the sections/of the ques-tionnaire, and the reasons for the inclusion ofthe items.

Clarity of wording and simplicity of designare essential. Clear instructions should guiderespondents: ‘Put a tick’, for example, invitesparticipation, whereas complicated instructionsand complex procedures intimidate respondents.Putting ticks in boxes by way of answering aquestionnaire is familiar to most respondents,whereas requests to circle precoded numbers atthe right-hand side of the questionnaire can bea source of confusion and error. In some cases itmight also be useful to include an example ofhow to fill in the questionnaire (e.g. ticking abox, circling a statement), though, clearly, caremust be exercised to avoid leading the respond-ents to answering questions in a particular wayby dint of the example provided (e.g. by sug-gesting what might be a desired answer to thesubsequent questions). Verma and Mallick(1999:121) suggest the use of emboldening todraw the respondent’s attention to significantfeatures.

Ensure that short, clear instructions accom-pany each section of the questionnaire. Repeat-ing instructions as often as necessary is goodpractice in a postal questionnaire. Since every-thing hinges on respondents knowing exactlywhat is required of them, clear, unambiguous

Cha

pte

r 14

259

instructions, boldly and attractively displayed,are essential.

Clarity and presentation also impact on thenumbering of the questions. For example a four-page questionnaire might contain sixty ques-tions, broken down into four sections. It mightbe off-putting to respondents to number eachquestion (1–60) as the list will seem intermina-bly long, whereas to number each section (1–4)makes the questionnaire look manageable.

Hence it is useful, in the interests of clarityand logic to break down the questionnaire intosubsections with section headings. This will alsoindicate the overall logic and coherence of thequestionnaire to the respondents, enabling themto ‘find their way’ through the questionnaire. Itmight be useful to preface each subsection witha brief introduction that tells them the purposeof that section.

The practice of sectionalizing and subletteringquestions (e.g. Q9 (a) (b) (c)…) is a useful tech-nique for grouping together questions to do witha specific issue. It is also a way of making thequestionnaire look smaller than it actually is!

This previous point also requires the ques-tionnaire designer to make it clear if respond-ents are exempted from completing certain ques-tions or sections of the questionnaire (discussedearlier in the section on filters). If so, then it isvital that the sections or questions are numberedso that the respondent knows exactly where tomove to next. Here the instruction might be,for example: ‘if you have answered “yes” toquestion 10 please go to question 15, otherwisecontinue with question 11’, or, for example: ‘ifyou are the school principal please answer thissection, otherwise proceed to section three’.

Arrange the contents of the questionnaire insuch a way as to maximize co-operation. Forexample, include questions that are likely to beof general interest. Make sure that questionswhich appear early in the format do not suggestto respondents that the inquiry is not intendedfor them. Intersperse attitude questions through-out the schedule to allow respondents to air theirviews rather than merely describe their behav-iour. Such questions relieve boredom and frus-

tration as well as providing valuable informa-tion in the process.

Coloured pages can help to clarify the over-all structure of the questionnaire and the use ofdifferent colours for instructions can assist re-spondents.

It is important to include in the questionnaire,perhaps at the beginning, assurances of confi-dentiality, anonymity, and non-traceability, forexample by indicating that they need not givetheir name, that the data will be aggregated, thatindividuals will not be able to be identifiedthrough the use of categories or details of theirlocation etc. (i.e. that it will not be possible toput together a traceable picture of the respond-ents through the compiling of nominal, descrip-tive data about the respondents). In some cases,however, the questionnaire might ask respond-ents to put their name so that they can be tracedfor follow-up interviews in the research (Vermaand Mallick, 1999:121); here the guarantee ofeventual anonymity and non-traceability willstill need to be given.

Finally, a brief note at the very end of thequestionnaire can: (a) ask respondents to checkthat no answer has been inadvertently missedout; (b) solicit an early return of the completedschedule; (c) thank respondents for their par-ticipation and co-operation, and offer to send ashort abstract of the major findings when theanalysis is completed.

Covering letters/sheets and follow-upletters

The purpose of the covering letter/sheet is toindicate the aim of the research, to convey torespondents its importance, to assure them ofconfidentiality, and to encourage their replies.The covering letter/sheet should: • provide a title to the research;• introduce the researcher, her/his name, ad-

dress, organization, contact telephone/ fax/e-mail address, together with an invitationto feel free to contact the researcher for fur-ther clarification or details;

THE LAYOUT OF THE QUESTIONNAIRE

QUESTIONNAIRES260

• indicate the purposes of the research;• indicate the importance and benefits of the

research;• indicate any professional backing, endorse-

ment, or sponsorship of, or permission for,the research (e.g. professional associations,government departments);

• set out how to return the questionnaire (e.g.in the accompanying stamped, addressed en-velope, in a collection box in a particular in-stitution, to a named person; whether thequestionnaire will be collected—and when,where and by whom);

• indicate the address to which to return thequestionnaire;

• indicate what to do if questions or uncertain-ties arise ;

• indicate a return-by date;• indicate any incentives for completing the

questionnaire;• provide assurances of confidentiality, ano-

nymity and non-traceability;• thank respondents in advance for their co-

operation. Verma and Mallick (1999:122) also suggest that,where possible, it is useful to personalize the let-ter, avoiding ‘Dear colleague’, ‘Dear Madam/Ms/Sir’ etc., and replacing these with exactnames.

With these intentions in mind, the followingpractices are to be recommended: • The appeal in the covering letter must be tai-

lored to suit the particular audience. Thus, asurvey of teachers might stress the importanceof the study to the profession as a whole.

• Neither the use of prestigious signatories, norappeals to altruism, nor the addition of hand-written postscripts affect response levels topostal questionnaires.

• The name of the sponsor or the organizationconducting the survey should appear on theletterhead as well as in the body of the cover-ing letter.

• A direct reference should be made to the con-fidentiality of respondents’ answers and the

purposes of any serial numbers and codingsshould be explained.

• A pre-survey letter advising respondents ofthe forthcoming questionnaire has beenshown to have substantial effect on responserates.

• A short covering letter is most effective; aimat no more than one page.

Piloting the questionnaire

It bears repeating that the wording of question-naires is of paramount importance and that pre-testing is crucial to its success. A pilot has sev-eral functions, principally to increase the reli-ability, validity and practicability of the ques-tionnaire (Oppenheim, 1992; Morrison, 1993;Wilson and McLean, 1994:47), it thus serves: • to check the clarity of the questionnaire items,

instructions and layout;• to gain feedback on the validity of the ques-

tionnaire items, the operationalization of theconstructs and the purposes of the research;

• to eliminate ambiguities or difficulties inwording;

• to gain feedback on the type of question andits format (e.g. rating scale, multiple choice,open, closed etc.);

• to gain feedback on response categories forclosed questions, and for the appropriatenessof specific questions or stems of questions;

• to gain feedback on the attractiveness andappearance of the questionnaire;

• to gain feedback on the layout, sectionalizing,numbering and itemization of the question-naire;

• to check the time taken to complete the ques-tionnaire;

• to check whether the questionnaire is too longor too short, too easy or too difficult, toounengaging, too threatening, too intrusive,too offensive;

• to generate categories from open-ended re-sponses to use as categories for closed re-sponse-modes (e.g. rating scale items);

• to identify redundant questions (e.g. those

Cha

pte

r 14

261

questions which consistently gain a total ‘yes’or ‘no’ response (Youngman, 1984:172)), i.e.those questions with little discriminability;

• to identify commonly misunderstood or non-completed items (e.g. by studying commonpatterns of unexpected response and non-re-sponse (Verma and Mallick, 1999:120));

• to try out the coding/classification system fordata analysis.

In short, as Oppenheim (1992:48) remarks, eve-rything about the questionnaire should be pi-loted; nothing should be excluded, not even thetype face or the quality of the paper!

Practical considerations in question-naire design

Taking the issues discussed so far in question-naire design, a range of practical implicationsfor designing a questionnaire can be highlighted: • Operationalize the purposes of the question-

naire carefully.• Decide on the most appropriate type of ques-

tion—dichotomous, multiple choice, rankorderings, rating scales, closed, open.

• Ensure that every issue has been exploredexhaustively and comprehensively; decide onthe content and explore it in depth andbreadth.

• Ensure that the data acquired will answer theresearch questions.

• Ask, for ease of analysis (particularly of alarge sample), more closed than open ques-tions.

• Balance comprehensiveness and exhaustivecoverage of issues with the demotivating fac-tor of having respondents complete severalpages of a questionnaire.

• Ask only one thing at a time in a question.• Strive to be unambiguous and clear in the

wording.• Be simple, clear and brief wherever possible.• Balance brevity with politeness (Oppenheim,

1992:122). It might be advantageous to re-place a staccato phrase like ‘marital status’

with a gentler ‘please indicate whether youare married, living with a partner, orsingle…’or ‘I would be grateful if would tellme if you are married, living with a partner,or single’.

• Ensure a balance of questions which ask forfacts and opinions (this is especially true ifstatistical correlations and cross-tabulationsare required).

• Avoid leading questions.• Try to avoid threatening questions.• Do not assume that respondents know the

answer, or have information to answer thequestions, or will always tell the truth (wit-tingly or not). Therefore include ‘don’t know’,‘not applicable’, ‘unsure’, ‘neither agree nordisagree’ and ‘not relevant’ categories.

• Avoid making the questions too hard.• Consider the readability levels of the ques-

tionnaire and the reading and writing abili-ties of the respondents (which may lead theresearcher to conduct the questionnaire as astructured interview).

• Put sensitive questions later in the question-naire in order to avoid creating a mental setin the mind of respondents, but not so late inthe questionnaire that boredom and lack ofconcentration have occurred.

• Be very clear on the layout of the question-naire so that it is clear and attractive (this isparticularly the case if a computer programis going to be used for data analysis).

• Avoid, where possible, splitting an item overmore than one page, as the respondent maythink that the item from the previous page isfinished.

• Ensure that the respondent knows how toenter a response to each question, e.g. byunderlining, circling, ticking, writing; providethe instructions for introducing, completingand returning (or collection of) the question-naire (provide a stamped addressed envelopeif it is to be a postal questionnaire).

• Pilot the questionnaire, using a group of re-spondents who are drawn from the possiblesample but who will not receive the final, re-fined version.

PILOTING THE QUESTIONNAIRE

QUESTIONNAIRES262

• Decide how to avoid falsification of responses(e.g. introduce a checking mechanism into thequestionnaire responses to another questionon the same topic or issue).

• Be satisfied if you receive a 50 per cent re-sponse to the questionnaire; decide what youwill do with missing data and what is the sig-nificance of the missing data (that might haveimplications for the strata of a stratified sam-ple targeted in the questionnaire), and whythe questionnaires have not been completedand returned (e.g. were the questions toothreatening?, was the questionnaire toolong?—this might have been signalled in thepilot).

• Include a covering explanation, giving thanksfor anticipated co-operation, indicating thepurposes of the research, how anonymity andconfidentiality will be addressed, who you areand what position you hold, and who will beparty to the final report.

• If the questionnaire is going to be adminis-tered by someone other than the researcher,ensure that instructions for administration areprovided and that they are clear.

A key issue that runs right through this lengthylist is for the reader to pay considerable atten-tion to respondents, and to see the questionnairethrough their eyes, and how they will regard it(e.g. from hostility to suspicion to apathy togrudging compliance to welcome; from easy todifficult, from motivating to boring, fromstraightforward to complex etc.).

Postal questionnaires

Frequently, the postal questionnaire is the bestform of survey in an educational inquiry. Take,for example, the researcher intent on investigat-ing the adoption and use made of a new cur-riculum series in secondary schools. An inter-view survey based upon some sampling of thepopulation of schools would be both expensiveand time-consuming. A postal questionnaire, onthe other hand, would have several distinct ad-

vantages. Moreover, given the usual constraintsover finance and resources, it might well provethe only viable way of carrying through such aninquiry.

What evidence we have about the advantagesand disadvantages of postal surveys derives fromsettings other than educational. Many of thefindings, however, have relevance to the educa-tional researcher. Here, we focus upon some ofthe ways in which educational researchers canmaximize the response level that they obtainwhen using postal surveys.

Research shows that a number of myths aboutpostal questionnaires are not borne out by theevidence (see Hoinville and Jowell, 1978). Re-sponse levels to postal surveys are not invari-ably less than those obtained by interview pro-cedures; frequently they equal, and in some casessurpass, those achieved in interviews. Nor doesthe questionnaire necessarily have to be short inorder to obtain a satisfactory response level.With sophisticated respondents, for example, ashort questionnaire might appear to trivializecomplex issues with which they are familiar.Hoinville and Jowell identify a number of fac-tors in securing a good response rate to a postalquestionnaire.

Initial mailing

• Use good-quality envelopes, typed and ad-

dressed to a named person wherever possi-ble.

• Use first-class—rapid—postage services, withstamped rather than franked envelopes wher-ever possible.

• Enclose a stamped envelope for the respond-ent’s reply.

• In surveys of the general population, Thurs-day is the best day for mailing out; in surveysof organizations, Monday or Tuesday arerecommended.

• Avoid at all costs a December survey (ques-tionnaires will be lost in the welter of Christ-mas postings in the western world).

Cha

pte

r 14

263

Follow-up letter

Of the four factors that Hoinville and Jowelldiscuss in connection with maximizing responselevels, the follow-up letter has been shown tobe the most productive. The following pointsshould be borne in mind in preparing reminderletters: • All of the rules that apply to the covering let-

ter apply even more strongly to the follow-up letter.

• The follow-up should re-emphasize the im-portance of the study and the value of therespondents’ participation.

• The use of the second person singular, theconveying of an air of disappointment at non-response and some surprise at non-coopera-tion have been shown to be effective ploys.

• Nowhere should the follow-up give the im-pression that non-response is normal or thatnumerous non-responses have occurred in theparticular study.

• The follow-up letter must be accompaniedby a further copy of the questionnaire to-gether with a stamped addressed envelope forits return.

• Second and third reminder letters suffer fromthe law of diminishing returns, so how manyfollow-ups are recommended and what suc-cess rates do they achieve? It is difficult togeneralize, but the following points are worthbearing in mind. A well-planned postal sur-vey should obtain at least a 40 per cent re-sponse rate and with the judicious use of re-minders, a 70 per cent to 80 per cent responselevel should be possible. A preliminary pilotsurvey is invaluable in that it can indicate thegeneral level of response to be expected. Themain survey should generally achieve at leastas high as and normally a higher level of re-turn than the pilot inquiry. The GovernmentSocial Survey (now the Office of PopulationCensuses and Surveys) recommends the useof three reminders which, they say, can in-crease the original return by as much as 30per cent in surveys of the general public. A

typical pattern of responses to the three fol-low-ups is as follows:

Original despatch 40 per centFirst follow-up +20 per centSecond follow-up +10 per centThird follow-up +5 per centTotal 75 per cent

Incentives

An important factor in maximizing responserates is the use of incentives. Although such us-age is comparatively rare in British surveys, itcan substantially reduce non-response rates par-ticularly when the chosen incentives accompanythe initial mailing rather than being mailed sub-sequently as rewards for the return of completedschedules. The explanation of the effectivenessof this particular ploy appears to lie in the senseof obligation that is created in the recipient. Careis needed in selecting the most appropriate typeof incentive. It should clearly be seen as a tokenrather than a payment for the respondent’s ef-forts and, according to Hoinville and Jowell,should be as neutral as possible. In this respect,they suggest that books of postage stamps orballpoint pens are cheap, easily packaged in thequestionnaire envelopes, and appropriate to thetask required of the respondent.

The preparation of a flow chart can help theresearcher to plan the timing and the sequencingof the various parts of a postal survey. One suchflow chart suggested by Hoinville and Jowell(1978) is shown in Box 14.2. The researchermight wish to add a chronological chart along-side it to help plan the exact timing of the eventsshown here.

Validity

Our discussion, so far, has concentrated on waysof increasing the response rate of postal ques-tionnaires; we have said nothing yet about thevalidity of this particular technique.

Validity of postal questionnaires can be seen

POSTAL QUESTIONNAIRES

QUESTIONNAIRES264

from two viewpoints according to Belson (1986).First, whether respondents who complete ques-tionnaires do so accurately and, second, whetherthose who fail to return their questionnaireswould have given the same distribution of an-swers as did the returnees.

The question of accuracy can be checked bymeans of the intensive interview method, a tech-

nique consisting of twelve principal tactics thatinclude familiarization, temporal reconstruction,probing and challenging,

The interested reader should consult Belson(1986:35–8).

The problem of non-response (the issue of‘volunteer bias’ as Belson calls it) can, in part,be checked on and controlled for, particularly

Box 14.2A flow chart for the planning of a postal survey

Source Hoinville and Jowell, 1978

Cha

pte

r 14

265

when the postal questionnaire is sent out on acontinuous basis. It involves follow-up contactwith non-respondents by means of interviewerstrained to secure interviews with such people. Acomparison is then made between the replies ofrespondents and non-respondents.

Processing questionnaire data

Let us assume that researchers have followedthe advice we have given about the planning ofpostal questionnaires and have secured a highresponse rate to their surveys. Their task is nowto reduce the mass of data they have obtainedto a form suitable for analysis. ‘Data reduction’,as the process is called, generally consists of cod-ing data in preparation for analysis—by handin the case of small surveys; by computers whennumbers are larger. First, however, prior to cod-ing, the questionnaires have to be checked. Thistask is referred to as editing.

Editing questionnaires is intended to identifyand eliminate errors made by respondents. (Inaddition to the clerical editing that we discussin this section, editing checks are also performedby the computer, e.g. SphinxSurvey,HyperRESEARCH, Results for Research™. Foran account of computer-run structure checks andvalid coding range checks, see also Hoinville andJowell (1978) pp. 150–5. Moser and Kalton(1977) point to three central tasks in editing: 1 Completeness A check is made that there is

an answer to every question. In most surveys,interviewers are required to record an answerto every question (a ‘not applicable’ categoryalways being available). Missing answers cansometimes be cross-checked from other sec-tions of the survey. At worst, respondents canbe contacted again to supply the missing in-formation.

2 Accuracy As far as is possible a check is madethat all questions are answered accurately. In-accuracies arise out of carelessness on the partof either interviewers or respondents. Some-times a deliberate attempt is made to mislead.A tick in the wrong box, a ring round the

wrong code, an error in simple arithmetic—all can reduce the validity of the data unlessthey are picked up in the editing process.

3 Uniformity A check is made that interview-ers have interpreted instructions and ques-tions uniformly. Sometimes the failure togive explicit instructions over the interpre-tation of respondents’ replies leads to in-terviewers recording the same answer in avariety of answer codes instead of one. Acheck on uniformity can help eradicate thissource of error.

The primary task of data reduction is coding,that is, assigning a code number to each answerto a survey question. Of course, not all answersto survey questions can be reduced to code num-bers. Many open-ended questions, for example,are not reducible in this way for computer analy-sis. Coding can be built into the construction ofthe questionnaire itself. In this case, we talk ofprecoded answers. Where coding is developedafter the questionnaire has been administeredand answered by respondents, we refer topostcoded answers. Precoding is appropriate forclosed-ended questions—male 1, female 0, forexample; or single 0, married 1, separated 2,divorced 3. For questions such as those whoseanswer categories are known in advance, a cod-ing frame is generally developed before the in-terviewing commences so that it can be printedinto the questionnaire itself. For open-endedquestions (Why did you choose this particularinservice course rather than XYZ?), a codingframe has to be devised after the completion ofthe questionnaire. This is best done by taking arandom sample of the questionnaires (10 percent or more, time permitting) and generating afrequency tally of the range of responses as apreliminary to coding classification. Havingdevised the coding frame, the researcher canmake a further check on its validity by using itto code up a further sample of the questionnaires.It is vital to get coding frames right from theoutset—extending them or making alterationsat a later point in the study is both expensiveand wearisome.

PROCESSING QUESTIONNAIRE DATA

QUESTIONNAIRES266

There are several computer packages that willprocess questionnaire survey data. At the timeof writing one such is SphinxSurvey. This pack-age, like others of its type, assists researchers inthe design, administration and processing ofquestionnaires, either for paper-based or for on-screen administration. Responses can be enteredrapidly, and data can be examined automatically,producing graphs and tables, as well as a widerange of statistics. (The Plus2 edition offers lexi-cal analysis of open-ended text, and the LexicaEdition has additional functions for qualitativedata analysis.) A website for previewing a dem-

onstration of this program can be found at http://www.scolari.co.uk and is typical of several ofits kind.

Whilst coding is usually undertaken by theresearcher, Sudman and Bradburn (1982:149)also make the case for coding by the respond-ents themselves, to increase validity. This is par-ticularly valuable in open-ended questionnaireitems, though, of course, it does assume not onlythe willingness of respondents to become in-volved post hoc but, also, that the researchercan identify and trace the respondents, which,as was indicated earlier, is an ethical matter.

Introduction

The use of the interview in research marks amove away from seeing human subjects as sim-ply manipulable and data as somehow externalto individuals, and towards regarding knowl-edge as generated between humans, oftenthrough conversations (Kvale, 1996:11). Re-garding an interview, as Kvale (ibid.: 14) re-marks, as an interview, an interchange of viewsbetween two or more people on a topic of mu-tual interest, sees the centrality of human inter-action for knowledge production, and empha-sizes the social situatedness of research data. Aswe suggested in Chapter 2, knowledge shouldbe seen as constructed between participants,generating data rather than capta (Laing,1967:53). As such, the interview is not exclu-sively either subjective or objective, it is intersubjective (ibid.: 66). Interviews enable partici-pants—be they interviewers or interviewees—to discuss their interpretations of the world inwhich they live, and to express how they regardsituations from their own point of view. In thesesenses the interview is not simply concerned withcollecting data about life: it is part of life itself,its human embeddedness is inescapable.

Conceptions of the interview

Kitwood lucidly contrasts three conceptions ofit. The first conception is that of a potentialmeans of pure information transfer and collec-tion. A second conception of the interview isthat of a transaction which inevitably has bias,which is to be recognized and controlled. Ac-cording to this viewpoint, Kitwood explains that

‘each participant in an interview will define thesituation in a particular way. This fact can bebest handled by building controls into the re-search design, for example by having a range ofinterviewers with different biases’. The interviewis best understood in terms of a theory of moti-vation which recognizes a range of non-rationalfactors governing human behaviour, like emo-tions, unconscious needs and interpersonal in-fluences. Kitwood points out that both theseviews of the interview regard the inherent fea-tures of interpersonal transactions as if they were‘potential obstacles to sound research, and there-fore to be removed, controlled, or at least har-nessed in some way’.

The third conception of the interview sees itas an encounter necessarily sharing many of thefeatures of everyday life (see for example, Box15.1). What is required, according to this view,is not a technique for dealing with bias, but atheory of everyday life that takes account of therelevant features of interviews. These may in-clude role-playing, stereotyping, perception andunderstanding. One of the strongest advocatesof this viewpoint is Cicourel (1964) who listsfive of the unavoidable features of the interviewsituation that would normally be regarded asproblematic: 1 There are many factors which inevitably dif-

fer from one interview to another, such asmutual trust, social distance and the inter-viewer’s control.

2 The respondent may well feel uneasy andadopt avoidance tactics if the questioning istoo deep.

15 Interviews

INTERVIEWS268

3 Both interviewer and respondent are boundto hold back part of what it is in their powerto state.

4 Many of the meanings which are clear to onewill be relatively opaque to the other, evenwhen the intention is genuine communication.

5 It is impossible, just as in everyday life, tobring every aspect of the encounter withinrational control.

The message here is that no matter how hard aninterviewer may try to be systematic and objec-tive, the constraints of everyday life will be apart of whatever interpersonal transactions sheinitiates. Barker and Johnson (1998:230) arguethat the interview is a particular medium forenacting or displaying people’s knowledge ofcultural forms, as questions, far from being neu-tral, are couched in the cultural repertoires ofall participants, indicating how people makesense of their social world and of each other.1

Purposes of the interview

The purposes of the interview are many andvaried, for example:

• to evaluate or assess a person in some respect;• to select or promote an employee;• to effect therapeutic change, as in the psychi-

atric interview;• to test or develop hypotheses;• to gather data, as in surveys or experimental

situations;• to sample respondents’ opinions, as in door-

step interviews. Although in each of these situations the respec-tive roles of the interviewer and interviewee mayvary and the motives for taking part may differ,a common denominator is the transaction thattakes place between seeking information on thepart of one and supplying information on thepart of the other.

The research interview may serve three pur-poses. First, it may be used as the principal meansof gathering information having direct bearingon the research objectives. As Tuckman describesit, ‘By providing access to what is “inside a per-son’s head”, [it] makes it possible to measurewhat a person knows (knowledge or informa-tion), what a person likes or dislikes (values andpreferences), and what a person thinks (attitudesand beliefs)’ (Tuckman, 1972). Second, it maybe used to test hypotheses or to suggest new ones;or as an explanatory device to help identify vari-ables and relationships. And third, the interviewmay be used in conjunction with other methodsin a research undertaking. In this connection,Kerlinger (1970) suggests that it might be usedto follow up unexpected results, for example,or to validate other methods, or to go deeperinto the motivations of respondents and theirreasons for responding as they do.

We limit ourselves here to the use of the in-terview as a specific research tool. Interviews inthis sense range from the formal interview inwhich set questions are asked and the answersrecorded on a standardized schedule; throughless formal interviews in which the intervieweris free to modify the sequence of questions,change the wording, explain them or add tothem; to the completely informal interviewwhere the interviewer may have a number of

Box 15.1Attributes of ethnographers as interviewers

Trust There would have to be a relationship betweenthe interviewer and interviewee that transcended theresearch, that promoted a bond of friendship, afeeling of togetherness and joint pursuit of a commonmission rising above personal egos.

Curiosity There would have to be a desire to know,to learn people’s views and perceptions of the facts,to hear their stories, discover their feelings. This isthe motive force, and it has to be a burning one, thatdrives researchers to tackle and overcome the manydifficulties involved in setting up and conductingsuccessful interviews.

Naturalness As with observation one endeavours tobe unobtrusive in order to witness events as they are,untainted by one’s presence and actions, so ininterviews the aim is to secure what is within theminds of interviewees, uncoloured and unaffected bythe interviewer.

Source Adapted from Woods, 1986

Cha

pte

r 15

269

key issues which she raises in conversationalstyle. Beyond this point is located the non-di-rective interview in which the interviewer takeson a subordinate role.

The research interview has been defined as ‘atwo-person conversation initiated by the inter-viewer for the specific purpose of obtaining re-search-relevant information, and focused by him[sic] on content specified by research objectivesof systematic description, prediction, or expla-nation’ (Cannell and Kahn, 1968:527). It in-volves the gathering of data through direct ver-bal interaction between individuals. In this senseit differs from the questionnaire where the re-spondent is required to record in some way herresponses to set questions.

As the interview has some things in commonwith the self-administered questionnaire, it isfrequently compared with it. Each has advan-tages over the other in certain respects. The ad-vantages of the questionnaire, for instance, are:it tends to be more reliable; because it is anony-mous, it encourages greater honesty; it is moreeconomical than the interview in terms of timeand money; and there is the possibility that itmay be mailed. Its disadvantages, on the otherhand, are: there is often too low a percentage ofreturns; the interviewer is able to answer ques-tions concerning both the purpose of the inter-

view and any misunderstandings experienced bythe interviewee, for it sometimes happens in thecase of the latter that the same questions havedifferent meanings for different people; if onlyclosed items are used, the questionnaire will besubject to the weaknesses already discussed; ifonly open items are used, respondents may beunwilling to write their answers for one reasonor another; questionnaires present problems topeople of limited literacy; and an interview canbe conducted at an appropriate speed whereasquestionnaires are often filled in hurriedly.

By way of interest, we illustrate the relativemerits of the interview and the questionnaire inBox 15.2. It has been pointed out that the directinteraction of the interview is the source of bothits advantages and disadvantages as a researchtechnique (Borg, 1963). One advantage, for ex-ample, is that it allows for greater depth than isthe case with other methods of data collection.A disadvantage, on the other hand, is that it isprone to subjectivity and bias on the part of theinterviewer. Oppenheim (1992:81–2) suggeststhat interviews have a higher response rate thanquestionnaires because respondents becomemore involved and, hence, motivated; they en-able more to be said about the research than isusually mentioned in a covering letter to a ques-tionnaire, and they are better than questionnaires

Box 15.2Summary of relative merits of interview versus questionnaire

Source Tuckman, 1972

Consideration Interview Questionnaire1 Personal need to collect data Requires interviewers Requires a secretary2 Major expense Payment to interviewers Postage and printing3 Opportunities for response-keying

(personalization) Extensive Limited4 Opportunities for asking Extensive Limited5 Opportunities for probing Possible Difficult6 Relative magnitude of data reduction Great (because of coding) Mainly limited to rostering7 Typically, the number of respondents

who can be reached Limited Extensive8 Rate of return Good Poor9 Sources of error Interviewer, instrument, coding, sample Limited to instrument and sample

10 Overall reliability Quite limited Fair11 Emphasis on writing skill Limited Extensive

PURPOSES OF THE INTERVIEW

INTERVIEWS270

for handling more difficult and open-ended ques-tions.

Types of interview

The number of types of interview given is fre-quently a function of the sources one reads! Forexample LeCompte and Preissle (1993) give sixtypes: (a) standardized interviews; (b) in-depthinterviews; (c) ethnographic interviews; (d) eliteinterviews; (e) life history interviews; (f) focusgroups. Bogdan and Biklen (1992) add to this:(g) semi-structured interviews; (h) group inter-views. Lincoln and Guba (1985) add: (i) struc-tured interviews; and Oppenheim (1992:65)adds to this: (j) exploratory interviews. Patton(1980:206) outlines four types: (k) informal con-versational interviews; (l) interview guide ap-proaches; (m) standardized open-ended inter-views; (n) closed quantitative interviews. Pattonsets these out clearly thus (Box 15.3):

How is the researcher to comprehend therange of these various types? Kvale (1996:126–7) sets the several types of interview along a se-ries of continua, arguing that interviews differin the openness of their purpose, their degree ofstructure, the extent to which they are explora-tory or hypothesis-testing, whether they seekdescription or interpretation, whether they arelargely cognitive-focused or emotion-focused. Amajor difference lies in the degree of structurein the interview, which, itself, reflects the pur-poses of the interview, for example, to generatenumbers of respondents’ feelings about a givenissue or to indicate unique, alternative feelingsabout a particular matter. Lincoln and Guba(1985:269) suggest that the structured interviewis useful when the researcher is aware of whatshe does not know and therefore is in a positionto frame questions that will supply the knowl-edge required, whereas the unstructured inter-view is useful when the researcher is not awareof what she does not know, and therefore, relieson the respondents to tell her!

The issue here is of ‘fitness for purpose’; themore one wishes to gain comparable data—across people, across sites—the more standard-

ized and quantitative one’s interview tends tobecome; the more one wishes to acquire unique,non-standardized, personalized informationabout how individuals view the world, the moreone veers towards qualitative, open-ended, un-structured interviewing. Indeed this is true notsimply of interviews but of their written coun-terpart—questionnaires. Oppenheim (1992:86)indicates that standardization should refer tostimulus equivalence, i.e. that every respondentshould understand the interview question in thesame way, rather than replicating the exactwording, as some respondents might have diffi-culty with, or interpret very differently, and per-haps irrelevantly, particular questions. (He alsoadds, that, as soon as the wording of a questionis altered, however minimally, it becomes, ineffect, a different question!)

Exploratory interviews (Oppenheim,1992:65) are designed to be essentially heuristicand seek to develop hypotheses rather than tocollect facts and numbers. As these frequentlycover emotionally loaded topics they require skillon the part of the interviewer to handle the in-terview situation, enabling respondents to talkfreely and emotionally and to have candour, rich-ness, depth, authenticity, honesty about theirexperiences.

Morrison (1993:34–6) sets out five continuaof different ways of conceptualizing interviews.At one end of the first continuum are numbers,statistics, objective facts, quantitative data; atthe other end are transcripts of conversations,comments, subjective accounts, essentially word-based qualitative data.

At one end of the second continuum are closedquestions, multiple choice questions where re-spondents have to select from a given, predeter-mined range of responses that particular re-sponse which most accurately represents whatthey wish to have recorded for them; at the otherend of the continuum are open-ended questionswhich do not require the selection from a givenrange of responses—respondents can answer thequestions in their own way and in their ownwords, i.e. the research is responsive to partici-pants’ own frames of reference.

Cha

pte

r 15

271

At one end of the third continuum is a desireto measure responses, to compare one set of re-sponses with another, to correlate responses, tosee how many people said this, how many rateda particular item as such-and-such; at the other

end of the continuum is a desire to capture theuniqueness of a particular situation, person, orprogramme—what makes it different from oth-ers, i.e. to record the quality of a situation orresponse.

Source Patton, 1980:206

Type of interview Characteristics Strengths Weaknesses1 Informal Questions emerge from the Increases the salience and Different information collected

conversational immediate context and are relevance of questions; from different people with differentinterview asked in the natural course interviews are built on and questions. Less systematic and

of things; there is no emerge from observations; comprehensive if certain questionspredetermination of question the interview can be matched don’t arise ‘naturally’. Datatopics or wording. to individuals and organization and analysis can be

circumstances. quite difficult.

2 Interview Topics and issues to be The outline increases the Important and salient topics may beguide covered are specified in comprehensiveness of the inadvertently omitted. Interviewerapproach advance, in outline form; data and makes data flexibility in sequencing and wording

interviewer decides sequence collection somewhat questions can result in substantiallyand working of questions in systematic for each different responses, thus reducingthe course of the interview. respondent. Logical gaps in the comparability of responses.

data can be anticipated andclosed. Interviews remainfairly conversational andsituational.

3 Standardized The exact wording and Respondents answer the Little flexibility in relating theopen-ended sequence of questions are same questions, thus interview to particular individualsinterviews determined in advance. All increasing comparability of and circumstances; standardized

interviewees are asked the responses; data are wording of questions may constrainsame basic questions in the complete for each person and limit naturalness and relevancesame order. on the topics addressed in of questions and answers.

the interview. Reducesinterviewer effects and biaswhen several interviewersare used. Permits decision-makers to see and reviewthe instrumentation used inthe evaluation. Facilitatesorganization and analysis ofthe data.

4 Closed Questions and response Data analysis is simple; Respondents must fit theirquantitative categories are determined responses can be directly experiences and feelings into theinterviews in advance. Responses are compared and easily researcher’s categories; may be

fixed; respondent chooses aggregated; many short perceived as impersonal, irrelevant,from among these fixed questions can be asked in a and mechanistic. Can distort whatresponses. short time. respondents really mean or

experienced by so completelylimiting their response choices.

Box 15.3Strengths and weaknesses of different types of interview

TYPES OF INTERVIEW

INTERVIEWS272

At one end of the fourth continuum is a de-sire for formality and the precision of numbersand prescribed categories of response where theresearcher knows in advance what is beingsought; at the other end is a more responsive,informal intent where what is being sought ismore uncertain and pre-determined. The re-searcher goes into the situation and responds towhat emerges.

At one end of the fifth continuum is the at-tempt to find regularities—of response, opinionsetc.—in order to begin to make generalizationsfrom the data, to describe what is happening; atthe other end is the attempt to portray and catchuniqueness, the quality of a response, the com-plexity of a situation, to understand why re-spondents say what they say, and all of this intheir own terms.

One can cluster the sets of poles of the fivecontinua thus:

Quantitative Qualitativeapproaches approachesNumbers Wordspredetermined, given open-ended, responsivemeasuring capturing uniquenessshort-term, long-term,intermittent continuouscomparing capturing particularitycorrelating valuing qualityfrequencies individualityformality informalitylooking at looking forregularities uniquenessdescription explanationobjective facts subjective factsdescribing interpretinglooking in from looking fromthe outside the insidestructured unstructuredstatistical ethnographic,

illuminative

The left hand column is much more formal andpre-planned to a high level of detail, whilst theright hand column is far less formal and the finedetail only emerges once the researcher is in situ.Interviews in the left hand column are front-

loaded, that is, they require all the categoriesand -multiple choice questions to be worked outin advance. This usually requires a pilot to tryout the material and refine it. Once the detail ofthis planning is completed the analysis of thedata is relatively straightforward because thecategories for analysing the data have beenworked out in advance, hence data analysis israpid.

The right hand column is much more end-loaded, that is, it is quicker to commence andgather data because the categories do not haveto be worked out in advance, they emerge oncethe data have been collected. However, in orderto discover the issues that emerge and to organ-ize the data presentation, the analysis of the datatakes considerably longer.

Kvale (1996:30) sets out key characteristicsof qualitative research interviews: • Life world The topic of the qualitative re-

search interview is the lived world of the sub-jects and their relation to it.

• Meaning The interview seeks to interpret themeaning of central themes in the life worldof the subject. The interviewer registers andinterprets the meaning of what is said as wellas how it is said.

• Qualitative The interview seeks qualitativeknowledge expressed in normal language, itdoes not aim at quantification.

• Descriptive The interview attempts to obtainopen nuanced descriptions of different aspectsof the subjects’ life worlds.

• Specificity Descriptions of specific situationsand action sequences are elicited, not generalopinions.

• Deliberate naiveté The interviewer exhibitsan openness to new and unexpected phenom-ena, rather than having ready-made catego-ries and schemes of interpretation.

• Focused The interview is focused on particu-lar themes; it is neither strictly structured withstandardized questions, nor entirely ‘non-di-rective’.

• Ambiguity Interviewee statements can

Cha

pte

r 15

273

sometimes be ambiguous, reflecting contra-dictions in the world the subject lives in.

• Change The process of being interviewed mayproduce new insights and awareness, and thesubject may in the course of the interviewcome to change his or her descriptions andmeanings about a theme.

• Sensitivity Different interviewers can producedifferent statements on the same themes, de-pending on their sensitivity to and knowledgeof the interview topic.

• Interpersonal relations The knowledge ob-tained is produced through the interpersonalinteraction in the interview.

• Positive experience A well carried-out re-search interview can be a rare and enrichingexperience for the interviewee, who may ob-tain new insights into his or her life situa-tion.

There are four main kinds of interview that wediscuss here that may be used specifically as re-search tools: (a) the structured interview; (b) theunstructured interview; (c) the non-directive in-terview; and (d) the focused interview. The struc-tured interview is one in which the content andprocedures are organized in advance. This meansthat the sequence and wording of the questionsare determined by means of a schedule and theinterviewer is left little freedom to make modi-fications. Where some leeway is granted her, ittoo is specified in advance. It is therefore char-acterized by being a closed situation. In contrastto this, the unstructured interview is an opensituation, having greater flexibility and freedom.As Kerlinger (1970) notes, although the researchpurposes govern the questions asked, their con-tent, sequence and wording are entirely in thehands of the interviewer. This does not mean,however, that the unstructured interview is amore casual affair, for in its own way it also hasto be carefully planned.

The non-directive interview as a researchtechnique derives from the therapeutic or psy-chiatric interview. The principal features of itare the minimal direction or control exhibitedby the interviewer and the freedom the respond-

ent has to express her subjective feelings as fullyand as spontaneously as she chooses or is able.As Moser and Kalton (1977) put it:

The informant is encouraged to talk about thesubject under investigation (usually himself) andthe course of the interview is mainly guided byhim. There are no set questions, and usually nopredetermined framework for recorded answers.The interviewer confines himself to elucidatingdoubtful points, to rephrasing the respondent’sanswers and to probing generally. It is an approachespecially to be recommended when complex at-titudes are involved and when one’s knowledgeof them is still in a vague and unstructured form.

(Moser and Kalton, 1977) The need to introduce rather more interviewercontrol into the non-directive situation led tothe development of the focused interview. Thedistinctive feature of this type is that it focuseson a respondent’s subjective responses to aknown situation in which she has been involvedand which has been analysed by the interviewerprior to the interview. She is thereby able to usethe data from the interview to substantiate orreject previously formulated hypotheses. AsMerton and Kendall (1946) explain,

In the usual depth interview, one can urge inform-ants to reminisce on their experiences. In the fo-cused interview, however, the interviewer can,when expedient, play a more active role: he canintroduce more explicit verbal cues to the stimu-lus pattern or even represent it. In either case thisusually activates a concrete report of responsesby informants.

(Merton and Kendall, 1946) We examine the non-directive interview and thefocused interview in more detail later in thechapter.

Planning interview-based researchprocedures

Kvale (1996:88) sets out seven stages of an in-terview investigation that can be used to planthis type of research:

INTERVIEW-BASED RESEARCH PROCEDURES

INTERVIEWS274

• Thematizing Formulate the purpose of aninvestigation and describe the concept of thetopic to be investigated before the interviewsstart. The why and what of the investigationshould be clarified before the question ofhow—method—is posed.

• Designing Plan the design of the study, tak-ing into consideration all seven stages of theinvestigation, before the interviewing starts.

• Interviewing Conduct the interviews basedon an interview guide and with a reflectiveapproach to the knowledge sought and theinterpersonal relation of the interview situa-tion.

• Transcribing Prepare the interview materialfor analysis, which commonly includes a tran-scription from oral speech to written text.

• Analysing Decide, on the basis of the pur-pose and topic of the investigation, and onthe nature of the interview material, whichmethods of analysis are appropriate for theinterviews.

• Verifying Ascertain the generalizability, reli-ability, and validity of the interview findings.

• Reporting Communicate the findings of thestudy and the methods applied in a form thatlives up to scientific criteria, takes the ethi-cal aspects of the investigation into consid-eration, and that results in a readable prod-uct.

We use these to structure our comments hereabout the planning of interview-based research.

Thematizing

The preliminary stage of an interview study willbe the point where the purpose of the research isdecided. It may begin by outlining the theoreti-cal basis of the study, its broad aims, its practicalvalue and the reasons why the interview approachwas chosen. There may then follow the transla-tion of the general goals of the research into moredetailed and specific objectives. This is the mostimportant step, for only careful formulation ofobjectives at this point will eventually produce

the right kind of data necessary for satisfactoryanswers to the research problem.

Designing

There follows the preparation of the interviewschedule itself. This involves translating the re-search objectives into the questions that willmake up the main body of the schedule. Thisneeds to be done in such a way that the ques-tions adequately reflect what it is the researcheris trying to find out. It is quite usual to begin thistask by writing down the variables to be dealtwith in the study. As one commentator says, Thefirst step in constructing interview questions is tospecify your variables by name. Your variablesare what you are trying to measure. They tell youwhere to begin’ (Tuckman, 1972).

Before the actual interview items are pre-pared, it is desirable to give some thought to thequestion format and the response mode. Thechoice of question format, for instance, dependson a consideration of one or more of the fol-lowing factors: • the objectives of the interview;• the nature of the subject matter;• whether the interviewer is dealing in facts,

opinions or attitudes;• whether specificity or depth is sought;• the respondent’s level of education;• the kind of information she can be expected

to have;• whether or not her thought needs to be struc-

tured; some assessment of her motivationallevel;

• the extent of the interviewer’s own insightinto the respondent’s situation;

• the kind of relationship the interviewer canexpect to develop with the respondent.

Having given prior thought to these matters, theresearcher is in a position to decide whether touse open and/or closed questions, direct and/orindirect questions, specific and/or non-specificquestions, and so on.

Cha

pte

r 15

275

Construction of schedules

Three kinds of items are used in the construc-tion of schedules used in research interviews (seeKerlinger, 1970). First, ‘fixed-alternative’ itemsallow the respondent to choose from two ormore alternatives. The most frequently used isthe dichotomous item which offers two alterna-tives only: ‘yes-no’ or ‘agree-disagree’, for in-stance. Sometimes a third alternative such as‘undecided’ or ‘don’t know’ is also offered.

Example: Do you feel it is against the interests ofa school to have to make public its examinationresults?YesNoDon’t know

Kerlinger has identified the chief advantages anddisadvantages of fixed-alternative items. Theyhave, for example, the advantage of achievinggreater uniformity of measurement and there-fore greater reliability; of making the respond-ents answer in a manner fitting the responsecategory; and of being more easily coded.

Disadvantages include their superficiality; thepossibility of irritating respondents who findnone of the alternatives suitable; and the possi-bility of forcing responses that are inappropri-ate, either because the alternative chosen con-ceals ignorance on the part of the respondent orbecause she may choose an alternative that doesnot accurately represent the true facts. Theseweaknesses can be overcome, however, if theitems are written with care, mixed with open-ended ones, and used in conjunction with probeson the part of the interviewer.

Second, ‘open-ended items’ have been suc-cinctly defined by Kerlinger as ‘those that sup-ply a frame of reference for respondents’ an-swers, but put a minimum of restraint on theanswers and their expression’ (Kerlinger, 1970).Other than the subject of the question, which isdetermined by the nature of the problem underinvestigation, there are no other restrictions oneither the content or the manner of the inter-viewee’s reply.

Example: What kind of television programmes doyou most prefer to watch?

Open-ended questions have a number of advan-tages: they are flexible; they allow the inter-viewer to probe so that she may go into moredepth if she chooses, or to clear up any misun-derstandings; they enable the interviewer to testthe limits of the respondent’s knowledge; theyencourage co-operation and help establish rap-port; and they allow the interviewer to make atruer assessment of what the respondent reallybelieves. Open-ended situations can also resultin unexpected or unanticipated answers whichmay suggest hitherto unthought-of relationshipsor hypotheses. A particular kind of open-endedquestion is the ‘funnel’ to which reference hasbeen made earlier. This starts, the reader willrecall, with a broad question or statement andthen narrows down to more specific ones.Kerlinger (1970) quotes an example from thestudy by Sears, Maccoby and Levin (1957):

All babies cry, of course. Some mothers feel that ifyou pick up a baby every time it cries, you willspoil it. Others think you should never let a babycry for very long. How do you feel about this?What did you do about it? How about the middleof the night?

(Sears, Maccoby and Levin, 1957) Third, the ‘scale’ is, as we have already seen, aset of verbal items to each of which the inter-viewee responds by indicating degrees of agree-ment or disagreement. The individual’s responseis thus located on a scale of fixed alternatives.The use of this technique along with open-endedquestions is a comparatively recent developmentand means that scale scores can be checkedagainst data elicited by the open-ended questions.

Example: Attendance at school after the age of14 should be voluntary:Strongly agree Agree Undecided Disagree Stronglydisagree

It is possible to use one of a number of scales inthis context: attitude scales, rank-order scales,

INTERVIEW-BASED RESEARCH PROCEDURES

INTERVIEWS276

rating scales, and so on. We touch upon thissubject again subsequently.

Question formats

We now look at the kinds of questions andmodes of response associated with interviewing.First, the matter of question format: how is aquestion to be phrased or organized? (see Wilson,1996). Tuckman (1972) has listed four such for-mats that an interviewer may draw upon. Ques-tions may, for example, take a direct or indirectform. Thus an interviewer could ask a teacherwhether she likes teaching: this would be a directquestion. Or else she could adopt an indirect ap-proach by asking for the respondent’s views oneducation in general and the ways schools func-tion. From the answers proffered, the interviewercould make inferences about the teacher’s opin-ions concerning her own job. Tuckman suggeststhat by making the purpose of questions lessobvious, the indirect approach is more likely toproduce frank and open responses.

There are also those kinds of questions whichdeal with either a general or specific issue. Toask a child what she thought of the teachingmethods of the staff as a whole would be a gen-eral or non-specific question. To ask her whatshe thought of her teacher as a teacher wouldbe a specific question. There is also the sequenceof questions designated the funnel in which themovement is from the general and non-specificto the more specific. Tuckman comments, ‘Spe-cific questions, like direct ones, may cause a re-spondent to become cautious or guarded andgive less-than-honest answers. Non-specificquestions may lead circuitously to the desiredinformation but with less alarm by the respond-ents’ (Tuckman, 1972).

A further distinction is that between ques-tions inviting factual answers and those invit-ing opinions. Both fact and opinion questionscan yield less than the truth, however: the formerdo not always produce factual answers; nor dothe latter necessarily elicit honest opinions. Inboth instances, inaccuracy and bias may be mini-mized by careful structuring of the questions.

There are several ways of categorizing ques-tions, for example (Spradley, 1979; Patton, 1980):

• descriptive questions;• experience questions;• behaviour questions;• knowledge questions;• construct-forming questions;• contrast questions (asking respondents to

contrast one thing with another);• feeling questions;• sensory questions;• background questions;• demographic questions.

These concern the substance of the question.Kvale (1996:133–5) adds to these what might betermed the process questions, i.e. questions that:

• introduce a topic or interview;• follow-up on a topic or idea;• probe for further information or response;• ask respondents to specify and provide ex-

amples;• directly ask for information;• indirectly ask for information;• interpret respondents’ replies.

We may also note that an interviewee may bepresented with either a question or a statement.In the case of the latter she will be asked for herresponse to it in one form or another.

Example question: Do you think homework shouldbe compulsory for all children between 11 and 16?Example statement: Homework should be compul-sory for all children between 11 and 16 years old.Agree Disagree Don’t know

Response modes

If there are varied ways of asking questions, itfollows there will be several ways in which theymay be answered. It is to the different responsemodes that we now turn. In all, Tuckman (1972)lists seven such modes.

The first of these is the ‘unstructured re-sponse’. This allows the respondent to give heranswer in whatever way she chooses.

Cha

pte

r 15

277

Example: Why did you not go to university?A ‘structured response’, by contrast, would limither in some way.Example: Can you give me two reasons for notgoing to university?

Although the interviewer has little control overthe unstructured response, it does ensure thatthe respondent has the freedom to give her ownanswer as fully as she chooses rather than beingconstrained in some way by the nature of thequestion. The chief disadvantage of the unstruc-tured response concerns the matter of quantifi-cation. Data yielded in the unstructured responseare more difficult to code and quantify than datain the structured response.

A ‘fill-in response’ mode requires the respond-ent to supply rather than choose a response,though the response is often limited to a wordor phrase.

Example:What is your present occupation? orHow long have you lived at your present address?

The differences between the fill-in response andthe unstructured response is one of degree.

A ‘tabular response’ is similar to a fill-in re-sponse though more structured. It may demandwords, figures or phrases, so example:

It is thus a convenient and short-hand way ofrecording complex information.

A ‘scaled response’ is one structured by meansof a series of gradations. The respondent is re-quired to record her response to a given state-ment by selecting from a number of alternatives.

Example: What are your chances of reaching atop managerial position within the next five years?Excellent Good Fair Poor Very poor

Tuckman draws our attention to the fact that,unlike an unstructured response which has tobe coded to be useful as data, a scaled responseis collected in the form of usable and analysabledata.

A ‘ranking response’ is one in which a re-spondent is required to rank-order a series ofwords, phrases or statements according to aparticular criterion.

Example: Rank order the following people interms of their usefulness to you as sources of ad-vice and guidance on problems you have encoun-tered in the classroom. Use numbers 1 to 5, with1 representing the person most useful.Education tutorSubject tutorClassteacherHeadteacherOther student

Ranked data can be analysed by adding up therank of each response across the respondents, thusresulting in an overall rank order of alternatives.

A ‘checklist response’ requires that the re-spondent selects one of the alternatives presentedto her. In that they do not represent points on acontinuum, they are nominal categories.

Example: I get most satisfaction in college from:the social lifestudying on my ownattending lecturescollege societiesgiving a paper at a seminar

This kind of response tends to yield less infor-mation than the other kinds considered.

Finally, the ‘categorical response’ mode issimilar to the checklist but simpler in that it of-fers respondents only two possibilities.

Example: Material progress results in greater hap-piness for peopleTrue Falseor

INTERVIEW-BASED RESEARCH PROCEDURES

INTERVIEWS278

In the event of another war, would you be pre-pared to fight for your country?Yes No

Summing the numbers of respondents with thesame responses yields a nominal measure.

As a general rule, the kind of informationsought and the means of its acquisition will de-termine the choice of response mode. Dataanalysis, then, ought properly to be consideredalongside the choice of response mode so thatthe interviewer can be confident that the datawill serve her purposes and analysis of them canbe duly prepared. Box 15.4 summarizes the re-lationship between response mode and type ofdata.

Once the variables to be measured or studiedhave been identified, questions can be con-structed so as to reflect them. It is important tobear in mind that more than one question for-mat and more than one response mode may beemployed when building up a schedule. The fi-nal mixture will depend on the kinds of factorsmentioned earlier—the objectives of the re-search, and so on.

Where an interview schedule is to be used bya number of trained interviewers, it will of coursebe necessary to include in it appropriate instruc-tions for both interviewer and interviewees.

The framing of questions for a semi-struc-tured interview will also need to considerprompts and probes (Morrison, 1993:66).Prompts enable the interviewer to clarify topicsor questions, whilst probes enable the inter-viewer to ask respondents to extend, elaborate,

add to, provide detail for, clarify or qualify theirresponse, thereby addressing richness, depth ofresponse, comprehensiveness and honesty thatare some of the hallmarks of successful inter-viewing (see also Patton, 1980:238).

Hence an interview schedule for a semi-struc-tured interview (i.e. where topics and open-ended questions are written but the exact se-quence and wording does not have to be fol-lowed with each respondent) might include: • the topic to be discussed;• the specific possible questions to be put for

each topic;• the issues within each topic to be discussed,

together with possible questions for each is-sue;

• a series of prompts and probes for each topic,issue and question.

‘How many interviews do I need to conduct?’ isa frequent question of novice researchers, ask-ing both about the numbers of people and thenumber of interviews with each person. Theadvice here echoes that of Kvale (1996:101) thatone conducts interviews with as many people asnecessary in order to gain the informationsought. There is no simple rule of thumb, as thisdepends on the purpose of the interview, forexample, whether it is to make generalizations,to provide in-depth, individual data, to gain arange of responses. Though the reader is directedto the chapter on sampling for fuller treatmentof these matters, the issue here is that the inter-viewer must ensure that the interviewees selected

Response mode Type of data advantages Chief advantages Chief disadvantagesFill-in Nominal Less biasing; greater More difficult to score

response flexibilityScaled Interval Easy to score Time consuming; can be biasingRanking Ordinal Easy to score; forces Difficult to complete

discriminationChecklist or categorical Nominal (may be Easy to score; easy Provides less data and fewer

interval when totalled) to respond options

Box 15.4The selection of response mode

Source Tuckman, 1972

Cha

pte

r 15

279

will be able to furnish the researcher with theinformation required.

Interviewing

Setting up and conducting the interview will makeup the next stage in the procedure. Where theinterviewer is initiating the research herself, shewill clearly select her own respondents; where sheis engaged by another agent, then she will prob-ably be given a list of people to contact. Tuckman(1972) has succinctly reviewed the procedures toadopt at the interview itself. He writes,

At the meeting, the interviewer should brief therespondent as to the nature or purpose of the in-terview (being as candid as possible without bias-ing responses) and attempt to make the respond-ent feel at ease. He should explain the manner inwhich he will be recording responses, and if heplans to tape record, he should get the respond-ent’s assent. At all times, an interviewer must re-member that he is a data collection instrumentand try not to let his own biases, opinions, or cu-riosity affect his behaviour. It is important thatthe interviewer should not deviate from his for-mat and interview schedule although many sched-ules will permit some flexibility in choice of ques-tions. The respondent should be kept from ram-bling away from the essence of a question, butnot at the sacrifice of courtesy.

(Tuckman, 1972) It is crucial to keep uppermost in one’s mind thefact that the interview is a social, interpersonalencounter, not merely a data collection exercise.Indeed Kvale (1996:125) suggests that an inter-view follows an unwritten script for interactions,the rules for which only surface when they aretransgressed. Hence the interviewer must be atpains to conduct the interview carefully and sen-sitively. Kvale (1996:147) adds that, as the re-searcher is the research instrument, the effec-tive interviewer is not only knowledgeable aboutthe subject matter but is also an expert in inter-action and communication. The interviewer willneed to establish an appropriate atmospheresuch that the participant can feel secure to talkfreely. This operates at several levels.

For example there is the need to address thecognitive aspect of the interview, ensuring thatthe interviewer is sufficiently knowledgeableabout the subject matter that she or he can con-duct the interview in an informed manner, andthat the interviewee does not feel threatened bylack of knowledge. That this is a particular prob-lem when interviewing children has been docu-mented by Simons (1982) and Lewis (1992),who indicate that children will tend to say any-thing rather than nothing at all, thereby limit-ing the possible reliability of the data.

Further, the ethical dimension of the inter-view needs to be borne in mind, ensuring, forexample, informed consent, guarantees of con-fidentiality, beneficence and non-maleficence (i.e.that the interview may be to the advantage ofthe respondent and will not harm her). The is-sue of ethics also needs to take account of whatis to count as data, for example it is often afterthe cassette recorder or video camera has beenswitched off that the ‘gems’ of the interview arerevealed, or people may wish to say something‘off the record’; the status of this kind of infor-mation needs to be clarified before the interviewcommences. The ethical aspects of interviewingare more fully discussed later in the chapter.

Then there is a need to address the interper-sonal, interactional, communicative and emo-tional aspects of the interview. For example, theinterviewer and interviewee communicate non-verbally, by facial and bodily expression. Some-thing as slight as a shift in position in a chairmight convey whether the researcher is inter-ested, angry, bored, agreeing, disagreeing andso on. Here the interviewer has to be adept at‘active listening’,

The interviewer is also responsible for con-sidering the dynamics of the situation, for ex-ample, how to keep the conversation going, howto motivate participants to discuss their thoughts,feelings and experiences, and how to overcomethe problems of the likely asymmetries of powerin the interview (where the interviewer typicallydefines the situation, the topic, the conduct, theintroduction, the course of the interview, andthe closing of the interview) (Kvale, 1996:126).

INTERVIEW-BASED RESEARCH PROCEDURES

INTERVIEWS280

As Kvale suggests, the interview is not usu-ally a reciprocal interaction between two equalparticipants. It is important to keep the inter-view moving forward, and how to achieve thisneeds to be anticipated by the interviewer, forexample by being clear on what one wishes tofind out, asking those questions that will elicitthe kinds of data sought, giving appropriate ver-bal and non-verbal feedback to the respondentduring the interview. It extends even to consid-ering when the interviewer should keep silent(ibid.: 135).

The ‘directiveness’ of the interviewer has beenscaled by Whyte (1982), where a six-point scalewas devised (1=the least directive, and 6=themost directive): 1 Making encouraging noises.2 Reflecting on remarks made by the inform-

ant.3 Probing on the last remark made by the in-

formant.4 Probing an idea preceding the last remark by

the informant.5 Probing an idea expressed earlier in the in-

terview.6 Introducing a new topic. This is not to say that the interviewer shouldavoid being too directive or not directive enough;indeed on occasions a confrontational stylemight yield much more useful data than a non-confrontational style. Further, it may be in theinterests of the research if the interview is some-times quite tightly controlled, as this might fa-cilitate the subsequent analysis of the data. Forexample, if the subsequent analysis will seek tocategorize and classify the responses, then itmight be useful for the interviewer to clarifymeaning and even suggest classifications duringthe interview (see Kvale, 1996:130).

Patton (1980:210) suggests that it is impor-tant to maintain the interviewee’s motivation,hence the interviewer must keep boredom at bay,for example by keeping to a minimum demo-graphic and background questions. The issue ofthe interpersonal and interactional elements

reaches further, for the language of all speakershas to be considered, for example, translatingthe academic language of the researcher into theeveryday, more easy-going and colloquial lan-guage of the interviewee, in order to generaterich descriptions and authentic data. Patton(1980:225) goes on to underline the importanceof clarity in questioning, and suggests that thisentails the interviewer finding out what termsthe interviewees use about the matter in hand,what terms they use amongst themselves, andavoiding the use of academic jargon. The issuehere is not only that the language of the inter-viewer must be understandable to intervieweesbut that it must be part of their frame of refer-ence, such that they feel comfortable with it.

This can be pursued even further, suggestingthat the age, gender, race, class, dress, language ofthe interviewers and interviewees will all exert aninfluence on the interview itself. This is discussedfully in Chapter 5 on reliability and validity.

The sequence and framing of the interviewquestions will also need to be considered, for ex-ample ensuring that easier and less threatening,non-controversial questions are addressed earlierin the interview in order to put respondents attheir ease (see Patton, 1980:210–11). This mightmean that the ‘what’ questions precede the moresearching and difficult ‘how’ and ‘why’ questions(though, as Patton reminds us (ibid.: 211), knowl-edge questions—‘what’—type questions—can bethreatening). The interviewer’s questions shouldbe straightforward and brief, even though theresponses need not be (Kvale, 1996:132). It willalso need to consider the kinds of questions to beput to interviewees, discussed earlier.

There are several problems in the actual con-duct of an interview that can be anticipated and,possibly, prevented, ensuring that the interviewproceeds comfortably, for example (see Field andMorse, 1989): • avoiding interruptions from outside (e.g. tel-

ephone calls, people knocking on the door);• minimizing distractions;• minimizing the risk of ‘stage fright’ in inter-

viewees and interviewers;

Cha

pte

r 15

281

• avoiding asking embarrassing or awkwardquestions;

• jumping from one topic to another;• giving advice or opinions (rather than active

listening);• summarizing too early or closing off an in-

terview too soon;• being too superficial;• handling sensitive matters (e.g. legal matters,

personal matters, emotional matters). There is also the issue of how to record the in-terview as it proceeds. For example, anaudiotape recorder might be unobtrusive butmight constrain the respondent; a videotapemight yield more accurate data but might beeven more constraining, with its connotation ofsurveillance. Merton et al. (1956) comment onthe tendency of taping to ‘cool things down’. Itmight be less threatening not to have any me-chanical means of recording the interview, inwhich case the reliability of the data might relyon the memory of the interviewer. An alterna-tive might be to have the interviewer make notesduring the interview, but this could be highlyoff-putting for some respondents. The issue hereis that there is a trade-off between the need tocatch as much data as possible and yet to avoidhaving so threatening an environment that itimpedes the potential of the interview situation.

What is being suggested here is that the in-terview, as a social encounter, has to take ac-count of, and plan for, the whole range of other,possibly non-cognitive, factors that form partof everyday conduct. The ‘ideal’ interview, then,meets several ‘quality criteria’ (Kvale, 1996:145): • The extent of spontaneous, rich, specific, and

relevant answers from the interviewee.• The shorter the interviewer’s questions and

the longer the subject’s answers, the better.• The degree to which the interviewer follows

up and clarifies the meanings of the relevantaspects of the answers.

• The ideal interview is to a large extent inter-preted throughout the interview.

• The interviewer attempts to verify his or her

interpretations of the subject’s answers in thecourse of the interview.

• The interview is ‘self-communicating’—it isa story contained in itself that hardly requiresmuch extra descriptions and explanations.

Transcribing

This is a crucial step, for there is the potentialfor massive data loss, distortion and the reduc-tion of complexity. We have suggested through-out that the interview is a social encounter, notmerely a data collection exercise; the problemwith much transcription is that it becomes solelya record of data rather than a record of a socialencounter. Indeed this problem might have be-gun at the data collection stage; for example,an audiotape is selective, it filters out importantcontextual factors, neglecting the visual and non-verbal aspects of the interview (Mishler, 1986).Indeed it is frequently the non-verbal communi-cation that gives more information than the ver-bal communication. Morrison (1993:63) re-counts the incident of an autocratic headteacherextolling the virtues of collegiality and demo-cratic decision-making whilst shaking her headvigorously from side to side and pressing theflat of her hand in a downwards motion awayfrom herself as if to silence discussion! To re-place audio recording with video recordingmight make for richer data and catch non-ver-bal communication, but this then becomes verytime-consuming to analyse.

Transcriptions inevitably lose data from theoriginal encounter. This problem is compounded,for a transcription represents the translationfrom one set of rule systems (oral and interper-sonal) to another very remote rule system (writ-ten language). As Kvale (1996:166) suggests theprefix trans indicates a change of state or form;transcription is selective transformation. There-fore it is unrealistic to pretend that the data ontranscripts are anything but already interpreteddata. As Kvale (ibid.: 167) remarks, the tran-script can become an opaque screen between theresearcher and the original live interview situa-tion.

INTERVIEW-BASED RESEARCH PROCEDURES

INTERVIEWS282

There can be no single ‘correct’ transcription;rather the issue becomes whether, to what ex-tent, and how a transcription is useful for theresearch. Transcriptions are decontextualized,abstracted from time and space, from the dy-namics of the situation, from the live form, andfrom the social, interactive, dynamic and fluiddimensions of their source; they are frozen.

The words in transcripts are not necessarilyas solid as they were in the social setting of theinterview. Scheurich (1995:240) suggests thateven conventional procedures for achieving re-liability are inadequate here, for holding con-stant the questions, the interviewer, the inter-viewee, the time and place does not guaranteestable, unambiguous data. Indeed Mishler(1991:260) suggests that data and the relation-ship between meaning and language are con-textually situated; they are unstable, changingand capable of endless reinterpretation.

We are not arguing against transcriptions,rather, we are cautioning against the researcherbelieving that they tell everything that took placein the interview. This might require the re-searcher to ensure that different kinds of dataare recorded in the transcript of the audiotape,for example: • what was being said;• the tone of voice of the speaker(s) (e.g. harsh,

kindly, encouraging);• the inflection of the voice (e.g. rising or fall-

ing, a question or a statement, a cadence or apause, a summarizing or exploratory tone,opening or closing a line of inquiry);

• emphases placed by the speaker;• pauses (short to long) and silences (short to

long);• interruptions;• the mood of the speaker(s) (e.g. excited, an-

gry, resigned, bored, enthusiastic, committed,happy, grudging);

• the speed of the talk (fast to slow, hurried orunhurried, hesitant to confident);

• how many people were speaking simultane-ously;

• whether a speaker was speaking continuouslyor in short phrases;

• who is speaking to whom;• indecipherable speech;• any other events that were taking place at

the same time that the researcher can recall.

If the transcript is of videotape, then this enablesthe researcher to comment on all of the non-ver-bal communication that was taking place in ad-dition to the features noted from the audiotape.The issue here is that it is often inadequate totranscribe only spoken words; other data areimportant. Of course, as soon as other data arenoted, this becomes a matter of interpretation(what is a long pause, what is a short pause, wasthe respondent happy or was it just a ‘front’, whatgave rise to such-and-such a question or response,why did the speaker suddenly burst into tears?).As Kvale (1996:183) notes, interviewees’ state-ments are not simply collected by the interviewer,they are, in reality, co-authored.

Analysing

Once data from the interview have been col-lected, the next stage involves analysing them,often by some form of coding or scoring. Inqualitative data the data analysis here is almostinevitably interpretive, hence the data analysisis less a completely accurate representation (asin the numerical, positivist tradition) but moreof a reflexive, reactive interaction between theresearcher and the decontextualized data thatare already interpretations of a social encoun-ter. The great tension in data analysis is betweenmaintaining a sense of the holism of the inter-view and the tendency for analysis to atomizeand fragment the data—to separate them intoconstituent elements, thereby losing the synergyof the whole, and in interviews often the wholeis greater than the sum of the parts. There areseveral stages in analysis, for example:

• generating natural units of meaning;• classifying, categorizing and ordering these

units of meaning;

Cha

pte

r 15

283

• structuring narratives to describe the inter-view contents;

• interpreting the interview data. These are comparatively generalized stages.Miles and Huberman (1994) suggest thirteentactics for generating meaning from transcribedand interview data: • counting frequencies of occurrence (of ideas,

themes, pieces of data, words);• noting patterns and themes (Gestalts), which

may stem from repeated themes and causesor explanations or constructs;

• seeing plausibility—trying to make good senseof data, using informed intuition to reach aconclusion;

• clustering—setting items into categories,types, behaviours and classifications;

• making metaphors—using figurative and con-notative language rather than literal and de-notative language, bringing data to life, therebyreducing data, making patterns, decentring thedata, and connecting data with theory;

• splitting variables to elaborate, differentiateand ‘unpack’ ideas, i.e. to move away fromthe drive towards integration and the blur-ring of data;

• subsuming particulars into the general (akinto Glaser’s (1978) notion of ‘constant com-parison’—see Chapter 6 in this book)—amove towards clarifying key concepts;

• factoring—bringing a large number of vari-ables under a smaller number of (frequently)unobserved hypothetical variables;

• identifying and noting relations between vari-ables;

• finding intervening variables—looking forother variables that appear to be ‘getting inthe way’ of accounting for what one wouldexpect to be strong relationships betweenvariables;

• building a logical chain of evidence—notingcausality and making inferences;

• making conceptual/theoretical coherence—moving from metaphors to constructs to theo-ries to explain the phenomena.

This progression, though perhaps positivist inits tone, is a useful way of moving from the spe-cific to the general in data analysis. Miles andHuberman (1994) attach much importance tocoding of interview responses, partially as a wayof reducing what is typically data overload fromqualitative data.

Coding has been defined by Kerlinger (1970)as the translation of question responses and re-spondent information to specific categories forthe purpose of analysis. As we have seen, manyquestions are preceded, that is, each responsecan be immediately and directly converted intoa score in an objective way. Rating scales andchecklists are examples of preceded questions.Coding is the ascription of a category label to apiece of data, with the category label either de-cided in advance or in response to the data thathave been collected.

In coding a piece of transcription the researchersystematically goes through the data, typicallyline by line, and writes a descriptive code by theside of each piece of datum, for example:

Text CodeThe students will undertakeproblem-solving in science PROBI prefer to teach mixed abilityclasses MIXABIL

One can see here that the codes are frequentlyabbreviations, enabling the researcher to under-stand immediately the issue that they are describ-ing because they resemble that issue (rather than,for example, ascribing a number as a code foreach piece of datum, where the number providesno clue as to what the datum or category con-cerns). Miles and Huberman (1994) suggest thatthe coding label should bear sufficient resem-blance to the original data so that the researchercan know, by looking at the code, what the origi-nal piece of datum concerned. There are severalcomputer packages that can help the coder here(e.g. Ethnograph, NUD.IST), though they re-quire the original transcript to be entered ontothe computer. One such, Code-A-Text, is par-ticularly useful for analysing dialogues both

INTERVIEW-BASED RESEARCH PROCEDURES

INTERVIEWS284

quantitatively and qualitatively (the system alsoaccepts sound and video input).

Having performed the first round of codingthe researcher is able to detect patterns, themesand begin to make generalizations (e.g. by count-ing the frequencies of codes). The researcher canalso group codes into more general clusters, eachwith a code, i.e. begin the move towardsfactoring the data.

Miles and Huberman suggest that it is possi-ble to keep as many as ninety codes in the work-ing memory at any one time, though they indi-cate that data might be recoded on a second orthird reading, as codes that were used early onmight have to be refined in light of codes thatare used later, either to make the codes morediscriminating or to conflate codes that are un-necessarily specific. There is also the danger thatearly codes might influence too strongly the latercodes. Codes, they argue, should be kept as dis-crete as possible, and they should enable the re-searcher to catch the complexity and compre-hensiveness of the data. They recommend ear-lier rather than later coding, as late coding, theysuggest, enfeebles the analysis.

Perhaps the biggest problem concerns the cod-ing and scoring of open-ended questions. Twosolutions are possible here. Even though a re-sponse is open-ended, the interviewer mayprecode her interview schedule so that while aninterviewee is responding freely, the intervieweris assigning the content of her responses, or partsof it, to predetermined coding categories. Clas-sifications of this kind may be developed duringpilot studies.

Example: Q. What is it that you like least about your job?A. Mostly the way the place is run—and the longhours; and the prospects aren’t too good.Coding:colleaguesorganization Xthe workconditions Xotherfuture prospects X

Alternatively, data may be postcoded. Havingrecorded the interviewee’s response, either bysummarizing it during or after the interview it-self, or verbatim by tape recorder, the researchermay subject it to content analysis and submit itto one of the available scoring procedures—scal-ing, scoring, rank scoring, response counting, etc.

Content analysis involves reading and judge-ment; Brenner et al. (1985) set out several steps inundertaking a content analysis of open-ended data:

Step 1 Briefing (understanding the problem andits context in detail).Step 2 Sampling (of people, including the typesof sample sought, see Chapter 4).Step 3 Associating (with other work that hasbeen done).Step 4 Hypothesis development.Step 5 Hypothesis testing.Step 6 Immersion (in the data collected, to pickup all the clues).Step 7 Categorizing (in which the categoriesand their labels must: (a) reflect the purpose ofthe research; (b) be exhaustive; (c) be mutuallyexclusive).Step 8 Incubation (e.g. reflecting on data anddeveloping interpretations and meanings).Step 9 Synthesis (involving a review of the ra-tionale for coding and an identification of theemerging patterns and themes).Step 10 Culling (condensing, excising and evenreinterpreting the data so that they can be writ-ten up intelligibly).Step 11 Interpretation (making meaning of thedata).Step 12 Writing, including (pp. 140–3): givingclear guidance on the incidence of occurrence;proving an indication of direction and intention-ality of feelings; being aware of what is not saidas well as what it said—silences; indicating sali-ence (to the readers and respondents).Step 13 Rethinking.

This process, the authors suggest (ibid.: 144),requires researchers to address several factors:

• Understand the research brief thoroughly.• Evaluate the relevance of the sample for the

research project.

Cha

pte

r 15

285

• Associate their own experiences with theproblem, looking for clues from the past.

• Develop testable hypotheses as the basis forthe content analysis (the authors name thisthe ‘Concept Book’).

• Test the hypotheses throughout the interview-ing and analysis process.

• Stay immersed in the data throughout thestudy.

• Categorize the data in the Concept Book, cre-ating labels and codes.

• Incubate the data before writing up.• Synthesize the data in the Concept Book,

looking for key concepts.• Cull the data, being selective is important

because it is impossible to report everythingthat happened.

• Interpret the data, identifying its meaning andimplication.

• Write up the report.• Rethink and rewrite: have the research ob-

jectives been met? Hycner (1985) sets out procedures that can befollowed when phenomenologically analysinginterview data. In summary, the guidelines areas follows: • Transcription Having the interview tape tran-

scribed, noting not only the literal statementsbut also non-verbal and paralinguistic com-munication.

• Bracketing and phenomenological reductionfor Hycner this means, ‘suspending (brack-eting) as much as possible the researcher’smeaning and interpretations and entering intothe world of the unique individual who wasinterviewed’ (Hycner, 1985). The researcherthus sets out to understand what the inter-viewee is saying rather than what she expectsthat person to say.

• Listening to the interview for a sense of thewhole This involves listening to the entire tapeseveral times and reading the transcription anumber of times in order to provide a con-text for the emergence of specific units ofmeaning and themes later on.

• Delineating units of general meaning Thisentails a thorough scrutiny of both verbal andnon-verbal gestures to elicit the participant’smeaning. Hycner says, ‘It is a crystallizationand condensation of what the participant hassaid, still using as much as possible the literalwords of the participant’ (Hycner, 1985).

• Delineating units of meaning relevant to theresearch question Once the units of generalmeaning have been noted, they are then re-duced to units of meaning relevant to the re-search question.

• Training independent judges to verify theunits of relevant meaning Findings can beverified by using other researchers to carryout the above procedures.

• Eliminating redundancies At this stage, theresearcher checks the lists of relevant mean-ing and eliminates those clearly redundant toothers previously listed.

• Clustering units of relevant meaning The re-searcher now tries to determine if any of theunits of relevant meaning naturally clustertogether; whether there seems to be somecommon theme or essence that unites severaldiscrete units of relevant meaning.

• Determining themes from clusters of mean-ing The researcher examines all the clustersof meaning to determine if there is one (ormore) central theme(s) which expresses theessence of these clusters.

• Writing a summary of each individual inter-view It is useful at this point, the author sug-gests, to go back to the interview transcrip-tion and write up a summary of the inter-view incorporating the themes that have beenelicited from the data.

• Return to the participant with the summaryand themes, conducting a second interviewThis is a check to see whether the essence ofthe first interview has been accurately andfully captured.

• Modifying themes and summary With thenew data from the second interview, the re-searcher looks at all the data as a whole andmodifies or adds themes as necessary.

• Identifying general and unique themes for all

INTERVIEW-BASED RESEARCH PROCEDURES

INTERVIEWS286

the interviews The researcher now looks forthe themes common to most or all of the in-terviews as well as the individual variations.The first step is to note if there are themescommon to all or most of the interviews. Thesecond step is to note when there are themesthat are unique to a single interview or a mi-nority of the interviews.

• Contextualization of themes At this point itis helpful to place these themes back withinthe overall contexts or horizons from whichthese themes emerged.

• Composite summary The author considers ituseful to write up a composite summary ofall the interviews which would accuratelycapture the essence of the phenomenon be-ing investigated. The author concludes, ‘Sucha composite summary describes the “world”in general, as experienced by the participants.At the end of such a summary the researchermight want to note significant individual dif-ferences’ (Hycner, 1985).

Verifying

Chapter 5 has discussed at length the issues ofreliability, validity and generalizability of thedata from interviews, and so these issues willnot be repeated here. The reader is advised toexplore not only that section of Chapter 5, but,indeed the whole chapter. Kvale (1996:237)makes the point that validation must take placeat all seven stages of the interview-based inves-tigation:

Stage 1 Thematizing. The theoretical under-pinnings of the research must be sound and thelink between theory and research questions mustbe logical.Stage 2 Designing. The research design must beadequate and sound in terms of methodology,operationalization, sampling, and ethical defen-sibility.Stage 3 Interviewing. The data must be trust-worthy and the interview must be conducted tothe highest standards, with validity and reliabil-ity checks being made as it unfolds.

Stage 4 Transcribing. The translation from oraland social media to a written medium shouldbe faithful to key features of the original media.Stage 5 Analysing. The methods of analysis andinterpretations of the data are faithful to thedata.Stage 6 Validating. Decisions are reached on themost appropriate forms of validity for the study,and who the validators might be.Stage 7 Reporting. The report fairly reflects thestudy and can be seen to be fair by the readers.

One main issue here is that there is no singlecanon of validity; rather, the notion of fitnessfor purpose within an ethically defensible frame-work should be adopted, giving rise to differentkinds of validity for different kinds of interview-based research (e.g. structured to unstructured,qualitative to quantitative, nomothetic toidiographic, generalizable to unique, descriptiveto explanatory, positivist to ethnographic, pre-ordinate to responsive).

Reporting

The nature of the reporting will be decided tosome extent by the nature of the interviewing.For example a standardized, structured interviewmay yield numerical data that may be reportedsuccinctly in tables and graphs, whilst a quali-tative, word-based, open-ended interview willyield word-based accounts that take up consid-erably more space.

Kvale (1996:263–6) suggests several elementsof a report: (a) an introduction that includes themain themes and contents; (b) an outline of themethodology and methods (from designing tointerviewing, transcription and analysis); (c) theresults (the data analysis, interpretation andverification); (d) a discussion.

If the report is largely numerical then figuresand tables might be appropriate; if the interviewis more faithfully represented in words ratherthan numbers then this presents the researcherwith the issue of how to present particular quo-tations. Here Kvale (ibid.: 266) suggests thatdirect quotations should: (a) illuminate and

Cha

pte

r 15

287

relate to the general text whilst maintaining abalance with the main text; (b) be contextualizedand be accompanied by a commentary and in-terpretation; (c) be particularly clear, useful, andthe ‘best’ of the data (the ‘gems’!); (d) shouldinclude an indication of how they have beenedited; and (e) be incorporated into a naturalwritten style of the report.

Group interviewing

Group interviewing is a useful way of conduct-ing interviews. Watts and Ebbutt (1987) set outthe advantages and disadvantages of group in-terviewing as a means of collecting data in edu-cational research. The advantages include thepotential for discussions to develop, thus yield-ing a wide range of responses. They explain,‘such interviews are useful…where a group ofpeople have been working together for sometime or common purpose, or where it is seen asimportant that everyone concerned is aware ofwhat others in the group are saying’ (Watts andEbbutt, 1987). For example, Lewis (1992) foundthat 10-year-olds’ understanding of severe learn-ing difficulties was enhanced in group interviewsituations, the children challenging and extend-ing each other’s ideas and introducing new ideasinto the discussion. The group interview, thepaper argues, can generate a wider range of re-sponses than in individual interviews. Bogdanand Biklen (1992:100) add that group interviewsmight be useful for gaining an insight into whatmight be pursued in subsequent individual in-terviews. There are practical and organizationaladvantages, too. Group interviews are oftenquicker than individual interviews and hence aretimesaving and involve minimal disruption. Thegroup interview can also bring together peoplewith varied opinions, or as representatives ofdifferent collectivities. Group interviews of chil-dren might also be less intimidating for themthan individual interviews. Simons (1982) andLewis (1992) chart some difficulties in interview-ing children, for example how to: • overcome children being easily distracted;

• avoid the researcher being seen as an author-ity figure;

• keep the interview relevant;• interview inarticulate, hesitant and nervous

children;• get the children’s teacher away from the chil-

dren;• respond to the child who says something then

immediately wishes she hadn’t said it;• elicit genuine responses from children rather

than simply responses to the interview situa-tion;

• get beyond the institutional, headteacher’s,or ‘expected’ response;

• keep children to the point;• avoid children being too extreme or destruc-

tive of each other’s views;• pitch language at the appropriate level;• avoid the interview being an arduous bore;• overcome children’s poor memories;• avoid children being too focused on particu-

lar features or situations;• overcome the problem that some children will

say anything rather than feel they do not have‘the answer’;

• overcome the problem that some childrendominate the conversation;

• avoid the problem of children feeling veryexposed in front of their friends;

• avoid children feeling uncomfortable orthreatened (addressed, perhaps, by placingchildren with their friends);

• avoid children telling lies. Clearly these problems are not exclusive to chil-dren; they apply equally well to some adultgroup interviews. Group interviews require skil-ful chairing and attention to the physical layoutof the room so that everyone can see everyoneelse. Group size is also an issue; too few and itcan put pressure on individuals, too large andthe group fragments and loses focus. Lewis(1992) summarizes research to indicate that agroup of around six or seven is an optimum size,though it can be smaller for younger children.

As regards the disadvantages of group inter-views, Watts and Ebbutt note that they are of

GROUP INTERVIEWING

INTERVIEWS288

little use in allowing personal matters to emerge,or where the researcher has to aim a series offollow-up questions at one specific member ofthe group. As they explain, ‘the dynamic of agroup denies access to this sort of data’ (Wattsand Ebbutt, 1987). Further, Lewis (1992) com-ments on the problem of coding up the responsesof group interviews. For further guidance on thistopic and the procedures involved, we refer thereader to Simons (1982), Watts and Ebbutt(1987), Hedges (1985), Breakwell (1990), Spen-cer and Flin (1990) and Lewis (1992).

Focus groups

As an adjunct to group interviews, the use offocus groups is growing in educational research,albeit more slowly than, for instance, in busi-ness and political circles. Focus groups are a formof group interview, though not in the sense of abackwards and forwards between interviewerand group. Rather, the reliance is on the inter-action within the group who discuss a topic sup-plied by the researcher (Morgan, 1988:9). Hencethe participants interact with each other ratherthan with the interviewer, such that the views ofthe participants can emerge—the participants’rather than the researcher’s agenda can predomi-nate. It is from the interaction of the group thatthe data emerge. Focus groups are contrived set-tings, bringing together a specifically chosen sec-tor of the population to discuss a particular giventheme or topic, where the interaction with thegroup leads to data and outcomes. Their con-trived nature is both their strength and theirweakness: they are unnatural settings yet theyare very focused on a particular issue and, there-fore, will yield insights that might not otherwisehave been available in a straightforward inter-view; they are economical on time, producing alarge amount of data in a short period of time,but they tend to produce less data than inter-views with the same number of individuals on aone-to-one basis (ibid.: 19).

Focus groups (Morgan, 1988; Krueger, 1988)are useful for:

• orientation to a particular field of focus;• developing themes, topic, and schedules for

subsequent interviews and/or questionnaires;• generating hypotheses that derive from the

insights and data from the group;• generating and evaluating data from differ-

ent sub-groups of a population;• gathering feedback from previous studies. Focus groups might be useful to triangulate withmore traditional forms of interviewing, question-naire, observation etc. There are several issuesto be addressed in running focus groups, forexample (Morgan, 1988:41–8): • deciding the number of focus groups for a

single topic (one group is insufficient, as theresearcher will be unable to know whetherthe outcome is unique to the behaviour ofthe group);

• deciding the size of the group (too small andintra-group dynamics exert a disproportion-ate effect, too large and the group becomesunwieldy and hard to manage; it fragments).Morgan (ibid.: 43) suggests between four andtwelve people per group;

• how to allow for people not ‘turning up’ onthe day. Morgan (ibid.: 44) suggests the needto over-recruit by as much as 20 per cent;

• taking extreme care with the sampling, so thatevery participant is the bearer of the particu-lar characteristic required or that the grouphas homogeneity of background in the re-quired area, otherwise the discussion will losefocus or become unrepresentative. Samplingis a major key to the success of focus groups;

• ensuring that participants have something tosay and feel comfortable enough to say it;

• chairing the meeting so that a balance isstruck between being too directive and veer-ing off the point, i.e. keeping the meetingopen-ended but to the point.

Unlike group interviewing with children, dis-cussed above, focus groups operate more success-fully if they are composed of relative strangers

Cha

pte

r 15

289

rather than friends, unless friendship, of course,is an important criterion for the focus (e.g. thatthe group will discuss something that is usuallyonly discussed amongst friends).

Although its potential is considerable, thefocus group, as a particular kind of group inter-viewing, still has to find its way into educationalcircles to the extent that it has in other areas oflife.

The non-directive interview and thefocused interview

Originating from psychiatric and therapeuticfields, the non-directive interview is character-ized by a situation in which the respondent isresponsible for initiating and directing the courseof the encounter and for the attitudes she ex-presses in it (in contrast to the structured or re-search interview we have already considered,where the dominating role assumed by the in-terviewer results in, to use Kitwood’s phrase,‘an asymmetry of commitment’ (Kitwood,1977)). It is a particularly valuable technique inthat it gets at the deeper attitudes and percep-tions of the person being interviewed in such away as to leave them free from interviewer bias.We shall examine briefly the characteristics ofthe therapeutic interview and then consider itsusefulness as a research tool in the social andeducational sciences.

The non-directive interview as it is currentlyunderstood grew out of the pioneering work ofFreud and subsequent modifications to his ap-proach by later analysts. His basic discovery wasthat if one can arrange a special set of condi-tions and have a patient talk about his/her diffi-culties in a certain way, behaviour changes ofmany kinds can be accomplished. The techniquedeveloped was used to elicit highly personal datafrom patients in such a way as to increase theirself-awareness and improve their skills in self-analysis (Madge, 1965). By these means theybecame better able to help themselves.

The present-day therapeutic interview has itsmost persuasive advocate in Carl Rogers. Bas-ing his analysis on his own clinical studies, he

has identified a sequence of characteristic stagesin the therapeutic process, beginning with theclient’s decision to seek help. He/she is met by acounsellor who is friendly and receptive, but notdidactic. The next stage is signalled when theclient begins to give vent to hostile, critical anddestructive feelings, which the counsellor ac-cepts, recognizes and clarifies. Subsequently, andinvariably, these antagonistic impulses are usedup and give way to the first expressions of posi-tive feeling. The counsellor likewise accepts theseuntil suddenly and spontaneously ‘insight andself-understanding come bubbling through’(Rogers, 1942). With this insight comes the re-alization of possible courses of action and alsothe power to make decisions. It is in translatingthese into practical terms that clients free them-selves from dependence on the counsellor.

Rogers (1945) subsequently identified anumber of qualities in the interviewer which hedeemed essential: that she bases her work onattitudes of acceptance and permissiveness; thatshe respects the client’s responsibility for his ownsituation; that she permits the client to explainhis problem in his own way; and that she doesnothing that would in any way arouse the cli-ent’s defences.

There are a number of features of the thera-peutic interview which are peculiar to it and maywell be inappropriate in other settings: for ex-ample, as we have seen, the interview is initi-ated by the respondent; his/her motivation is toobtain relief from a particular symptom; the in-terviewer is primarily a source of help, not aprocurer of information; the actual interview ispart of the therapeutic experience; the purposeof the interview is to change the behaviour andinner life of the person and its success is definedin these terms; and there is no restriction on thetopics discussed.

A researcher has a different order of priori-ties however, e.g. focus, economics of time; whatappear as advantages in a therapeutic contextmay be decided limitations when the techniqueis used for research purposes, even though shemay be sympathetic to the spirit of the non-di-rective interview (Madge, 1965).

NON-DIRECTIVE AND FOCUSED INTERVIEW

INTERVIEWS290

One attempt to meet this need is reported byMerton and Kendall (1946) in which the focusedinterview was developed. While seeking to fol-low closely the principle of non-direction, themethod did introduce rather more interviewercontrol in the kinds of questions used and soughtalso to limit the discussion to certain parts ofthe respondent’s experience.

The focused interview differs from other typesof research interview in certain respects (Mertonand Kendall, 1946): • The persons interviewed are known to have

been involved in a particular situation: they may,for example, have watched a TV programme;or seen a film; or read a book or article; or havebeen a participant in a social situation.

• By means of the techniques of content analy-sis, elements in the situation which the re-searcher deems significant have previouslybeen analysed by her. She has thus arrived ata set of hypotheses relating to the meaningand effects of the specified elements.

• Using her analysis as a basis, the investigatorconstructs an interview guide. This identifiesthe major areas of inquiry and the hypoth-eses which determine the relevant data to beobtained in the interview.

• The actual interview is focused on the sub-jective experiences of the people who havebeen exposed to the situation. Their responsesenable the researcher both to test the validityof her hypotheses, and to ascertain unantici-pated responses to the situation, thus givingrise to further hypotheses.

From this it can be seen that the distinctive fea-ture of the focused interview is the prior analy-sis by the researcher of the situation in whichsubjects have been involved. The advantages ofthis procedure have been cogently explained byMerton and Kendall:

Fore-knowledge of the situation obviously reducesthe task confronting the investigator, since the in-terview need not be devoted to discovering theobjective nature of the situation. Equipped in ad-

vance with a content analysis, the interviewer canreadily distinguish the objective facts of the casefrom the subjective definitions of the situation.He [sic] thus becomes alert to the entire field of‘selective response’. When the interviewer, throughhis familiarity with the objective situation, is ableto recognize symbolic or functional silences,‘distortions’, avoidances, or blockings, he is themore prepared to explore their implications.

(Merton and Kendall, 1946) In the quest for what Merton and Kendall term‘significant data’, the interviewer must developthe ability to evaluate continuously the inter-view while it is in progress. To this end, theyestablished a set of criteria by which productiveand unproductive interview material can be dis-tinguished. Briefly, these are: • Non-direction Interviewer guidance should be

minimal.• Specificity Respondents’ definitions of the

situation should find full and specific expres-sion.

• Range The interview should maximize therange of evocative stimuli and responses re-ported by the subject.

• Depth and personal context The interviewshould bring out the affective and value-ladenimplications of the subjects’ responses, todetermine whether the experience had cen-tral or peripheral significance. It should elicitthe relevant personal context, the idiosyn-cratic associations, beliefs and ideas.

By way of example of productive interviewmaterial, Ashton (1994) used focused interviewsto ascertain the strengths of beliefs and the per-sonal reactions of principals of further educa-tion colleges to various changes being pressedupon them by central government and localagencies.

Telephone interviewing

Telephone interviewing is an important methodof data collection and is common practice insurvey research.2 Dicker and Gilbert (1988),

Cha

pte

r 15

291

Nias (1991), Oppenheim (1992) and Borg andGall (1996) suggest several attractions to tel-ephone interviewing: • It is sometimes cheaper than face-to-face in-

terviewing.• It enables researchers to select respondents

from a much more dispersed population thanif they have to travel to meet the interview-ees.

• It is useful for gaining rapid responses to astructured questionnaire.

• Monitoring and quality control are under-taken more easily since interviews are under-taken and administered centrally, indeed thereare greater guarantees that the researcheractually carries out the interview as required.

• Call-back costs are so slight as to enable fre-quent call-backs possible, enhancing reliabil-ity and contact.

• Many groups, particularly of busy people, canbe reached at times more convenient to themthan if a visit were to be made.

• They are safer to undertake than, for exam-ple, having to visit dangerous neighbour-hoods.

• They can be used to collect sensitive data, aspossible feelings of threat of face-to-face ques-tions about awkward, embarrassing or diffi-cult matters is absent.

• Response rate is higher than, for example,questionnaires.

Clearly this issue is not as cut-and-dried as theclaims made for it, as there are several potentialproblems with telephone interviewing, for ex-ample (see also Chapter 5): • It is very easy for respondents simply to hang

up on the caller.• There is a chance of skewed sampling, as not

all of the population have a telephone (oftenthose lower income households—perhaps thevery people that the researcher wishes to tar-get) or can hear (e.g. the old and second lan-guage speakers in addition to those with hear-ing difficulties).

• There is a lower response rate at weekends.• Some people have a deep dislike of telephones,

that sometimes extends to a phobia, and thisinhibits their responses or willingness to par-ticipate.

• Respondents may not disclose informationbecause of uncertainty about actual (eventhough promised) confidentiality.

• Many respondents (up to 25 per cent,Oppenheim, 1992:97) will be ‘ex-directory’and so their numbers will not be available intelephone directories.

• Respondents may withhold important infor-mation or tell lies, as the non-verbal behav-iour that frequently accompanies this is notwitnessed by the interviewer.

• It is often more difficult for complete stran-gers to communicate by telephone than face-to-face, particularly as non-verbal cues areabsent.

• Respondents are naturally suspicious (e.g. ofthe caller trying to sell a product).

• One telephone might be shared by severalpeople.

• Responses are difficult to write down orrecord during the interview.

That said, Sykes and Hoinville (1985) and alsoBorg and Gall (1996) suggest that telephone in-terviewing reaches nearly the same proportionof many target populations as ‘standard’ inter-views, that it obtains nearly the same rate ofresponse, and produces comparable informationto ‘standard’ interviews, sometimes at a frac-tion of the cost.

Harvey (1988), Oppenheim (1992) andMiller (1995) consider that: (a) telephone inter-views need careful arrangements for timing andduration (typically that they are shorter andquicker than face-to-face interviews)—a prelimi-nary call may be necessary to fix a time when alonger call is to be made; (b) the interviewer willneed to have ready careful prompts and probes,including more than usual closed questions andless complex questions, in case the respondent‘dries up’ on the telephone; (c) both interviewerand interviewee need to be prepared in advance

TELEPHONE INTERVIEWING

INTERVIEWS292

of the interview if its potential is to be realized;and (d) sampling requires careful consideration,using, for example, random numbers or someform of stratified sample. In general, however,many of the issues from ‘standard’ forms of in-terviewing apply equally well to telephone in-terviewing (see also Chapter 4).

Ethical issues in interviewing

Interviews have an ethical dimension; they con-cern interpersonal interaction and produce infor-mation about the human condition. One can iden-tify three main areas of ethical issues here—in-formed consent, confidentiality, and the conse-quences of the interviews; each is problematic(Kvale, 1996:111–20). For instance, who shouldgive the informed consent (e.g. participants, theirsuperiors), and for whom and what? How muchinformation should be given, and to whom? Whatis legitimate private and public knowledge? Howmight the research help or harm the interview-ees? Does the interviewer have a duty to pointout the possible harmful consequences of the re-search data or will this illegitimately steer the in-terview?

It is difficult to lay down hard and fast ethi-cal rules, as, by definition, ethical matters arecontestable. Nevertheless, it is possible to raisesome ethical questions to which answers needto be given before the interviews commence: • Has the informed consent of the interview-

ees been gained?• Has this been obtained in writing or orally?• How much information should be given in

advance of the study?• How can adequate information be provided

if the study is exploratory?• Have the possible consequences of the re-

search been made clear to the participants?• Has care been taken to prevent any harmful

effects of the research to the participants (andto others)?

• To what extent do any potential benefits out-weigh the potential harm done by the re-search, and how justifiable is this for con-ducting the research?

• How will the research benefit the partici-pants?

• Who will benefit from the research?• To what extent is there reciprocity between

what participants give to and receive fromthe research?

• Have confidentiality, anonymity, non-identi-fiability and non-traceability been guaran-teed? Should participants’ identities be dis-guised?

• How does the Data Protection Act (1984)operate in interview situations?

• Who will have access to the data?• What has been done to ensure that the inter-

view is conducted in an appropriate, non-stressful, non-threatening, manner?

• How will the data and transcriptions be veri-fied, and by whom?

• Who will see the results of the research? Willsome parts be withheld? Who own the data?At what stage does ownership of the data passfrom interviewees to interviewers? Are thererights of veto for what appears? To whomshould sensitive data be made available (e.g.should interview data on child abuse or drugtaking be made available with or withoutconsent to parents and the police)?

• How far should the researcher’s own agendaand views predominate? What if the re-searcher makes a different interpretation fromthe interviewee? Should the interviewees betold, even if they have not asked for theseinterpretations?

These issues, by no means an exhaustive list,are not exclusive to the research interview,though they are highly applicable here. For fur-ther reading on ethical issues we refer readersto Chapter 2.

The rationale of much of this chapter is locatedin the interpretive, ethnographic paradigm whichstrives to view situations through the eyes ofparticipants, to catch their intentionality andtheir interpretations of frequently complex situ-ations, their meaning systems and the dynamicsof the interaction as it unfolds. This is akin tothe notion of ‘thick description’ from Geertz(1973) and his predecessor Ryle (1949). Thechapter proceeds in several stages: firstly, we setout the characteristics of the ethogenic approach;secondly, we set out procedures in eliciting, ana-lysing and authenticating accounts; thirdly, weprovide an introduction to handling qualitativeaccounts and their related fields of: (a) networkanalysis; (b) discourse analysis; fourthly, we pro-vide an introduction to accounts; finally, we re-view the strengths and weaknesses of ethogenicapproaches. We recognize that the field of lan-guage and language use is vast, and to try to dojustice to it here is the ‘optimism of ignorance’(Edwards, 1976). Rather, we attempt to indi-cate some important ways in which researcherscan use accounts in collecting data for their re-search.

The field also owes a considerable amountto the communication theory and speech acttheory of Austin (1962), Searle (1969) and, morerecently, Habermas (e.g. 1979, 1984). In par-ticular, the notion that there are three kinds ofspeech act (locutionary—saying something;illocutionary—doing something whilst sayingsomething; and perlocutionary—achievingsomething by saying something) might commenditself for further study.

Introduction

Although each of us sees the world from ourown point of view, we have a way of speakingabout our experiences which we share with thosearound us. Explaining our behaviour towardsone another can be thought of as accountingfor our actions in order to make them intelligi-ble and justifiable to our fellows. Thus, saying‘I’m terribly sorry, I didn’t mean to bump intoyou’, is a simple case of the explication of socialmeaning, for by locating the bump outside anyplanned sequence and neutralizing it by makingit intelligible in such a way that it is not war-rantable, it ceases to be offensive in that situa-tion (Harré, 1978).

Accounting for actions in those larger slicesof life called social episodes is the central con-cern of a participatory psychology which focusesupon actors’ intentions, their beliefs about whatsorts of behaviour will enable them to reach theirgoals, and their awareness of the rules that gov-ern those behaviours. Studies carried out withinthis framework have been termed ‘ethogenic’,an adjective which expresses a view of the hu-man being as a person, that is, a plan-making,self-monitoring agent, aware of goals and de-liberately considering the best ways to achievethem. Ethogenic studies represent another ap-proach to the study of social behaviour and theirmethods stand in bold contrast to those com-monly employed in much of the educational re-search which we describe in Chapter 12. Beforediscussing the elicitation and analysis of accountswe need to outline the ethogenic approach inmore detail. This we do by reference to the work

16 Accounts

294 ACCOUNTS

of one of its foremost exponents, Rom Harré(1974, 1976, 1977a, 1977b, 1978).

The ethogenic approach

Harré (1978) identifies five main principles in theethogenic approach. They are set out in Box 16.1.

Characteristics of accounts and episodes

The discussion of accounts and episodes thatnow follows develops some of the ideas con-tained in the principles of the ethogenic approachoutlined in Box 16.1.

We have already noted that accounts mustbe seen within the context of social episodes.The idea of an episode is a fairly general one.The concept itself may be defined as any coher-ent fragment of social life. Being a natural divi-sion of life, an episode will often have a recog-nizable beginning and end, and the sequence ofactions that constitute it will have some mean-ing for the participants. Episodes may thus varyin duration and reflect innumerable aspects oflife. A pupil entering primary school at sevenand leaving at eleven would be an extended epi-

sode. A two-minute television interview with apolitical celebrity would be another. The con-tents of an episode which interest the ethogenicresearcher include not only the perceived behav-iour such as gesture and speech, but also thethoughts, the feelings and the intentions of thosetaking part. And the ‘speech’ that accounts forthose thoughts, feelings and intentions must beconceived of in the widest connotation of theword. Thus, accounts may be personal recordsof the events we experience in our day-to-daylives, our conversations with neighbours, ourletters to friends, our entries in diaries. Accountsserve to explain our past, present and futureoriented actions.

Providing that accounts are authentic, it isargued, there is no reason why they should notbe used as scientific tools in explaining people’sactions.

Procedures In eliciting, analysing andauthenticating accounts

The account-gathering method proposed byBrown and Sime (1977) is summarized in Box16.2. It involves attention to informants, the

Box 16.1Principles in the ethogenic approach

Source Adapted from Harré, 1978

1 An explicit distinction is drawn between synchronic analysis, that is, the analysis of social practices andinstitutions as they exist at any one time, and diachronic analysis, the study of the stages and the processes bywhich social practices and institutions are created and abandoned, change and are changed. Neither type ofanalysis can be expected to lead directly to the discovery of universal social psychological principles or laws.

2 In social interactions, it is assumed that action takes place through endowing intersubjective entities withmeaning; the ethogenic approach therefore concentrates upon the meaning system, that is, the whole sequenceby which a social act is achieved in an episode. Consider, for example, the action of a kiss in the particularepisodes of (a) leaving a friend’s house; (b) the passing-out parade at St Cyr; and (c) the meeting in the gardenof Gethsemane.

3 The ethogenic approach is concerned with speech which accompanies action. That speech is intended to makethe action intelligible and justifiable in occurring at the time and the place it did in the whole sequence ofunfolding and co-ordinated action. Such speech is accounting. In so far as accounts are socially meaningful, it ispossible to derive accounts of accounts.

4 The ethogenic approach is founded upon the belief that a human being tends to be the kind of person hislanguage, his traditions, his tacit and explicit knowledge tell him he is.

5 The skills that are employed in ethogenic studies therefore make use of commonsense understandings of thesocial world. As such the activities of the poet and the playwright offer the ethogenic researcher a better modelthan those of the physical scientist.

295Chapte

r 16

account-gathering situation, the transformationof accounts, and researchers’ accounts, and setsout control procedures for each of these ele-ments.

Problems of eliciting, analysing and authen-ticating accounts are further illustrated in thefollowing outlines of two educational studies.The first is concerned with valuing among olderboys and girls; the second is to do with the ac-tivities of pupils and teachers in using comput-ers in primary classrooms. In a study of adoles-cent values, Kitwood (1977) developed an ex-perience-sampling method, that is, a qualitativetechnique for gathering and analysing accountsbased upon tape-recorded interviews that werethemselves prompted by the fifteen situationslisted in Box 16.3.

Because the experience-sampling methodavoids interrogation, the material which emergesis less organized than that obtained from atightly structured interview. Successful handlingof individual accounts therefore requires the re-searcher to know the interview content

extremely well and to work toward the gradualemergence of tentative interpretive schematawhich she then modifies, confirms or falsifies asthe research continues. Kitwood identifies eightmethods for dealing with the tape-recorded ac-counts. Methods 1–4 are fairly close to the ap-proach adopted in handling questionnaires; andmethods 5–8 are more in tune with the ethogenicprinciples that we identified earlier: 1 The total pattern of choice The frequency of

choice of various items permits some surfacegeneralizations about the participants, takenas a group. The most revealing analyses maybe those of the least and most popular items.

2 Similarities and differences Using the sametechnique as in method 1, it is possible to in-vestigate similarities and differences within thetotal sample of accounts according to somecharacteristic(s) of the participants such as age,sex, level of educational attainment, etc.

3 Grouping items together It may be conven-ient for some purposes to fuse together

Box 16.2Account gathering

Source Brown and Sime, 1981

ELICITING, ANALYSING AND AUTHENTICATING ACCOUNTS

Research strategy Control procedure1 Informants

Definition of episode and role groups representing domain of interest Rationale for choice of episode and role groupsIdentification of exemplars Degree of involvement of potential informantsSelection of individual informants Contact with individuals to establish motive for

participation, competence and performance2 Account gathering situation

Establishing venue Contextual effects of venueRecording the account Appropriateness and accuracy in documenting

accountControlling relevance of account Accounts agendaAuthenticating account Negotiation and internal consistencyEstablishing role of interviewer and interviewee Degree of directionPost account authentication Corroboration

3 Transformation of accountsProvision of working documents Transcription reliability; coder reliabilityData reduction techniques Appropriateness of statistical and content analyses

4 Researchers’ accountsAccount of the account—summary, overview, interpretation Description of research operations, explanatory

scheme and theoretical background

296 ACCOUNTS

categories that cover similar subject matter.For example, items 1, 5 and 14 in Box 16.3relate to conflict; items 4, 7 and 15, to per-sonal growth and change.

4 Categorization of content The content of aparticular item is inspected for the total sam-ple and an attempt is then made to developsome categories into which all the materialwill fit. The analysis is most effective whentwo or more researchers work in collabora-tion, each initially proposing a category sys-tem independently and then exchanging viewsto negotiate a final category system.

5 Tracing a theme This type of analysis tran-scends the rather artificial boundaries whichthe items themselves imply. It aims to collectas much data as possible relevant to a par-ticular topic regardless of where it occurs inthe interview material. The method is exact-ing because it requires very detailed knowl-

edge of content and may entail going throughtaped interviews several times. Data so col-lected may be further analysed along the linessuggested in method 4 above.

6 The study of omissions The researcher maywell have expectations about the kind of is-sues likely to occur in the interviews. Whensome of these are absent, that fact may behighly significant. The absence of an antici-pated topic should be explored to discoverthe correct explanation of its omission.

7 Reconstruction of a social life-world Thismethod can be applied to the accounts of anumber of people who have part of their livesin common, for example, a group of friendswho go around together. The aim is to at-tempt some kind of reconstruction of theworld which the participants share in ana-lysing the fragmentary material obtained inan interview. The researcher seeks to

Box 16.3Experience sampling method

Source Adapted from Kitwood, 1977

Below are listed fifteen types of situation which most people have been in at some time. Try to think of somethingthat has happened in your life in the last year or so, or perhaps something that keeps on happening, which fitsinto each of the descriptions. Then choose the ten of them which deal with the things that seem to you to be mostimportant, which cover your main interests and concerns, and the different parts of your life. When we meet wewill talk together about the situations you have chosen. Try beforehand to remember as clearly as you can whathappened, what you and others did, and how you yourself felt and thought. Be as definite as you can. If you like,write a few notes to help you keep the situation in mind.

1 When there was a misunderstanding between you and someone else (or several others)…2 When you got on really well with people…3 When you had to make an important decision…4 When you discovered something new about yourself…5 When you felt angry, annoyed or resentful…6 When you did what was expected of you…7 When your life changed direction in some way…8 When you felt you had done something well…9 When you were right on your own, with hardly anyone taking your side…

10 When you ‘got away with it’, or were not found out…11 When you made a serious mistake…12 When you felt afterwards that you had done right…13 When you were disappointed with yourself…14 When you had a serious clash or disagreement with another person…15 When you began to take seriously something that had not mattered much to you before…

297Chapte

r 16

understand the dominant modes of orientingto reality, the conceptions of purpose and thelimits to what is perceived.

8 Generating and testing hypotheses New hy-potheses may occur to the researcher duringthe analysis of the tape-recordings. It is pos-sible to do more than simply advance theseas a result of tentative impressions; one canloosely apply the hypothetico-deductivemethod to the data. This involves putting thehypothesis forward as clearly as possible,working out what the verifiable inferencesfrom it would logically be, and testing theseagainst the account data. Where these dataare too fragmentary, the researcher may thenconsider what kind of evidence and methodof obtaining it would be necessary for morethorough hypothesis testing. Subsequent setsof interviews forming part of the same pieceof research might then be used to obtain rel-evant data.

In the light of the weaknesses in account gather-ing and analysis (discussed later), Kitwood’ssuggestions of safeguards are worth mention-ing. First, he calls for cross-checking betweenresearchers as a precaution against consistentbut unrecognized bias in the interviews them-selves. Second, he recommends member tests,that is, taking hypotheses and unresolved prob-lems back to the participants themselves or topeople in similar situations to them for theircomments. Only in this way can researchers besure that they understand the participants’ owngrounds for action. Since there is always thepossibility that an obliging participant will read-ily confirm the researcher’s own speculations,every effort should be made to convey to theparticipant that one wants to know the truth ashe or she sees it, and that one is as glad to beproved wrong as right.

A study by Blease and Cohen (1990) usedcross-checking as a way of validating the class-room observation records of co-researchers, andmember tests to authenticate both quantitativeand qualitative data derived from teacher andpupil informants. Thus, in the case of cross-

checking, the classroom observation schedulesof research assistants and researchers were com-pared and discussed, to arrive at definitive ac-counts of the range and duration of specific com-puter activities occurring within observationsessions. Member tests arose when interpreta-tions of interview data were taken back to par-ticipating teachers for their comments. Similarly,pupils’ scores on certain self-concept scales werediscussed individually with respondents in or-der to ascertain why children awarded them-selves high or low marks in respect of a range ofskills in using computer programmes.

Network analyses of qualitative data

Another technique that has been successfullyemployed in the analysis of qualitative data isdescribed by its originators as ‘systematic net-work analysis’ (Bliss, Monk and Ogborn, 1983).Drawing upon developments in artificial intelli-gence, Bliss and her colleagues employed theconcept of ‘relational network’ to represent thecontent and structuring of a person’s knowledgeof a particular domain.

Essentially, network analysis involves the de-velopment of an elaborate system of categories byway of classifying qualitative data and preservingthe essential complexity and subtlety of the mate-rials under investigation. A notational techniqueis employed to generate network-like structuresthat show the inter-dependencies of the categoriesas they are developed. Network mapping is akinto cognitive mapping,1 an example of which canbe seen in the work of Bliss et al. (1983).

What makes a good network?

Bliss et al. (1983) point out that there cannot beone overall account of criteria for judging themerits of a particular network. They do, how-ever, attempt to identify a number of factors thatought to feature in any discussion of the stand-ards by which a network might fairly be judgedas adequate.

First, any system of description needs to bevalid and reliable: valid in the sense that it is

WHAT MAKES A GOOD NETWORK?

298 ACCOUNTS

appropriate in kind and, within that kind, suffi-ciently complete and faithful; reliable in the sensethat there exists an acceptable level of agree-ment between people as to how to use the net-work system to describe data.

Second, there are properties that a networkdescription should possess such as clarity, com-pleteness and self-consistency. These relate to afurther criterion of ‘network utility’, the suffi-ciency of detail contained in a particular net-work. A third property that a network shouldpossess is termed ‘learnability’. Communicatingthe terms of the analysis to others, say the au-thors, is of central importance. It follows there-fore that much hinges on whether networks arerelatively easy or hard to teach to others. Afourth aspect of network acceptability has todo with its ‘testability’. Bliss et al. identify twoforms of testability, the first having to do withtesting a network as a ‘theory’ against data, thesecond with testing data against a ‘theory’ orexpectation via a network.

Finally, the terms ‘expressiveness’ and ‘per-suasiveness’ refer to qualities of language usedin developing the network structure. And here,the authors proffer the following advice. ‘Help-ful as the choice of an expressive coding moodor neat use of indentation or brackets may be,the code actually says no more than the networkdistinguishes’ (our italics).

To conclude, network analysis would seemto have a useful role to play in educational re-search by providing a technique for dealing withthe bulk and the complexity of the accounts thatare typically generated in qualitative studies.

Discourse analysis

Discourse researchers explore the organizationof ordinary talk and everyday explanations andthe social actions performed in them. Collect-ing, transcribing and analysing discourse dataconstitutes a kind of psychological ‘natural his-tory’ of the phenomena in which discourse ana-lysts are interested (Edwards and Potter, 1993).Discourses can be regarded as sets of linguisticmaterial that are coherent in organization and

content and enable people to construct mean-ing in social contexts (Coyle, 1995:245). Theemphasis on the construction of meaning indi-cates the action perspective of discourse analy-sis (ibid.) and this resonates with the notion ofspeech acts mentioned at the start of this chap-ter: locutions, illocutions and perlocutions.

Further, the focus on discourse and speechacts links this style of research to Habermas’scritical theory set out at the start of this book.Habermas argues that utterances are never sim-ply sentences (Habermas, 1970:368) that are dis-embodied from context, but, rather, their mean-ing derives from the inter subjective contexts inwhich they are set. A speech situation has a dou-ble structure, the propositional content (thelocutionary aspect—what is being said) and theperformatory content (the illocutionary andperlocutionary aspect—what is being done orachieved through the utterance). For Habermas(1979, 1984) each utterance has to abide by thecriteria of legitimacy, truth, rightness, sincerityand comprehensibility. His concept of the ‘idealspeech situation’ argues that speech—and, forour purposes here—discourse, should seek to beempowering and not subject to repression orideological distortion. His ideal speech situationis governed by several principles, not the leastof which are: mutual understanding betweenparticipants, freedom to enter a discourse, anequal opportunity to use speech acts, discussionto be free from domination, the movement to-wards consensus resulting from the discussionalone and the force of the argument alone (ratherthan the position power of speakers). ForHabermas, then, discourse analysis would seekto uncover, through ideology critique (see Chap-ter 1) the repressive forces which ‘systematicallydistort’ communication. For our purposes, wecan take from Habermas the need to expose andinterrogate the dominatory influences that notonly thread through the discourses which re-searchers are studying, but the discourses thatthe research itself produces.

Recent developments in discourse analysishave made important contributions to our un-derstanding of children’s thinking, challenging

299Chapte

r 16

views (still common in educational circles) of‘the child as a lone organism, constructing asuccession of general models of the world aseach new stage is mastered’ (Edwards, 1991).Rather than treating children’s language as rep-resentative of an inner cognitive world to beexplored experimentally by controlling for ahost of intruding variables, discourse analyststreat that language as action, as ‘situated dis-cursive practice’.2

By way of example, Edwards (1993) exploresdiscourse data emanating from a visit to a green-house by 5-year-old pupils and their teacher, tosee plants being propagated and grown. Hisanalysis shows how children takeunderstandings of adults’ meanings from thewords they hear and the situations in whichthose words are used. And in turn, adults (inthis case, the teacher) take from pupils’ talk, not

only what they might mean but also what theycould and should mean. What Edwards describesas ‘the discursive appropriation of ideas’(Edwards 1991) is illustrated in Box 16.4.

Discourse analysis requires a careful readingand interpretation of textual material, with in-terpretation being supported by the linguisticevidence. The inferential and interactional as-pects of discourse and discourse analysis sug-gest the need for the researcher to be highly sen-sitive to the nuances of language (Coyle,1995:247). In discourse analysis, as in qualita-tive data analysis generally (Miles andHuberman, 1984), the researcher can use cod-ing at an early stage of analysis, assigning codesto the textual material being studied (Parker,1992; Potter and Wetherell, 1987). This enablesthe researcher to discover patterns and broadareas in the discourse; computer programmes

Box 16.4Concepts in children’s talk

Source Edwards, 1993

DISCOURSE ANALYSIS

81 Sally Cuttings can grow to plants.82 Teacher [writing] ‘Cuttings can grow—,’ instead of saying ‘to83 plants’ you can say ‘grow, to plants.’84 Sally =You wrote Chris85 Teacher Oops.Thank you. I’ll do this again. ‘Cuttings can86 grow into plants’. That’s also good. What is a cutting,87 Christina?88 Christina A cutting is, umm, I don’t know.89 Teacher Who knows what a cutting is besides Sally? Sam.90 Sam It’s when you cut off a -, it’s when you cut off a piece91 of a plant.92 Teacher Exactly, and when you cut off a piece of a plant, what do93 you then do with it to make it grow? If you leave94

95 X96 Teacher Well, sometimes you can put it in soil.

97 Y And 98 Teacher wait, what else could you put it in?99 Sam Put it in a pot?

100 Teacher Pot, with soil, or . . . ? There’s another way.101 Sally 1 know another way.=102 Teacher =Wait. Sam, do you know? No?=103 Sam =Dirt.104 Teacher No, it doesn’t have to do with s -, it’s not a solid, it’s105 a liquid. What

106 Meredith107 Teacher Right. [. . .]

300 ACCOUNTS

such as Code-A-Text and Ethnograph can assisthere. With this achieved the researcher can thenre-examine the text to discover intentions, func-tions and consequences of the discourse (exam-ining the speech act functions of the discourse,e.g. to impart information, to persuade, to ac-cuse, to censure, to encourage etc). By seekingalternative explanations and the degree of vari-ability in the discourse, it is possible to rule outrival interpretations and arrive at a fair readingof what was actually taking place in the dis-course in its social context.

The application of discourse analysis to ourunderstanding of classroom learning processes iswell exemplified in a study by Edwards and Mer-cer (1987). Rather than taking the classroom talkas evidence of children’s thought processes, theresearchers explore it as ‘contextualized dialoguewith the teacher. The discourse itself is the edu-cational reality and the issue becomes that ofexamining how teacher and children construct ashared account, a common interpretative frame-work for curriculum knowledge and for whathappens in the classroom’ (Edwards, 1991).

Overriding asymmetries between teachers andpupils, Edwards concludes, both cognitive (interms of knowledge) and interactive (in termsof power), impose different discursive patternsand functions. Indeed Edwards (1980) suggeststhat teachers control classroom talk very effec-tively, reproducing asymmetries of power in theclassroom by telling the students when to talk,what to talk about, and how well they havetalked.

Discourse analysis has been criticized for itslack of systematicity (Coyle, 1995:256), for itsemphasis on the linguistic construction of a so-cial reality, and the impact of the analysis inshifting attention away from what is being ana-lysed and towards the analysis itself, i.e. the riskof losing the independence of phenomena. Dis-course analysis risks reifying discourse. Onemust not lose sight of the fact that the discourseanalysis itself is a text, a discourse that in turncan be analysed for its meaning and inferences,rendering the need for reflexivity to be high(Ashmore, 1989).3

Edwards and Westgate (1987) show whatsubstantial strides have been made in recentyears in the development of approaches to theinvestigation of classroom dialogue. Some meth-ods encourage participants to talk; others waitfor talk to emerge and sophisticated audio/videotechniques record the result by whatever methodit is achieved. Thus captured, dialogue is re-viewed, discussed and reflected upon; moreo-ver, that reviewing, discussing and reflecting isusually undertaken by researchers. It is they,generally, who read ‘between the lines’ and‘within the gaps’ of classroom talk by way ofinterpreting the intentionality of the participat-ing discussants.4

Analysing social episodes

A major problem in the investigation of thatnatural unit of social behaviour, the ‘social epi-sode’, has been the ambiguity that surrounds theconcept itself and the lack of an acceptable tax-onomy by which to classify an interaction se-quence on the basis of empirically quantifiablecharacteristics. Several quantitative studies havebeen undertaken in this field. For exampleMagnusson (1971), Ekehammer andMagnusson (1973) and McQuitty (1957) usefactor analysis and linkage analysis respectively,whilst Forgas (1976, 1978), Peevers and Secord(1973) and Secord and Peevers (1974) use mul-tidimensional scaling and cluster analysis.

Account gathering in educationalresearch: an example

The ‘free commentary’ method that Secord andPeevers (1974) recommend as a way of probingfor explanations of people’s behaviour lies atthe very heart of the ethnographer’s skills. Inthe example of ethnographic research that fol-lows, one can detect the researcher’s attemptsto get below the surface data and to search forthe deeper, hidden patterns that are only revealedwhen attention is directed to the ways that groupmembers interpret the flow of events in theirlives.

301Chapte

r 16

Heath: ‘Questioning at home and atschool’ (1982)

Heath’s study of misunderstandings existingbetween black children and their white teachersin classrooms in the south of the United Statesbrought to light teachers’ assumptions that pu-pils would respond to language routines and theuses of language in building knowledge and skillsjust as other children (including their own) did(Heath, 1982).5 Specifically, she sought to un-derstand why these particular children did notrespond just as others did. Her research involvedeliciting explanations from both the children’sparents and teachers. ‘We don’t talk to our chil-dren like you folks do’, the parents observedwhen questioned about their children’s behav-iour. Those children, it seemed to Heath, werenot regarded as information givers or as appro-priate conversational partners for adults. Thatis not to say that the children were excluded fromlanguage participation. They did, in fact, par-ticipate in a language that Heath describes asrich in styles, speakers and topics. Rather, itseemed to the researcher that the teachers’ char-acteristic mode of questioning was ‘to pull at-tributes of things out of context, particularly outof the context of books and name them—queens,elves, police, red apples’ (Heath, 1982). Theparents did not ask these kinds of questions oftheir children, and the children themselves had

their own ways of deflecting such questions, asthe example in Box 16.5 well illustrates.

Heath elicited both parents’ and teachers’accounts of the children’s behaviour and theirapparent communication ‘problems’ (see Box16.6). Her account of accounts arose out of pe-riods of participation and observation in class-rooms and in some of the teachers’ homes. Inparticular, she focused upon the ways in which‘the children learned to use language to satisfytheir needs, ask questions, transmit information,and convince those around them that they werecompetent communicators’ (Heath, 1982). Thisinvolved her in a much wider and more inten-sive study of the total fabric of life in Trackton,the southern community in which the researchwas located.

Over five years… I was able to collect data acrossa wide range of situations and to follow somechildren longitudinally as they acquired commu-nicative competence in Trackton. Likewise, at vari-ous periods during these years, I observedTrackton adults in public service encounters andon their jobs… The context of language use, in-cluding setting, topic, and participants (both thosedirectly involved in the talk and those who onlylistened) determined in large part how commu-nity members, teachers, public service personnel,and fellow workers judged the communicativecompetence of Trackton residents.

(Heath, 1982)6

Box 16.5‘Ain’t nobody can talk about things being about theirselves’

Source Adapted from Spindler, 1982

This comment by a 9-year-old boy was directed to his teacher when she persisted in interrogating him about thestory he had just completed in his reading group.

Teacher: What is the story about?Children: (silence)Teacher: Uh…Let’s… Who is it the story talks about?Children: (silence)Teacher: Who is the main character?…Um… What kind of story is it?Child: Ain’t nobody can talk about things being about theirselves.

The boy was saying ‘There’s no way anybody can talk (and ask) about things being about themselves’.

ACCOUNT GATHERING IN EDUCATIONAL RESEARCH

302 ACCOUNTS

Problems in gathering and analysingaccounts

The importance of the meaning of events andactions to those who are involved in them is nowgenerally recognized in social research. The im-plications of the ethogenic stance in terms ofactual research techniques, however, remainproblematic. Menzel (1978)7 discusses a numberof ambiguities and shortcomings in the ethogenicapproach, arising out of the multiplicity of mean-ings that may be held for the same behaviour.Most behaviour, Menzel observes, can be as-signed meanings and more than one of these mayvery well be valid simultaneously. It is fallacioustherefore, he argues, to insist upon determining‘the’ meaning of an act. Nor can it be said thatthe task of interpreting an act is done when onehas identified one meaning of it, or the one mean-ing that the researcher is pleased to designate asthe true one.

A second problem that Menzel raises is to dowith actors’ meanings as sources of bias. Howcentral a place, he asks, ought to be given toactors’ meanings in formulating explanations ofevents? Should the researcher exclusively andinvariably be guided by these considerations?To do so would be to ignore a whole range of

potential explanations which few researcherswould wish to see excluded from consideration.

These are far-reaching, difficult issues thoughby no means intractable. What solutions doesMenzel propose? First we must specify ‘towhom’ when asking what acts and situationsmean. Second, researchers must make choicesand take responsibility in the assignment ofmeanings to acts; moreover, problem formula-tions must respect the meaning of the act to us,the researchers. And third, explanations shouldrespect the meanings of acts to the actors them-selves but need not invariably be centred on thesemeanings.

Menzel’s plea is for the usefulness of an out-side observer’s account of a social episode along-side the explanations that participants them-selves may give of that event. A similar argu-ment is implicit in McIntyre and McLeod’s(1978) justification of objective, systematic ob-servation in classroom settings. Their case is setout in Box 16.7.

Strengths of the ethogenic approach

The advantages of the ethogenic approach tothe educational researcher lie in the distinctiveinsights that are made available to her through

Box 16.6Parents and teachers: divergent viewpoints on children’s communicative competence

Source Adapted from Spindler, 1982

ParentsThe teachers won’t listen. My kid, he too scared to talk, ’cause nobody play by the rules he know. At home, Ican’t shut ’im up.

Miss Davis, she complain ‘bout Ned not answerin’ back. He say she asks dumb questions she already know’bout. TeachersThey don’t seem to be able to answer even the simplest questions.

I would almost think some of them have a hearing problem; it is as though they don’t hear me ask a question. Iget blank stares to my questions. Yet when I am making statements or telling stories which interest them, theyalways seem to hear me.

The simplest questions are the ones they can’t answer in the classroom; yet on the playground, they can explain arule for a ballgame or describe a particular kind of bait with no problem. Therefore, I know they can’t be as dumbas they seem in my class.

I sometimes feel that when I look at them and ask a question I’m staring at a wall I can’t break through. There’ssomething there; yet in spite of all the questions I ask, I’m never sure I’ve gotten through to what’s inside that wall.

303Chapte

r 16

the analysis of accounts of social episodes. Thebenefits to be derived from the exploration ofaccounts are best seen by contrasting8 theethogenic approach with a more traditional edu-cational technique such as the survey which wediscussed in Chapter 8.

There is a good deal of truth in the assertionof the ethogenically oriented researcher thatapproaches which employ survey techniquessuch as the questionnaire take for granted thevery things that should be treated as problem-atic in an educational study. Too often, the phe-nomena that ought to be the focus of attentionare taken as given, that is, they are treated asthe starting point of the research rather thanbecoming the centre of the researcher’s interestand effort to discover how the phenomena aroseor came to be important in the first place. Nu-merous educational studies, for example, haveidentified the incidence and the duration of dis-ciplinary infractions in school; only relativelyrecently, however, has the meaning of classroomdisorder, as opposed to its frequency and type,been subjected to intensive investigation.9 Un-like the survey, which is a cross-sectional tech-nique that takes its data at a single point in time,the ethogenic study employs an ongoing obser-vational approach that focuses upon processesrather than products. Thus it is the process ofbecoming deviant in school which would cap-ture the attention of the ethogenic researcher

rather than the frequency and type of misbe-haviour among k types of ability in children lo-cated in n kinds of school.

A note on stories

A comparatively neglected area in educationalresearch is the field of stories and storytelling.Bauman (1986:3) suggests that stories are oralliterature whose meanings, forms and functionsare situationally rooted in cultural contexts,scenes and events which give meaning to action.This recalls Bruner (1986) who, echoing the in-terpretive mode of educational research, regardsmuch action as ‘storied text’, with actors makingmeaning of their situations through narrative.Stories have a legitimate place as an inquirymethod in educational research (Parsons andLyons, 1979), and, indeed, Jones (1990), Crow(1992), Dunning (1993) and Thody (1997) placethem on a par with interviews as sources of evi-dence for research. Thody (1997:331) suggeststhat, as an extension to interviews, stories—likebiographies—are rich in authentic, live data; theyare, she avers, an ‘unparalleled method of reach-ing practitioners’ mindsets’. She provides a fasci-nating report on stories as data sources for edu-cational management research as well as for gath-ering data from young children (pp. 333–4).

Thody indicates (p. 331) how stories can beanalysed, using, for example, conventional

Box 16.7Justification of objective systematic observation in classroom settings

Source McIntyre and McLeod, in McAleese and Hamilton, 1978

When Smith looks at Jones and says, ‘Jones, why does the blue substance spread through the liquid?’ (probablywith a particular kind of voice inflection), and then silently looks at Jones (probably with a particular kind of facialexpression), the observer can unambiguously categorize the event as ‘Smith asks Jones a question seeking anexplanation of diffusion in a liquid.’ Now Smith might describe the event as ‘giving Jones a chance to show heknows something’, and Jones might describe the event as ‘Smith trying to get at me’; but if either of them deniedthe validity of the observer’s description, they would be simply wrong, because the observer would be describingat least part of what the behaviour which occurred means in English in Britain. No assumptions are made hereabout the effectiveness of classroom communication; but the assumption is made that…communication isdependent on the system of conventional meanings available within the wider culture. More fundamentally, thisinterpretation implies that the systematic observer is concerned with an objective reality (or, if one prefers, ashared intersubjective reality) of classroom events. This is not to suggest that the subjective meanings of events toparticipants are not important, but only that these are not accessible to the observer and that there is an objectivereality to classroom activity which does not depend on these meanings [our emphasis].

A NOTE ON STORIES

304 ACCOUNTS

techniques such as: categorizing and coding ofcontent; thematization; concept building. Inthis respect stories have their place alongsideother sources of primary and secondary docu-mentary evidence (e.g. case studies, biogra-phies). They can be used in ex post facto re-search, historical research, as accounts or inaction research; in short they are part of theeveryday battery of research instruments thatare available to the researcher. The rise in the

use of oral history as a legitimate researchtechnique in social research can be seen hereto apply to educational research. Though theymight be problematic in that verification isdifficult (unless other people were present toverify events reported), stories, being rich inthe subjective involvement of the storyteller,offer an opportunity for the researcher togather authentic, rich and ‘respectable’ data(Bauman, 1986).

Observational data are attractive as they affordthe researcher the opportunity to gather ‘live’data from ‘live’ situations. The researcher isgiven the opportunity to look at what is takingplace in situ rather than at second hand (Patton,1990:203–5). This enables researchers to under-stand the context of programmes, to be open-ended and inductive, to see things that mightotherwise be unconsciously missed, to discoverthings that participants might not freely talkabout in interview situations, to move beyondperception-based data (e.g. opinions in inter-views), and to access personal knowledge. Be-cause observed incidents are less predictablethere is a certain freshness to this form of datacollection that is often denied in other forms,e.g. a questionnaire or a test.

Observations, it is argued (Morrison, 1993:80),enable the researcher to gather data on: • the physical setting (e.g. the physical envi-

ronment and its organization);• the human setting (e.g. the organization of

people, the characteristics and make up ofthe groups or individuals being observed, forinstance gender, class);

• the interactional setting (e.g. the interactionsthat are taking place, formal, informal,planned, unplanned, verbal, non-verbal etc.);

• the programme setting (e.g. the resources andtheir organization, pedagogic styles, curriculaand their organization).

Patton (1990:202) suggests that observationaldata should enable the researcher to enter andunderstand the situation that is being described.

The kind of observations available to the re-searcher lie on a continuum from unstructuredto structured, responsive to pre-ordinate. Ahighly structured observation will know in ad-vance what it is looking for (i.e. pre-ordinateobservation) and will have its observation cat-egories worked out in advance. A semi-struc-tured observation will have an agenda of issuesbut will gather data to illuminate these issues ina far less pre-determined or systematic manner.An unstructured observation will be far less clearon what it is looking for and will therefore haveto go into a situation and observe what is tak-ing place before deciding on its significance forthe research. In a nutshell, a structured obser-vation will already have its hypotheses decidedand will use the observational data to conformor refute these hypotheses. On the other hand,a semi-structured and, more particularly, anunstructured observation, will be hypothesis-generating rather than hypothesis-testing. Thesemi-structured and unstructured observationswill review observational data before suggest-ing an explanation for the phenomena beingobserved.

Though it is possible to argue that all researchis some form of participant observation sincewe cannot study the world without being partof it (Adler and Adler, 1994), nevertheless Gold(1958) offers a well-known classification of re-searcher roles in observation, that lie on a con-tinuum. At one end is the complete participant,moving to the participant-as-observer, thence tothe observer-as-participant, and finally to thecomplete observer. The move is from completeparticipation to complete detachment. The

17 Observation

306 OBSERVATION

mid-points of this continuum strive to balanceinvolvement with detachment, closeness withdistance, familiarity with strangeness. The roleof the complete observer is typified in the one-way mirror, the video cassette, the audio-cas-sette and the photograph, whilst complete par-ticipation involves researchers taking on mem-bership roles (overt or covert).

Traditionally observation has been character-ized as non-interventionist (Adler and Adler,1994:378), where researchers do not seek to ma-nipulate the situation or subjects, they do not posequestions for the subjects, nor do they deliber-ately create ‘new provocations’ (ibid.: 378). Quan-titative research tends to have a small field of fo-cus, fragmenting the observed into minute chunksthat can subsequently be aggregated into a vari-able. Qualitative research, on the other hand, drawsthe researcher into the phenomenological com-plexity of participants’ worlds; here situationsunfold, and connections, causes and correlationscan be observed as they occur over time. The quali-tative researcher seeks to catch the dynamic na-ture of events, to seek intentionality, and to seeklarge trends and patterns over time.

If we know in advance what we wish to ob-serve, i.e. if the observation is concerned to chartthe incidence, presence and frequency of ele-ments of the four settings referred to earlier(Morrison, 1993:80), and maybe wishes to com-pare one situation with another, then it may bemore efficient in terms of time to go into a situ-ation with an already designed observationschedule. If, on the other hand, we want to gointo a situation and let the elements of the situ-ation speak for themselves, perhaps with noconcern with how one situation compares withanother, then it may be more appropriate to optfor a less structured observation.

The former, structured observation, takesmuch time to prepare but the data analysis isfairly rapid, the categories having already beenestablished, whilst the latter, less structured ap-proach, is quicker to prepare but the data takemuch longer to analyse. The former approachoperates within the agenda of the researcher andhence might neglect aspects of the four settings

above if they do not appear on the observationschedule, i.e. it looks selectively at situations.On the other hand, the latter operates withinthe agenda of the participants, i.e. it is respon-sive to what it finds and therefore, by defini-tion, is honest to the situation which it finds.Here selectivity derives from the situation ratherthan from the researcher in the sense that keyissues emerge from, follow from the observa-tion, rather than the researcher knowing in ad-vance what those key issues will be.

Structured observation

A structured observation is very systematic andenables the researcher to generate numerical datafrom the observations. Numerical data, in turn,facilitate the making of comparisons betweensettings and situations, and frequencies, patternsand trends to be noted or calculated. The ob-server adopts a passive, non-intrusive role, merelynoting down the incidence of the factors beingstudied. Observations are entered on an obser-vational schedule. An example of this is shownin Box 17.1 This is an example of a scheduleused to monitor student and teacher conversa-tions over a ten minute period. The upper sevencategories indicate who is speaking to whom,whilst the lower four categories indicate the na-ture of the talk. Looking at the example of theobservation schedule, several points can be noted: • The categories for the observation are dis-

crete, i.e. there is no overlap between them.For this to be the case requires a pilot to havebeen developed and tested in order to ironout any problems of overlap of categories.

• Each column represents a thirty second timeinterval, i.e. the movement from left to rightrepresents the chronology of the sequence,and the researcher has to enter data in theappropriate cell of the matrix every thirtyseconds (see below: instantaneous sampling).

• Because there are so many categories whichhave to be scanned at speed (every thirty sec-onds), the researcher will need to practisecompleting the schedule until he or she

307Chapte

r 17

becomes proficient and consistent in enter-ing data (i.e. the observed behaviours, set-tings etc. are entered into the same catego-ries consistently), achieving reliability. Thiscan be done either through practising withvideo material or through practising in a livesituation with participants who will not sub-sequently be included in the research. If thereis to be more than one researcher then it maybe necessary to provide training sessions sothat the team of researchers proficiently, effi-ciently and consistently enter the same sortof data in the same categories, i.e. that thereis inter-rater reliability.

• The researcher will need to decide what en-try is to be made in the appropriate category,for example: a tick ( ), a forward slash (/), abackward slash (\), a figure (1, 2, 3 etc.),4 aletter (a, b, c etc.), a tally mark (|). Whatevercode or set of codes is used, it must be under-stood by all the researchers (if there is a team)and must be simple and quick to enter (i.e.symbols rather than words). Bearing in mindthat every thirty seconds one or more entriesmust be made in each column, the researcher

will need to become proficient in fast andaccurate data entry of the appropriate codes.1

The need to pilot a structured observation sched-ule, as in the example, cannot be overempha-sized. Categories must be mutually exclusive andmust be comprehensive. The researcher, then,will need to decide: 1 the foci of the observation (e.g. people as well

as events);2 the frequency of the observations (e.g. every

thirty seconds, every minute, every two min-utes);

3 the length of the observation period (e.g. onehour, twenty minutes);

4 the nature of the entry (the coding system). The criterion of ‘fitness for purpose’ is used formaking decisions on these four matters. Struc-tured observation will take much time in prepa-ration but the analysis of the data should be rapidas the categories for analysis will have been builtinto the schedule itself. So, for example, if close,detailed scrutiny is required then the time

Box 17.1A structured observation schedule

Notes/ = participants in the conversation

= nature of the conversation

STRUCTURED OBSERVATION

308 OBSERVATION

intervals will be very short, and if less detail isrequired then the intervals may be longer.

There are four principal ways of entering dataonto a structured observation schedule: eventsampling, instantaneous sampling, interval re-cording, and rating scales.

Event sampling

Event sampling, also known as a sign system,requires a tally mark to be entered against eachstatement each time it is observed, for example:

teacher shouts at the child /////child shouts at the teacher ///parent shouts at the teacher //teacher shouts at the parent //

The researcher will need to devise statementsthat yield the data that answer the research ques-tions. This method is useful for finding out thefrequencies or incidence of observed situationsor behaviours, so that comparisons can be made;we can tell that the teacher does most shoutingand that the parent shouts least of all. However,whilst these data enable us to chart the incidenceof observed situations or behaviours, the diffi-culty with them is that we are unable to deter-mine the chronological order in which they oc-curred. For example, two different stories couldbe told from these data if the sequence of eventswere known. If the data were presented in achronology, one story could be seen as follows,where the numbers 1–7 are the different peri-ods over time (e.g. every thirty seconds):

1 2 3 4 5 6 7teacher shouts at the child / / / / /child shouts at the teacher / / /parent shouts at the teacher / /teacher shouts at the parent / /

Imagine the scene: a parent and his child arrivelate for school one morning and the child slipsinto the classroom; an event quickly occurswhich prompts the child to shout at the teacher,the exasperated teacher is very cross when thusprovoked by the child; the teacher shouts at the

child who then brings in the parent (who hasnot yet left the premises); the parent shouts atthe teacher for unreasonable behaviour and theteacher shouts back at the child. It seems in thisversion that the teacher only shouts when pro-voked by the child or parent.

If the same number of tally marks were dis-tributed in a different order, a very different storymight emerge, for example:

1 2 3 4 5 6 7teacher shouts at the child / / / / /child shouts at the teacher / / /parent shouts at the teacher / /teacher shouts at the parent / /

In this scene it is the teacher who is the instiga-tor of the shouting, shouting at the child andthen at the parent; the child and the parent onlyshout back when they have been provoked!

Instantaneous sampling

If it is important to know the chronology ofevents, then it is necessary to use instantaneoussampling, sometimes called time sampling. Herethe researcher enters what she observes at stand-ard intervals of time, for example every twentyseconds, every minute. On the stroke of thatinterval she notes what is happening at that pre-cise moment and enters it into the appropriatecategory on the schedule. For example, imaginethat the sampling will take place every thirtyseconds; numbers 1–7 represent each thirty sec-ond interval thus:

1 2 3 4 5 6 7teacher smiles at the child / / / /child smiles at the teacher / / / /teacher smiles at the parent / / / /parent smiles at the teacher / / / /

In this scene the researcher notes down what ishappening on the thirty second point and no-tices from these precise moments that the teacherinitiates the smiling but that all parties seem tobe doing quite a lot of smiling, with the parentand the child doing the same amount of smilingeach! Instantaneous sampling involves record-ing what is happening on the instant and

309Chapte

r 17

entering it on the appropriate category. The chro-nology of events is thus preserved.

Interval recording

This method charts the chronology of events tosome extent and, like instantaneous sampling,requires the data to be entered in the appropri-ate category at fixed intervals. However, insteadof charting what is happening on the instant, itcharts what has happened during the precedinginterval. So, for example, if recording were totake place every thirty seconds, then the re-searcher would note down in the appropriatecategory what had happened during the preced-ing thirty seconds. Whilst this enables frequen-cies to be calculated, simple patterns to be ob-served and an approximate sequence of eventsto be noted, because it charts what has takenplace in the preceding interval of time, some el-ements of the chronology might be lost. For ex-ample, if three events took place in the preced-ing thirty seconds of the example, then the or-der of the three events would be lost; we wouldknow simply that they had occurred.

Rating scales

In this method the researcher is asked to makesome judgement about the events being ob-served, and to enter responses onto a rating scale.For example, Wragg (1994) suggests that ob-served teaching behaviour might be entered ontorating scales by placing the observed behaviouronto a continuum:

1 2 3 4 5Warm — — — — — AloofStimulating — — — — — DullBusinesslike — — — — — Slipshod

An observer might wish to enter a rating ac-cording to a five point scale of observed behav-iour, for example:

1=not at all 2=very little 3=a little 4=a lot5=a very great deal

What is required here is for the researcher tomove from low inference (simply reporting ob-servations) to a higher degree of inference (mak-ing judgements about events observed). Thismight introduce a degree of unreliability intothe observation (for example through: (a) thehalo effect; (b) the central tendency whereinobservers will avoid extreme categories; (c) re-cency—where observers are influenced by morerecent events than less recent events). That said,this might be a helpful summary way of gather-ing observational data.

Whilst structured observation can provideuseful numerical data (e.g. Bennett et al., 1984;Galton et al., 1980), there are several concernswhich must be addressed in this form of obser-vation, for example: • the method is behaviourist,excluding any

mention of the intentions or motivations ofthe people being observed;

• the individual’s subjectivityis lost to an ag-gregated score;

• there is an assumption that the observedbehaviour provides evidence of underlyingfeelings, i.e. that concepts or constructs canbe crudely measured in observed occurrences.

This latter point is important, for it goes to thevery heart of the notion of validity, since it re-quires researchers to satisfy themselves that it isvalid to infer that a particular behaviour indi-cates a particular state of mind or particular in-tention or motivation. The thirst tooperationalize concepts and constructs can eas-ily lead researchers to provide simple indicatorsof complex concepts.

Further, structured observation neglects the

STRUCTURED OBSERVATION

310 OBSERVATION

significance of contexts—temporal and spatial—thereby overlooking the fact that behavioursmay be context specific. In their concern for theovert and the observable, researchers may over-look unintended outcomes which may have sig-nificance; they may be unable to show how sig-nificant are the behaviours of the participantsbeing observed in their own terms. If we acceptthat behaviour is developmental, that interac-tions evolve over time and, therefore, are, bydefinition, fluid, then the three methods of struc-tured observation outlined above appear to takea series of ‘freeze-frame’ snapshots of behaviour,thereby violating the principle of fluidity of ac-tion. Captured for an instant in time, it is diffi-cult to infer a particular meaning to one or moreevents (Stubbs and Delamont, 1976), just as itis impossible to say with any certainty what istaking place when we study a single photographor a set of photographs of a particular event.Put simply, if structured observation is to holdwater, then the researcher may need to gatheradditional data from other sources to inform theinterpretation of observational data.

This latter point is a matter not only forstructured observation but, equally, for un-structured observation, for what is being sug-gested here is the notion that triangulation (ofmethods, of observers, of time and space) canassist the researcher to generate reliable evi-dence. There is a risk that observations will beselective, and the effects of this can be attenu-ated by triangulation. One way of gatheringmore reliable data (for example about a par-ticular student or group of students) is by track-ing them through the course of a day or a week,following them from place to place, event toevent. It is part of teaching folklore that stu-dents will behave very differently for oneteacher than for another, and a full picture ofstudents’ behaviour might require the observerto see the students in different contexts.

Critical incidents

There will be times when reliability as consist-ency in observations is not always necessary. For

example, a student might only demonstrate aparticular behaviour once, but it is so impor-tant as not to be ruled out simply because it oc-curred once. One only has to commit a singlemurder to be branded a murderer! Sometimesone event can occur which reveals an extremelyimportant insight into a person or situation.Critical incidents (Flanagan, 1949) and criticalevents (Wragg, 1994) are particular events oroccurrences that might typify or illuminate verystarkly a particular feature of a teacher’s behav-iour or teaching style for example. Wragg(1994:64) writes that these are events that ap-pear to the observer to have more interest thanother ones, and therefore warrant greater detailand recording than other events; they have animportant insight to offer. For example, a childmight unexpectedly behave very aggressivelywhen asked to work with another child—thatmight reveal an insight into the child’s socialtolerance; a teacher might suddenly overreactwhen a student produces a substandard piece ofwork—the straw that breaks the camel’s back—that might indicate a level of frustration toler-ance or intolerance and the effects of that thresh-old of tolerance being reached. These events arecritical in that they may be non-routine but veryrevealing; they offer the researcher an insightthat would not be available by routine observa-tion. They are frequently unusual events.2

Naturalistic observation

There are degrees of participation in observa-tion (LeCompte and Preissle, 1993:93–4). The‘complete participant’ is a researcher who takeson an insider role in the group being studied,and maybe who does not even declare that she isa researcher (echoing the comments above aboutthe ethics of covert research). The ‘participant-as-observer’, as its name suggests, is part of thesocial life of participants and documents andrecords what is happening for research purposes.The ‘observer-as-participant’, like the participant-as-observer, is known as a researcher to the group,and maybe has less extensive contact with thegroup. With the ‘complete observer’ participants

311Chapte

r 17

do not realize that they are being observed (e.g.using a one-way mirror), hence this is anotherform of covert research. Hammersley andAtkinson (1983:93–5) suggest that comparativeinvolvement may come in the forms of the com-plete participant and the participant-as-observer,with a degree of subjectivity and sympathy, whilstcomparative detachment may come in the formsof the observer-as-participant and the completeobserver, where objectivity and sympathy are keycharacteristics. Both complete participation andcomplete detachment are as limiting as each other.As a complete participant the researcher darenot go outside the confines of the group for fearof revealing her identity (in covert research), andas a complete observer there is no contact withthe observed, so inference is dangerous. That said,both complete participation and complete de-tachment minimize reactivity, though in theformer there is the risk of ‘going native’—wherethe researcher adopts the values, norms and be-haviours of the group as her own, i.e. ceases tobe a researcher and becomes a member of thegroup.

In participant observational studies the re-searcher stays with the participants for a sub-stantial period of time to reduce reactivity ef-fects (the effects of the researcher on the re-searched), recording what is happening, whilsttaking a role in that situation. In schools thismight be taking on some particular activities,sharing supervisions, participating in school life,recording impressions, conversations, observa-tions, comments, behaviour, events and activi-ties and the views of all participants in a situa-tion. Participant observation is often combinedwith other forms of data collection that, together,elicit the participants’ definitions of the situa-tion and their organizing constructs in account-ing for situations and behaviour. By staying in asituation over a long period the researcher is alsoable to see how events evolve over time, catch-ing the dynamics of situations, the people, per-sonalities, contexts, resources, roles etc.

Morrison (1993:88) argues that by ‘beingimmersed in a particular context over time notonly will the salient features of the situation

emerge and present themselves but a more ho-listic view will be gathered of the interrelation-ships of factors’. Such immersion facilitates thegeneration of ‘thick descriptions’ which lendthemselves to accurate explanation and inter-pretation of events rather than relying on theresearcher’s own inferences.

Components of ‘thick descriptions’ involve(Carspecken, 1996:47), for example, recording:speech acts; non-verbal communication; descrip-tions in low-inference vocabulary; careful andfrequent recording of the time and timing ofevents; the observer’s comments that are placedinto categories; detailed contextual data.

Observations are recorded in field notes; thesecan be written at several levels. At the level ofdescription they might include, for example(Spradley, 1980; Bogdan and Biklen, 1992:120–1; LeCompte and Preissle, 1993:224): • quick, fragmentary jottings of key words/

symbols;• transcriptions and more detailed observations

written out fully;• descriptions that, when assembled and writ-

ten out, form a comprehensive and compre-hensible account of what has happened;

• pen portraits of participants;• reconstructions of conversations;• descriptions of the physical settings of events;• descriptions of events, behaviour and

activities;• description of the researcher’s activities and

behaviour. Lincoln and Guba (1985:273) suggest a varietyof elements or types of observations that include: • ongoing notes, either verbatim or categorized

in situ;• logs or diaries of field experiences (similar to

field notes though usually written after sometime has elapsed since the observations weremade);

• notes that are made on specific, predeter-mined themes (e.g. that have arisen fromgrounded theory);

NATURALISTIC OBSERVATION

312 OBSERVATION

• ‘chronologs’, where each separate behav-ioural episode is noted, together with the timeat which it occurred, or recording an obser-vation at regular time intervals, e.g. every twoor three minutes;

• context maps—maps, sketches, diagrams orsome graphic display of the context (usuallyphysical) within which the observation takesplace, such graphics enabling movements tobe charted;

• entries on predetermined schedules (includingrating scales, checklists and structured obser-vation charts), using taxonomic or categoricsystems, where the categories derive from pre-vious observational or interview data;

• sociometric diagrams indicating social rela-tionships, e.g. isolates (whom nobodychooses), stars (whom everyone chooses); anddyads (who choose each other);

• debriefing questionnaires from respondentsthat are devised for, and by, the observer only,to be used for reminding the observer of maintypes of information and events once she orhe has left the scene;

• data from debriefing sessions with other re-searchers, again as an aide-memoire.

LeCompte and Preissle (1993:199–200) providea useful set of guidelines for directing observa-tions of specific activities, events or scenes, sug-gesting that they should include answers to thefollowing questions: • who is in the group/scene/activity—who is

taking part?• how many people are there, their identities

and their characteristics?• how do participants come to be members of

the group/event/activity?• what is taking place?• how routine, regular, patterned, irregular and

repetitive are the behaviours observed?• what resources are being used in the scene?• how are activities being described, justified,

explained, organized, labelled?• how do different participants behave towards

each other?

• what are the statuses and roles of the partici-pants?

• who is making decisions, and for whom?• what is being said, and by whom?• what is being discussed frequently/infre-

quently?• what appears to be the significant issues that

are being discussed?• what non-verbal communication is taking place?• who is talking and who is listening?• where does the event take place?• when does the event take place?• how long does the event take?• how is time used in the event?• how are the individual elements of the event

connected?• how are change and stability managed?• what rules govern the social organization of,

and behaviour in, the event?• why is this event occurring, and occurring in

the way that it is?• what meanings are participants attributing

to what is happening?• what are the history, goals, and values of the

group in question? That this list is long (and by no means exhaus-tive) reflects the complexity of even the appar-ently most mundane activity!

Spradley (1980) suggests a checklist of thecontent of field notes: • Space the physical setting;• Actors the people in the situation;• Activities the sets of related acts that are tak-

ing place;• Objects the artifacts and physical things that

are there;• Acts the specific actions that participants are

doing;• Events the sets of activities that are taking

place;• Time the sequence of acts, activities and

events;• Goals what people are trying to achieve;• Feelings what people feel and how they ex-

press this.

313Chapte

r 17

At the level of reflection, field notes might in-clude (Bogdan and Biklen, 1992:122): • reflections on the descriptions and analyses

that have been done;• reflections on the methods used in the obser-

vations and data collection and analysis;• ethical issues, tensions, problems and dilem-

mas;• the reactions of the observer to what has been

observed and recorded—attitude, emotion,analysis etc.;

• points of clarification that have been and/orneed to be made;

• possible lines of further inquiry Lincoln and Guba (1985:327) indicate threemain types of item that might be included in ajournal: 1 a daily schedule, including practical matters,

e.g. logistics;2 a personal diary, for reflection, speculation

and catharsis;3 notes on and a log of methodology. For the level of analysis see the discussion ofStage 9 below.

What is being suggested through these com-ments is that the data should be comprehensiveenough to enable the reader to reproduce theanalysis that was performed. It should focus onthe observable and make explicit the inferen-tial, and that the construction of abstractionsand generalizations might commence early butshould not starve the researcher of novel chan-nels of inquiry (Sacks, 1992).

Observations include both oral and visualdata. In addition to the observer writing downdetails in field notes, a powerful recording de-vice is through audio-visual recording (Erickson,1992:209–10). Comprehensive audio-visual re-cording can overcome the partialness of the ob-server’s view of a single event and can overcomethe tendency towards only recording the fre-quently occurring events. Audio-visual data col-lection has the capacity for completeness of

analysis and comprehensiveness of material, re-ducing both the dependence on prior interpre-tations by the researcher, and the possibilityagain of only recording events which happenfrequently. Of course, one has to be cautioushere, for installing video cameras might bringthe problem of reactivity. If fixed they might beas selective as participant observers, and if mov-able, they might still be highly selective(Morrison, 1993:91).

The context of observation is important(Silverman, 1993:146). Indeed Spradley (1979) andKirk and Miller (1986) suggest that observers shouldkeep four sets of observational data to include: • notes made in situ;• expanded notes that are made as soon as

possible after the initial observations;• journal notes to record issues, ideas, difficul-

ties etc. that arise during the field-work;• a developing, tentative running record of

ongoing analysis and interpretation. The intention here is to introduce some systema-tization into observations in order to increasetheir reliability. In this respect Silverman (1993)reminds us of the important distinction betweenetic and emic analysis. Etic analysis uses theconceptual framework of the researcher, whilstemic approaches use the conceptual frameworksof those being researched. Structured observa-tion uses etic approaches, with predefined frame-works that are adhered to unswervingly, whilstemic approaches sit comfortably within quali-tative approaches, where the definitions of thesituations are captured through the eyes of theobserved.

Participant observation studies are not with-out their critics. The accounts that typicallyemerge from participant observations echo thecriticisms of qualitative data outlined earlier,being described as subjective, biased, impression-istic, idiosyncratic and lacking in the precisequantifiable measures that are the hallmark ofsurvey research and experimentation. Whilst itis probably true that nothing can give betterinsight into the life of a gang of juvenile

NATURALISTIC OBSERVATION

314 OBSERVATION

delinquents than going to live with them for anextended period of time, critics of participantobservation studies will point to the dangers of‘going native’ as a result of playing a role withinsuch a group. How do we know that observersdo not lose their perspective and become blindto the peculiarities that they are supposed to beinvestigating?

Adler and Adler (1994:380) suggest severalstages in an observation. Commencing with theselection of a setting on which to focus, the ob-server then seeks a means of gaining entry tothe situation (for example, taking on a role init). Having gained entry the observer can thencommence the observation proper, be it struc-tured or unstructured, focused or unfocused. Ifquantitative observation is being used then dataare gathered to be analysed post hoc; if moreethnographic techniques are being used thenprogressive focusing requires the observer toundertake analysis during the period of obser-vation itself (discussed earlier).

The question that researchers frequently askis ‘how much observation to do’, or ‘when do Istop observation?’. Of course, there is no hardand fast rule here, though it may be appropriateto stop when ‘theoretical saturation’ has beenreached (Adler and Adler, 1994:380), i.e. whenthe situations that are being observed appear tobe repeating data that have already been col-lected. Of course, it may be important to carryon collecting data at this point, to indicate over-all frequencies of observed behaviour, enablingthe researcher to find the most to the least com-mon behaviours observed over time. Further,the greater the number of observations, thegreater the reliability of the data might be, ena-bling emergent categories to be verified. What isbeing addressed here is the reliability of the ob-servations (see the earlier discussion of triangu-lation).

Ethical considerations

Though observation frequently claims neutral-ity by being non-interventionist, there are sev-eral ethical considerations that surround it.

There is a well-documented literature on thedilemma surrounding overt and covert obser-vation. Whereas in overt research the subjectsknow that they are being observed, in covertresearch they do not. On the one hand this lat-ter form of research appears to violate the prin-ciple of informed consent, invades the privacyof subjects and private space, treats the partici-pants instrumentally—as research objects—and places the researcher in a position of mis-representing her/his role (Mitchell, 1993), orrather, of denying it. However, on the otherhand, it is argued (ibid.) that there are someforms of knowledge that are legitimately in thepublic domain but access to which is only avail-able to the covert researcher (see, for example,the fascinating account of the lookout ‘watchqueen’ in the homosexual community(Humphreys, 1975)). Covert research might benecessary to gain access to marginalized andstigmatized groups, or groups who would notwillingly accede to the requests of a researcherto become involved in research. This might in-clude those groups in sensitive positions, forexample drug users and suppliers, HIV suffer-ers, political activists, child abusers, police in-formants, and racially motivated attackers.Mitchell makes a powerful case for covert re-search, arguing that not to undertake covert re-search is to deny access to powerful groupswho operate under the protection of silence, toneglect research on sensitive but important top-ics, and to reduce research to mealy-mouthedavoidance of difficult but strongly held issuesand beliefs, i.e. to capitulate when the goinggets rough! In a series of examples from re-search undertaken covertly, he makes the casethat not to have undertaken this kind of re-search would be to deny the public access toareas of legitimate concern, the agendas of thepowerful (who can manipulate silence and de-nial of access to their advantage), and the pub-lic knowledge of poorly understood groups orsituations.

That covert research can be threatening is welldocumented from Patrick’s (1973) study of aGlasgow gang, where the researcher had to take

315Chapte

r 17

extreme care not to ‘blow his cover’ when wit-ness to a murder, to Mitchell’s (1993) accountof the careful negotiation of role required toundertake covert research into a group of‘millennialists’—ultra-right-wing armed politi-cal groups in America who were bound by codesof secrecy, and to his research on mountaineers,where membership of the group involved initia-tion into the rigours and pains of mountaineer-ing (the researcher had to become a fully fledgedmountaineer himself to gain acceptance by thegroup).

The ethical dilemmas are numerous, chart-ing the tension between invasion and protec-tion of privacy and the public’s legitimate‘right to know’, between informed consentand its violation in the interests of a widerpublic, between observation as a superficial,perhaps titillating, spectator sport and as im-portant social research. At issue is the di-lemma that arises between protecting the indi-vidual and protecting the wider public, posingthe question ‘whose beneficence?’—whomdoes the research serve, whom does the re-search protect, is the greater good the protec-tion and interests of the individual or the pro-tection and interests of the wider public, willthe research harm already damaged or vulner-able people, will the research improve theirlot, will the research have to treat the re-searched instrumentally in the interests ofgathering otherwise unobtainable yet valuableresearch data? The researcher has inescapablemoral obligations to consider, and, whilstcodes of ethical conduct abound, each casemight have to be judged on its own merits.

Further, the issue of non-intervention is, it-self, problematical. Whilst the claim for obser-vation as being non-interventionist was madeat the start of this chapter, the issue is not asclean as this, for researchers inhabit the worldthat they are researching, and their influence maynot be neutral (the Hawthorne and halo effectsdiscussed in Chapter 5). This is clearly an issuein, for example, school inspections, where thepresence of an inspector in the classroom exertsa powerful influence on what takes place; it is

disingenuous to pretend otherwise. Observereffects can be considerable.

Moreover, the non-interventionist observerhas to consider her/his position very closely. Inthe example of Patrick’s witness to a murderabove, should the researcher have ‘blown hiscover’ and reported the murder? What if notacting on the witnessed murder might haveyielded access to further sensitive data? Shoulda researcher investigating drug or child abuse,report the first incident, or ‘hang back’ in orderto gain access to further, more sensitive data?Should a witness to abuse simply report it ortake action about it? If I see an incident of ra-cial abuse, or bullying, do I maintain my non-interventionist position? Is the observer merelya journalist, providing data for others to judge?When does non-intervention become morallyreprehensible? These are issues for which onecannot turn to codes of conduct for a clear ad-judication.

Conclusion

Observation methods are powerful tools forgaining insight into situations.3 As with otherdata collection techniques, they are beset byissues of validity and reliability. Even low in-ference observation, perhaps the safest formof observation, is itself highly selective, justas perception is selective. Higher forms of in-ference, whilst moving towards establishingcausality, rely on greater levels of interpreta-tion by the observer, wherein the observermakes judgements about intentionality andmotivation. In this respect it has been sug-gested that additional methods of gatheringdata might be employed, to provide corrobo-ration and triangulation, in short, to ensurethat reliable inferences are derived from reli-able data.

This chapter has outlined several differenttypes of observation and the premises that un-derlie them, the selection of the method to beused depending on ‘fitness for purpose’. Over-riding the issues of which specific method ofobservation to use, this chapter has suggested

CONCLUSION

316 OBSERVATION

that observation places the observer into themoral domain, that it is inadequate simply todescribe observation as a non-intrusive,noninterventionist technique and thereby toabrogate responsibility for the participants in-

volved. Like other forms of data collection inthe human sciences, observation is not a mor-ally neutral enterprise. Observers, like otherresearchers, have obligations to participants aswell as to the research community.

Tests and testing have a long and venerable his-tory. Since the spelling test of Rice (1897), thefatigue test of Ebbinghaus (1897) and the intel-ligence scale of Binet (1905) the growth of testshas proceeded at an extraordinary pace in termsof volume, variety, scope and sophistication. Thefield of testing is extensive, so extensive in factthat the comments that follow must needs be ofan introductory nature and the reader seeking adeeper understanding will have to refer to spe-cialist texts and sources on the subject. Limita-tions of space permit no more than a brief out-line of a small number of key issues to do withtests and testing. Readers wishing to undertakestudies to greater depth will need to pursue theirinterests elsewhere.

In tests, researchers have at their disposal apowerful method of data collection, an impres-sive array of tests for gathering data of a nu-merical rather than verbal kind. In consideringtesting for gathering research data, several is-sues need to be borne in mind: • Are we dealing with parametric or

nonparametric tests?• Are they achievement potential or aptitude

tests?• Are they norm-referenced or criterion-refer-

enced?• Are they available commercially for research-

ers to use or will researchers have to develophome produced tests?

• Do the test scores derive from a pretest andpost-test in the experimental method?

• Are they group or individual tests? Let us unpack some of these issues.

Parametric and non-parametric tests

Parametric tests are designed to represent thewide population—e.g. of a country or age group.They make assumptions about the wider popu-lation and the characteristics of that wider popu-lation, i.e. the parameters of abilities are known.They assume (Morrison, 1993): • that there is a normal curve of distribution of

scores in the population (the bell-shaped sym-metry of the Gaussian curve of distributionseen, for example, in standardized scores ofIQ or the measurement of people’s height orthe distribution of achievement on readingtests in the population as a whole);

• that there are continuous and equal intervalsbetween the test scores (so that, for example,a score of 80 per cent could be said to bedouble that of 40 per cent; this differs fromthe ordinal scaling of rating scales discussedearlier in connection with questionnaire de-sign where equal intervals between each scorecould not be assumed).

Parametric tests will usually be published asstandardized tests which are commercially avail-able and which have been piloted on a large andrepresentative sample of the whole population.They usually arrive complete with the backupdata on sampling, reliability and validity statis-tics which have been computed in the devisingof the tests. Working with these tests enablesthe researcher to use statistics applicable to in-terval and ratio levels of data.

On the other hand, non-parametric tests makefew or no assumptions about the distribution of

18 Tests

318 TESTS

the population (the parameters of the scores) orthe characteristics of that population. The testsdo not assume a regular bell-shaped curve ofdistribution in the wider population; indeed thewider population is perhaps irrelevant as thesetests are designed for a given specific popula-tion—a class in school, a chemistry group, aprimary school year group. Because they makeno assumptions about the wider population, theresearcher is confined to working with non-para-metric statistics appropriate to nominal and or-dinal levels of data.

The attraction of non-parametric statistics istheir utility for small samples because they donot make any assumptions about how normal,even and regular the distributions of scores willbe. Furthermore, computation of statistics fornon-parametric tests is less complicated than thatfor parametric tests. It is perhaps safe to assumethat a home-devised test (like a home-devisedquestionnaire) will probably be non-parametricunless it deliberately contains interval and ratiodata. Non-parametric tests are the stock-in-tradeof classroom teachers—the spelling test, themathematics test, the end-of-year examination,the mock-examination. They have the advan-tage of being tailored to particular institutional,departmental and individual circumstances.They offer teachers a valuable opportunity forquick, relevant and focused feedback on studentperformance.

Parametric tests are more powerful than non-parametric tests because they not only derivefrom standardized scores but enable the re-searcher to compare sub-populations with awhole population (e.g. to compare the resultsof one school or local education authority withthe whole country, for instance in comparingstudents’ performance in norm-referenced orcriterion-referenced tests against a national av-erage score in that same test). They enable theresearcher to use powerful statistics in dataprocessing (e.g. means, standard deviations, t-tests, Pearson product moment correlations, fac-tor analysis, analysis of variance), and to makeinferences about the results. Because non-para-metric tests make no assumptions about the

wider population a different set of statistics isavailable to the researcher (e.g. modal scores,rankings, the chi-square statistic, a Spearmancorrelation). These can be used in very specificsituations—one class of students, one yeargroup, one style of teaching, one curriculumarea—and hence are valuable to teachers.

Norm-referenced, criterion-referencedand domain-referenced tests

A norm-referenced test compares students’achievements relative to other students’ achieve-ments (e.g. a national test of mathematical per-formance or a test of intelligence which has beenstandardized on a large and representative sam-ple of students between the ages of six and six-teen). A criterion-referenced test does not com-pare student with student but, rather, requiresthe student to fulfil a given set of criteria, apredefined and absolute standard or outcome(Cunningham, 1998). For example, a driving testis usually criterion-referenced since to pass itrequires the ability to meet certain test items—reversing round a corner, undertaking an emer-gency stop, avoiding a crash, etc. regardless ofhow many others have or have not passed thedriving test. Similarly many tests of playing amusical instrument require specified perform-ances—e.g. the ability to play a particular scaleor arpeggio, the ability to play a Bach fuguewithout hesitation or technical error. If the stu-dent meets the criteria, then he or she passes theexamination.

The link between criterion referenced testsand mastery learning is strong, for both empha-size the achievement of objectives per se ratherthan in comparison to other students. Both placean emphasis on learning outcomes. Further,Cunningham (1998) has indicated the link be-tween criterion referencing, the minimumcompetency testing and measurement driveninstruction in the US; all of them share the con-cern for measuring predetermined and specificoutcomes and objectives. Though this use ofcriterion-referencing declined in the closing dec-ade of the twentieth century, the use of

319Chapte

r 18

criterion-referencing to set standards burgeonedin the same period. What we have, then, is themove away from criterion-referencing as meas-urement of the achievement of detailed and spe-cific behavioural objectives and towards a test-ing of what a student has achieved that is not sospecifically framed.

A criterion-referenced test provides the re-searcher with information about exactly what astudent has learned, what she can do, whereas anorm-referenced test can only provide the re-searcher with information on how well one stu-dent has achieved in comparison to another, ena-bling rank orderings of performance and achieve-ment to be constructed. Hence a major featureof the norm-referenced test is its ability to dis-criminate between students and their achieve-ments—a well constructed norm-referenced testenables differences in achievement to be meas-ured acutely, i.e. to provide variability or a greatrange of scores. For a criterion-referenced testthis is less of a problem, the intention here is toindicate whether students have achieved a set ofgiven criteria, regardless of how many othersmight or might not have achieved them, hencevariability or range is less important here.

The question of the politics in the use of datafrom criterion-referenced examination resultsarises when such data are used in a norm-refer-enced way to compare student with student,school with school, local authority with localauthority, region with region (as has been donein the United Kingdom with the publication of‘league tables’ of local authorities’ successes inthe achievement of their students when testedat the age of seven—a process which is envis-aged to develop into the publication of achieve-ments at several ages and school by school).

More recently an outgrowth of criterion-ref-erenced testing has been the rise of domain-ref-erenced tests (Gipps, 1994:81). Here consider-able significance is accorded to the careful anddetailed specification of the content or the do-main which will be assessed. The domain is theparticular field or area of the subject that is be-ing tested, for example, light in science, two-part counterpoint in music, parts of speech in

English language. The domain is set out veryclearly and very fully, such that the full depthand breadth of the content is established. Testitems are then selected from this very full field,with careful attention to sampling proceduresso that representativeness of the wider field isensured in the test items. The student’s achieve-ments on that test are computed to yield a pro-portion of the maximum score possible, and this,in turn, is used as an index of the proportion ofthe overall domain that she has grasped. So, forexample, if a domain has 1,000 items and thetest has 50 items, and the student scores 30marks from the possible 50 then it is inferredthat she has grasped 60 per cent ({30÷50}×100)of the domain of 1,000 items. Here inferencesare being made from a limited number of itemsto the student’s achievements in the whole do-main; this requires careful and representativesampling procedures for test items.

Commercially produced tests andresearcher-produced tests

There is a battery of tests in the public domainwhich cover a vast range of topics and whichcan be used for evaluative purposes. Mostschools will have used published tests at one timeor another: diagnostic tests, aptitude tests,achievement tests, norm-referenced tests, readi-ness tests, subject-specific tests, skills tests, cri-terion-referenced tests, reading tests, verbal rea-soning tests, non-verbal reasoning tests, tests ofsocial adjustment, tests of intelligence, tests ofcritical thinking; the list is colossal.

There are several attractions to using pub-lished tests: • They are objective;• They have been piloted and refined;• They have been standardized across a named

population (e.g. a region of the country, thewhole country, a particular age group or vari-ous age groups) so that they represent a widepopulation;

• They declare how reliable and valid they are(mentioned in the statistical details which are

COMMERCIAL AND RESEARCHER-PRODUCED TESTS

320 TESTS

usually contained in the manual of instruc-tions for administering the test);

• They tend to be parametric tests, hence ena-bling sophisticated statistics to be calculated;

• They come complete with instructions foradministration;

• They are often straightforward and quick toadminister and to mark;

• Guides to the interpretation of the data areusually included in the manual;

• Researchers are spared the task of having todevise, pilot and refine their own test.

Several commercially produced tests have re-stricted release or availability, hence the re-searcher might have to register with a particu-lar association before being given clearance touse the test or before being given copies of it.For example, the Psychological Corporation Ltdand McGraw-Hill publishers not only hold therights to a world-wide battery of tests of all kindsbut require registration before releasing tests.In this example the Psychological Corporationalso has different levels of clearance, so that cer-tain parties or researchers may not be eligible tohave a test released to them because they do notfulfil particular criteria for eligibility.

Published tests by definition are not tailoredto institutional or local contexts or needs; in-deed their claim to objectivity is made on thegrounds that they are deliberately supra-insti-tutional. The researcher wishing to use publishedtests must be certain that the purposes, objec-tives and content of the published tests matchthe purposes, objectives and content of the evalu-ation. For example, a published diagnostic testmight not fit the needs of the evaluation to havean achievement test; a test of achievement mightnot have the predictive quality which the re-searcher seeks in an aptitude test, a publishedreading test might not address the areas of read-ing that the researcher is wishing to cover, a ver-bal reading test written in English might con-tain language which is difficult for a studentwhose first language is not English. These areimportant considerations. A much-cited text onevaluating the utility for researchers of commer-

cially available tests is produced by the Ameri-can Psychological Association (1974) in theStandards for Educational and PsychologicalTesting.

The golden rule for deciding to use a pub-lished test is that it must demonstrate fitness forpurpose. If it fails to demonstrate this, then testswill have to be devised by the researcher. Theattraction of this latter point is that such a‘homegrown’ test will be tailored to the localand institutional context very tightly, i.e. thatthe purposes, objectives and content of the testwill be deliberately fitted to the specific needsof the researcher in a specific, given context. Indiscussing ‘fitness for purpose’ (Cronbach, 1949;Gronlund and Linn, 1990) set out a range ofcriteria against which a commercially producedtest can be evaluated for its suitability for spe-cific research purposes.

Against these advantages of course there areseveral important considerations in devising a‘home-grown’ test. Not only might it be time-consuming to devise, pilot, refine and then ad-minister the test but, because much of it willprobably be non-parametric, there will be a morelimited range of statistics which may be appliedto the data than in the case of parametric tests.

The scope of tests and testing is far-reaching;it is as if no areas of educational activity areuntouched by them. Achievement tests, largelysummative in nature, measure achieved perform-ance in a given content area. Aptitude tests areintended to predict capability, achievement po-tential, learning potential and future achieve-ments. However, the assumption that these twoconstructs—achievement and aptitude—areseparate has to be questioned (Cunningham,1998); indeed it is often the case that a test ofaptitude for, say, geography, at a particular ageor stage will be measured by using an achieve-ment test at that age or stage. Cunningham(1998) has suggested that an achievement testmight include more straightforward measuresof basic skills whereas aptitude tests might putthese in combination, e.g. combining reasoning(often abstract) and particular knowledge, i.e.that achievement and aptitude tests differ

321Chapte

r 18

according to what they are testing. Not only dothe tests differ according to what they measure,but, since both can be used predictively, theydiffer according to what they might be able topredict. For example, because an achievementtest is more specific and often tied to a specificcontent area, it will be useful as a predictor offuture performance in that content area but willbe largely unable to predict future performanceout of that content area. An aptitude test tendsto test more generalized abilities (e.g. aspects of‘intelligence’, skills and abilities that are com-mon to several areas of knowledge or curricula),hence it is able to be used as a more generalizedpredictor of achievement. Achievement tests,Gronlund (1985) suggests, are more linked toschool experiences whereas aptitude tests en-compass out-of-school learning and wider ex-periences and abilities. However Cunningham(1998), in arguing that there is a considerableoverlap between the two types, is suggesting thatthe difference is largely cosmetic. An achieve-ment test tends to be much more specific andlinked to instructional programmes and cognateareas than an aptitude test, which looks for moregeneral aptitudes (Hanna, 1993) (e.g. intelli-gence or intelligences, Gardner, 1993).1

Constructing a test

The opportunity to devise a test is exciting andchallenging, and in doing so the researcher willhave to consider: • the purposes of the test (for answering evalu-

ation questions and ensuring that it tests whatit is supposed to be testing, e.g. the achieve-ment of the objectives of a piece of the cur-riculum);

• the type of test (e.g. diagnostic, achievement,aptitude, criterion-referenced, norm-refer-enced);

• the objectives of the test (cast in very specificterms so that the content of the test items canbe seen to relate to specific objectives of aprogramme or curriculum);

• the content of the test;

• the construction of the test, involving itemanalysis in order to clarify the itemdiscriminability and item difficulty of the test(see below);

• the format of the test—its layout, instructions,method of working and of completion (e.g.oral instructions to clarify what students willneed to write, or a written set of instructionsto introduce a practical piece of work);

• the nature of the piloting of the test;• the validity and reliability of the test;• the provision of a manual of instructions for

the administration, marking and data treat-ment of the test (this is particularly impor-tant if the test is not to be administered bythe researcher or if the test is to be adminis-tered by several different people, so that reli-ability is ensured by having a standard pro-cedure).

In planning a test the researcher can pro-ceed thus:

1 Identify the purposes of the test

The purposes of a test are several, for exampleto diagnose a student’s strengths, weaknessesand difficulties, to measure achievement, tomeasure aptitude and potential, to identify readi-ness for a programme. Gronlund and Linn(1990) term this ‘placement testing’ and it isusually a form of pretest, normally designed todiscover whether students have the essential pre-requisites to begin a programme (e.g. in termsof knowledge, skills, understandings). Thesetypes of tests occur at different stages. For ex-ample the placement test is conducted prior tothe commencement of a programme, and willidentify starting abilities and achievements—theinitial or ‘entry’ abilities in a student. If the place-ment test is designed to assign students to tracks,sets or teaching groups (i.e. to place them intoadministrative or teaching groupings), then theentry test might be criterion-referenced or norm-referenced; if it is designed to measure detailedstarting points, knowledge, abilities and skillsthen the test might be more criterion-referenced

CONSTRUCTING A TEST

322 TESTS

as it requires a high level of detail. It has itsequivalent in ‘baseline assessment’ and is animportant feature if one is to measure the ‘value-added’ component of teaching and learning: onecan only assess how much a set of educationalexperiences has added value to the student ifone knows that student’s starting point and start-ing abilities and achievements. • Formative testing is undertaken during a pro-

gramme, and is designed to monitor students’progress during that programme, to measureachievement of sections of the programme,and to diagnose strengths and weaknesses. Itis typically criterion-referenced.

• Diagnostic testing is an in-depth test to dis-cover particular strengths, weaknesses anddifficulties that a student is experiencing, andis designed to expose causes and specific ar-eas of weakness or strength. This often re-quires the test to include several items aboutthe same feature, so that, for example, sev-eral types of difficulty in a student’s under-standing will be exposed; the diagnostic testwill need to construct test items that will fo-cus on each of a range of very specific diffi-culties that students might be experiencing,in order to identify the exact problems thatthey are having from a range of possible prob-lems. Clearly this type of test is criterion-ref-erenced.

• Summative testing is the test given at the endof the programme, and is designed to meas-ure achievement, outcomes, or ‘mastery’. Thismight be criterion-referenced or norm-refer-enced, depending to some extent on the useto which the results will be put (e.g. to awardcertificates or grades, to identify achievementof specific objectives).

2 Identify the test specifications

The test specifications include: • which programme objectives and student

learning outcomes will be addressed;• which content areas will be addressed;

• the relative weightings, balance and cover-age of items;

• the total number of items in the test;• the number of questions required to address

a particular element of a programme or learn-ing outcomes;

• the exact items in the test. To ensure validity in a test it is essential to en-sure that the objectives of the test are fairly ad-dressed in the test items. Objectives, it is argued(Mager, 1962; Wiles and Bondi, 1984), should:(a) be specific and be expressed with an appro-priate degree of precision; (b) represent intendedlearning outcomes; (c) identify the actual andobservable behaviour which will demonstrateachievement; (d) include an active verb; (e) beunitary (focusing on one item per objective).

One way of ensuring that the objectives arefairly addressed in test items can be done througha matrix frame that indicates the coverage ofcontent areas, the coverage of objectives of theprogramme, and the relative weighting of theitems on the test. Such a matrix is set out in Box18.1 taking the example from a secondary schoolhistory syllabus.

Box 18.1 indicates the main areas of the pro-gramme to be covered in the test (content ar-eas); then it indicates which objectives or de-tailed content areas will be covered (1a–3c)—these numbers refer to the identified specifica-tions in the syllabus; then it indicates the marks/percentages to be awarded for each area. Thisindicates several points: • the least emphasis is given to the build-up to

and end of the war (10 marks each in the‘total’ column);

• the greatest emphasis is given to the invasionof France (35 marks in the ‘total’ column);

• there is fairly even coverage of the objectivesspecified (the figures in the ‘total’ row onlyvary from 9–13);

• greatest coverage is given to objectives 2a and3a, and least coverage is given to objective 1c;

• some content areas are not covered in the testitems (the blanks in the matrix).

323Chapte

r 18

Hence we have here a test scheme that indicatesrelative weightings, coverage of objectives andcontent, and the relation between these two lat-ter elements. Gronlund and Linn (1990) sug-gest that relative weightings should be ad-dressed by firstly assigning percentages at thefoot of each column, then by assigning percent-ages at the end of each row, and then complet-ing each cell of the matrix within these specifi-cations. This ensures that appropriate samplingand coverage of the items are achieved. The ex-ample of the matrix refers to specific objectivesas column headings; of course these could bereplaced by factual knowledge, conceptualknowledge and principles, and skills for each ofthe column headings. Alternatively they couldbe replaced with specific aspects of an activity,for example (Cohen, Manion and Morrison,1996:416): designing a crane, making thecrane, testing the crane, evaluating the results,improving the design. Indeed these latter couldbecome content (row) headings as shown in

Box 18.2. Here one can see that practical skillswill carry fewer marks than recording skills(the column totals), and that making and evalu-ating carry equal marks (the row totals).

This exercise also enables some indication tobe gained on the number of items to be includedin the test, for instance in the example of thehistory test above the matrix is 5×9=45 possibleitems, and in the ‘crane’ activity below the ma-trix is 5×4=20 possible items. Of course, therecould be considerable variation in this, for ex-ample more test items could be inserted if it weredeemed desirable to test one cell of the matrixwith more than one item (possible for cross-checking), or indeed there could be fewer itemsif it were possible to have a single test item thatserves more than one cell of the matrix. The dif-ficulty in matrix construction is that it can eas-ily become a runaway activity, generating verymany test items and, hence, leading to anunworkably long test—typically the greater thedegree of specificity required, the greater the

Box 18.1A matrix of test items

Content areas Objective/area of Objective/area of Objective/areaprogramme content programme content programme content

Aspects of the 1939–45 war 1a 1b 1c 2a 2b 2c 3a 3b 3c TotalThe build-up to the 1939–45 world 1 2 2 1 1 1 1 1 10warThe invasion of Poland 2 1 1 3 2 2 3 3 3 20The invasion of France 3 4 5 4 4 3 4 4 4 35The allied invasion 3 2 3 3 4 3 3 2 2 25The end of the conflict 2 1 1 1 1 2 2 10Total 11 10 9 13 12 10 13 12 10 100

Box 18.2Compiling elements of test items

Content area Identifying key concepts Practical skills Evaluative skills Recording results Totaland principles

Designing a crane 2 1 1 3 7Making the crane 2 5 2 3 12Testing the crane 3 3 1 4 11Evaluating the results 3 5 4 12Improving the design 2 2 3 1 8Total 12 11 12 15 50

CONSTRUCTING A TEST

324 TESTS

number of test items there will be. One skill intest construction is to be able to have a singletest item that provides valid and reliable datafor more than a single factor.

Having undertaken the test specifications, theresearcher should have achieved clarity on (a)the exact test items that test certain aspects ofachievement of objectives, programmes, con-tents etc.; (b) the coverage and balance of cov-erage of the test items; and (c) the relativeweightings of the test items.

3 Select the contents of the test

Here the test is subject to item analysis.Gronlund and Linn (1990) suggest that an itemanalysis will need to consider: • the suitability of the format of each item for

the (learning) objective (appropriateness);• the ability of each item to enable students to

demonstrate their performance of the (learn-ing) objective (relevance);

• the clarity of the task for each item;• the straightforwardness of the task;• the unambiguity of the outcome of each

item, and agreement on what that outcomeshould be;

• the cultural fairness of each item;• the independence of each item (i.e. where the

influence of other items of the test is minimaland where successful completion of one itemis not dependent on successful completion ofanother);

• the adequacy of coverage of each (learning)objective by the items of the test.

In moving to test construction the researcher willneed to consider how each element to be testedwill be operationalized: (a) what indicators andkinds of evidence of achievement of the objec-tive will be required; (b) what indicators of high,moderate and low achievement there will be;(c) what the students will be doing when theyare working on each element of the test; (d) whatthe outcome of the test will be (e.g. a writtenresponse, a tick in a box of multiple choice items,

an essay, a diagram, a computation). Indeed theTask Group on Assessment and Testing in theUK (1988) took from the work of the UK’s As-sessment of Performance Unit the suggestion thatattention will have to be given to the presenta-tion, operation and response modes of a test:(a) how the task will be introduced (e.g. oral,written, pictorial, computer, practical demon-stration); (b) what the students will be doingwhen they are working on the test (e.g. mentalcomputation, practical work, oral work, writ-ten); and (c) what the outcome will be—howthey will show achievement and present the out-comes (e.g. choosing one item from a multiplechoice question, writing a short response, open-ended writing, oral, practical outcome, compu-ter output). Operationalizing a test from objec-tives can proceed by stages: • identify the objectives/outcomes/elements to

be covered;• break down the objectives/outcomes/elements

into constituent components or elements;• select the components that will feature in the

test, such that, if possible, they will representthe larger field (i.e. domain referencing, ifrequired);

• recast the components in terms of specific,practical, observable behaviours, activitiesand practices that fairly represent and coverthat component;

• specify the kinds of data required to provideinformation on the achievement of the criteria;

• specify the success criteria (performance in-dicators) in practical terms, working outmarks and grades to be awarded and howweightings will be addressed;

• write each item of the test;• conduct a pilot to refine the language/read-

ability and presentation of the items, to gaugeitem discriminability, item difficulty anddistractors (discussed below), and to addressvalidity and reliability.

Item analysis, Gronlund and Linn aver (p. 255),is designed to ensure that: (a) the items fun-ction as they are intended, for example, that

325Chapte

r 18

criterion-referenced items fairly cover the fieldsand criteria and that norm-referenced items dem-onstrate item discriminability (discussed below);(b) the level of difficulty of the items is appro-priate (see below: item difficulty); (c) the test isreliable (free of distractors—unnecessary infor-mation and irrelevant cues, see below:distractors) (see Millmann and Greene (1993)).An item analysis will consider the accuracy lev-els available in the answer, the item difficulty,the importance of the knowledge or skill beingtested, the match of the item to the programme,and the number of items to be included.

The basis of item analysis can be seen in itemresponse theory (see Hambleton, 1993). Itemresponse theory (IRT) is based on the principlethat it is possible to measure single, specific la-tent traits, abilities, attributes that, themselves,are not observable, i.e. to determine observablequantities of unobservable quantities. The theorymodel assumes a relationship between a person’spossession or level of a particular attribute, traitor ability and his/her response to a test item.IRT is also based on the view that it is possible: • to identify objective levels of difficulty of an

item, e.g. the Rasch model (Wainer andMislevy, 1990);

• to devise items that will be able to discrimi-nate effectively between individuals;

• to describe an item independently of any par-ticular sample of people who might be re-sponding to it, i.e. is not group dependent(i.e. the item difficulty and itemdiscriminability are independent of thesample);

• to describe a testee’s proficiency in terms ofhis or her achievement on an item of a knowndifficulty level;

• to describe a person independently of anysample of items that has been administeredto that person (i.e. a testee’s ability does notdepend on the particular sample of test items);

• to specify and predict the properties of a testbefore it has been administered;

• for traits to be unidimensional (single traitsare specifiable, e.g. verbal ability, mathemati-

cal proficiency) and to account for test out-comes and performance;

• for a set of items to measure a common traitor ability;

• for a testee’s response to any one test itemnot to affect his or her response to anothertest item;

• that the probability of the correct responseto an item does not depend on the numberof testees who might be at the same level ofability;

• that it is possible to identify objective levelsof difficulty of an item;

• that a statistic can be calculated that indi-cates the precision of the measured ability foreach testee, and that this statistic depends onthe ability of the testee and the number andproperties of the test items.

In constructing a test the researcher will need toundertake an item analysis to clarify the itemdiscriminability and item difficulty of each itemof the test. Item discriminability refers to thepotential of the item in question to be answeredcorrectly by those students who have a lot ofthe particular quality that the item is designedto measure and to be answered incorrectly bythose students who have less of the particularquality that the same item is designed to meas-ure. In other words, how effective is the test itemin showing up differences between a group ofstudents? Does the item enable us to discrimi-nate between students’ abilities in a given field?An item with high discriminability will enablethe researcher to see a potentially wide varietyof scores on that item; an item with lowdiscriminability will show scores on that itempoorly differentiated. Clearly a high measure ofdiscriminability is desirable.

Suppose the researcher wishes to construct atest of mathematics for eventual use with thirtystudents in a particular school (or with class Ain a particular school). The researcher devises atest and pilots it in a different school or class Brespectively, administering the test to thirty stu-dents of the same age (i.e. she matches the sam-ple of the pilot school or class to the sample in

CONSTRUCTING A TEST

326 TESTS

the school which eventually will be used). Thescores of the thirty pilot children are then splitinto three groups of ten students each (high,medium and low scores). It would be reason-able to assume that there will be more correctanswers to a particular item amongst the highscorers than amongst the low scorers. For eachitem compute the following:

A B CTop 10 students 10 0 2Bottom 10 students 8 0 10

In example A, the item disc riminates positivelyin that it attracts more correct responses (10)from the top 10 students than the bottom 10 (8)and hence is a poor distractor; here, also, thediscriminability index is 0.20, hence is a poordiscriminator and is also a poor distractor. Ex-ample B is an ineffective distractor because no-body was included from either group. ExampleC is an effective distractor because it includesfar more students from the bottom 10 students(10) than the higher group (2). However, in thiscase any ambiguities must be ruled out beforethe discriminating power can be improved.

Distractors are the stuff of multiple choiceitems, where incorrect alternatives are offered,and students have to select the correct alterna-tives. Here a simple frequency count of thenumber of times a particular alternative is se-lected will provide information on the effective-ness of the distractor: if it is selected many timesthen it is working effectively; if it is seldom ornever selected then it is not working effectivelyand it should be replaced.

If we wished to calculate the item difficultyof a test, we could use the following formula:

where

A = the number of correct scores from thehigh scoring group;

B = the number of correct scores from thelow scoring group;

N = the total number of students in the twogroups.

Suppose all ten students from the high scoringgroup answered the item correctly and two stu-dents from the low scoring group answered theitem correctly. The formula would work out thus:

The maximum index of discriminability is 1.00.Any item whose index of discriminability is lessthan 0.67, i.e. is too undiscriminating, shouldbe reviewed firstly to find out whether this isdue to ambiguity in the wording or possible cluesin the wording. If this is not the case, thenwhether the researcher uses an item with an in-dex lower than 0.67 is a matter of judgement. Itwould appear, then, that the item in the exam-ple would be appropriate to use in a test. For afurther discussion of item discriminability seeLinn (1993).

One can use the discriminability index to ex-amine the effectiveness of distractors. This isbased on the premise that an effective distractorshould attract more students from a low scor-ing group than from a high scoring group. Con-sider the following example, where low and highscoring groups are identified:

where

A = the number of students who answeredthe item correctly

N = the total number of students who at-tempted the item

Hence if 12 students out of a class of 20 an-swered the item correctly, then the formulawould work out thus:

The maximum index of difficulty is 100 per cent.Items falling below 33 per cent and above 67 percent are likely to be too easy and too difficult

327Chapte

r 18

respectively. It would appear, then, that this itemwould be appropriate to use in a test. Here, again,whether the researcher uses an item with an in-dex of difficulty below or above the cut-off pointsis a matter of judgement. In a norm-referencedtest the item difficulty should be around 50 percent (Frisbie, 1981). For further discussion of itemdifficulty see Linn (1993) and Hanna (1993).

Given that the researcher can only know thedegree of item discriminability and difficultyonce the test has been undertaken, there is anunavoidable need to pilot home-grown tests.Items with limited discriminability and limiteddifficulty must be weeded out and replaced,those items with the greatest discriminability andthe most appropriate degrees of difficulty canbe retained; this can only be undertaken oncedata from a pilot have been analysed.

Item discriminability and item difficulty takeon differential significance in norm-referencedand criterion-referenced tests. In a norm-refer-enced test we wish to compare students witheach other, hence item discriminability is veryimportant. In a criterion-referenced test, on theother hand, it is not important per se to be ableto compare or discriminate between students’performance. For example, it may be the casethat we wish to discover whether a group ofstudents has learnt a particular body of knowl-edge, that is the objective, rather than, say, find-ing out how many have learned it better thanothers. Hence it may be that a criterion-refer-enced test has very low discriminability if all thestudents achieve very well or achieve very poorly,but the discriminability is less important thanthe fact that the students have or have not learntthe material. A norm-referenced test would re-gard such a poorly discriminating item as un-suitable for inclusion, whereas a criterion-refer-enced test would regard such an item as provid-ing useful information (on success or failure).

With regard to item difficulty, in a criterion-referenced test the level of difficulty is that whichis appropriate to the task or objective. Hence ifan objective is easily achieved then the test itemshould be easily achieved; if the objective is dif-ficult then the test item should be

correspondingly difficult. This means that, un-like a norm-referenced test where an item mightbe reworked in order to increase itsdiscriminability index, this is less of an issue incriterion-referencing. Of course, this is not todeny the value of undertaking an item difficultyanalysis, rather, it is to question the centralityof such a concern. Gronlund and Linn(1990:265) suggest that where instruction hasbeen effective the item difficulty index of a cri-terion-referenced test will be high.

In addressing the item discriminability, itemdifficulty and distractor effect of particular testitems, it is advisable, of course, to pilot these testsand to be cautious about placing too great a storeon indices of difficulty and discriminability thatare computed from small samples.

In constructing a test with item analysis, itemdiscriminability, item difficulty and distractor ef-fects in mind, it is important also to consider theactual requirements of the test (Nuttall, 1987;Cresswell and Houston, 1991), for example: • are all the items in the test equally difficult?;• which items are easy, moderately hard, hard,

very hard?;• what kinds of task each item is addressing

(e.g. is it (a) a practice item—repeating knownknowledge, (b) an application item—apply-ing known knowledge, (c) a synthesis item—bringing together and integrating diverse ar-eas of knowledge)?;

• if not, what makes some items more difficultthan the rest?;

• whether the items are sufficiently within theexperience of the students;

• how motivated students will be by the con-tents of each item (i.e. how relevant they per-ceive the item to be, how interesting it is).

The contents of the test will also need to take ac-count of the notion of fitness for purpose, for ex-ample in the types of test items. Here the researcherwill need to consider whether the kinds of data todemonstrate ability, understanding and achieve-ment will be best demonstrated in, for example(Lewis, 1974; Cohen, Manion and Morrison, 1996):

CONSTRUCTING A TEST

328 TESTS

• an open essay;• a factual and heavily directed essay;• short answer questions;• divergent thinking items;• completion items;• multiple choice items (with one correct an-

swer or more than one correct answer);• matching pairs of items or statements;• inserting missing words;• incomplete sentences or incomplete, unla-

belled diagrams;• true/false statements;• open-ended questions where students are

given guidance on how much to write (e.g.300 words, a sentence, a paragraph);

• closed questions. These items can test recall, knowledge, compre-hension, application, analysis, synthesis, andevaluation, i.e. different orders of thinking.These take their rationale from Bloom (1956)on hierarchies of thinking—from low order(comprehension, application), through middleorder thinking (analysis, synthesis) to higherorder thinking (evaluation, judgement, criti-cism). Clearly the selection of the form of thetest item will be based on the principle of gain-ing the maximum amount of information in themost economical way. More recently this is evi-denced in the explosive rise of machine-scorablemultiple choice completion tests, where opticalmark readers and scanners can enter and proc-ess large scale data rapidly.

4 Consider the form of the test

Much of the discussion in this chapter assumesthat the test is of the pen-and-paper variety.Clearly this need not be the case, for exampletests can be written, oral, practical, interactive,computer-based, dramatic, diagrammatic, pic-torial, photographic, involve the use of audioand video material, presentational and role-play,simulations. This does not negate the issues dis-cussed in this chapter, for the form of the testwill still need to consider, for example, reliabil-ity and validity, difficulty, discriminability,

marking and grading, item analysis, timing. In-deed several of these factors take on an addedsignificance in non-written forms of testing; forexample: (a) reliability is a major issue in judg-ing live musical performance or the performanceof a gymnastics routine—where a ‘one-off’ eventis likely; (b) reliability and validity are signifi-cant issues in group performance or group exer-cises—where group dynamics may prevent atestee’s true abilities from being demonstrated.Clearly the researcher will need to considerwhether the test will be undertaken individu-ally, or in a group, and what form it will take.

5 Write the test item

The test will need to address the intended andunintended clues and cues that might be pro-vided in it, for example (Morris et al., 1987): • the number of blanks might indicate the

number of words required;• the number of dots might indicate the number

of letters required;• the length of blanks might indicate the length

of response required;• the space left for completion will give cues

about how much to write;• blanks in different parts of a sentence will be

assisted by the reader having read the otherparts of the sentence (anaphoric and cata-phoric reading cues).

Hanna (1993:139–41) and Cunningham (1998)provide several guidelines for constructing short-answer items to overcome some of these problems: • make the blanks close to the end of the sen-

tence;• keep the blanks the same length;• ensure that there can be only a single correct

answer;• avoid putting several blanks close to each

other (in a sentence or paragraph) such thatthe overall meaning is obscured;

• only make blanks of key words or concepts,rather than of trivial words;

329Chapte

r 18

• avoid addressing only trivial matters;• ensure that students know exactly the kind

and specificity of the answer required;• specify the units in which a numerical answer

is to be given;• use short-answers for testing knowledge recall. With regard to multiple choice items there areseveral potential problems: • the number of choices in a single multiple

choice item (and whether there is one or moreright answer(s));

• the number and realism of the distractors ina multiple-choice item (e.g. there might bemany distractors but many of them are tooobvious to be chosen—there may be severalredundant items);

• the sequence of items and their effects on eachother;

• the location of the correct response(s) in amultiple choice item.

Gronlund and Linn (1990), Hanna (1993:161–75) and Cunningham (1998) set out sev-eral suggestions for constructing effective mul-tiple choice test items: • ensure that they catch significant knowledge

and learning rather than low-level recall of facts;• frame the nature of the issue in the stem of

the item, ensuring that the stem is meaning-ful in itself (e.g. replace the general ‘sheep:(a) are graminivorous, (b) are cloven footed,(c) usually give birth to one or two calves ata time’ with ‘how many lambs are normallyborn to a sheep at one time?’);

• ensure that the stem includes as much of theitem as possible, with no irrelevancies;

• avoid negative stems to the item;• keep the readability levels low;• ensure clarity and unambiguity;• ensure that all the options are plausible so

that guessing of the only possible option isavoided;

• avoid the possibility of students making thecorrect choice through incorrect reasoning;

• include some novelty to the item if it is beingused to measure understanding;

• ensure that there can only be a single correctoption (if a single answer is required) and thatit is unambiguously the right response;

• avoid syntactical and grammatical clues bymaking all options syntactically and gram-matically parallel and by avoiding matchingthe phrasing of a stem with similar phrasingin the response;

• avoid including in the stem clues as to whichmay be the correct response;

• ensure that the length of each response itemis the same (e.g. to avoid one long correctanswer from standing out);

• keep each option separate, avoiding optionswhich are included in each other;

• ensure that the correct option is positioneddifferently for each item (e.g. so that it is notalways option 2);

• avoid using options like ‘all of the above’ or‘none of the above’;

• avoid answers from one item being used to cueanswers to another item—keep items separate.

Morris et al. (1987:161), Gronlund and Linn(1990), Hanna (1993:147) and Cunningham(1998) also indicate particular problems in true-false questions: • ambiguity of meaning;• some items might be partly true or partly

false;• items that polarize—being too easy or too

hard;• most items might be true or false under cer-

tain conditions;• it may not be clear to the student whether

facts or opinions are being sought;• as this is dichotomous, students have an even

chance of guessing the correct answer;• an imbalance of true to false statements;• some items might contain ‘absolutes’ which give

powerful clues, e.g. ‘always’, ‘never’, ‘all’, ‘none’. To overcome these problems the authorssuggest several points that can be addressed:

CONSTRUCTING A TEST

330 TESTS

• avoid generalized statements (as they are usu-ally false);

• avoid trivial questions;• avoid negatives and double negatives in statements;• avoid over-long and over-complex statements;• ensure that items are rooted in facts;• ensure that statements can be either only true

or false;• write statements in everyday language;• decide where it is appropriate to use ‘de-

grees’—‘generally’, ‘usually’, ‘often’—as theseare capable of interpretation;

• avoid ambiguities;• ensure that each statement only contains one

idea;• if an opinion is to be sought then ensure that

it is attributable to a named source;• ensure that true statements and false state-

ments are equal in length and number. Morris et al. (1987), Hanna (1993:150–2) andCunningham (1998) also indicate particularpotential difficulties in matching items: • it might be very clear to a student which items

in a list simply cannot be matched to items inthe other list (e.g. by dint of content, gram-mar, concepts), thereby enabling the studentto complete the matching by eliminationrather than understanding;

• one item in one list might be able to bematched to several items in the other;

• the lists might contain unequal numbers ofitems, thereby introducing distractors—ren-dering the selection as much a multiple choiceitem as a matching exercise.

The authors suggest that difficulties in match-ing items can be addressed thus: • ensure that the items for matching are ho-

mogeneous—similar—over the whole test (torender guessing more difficult);

• avoid constructing matching items to answersthat can be worked out by elimination (e.g.by ensuring that: (a) there are different num-bers of items in each column so that there are

more options to be matched than there areitems; (b) students can avoid being able to re-duce the field of options as they increase thenumber of items that they have matched; (c)the same option may be used more than once);

• decide whether to mix the two columns ofmatched items (i.e. ensure, if desired, thateach column includes both items and op-tions);

• sequence the options for matching so that theyare logical and easy to follow (e.g. by number,by chronology);

• avoid over-long columns and keep the col-umns on a single page;

• make the statements in the options columnsas brief as possible;

• avoid ambiguity by ensuring that there is aclearly suitable option that stands out fromits rivals;

• make it clear what the nature of the relation-ship should be between the item and the op-tion (on what terms they relate to each other);

• number the items and letter the options. With regard to essay questions, there are sev-eral advantages that can be claimed. For exam-ple, an essay, as an open form of testing, ena-bles complex learning outcomes to be measured,it enables the student to integrate, apply andsynthesize knowledge, to demonstrate the abil-ity for expression and self-expression, and todemonstrate higher order and divergent cogni-tive processes. Further, it is comparatively easyto construct an essay title. On the other hand,essays have been criticized for yielding unreli-able data (Gronlund and Linn, 1990;Cunningham, 1998), for being prone to unreli-able (inconsistent and variable) scoring, neglect-ful of intended learning outcomes and prone tomarker bias and preference (being too intuitive,subjective, holistic, and time-consuming tomark). To overcome these difficulties the authorssuggest that: • the essay question must be restricted to those

learning outcomes that are unable to be meas-ured more objectively;

331Chapte

r 18

• the essay question must ensure that it is clearlylinked to desired learning outcomes; that it isclear what behaviours the students must dem-onstrate;

• the essay question must indicate the field andtasks very clearly (e.g. ‘compare’, ‘justify’,‘critique’, ‘summarize’, ‘classify’, ‘analyse’,‘clarify’, ‘examine’, ‘apply’, ‘evaluate’, ‘syn-thesize’, ‘contrast’, ‘explain’, ‘illustrate’);

• time limits are set for each essay;• options are avoided, or, if options are to be

given, ensure that, if students have a list oftitles from which to choose, each title isequally difficult and equally capable of ena-bling the student to demonstrate achievement,understanding etc.

• marking criteria are prepared and are explicit,indicating what must be included in the an-swers and the points to be awarded for suchinclusions or ratings to be scored for the ex-tent to which certain criteria have been met;

• decisions are agreed on how to address andscore irrelevancies, inaccuracies, poor gram-mar and spelling;

• the work is double marked, blind, and, whereappropriate, without the marker knowing(the name of) the essay writer.

Clearly these are issues of reliability (see Chap-ter 5 on reliability and validity). The followingissue is that layout can exert a profound effecton the test.

6 Consider the layout of the test

This will include (Gronlund and Linn, 1990;Hanna, 1993; Linn, 1993; Cunningham, 1998): • the nature, length and clarity of the instruc-

tions (e.g. what to do, how long to take, howmuch to do, how many items to attempt, whatkind of response is required (e.g. a singleword, a sentence, a paragraph, a formula, anumber, a statement etc.), how and where toenter the response, where to show the ‘work-ing out’ of a problem, where to start newanswers, e.g. in a separate booklet), is one

answer only required to a multiple choiceitem, or is more than one answer required;

• the spreading of the instructions through thetest, avoiding overloading students with toomuch information at first, and providing in-structions for each section as they come to it;

• what marks are to be awarded for which partsof the test;

• minimizing ambiguity and taking care overthe readability of the items;

• the progression from the easy to the moredifficult items of the test (i.e. the location andsequence of items);

• the visual layout of the page, for example,avoiding overloading students with visualmaterial or words;

• the grouping of items—keeping togetheritems that have the same contents or the sameformat;

• the setting out of the answer sheets/locationsso that they can be entered onto computersand read by optical mark readers and scan-ners (if appropriate).

The layout of the text should be such that itsupports the completion of the test and that thisis done as efficiently and as effectively as possi-ble for the student.

7 Consider the timing of the test

This refers to two areas: (a) when the test willtake place (the day of the week, month, time ofday) and (b) the time allowances to be given tothe test and its component items. With regardto the former, in part this is a matter of reliabil-ity, for the time of day, week etc. might influ-ence how alert, motivated, capable a studentmight be. With regard to the latter, the researcherwill need to decide what time restrictions arebeing imposed and why (for example, is the pres-sure of a time constraint desirable—to showwhat a student can do under time pressure—oran unnecessary impediment, putting a timeboundary around something that need not bebounded—was Van Gogh put under a time pres-sure to produce the painting of sunflowers?).

CONSTRUCTING A TEST

332 TESTS

Though it is vital that the student knows whatthe overall time allowance is for the test, clearlyit might be helpful to a student to indicate no-tional time allowances for different elements ofthe test; if these are aligned to the relativeweightings of the test (see the discussions ofweighting and scoring) they enable a student todecide where to place emphasis in the test—shemay want to concentrate her time on the highscoring elements of the test. Further, if the itemsof the test have exact time allowances, this ena-bles a degree of standardization to be built intothe test, and this may be useful if the results aregoing to be used to compare individuals orgroups.

8 Plan the scoring of the test

The awarding of scores for different items ofthe test is a clear indication of the relative sig-nificance of each item—the weightings of eachitem are addressed in their scoring. It is impor-tant to ensure that easier parts of the test attractfewer marks than more difficult parts of it, oth-erwise a student’s results might be artificiallyinflated by answering many easy questions andfewer more difficult questions (Gronlund andLinn, 1990). Additionally, there are several at-tractions to making the scoring of tests as de-tailed and specific as possible (Cresswell andHouston, 1991; Gipps, 1994), awarding specificpoints for each item and sub-item, for example: • it enables partial completion of the task to

be recognized—students gain marks in pro-portion to how much of the task they havecompleted successfully (an important featureof domain referencing);

• it enables a student to compensate for doingbadly in some parts of a test by doing well inother parts of the test;

• it enables weightings to be made explicit tothe students;

• it enables the rewards for successful comple-tion of parts of a test to reflect considera-tions such as the length of the item, the time

required to complete it, its level of difficulty,its level of importance;

• it facilitates moderation because it is clear andspecific;

• it enables comparisons to be made acrossgroups by item;

• it enables reliability indices to be calculated(see discussions of reliability);

• scores can be aggregated and converted intogrades straightforwardly.

Ebel (1979) argues that the more marks that areavailable to indicate different levels of achieve-ment (e.g. for the awarding of grades), thegreater the reliability of the grades will be,though, clearly this could make the test longer.Scoring will also need to be prepared to handleissues of poor spelling, grammar and punctua-tion—is it to be penalized, and how will con-sistency be assured here? Further, how will is-sues of omission be treated, e.g. if a student omitsthe units of measurement (miles per hour, dol-lars or pounds, metres or centimetres)?

Related to the scoring of the test is the issueof reporting the results. If the scoring of a test isspecific then this enables variety in reporting tobe addressed, for example, results may be re-ported item by item, section by section, or wholetest by whole test. This degree of flexibility mightbe useful for the researcher, as it will enable par-ticular strengths and weaknesses in groups ofstudents to be exposed.

The desirability of some of the above pointsis open to question. For example, it could beargued that the strength of criterion-referenc-ing is precisely its specificity, and that to aggre-gate data (e.g. to assign grades) is to lose thevery purpose of the criterion-referencing (Gipps,1994:85). For example, if I am awarded a gradeE for spelling in English, and a grade A for im-aginative writing, this could be aggregated intoa C grade as an overall grade of my English lan-guage competence, but what does this C grademean? It is meaningless, it has no frame of ref-erence or clear criteria, it loses the usefulspecificity of the A and E grades, it is a compro-mise that actually tells us nothing. Further,

333Chapte

r 18

aggregating such grades assumes equal levels ofdifficulty of all items.

Of course, raw scores are still open to inter-pretation—which is a matter of judgement ratherthan exactitude or precision (Wiliam, 1996). Forexample, if a test is designed to assess ‘mastery’of a subject, then the researcher is faced withthe issue of deciding what constitutes ‘mastery’—is it an absolute (i.e. very high score) or are theregradations, and if the latter, then where do thesegradations fall? For published tests the scoringis standardized and already made clear, as arethe conversions of scores into, for example, per-centiles and grades.

Underpinning the discussion of scoring is theneed to make it unequivocally clear exactly whatthe marking criteria are—what will and will notscore points. This requires a clarification ofwhether there is a ‘checklist’ of features that mustbe present in a student’s answer.

Clearly criterion-referenced tests will have todeclare their lowest boundary—a cut-off point—below which the student has been deemed tofail to meet the criteria. A compromise can beseen in those criterion-referenced tests whichaward different grades for different levels ofperformance of the same task, necessitating theclarification of different cut-off points in theexamination. A common example of this can beseen in the GCSE examinations for secondaryschool pupils in the United Kingdom, where stu-dents can achieve a grade between A and F for acriterion-related examination.

The determination of cut-off points has beenaddressed by Nedelsky (1954), Angoff (1971),Ebel (1972) and Linn (1993). Angoff (1971)suggests a method for dichotomously scoreditems. Here judges are asked to identify the pro-portion of minimally acceptable persons whowould answer each item correctly. The sum ofthese proportions would then be taken to repre-sent the minimally acceptable score.

An elaborated version of this principle comesfrom Ebel (1972). Here a difficulty by relevancematrix is constructed for all the items. Difficultymight be assigned three levels (e.g. easy, mediumand hard), and relevance might be assigned three

levels (e.g. highly relevant, moderately relevant,barely relevant). When each and every test itemhas been assigned to the cells of the matrix thejudges estimate the proportion of items in eachcell that minimally acceptable persons wouldanswer correctly, with the standard for eachjudge being the weighted average of the propor-tions in each cell (which are determined by thenumber of items in each cell). In this methodjudges have to consider two factors—relevanceand difficulty (unlike Angoff, where only diffi-culty featured). What characterizes these ap-proaches is the trust that they place in expertsin making judgements about levels (e.g. of diffi-culty, or relevance, or proportions of successfulachievement), i.e. they are based on fallible hu-man subjectivity.

Ebel (1979) argues that one principle in as-signation of grades is that they should representequal intervals on the score scales. Reference ismade to median scores and standard devia-tions, median scores because it is meaninglessto assume an absolute zero on scoring, andstandard deviations as the unit of convenientsize for inclusion of scores for each grade (seealso Cohen and Holliday, 1996). One proce-dure is thus:

Step 1 Calculate the median and standard de-viation of the scores.Step 2 Determine the lower score limits of themark intervals using the median and the stand-ard deviation as the unit of size for each grade.

However, the issue of cut-off scores is compli-cated by the fact that they may vary accordingto the different purposes and uses of scores(e.g. for diagnosis, for certification, for selec-tion, for programme evaluation), as these pur-poses will affect the number of cut-off pointsand grades, and the precision of detail re-quired. For a full analysis of determining cut-off grades see Linn (1993).

The issue of scoring takes in a range of fac-tors, for example: grade norms, age norms, per-centile norms and standard score norms (e.g.z-scores, T-scores, stanine scores, percentiles).

CONSTRUCTING A TEST

334 TESTS

These are beyond the scope of this book to dis-cuss, but readers are referred to Cronbach(1970), Gronlund and Linn (1990), Cohen andHolliday (1996), Hopkins et al. (1996).

Devising a pretest and post-test

The construction and administration oftests is an essential part of the experimen-tal model of research, where a pretest anda post-test have to be devised for the con-trol and experimental groups. The pretestand post-test must adhere to several guide-lines: • The pretest may have questions which differ

in form or wording from the post-test, thoughthe two tests must test the same content, i.e.they will be alternate forms of a test for thesame groups.

• The pretest must be the same for the controland experimental groups.

• The post-test must be the same for bothgroups.

• Care must be taken in the construction of apost-test to avoid making the test easier tocomplete by one group than another.

• The level of difficulty must be the same inboth tests.

Test data feature centrally in the experimentalmodel of research; additionally, they may fea-ture as part of a questionnaire, interview anddocumentary material.

Reliability and validity of tests

Chapter 5 covers issues of reliability and valid-ity. Suffice it here to say that reliability concernsthe degree of confidence that can be placed inthe results and the data, which is often a matterof statistical calculation and subsequent test re-designing. Validity, on the other hand, concernsthe extent to which the test tests what it is sup-posed to test! This devolves on content, con-struct, face, criterion-related, and concurrentvalidity.

Ethical issues in preparing for tests

A major source of unreliability of test data de-rives from the extent and ways in which stu-dents have been prepared for the test. These canbe located on a continuum from direct and spe-cific preparation, through indirect and generalpreparation, to no preparation at all. With thegrowing demand for test data (e.g. for selection,for certification, for grading, for employment,for tracking, for entry to higher education, foraccountability, for judging schools and teach-ers) there is a perhaps understandable pressureto prepare students for tests. This is the ‘high-stakes’ aspect of testing (Harlen, 1994), wheremuch hinges on the test results. At one level thiscan be seen in the backwash effect of examina-tions on curricula and syllabuses; at another levelit can lead to the direct preparation of studentsfor specific examinations. Preparation can takemany forms (Mehrens and Kaminski, 1989;Gipps, 1994): • ensuring coverage, amongst other programme

contents and objectives, of the objectives andprogramme that will be tested;

• restricting the coverage of the programmecontent and objectives to those only that willbe tested;

• preparing students with ‘exam technique’;• practice with past/similar papers;• directly matching the teaching to specific test

items, where each piece of teaching and con-tents is the same as each test item;

• practice on an exactly parallel form of thetest;

• telling students in advance what will appearon the test;

• practice on, and preparation of, the identicaltest itself (e.g. giving out test papers in ad-vance) without teacher input;

• practice on, and preparation of, the identicaltest itself (e.g. giving out the test papers inadvance), with the teacher working throughthe items, maybe providing sample answers.

How ethical it would be to undertake the finalfour of these is perhaps questionable, or indeed

335Chapte

r 18

any apart from the first on the list. Are theycheating or legitimate test preparation? Shouldone teach to a test; is not to do so a derelictionof duty (e.g. in criterion-and domain-referencedtests) or giving students an unfair advantage andthus reducing the reliability of the test as a trueand fair measure of ability or achievement? Inhigh stakes assessment (e.g. for public account-ability and to compare schools and teachers)there is even the issue of not entering for testsstudents whose performance will be low (see,for example, Haladyna, Nolen and Hass, 1991).There is a risk of a correlation between the‘stakes’ and the degree of unethical practice—the greater the stakes, the greater the incidenceof unethical practice. Unethical practice, ob-serves Gipps (1994) occurs where scores are in-flated but reliable inference on performance orachievement is not, and where different groupsof students are prepared differentially for tests,i.e. giving some students an unfair advantageover others. To overcome such problems, shesuggests, it is ethical and legitimate for teachersto teach to a broader domain than the test, thatteachers should not teach directly to the test,and the situation should only be that better in-struction rather than test preparation is accept-able (Cunningham, 1998).

One can add to this list of considerations(Cronbach, 1970; Hanna, 1993; Cunningham,1998) the view that: • tests must be valid and reliable (see the chap-

ter on reliability and validity);• the administration, marking and use of the

test should only be undertaken by suitablycompetent/qualified people (i.e. people andprojects should be vetted);

• access to test materials should be controlled,for instance: test items should not be repro-duced apart from selections in professionalpublication; the tests should only be releasedto suitably qualified professionals in connec-tion with specific professionally acceptableprojects;

• tests should benefit the testee (beneficence);• clear marking and grading protocols should

exist (the issue of transparency is discussedin the chapter on reliability and validity);

• test results are only reported in a way thatcannot be misinterpreted;

• the privacy and dignity of individuals shouldbe respected (e.g. confidentiality, anonymity,non-traceab ility) ;

• individuals should not be harmed by the testor its results (non-maleficence);

• informed consent to participate in the testshould be sought.

Computerized adaptive testing

A recent trend in testing is towards computer-ized adaptive testing (Wainer, 1990). This isparticularly useful for large-scale testing, wherea wide range of ability can be expected. Here atest must be devised that enables the tester tocover this wide range of ability; hence it mustinclude some easy to some difficult items—tooeasy and it does not enable a range of high abil-ity to be charted (testees simply getting all theanswers right), too difficult and it does not en-able a range of low ability to be charted (testeessimply getting all the answers wrong). We findout very little about a testee if we ask a batteryof questions which are too easy or too difficultfor her. Further, it is more efficient and reliableif a test can avoid the problem for high abilitytestees of having to work through a mass of easyitems in order to reach the more difficult itemsand for low ability testees of having to try toguess the answers to more difficult items. Henceit is useful to have a test that is flexible and thatcan be adapted to the testees. For example, if atestee found an item too hard the next item couldadapt to this and be easier, and, conversely, if atestee was successful on an item the next itemcould be harder.

Wainer indicates that in an adaptive test thefirst item is pitched in the middle of the assumedability range; if the testee answers it correctlythen it is followed by a more difficult item, andif the testee answers it incorrectly then it is fol-lowed by an easier item. Computers here pro-vide an ideal opportunity to address the

COMPUTERIZED ADAPTIVE TESTING

336 TESTS

flexibility, discriminability and efficiency of test-ing. Testees can work at their own pace, theyneed not be discouraged but can be challenged,the test is scored instantly to provide feedbackto the testee, a greater range of items can beincluded in the test and a greater degree of pre-cision and reliability of measurement can beachieved; indeed test security can be increasedand the problem of understanding answer sheetsis avoided.

Clearly the use of computer adaptive testinghas several putative attractions. On the otherhand it requires different skills from traditionaltests, and these might compromise the reliabil-ity of the test, for example: • the mental processes required to work with a

computer screen and computer programmediffer from those required for a pen and pa-per test;

• motivation and anxiety levels increase or de-crease when testees work with computers;

• the physical environment might exert a sig-nificant difference, e.g. lighting, glare fromthe screen, noise from machines, loading andrunning the software;

• reliability shifts from an index of the

variability of the test to an index of the stand-ard error of the testee’s performance. Theusual formula for calculating standard errorassumes that error variance is the same forall scores, whereas in item response theory itis assumed that error variance depends oneach testee’s ability. The conventional statis-tic of error variance calculates a single aver-age variance of summed scores, whereas initem response theory this is, at best very crude,and at worst misleading as variation is a func-tion of ability rather than test variation andcannot fairly be summed (see Thissen, 1990,for an analysis of how to address this issue);

• having so many test items increases the chanceof inclusion of poor items.

Computer adaptive testing requires a large itempool for each area of content domain to be de-veloped (Flaugher, 1990), with sufficient num-bers, variety and spread of difficulty. The itemshave to be pretested and validated, their diffi-culty and discriminability calculated, the effectof distractors reduced, the capability of the testto address unidimensionality and/ormultidimensionality to be clarified, and therules for selecting items to be enacted.

Introduction

One of the most interesting theories of person-ality to have emerged this century and one thathas had an increasing impact on educationalresearch is ‘personal construct theory’. Personalconstructs are the basic units of analysis in acomplete and formally stated theory of person-ality proposed by George Kelly in a book enti-tled The Psychology of Personal Constructs(1955). Kelly’s own experiences were intimatelyrelated to the development of his imaginativetheory. He began his career as a school psycholo-gist dealing with problem children referred tohim by teachers. As his experiences widened,instead of merely corroborating a teacher’s com-plaint about a pupil, Kelly tried to understandthe complaint in the way the teacher construedit. This change of perspective constituted a sig-nificant reformulation of the problem. In prac-tical terms it resulted in an analysis of the teachermaking the complaint as well as the problempupil. By viewing the problem from a wider per-spective Kelly was able to envisage a wider rangeof solutions.

The insights George Kelly gained from hisclinical work led him to the view that there isno objective, absolute truth and that events areonly meaningful in relation to the ways that areconstrued by individuals. Kelly’s primary focusis upon the way individuals perceive their envi-ronment, the way they interpret what they per-ceive in terms of their existing mental structure,and the way in which, as a consequence, theybehave towards it. In The Psychology of Per-sonal Constructs, Kelly proposes a view of peo-ple actively engaged in making sense of and ex-tending their experience of the world. Personalconstructs are the dimensions that we use toconceptualize aspects of our day-to-day world.

The constructs that we create are used by us toforecast events and rehearse situations beforetheir actual occurrence. According to Kelly, wetake on the role of scientist seeking to predictand control the course of events in which weare caught up. For Kelly, the ultimate explana-tion of human behaviour ‘lies in scanningman’s undertakings, the questions he asks, thelines of inquiry he initiates and the strategies heemploys’ (Kelly, 1969). Education, in Kelly’sview, is necessarily experimental. Its ultimategoal is individual fulfilment and the maximiz-ing of individual potential. In emphasizing theneed of each individual to question and ex-plore, construct theory implies a view of educa-tion that capitalizes upon the child’s naturalmotivation to engage in spontaneous learningactivities. It follows that the teacher’s task is tofacilitate children’s ongoing exploration of theworld rather than impose adult perspectivesupon them. Kelly’s ideas have much in commonwith those to be found in Rousseau’s Emile.

The central tenets of Kelly’s theory are setout in terms of a fundamental postulate and anumber of corollaries. It is not proposed here toundertake a detailed discussion of his theoreti-cal propositions. Good commentaries are avail-able in Bannister (1970) and Ryle (1975). In-stead, we look at the method suggested by Kellyof eliciting constructs and assessing the math-ematical relationships between them, that is,repertory grid technique.

Characteristics of the method

Kelly proposes that each person has access to alimited number of ‘constructs’ by means ofwhich she evaluates the phenomena that

19 Personal constructs

338 PERSONAL CONSTRUCTS

constitute her world. These phenomena—peo-ple, events, objects, ideas, institutions and soon—are known as ‘elements’. He further sug-gests that the constructs that each of us employsmay be thought of as bi-polar, that is, capableof being defined in terms of polar adjectives(good-bad) or polar phrases (makes me feelhappy-makes me feel sad).

A number of different forms of repertory gridtechnique have been developed since Kelly’s firstformulation. All have the two essential charac-teristics in common that we have already iden-tified, that is, constructs—the dimensions usedby a person in conceptualizing aspects of herworld; and elements—the stimulus objects thata person evaluates in terms of the constructs sheemploys. In Box 19.1, we illustrate the empiri-cal technique suggested by Kelly for elicitingconstructs and identifying their relationship withelements in the form of a repertory grid.

Since Kelly’s original account of what hecalled ‘The Role Construct Repertory Grid Test’,

several variations of repertory grid have beendeveloped and used in different areas of re-search. It is the flexibility and adaptability ofrepertory grid technique that has made it such anattractive tool to researchers in psychiatric,counselling, and more recently, educational set-tings. We now review a number of developmentsin the form and the use of the technique. Alban-Metcalf (1997:318) suggests that the use of rep-ertory grids is largely twofold: in their ‘static’form they elicit perceptions that people hold ofothers at a single point in time; in their ‘dynamic’form, repeated application of the methodindicates changes in perception over time; thelatter is useful for charting development andchange.

‘Elicited’ versus ‘provided’ constructs

A central assumption of this ‘standard’ form ofrepertory grid is that it enables the researcher toelicit constructs that subjects customarily use in

Box 19.1Eliciting constructs and constructing a repertory grid

Source Adapted from Kelly, 1969

A person is asked to name a number of people who are significant to him. These might be, for example, mother,father, wife, friend, employer, priest. These constitute the elements in the repertory grid.

The subject is then asked to arrange the elements into groups of threes in such a manner that two are similar insome way but at the same time different from the third. The ways in which the elements may be alike or differentare the constructs, generally expressed in bi-polar form (quiet—talkative; mean—generous; warm—cold). The wayin which two of the elements are similar is called the similarity pole of the construct; and the way in which two ofthe elements are different from the third, the contrast pole of the construct.

A grid can now be constructed by asking the subject to place each element at either the similarity or the contrastpole of each construct. Let x=one pole of the construct, and blank=the other. The result can be set out as follows:

It is now possible to derive different kinds of information from the grid. By studying each row, for example, wecan get some idea of how a person defines each construct in terms of significant people in his life. From eachcolumn, we have a personality profile of each of the significant people in terms of the constructs selected by thesubjects. More sophisticated treatments of grid data are discussed in examples presented in the text.

339Chapte

r 19

interpreting and predicting the behaviour ofthose people who are important in their lives.Kelly’s method of eliciting personal constructsrequired the subject to complete a number ofcards, ‘each showing the name of a person in[his/her] life’. Similarly, in identifying elements,the subject was asked, ‘Is there an importantway in which two of [the elements]—any two—differ from the third?’, i.e. triadic elicitation (see,for example, Nash, 1976). This insistence uponimportant persons and important ways that theyare alike or differ, where both constructs andelements are nominated by the subjects them-selves, is central to Personal Construct Theory.Kelly gives it precise expression in his Individu-ality Corollary—‘Persons differ from each otherin their construction of events.’

Several forms of repertory grid technique nowin common use represent a significant departurefrom Kelly’s individuality corollary in that theyprovide constructs to subjects rather than elicitconstructs from them.

One justification for the use of provided con-structs is implicit in Ryle’s commentary on theindividuality corollary: ‘Kelly paid rather littleattention to developmental and social processes’,Ryle observes, ‘his own concern was with thepersonal and not the social’. Ryle believes thatthe individuality corollary would be strength-ened by the additional statement that ‘personsresemble each other in their construction ofevents’ (Ryle, 1975).

Can the practice of providing constructs tosubjects be reconciled with the individuality cor-ollary assumptions? A review of a substantialbody of research suggests a qualified ‘yes’:

[While] it seems clear in the light of research thatindividuals prefer to use their own elicited con-structs rather than provided dimensions to describethemselves and others…the results of several stud-ies suggest that normal subjects, at least, exhibitapproximately the same degree of differentiationin using carefully selected supplied lists of adjec-tives as when they employ their own elicited per-sonal constructs.

(Adams-Webber, 1970)

However, see Fransella and Bannister (1977) onelicited versus supplied constructs as a ‘grid-gen-erated’ problem.

Bannister and Mair (1968) support the useof supplied constructs in experiments wherehypotheses have been formulated and in thoseinvolving group comparisons. The use of elic-ited constructs alongside supplied ones can serveas a useful check on the meaningfulness of thosethat are provided, substantially lower inter-cor-relations between elicited and supplied con-structs suggesting, perhaps, the lack of relevanceof those provided by the researcher. The dangerwith supplied constructs, Bannister and Mairargue, is that the researcher may assume thatthe polar adjectives or phrases she provides arethe verbal equivalents of the psychological di-mensions in which she is interested.

Allotting elements to constructs

When a subject is allowed to classify as many oras few elements at the similarity or the contrastpole, the result is often a very lopsided constructwith consequent dangers of distortion in the esti-mation of construct relationships. Bannister andMair (1968) suggest two methods for dealingwith this problem which we illustrate in Box19.2. The first, the ‘split-half form’, requires thesubject to place half the elements at the similar-ity pole of each construct, by instructing her todecide which element most markedly shows thecharacteristics specified by each of the con-structs. Those elements that are left are allocatedto the contrast pole. As Bannister observes, thistechnique may result in the discarding of con-structs (for example, male-female) which cannotbe summarily allocated. A second method, the‘rank order form’, as its name suggests, requiresthe subject to rank the elements from the onewhich most markedly exhibits the particularcharacteristic (shown by the similarity pole de-scription) to the one which least exhibits it. Asthe second example in Box 19.2 shows, a rankorder correlation co-efficient can be used to esti-mate the extent to which there is similarity in theallotment of elements on any two constructs.

ALLOTTING ELEMENTS TO CONSTRUCTS

340 PERSONAL CONSTRUCTS

Box 19.2Allotting elements to constructs: three methods

Source Adapted from Bannister and Mair, 1968

Following Bannister, a ‘construct relationship’be used as scores.) The construct relationshipscore can be calculated by squaring the correla-score gives an estimate of the percentage varition

co-efficient and multiplying by 100. (Becauseance that the two constructs share in commonin correlations are not linearly related they can-not terms of the rankings on the two grids.

341Chapte

r 19

A third method of allotting elements is the‘rating form’. Here, the subject is required tojudge each element on a 7-point or a 5-pointscale, for example, absolutely beautiful (7) toabsolutely ugly (1). Commenting on the advan-tages of the rating form, Bannister and Mair(1968) note that it offers the subject greater lati-tude in distinguishing between elements thanthat provided for in the original form proposedby Kelly. At the same time the degree of differ-entiation asked of the subject may not be as greatas that demanded in the ranking method. Aswith the rank order method, the rating formapproach also allows the use of most correla-tion techniques. The rating form is the thirdexample illustrated in Box 19.2.

Alban-Metcalf (1997:317) suggests that thereare two principles that govern the selection ofelements in the repertory grid technique. Thefirst is that the elements must be relevant to thatpart of the construct system that is being inves-tigated, and the second is that the selected ele-ments must be representative. The greater thenumber of elements (typically between 10 and25) or constructs that are elicited, the greater isthe chance of representativeness. Constructs canbe psychological (e.g. anxious), physical (e.g.tall), situational (e.g. from this neighbourhood),and behavioural (e.g. is good at sport).

Laddering and pyramid constructions

The technique known as laddering arises out ofHinkle’s (1965) important revision of the theoryof personal constructs and the method employedin his research. Hinkle’s concern was for thelocation of any construct within an individual’sconstruct system, arguing that a construct hasdifferential implications within a given hierar-chical context. Here a construct is selected bythe interviewer, and the respondent is askedwhich pole applies to a particular, given element(Alban-Metcalf, 1997:316). The constructs thatare elicited are a sequence that has a logic forthe individual and that can be arranged in a hi-erarchical manner of subordinate andsuperordinate constructs (ibid.: 317). That is

‘laddering up’, where there is a progression fromsubordinate to superordinate constructs. Thereverse process (superordinate to subordinate)is ‘laddering down’, asking, for example, howthe respondent knows that such and such a con-struct applies to a particular person.

Hinkle (1965) went on to develop an Impli-cation Grid or Impgrid, in which the subject isrequired to compare each of his constructs withevery other to see which implies the other. Thequestion ‘why?’ is asked over and over again toidentify the position of any construct in an indi-vidual’s hierarchical construct system. Box 19.3illustrates Hinkle’s laddering technique with anexample from educational research reported byFransella (1975).

In pyramid construction respondents areasked to think of a particular ‘element’, a per-son, and then to specify an attribute which ischaracteristic of that person. Then the respond-ent is asked to identify a person who displaysthe opposite characteristic. This sets out the twopoles of the construct. Finally, laddering downof each of the opposite poles is undertaken,thereby constructing a pyramid of relationshipsbetween the constructs (Alban-Metcalf,1997:317).

Grid administration and analysis

The example of grid administration and analy-sis outlined below employs the split-half methodof allocating elements to constructs and a formof ‘anchor analysis’ devised by Bannister. Weassume that 16 elements and 15 constructs havealready been elicited by means of a techniquesuch as the one illustrated in Box 19.1.

Procedures in grid administration

Draw up a grid measuring 16 (elements) by 15(constructs) as in Box 19.1, writing along thetop the names of the elements, but first insert-ing the additional element, ‘self’. Alongside therows write in the construct poles.

You now have a grid in which each intersec-tion or cell is defined by a particular column

PROCEDURES IN GRID ADMINISTRATION

342 PERSONAL CONSTRUCTS

(element) and a particular row (construct). Theadministration takes the form of allocating everyelement on every construct. If, for example, yourfirst construct is ‘kind—cruel’, allocate each el-ement in turn on that dimension, putting a crossin the appropriate box if you consider that per-son (element) kind, or leaving it blank if youconsider that person cruel. Make sure that halfof the elements are designated kind and halfcruel.

Proceed in this way for each construct in turn,always placing a cross where the construct poleto the left of the grid applies, and leaving it blankif the construct pole to the right is applicable.Every element must be allocated in this way, andhalf of the elements must always be allocated tothe left-hand pole.

Procedures in grid analysis

The grid may be regarded as a reflection of con-ceptual structure in which constructs are linkedby virtue of their being applied to the same per-sons (elements). This linkage is measured by aprocess of matching construct rows.

To estimate the linkage between constructs 1and 2 in Box 19.4, for example, count thenumber of matches between correspondingboxes in each row. A match is counted wherethe same element has been designated with across (or a blank) on both constructs. So, forconstructs 1 and 2 in Box 19.4, we count 6 suchmatches. By chance we would expect 8 (out of16) matches, and we may subtract this from theobserved value to arrive at an estimate of suchdeviation from chance.

Box 19.3Laddering

Source Adapted from Fransella, 1975

A matrix of rankings for a repertory grid with teachers as elements

You may decide to stop when you have elicited seven or eight constructs from the teacher elements. But you could goon to ‘ladder’ two or three of them. This process of laddering is in effect asking yourself (or someone else) to abstractfrom one conceptual level to another. You could ladder from man-woman, but it might be easier to start off withserious-light-hearted. Ask yourself which you would prefer to be—serious or light-hearted. You might reply light-hearted. Now pose the question ‘why’. Why would you rather be a light-hearted person than a serious person?Perhaps the answer would be that light-hearted people get on better with others than do serious people. Ask yourself‘why’ again. Why do you want to be the sort of person who gets on better with others? Perhaps it transpires that youthink that people who do not get on well with others are lonely. In this way you elicit more constructs but ones thatstand on the shoulders of those previously elicited. Whatever constructs you have obtained can be put into the grid.

343Chapte

r 19

Constructs Match Difference score1–2 6 6–8=-2

By matching construct 1 against all remain-ing constructs (3…15), we get a score for eachcomparison. Beginning then with construct 2,and comparing this with every other construct(3…15), and so on, every construct on the gridis matched with every other one and a differ-ence score for each obtained. This is recordedin matrix form, with the reflected half of thetable also filled in (see difference score for con-structs 1–2 in Box 19.5). The sign of the dif-ference score is retained. It indicates the di-rection of the linkage. A positive sign showsthat the constructs are positively associated,a negative sign that they are negatively asso-ciated.

Now add up (without noting sign) the sumof the difference scores for each column (con-struct) in the matrix. The construct with the larg-est difference score is the one which, statistically,accounts for the greatest amount of variance in

the grid. Note this down. Now look in the bodyof the matrix for that construct which has thelargest non-significant association with the onewhich you have just noted (in the case of a 16-element grid as in Box 19.4, this will be a differ-ence score of±3 or less). This second constructcan be regarded as a dimension which isorthogonal to the first, and together they mayform the axes for mapping the person’s psycho-logical space.

If we imagine the construct with the highestdifference score to be ‘kind-cruel’ and the high-est non-significant associated construct to be‘confident-unsure’, then every other constructin the grid may be plotted with reference to thesetwo axes. The co-ordinates for the map are pro-vided by the difference scores relating to thematching of each construct with the two usedto form the axes of the graph. In this way a pic-torial representation of the individual’s ‘personalconstruct space’ can be obtained, and inferencesmade from the spatial relationships betweenplotted constructs (see Box 19.6).

Box 19.4Elements

Box 19.5Difference score for constructs

PROCEDURES IN GRID ANALYSIS

344 PERSONAL CONSTRUCTS

By rotating the original grid 90 degrees andcarrying out the same matching procedure onthe columns (figures), a similar map may be ob-tained for the people (figures) included in thegrid. Grid matrices can be subjected to analysesof varying degrees of complexity. We have illus-trated one of the simplest ways of calculatingrelationships between constructs in Box 19.5.For the statistically minded researcher, a varietyof programmes exist in GAP, the Grid AnalysisPackage developed by Slater and described byChetwynd (1974).1 GAP programmes analysethe single grid, pairs of grids and grids in groups.Grids may be aligned either by construct, by el-ement or both. A fuller discussion of metric fac-tor analysis is given in Fransella and Bannister(1977:73–81) and Pope and Keen (1981:77–91).

Non-metric methods of grid analysis makeno assumptions about the linearity of relation-ships between the variables and the factors.Moreover, where the researcher is primarily in-terested in the relationships between elements,multidimensional scaling may prove a more use-ful approach to the data than principal compo-nents analysis.

The choice of one method rather than anothermust ultimately rest both upon what is statisti-cally correct and what is psychologically desir-able. The danger in the use of advanced compu-ter programmes, as Fransella and Bannister pointout, is being caught up in the numbers game. Theirplea is that grid users should have at least an in-tuitive grasp of the processes being so compe-tently executed by their computers!

Strengths of repertory grid technique

It is in the application of interpretive perspec-tives in social research, where the investigatorseeks to understand the meaning of events tothose participating, that repertory grid tech-nique offers exciting possibilities. It is particu-larly able to provide the researcher with anabundance and a richness of interpretable ma-terial. Repertory grid is, of course, especiallysuitable for the exploration of relationships be-tween an individual’s personal constructs as thestudies of Foster (1992)2 and Neimeyer (1992),for example, show. Foster employed a GridsReview and Organizing Workbook (GROW), astructured exercise based on personal constructtheory, to help a 16-year-old boy articulate con-structs relevant to his career goals. Neimeyer’scareer counselling used a Vocational Reptestwith a 19-year-old female student who com-pared and contrasted various vocational ele-ments (occupations), laddering techniques be-ing employed to determine construct hierar-chies. Repertory grid is equally adaptable to theproblem of identifying changes in individualsthat occur as a result of some educational expe-rience. By way of example, Burke, Noller andCaird (1992)3 identified changes in the con-structs of a cohort of technical teacher traineesduring the course of their two-year studies lead-ing to qualified status.

In modified formats (the ‘dyad’ and the‘double dyad’) repertory grid has employedrelationships between people as elements,rather than people themselves, and demon-strated the increased sensitivity of this typeof grid in identifying problems of adjust-ment in such diverse fields as family coun-selling (Alexander and Neimeyer, 1989) andsports psychology (Feixas, Marti andVillegas, 1989).

Finally, repertory grid can be used in study-ing the changing nature of construing and thepatterning of relationships between constructsin groups of children from relatively young agesas the work of Epting et al. (1971), Salmon(1969) and Applebee (1976) have shown.

Box 19.6Grid matrix

345Chapte

r 19

Difficulties in the use of repertory gridtechnique

Fransella and Bannister (1977) point to anumber of difficulties in the development anduse of grid technique, the most important ofwhich is, perhaps, the widening gulf betweentechnical advances in grid forms and analysesand the theoretical basis from which these arederived. There is, it seems, a rapidly expandinggrid industry. Small wonder, then, as Fransellaand Bannister wryly observe, that studies suchas a one-off analysis of the attitudes of a groupof people to asparagus, which bears little or norelation to personal construct theory, are on theincrease.

A second difficulty relates to the question ofbi-polarity in those forms of the grid in whichcustomarily only one pole of the construct isused. Researchers may make unwarranted in-ferences about constructs’ polar opposites.Yorke’s illustration of the possibility of the re-searcher obtaining ‘bent’ constructs suggests theusefulness of the opposite method (Epting et al.,1971) in ensuring the bi-polarity of elicited con-structs.

A third caution is urged with respect to theelicitation and laddering of constructs.Laddering, note Fransella and Bannister, is anart, not a science. Great care must be taken notto impose constructs. Above all, the researchermust learn to listen to her subject(s).

A number of practical problems commonlyexperienced in rating grids are identified byYorke.4 These are: • Variable perception of elements of low per-

sonal relevance.• Varying the context in which the elements are

perceived during the administration of thegrid.

• Halo effect intruding into the ratings wherethe subject sees the grid matrix building up.

• Accidental reversal of the rating scale (men-tally switching from 5=high to 1=high, per-haps because ‘five points’ and ‘first’ are bothways of describing high quality). This can

happen both within and between constructs,and is particularly likely where a negative orimplicitly negative property is ascribed to thepair during triadic elicitation.

• Failure to follow the rules of the rating proce-dure. For example, where the pair has had to berated at the high end of a 5-point scale, triadshave been found in a single grid rated as 5, 4, 4;1, 1, 2; 1, 2, 4 which must call into question theconstructs and their relationship with the elements.

More fundamental criticism of the repertorygrid, however, argues that it exhibits a nomoth-etic positivism that is discordant with the verytheory on which it is based. Whatever themethod of rating, ranking or dichotomous allo-cation of elements on constructs, is there not animplicit assumption, asks Yorke, that the con-struct is stable across all of the elements beingrated? Similar to scales of measurement in thephysical sciences, elements are assigned to posi-tions on a fixed scale of meaning as though theresearcher were dealing with length or weight.But meaning, Yorke reminds us, is ‘anchored inthe shifting sands of semantics’. This he ablydemonstrates by means of a hypothetical prob-lem of rating four people on the construct ‘gen-erous-mean’. Yorke shows that it would requirea finely wrought grid of enormous proportionsto do justice to the nuances of meaning thatcould be elicited in respect of the chosen con-struct. The charge that the rating of elementson constructs and the subsequent statisticalanalyses retain a positivistic core in what pur-ports to be a non-positivistic methodology isdifficult to refute.

Finally, increasing sophistication in compu-ter-based analyses of repertory grid forms leadsinevitably to a burgeoning number of conceptsby which to describe the complexity of what canbe found within matrices. It would be ironic,would it not, Fransella and Bannister ask, if rep-ertory grid technique were to become absorbedinto the traditions of psychological testing andemployed in terms of the assumptions whichunderpin such testing. From measures to traitsis but a short step, they warn.

DIFFICULTIES OF THE REPERTORY GRID TECHNIQUE

346 PERSONAL CONSTRUCTS

Some examples of the use of reper-tory grid in educational research

Our first two examples of the use of personalconstructs in education have to do with courseevaluation, albeit one less directly than the other.The first study employs the triadic sorting pro-cedure that Kelly originally suggested; the sec-ond illustrates the use of sophisticated interac-tive software in the elicitation and analysis ofpersonal constructs. Kremer-Hayon’s (1991)study sought to answer two questions: first,‘What are the personal constructs by whichheadteachers relate to their staff?’ and second,To what extent can those constructs be mademore “professional”?’ The subjects of her re-search were thirty junior school headteachersparticipating in an in-service university pro-gramme on school organization and manage-ment, educational leadership and curriculumdevelopment. The broad aim of the course wasto improve the professional functioning of itsparticipants. Headteachers’ personal constructswere elicited through the triadic sorting proce-dure in the following way:

1 Participants were provided with ten cards

which they numbered 1 to 10. On each cardthey wrote the name of a member of staffwith whom they worked at school.

2 They were then required to arrange the cardsin threes, according to arbitrarily-selectednumbers provided by the researcher.

3 Finally, they were asked to suggest one wayin which two of the three named teachers inany one triad were similar and one way inwhich the third member was different.

During the course of the two-year in-service pro-gramme, the triadic sorting procedure was un-dertaken on three occasions: Phase 1 at the be-ginning of the first year, Phase 2 at the begin-ning of the second year, and Phase 3 two monthslater, after participants had engaged in a work-shop aimed at enriching and broadening theirperspectives as a result of analysing personalconstructs elicited during Phases 1 and 2.

The analysis of the personal construct datagenerated categories derived directly from theheadteachers’ sortings. Categories were countedseparately for each and for all headteachers, thusyielding personal and group profiles. This partof the analysis was undertaken by two judgesworking independently, who had previously at-tained 85 per cent agreement on equivalent data.In classifying categories as ‘professional’ Kremer-Hayon (1991) drew on a research literaturewhich included the following attributes of a pro-fession: ‘a specific body of knowledge and ex-pertise, teaching skill, theory and research, ac-countability, commitment, code of ethics, soli-darity and autonomy’. Descriptors were furtherdifferentiated as ‘cognitive’ and ‘affective’. Byway of example, the first three attributes of pro-fessionalism listed above, (specific body ofknowledge, teaching skills and theory and re-search) were taken to connote cognitive aspects;the next four, affective. Thus, the data were clas-sified into the following categories:

professional features (cognitive and affective)general features (cognitive and affective)background data (professional and non-

professional)miscellaneous

At the onset of the in-service programme, thegroup of head-teachers related to their teachingstaff by general rather than professionaldescriptors, and by affective rather than cognitivedescriptors. The overall group profile at Phase 1appeared to be non-professional and affective. Thispatterning changed at the onset of the second yearwhen, as far as professional descriptors were con-cerned, a more balanced picture emerged. Uponthe completion of the workshop (Phase 3), therewas a substantial change towards a professionaldirection.

Kremer-Hayon concludes that the growth in thenumber of descriptors pertaining to professionalfeatures bears some promise for professionalstaff development.

The research report of Fisher et al. (1991)arose out of an evaluation of a two-year diplomacourse in a college of further and higher

347Chapte

r 19

education. Repertory grid was chosen as a par-ticularly suitable means of helping students charttheir way through the course of study and re-veal to them aspects of their personal and pro-fessional growth. At the same time, it was feltthat repertory grid would provide tutors andcourse directors with important feedback aboutteaching, examining and general managementof the course as a whole.

‘Flexigrid’, the interactive software used inthe study, was chosen to overcome what theauthors identify as the major problem of gridproduction and subsequent exploration ofemerging issues—the factor of time. During thediploma course, five three-hour sessions wereset aside for training and the elicitation of grids.Students were issued with a booklet containingexact instructions on using the computer. Theywere asked to identify six items they felt impor-tant in connection with their diploma course.These six elements, along with the constructsarising from the triads selected by the softwarewere entered into the computer. Students workedsingly using the software and then discussed theirindividual findings in pairs, having already beentrained how to interpret the ‘maps’ that appearedon the printouts. Individuals’ and partners’ in-terpretations were then entered in the students’booklets. Tape-recorders were made availablefor recording conversations between pairs. Theanalysis of the data in the research report de-rives from a series of computer printouts accom-panied by detailed student commentaries, to-gether with field notes made by the researchersand two sets of taped discussions.

From a scrutiny of all diploma student gridsand commentaries, Fisher, Russell andMcSweeney drew the following conclusionsabout students’ changing reactions to their stud-ies as the course progressed. 1 The over-riding student concerns were to do

with anxiety and stress connected with the com-pletion of assignments; such concerns, moreo-ver, linked directly to the role of assessors.

2 Extrinsic factors took over from intrinsicones, that is to say, finishing the course be-

came more important than its intrinsicvalue.

3 Tutorial support was seen to provide a cush-ion against excessive stress and fear of fail-ure. There was some evidence that tutors hadnot been particularly successful at defusingproblems to do with external gradings.

The researchers were satisfied with the poten-tial of ‘Flexigrid’ as a tool for course evalua-tion. Particularly pleasing was the high level ofinternal validity shown by the congruence ofresults from the focused grids and the contentanalysis of students’ commentaries.

For further examples of repertory grid tech-nique we refer the reader to: (a) Harré andRosser’s (1975) account of ethogenically ori-ented research into the rules governing disor-derly behaviour among secondary school leavers,which parallels both the spirit and the approachof an extension of repertory grid described byRavenette (1977); (b) a study of student teach-ers’ perceptions of the teaching practice situa-tion (Osborne, 1977) which uses 13×13 matri-ces to elicit elements (significant role incum-bents) and provides an example of Smith’s andLeach’s (1972) use of hierarchical structures inrepertory grids.

Grid technique and audio/video lessonrecording

Parsons et al. (1983) show how grid techniqueand audio/video recordings of teachers’ work inclassrooms can be used to make explicit the ‘im-plicit models’ that teachers have of how chil-dren learn.

Fourteen children were randomly selectedand, on the basis of individual photographs, tri-adic comparisons were made to elicit constructsconcerning one teacher’s ideas about the simi-larities and differences in the manner in whichthese children learned. In addition, extensiveobservations of the teacher’s classroom behav-iour were undertaken under naturalistic condi-tions and verbatim recordings (audio and video)were made for future review and discussion

GRID TECHNIQUE AND LESSON RECORDING

348 PERSONAL CONSTRUCTS

between the teacher and the researchers at theend of each recording session.

What very soon became evident in these on-going complementary analyses was the cleardistinction that Mrs C (the teacher) held for highand low achievers. The analysis of the childrenin class as shown in the video tapes revealedthat not only did high and low achievers sit inseparate groups but the teacher’s whole ap-proach to these two groupings differed:

With high achievers, Mrs. C would often adopt a‘working with’ approach, i.e. verbalizing whatchildren had done, with their help. When con-fronted with low achievers, Mrs. C would moreoften ask ‘why’ they had tackled problems in acertain manner, and wait for an answer.

(Parsons et al., 1983)

Focused grids, non-verbal grids, exchangegrids and sociogrids

A number of developments have been reportedin the use of computer programmes in reper-tory grid research.5 We briefly identify these asfollows: 1 Focusing a grid assists in the interpretation

of raw grid data. Each element is comparedwith every other element and the ordering ofelements in the grid is changed so that thosemost alike are clustered most closely together.A similar rearrangement is made in respectof each construct.

2 Physical objects can be used as elements andgrid elicitation is then carried out in non-

verbal terms. Thomas (1978) claims that thisapproach enhances the exploration of sensoryand perceptual experiences.

3 Exchange grids are procedures developed toenhance the quality of conversational ex-changes. Basically, one person’s construingprovides the format for an empty grid whichis offered to another person for completion.The empty grid consists of the first person’sverbal descriptions from which his ratingshave been deleted. The second person is theninvited to test his comprehending of the firstperson’s point of view by filling in the grid ashe believes the other has already completedit. Various computer programmes (‘Pairs’,‘Cores’ and ‘Difference’) are available to as-sist analysis of the processes of negotiationelicited in exchange grids.

4 In the ‘Pairs’ analysis, all constructs in onegrid are compared with all constructs in theother grid and a measure of commonality inconstruing is determined. ‘Pairs’ analysis leadson to ‘Sociogrids’ in which the pattern of re-lationships between the grids of one groupcan be identified. In turn, ‘Sociogrids’ canprovide a mode grid for the whole group or anumber of mode grids identifying cliques.‘Socionets’ which reveal the pattern of sharedconstruing can also be derived.

With these brief examples, the reader will catchsomething of the flavour of what can be achievedusing the various manifestations of repertorygrid techniques in the field of educational re-search.6

Introduction

However limited our knowledge of astronomy,most of us have learned to pick out certainclusterings of stars from the infinity of those thatcrowd the northern skies and to name them asthe familiar Plough, Orion, and the Great BearFew of us would identify constellations in thesouthern hemisphere that are instantly recog-nizable by those in Australia.

Our predilection for reducing the complex-ity of elements that constitute our lives to a moresimple order doesn’t stop at star gazing. In nu-merous ways, each and every one of us attemptsto discern patterns or shapes in seemingly un-connected events in order to better grasp theirsignificance for us in the conduct of our dailylives. The educational researcher is no excep-tion.

As research into a particular aspect of hu-man activity progresses, the variables being ex-plored frequently turn out to be more complexthan was first realized. Investigation into therelationship between teaching styles and pupilachievement is a case in point. Global distinc-tions between behaviour identified as progres-sive or traditional, informal or formal, are vagueand woolly and have led inevitably to researchfindings that are at worse inconsistent, at best,inconclusive. In reality, epithets such as infor-mal or formal in the context of teaching andlearning relate to ‘multi-dimensional concepts’,that is, concepts made up of a number of vari-ables. ‘Multi-dimensional scaling’, on the otherhand, is a way of analysing judgements of simi-larity between such variables in order that thedimensionality of those judgements can be as-sessed (Bennett and Bowers, 1977). As regards

research into teaching styles and pupil achieve-ment, it has been suggested that multi-dimen-sional typologies of teacher behaviour shouldbe developed. Such typologies, it is believed,would enable the researcher to group togethersimilarities in teachers’ judgements about spe-cific aspects of their classroom organization andmanagement, and their ways of motivating, as-sessing and instructing pupils.

Techniques for grouping such judgements aremany and various. What they all have in com-mon is that they are methods for ‘determiningthe number and nature of the underlying vari-ables among a large number of measures’, a defi-nition which Kerlinger (1970) uses to describeone of the best-known grouping techniques, ‘fac-tor analysis’. We begin the chapter by illustrat-ing a number of methods of grouping or clus-tering variables ranging from elementary link-age analysis which can be undertaken by hand,to factor analysis, which is best left to the com-puter. We then outline one way of analysing datacast into multi-dimensional tables. Finally, weappend a brief note on a recent, sophisticatedtechnique for exploring multivariate data.

Elementary linkage analysis: an example

Seven constructs were elicited from an infantschool teacher who was invited to discuss theways in which she saw the children in her class(see Chapter 19). She identified favourable andunfavourable constructs as follows: ‘intelligent’(+), ‘sociable’ (+), ‘verbally good’ (+), ‘well-be-haved’ (+), ‘aggressive’ (-), ‘noisy’ (-) and‘clumsy’ (-).

20 Multi-dimensional measurement

350 MULTI-DIMENSIONAL MEASUREMENT

Four boys and six girls were then selected atrandom from the class register and the teacherwas asked to place each child in rank order un-der each of the seven constructs, using rank po-sition 1 to indicate the child most like the par-ticular construct, and rank position 10, the childleast like the particular construct. The teacher’srank ordering is set out in Box 20.1. Notice thaton three constructs, the rankings have been re-versed in order to maintain the consistency offavourable 1, unfavourable 10.

Elementary linkage analysis (McQuitty,1957) is one way of exploring the relationshipbetween the teacher’s personal constructs, thatis, of assessing the dimensionality of the judge-ments that she makes about her pupils. It seeksto identify and define the clusterings of certainvariables within a set of variables. Like factoranalysis which we shortly illustrate, elementarylinkage analysis searches for interrelated groupsof correlation co-efficients. The objective of thesearch is to identify ‘types’. By type, McQuittyrefers to ‘a category of people or other objects(personal constructs in our example) such thatthe members are internally self-contained in be-ing like one another’. Box 20.2 sets out theintercorrelations between the seven personalconstruct ratings shown in Box 20.1 (Spearman’srho is the method of correlation used in this ex-ample).

Steps in elementary linkage analysis

1 In Box 20.2, underline the strongest, that isthe highest, correlation co-efficient in eachcolumn of the matrix. Ignore negative signs.

2 Identify the highest correlation co-efficient inthe entire matrix. The two variables havingthis correlation constitute the first two ofCluster 1.

3 Now identify all those variables which aremost like the variables in Cluster 1. To dothis, read along the rows of the variableswhich emerged in Step 2, selecting any of theco-efficients which are underlined in the rows.Box 20.3 illustrates diagramatically the waysin which these new cluster members are

related to the original pair which initially con-stituted Cluster 1.

4 Now identify any variables which are mostlike the variables elicited in Step 3. Repeat

Source Cohen, 1977

Box 20.1Rank ordering of ten children on seven constructs

INTELLIGENT SOCIABLE

favourable) 1 Heather (favourable) 1 Caroline2 Richard 2 Richard3 Caroline 3 Sharon4 Tim 4 Jane5 Patrick 5 Tim6 Sharon 6 Janice7 Janice 7 Heather8 Jane 8 Patrick9 Alex 9 Karen

(unfavourable) 10 Karen (unfavourable) 10 Alex

AGGRESSIVE NOISY(unfavourable) 10 Alex (unfavourable) 10 Alex

9 Patrick 9 Patrick8 Tim 8 Karen7 Karen 7 Tim6 Richard 6 Caroline5 Caroline 5 Richard4 Heather 4 Heather3 Jane 3 Janice2 Sharon 2 Sharon

(favourable) 1 Janice (favourable) 1 Jane

VERBALLY-GOOD CLUMSY(favourable) 1 Richard (unfavourable) 10 Alex

2 Caroline 9 Patrick3 Heather 8 Karen4 Janice 7 Tim5 Patrick 6 Richard6 Tim 5 Sharon7 Alex 4 Jane8 Sharon 3 Janice9 Jane 2 Caroline

(unfavourable) 10 Karen (favourable) 1 Heather

WELL-BEHAVED(favourable) 1 Janice

2 Jane3 Sharon4 Caroline5 Heather6 Richard7 Tim8 Karen9 Patrick

(unfavourable) 10 Alex

351Chapte

r 20

this procedure until no further variables areidentified.

5 Excluding all those variables which belongwithin Cluster 1, repeat Steps 2 to 4 until allthe variables have been accounted for.

Cluster analysis: an example1

Elementary linkage analysis is one method ofgrouping or clustering together correlation co-efficients which show similarities among a setof variables. We now illustrate another methodof clustering which was used by Bennett (1976)2

in his study of teaching styles and pupil progress.His starting point was a disaffection for globaldescriptions such as ‘progressive’ and ‘traditionalas applied to teaching styles in junior schoolclassrooms. A more adequate theoretical andexperimental conceptualization of the elementsconstituting teaching styles was attemptedthrough the construction of a questionnaire con-

taining twenty-eight statements illustrating sixmajor areas of teacher classroom behaviour:classroom management and control; teachercontrol and sanctions; curriculum content andplanning; instructional strategies; motivationaltechniques; and assessment procedures.

Bennett constructed a typology of teachingstyles from the responses of 468 top-junior-school classteachers to the questionnaire. Hiscluster analysis of their responses involved cal-culating co-efficients of similarity between sub-jects across all the variables that constituted thefinal version of the questionnaire. This techniqueinvolves specifying the number of clusters ofsubjects to which the researcher wishes the datato be reduced. Examination of the central pro-files of all solutions from twenty-two to threeclusters, showed that at the twelve-cluster solu-tion level, between-cluster differences weremaximized in relation to within-cluster error (seeBennett, 1976). An essential prerequisite to theclustering technique employed in this study wasthe use of factor analysis to ensure that the vari-ables were relatively independent of one anotherand that groups of variables were not over-weighted in the analysis. Principal Componentsanalysis followed by varimax rotation reducedthe twenty-eight variables in Bennett’s originalquestionnaire to the nineteen shown in Box 20.4.For purposes of exposition, Bennett ordered thetypes of teaching style shown in Box 20.4, fromthe most progressive cluster (Type 1) to the mosttraditional cluster (Type 12), noting however,

Box 20.2Intercorrelations between seven personal constructs

Source Cohen, 1977

(1) (2) (3) (4) (5) (6) (7)

Intelligent (1) 53 –10 –16 83 –52 13

Sociable (2) 53 --50 --59 44 –56 61

Aggressive (3) –10 –50 91 –07 79 –96

Noisy (4) –16 –59 91 –01 73 –93

Verbally-good (5) 83 44 –07 –01 –43 12

Clumsy (6) –52 –56 79 73 –43 –81

Well-behaved (7) 13 61 –96 –93 12 –81

(decimal points omitted)

Source Cohen, 1977

Box 20.3The structuring of relationships among the seven personalconstructs

CLUSTER ANALYSIS: AN EXAMPLE

352 MULTI-DIMENSIONAL MEASUREMENT

Box 20.4Central profiles (percentage occurrence) at 12-cluster levels

Source Bennett 1976

that whilst the extreme types could be describedin these terms, the remaining types all containedelements of both progressive and traditionalteaching styles. The figures in heavy typefaceshow percentage response levels that were con-sidered significantly different from the totalpopulation distribution.

Bennett described the twelve types of teacherstyles as follows: 1 These teachers favour integration of subject

matter, and, unlike most other groups, allowpupils choice of work, whether undertakenindividually or in groups. Most allow pupils

1 Pupils have choice in where to sit 63 66 17 46 50 18 7 17 3 7 77 002 Pupils allocated to seating by ability 14 16 25 0 12 45 20 7 81 58 3 503 Pupils not allowed freedom of

movement in the classroom 49 38 83 76 100 84 87 100 86 97 97 1004 Teacher expects pupils to be quiet 31 34 92 61 23 55 56 90 81 74 90 1005 Pupils taken out of school regularly

as normal teaching activity 51 50 83 49 81 45 17 47 31 19 26 426 Pupils given homework regularly 9 22 8 27 65 3 13 43 36 29 21 56

Teaching emphasis7 (i) Above average teacher talks to

whole class 29 16 79 58 30 74 83 73 33 94 85 708 (ii) Above average pupils working

in groups on teacher tasks 46 13 83 12 77 92 3 3 22 68 10 89 (iii) Above average pupils working

in groups of own choice 89 3 29 94 19 32 13 3 0 23 0 010 (iv) Above average pupils working

individually on teacher tasks 9 97 0 3 42 0 73 83 100 0 72 9211 (v) Above average pupils working

individually on work of own choice 94 9 42 85 42 18 57 57 8 3 8 2812 Pupils’ work marked and graded 3 3 13 15 31 16 33 33 8 32 31 9713 Stars given to pupils who produce

best work 9 31 38 55 8 18 17 73 17 87 69 7514 Arithmetic tests given at least once

a week 9 9 71 88 100 8 10 70 50 94 56 8115 Spelling tests given at least once a

week 23 19 67 94 92 18 7 73 92 94 87 9216 Teacher smacks for persistent

disruptive behaviour 34 34 96 24 31 45 80 93 42 68 64 5817 Teacher sends pupil out of room

for persistent disruptive behaviour 11 25 13 6 8 3 7 10 25 0 33 1118 Allocation of teaching time

(i) Above average separate subjectteaching 20 31 4 82 81 95 100 47 81 100 100 92

19 (ii) Above average integratedsubject teaching 97 91 100 24 65 8 10 93 14 7 0 0N in cluster 35 32 24 33 26 38 30 30 36 31 39 36

Item Type1 2 3 4 5 6 7 8 9 10 11 12

353Chapte

r 20

choice of seating. Less than half curb move-ment and talk. Assessment in all its forms,tests, grading and homework, appears to bediscouraged. Intrinsic motivation is favoured.

2 These teachers also prefer integration of sub-ject matter. Teacher control appears to be low,but less pupil choice of work is offered. How-ever, most allow pupils choice of seating, andonly one-third curb movement and talk. Fewtest or grade work.

3 The main teaching mode of this group is classteaching and group work. Integration of sub-ject matter is preferred, associated with tak-ing pupils out of school. They appear to bestrict, most curbing movement and talk, andoffenders are smacked. The amount of test-ing is average, but the amount of grading andhomework is below average.

4 These teachers prefer separate subject teach-ing but a high proportion allow pupil choiceboth in group and individual work. None seattheir pupils by ability. They test and grademore than average.

5 A mixture of separate subject and integratedsubject teaching is characteristic of this group.The main teaching mode is pupils workingin groups of their own choice on tasks set bythe teacher. Teacher talk is lower than aver-age. Control is high with regard to movementbut not to talk. Most give tests every weekand many give homework regularly. Stars arerarely used, and pupils are taken out of schoolregularly.

6 These teachers prefer to teach subjects sepa-rately with emphasis on groups working onteacher-specified tasks. The amount of indi-vidual work is small. These teachers appearto be fairly low on control, and are belowaverage on assessment and the use of extrin-sic motivation.

7 This group is separate subject oriented, witha high level of class teaching together withindividual work. Teacher control appears tobe tight, few allow movement or choice ofseating, and offenders are smacked. Assess-ment, however, is low.

8 This group of teachers has very similar char-acteristics to those in Type 3, the differencebeing that these prefer to organize the workon an individual rather than group basis.Freedom of movement is restricted, and mostexpect pupils to be quiet.

9 These teachers favour separate subject teach-ing, the predominant teaching mode beingindividuals working on tasks set by theteacher. Teacher control appears to be high;most curb movement and talk, and seat byability. Pupil choice is minimal. Regular spell-ing tests are given, but few mark or gradework or use stars.

10 All these teachers favour separate subjectteaching. The teaching mode favoured isteacher talk to whole class, and pupils work-ing in groups determined by the teacher ontasks set by the teacher. Most curb move-ment and talk, and over two-thirds smackfor disruptive behaviour. There is regulartesting and most give stars for good work.

11 All members of this group stress separatesubject teaching by way of class teaching andindividual work. Pupil choice of work isminimal, although most teachers allowchoice of seating. Movement and talk arecurbed, and offenders smacked.

12 This is an extreme group in a number of re-spects. None favour an integrated approach.Subjects are taught separately by class teach-ing and individual work. None allow pupils’choice of seating, and every teacher curbsmovement and talk. These teachers are aboveaverage on all assessment procedures, andextrinsic motivation predominates.

Bennett’s typology of teacher styles and hisanalysis of pupil performance based on the ty-pology aroused considerable debate. Readersmay care to follow up critical comments on thecluster analysis procedures we have outlinedhere.3 It is important to note, perhaps, how timeshave changed since this study was undertaken—many of the practices that Bennett describeswould be considered illegal today!

CLUSTER ANALYSIS: AN EXAMPLE

354 MULTI-DIMENSIONAL MEASUREMENT

Factor analysis: an example

Factor analysis, we said earlier, is a way of de-termining the nature of underlying patternsamong a large number of variables. It is par-ticularly appropriate in research where investi-gators aim to impose an ‘orderly simplification’(Child, 1970) upon a number of interrelatedmeasures. We illustrate the use of factor analy-sis in a study of occupational stress among teach-ers (McCormick and Solman, 1992).

Despite a decade or so of sustained research,the concept of occupational stress still causesdifficulties for researchers intent upon obtain-ing objective measures in such fields as the physi-ological and the behavioural, because of the widerange of individual differences. Moreover, sub-jective measures such as self-reports, by theirvery nature, raise questions about the externalvalidation of respondents’ revelations. This lat-ter difficulty notwithstanding, McCormick andSolman (1992) chose the methodology of self-report as the way into the problem,dichotomizing it into first, the teacher’s view ofself, and second, the external world as it is seento impinge upon the occupation of teaching.Stress, according to the researchers, is consid-ered as ‘an unpleasant and unwelcome emotion’whose negative effect for many is ‘associatedwith illness of varying degree’ (McCormick andSolman, 1992). They began their study on thebasis of the following premisses: 1 Occupational stress is an undesirable and nega-

tive response to occupational experiences.2 To be responsible for one’s own occupational

stress can indicate a personal failing. Drawing on attribution theory, McCormick andSolman consider that the idea of blame is a keyelement in a framework for the exploration ofoccupational stress. The notion of blame foroccupational stress, they assert, fits in well withtenets of attribution theory, particularly in termsof attribution of responsibility having a self-serv-ing bias.4 Taken in concert with organizationalfacets of schools, the researchers hypothesized

that teachers would ‘externalize responsibilityfor their stress increasingly to increasingly dis-tant and identifiable domains’ (McCormick andSolman, 1992). Their selection of dependent andindependent variables in the research followeddirectly from this major hypothesis.

McCormick and Solman developed a ques-tionnaire instrument that included thirty-twoitems to do with occupational satisfaction. Thesewere scored on a continuum ranging from‘strongly disagree’ to ‘strongly agree’. Thirty-eight further items had to do with possiblesources of occupational stress. Here, respond-ents rated the intensity of the stress they experi-enced when exposed to each source. Stress itemswere judged on a scale ranging from ‘no stress’to ‘extreme stress’. In yet another section of thequestionnaire, respondents rated how responsi-ble they felt certain nominated persons or insti-tutions were for the occupational stress that they,the respondents, experienced. These entities in-cluded self, pupils, superiors, the Departmentof Education, the Government and society it-self. Finally, the teacher-participants were askedto complete a fourteen-item Locus of Controlscale, giving a measure of internality/external-ity. ‘Internals’ are people who see outcomes as afunction of what they themselves do; ‘externals’see outcomes as a result of forces beyond theircontrol. The items included in this lengthy ques-tionnaire arose partly from statements aboutteacher stress used in earlier investigations, butmainly as a result of hunches about blame foroccupational stress that the researchers derivedfrom attribution theory. As Child (1970)observes:

In most instances, the factor analysis is precededby a hunch as to the factors that might emerge. Infact, it would be difficult to conceive of a man-ageable analysis which started in an empty-headedfashion… Even the ‘let’s see what happens’ ap-proach is pretty sure to have a hunch at the backof it somewhere. It is this testing and the genera-tion of hypotheses which forms the principal con-cern of most factor analysts.

(Child, 1970)

355Chapte

r 20

The 90-plus-item inventory was completed by387 teachers. Separate correlation matrices com-posed of the inter-correlations of the 32 itemson the satisfaction scale, the 8 items in the per-sons/institutions responsibility measure and the38 items on the stress scale were factor analysed.

The technical details of factor analysis arebeyond the scope of this book. Briefly, however,the procedures followed by McCormick andSolman involved a method called Principal Com-ponents, by means of which factors or group-ings are extracted. These are rotated to producea more meaningful interpretation of the under-lying structure than that provided by the Princi-pal Components method. (Readable accounts offactor analysis may be found in Kerlinger (1970)and Child (1970).)

In the factor analysis of the eight-item respon-sibility for stress measure, the researchers iden-tified three factors. Box 20.5 shows those threefactors with what are called their ‘factorloadings’. These are like correlation co-efficients,ranging from -1.0 to +1.0 and are interpretedsimilarly. That is to say they indicate the corre-lation between the person/institution responsi-bility items shown in Box 20.5, and the factors.

Looking at Factor 1, ‘School structure’, for ex-ample, it can be seen that in the three items load-ing there are, in descending order of weight,superiors (0.85), school organization (0.78) andpeers (0.77). ‘School structure’ as a factor, theauthors suggest, is easily identified and readilyexplained. But what of Factor 3, ‘Teacher—stu-dent relationships’, which includes the variablesstudents, society and yourself? McCormick andSolman (1992) proffer the following tentativeinterpretation:

An explanation for the inclusion of the variable‘yourself’ in this factor is not readily at hand.Clearly, the difference between the variable ‘your-self and the ‘students’ and ‘society’ variables isthat only 20% of these teachers rated themselvesas very or extremely responsible for their ownstress, compared to 45% and 60% respectivelyfor the latter two. Possibly the degree of responsi-bility which teachers attribute to themselves fortheir occupational stress is associated with theirperceptions of their part in controlling studentbehaviour. This would seem a reasonable expla-nation, but requiring further investigation.

(McCormick and Solman, 1992) Box 20.6 shows the factors derived from theanalysis of the thirty-eight occupational stressitems. Five factors were extracted. They werenamed: ‘Student domain’, ‘External (to school)domain’, ‘Time demands’, ‘School domain’ and‘Personal domain’. Whilst a detailed discussionof the factors and their loadings is inappropri-ate here, we draw readers’ attention to one ortwo interesting findings. Notice, for example,how the second factor, ‘External (to school)domain’, is consistent with the factoring of theresponsibility for stress items reported in Box20.5. That is to say, the variables to do with theGovernment and the Department of Educationhave loaded on the same factor. The researchersventure this further elaboration of the point:

when a teacher attributes occupational stress to theDepartment of Education, it is not as a member ofthe Department of Education, although such, infact is the case. In this context, the Department of

Box 20.5Factor analysis of responsibility for stress items

Source McCormick and Solman, 1992

Factor groupings of responsibility items with factorloadings and (rounded) percentages of teachersresponding in the two most extreme categories ofmuch stress and extreme stress.

Loading PercentageFactor 1: School structureSuperiors 0.85 29School Organization 0.78 31

Peers 0.77 13

Factor 2: Bureaucratic authorityDepartment of Education 0.89 70Government 0.88 66

Factor 3: Teacher-student relationshipsStudents 0.85 45

Society 0.60 60Yourself 0.50 20

FACTOR ANALYSIS: AN EXAMPLE

356 MULTI-DIMENSIONAL MEASUREMENT

Box 20.6Factor analysis of the occupational stress items

Source McCormick and Solman, 1992

Factor groupings of stress items with factor loadings and (rounded) percentages of teachers responding to the twoextremes of much stress and extreme stress

Loading Percentage

Factor 1 : Student domain Poor work attitudes of students 0.79 49Difficulty in motivating students 0.75 44Having to deal with students who constantly misbehave 0.73 57Inadequate discipline in the school 0.70 47Maintaining discipline with difficult classes 0.64 55Difficulty in setting and maintaining standards 0.63 26Verbal abuse by students 0.62 39Students coming to school without necessary equipment 0.56 23Deterioration of society’s control over children 0.49 55

Factor 2: External (to school) domainThe Government’s education policies 0.82 63The relationship which the Department of Education has with its schools 0.80 55Unrealistic demands from the Department of Education 0.78 63The conviction that the education system is getting worse 0.66 49Media criticism of teachers 0.64 52Lack of respect in society for teachers 0.63 56Having to implement Departmental policies 0.59 38Feeling of powerlessness 0.55 44

Factor 3:Time demandsInsufficient time for personal matters 0.74 43Just not enough time in the school day 0.74 51Difficulty of doing a good job in the classroom because of other delegatedresponsibilities 0.73 43Insufficient time for lesson preparation and marking 0.69 50Excessive curriculum demands 0.67 49Difficulty in covering the syllabus in the time available 0.61 37Demanding nature of the job 0.58 64

Factor 4: School domainLack of support from the principal 0.83 21Not being appreciated by the principal 0.83 14Principal’s reluctance to make tough decisions 0.77 30Lack of opportunity to participate in school decision-making 0.74 16Lack of support from other colleagues 0.57 11Lack of a supportive and friendly atmosphere 0.55 17Things happen at school over which you have no control 0.41 36

Factor 5: Personal domainPersonal failings 0.76 13Feeling of not being suited to teaching 0.72 10Having to teach a subject for which you are not trained 0.64 23

357Chapte

r 20

Education is outside ‘the system to which theteacher belongs’, namely the school. A similar ar-gument can be posed for the nebulous concept ofSociety. The Government is clearly a discrete po-litical structure.

(McCormick and Solman, 1992) ‘School domain’, Factor 4 in Box 20.6, consistsof items concerned with support from the schoolprincipal and colleagues as well as the generalnurturing atmosphere of the school. Of particu-lar interest here is that teachers report relativelylow levels of stress for these items. Box 20.7 re-ports the factor analysis of the thirty-two itemsto do with occupational satisfaction. Five fac-tors were extracted and named as ‘Supervision’,‘Income’, ‘External demands’, ‘Advancement’and ‘School culture’. Again, space precludes afull outline of the results set out in Box 20.7.Notice, however, an apparent anomaly in thefirst factor, ‘Supervision’. Responses to items todo with teachers’ supervisors and recognitionseem to indicate that in general, teachers aresatisfied with their supervisors, but feel that theyreceive too little recognition. Box 20.7 showsthat 21 per cent of teacher-respondents agree orstrongly agree that they receive too little recog-nition, yet 52 per cent agree or strongly agreethat they do receive recognition from their im-mediate supervisors. McCormick and Solmanoffer the following explanation:

The difference can be explained, in the first instance,by the degree or amount of recognition given. Thatis, immediate supervisors give recognition, but notenough. Another interpretation is that superiorsother than the immediate supervisor do not givesufficient recognition for their work.

(McCormick and Solman, 1992)

Here is a clear case for some form of respond-ent validation (see Chapter 5).

Having identified the underlying structures ofoccupational stress and occupational satisfaction,the researchers then went on to explore the rela-tionships between stress and satisfaction by usinga technique called ‘canonical correlation analy-sis’. The technical details of this procedure are

beyond the scope of this book. Interested readersare referred to Levine, who suggests that ‘the mostacceptable approach to interpretation of canoni-cal variates is the examination of the correlationsof the original variables with the canonical variate’(Levine, 1984). This is the procedure adopted byMcCormick and Solman.

From Box 20.8 we see that factors havinghigh correlations with Canonical Variate 1 areStress: Student domain (-0.82) and Satisfaction:External demands (0.72). The researchers offerthe following interpretation of this finding:

[This] indicates that teachers perceive that ‘non-teachers’ or outsiders expect too much of them(External demands) and that stress results frompoor student attitudes and behaviour (Studentdomain). One interpretation might be that forthese teachers, high levels of stress attributable tothe Student domain are associated with low levelsof satisfaction in the context of demands fromoutside the school, and vice versa. It may well bethat, for some teachers, high demand in one ofthese is perceived as affecting their capacity to copeor deal with the demands of the other. Certainly,the teacher who is experiencing the urgency of astruggle with student behaviour in the classroom,is unlikely to think of the requirements of personsand agencies outside the school as important.

(McCormick and Solman, 1992)

The outcomes of their factor analyses frequentlypuzzle researchers. Take, for example, one ofthe loadings on the third canonical variate.There, we see that the stress factor ‘Time de-mands’ correlates negatively (-0.52). One mighthave supposed, the authors say, that stress at-tributable to the external domain would havecorrelated with the variate in the same direc-tion. But this is not so. It correlates positively at0.80. One possible explanation, they suggest, isthat an increase in stress experienced because oftime demands coincides with a lowering of stressattributable to the external domain, as time isexpended in meeting demands from the exter-nal domain. The researchers concede, however,that this explanation would need close exami-nation before it could be accepted.

FACTOR ANALYSIS: AN EXAMPLE

358 MULTI-DIMENSIONAL MEASUREMENT

Box 20.7Factor analysis of the occupational satisfaction items

Source McCormick and Solman,1992

McCormick and Solman’s questionnaire alsoelicited biographical data from the teacher-re-spondents in respect of sex, number of yearsteaching, type and location of school and posi-tion held in school. By rescoring the stress itemson a scale ranging from ‘No stress’ (1) to ‘Ex-treme stress’ (5) and using the means of the fac-

tor scores, the researchers were able to exploreassociations between the degree of perceivedoccupational stress and the biographical datasupplied by participants. Space precludes a fullaccount of McCormick and Solman’s findings.We illustrate two or three significant results inBox 20.9. In the School domain more stress was

Factor groupings of satisfaction items with factor loadings and (rounded) percentages of teacher responses in the twopositive extremes of ‘strongly agree’ and ‘agree’ for positive statements, or ‘strongly disagree’ and ‘disagree’ forstatements of a negative nature; the latter items were reversed for analysis and are indicated by*

Loading Percentage

Factor 1 : SupervisionMy immediate supervisor does not back me up* 0.83 70I receive recognition from my immediate supervisor 0.80 52My immediate supervisor is not willing to listen* 0.78 68My immediate supervisor makes me feel uncomfortable* 0.78 66My immediate supervisor treats everyone equitably 0.68 62My superiors do not appreciate what a good job I do* 0.66 39I receive too little recognition* 0.51 21

Factor 2: IncomeMy income is less than I deserve* 0.80 10I am well paid in proportion to my ability 0.78 8My income from teaching is adequate 0.78 19My pay compares well with other non-teaching jobs 0.66 6Teachers’ income is barely enough to live on* 0.56 24

Factor 3: External demandsTeachers have an excessive workload* 0.72 5Teachers are expected to do too many non-teaching tasks* 0.66 4People expect too much of teachers* 0.56 4There are too many changes in education* 0.53 10I am satisfied with the Department of Education as an employer 0.44 12People who aren’t teachers do not understand the realities in schools* 0.34 1

Factor 4: AdvancementTeaching provides me with an opportunity to advance professionally 0.76 37I am not getting ahead in my present position* 0.67 16The Government is striving for a better education system 0.54 22The Department of Education is concerned for teachers’ welfare 0.52 6

Factor 5: School cultureI am happy to be working at this particular school 0.77 73Working conditions in my school are good 0.64 46Teaching is very interesting work 0.60 78

359Chapte

r 20

reported by secondary school teachers than bytheir colleagues teaching younger pupils, notreally a very surprising result, the researchersobserve, given that infant/primary schools aregenerally much smaller than their secondarycounterparts and that teachers are more likelyto be part of a smaller, supportive group. In thedomain of Time demands, females experiencedmore stress than males, a finding consistent withthat of other research. In the Personal domain,a significant difference was found in respect ofthe school’s location, the level of occupationalstress increasing from the rural setting, throughthe country/city to the metropolitan area. Toconclude, factor analysis techniques are ideallysuited to studies such as that of McCormick andSolman in which lengthy questionnaire-type dataare elicited from a large number of participantsand where researchers are concerned to exploreunderlying structures and relationships betweendependent and independent variables. Inevita-bly, such tentative explorations raise as manyquestions as they answer.

Examples of studies using factor analysisand linkage analysis

The use of factor analysis and linkage analysisin studies of children’s judgements of educationalsituations is illustrated in the work ofMagnusson (1971) and Ekehammar and

Box 20.8Correlations between (dependent) stress and(independent) satisfaction factors and canonicalvariates

Canonical variates1 2 3

Stress factorsStudent domain –0.82 –0.47 0.05External (to school) domain –0.15 –0.04 0.80Time –0.43 0.17 –0.52School domain –0.34 0.86 0.16Personal domain 0.09 –0.05 –0.25

Satisfaction factorsSupervision 0.23 –0.91 0.32Income 0.45 0.13 0.12External demands 0.72 0.33 0.28Advancement 0.48 –0.04 –0.71

Source Adapted from McCormick and Solman,1992

Box 20.9Biographical data and stress factors

Source McCormick and Solman, 1992

FACTOR ANALYSIS: AN EXAMPLE

360 MULTI-DIMENSIONAL MEASUREMENT

Magnusson (1973). In the latter study, pupilswere required to rate descriptions of variouseducational episodes on a scale of perceived simi-larity ranging from ‘0=not at all similar’ to‘4=identical’. Twenty different situations werepresented, two at a time, in the same randomizedorder for all subjects. For example, ‘listening toa lecture but do not understand a thing’ wouldbe judged against ‘sitting at home writing anessay’. Product moment correlation co-efficientsbetween pairs of similarity matrices calculatedfor all subjects varied between 0.57 and 0.79,with a median value of 0.71. No individualmatrix deviated markedly from any of the oth-ers. A factor analysis of the total correlationmatrix showed that the descriptions of situationshad very clear structures for the children in-volved. Moreover, judgements of perceived simi-larity between situations had a considerable de-gree of consistency over time. Ekehammar andMagnusson (1973) compared their dimensionalanalysis with a categorical approach to the datausing elementary linkage (McQuitty, 1957).They reported that this latter approach gave aresult which was entirely in agreement with theresult of the dimensional analysis. Five catego-ries of situations were obtained with the samesituations distributed in categories in the sameway as they were distributed in factors in thedimensional analysis.

Examples of studies usingmulti-dimensional scaling and clusteranalysis

Forgas (1976) studied housewives’ and students’perceptions of typical social episodes in theirlives, the episodes having been elicited from therespective groups by means of a diary technique.Subjects were required to supply two adjectivesto describe each of the social episodes they hadrecorded as having occurred during the previ-ous twenty-four hours. From a pool of some 146adjectives thus generated, ten (together with theirantonyms) were selected on the basis of theirsalience, their diversity of usage and their inde-pendence of one another. Two more scales from

speculative taxonomies were added to givetwelve unidimensional scales purporting to de-scribe the underlying episode structures. Thesescales were used in the second part of the studyto rate twenty-five social episodes in each group,the episodes being chosen as follows. An ‘indexof relatedness’ was computed on the basis ofthe number of times a pair of episodes was placedin the same category by respective house-wifeand student judges. Data were aggregated overthe total number of subjects in each of the twogroups. The twenty-five ‘top’ social episodes ineach group were retained. Forgas’s analysis isbased upon the ratings of twenty-six housewivesand twenty-five students of their respectivetwenty-five episodes on each of the twelveunidimensional scales. Box 20.10 shows a three-dimensional configuration of twenty-five socialepisodes rated by the student group on three ofthe scales. For illustrative purposes some of thesocial episodes numbered in Box 20.10 are iden-tified by specific content.

In another study, Forgas examined the socialenvironment of a university department consist-ing of tutors, students and secretarial staff, allof whom had interacted both inside and outsidethe department for at least six months prior tothe research and thought of themselves as anintensive and cohesive social unit. Forgas’s in-terest was in the relationship between two as-pects of the social environment of the depart-ment—the perceived structure of the group andthe perceptions that were held of specific socialepisodes. Participants were required to rate thesimilarity between each possible pairing of groupmembers on a scale ranging from ‘1= extremelysimilar’ to ‘9=extremely dissimilar’. An indi-vidual differences multi-dimensional scaling pro-cedure (INDSCAL) produced an optimal three-dimensional configuration of group structureaccounting for 68 per cent of the variance, groupmembers being differentiated along the dimen-sions of sociability, creativity and competence.

A semi-structured procedure requiring par-ticipants to list typical and characteristic inter-action situations was used to identify a numberof social episodes. These in turn were validated

361Chapte

r 20

by participant observation of the ongoing ac-tivities of the department. The most commonlyoccurring social episodes (those mentioned bynine or more members) served as the stimuli inthe second stage of the study. Bi-polar scalessimilar to those reported by Forgas (1976) andelicited in like manner were used to obtain groupmembers’ judgements of social episodes.

An interesting finding reported by Forgaswas that formal status differences exercisedno significant effect upon the perception ofthe group by its members, the absence of dif-ferences being attributed to the strength ofthe department’s cohesiveness and intimacy.

In Forgas’s analysis of the group’s percep-tions of social episodes, the INDSCAL scal-ing procedure produced an optimal four-di-mensional solution accounting for 62 percent of the variance, group members perceiv-ing social episodes in terms of anxiety, in-volvement, evaluation and social-emotionalversus task orientation. Box 20.11 illustrateshow an average group member would see thecharacteristics of various social episodes interms of the dimensions by which the groupcommonly judged them.

Finally we outline a classificatory system thathas been developed to process materials elicited

Box 20.10Students’ perceptions of social episodes

Source Adapted from Forgas, 1976

FACTOR ANALYSIS: AN EXAMPLE

362 MULTI-DIMENSIONAL MEASUREMENT

in a rather structured form of account gather-ing. Peevers and Secord’s study of developmen-tal changes in children’s use of descriptive con-cepts of persons, illustrates the application ofquantitative techniques to the analysis of oneform of account.

In individual interviews, children of varyingages were asked to describe three friends and oneperson whom they disliked, all four people beingof the same sex as the interviewee. Interviews weretape-recorded and transcribed. A person-conceptcoding system was developed, the categories ofwhich are illustrated in Box 20.12. Each person-description was divided into items, each item con-sisting of one discrete piece of information. Each

item was then coded on each of four major di-mensions. Detailed coding procedures are set outin Peevers and Secord (1973).

Tests of inter judge agreement on descriptive-ness, personal involvement and evaluative con-sistency in which two judges worked independ-ently on the interview transcripts of twenty-oneboys and girls aged between five and sixteenyears resulted in interjudge agreement on thosethree dimensions of 87 per cent, 79 per cent and97 per cent respectively.

Peevers and Secord also obtained evidence ofthe degree to which the participants themselveswere consistent from one session to another intheir use of concepts to describe other people.

Box 20.11Perception of social episodes

Source Adapted from Forgas, 1978

363Chapte

r 20

Children were reinterviewed between one weekand one month after the first session on the pre-text of problems with the original recordings.Indices of test-retest reliability were computedfor each of the major coding dimensions. Sepa-rate correlation co-efficients (eta) were obtainedfor younger and older children in respect of theirdescriptive concepts of liked and disliked peers.

Reliability co-efficients are as set out in Box20.13. Secord and Peevers (1974) conclude thattheir approach offers the possibility of an excit-ing line of inquiry into the depth of insight thatindividuals have into the personalities of theiracquaintances. Their ‘free commentary’ methodis a modification of the more structured inter-view, requiring the interviewer to probe for

Box 20.12Person concept coding system

Source Adapted from Peevers and Secord, 1973

Box 20.13Reliability co-efficients for peer descriptions

Liked peers Disliked peers

Dimension Younger subjects Older subjects Younger subjects Older subjectsDescriptiveness 0.83 0.91 0.80 0.84Personal involvement 0.76 0.80 0.84 0.77Depth 0.65 0.71 0.65 0.75Evaluative consistency 0.69 0.92 0.76 0.69

Source Peevers and Secord, 1973

FACTOR ANALYSIS: AN EXAMPLE

Dimension Levels of descriptivenessDescriptiveness 1 Undifferentiatin…

(person not differentiated from his environment)2 Simple differentiatin…(person differentiated in simple global terms)3 Differentiatin…(person differentiated in specific characteristics)4 Dispositional…(person differentiated in terms of traits)

Personal involvement Degrees of involvement1 Egocentric…(other person described in self-oriented terms)2 Mutual…(other person described in terms of his relationship to perceiver)3 Other oriented…(no personal involvement expressed by perceiver)

Evaluative consistency Amount of consistency1 Consistent…(nothing favourable about ‘disliked’, nothing unfavourable about ‘liked’)2 Inconsistent…(some mixture of favourableness and unfavourableness)

Depth Levels of depthLevel 1 (includes all undifferentiated and simple differentiated descriptions)Level 2 (includes differentiated and some dispositional descriptions)Level 3 (includes explanation-type differentiated and dispositional descriptions)

364 MULTI-DIMENSIONAL MEASUREMENT

explanations of why a person behaves the wayhe or she does or why a person is the kind ofperson he or she is. Peevers and Secord foundthat older children in their sample readily vol-unteered this sort of information. Harré (1977b)observes that this approach could also be ex-tended to elicit commentary upon children’sfriends and enemies and the ritual actions asso-ciated with the creation and maintenance ofthese categories.

Multi-dimensional tables

A frequently used statistic for a 2×2 contingencytable is the chi-square ( ) statistic. Thechisquare statistic measures the difference be-tween a statistically generated expected and anactual result to see if there is a significant differ-ence between them, i.e. to see if the frequenciesobserved are significant; it is a measure of ‘good-ness of fit’ between an expected and an actualresult or set of results. The expected result isbased on a statistical process discussed below.The chi-square statistic addresses the notion ofstatistical significance, itself based on notionsof probability.

For a chi-square statistic data are set into acontingency table, an example of which can beseen below, a 2×3 contingency table, i.e. twohorizontal rows and three columns (contingencytables may contain more than this number ofvariables). The example in this figure presentsdata concerning sixty students’ entry into sci-ence, arts and humanities, in a college, andwhether the students were male or female(Morrison, 1993:132–4). The lower of the twofigures in each cell is the number of actual stu-dents who have opted for the particular sub-jects (sciences, arts, humanities). The upper ofthe two figures in each cell is what might beexpected purely by chance to be the number ofstudents opting for each of the particular sub-jects. The figure is arrived at by statistical com-putation, hence the decimal fractions for the fig-ures. What is of interest to the researcher iswhether the actual distribution of subject choiceby males and females differs significantly from

that which could occur by chance variation inthe population of college entrants.

The researcher begins with the hypothesis thatthere is no significant difference between the ac-tual results noted and what might be expected tooccur by chance in the wider population (the nullhypothesis). When the chi-square statistic is cal-culated, if the observed, actual distribution dif-fers from that which might be expected to occurby chance alone, then the researcher has to deter-mine whether that difference is statistically sig-nificant, i.e. to reject the null hypothesis.

Our example using sixty students, using achisquare formula (available in most books onstatistics) yields a final chi-square figure of14.64; this is the figure computed from the sam-ple of 60 college entrants. The researcher thenrefers to tables of the distribution of chi-square(given in most books on social science statistics)and looks up the figure to see if it indicates astatistically significant difference from that oc-curring by chance. Part of the chi-square distri-bution table is shown here:

The researcher will see that the ‘degrees of free-dom’ (a mathematical construct that is related tothe number of restrictions that have been placedon the data) have to be identified. In many cases,to establish the degrees of freedom, one simplytakes 1 away from the total number of rows of

Degrees of freedom Level of significance0.05 0.01

3 7.81 11.34

4 9.49 13.285 11.07 15.09

6 12.59 16.81

365Chapte

r 20

the contingency table and 1 away from the totalnumber of columns and adds them; in this case itis (2–1)+(3–1)=3 degrees of freedom. Degrees offreedom are discussed later in this chapter. (Otherformulae for ascertaining degrees of freedom holdthat the number is the total number of cells minusone—this is the method set out later in this chap-ter.) In our example above, the researcher looksalong the table from the entry for the three de-grees of freedom and notes that the figure calcu-lated—of 14.64—is statistically significant at the0.01 level, i.e. is higher than the required 11.34,indicating that the results obtained—the distri-butions of the actual data—could not have oc-curred simply by chance. The null hypothesis isrejected at the 0.01 level of significance. Inter-preting the specific figures of the contingency ta-ble in educational rather than statistical terms,noting (a) the low incidence of females in the sci-ence subjects and the high incidence of females inthe arts and humanities subjects, and (b) the highincidence of males in the science subjects and lowincidence of males in the arts and humanities, theresearcher would say that this distribution is sig-nificant—suggesting, perhaps, that the collegeneeds to consider action possibly to encouragefemales into science subjects and males into artsand humanities.

There are numerous statistical packages avail-able for computer use that will process the cal-culations for most researchers; they will simplyneed to enter the raw data and the computerwill process the data and indicate the level ofstatistical significance of the distributions. Amuch-used package is the Statistical Package forSocial Sciences (SPSS) which will process thesedata using the CROSSTABS command sequence.

The chi-square test requires at least 80 percent of the cells of a contingency table to con-tain at least five cases if confidence is to be placedin the results. This means that it may not befeasible to calculate the chi-square statistic if onlya small sample is being used. Hence the re-searcher would tend to use this statistic forlarger-scale survey data. Other approaches couldbe used if the problem of low cell frequenciesobtains (Cohen and Holliday, 1996).

Methods of analyzing data cast into 2×2 con-tingency tables by means of the chi square testare generally well covered in research methodsbooks. Increasingly, however, educational dataare classified in multiple rather than two-di-mensional formats. Everitt (1977) provides auseful account of methods for analyzing multi-dimensional tables and has shown, incidentally,the erroneous conclusions that can result fromthe practice of analyzing multi-dimensionaldata by summing over variables to reduce themto two-dimensional formats. In this section wetoo illustrate the misleading conclusions thatcan arise when the researcher employs bivariaterather than multivariate analysis. The outlinethat follows draws closely on an exposition byWhiteley (1983).5

Multi-dimensional data: some words onnotation

The hypothetical data in Box 20.14 refer to asurvey of voting behaviour in a sample of menand women in Britain:

the row variable (sex) is represented by i;the column variable (voting preference) is rep-resented by j;the layer variable (social class) is represented by k.

The number in any one cell in Box 20.14 can berepresented by the symbol n

ijk that is to say, the

score in row category i column category j, andlayer category k, where:

i = 1 (men), 2 (women)j = 1 (Conservative), 2 (Labour)k = 1 (middle-class), 2 (working-class)

Box 20.14Sex, voting preference and stable ocial class: a three-wayclassification

Source Adapted from Whiteley, 1983

Middle-class Working-class

Conservative Labour Conservative LabourMen 80 30 40 130Women 100 20 40 110

MULTI-DIMENSIONAL DATA NOTATION

366 MULTI-DIMENSIONAL MEASUREMENT

It follows therefore that the numbers in Box20.14 can also be represented as in Box 20.15.Thus,

Using the chi square test in a three-way classification table

Whiteley (1983) showshow easy it is to extendthe 2×2 chi square test to the three-way case.The probability that an individual taken fromthe sample at random in Box 20.10 will be awoman is:Box 20.15

Sex, voting preference and social class: a three-waynotational classification

and

Middle-class Working-class

Conservative Labour Conservative Labour

Men n111 n121 n112 n122

Women n211 n221 n212 n222

Three types of marginals can be obtained fromBox 20.15 by: 1 Summing over two variables to give the mar-

ginal totals for the third. Thus:

n++k

=summing over sex and voting preferenceto give social class, for example:

n+j+

=summing over sex and social class to givevoting preferencen

i++=summing over voting preference and so-

cial class to give sex.

2 Summing over one variable to give the mar-ginal totals for the second and third variables.Thus:

3 Summing over all three variables to give thegrand total. Thus:

and the probability that a respondent’s votingpreference will be Labour is:

and the probability that a respondent will beworking-class is:

To determine the expected probability of an in-dividual being a woman, Labour supporter andworking-class we assume that these variables arestatistically independent (that is to say, there isno relationship between them) and simply ap-ply the multiplication rule of probability theory:

This can be expressed in terms of the expectedfrequency in cell n

222 as:

Similarly, the expected frequency in cell n112

is:where:

and

and

367Chapte

r 20

Thus

Box 20.16 gives the expected frequencies for thedata shown in Box 20.14.

With the observed frequencies and the ex-pected frequencies to hand, chi square is calcu-lated in the usual way:

Thus; where r=rows, c=columns and l=layers:Box 20.16Expected frequencies in sex, voting preference and socialclass

Source Adapted from Whiteley, 1983

Middle-class Working-class

Conservative Labour Conservative Labour

Men 55.4 61.7 77.0 85.9Women 53.4 59.5 74.3 82.8

Degrees of freedom

As Whiteley observes, degrees of freedom in athree-way contingency table are more complexthan in a 2×2 classification. Essentially, how-ever, degrees of freedom refer to the freedomwith which the researcher is able to assign val-ues to the cells, given fixed marginal totals. Thiscan be computed by first determining the de-grees of freedom for the marginals.

Each of the variables in our example (sex,voting preference, and social class) contains twocategories. It follows therefore that we have (2–1) degrees of freedom for each of them, giventhat the marginal for each variable is fixed. Sincethe grand total of all the marginals (i.e. the sam-ple size) is also fixed, it follows that one moredegree of freedom is also lost. We subtract thesefixed numbers from the total number of cells inour contingency table. In general therefore:

degrees of freedom (df)=the number of cells inthe table-1 (for N)-the number of cells fixedby the hypothesis being tested.

that is to say when weare testing the hypothesis of the mutual inde-pendence of the three variables.

In our example:

From chi square tables we see that the criticalvalue of with four degrees of freedom is 9.49at p=0.05. Our obtained value greatly exceedsthat number. We reject the null hypothesis andconclude that sex, voting preference, and socialclass are significantly interrelated.

Having rejected the null hypothesis with re-spect to the mutual independence of the threevariables, the researcher’s task now is to iden-tify which variables cause the null hypothesis tobe rejected. We cannot simply assume that be-cause our chi-square test has given a significantresult, it therefore follows that there are signifi-cant associations between all three variables. Itmay be the case, for example, that an associa-tion exists between two of the variables whilstthe third is completely independent. What weneed now is a test of ‘partial indepen-dence’.Whiteley shows the following three suchpossible tests in respect of the data in Box 20.10.First, that sex is independent of social class andvoting preference:

(1)

Second, that voting preference is independentof sex and social class:

(2)

And third, that social class is independent of sexand voting preference:

(3)

The following example shows how to constructthe expected frequencies for the first hypothesis.

DEGREES OF FREEDOM

368 MULTI-DIMENSIONAL MEASUREMENT

We can determine the probability of an indi-vidual being, say, woman, Labour, and work-ing-class, assuming hypothesis (1), as follows:

From chi square tables we see that the criticalvalue of with three degrees of freedom is 7.81at . Our obtained value is less than this.We therefore accept the null hypothesis and con-clude that there is no relationship between sexon the one hand and voting preference and so-cial class on the other.

Suppose now that instead of casting our datainto a three-way classification as shown in Box20.14, we had simply used a 2×2 contingencytable and that we had sought to test the nullhypothesis that there is no relationship betweensex and voting preference. The data are shownin Box 20.18.

That is to say, assuming that sex is independentof social class and voting preference, the ex-pected number of female, working class Laboursupporters is 117.8.

When we calculate the expected frequenciesfor each of the cells in our contingency table inrespect of our first hypothesis , weobtain the results shown in Box 20.17.

Box 20.17Expected frequencies assuming that sex is independent ofsocial class and voting preference

Source Adapted from Whiteley, 1983

Middle-class Working-class

Conservative Labour Conservative LabourMen 91.6 25.5 40.7 122.2

Women 88.4 24.5 39.3 117.8

Degrees of freedom are given by:

Whiteley observes:

Note that we are assuming c and l are interrelatedso that once, say, p+11 is calculated, then p+12, p+21

and p+22 are determined, so we have only 1 degreeof freedom; that is to say, we lose (cl-1) degrees offreedom in calculating that relationship.

(Whiteley, 1983)

When we compute chi square from the abovedata our obtained value is . Degrees offreedom are given by

From chi square tables we see that the criti-cal value of with 1 degree of freedom is 3.84at . Our obtained value exceeds this.We reject the null hypothesis and conclude thatsex is significantly associated with voting pref-erence.

But how can we explain the differing conclu-sions that we have arrived at in respect of thedata in Boxes 20.14 and 20.18? These examplesillustrate an important and general point,Whiteley observes. In the bivariate analysis (Box20.18) we concluded that there was a significantrelationship between sex and voting preference.In the multivariate analysis (Box 20.14) that re-lationship was found to be non-significant whenwe controlled for social class. The lesson is plain:use a multivariate approach to the analysis ofcontingency tables wherever the data allow.5

Source Adapted from Whiteley, 1983

Box 20.18Sex and voting preference: a two-way classificationtable

Conservative LabourMen 120 160

Women 140 130

.

369Chapte

r 20

A note on multilevel modelling

Multilevel modelling (also known as multilevelregression) is a statistical method that recognizesthat it is uncommon to be able to assign stu-dents in schools randomly to control and ex-perimental groups, or indeed to conduct an ex-periment that requires an intervention with onegroup whilst maintaining a control group(Keeves and Sellin, 1997:394).

Typically in most schools, students are broughttogether in particular groupings for specified pur-poses and each group of students has its own dif-ferent characteristics which renders it differentfrom other groups. Multilevel modelling addressesthe fact that, unless it can be shown that differ-ent groups of students are, in fact, alike, it is gen-erally inappropriate to aggregate groups of stu-dents or data for the purposes of analysis. Indeedmultilevel modelling provides a striking critiqueof Bennett’s (1976) research on teaching stylesthat we report earlier in this chapter (Aitken,Anderson and Hinde, 1981). Multilevel modelsavoid the pitfalls of aggregation and the ecologi-cal fallacy (Plewis, 1997:35), i.e. making infer-ences about individual students and behaviourfrom aggregated data.

Data and variables exist at individual andgroup levels, indeed Keeves and Sellin (1997)break down analysis further into three main lev-els: (a) between students over all groups; (b) be-tween groups; and (c) between students withingroups. One could extend the notion of levels,of course, to include individual, group, class,school, local, regional, national and internationallevels (Paterson and Goldstein, 1991). This hasbeen done using multilevel regression and hier-archical linear modelling. Multilevel models en-able researchers to ask questions hitherto unan-swered, e.g. about variability between and withinschools, teachers and curricula (Plewis, 1997:34–5), in short about the processes of teaching andlearning.6 Useful overviews of multilevel model-ling can be found in Goldstein (1987), Fitz-Gib-bon (1997) and Keeves and Sellin (1997).

Multilevel analysis avoids statistical treatmentsassociated with experimental methods (e.g. analy-

sis of variance and covariance); rather, it uses re-gression analysis and, in particular, multilevelregression. Regression analysis, argues Plewis(1997:28), assumes homoscedasticity (where theresiduals demonstrate equal scatter), that theresiduals are independent of each other, and fi-nally, that the residuals are normally distributed.

The whole field of multilevel modelling has pro-liferated rapidly in the 1990s, and is the basis ofmuch research that is being undertaken on the‘value added’ component of education and thecomparison of schools in public ‘league tables’ ofresults (Fitz-Gibbon, 1991, 1997). However Fitz-Gibbon (1997:42–4) provides important evidenceto question the value of some forms of multilevelmodelling. She demonstrates that residual gainanalysis provides answers to questions about thevalue-added dimension of education which differinsubstantially from those answers that are givenby multilevel modelling (the lowest correlation co-efficient being 0.93 and 71.4 per cent of the corre-lations computed correlating between 0.98 and 1).The important point here is that residual gainanalysis is a much more straightforward techniquethan multilevel modelling. Her work strikes at theheart of the need to use complex multilevel mod-elling to assess the ‘value-added’ component ofeducation. In her work (Fitz-Gibbon, 1997:5) thevalue-added score—the difference between a sta-tistically predicted performance and the actual per-formance—can be computed using residual gainanalysis rather than multilevel modelling. None-theless, multilevel modelling now attracts world-wide interest. Whereas ordinary regression mod-els do not make allowances, for example, for dif-ferent schools (Paterson and Goldstein, 1991),multilevel regression can include school differences,and, indeed other variables, for example: socio-economic status (Willms, 1992), single and co-edu-cational schools (Daly, 1996; Daly andShuttleworth, 1997), location (Garner andRaudenbush, 1991), size of school (Paterson, 1991)and teaching styles (Zuzovsky and Aitken, 1991).Indeed Plewis (1991a) indicates how multilevelmodelling can be used in longitudinal studies, link-ing educational progress with curriculumcoverage.

A NOTE ON MULTILEVEL MODELLING

Introduction

Much current discussion of role-playing hasoccurred within the context of a protracted de-bate over the use of deception in experimentalsocial psychology. Inevitably therefore, the fol-lowing account of role-playing as a research toolinvolves some detailed comment on the ‘decep-tion’ versus ‘honesty’ controversy But role-play-ing has a much longer history of use in the so-cial sciences than as a substitute for deceit. Ithas been employed for decades in assessing per-sonality, in business training and in psycho-therapy (Ginsburg, 1978).1 In this latter connec-tion, role-playing was introduced to the UnitedStates as a therapeutic procedure by Moreno inthe 1930s. His group therapy sessions werecalled ‘psycho-drama’, and in various forms theyspread to the group dynamics movement whichwas developing in America in the 1950s. Cur-rent interest in encounter sessions and sensitiv-ity training can be traced back to the impact ofMoreno’s pioneering work in role-taking androle-enactment.

The focus of this chapter is on the use of role-playing as a technique of educational research.Role-playing is defined as participation in simu-lated social situations that are intended to throwlight upon the role/rule contexts governing ‘real’life social episodes. The present discussion aimsto extend some of the ideas set out in Chapter16 which dealt with account gathering andanalysis. We begin by itemizing a number of role-playing methods that have been reported in theliterature.

Various role-play methods have been identi-fied by Hamilton (1976) and differentiated in

terms of a passive-active distinction. Thus, anindividual may role-play merely by reading adescription of a social episode and filling in aquestionnaire about it; on the other hand, a per-son may role-play by being required to impro-vise a characterization and perform it in frontof an audience. This passive—active continuum,Hamilton notes, glosses over three importantanalytical distinctions.

First, the individual may be asked simply toimagine a situation or actually to perform it.Hamilton terms this an ‘imaginary-performed’situation. Second, in connection with performedrole-play, he distinguishes between structuredand unstructured activities, the difference de-pending upon whether the individual is restrictedby the experimenter to present forms or lines.This Hamilton calls a ‘scripted-improvised’ dis-tinction. And third, the participant’s activitiesmay be verbal responses, usually of the paperand pencil variety, or behavioural, involvingsomething much more akin to acting. This dis-tinction is termed ‘verbal-behavioural’. Turningnext to the content of role-play, Hamilton dis-tinguishes between relatively involving oruninvolving contents, that is, where a subject isrequired to act or to imagine herself in a situa-tion or, alternatively, to react as she believesanother person would in those circumstances,the basic issue here being what person the sub-ject is supposed to portray. Furthermore, in con-nection with the role in which the person isplaced, Hamilton differentiates between stud-ies that assign the individual to the role of labo-ratory subject and those that place her in anyother role. Finally, the content of the role-playis seen to include the context of the acted or the

21 Role-playing

Cha

pte

r 21

371

imagined performance, that is, the elaboratenessof the scenario, the involvement of other actors,and the presence or absence of an audience. Thevarious dimensions of role-play methods identi-fied by Hamilton are set out in Box 21.1.

To illustrate the extremes of the range in therole-playing methods identified in Box 21.1 wehave selected two studies, the first of which ispassive, imaginary and verbal, typical of the wayin which role-playing is often introduced to pu-pils; the second is active, performed and behav-ioural, involving an elaborate scenario and theparticipation of numerous other actors.

In a lesson designed to develop empathizingskills (Rogers and Atwood, 1974), a number ofmagazine pictures were selected. The picturesincluded easily observed clues that served as thebasis for inferring an emotion or a situation.Some pictures showed only the face of an indi-vidual, others depicted one or more persons in aparticular social setting. The pictures exhibiteda variety of emotions such as anger, fear, com-passion, anxiety and joy. Pupils were asked tolook carefully at a particular picture and thento respond to questions like: • How do you think the individual(s) is (are)

feeling?• Why do you think this is? (Encourage stu-

dents to be specific about observations fromwhich they infer emotions. Distinguish be-tween observations and inferences.)

• Might the person(s) be feeling a differentemotion than the one you inferred? Give anexample.

• Have you ever felt this way? Why?• What do you think might happen next to this

person?• If you inferred an unpleasant emotion, what

possible action might the person(s) take inorder to feel better?

The second example of a role-playing study isthe well-known Stanford Prison experiment car-ried out by Haney et al. (1973), a brief over-view of which is given in Box 21.2. Enthusiastsof role-playing as a research methodology cite

experiments such as the Stanford Prison studyto support their claim that where realism andspontaneity can be introduced into role-play,then such experimental conditions do, in fact,simulate both symbolically and phenomenologically, the real-life analogues that they purportto represent. Advocates of role-play would con-cur with the conclusions of Haney and his asso-ciates that the simulated prison developed intoa psychologically compelling prison environmentand they, too, would infer that the dramatic dif-ferences in the behaviour of prisoners and guardsarose out of their location in different positionswithin the institutional structure of the prisonand the social psychological conditions that pre-vailed there, rather than from personality dif-ferences between the two groups of subjects (seeBanuazizi and Movahedi, 1975).

On the other hand, the passive, imaginaryrole-play required of subjects taking part in thelesson cited in the first example has been thefocus of much of the criticism levelled at role-playing as a research technique. Ginsburg (1978)summarizes the argument against role-playingas a device for generating scientific knowledge: • Role-playing is unreal with respect to the

variables under study in that the subject re-ports what she would do, and that is takenas though she did do it.2

• The behaviour displayed is not spontaneouseven in the more active forms of role-playing.

• The verbal reports in role-playing are very

Box 21.1Dimensions of role-play methods

FORM CONTENTSet: imaginary Person: self v other

vperformed

Action: scripted Role: subject vv another roleimprovised

Dependent variables: verbal Context: scenariov other actorsbehavioural audience

Source Adapted from Hamilton, 1976

INTRODUCTION

ROLE-PLAYING372

susceptible to artefactual influence such associal desirability.

• Role-playing procedures are not sensitive tocomplex interactions whereas deception de-signs are.

In general, Ginsburg concludes, critics of role-playing view science as involving the discoveryof natural truths and they contend that role-play-ing simply cannot substitute for deception—asad but unavoidable state of affairs.

Role-playing versus deception: theargument

As we shall shortly see, those who support role-playing as a legitimate scientific technique forsystematic research into human social behaviourreject such criticisms by offering role-playingalternatives to deception studies of phenomenasuch as destructive obedience to authority andto conventional research in, for example, thearea of attitude formation and change.

The objections to the use of deception in ex-perimental research are articulated as follows:

• Lying, cheating and deceiving contradict thenorms that we typically try to apply in oureveryday social interactions. The use of de-ception in the study of interpersonal relationsis equally reprehensible. In a word, decep-tion is unethical.

• The use of deception is epistemologicallyunsound because it rests upon the acceptanceof a less than adequate model of the subjectas a person. Deception studies generally tryto exclude the human capacities of the sub-ject for choice and self-presentation. Theytend therefore to focus upon ‘incidental’ so-cial behaviour, that is, behaviours that areoutside of the subject’s field of choice, inten-tion and self-presentation that typically con-stitute the main focus of social activity amonghuman actors (see Forward et al., 1976).

• The use of deception is methodologicallyunsound. Deception research depends upona continuing supply of subjects who are na-ive to the intentions of the researchers. Butword soon gets round and potential subjectscome to expect that they will be deceived. It

Box 21.2The Stanford Prison experiment

The study was conducted in the summer of 1971 in a mock prison constructed in the basement of the psychologybuilding at Stanford University. The subjects were selected from a pool of 75 respondents to a newspaper advertise-ment asking for paid volunteers to participate in a psychological study of prison life. On a random basis half of thesubjects were assigned to the role of guard and half to the role of prisoner. Prior to the experiment subjects wereasked to sign a form, agreeing to play either the prisoner or the guard role for a maximum of two weeks. Thoseassigned to the prisoner role should expect to be under surveillance, to be harassed, but not to be physically abused.In return, subjects would be adequately fed, clothed and housed and would receive 15 dollars per day for theduration of the experiment.The outcome of the study was quite dramatic. In less than two days after the initiation of the experiment, violence andrebellion broke out. The prisoners ripped off their clothing and their identification numbers and barricaded themselvesinside the cells while shouting and cursing at the guards. The guards, in turn, began to harass, humiliate andintimidate the prisoners. They used sophisticated psychological techniques to break the solidarity among the inmatesand to create a sense of distrust among them. In less than 36 hours one of the prisoners showed severe symptoms ofemotional disturbance, uncontrollable crying and screaming and was released. On the third day, a rumour devel-oped about a mass escape plot. The guards increased their harassment, intimidation and brutality towards theprisoners. On the fourth day, two prisoners showed symptoms of severe emotional disturbance and were released.On the fifth day, the prisoners showed symptoms of individual and group disintegration. They had become mostlypassive and docile, suffering from an acute loss of contact with reality. The guards on the other hand, had kept uptheir harassment, some behaving sadistically. Because of the unexpectedly intense reactions generated by the mockprison experience, the experimenters terminated the study at the end of the sixth day.

Source Adapted from Banuazizi and Movahedi, 1975

Cha

pte

r 21

373

is a fair guess that most subjects are suspi-cious and distrustful of psychological researchdespite the best intentions of deception re-searchers.

Finally, advocates of role-playing methods de-plore the common practice of comparing theoutcomes of role-playing replications against thestandard of their deception study equivalents asa means of evaluating the relative validity of thetwo methods. The results of role-playing anddeception, it is argued, are not directly compa-rable since role-playing introduces a far widerrange of human behaviour into experiments (seeForward et al., 1976). If comparisons are to bemade, then role-playing results should providethe yardstick against which deception study dataare measured and not the other way round as isgenerally the case. We invite readers to followthis last piece of advice and to judge the well-known experiments of Milgram (1974) on de-structive obedience to authority against theirrole-playing replications by Mixon (1972; 1974).A more sustained discussion of ethical problemsinvolved in deception is given in Chapter 2.

Role-playing versus deception: theevidence

Milgram’s obedience-to-authorityexperiments

In a series of studies from 1963 to 1974,Milgram carried out numerous variations on abasic obedience experiment which involved in-dividuals acting, one at a time, as ‘teachers’ ofanother subject (who was, in reality, a confed-erate of the experimenter). Teachers were re-quired to administer electric shocks of increas-ing severity every time the learner failed to makea correct response to a verbal learning task. Overthe years, Milgram involved more than 1,000subjects in the experiment—subjects, inciden-tally, who were drawn from all walks of liferather than from undergraduate psychologyclasses. Summarizing his findings, Milgram(1974) reported that typically some 67 per cent

of his teachers delivered the maximum electricshock to the learner despite the fact that such adegree of severity was clearly labelled as highlydangerous to the physical well-being of the per-son on the receiving end. Milgram’s explana-tion of destructive obedience to authority is sum-marized by Brown and Herrnstein (1975).

Mixon’s starting point was a disaffection forthe deceit that played such an important part ingenerating emotional stress in Milgram’s sub-jects, and a desire to explore alternative ap-proaches to the study of destructive obedienceto authority. Since Milgram’s dependent vari-able was a rule-governed action, Mixon rea-soned (Mixon, 1974) the rule-governed behav-iour of Milgram’s subjects could have been uni-form or predictable. But it was not. Why, then,did some of Milgram’s subjects obey and somedefy the experimenter’s instructions? The situa-tion, Mixon notes, seemed perfectly clear to mostcommentators; the command to administer anelectric shock appeared to be obviously immoraland all subjects should therefore have disobeyedthe experimenter. If defiance was so obviouslycalled for when looking at the experiment fromthe outside, why, asks Mixon, was it not obvi-ous to those taking part on the inside?

Mixon found a complete script of Milgram’sexperiment and proceeded to transform it intoan active role-playing exercise. He writes:

Previous interpretations [of the Milgram data]have rested on the assumption that obedient sub-jects helplessly performed an obviously immoralact. From the outside the situation seemed clear.It was otherwise to the actors. The actors in myrole playing version could not understand why theexperimenter behaved as if feedback from the ‘vic-tim’ was unimportant. The feedback suggested thatsomething serious had occurred, that somethinghad gone badly wrong with the experiment. Theexperimenter behaved as if nothing serious hador could happen. The experimenter in effect con-tradicted the evidence that otherwise seemed soclearly to suggest that the ‘victim’ was in serioustrouble…. Using the ‘all-or-none’ method I foundthat when it became perfectly clear that the experi-menter believed the ‘victim’ was being seriously

ROLE PLAYING V DECEPTION: THE EVIDENCE

ROLE-PLAYING374

harmed all actors indicated defiance to experimen-tal commands. Briefly summarized, the ‘all-or-none’ analysis suggests that people will obey seem-ingly inhumane experimental commands so longas there is no good reason to think experimentalsafeguards have broken down; people will defyseemingly inhumane experimental commandswhen it becomes clear that safeguards have bro-ken down—when consequences may indeed bewhat they appear to be. When the experimentalsituation is confusing and mystifying as inMilgram’s study, some people will obey and somewill defy experimental commands.

(Mixon, 1974, emphasis added) We leave readers to compare Mixon’s explana-tions with Milgram’s account set out by Brownand Herrnstein (1975).

In summary, sophisticated role-playing meth-ods such as those used by Mixon offer excitingpossibilities to the educational researcher. Theyavoid the disadvantages of deception designs yetare able to incorporate many of the standardfeatures of experiments such as constructingexperimental conditions across factors of inter-est (in the Mixon studies for example, usingscripts that vary the states of given role/rule con-texts), randomly assigning actors to conditionsas a way of randomizing out individual differ-ences, using repeated-measures designs, andstandardizing scripts and procedures to allowfor replication of studies (Forward et al., 1976).

Despite what has just been said about thepossibilities of incorporating experimental role-playing methodologies in exploratory ratherthan experimental settings, Harré and Secord(1972) distinguish between ‘exploration’ and‘experiment’ as follows. Whereas the experimentis employed to test the authenticity of what isknown, exploration serves quite a different pur-pose:

In exploratory studies, a scientist has no very clearidea of what will happen, and aims to find out.He has a feeling for the direction in which togo…but no clear expectations of what to expect.He is not confirming or refuting hypotheses.

(Harré and Secord, 1972)

Increasingly, exploratory (as opposed to experi-mental) research into human social behaviouris turning to role-playing methodologies. Thereason is plain enough. Where the primary ob-jective of such research is the identification andelucidation of the role/rule frameworks govern-ing social interaction, informed rather than de-ceived subjects are essential if the necessary dataon how they genuinely think and feel are to bemade available to the researcher. Contrast theposition of the fully participating, informed sub-ject in such research with that of the deceivedsubject under the more usual experimental con-ditions.

It can be argued that many of the more press-ing social problems that society faces today ariseout of our current ignorance of the role/ruleframeworks governing human interactions indiverse social settings. If this is the case, thenrole-playing techniques could offer the possibil-ity of a greater insight into the natural episodesof human behaviour that they seek to elucidatethan the burgeoning amount of experimentaldata already at hand. The danger may lie in toomuch being expected of role-playing as a key tosuch knowledge. Ginsburg (1978) offers a timelywarning. Role-playing, he urges, should be seenas a complement to conventional experiments,survey research and field observations. That is,it is an important addition to our investigativearmamentarium, not a replacement.

Role-playing in educational settings

Role-playing, gaming and machine or compu-ter simulation are three strands of developmentin simulation studies that have found their wayinto British classrooms (Taylor and Walford,1972). Their discovery and introduction intoprimary and secondary schools as late as the1960s is somewhat surprising in view of theunqualified support that distinguished educa-tional theorists from Plato onwards have ac-corded to the value of play and games in educa-tion (Megarry, 1978).

The distinction between these three types ofsimulation—role-playing, games and machines/

Cha

pte

r 21

375

computers—is by no means clear-cut; for exam-ple, simulation games often contain role-play-ing activities and may be designed with compu-ter back-up services to expedite their procedures(see Taylor and Walford, 1972).

In this section we focus particularly upon role-playing aspects of simulation, beginning withsome brief observations on the purposes of role-playing in classroom settings and some practi-cal suggestions directed towards the less experi-enced practitioners of role-playing methods.

The uses of role-playing

The uses of role-playing are classified by vanMents (1978) as: • Developing sensitivity and awareness. The

definitions of positions such as mother,teacher, policeman and priest, for example,explicitly or implicitly incorporate variousrole characteristics which often lead to thestereotyping of position occupants. Role-play-ing provides a means of exploring such stere-otypes and developing a deeper understand-ing of the point of view and feelings of some-one who finds herself in a particular role.

• Experiencing the pressures which create roles.Role-playing provides study material forgroup members on the ways in which rolesare created in, for example, a committee. Itenables subjects to explore the interactionsof formal structure and individual personali-ties in role taking.

• Testing out for oneself possible modes of be-haviour. In effect, this is the rehearsal syn-drome: the trying out in one’s mind in ad-vance of some new situation that one has toface. Role-playing can be used for a widevariety of situations where the subject, forone reason or another, needs to learn to copewith the rituals and conventions of social in-tercourse and to practise them so that theycan be repeated under stress.

• Simulating a situation for others (and possi-bly oneself) to learn from. Here, the role-player provides materials for others to use

and work upon. In the simplest situation,there is just one role-player acting out a spe-cific role. In more complex situations such asthe Stanford Prison study discussed in Box21.2, role-playing is used to provide an envi-ronment structured on the interactions ofnumerous role incumbents. Suggestions forrunning role-play sessions are set out in Box21.3. They are particularly appropriate toteachers intent upon using role-play in class-room settings.

Setting objectives

The first observation made by van Ments is thatteachers must begin by asking themselves what

Box 21.3A flow chart for using role-play

Source van Ments, 1983

THE USES OF ROLE-PLAYING

ROLE-PLAYING376

exactly their intentions are in teaching by meansof role-play. Is it, for example, to teach facts, orconcepts, or skills, or awareness, or sensitivity?Depending on the specific nature of the teach-er’s objective, role-play can be fitted into thetimetable in several ways. Van Ments identifiesthe following: • as an introduction to the subject;• as a means of supplementing or following on

from a point that is being explored;• as the focal point of a course or a unit of work;• as a break from the routine of the classroom

or the workshop;• as a way of summarizing or integrating di-

verse subject matter;• as a way of reviewing or revising a topic;• as a means of assessing work.

Determining external constraints

Role-play can be extremely time consuming. Itis vital therefore that from the outset, teachersshould be aware of the following factors thatmay inhibit or even preclude the running of arole-play (see van Ments, 1978): • suitable room or space (size, layout, furni-

ture, etc.);• sufficient time for warm up, running the ac-

tual role-play and debriefing;• availability of assistance to help run the session.

Critical factors

The teacher, van Ments advises, must look atthe critical issues involved in the problem area

encompassed by the role-play and decide whohas the power to influence those issues as wellas who is affected by the decisions to be taken.By way of example, Box 21.4 identifies some ofthe principal protagonists in a role-play sessionto do with young people smoking.

Choosing or writing the role-play

The choice lies with teachers either to buy orborrow a ready-made role-play or to write theirown. In practice, van Ments observes, most role-plays are written for specific needs and with theintention of fitting into a particular course pro-gramme. Existing role-plays can, of course, beadapted by teachers to their own particular cir-cumstances and needs. On balance it is prob-ably better to write the role-play oneself in or-der to ensure that the background is familiar tothe intended participants; they can then see itsrelevance to the specific problem that concernsthem.

Running the role-play

The counsel of perfection, van Ments remindsus, is always to pilot test the role-play materialthat one is going to use, preferably with a simi-lar audience. In reality, pilot testing can be astime consuming as the play itself and may there-fore be totally impracticable given timetablepressures. But however difficult the circum-stances, any form of piloting, says van Ments, isbetter than none at all, even if it is simply amatter of talking procedures through with oneor two colleagues.

Once the materials are prepared, then the

Box 21.4Critical factors in a role-play: smoking and young people

Roles involved: young people, parents, teachers, doctors, youth leaders, shopkeeper, cigarettemanufacturer.

Critical issues: responsibility for health, cost of illness, freedom of action, taxation revenue,advertising, effects on others.

Key communication channels: advertisements, school contacts, family, friends.

Source Adapted from van Ments, 1983

Cha

pte

r 21

377

role-play follows its own sequence: introduction,warm up, running, and ending. One final wordof caution. It is particularly important to timethe ending of the role-play in such a way as tofit into the whole programme. One method ofensuring this is to write the mechanism for end-ing into the role-play itself. Thus: ‘You must havereached agreement on all five points before 11.30a.m. when you have to attend a meeting of theboard of directors.’

Debriefing

Debriefing is more than simply checking thatthe right lesson has been learnt and feeding thisinformation back to the teacher. Rather, vanMents reminds us, it is a two-way process, dur-ing which the consequences of actions arising inthe role-play can be analysed and conclusionsdrawn. It is at this point in the role-play sequencewhen mistakes and misunderstandings can berectified. Most important of all, it is from well-conducted debriefing sessions that the teachercan draw out the implications of what the pu-pils have been experiencing and can then planthe continuation of their learning about the topicat hand.

Follow-up

To conclude, van Ments notes the importanceof the follow-up session in the teacher’s plan-ning of the ways in which the role-play exercisewill lead naturally into the next learning activ-ity. Thus, when the role-play session has at-tempted to teach a skill or rehearse a novel situ-ation, then it may be logical to repeat it untilthe requisite degree of competence has beenreached. Conversely, if the purpose of the exer-cise has been to raise questions, then a follow-up session should be arranged to answer them.‘Whatever the objectives of using role play’, vanMents advises, ‘one must always consider theconnection between it and the next learning ac-tivity’ (van Ments, 1983). Above all else, avoidleaving the role-play activity in a vacuum.3

Strengths and weaknesses of role-playing and other simulation exer-cises

Taylor and Walford (1972) identify two promi-nent themes in their discussion of some of thepossible advantages and disadvantages in the useof classroom simulation exercises. They are, first,the claimed enhancement of pupil motivation and,second, the role of simulation in the provision ofrelevant learning materials. The motivationaladvantages of simulation are said to include: • a heightened interest and excitement in learn-

ing;• a sustained level of freshness and novelty aris-

ing out of the dynamic nature of simulationtasks;

• a transformation in the traditional pupil-teachersubordinate-superordinate relationship;

• the fact that simulation is a universal behav-ioural mode.

As to the learning gains arising out of the use ofsimulation, the authors identify: • the learning that is afforded at diverse levels

(cognitive, social and emotional);• the decision-making experiences that partici-

pants acquire;• an increased role awareness;• the ability of simulation to provide a vehicle

for free interdisciplinary communication;• the success with which the concrete approach

afforded by simulation exercises bridges thegap between ‘schoolwork’ and ‘the real world’.

What reservations are there in connection withsimulation exercises? Taylor and Walford (1972)identify the following: • Simulations, however interesting and attrac-

tive, are time-demanding activities and oughttherefore to justify fully the restrictedtimetabling allotted to competing educationalapproaches.

• Many simulation exercises are in the form ofgame kits and these can be quite expensive.

STRENGTHS AND WEAKNESSES OF ROLE-PLAYING

ROLE-PLAYING378

• Simulation materials may pose problems oflogistics, operation and general acceptance aslegitimate educational techniques parties larlyby parent associations.

Our discussion of the strengths and weaknessesof role-playing has focused upon its applicationin pupil groups. To illustrate Taylor andWalford’s point that simulation is a universalbehavioural mode, Robson and Coller’s (1991)example of a role-play with students in furthereducation deserves attention.

Role-playing In an educationalsetting: an example

Our example of role-play in an educational set-ting illustrates the fourth use of this approachthat van Ments identifies, namely, simulating asituation from which others may learn. As partof a study of secondary school pupils’ percep-tions of teacher racism, Naylor (1995) producedfour five-minute video presentations of actualclassroom events reconstructed for the purposesof the research. The films were scripted and role-

played by twenty-one comprehensive schoolpupils, each video focusing on the behaviour ofa white, female teacher towards pupils of vis-ible ethnic minority groups. A gifted teacher ofdrama elicited performances from the pupils andfaithfully interpreted their directions in her por-trayal of their devised teachers’ roles. The fourparts she played consisted of a supply teacherof Geography, a teacher of French, a teacher ofEnglish and a Mathematics teacher.

In an opportunity sample drawn throughoutEngland, Naylor (1995) showed the videos toover 1,000 adolescents differentiated by age, sex,ability and ethnicity. Pupils’ written responsesto the four videos were scored 0 to 5 on theKohlberg-type scale set out in Box 21.5. Interalia, the analysis of scripts from a stratified sam-ple of some 480 pupils suggested that older, high-ability girls of visible ethnic minority groupmembership were most perceptive of teacherracism and younger, low-ability boys of indig-enous white group membership, least percep-tive. For further examples of role play in an edu-cational setting see Bolton and Heathcote(1999).

Box 21.5Categorization of responses to the four video extracts

Level Description(Score)

0 No response or nothing which is intelligibly about the ‘ways in which people treat one another’ in theextract. Alternatively this level of response may be wrong in terms of fact and/or in interpretation.

1 No reference to racism (i.e. unfairness towards visible ethnic minority pupils) either by the teacher or bypupils, either implicitly or explicitly.

2 Either some reference to pupils’ racism (see level 1 above) but not to the teacher’s, or, reference to racism isleft unspecified as to its perpetrator. Such reference is likely to be implied and may relate to one or moreexamples drawn from the extract without any generalization or synthesizing statement(s). The account is at asuperficial level of analysis, understanding and explanation.

3 There is some reference to the teacher’s racist behaviour and actions. Such reference is, however, impliedrather than openly stated. There may also be implied condemnation of the teacher’s racist behaviour/actions. There will not be any generalized statement(s) about the teacher’s racism supported with examplesdrawn from the extract.

4 At this level the account will explicitly discuss and illustrate the teacher’s racism but the analysis will show asuperficial knowledge and understanding of the deeper issues.

5 At this level the account will explicitly discuss the teacher’s racism as a generalization and this will be wellillustrated with examples drawn from the extract. One or more of these examples may well be of the lessobvious and more subtle types of racist behaviour/action portrayed in the extract.

Source Naylor (unpublished)

Cha

pte

r 21

379

Evaluating role-playing and othersimulation exercises

Because the use of simulation methods in class-room settings is growing, there is increasing needto evaluate claims concerning the advantagesand effectiveness of these newer approachesagainst more traditional methods. Yet here liesa major problem. To date, as Megarry observes,a high proportion of evaluation effort has beendirected towards the comparative experimentinvolving empirical comparisons between simu-lation-type exercises and more traditional teach-ing techniques in terms of specified learning pay-offs. One objection to this approach to evalua-tion has been detailed earlier but is worth re-peating here:

the limitations [of the classical, experimentalmethod] as applied to evaluating classroom simu-lation and games are obvious: not only are theinputs multiple, complex, and only partly known,but the outputs are disputed, difficult to isolate,detect or measure and the interaction among par-ticipants is considerable. Interacting forms, in someviews, a major part of what simulation and gam-ing is about; it is not merely a source of ‘noise’ orexperimental error.

(Megarry, 1978) What alternatives are there to the traditionaltype of evaluative effort? Megarry lists the fol-lowing promising approaches to simulationevaluation: • using narrative reports;• using checklists gathered from students’ rec-

ollections of outstanding positive and nega-tive learning experiences;

• encouraging players to relate ideas and con-cepts learned in games to other areas of theirlives;

• using the instructional interview, a form oftutorial carried out earlier with an individuallearner or a small group in which materialsand methods are tested by an instructor whois versed not only in the use of the materials,but also in the ways in which pupils learn.

(See also Percival’s (1978) discussion of obser-vational and self-reporting techniques.) Noticehow each of the above evaluative techniques isprimarily concerned with the process rather thanthe product of simulation.

By way of summary, simulation methods pro-vide a means of alleviating a number of prob-lems inherent in laboratory experiments. At thesame time, they permit the retention of some oftheir virtues. Simulations, notes Palys (1978),share with the laboratory experiment the char-acteristic that the experimenter has completemanipulative control over every aspect of thesituation. At the same time, the subjects’ human-ity is left intact in that they are given a realisticsituation in which to act in whatever way theythink appropriate. The inclusion of the time di-mension is another important contribution ofthe simulation, allowing the subject to take anactive role in interacting with the environment,and the experimenter the opportunity of observ-ing a social system in action with its feedbackloops, multidirectional causal connections andso forth. Finally, Palys observes, the high involve-ment normally associated with participation insimulations shows that the self-consciousnessusually associated with the laboratory experi-ment is more easily dissipated.

EVALUATING ROLE-PLAYING

With respect to the fifth edition, the book so far

has brought the ‘story’ of educational research

up to date on very many issues, and in the con-

cluding part that follows we outline some im-

portant developments which, we suggest, will

feature prominently over the coming years. Al-

though what we say is speculative, these ini-

tiatives, we believe, will become fruitful avenues

of approach; nevertheless, the message that

educational research is developing and meta-

morphosing is one that cannot be ignored.

It is notable that none of the developments

that we include here began life in the world of

education, but elsewhere. The Internet had its

origins in military intelligence, whilst simulations

and fuzzy logic have their origins largely in the

natural sciences and mathematics. Simulations

have spilled over into all walks of life, from eco-

nomic forecasting to navigating ships; and fuzzy

logic is prevalent in the manufacture of white

goods and controlling traffic flow. Geographi-

cal Information Systems, another line of devel-

opment we consider, have been brought into

education, being already established in social

welfare analysis and health provision. And

needs analysis derives from social policy for-

mation, housing and welfare reforms. Although

it has featured in education for some time, it is

emerging from recent relative neglect to as-

sume an important role, not least because, with

the impact of the introduction of industrial man-

agement systems into education, it is premised

on the belief redolent of Japanese business

practice that the best people to identify a prob-

lem are the ones who are closest to it! Finally,

evidence-based education, building on the sub-

ject of meta-analysis that we discussed in Part

Three, has been prominent in the world of medi-

cine for many years, and the worldwide

Cochrane Collaboration—a group that collates

the results of stringent experimental testing of

treatments typically through randomized con-

trolled trials, preparing, maintaining and pro-

moting the accessibility of systematic reviews

of the effects of health care interventions—tes-

tifies to this.

This mixed pedigree of emerging develop-

ments signals that educational research is ec-

lectic in its paradigms, traditions, methodolo-

gies, instrumentation and data analysis. Fur-

ther, it is important to recognize that educational

research is integrative; it steps over the tradi-

tional boundaries of different disciplines; its

epistemological basis being, in part, derivative,

and suggestive of a need to cross such bounda-

ries and protected territories. Educational re-

search is both modern and postmodern! Just

as new knowledge crosses traditional episte-

mological boundaries, is at the frontiers of tra-

ditional disciplines and creates new ones, so

research, in its endeavour to create new knowl-

edge, need not be hidebound by tradition. Edu-

cation opens minds; educational research

should be open to new developments.

Part five

Recent Developments ineducational research

As we saw in the introduction to this fifth partof the book, what can be observed in recent de-velopments in educational research is the im-portation of ideas and methods from spheresoutside education, furthering the notion thatinterdisciplinary inquiry is both a developingtrend and, indeed, the way forward at the cut-ting edge of research. The frontiers of newknowledge are no longer hidebound by disci-plines (for example in the discussions of needsanalysis and needs assessment in this chapter).This trend can be coupled with the use of infor-mation technology for research activity (for ex-ample the discussions of simulations and mod-elling as we shall see). In this chapter these ex-amples, in turn, draw on the disciplines of math-ematics (e.g. chaos theory, complexity theoryand fuzzy logic) and geography (GeographicalInformation Systems). The role of informationtechnology has enabled researchers to break outof disciplinary boundaries and move forwardwith speed and success. Previous chapters haveindicated the role that information technologysoftware can play at all stages of research. Thischapter discusses four such applications: theInternet, simulations, fuzzy logic, and Geo-graphical Information Systems (GIS).

The Internet

The storage and retrieval of research data onthe Internet play an important role not only inkeeping researchers abreast of developmentsacross the world, but also in providing access todata which can inform literature searches to es-tablish construct and content validity in their

own research. Indeed some kinds of research areessentially large-scale literature searches (e.g. theresearch papers published in the journal Reviewof Educational Research). On-line journals, ab-stracts and titles enable researchers to identifythe cutting edge of research and to initiate a lit-erature search of relevant material on their cho-sen topic. Websites and e-mail correspondenceenable networks and information to be shared.For example, researchers wishing to gain instan-taneous global access to literature and recentdevelopments in research associations can reachAustralia, East Asia, the UK and America in amatter of seconds through such websites as:

http://www.aera.net (the website ofthe American Educational Research Associa-tion);

http://www.laic.k12.ca.us/catalog/

prov iders/185.html (also the website ofthe American Educational Research Associa-tion);

http://www.acer.edu.au/

index2.html (the website of the AustralianCouncil for Educational Research);

http://www.bera.ac.uk (the website ofthe British Educational Research Association);

http://scre.ac.uk (the website of theScottish Council for Research in Education);

http://www.eera.ac.uk/index.html

(the website of the European EducationalResearch Association);

http://www.cem.dur.ac.uk (thewebsite of the Curriculum Evaluation and

22 Recent developments

RECENT DEVELOPMENTS384

Management Centre, probably the largestmonitoring centre of its kind in the world);

http://www.nfer.ac.uk (the websiteof the National Foundation for EducationalResearch in the UK);

http://www.fed.cuhk.edu.hk/~hkera

(the website of the Hong Kong EducationalResearch Association);

http://www2 .hawai.edu.hera (thewebsite of the Hawaii Educational ResearchAssociation);

http://www.wera-web.org/

index.html (the website of the WashingtonEducational Research Association);

http://www.ttu.eedu/~edupsy/

regis.html (the website of the ChineseAmerican Educational Research Association);

http://www.msstate.edu/org/msera/

msera .html (the website of the mid-SouthEducational Research Association, a verylarge regional association in the USA);

http://www.esrc.ac.uk (the website ofthe Economic and Social Research Council inthe UK);

http://www.asanet.org (the website ofthe American Sociological Association);

Researchers wishing to access on-line journalindices and references for published researchresults (rather than to specific research associa-tions as in the websites above) have a variety ofwebsites which they can visit, for example:

http://www.leeds.ac.uk/bei (to gainaccess to the British Education Index);

http://www.routledge.com:9996/

rout ledge/journal/er.html (thewebsite of an international publisher thatprovides information on all its research articles);

http://www.carfax.co.uk (a serviceprovided by a UK publisher to gain access tothe Scholarly Articles Research Alertingnetwork in the UK);

http://ericir.syr.edu/Eric/ (a

service to access the international Eric educa-tional research index);

http://ericir.syr.edu/Eric/

index.html (the index to the Eric database);

http://ericir.syr.edu/ (a furtherwebsite for searching Eric);

http://www.tandf.co.uk/era (thewebsite for educational research abstracts;

http://www.leeds.ac.uk/educol/

index. html (the website for Education-line,a service for electronic texts in education);

http://bubl.ac.uk (a national informa-tion service in the UK, provided for the highereducation community);

http://www.sosig.ac.uk (the SocialScience Information Gateway, providing accessto worldwide resources and information);

http://www.carfax.co.uk/ber-

ad.htm (the website of the British Educa-tional Research Journal);

http://wos.mimas.ac.uk (the websiteof the Web of Science, that, amongst otherfunctions, provides access to the Social ScienceCitation Index, the Science Citation Index andthe Arts and Humanities Citation Index);

http://pinkerton.bham.ac.uk/era/

main, htm (the website of EducationalResearch Abstracts Online);

http://www.socresonline.org.uk (thewebsite of Sociological Research Online).

Researchers who do not possess website ad-dresses have at their disposal a variety of searchengines to locate them. At the time of writingsome widely used engines are: Alta Vista; Euro-Ferret; Excite; GoTo; HotBot; InfoSeek NetSearch; Infoseek Ultra; Lycos; Magellan;OpenTextIndex; PlanetSearch; Webcrawler;What-U-Seek; WWW Worm; Yahoo; YahooUK. All of these search engines enable research-ers to conduct searches by keywords and someof them (e.g. Excite; Magellan) also enablesearches to be undertaken by concepts. Whilstall of these are single search engines, there are

Cha

pte

r 22

385

also several parallel search engines (which willsearch several single search engines at a time),and file search engines (which will search filesacross the world).

Finding research information, where notavailable from databases and indices on CD-ROMs, is often done through the Internet bytrial-and-error and serendipity, identifying thekey words to unlock the doors to websites. Forexample, keying in such terms as ‘educationalresearch uk’, ‘educational research usa’, ‘Ameri-can educational research association’, or ‘Brit-ish educational research association’ to a searchengine will reveal a plethora of websites thatare useful. The system of ‘bookmarking’websites enables rapid retrieval of these websitesfor future reference; this is perhaps essential, assome Internet connections are slow, and a vastamount of material on it is, at best, unhelpful!We provide some websites and keywords thatmay be helpful in researching the subsequenttopics in this chapter.

Simulations

The advent of computer technology has openedup powerful new vistas for research. Virtual tech-nology, as used, for example, in air flightsimulations (e.g. training new pilots) and shippiloting simulations, seeks to ensure high reli-ability of performance and the avoidance of fail-ure or system breakdown. In the field of educa-tion this has spawned research into schools ashigh reliability organizations (Morrison,1998:76–8)—institutions where failure isavoided for fear of disastrous consequences, forexample nuclear power plants, air traffic con-trol, electricity supply companies (Reynolds,1995; Stringfield, 1997:152–7). Outside theworld of education the practice of simulation isused extensively in order to identify problemsand weaknesses so that action can be taken (i.e.focus is on the ‘trailing edge’ of weaknessesrather than to the successful aspects of the or-ganization and its operation). The practice pro-ceeds on the premise that, unchecked, minorflaws and errors could escalate into huge fail-

ures at a systems level (the view of chaos theorydiscussed below). Simulations have two maincomponents: a system in which the researcher isinterested and that lends itself to be modelled orsimulated, and a model of that system (Wilcox,1997). The system comprises any set of interre-lated features, whilst the model, that is, the ana-logue of the system, is usually mathematical.

Wilcox (1997) has indicated two forms ofsimulation. In deterministic simulations all themathematical and logical relationships betweenthe components of a system are known andfixed. In stochastic simulations, typically themain types used in educational research, at leastone variable is random.

The use of simulations has grown consider-ably with the increase in mathematical model-ling. Computers can handle very rapidly datathat would take humans several years to proc-ess. Simulations based on mathematical model-ling (e.g. multiple iterations of the same formula)provide researchers with a way of imitating be-haviours and systems, and extrapolating whatmight happen if the system runs over time or ifthe same mathematical calculations are repeatedover and over again, where data are fed back—formatively—into the next round of calculationof the same formula. Hopkins, Hopkins andGlass (1996:159–62) report such a case in prov-ing the Central Limit Theorem (discussed inChapter 4), where the process of calculation ofmeans was repeated 10,000 times. Such model-ling has its roots in chaos theory and complex-ity theory (Morrison, 1998:3–5).

For Laplace and Newton, the universe wasrationalistic, deterministic and of clockworkorder; effects were functions of causes, smallcauses (minimal initial conditions) producedsmall effects (minimal and predictable) and largecauses (multiple initial conditions) producedlarge (multiple) effects. Predictability, causality,patterning, universality and ‘grand’ overarchingtheories, linearity, continuity, stability, objectiv-ity, all contributed to the view of the universe asan ordered and internally harmonistic mecha-nism in an albeit complex equilibrium, a rational,closed and deterministic system susceptible to

SIMULATIONS

RECENT DEVELOPMENTS386

comparatively straightforward scientific discov-ery and laws.

From the 1960s this view has been increas-ingly challenged with the rise of theories of chaosand complexity. Central to chaos theory are sev-eral principles (e.g. Gleick, 1987; Morrison,1998): • Small-scale changes in initial conditions can

produce massive and unpredictable changesin outcome (e.g. a butterfly’s wing beat inthe Caribbean can produce a hurricane inAmerica).

• Very similar conditions can produce very dis-similar outcomes (e.g. using simple math-ematical equations (Stewart, 1990)).

• Regularity and conformity break down toirregularity and diversity.

• Even if differential equations are very sim-ple, the behaviour of the system that they aremodelling may not be simple.

• Effects are not straightforward continuousfunctions of causes.

• The universe is largely unpredictable.• If something works once there is no guaran-

tee that it will work in the same way a sec-ond time.

• Determinism is replaced by indeterminism;deterministic, linear and stable systems arereplaced by ‘dynamical’, changing, evolvingsystems and non-linear explanations of phe-nomena.

• Continuity is replaced by discontinuity, tur-bulence and irreversible transformation.

• Grand, universal, all-encompassing theoriesand large-scale explanations provide inad-equate accounts of localized and specific phe-nomena.

• Long-term prediction is impossible. More recently theories of chaos have been ex-tended to complexity theory—‘the edge of chaos’(Waldrop, 1992; Lewin, 1993)—in analysingsystems, with components at one level acting asthe building blocks for components at another.A complex system comprises independent ele-ments which, themselves, might be made up of

complex systems. These interact and give rise topatterned behaviour in the system as a whole.Order is not totally predetermined and fixed,but the universe (however defined) is creative,emergent (through iteration, learning and recur-sion), evolutionary and changing, transformativeand turbulent. Order emerges in complex sys-tems that are founded on simple rules (perhapsformulae) for interacting organisms(Kauffmann, 1995:24).

Through feedback, recursion, perturbance,autocatalysis, connectedness and self-organiza-tion, higher and greater levels of complexity aredifferentiated, new forms arise from lower lev-els of complexity and existing forms. These com-plex forms derive from often comparatively sim-ple sets of rules—local rules and behaviours gen-erating complex global order and diversity(Waldrop, 1992:16–17; Lewin, 1993:38). Dy-namical systems (Peak and Frame, 1994:122)are a product of initial conditions and often sim-ple rules for change. General laws can governadaptive, dynamical processes (Kauffmann,1995:27). There are laws of emergent order, andcomplex behaviours and systems do not need tohave complex roots (Waldrop, 1992:270). Im-portantly, given these simple rules, behaviourand systems can be modelled in computersimulations.

Simulations are an emerging field in educa-tional research, though they have been used inthe natural sciences and economic forecastingfor several decades. For example, Lewin (1993)and Waldrop (1992), in the study of the rise andfall of species and their behaviour, indicate howthe consecutive iteration—repeated calcula-tion—of simple formulae to express the itera-tion of a limited number of variables (initialconditions), wherein the data from one roundof calculations are used in the next round ofcalculation of the same formula and so on (i.e.building in continuous feedback), can give riseto a huge diversity of outcomes (e.g. of species,of behaviour) such that it beggars simple pre-diction or simple cause-and-effect relationships.Waldrop (1992:241–2) provides a fascinatingexample of this in the computer programme

Cha

pte

r 22

387

Boids, where just three initial conditions are builtinto a mathematical formula that catches theactuality of the diverse patterns of flight of aflock of birds. These are: (a) the boids (birds)strive to keep a minimum distance from otherobjects (including other boids); (b) the boidsstrive to keep to the same speed as other boids;(c) each boid strives to move towards the centreof the flock.

The key features of simulations are: • The computer can model and imitate the be-

haviour of systems and their major attributes.• Computer use can help us to understand the

system that is being imitated by testing thesimulation in a range of simulated, imitatedenvironments (e.g. enabling researchers to see‘what happens if’ the system is allowed torun its course or if variables are manipulated,i.e. to be able to predict).

• The mathematical formula models and inter-prets—represents and processes—key featuresof the reality rather than catching and ma-nipulating the fine grain of reality.

• Mathematical relationships are assumed tobe acting over and over againdeterministically in controlled, bounded andclearly defined situations, on occasions giv-ing rise to unanticipated, emergent and un-expected, wide-ranging outcomes (Tymms,1996:124).

• Feedback and multiple, continuous iterationare acceptable procedures for understandingthe emergence of phenomena and behaviours.

• Complex and wide-ranging phenomena andbehaviours derive from the repeated interplayof initial conditions/variables.

• Deterministic laws (the repeated calculationof a formula) lead to unpredictable outcomes(chaos).

In the field of education what is being suggestedis that schools and classrooms, whilst being com-plex, non-linear, dynamical systems, can be un-derstood in terms of the working out of simplemathematical modelling. This may be at the levelof analogy only, but, as Tymms (1996:130) re-

marks, if the analogue fits the reality then re-searchers have a powerful tool for understand-ing such complexity in terms of the interplay ofkey variables or initial conditions and a set ofsimple rules. Further, if the construct validity ofsuch initial conditions or key variables can bedemonstrated then researchers have a powerfulmeans of predicting what might happen over time.

Three immediate applications of simulationshave been in the field of educational change(Ridgway, 1998), inspections (Tymms, 1997)and school effectiveness (Tymms, 1996). In thefirst, Ridgway argues that the complexity of thechange process might be best understood as acomplex, emergent system (see also Fullan,1999). In the second, Tymms exposes somemajor flaws in the inspection process. In thethird, he indicates the limitations of linear (in-put and output) or multi-level modelling to un-derstand or explain why schools are effective orwhy there is such a range of variation betweenand within schools. He puts forward the casefor using simulations based on mathematicalmodelling to account for such diversity and vari-ation between schools; as he argues in his pro-vocative statement: ‘the world is too complicatedfor words’ (ibid.: 131) (of course, similarly, forqualitative researchers the world may be eventoo complicated for numbers!).

Tymms indicates the limitations of existingschool effectiveness research that is based on lin-ear premises, however sophisticated. Instead,pouring cold water on much present school ef-fectiveness research, he argues (pp. 132–3) that‘simulation models would suggest that even if itwere possible to arrange for exactly the sameclasses to have exactly the same teacher for twoyears in the same classroom living through thesame two years that the outcomes would not bethe same’. For him, it is little surprise that schooleffectiveness research has failed to account effec-tively for variance between schools, because suchresearch is based on the wrong principles. Rather,he argues, such variance is the natural outcomeof the interplay of key—common—variables.

There are several potential concerns aboutand criticisms of simulations. To the charges that

SIMULATIONS

RECENT DEVELOPMENTS388

they artificially represent the world and that theyare a reductio ad absurdum, it can be stated thatresearchers, like theorists, strive to construct thebest fit with reality, to provide the most com-prehensive explanation, and that the closer theanalogy—the simulation—fits reality, the bet-ter (Tymms, 1996:130). That is an argument forrefining rather than abandoning simulations. Weonly need to know key elements to be able toconstruct an abstraction, we do not need com-plete, fine-grain detail.

To the charges that a simulation can nevertell us anything that we do not already know,that it is no better than the assumptions on whichit is built, and that a computer can only do whatit is programmed to do (rendering human agencyand freedom insignificant), it can be stated that:(a) simulations can reveal behaviours that oc-cur ‘behind the backs’ of social actors—thereare social facts (Durkheim, 1956) and patterns;(b) simulations can tell us what we do not know(Simon, 1996)—we may know premises andstarting points but not where they might lead toor what they imply; (c) we do not need to knowall the workings of the system to be able to ex-plain it, only those parts that are essential forthe model.

Other concerns can be voiced aboutsimulations, for example: • complexity and chaos theory that underpin

many mathematical simulations might ex-plain diverse, variable outcomes (as in schooleffectiveness research), but how do they en-able developers to intervene to promote im-provement?, e.g. in schools—explanationhere is retrospective rather than prospective;

• how does one ascertain the key initial condi-tions to build into the simulation (i.e. con-struct validity) and how do simulations fromthese lead to prescriptions for practice?

• How acceptable is it to regard systems as therecurring iteration and reiteration of the sameformula/model?

• In understanding chaotic complexity (in thescientific sense), how can researchers workback from this to identify the first principles

or elements or initial conditions that are im-portant?—the complex outcomes might bedue to the interaction of completely differentsets of initial conditions. This is akin toChomsky’s (1959) withering critique of Skin-ner’s behaviourism—it is impossible to infera particular stimulus from an observation ofbehaviour; we cannot infer a cause from anobservation or putative effect;

• Simulations work out and assume only theinterplay of initial conditions, thereby neglect-ing the introduction of additional factors ‘onthe way’, i.e. the process is too deterministic;

• What is being argued here is common sense,viz. that the interaction of people producesunpredicted and unpredictable behaviour.That is also its greatest attraction—it cel-ebrates agency;

• Planned interventions might work at first butultimately do not work (a reiteration, per-haps, of the Hawthorne effect); all we canpredict is that we cannot predict;

• Manipulating human variables is technicist;• There is more to behaviour than the repeated

iteration of the same mathematical model;• Whilst they may enable us to understand why

there are variations in effects, simulations donot help us to establish causes or interven-tions;

• There will always be a world of differencebetween the real world and the simulatedworld other than at an unhelpfully simplisticlevel;

• As with other numerical approaches,simulations might combine refinement ofprocess with crudity of concept (Ruddock,1981:49);

• Reducing the world to numbers, howeversophisticated, is quite simply wrong-headed;the world is too complicated for numbers.

These criticisms are serious, and indicate thatthis emergent new field of research has much todo to gain legitimacy. This is not to dismiss thisimportant and growing area; rather it is to seekits advance. These reservations—at conceptualand practical levels—do not argue against

Cha

pte

r 22

389

simulations but, rather, for their developmentand refinement. They promise much and in ar-eas of the sciences apart from education havealready yielded much of value. For further in-formation on complexity theory and simulationswe suggest that readers visit Internet websitessuch as:

http://www/santafe.edu/ (the website ofthe Santa Fe Institute—a major institute for thestudy of complexity theory);

http://www.brint.com/Systems.htm (awebsite that provides an index of material oncomplexity theory);

http://journals.wiley.com/1076–

2787/tocs/ (the website of the journal Com-plexity);

http://life.csu.edu.au/vl_complex/

all.html (a website that provides access tothe listings on complexity theory on the World-Wide Web Virtual Library).

Further, simply by keying in ‘complexitytheory’, ‘education simulations’, or ‘Santa FeInstitute’ to a search engine on the Internet thereader will be able to access a wealth ofreferences and information about the topics inthis section.

Fuzzy logic

Computer simulations can be extended to in-clude the developing field of ‘fuzzy logic’. Herethe researcher sets out to ascertain the extent towhich a particular measure conforms to a se-mantic ideal (Fourali, 1997). Fuzzy logic recog-nizes that properties (e.g. fast, slow, tall, low,high, moderate, adequate, mature, developed,competent) have continuously varying values,and that we partition these values comparativelyand arbitrarily into semantic categories or sec-tions (e.g. on a rating scale). Within each cat-egory there is variation. Fuzzy logic enables usto gain a more precise measurement of the vari-ance within and between these semantic catego-ries; it recognizes that imprecision, rather thanbivalence (either something is or is not the case)

is a characteristic of many phenomena. Fuzzylogic opts for shades of greyness rather thanblack-or-white (Kosko, 1994:102)! In the fieldof education Fourali (1997) has shown howfuzzy logic is particularly useful in assessment.Fuzzy logic builds in feedback: systems con-stantly modify themselves in response to feed-back, resonating with complexity theory. Kosko(1994:63) illustrates this with the example of awashing machine whose sensors adjust the ma-chine to the weight of washing, the amount ofdirt, the texture of the washing etc. For a fulleranalysis of the principles and practice of fuzzylogic see Smithson (1988), Cox (1994), Kosko(1994) and Fourali (1997). Readers wishing toresearch fuzzy logic in education will find a hugeamount of material on the Internet, accessed bykeying in ‘education fuzzy logic’ or ‘fuzzy logic’to a search engine. Two useful websites are:

http://www.ang-physik.uni-kiel.de/

~hoefi./fuzzy.www.english.html (awebsite that provides a world wide server aboutfuzzy logic);

http://www.fuzzytech.com/e_uni.htm (awebsite that provides other addresses for infor-mation).

Geographical Information Systems

The role of computer technology for educationalresearch purposes has extended the boundariesof discipline-based research. An example of thisis the use of Geographical Information Systemswhich are being used in the health services aswell as in education.

Educational policy frequently has geographi-cal implications and dimensions, e.g. catchmentareas, school closures, open enrolment andschool choice, the distribution of resources andfinancial expenditure, the distribution of assess-ment scores and examination results. Geographi-cal Information Systems (GIS) is a computer-based system for capturing, storing, validating,analysing and displaying spatial data, both largescale and small scale, integrating several typesof data from different sources (Worrall, 1990;

GEOGRAPHICAL INFORMATION SYSTEMS

RECENT DEVELOPMENTS390

Parsons, Chalkley and Jones, 1996). This is use-ful for teasing out the implications and outcomesof policy initiatives, for example: ‘What is theeffect of parental choice on school catchments?’;‘What is the spread of examination scores in aparticular region?’; ‘How effective is the provi-sion of secondary schools for a given popula-tion?’; ‘How can a transport system be mademore effective for taking students to and fromschool?’; ‘What is the evidence for the creationof “magnet” and “sink” schools in a particularcity?’. Examples of the data presented here aregiven in Boxes 22.1 and 22.2.

Clearly the political sensitivity and signifi-cance of these kinds of data are immense, indi-cating how research can inform policy-makingand its effects very directly. Parsons, Chalkleyand Jones (1996) provide a straightforward,fully-referenced introduction to this field of re-search in education, and they present case stud-ies of catchment areas and examination perform-ance, the redistribution of school catchments,and the pattern of movements in catchments.1

Readers wishing to research GeographicalInformation Systems on the Internet can accessseveral sites by keying in ‘education researchGeographical Information Systems’ on a search

engine for the Internet or by visiting the follow-ing website:

http://geo.ifaran.ru/resources/

giswww.html (a GIS World-Wide Web resourcelist).

Needs analysis

The notion of needs analysis (also called needsassessment) has existed in the world of educa-tion for over a decade, coming from social wel-fare (e.g. housing, employment, crime preven-tion and poverty reduction programmes), healthprogrammes and social policy research. Its pedi-gree in education is rooted in evaluation studiesand research (Suarez, 1994). Needs analysis canbe used, for example, to:

• identify students’ instructional needs;• identify programme provision needs (and

gaps in present provision);• ascertain weaknesses in students’ achieve-

ments or provision;• provide information on in-service needs;• determine where deficits exist so that they can

be addressed;• identify areas for expenditure and educational

development.

Box 22.1Geographical Information Systems in secondary schools

Source Parsons, Chalkley and Jones, 1996

Cha

pte

r 22

391

It can be argued that needs assessment has simi-larities to the cause-and-effect quality modelsused in industry (Morrison, 1998), where theintention is to find the ‘real causes’ of problemsrather than, for example, putative sources, sothat causes rather than symptoms of problems,deficits and weaknesses can be addressed in sub-sequent planning.

Much hangs on the definition of ‘needs’ thatis adopted. For example, a need can be definedin several ways (Scriven and Roth, 1978; Lundand McGechan, 1981; Stufflebeam et al., 1985;Rossi and Freeman, 1993; Suarez, 1994):

• a discrepancy or underachievement (a differ-ence between what is and what should be thecase);

• wants and preferences (e.g. for future plan-ning), reflecting values;

• anticipated requirements for the future;• anticipated problems for the future;• a deficit (where the absence of a feature un-

der review is harmful).

Here the concept of a need swings between, onthe one hand, deficit or shortfall, and, on theother hand, future planning and programming.

The first is essentially reactive—a measure ofachievement (or underachievement) which isuseful in accountability studies—whilst the sec-ond is more proactive and linked to future de-velopments: the first concerns remediation whilstthe second concerns forward planning and fore-casting (for example by using trend analysis,discussed in Chapter 8). Both, however, concernthe process of diagnosis for subsequent planning;both are strongly in the vein of evaluative re-search (discussed in Chapter 1); and both areconcerned with gathering information on theproblem, for problem and need definition. Needsanalysis is research designed to render decision-making informed rather than conjectural andspeculative.

There are several components of a needsanalysis. In relation to the operationalization ofthe term it is necessary to address key issues: • the definition of need that is being used, e.g.

the operationalization of the problem or need;• the nature of the actual problem or need;• the indicators of the need or problem;• the size of the need or problem;• the type of need or problem;

Box 22.2Location of home postcodes using Geographical Information Systems

Source Parsons, Chalkley and Jones, 1996

NEEDS ANALYSIS

RECENT DEVELOPMENTS392

• the scope, complexity and range of the prob-lem or need;

• the sub-elements of the need or problem;• the priorities of aspects of the need;• the severity or intensity of the need or prob-

lem;• the causes of the need;• forecasting needs;• the consequences if the need is not addressed;• the consequences of addressing the need. In terms of the population concerned it is neces-sary to address several factors: • the target population for an intervention;• the number of people affected or concerned

(e.g. the proportion of a total population);• the location of the need or problem;• the clarification of whose problem it is;• the density and distribution of the problem;• the incidence of the problem or need;• how widespread is the need. In terms of a proposed intervention there areseveral important factors to address: • the identification of the exact conditions,

problems and needs that the intervention isdesigned to address;

• the appropriateness of the programme in-tended to address the need;

• the purposes of the proposed intervention;• the boundaries of the target population for

an intervention (i.e. the criteria to be usedfor defining the target population);

• the present and estimated future size of thetarget population;

• the feasibility of the proposed intervention;• responsibility for interventions (which peo-

ple have to take action). The intention is to ensure that interventions areappropriately matched to perceived problems orneeds, indeed that competing and alternativeproposed interventions are evaluated.

The data required for needs analysis can bederived from several sources, for example:

1 quantitative data from: structured surveys;‘key person’ (informants) surveys; structuredinterviews (Rossi and Freeman, 1993); datafrom official public sources and documents(e.g. census returns, test and examinationdata, and other surveys); simulations andprediction analyses; test, assessment and ex-amination data; application, attendance, re-tention, withdrawal and success rates;

2 qualitative data from: semi-structured inter-views with individuals and groups; focusgroups; case studies; critical incidents andevents; public meetings; nominal group tech-nique and Delphi techniques (Morrison,1993); Ishikawa cause-and-effect diagrams(Morrison, 1998).

Clearly, the success of needs analysis could de-pend on the careful and appropriate samplingand targeting of parties concerned. Rossi andFreeman (1993:84) suggest that qualitative dataare useful for determining the nature of the need,whilst quantitative data are necessary for deter-mining the extent of the need. The issue of sam-pling and targeting is not unproblematic, for is-sues of inclusion in, and exclusion from, the sam-ple or target population might be highly sensi-tive, for example: a needs analysis of children atrisk of abuse or poor parenting, or the criteriato be used for defining students who aredevelopmentally delayed.

Further, it is possible that a need or problemwill be reconceptualized as more data are ob-tained, the sample is widened, the number ofstakeholders increase, and further evaluationstudies are undertaken. For example, a problemwhich might be perceived initially as the inci-dence of noisy students in a school corridormight turn out to be a manifestation of a deeperset of problems (e.g. poor timetabling that causesall students to have to move around the schoolsimultaneously; poor layout of corridors whichleads to inevitable congestion; poor quality fa-cilities that have noisy floors instead of carpetedfloors; and poor organizational matters that re-sult in many students having to move rooms orwalk great distances).

Cha

pte

r 22

393

A needs analysis, then, identifies the prob-lem or need and then proceeds to identify theaims, content, implementation, target popula-tion and outcome of an intervention. In this re-spect it is akin to planning action research.Suarez (1994) suggests that needs analysis forthe purpose of future planning and developmentwill tend to focus on aims and goals, whilst needsanalysis that is undertaken to identify discrep-ancies will tend to focus on content, implemen-tation and outcome.

It is important, then, for the researcher to beclear on the purposes of the needs analysis be-ing undertaken, for this will determine the fo-cus, methodology and outcome of the assess-ment. Consequent to this, it is necessary for theneeds analysis to be clear on its remit, focus,sampling, methodology, data collection, andprescription for intervention.

The issue of prioritization of the problems,needs and aspects of the intervention is also criti-cal, particularly because budgetary constraintswill affect the conduct of the needs assessmentand its subsequent recommendation. Witkin(1984) identifies several quantitative methodsfor identifying priorities (including, for exam-ple, ratings, amount of discrepancy betweenactual and intended practices or incidence). Lundand McGechan (1981) suggest that the processof prioritization will need to focus on such is-sues as: (a) the consequences of not meeting theneed; (b) the number of people affected; (c) themeeting of the need by the parties identified (e.g.whether the problem is solely a matter for edu-cationists or whether it involves other servicesectors); (d) the criticality and severity of theneeds; (e) the sequencing of the need (the orderin which the needs must be addressed, andwhether the addressing of some needs logicallyand empirically precedes the addressing of oth-ers); (f) the resource implications of meeting theneeds (e.g. people, financial and budgetary, time,materials and equipment, administrative sup-port); (g) the scope of the outcome and the util-ity of the intervention.

Suarez (1994) underlines the importance inneeds assessment for the dissemination of the

research findings to be planned and to beextesive. The critical factor of a needs assess-ment, like many evaluation studies, is the utilityof the findings. The outcomes of needs assess-ment should feed into decision-making andpolicy-formation. Hence all stakeholders needto be involved in and informed of the research.

In planning a needs analysis, then, four mainsteps can be followed:

Step 1 Decide the purposes of the needs analysisand the definitions of needs that are to be used.Step 2 Identify the focus of the needs analysis.Step 3 Decide the methodology, sampling, in-strumentation, data collection and analysis pro-cedures and criteria to be used to judge the size,scope, extent, severity etc. of the need.Step 4 Decide the reporting and disseminationof the results.

It can be seen that the planning of a needs analy-sis follows a typical plan of an evaluation or ofevaluative research. For an extended exampleof a needs analysis see Kshir’s (1999) analysisof in-service needs for staff development andcurriculum change. Internet material on needsanalysis in education can be found by keying in‘education needs analysis’ or ‘education needsevaluation’ on a search engine. However, it mustbe stated that the overwhelming amount ofInternet material here concerns business and fi-nancial needs assessment, management trainingneeds, and IT needs assessment, though thereare isolated instance of educational materials,often of an advertising nature and often con-cerning special educational needs. Two exam-ples of educational entries can be found at:

http://www.metagifted.org/ (thewebsite of the Metagifted Organisation);

http://www.metagifted.org/topics/

Multiplelntelligences/ (the section of theMetagifted Organisation that focuses on thework of Howard Gardner).

The amount of research data on the Internetabout needs assessment in education, at the timeof writing, appears limited.

NEEDS ANALYSIS

RECENT DEVELOPMENTS394

Evidence-based education

This is a term that has been coined to cover thegrowth in interest in particular types of data andresearch in education. The need for practice anddecision-making to be informed by the best evi-dence available is undeniable. In evidence-basededucation the evidence in question is of a par-ticular nature or type, viz. that acquired fromwell-controlled experimental trials which indi-cate the effects and effect sizes of an interven-tion. In this respect the move towards evidence-based education resonates with the use of meta-analysis, which was discussed in Chapter 12, andwith the importance of examining effect size,that was discussed in Chapters 10 and 12. Morespecifically, it is suggested that the evidence isstrongest when it derives from randomized con-trolled trials (RCTs).

The roots of evidence-based practice lie inmedicine, where the advocacy by Cochrane(1972) for randomized controlled trials togetherwith their systematic review and documentationled to the foundation of the Cochrane Collabo-ration (Maynard and Chalmers, 1997), whichis now worldwide. The careful, quantitative-based research studies that can contribute to theaccretion of an evidential base is seen to be apowerful counter to the often untried and under-tested schemes that are injected into practice.

More recently evidence-based education hasentered the worlds of social policy, social work(MacDonald, 1997) and education (Fitz-Gib-bon, 1997). At the forefront of educational re-search in this area are Fitz-Gibbon (1996; 1997;1999) and Tymms (1996), who, at the Curricu-lum, Evaluation and Management Centre at theUniversity of Durham, have established one ofthe world’s largest monitoring centres in educa-tion. Fitz-Gibbon’s work is critical of multilevelmodelling and, instead, suggests how indicatorsystems can be used with experimental methodsto provide clear evidence of causality and a readyanswer to her own question: how do we knowwhat works? (Fitz-Gibbon, 1999:33).

Echoing Anderson and Biddle (1991), Fitz-Gibbon suggests that policy makers shun evi-

dence in the development of policy and that prac-titioners, in the hurly-burly of everyday activ-ity, call upon tacit knowledge rather than theknowledge which is derived from RCTs. How-ever, in a compelling argument (1997:35–6), shesuggests that evidence-based approaches arenecessary in order to: (a) challenge the imposi-tion of unproven practices; (b) solve problemsand avoid harmful procedures; (c) create im-provement that leads to more effective learn-ing. Further, such evidence, she contends, shouldexamine effect sizes rather than statistical sig-nificance.

Whilst the nature of evidence in evidence-based education might be contested by research-ers whose sympathies (for whatever reason) lieoutside randomized controlled trials, the mes-sage from Fitz-Gibbon will not go away: theeducational community needs evidence on whichto base its judgements and actions. The devel-opment of indicator systems worldwide atteststo the importance of this, be it through assess-ment and examination data, inspection findings,national and international comparisons ofachievement, or target setting. Rather than be-ing a shot in the dark, evidence-based educa-tion suggests that policy formation should beinformed, and policy decision-making should bebased on the best information to date rather thanon hunch, ideology or political will. It is border-ing on the unethical to implement untried anduntested recommendations in educational prac-tice, just as it is unethical to use untested prod-ucts and procedures on hospital patients with-out their consent.

The Internet material on evidence-based edu-cation is largely concerned with medical educa-tion at the time of writing, and this can beaccessed through the keywords ‘education evi-dence based’ on a search engine. However, thefollowing website provides some useful mate-rial for researchers:

http://www.dur.ac.uk/edeuk/ (thiscontains a manifesto for evidence-basededucation and a listing of other websites aboutthe topic).

Cha

pte

r 22

395

We close this chapter and end the book byreturning to the earlier Chinese nostrum thatdoing isinform practice, and in doing soneeds to be rigorous, circumspect and self-aware. To this extent we echo the openingwords of Zen Master Sun Yat Sen that pref-ace this fifth edition: To understand is hard;

once one understands, action is easy.’ Whenhe made this comment he was reacting to akey issue: that research is necessary toharder than learning, i.e.: to learn is easy;to put into practice is hard. This book hastried to make the learning easier and the do-ing more informed.

EVIDENCE-BASED EDUCATION

1 THE NATURE OF INQUIRY

1 Parts of this chapter are taken from Cohen, L.and Manion, L. (1981) Perspectives on Class-rooms and Schools with permission from Holt,Rinehart & Winston.

2 We are not here recommending, nor would wewish to encourage, exclusive dependence on ra-tionally derived and scientifically provableknowledge for the conduct of education—evenif this were possible. There is a rich fund of tra-ditional and cultural wisdom in teaching (as inother spheres of life) which we would ignore toour detriment. What we are suggesting, however,is that total dependence on the latter has tendedin the past to lead to an impasse: and that forfurther development and greater understandingto be achieved education must needs resort tothe methods of science and research.

3 Primarily associated with the Vienna Circle ofthe 1920s whose most famous members includedSchlick, Carnap, Neurath and Waisman.

4 A classic statement opposing this particular viewof science is that of Kuhn, T.S. (1962) The Struc-ture of Scientific Revolutions, Chicago: Univer-sity of Chicago Press. Kuhn’s book, acknowl-edged as an intellectual tour de force, makes thepoint that science is not the systematic accumu-lation of knowledge as presented in text books;that it is a far less rational exercise than gener-ally imagined. In effect, it is ‘a series of peacefulinterludes punctuated by intellectually violentrevolutions…in each of which one conceptualworld view is replaced by another.’

5 For a straightforward overview of the discussionshere see Chalmers, A.F (1982) What Is ThisThing Called Science? (second edition), MiltonKeynes: Open University Press.

6 For a later study that examines the influence ofscience and objectivity on the secularization ofconsciousness, see the same author’s Where theWasteland Ends, London: Faber & Faber, 1972.

7 The formulation of scientific method outlinedearlier has come in for strong and sustained criti-cism. Mishler for example, describes it as a ‘sto-

rybook image of science’, out of tune with theactual practices of working scientists who turnout to resemble craftpersons rather than logi-cians. By craftpersons, Mishler is at pains to stressthat competence depends upon ‘apprenticeshiptraining, continued practice and experienced-based, contextual knowledge of the specificmethods applicable to a phenomenon of inter-est rather than an abstract “logic of discovery”and application of formal “rules” ’. The knowl-edge base of scientific research, Mishler contends,is largely tacit and unexplicated; moreover, sci-entists learn it through a process of socializationinto a ‘particular form of life’. The discovery,testing and validation of findings is embeddedin cultural and linguistic practices and experi-mental scientists proceed in pragmatic ways,learning from their errors and failures, adaptingprocedures to their local contexts, making deci-sions on the basis of their accumulated experi-ences. See for example, Mishler, E.G. (1990)Validation in inquiry-guided research: the roleof exemplars in narrative studies, Harvard Edu-cational Review, 60 (4): 415–42.

8 See, for example, Rogers, C.R. (1969) Freedomto Learn, Columbus, OH: Merrill Pub. Co.; andalso Rogers, C.R. and Stevens, B. (1967) Personto Person: the Problem of Being Human, Lon-don: Souvenir Press.

9 Investigating social episodes involves analysingthe accounts of what is happening from thepoints of view of the actors and the participantspectator(s)/investigator(s). This is said to yieldthree main kinds of interlocking material: im-ages of the self and others, definitions of situa-tions, and rules for the proper development ofthe action. See Harre, R. (1976) The construc-tive role of models, in L.Collins (ed.), The Useof Models in the Social Sciences, London:Tavistock Publications.

10 It may seem paradoxical to some readers that,although we have just described interpretivetheories as anti-positivist, they are neverthelessconventionally regarded as ‘scientific’ (and hencepart of ‘social science’) in that they are concerned

Notes

Note

s397NOTES

ultimately with describing and explaining humanbehaviour by means of methods that are in theirown way every bit as rigorous as the ones usedin positivist research.

11 It is not our intention here to outline philosophi-cal challenges to paradigm theory enunciated bycoherence theorists who argue for the epistemo-logical unity of educational research. One ver-sion of unity theory is succinctly articulated byWalker and Evers (1988:28–36).

12 See also Verma, G.K. and Beard, R.M. (1981)What is Educational Research? Aldershot:Gower, for further information on the nature ofeducational research and also a historical per-spective on the subject.

2 THE ETHICS OF EDUCATIONAL AND

SOCIAL RESEARCH

1 For example, American Psychological Associa-tion (1982); American Sociological Association(1971); British Sociological Association (1982);Social Research Association (1986); and the Brit-ish Educational Research Association (1989).Comparable developments may be found in otherfields of endeavour. For an examination of keyethical issues in medicine, business and journal-ism together with reviews of common ethicalthemes across these areas, see Serafini, A. (ed.)(1989) Ethics and Social Concern, New York:Paragon House. The book also contains an ac-count of principal ethical theories from Socratesto R.M.Hare.

2 US Dept of Health, Education and Welfare, Pub-lic Health Service and National Institute ofHealth (1971) The Institutional Guide toD.H.E.W. Policy on Protecting Human Subjects,DHEW Publication (NIH): December 2, 72–102.

3 See also, Reynolds, P.D. (1979) Ethical Dilem-mas and Social Science Research, San Francisco:Jossey-Bass.

4 As regards judging researchers’ behaviour, per-haps the only area of educational research wherethe term ethical absolute can be unequivocallyapplied and where subsequent judgement is un-questionable is that concerning researchers’ re-lationship with their data. Should they chooseto abuse their data for whatever reason, the be-haviour is categorically wrong; no place here formoral relativism. For once, a clear dichotomy isrelevant: if there is such a thing as clearly ethicalbehaviour, such abuse is clearly unethical. It cantake the form of first, falsifying data to supporta preconceived, often favoured, hypothesis; sec-ond, manipulating data, often statistically, forthe same reason (or manipulating techniques

used—deliberately including leading questions,for example); third, using data selectively, thatis, ignoring or excluding the bits that don’t fitone’s hypothesis; and fourth, going beyond thedata, in other words, arriving at conclusions notwarranted by them (or over-interpreting them).But even malpractice as serious as these exam-ples cannot be controlled by fiat: ethical injunc-tions would hardly be appropriate in this con-text, let alone enforceable. The only answer (inthe absence of professional monitoring) is forthe researcher to have a moral code that is ‘ra-tionally derived and intelligently applied’, to usethe words of the philosopher, R.S.Peters, and tobe guided by it consistently. Moral competence,like other competencies, can be learned. One wayof acquiring it is to bring interrogative reflec-tion to bear on one’s own code and practice,e.g. did I provide suitable feedback, in the rightamounts, to the right audiences, at the right time?In sum, ethical behaviour depends on the con-currence of ethical thinking which in turn isbased on fundamentally thought-out principles.Readers wishing to take the subject of data abusefurther should read Peter Medawar’s (1991) el-egant and amusing essay, ‘Scientific fraud’, inD.Pike (ed.) The Threat and the Glory: Reflec-tions on Science and Scientists, Oxford: OxfordUniversity Press; and also Broad, W. and Wade,N. (1983) Betrayers of Truth: Fraud and Deceitin the Halls of Science, New York: Century.

5 We would see the term ‘a sense of rightness’ asapproximately equivalent to the word ‘con-science’ as used in the religious tradition, or toCarl Rogers’ term ‘internal locus of evaluation’as used in a humanistic context. Some writers(e.g. Benstead and Constantine, 1998) distin-guish between acquired conscience and true con-science. The former conforms to social ideas asto what is right and wrong and is acquiredthrough social conditioning; the latter, true con-science, is a latent, innate ‘sense of rightness’made manifest by heightened awareness, theconsequence of which is that people know thedifference between right and wrong for them-selves. Therefore, as awareness is heightened, assensitivity is refined, so conscience develops. Ethi-cal problems can arise through lack of such sen-sitivity, and, a fortiori, from the ego taking overthe conscience and using it for its own ends. In-deed, the competing demands of ego and con-science are often at the heart of ethical or moraldilemmas. The reader can tease out for himselfor herself the implications of all this for educa-tional research. It may be, for example, that acode of conduct ultimately becomes unnecessary:

NOTES398

a researcher will know intuitively whether acourse of action is appropriate or not.

6 This idea of a personal code of practice may becomplemented by a distinctive view from the east.For eastern teachers, ‘right doing’ is a consequenceof ‘right being’. As Guy Claxton (1981) says:‘what makes an action good is the quality of thedoer, not the objective nature of the act’. This ech-oes Pirsig’s (1976:318) famous view in his Zenand the Art of Motorcycle Maintenance that topaint a perfect painting is easy: first one makesoneself perfect and then one just paints naturally.

7 Readers seeking guidance on this matter are re-ferred to Reynolds (1979), where the author hasassembled a composite code of ethics based onstatements appearing in twenty-four codes re-lated to the conduct of social science research inthe United States. The seventy-eight statementslisted by him cover general issues related to thecode of ethics: the decision to conduct the re-search; the actual conduct of the research; in-formed consent; protection of rights and wel-fare of participants; deception; confidentialityand anonymity; benefits to participants; effectson aggregates or communities; and the interpre-tation and reporting of the results of the research.The composite code is reprinted in Frankfort-Nachmias and Nachmias (1992) and a selectionof items from it may be found in Box 2.9. As wepointed out in the text, codes of practice are nota universal panacea, in spite of our advocacy,and their efficacy will vary with method andcontext. Some researchers, for example, havereported difficulties in working with codes ofpractice when doing field work. For the appro-priate references to these cases, see Burgess, R.G.(1989b) Grey areas: ethical dilemmas in educa-tional ethnography, in R.G.Burgess (ed.) (1989a)The Ethics of Educational Research, Lewes:Falmer Press.

3 RESEARCH DESIGN ISSUES: PLAN-

NING RESEARCH

1 For a discussion of the nature, strengths andweaknesses of models see Morrison (1993:37–8). He suggests that, whilst models usefully re-duce the world to manageable proportions, sim-plifying for the sake of clarity, care has to betaken not to oversimplify complexity to the pointof reductionism ad absurdum.

4 SAMPLING

1 This table is also reproduced in Dunham, R.B.and Smith, F.J. (1979) Organizational Surveys:

an Internal Assessment of Organizational Health,Glenview, Ill.: Scott, Foreman and Co., 68.

5 VALIDITY AND RELIABILITY

1 For a critique of a survey, from conceptualizationto reporting, see Morrison (1997).

7 HISTORICAL RESEARCH

1 See also the opening chapters in Gardiner, P.(1961) The Nature of Historical Explanation,Oxford: Oxford University Press, reprinted1978.

2 By contrast, the historian of the modern period,i.e. the nineteenth and twentieth centuries, ismore often faced in the initial stages with theproblem of selecting from too much material,both at the stage of analysis and writing. Herethe two most common criteria for such selectionare (1) the degree of significance to be attachedto data, and (2) the extent to which a specificdetail may be considered typical of the whole.

3 However, historians themselves usually rejectsuch a direct application of their work and rarelyindulge in it on the grounds that no two eventsor contextual circumstances, separated geo-graphically and temporally, can possibly beequated. As the popular sayings go, ‘Historynever repeats itself and so, ‘The only thing wecan learn from History is that we can learn noth-ing from History.’

4 The status of the history of education as an aca-demic discipline is well summarized and illus-trated in Sutherland, G. (1969) The study of thehistory of education, History, 54 (180).

5 See also the Social Science Research Council’s(1971) Research in Economic and Social His-tory, London: Heinemann, Chapters 2 and 3.

6 Holsti, O.R. (1968) Content analysis, in G.Lindzey and E.Aronson (eds), The Handbookof Social Psychology. Vol. 2: Research Methods,Reading, MA: Addison-Wesley. For a detailedaccount of the methods and problems involvedin establishing the reliability of the content analy-sis of written data, see: Everett, M. (1984) TheScottish comprehensive school: its function andthe roles of its teachers with special reference tothe opinions of pupils and student teachers, Un-published Ph.D. dissertation, School of Educa-tion, University of Durham.

7 Thomas, W.I. and Znaniecki, F. (1918) ThePolish Peasant in Europe and America, Chicago:University of Chicago Press. For a fuller discus-sion of the monumental work of Thomas andZnaniecki, the reader is referred to Plummer, K.

Note

s399NOTES

(1983) Documents of Life: an Introduction tothe Problems and Literature of a HumanisticMethod, London: George Allen & Unwin, espe-cially Chapter 3, The making of a method; andto Madge, J. (1963) The Origin of Scientific So-ciology, London: Tavistock. For a critique ofThomas and Znaniecki, see Riley, M.W. (1963)Sociological Research: a Case Approach, NewYork: Harcourt, Brace & World, Inc.

8 Sikes, P., Measor, L. and Woods, P. (1985)Teacher Careers, Lewes: Falmer Press; see also:Acker, S. (1989) Teachers, Gender and Careers,Lewes: Falmer Press; Blease, D. and Cohen, L.(1990) Coping with Computers: an Ethno-graphic Study in Primary Classrooms, London:Paul Chapman Publishers; Evetts, J. (1990)Women in Primary Teaching, London: UnwinHyman; Evetts, J. (1991) The experience of sec-ondary headship selection: continuity andchange, Educational Studies, 17(3), 285–94;Goodson, I. (1990) The Making of Curriculum,Lewes: Falmer Press; Smith, L.M. (1987) Ken-sington Revisited, Lewes: Falmer Press;Goodson, I. and Walker, R. (1988) Putting lifeinto educational research, in R.R. Sherman andR.B.Webb (eds) Qualitative Research in Educa-tion: Focus and Methods, Lewes: Falmer Press;Sikes, P. and Troyna, B. (1991) True stories: acase study in the use of life histories in teachereducation, Educational Review, 43 (1) 3–16;Winkley, D. (1995) Diplomats and Detectives:LEA Advisers and Work, London: Robert Royce.

8 SURVEYS, LONGITUDINAL, CROSS-

SECTIONAL AND TREND STUDIES

1 There are several examples of surveys, includ-ing: Borg, M.G. (1998) Secondary school teach-ers’ perceptions of pupils’ undesirable behav-iours, British. Journal of Educational Psychol-ogy, 68, 67–79; Boulton, M.J. (1997) Teachers’views on bullying: definitions, attitudes and abili-ties to cope, British Journal of Educational Psy-chology, 67, 223–33; Cline, T and Ertubney, C.(1997) The impact of gender on primary teach-ers’ evaluations of children’s difficulties inschool, British Journal of Educational Psychol-ogy, 67, 447–56; Dosanjh, J.S. and Ghuman,P.A.S. (1997) Asian parents and English educa-tion—20 years on: a study of two generations,Educational Studies, 23 (3), 459–472; Foskett,N.H. and Hesketh, A.J. (1997) Constructingchoice in continuous and parallel markets: insti-tutional and school leavers’ responses to the newpost-16 marketplace, Oxford Review of Educa-tion, 23 (3), 299–319; Gallagher, T., McEwen,

A. and Knip, D. (1997) Science education policy:a survey of the participation of sixth-form pu-pils in science and the subjects over a 10-yearperiod, 1985–95, Research Papers in Education,12 (2), 121–42; Hall, K. and Nuttall, W. (1999)The relative importance of class size to infantteachers in England, British Educational Re-search Journal, 25 (2), 245–58; Jules, V andKutnick, P. (1997) Student perceptions of a goodteacher: the gender perspective, British Journalof Educational Psychology, 67, 497–511; Millan,R., Gallagher, M. and Ellis, R. (1993) Surveyingadolescent worries: development of the ‘ThingsI Worry About’ scale, Pastoral Care in Educa-tion, 11 (1), 43–57; Papasolomoutos, C. andChristie, T (1998) Using national surveys: a re-view of secondary analyses with special refer-ence to schools, Educational Research, 40 (3),295–310; Rigby, K. (1999) Peer victimisation atschool and the health of secondary school stu-dents, British Journal of Educational Psychol-ogy, 69, 95–104; Strand, S. (1999) Ethnic group,sex and economic disadvantage: associationswith pupils’ educational progress from Baselineto the end of Key Stage 1, British EducationalResearch Journal, 25 (2), 179–202; Tatar, M.(1998) Teachers as significant others: gender dif-ferences in secondary school pupils’ perceptions,British Journal of Educational Psychology, 68,255–68; Terry, A.A. (1998) Teachers as targetsof bullying by their pupils: a study to investigateincidence, British Journal of Educational Psy-chology, 68, 255–68.

Examples of different kinds of survey studiesare as follows: (a) Francis’s (1992) ‘true cohort’study of patterns of reading development, fol-lowing a group of 54 young children for twoyears at six monthly intervals; (b) Blatchford’s1992 cohort/cross-sectional study of 133–175children (two samples) and their attitudes towork at 11 years of age; (c) a large scale/cross-sectional study by Munn, Johnstone andHolligan (1990) into pupils’ perceptions of ef-fective disciplinarians, with a sample size of 543;(d) a trend/prediction study of school buildingrequirements by a government department (De-partment of Education and Science, 1977), iden-tifying building and improvement needs basedon estimated pupil populations from births dur-ing the decade 1976–86; (e) a survey study byBelson (1975) of 1,425 teenage boys’ theft be-haviour; (f) a survey by Hannan and Newby(1992) of 787 student teachers (with a 46 percent response rate) and their views on govern-ment proposals to increase the amount of timespent in schools during the training period.

NOTES400

2 For a critique of a survey conducted by courseleaders see Morrison (1997).

3 Examples of longitudinal and cross-sectionalstudies include: Busato, V.V, Prins, F.J., Elshant,J.J. and Hamaker, C. (1998) Learning styles: across-sectional and longitudinal study in highereducation, British Journal of Educational Psy-chology, 68, 427–41; Davenport, E.C.Jr,Davison, M.L., Kuang, H., Ding, S., Kin, S.-K.and Kwak, N. (1998) High school mathematicscourse-taking by gender and ethnicity, Ameri-can Educational Research Journal, 35 (3), 497–514; Davies, J. and Brember, I (1997) Monitor-ing reading standards in year 6: a 7-yearcrosssectional study, British Educational ResearchJournal, 23 (5), 615–22; Davies, J. and Brember, I.(1998) Standards in reading at key stage 1—a cross-sectional study, Educational Research, 40 (2), 153–60; Galton, M., Hargreaves, L, Comber, C., Wall,D. and Pell, T. (1999) Changes in patterns in teacherinteraction in primary classrooms, 1976–1996,British Educational Research Journal, 25 (1), 23–37; Marsh, H.W. and Yeung, A.S. (1998) Longitu-dinal structural equation models of academic self-concept and achievement: gender differences in thedevelopment of math and English constructs,American Educational Research Journal, 35 (4),705–38; Noack, P (1998) School achievement andadolescents’ interactions with their fathers, moth-ers, and friends, European Journal of Psychologyof Education, 13 (4), 503–13; Preisler, G.M., andAhström, M. (1997) Sign language for hard ofhearing children—a hindrance or a benefit fortheir development? European Journal of Psychol-ogy of Education, 12 (4), 465–77.

4 For an account of the National Child Develop-ment Study of a cohort of 15,000 children (nowadults), see Fogelman, K. (ed.) (1983) GrowingUp in Great Britain: Papers from the NationalChild Development Study, London: Macmillan.The third national cohort study (the 1970 co-hort) is a detailed account of the health and be-haviour of Britain’s 5-year-olds. See, Butler, N.R.and Golding, J. (1986) From Birth to Five, Ox-ford: Pergamon Press.

5 For further information on event history analy-sis and hazard rates we refer readers to Allison,1984; Hakim, 1987; Plewis, 1985; von Eye,1990; Rose and Sullivan, 1993.

9 CASE STUDIES

1 King, R. (1979) All Things Bright and Beauti-ful? Chichester: John Wiley; King’s study as awhole is based upon unstructured observationsin infant classrooms. For a more structured in-

quiry into the activities of young children, seeDunn, S. and Morgan, V (1987) Nursery andinfant school play patterns: sex-related differ-ences, British Educational Research Journal, 13(3), 271–81. An earlier study that raised ques-tions about the so-called progressive practicesin primary education is provided by Sharp, R.and Green, A. (1975) Education and Social Con-trol: a Study in Progressive Primary Education,London: Routledge & Kegan Paul.

2 For a text dealing with techniques of observation,see Croll, P. (1985) Systematic Observation,Lewes: Falmer Press. For analysing case records(indexing, structuring, restructuring, sequencing,classification and cross-classification, coordinat-ing and reducing) see Bromley, D.B. (1986) TheCase Study Method in Psychology and RelatedDisciplines, Chichester: John Wiley.

3 For a British study employing ethnographic tech-niques and looking, inter alia, at the leadershipof the head teacher, see Burgess, R.G. (1983)Experiencing Comprehensive Education, Lon-don: Methuen. For other case studies of schoolsthe reader is referred to Ball, S.J. (1981) BeachsideComprehensive, Cambridge: Cambridge Univer-sity Press; Ball, S.J. (1985) School politics, teach-ers’ careers and educational change: a case studyof becoming a comprehensive school, in L.Bartonand S.Walker (eds) Education and Social Change,Beckenham: Croom Helm; Beynon, J. (1985a)Career histories in a comprehensive school, inS.J.Ball and I.F. Goodson (eds) Teachers’ Livesand Careers, Lewes: Falmer Press; Beynon, J.(1985b) Initial Encounters in the SecondarySchool, Lewes: Falmer Press; and Davies, L. (1984)Pupil Power: Deviance and Gender in School,Lewes: Falmer Press.

4 For further examples of case studies see: Bates,I. and Dutson, J. (1995) A Bermuda triangle? Acase study of the disappearance of competence-based vocational training policy in the contextof practice, British Journal of Education andWork, 8 (2), 41–59; Jacklin, A. and Lacey, C.(1997) Gender integration in the infant class-room: a case study, British Educational ResearchJournal, 23 (5), 623–40; Woods, P. (1993) Man-aging marginality: teacher development throughgrounded life history, British Educational Re-search Journal, 19 (5), 447–88.

11 EX POST FACTO RESEARCH

1 In Chapters 11 and 12 we adopt the symbolsand conventions used in Campbell, D.T. andStanley, J.C. (1963) Experimental and quasi-ex-perimental designs for research on teaching, in

Note

s401NOTES

N.L.Gage (ed.) Handbook of Research on Teach-ing, Chicago: Rand McNally. These are presentedfully in Chapter 12.

2 For further information on logical fallacies, seeCohen, M.R. and Nagel, E. (1961) An Intro-duction to Logic and Scientific Method, Lon-don: Routledge and Kegan Paul. The exampleof the post hoc, ergo propter hoc fallacy givenby the authors concerns sleeplessness, which mayfollow drinking coffee, but sleeplessness may notoccur because coffee was drunk.

3 Stables’s ex post facto design separated morethan 2,300 pupils by type of school (mixed orsinglesex), and then compared their perceptionsof the importance of all their school subjects bymeans of a specially designed questionnaire. Atthe same time, participants were given an Atti-tudes to Physics, Chemistry and Biology’ scaleconsisting of 64 statements to which they re-sponded on a continuum ranging from ‘stronglyagree’ to ‘strtrongly disagree’. Stables’s resultsshowed that boys’ and girls’ attitudes in mixedschools were more strongly polarized than in sin-gle-sex schools. Drama, Biology and Languageswere significantly more highly rated by boys insingle-sex schools than by their fellows in mixedestablishments. On the other hand, boys in mixedschools recorded greater support for Physics andPhysical Sciences than boys in single-sex schools.As far as girls were concerned, Physics was bet-ter liked in single-sex schools than in mixed.Overall, the effect of being educated in a single-sex or a mixed school seemed to have greatereffect on pupils’ feelings towards Sciences, Mod-ern Languages, Craft, Drama and Music. Themost consistent finding in Stables’s investigationwas in connection with the Attitude to Physics,Chemistry and Biology’ scale. Stables reports,‘On every section of the scale the sex differencewas greater among co-educated pupils.’ He con-cludes, ‘The danger is that subject interest andspecialisation may be guided to a greater extentby a desire to conform to a received sexual stere-otype in mixed schools than in single-sex schools,thus effectively narrowing career choice for co-educated pupils.’

Arnold and Atkins’s study consisted oftwenty-three hearing-impaired children andtwenty-three normally hearing pupils acting ascontrols. The causal-comparative design wasused to ask the following questions: Are hear-ing-impaired children more maladjusted thannon hearing-impaired children?’, and if so, ‘Arethey differently maladjusted as revealed by twowidely-used measures of maladjustment?’ Theirresearch used the ‘Bristol Social Adjustment

Guide’ and ‘Rutter’s Children’s Behaviour Ques-tionnaire’ to obtain ratings of their sample. Theyreport that the hearing-impaired were no moremaladjusted than the age-matched hearing con-trols, although there were high levels of malad-justment in both groups.

For further examples of ex post facto we re-fer the reader to three examples. Ben-Peretz andKremer-Hayon (1990) studied the context andcontent of professional dilemmas using in-depthand open-ended interviews that were transcribedfor subsequent analysis, i.e. an example of aqualitative study. Pierce and Molloy (1990) useda quantitative methodology in studying psycho-logical and biographical differences in teacherswho were experiencing burnout. McLaughlin etal. (1992) studied the schoolchild as a healtheducator, using quantitative and qualitative data;this study is a very useful example of the prac-tice of coding addressed in Chapter 6 (Miles andHuberman, 1984; Strauss, 1987).

12 EXPERIMENTS, QUASI-EXPERIMENTS

AND SINGLE-CASE RESEARCH

1 Randomization is one way of apportioning outor controlling for extraneous variables (seeRiecken and Boruch, 1974; Bennett andLumsdaine, 1975; Boruch, 1997). Alternatively,the experimenter may use matched cases, thatis, subjects are matched in pairs in terms of someother variable thought likely to affect scores onthe dependent variable and pairs are then allo-cated randomly to E and C conditions in such away that the means and variances of the twogroups are as nearly equal as possible. Finally,analysis of covariance is a powerful statisticalprocedure which uses pretest mean scores ascovariates to control for initial differences be-tween E and C groups on a number of independ-ent variables.

2 See also the discussion of validity and reliabilityin educational research, in Hammersley, M.(1987) Some notes on the terms ‘validity’ and‘reliability’, British Educational Research Jour-nal, 13 (1), 73–81.

3 Questions have been raised about the authentic-ity of both definitions and explanations of theHawthorne effect. See Diaper, G. (1990) TheHawthorne Effect: a fresh examination, Educa-tional Studies, 16 (3), 261–7.

4 Ethical considerations arising out of such grossdifferentiation in educational provision to im-poverished pupils is a matter of ethical concern.

5 The interested reader is referred to the follow-ing studies that draw upon single case designs in

NOTES402

British schools: Gersch, I. (1984) Behaviourmodification and systems analysis in a second-ary school: combining two approaches, Behav-ioural Approaches with Children, 8, 83–91;McNamara, E. (1986) The effectiveness of in-centive and sanction systems used in secondaryschools: a behavioural analysis, Durham andNewcastle Research Review, 10, 285–90;Merrett, F., Wilkins, J., Houghton, S. andWheldall, K. (1988) Rules, sanctions and re-wards in secondary schools, Educational Stud-ies, 14 (2), 139–49; Sharpe, P. (1985) Behaviourmodification in the secondary school: a surveyof students’ attitudes to rewards and praise, Be-havioural Approaches with Children, 9, 109–12; and Wheldall, K. and Panagopoulou-Stamatelatou, A. (1991) The effects of pupil self-recording of on-task behaviour in primary schoolchildren, British Educational Research Journal,17 (2), 113–27.

6 Examples of experimental research can be seenin: Alfassi, M. (1998) Reading for meaning: theefficacy of reciprocal teaching in fostering read-ing comprehension in high school students in re-medial reading classes, American EducationalResearch Journal, 35 (2), 309–22; Bijstra, J.O.and Jackson, S. (1998) Social skills training withearly adolescents: effects on social skills, wellbe-ing, self-esteem and coping, European Journalof Psychology of Education, 13 (4), 569–83;Bryant, P., Devine, M., Ledward, A., and Nunes,T. (1997) Spelling with apostrophes and under-standing possession, British Journal of Educa-tional Psychology, 67, 91–110; Cline, T., Proto,A., Raval, P.D., and Paolo, T. (1998) The effectsof brief exposure and of classroom teaching onattitudes children express towards facial disfig-urement in peers, Educational Research, 40 (1),55–68; Didierjean, A. and Cauzinille-Marmèche,E. (1998) Reasoning by analogy: is it schema-mediated or case-based? European Journal ofPsychology of Education, 13 (3), 385–98;Dugard, P. and Todman, J. (1995) Analysis ofpretest and post-test control group designs ineducational research, Educational Psychology,15 (2), 181–98; Hall, E., Hall, C, and Abaci, R.(1997) The effects of human relations trainingon reported teacher stress, pupil control ideol-ogy and locus of control, British Journal of Edu-cational Psychology, 67, 483–96; Littleton, K.,Ashman, H., Light, P., Artis, J., Roberts, T andOosterwegel, A. (1999) Gender, task contexts,and children’s performance on a computer-basedtask, European Journal of Psychology of Edu-cation, 14 (1), 129–39, Marcinkiewicz, H.R. andClariana, R. B. (1997) The performance effects

of headings within multi-choice tests, BritishJournal of Educational Psychology, 67, 111–17;Overett, S. and Donald, D. (1998) Paired read-ing: effects of a parental involvement programmein a disadvantaged community in South Africa,British Journal of Educational Psychology, 68,347–56; Sainsbury, M., Whetton, C., Mason, K.and Schagen, I. (1998) Fallback in attainmenton transfer at age 11: evidence from the summerliteracy schools evaluation, Educational Re-search, 40 (1), 73–81; Tones, K. (1997) Beyondthe randomized controlled trial: a case for ‘judi-cial review’, Health Education Research, 12 (2)i–iv.

7 See also, Hamilton, D. (1981) Generalisation inthe educational sciences: problems and purposes,in T.S.Popkewitz and R.S.Tabachnick (eds) TheStudy of Schooling, New York: Praeger.

8 Criteria for selecting from a larger pool of stud-ies those deemed to be well-controlled are setout in Cohen, P.A., Kulik, J.A. and Kulik, C.L.(1982) Educational outcomes of tutoring: ameta-analysis of findings, American EducationalResearch Journal, 19 (2), 237–48. See alsoKumar, D.D. (1991) A meta-analysis of the rela-tionship between science instruction and studentengagement, Educational Review, 43 (1), 49–56.

9 An example of meta-analysis in educational re-search can be seen in Severiens, S. and ten Dam,G. (1998) A multilevel meta-analysis of genderdifferences in learning orientations, British Jour-nal of Educational Psychology, 68, 595–618. Theuse of meta-analysis is widespread, indeed theCochrane Collaboration is a pioneer in this field,focusing on meta-analyses of randomized con-trolled trials; see Maynard and Chalmers (1997).

13 ACTION RESEARCH

1 Examples of action research include: McFee, G.(1993) Reflections on the nature of action re-search, Cambridge Journal of Education, 23 (2)173–83; Postlethwaite, K. and Haggarty, L.(1998) Towards effective and transferable learn-ing in secondary school: the development of anapproach based on mastery learning, British Edu-cational Research Journal, 24 (3), 333–53.

14 QUESTIONNAIRES

1 Examples of questionnaires in educational re-search include: Hannan and Newby (1992);Black, D.R. and Scott, W.A.H. (1997) Factorsaffecting the employment of teachers returningto the United Kingdom after teaching abroad,Educational Research, 39 (1), 37–63; Pithers,

Note

s403NOTES

R.T. and Soden, R. (1999) Person-environmentfit and teacher stress, Educational Research, 41,51–61.

15 INTERVIEWS

1 Examples of interviews in educational researchinclude: Carroll, S. and Walford, G. (1997) Par-ents’ responses to the school quasi-market, Re-search Papers in Education, 12 (1) 3–26;Cicognani, C. (1998) Parents’ educational stylesand adolescent autonomy, European Journal ofPsychology of Education, 13 (4), 485–502;Cullen, K. (1997) Headteacher appraisal: a viewfrom the inside, Research Papers in Education,12 (2), 177–204; Ferris, J. and Gerber, R. (1996)Mature-age students’ feelings of enjoying learn-ing in a further education context, EuropeanJournal of Psychology of Education, 11 (1), 79–96; Robinson, P. and Smithers, A. (1999) Shouldthe sexes be separated for secondary education—comparisons of single-sex and co-educationalschools? Research Papers in Education, 14 (1),23–49; Van Etten, S., Pressley, M., Freebern, G.and Echevarria, M. (1998) An interview studyof college freshmen’s beliefs about their academicmotivation, European Journal of Psychology ofEducation, 13 (1), 105–30.

2 Examples of telephone interviews include: Jones,J.L. (1998) Managing the induction of newlyappointed governors, Educational Research, 40(3) 329–51.

16 ACCOUNTS

1 For an example of concept mapping in educa-tional research see: Lawless, L., Smee, P. andO’Shea, T. (1998) Using concept sorting andconcept mapping in business and public admin-istration, and education: an overview, Educa-tional Research, 40 (2), 219–35.

2 See also: Edwards, D. and Mercer, N.M. (1989)Reconstructing context: the conventionalizationof classroom knowledge, Discourse Processes,12, 91–104; Potter, J. and Wetherall, M. (1987)Discourse and Social Psychology: Beyond Atti-tudes and Behaviour, London: Sage; Walkerdine,V. (1988) The Mastery of Reason: Cognitive De-velopment and the Production of Rationality,London: Routledge.

3 For further examples of discourse analysis see:Butzkamm, W. (1998) Code-switching in a bi-lingual history lesson: the mother tongue as aconversational lubricant, Bilingual Educationand Bilingualism, 1 (2): 81–99; Mercer, N.,Wegerif, R. and Dawes, L. (1999) Children’s talk

and the development of reasoning in the class-room, British Educational Research Journal, 25(1) 95–111; Ramsden, C. and Reason, D. (1997)Conversation—discourse analysis in library andinformation services, Education for Information,15 (4), 283–95.

4 Our account draws on the outline contained inO’Neill, B. and McMahon, H. (1990) OpeningNew Windows with Bubble Dialogue, LanguageDevelopment and Hypermedia Research Group,Faculty of Education, University of Ulster atColeraine. See also the taxonomies for the analy-sis of social episodes by Windisch, V (1990)Speech and Reasoning in Everyday Life, Cam-bridge: Cambridge University Press; Schonbach,P. (1990) Account Episodes: the Managementor Escalation of Conflict, Cambridge: CambridgeUniversity Press; Semin, G.R. and Manstead,A.S.R. (1983) The Accountability of Conduct: aSocial Psychological Analysis, London: Aca-demic Press. Bubble dialogue was born out ofchildren’s comic strips. Cunningham et al. (1991)have extended and powerfully transformed thecomic strip in their computer-based application.Four icons, representing a speech bubble and athought bubble per character are presentedalongside two characters on the screen. Clickingon an icon brings up an empty ‘say’ or ‘think’bubble for the chosen character. The comic genreis so well established, the authors opine, that evenvery young children when presented with emptybubbles, feel compelled to speak for the charac-ters, playing out their roles. Sometimes the au-thors write in a first speech or thought (an‘opener’ as they call it) to get a dialogue started.When pupils are more familiar with the tool, theyreadily create their own scenes and openers. Inbubble dialogue, characters are set against abackdrop, the presence of which is consideredcrucial. A prologue helps set the scene in whichthe dialogue takes place. From the researcher’svantage point, bubble dialogue permits perceivedrelationships to be varied (by backdrop, pro-logue, openers). Bubble dialogue, its creatorsconclude, ‘is a powerful methodology for usersto make public those perceptions of context,content and interaction which might otherwiseremain unformed and unsaid as well as unwrit-ten’ (O’Neill and McMahon, 1990). Bubble dia-logue (in comic script format rather than com-puter-based application) has been used by Cohen(1993) to explore perceptions and interpretationsof racist behaviour in secondary school class-rooms.

5 Heath, S.B. (1982) Questioning at home and atschool: a comparative study, in G.Spindler (ed.)

NOTES404

Doing the Ethnography of Schooling, New York:Holt, Rinehart & Winston. Interesting ethno-graphic studies of children in classrooms andplaygrounds appear in the Routledge & KeganPaul series, Social Worlds of Childhood: Davies,B. (1982) Life in the Classroom and Playground:the Accounts of Primary School Children, Lon-don: Routledge & Kegan Paul; and Sluckin, A.(1981) Growing up in the Playground: the So-cial Development of Children, London:Routledge & Kegan Paul. See also: Troyna, B.and Hatcher, R. (1992) Racism in Children’sLives: a Study in Mainly-White Primary Schools,London: Routledge; Woods, P. and Hammersley,M. (1993) Gender and Ethnicity in Schools: Eth-nographic Accounts, London: Routledge.

6 For further similar examples see: Bates, I. andDutson, J. (1995) A Bermuda triangle? A casestudy of the disappearance of competence-basedvocational training policy in the context of prac-tice, British Journal of Education and Work, 8(2), 41–59; Ziegahn, L. and Hinchman, K.A.(1999) Liberation or reproduction: explainingmeaning in college tutors’ adult literacy tutor-ing, International Journal of Qualitative Stud-ies in Education, 12 (1), 85–101.

7 Menzel, H. (1978) Meaning—who needs it? inM. Brenner, P.Marsh and M.Brenner (eds) TheSocial Context of Method, London: CroomHelm. For a further discussion of the problem,see Gilbert, G.N. (1983) Accounts and thoseaccounts called actions, in G.N.Gilbert and P.Abell, Accounts and Action, Aldershot: Gower.

8 The discussion at this point draws on that inBailey, K.D. (1978) Methods of Social Research,London: Collier-Macmillan, 261.

9 See, for example: Hargreaves, D.H., Hester, S.K. and Mellor, F.J. (1975) Deviance in Class-rooms, London: Routledge & Kegan Paul;Marsh, P., Rosser, E. and Harré, R. (1978) TheRules of Disorder, London: Routledge & KeganPaul.

17 OBSERVATION

1 For an example of time-sampling see: Childs, G.(1997) A concurrent validity study of teachers’ratings for nominated ‘problem’ children, BritishJournal of Educational Psychology, 67, 457–74.

2 For an example of critical incidents see: Tripp,D. (1994) Teachers’ lives, critical incidents andprofessional practice, International Journal ofQualitative Studies in Education, 7 (1) 65–72.

3 For an example of an observational study see:Sideris, G. (1998) Direct classroom observation,Research in Education, 59, 19–28.

18 TESTS

1 For an example of a test-based piece of researchsee: Bielinski, J. and Davison, M.L. (1998) Gen-der differences by item difficulty interactions inmultiple choice mathematics items, AmericanEducational Research Journal, 35 (3), 455–76.

19 PERSONAL CONSTRUCTS

1 See also University of Manchester Regional Com-puting Centre (UMRCC) (1981) GAP: GridAnalysis Package, Manchester: University ofManchester Regional Computing Centre.

2 See also: Slater, P. (1977) The Measurement ofInterpersonal Space, Vol. 2, Chichester: Wiley.

3 See also the following applications of personalconstruct theory to research on teachers andteacher groups: Cole, A.L. (1991) Personal theo-ries of teaching: development in the formativeyears, Alberta Journal of Educational Research,37 (2), 119–32; Corporal, A.H. (1991) Reper-tory grid research into cognitions of prospectiveprimary school teachers, Teaching and TeacherEducation, 36, 315–29; Lehrer, R. and Franke,M.L. (1992) Applying personal construct psy-chology to the study of teachers’ knowledge offractions, Journal for Research in MathematicalEducation, 23 (3), 223–41; Shapiro, B.L. (1990)A collaborative approach to help novice scienceteachers reflect on changes in their constructionof the role of the science teacher, Alberta Jour-nal of Educational Research, 36 (3), 203–22;Shaw, E.L. (1992) The influence of methods in-struction on the beliefs of preservice elementaryand secondary science teachers: preliminary com-parative analyses, School Science and Mathemat-ics, 92, 14–22.

4 See also Yorke, D.M. (1985) Indexes of stabil-ity in repertory grids: a small-scale comparison,British Educational Research Journal, 11(3),221–5.

5 See also Pope, M.L. and Keen, T.R. (1981) Per-sonal Construct Psychology and Education, Lon-don: Academic Press, especially Chapters 8 and9; Shaw, M.L.G. (éd.) (1981) Recent Advancesin Personal Construct Technology, London: Aca-demic Press; Thomas, L.F. and Harri-Augstein,E.S. (1992) Self-organized Learning, London:Routledge.

6 For an example of personal constructs in educa-tional research see: Derry, S.J. and Potts, M.K. (1998) How tutors model students: a studyof personal constructs in adaptive tutoring,American Educational Research Journal, 35 (1),65–99.

Note

s405NOTES

20 MULTI-DIMENSIONAL MEASUREMENT

1 For a fuller discussion of clustering methods, seeEveritt, B.S. (1974) Cluster Analysis, London:Heinemann Educational Books.

2 See also Bennett, S.N. and Jordan, J. (1975) Atypology of teaching styles in primary schools,British Journal of Educational Psychology, 45,20–8. Powerful cluster programmes such as SAS,Version 5 Edition, (1985) SAS Institute Inc.,Cary, NC, USA, can throw new light on dataand reveal dimensions previously obscure. Us-ing SAS in an analysis of the perceptions of some686 teachers about their working lives,Poppleton and Riseborough (1990) identifiedfour clusters of teachers with distinctively dif-ferent orientations towards the pursuit of a ca-reer. See Poppleton, P. and Riseborough, G.(1990) Teaching in the mid–1980s: the central-ity of work in secondary teachers’ lives, BritishEducational Research Journal, 16 (2), 105–24.

3 Gray, J. and Satterly, D. (1976) A chapter of er-rors: teaching styles and pupil progress in retro-spect, Educational Research, 19, 45–56; Aitken,M., Bennett, S.N. and Hesketh, J. (1981) Teach-ing styles and pupil progress: a reanalysis, Brit-ish Journal of Educational Psychology, 51 (2),170–86; Aitken, M., Anderson, D. and Hinde,J. (1981) Statistical modelling of data on teach-ing styles, Journal of the Royal Statistical Soci-ety, 144 (4), 419–61; Frais, S.J. (1983) Formaland informal teaching: a further reconsiderationof Professor Bennett’s statistics, Journal of theRoyal Statistical Society, 146 (2), 163–9;Chatfield, C. (1985) The initial examination ofdata, Journal of the Royal Statistical Society, 148(3), 214–53.

4 Self-serving bias refers to our propensity to ac-cept responsibility for our successes, but to denyresponsibility for our failures.

5 For examples of research conducted using fac-tor analysis see: Andrews, P. and Hatch, G.(1999) A new look at secondary teachers’ con-ception of mathematics and its teaching, BritishEducational Research Journal, 25 (2), 203–23;McEneaney, J.E. and Sheridan, E.M. (1996) Asurvey-based component for programme assess-ment in undergraduate pre-service teacher edu-cation, Research in Education, 55, 49–61;Prosser, M. and Trigwell, K. (1997) Relationsbetween perceptions of the teaching environmentand approaches to teaching, British Journal ofEducational Psychology, 67, 25–35; Valadines,N. (1999) Formal reasoning performance ofhigher secondary school students: theoretical andeducational implications, European Journal of

Psychology of Education, 14 (1), 109–17;Vermunt, J.D. (1998) The regulation of construc-tive learning processes, British Journal of Edu-cational Psychology, 68, 149–71. For an exam-ple of research using cluster analysis see: Seifert,T.L. (1997) Academic goals and emotions: re-sults of a structural equation model and a clus-ter analysis, British Journal of Educational Psy-chology, 67, 323–38. For examples of researchusing correlation co-efficients see: Goossens, L.,Marcoen, A., van Hees, S. and van de Woestlijne,O. (1998) Attachment style and loneliness inadolescence, European Journal of Psychology ofEducation, 13 (4), 529–42; Lamb, S., Bibby, P.,Wood, D. and Leyden, G. (1997) Communica-tion skills, educational achievement and bio-graphic characteristics of children with moder-ate learning difficulties, European Journal ofPsychology of Education, 12 (4), 401–14;Okagaki, L. and Frensch, P.A. (1998) Parentingand school achievement: a multiethnic perspec-tive, American Educational Research Journal,35 (1), 123–44.

6 Examples of multilevel modelling in educationalresearch can be seen in: Bell, J.F. (1996) Ques-tion choice in English literature examination,Oxford Review of Education, 23 (4), 447–58;Croxford, L. (1997) Participation in science sub-jects: the effect of the Scottish curriculum frame-work, Research Papers in Education, 12 (1) 69–89; Fitz-Gibbon, C. T. (1991) Multilevel model-ling in an indicator system, in S.W.Raudenbushand J.D.Willms (eds) Schools, Classrooms andPupils. International Studies of Schooling froma Multilevel Perspective, San Diego, CA: Aca-demic Press Inc.; Hill, P.W. and Rowe, K.J. (1996)Multilevel modelling in school effectiveness re-search, School Effectiveness and School Improve-ment, 7 (1), 1–34; Kivulu, J.M. and Rogers, W.T.(1998) A multilevel analysis of cultural experi-ence and gender influences on causal attributionsto perceived performance in mathematics, Brit-ish Journal of Educational Psychology, 68, 25–37; McNiece, R. and Jolliffe, F. (1998) An in-vestigation into regional differences in educa-tional performance in the National Child Devel-opment Study, Educational Research, 40 (1), 13–30; Mooij, T. (1998) Pupil–class determinantsof aggressive and victim behaviour in pupils,British Journal of Educational Psychology, 68,373–85; Musch, J. and Bröder, A. (1999) Testanxiety versus academic skills: a comparison oftwo alternative models for predicting perform-ance in a statistics exam, British Journal of Edu-cational Psychology, 69, 105–16; Schagen, I. andSainsbury, M. (1996) Multilevel analysis of the

NOTES406

key stage 1 national curriculum data in 1995,Oxford Review of Education, 22 (3), 265–72;Thomas, S., Sammons, P., Mortimore, P. andSmees, R. (1997) Differential secondary schooleffectiveness: comparing the performance ofdifferent pupil groups, British Educational Re-search Journal, 23 (4), 351–69.

21 ROLE-PLAYING

1 For a recent account of a wide range of role-play applications in psychotherapy, seeHolmes, P. and Karp, M. (1991) Psychodrama:Inspiration and Technique, London:Routledge.

2 However, this is not what advocates of role-play as an alternative to deception generallymean by role-play. See Hamilton (1976) and

Forward, Canter and Kirsch (1976) for a fullerdiscussion.

3 For further sound advice see also, Bolton, G. andHeathcote, D. (1999) So You Want To Use Role-Play? A New Approach To Planning, Stoke:Trentham Books.

22 RECENT DEVELOPMENTS

1 For an example of research using GeographicalInformation Systems see: Higgs, G., Webster, C.J. and White, S.D. (1997) The use of geographi-cal information systems in assessing spatial andsocio-economic impacts of parental choice, Re-search Papers in Education, 12 (1) 27–48; Jones,D. and Vann, P. (1994) Informing School Deci-sions: GIS in Education, Luton: Local Govern-ment Management Board.

Acker, S. (1989) Teachers, Gender and Careers.Lewes: Falmer Press.

Acker, S. (1990) Teachers’ culture in an English pri-mary school: continuity and change. British, Jour-nal of Sociology of Education, 11 (3), 257–73.

Acton, H.B. (1975) Positivism. In J.O.Urmson (ed.)The Concise Encyclopedia of Western Philosophy.London: Hutchinson, 253–6.

Adams-Webber, J.R. (1970) Elicited versus providedconstructs in repertory grid technique: a review.British Journal of Medical Psychology, 43, 349–54.

Adelman, C., Kemmis, S. and Jenkins, D. (1980)Rethinking case study: notes from the SecondCambridge Conference. In H.Simons (ed.) To-wards a Science of the Singular. Centre for Ap-plied Research in Education, University of EastAnglia, 45–61.

Adeyemi, M.B. (1992) The effects of a social studiescourse on the philosophical orientations of his-tory and geography graduate students in Bot-swana. Educational Studies, 18 (2), 235–44.

Adler, P.A. and Adler, P. (1994) Observational tech-niques. In N.K.Denzin and Y.S.Lincoln (eds)Handbook of Qualitative Research. London: Sage,377–92.

Agar, M. (1993) Speaking of ethnography. Cited inD.Silverman, Interpreting Qualitative Data. Lon-don: Sage Publications.

Aitkin, M., Anderson, D. and Hinde, J. (1981) Sta-tistical modelling of data on teaching styles. Jour-nal of the Royal Statistical Society, 144 (4), 419–61.

Aitkin, M., Bennett, S.N. and Hesketh, J. (1981)Teaching styles and pupil progress: a reanalysis.British Journal of Educational Psychology, 51 (2),170–86.

Alban-Metcalf, R.J. (1997) Repertory Grid Tech-nique. In J.P.Keeves (ed.) Educational Research,Methodology and Measurement: an International

Handbook (second edition). Oxford: Elsevier Sci-ence Ltd., 315–18.

Alexander, P.C. and Neimeyer, G.J. (1989)Constructivism and family therapy. InternationalJournal of Personal Construct Psychology, 2 (2),111–21.

Alfassi, M. (1998) Reading for meaning: the efficacyof reciprocal teaching in fostering reading com-prehension in high school students in remedialreading classes. American Educational ResearchJournal, 35 (2), 309–22.

Alkin, M.C., Daillak, R. and White, P. (1991) Doesevaluation make a difference? In D.S.Andersonand B.J.Biddle (eds) Knowledge for Policy: Im-proving Education through Research. London:Falmer, 268–75.

Allison, P. (1984) Event history analysis: regressionfor longitudinal event data. Sage University Pa-pers: Quantitative Applications in the Social Sci-ences. No. 46. Beverly Hills: Sage Publications.

Altricher, H. and Gstettner, P. (1993) Action research:a closed chapter in the history of German socialscience? Educational Action Research, 1 (3), 329–60.

American Psychological Association (1973) EthicalPrinciples in the Conduct of Research with Hu-man Subjects. Ad Hoc Committee on EthicalStandards in Psychological Research. Washington,DC.

American Psychological Association (1974) Stand-ards for Educational and Psychological Testing.Washington, DC. American Psychological Asso-ciation.

American Psychological Association (1994) Publica-tion Manual of the American Psychological Asso-ciation (fourth edition). Washington, DC: B.Thompson.

Anderson, D.S. and Biddle, B.J. (eds) (1991) Knowl-edge for Policy: Improving Education throughResearch. London: Falmer.

Bibliography

BIBLIOGRAPHY408

Andrews, P. and Hatch, G. (1999) A new look atsecondary teachers’ conception of mathematicsand its teaching. British Educational ResearchJournal, 25 (2), 203–23.

Angoff, W.H. (1971) Scales, norms, and equivalentscores. In R.L.Thorndike (ed.) Educational Meas-urement (second edition). Washington, DC:American Council on Education, 508–600.

Antonsen, E.A. (1988) Treatment of a boy of twelve:help with handwriting, play therapy and discus-sion of problems. Journal of Education Therapy,2 (1), 2–32.

Apple, M. (1990) Ideology and Curriculum (secondedition). London: Routledge & Kegan Paul.

Applebee, A.N. (1976) The development of children’sresponses to repertory grids. British Journal ofSocial and Clinical Psychology, 15, 101–2.

Argyle, M. (1978) Discussion chapter: an appraisalof the new approach to the study of social behav-iour. In M.Brenner, P.Marsh and M.Brenner (eds),The Social Contexts of Method. London: CroomHelm, 237–55.

Argyris, C. (1975) Dangers in applying results fromexperimental social psychology. American Psy-chologist, 30, 469–85.

Argyris, C. (1990) Overcoming OrganisationalDefenses: Facilitating Organisational Learning.Boston: Allyn and Bacon.

Arnold, P. and Atkins, J. (1991) The social and emo-tional adjustment of hearing-impaired childrenintegrated in primary schools. Educational Re-searcher, 33 (3), 223–8.

Aronowitz, S. and Giroux, H. (1986) EducationUnder Siege. London: Routledge & Kegan Paul.

Aronson, E. and Carlsmith, J.M. (1969) Experimen-tation in social psychology. In G.Lindzey and E.Aronson (eds) The Handbook of Social Psychol-ogy, Vol. 2. Reading, MA: Addison-Wesley, 1–79.

Aronson, E., Ellsworth, P.C., Carlsmith, J.M. andGonzalez, M.H. (1990) Methods of Research inSocial Psychology. New York: McGraw-Hill.

Ary, D., Jacobs, L.C. and Razavieh, A. (1972) Intro-duction to Research in Education. New York:Holt, Rinehart & Winston.

Ashmore, M. (1989) The Reflexive Thesis: WrightingSociology of Scientific Knowledge. Chicago: Uni-versity of Chicago Press.

Ashton, E. (1994) Managing change in Further Edu-cation: some perspectives of college principals.Unpublished Ph.D. dissertation, LoughboroughUniversity of Technology.

Atkinson, R. (1998) The Life Interview. London: SagePublications.

Austin, J.L. (1962) How We Do Things with Words.Oxford: Oxford University Press.

Bailey, K.D. (1978) Methods of Social Research.Basingstoke: Collier-Macmillan.

Ball, S.J. (1981) Beachside Comprehensive. Cam-bridge: Cambridge University Press.

Ball, S.J. (1985) School politics, teachers’ careers andeducational change: a case study of becoming acomprehensive school. In L.Barton and S.Walker(eds) Education and Social Change. Beckenham:Groom Helm, 29–61.

Ball, S.J. (1990) Politics and Policy-Making in Edu-cation. London: Routledge.

Ball, S.J. (1994a) Education Reform: a Critical andPost-structuralist Approach. Buckingham: OpenUniversity Press.

Ball, S.J. (1994b) Political interviews and the politicsof interviewing. In G.Walford (ed.) Researchingthe Powerful in Education. London: UCL Press,96–115.

Bannister, D. (ed.) (1970) Perspectives in PersonalConstruct Theory. London: Academic Press.

Bannister, D. and Mair, J.M.M. (1968) The Evaluationof Personal Constructs. London: Academic Press.

Banuazizi, A. and Movahedi, A. (1975) Interpersonaldynamics in a simulated prison: a methodologicalanalysis. American Psychologist, 30, 152–60.

Barker, C.D. and Johnson, G. (1998) Interview talkas professional practice. Language and Education,12 (4), 229–42.

Barratt, P.E.H. (1971) Bases of Psychological Meth-ods. Australasia Pty. Ltd., Queensland, Australia:Wiley & L Sons.

Barr Greenfield, T. (1975) Theory about organisa-tions: a new perspective and its implications forschools. In M.G.Hughes (ed.) Administering Edu-cation: International Challenge. London: AthlonePress, 71–99.

Bates, I. and Dutson, J. (1995) A Bermuda triangle?A case study of the disappearance of competence-based vocational training policy in the context ofpractice. British Journal of Education and Work,8 (2), 41–59.

Batteson, C. and Ball, S.J. (1995) Autobiographiesand interviews as means of ‘access’ to elite policymaking in education. British Journal of Educa-tional Studies, 43 (2), 201–16.

Bauman, R. (1986) Story, Performance and Event.Cambridge: Cambridge University Press.

Bib

liog

rap

hy

409BIBLIOGRAPHY

Baumrind, D. (1964) Some thoughts on ethics of re-search. American Psychologist, 19, 421–3.

Beck, R.N. (1979) Handbook in Social Philosophy.New York: Macmillan.

Becker, H. (1970) Sociological Work. Chicago:Aldane.

Becker, H. and Geer, B. (1960) Participant observa-tion: the analysis of qualitative field data. InR.Adams and J.Preiss (eds) Human OrganizationResearch: Field Relations and Techniques.Homewood, IL: Dorsey.

Beer, M., Eisenstadt, R.A. and Spector, B. (1990) Whychange programs don’t produce change. HarvardBusiness Review, Nov.–Dec. (68), 158–66.

Belbin, R.M. (1981) Management Teams: Why TheySucceed or Fail. London: Heinemann.

Bell, J. (1991) Doing Your Research Project. MiltonKeynes: Open University Press.

Bell, J. (1996) Question choice in English literatureexamination. Oxford Review of Education, 23 (4),447–58.

Belson, W.A. (1975) Juvenile Theft: Causal Factors.London: Harper & Row.

Belson, W.A. (1986) Validity in Survey Research.Aldershot: Gower.

Bennett, C.A. and Lumsdaine, A.A. (1975) Evalua-tion and Experimentation. New York: AcademicPress Inc.

Bennett, S.N. (1976) Teaching Styles and PupilProgress. Shepton Mallet: Open Books.

Bennett, S.N. and Jordan, J. (1975) A typology ofteaching styles in primary schools. British Jour-nal of Educational Psychology, 45, 20–8.

Bennett, S. and Bowers, D. (1977) An Introductionto Multivariate Techniques for Social and Behav-ioural Sciences. London: Macmillan.

Bennett, S.N., Desforges, C. and Wilkinson, E. (1984)The Quality of Pupil Learning Experiences. Lon-don: Lawrence Erlbaum Associates.

Ben-Peretz, M. and Kremer-Hayon, L. (1990) Thecontent and context of professional dilemmas en-countered by novice and senior teachers. Educa-tional Review, 42 (1), 31–40.

Benstead, D. and Constantine, S. (1998) The InwardRevolution. London: Little, Brown & Co.

Berger, P.L. and Luckmann, T. (1967) The Social Con-struction of Reality. Harmonds worth: Penguin.

Bernstein, B. (1970) Education cannot compensatefor society. New Society, February, 387, 344–57.

Bernstein, B. (1971) On the classification and fram-ing of educational knowledge. In M.F.D. Young

(ed.) Knowledge and Control. Basingstoke: Col-lier-Macmillan, 47–69.

Bernstein, B. (1974) Sociology and the sociology ofeducation: a brief account. In J.Rex (ed.) Ap-proaches to Sociology: an Introduction to MajorTrends in British Sociology. London: Routledge& Kegan Paul, 145–59.

Bernstein, R.J. (1976) The Restructuring of Socialand Political Theory. Oxford: Blackwell.

Bernstein, R.J. (1983) Beyond Objectivism and Rela-tivism. Oxford: Blackwell.

Best, J.W. (1970) Research in Education. EnglewoodCliffs, NJ: Prentice-Hall.

Beynon, J. (1985 a) Historical change and career his-tories in a comprehensive school. In S.J.Ball andI.F.Goodson (eds) Teachers’ Lives and Careers.Lewes: Falmer Press, 158–79.

Beynon, J. (1985b) Initial Encounters in the Second-ary School. Lewes: Falmer Press.

Bhadwal, S.C. and Panda, P.K. (1991) The effects ofa package of some curricular strategies on the studyhabits of rural primary school students: a year longstudy. Educational Studies, 17 (3), 261–72.

Biddle, B.J. and Anderson, D.S. (1991) Social re-search and educational change. In D.S.Andersonand B.J.Biddle (eds) Knowledge for Policy: Im-proving Education through Research. London:Falmer, 1–20.

Bielinski, J. and Davison, M.L. (1998) Gender dif-ferences by item difficulty interactions in multiplechoice mathematics items. American EducationalResearch Journal, 35 (3), 455–76.

Bijstra, J.O. and Jackson, S. (1998) Social skills train-ing with early adolescents: effects on social skills,well-being, self-esteem and coping. European Jour-nal of Psychology of Education, 13 (4), 569–83.

Binet, A. (1905) Methode nouvelle pour le diagnos-tic de l’intelligence des anormaux. Cited in G. deLandsheere (1997) History of educational re-search. In J.P.Keeves (ed.) Educational Research,Methodology, and Measurement: an InternationalHandbook (second edition). Oxford: Elsevier Sci-ence Ltd., 8–16.

Black, D.R. and Scott, W.A.H. (1997) Factors affect-ing the employment of teachers returning to theUnited Kingdom after teaching abroad. Educa-tional Research, 39 (1), 37–63.

Blalock, H.M. (1979) Social Statistics (second edi-tion). Tokyo: McGraw-Hill-Kogakusha.

Blalock, H.M. (1991) Dilemmas of social research.In D.S.Anderson and B.J.Biddle (eds) Knowledge

BIBLIOGRAPHY410

for Policy: Improving Education through Re-search. London: Falmer, 60–9.

Blatchford, P. (1992) Children’s attitudes to work at11 years. Educational Studies, 18 (1), 107–18.

Blease, D. and Cohen, L. (1990) Coping with Com-puters: An Ethnographic Study in Primary Class-rooms. London: Paul Chapman Publishing.

Bliss, J., Monk, M. and Ogborn, J. (1983) Qualita-tive Data Analysis for Educational Research. Lon-don: Croom Helm.

Bloom, B. (ed.) (1956) Taxonomy of EducationalObjectives: Handbook 1: Cognitive Domain. Lon-don: Longman.

Bloor, M. (1978) On the analysis of observationaldata: a discussion of the worth and uses of induc-tion techniques and respondent validation. Soci-ology, 12 (3), 545–52.

Blumenfeld-Jones, D. (1995) Fidelity as a criterionfor practising and evaluating narrative inquiry.International Journal of Qualitative Studies inEducation, 8 (1), 25–33.

Blumer, H. (1969) Symbolic Interactionism.Englewood Cliffs, NJ: Prentice-Hall.

Boas, F. (1943) Recent anthropology. Science, 98,311–14.

Bogdan, R.G. and Biklen, S.K. (1992) QualitativeResearch for Education (second edition). Boston,MA: Allyn & Bacon.

Bolton, G. and Heathcote, D. (1999) So You Wantto Use Role-Play? A New Approach to Planning.Stoke: Trentham Books.

Borg, M.G. (1998) Secondary school teachers’ percep-tions of pupils’ undesirable behaviours. British Jour-nal of Educational Psychology, 68, 67–79.

Borg, W.R. (1963) Educational Research: an Intro-duction. London: Longman.

Borg, W.R. (1981) Applying Educational Research:a Practical Guide for Teachers. New York:Longman.

Borg, W.R. and Gall, M.D. (1979) Educational Re-search: an Introduction (third edition). London:Longman.

Borg, W.R. and Gall, M.D. (1996) Educational Re-search: an Introduction (sixth edition). New York:Longman.

Boring, E.G. (1953) The role of theory in experimen-tal psychology. American Journal of Psychology,66, 169–84.

Borkowsky, F.T. (1970) The relationship of workquality in undergraduate music curricula to effec-tiveness in instrumental music teaching in the pub-

lic schools. Journal of Experimental Education,39, 14–19.

Boruch, R.F (1997) Randomized Experiments forPlanning and Evaluation. Applied social researchmethods series, Vol. 44. Thousand Oaks, CA: SagePublications.

Boulton, M.J. (1992) Participation in playgroundactivities at middle school. Education Research,34 (3), 167–82.

Boulton, M.J. (1997) Teachers’ views on bullying:definitions, attitudes and abilities to cope. BritishJournal of Educational Psychology, 67, 223–33.

Bowe, R., Ball, S.J. and Gold, A. (1992) ReformingEducation and Changing Schools. London:Routledge.

Bowles, S. and Gintis, H. (1976) Schooling in Capi-talist America. London: Routledge & Kegan Paul.

Boyd-Barrett, D. and Scanlon, E. (1991) Computersand Learning. Wokingham: Addison-Wesley Pub-lishing Co.

Bracht, G.H. and Glass, G.V. (1968) The externalvalidity of experiments. American EducationalResearch Journal, 4 (5), 437–74.

Bradburn, N.M. and Berlew, D.E. (1961) EconomicDevelopment and Cultural Change. Chicago:University of Chicago Press.

Breakwell, G. (1990) Interviewing. London:Routledge/ British Psychological Society.

Brenner, M., Brown, J. and Canter, D. (1985) TheResearch Interview. London: Academic Press Inc.

Broad, W. and Wade, N. (1983) Betrayers of Truth:Fraud and Deceit in the Halls of Science. NewYork: Century.

Brock-Utne, B. (1996) Reliability and validity in quali-tative research within education in Africa. Inter-national Review of Education, 42 (6), 605–21.

Bromley, D.B. (1986) The Case Study Method in Psy-chology and Related Disciplines. Chichester: JohnWiley.

Brown, J. and Sime, J.D. (1977) Accounts as generalmethodology. Paper presented to the British Psy-chological Society Conference. University of Ex-eter.

Brown, J. and Sime, J.D. (1981) A methodology ofaccounts. In M.Brenner (ed.) Social Method andSocial Life. London: Academic Press, 159–88.

Brown, J., Collins, A. and Duguid, P. (1989) Situ-ated cognition and the culture of learning. Edu-cational Researcher, 32, 32–42.

Brown, R. and Herrnstein, R.J. (1975) Psychology.London: Methuen.

Bib

liog

rap

hy

411BIBLIOGRAPHY

Bruner, J. (1986) Actual Minds, Possible Worlds.Cambridge, MA: Harvard University Press.

Bryant, P., Devine, M., Ledward, A. and Nunes, T.(1997) Spelling with apostrophes and understand-ing possession. British Journal of EducationalPsychology, 67, 91–110.

Buhler, C. and Allen, M. (1972) Introduction toHumanistic Psychology. Monterey, California:Brooks/Cole.

Bulmer, M. (ed.) (1982) Social Research Ethics. Lon-don and Basingstoke: Macmillan.

Burgess, R.G. (ed.) (1982) Field Research: ASourcebook and Field Manual. London: GeorgeAllen & Unwin.

Burgess, R.G. (1983) Experiencing ComprehensiveEducation. London: Methuen.

Burgess, R.G. (1985a) Conversations with a purpose?The ethnographic interview in educational re-search. Paper presented at the British EducationalResearch Association Annual Conference, Shef-field.

Burgess, R.G. (1985b) Issues in Educational Research:Qualitative Methods. Lewes: Falmer.

Burgess, R.G. (ed.) (1989a) The Ethics of EducationalResearch. Lewes: Falmer Press.

Burgess, R.G. (1989b) Grey areas: ethical dilemmasin educational ethnography. In R.G.Burgess (ed.)The Ethics of Educational Research. Lewes:Falmer Press, 60–76.

Burgess, R.G. (1993) Biting the hand that feeds you?Educational research for policy and practice. InR. G.Burgess (ed.) Educational Research andEvaluation for Policy and Practice. London:Falmer, 1–18.

Burke, M., Noller, P. and Caird, D. (1992) Transi-tion from probationer to educator: a repertory gridanalysis. International Journal of Personal Con-struct Psychology, 5 (2), 159–82.

Burrell, G. and Morgan, G. (1979) Sociological Para-digms and Organisational Analysis. London:Heinemann Educational Books.

Busato, V.V, Prins, F.J., Elshant, J.J. and Hamaker,C. (1998) Learning styles: a cross-sectional andlongitudinal study in higher education. BritishJournal of Educational Psychology, 68, 427–41.

Butler, N.R. and Golding, J. (1986) From Birth toFive. Oxford: Pergamon Press.

Butzkamm, W. (1998) Code-switching in a bilingualhistory lesson: the mother tongue as a conversa-tional lubricant. Bilingual Education and Bilin-gualism, 1 (2): 81–99.

Calder, J. (1979) Introduction to applied sampling.Research Methods in Education and the SocialSciences. Block 3, part 4, DE304. Milton Keynes:Open University Press.

Callawaert, S. (1999) Philosophy of Education,Frankfurt critical theory, and the sociology ofPierre Bourdieu. In T.Popkewitz and L.Fendler(eds) Critical Theories in Education: ChangingTerrains of Knowledge and Politics. London:Routledge, 117–44.

Campbell, D.T. and Fiske, D.W. (1959) Convergentand discriminant validation by the multitrait-multimethod matrix, Psychological Bulletin,56,81–105.

Campbell, D.T. and Stanley, J. (1963) Experimentaland quasi-experimental designs for research onteaching. In N.Gage (ed.) Handbook of Researchon Teaching. Chicago: Rand McNally, 171–246.

Campbell, D.T., Sanderson, R.E. and Laverty, S.G.(1964) Characteristics of a conditioned responsein human subjects during extinction trials follow-ing a single traumatic conditioning trial. Journalof Abnormal and Social Psychology, 68, 627–39.

Cannell, C.F. and Kahn, R.L. (1968) Interviewing.In G.Lindzey and A.Aronson (eds) The Handbookof Social Psychology, Vol. 2: Research Methods.New York: Addison Wesley, 526–95.

Caplan, N. (1991) The use of social research knowl-edge at the national level. In D.S.Anderson andB.J.Biddle (eds) Knowledge for Policy: Improving Edu-cation through Research. London: Falmer, 193–202.

Carr, W, and Kemmis, S. (1986) Becoming Critical.Lewes: Falmer.

Carroll, S. and Walford, G. (1997) Parents’ responsesto the school quasi-market. Research Papers inEducation, 12 (1), 3–26.

Carspecken, P.F. (1996) Critical Ethnography in Edu-cational Research. London: Routledge.

Carspecken, P.F. and Apple, M. (1992) Critical quali-tative research: theory, methodology, and prac-tice. In M.LeCompte, W.L.Millroy and J.Preissle(eds) The Handbook of Qualitative Research inEducation. London: Academic Press Ltd., 507–53.

Cartwright, D. (1991) Basic and applied social psy-chology. In D.S.Anderson and B.J.Biddle (eds)Knowledge for Policy: Improving Educationthrough Research. London: Falmer, 23–31.

Carver, R.P. (1978) The case against significance test-ing. Harvard Educational Review, 48 (3), 378–99.

BIBLIOGRAPHY412

Cassell, J. (1993) The relationship of observer toobserved when studying up. Cited in R.M.Lee,Doing Research on Sensitive Topics. London: SagePublications.

Cavan, S. (1977) Review of J.D.Douglas’s (1976) ‘In-vestigative Social Review: Individual and TeamField Research’. The American Journal of Sociol-ogy, 83 (3), 809–11.

Central Advisory Council for Education (1967) Chil-dren and their Primary Schools. London: HMSO.

Chalmers, A.F. (1982) What Is This Thing CalledScience? (second edition). Milton Keynes: OpenUniversity Press.

Chapin, F.S. (1947) Experimental Designs in Socio-logical Research. New York: Harper & Row.

Chatfield, C. (1985) The initial examination of data.Journal of the Royal Statistical Society, 148 (3),214–53.

Chelinsky, E. and Mulhauser, F. (1993) Educationalevaluations for the US Congress: some reflectionson recent experience. In R.G.Burgess (ed.) Educa-tional Research and Evaluation for Policy andPractice. London: Falmer, 44–60.

Chetwynd, S.J. (1974) Outline of the analyses avail-able with G.A.P., the Grid Analysis Package. Lon-don: St George’s Hospital.

Child, D. (1970) The Essentials of Factor Analysis.New York: Holt, Rinehart & Winston.

Childs, G. (1997) A concurrent validity study of teach-ers’ ratings for nominated ‘problem’ children.British Journal of Educational Psychology, 67,457–74.

Chomsky, N. (1959) Review of Skinner’s Verbal Be-haviour. In Language, 35 (1), 26–58.

Cicognani, C. (1998) Parents’ educational styles andadolescent autonomy. European Journal of Psy-chology of Education, 13 (4), 485–502.

Cicourel, A.V (1964) Method and Measurement inSociology. New York: The Free Press.

Claxton, G. (1981) Wholly Human. London:Routledge & Kegan Paul.

Cline, T. and Ertubney, C. (1997) The impact of gen-der on primary teachers’ evaluations of children’sdifficulties in school. British Journal of Educa-tional Psychology, 67, 447–56.

Cline, T., Proto, A., Raval, P.D. and Paolo T. (1998)The effects of brief exposure and of classroomteaching on attitudes children express towardsfacial disfigurement in peers. Educational Re-search, 40 (1), 55–68.

Cochrane, A.L. (1972) Effectiveness and Efficiency:

Random Reflections on Health Services. London:Nuffield Provincial Hospitals Trust.

Cogan, J.J. and Derricot, R. (eds) (1998) Citizenshipfor the 21st Century. London: Kogan Page.

Cohen, D.K. and Garet, M.S. (1991) Reforming edu-cational policy with applied social research. InD.S.Anderson and B.J.Biddle (eds) Knowledge forPolicy: Improving Education through Research.London: Falmer, 123–40.

Cohen, L. (1977) Educational Research in Class-rooms and Schools: a Manual of Materials andMethods. London: Harper & Row.

Cohen, L. (1993) Racism Awareness Materials in Ini-tial Teacher Training. Report to the LeverhulmeTrust, 11–19 New Fetter Lane, London, EC4A 1NR.

Cohen, L. and Holliday, M. (1979) Statistics for Edu-cation and Physical Education. London: Harper& Row.

Cohen, L. and Holliday, M. (1982) Statistics for So-cial Scientists. London: Harper & Row.

Cohen, L. and Holliday, M. (1996) Practical Statis-tics for Students. London: Paul Chapman Pub-lishing Ltd.

Cohen, L. and Manion, L. (1981) Perspectives onClassrooms and Schools. Eastbourne: Holt,Rinehart &. Winston.

Cohen, L. and Manion, L. (1994) Research Methodsin Education (fourth edition). London: Routledge.

Cohen, L., Manion, L. and Morrison, K.R.B. (1996)A Guide to Teaching Practice (fourth edition).London: Routledge.

Cohen, M.R. and Nagel, E. (1961) An Introductionto Logic and Scientific Method. London:Routledge & Kegan Paul.

Cohen, P.A., Kulik, J.A. and Kulik, C.L. (1982) Edu-cational outcomes of tutoring: a meta-analysis offindings. American Educational Research Journal,19 (2), 237–48.

Cole, A.L. (1991) Personal theories of teaching: de-velopment in the formative years, Alberta Journalof Educational Research, 37 (2), 119–32.

Coleman, J.S. (1991) Social policy research andsocietal decision-making. In D.S.Anderson and B.J.Biddle (eds) Knowledge for Policy: ImprovingEducation through Research. London: Falmer,113–22.

Collins, J.S. and Duguid, P. (1989) Situated cogni-tion and the culture of learning. Educational Re-searcher, 32, 32–42.

Collins, L. (ed.) (1976) The Use of Models in theSocial Sciences. London: Tavistock Publications.

Bib

liog

rap

hy

413BIBLIOGRAPHY

Connell, R.W., Ashenden, D.J., Kessler, S. andDoswett, G.W. (1996) Making the difference:schools, families and social division. Cited in B.Limerick, T.Burgess-Limerick, and M.Grace, Thepolitics of interviewing: power relations and ac-cepting the gift. International Journal of Qualita-tive Studies in Education, 9 (4), 449–60.

Connelly, F.M. and Clandinin, D.J. (1999) Narrativeinquiry. In J.P.Keeves and G.Lakomski (eds) Is-sues in Educational Research (second edition).Oxford: Elsevier Science Ltd, 132–40.

Cook, T.D. (1991) Postpositivist criticisms, reformassociations, and uncertainties about social re-search. In D.S.Anderson and B.J.Biddle (eds)Knowledge for Policy: Improving Educationthrough Research. London: Falmer, 43–59.

Cook, T.D., Cooper, H., Cordray, D.S., Hartmann,H., Hedges, L.V., Light, R.J., Louis, T.A.,Mosteller, F. (1992) Meta-analysis for Explana-tion. New York: Russell Sage Foundation.

Cooley, C.H. (1902) Human Nature and the SocialOrder. New York: Charles Scribner.

Cooper, H.M. and Rosenthal, R. (1980) Statisticalversus traditional procedures for summarizing re-search findings. Psychological Bulletin, 87, 442–9.

Corey, S.M. (1949) Action research, fundamentalresearch and educational practices. Teachers Col-lege Record, 50, 509–14.

Corey, S.M. (1953) Action Research to ImproveSchool Practice. New York: Teachers College,Columbia University.

Corporal, A.H. (1991) Repertory grid research intocognitions of prospective primary school teach-ers. Teaching and Teacher Education, 36, 315–29.

Coser, L.A. and Rosenberg, B. (1969) SociologicalTheory: a Book of Readings (third edition). NewYork: Macmillan.

Cox, E. (1994) The Fuzzy Systems Handbook: a Prac-titioner’s Guide to Building, Using and Maintain-ing Fuzzy Systems. London: Academic Press.

Coyle, A. (1995) Discourse analysis. In G.M.Break-well, S.Hammond and C.Fife-Shaw (1995) Re-search Methods in Psychology. London: Sage Pub-lications, 243–58.

Cresswell, M.J. and Houston, J.G.(1991) Assessmentof the national curriculum—some fundamentalconsiderations. Educational Review, 43 (1), 63–78.

Crocker, A.C. and Cheeseman, R.G. (1988) The abil-ity of young children to rank themselves for aca-

demic ability. Educational Studies, 14 (1), 105–10.

Croll, P. (1985) Systematic Observation. Lewes:Falmer Press.

Cronbach, L.J. (1949) Essential of Psychological Test-ing (first edition). New York: Harper & Row.

Cronbach, L.J. (1970) Essentials of PsychologicalTesting (third edition). New York: Harper & Row.

Crow, G.M. (1992) The principalship as a career: inneed of a theory. Educational Management andAdministration, 21 (2), 80–7.

Croxford, L. (1997) Participation in science subjects:the effect of the Scottish curriculum framework.Research Papers in Education, 12 (1) 69–89.

Cuff, E.G. and Payne, G.C.F. (eds) (1979) Perspec-tives in Sociology. London: George Allen &Unwin.

Cullen, K. (1997) Headteacher appraisal: a view fromthe inside. Research Papers in Education, 12 (2),177–204.

Cunningham, G.K. (1998) Assessment in the Class-room. London: Falmer Press.

Cunningham, D., McMahon, H. and O’Neill, B.(1991) Bubble Dialogue: a New Tool for Instruc-tion and Assessment. Language Development andHypermedia Research Group, Faculty of Educa-tion, University of Ulster at Coleraine.

Curtis, B. (1978) Introduction to Curtis, B. and Mays,W. (eds) Phenomenology and Education. London:Methuen.

Daly, P. (1996) The effects of single-sex and co-edu-cational secondary schooling on girls’ achieve-ment. Research Papers in Education, 11 (3), 289–306.

Daly, P. and Shuttleworth, I. (1997) Determinants ofpublic examination entry and attainment in math-ematics: evidence of gender and gender-type ofschool from the 1980s and 1990s in Northern Ire-land. Evaluation and Research in Education, 11(2), 91–101.

Data Protection Act 1984. London: HMSO.Davenport, E.C.Jr, Davison, M.L., Kuang, H., Ding,

S., Kin, S.-K. and Kwak, N. (1998) High schoolmathematics course-taking by gender and ethnic-ity. American Educational Research Journal,35(3), 497–514.

Davidson, J. (1970) Outdoor Recreation Surveys: theDesign and Use of Questionnaires for Site Sur-veys. London: Countryside Commission.

Davie, R. (1972) The longitudinal approach. Trendsin Education, 28, 8–13.

BIBLIOGRAPHY414

Davies, B. (1982) Life in the Classroom and Play-ground: the Accounts of Primary School Children.London: Routledge & Kegan Paul.

Davies, J. and Brember, I (1997) Monitoring readingstandards in year 6: a 7-year cross-sectional study.British Educational Research Journal, 23 (5), 615–22.

Davies, J. and Brember, I. (1998) Standards in read-ing at key stage 1—a cross-sectional study. Edu-cational Research, 40 (2), 153–60.

Davies, L. (1984) Pupil Power: Deviance and Gen-der in School. Lewes: Falmer Press.

Delamont, S. (1976) Interaction in the Classroom.London: Methuen.

Delamont, S. (1981) All too familiar? A decade ofclassroom research. Educational Analysis, 3 (1),69–83.

Delamont, S. (1992) Fieldwork in Educational Set-tings: Methods, Pitfalls and Perspectives. London:Falmer.

Delamont, S. (1998) Review of ‘Understanding So-cial Research’. Evaluation and Research in Edu-cation, 12 (3), 176–7.

de Landsheere, G. (1997) History of educational re-search. In J.P.Keeves (ed.) Educational Research,Methodology, and Measurement: an InternationalHandbook (second edition). Oxford: Elsevier Sci-ence Ltd., 8–16.

Denscombe, M. (1995) Explorations in group inter-views: an evaluation of a reflexive and partisanapproach. British Educational Research Journal,21 (2), 131–48.

Denzin, N.K. (1970) The Research Act in Sociology:a Theoretical Introduction to Sociological Meth-ods. London: Butterworth.

Denzin, N.K. (1989) The Research Act (third edi-tion). Englewood Cliffs, NJ: Prentice-Hall.

Denzin, N.K. (1997) Triangulation in educationalresearch. In J.P.Keeves (ed.) Educational Research,Methodology and Measurement: an InternationalHandbook (second edition). Oxford: Elsevier Sci-ence Ltd., 318–22.

Denzin, N.K. (1999) Biographical research methods.In J.P.Keeves and G.Lakomski (eds) Issues in Edu-cational Research. Oxford: Elsevier Science Ltd.,92–102.

Denzin, N.K. and Lincoln, Y.S. (eds) (1994) Hand-book of Qualitative Research. Thousand Oaks,California: Sage Publications Inc.

Department of Education and Science (1977) A Studyof School Buildings. Annex 1. London: HMSO.

Derry, S.J. and Potts, M.K. (1998) How tutors modelstudents: a study of personal constructs in adap-tive tutoring. American Educational ResearchJournal, 35 (1), 65–99.

Deyle, D.L., Hess, G. and LeCompte, M.L. (1992)Approaching ethical issues for qualitative research-ers in education. In M.LeCompte, W.L. Millroyand J.Preissle (eds) The Handbook of QualitativeResearch in Education. London: Academic PressLtd., 597–642.

Diaper, G. (1990) The Hawthorne Effect: a freshexamination. Educational Studies, 16 (3), 261–7.

Dicker, R. and Gilbert, J. (1988) The role of the tel-ephone in educational research. British Educa-tional Research Journal, 14 (1), 65–72.

Didierjean, A. and Cauzinille-Marmèche, E. (1998)Reasoning by analogy: is it schema-mediated orcase-based? European Journal of Psychology ofEducation, 13 (3), 385–98.

Diener, E. and Crandall, R. (1978) Ethics in Socialand Behavioral Research. Chicago: University ofChicago Press.

Dietz, S.M. (1977) An analysis of programming DRLschedules in educational settings, Behaviour Re-search and Therapy, 15, 103–11.

Dobbert, M.L. and Kurth-Schai, R. (1992) System-atic ethnography: toward an evolutionary scienceof education and culture. In M.LeCompte, W.L.Millroy and J.Preissle (eds) The Handbook ofQualitative Research in Education. London: Aca-demic Press Ltd., 93–160.

Doll, W.E. (1993) A Post-modern Perspective onCurriculum. New York: Teachers College Press.

Dollard, J. (1949) Criteria for the Life History. NewHaven, CT: Yale University Press.

Dosanjh, J.S. and Ghuman, P.A.S. (1997) Asian par-ents and English education—20 years on: a studyof two generations. Educational Studies, 23 (3),459–72.

Douglas, J.D. (1973) Understanding Everyday Life.London: Routledge & Kegan Paul.

Douglas, J.D. (1976) Investigative Social Research.Beverly Hills: Sage Publications.

Douglas, J.W.B. (1976) The use and abuse of nationalcohorts. In M.D.Shipman (ed.) The Organizationand Impact of Social Research. London: Routledge& Kegan Paul, 3–21.

Dugard, P. and Todman, J. (1995) Analysis of pretestand post-test control group designs in educationalresearch. Educational Psychology, 15 (2), 181–98.

Bib

liog

rap

hy

415BIBLIOGRAPHY

Duncan Mitchell, G. (1968) A Dictionary of Sociol-ogy. London: Routledge & Kegan Paul.

Dunham, R.B. and Smith, F.J. (1979) OrganisationalSurveys: an Internal Assessment of OrganizationalHealth. Glenview, IL: Scott, Foreman and Co.

Dunn, S. and Morgan, V. (1987) Nursery and infantschool play patterns: sex-related differences. Brit-ish Educational Research Journal, 13, (3), 271–81.

Dunning, G. (1993) Managing the small primaryschool: the problem role of the teaching head.Educational Management and Administration, 21(2), 79–89.

Durkheim, E. (1956) Education and Sociology. Glen-coe: IL.: Free Press.

Eagleton, T. (1991) Ideology. London: Verso.Ebbinghaus, H. (1897) Über eine neue methode zur

Prüfung geistiger Fähigkeiten. Cited in G.deLandsheere (1997) History of educational re-search. In J.P.Keeves (ed.) Educational Research,Methodology, and Measurement: an InternationalHandbook (second edition). Oxford: Elsevier Sci-ence Ltd., 8–16.

Ebbutt, D. (1985) Educational action research: somegeneral concerns and specific quibbles. In R.G.Burgess (ed.) (1985b) Issues in Educational Re-search: Qualitative Methods. Lewes: Falmer, 152–74.

Ebel, R.L. (1972) Essentials of Educational Meas-urement. Englewood Cliffs, NJ: Prentice-Hall.

Ebel, R.L. (1979) Essentials of Educational Meas-urement (third edition). Englewood Cliffs, NJ:Prentice-Hall.

Edwards, A. (1990) Practitioner Research. StudyGuide No. 1. Lancaster: St Martin’s College.

Edwards, A.D. (1976) Language in Culture and Class.London: Heinemann.

Edwards, A.D. (1980) Patterns of power and author-ity in classroom talk. In P.Woods (ed.) TeacherStrategies: Explorations in the Sociology of theSchool. London: Croom Helm, 237–53.

Edwards, A.D. and Westgate, D.P.G. (1987) Investi-gating Classroom Talk. Lewes: Falmer Press.

Edwards, D. (1991) Discourse and the developmentof understanding in the classroom. InO.BoydBarrett and E.Scanlon (eds) Computersand Learning. Wokingham: Addison-Wesley Pub-lishing Co., 186–204.

Edwards, D. (1993) Concepts, memory and the or-ganisation of pedagogic discourse: a case study.

International Journal of Educational Research, 19(3), 205–25.

Edwards, D. and Mercer, N.M. (1987) CommonKnowledge: The Development of Understandingin the Classroom. London: Routledge & KeganPaul.

Edwards, D. and Mercer, N.M. (1989) Reconstruct-ing context: the conventionalization of classroomknowledge. Discourse Processes, 12, 91–104.

Edwards, D. and Potter, J. (1993) Language and cau-sation: a discursive action model of descriptionand attribution. Psychological Review, 100 (1),23–41.

Eisenhart, M.A. and Howe, K.R. (1992) Validity ineducational research. In M.D.LeCompte, W.L.Millroy and J.Preissle (eds) The Handbook ofQualitative Studies in Education. New York: Aca-demic Press, 643–80.

Eisner, E. (1985) The Art of Educational Evaluation.Lewes: Falmer.

Eisner, E. (1991) The Enlightened Eye: QualitativeInquiry and the Enhancement of EducationalPractice. New York: Macmillan.

Ekehammar, B. and Magnusson, D. (1973) A methodto study stressful situations, Journal of Personal-ity and Social Psychology, 27 (2), 176–9.

Elliott, J. (1978) What is Action-Research in Schools?Journal of Curriculum Studies 10 (4), 355–7.

Elliott, J. (1991) Action Research for EducationalChange. Buckingham: Open University Press.

English, H.B. and English, A.C. (1958) A Compre-hensive Dictionary of Psychological and Psycho-analytic Terms. London: Longman.

Entwistle, N.J. and Ramsden, P. (1983) Understand-ing Student Learning. Beckenham: Croom Helm.

Epting, F.R. (1988) Journeying into the personal con-structs of children. International Journal of Per-sonal Construct Psychology, 1 (1), 53–61.

Epting, F.R., Suchman, D.I. and Nickeson, K.J. (1971)An evaluation of elicitation procedures for per-sonal constructs. British Journal of Psychology,62, 513–17.

Erickson, F. (1973) What makes school ethnography‘ethnographic’? Anthropology and EducationQuarterly, 4 (2), 10–19.

Erickson, F.E. (1992) Ethnographic microanalysis ofinteraction. In M.LeCompte, W.L.Millroy and J.Preissle (eds) The Handbook of Qualitative Re-search in Education. London: Academic Press Ltd.,201–26.

BIBLIOGRAPHY416

Erikson, K.T. (1967) A comment on disguised obser-vation in sociology. Social Problems, 14, 366–73.

Evans, K.M. (1978) Planning Small Scale Research.Windsor: NFER Publishing Co.

Everett, M. (1984) The Scottish comprehensiveschool: its function and the roles of its teacherswith special reference to the opinions of pupilsand student teachers. Unpublished Ph.D. disser-tation, School of Education, University of Dur-ham.

Everitt, B.S. (1974) Cluster Analysis. London:Heinemann Educational Books.

Everitt, B.S. (1977) The Analysis of ContingencyTables. London: Chapman & Hall.

Evetts, J. (1990) Women in Primary Teaching. Lon-don: Unwin Hyman.

Evetts, J. (1991) The experience of secondary head-ship selection: continuity and change. EducationalStudies, 17 (3), 285–94.

Eysenck, H. (1978) An exercise in mega-silliness.American Psychologist, 33, 517.

Fay, B. (1987) Critical Social Science. New York:Cornell University Press.

Feixas, G., Marti, J. and Villegas, M. (1989) Personalconstruct assessment of sports teams. InternationalJournal of Personal Construct Psychology, 2 (1),49–54.

Feldt, L.S. and Brennan, R.L. (1993) Reliability. InR.Linn (ed.) Educational Measurement. NewYork: Macmillan Publishing Co., 105–46.

Fendler, L. (1999) Making trouble: prediction, agency,critical intellectuals. In T.S.Popkewitz andL.Fendler (eds) Critical Theories in Education:Changing Terrains of Knowledge and Politics.London: Routledge, 169–88.

Ferris, J. and Gerber, R. (1996) Mature-age students’feelings of enjoying learning in a further educa-tion context. European Journal of Psychology ofEducation, 11 (1), 79–96.

Festinger, L. and Katz, D. (1966) Research Methodsin the Behavioral Sciences. New York: Holt,Rinehart & Winston.

Fiedler, J. (1978) Field Research: a Manual For Lo-gistics and Management of Scientific Studies inNatural Settings. London: Jossey-Bass.

Field, P.A. and Morse, J.M. (1989) Nursing Research:the Application of Qualitative Methods. London:Chapman and Hall.

Fielding, N.G. and Fielding, J.L. (1986) Linking Data.Beverly Hills, CA: Sage Publications.

Finch, J. (1985) Social policy and education: prob-lems and possibilities of using qualitative research.In R.G.Burgess (ed.) Issues in Educational Re-search. Lewes: Falmer Press, 109–28.

Finch, J. (1986) Research and Policy: the Uses ofQualitative Methods in Social and EducationalResearch. Lewes: Falmer Press.

Fine, G.A. and Sandstrom, K.L. (1988) KnowingChildren: Participant Observation with Minors.Qualitative Research Methods Series 15. BeverlyHills: Sage Publications.

Finn, C.E. (1991) What ails education research? InD.S.Anderson and B.J.Biddle (eds) Knowledge forPolicy: Improving Education through Research.London: Falmer, 39–42.

Finnegan, R. (1996) Using documents. In R. Sapsfordand V.Jupp (1996) (eds) Data Collection andAnalysis. London: Sage Publications and the OpenUniversity Press, 138–51.

Fisher, B., Russell, T. and McSweeney, P. (1991) Us-ing personal constructs for course evaluation. Jour-nal of Further and Higher Education, 15 (1), 44–57.

Fitz-Gibbon, C.T. (1984) Meta-analysis: an explana-tion. British Educational Research Journal, 10 (2)135–44.

Fitz-Gibbon, C.T. (1985) The implications ofmetaanalysis for educational research, BritishEducational Research Journal, 11 (1), 45–9.

Fitz-Gibbon, C.T. (1991) Multilevel modelling in anindicator system. In S.W.Raudenbush and J.D.Willms (eds) Schools, Classrooms and Pupils. In-ternational Studies of Schooling from a MultilevelPerspective. San Diego, CA: Academic Press Inc.,67–83.

Fitz-Gibbon, C.T. (1996) Monitoring Education: In-dicators, Quality and Effectiveness. London:Cassell.

Fitz-Gibbon, C.T. (1997) The Value Added NationalProject. Final Report. London: School Curricu-lum and Assessment Authority.

Fitz-Gibbon, C.T. (1999) Education: high potentialnot yet realized. Public Money and Management,Jan.-March, 19 (1), 33–9.

Flanagan, J. (1949) Critical requirements: a new ap-proach to employee evaluation. Cited in E.C.Wragg An Introduction to Classroom Observa-tion. London: Routledge.

Flanders, N. (1970) Analyzing Teacher Behaviour.Reading, MA: Addison-Wesley.

Bib

liog

rap

hy

417BIBLIOGRAPHY

Flaugher, R. (1990) Item pools. In H.Wainer (ed.)Computerized Adaptive Testing: a Primer. NewJersey: Lawrence Erlbaum, 41–64.

Floud, R. (1979) An Introduction to QuantitativeMethods for Historians (second edition). London:Methuen.

Fogelman, K. (ed.) (1983) Growing Up in Great Brit-ain: Papers from the National Child DevelopmentStudy. London: Macmillan.

Forgas, J.P. (1976) The perception of social episodes:categoric and dimensional representations in twodifferent social milieux. Journal of Personality andSocial Psychology, 34 (2), 199–209.

Forgas, J.P. (1978) Social episodes and social struc-ture in an academic setting: the social environ-ment of an intact group. Journal of ExperimentalSocial Psychology, 14, 434–48.

Forgas, J.P. (1981) Social Episodes: The Study of In-teraction Routines. London: Academic Press.

Forward, J., Canter, R. and Kirsch, N. (1976) Role-enactment and deception methodologies. Ameri-can Psychologist, 35, 595–604.

Foskett, N.H. and Hesketh, A.J. (1997) Construct-ing choice in continuous and parallel markets:institutional and school leavers’ responses to thenew post-16 marketplace. Oxford Review of Edu-cation, 23 (3), 299–319.

Foster, J.R. (1992) Eliciting personal constructs andarticulating goals. Journal of Career Development,18 (3), 175–85.

Foster, P. (1989) Change and adjustment in a Fur-ther Education College. In R.G.Burgess (ed.) TheEthics of Educational Research. Lewes: FalmerPress, 188–204.

Fourali, C. (1997) Using fuzzy logic in educationalmeasurement: the case of portfolio assessment.Evaluation andResearch in Education, 11 (3),129–48.

Fox, D.J. (1969) The Research Process in Education.New York: Holt, Rinehart & Winston.

Francis, H. (1992) Patterns of reading developmentin the first school. British Journal of EducationalPsychology, 62, 225–32.

Frankfort-Nachmias, C. and Nachmias, D. (1992)Research Methods in the Social Sciences. London:Edward Arnold.

Fransella, F. (1975) Need to Change? London:Methuen.

Fransella, F. and Bannister, D. (1977) A Manual forRepertory Grid Technique. London: AcademicPress.

Fredericksen, J.R. and Collins, A. (1989) A systemsapproach to educational testing. Educational Re-searcher, 189, 27–32.

Freire, P. (1970) The Pedagogy of the Oppressed.Harmondsworth: Penguin.

Frisbie, D. (1981) The relative difficulty ratio—a testand item index. Educational and PsychologicalMeasurement, 41 (2), 333–9.

Fullan, M. (1999) Change Forces (second edition).London: Falmer.

Gadamer, H.G. (1975) Truth and Method. New York:Polity Press.

Gage, N.L. (1989) The paradigm wars and their after-math. Teachers College Record, 91 (2), 135–50.

Gallagher, T, McEwen, A. and Knip, D. (1997) Sci-ence education policy: a survey of the participa-tion of sixth-form pupils in science and the sub-jects over a 10-year period, 1985–95. ResearchPapers in Education, 12 (2), 121–42.

Galton, M., Simon, B. and Coll, P. (1980) Inside thePrimary Classroom. London: Routledge & KeganPaul.

Galton, M., Hargreaves, L, Comber, C., Wall, D. andPell, T. (1999) Changes in patterns in teacher in-teraction in primary classrooms, 1976–1996. Brit-ish, Educational Research Journal, 25 (1), 23–37.

Gardiner, P. (1961) The Nature of Historical Expla-nation. Oxford: Oxford University Press.

Gardner, G. (1978) Social Surveys for Social Plan-ners. Milton Keynes: Open University Press.

Gardner, H. (1993) Multiple Intelligences: the Theoryin Practice. New York: Basic Books.

Garfinkel, H. (1967) Studies in Ethnomethodology.Englewood Cliffs, NJ: Prentice-Hall.

Garner, C. and Raudenbush, S.W. (1991) Neighbour-hood effects in educational attainment: a multilevelanalysis. Sociology of Education, 64 (4), 251–62.

Garrahan, P. and Stewart, P. (1992) The NissanEnigma: Flexibility at Work in a Local Economy.London: Mansell.

Gaukroger, A. and Schwartz, L. (1997) A universityand its region: student recruitment to Birmingham,1945–75. Oxford Review of Education, 23 (2),185–202.

Geertz, C. (1973) Thick description: towards an in-terpretive theory of culture. In. C.Geertz (ed.)TheInterpretation of Cultures. New York: BasicBooks.

Geertz, C. (ed.) (1973) The Interpretation of Cul-tures. New York: Basic Books.

Geertz, C. (1974) From the native’s point of view:

BIBLIOGRAPHY418

on the nature of anthropological understanding.Bulletin of the American Academy of Arts andSciences, 28 (1), 26–45.

Gersch, I. (1984) Behaviour modification and sys-tems analysis in a secondary school: combiningtwo approaches. Behavioural Approaches withChildren, 8, 83–91.

Geuss, R. (1981) The Idea of a Critical Theory. Lon-don: Cambridge University Press.

Gewirtz, S. and Ozga. J. (1993) Sex, lies andaudiotape: interviewing the education policy elite.Paper presented to the Economic and ResearchCouncil, 1988 Education Reform Act Researchseminar. University of Warwick.

Gewirtz, S. and Ozga, J. (1994) Interviewing the edu-cation policy elite. In G.Walford (ed.) Research-ing the Powerful in Education. London: UCLPress, 186–203.

Gibson, R. (1985) Critical times for action research.Cambridge Journal of Education, 15 (1), 59–64.

Giddens, A. (ed.) (1975) Positivism and Sociology.London: Heinemann Educational Books.

Giddens, A. (1976) New Rules of SociologicalMethod: a Positive Critique of Interpretative So-ciologies. London: Hutchinson.

Giddens, A. (1979) Central Problems in SocialTheory. London: Macmillan.

Gilbert, G.N. (1983) Accounts and those accountscalled actions. In G.N.Gilbert and P. Abell, Ac-counts and Action. Aldershot: Gower, 183–7.

Ginsburg, G.P. (1978) Role playing and role perform-ance in social psychological research. In M. Bren-ner, P.Marsh and M.Brenner (eds), The SocialContexts of Method. London: Groom Helm, 91–121.

Gipps, C. (1994) Beyond Testing: towards a Theoryof Educational Assessment. London: Falmer.

Giroux, H. (1989) Schooling for Democracy. Lon-don: Routledge.

Glaser, B.G. (1978) Theoretical Sensitivity: Advancesin the Methodology of Grounded Theory. MillValley, CA: Sociology Press.

Glaser, B.G. and Strauss, A.L. (1967) The Discoveryof Grounded Theory. Chicago: Aldane.

Glass, G.V. (1976) Primary, secondary andmetaanalysis. Educational Research, 5, 3–8.

Glass, G.V. (1977) Integrating findings: themetaanalysis of research. Review of Research inEducation, 5, 351–79.

Glass, G.V. and Hopkins, K.D. (1996) Statistical

Methods in Education and Psychology (third edi-tion). Boston: Allyn & Bacon.

Glass, G.V. and Smith, M.L. (1978) Metaanalysis ofResearch on the Relationship of Class-size andAchievement. San Francisco: Farwest Laboratory.

Glass, G.V, McGaw, B. and Smith, M.L. (1981) Meta-analysis in Social Research. Beverly Hills: SagePublications.

Glass, G.V, Cahen, L.S., Smith, M.L. and Filby, N.N. (1982) School Class Size: Research and Policy.Beverly Hills: Sage Publications.

Gleick, J. (1987) Chaos. London: Abacus.Goffman, E. (1968) Asylums. Harmondsworth: Pen-

guin.Goffman, E. (1969) The Presentation of Self in Eve-

ryday Life. Harmondsworth: Penguin.Gold, R.L. (1958) Roles in sociological field obser-

vations. Social Forces, 36, 217–23.Goldstein, H. (1987) Multilevel Modelling in Educa-

tional and Social Research. London: Charles Grif-fin and Co. Ltd.

Good, C.V. (1963) Introduction to Educational Re-search. New York: Appleton-Century-Crofts.

Goodson, I. (1983) The use of life histories in thestudy of teaching. In M.Hammersley (ed.) TheEthnography of Schooling. Driffield: NaffertonBooks, 129–54.

Goodson, I. (1990) The Making of Curriculum.Lewes: Falmer Press.

Goodson, I. and Walker, R. (1988) Putting life intoeducational research. In R.R.Sherman andR.B.Webb (eds) Qualitative Research in Educa-tion: Focus and Methods. Lewes: Falmer Press,110–22.

Goossens, L., Marcoen, A., van Hees, S. and van deWoestlijne, O. (1998) Attachment style and lone-liness in adolescence. European Journal of Psy-chology of Education, 13 (4), 529–42.

Gottschalk, L. (1951) Understanding History. NewYork: Alfred A.Knopf.

Graue, M.E. and Walsh, D.J. (1998) Studying Chil-dren in Context: Theories, Methods and Ethics.London: Sage Publications.

Gray, J. and Satterly, D. (1976) A chapter of errors:teaching styles and pupil progress in retrospect.Educational Research, 19, 45–56.

Gray, J., Reynolds, D., Fitz-Gibbon, C.T. and Jesson,D. (eds) (1996) Merging Traditions: The Futureof Research on School Effectiveness and SchoolImprovement. London: Cassell.

Bib

liog

rap

hy

419BIBLIOGRAPHY

Greig, A.D. and Taylor, J. (1998) Doing Researchwith Children. London: Sage Publications.

Gronlund, N.E. (1981) Measurement and Evaluationin Teaching (fourth edition). New York:CollierMacmillan.

Gronlund, N.E. (1985) Stating Objectives for Class-room Instruction (third edition). New York:Macmillan.

Gronlund, N.E. and Linn, R.L. (1990) Measurementand Evaluation in Teaching (sixth edition). NewYork: Macmillan.

Grundy, S. (1987) Curriculum: Product or Praxis.Lewes: Falmer.

Grundy, S. (1996) Towards empowering leadership:the importance of imagining. In O.Zuber-Skerritt(ed.) New Directions in Action Research. London:Falmer, 106–20.

Grundy, S. and Kemmis, S. (1988) Educational ac-tion research in Australia: the state of the art (anoverview). In S.Kemmis and R.McTaggart (eds)The Action Research Reader (second edition).Victoria: Deakin University Press, 83–97.

Guba, E.G. and Lincoln, Y.S. (1989) Fourth Genera-tion Evaluation. Beverly Hills: Sage Publications.

Guba, E.G. and Lincoln, Y.S. (1991) What is theconstructivist paradigm? In D.S.Anderson andB.J.Biddle (eds) Knowledge for Policy: ImprovingEducation through Research. London: Falmer,158–70.

Guba, E.G. and Lincoln, Y.S. (1994) Competing para-digms in qualitative research. In N.K.Denzin andY.S.Lincoln (eds) Handbook of Qualitative Re-search. Beverly Hills: Sage Publications, 105–17.

Guilford, J.P. and Fruchter, B. (1973) FundamentalStatistics in Psychology and Education. New York:McGraw-Hill.

Habermas, J. (1970) Toward a theory of communi-cative competence. Inquiry, 13, 360–75.

Habermas, J. (1972) Knowledge and Human Inter-ests (trans. J.Shapiro). London: Heinemann.

Habermas, J. (1974) Theory and Practice (trans. J.Viertel). London: Heinemann.

Habermas, J. (1976) Legitimation Crisis (trans. T.McCarthy). London: Heinemann.

Habermas, J. (1979) Communication and the Evo-lution of Society. London: Heinemann.

Habermas, J. (1982) A reply to my critics. In J.Thompson and D.Held (eds) Habermas: CriticalDebates. London: Macmillan, 219–83.

Habermas, J. (1984) The Theory of Communicative

Action. Vol. 1: Reason and the Rationalization ofSociety (trans. T.McCarthy). Boston: Beacon Press.

Habermas, J. (1988) On the Logic of the Social Sci-ences (trans. S.Nicholsen and J.Stark). Oxford:Polity Press in association with Basil Blackwell.

Habermas, J. (1990) Moral Consciousness and Com-municative Action (trans. C.Lenhardt and S.Nicholsen). Cambridge: Polity Press in associationwith Basil Blackwell.

Haig, B.D. (1997) Feminist research methodology.In J.P.Keeves (ed.) Educational Research, Meth-odology, and Measurement: an InternationalHandbook (second edition). Oxford: Elsevier Sci-ence Ltd. 180–5.

Haig, B.D. (1999) Feminist research methodology.In J.P.Keeves and G.Lakomski (eds) Issues in Edu-cational Research. Oxford: Elsevier Science Ltd.,222–31.

Hakim, C. (1987) Research Design, ContemporarySocial Research: 13. London: Allen & Unwin.

Haladyna, T., Nolen, S. and Haas, N. (1991) Raisingstandardised achievement test scores and the ori-gins of test score pollution. Educational Re-searcher, 20 (5), 2–7.

Hall, E., Hall, C. and Abaci, R. (1997) The effects ofhuman relations training on reported teacherstress, pupil control ideology and locus of con-trol. British Journal of Educational Psychology,67, 483–96.

Hall, K. and Nuttall, W. (1999) The relative importanceof class size to infant teachers in England. BritishEducational Research Journal, 25 (2), 245–58.

Hall, S. (1996) Reflexivity in emancipatory actionresearch: illustrating the researcher’sconstitutiveness. In O.Zuber-Skerritt (ed.) NewDirections in Action Research. London: Falmer.

Halpin, D., Croll, P. and Redman, K. (1990) Teach-ers’ perceptions of the effects of inservice educa-tion. British Educational Research Journal, 16 (2),163–77.

Halsey, A.H. (ed.) (1972) Educational Priority: Vol1: E.P.A. Problems and Policies. London:HMSO.

Hambleton, R.K. (1993), Principles and selected ap-plication of item response theory, in R.Linn (1993)(ed.) Educational Measurement (third edition).American Council on Education and the OryxPress, Phoenix, Arizona, 147–200.

Hamilton, D. (1981) Generalisation in the educationalsciences: problems and purposes. In T.S. Popkewitz

BIBLIOGRAPHY420

and R.S.Tabachnick (eds) The Study of School-ing. New York: Praeger.

Hamilton, V.L. (1976) Role play and deception: a re-examination of the controversy. Journal for theTheory of Social Behaviour, 6, 233–50.

Hammersley, M. (1979) Analysing ethnographic data.Unit 1, Block 6, DE304 Research Methods inEducation and the Social Sciences. Milton Keynes:Open University Press.

Hammersley, M. (1987) Some notes on the terms ‘va-lidity’ and ‘reliability’. British Educational Re-search Journal, 13 (1), 73–81.

Hammersley, M. (1992) What’s Wrong with Ethnog-raphy? London: Routledge.

Hammersley, M. (ed.) (1993) Social Research:Philosophy, Politics and Practice. London: SagePublications in association with the Open Univer-sity Press.

Hammersley, M. and Atkinson.P. (1983) Ethnogra-phy: Principles on Practice. London: Methuen.

Hampden-Turner, C. (l970) Radical Man. Cambridge,MA: Schenkman.

Haney, C., Ranks, C. and Zimbardo, P. (1973) Inter-personal dynamics in a simulated prison. Interna-tional Journal of Criminology and Penology, 1,69–97.

Hanna, G.S. (1993) Better Teaching through BetterMeasurement. Fort Worth: Harcourt BraceJovanovich Inc.

Hannan, A. and Newby, M. (1992) Student teacherand headteacher views on current provision andproposals for the future of Initial Teacher Educa-tion for primary schools (mimeo). Rolle Facultyof Education, University of Plymouth.

Hargreaves, D.H., Hester, S.K. and Mellor, F.J. (1975)Deviance in Classrooms. London: Routledge &Kegan Paul.

Harlen, W. (ed.) (1994) Enhancing Quality in As-sessment. London: Paul Chapman Publishing.

Harré, R. (1974) Some remarks on ‘rule’ as a scien-tific concept. In T.Mischel (ed.) On Understand-ing Persons. Oxford: Basil Blackwell, 143–84.

Harré, R. (1976) The constructive role of models. InL.Collins (ed.) The Use of Models in the SocialSciences. London: Tavistock Publications, 16–43.

Harré, R. (1977a) The ethogenic approach: theoryand practice. In L.Berkowitz (ed.) Advances inExperimental Social Psychology, Vol. 10. NewYork: Academic Press, 284–314.

Harré, R. (1977b) Friendship as an accomplishment.

In S.Duck (ed.) Theory and Practice in Interper-sonal Attraction. London: Academic Press, 339–54.

Harré, R. (1978) Accounts, actions and meanings:the practice of participatory psychology. In M.Brenner, P.Marsh and M.Brenner (eds) The SocialContexts of Method. London: Croom Helm, 44–66.

Harré, R. and Rosser, E. (1975) The rules of disor-der. The Times Educational Supplement, 25 July.

Harré, R. and Secord, P. (1972) The Explanation ofSocial Behaviour. Oxford: Basil Blackwell.

Harris, N., Pearce, P. and Johnstone, S. (1992) TheLegal Context of Teaching. London: Longman.

Harvey, C.D.H. (1988) Telephone survey techniques.Canadian Home Economics Journal, 38 (1), 30–5.

Heath, S.B., (1982) Questioning at home and atschool: a comparative study, in G.Spindler (ed.)Doing the Ethnography of Schooling. New York:Holt, Rinehart & Winston, 102–31.

Hedges, A. (1985) Group interviewing. In R.Walker(ed.) (1985) Applied Qualitative Research. Alder-shot: Gower, 71–91.

Hedges, L. (1981) Distribution theory for Glass’sestimator of effect size and related estimators. Jour-nal of Educational Statistics, 6, 107–28.

Hedges, L. (1990) Directions for future methodol-ogy. In. K.W.Wachter and M.L.Straf (eds) TheFuture of Meta-analysis. New York: Russell SageFoundation, cited in B.McGaw, Meta-analysis. InJ.P.Keeves (ed.) Educational Research, Method-ology, and Measurement: an International Hand-book (second edition). Oxford: Elsevier ScienceLtd., 371–80.

Hedges, L. and Olkin, I. (1980) Vote-counting meth-ods in research synthesis. Psychological Bulletin,88 (2), 359–69.

Hedges, L. and Olkin, I. (1985) Statistical Methodsfor Meta-analysis. New York: Academic Press.

Held, D. (1980) Introduction to Critical Theory. LosAngeles: University of California Press.

Hesse, M. (1982) Science and objectivity. In J.Thompson and D.Held (eds) Habermas: CriticalDebates. London: Macmillan, 98–115.

Hibbett, A., Fogelman, K. and Manor, O. (1990)Occupational outcomes of truancy. British Jour-nal of Educational Psychology, 60, 23–36.

Higgs, G., Webster, C.J. and White, S.D. (1997) Theuse of geographical information systems in

Bib

liog

rap

hy

421BIBLIOGRAPHY

assessing spatial and socio-economic impacts ofparental choice. Research Papers in Education, 12(1) 27–48.

Hill, J.E. and Kerber, A. (1967) Models, Methods andAnalytical Procedures in Educational Research.Detroit: Wayne State University Press.

Hill, P.W. and Rowe, K.J. (1996) Multilevel model-ling in school effectiveness research. School Ef-fectiveness and School Improvement. 7 (1), 1–34.

Hinkle, D.N. (1965) The change of personal con-structs from the viewpoint of a theory of implica-tions. Unpublished Ph.D. thesis. Ohio State Uni-versity.

Hitchcock, G. and Hughes, D. (1989) Research andthe Teacher. London: Routledge.

Hitchcock, G. and Hughes, D. (1995) Research andthe Teacher (second edition). London: Routledge.

Hockett, H.C. (1955) The Critical Method in His-torical Research and Writing. London: Macmillan.

Hodgkinson, H.L. (1957) Action research—a cri-tique. Journal of Educational Sociology. 31 (4),137–53.

Hoinville, G. and Jowell, R. (1978) Survey ResearchPractice. London: Heinemann.

Holbrook, D. (1977) Education, Nihilism and Sur-vival. London: Darton, Longman & Todd.

Holland, J.L. (1985) Making Vocational Choices: aTheory of Vocational Personalities and WorkEnvironments. Englewood Cliffs, NJ: Prentice-Hall.

Holly, P. (1984) Action Research: a Cautionary Note.Classroom Action Research Network, bulletin No.6. Cambridge Institute of Education.

Holly, P. and Whitehead, D. (1986) Action Researchin Schools: Getting It into Perspective. ClassroomAction Research Network.

Holmes, P. and Karp, M. (1991) Psychodrama: In-spiration and Technique. London: Routledge.

Holmes, R.M. (1998) Fieldwork with Children. Lon-don: Sage Publications.

Holsti, O.R. (1968) Content analysis. In G.Lindzeyand E.Aronson (eds), The Handbook of SocialPsychology. Vol. 2: Research Methods. Reading,MA: Addison-Wesley, 596–692.

Hopkins, D. (1985) A Teacher’s Guide to ClassroomResearch. Milton Keynes: Open University Press.

Hopkins, K.D., Hopkins, B.R. and Glass, G.V. (1996)Basic Statistics for the Behavioral Sciences (thirdedition). Boston: Allyn & Bacon.

Horkheimer, M. (1972) Critical Theory: Selected

Essays (trans. M.Connell et al.). New York:Herder & Herder.

Horowitz, I.L. and Katz, J.E. (1991) Brown vs Boardof Education. In D.S.Anderson and B.J.Biddle (eds)Knowledge for Policy: Improving Educationthrough Research. London: Falmer, 237–44.

Houghton, D. (1991) Mr Chong: a case study of adependent learner of English for academic pur-poses. System, 19 (1–2), 75–90.

House, E.R. (1979) Technology versus craft: a ten-year perspective on innovation, Journal of Cur-riculum Studies, 11 (1), 1–15.

Hoyle, E. (1986) The Politics of School Management.Sevenoaks: Hodder & Stoughton.

Hudson, P. and Miller, C. (1997) The treasure hunt:strategies for obtaining maximum response to apostal survey. Evaluation and Research in Educa-tion, 11 (2), 102–12.

Hughes, J.A. (1976) Sociological Analysis: Methodsof Discovery. Sunbury-on-Thames: Nelson & Sons.

Hull, C. (1985) Between the lines: the analysis of in-terview data as an exact art. British EducationalResearch Journal, 11 (1), 27–31.

Hult, M. and Lennung, S. (1980) Towards a defini-tion of action-research: a note and bibliography.Journal of Management Studies, 17 (2), 241–50.

Humphreys, L. (1975) Tearoom Trade: ImpersonalSex in Public Places (enlarged edition). New York:Aldine.

Hunter, J.E., Schmidt, F.L. and Jackson, G.B.(1982) Meta-analysis: Cumulating ResearchFindings across Studies. Beverly Hills: Sage Pub-lications.

Hurn, C.J. (1978) The Limits and Possibilities ofSchooling. Boston, MA: Allyn & Bacon.

Hutchinson, B. and Whitehouse, P. (1986) Actionresearch, professional competence and school or-ganization. British Educational Research Journal,12 (1) 85–94.

Hycner, R.H. (1985) Some guidelines for thephenomenological analysis of interview data.Human Studies, 8, 279–303.

Ions, E. (1977) Against Behaviouralism: a Critiqueof Behavioural Science. Oxford: Basil Blackwell.

Jacklin, A. and Lacey, C. (1997) Gender integrationin the infant classroom: a case study. British Edu-cational Research Journal, 23 (5), 623–40.

Jackson, D. and Marsden, D. (1962) Education andthe Working Class. London: Routledge & KeganPaul.

BIBLIOGRAPHY422

Jackson, G.B. (1980) Methods for integrative reviews.Review of Educational Research. 50 (3), 438–60.

Jacob, E. (1987) Qualitative research traditions: areview. Review of Educational Research, 57 (1),1–50.

James, M. (1993) Evaluation for policy: rationalityand political reality: the paradigm case of PRAISE?In R.G.Burgess (ed.) Educational Research andEvaluation for Policy and Practice. London:Falmer.

Jayaratne, T.E. (1993) The value of quantitative meth-odology for feminist research. In M. Hammersley(ed.) Social Research: philosophy, politics andpractice. London: Sage Publications, in associa-tion with the Open University Press, 109–23.

Jones, D. and Vann, P. (1994) Informing School De-cisions: GIS in Education. Luton: Local Govern-ment Management Board.

Jones, J. (1990) The role of the headteacher in staffdevelopment. Educational Management and Ad-ministration, 18 (1), 27–36.

Jones, J.L. (1998) Managing the induction of newlyappointed governors. Educational Research, 40(3), 329–51.

Jones, S. (1987) The analysis of depth interviews. InR.Murphy and H.Torrance (eds) Evaluating Edu-cation: Issues and Methods. London: PaulChapman Publishing, 263–77.

Jules, V and Kutnick, P. (1997) Student perceptionsof a good teacher: the gender perspective. BritishJournal of Educational Psychology, 67, 497–511.

Kamin, L. (1991) Some historical facts about IQ test-ing. In D.S.Anderson and B.J.Biddle (eds) Knowl-edge for Policy: Improving Education throughResearch. London: Falmer, 259–67.

Kaplan, A. (1973) The Conduct of Inquiry. Ayles-bury: Intertext Books.

Kauffmann, S. (1995) At Home in the Universe.Harmondsworth: Penguin.

Kazdin, A.E. (1982) Single-case Research Designs.New York: Oxford University Press.

Keat, R. (1981) The Politics of Social Theory. Ox-ford: Basil Blackwell.

Keeves, J.P. (ed.) (1997) Educational Research, Meth-odology and Measurement: an InternationalHandbook (second edition). Oxford: Elsevier Sci-ence Ltd.

Keeves, J.P. and Lakomski, G. (eds) (1999) Issues inEducational Research. Oxford: Elsevier ScienceLtd.

Keeves, J.P. and Sellin, N. (1997) Multilevel analysis.In J.P.Keeves (ed.) Educational Research, Meth-odology and Measurement: An InternationalHandbook. Oxford: Elsevier Science Ltd., 394–403.

Kelle, U. (ed.) (1995) Computer-Aided QualitativeData Analysis. London: Sage Publications.

Kelle, U. and Laurie, H. (1995) Computer use inqualitative research and issues of validity. In U.Kelle (ed.) (1995) Computer-Aided QualitativeData Analysis. London: Sage Publications, 19–28.

Kelly, A. (1986) The development of children’s atti-tudes to science: a longitudinal study. EuropeanJournal of Science Education, 8, 399–412.

Kelly, A. (1987) Science for Girls? Milton Keynes:Open University Press.

Kelly, A. (1989a) Education or indoctrination? Theethics of school-based action research. In R.G.Burgess (ed.) The Ethics of Educational Research.Lewes: Falmer, 100–13.

Kelly, A. (1989b) Getting the GIST: a qualitative studyof the effects of the Girls Into Science and Tech-nology Project. Manchester Sociology OccasionalPapers, no. 22. Manchester: University of Man-chester.

Kelly, A. and Smail, B. (1986) Sex stereotypes andattitudes to science among eleven-year-old chil-dren. British Journal of Educational Psychology.56, 158–68.

Kelly, G.A. (1955) The Psychology of Personal Con-structs. New York: Norton.

Kelly, G.A. (1969) Clinical Psychology and Person-ality: the Selected Papers of George Kelly, editedby B.A. Maher. New York: Wiley.

Kelman, H.C. (1967) Human use of human subjects.Psychological Bulletin, 67 (1), 1–11.

Kemmis, S. (1982) Seven principles for programmeevaluation in curriculum development and inno-vation. Journal of Curriculum Studies, 14 (3), 221–40.

Kemmis, S. (1997) Action Research. In J.P.Keeves (ed.)Educational Research, Methodology, and Meas-urement: an International Handbook (second edi-tion). Oxford: Elsevier Science Ltd., 173–9.

Kemmis, S. (1999) Action Research. In J.P.Keeves andG.Lakomski (eds) Issues in Educational Research.Oxford: Elsevier Science Ltd., 150–60.

Kemmis, S. and McTaggart, R. (eds) (1981) The Ac-tion Research Planner (first edition). Geelong,Victoria: Deakin University Press.

Bib

liog

rap

hy

423BIBLIOGRAPHY

Kemmis. S. and McTaggart, R. (eds) (1988) The Ac-tion Research Planner (second edition). Geelong,Victoria: Deakin University Press.

Kemmis, S. and McTaggart, R. (eds) (1992) The Ac-tion Research Planner (third edition) Geelong,Victoria, Australia: Deakin University Press.

Kerlinger, F.N. (1970) Foundations of BehavioralResearch. New York: Holt, Rinehart & Winston.

Kerlinger, F.N. (1986) Foundations of BehavioralResearch (third edition). New York: Holt, Rinehart& Winston.

Kerlinger, F.N. (1991) Science and behavioural re-search. In D.S.Anderson and B.J.Biddle (eds)Knowledge for Policy: Improving Educationthrough Research. London: Falmer, 87–102.

Kierkegaard, S. (1974) Concluding Unscientific Post-script. Princeton: Princeton University Press.

Kimmel, A.J. (1988) Ethics and Values in AppliedSocial Research. Beverly Hills: Sage Publications.

Kincheloe, J.L. (1991) Teachers as Researchers:Qualitative Inquiry as a Path to Empowerment.London: Falmer.

Kincheloe, J. and McLaren, P. (1994) Rethinking criti-cal theory and qualitative research. In N.K. Denzinand Y S.Lincoln (eds) Handbook of QualitativeResearch. Beverly Hills: Sage Publications, 105–17.

King, J.A., Morris, L.L. and Fitz-Gibbon, C.T. (1987)How to Assess Programme Implementation.Beverly Hills: Sage Publications.

King, R. (1979) All Things Bright and Beautiful?Chichester: John Wiley.

Kirk, J. and Miller, M.L. (1986) Reliability and Va-lidity in Qualitative Research. Qualitative Re-search Methods Series No. 1. Beverly Hills: SagePublications.

Kitwood, T.M. (1977) Values in adolescent life: to-wards a critical description, Unpublished Ph.D.dissertation, School of Education, University ofBradford.

Kivulu, J.M. and Rogers, W.T. (1998) A multilevelanalysis of cultural experience and gender influ-ences on causal attributions to perceived perform-ance in mathematics. British Journal of Educa-tional Psychology, 68, 25–37.

Knott, J. and Wildavsky, A. (1991) If disseminationis the solution, what is the problem? In D.S.Anderson and B.J.Biddle (eds) Knowledge forPolicy: Improving Education through Research.London: Falmer, 214–24.

Kogan, M. and Atkin, J.M. (1991) Special commis-sions and educational policy in the U.S.A. and U.K.In D.S.Anderson and B.J.Biddle (eds) Knowledgefor Policy: Improving Education through Re-search. London: Falmer, 245–58.

Kolakowski, L. (1978) Main Currents of Marxism,Volume Three: the Breakdown (trans. P.S.Falla).Oxford: Clarendon Press.

Kosko, B. (1994) Fuzzy Thinking. London: Flamingo.Krejcie, R.V and Morgan, D.W. (1970) Determining

sample size for research activities. Educational andPsychological Measurement, 30: 607–10.

Kremer-Hayon, L. (1991) Personal constructs of el-ementary school principals in relation to teach-ers. Research in Education, 43, 15–21.

Krueger, R.A. (1988) Focus Groups: a Practical Guidefor Applied Research. Beverly Hills: Sage Publica-tions.

Kshir, M.A.M. (1999) An evaluative survey of therole of INSET in managing innovations in schoolsin Libya. Unpublished Ph.D. thesis, School ofEducation: University of Durham.

Kuhn, T.S. (1962) The Structure of Scientific Revo-lutions. Chicago: University of Chicago Press

Kumar, D.D. (1991) A meta-analysis of the relation-ship between science instruction and student en-gagement, Educational Review, 43 (1), 49–56.

Kvale, S. (1996) Interviews. London: Sage Publica-tions.

Labov, W. (1969) The logic of non-standard English.In N.Keddie (ed.) Tinker, Tailor…the Myth ofCultural Deprivation. Harmondsworth: Penguin,21–66.

Laing, R.D. (1967) The Politics of Experience andthe Bird of Paradise. Harmondsworth: Penguin.

Lakatos, I. (1970) Falsification and the methodol-ogy of scientific research programmes. In I.Lakatosand A.Musgrave (eds) Criticism and the Growthof Knowledge. London: Cambridge UniversityPress, 91–195.

Lakomski, G. (1999) Critical theory. In J.P.Keevesand G.Lakomski (eds) Issues in Educational Re-search. Oxford: Elsevier Science Ltd., 174–83.

Lamb, S., Bibby, P., Wood, D. and Leyden, G. (1997)Communication skills, educational achievementand biographic characteristics of children withmoderate learning difficulties. European journalof Psychology of Education, 12 (4), 401–14.

Lansing, J.B., Ginsberg, G.P. and Braaten, K. (1961)

BIBLIOGRAPHY424

An Investigation of Response Error. Bureau of Eco-nomic and Business Research University of Illinois.

Lather, P. (1986) Research as praxis. Harvard Edu-cational Review, 56, 257–77.

Lather, P. (1991) Getting Smart: Feminist Researchand Pedagogy within the Post Modern. New York:Routledge.

Laudan, L. (1990) Science and Relativism. Chicago:University of Chicago Press Ltd.

Lave, J. and Kvale, S. (1995) What is anthropologi-cal research? An interview with Jean Lave bySteiner Kvale. International Journal of Qualita-tive Studies in Education, 8 (3), 219–28.

Lawless, L., Smee, P. and O’Shea, T. (1998) Usingconcept sorting and concept mapping in businessand public administration, and education: an over-view. Educational Research, 40 (2), 219–35.

Layder, D. (1994) Understanding Social Theory. Lon-don: Sage Publications.

Lazarsfeld, P.P. and Barton, A. (1951) Qualitativemeasurement in the social sciences: classification,typologies and indices. In. D.P.Lerner andH.D.Lasswell (eds) The Policy Sciences. Stanford:Stanford University Press, 155–92.

LeCompte, M. and Preissle, J. (1993) Ethnographyand Qualitative Design in Educational Research(second edition). London: Academic Press Ltd.

LeCompte, M., Millroy, W.L. and Preissle, J. (eds)(1992) The Handbook of Qualitative Research inEducation. London: Academic Press Ltd.

Lee, R.M. (1993) Doing Research on Sensitive Top-ics. London: Sage Publications.

Lehrer, R. and Franke, M.L. (1992) Applying per-sonal construct psychology to the study of teach-ers’ knowledge of fractions. Journal for Researchin Mathematical Education, 23 (3), 223–41.

Leistyna, P., Woodrum, A. and Sherblom, S.A. (1996)Breaking Free. Cambridge, MA: Harvard Educa-tional Review.

Levin, H.M. (1991) Why isn’t educational researchmore useful? In D.S.Anderson and B.J.Biddle (eds)Knowledge for Policy: Improving Educationthrough Research. London: Falmer, 70–8.

Levine, M. (1984) Canonical Analysis and FactorComparisons. Beverly Hills: Sage Publications.

Levine, R.A. (1966) Towards a psychology ofpopulations: the cross-cultural study of personal-ity. Human Development, 3, 30–6.

Lewin, K. (1946) Action research and minority prob-lems. Journal of Social Issues, 2 (4), 34–46.

Lewin, K. (1948) Resolving Social Conflicts. NewYork: Harper.

Lewin, K. (1952) Field Theory in Social Science. Lon-don: Tavistock Publications.

Lewin, R. (1993) Complexity: Life on the Edge ofChaos. London: Phoenix.

Lewis, A. (1992) Group child interviews as a researchtool. British Educational Research Journal, 18 (4),413–21.

Lewis, D. (1974) Assessment in Education. London:University of London Press.

Lewis-Beck, M.S. (ed.) (1993) Experimental Designand Methods. London: Toppan Co., with the co-operation of Sage Publications, Inc.

Lietz, P. and Keeves, J.P. (1997) Cross-sectional re-search methods. In J.P.Keeves (ed.) EducationalResearch, Methodology and Measurement: anInternational Handbook (second edition). Oxford:Elsevier Science Ltd., 138–49.

Light, R.J. and Smith, P.V. (1971) Accumulating evi-dence: procedures for resolving contradictionsamong different research studies. Harvard Edu-cational Review, 41, 429–71.

Likert, R. (1932) A Technique for the Measurementof Attitudes. New York: Columbia UniversityPress.

Limerick, B., Burgess-Limerick, T. and Grace, M.(1996) The politics of interviewing: power relationsand accepting the gift. International Journal ofQualitative Studies in Education, 9 (4), 449–60.

Lin, N. (1976) Foundations of Social Research. NewYork: McGraw-Hill.

Lincoln, Y.S. and Guba, E.G. (1985) Naturalistic In-quiry. Beverly Hills: Sage Publications.

Lincoln, Y.S. and Guba, E.G. (1986) But is it rigor-ous? Trustworthiness and authenticity in natural-istic inquiry. In D.D.Williams (ed.) NaturalisticEvaluation. San Francisco: Jossey-Bass, 73–84.

Linn, R.L. (ed.) (1993) Educational Measurement(third edition). Phoenix, Arizona: American Coun-cil on Education and the Oryx Press.

Lipsey, M.W. (1992) Juvenile delinquency treatment:a meta-analytic inquiry into the variability of ef-fects. In T.D.Cook, H.Cooper, D.S.Cordray,H.Hartmann, L.V.Hedges, R.J.Light, T.A. Louisand F.Mosteller (eds) Meta-analysis for Explana-tion. New York: Russell Sage Foundation.

Littleton, K., Ashman, H., Light, P., Artis, J.,Roberts, T. and Oosterwegel, A. (1999) Gender,task contexts, and children’s performance on a

Bib

liog

rap

hy

425BIBLIOGRAPHY

computer-based task. European Journal of Psy-chology of Education, 14 (1), 129–39.

Loevinger, J. (1957) Objective tests as instruments ofpsychological theory. Psychological Review, 72,143–55.

Lofland, J. (1970) Interactionist imagery and ana-lytic interruptus. In T.Shibutani (ed.) Human Na-ture and Collective Behaviour: Papers in Honourof Herbert Blumer. Englewood Cliffs, NJ: Prentice-Hall, 35–45.

Lofland, J. (1971) Analyzing Social Settings. Belmont,CA: Wadsworth.

Lonkila, M. (1995) Grounded theory as an emergingparadigm for computer-assisted qualitative dataanalysis. In U.Kelle (1995) (ed.) Computer-aidedQualitative Data Analysis. London: Sage Publi-cations, 41–51.

Lund, B. and McGechan, S. (1981) CE Programmer’sManual. Victoria, Australia: Ministry of Educa-tion, Continuing Education Division.

McClelland, D.C., Atkinson, J.W., Clark, R.A. andLowell, E.L. (1953) The Achievement Motive.New York: Appleton-Century-Crofts.

McCormick, J. and Solman, R. (1992) Teachers’ at-tributions of responsibility for occupational stressand satisfaction: an organisational perspective.Educational Studies, 18 (2), 201–22.

McCormick, R. and James, M. (1988) CurriculumEvaluation in Schools (second edition). London:Croom Helm.

MacDonald, B. (1987) Research and Action in theContext of Policing. Paper commissioned by thePolice Federation. Norwich: Centre for AppliedResearch in Education, University of East Anglia.

MacDonald, G. (1997) Social work: beyond control?In A.Maynard and I.Chalmers (eds) Non-randomReflections on Health Service Research. London:BMJ Publishing Group, 122–46.

McEneaney, J.E. and Sheridan, E.M. (1996) A sur-vey-based component for programme assessmentin undergraduate pre-service teacher education.Research in Education, 55, 49–61.

McFee, G. (1993) Reflections on the nature of actionresearch. Cambridge Journal of Education, 23 (2),173–83.

McGaw, B. (1997) Meta-analysis. In J.P.Keeves (ed.)Educational Research, Methodology and Meas-urement: an International Handbook (second edi-tion). Oxford: Elsevier Science Ltd, 371–80.

McIntyre, D. and MacLeod, G. (1978) The charac-teristics and uses of systematic classroom obser-

vation. In R.McAleese and D.Hamilton (eds),Understanding Classroom Life. Windsor: NFER,102–31.

McKernan, J. (1991) Curriculum Action Research.London: Kogan Page.

McLaren, P. (1995) Critical Pedagogy and PredatoryCulture. London: Routledge.

McLaughlin, J.F., Owen, S.L., Fors, S.W. andLevinson, R.M (1992) The schoolchild as healtheducator: diffusion of hypertension informationfrom sixth-grade children to their parents. Quali-tative Studies in Education, 5 (2), 147–65.

McNamara, E. (1986) The effectiveness of incentiveand sanction systems used in secondary schools: abehavioural analysis. Durham and NewcastleResearch Review, 10, 285–90.

McNiece, R. and Jolliffe, F. (1998) An investigationinto regional differences in educational perform-ance in the National Child Development Study.Educational Research, 40 (1), 13–30.

McNiff, J. (1988) Action Research: Principles andPractice. Basingstoke: Macmillan.

McNiff, J., Lomax, P. and Whitehead, J. (1996) Youand Your Action Research Project. London:Routledge, in association with Hyde Publications,Bournemouth.

McQuitty, L.L. (1957) Elementary linkage analysisfor isolating orthogonal and oblique types andtypal relevancies. Educational and PsychologicalMeasurement, 17, 207–29.

McTaggart, R. (1996) Issues for participatory actionresearchers. In O.Zuber-Skerritt (ed.) New Direc-tions in Action Research. London: Falmer, 243–55.

Madge, J. (1963) The Origin of Scientific Sociology.London: Tavistock.

Madge, J. (1965) The Tools of Social Science. Lon-don: Longman.

Mager, R.F (1962) Preparing Instructional Objec-tives. Belmont, CA: Fearon Publishers.

Magnusson, D. (1971) An analysis of situational di-mensions. Perceptual and Motor Skills, 32, 851–67.

Malinowski, B. (1922) Argonauts of the WesternPacific: an Account of Native Enterprise and Ad-venture in the Archipelagoes of Melanesian NewGuinea. New York: Dutton.

Mannheim, K. (1936) Ideology and Utopia. London:Routledge & Kegan Paul.

Marcinkiewicz, H.R. and Clariana, R.B. (1997) Theperformance effects of headings within multi-choice

BIBLIOGRAPHY426

tests. British Journal of Educational Psychology,67, 111–17.

Marris, P. and Rein, M. (1967) Dilemmas of SocialReform: Poverty and Community Action in theUnited States. London: Routledge & Kegan Paul.

Marsh, H.W. and Yeung, A.S. (1998) Longitudinalstructural equation models of academic self-con-cept and achievement: gender differences in thedevelopment of math and English constructs.American Educational Research Journal, 35 (4),705–38.

Marsh, P., Rosser, E. and Harré, R. (1978) The Rulesof Disorder. London: Routledge & Kegan Paul.

Maslow, A.H. (1954) Motivation and Personality.New York: Harper & Row.

Mason, M., Mason, B. and Quayle, T. (1992) Illumi-nating English: how explicit language teaching im-proved public examination results in a comprehen-sive school. Educational Studies, 18 (3), 341–54.

Masschelein, J. (1991) The relevance of Habermas’scommunicative turn. Studies in Philosophy andEducation, 11 (2), 95–111.

Maxwell, J.A. (1992) Understanding and validity inqualitative research. Harvard Educational Review,62 (3), 279–300.

Maynard, A. and Chalmers, I. (eds) (1997) Non-ran-dom Reflections on Health Service Research. Lon-don: BMJ Publishing Group.

Mead, G.H. (ed. Charles Morris) (1934) Mind, Selfand Society. Chicago: University of Chicago Press.

Medawar, P.B. (1972) The Hope of Progress. Lon-don: Methuen.

Medawar, P.B. (1981) Advice to a Young Scientist.London: Pan Books.

Medawar, P.B. (1991) Scientific fraud. In D.Pike (ed.)The Threat and the Glory: Reflections on Scienceand Scientists. Oxford: Oxford University Press,64–70.

Megarry, J. (1978) Retrospect and prospect. In R.McAleese (ed.) Perspectives on Academic Gam-ing and Simulation 3: Training and ProfessionalEducation. London: Kogan Page, 187–207.

Mehrens, W. and Kaminski, J. (1989) Methods forimproving standardised test scores: fruitful, fruit-less or fraudulent? Educational Measurement: Is-sues and Practice, Spring, 14–22.

Melrose, M.J. (1996) Got a philosophical match?Does it matter? In O.Zuber-Skerritt (ed.) NewDirections in Action Research. London: Falmer.

Menzel, H. (1978) Meaning—who needs it? In M.

Brenner, P.Marsh and M.Brenner (eds) The SocialContexts of Method. London: Croom Helm, 140–71.

Mercer, N., Wegerif, R. and Dawes, L. (1999) Chil-dren’s talk and the development of reasoning inthe classroom. British Educational Research Jour-nal, 25 (1) 95–111.

Merrett, F, Wilkins, J., Houghton, S. and Wheldall,K. (1988) Rules, sanctions and rewards in second-ary schools. Educational Studies, 14 (2), 139–49.

Merriam, S.B. (1988) Case Study Research in Edu-cation. San Francisco: Jossey Bass.

Merton, R.K. (1949) Social Theory and Social Struc-ture. Illinois: Glencoe Press.

Merton, R.K. (1967) On Theoretical Sociology. NewYork: The Free Press.

Merton, R.K. and Kendall, P.L. (1946) The focusedinterview. American Journal of Sociology 51, 541–57.

Merton, R.K., Fiske, M. and Kendall, P.L. (1956) TheFocused Interview. Glencoe, IL: Free Press.

Messick, S. (1993) Validity. In R.Linn (ed.) Educa-tional Measurement (third edition). Phoenix, Ari-zona: American Council on Education and theOryx Press, 13–103.

Miedama, S. and Wardekker, W.L. (1999) Emergentidentity versus consistent identity: possibilities fora postmodern repoliticization of critical pedagogy.In T.Popkewitz and L.Fendler (eds) Critical Theo-ries in Education: Changing Terrains of Knowl-edge and Politics. London: Routledge, 67–83.

Mies, M. (1993) Towards a methodology for femi-nist research. In M.Hammersley (ed.) Social Re-search: Philosophy, Politics and Practice. London:Sage Publications, in association with the OpenUniversity Press, 64–82.

Miles, M. and Huberman, M.A. (1984) QualitativeData Analysis. Beverly Hills: Sage Publications.

Miles, M. and Huberman, M.A. (1994) QualitativeData Analysis (second edition). Beverly Hills: SagePublications.

Milgram, S. (1963) Behavioral study of obedience.Journal of Abnormal and Social Psychology, 67,371–8.

Milgram, S. (1974) Obedience to Authority. NewYork: Harper & Row.

Millan, R., Gallagher, M. and Ellis, R. (1993) Sur-veying adolescent worries: development of the‘Things I Worry About’ scale. Pastoral Care inEducation, 11 (1), 43–57.

Bib

liog

rap

hy

427BIBLIOGRAPHY

Miller, C. (1995) In-depth interviewing by telephone:some practical considerations. Evaluation andResearch in Education, 9 (1), 29–38.

Miller, P.V. and Cannell, C.F. (1997) Interviewing forsocial research. In J.P.Keeves (ed.) EducationalResearch, Methodology and Measurement: anInternational Handbook (second edition). Oxford:Elsevier Science Ltd., 361–70.

Miller, R.L. (1999) Researching Life Stories and Fam-ily Histories. London: Sage Publications.

Millman, J. and Darling-Hammond, L. (eds) (1991)The New Handbook of Teacher Evaluation.Newbury Park, CA: Corwin Press.

Millman, J. and Greene, J. (1993) The specificationand development of tests of achievement and abil-ity, in R.Linn (1993) (ed.) Educational Measure-ment (third edition). American Council on Edu-cation and the Oryx Press, Phoenix, AZ, 147–200.

Mishler, E.G. (1986) Research Interviewing: Contextand Narrative. Cambridge, MA: Harvard Univer-sity Press.

Mishler, E.G. (1990) Validation in inquiry-guidedresearch: the role of exemplars in narrative stud-ies. Harvard Educational Review, 60 (4): 415–42.

Mishler, E.G. (1991) Representing discourse: therhetoric of transcription. Journal of Narrative andLife History, 1, 225–80.

Mitchell, M. and Jolley, J. (1988) Research DesignExplained. New York: Holt, Rinehart &. Winston.

Mitchell, R.G. (1993) Secrecy and Fieldwork. Lon-don: Sage Publications.

Mixon, D. (1972) Instead of deception. Journal forthe Theory of Social Behaviour, 2, 146–77.

Mixon, D. (1974) If you won’t deceive, what canyou do? In N.Armistead (ed.) Reconstructing So-cial Psychology. Harmondsworth: Penguin, 72–85.

Mooij, T. (1998) Pupil-class determinants of aggres-sive and victim behaviour in pupils. British Jour-nal of Educational Psychology, 68, 373–85.

Morgan, C. (1999) Personal communication with oneof the authors. University of Bath, Department ofEducation.

Morgan, D.L. (1988) Focus Groups as QualitativeResearch. Beverly Hills: Sage Publications.

Morris, L.L., Fitz-Gibbon, C.T. and Lindheim, E.(1987) How to Measure Performance and UseTests. Beverly Hills: Sage Publications.

Morrison, K.R.B. (1993) Planning and Accomplish-ing School-centred Evaluation. Norfolk: PeterFrancis Publishers.

Morrison, K.R.B. (1995a) Habermas and the schoolcurriculum. Unpublished Ph.D. thesis, School ofEducation, University of Durham.

Morrison, K.R.B. (1995b) Dewey, Habermas andreflective practice. Curriculum, 16 (2), 82–94.

Morrison, K.R.B. (1996a) Developing reflective prac-tice in higher degree students through a learningjournal. Studies in Higher Education, 21 (3), 317–32.

Morrison, K.R.B. (1996b) Structuralism, post-mo-dernity and the discourses of control. Curriculum,17 (3), 164–77.

Morrison, K.R.B. (1996c) Why present school inspec-tions are unethical. Forum, 38 (3), 79–80.

Morrison, K.R.B. (1997) Researching the need formulticultural perspectives in counsellor training:a critique of Bimrose and Bayne. British Journalof Guidance and Counselling, 25 (1), 95–102.

Morrison, K.R.B. (1998) Management Theories forEducational Change. London: Paul ChapmanPublishing.

Mortimore, P., Sammons, P., Stoll, L., Lewis, D. andEcob, R. (1988) School Matters: the Junior Years.London: Open Books.

Moser, C. and Kalton, G. (1977) Survey Methods inSocial Investigation. London: Heinemann.

Mouly, G.J. (1978) Educational Research: the Artand Science of Investigation. Boston: Allyn &Bacon.

Mullen, B. (1993) Meta-analysis. Thornfield Jour-nal, 16, 36–41.

Munn, P., Johnstone, M. and Holligan, C. (1990)Pupils’ perceptions of effective disciplinarians.British Educational Research Journal, 16 (2),191–8.

Musch, J. and Bröder, A. (1999) Test anxiety versusacademic skills: a comparison of two alternativemodels for predicting performance in a statisticsexam. British Journal of Educational Psychology,69, 105–16.

Nash, R. (1976) Classrooms Observed. London:Routledge & Kegan Paul.

National Education Association of the United States:Association for Supervision and Curriculum De-velopment (1959) Learning about Learning fromAction Research. Washington: National EducationAssociation of the United States.

Naylor, P. (1995) Pupils’ perceptions of teacher rac-ism. Unpublished Ph.D. dissertation, Loughbor-ough University of Technology.

Neal, S. (1995) Researching powerful people from a

BIBLIOGRAPHY428

feminist and anti-racist perspective: a note on gen-der, collusion and marginality. British EducationalResearch Journal, 21 (4), 517–31.

Nedelsky, L. (1954) Absolute grading standards forobjective tests. Educational and PsychologicalMeasurement, 14, 3–19.

Neimeyer, G.J. (1992) Personal constructs in careercounselling and development. Journal of CareerDevelopment, 18 (3), 163–73.

Nesfield-Cookson, B. (1987) William Blake: Prophetof Universal Brotherhood. London: Crucible.

Nias, J. (1991) Primary teachers talking: a reflexiveaccount of longitudinal research. In G. Walford(ed.) Doing Educational Research. London:Routledge, 147–65.

Nisbet, J. and Watt, J. (1984) Case study. In J.Bell, T.Bush, A.Fox, J.Goodey and S.Goulding (eds) Con-ducting Small-scale Investigations in EducationalManagement. London: Harper &. Row, 79–92.

Nixon, J. (ed.) (1981) A Teacher’s Guide to ActionResearch. London: Grant McIntyre.

Noack, P. (1998) School achievement and adolescents’interactions with their fathers, mothers, andfriends. European Journal of Psychology of Edu-cation, 13 (4), 503–13.

Noffke, S.E. and Zeichner, K.M. (1987) Action re-search and teacher thinking. Paper presented atthe annual meeting of the American EducationalResearch Association, Washington, DC.

Norris, N. (1990) Understanding Educational Evalu-ation. London: Kogan Page.

Nuttall, D. (1987) The validity of assessments. Eu-ropean Journal of Psychology of Education, 11(2), 109–18.

Oakley, A. (1981) Interviewing women: a contradic-tion in terms. In H.Roberts (ed.) Doing FeministResearch. London: Routledge & Kegan Paul, 30–61.

Oja, S.N. and Smulyan, L. (1989) Collaborative Ac-tion Research: a Developmental Approach. Lewes:Falmer.

Okagaki, L. and Frensch, P.A. (1998) Parenting andschool achievement: a multiethnic perspective.American Educational Research Journal, 35 (1),123–44.

Okakura, K. (1991) The Book of Tea. Tokyo:Kodansha International.

Oldroyd, D. (1986) The Arch of Knowledge: an In-troductory Study of the History of the Philoso-phy and Methodology of Science. New York:Methuen.

O’Neill, B. and McMahon, H. (1990) Opening NewWindows with Bubble Dialogue. Language De-velopment and Hypermedia Research Group, Fac-ulty of Education, University of Ulster atColeraine.

Oppenheim, A.N. (1966) Questionnaire Design andAttitude Measurement. London: Heinemann.

Oppenheim, A.N. (1992) Questionnaire Design, In-terviewing and Attitude Measurement. London:Pinter Publishers Ltd.

Osborne, J.I. (1977) College of education students’perceptions of the teaching situation. UnpublishedM.Ed, dissertation. University of Liverpool.

Osgood, C.E., Suci, G.S. and Tannenbaum, P.H.(1957) The Measurement of Meaning. Urbana:University of Illinois.

Overett, S. and Donald, D. (1998) Paired reading: ef-fects of a parental involvement programme in adisadvantaged community in South Africa. BritishJournal of Educational Psychology, 68, 347–56.

Page, R.N. (1997) Teaching about validity. Interna-tional Journal of Qualitative Studies in Education,10 (2), 145–55.

Palmer, M. (1976) An experimental study of the useof archive materials in the secondary school his-tory curriculum. Unpublished Ph.D. dissertation.Leicester: University of Leicester.

Palys, T.S. (1978) Simulation methods and social psy-chology. Journal for the Theory of Social Behav-iour, 8, 341–68.

Papasolomoutos, C. and Christie, T. (1998) Usingnational surveys: a review of secondary analyseswith special reference to schools. EducationalResearch, 40 (3), 295–310.

Parker, H.J. (1974) View from the Boys. NewtonAbbot: David & Charles.

Parker, I. (1992) Discourse Dynamics: Critical Analy-sis for Social and Individual Psychology. London:Routledge.

Parlett, M. and Hamilton, D. (1976) Evaluation asillumination. In D.Tawney (ed.) Curriculum Evalu-ation Today: Trends and Implications. London:Macmillan, 84–101.

Parsons, E., Chalkley, B. and Jones, A. (1996) Therole of Geographic Information Systems in thestudy of parental choice and secondary schoolcatchments. Evaluation and Research in Educa-tion, 10 (1), 23–34.

Parsons, J.M., Graham, N. and Honess, T. (1983) Ateacher’s implicit model of how children learn.

Bib

liog

rap

hy

429BIBLIOGRAPHY

British Educational Research Journal, 9 (1), 91–101.

Parsons, M. and Lyons, G. (1979) An alternativeapproach to inquiry in educational management.Educational Management and Administration, 8(1), 75–84.

Paterson, L. (1991) An introduction to multilevelmodelling. In S.W.Raudenbush and J.Willms (eds)Schools, Classrooms and Pupils: InternationalStudies of Schooling from a Multilevel Perspec-tive. San Diego: Academic Press, 13–24.

Paterson, L. and Goldstein, H. (1991) New statisti-cal methods for analysing social structures: an in-troduction of multilevel models. British Educa-tional Research Journal, 17 (4), 387–93.

Patrick, J. (1973) A Glasgow Gang Observed. Lon-don: Eyre Methuen.

Patton, M.Q. (1980) Qualitative Evaluation Meth-ods. Beverly Hills: Sage Publications.

Patton, M.Q. (1990) Qualitative Evaluation andResearch Methods (second edition). London: SagePublications.

Peak, D. and Frame, M. (1994) Chaos under Con-trol: the Art and Science of Complexity. New York:W.H. Freeman.

Peevers, B.H. and Secord, P.F. (1973) Developmentalchanges in attribution of descriptive concepts topersons. Journal of Personality and Social Psy-chology, 27 (1), 120–8.

Percival, F. (1978) Evaluation procedures for simu-lating gaming exercises. In R.McAleese (ed.) Per-spectives on Academic Gaming and Simulation 3:Training and Professional Education. London:Kogan Page, 163–9.

Phillips, R. (1998) Some methodological and ethicaldilemmas in élite-based research. British Educa-tional Research Journal, 24 (1), 5–20.

Pierce, C.M.B. and Molloy, G.N. (1990) Psychologi-cal and biographical differences between second-ary school teachers experiencing high and low lev-els of burnout. British Journal of Educational Psy-chology, 60, 37–51.

Pilliner, A. (1973) Experiment in Educational Re-search. E 341. Milton Keynes: Open UniversityPress.

Pirsig, R. (1976) Zen and the Art of MotorcycleMaintenance. London: Corgi.

Pithers, R.T. and Soden, R. (1999) Person-environ-ment fit and teacher stress. Educational Research,41, 51–61.

Pitman, M.A. and Maxwell, J.A. Qualitative ap-

proaches to evaluation: models and methods. InM.LeCompte, W.L.Millroy and J.Preissle (eds) TheHandbook of Qualitative Research in Education.London: Academic Press Ltd., 729–70.

Platt, J. (1981) Evidence and proof in documentaryresearch: some specific problems of documentaryresearch. Sociological Review, 29 (1), 31–52.

Plewis, I. (1985) Analysing Change: Measurement andExplanation Using Longitudinal Data. Chiches-ter: John Wiley.

Plewis, I. (1991a) Using multilevel models to linkeducational progress with curriculum coverage. InS.W.Raudenbush and J.Willms (eds) Schools,Classrooms and Pupils: International Studies ofSchooling from a Multilevel Perspective. San Di-ego: Academic Press, 53–65.

Plewis, I. (1991b) Pupils’ progress in reading andmaths during primary school: associations withethnic group and sex. Educational Research, 33,133–40.

Plewis, I. (1997) Statistics in Education. London:Arnold.

Plummer, K. (1983) Documents of Life: an Introduc-tion to the Problems and Literature of a Human-istic Method. London: Allen & Unwin.

Pollard, A. (1985) The Social World of the PrimarySchool. Eastbourne: Holt, Rinehart & Winston.

Pope, M.L. and Keen, T.R. (1981) Personal ConstructPsychology and Education. London: AcademicPress.

Popper, K. (1968) The Logic of Scientific Discovery(second edition). London: Hutchinson.

Poppleton, P. and Riseborough, G. (1990) Teachingin the mid–1980s: the centrality of work in sec-ondary teachers’ lives. British Educational Re-search Journal, 16 (2), 105–24.

Postlethwaite, K. and Haggarty, L. (1998) Towardseffective and transferable learning in secondaryschool: the development of an approach based onmastery learning. British Educational ResearchJournal, 24 (3), 333–53.

Potter, J. and Wetherall, M. (1987) Discourse andSocial Psychology: Beyond Attitudes and Behav-iour. London: Sage Publications.

Powney, J. and Watts, M. (1987) InterviewinginEducational Research. London: Routledge &Kegan Paul.

Prais, S.J. (1983) Formal and informal teaching: afurther reconsideration of Professor Bennett’s sta-tistics. Journal of the Royal Statistical Society, 146(2), 163–9.

BIBLIOGRAPHY430

Prein, G., Kelle, U. and Bird, K. (1995) An overviewof software. In U.Kelle (1995) (ed.) Computer-Aided Qualitative Data Analysis. London: SagePublications.

Preisler, G.M. and Ahström, M. (1997) Sign languagefor hard of hearing children—a hindrance or abenefit for their development? European Journalof Psychology of Education, 12 (4), 465–77.

Pring, R. (1984) The problems of confidentiality. InM.Skilbeck (ed.) Evaluating the Curriculum in theEighties. Sevenoaks: Hodder & Stoughton, 38–44.

Prosser, M. and Trigwell, K. (1997) Relations be-tween perceptions of the teaching environmentand approaches to teaching. British Journal ofEducational Psychology, 67, 25–35.

Quantz, R.A. (1992) On critical ethnography (withsome postmodern considerations). In M.LeCompte, W.L.Millroy and J.Preissle (eds) TheHandbook of Qualitative Research in Education.London: Academic Press Ltd., 447–506.

Raffe, D., Bundell, I. and Bibby, J. (1989) Ethics andtactics: issues arising from an educational survey.In R.G.Burgess (ed.) The Ethics of EducationalResearch. Lewes: Falmer, 13–30.

Ramsden, C. and Reason, D. (1997) Conversation -discourse analysis in library and information serv-ices. Education for Information, 15 (4), 283–95.

Rapoport, R.N. (1970) Three dilemmas in actionresearch. Human Relations, 23 (6), 499–513.

Rasmussen, D.M. (1990) Reading Habermas. Ox-ford: Basil Blackwell Ltd.

Ravenette, A.T. (1977) Personal construct theory: anapproach to the psychological investigation ofchildren and young people. In D.Bannister (ed.)New Perspectives in Personal Construct Theory.London: Academic Press, 251–80.

Reid, S. (1987) Working with Statistics: an Introduc-tion to Quantitative Methods for Social Scientists.Cambridge: Polity Press.

Rex, J. (ed.) (1974) Approaches to Sociology: an In-troduction to Major Trends in British Sociology.London: Routledge & Kegan Paul.

Reynolds, D. (1995) The effective school: an inaugu-ral lecture. Evaluation and Research in Education,9 (2), 57–73.

Reynolds, P.D. (1979) Ethical Dilemmas and SocialScience Research. San Francisco: Jossey-Bass.

Ribbens, J. and Edwards, R. (1997) Feminist Dilem-mas in Qualitative Research: Public Knowledgeand Private Lives. London: Sage Publications.

Rice, J.M. (1897) The futility of the spelling grind.

Cited in G.de Landsheere (1997) History of edu-cational research. In J.P.Keeves (ed.) EducationalResearch, Methodology, and Measurement: anInternational Handbook (second edition). Oxford:Elsevier Science Ltd., 8–16.

Ridgway, J. (1998) The Modeling of Systems andMacro-Systemic Change: Lessons for Evaluationfrom Epidemiology and Ecology. Research Mono-graph 8. University of Wisconsin-Madison: Na-tional Institute for Science Education.

Riecken, H.W. and Boruch, R.F. (1974) Social Ex-planation: a Method for Planning and EvaluatingSocial Intervention. New York: Academic PressInc.

Rigby, K. (1999) Peer victimisation at school and thehealth of secondary school students. British Jour-nal of Educational Psychology, 69, 95–104.

Riley, M.W. (1963). Sociological Research 1: a CaseApproach. New York: Harcourt, Brace & WorldInc.

Riseborough, G.F. (1992) The Cream Team: an eth-nography of BTEC National Diploma (Cateringand Hotel Management) students in a tertiarycollege. British Journal of Sociology and Educa-tion, 13 (2), 215–45.

Roberts, H. (1981) Doing Feminist Research. Lon-don: Routledge.

Robinson, B. (1982) Tutoring by Telephone: a Hand-book. Milton Keynes: Open University Press.

Robinson, P. and Smithers, A. (1999) Should the sexesbe separated for secondary education—compari-sons of single-sex and co-educational schools.Research Papers in Education, 14 (1), 23–49.

Robson, J. and Collier, K. (1991) Designing ‘Sugar’n’ Spice’: An anti-sexist simulation. Simulation/Games For Learning, 21 (3), 213–19.

Roderick, R. (1986) Habermas and the Foundationsof Critical Theory. Basingstoke: Macmillan.

Rogers, C.R. (1942) Counselling and Psychotherapy.Boston: Houghton Mifflin.

Rogers, C.R. (1945) The non-directive method as atechnique for social research. American Journalof Sociology, 50, 279–83.

Rogers, C.R. (1969) Freedom to Learn. Columbus,OH: Merrill Pub. Co.

Rogers, C.R. and Stevens, B. (1967) Person to Per-son: The Problem of Being Human. London: Sou-venir Press.

Rogers, V.M. and Atwood, R.K. (1974) Can we putourselves in their place? Yearbook of the NationalCouncil for Social Studies, 44, 80–111.

Bib

liog

rap

hy

431BIBLIOGRAPHY

Roman, L.G. and Apple, M. (1990) Is naturalism amove away from positivism? Materialist and femi-nist approaches to subjectivity in ethnographicresearch. In E.Eisner and A.Peshkin (eds) Quali-tative Inquiry in Education: the continuing debate.New York: Teachers College Press, 38–73.

Rose, D. and Sullivan, O. (1993) Introducing DataAnalysis for Social Scientists. Buckingham: OpenUniversity Press.

Rosenthal, R. (1991) Meta-analysis Procedures forSocial Research. Beverly Hills: Sage Publications.

Rosier, M.J. (1997) Survey research methods. In J.P.Keeves (ed.) Educational Research, Methodologyand Measurement: an International Handbook(second edition). Oxford: Elsevier Science Ltd.,154–62.

Ross, K.N. and Rust, K. (1997) Sampling in surveyresearch. In J.P.Keeves (ed.) Educational Research,Methodology and Measurement: an InternationalHandbook (second edition). Oxford: Elsevier Sci-ence Ltd., 427–38.

Ross, K.N. and Wilson, M. (1997) Sampling error insurvey research. In J.P.Keeves (ed.) EducationalResearch, Methodology and Measurement: anInternational Handbook (second edition). Oxford:Elsevier Science Ltd., 663–670.

Rossi, P.H. and Freeman, H.E. (1993) Evaluation: aSystematic Approach. London: Sage Publications.

Roszak, T. (1970) The Making of a Counter Cul-ture. London: Faber & Faber.

Roszak, T. (1972) Where the Wasteland Ends. Lon-don: Faber & Faber.

Rozeboom, W.W. (1997) Good science is abductive,not hypothetico-deductive. In L.L.Harlow, S.A.Muliak and J.H.Steiger (eds) What if There WereNo Significance Tests? Mahwah, NJ: Erlbaum,335–92.

Ruddock, J. (1981) Evaluation: a Consideration ofPrinciples and Methods. Manchester Monographs10: University of Manchester.

Ryle, A. (1975) Frames and Cages: the Repertory GridApproach to Human Understanding. Brighton:Sussex University Press.

Ryle, G. (1949) The Concept of Mind.Harmondsworth: Penguin

Sacks, H. (1992) Lectures on Conversation (ed. G.Jefferson). Oxford: Basil Blackwell.

Sainsbury, M., Whetton, C., Mason, K. and Schagen,I. (1998) Fallback in attainment on transfer at age11:evidence from the summer literacy schoolsevaluation. Educational Research, 40 (1), 73–81.

Salmon, P. (1969) Differential conforming of the de-velopmental process. British Journal of Social andClinical Psychology, 8, 22–31.

Sanday, A. (1993) The relationship between educa-tional research and evaluation and the role of theLocal Education Authority. In R.G.Burgess (ed.)Educational Research and Evaluation for Policyand Practice. London: Falmer, 32–43.

Schagen. I. and Sainsbury, M. (1996) Multilevel analy-sis of the key stage 1 national curriculum data in1995. Oxford Review of Education, 22 (3), 265–72.

Schatzman, L. and Strauss, A.L. (1973) Field Re-search: Strategies for a Natural Sociology.Englewood Cliffs, NJ: Prentice-Hall.

Scheurich, J.J. (1995) A postmodernist critique ofresearch interviewing. International Journal ofQualitative Studies in Education, 8 (3), 239–52.

Scheurich, J.J. (1996) The masks of validity: adeconstructive investigation. International Journalof Qualitative Studies in Education, 9 (1), 49–60.

Schofield, J.W. (1993) Increasing the generalizabilityof qualitative research. In M.Hammersley (ed.)Social Research: Philosophy, Politics and Practice.London: Sage Publications in association with theOpen University Press, 200–25.

Schofield, W. (1996) Survey sampling. In R.Sapsfordand V.Jupp (1996) (eds) Data Collection andAnalysis. London: Sage Publications and the OpenUniversity Press, 25–55.

Schön, D. (1983) The Reflective Practitioner: HowProfessionals Think in Action. London: TempleSmith.

Schön, D. (1987) Educating the Reflective Practi-tioner. San Francisco: Jossey-Bass.

Schonbach, P. (1990) Account Episodes: the Man-agement or Escalation of Conflict. Cambridge:Cambridge University Press.

Schutz, A. (1962) Collected Papers. The Hague:Nijhoff.

Scriven, M. and Roth, J. (1978) Needs assessment:concept and practice. New Directions for ProgramEvaluation, 1, 1–11.

Searle, J. (1969) Speech Acts. London: CambridgeUniversity Press.

Sears, R., Maccoby, E. and Levin, H. (1957) Patternsof Child Rearing. New York: Harper & Row.

Secord, P.F. and Peevers, B.H. (1974) The develop-ment and attribution of person concepts. In T.Mischel (ed.) On Understanding Persons. Oxford:Basil Blackwell.

BIBLIOGRAPHY432

Seidel, J. and Kelle, U (1995) Different functions ofcoding in the analysis of textual data. In U.Kelle(1995) (ed.) Computer-Aided Qualitative DataAnalysis. London: Sage Publications.

Seifert, T.L. (1997) Academic goals and emotions:results of a structural equation model and a clus-ter analysis. British Journal of Educational Psy-chology, 67, 323–38.

Selleck, R.W.J. (1991) The Manchester StatisticalSociety and the foundation of social science re-search. In D.S.Anderson and B.J.Biddle (eds)Knowledge for Policy: Improving Educationthrough Research. London: Falmer, 291–304.

Sellitz, C., Wrightsman, L.S. and Cook, S.W. (1976)Research Methods in Social Relations. New York:Holt, Rinehart & Winston.

Semin, G.R. and Manstead, A.S.R. (1983) The Ac-countability of Conduct: a Social PsychologicalAnalysis. London: Academic Press.

Serafini, A. (ed.) (1989) Ethics and Social Concern.New York: Paragon House.

Severiens, S. and ten Dam, G. (1998) A multilevelmeta-analysis of gender differences in learningorientations. British Journal of Educational Psy-chology, 68, 595–618.

Shapiro, B.L. (1990) A collaborative approach to helpnovice science teachers reflect on changes in theirconstruction of the role of the science teacher. Al-berta Journal of Educational Research, 36 (3),203–22.

Sharp, R. and Green, A. (1975) Education and So-cial Control: a Study in Progressive Primary Edu-cation. London: Routledge & Kegan Paul.

Sharpe, P. (1985) Behaviour modification in the sec-ondary school: a survey of students’ attitudes torewards and praise. Behavioural Approaches withChildren, 9, 109–12.

Shavelson, R.J. and Berliner, D.C. (1991) Erosion ofthe education research infrastructure: a reply toFinn. In D.S.Anderson and B.J.Biddle (eds) Knowl-edge for Policy: Improving Education throughResearch. London: Falmer, 79–84.

Shaw, E.L. (1992) The influence of methods instruc-tion on the beliefs of preservice elementary andsecondary science teachers: preliminary compara-tive analyses. School Science and Mathematics, 92,14–22.

Shaw, M.L.G. (ed.) (1981) Recent Advances in PersonalConstruct Technology. London: Academic Press.

Sherrard, P. (1987) The Eclipse of Man and Nature.London: Inner Traditions, Lindisfarne Press.

Shipman, M.D. (1972) The Limitations of SocialResearch. London: Longman.

Shipman, M.D. (1974) Inside a Curriculum Project.

London: Methuen Co.Sideris, G. (1998) Direct classroom observation. Re-

search in Education, 59, 19–28.Siegel, H. (1987) Relativism Refuted. Netherlands:

D. Reidel Publishing Ltd.Siegel, S. (1956) Nonparametric Statistics for the

Behavioral Sciences. New York: McGraw-Hill.Sikes, P. and Troyna, B. (1991) True stories: a case

study in the use of life histories in teacher educa-tion. Educational Review, 43 (1), 3–16.

Sikes, P., Measor, L. and Woods, P. (1985) Teacher

Careers. Lewes: Falmer Press.Silverman, D. (1985) Qualitative Methodology and

Sociology: Describing the Social World.

Brookfield, VT: Gower.Silverman, D. (1993) Interpreting Qualitative Data.

London: Sage Publications.Simon, H.A. (1996) The Sciences of the Artificial

(third edition). Cambridge, MA: The MIT Press.Simon, J.L. (1978) Basic Research Methods in Social

Science. New York: Random House.Simons, H. (1982) Conversation piece: the practice

of interviewing in case study research. In R.McCormick (ed.) Calling Education to Account.London: Heinemann, 239–46.

Simons, H. (1996) The paradox of case study. Cam-

bridge Journal of Education, 26 (2), 225–40.Slater, P. (1977) The Measurement of Interpersonal

Space, Vol. 2. Chichester: Wiley.Slavin, R.E. (1984a) Meta-analysis in education: how

has it been used? Educational Researcher. AmericanEducational Research Association, 13 (8), 6–15.

Slavin, R.E. (1984b) A rejoinder to Carlberg et al.Educational Researcher. American EducationalResearch Association, 13 (8), 24–7.

Sluckin, A. (1981) Growing up in the Playground:The Social Development of Children. London:Routledge & Kegan Paul.

Smith, H.W. (1991) Strategies of Social Research(third edition). Orlando, FL: Holt, Rinehart &Winston.

Smith, L.M. (1987) Kensington Revisited. Lewes:Falmer Press.

Smith, M.L. and Glass, G.V. (1977) Meta-analysis ofpsychotherapy outcome studies. American Psy-chologist, 32, 752–60.

Smith, M.L. and Glass, G.V. (1987) Research and

Bib

liog

rap

hy

433BIBLIOGRAPHY

Evaluation in Education and the Social Sciences.Englewood Cliffs, NJ: Prentice-Hall.

Smith, R.W. (1975) Strategies of Social Research: theMethodological Imagination. London: Prentice-Hall.

Smith, S. and Leach, C. (1972) A hierarchical meas-ure of cognitive complexity. British Journal ofPsychology, 63 (4), 561–8.

Smithson, M. (1988) Possibility theory, fuzzy logic,and psychological explanation. In B.Zetenyi (ed.)Fuzzy Sets in Psychology. North Holland: ElsevierScience Publishers.

Smyth, J. (1989) Developing and sustaining criticalreflection in teacher education. Journal of TeacherEducation, 40 (2), 2–9.

Soble, A. (1978) Deception in social science research:is informed consent possible? Hastings CenterReport, 8, 40–6.

Social and Community Planning Research (1972)Questionnaire Design Manual No. 5. London: 16Duncan Terrace, NI 8BZ.

Social Science Research Council (1971) Research inEconomic and Social History. London:Heinemann.

Social Sciences and Humanities Research Council ofCanada (1981) Ethical Guidelines for the Institu-tional Review Committee for Research with Hu-man Subjects.

Solomon, R.L. (1949) An extension of control groupdesign. Psychological Bulletin, 46, 137–50.

Somekh, B. (1995) The contribution of action re-search to development in social endeavours: a po-sition paper on action research methodology. Brit-ish Educational Research Journal, 21 (3), 339–55.

Southgate, V., Arnold, H. and Johnson, S. (1981)Extending Beginning Reading. London:Heinemann Educational Books for the SchoolsCouncil.

Spector, P.E. (1993) Research designs. In M.L. Lewis-Beck (ed.) Experimental Design and Methods.International Handbook of Quantitative Appli-cations in the Social Sciences, Vol. 3. London: SagePublications, 1–74.

Spencer, J.R. and Flin, R. (1990) The Evidence ofChildren. London: Blackstone.

Spindler, G. (ed.) (1982) Doing the Ethnography ofSchooling. New York: Holt, Rinehart &. Winston.

Spindler, G. and Spindler, L. (1992) Cultural processand ethnography: an anthropological perspective.In M.LeCompte, W.L.Millroy and J.Preissle (eds)

The Handbook of Qualitative Research in Edu-cation. London: Academic Press Ltd., 53–92.

Spradley, J.P. (1979) The Ethnographic Interview.New York: Holt, Rinehart & Winston.

Spradley, J.P. (1980) Participant Observation. NewYork: Holt, Rinehart & Winston.

Stables, A. (1990) Differences between pupils frommixed and single-sex schools in their enjoymentof school subjects and in their attitude to Sciencein school. Educational Review, 42 (3), 221–30.

Stake, R.E. (1978) The case study method in socialinquiry. Educational Researcher, 7 (2), 5–8.

Stake, R.E. (1994) Case studies. In N.K.Denzin andY.S.Lincoln (eds) Handbook of Qualitative Re-search. London: Sage Publications, 236–47.

Steiner, G. (1978) Heidegger. Glasgow: Fontana/Collins.

Stenhouse, L. (1975) An Introduction to CurriculumResearch and Development. London: Heinemann.

Stenhouse, L. (1979) What is action research?(mimeo). Norwich: Classroom Action ResearchNetwork.

Stenhouse, L. (1985) Case study methods. In T. Husenand T.N.Postlethwaite (eds) International Ency-clopaedia of Education (first edition). Oxford:Pergamon, 640–6.

Stewart, I. (1990) Does God Play Dice?Harmondsworth: Penguin.

Stillar, G.F (1998) Analysing Everyday Texts: Dis-courses, Rhetoric, and Social Perspectives. Lon-don: Sage Publications.

Strand, S. (1999) Ethnic group, sex and economicdisadvantage: associations with pupils’ educationalprogress from Baseline to the end of Key Stage 1.British Educational Research Journal, 25 (2), 179–202.

Strauss, A.M. (1987) Qualitative Analysis for SocialScientists. Cambridge: Cambridge University Press.

Strike, K.A. (1990) The ethics of educational evalua-tion. In J.Millman and L.Darling-Hammond (eds)A New Handbook of Teacher Evaluation. BeverlyHills: Corwin Press, Sage Publications, 356–73.

Stringfield, S. (1997) Underlying the chaos: factorsexplaining exemplary US elementary schools andthe case for high-reliability organisations. InT.Townsend (ed.) Restructuring and Quality: Is-sues for Tomorrow’s Schools. London: Routledge,143–60.

Stronach, I. and Morris, B. (1994) Polemical noteson educational evaluation in an age of ‘policy

BIBLIOGRAPHY434

hysteria’. Evaluation and Research in Education,8 (1 & 2), 5–19.

Stubbs, M. and Delamont, S. (eds) (1976) Explora-tions in Classroom Observation. Chichester: JohnWiley.

Stufflebeam, D.L., McCormick, C.H., Brinkerhoff,R.O. and Nelson, C.O. (1985) Conducting Edu-cational Needs Assessment. Boston, MA: KluwerNijhoff.

Sturman, A. (1997) Case study methods. In J.P. Keeves(ed.) Educational Research, Methodology andMeasurement: an International Handbook (sec-ond edition). Oxford: Elsevier Science Ltd., 61–6.

Sturman, A. (1999) Case study methods. In J.P. Keevesand G.Lakomski (eds) Issues in Educational Re-search. Oxford: Elsevier Science Ltd., 103–12.

Suarez, T.M. (1994) Needs assessment. In T.Husenand T.N.Postlethwaite (eds) The InternationalEncyclopedia of Education, Volume 7 (secondedition). Oxford: Pergamon, 4,056–60.

Sudman, S. and Bradburn, N.M. (1982) Asking Ques-tions: a Practical Guide to Questionnaire Design.San Francisco, CA: Jossey-Bass Inc.

Sutherland, G. (1969) The study of the history ofeducation. History, 54 (180).

Sykes, W. and Hoinville, G. (1985) Telephone Inter-viewing on a Survey of Social Attitudes. London:Social and Community Planning Research.

Task Group on Assessment and Testing (1988) Na-tional Curriculum: Testing and Assessment: a Re-port. London: HMSO.

Tatar, M. (1998) Teachers as significant others: gen-der differences in secondary school pupils’ per-ceptions. British Journal of Educational Psychol-ogy, 68, 217–27.

Taylor, J.L. and Walford, R. (1972) Simulation in theClassroom. Harmondsworth: Penguin Books.

Terry, A.A. (1998) Teachers as targets of bullying bytheir pupils: a study to investigate incidence. Brit-ish Journal of Educational Psychology, 68, 255–68.

Tesch, R. (1990) Qualitative Research: Analysis Typesand Software Tools. London: Falmer.

Thissen, D. (1990) Reliability and measurement pre-cision. In H.Wainer (ed.) Computer Adaptive Test-ing: a Primer. Hillsdale, NJ: Lawrence ErlbaumAssociates, 161–86.

Thody, A. (1997) Lies, damned lies—and storytell-ing. Educational Management and Administra-tion, 25 (3), 325–38.

Thomas, J.B. (1992) Birmingham University and

teacher training: day training college to departmentof education. History of Education, 21 (3), 307–21.

Thomas, L.F. (1978) A personal construct approachto learning in education, training and therapy. InF.Fransella (ed.) Personal Construct Psychology.London: Academic Press, 45–58.

Thomas, L.F. and Harri-Augstein, E.S. (1992) Self-organized Learning. London: Routledge.

Thomas, P. (1991) Research models: insiders, gad-flies, limestone. In D.S.Anderson and B.J.Biddle(eds) Knowledge for Policy: Improving Educationthrough Research. London: Falmer, 225–33.

Thomas, S., Sammons, P., Mortimore, P. and Smees,R. (1997) Differential secondary school effective-ness: comparing the performance of different pu-pil groups. British Educational Research Journal,23 (4), 351–69.

Thomas, W.I. (1923) The Unadjusted Girl. Boston:Little, Brown

Thomas, W.I. (1928) The Child in America. NewYork: Knopf..

Thomas, W.I. and Znaniecki, F. (1918) The PolishPeasant in Europe and America. Chicago: Uni-versity of Chicago Press.

Thompson, B. (1994) Guidelines for authors. Edu-cational and Psychological Measurement, 54, 837–47.

Thompson, B. (1996) AERA editorial policies regard-ing statistical significance testing: three suggestedreforms. Educational Researcher, 25 (2), 26–30.

Thompson, B. (1998) In praise of brilliance: wherethat praise really belongs. American Psychologist,53, 799–800.

Thompson, B. and Snyder, P.A. (1997) Statistical sig-nificance testing practices in the Journal of Ex-perimental Education. Journal of ExperimentalEducation, 66, 75–83.

Thurstone, L.L. and Chave, E.J. (1929) The Meas-urement of Attitudes. Chicago: University of Chi-cago Press.

Tones, K. (1997) Beyond the randomized controlledtrial: a case for ‘judicial review’. Health Educa-tion Research, 12 (2) i–iv.

Torres, C.A. (1992) Participatory action research andpopular education in Latin America. InternationalJournal of Qualitative Studies in Education, 5 (1),51–62.

Travers, R.M.W. (1969) An Introduction to Educa-tional Research. London: Collier-Macmillan.

Tripp, D.H. (1985) Case study generalisation: an

Bib

liog

rap

hy

435BIBLIOGRAPHY

agenda for action. British Educational ResearchJournal, 11 (1), 33–43.

Tripp, D. (1994) Teachers’ lives, critical incidents andprofessional practice. International Journal ofQualitative Studies in Education, 7 (1) 65–72.

Troyna, B. and Hatcher, R. (1992) Racism in Chil-dren’s Lives: a Study in Mainly-White PrimarySchools. London: Routledge.

Tuckman, B.W. (1972) Conducting Educational Re-search. New York: Harcourt Brace Jovanovich.

Tyler, R. (1949) Basic Principles of Curriculum andInstruction. Chicago: University of Chicago Press.

Tymms, P.B. (1996) Theories, models and simulations:school effectiveness at an impasse. In J. Gray,D.Reynolds, C.T.Fitz-Gibbon, D.Jesson (eds)Merging Traditions: the Future of Research onSchool Effectiveness and School Improvement.London: Cassell, 121–35.

Tymms, P.B. (1997) The Security of Inspection ofInitial Teacher Training. http://www.leeds.ac.uk/educol/index.html.

UNESCO (1996) Learning: the Treasure Within.Paris: UNESCO.

United States Department of Health, Education andWelfare (1971) The Institutional Guide to D.H.E.W Policy on Protecting Human SubjectsDHEW Publication (NIH), December 2, 1971, 72–102.

University of Manchester Regional Computing Cen-tre (UMRCC) (1981) GAP: Grid Analysis Pack-age. Manchester: University of Manchester Re-gional Computing Centre.

Usher, P. (1996) Feminist approaches to research. InD.Scott and R.Usher (eds) Understanding Educa-tional Research. London: Routledge, 120–42.

Usher, R. and Scott, D. (1996) Afterword: the poli-tics of educational research. In D.Scott and R.Usher (eds) Understanding Educational Research.London: Routledge, 175–80.

Valadines, N. (1999) Formal reasoning performanceof higher secondary school students: theoreticaland educational implications. European Journalof Psychology of Education, 14 (1), 109–17.

Van Etten, S., Pressley, M., Freebern, G. andEchevarria, M. (1998) An interview study of col-lege freshmen’s beliefs about their academic mo-tivation. European Journal of Psychology of Edu-cation, 13 (1), 105–30.

van Ments, M. (1978) Role playing: playing a part ora mirror to meaning? Sagset Journal, 8 (3), 83–92.

van Ments, M. (1983) The Effective Use of Role-Play:

a Handbook for Teachers and Trainers. London:Croom Helm.

Vasta, R. (1979) Studying Children: an Introductionto Research Methods. San Francisco:W.H.Freeman.

Verma, G.K. and Beard, R.M. (1981) What is Edu-cational Research? Aldershot: Gower.

Verma, G.K. and Mallick, K. (1999) ResearchingEducation: Perspectives and Techniques. London:Falmer.

Vermunt, J.D. (1998) The regulation of constructivelearning processes. British Journal of EducationalPsychology, 68, 149–71.

Von Eye, A. (1990) Statistical Methods in Longitudi-nal Research. New York: Academic Press Inc.

Vulliamy, G. (1990) The potential of qualitative edu-cational research in developing countries. In G.Vulliamy, K.Lewin, and D.Stephens (1990) DoingEducational Research in Developing Countries:Qualitative Strategies. London: Falmer, 7–25.

Vulliamy, G., Lewin, K. and Stephens, D. (1990)Doing Educational Research in Developing Coun-tries: Qualitative Strategies. London: Falmer.

Wainer, H. (1990) (ed.) Computerized Adaptive Test-ing: a Primer. New Jersey: Lawrence Erlbaum As-sociates.

Wainer, H. and Mislevy, R.J. (1990) Item responsetheory, item calibration and proficiency estima-tion. In H.Wainer (ed.) Computerized AdaptiveTesting: a Primer. New Jersey: Lawrence ErlbaumAssociates, 65–102.

Waldrop, M.M. (1992) Complexity.Harmondsworth: Penguin.

Walford, G. (ed.) (1994) Researching the Powerfulin Education. London, UCL Press.

Walker, J.C. and Evers. C.W. (1988) The epistemo-logical unity of educational research. In J.P. Keeves.(ed.) Educational Research, Methodology andMeasurement: an International Handbook. Ox-ford: Pergamon Press, 28–36.

Walker, R. (1980) Making sense and losing meaning:problems of selection in doing Case Study. In H.Simons (ed.) Towards a Science of the Singular.University of East Anglia: Centre for Applied Re-search in Education, 222–35.

Walkerdine, V (1988) The Mastery of Reason: Cog-nitive Development and the Production of Ration-ality. London: Routledge.

Wardekker, W.L. and Miedama, S. (1997) Criticalpedagogy: an evaluation and a direction for refor-mulation. Curriculum Inquiry, 27 (1), 45–61.

BIBLIOGRAPHY436

Warnock, M. (1970) Existentialism. London: OxfordUniversity Press.

Watts, M. and Ebbutt, D. (1987) More than the sumof the parts: research methods in group interview-ing. British Educational Research Journal, 13 (1),25–34.

Webb, G. (1996) Becoming critical of action researchfor development. In O.Zuber-Skerritt (ed.) NewDirections in Action Research. London: Falmer,137–61.

Weiskopf, R. and Laske, S. (1996) Emancipatory ac-tion research: a critical alternative to personneldevelopment or a new way of patronising people?In O.Zuber-Skerritt (ed.) New Directions in Ac-tion Research. London: Falmer, 121–36.

Weiss, C. (1991a) The many meanings of researchutilization. In D.S.Anderson and B.J.Biddle (eds)Knowledge for Policy: Improving Educationthrough Research. London: Falmer, 173–82.

Weiss, C. (1991b) Knowledge creep and decision ac-cretion. In D.S.Anderson and B.J.Biddle (eds)Knowledge for Policy: Improving Educationthrough Research. London: Falmer, 183–92.

Wheldall, K. and Panagopoulou-Stamatelatou, A.(1991) The effects of pupil self-recording of on-task behaviour in primary school children. Brit-ish Educational Research Journal, 17 (2), 113–27.

Whitehead, J. (1985) An analysis of an individual’seducational development: the basis for personallyoriented action research. In M.Shipman (ed.) Edu-cational Research: Principles, Policies and Prac-tices. Lewes: Falmer, 97–108.

Whiteley, P. (1983) The analysis of contingency ta-bles. In D.McKay, N.Schofield and P.Whiteley(eds) (1983) Data Analysis and the Social Sciences.London: Frances Pinter, 72–119.

Whyte, J. (1986) Girls into Science and Technology:the Story of a Project. London: Routledge &Kegan Paul.

Whyte, W.F. (1982) Interviewing in field research. InR.Burgess (ed.) Field Research: a Sourcebook andField Manual. London: Allen & Unwin, 111–22.

Wickens, P. (1987) The Road to Nissan: Flexibility,Quality, Teamwork. Basingstoke: Macmillan.

Wilcox, R.R. (1997) Simulation as a research tech-nique. In J.P.Keeves (ed,) Educational Research,Methodology and Measurement: an InternationalHandbook (second edition). Oxford: Elsevier Sci-ence Ltd., 150–4.

Wild, P., Scivier, J.E. and Richardson, S.J. (1992)

Evaluating information technology-supported lo-cal management of schools: the user acceptabilityaudit. Educational Management and Administra-tion, 20 (1), 40–8.

Wiles, J. and Bondi, J.C. (1984) Curriculum Devel-opment: a Guide to Practice (second edition).Columbus, OH: Charles E.Merrill Publishing.

Wiliam, D. (1996) Standards in examinations: a mat-ter of trust. The Curriculum Journal, 7 (3), 293–306.

Willis, P.E., (1977) Learning to Labour. Farnborough:Saxon House.

Willms, J.D. (1992) Pride or prejudice? Opportu-nity structure and the effects of Catholic schoolsin Scotland. In A.Yogev (ed.) International Per-spectives on Education and Society: a Researchand Policy Annual (Vol. 2). Greenwich, CT: JAIPress.

Wilson, M. (1996) Asking questions. In R.Sapsfordand V.Jupp (1996) (eds) Data Collection andAnalysis. London: Sage Publications and the OpenUniversity Press, 94–120.

Wilson, N. and McLean, S. (1994) QuestionnaireDesign: a Practical Introduction. Newtown Ab-bey, Co. Antrim: University of Ulster Press.

Windisch, V. (1990) Speech and Reasoning in Every-day Life. Cambridge: Cambridge University Press.

Wineburg, S.S. (1991) The self-fulfilment of the self-fulfilling prophecy. In D.S.Anderson and B.J.Biddle (eds) Knowledge for Policy: ImprovingEducation through Research. London: Falmer,276–90.

Winter, R. (1982) Dilemma analysis: a contributionto methodology for action research. CambridgeJournal of Education, 12 (3) 161–74.

Winter, R. (1996) Some principles and proceduresfor the conduct of action research. In O.Zuber-Skerritt (ed.) New Directions in Action Research.London: Falmer.

Witkin, B.R. (1984) Assessing Needs in Educationaland Social Programs. San Francisco: Jossey-Bass.

Wittgenstein, L. (1974) Tractatus Logico-Philosophicus (trans. D.Pears and B.McGuiness).London: Routledge & Kegan Paul.

Wolcott, H.F (1973) The Man in the Principal’s Of-fice. New York: Holt, Rinehart & Winston.

Wolcott, H.F (1992) Posturing in qualitative research.In M.LeCompte, W.L.Millroy and J. Preissle (eds)The Handbook of Qualitative Research in Edu-cation. London: Academic Press Ltd., 3–52.

Wolf, R.M. (1994) The validity and reliability of

Bib

liog

rap

hy

437BIBLIOGRAPHY

outcome measure. In A.C.Tuijnman and T.N.Postlethwaite (eds) Monitoring the Standards ofEducation. Oxford: Pergamon, 121–32.

Wood, P. (1995) Meta-analysis. In G.M.Breakwell,S.Hammond and C.Fife-Shaw (eds) ResearchMethods in Psychology. London: Sage Publica-tions, 396–9.

Woods, P. (1979) The Divided School London:Routledge & Kegan Paul.

Woods, P. (1983) Sociology and the School. London:Routledge & Kegan Paul.

Woods, P. (1986) Inside Schools: Ethnography inEducational Research. London: Routledge &Kegan Paul.

Woods, P. (1989) Working for Teacher Development.Dereham, Norfolk: Peter Francis Publishers.

Woods, P. (1992) Symbolic interactionism: theory andmethod. In M.LeCompte, W L.Millroy andJ.Preissle (eds) The Handbook of Qualitative Re-search in Education. London: Academic Press Ltd.,337–404.

Woods, P. (1993) Managing marginality: teacher de-velopment through grounded life history. BritishEducational Research Journal, 19 (5), 447–88.

Woods, P. and Hammersley, M. (1993) Gender andEthnicity in Schools: Ethnographic Accounts.London: Routledge.

Worrall, L. (ed.) (1990) Geographic Information Sys-tems: Developments and Applications. London:Belhaven Press.

Wragg, E.C. (1994) An Introduction to ClassroomObservation. London: Routledge.

Yin, R.K. (1984) Case Study Research: Design andMethods. Beverly Hills: Sage Publications.

Yorke, D.M. (1978) Repertory grids in educationalresearch: some methodological considerations.British Educational Research Journal, 4 (2), 63–74.

Yorke, D.M. (1985) Indexes of stability in repertory

grids: a small-scale comparison. British Educa-tional Research Journal, 11 (3), 221–5.

Young, J. (1971) The Drugtakers. London: Paladin.Youngman, M.B. (1984) Designing questionnaires.

In J.Bell, T.Bush, A.Fox, J.Goodey and S.Goulding(eds) Conducting Small-scale Investigations inEducational Management. London: Harper &Row, 156–76.

Zechmeister, E.B. and Shaughnessy, J.J. (1992) APractical Introduction to Research Methods inPsychology. New York: McGraw-Hill.

Ziegahn, L. and Hinchman, K.A. (1999) Liberationor reproduction: explaining meaning in collegetutors’ adult literacy tutoring. International Jour-nal of Qualitative Studies in Education, 12 (1),85–101.

Zimbardo, P.C (1984) On the ethics of interventionin human psychological research with specific ref-erence to the ‘Stanford Prison Experiment’. In J.Murphy, M.John and H.Brown (eds) Dialoguesand Debates in Social Psychology. London: Law-rence Erlbaum Associates in association with theOpen University Press.

Znaniecki, F. (1934) The Method of Sociology. NewYork: Farrar & Rinehart.

Zuber-Skerritt, O. (1996a) Introduction. In O.Zuber-Skerritt (ed.) New Directions in Action Research.London: Falmer, 3–9.

Zuber-Skerritt, O. (1996b) Emancipatory action re-search for organisational change and managementdevelopment. In O.Zuber-Skerritt (ed.) New Di-rections in Action Research. London: Falmer, 83–105.

Zuzovsky, R. and Aitkin, M. (1991) Curriculumchange and science achievement in Israel elemen-tary schools. In S.W.Raudenbush and J.Willms(eds) Schools, Classrooms and Pupils: Interna-tional Studies of Schooling from a Multilevel Per-spective. San Diego: Academic Press, 25–36.

absolutism 58–9academic freedom 38–44, 69access 53–7, 67, 98–9, 140, 144, 314account gathering 295, 300–2accounts 80, 145, 237, 293–304, 403–4

bubble dialogue in 403 (note 4)characteristics of 294examples of 300–2problems with 302procedures in 294–7see also discourse analysis; ethogenic method;

network analysis; speech acts; storiesaccuracy 93–6, 117, 128, 264–5achievement tests, see tests, kinds ofaction research 73, 79, 80, 226–41, 402

as critical praxis 30–2, 228, 231–4, 235emancipatory 30–2, 228, 231–4, 235, 240, 241ethics in 63, 67–8nature of 67, 226–30planning of 234–9reflexivity in 239stages of 234–9see also collaborative action research; critical

action research; Delphi techniques; NominalGroup Technique

advocacy 52–3, 140–1agency 19, 27, 31analysis of variance 81, 318analytic induction 80, 150–1anonymity 57, 61–2, 142, 186, 246, 251, 256, 259,

269, 292, 335anthropological methods, see ethnographic

methodsanti-positivist movement 6, 9, 17–20, 22, 27applied research 38aptitude tests, see tests, kinds ofAQUAD 156assessment 80, 131, 334, see also testsassociation, concept of 191–7, see also correlationalATLAS/ti 156audiences of research 38–44, 57, 61–3, 66–7, 87, 89audio-visual recording 237, 279, 281–2, 295–6, 313,

347–8

authenticity in research 104–5, 108, 120, 131, 162, 239,255, 294

Becker, H. 22, 25behaviour 5, 9, 22, 27, 137, 191behaviourism 19, 29, 32–4, 309beneficence 64, 69, 71, 246, 279, 292, 315, 335,

see also ethicsBennett, S.N. 351–3, 369, 405 (notes 2 and 3)Bernstein, B. 27, 32, 34Betrayal 63, see also ethicsbias

in ethogenic research 302in experimental research 126–8, 224in historical research 163–4, 166–8in interviewing 120–6, 293in life history research 65, 132–3, 161–8in questionnaires 128–9in sampling, see also reliability and validity;

sampling, representativeness ofbiographies 24, 26, 80, 165–8Blumer, H. 21, 25Boids 386–8bubble dialogue 258, 403 (note 4) case studies 15, 79, 80, 138, 152–3, 167–8, 181–90,

228, 247defined 181–5examples of 185–7, 400 (ch. 9 note 4)observation in 187–9planning 189–90types of 183–7

catalytic validity, see validity, kinds ofcausal research 10, 14, 16, 19, 30, 150, 178, 179, 197,

201, 205–10, 211–25, 385causal-comparative research 93, 205–6, 209, 210‘cause-to-effect’ studies 80, 181, 205–10Central Limit Theorem 96–7, 385chaos theory 33, 383, 385–9children 52–3, 66, 124–5, 258, 287–8, 300chi-square 80, 81, 251, 318, 364–8cluster analysis 171, 351–3, 360–4, 405 (note 5)cluster sampling, see sampling, kinds of

Index

Ind

ex

439INDEX

Cochrane Collaboration 394, 402 (note 9)Code-A-Text 156, 283, 300codes of practice 69–72, 397 (note 1)coding 21, 77, 116, 121, 148–9, 152, 155–6, 167, 173,

223, 248, 260, 265–6, 282–6, 295–6, 299–300co-efficients of correlation, see correlation

co-efficientscognitive mapping 80, 297cohort studies 174–80, 399–400collaborative action research 37, 227–30, 233, 237,

240–1collaborative research 37, 60, 123, 229–30, 237,

239–41complexity theory 33, 383, 385–9computerized adaptive testing 335–6computers, see AQUAD; ATLAS/ti; Boids; Code-

A-Text; computerized adaptive testing;CROSSTABS; Ethnograph; ethnographicmethods, computer usage in; fuzzy logic;Geographical Information Systems;HyperQuad 2; HyperRESEARCH; Hypersoft;Internet; meta-analysis; multiple iteration;NUD.IST; QUALPRO; Results for Research™;simulations; SphinxSurvey; SPSS; TextbaseAlpha

Comte, A. 8conceptions of social reality 5–9, 23–7concepts 13–15, 110, 148, 299, 309, 403 (ch. 16 note 1)concurrent validity, see validity, kinds ofconfidence levels 94–5, 107, 207, see also correlation,

co-efficients ofconfidentiality 54, 56–7, 62–3, 68, 142, 190, 246, 259,

279, 292, 335, see also ethicsconfirmability of research 12, 108, 129consequential validity, see validity, kinds ofconsistency, see reliabilityconstant comparison, 80, 150–2, 283constructs, see conceptsconstructs, personal, see personal constructsconstruct validit, see validity, kinds ofcontent analysis, 77, 164–5, 284–5content validity, see validity, kinds ofcontext 21, 26, 30, 120–6, 130, 131, 137–40, 148,

171–2, 181–2, 189, 223, 286, 300, 310–14contingency tables 364–8continuous variables 191–3control 8–16, 27–31, 207–10, 211, 240control effect 176, 178–9control groups 81, 116, 208–10, 211–25convenience sampling, see sampling, kinds ofconvergent validity, see validity, kinds ofCooley, J. 141correlation 16, 81, 116, 117, 171, 223, 357–9

co-efficients 117–19, 131, 193–7, 198–9, 201–2,350, 369, 405 (note 5)

explained 192–7

see also curvilinearity; effect size; Pearson productmoment correlation; Spearman rank-ordercorrelation; statistical significance; Type I errors;Type II errors

correlational research 93, 191–204, 205–6cost/benefits ratio 49–50, 63countertransference 121covert research 49, 51–2, 55, 63–6, 142, 186, 310,

314–15, see also ethicscredibility 108, 128, 129, 152criterion group studies 205–6criterion-referenced tests, see testscriterion-related validity, see validity, kinds ofcritical action research 229, 231–4, 235critical case sampling, see sampling, kinds ofcritical educational research 27–38, 181, 231–4critical ethnographies 153–5critical events 103critical incidents 310critical pedagogy 34critical theory 3, 11, 18, 27–38, 153–5, 181, 230–4,

235, 241cross-sectional studies 113, 169, 174–80, 399–400cross-site analysis 80, 109, 270CROSSTABS 251, 364–8cultural validity, see validity, kinds ofcurriculum research 32–5curvilinearity 197–8

data analysis 77–80, 82, 86, 116, 140, 147–52, 15–6,

163–8, 190, 265–6, 281–6, 295–6, 349–70data collection 86, 89, 107, 112–16, 160–3, 165,

187–90, 223, see Part 4data reporting, see reportingData Protection Act 70, 292data types 75, 85data validation 87, 89deception 63–6, 372–4, see also ethicsdeduction 4, 5, 10, 12definition of the situation 26, 111, 122, 138, 156, 311degrees of freedom 364–8Delphi techniques 238–9democracy 27–30, 111, 231dependability of data 108, 120, 129, 152, 164dependent variable, see variablesdescriptive research 30descriptive validity, see validity, kinds ofdeterminism 6, 10developmental research 169, 174–80Dewey, J. 159diachronic analysis 178, 294diagnostic tests, see testsdichotomous questions, see questions, types ofdimensional sampling, see sampling, kinds ofdiscourse analysis 80, 298–300, 403 (note 3), see also

speech acts

INDEX440

discrete variables 192–3dissemination 39, 54, 56, 61distribution of sample means 96–8documentary research 147, 161–3, 237domain analysis 109, 148–9, 319domain referencing 130, 131, 318–19, see also tests

ecological validity, see validity, kinds of‘effect-to-cause’ studies 206–9effect size 197, 221–5, 394emancipation 29–30, 111, 123, 153–5, 231–4, 240–1emancipatory interest 29–34, 231–2emic approaches 139, 152, 313empiricism 4, 8, 10, 11, 12, 15empowerment 30–8, 111, 153–5, 231–4, 298episodes 21, 293–4, 300, see also social episodesepistemology 3, 6, 7, 17, 19, 44, 139, 230, 239equality 27–30, 32–8, 123, 153–5, 231errors, see Type I errors; Type II errorsethics 49–72, 75, 89, 141–3, 212, 245–6, 279, 292,

310, 314–15, 334–5, 371–4, 398in action research 63, 67–8in ethnographies 142–3in experiments 63, 212in interviews 61, 279, 292in observation 66, 310, 314–15in questionnaires 61, 245–6in role play 58–9, 65, 371–4in testing 334–5see also covert research

Ethnograph 156, 255, 283, 300ethnographers as interviewers 140, 146–7, 268, 270ethnographic methods 23–7, 37, 54, 73, 78, 93,

106–8, 137–57, 272, 293, 310–14, 400 (ch. 9 note 3)characteristics of 106, 137–40computer usage in 155–6critical 153–5data analysis in 147–52data collection in 145–7ethics in 62–7, 141–3planning 140–53problems in 156–7reliability and validity in 106–10, 119–20, 145, 152,

154sampling in 93, 143–5see also naturalistic research

ethnomethodology 23–7ethogenic method 21, 293–4, 302–3etic approaches 139, 313evaluative research 3, 38–43, 67–9evaluative validity, see validity, kinds ofevent history analysis 177–80, 401 (note 5)event sampling 308evidence-based education 394–5, see also Cochrane

Collaboration; randomized controlled trialsexistential phenomenology 17, 29

experience 3–6, 10, 23experience-sampling method 295experiments and quasi-experiments 11, 16, 73, 78,

81, 93, 117, 118, 185, 189, 199, 205, 211–20, 317,369, 394, 400–2and ex post facto research 205, 209, 211, 400–1and meta-analysis 220–5characteristics of 211–14conduct of 116, 211–20, 334designs of 212–20examples of 217–20, 401–2reliability and validity in 126–8single-case research 219–20see also Cochrane Collaboration; meta-analysis;

pretest; post-test; randomized controlled trials;sampling

ex post facto research 177, 205–10, 211, 304, 400–1characteristics of 206–7kinds of 205–10planning 209–10

external validity, see validity, kinds of

face validity, see validity, kinds offactor analysis 171, 300, 318, 349, 354–60, 405 (note 5)fairness 108, 111false consciousness 28, 156feminist research 34–8, 111, 123fidelity 106–7field notes 77, 141, 145–7, 190, 311–14, see also

accounts; case studies; computer usage;documents; ethnographic methods; observation

fitness for purpose 11, 37, 47, 73, 80, 91, 104, 132,135, 146, 222–3, 243, 248, 270, 307, 315, 320, 327

Flanders Interaction Analysis Categories 21focus groups 260, 288–9focused interview, see interviewsformal interview, see interviewsFrankfurt School, 18, 28, 34, 231freedom 29–38freedom, degrees of, see degrees of freedomfrequencies 80, 81, 272, 283, 364–8fusion of horizons 29fuzzy logic 389

Gadamer, H.G. 29Garfinkel, H. 22, 24, 25Gaussian curve 77, 317–18gender 34–8, 123, 153generalizability 10, 11, 16, 19, 22, 27, 30, 40, 99, 101,

102, 107, 109, 120, 127, 138, 143, 144, 157, 171–2,182–3, 189, 220–5, 240, 278

Geographical Information Systems 389–91, 406 (ch.22 note 1)

Girls into Science and Technology 38, 66Goffman, E. 25, 143grid technique, see personal constructs

Ind

ex

441INDEX

grounded theory 23, 80, 138, 150–2group interviewing 287–8, see also focus groupsGuttman’s lambda 191–2Guttman scales 252

Habermas, J. 18, 28–36, 154, 227, 231–2, 293, 298halo effect 116, 156, 309, 315, 345Harré, R. 294Hawthorne effect 116, 127, 130, 156, 315, 388, 401

(ch. 12 note 3), see also reactivityhermeneutic interest 29–34, 231hermeneutics 28, 104, 138, 155historical research 158–68, 399 (note 8), see also

biographies; documents; life histories; records;storiescontent analysis in 164–5data collection in 160–3evaluation 162–8quantitative methods in 164–5

human behaviour 6, 9, 18, 191humanistic psychology 20Husserl, E. 23–4HyperQuad2 156HyperRESEARCH 156, 265Hyper soft 156hypothesis 3–5, 12, 14–16, 39, 55, 137, 139, 149, 151,

158–9, 173, 183, 207, 215, 235, 268, 284, 297, 305

ideology 27–38, 153–5, 298ideology critique 27–38, 111, 155, 231–4idiographic approach to human behaviour 7, 20, 137,

138illuminative approaches, see ethnographic methodsindependent variable, see variablesindexicality 25induction 4, 5, 10, 104, 138, 151inductive-deductive reasoning 4, 5, 10inference 101, 148, 149, 150–2, 179, 309, 311, 315,

318informal interview, see interview, types ofinformants 144–5, 166, 189, see also ethnographic

methodsinformation technology, see computersinformed consent 50–3, 59, 71, 142, 245, 279, 292,

314–5, 335, see also ethicsinstantaneous sampling 306–10instrumentation 74, 75, 86, 89, 90, 116, 126–7, 130,

135, 145–7, see also Part 4intentions 20–2, 104, 108, 293, 300interactionism 23–7, see also ethnographic methods;

naturalistic researchinterests 27–38, 55, 231–2, see also

knowledge-constitutive interestsinternal validity, see validity, kinds ofInternet 383–5, 389, 390, 393, 394interpretation 19, 22, 25, 29

interpretive paradigm 3, 5, 19–29, 131, 157, 181, 183,293, 303

interpretive validity, see validity, kinds ofinter-rater reliability, see reliability, kinds ofinterval recording 308, 309interval scale 77, 81, 95, 116, 191–2, 317–18, 333interviewees, characteristics of 122–5, 279–81interviewers, characteristics of 122–5, 133, 279–81interviews 267–92, 403

analysis of 147–52, 281–6and children 124–5, 287–8compared to questionnaires 128–9conduct of 122–5, 133, 146, 279–81ethical issues 66–7, 279, 292ethnographic 140, 146–7focused 273, 290group 287–8kinds of 121, 267–73merits of 269–70non-directive 273, 289–90open-ended 121, 146, 189, 268, 270, 275piloting of 121planning of 273–5power in 123–5purposes of 268–70, 274qualitative 270–2quantitative 270–2questions in 120–6reliability and validity in 120–6, 146response modes in 276–9semi-structured 146–7, 171, 189, 268, 270–1, 278structured 121, 146, 171, 189, 268, 270–3, 277, 286telephone 123–4, 290–2transcription of 125–6unstructured 268, 270–1, 273, 276–7see also coding; focus groups; questions, types of;

reporting; sensitive research; transcription;triangulation

item analysis, see testsitem difficulty, see testsitem discriminability, see testsitem response theory, 130, 325

jury validity, see validity, kinds of

Kelly, G. 337–41, 346Kendall’s tau 191–2Kerlinger, R 5, 11, 14–16, 205, 206, 211, 213–14, 218,

268, 273, 275, 283, 349knowledge-constitutive interests 29–34, 231–2Kruskal’s gamma 191–2Kruskal-Wallis analysis of variance 81Kuder-Richardson formula for reliability 131 laddering, see personal constructsleading questions 122, 248, 261

INDEX442

legitimacy 27–38, 232levels of data 77, 80–1, 94–5, 116, 190, 191–3, 317–18Lewin, K. 226, 230, 231, 234–5life histories 59, 65, 80, 95, 132–3, 146, 165–8, 260Likert scales 252–5linkage analysis 349–51, 359–60logical positivism 8–10, 396 (note 3)longitudinal studies 54, 113, 169, 174–80, 369, 400

Mann-Whitney U-test 81Marxism 12, 28mastery learning 318–9, 322, 333, see also tests,

criterion-referencedmatrix planning 80–90, 322–3Mead, G.H. 25, 27mean 81, 222–5, 318meanings 23–5, 27, 29, 104, 137–8, 294, 298, 302,

310measurement effect 176, 178–9Medawar, RB. 14–15median scores 80, 81, 333meta-analysis 37, 220–5, 394, 402 (note 9)Milgram, S. and role-playing 52, 59, 373–4Mixon, D. and role-playing 373–4modal scores 80, 81, 318models 12–13, 398 (note on ch. 3)moderation 130Moreno, J.L. 370multidimensional measurement 349–70, 405–6

cluster analysis 351–3elementary linkage analysis 349–51, 359–60examples of 351–3, 354–60, 360–4factor analysis 354–60

multidimensional scaling 360–4multidimensional tables 364–6multilevel modelling 369, 394, 405 (note 6)multi-phase sampling, see sampling, kinds ofmultiple choice questions, see questions, types ofmultiple correlation 192, 198–9multiple iteration 385–8multiple realities 22–4, 120, 137, 157

narrative 165–8, 190, 221naturalistic observation 145–6, 305–6, 310–14naturalistic research 3, 19–27, 106, 108, 109, 119,

129, 137–57, 310–14characteristics of 137–40data analysis in 147–52, 155–6data collection in 145–7ethics in 141–3interviews in 146–7planning of 140–53problems in 156–7sampling in 143–4see also ethnographic methods

nature of science 8–19

needs analysis 390–3negative correlation 193–7negotiation 26, 27, 55, 68, 138network analysis 293, 297–8Nominal Group Technique 237–9nominal scale 77, 80, 94, 173, 191–3, 251, 318nominalism 6, 7nomothetic approach to human behaviour 7, 20, 39,

138, 345non-directive interview, see interviewsnon-maleficence, 57, 58, 59, 64, 71, 142, 246, 279,

335, see also ethicsnon-parametric tests 77, 317–18non-participant observation 138, 185–7, 303, 305–10non-probability sampling, see sampling, kinds ofnon-response 128, 262–5normative paradigm 22, 28norm-referenced tests, see testsNUD.IST 77, 156, 283null hypothesis 39, 195–7, 364–5, 367–8

objectivity 6–7, 17, 18, 32, 35, 36, 139, 233, 257, 303observation 10, 16, 118, 138, 139, 181, 185–8, 303,

305–16, 400 (note 2), 404critical incidents 310ethics of 66, 310, 324–5kinds of 305–6naturalistic 146, 310–14recording 306–14reliability and validity in 118, 129, 307, 309, 310semi-structured 305, 310–14structured 129, 171, 305, 306–10, 312, 313unstructured 305, 306, 310–14see also accounts; case studies; covert research;

documents; field notes; participant observation;non-participant observation

Occam, William of xvi, 10ontology 3, 5, 6, 7, 17open-ended interviews, see interviewsoperationalization 12, 31, 73, 75–6, 110, 127, 129,

237, 246–7, 324opportunity sampling, see sampling, kinds ofordinal scale 77, 80, 116, 173, 191–2, 318ownership of research 75, 142, 157, 292

panel studies 113, 174–80paradigms in educational research 3–45, 135, 137,

396 (notes 4 and 5)parametric tests 77, 81, 317–18parsimony, principle of 10, 12partial correlation 192, 198–9participant observation, 27, 30, 52, 66, 114, 129, 138,

146, 154, 185–90, 305, 310–14, see also casestudies; covert research; ethnographic methods;field notes; naturalistic research

participatory research 26, 35–8, 56, 58–60, 228–31,

Ind

ex

443INDEX

237, 239, 241, see also collaborative actionresearch; collaborative research

patterning 80, 283Pearson’s product moment correlation co-efficient

81, 118, 191–3, 198, 318periodicity 100permission 50–6, see also ethicspersonal constructs 80, 337–48, 404

characteristics of 337–8elicited and provided 338–9, 346examples of 339–40, 344, 346–8, 404grid analysis of 342–4laddering 341–2, 344procedures for 339–41pyramid constructions in 341repertory grids 339–48

phenomenology 23–7, 131, 138, 181, see alsodefinition of the situation; ethnographic methods;interactionism; interpretive paradigm, naturalisticresearch

Piaget, J. 114, 169piloting 56, 121, 129, 171, 173, 216, 248, 260–1, 263,

284, 307, 319, 325planning educational research 73–91Plowden Committee 169policy-making and research 38–44politics 31–44, 88, 111, 153–5, 229, 241, 252, 315population, see samplingpositive correlation 193–7positivism 3, 6, 8–19, 22, 27–9, 35, 36, 106, 108, 109,

119, 131, 139post hoc, ergo propter hoc fallacy 207, 401postal questionnaire 128–9, 171, 262–5post-test 81, 117, 118, 213–17, 317, 337power 27, 28–38, 52, 55, 70, 75, 83, 89, 121, 122–5,

142, 153–5, 231, 233, 300, 314practical interest 29–34, see also hermeneutic interestprediction 11, 12, 16, 28, 29, 179, 211, 320, 385–6prediction studies 169, 174–80, 199–200predictive validity, see validity, kinds ofpre-test 81, 117, 118, 126, 127, 213–17, 317, 334primary source of data 161–3, 189principal components analysis 351–3, 355privacy 58, 60–3, 69, 71, 142, 292, 314–15, see also

ethicsprobability 95, 193–202, 325, see also statistical

significanceprobability sampling, see sampling, kinds ofprogressive focussing 147–8, 189prospective studies 174–80purposive sampling, see sampling, kinds of

qualitative research 7, 19–27, 37, 66–7, 77, 93, 95,

101, 105, 106–8, 117, 119–20, 137–57, 159–68,180–90, 255, 270–2, 283, 297, 305, 310–14, 387,392

QUALPRO 156quantitative research 7, 18, 37, 95–6, 101, 105,

116–19, 120, 169–80, 191–225, 270–2, 283, 305,306–10, 349–70, 387–8, 392, 394

quasi-experimental designs 212, 214–15, 217–18questionnaires 116, 245–66, 269, 312, 402–3

compared to interviews 128–9construction and design 246–8, 258–60data analysis in 77, 265–6degrees of structure 247–8ethical issues in 61, 245–6layout 255, 258–60, 261operationalization 246–7piloting 248, 260–1,263planning 77, 246–8postal 128, 171, 262–5practical considerations in 54, 128–9, 248–9, 256,

258–9, 261–2reliability and validity in 109, 128–9, 256, 258,

260, 263–5sequence 257–8see also interviews; questions; sensitive research;

surveysquestions, types of

closed 121, 129, 247–8, 257, 260, 261, 270, 277dichotomous 248, 250–1, 253, 255, 257, 261, 277Guttman scales 253Likert scales 252–5multiple choice 171, 248, 251–2, 253, 257, 260,

261, 270, 275, 277, 324, 328, 329open-ended 126, 129, 247–9, 255–6, 257, 260, 270,

284, 328rank ordering 252, 261, 275, 277rating scales 248, 251–5, 257, 260, 261, 275, 277,

283semantic differential scales 252–5sensitive 256–8, 261sequencing 257–8, 261, 280Thurstone scales 252True/false 329–30Wording 248–56, 258see also interviews; sensitive research

quota sampling, see sampling, kinds of

randomization 100, 206, 211, 213–16, 218, 401

(note 1)randomized controlled trials 99, 216–17, 394random sampling, see sampling, kinds ofranking response 252, 318rating scales 80, 81, 248, 251, 260–1, 275, 277, 283,

308, 309–10, 312, 343, 345ratio scale 77, 81, 95, 191–2, 317, 333reactivity 116, 127, 129, 141, 156, 246, 311, 313, see

also Hawthorne effectrecords 147, 160–5, 237, 313, see also documents;

field notes; historical research

INDEX444

reflective practice 30, 227, 228–9, 231, 233, 235, 239,241, 313

reflexivity 24, 25, 36, 85, 120, 140, 141, 153, 164,189, 228, 239, 300, 313

regression 171, 198, 224–5, 369regulation 69–70relationships 6, 10, 14, 16, 33, 36, 52, 59, 64, 145,

155, 274, see also correlational researchrelativism 58–9reliability, 85–9, 112–33, 363, see also triangulationreliability, kinds of

consistency 117equivalence 117, 118, 131internal consistency 117,inter-rater 116, 118, 119, 121, 129, 130, 307parallel forms 118, 119split-half 118–19, 197stability 117–18, 119

reliabilityin experiments 126–8in interviews 119–20, 120–6in life histories 132–3, 166–8in network analysis 297–8in observation 129, 307, 309, 310in qualitative research 119–20, 152, 162–3, 185in quantitative research 117–19, 176–80in questionnaires 128–9, 256, 258, 260, 263–5in tests 117–18, 130–2, 317–20, 325, 330–1, 334–6

repertory grid, see personal constructsreplication 12, 117, 119reporting 68, 80, 82, 87, 89, 117, 140, 152–3, 163–4,

173, 182, 240, 286–7, 332reputational case sampling, see sampling, kinds ofresearch 5, 56, 73–91

design 54, 75–7, 84, 115–16, 135, 170–4, 274see also Part 3

foci 74, 75, 83–4, 89, 116instrumentation, see Part 4methodology 75–7, 85, 89, 115, 135models 78–80 398 (note on ch. 3)operationalization 73, 75–6, 84, 115, 127, 172–3,

237, 246–7, 261, 324ownership of 84, 89, 157, 292, see also ownership

of researchpurposes 54, 75, 83–4, 89, 172, 261reporting of 80, 89, 117, 140, 152–3, 163–4, 173,

182, 240, 286–7, 332questions 74, 83–5, 89, 115, 117, 173sponsorship 38–44timing of 54, 74, 83, 89, 90, 115

respondent validation 57, 63, 87, 104, 108, 116, 120,154, 189–90, 285, 357

responses in interview/questionnaire construction248–60, 269

Results for ResearchTM 265retrospective studies 174–80

right to know 58, 60–3, 142, 292, 314–15, see also

ethicsright to privacy 58, 60–3, 69, 71, 142, 292, 314–15,

see also ethicsRogers, C.R. 20, 167, 289, 397 (note 5)role 152, 182, see also access; ethnographic methods;

naturalistic researchrole-playing 58–9, 65, 370–9, 406

ethics in 371–4examples of 371in educational settings 374–9procedures for 374–9versus deception 372–4see also Stanford Prison Experiment

sample, see sampling; sampling, kinds ofsample, access to 92, 98–9sample mortality 96, 116, 127, 143, 176, 178, 180sample size 92, 93–6, 131, 196–7, 216, 224, 247, 318sampling 86, 89, 90, 92–104, 143–5, 171, 174, 185,

288, 291, 392sampling distribution 97sampling error 94–5, 96–8, 99, 222–3, 291sampling frame 98sampling in ethnographies 143–5sampling, kinds of 99–104, 129

accidental 102–3cluster 101convenience 102–3, 143, 144critical case 103–4, 143, 144dimensional 104event 308extreme case 144instantaneous 309–10matched 118multi-phase 102non-probability sample 99, 102–4, 172opportunity 102–3, 143probability sample 93, 99–102, 122, 174purposive 99, 103–4, 138quota 103random 93, 99, 100, 104, 118, 216reputational case 144snowball 104, 144stage 102stratified 73, 101, 103, 118, 143, 291systematic 100typical case 143, 144unique case 143–4see also case studies

sampling, representativeness of 92, 93, 94, 98, 99,123, 127, 145, 167, 175, 179

scatter diagram 193–4, 197Schutz, A. 23–4, 187science, assumptions and nature of, see nature of

science

Ind

ex

445INDEX

‘science of persons’ 21scientific method 5, 15–19, 35, 38–9, 181, 227–8, 396

(note 7)see also nature of science

secondary sources of data 161–3, 189semantic differential scales 252–5sensitive research 57, 61, 62, 98–9, 121, 125, 246,

256–8, 261, 314–15significance, levels of, see statistical significancesignificance, statistical, see statistical significance

see also effect sizesimulations 97, 374–5, 379, 385–9single-case research 219–20skewed data 80snowball sampling, see sampling, kinds ofsocial episodes 21, 293–4, 300, 396 (note 9)social psychology 20–2social reality, see conceptions of social realitysocial sciences 14, 18, 19, 20, 24sociology of knowledge 33–4Somer’s d 191–2Spearman-Brown formula for reliability 118–19Spearman rank-order correlation 81, 118, 192, 197,

318, 350speech acts 232, 293, 298, 300, 311SphinxSurvey 255, 265–6split-half reliability, see reliability, kinds ofSPSS 77, 255, 365stability, see reliability, kinds ofstage sampling, see sampling, kinds ofstandard deviation 81, 97, 222–5, 318, 333standard error of mean 96–7standard error of proportions 97–8Stanford Prison Experiment 58–9, 371–2statistical significance 81, 116–17, 118, 120, 192–7,

202–3, 221–31, 364–5, 394see also effect size

statistics 77, 80, 116, 129, 179, 191–225, 251, 255,270, 317–18, 319–20, 344, 349–70

stories 303–4stratified sampling, see sampling, kinds ofstructured interview, see interview, kinds ofstructured observation, see observationsubjectivity 6, 7, 17, 19, 22, 23–7, 29, 35–6, 116,

121–2, 123, 137, 139, 155, 257, 269, 309surveys 15, 54, 73, 78, 117, 169–74, 189, 262, 268,

398 (note on ch. 5), 399planning 170–4sampling 73, 93, 172–4see also questionnaires; questions; sampling

syllogism 4symbolic interactionism 23–7, see also ethnographic

methodssynchronie analysis 294systematic sampling, see sampling, kinds of

systemic validity, see validity, kinds of

tape-recorder, use of, see audio-visual recordingteacher as researcher 230–1teacher evaluation 67–9technical interest 29–34, 231technicism 28–30telephone interviewing 123–4, 290–2test/revest 117, 118, 131tests 80, 81, 116, 131, 172, 317–36, 404

achievement 317, 319–29, 335aptitude 317, 319–21commercially produced 317, 319–21computerized adaptive testing in 335–6construction of 321–34criterion-referenced 131, 317–19, 321–2, 327, 332,

335diagnostic 319–22domain-referenced 318–19, 335ethics of 334–5format 328item analysis 192, 319, 322–3, 324–7, 335–6item difficulty 198, 325–7, 335–6item discriminability 192, 197, 198, 325–7, 335–6item response theory 325, 335–6item writing 328–31layout of 331non-parametric 317–18norm-referenced 317–19, 321, 327operationalization 324parametric 317–18, 320piloting of 319, 325pretest 81, 117, 118, 126, 213–17, 317, 334post-test 81, 117, 118, 213–17, 317, 334purposes of 321–2reliability and validity in 130–2, 319–21, 322, 325,

330, 331, 334, 335, 336scoring 321–4, 330, 332–4, 335timing of 331–2see also questions

Textbase Alpha 156textual analysis 37, see also content analysistheoretical validity, see validity, kinds oftheory 11, 12, 16, 22, 23, 35, 36, 137

grounded 23, 80, 138, 150–2theory generation 138, 148, 150–2, 167

thick description 22, 104, 137, 139, 152, 182, 293,311

Thurstone scales 252timing of research 54, 74, 83, 89, 90, 115transcription 125–6, 270, 281–6, 295, see also case

studies; ethnographic methods; field notes;interviews; naturalistic research

transference 121transformative research 28–38

INDEX446

trend studies 113, 169, 174–80triadic elicitation 339, 346triangulation 108, 112–16, 120, 151–2, 156, 185, 288,

310defined 112–13types of 37, 113–15

truth, search for 3, 56t-test 81, 318Tyler, R. 32–4Type I errors 116–17, 197, 221Type II errors 116–17, 197, 223typical case sampling, see sampling, kinds oftypological analysis 150–2

understanding 7, 20, 21, 23–4, 27, 29, 102, see also

ethnographic methods; hermeneutics; meaning;naturalistic methods

unique case sampling, see sampling, kinds ofunstructured interview, see interview, kinds ofunstructured observation, see observationutilitarianism 68

validity 4, 85, 89, 105–17, 120–33, 232, 298

and postal questionnaires 128–9, 262–5and triangulation 112–16, 120, 310in data analysis 116–17in data collection 87, 105, 116in data reporting 117in experiments 126–8in interviews 120–6in life histories 132–3, 166–8in network analysis 297–8in observation 129, 309in qualitative data 106–8, 109, 110, 145, 152, 154,

162–3, 185in quantitative data 108, 176–80

in questionnaires 128–9, 256, 260, 263–5in research design 115–16in tests 130–2, 317–22, 334–6

validity, kinds of 105–17catalytic 34–8, 105, 108, 111, 123concurrent 105, 112, 132consequential 105, 132construct 105, 110, 132, 383content 105, 109–10, 131, 383convergent 110, 121criterion-related 105, 111–12, 131cultural 106, 111descriptive 106, 107discriminant 110ecological 105, 110, 119, 127, 129, 369evaluative 106, 107external 105, 109, 127–8, 129, 164face 105, 132internal 105, 107–8, 120, 126–7, 129, 164interpretive 106, 107jury 105, 132predictive 105, 111–12, 132systemic 105, 132theoretical 106, 107

value-added 171, 322, 369variables 27, 97, 177, 192, 205–10, 211–25, 251, 274,

283, 354verification 8, 12, 27, 181, 285–6, 292, 297verstehen approaches 29, see also understandingvideo, see audio-visual recordingvoluntarism 51, 67,vulnerability 52–3, 64, 142, 315, see also ethics William of Occam xvi, 10writing research reports, see reporting z-scores 81, 223, 333