experimental design and data analysis for biologists

1
6/2/2018 Whoops-DiscoverSharePresent-SLIDEPDF.COM http://slidepdf.com/reader/full/experimental-design-and-data-analysis-for-biologists

Upload: cristian-riquelme

Post on 08-Oct-2015

73 views

Category:

Documents


2 download

DESCRIPTION

Data

TRANSCRIPT

  • This page intentionally left blank

  • Gerry Quinn is in the School of BiologicalSciences at Monash University, with research inter-ests in marine and freshwater ecology, especiallyriver oodplains and their associated wetlands.

    M ichael Keough is in the Department of Zoologyat the University of Melbourne, with research inter-ests in marine ecology, environmental science andconservation biology.

    Both authors have extensive experience teachingexperimental design and analysis courses and haveprovided advice on the design and analysis of sam-pling and experimental programs in ecology andenvironmental monitoring to a wide range of envi-ronmental consultants, university and governmentscientists.

    Experimental Design and Data Analysis for Biologists

    An essential textbook for any student or researcher inbiology needing to design experiments, samplingprograms or analyze the resulting data. The textbegins with a revision of estimation and hypothesistesting methods, covering both classical and Bayesianphilosophies, before advancing to the analysis oflinear and generalized linear models. Topics coveredinclude linear and logistic regression, simple andcomplex ANOVA models (for factorial, nested, block,split-plot and repeated measures and covariancedesigns), and log-linear models. Multivariate tech-niques, including classication and ordination, arethen introduced. Special emphasis is placed onchecking assumptions, exploratory data analysis andpresentation of results. The main analyses are illus-trated with many examples from published papersand there is an extensive reference list to both thestatistical and biological literature. The book is sup-ported by a website that provides all data sets, ques-tions for each chapter and links to software.

  • Experimental Design and DataAnalysis for Biologists

    Gerry P. QuinnMonash University

    Michael J. KeoughUniversity of Melbourne

  • Cambridge, New York, Melbourne, Madrid, Cape Town, Singapore, So Paulo

    Cambridge University PressThe Edinburgh Building, Cambridge , United Kingdom

    First published in print format

    - ----

    - ----

    - ----

    G. Quinn & M. Keough 2002

    2002

    Information on this title: www.cambridge.org/9780521811286

    This book is in copyright. Subject to statutory exception and to the provision ofrelevant collective licensing agreements, no reproduction of any part may take placewithout the written permission of Cambridge University Press.

    - ---

    - ---

    - ---

    Cambridge University Press has no responsibility for the persistence or accuracy ofs for external or third-party internet websites referred to in this book, and does notguarantee that any content on such websites is, or will remain, accurate or appropriate.

    Published in the United States of America by Cambridge University Press, New York

    www.cambridge.org

    hardback

    paperbackpaperback

    eBook (NetLibrary)eBook (NetLibrary)

    hardback

  • Contents

    Preface page xv

    1 Introduction 1

    1.1 Scientic method 11.1.1 Pattern description 2

    1.1.2 Models 2

    1.1.3 Hypotheses and tests 3

    1.1.4 Alternatives to falsication 4

    1.1.5 Role of statistical analysis 5

    1.2 Experiments and other tests 51.3 Data, observations and variables 71.4 Probability 71.5 Probability distributions 9

    1.5.1 Distributions for variables 10

    1.5.2 Distributions for statistics 12

    2 Estimation 14

    2.1 Samples and populations 142.2 Common parameters and statistics 15

    2.2.1 Center (location) of distribution 15

    2.2.2 Spread or variability 16

    2.3 Standard errors and condence intervals for the mean 172.3.1 Normal distributions and the Central Limit Theorem 17

    2.3.2 Standard error of the sample mean 18

    2.3.3 Condence intervals for population mean 19

    2.3.4 Interpretation of condence intervals for population mean 20

    2.3.5 Standard errors for other statistics 20

    2.4 Methods for estimating parameters 232.4.1 Maximum likelihood (ML) 23

    2.4.2 Ordinary least squares (OLS) 24

    2.4.3 ML vs OLS estimation 25

    2.5 Resampling methods for estimation 252.5.1 Bootstrap 25

    2.5.2 Jackknife 26

    2.6 Bayesian inference estimation 272.6.1 Bayesian estimation 27

    2.6.2 Prior knowledge and probability 28

    2.6.3 Likelihood function 28

    2.6.4 Posterior probability 28

    2.6.5 Examples 29

    2.6.6 Other comments 29

  • 3 Hypothesis testing 32

    3.1 Statistical hypothesis testing 323.1.1 Classical statistical hypothesis testing 32

    3.1.2 Associated probability and Type I error 34

    3.1.3 Hypothesis tests for a single population 35

    3.1.4 One- and two-tailed tests 37

    3.1.5 Hypotheses for two populations 37

    3.1.6 Parametric tests and their assumptions 39

    3.2 Decision errors 423.2.1 Type I and II errors 42

    3.2.2 Asymmetry and scalable decision criteria 44

    3.3 Other testing methods 453.3.1 Robust parametric tests 45

    3.3.2 Randomization (permutation) tests 45

    3.3.3 Rank-based non-parametric tests 46

    3.4 Multiple testing 483.4.1 The problem 48

    3.4.2 Adjusting signicance levels and/or P values 49

    3.5 Combining results from statistical tests 503.5.1 Combining P values 50

    3.5.2 Meta-analysis 50

    3.6 Critique of statistical hypothesis testing 513.6.1 Dependence on sample size and stopping rules 51

    3.6.2 Sample space relevance of data not observed 52

    3.6.3 P values as measure of evidence 53

    3.6.4 Null hypothesis always false 53

    3.6.5 Arbitrary signicance levels 53

    3.6.6 Alternatives to statistical hypothesis testing 53

    3.7 Bayesian hypothesis testing 54

    4 Graphical exploration of data 58

    4.1 Exploratory data analysis 584.1.1 Exploring samples 58

    4.2 Analysis with graphs 624.2.1 Assumptions of parametric linear models 62

    4.3 Transforming data 644.3.1 Transformations and distributional assumptions 65

    4.3.2 Transformations and linearity 67

    4.3.3 Transformations and additivity 67

    4.4 Standardizations 674.5 Outliers 684.6 Censored and missing data 68

    4.6.1 Missing data 68

    4.6.2 Censored (truncated) data 69

    4.7 General issues and hints for analysis 714.7.1 General issues 71

    vi CONTENTS

  • 5 Correlation and regression 72

    5.1 Correlation analysis 725.1.1 Parametric correlation model 72

    5.1.2 Robust correlation 76

    5.1.3 Parametric and non-parametric condence regions 76

    5.2 Linear models 775.3 Linear regression analysis 78

    5.3.1 Simple (bivariate) linear regression 78

    5.3.2 Linear model for regression 80

    5.3.3 Estimating model parameters 85

    5.3.4 Analysis of variance 88

    5.3.5 Null hypotheses in regression 89

    5.3.6 Comparing regression models 90

    5.3.7 Variance explained 91

    5.3.8 Assumptions of regression analysis 92

    5.3.9 Regression diagnostics 94

    5.3.10 Diagnostic graphics 96

    5.3.11 Transformations 98

    5.3.12 Regression through the origin 98

    5.3.13 Weighted least squares 99

    5.3.14 X random (Model II regression) 100

    5.3.15 Robust regression 104

    5.4 Relationship between regression and correlation 1065.5 Smoothing 107

    5.5.1 Running means 107

    5.5.2 LO(W)ESS 107

    5.5.3 Splines 108

    5.5.4 Kernels 108

    5.5.5 Other issues 109

    5.6 Power of tests in correlation and regression 1095.7 General issues and hints for analysis 110

    5.7.1 General issues 110

    5.7.2 Hints for analysis 110

    6 Multiple and complex regression 111

    6.1 Multiple linear regression analysis 1116.1.1 Multiple linear regression model 114

    6.1.2 Estimating model parameters 119

    6.1.3 Analysis of variance 119

    6.1.4 Null hypotheses and model comparisons 121

    6.1.5 Variance explained 122

    6.1.6 Which predictors are important? 122

    6.1.7 Assumptions of multiple regression 124

    6.1.8 Regression diagnostics 125

    6.1.9 Diagnostic graphics 125

    6.1.10 Transformations 127

    6.1.11 Collinearity 127

    CONTENTS vii

  • 6.1.12 Interactions in multiple regression 130

    6.1.13 Polynomial regression 133

    6.1.14 Indicator (dummy) variables 135

    6.1.15 Finding the best regression model 137

    6.1.16 Hierarchical partitioning 141

    6.1.17 Other issues in multiple linear regression 142

    6.2 Regression trees 1436.3 Path analysis and structural equation modeling 1456.4 Nonlinear models 1506.5 Smoothing and response surfaces 1526.6 General issues and hints for analysis 153

    6.6.1 General issues 153

    6.6.2 Hints for analysis 154

    7 Design and power analysis 155

    7.1 Sampling 1557.1.1 Sampling designs 155

    7.1.2 Size of sample 157

    7.2 Experimental design 1577.2.1 Replication 158

    7.2.2 Controls 160

    7.2.3 Randomization 161

    7.2.4 Independence 163

    7.2.5 Reducing unexplained variance 164

    7.3 Power analysis 1647.3.1 Using power to plan experiments (a priori power analysis) 166

    7.3.2 Post hoc power calculation 168

    7.3.3 The effect size 168

    7.3.4 Using power analyses 170

    7.4 General issues and hints for analysis 1717.4.1 General issues 171

    7.4.2 Hints for analysis 172

    8 Comparing groups or treatments analysis of variance 173

    8.1 Single factor (one way) designs 1738.1.1 Types of predictor variables (factors) 176

    8.1.2 Linear model for single factor analyses 178

    8.1.3 Analysis of variance 184

    8.1.4 Null hypotheses 186

    8.1.5 Comparing ANOVA models 187

    8.1.6 Unequal sample sizes (unbalanced designs) 187

    8.2 Factor effects 1888.2.1 Random effects: variance components 188

    8.2.2 Fixed effects 190

    8.3 Assumptions 1918.3.1 Normality 192

    8.3.2 Variance homogeneity 193

    8.3.3 Independence 193

    viii CONTENTS

  • 8.4 ANOVA diagnostics 1948.5 Robust ANOVA 195

    8.5.1 Tests with heterogeneous variances 195

    8.5.2 Rank-based (non-parametric) tests 195

    8.5.3 Randomization tests 196

    8.6 Specic comparisons of means 1968.6.1 Planned comparisons or contrasts 197

    8.6.2 Unplanned pairwise comparisons 199

    8.6.3 Specic contrasts versus unplanned pairwise comparisons 201

    8.7 Tests for trends 2028.8 Testing equality of group variances 2038.9 Power of single factor ANOVA 204

    8.10 General issues and hints for analysis 2068.10.1 General issues 206

    8.10.2 Hints for analysis 206

    9 Multifactor analysis of variance 208

    9.1 Nested (hierarchical) designs 2089.1.1 Linear models for nested analyses 210

    9.1.2 Analysis of variance 214

    9.1.3 Null hypotheses 215

    9.1.4 Unequal sample sizes (unbalanced designs) 216

    9.1.5 Comparing ANOVA models 216

    9.1.6 Factor effects in nested models 216

    9.1.7 Assumptions for nested models 218

    9.1.8 Specic comparisons for nested designs 219

    9.1.9 More complex designs 219

    9.1.10 Design and power 219

    9.2 Factorial designs 2219.2.1 Linear models for factorial designs 225

    9.2.2 Analysis of variance 230

    9.2.3 Null hypotheses 232

    9.2.4 What are main effects and interactions really measuring? 237

    9.2.5 Comparing ANOVA models 241

    9.2.6 Unbalanced designs 241

    9.2.7 Factor effects 247

    9.2.8 Assumptions 249

    9.2.9 Robust factorial ANOVAs 250

    9.2.10 Specic comparisons on main effects 250

    9.2.11 Interpreting interactions 251

    9.2.12 More complex designs 255

    9.2.13 Power and design in factorial ANOVA 259

    9.3 Pooling in multifactor designs 2609.4 Relationship between factorial and nested designs 2619.5 General issues and hints for analysis 261

    9.5.1 General issues 261

    9.5.2 Hints for analysis 261

    CONTENTS ix

  • 10 Randomized blocks and simple repeated measures:unreplicated two factor designs 262

    10.1 Unreplicated two factor experimental designs 26210.1.1 Randomized complete block (RCB) designs 262

    10.1.2 Repeated measures (RM) designs 265

    10.2 Analyzing RCB and RM designs 26810.2.1 Linear models for RCB and RM analyses 268

    10.2.2 Analysis of variance 272

    10.2.3 Null hypotheses 273

    10.2.4 Comparing ANOVA models 274

    10.3 Interactions in RCB and RM models 27410.3.1 Importance of treatment by block interactions 274

    10.3.2 Checks for interaction in unreplicated designs 277

    10.4 Assumptions 28010.4.1 Normality, independence of errors 280

    10.4.2 Variances and covariances sphericity 280

    10.4.3 Recommended strategy 284

    10.5 Robust RCB and RM analyses 28410.6 Specic comparisons 28510.7 Efciency of blocking (to block or not to block?) 28510.8 Time as a blocking factor 28710.9 Analysis of unbalanced RCB designs 287

    10.10 Power of RCB or simple RM designs 28910.11 More complex block designs 290

    10.11.1 Factorial randomized block designs 290

    10.11.2 Incomplete block designs 292

    10.11.3 Latin square designs 292

    10.11.4 Crossover designs 296

    10.12 Generalized randomized block designs 29810.13 RCB and RM designs and statistical software 29810.14 General issues and hints for analysis 299

    10.14.1 General issues 299

    10.14.2 Hints for analysis 300

    11 Split-plot and repeated measures designs: partly nestedanalyses of variance 301

    11.1 Partly nested designs 30111.1.1 Split-plot designs 301

    11.1.2 Repeated measures designs 305

    11.1.3 Reasons for using these designs 309

    11.2 Analyzing partly nested designs 30911.2.1 Linear models for partly nested analyses 310

    11.2.2 Analysis of variance 313

    11.2.3 Null hypotheses 315

    11.2.4 Comparing ANOVA models 318

    11.3 Assumptions 31811.3.1 Between plots/subjects 318

    11.3.2 Within plots/subjects and multisample sphericity 318

    x CONTENTS

  • 11.4 Robust partly nested analyses 32011.5 Specic comparisons 320

    11.5.1 Main effects 320

    11.5.2 Interactions 321

    11.5.3 Prole (i.e. trend) analysis 321

    11.6 Analysis of unbalanced partly nested designs 32211.7 Power for partly nested designs 32311.8 More complex designs 323

    11.8.1 Additional between-plots/subjects factors 324

    11.8.2 Additional within-plots/subjects factors 329

    11.8.3 Additional between-plots/subjects and within-plots/

    subjects factors 332

    11.8.4 General comments about complex designs 335

    11.9 Partly nested designs and statistical software 33511.10 General issues and hints for analysis 337

    11.10.1 General issues 337

    11.10.2 Hints for individual analyses 337

    12 Analyses of covariance 339

    12.1 Single factor analysis of covariance (ANCOVA) 33912.1.1 Linear models for analysis of covariance 342

    12.1.2 Analysis of (co)variance 347

    12.1.3 Null hypotheses 347

    12.1.4 Comparing ANCOVA models 348

    12.2 Assumptions of ANCOVA 34812.2.1 Linearity 348

    12.2.2 Covariate values similar across groups 349

    12.2.3 Fixed covariate (X ) 349

    12.3 Homogeneous slopes 34912.3.1 Testing for homogeneous within-group regression slopes 349

    12.3.2 Dealing with heterogeneous within-group regression

    slopes 350

    12.3.3 Comparing regression lines 352

    12.4 Robust ANCOVA 35212.5 Unequal sample sizes (unbalanced designs) 35312.6 Specic comparisons of adjusted means 353

    12.6.1 Planned contrasts 353

    12.6.2 Unplanned comparisons 353

    12.7 More complex designs 35312.7.1 Designs with two or more covariates 353

    12.7.2 Factorial designs 354

    12.7.3 Nested designs with one covariate 355

    12.7.4 Partly nested models with one covariate 356

    12.8 General issues and hints for analysis 35712.8.1 General issues 357

    12.8.2 Hints for analysis 358

    CONTENTS xi

  • 13 Generalized linear models and logistic regression 359

    13.1 Generalized linear models 35913.2 Logistic regression 360

    13.2.1 Simple logistic regression 360

    13.2.2 Multiple logistic regression 365

    13.2.3 Categorical predictors 368

    13.2.4 Assumptions of logistic regression 368

    13.2.5 Goodness-of-t and residuals 368

    13.2.6 Model diagnostics 370

    13.2.7 Model selection 370

    13.2.8 Software for logistic regression 371

    13.3 Poisson regression 37113.4 Generalized additive models 37213.5 Models for correlated data 375

    13.5.1 Multi-level (random effects) models 376

    13.5.2 Generalized estimating equations 377

    13.6 General issues and hints for analysis 37813.6.1 General issues 378

    13.6.2 Hints for analysis 379

    14 Analyzing frequencies 380

    14.1 Single variable goodness-of-t tests 38114.2 Contingency tables 381

    14.2.1 Two way tables 381

    14.2.2 Three way tables 388

    14.3 Log-linear models 39314.3.1 Two way tables 394

    14.3.2 Log-linear models for three way tables 395

    14.3.3 More complex tables 400

    14.4 General issues and hints for analysis 40014.4.1 General issues 400

    14.4.2 Hints for analysis 400

    15 Introduction to multivariate analyses 401

    15.1 Multivariate data 40115.2 Distributions and associations 40215.3 Linear combinations, eigenvectors and eigenvalues 405

    15.3.1 Linear combinations of variables 405

    15.3.2 Eigenvalues 405

    15.3.3 Eigenvectors 406

    15.3.4 Derivation of components 409

    15.4 Multivariate distance and dissimilarity measures 40915.4.1 Dissimilarity measures for continuous variables 412

    15.4.2 Dissimilarity measures for dichotomous (binary) variables 413

    15.4.3 General dissimilarity measures for mixed variables 413

    15.4.4 Comparison of dissimilarity measures 414

    15.5 Comparing distance and/or dissimilarity matrices 414

    xii CONTENTS

  • 15.6 Data standardization 41515.7 Standardization, association and dissimilarity 41715.8 Multivariate graphics 41715.9 Screening multivariate data sets 418

    15.9.1 Multivariate outliers 419

    15.9.2 Missing observations 419

    15.10 General issues and hints for analysis 42315.10.1 General issues 423

    15.10.2 Hints for analysis 424

    16 Multivariate analysis of variance and discriminant analysis 425

    16.1 Multivariate analysis of variance (MANOVA) 42516.1.1 Single factor MANOVA 426

    16.1.2 Specic comparisons 432

    16.1.3 Relative importance of each response variable 432

    16.1.4 Assumptions of MANOVA 433

    16.1.5 Robust MANOVA 434

    16.1.6 More complex designs 434

    16.2 Discriminant function analysis 43516.2.1 Description and hypothesis testing 437

    16.2.2 Classication and prediction 439

    16.2.3 Assumptions of discriminant function analysis 441

    16.2.4 More complex designs 441

    16.3 MANOVA vs discriminant function analysis 44116.4 General issues and hints for analysis 441

    16.4.1 General issues 441

    16.4.2 Hints for analysis 441

    17 Principal components and correspondence analysis 443

    17.1 Principal components analysis 44317.1.1 Deriving components 447

    17.1.2 Which association matrix to use? 450

    17.1.3 Interpreting the components 451

    17.1.4 Rotation of components 451

    17.1.5 How many components to retain? 452

    17.1.6 Assumptions 453

    17.1.7 Robust PCA 454

    17.1.8 Graphical representations 454

    17.1.9 Other uses of components 456

    17.2 Factor analysis 45817.3 Correspondence analysis 459

    17.3.1 Mechanics 459

    17.3.2 Scaling and joint plots 461

    17.3.3 Reciprocal averaging 462

    17.3.4 Use of CA with ecological data 462

    17.3.5 Detrending 463

    17.4 Canonical correlation analysis 463

    CONTENTS xiii

  • 17.5 Redundancy analysis 46617.6 Canonical correspondence analysis 46717.7 Constrained and partial ordination 46817.8 General issues and hints for analysis 471

    17.8.1 General issues 471

    17.8.2 Hints for analysis 471

    18 Multidimensional scaling and cluster analysis 473

    18.1 Multidimensional scaling 47318.1.1 Classical scaling principal coordinates analysis (PCoA) 474

    18.1.2 Enhanced multidimensional scaling 476

    18.1.3 Dissimilarities and testing hypotheses about groups of

    objects 482

    18.1.4 Relating MDS to original variables 487

    18.1.5 Relating MDS to covariates 487

    18.2 Classication 48818.2.1 Cluster analysis 488

    18.3 Scaling (ordination) and clustering for biological data 49118.4 General issues and hints for analysis 493

    18.4.1 General issues 493

    18.4.2 Hints for analysis 493

    19 Presentation of results 494

    19.1 Presentation of analyses 49419.1.1 Linear models 494

    19.1.2 Other analyses 497

    19.2 Layout of tables 49719.3 Displaying summaries of the data 498

    19.3.1 Bar graph 500

    19.3.2 Line graph (category plot) 502

    19.3.3 Scatterplots 502

    19.3.4 Pie charts 503

    19.4 Error bars 50419.4.1 Alternative approaches 506

    19.5 Oral presentations 50719.5.1 Slides, computers, or overheads? 507

    19.5.2 Graphics packages 508

    19.5.3 Working with color 508

    19.5.4 Scanned images 509

    19.5.5 Information content 509

    19.6 General issues and hints 510

    References 511

    Index 527

    xiv CONTENTS

  • Preface

    Statistical analysis is at the core of most modernbiology, and many biological hypotheses, evendeceptively simple ones, are matched by complexstatistical models. Prior to the development ofmodern desktop computers, determining whetherthe data t these complex models was the prov-ince of professional statisticians. Many biologistsinstead opted for simpler models whose structurehad been simplied quite arbitrarily. Now, withimmensely powerful statistical software availableto most of us, these complex models can be tted,creating a new set of demands and problems forbiologists.

    We need to:

    know the pitfalls and assumptions ofparticular statistical models,

    be able to identify the type of modelappropriate for the sampling design and kindof data that we plan to collect,

    be able to interpret the output of analysesusing these models, and

    be able to design experiments and samplingprograms optimally, i.e. with the best possibleuse of our limited time and resources.

    The analysis may be done by professional stat-isticians, rather than statistically trained biolo-gists, especially in large research groups ormultidisciplinary teams. In these situations, weneed to be able to speak a common language:

    frame our questions in such a way as to get asensible answer,

    be aware of biological considerations that maycause statistical problems; we can not expect astatistician to be aware of the biologicalidiosyncrasies of our particular study, but if heor she lacks that information, we may getmisleading or incorrect advice, and

    understand the advice or analyses that wereceive, and be able to translate that back intobiology.

    This book aims to place biologists in a betterposition to do these things. It arose from ourinvolvement in designing and analyzing our own

    data, but also providing advice to students andcolleagues, and teaching classes in design andanalysis. As part of these activities, we becameaware, rst of our limitations, prompting us toread more widely in the primary statistical litera-ture, and second, and more importantly, of thecomplexity of the statistical models underlyingmuch biological research. In particular, we con-tinually encountered experimental designs thatwere not described comprehensively in many ofour favorite texts. This book describes many of thecommon designs used in biological research, andwe present the statistical models underlyingthose designs, with enough information to high-light their benets and pitfalls.

    Our emphasis here is on dealing with biologi-cal data how to design sampling programs thatrepresent the best use of our resources, how toavoid mistakes that make analyzing our data dif-cult, and how to analyze the data when they arecollected. We emphasize the problems associatedwith real world biological situations.

    In this book

    Our approach is to encourage readers to under-stand the models underlying the most commonexperimental designs. We describe the modelsthat are appropriate for various kinds of biologi-cal data continuous and categorical responsevariables, continuous and categorical predictoror independent variables. Our emphasis is ongeneral linear models, and we begin with thesimplest situations single, continuous vari-ables describing those models in detail. We usethese models as building blocks to understand-ing a wide range of other kinds of data all ofthe common statistical analyses, rather thanbeing distinctly different kinds of analyses, arevariations on a common theme of statisticalmodeling constructing a model for the dataand then determining whether observed data tthis particular model. Our aim is to show how abroad understanding of the models allows us to

  • deal with a wide range of more complex situa-tions.

    We have illustrated this approach of ttingmodels primarily with parametric statistics. Mostbiological data are still analyzed with linearmodels that assume underlying normal distribu-tions. However, we introduce readers to a range ofmore general approaches, and stress that, onceyou understand the general modeling approachfor normally distributed data, you can use thatinformation to begin modeling data with nonlin-ear relationships, variables that follow other stat-istical distributions, etc.

    Learning by example

    One of our strongest beliefs is that we understandstatistical principles much better when we seehow they are applied to situations in our own dis-cipline. Examples let us make the link betweenstatistical models and formal statistical terms(blocks, plots, etc.) or papers written in other dis-ciplines, and the biological situations that we aredealing with. For example, how is our analysis andinterpretation of an experiment repeated severaltimes helped by reading a literature about blocksof agricultural land? How does literature devel-oped for psychological research let us deal withmeasuring changes in physiological responses ofplants?

    Throughout this book, we illustrate all of thestatistical techniques with examples from thecurrent biological literature. We describe why(we think) the authors chose to do an experimentin a particular way, and how to analyze the data,including assessing assumptions and interpret-ing statistical output. These examples appear asboxes through each chapter, and we aredelighted that authors of most of these studieshave made their raw data available to us. Weprovide those raw data les on a websitehttp://www.zoology.unimelb.edu.au/qkstatsallowing readers to run these analyses usingtheir particular software package.

    The other value of published examples is thatwe can see how particular analyses can bedescribed and reported. When tting complexstatistical models, it is easy to allow the biology to

    be submerged by a mass of statistical output. Wehope that the examples, together with our ownthoughts on this subject, presented in the nalchapter, will help prevent this happening.

    This book is a bridge

    It is not possible to produce a book that intro-duces a reader to biological statistics and takesthem far enough to understand complex models,at least while having a book that is small enoughto transport. We therefore assume that readersare familiar with basic statistical concepts, suchas would result from a one or two semester intro-ductory course, or have read one of the excellentbasic texts (e.g. Sokal & Rohlf 1995). We take thereader from these texts into more complex areas,explaining the principles, assumptions, and pit-falls, and encourage a reader to read the excellentdetailed treatments (e.g, for analysis of variance,Winer et al. 1991 or Underwood 1997).

    Biological data are often messy, and manyreaders will nd that their research questionsrequire more complex models than we describehere. Ways of dealing with messy data or solutionsto complex problems are often provided in theprimary statistical literature. We try to point theway to key pieces of that statistical literature, pro-viding the reader with the basic tools to be able todeal with that literature, or to be able to seek pro-fessional (statistical) help when things becometoo complex.

    We must always remember that, for biologists,statistics is a tool that we use to illuminate andclarify biological problems. Our aim is to be ableto use these tools efciently, without losing sightof the biology that is the motivation for most of usentering this eld.

    Some acknowledgments

    Our biggest debt is to the range of colleagues whohave read, commented upon, and correctedvarious versions of these chapters. Many of thesecolleagues have their own research groups, whothey enlisted in this exercise. These altruistic anddiligent souls include (alphabetically) Jacqui

    xvi PREFACE

  • Brooks, Andrew Constable, Barb Downes, PeterFairweather, Ivor Growns, Murray Logan, RalphMac Nally, Richard Marchant, Pete Raimondi,Wayne Robinson, Suvaluck Satumanatpan andSabine Schreiber. Perhaps the most innocentvictims were the graduate students who havebeen part of our research groups over the periodwe produced this book. We greatly appreciatetheir willingness to trade the chance of some illu-

    mination for reading and highlighting our obfus-cations.

    We also wish to thank the various researcherswhose data we used as examples throughout.Most of them willingly gave of their raw data,trusting that we would neither criticize nor ndaws in their published work (we didnt!), or werepublic-spirited enough to have published theirraw data.

    PREFACE xvii

  • Chapter 1

    Introduction

    Biologists and environmental scientists todaymust contend with the demands of keeping upwith their primary eld of specialization, and atthe same time ensuring that their set of profes-sional tools is current. Those tools may includetopics as diverse as molecular genetics, sedimentchemistry, and small-scale hydrodynamics, butone tool that is common and central to most ofus is an understanding of experimental designand data analysis, and the decisions that wemake as a result of our data analysis determineour future research directions or environmentalmanagement. With the advent of powerfuldesktop computers, we can now do complex ana-lyses that in previous years were available only tothose with an initiation into the wonders of earlymainframe statistical programs, or computer pro-gramming languages, or those with the time forlaborious hand calculations. In past years, thosestatistical tools determined the range of sam-pling programs and analyses that we werewilling to attempt. Now that we can do muchmore complex analyses, we can examine data inmore sophisticated ways. This power comes at acost because we now collect data with complexunderlying statistical models, and, therefore, weneed to be familiar with the potential and limita-tions of a much greater range of statisticalapproaches.

    With any eld of science, there are particularapproaches that are more common than others.Texts written for one eld will not necessarilycover the most common needs of another eld,and we felt that the needs of most common biol-ogists and environmental scientists of our

    acquaintance were not covered by any one partic-ular text.

    A fundamental step in becoming familiar withdata collection and analysis is to understand thephilosophical viewpoint and basic tools thatunderlie what we do. We begin by describing ourapproach to scientic method. Because our aim isto cover some complex techniques, we do notdescribe introductory statistical methods inmuch detail. That task is a separate one, and hasbeen done very well by a wide range of authors. Wetherefore provide only an overview or refresher ofsome basic philosophical and statistical concepts.We strongly urge you to read the rst few chaptersof a good introductory statistics or biostatisticsbook (you cant do much better than Sokal & Rohlf1995) before working through this chapter.

    1.1 Scientic method

    An appreciation of the philosophical bases for theway we do our scientic research is an importantprelude to the rest of this book (see Chalmers1999, Gower 1997, OHear 1989). There are manyvaluable discussions of scientic philosophy froma biological context and we particularly recom-mend Ford (2000), James & McCulloch (1985),Loehle (1987) and Underwood (1990, 1991).Maxwell & Delaney (1990) provide an overviewfrom a behavioral sciences viewpoint and the rsttwo chapters of Hilborn & Mangel (1997) empha-size alternatives to the Popperian approach in sit-uations where experimental tests of hypothesesare simply not possible.

  • Early attempts to develop a philosophy of sci-entic logic, mainly due to Francis Bacon andJohn Stuart Mill, were based around the principleof induction, whereby sufcient numbers of con-rmatory observations and no contradictoryobservations allow us to conclude that a theory orlaw is true (Gower 1997). The logical problemswith inductive reasoning are discussed in everytext on the philosophy of science, in particularthat no amount of conrmatory observations canever prove a theory. An alternative approach, andalso the most commonly used scientic methodin modern biological sciences literature, employsdeductive reasoning, the process of derivingexplanations or predictions from laws or theories.Karl Popper (1968, 1969) formalized this as thehypothetico-deductive approach, based aroundthe principle of falsicationism, the doctrinewhereby theories (or hypotheses derived fromthem) are disproved because proof is logicallyimpossible. An hypothesis is falsiable if thereexists a logically possible observation that isinconsistent with it. Note that in many scienticinvestigations, a description of pattern and induc-tive reasoning, to develop models and hypotheses(Mentis 1988), is followed by a deductive process inwhich we critically test our hypotheses.

    Underwood (1990, 1991) outlined the stepsinvolved in a falsicationist test. We will illustratethese steps with an example from the ecologicalliterature, a study of bioluminescence in dinoag-ellates by Abrahams & Townsend (1993).

    1.1.1 Pattern descriptionThe process starts with observation(s) of a patternor departure from a pattern in nature.Underwood (1990) also called these puzzles orproblems. The quantitative and robust descrip-tion of patterns is, therefore, a crucial part of thescientic process and is sometimes termed anobservational study (Manly 1992). While westrongly advocate experimental methods inbiology, experimental tests of hypotheses derivedfrom poorly collected and interpreted observa-tional data will be of little use.

    In our example, Abrahams & Townsend (1993)observed that dinoagellates bioluminesce whenthe water they are in is disturbed. The next step isto explain these observations.

    1.1.2 ModelsThe explanation of an observed pattern is referredto as a model or theory (Ford 2000), which is aseries of statements (or formulae) that explainswhy the observations have occurred. Model devel-opment is also what Peters (1991) referred to as thesynthetic or private phase of the scienticmethod, where the perceived problem interactswith insight, existing theory, belief and previousobservations to produce a set of competingmodels. This phase is clearly inductive andinvolves developing theories from observations(Chalmers 1999), the exploratory process ofhypothesis formulation.

    James & McCulloch (1985), while emphasizingthe importance of formulating models in science,distinguished different types of models. Verbalmodels are non-mathematical explanations ofhow nature works. Most biologists have some ideaof how a process or system under investigationoperates and this idea drives the investigation. Itis often useful to formalize that idea as a concep-tual verbal model, as this might identify impor-tant components of a system that need to beincluded in the model. Verbal models can bequantied in mathematical terms as eitherempiric models or theoretic models. These modelsusually relate a response or dependent variable toone or more predictor or independent variables.We can envisage from our biological understand-ing of a process that the response variable mightdepend on, or be affected by, the predictor vari-ables.

    Empiric models are mathematical descrip-tions of relationships resulting from processesrather than the processes themselves, e.g. equa-tions describing the relationship between metab-olism (response) and body mass (predictor) orspecies number (response) and island area (rstpredictor) and island age (second predictor).Empiric models are usually statistical models(Hilborn & Mangel 1997) and are used to describea relationship between response and predictorvariables. Much of this book is based on ttingstatistical models to observed data.

    Theoretic models, in contrast, are used tostudy processes, e.g. spatial variation in abun-dance of intertidal snails is caused by variationsin settlement of larvae, or each outbreak of

    2 INTRODUCTION

  • Mediterranean fruit y in California is caused bya new colonization event (Hilborn & Mangel 1997).In many cases, we will have a theoretic, or scien-tic, model that we can re-express as a statisticalmodel. For example, island biogeography theorysuggests that the number of species on an islandis related to its area. We might express this scien-tic model as a linear statistical relationshipbetween species number and island area and eval-uate it based on data from a range of islands of dif-ferent sizes. Both empirical and theoretic modelscan be used for prediction, although the general-ity of predictions will usually be greater for theor-etic models.

    The scientic model proposed to explain biolu-minescence in dinoagellates was the burglaralarm model, whereby dinoagellates biolu-minesce to attract predators of copepods, whicheat the dinoagellates. The remaining steps in theprocess are designed to test or evaluate a particu-lar model.

    1.1.3 Hypotheses and testsWe can make a prediction or predictions deducedfrom our model or theory; these predictions arecalled research (or logical) hypotheses. If a partic-ular model is correct, we would predict specicobservations under a new set of circumstances.This is what Peters (1991) termed the analytic,public or Popperian phase of the scienticmethod, where we use critical or formal tests toevaluate models by falsifying hypotheses. Ford(2000) distinguished three meanings of the termhypothesis. We will use it in Fords (2000) senseof a statement that is tested by investigation,experimentally if possible, in contrast to a modelor theory and also in contrast to a postulate, a newor unexplored idea.

    One of the difculties with this stage in theprocess is deciding which models (and subsequenthypotheses) should be given research priority.There will often be many competing models and,with limited budgets and time, the choice ofwhich models to evaluate is an important one.Popper originally suggested that scientists shouldtest those hypotheses that are most easily falsiedby appropriate tests. Tests of theories or modelsusing hypotheses with high empirical contentand which make improbable predictions are what

    Popper called severe tests, although that term hasbeen redened by Mayo (1996) as a test that islikely to reveal a specic error if it exists (e.g. deci-sion errors in statistical hypothesis testing seeChapter 3). Underwood (1990, 1991) argued that itis usually difcult to decide which hypotheses aremost easily refuted and proposed that competingmodels are best separated when their hypothesesare the most distinctive, i.e. they predict very dif-ferent results under similar conditions. There areother ways of deciding which hypothesis to test,more related to the sociology of science. Somehypotheses may be relatively trivial, or you mayhave a good idea what the results can be. Testingthat hypothesis may be most likely to producea statistically signicant (see Chapter 3), and,unfortunately therefore, a publishable result.Alternatively, a hypothesis may be novel orrequire a complex mechanism that you thinkunlikely. That result might be more exciting to thegeneral scientic community, and you mightdecide that, although the hypothesis is harder totest, youre willing to gamble on the fame, money,or personal satisfaction that would result fromsuch a result.

    Philosophers have long recognized that proofof a theory or its derived hypothesis is logicallyimpossible, because all observations related to thehypothesis must be made. Chalmers (1999; seealso Underwood 1991) provided the cleverexample of the long history of observations inEurope that swans were white. Only by observingall swans everywhere could we prove that allswans are white. The fact that a single observationcontrary to the hypothesis could disprove it wasclearly illustrated by the discovery of black swansin Australia.

    The need for disproof dictates the next step inthe process of a falsicationist test. We specify anull hypothesis that includes all possibilitiesexcept the prediction in the hypothesis. It ismuch simpler logically to disprove a null hypoth-esis. The null hypothesis in the dinoagellateexample was that bioluminesence by dinoagel-lates would have no effect on, or would decrease,the mortality rate of copepods grazing on dino-agellates. Note that this null hypothesisincludes all possibilities except the one speciedin the hypothesis.

    SCIENTIFIC METHOD 3

  • So, the nal phase in the process is the experi-mental test of the hypothesis. If the null hypothe-sis is rejected, the logical (or research) hypothesis,and therefore the model, is supported. The modelshould then be rened and improved, perhapsmaking it predict outcomes for different spatialor temporal scales, other species or other new sit-uations. If the null hypothesis is not rejected, thenit should be retained and the hypothesis, and themodel from which it is derived, are incorrect. Wethen start the process again, although the statisti-cal decision not to reject a null hypothesis is moreproblematic (Chapter 3).

    The hypothesis in the study by Abrahams &Townsend (1993) was that bioluminesence wouldincrease the mortality rate of copepods grazing ondinoagellates. Abrahams & Townsend (1993)tested their hypothesis by comparing the mortal-ity rate of copepods in jars containing biolumi-nescing dinoagellates, copepods and one sh(copepod predator) with control jars containingnon-bioluminescing dinoagellates, copepodsand one sh. The result was that the mortalityrate of copepods was greater when feeding on bio-luminescing dinoagellates than when feedingon non-bioluminescing dinoagellates. Thereforethe null hypothesis was rejected and the logicalhypothesis and burglar alarm model was sup-ported.

    1.1.4 Alternatives to falsicationWhile the Popperian philosophy of falsicationisttests has been very inuential on the scienticmethod, especially in biology, at least two otherviewpoints need to be considered. First, ThomasKuhn (1970) argued that much of science iscarried out within an accepted paradigm orframework in which scientists rene the theoriesbut do not really challenge the paradigm. Falsiedhypotheses do not usually result in rejection ofthe over-arching paradigm but simply its enhance-ment. This normal science is punctuated byoccasional scientic revolutions that have asmuch to do with psychology and sociology asempirical information that is counter to the pre-vailing paradigm (OHear 1989). These scienticrevolutions result in (and from) changes inmethods, objectives and personnel (Ford 2000).Kuhns arguments have been described as relativ-

    istic because there are often no objective criteriaby which existing paradigms and theories aretoppled and replaced by alternatives.

    Second, Imre Lakatos (1978) was not con-vinced that Poppers ideas of falsication andsevere tests really reected the practical applica-tion of science and that individual decisionsabout falsifying hypotheses were risky and arbi-trary (Mayo 1996). Lakatos suggested we shoulddevelop scientic research programs that consistof two components: a hard core of theoriesthat are rarely challenged and a protective belt ofauxiliary theories that are often tested andreplaced if alternatives are better at predictingoutcomes (Mayo 1996). One of the contrastsbetween the ideas of Popper and Lakatos that isimportant from the statistical perspective is thelatters ability to deal with multiple competinghypotheses more elegantly than Poppers severetests of individual hypotheses (Hilborn & Mangel1997).

    An important issue for the Popperian philoso-phy is corroboration. The falsicationist testmakes it clear what to do when an hypothesis isrejected after a severe test but it is less clear whatthe next step should be when an hypothesis passesa severe test. Popper argued that a theory, and itsderived hypothesis, that has passed repeatedsevere testing has been corroborated. However,because of his difculties with inductive think-ing, he viewed corroboration as simply a measureof the past performance of a model, rather anindication of how well it might predict in othercircumstances (Mayo 1996, OHear 1989). This isfrustrating because we clearly want to be able touse models that have passed testing to make pre-dictions under new circumstances (Peters 1991).While detailed discussion of the problem of cor-roboration is beyond the scope of this book (seeMayo 1996), the issue suggests two further areas ofdebate. First, there appears to be a role for bothinduction and deduction in the scientic method,as both have obvious strengths and weaknessesand most biological research cannot help but useboth in practice. Second, formal corroboration ofhypotheses may require each to be allocated somemeasure of the probability that each is true orfalse, i.e. some measure of evidence in favor oragainst each hypothesis. This goes to the heart of

    4 INTRODUCTION

  • one of the most long-standing and vigorousdebates in statistics, that between frequentistsand Bayesians (Section 1.4 and Chapter 3).

    Ford (2000) provides a provocative and thor-ough evaluation of the Kuhnian, Lakatosian andPopperian approaches to the scientic method,with examples from the ecological sciences.

    1.1.5 Role of statistical analysisThe application of statistics is important through-out the process just described. First, the descrip-tion and detection of patterns must be done in arigorous manner. We want to be able to detect gra-dients in space and time and develop models thatexplain these patterns. We also want to be con-dent in our estimates of the parameters in thesestatistical models. Second, the design and analysisof experimental tests of hypotheses are crucial. Itis important to remember at this stage that theresearch hypothesis (and its complement, the nullhypothesis) derived from a model is not the sameas the statistical hypothesis (James & McCulloch1985); indeed, Underwood (1990) has pointed outthe logical problems that arise when the researchhypothesis is identical to the statistical hypothe-sis. Statistical hypotheses are framed in terms ofpopulation parameters and represent tests of thepredictions of the research hypotheses (James &McCulloch 1985). We will discuss the process oftesting statistical hypotheses in Chapter 3. Finally,we need to present our results, from both thedescriptive sampling and from tests of hypothe-ses, in an informative and concise manner. Thiswill include graphical methods, which can also beimportant for exploring data and checkingassumptions of statistical procedures.

    Because science is done by real people, thereare aspects of human psychology that can inu-ence the way science proceeds. Ford (2000) andLoehle (1987) have summarized many of these inan ecological context, including conrmationbias (the tendency for scientists to conrm theirown theories or ignore contradictory evidence)and theory tenacity (a strong commitment tobasic assumptions because of some emotional orpersonal investment in the underlying ideas).These psychological aspects can produce biases ina given discipline that have important implica-tions for our subsequent discussions on research

    design and data analysis. For example, there is atendency in biology (and most sciences) to onlypublish positive (or statistically signicant)results, raising issues about statistical hypothesistesting and meta-analysis (Chapter 3) and power oftests (Chapter 7). In addition, successful tests ofhypotheses rely on well-designed experimentsand we will consider issues such as confoundingand replication in Chapter 7.

    1.2 Experiments and other tests

    Platt (1964) emphasized the importance of experi-ments that critically distinguish between alterna-tive models and their derived hypotheses when hedescribed the process of strong inference:

    devise alternative hypotheses, devise a crucial experiment (or several experi-

    ments) each of which will exclude one or moreof the hypotheses,

    carry out the experiment(s) carefully to obtaina clean result, and

    recycle the procedure with new hypotheses torene the possibilities (i.e. hypotheses) thatremain.

    Crucial to Platts (1964) approach was the idea ofmultiple competing hypotheses and tests to dis-tinguish between these. What nature shouldthese tests take?

    In the dinoagellate example above, thecrucial test of the hypothesis involved a manipu-lative experiment based on sound principles ofexperimental design (Chapter 7). Such manipula-tions provide the strongest inference about ourhypotheses and models because we can assess theeffects of causal factors on our response variableseparately from other factors. James & McCulloch(1985) emphasized that testing biological models,and their subsequent hypotheses, does not occurby simply seeing if their predictions are met in anobservational context, although such results offersupport for an hypothesis. Along with James &McCulloch (1985), Scheiner (1993), Underwood(1990), Werner (1998), and many others, we arguestrongly that manipulative experiments are thebest way to properly distinguish between biologi-cal models.

    EXPERIMENTS AND OTHER TESTS 5

  • There are at least two costs to this strong infer-ence from manipulative experiments. First,experiments nearly always involve some articialmanipulation of nature. The most extreme formof this is when experiments testing some naturalprocess are conducted in the laboratory. Even eldexperiments will often use articial structures ormechanisms to implement the manipulation. Forexample, mesocosms (moderate sized enclosures)are often used to investigate processes happeningin large water bodies, although there is evidencefrom work on lakes that issues related to thesmall-scale of mesocosms may restrict generaliza-tion to whole lakes (Carpenter 1996; see alsoResetarits & Fauth 1998). Second, the larger thespatial and temporal scales of the process beinginvestigated, the more difcult it is to meet theguidelines for good experimental design. Forexample, manipulations of entire ecosystems arecrucial for our understanding of the role ofnatural and anthropogenic disturbances to thesesystems, especially since natural resource agen-cies have to manage such systems at this largespatial scale (Carpenter et al. 1995). Replicationand randomization (two characteristics regardedas important for sensible interpretation of experi-ments see Chapter 7) are usually not possible atlarge scales and novel approaches have been devel-oped to interpret such experiments (Carpenter1990). The problems of scale and the generality ofconclusions from smaller-scale manipulativeexperiments are challenging issues for experi-mental biologists (Dunham & Beaupre 1998).

    The testing approach on which the methods inthis book are based relies on making predictionsfrom our hypothesis and seeing if those predic-tions apply when observed in a new setting, i.e.with data that were not used to derive the modeloriginally. Ideally, this new setting is experimen-tal at scales relevant for the hypothesis, but this isnot always possible. Clearly, there must be addi-tional ways of testing between competing modelsand their derived hypotheses. Otherwise, disci-plines in which experimental manipulation is dif-cult for practical or ethical reasons, such asmeteorology, evolutionary biology, sheriesecology, etc., could make no scientic progress.The alternative is to predict from ourmodels/hypotheses in new settings that are not

    experimentally derived. Hilborn & Mangel (1997),while arguing for experimental studies in ecologywhere possible, emphasize the approach of con-fronting competing models (or hypotheses) withobservational data by assessing how well the datameet the predictions of the model.

    Often, the new setting in which we test thepredictions of our model may provide us with acontrast of some factor, similar to what we mayhave set up had we been able to do a manipula-tive experiment. For example, we may never beable to (nor want to!) test the hypothesis thatwildre in old-growth forests affects populationsof forest birds with a manipulative experiment ata realistic spatial scale. However, comparisons ofbird populations in forests that have burnt natu-rally with those that havent provide a test of thehypothesis. Unfortunately, a test based on such anatural experiment (sensu Underwood 1990) isweaker inference than a real manipulativeexperiment because we can never separate theeffects of re from other pre-existing differencesbetween the forests that might also affect birdpopulations. Assessments of effects of humanactivities (environmental impact assessment)are often comparisons of this kind because wecan rarely set up a human impact in a trulyexperimental manner (Downes et al. 2001). Well-designed observational (sampling) programs canprovide a refutationist test of a null hypothesis(Underwood 1991) by evaluating whether predic-tions hold, although they cannot demonstratecausality.

    While our bias in favor of manipulative experi-ments is obvious, we hope that we do not appeartoo dogmatic. Experiments potentially providethe strongest inference about competing hypoth-eses, but their generality may also be constrainedby their articial nature and limitations of spatialand temporal scale. Testing hypotheses againstnew observational data provides weaker distinc-tions between competing hypotheses and the infe-rential strength of such methods can be improvedby combining them with other forms of evidence(anecdotal, mathematical modeling, correlationsetc. see Downes et al. 2001, Hilborn & Mangel1997, McArdle 1996). In practice, most biologicalinvestigations will include both observationaland experimental approaches. Rigorous and sen-

    6 INTRODUCTION

  • sible statistical analyses will be relevant at allstages of the investigation.

    1.3 Data, observations andvariables

    In biology, data usually consist of a collection ofobservations or objects. These observations areusually sampling units (e.g. quadrats) or experi-mental units (e.g. individual organisms, aquaria,etc.) and a set of these observations should repre-sent a sample from a clearly dened population(all possible observations in which we are inter-ested). The actual property measured by the indi-vidual observations (Sokal & Rohlf 1995, p. 9), e.g.length, number of individuals, pH, etc., is called avariable. A random variable (which we will denoteas Y, with y being any value of Y) is simply a vari-able whose values are not known for certainbefore a sample is taken, i.e. the observed valuesof a random variable are the results of a randomexperiment (the sampling process). The set of allpossible outcomes of the experiment, e.g. all thepossible values of a random variable, is called thesample space. Most variables we deal with inbiology are random variables, although predictorvariables in models might be xed in advance andtherefore not random. There are two broad catego-ries of random variables: (i) discrete random vari-ables can only take certain, usually integer,values, e.g. the number of cells in a tissue sectionor number of plants in a forest plot, and (ii) con-tinuous random variables, which take any value,e.g. measurements like length, weight, salinity,blood pressure etc. Kleinbaum et al. (1997) distin-guish these in terms of gappiness discrete var-iables have gaps between observations andcontinuous variables have no gaps between obser-vations.

    The distinction between discrete and continu-ous variables is not always a clear dichotomy; thenumber of organisms in a sample of mud from alocal estuary can take a very large range of valuesbut, of course, must be an integer so is actually adiscrete variable. Nonetheless, the distinctionbetween discrete and continuous variables isimportant, especially when trying to measureuncertainty and probability.

    1.4 Probability

    The single most important characteristic of bio-logical data is their uncertainty. For example, ifwe take two samples, each consisting of the samenumber of observations, from a population andestimate the mean for some variable, the twomeans will almost certainly be different, despitethe samples coming from the same population.Hilborn & Mangel (1997) proposed two generalcauses why the two means might be different, i.e.two causes of uncertainty in the expected value ofthe population. Process uncertainty results fromthe true population mean being different whenthe second sample was taken compared with therst. Such temporal changes in biotic variables,even over very short time scales, are common inecological systems. Observation uncertaintyresults from sampling error; the mean value in asample is simply an imperfect estimate of themean value in the population (all the possibleobservations) and, because of natural variabilitybetween observations, different samples willnearly always produce different means.Observation uncertainty can also result frommeasurement error, where the measuring devicewe are using is imperfect. For many biological var-iables, natural variability is so great that we rarelyworry about measurement error, although thismight not be the case when the variable is meas-ured using some complex piece of equipmentprone to large malfunctions.

    In most statistical analyses, we view uncer-tainty in terms of probabilities and understand-ing probability is crucial to understandingmodern applied statistics. We will only brieyintroduce probability here, particularly as it isvery important for how we interpret statisticaltests of hypotheses. Very readable introductionscan be found in Antelman (1997), Barnett (1999),Harrison & Tamaschke (1984) and Hays (1994);from a biological viewpoint in Sokal & Rohlf(1995) and Hilborn & Mangel (1997); and from aphilosophical perspective in Mayo (1996).

    We usually talk about probabilities in terms ofevents; the probability of event A occurring iswritten P(A). Probabilities can be between zeroand one; if P(A) equals zero, then the event is

    PROBABILITY 7

  • impossible; if P(A) equals one, then the event iscertain. As a simple example, and one that is usedin nearly every introductory statistics book,imagine the toss of a coin. Most of us would statethat the probability of heads is 0.5, but what do wereally mean by that statement? The classical inter-pretation of probability is that it is the relative fre-quency of an event that we would expect in thelong run, or in a long sequence of identical trials.In the coin tossing example, the probability ofheads being 0.5 is interpreted as the expected pro-portion of heads in a long sequence of tosses.Problems with this long-run frequency interpreta-tion of probability include dening what is meantby identical trials and the many situations inwhich uncertainty has no sensible long-run fre-quency interpretation, e.g. probability of a horsewinning a particular race, probability of it rainingtomorrow (Antelman 1997). The long-run fre-quency interpretation is actually the classical sta-tistical interpretation of probabilities (termed thefrequentist approach) and is the interpretation wemust place on condence intervals (Chapter 2)and P values from statistical tests (Chapter 3).

    The alternative way of interpreting probabil-ities is much more subjective and is based on adegree of belief about whether an event willoccur. It is basically an attempt at quanticationof an opinion and includes two slightly differentapproaches logical probability developed byCarnap and Jeffreys and subjective probabilitypioneered by Savage, the latter being a measure ofprobability specic to the person deriving it. Theopinion on which the measure of probability isbased may be derived from previous observations,theoretical considerations, knowledge of the par-ticular event under consideration, etc. Thisapproach to probability has been criticizedbecause of its subjective nature but it has beenwidely applied in the development of prior prob-abilities in the Bayseian approach to statisticalanalysis (see below and Chapters 2 and 3).

    We will introduce some of the basic rules ofprobability using a simple biological examplewith a dichotomous outcome eutrophication inlakes (e.g. Carpenter et al. 1998). Let P(A) be theprobability that a lake will go eutrophic. ThenP(A) equals one minus P(A), i.e. the probability ofnot A is one minus the probability of A. In our

    example, the probability that the lake will not goeutrophic is one minus the probability that it willgo eutrophic.

    Now consider the P(B), the probability thatthere will be an increase in nutrient input intothe lake. The joint probability of A and B is:

    P(AB)P(A)P(B)P(AB) (1.1)

    i.e. the probability that A or B occur [P(AB)] is theprobability of A plus the probability of B minusthe probability of A and B both occurring [P(AB)].In our example, the probability that the lake willgo eutrophic or that there will be an increase innutrient input equals the probability that the lakewill go eutrophic plus the probability that thelake will receive increased nutrients minus theprobability that the lake will go eutrophic andreceive increased nutrients.

    These simple rules lead on to conditional prob-abilities, which are very important in practice.The conditional probability of A, given B, is:

    P(A|B)P(AB)/P(B) (1.2)

    i.e. the probability that A occurs, given that Boccurs, equals the probability of A and B bothoccurring divided by the probability of B occur-ring. In our example, the probability that the lakewill go eutrophic given that it receives increasednutrient input equals the probability that it goeseutrophic and receives increased nutrientsdivided by the probability that it receivesincreased nutrients.

    We can combine these rules to developanother way of expressing conditional probability Bayes Theorem (named after the eighteenth-century English mathematician, Thomas Bayes):

    P(A|B) (1.3)

    This formula allows us to assess the probability ofan event A in the light of new information, B. Letsdene some terms and then show how this some-what daunting formula can be useful in practice.P(A) is termed the prior probability of A it is theprobability of A prior to any new information(about B). In our example, it is our probability of alake going eutrophic, calculated before knowinganything about nutrient inputs, possibly deter-mined from previous studies on eutrophication in

    P(B|A)P(A)P(B|A)P(A) P(B|A)P(A)

    8 INTRODUCTION

  • lakes. P(B|A) is the likelihood of B being observed,given that A did occur [a similar interpretationexists for P(B|A)]. The likelihood of a model orhypothesis or event is simply the probability ofobserving some data assuming the model orhypothesis is true or assuming the event occurs.In our example, P(B|A) is the likelihood of seeinga raised level of nutrients, given that the lake hasgone eutrophic (A). Finally, P(A|B) is the posteriorprobability of A, the probability of A after makingthe observations about B, the probability of a lakegoing eutrophic after incorporating the informa-tion about nutrient input. This is what we areafter with a Bayesian analysis, the modication ofprior information to posterior information basedon a likelihood (Ellison 1996).

    Bayes Theorem tells us how probabilities mightchange based on previous evidence. It also relatestwo forms of conditional probabilities the prob-ability of A given B to the probability of B given A.Berry (1996) described this as relating inverseprobabilities. Note that, although our simpleexample used an event (A) that had only two pos-sible outcomes, Bayes formula can also be used forevents that have multiple possible outcomes.

    In practice, Bayes Theorem is used for estimat-ing parameters of populations and testing hypoth-eses about those parameters. Equation 1.3 can besimplied considerably (Berry & Stangl 1996,Ellison 1996):

    P(|data) (1.4)

    where is a parameter to be estimated or anhypothesis to be evaluated, P() is the uncondi-tional prior probability of being a particularvalue, P(data|) is the likelihood of observing thedata if is that value, P(data) is the uncondi-tional probability of observing the data and isused to ensure the area under the probability dis-tribution of equals one (termed normaliza-tion), and P(|data) is the posterior probability of conditional on the data being observed. Thisformula can be re-expressed in English as:

    posterior probability likelihoodprior probability (1.5)

    While we dont advocate a Bayesian philosophy inthis book, it is important for biologists to be aware

    P(data|)P()P(data)

    of the approach and to consider it as an alterna-tive way of dealing with conditional probabilities.We will consider the Bayesian approach to estima-tion in Chapter 2 and to hypothesis testing inChapter 3.

    1.5 Probability distributions

    A random variable will have an associated prob-ability distribution where different values of thevariable are on the horizontal axis and the rela-tive probabilities of the possible values of the var-iable (the sample space) are on the vertical axis.For discrete variables, the probability distribu-tion will comprise a measurable probability foreach outcome, e.g. 0.5 for heads and 0.5 for tailsin a coin toss, 0.167 for each one of the six sidesof a fair die. The sum of these individual probabil-ities for independent events equals one.Continuous variables are not restricted to inte-gers or any specic values so there are an innitenumber of possible outcomes. The probability dis-tribution of a continuous variable (Figure 1.1) isoften termed a probability density function (pdf)where the vertical axis is the probability densityof the variable [ f(y)], a rate measuring the prob-ability per unit of the variable at any particularvalue of the variable (Antelman 1997). We usuallytalk about the probability associated with a rangeof values, represented by the area under the prob-ability distribution curve between the twoextremes of the range. This area is determinedfrom the integral of the probability density fromthe lower to the upper value, with the distribu-tion usually normalized so that the total prob-ability under the curve equals one. Note that theprobability of any particular value of a continu-ous random variable is zero because the areaunder the curve for a single value is zero(Kleinbaum et al. 1997) this is important whenwe consider the interpretation of probability dis-tributions in statistical hypothesis testing(Chapter 3).

    In many of the statistical analyses described inthis book, we are dealing with two or more vari-ables and our statistical models will often havemore than one parameter. Then we need to switchfrom single probability distributions to joint

    PROBABILITY DISTRIBUTIONS 9

  • probability distributionswhere probabilities are meas-ured, not as areas under asingle curve, but volumesunder a more complex distri-bution. A common joint pdf isthe bivariate normal distribu-tion, to be introduced inChapter 5.

    Probability distributions nearly always refer tothe distribution of variables in one or more popu-lations. The expected value of a random variable[E(Y)]is simply the mean () of its probability distri-bution. The expected value is an important conceptin applied statistics most modeling proceduresare trying to model the expected value of a randomresponse variable. The mean is a measure of thecenter of a distribution other measures includethe median (the middle value) and the mode (themost common value). It is also important to be ableto measure the spread of a distribution and themost common measures are based on deviationsfrom the center, e.g. the variance is measured asthe sum of squared deviations from the mean. Wewill discuss means and variances, and other meas-ures of the center and spread of distributions, inmore detail in Chapter 2.

    1.5.1 Distributions for variablesMost statistical procedures rely on knowing theprobability distribution of the variable (or theerror terms from a statistical model) we are ana-lyzing. There are many probability distributionsthat we can dene mathematically (Evans et al.2000) and some of these adequately describe thedistributions of variables in biology. Lets considercontinuous variables rst.

    The normal (also termed Gaussian) distribu-tion is a symmetrical probability distribution

    with a characteristic bell-shape (Figure 1.1). It isdened as:

    f(y) e( y)2/22 (1.6)

    where f(y) is the probability density of any value yof Y. Note that the normal distribution can bedened simply by the mean () and the variance(2), which are independent of each other. Allother terms in the equation are constants. Anormal distribution is often abbreviated toN(Y:,). Since there are innitely many possiblecombinations of mean and variance, there is aninnite number of possible normal distributions.The standard normal distribution (z distribution)is a normal distribution with a mean of zero anda variance of one. The normal distribution is themost important probability distribution for dataanalysis; most commonly used statistical proce-dures in biology (e.g. linear regression, analysis ofvariance) assume that the variables being ana-lyzed (or the deviations from a tted model)follow a normal distribution.

    The normal distribution is a symmetrical prob-ability distribution, but continuous variables canhave non-symmetrical distributions. Biologicalvariables commonly have a positively skewed dis-tribution, i.e. one with a long right tail (Figure1.1). One skewed distribution is the lognormal dis-tribution, which means that the logarithm of the

    1

    22

    10 INTRODUCTION

    Figure 1.1. Probabilitydistributions for random variablesfollowing four commondistributions. For the Poissondistribution, we show thedistribution for a rare event and acommon one, showing the shift ofthe distribution from skewed toapproximately symmetrical.

  • variable is normally distributed (suggesting asimple transformation to normality see Chapter4). Measurement variables in biology that cannotbe less than zero (e.g. length, weight, etc.) oftenfollow lognormal distributions. In skewed distri-butions like the lognormal, there is a positive rela-tionship between the mean and the variance.

    There are some other probability distributionsfor continuous variables that are occasionallyused in specic circumstances. The exponentialdistribution (Figure 1.1) is another skewed distri-bution that often applies when the variable is thetime to the rst occurrence of an event (Fox 1993,Harrison & Tamaschke 1984), such as in failuretime analysis. This is a single parameter () distri-bution with the following probability densityfunction:

    f(y)ey (1.7)

    where 1/ is the mean time to rst occurrence. Fox(1993) provided some ecological examples.

    The exponential and normal distributions aremembers of the larger family of exponential dis-tributions that can be used as error distributionsfor a variety of linear models (Chapter 13). Othermembers of this family include gamma distribu-tion for continuous variables and the binomialand Poisson (see below) for discrete variables.

    Two other probability distributions for contin-uous variables are also encountered (albeit rarely)in biology. The two-parameter Weibull distribu-tion varies between positively skewed andsymmetrical depending on parameter values,although versions with three or more parametersare described (Evans et al. 2000). This distributionis mainly used for modeling failure rates andtimes. The beta distribution has two parametersand its shape can range from U to J to symmetri-cal. The beta distribution is commonly used as aprior probability distribution for dichotomousvariables in Bayesian analyses (Evans et al. 2000).

    There are also probability distributions for dis-crete variables. If we toss a coin, there are two pos-sible outcomes heads or tails. Processes withonly two possible outcomes are common inbiology, e.g. animals in an experiment can eitherlive or die, a particular species of tree can beeither present or absent from samples from aforest. A process that can only have one of two

    outcomes is sometimes called a Bernoulli trialand we often call the two possible outcomessuccess and failure. We will only consider a sta-tionary Bernoulli trial, which is one where theprobability of success is the same for each trial, i.e.the trials are independent.

    The probability distribution of the number ofsuccesses in n independent Bernoulli trials iscalled the binomial distribution, a very importantprobability distribution in biology:

    P(yr) r(1)nr (1.8)

    where P(yr) is the probability of a particularvalue (y) of the random variable (Y ) being r suc-cesses out of n trials, n is the number of trials and

    is the probability of a success. Note that n, thenumber of trials is xed, and therefore the valueof a binomial random variable cannot exceed n.The binomial distribution can be used to calculateprobabilities for different numbers of successesout of n trials, given a known probability ofsuccess on any individual trial. It is also importantas an error distribution for modeling variableswith binary outcomes using logistic regression(Chapter 13). A generalization of the binomial dis-tribution to when there are more than two pos-sible outcomes is the multinomial distribution,which is the joint probability distribution ofmultiple outcomes from n xed trials.

    Another very important probability distribu-tion for discrete variables is the Poisson distribu-tion, which usually describes variables repre-senting the number of (usually rare) occurrencesof a particular event in an interval of time orspace, i.e. counts. For example, the number oforganisms in a plot, the number of cells in amicroscope eld of view, the number of seedstaken by a bird per minute. The probability distri-bution of a Poisson variable is:

    P(yr) (1.9)

    where P(yr) is the probability that the numberof occurrences of an event (y) equals an integervalue (r0, 1, 2 . . .), is the mean (and variance) ofthe number of occurrences. A Poisson variable cantake any integer value between zero and innitybecause the number of trials, in contrast to the

    er

    r!

    n!r!(n r)!

    PROBABILITY DISTRIBUTIONS 11

  • binomial and the multinomial, is not xed. One ofthe characteristics of a Poisson distribution is thatthe mean () equals the variance (2). For smallvalues of , the Poisson distribution is positivelyskewed but once is greater than about ve, thedistribution is symmetrical (Figure 1.1).

    The Poisson distribution has a wide range ofapplications in biology. It actually describes theoccurrence of random events in space (or time)and has been used to examine whether organismshave random distributions in nature (Ludwig &Reynolds 1988). It also has wide application inmany applied statistical procedures, e.g. counts incells in contingency tables are often assumed tobe Poisson random variables and therefore aPoisson probability distribution is used for theerror terms in log-linear modeling of contingencytables (Chapter 14).

    A simple example might help in understand-ing the difference between the binomial and thePoisson distributions. If we know the averagenumber of seedlings of mountain ash trees(Eucalyptus regnans) per plot in some habitat, wecan use the Poisson distribution to model theprobability of different numbers of seedlings perplot, assuming independent sampling. The bino-mial distribution would be used if we wished tomodel the number of plots with seedlings out of axed number of plots, knowing the probability ofa plot having a seedling.

    Another useful probability distribution forcounts is the negative binomial (White & Bennetts1996). It is dened by two parameters, the meanand a dispersion parameter, which measures thedegree of clumping in the distribution. White &Bennetts (1996) pointed out that the negativebinomial has two potential advantages over thePoisson for representing skewed distributions ofcounts of organisms: (i) the mean does not have toequal the variance, and (ii) independence of trials(samples) is not required (see also Chapter 13).

    These probability distributions are very impor-tant in data analysis. We can test whether a partic-ular variable follows one of these distributions bycalculating the expected frequencies and compar-ing them to observed frequencies with a goodness-of-t test (Chapter 14). More importantly, we canmodel the expected value of a response variable[E(Y)] against a range of predictor (independent)

    variables if we know the probability distributionof our response variable.

    1.5.2 Distributions for statisticsThe remaining theoretical distributions toexamine are those used for determining probabil-ities of sample statistics, or modications thereof.These distributions are used extensively for esti-mation and hypothesis testing. Four particularlyimportant ones are as follows.

    1. The z or normal distribution representsthe probability distribution of a random variablethat is the ratio of the difference between asample statistic and its population value to thestandard deviation of the population statistic(Figure 1.2).

    2. Students t distribution (Figure 1.2)represents the probability distribution ofa random variable that is the ratio of thedifference between a sample statistic and itspopulation value to the standard deviation ofthe distribution of the sample statistic. The tdistribution is a symmetrical distribution verysimilar to a normal distribution, bounded byinnity in both directions. Its shape becomesmore similar with increasing sample size(Figure 1.2). We can convert a single samplestatistic to a t value and use the t distributionto determine the probability of obtaining thatt value (or one smaller or larger) for a speciedvalue of the population parameter (Chapters 2and 3).

    3. 2 (chi-square) distribution (Figure 1.2)represents the probability distribution of avariable that is the square of values from astandard normal distribution (Section 1.5).Values from a 2 distribution are bounded byzero and innity. Variances have a 2 distribu-tion so this distribution is used for intervalestimation of population variances (Chapter 2).We can also use the 2 distribution to determinethe probability of obtaining a sample difference(or one smaller or larger) between observedvalues and those predicted by a model (Chapters13 and 14).

    4. F distribution (Figure 1.2) represents theprobability distribution of a variable that is theratio of two independent 2 variables, each

    12 INTRODUCTION

  • divided by its df (degrees of freedom) (Hays 1994).Because variances are distributed as 2, the Fdistribution is used for testing hypotheses aboutratios of variances. Values from the F distribu-tion are bounded by zero and innity. We canuse the F distribution to determine the prob-ability of obtaining a sample variance ratio (orone larger) for a specied value of the true ratiobetween variances (Chapters 5 onwards).

    All four distributions have mathematical deri-vations that are too complex to be of much inter-est to biologists (see Evans et al. 2000). However,

    these distributions are tabled in many textbooksand programmed into most statistical software,so probabilities of obtaining values from each,within a specic range, can be determined. Thesedistributions are used to represent the probabilitydistributions of the sample statistics (z, t, 2 or F)that we would expect from repeated random sam-pling from a population or populations. Differentversions of each distribution are used dependingon the degrees of freedom associated with thesample or samples (see Box 2.1 and Figure 1.2).

    PROBABILITY DISTRIBUTIONS 13

    Figure 1.2. Probabilitydistributions for four commonstatistics. For the t, 2, and Fdistributions, we show distributionsfor three or four different degreesof freedom (a to d, in increasingorder), to show how the shapes ofthese distributions change.

  • Chapter 2

    Estimation

    2.1 Samples and populations

    Biologists usually wish to make inferences (drawconclusions) about a population, which is denedas the collection of all the possible observations ofinterest. Note that this is a statistical population,not a biological population (see below). The collec-tion of observations we take from the populationis called a sample and the number of observationsin the sample is called the sample size (usuallygiven the symbol n). Measured characteristics ofthe sample are called statistics (e.g. sample mean)and characteristics of the population are calledparameters (e.g. population mean).

    The basic method of collecting the observa-tions in a sample is called simple random sam-pling. This is where any observation has the sameprobability of being collected, e.g. giving every ratin a holding pen a number and choosing a sampleof rats to use in an experiment with a randomnumber table. We rarely sample truly randomly inbiology, often relying on haphazard sampling forpractical reasons. The aim is always to sample in amanner that doesnt create a bias in favour of anyobservation being selected. Other types of sam-pling that take into account heterogeneity in thepopulation (e.g. stratied sampling) are describedin Chapter 7. Nearly all applied statistical proce-dures that are concerned with using samples tomake inferences (i.e. draw conclusions) about pop-ulations assume some form of random sampling.If the sampling is not random, then we are neversure quite what population is represented by oursample. When random sampling from clearly

    dened populations is not possible, then interpre-tation of standard methods of estimationbecomes more difcult.

    Populations must be dened at the start of anystudy and this denition should include thespatial and temporal limits to the population andhence the spatial and temporal limits to our infer-ence. Our formal statistical inference is restrictedto these limits. For example, if we sample from apopulation of animals at a certain location inDecember 1996, then our inference is restricted tothat location in December 1996. We cannot inferwhat the population might be like at any othertime or in any other place, although we can spec-ulate or make predictions.

    One of the reasons why classical statistics hassuch an important role in the biological sciences,particularly agriculture, botany, ecology, zoology,etc., is that we can often dene a population aboutwhich we wish to make inferences and fromwhich we can sample randomly (or at least hap-hazardly). Sometimes the statistical population isalso a biological population (a group of individu-als of the same species). The reality of randomsampling makes biology a little different fromother disciplines that use statistical analyses forinference. For example, it is often difcult forpsychologists or epidemiologists to sample ran-domly because they have to deal with whateversubjects or patients are available (or volunteer!).

    The main reason for sampling randomly froma clearly dened population is to use sample sta-tistics (e.g. sample mean or variance) to estimatepopulation parameters of interest (e.g. populationmean or variance). The population parameters

  • cannot be measured directly because the popula-tions are usually too large, i.e. they contain toomany observations for practical measurement. Itis important to remember that population param-eters are usually considered to be xed, butunknown, values so they are not random variablesand do not have probability distributions. Notethat this contrasts with the Bayesian approachwhere population parameters are viewed asrandom variables (Section 2.6). Sample statisticsare random variables, because their valuesdepend on the outcome of the sampling experi-ment, and therefore they do have probability dis-tributions, called sampling distributions.

    What are we after when we estimate popula-tion parameters? A good estimator of a populationparameter should have the following characteris-tics (Harrison & Tamaschke 1984, Hays 1994).

    It should be unbiased, meaning that theexpected value of the sample statistic (the meanof its probability distribution) should equal theparameter. Repeated samples should produceestimates which do not consistently under- orover-estimate the population parameter.

    It should be consistent so as the sample sizeincreases then the estimator will get closer tothe population parameter. Once the sampleincludes the whole population, the samplestatistic will obviously equal the populationparameter, by denition.

    It should be efcient, meaning it has thelowest variance among all competing esti-mators. For example, the sample mean is amore efcient estimator of the populationmean of a variable with a normal probabilitydistribution than the sample median, despitethe two statistics being numerically equivalent.

    There are two broad types of estimation:

    1. point estimates provide a single valuewhich estimates a population parameter, and

    2. interval estimates provide a range of valuesthat might include the parameter with a knownprobability, e.g. condence intervals.

    Later in this chapter we discuss differentmethods of estimating parameters, but, for now,lets consider some common population parame-ters and their point estimates.

    2.2 Common parameters andstatistics

    Consider a population of observations of the vari-able Y measured on all N sampling units in thepopulation. We take a random sample of n obser-vations (y1, y2, y3, . . .yi, . . .yn) from the population.We usually would like information about twoaspects of the population, some measure of loca-tion or central tendency (i.e. where is the middleof the population?) and some measure of thespread (i.e. how different are the observations inthe population?). Common estimates of parame-ters of location and spread are given in Table 2.1and illustrated in Box 2.2.

    2.2.1 Center (location) of distribution Estimators for the center of a distribution can beclassied into three general classes, or broad types(Huber 1981, Jackson 1986). First are L-estimators,based on the sample data being ordered from small-est to largest (order statistics) and then forming alinear combination of weighted order statistics. Thesample mean ( y), which is an unbiased estimator ofthe population mean (), is an L-estimator whereeach observation is weighted by 1/n (Table 2.1).Other common L-estimators include the following.

    The median is the middle measurement of aset of data. Arrange the data in order ofmagnitude (i.e. ranks) and weight allobservations except the middle one by zero.The median is an unbiased estimator of thepopulation mean for normal distributions,is a better estimator of the center of skeweddistributions and is more resistant to outliers(extreme values very different to the rest of thesample; see Chapter 4).

    The trimmed mean is the mean calculatedafter omitting a proportion (commonly 5%) ofthe highest (and lowest) observations, usuallyto deal with outliers.

    The Winsorized mean is determined as fortrimmed means except the omitted obser-vations are replaced by the nearest remainingvalue.

    Second are M-estimators, where the weight-ings given to the different observations change

    COMMON PARAMETERS AND STATISTICS 15

  • gradually from the middle of the sample andincorporate a measure of variability in the estima-tion procedure. They include the Huber M-estimator and the Hampel M-estimator, which usedifferent functions to weight the observations.They are tedious to calculate, requiring iterativeprocedures, but maybe useful when outliers arepresent because they downweight extreme values.They are not commonly used but do have a role inrobust regression and ANOVA techniques for ana-lyzin