sections 11.1, 11.2, and 11people.stat.sc.edu/sshen/courses/17sstat205/course materials/notes/notes...

25
11.1 Introduction 11.2 One-Way ANOVA 11.4 The Global F Test Sections 11.1, 11.2, and 11.4 Shiwen Shen Department of Statistics University of South Carolina Elementary Statistics for the Biological and Life Sciences (STAT 205) 1 / 24

Upload: others

Post on 27-Sep-2020

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Sections 11.1, 11.2, and 11people.stat.sc.edu/sshen/courses/17sstat205/course materials/Notes/Notes 20.pdf11.1 Introduction 11.2 One-Way ANOVA 11.4 The Global F Test Example 11.1.1:

11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test

Sections 11.1, 11.2, and 11.4

Shiwen Shen

Department of StatisticsUniversity of South Carolina

Elementary Statistics for the Biological and Life Sciences

(STAT 205)

1 / 24

Page 2: Sections 11.1, 11.2, and 11people.stat.sc.edu/sshen/courses/17sstat205/course materials/Notes/Notes 20.pdf11.1 Introduction 11.2 One-Way ANOVA 11.4 The Global F Test Example 11.1.1:

11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test

Chapter 7

2 / 24

Page 3: Sections 11.1, 11.2, and 11people.stat.sc.edu/sshen/courses/17sstat205/course materials/Notes/Notes 20.pdf11.1 Introduction 11.2 One-Way ANOVA 11.4 The Global F Test Example 11.1.1:

11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test

Chapter 11

3 / 24

Page 4: Sections 11.1, 11.2, and 11people.stat.sc.edu/sshen/courses/17sstat205/course materials/Notes/Notes 20.pdf11.1 Introduction 11.2 One-Way ANOVA 11.4 The Global F Test Example 11.1.1:

11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test

Comparing more than two means

In Chapter 7 we had two groups and tested H0 : µ1 = µ2.

In Chapter 11 we will have I groups and testH0 : µ1 = µ2 = · · · = µI .

We are still interested in whether the population means arethe same across groups, there’s just more than two.

The alternative hypothesis is HA : one or more ofµ1, µ2, . . . , µI are different.

Let’s look at an example where I = 5.

4 / 24

Page 5: Sections 11.1, 11.2, and 11people.stat.sc.edu/sshen/courses/17sstat205/course materials/Notes/Notes 20.pdf11.1 Introduction 11.2 One-Way ANOVA 11.4 The Global F Test Example 11.1.1:

11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test

Example 11.1.1: Sweet Corn

When growing sweet corn, can organic methods be usedsuccessfully to control harmful insects and limit their effect on thecorn? In a study of this question researchers compared the weightsof ears of corn under five conditions in an experiment in whichsweet corn was grown using organic methods. In one plot of corn abeneficial soil nematode was introduced. In a second plot aparasitic wasp was used. A third plot was treated with both thenematode and the wasp. In a fourth plot a bacterium was used.Finally, a fifth plot of corn acted as a control; no special treatmentwas applied here. Thus, the treatments were

Treatment 1: Nematodes

Treatment 2: Wasps

Treatment 3: Nematodes and wasps

Treatment 4: Bacteria

Treatment 5: Control5 / 24

Page 6: Sections 11.1, 11.2, and 11people.stat.sc.edu/sshen/courses/17sstat205/course materials/Notes/Notes 20.pdf11.1 Introduction 11.2 One-Way ANOVA 11.4 The Global F Test Example 11.1.1:

11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test

Section 11.1 Background

6 / 24

Page 7: Sections 11.1, 11.2, and 11people.stat.sc.edu/sshen/courses/17sstat205/course materials/Notes/Notes 20.pdf11.1 Introduction 11.2 One-Way ANOVA 11.4 The Global F Test Example 11.1.1:

11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test

Weights of ears of corn receiving five treatments

7 / 24

Page 8: Sections 11.1, 11.2, and 11people.stat.sc.edu/sshen/courses/17sstat205/course materials/Notes/Notes 20.pdf11.1 Introduction 11.2 One-Way ANOVA 11.4 The Global F Test Example 11.1.1:

11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test

Comparing four population means requires six comparisons

The PROBLEM of MULTIPLE COMPARISONS occurs whenmultiple hypotheses are considered simultaneously.

H0 : µ1 = µ2 H0 : µ1 = µ3 H0 : µ1 = µ4

H0 : µ2 = µ3 H0 : µ2 = µ4 H0 : µ3 = µ4

8 / 24

Page 9: Sections 11.1, 11.2, and 11people.stat.sc.edu/sshen/courses/17sstat205/course materials/Notes/Notes 20.pdf11.1 Introduction 11.2 One-Way ANOVA 11.4 The Global F Test Example 11.1.1:

11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test

Type I error

Its natural to ask: why not now just compare all possible pairsµ1 − µ2, µ1 − µ3, µ2 − µ3, etc., each with 5% level?Question: With 4 samples (6 pairs), is the type I error6 × 5% = 30%?

Answer: NO! The calculation of type I error is morecomplicated!

9 / 24

Page 10: Sections 11.1, 11.2, and 11people.stat.sc.edu/sshen/courses/17sstat205/course materials/Notes/Notes 20.pdf11.1 Introduction 11.2 One-Way ANOVA 11.4 The Global F Test Example 11.1.1:

11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test

Type I error

Its natural to ask: why not now just compare all possible pairsµ1 − µ2, µ1 − µ3, µ2 − µ3, etc., each with 5% level?Question: With 4 samples (6 pairs), is the type I error6 × 5% = 30%?Answer: NO! The calculation of type I error is morecomplicated!

9 / 24

Page 11: Sections 11.1, 11.2, and 11people.stat.sc.edu/sshen/courses/17sstat205/course materials/Notes/Notes 20.pdf11.1 Introduction 11.2 One-Way ANOVA 11.4 The Global F Test Example 11.1.1:

11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test

Signal to Noise

H0 : µ1 = µ2 = µ3 = µ4

Ha : one or more of µ1, µ2, µ3, µ4 are different

Which study has more pronounced differences between groups?(a) or (b)?

10 / 24

Page 12: Sections 11.1, 11.2, and 11people.stat.sc.edu/sshen/courses/17sstat205/course materials/Notes/Notes 20.pdf11.1 Introduction 11.2 One-Way ANOVA 11.4 The Global F Test Example 11.1.1:

11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test

Analysis of Variance (ANOVA)

We will test H0 : µ1 = µ2 = · · · = µI via analysis of variance(ANOVA).

ANOVA compares how variable the sample meansy1, y2, . . . , yI are to how variable observations are around eachmean.

Assumptions: Observations in each group are independentlynormally distributed with the same variance σ2. (You are notnecessary to know the value of σ2.) The data in differentgroups are also independent.

11 / 24

Page 13: Sections 11.1, 11.2, and 11people.stat.sc.edu/sshen/courses/17sstat205/course materials/Notes/Notes 20.pdf11.1 Introduction 11.2 One-Way ANOVA 11.4 The Global F Test Example 11.1.1:

11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test

Building Test Statistic: Between Group Difference

Y1 is the sample mean for group 1.¯Y is the sample mean for all group.Between group difference:

n1 × (Y1 − ¯Y ) + n2 × (Y2 − ¯Y ) + n3 × (Y3 − ¯Y )

12 / 24

Page 14: Sections 11.1, 11.2, and 11people.stat.sc.edu/sshen/courses/17sstat205/course materials/Notes/Notes 20.pdf11.1 Introduction 11.2 One-Way ANOVA 11.4 The Global F Test Example 11.1.1:

11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test

Building Test Statistic: Between Group Variation

Between group variation:

n1 × (Y1 − ¯Y )2 + n2 × (Y2 − ¯Y )2 + n3 × (Y3 − ¯Y )2

= Σ3i=1ni (Yi − ¯Y )2

It is between group sums of squares.

13 / 24

Page 15: Sections 11.1, 11.2, and 11people.stat.sc.edu/sshen/courses/17sstat205/course materials/Notes/Notes 20.pdf11.1 Introduction 11.2 One-Way ANOVA 11.4 The Global F Test Example 11.1.1:

11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test

Building Test Statistic: Within Group Variation

Y1i are the observations from group 1, i = 1, 2, ..., n1.Y1 is the sample mean for group 1.Within group variation:

Σi (Y1i − Y1)2 + Σi (Y2i − Y2)2 + Σi (Y3i − Y3)2

14 / 24

Page 16: Sections 11.1, 11.2, and 11people.stat.sc.edu/sshen/courses/17sstat205/course materials/Notes/Notes 20.pdf11.1 Introduction 11.2 One-Way ANOVA 11.4 The Global F Test Example 11.1.1:

11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test

Building Test Statistic: Within Group Variation

By definition, sample standard deviation of group 1 is

s21 =Σi (Y1i − Y1)2

n1 − 1

Within group variation:

Σi (Y1i − Y1)2 + Σi (Y2i − Y2)2 + Σi (Y3i − Y3)2

= (n1 − 1) × s21 + (n2 − 1) × s22 + (n3 − 1) × s23

= Σi (ni − 1) × s2i

It is weighted sum of sample standard deviations, called withingroup sums of squares.

15 / 24

Page 17: Sections 11.1, 11.2, and 11people.stat.sc.edu/sshen/courses/17sstat205/course materials/Notes/Notes 20.pdf11.1 Introduction 11.2 One-Way ANOVA 11.4 The Global F Test Example 11.1.1:

11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test

Definition of Sums of Squares

Sums of Squares (SS) are sums of squared deviations from acentral value.

Between Group SS.

Within Group SS

Total SS =∑I

i=1

∑nij=1(Yij − ¯Y )2

Relation:

TOTAL SS = Between Group SS + Within Group SS

Mean Square (MS) is the average of the squared deviationsfrom a central value. It is a Sum of Squares (SS) divided bythe number of informative values in the SS, called “degrees offreedom”, or df

16 / 24

Page 18: Sections 11.1, 11.2, and 11people.stat.sc.edu/sshen/courses/17sstat205/course materials/Notes/Notes 20.pdf11.1 Introduction 11.2 One-Way ANOVA 11.4 The Global F Test Example 11.1.1:

11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test

ANOVA formulae

17 / 24

Page 19: Sections 11.1, 11.2, and 11people.stat.sc.edu/sshen/courses/17sstat205/course materials/Notes/Notes 20.pdf11.1 Introduction 11.2 One-Way ANOVA 11.4 The Global F Test Example 11.1.1:

11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test

Example 11.2.1: Weight Gain of Lambs

The table shows the weight gains (in 2 weeks) of young lambson three different diets.

The total number of observations is n. = 3 + 5 + 4 = 12

18 / 24

Page 20: Sections 11.1, 11.2, and 11people.stat.sc.edu/sshen/courses/17sstat205/course materials/Notes/Notes 20.pdf11.1 Introduction 11.2 One-Way ANOVA 11.4 The Global F Test Example 11.1.1:

11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test

ANOVA table

estimate of the unknown variance:

σ2 = s2pool = MS(Within),

which is 23.33 in the Lamb data.

19 / 24

Page 21: Sections 11.1, 11.2, and 11people.stat.sc.edu/sshen/courses/17sstat205/course materials/Notes/Notes 20.pdf11.1 Introduction 11.2 One-Way ANOVA 11.4 The Global F Test Example 11.1.1:

11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test

F test

H0 : µ1 = µ2 = · · · = µI

Test statistic:

F =MS(Between)

MS(Within),

which has an F (dfbetween, dfwithin) distribution if H0 is true.

The F distribution with ν1 and ν2 degrees of freedom is thedistribution of the ratio of two independent mean squares.Notation: F ∼ F (ν1, ν2).

In R, Within Group SS is called Residual SS

Obtain P-value from R using aov() command.

20 / 24

Page 22: Sections 11.1, 11.2, and 11people.stat.sc.edu/sshen/courses/17sstat205/course materials/Notes/Notes 20.pdf11.1 Introduction 11.2 One-Way ANOVA 11.4 The Global F Test Example 11.1.1:

11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test

R code for lamb diet data

weight <- c(8,16, 9, 9,16,21,11,18,15,10,17, 6)

diet <- c( 1, 1, 1, 2, 2, 2, 2, 2, 3, 3, 3, 3)

diet <- factor(diet)

fit <- aov(weight~diet)

summary(fit)

Df Sum Sq Mean Sq F value Pr(>F)

diet 2 36 18.00 0.771 0.491

Residuals 9 210 23.33

With p-value 0.491 greater than 5% level, we accept the nullhypothesis and claim the mean weight gain is the same on all threediets.

21 / 24

Page 23: Sections 11.1, 11.2, and 11people.stat.sc.edu/sshen/courses/17sstat205/course materials/Notes/Notes 20.pdf11.1 Introduction 11.2 One-Way ANOVA 11.4 The Global F Test Example 11.1.1:

11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test

R code for corn growth data (Example 11.1.1)

weight=c(16.5,11.0, 8.5,16.0,13.0,15.0,15.0,13.0,14.5,10.5,

11.5, 9.0,12.0,15.0,11.0,12.0, 9.0,10.0, 9.0,10.0,

12.5,11.5,12.5,10.5,14.0, 9.0,11.0, 8.5,14.0,12.0,

16.0, 9.0, 9.5,12.5,11.0, 6.5,10.0, 7.0, 9.0, 9.5,

8.0, 9.0,10.5, 9.0,18.5,14.5, 8.0,10.5, 9.0,17.0,

7.0, 8.0,13.0, 6.5,10.0,10.5, 5.0, 9.0, 8.5,11.0)

treat=c(1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,

1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,1,2,3,4,5,

1,2,3,4,5,1,2,3,4,5)

treat=factor(treat)

fit=aov(weight~treat)

summary(fit)

Df Sum Sq Mean Sq F value Pr(>F)

treat 4 52.31 13.0771 1.6461 0.1758

Residuals 55 436.94 7.9443

With p-value 0.1758 greater than 5% level, we accept the nullhypothesis. 22 / 24

Page 24: Sections 11.1, 11.2, and 11people.stat.sc.edu/sshen/courses/17sstat205/course materials/Notes/Notes 20.pdf11.1 Introduction 11.2 One-Way ANOVA 11.4 The Global F Test Example 11.1.1:

11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test

MAO activity in schizophrenia

23 / 24

Page 25: Sections 11.1, 11.2, and 11people.stat.sc.edu/sshen/courses/17sstat205/course materials/Notes/Notes 20.pdf11.1 Introduction 11.2 One-Way ANOVA 11.4 The Global F Test Example 11.1.1:

11.1 Introduction11.2 One-Way ANOVA11.4 The Global F Test

MAO activity (Fig. 1.1.2)

url="http://people.stat.sc.edu/sshen/courses/17sstat205/data/mao.txt"

x = read.table(url,header=FALSE)

y = x[,1]

diagnosis = factor(x[,2])

fit = aov(y~diagnosis)

summary(fit)

Df Sum Sq Mean Sq F value Pr(>F)

diagnosis 2 136.1 68.06 6.346 0.00411 **

Residuals 39 418.3 10.72

With p-value 0.00411 less than 5% level, we reject the nullhypothesis and claim that at least one of the disease diagnosis isdifferent in MAO activity.

24 / 24