+ chapter 12: analysis of variance lecture powerpoint slides discovering statistics 2nd edition...

22
+ Chapter 12: Analysis of Variance Lecture PowerPoint Slides Discovering Statistics 2nd Edition Daniel T. Larose

Upload: randell-collins

Post on 24-Dec-2015

235 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: + Chapter 12: Analysis of Variance Lecture PowerPoint Slides Discovering Statistics 2nd Edition Daniel T. Larose

+

Chapter 12:Analysis of Variance

Lecture PowerPoint Slides

Discovering Statistics

2nd Edition Daniel T. Larose

Page 2: + Chapter 12: Analysis of Variance Lecture PowerPoint Slides Discovering Statistics 2nd Edition Daniel T. Larose

+ Chapter 12 Overview

12.1 One-Way Analysis of Variance (ANOVA)

12.2 Multiple Comparisons

12.3 Randomized Block Design

12.4 Two-Way ANOVA

2

Page 3: + Chapter 12: Analysis of Variance Lecture PowerPoint Slides Discovering Statistics 2nd Edition Daniel T. Larose

+ The Big Picture

Where we are coming from and where we are headed…

In Chapters 8–10, we learned statistical inference for continuous random variables and in Chapter 11 we learned hypothesis tests for categorical variables.

Here in Chapter 12, we are introduced to analysis of variance, a way to compare the population means of several different groups, and determine whether significant differences exist between the means.

In the final two chapters, we will learn about inference for regression and nonparametric statistics.

3

Page 4: + Chapter 12: Analysis of Variance Lecture PowerPoint Slides Discovering Statistics 2nd Edition Daniel T. Larose

+ 12.1: One-Way ANOVA

Objectives:

Explain how ANOVA works.

Perform ANOVA.

4

Page 5: + Chapter 12: Analysis of Variance Lecture PowerPoint Slides Discovering Statistics 2nd Edition Daniel T. Larose

5

How ANOVA WorksAnalysis of variance (ANOVA) is a hypothesis test for determining whether three or more means of different populations are equal. ANOVA works by comparing the variability between the samples to the variability within the samples.

Requirements for Performing ANOVA

1. Each of the k populations is normally distributed.

2. The variances (σ2) of the populations are all equal.

3. The samples are independently drawn.

Procedure for Verifying the Requirements for ANOVA

1. Normality: Check that the data from each group are normally distributed, using normal probability plots.

2. Equal Variances: Compute the sample standard deviation for each group to verify that the largest standard deviation is not larger than twice the smallest standard deviation.

3. Independence: Verify that the samples are independently drawn.

Page 6: + Chapter 12: Analysis of Variance Lecture PowerPoint Slides Discovering Statistics 2nd Edition Daniel T. Larose

6

Measuring VariabilitiesWe use the following statistics to measure the variabilities between and within the samples.

The mean square treatment (MSTR) measures the variability in the sample means. MSTR is the sample variance of the sample means, weighted by the sample size.

The mean square error (MSE) measures the variability within the samples. MSE is the mean of the sample variances, weighted by sample size.

MSTR ni(x i x )2

k 1

MSE (ni 1)si

2nt k

Fdata =MSTR

MSE

Page 7: + Chapter 12: Analysis of Variance Lecture PowerPoint Slides Discovering Statistics 2nd Edition Daniel T. Larose

7

ExampleConsider the summary statistics for GPAs for dorms A, B, and C. Calculate the SSTR, SSE, SST, MSTR, MSE, and Fdata.

k = 3 dormitories, and total sample size nt = 10 + 10 + 10 = 30

2

2 2 2

2 2 2

• a.

10 2.2 2.5 10 2.5 2.5 10 2.8 2.5

10 0.3 0 0.3 1.8

iiSSTR n x x

Page 8: + Chapter 12: Analysis of Variance Lecture PowerPoint Slides Discovering Statistics 2nd Edition Daniel T. Larose

8

ExampleSSE ≈ (10 – 1)1.1334607772 + (10 – 1)1.0308572482

+ (10 – 1)0.93702842 ≈ 29.0288

SST = SSTR + SSE = 1.8 + 29.0288 = 30.8288

1.8

0.91 3 1

SSTRMSTR

k

29.0288

1.075140740730 3t

SSEMSE

n k

0.9

1.0751407407 0.8370997079

0.84

data

MSTRF

MSE

Page 9: + Chapter 12: Analysis of Variance Lecture PowerPoint Slides Discovering Statistics 2nd Edition Daniel T. Larose

9

Performing One-Way ANOVA

One-Way Analysis of Variance

We have taken random samples from k populations and want to test whether the population means are all equal. Conditions:

1. Each of the k populations is normally distributed.2. The variances (σ2) of the populations are all equal.3. The samples are independently drawn.

Step 1: State the hypotheses and rejection rule.

Step 2: Calculate Fdata

where Fdata follows an F distribution with df1 = k – 1 and df2 = nt – k.

Step 3: Find the p-value.

Step 4: State the conclusion and the interpretation.

MSE

MSTRF data

Page 10: + Chapter 12: Analysis of Variance Lecture PowerPoint Slides Discovering Statistics 2nd Edition Daniel T. Larose

10

ExampleConsider the summary statistics for GPAs for dorms A, B, and C. Test whether the population mean GPAs differ among the students in the three dormitories. Use = 0.05.

H0: μA = μB = μC Ha: not all the population means are equal

μi represents the GPA of students from dormitory i. Rejection rule: Reject H0 if p-value < .

Page 11: + Chapter 12: Analysis of Variance Lecture PowerPoint Slides Discovering Statistics 2nd Edition Daniel T. Larose

11

ExampleThe conditions are checked in Example 12.1.

Also, in a previous example, we calculatedMSTR = 0.9, MSE = 1.0751407407

0.90.8370997079

1.0751407407data

MSTRF

MSE

Fdata follows an F distribution with df1 = k – 1 = 3 – 1 = 2 and df2 = nt – k = 30 – 3 = 27

We find the p-value to be P(F > 0.8370997079) = 0.4438929572 ≈ 0.4439.

Since this p-value is > 0.05, we do not reject the null hypothesis. There is not enough evidence to conclude that not all of the mean GPAs are equal.

Page 12: + Chapter 12: Analysis of Variance Lecture PowerPoint Slides Discovering Statistics 2nd Edition Daniel T. Larose

+ 12.2: Multiple Comparisons

Objectives:

Perform multiple comparisons tests using the Bonferroni method.

Use Tukey’s test to perform multiple comparisons.

Use confidence intervals to perform multiple comparisons for Tukey’s test.

12

Page 13: + Chapter 12: Analysis of Variance Lecture PowerPoint Slides Discovering Statistics 2nd Edition Daniel T. Larose

13

Multiple ComparisonsWhen we perform one-way ANOVA, we may determine not all population means are the same. However, we do not test to find out which pairs of population means are significantly different.

Multiple Comparisons

Once an ANOVA result has been found significant (the null hypothesis is rejected) multiple comparisons procedures seek to determine which pairs of population means are significantly different. Multiple comparisons are not performed if the ANOVA null hypothesis has not been rejected.

We shall learn three multiple comparisons procedures:• The Bonferroni Method• Tukey’s Test• Tukey’s Test using Confidence Intervals

Page 14: + Chapter 12: Analysis of Variance Lecture PowerPoint Slides Discovering Statistics 2nd Edition Daniel T. Larose

14

The Bonferroni MethodTo determine which pairs of population means are significantly different, we test each pair of means using a slightly different test statistic t and apply the Bonferroni adjustment to the p-value.

The Bonferroni Adjustment

When performing multiple comparisons, the experimentwise error rate EW is the probability of making at least one Type I error in the set of hypothesis tests.

• EW is always greater than the comparison level of significance by a factor approximately equal to the number of comparisons being made.

• Thus, the Bonferroni adjustment corrects for the experimentwise error rate by multiplying the p-value of each pairwise hypothesis test

by the number of comparisons being made. If the Bonferroni-adjusted p-value is greater than 1, then set it equal to 1.

Page 15: + Chapter 12: Analysis of Variance Lecture PowerPoint Slides Discovering Statistics 2nd Edition Daniel T. Larose

15

Tukey’s TestTukey’s Test for Multiple Comparisons

Tukey’s Method requires that the conditions for ANOVA have been met and that the null hypothesis of equal means has been rejected.

Step 1: For each of the c hypothesis tests, state the hypotheses.

Step 2: Find the Tukey critical value and state the rejection rule.

Step 3: Calculate the Tukey test statistic q for each hypothesis test.

Step 4: For each hypothesis test, state the conclusion and the interpretation.

10

10data

112 nn

MSE

xxq

Multiple Comparisons

If a 100(1 – )% confidence interval for µ1 – µ2 contains zero, then at level of significance we do not reject the null hypothesis H0: µ1 = µ2. If the interval does not contain zero, then we do reject the null hypothesis.

Page 16: + Chapter 12: Analysis of Variance Lecture PowerPoint Slides Discovering Statistics 2nd Edition Daniel T. Larose

+ 12.3: Randomized Block Design

Objective:

Explain the power of the randomized block design and perform a randomized block design ANOVA.

16

Page 17: + Chapter 12: Analysis of Variance Lecture PowerPoint Slides Discovering Statistics 2nd Edition Daniel T. Larose

17

Randomized Block DesignIn the appropriate circumstances, we can use the randomized block design to improve the ability of the ANOVA to find significant differences among the treatment means.

A blocking factor, or block, is a variable that is not of primary interest to the researcher but is included in the ANOVA in order to improve the ability of the ANOVA to find significant differences among the treatment means. In a randomized block design ANOVA, we test for differences among the treatment means, while accounting for the variability among the levels in the blocking factor.

Page 18: + Chapter 12: Analysis of Variance Lecture PowerPoint Slides Discovering Statistics 2nd Edition Daniel T. Larose

18

Randomized Block DesignNote the following facts about the ANOVA table for randomized block design:

• SSTR, its df, k – 1, and MSTR are all the same quantities as in the one-way ANOVA table.

• SSERBD is denoted simply as SSE.

• Quantities in the Mean Square column equal the ratio of the quantities in the sum of squares column divided by their respective

degrees of freedom.

• We have SST = SSTR + SSB + SSE, and the 3 df sum to nT – 1.

• Since we are not interested in the blocks and thus the mean square blocks MSB, there is no F statistic for blocks.

• In RBD, the error df is broken down into the df for SSB, b – 1, and the df for the new SSE, (k – 1)(b – 1).

Page 19: + Chapter 12: Analysis of Variance Lecture PowerPoint Slides Discovering Statistics 2nd Edition Daniel T. Larose

+ 12.4: Two-Way ANOVA

Objectives:

Construct and interpret an interaction graph.

Perform a two-way ANOVA.

19

Page 20: + Chapter 12: Analysis of Variance Lecture PowerPoint Slides Discovering Statistics 2nd Edition Daniel T. Larose

20

Interaction GraphIt is important when performing two-way ANOVA to check for the presence of interaction between the factors.

Interaction exists between two factors when the effect of one factor depends on the level of the other factor.

An interaction plot is a graphical representation of the cell means for each cell in the contingency table. To construct an interaction plot:

1. Compute the cell means for all cells.

2. Construct an x – y plot (Cartesian plane). Label the horizontal axis for each level of Factor A. The vertical axis represents the response variable.

3. For the first level of Factor A, insert a point at a height representing the cell means for the response variable for each level of Factor B. Then do this for the other levels of Factor A.

4. Connect the points that have a common Factor B level.

Page 21: + Chapter 12: Analysis of Variance Lecture PowerPoint Slides Discovering Statistics 2nd Edition Daniel T. Larose

21

Two-Way ANOVAThe requirements for performing two-way ANOVA are the same as for one-way ANOVA:

1. Each of the k populations is normally distributed.2. The variances (σ2) of the populations are all equal.3. The samples are independently drawn.

Warning!

If there is interaction between the factors, then we cannot draw conclusions about the main effects. If the test for interaction produces evidence that interaction is present, then do not perform the test for either Factor A or B.

Two-way ANOVA involves a series of three hypothesis tests:

1. Test for interaction between the factors.2. Test for Factor A effect.3. Test for Factor B effect.

Page 22: + Chapter 12: Analysis of Variance Lecture PowerPoint Slides Discovering Statistics 2nd Edition Daniel T. Larose

+ Chapter 12 Overview

12.1 One-Way Analysis of Variance (ANOVA)

12.2 Multiple Comparisons

12.3 Randomized Block Design

12.4 Two-Way ANOVA

22