1 copyright © 2005 brooks/cole, a division of thomson learning, inc. chapter 15 the analysis of...
TRANSCRIPT
![Page 1: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/1.jpg)
1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Chapter 15
The Analysis of Variance
![Page 2: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/2.jpg)
2 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
A study was done on the survival time of patients with advanced cancer of the stomach, bronchus, colon, ovary or breast when treated with ascorbate1. In this study, the authors wanted to determine if the survival times differ based on the affected organ.
1 Cameron, E. and Pauling, L. (1978) Supplemental ascorbate in the supportive treatment of cancer: re-evaluation of prolongation of survival time in terminal human cancer. Proceedings of the National Academy of Science, USA, 75, 4538-4542.
A Problem
![Page 3: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/3.jpg)
3 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
A comparative dotplot of the survival times is shown below.
A Problem
3000200010000
Survival Time (in days)
Dotplot for Survival Time
Cancer Type
Breast
Bronchus
Colon
Ovary
Stomach
![Page 4: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/4.jpg)
4 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
H0: µstomach = µbronchus = µcolon = µovary = µbreast
Ha: At least two of the µ’s are different
A Problem
The hypotheses used to answer the question of interest are
The question is similar to ones encountered in chapter 11 where we looked at tests for the difference of means of two different variables. In this case we are interested in looking a more than two variable.
![Page 5: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/5.jpg)
5 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
A single-factor analysis of variance (ANOVA) problems involves a comparison of k population or treatment means µ1, µ2, … , µk.
The objective is to test the hypotheses:
H0: µ1 = µ2 = µ3 = … = µk
Ha: At least two of the µ’s are different
Single-factor Analysis of Variance (ANOVA)
![Page 6: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/6.jpg)
6 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
The analysis is based on k independently selected samples, one from each population or for each treatment.
In the case of populations, a random sample from each population is selected independently of that from any other population.
When comparing treatments, the experimental units (subjects or objects) that receive any particular treatment are chosen at random from those available for the experiment.
Single-factor Analysis of Variance (ANOVA)
![Page 7: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/7.jpg)
7 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
A comparison of treatments based on independently selected experimental units is often referred to as a completely randomized design.
Single-factor Analysis of Variance (ANOVA)
![Page 8: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/8.jpg)
8 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
70
60
50
40
FertilizerY
ield
Dotplots of Yield by Fertilizer(group means are indicated by lines)
Type 1 Type 2 Type 3
Notice that in the above comparative dotplot, the differences in the treatment means is large relative to the variability within the samples.
Single-factor Analysis of Variance (ANOVA)
![Page 9: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/9.jpg)
9 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Sta
tistic
s
Psy
cho
log
y
Eco
nom
ics
Bus
ine
ss
85
75
65
SubjectP
rice
Dotplots of Price by Subject(group means are indicated by lines)
Notice that in the above comparative dotplot, the differences in the treatment means is not easily understood relative to the sample variability.
ANOVA techniques will allow us to determined if those differences are significant.
Single-factor Analysis of Variance (ANOVA)
![Page 10: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/10.jpg)
10 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
ANOVA Notation
k = number of populations or treatments being compared
Population or treatment 1 2 … k
Population or treatment mean µ1 µ2 … µk
Sample mean …1x 2x kx
Population or treatment variance …21 2
2 2k
Sample variance …21s 2
2s 2ks
Sample size n1 n2 … nk
![Page 11: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/11.jpg)
11 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
N = n1 + n2 + … + nk (Total number of observations in the data set)
ANOVA Notation
Tx grand mean
N
T = grand total = sum of all N observations
1 1 2 2 k kn x n x n x
![Page 12: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/12.jpg)
12 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Assumptions for ANOVA
1. Each of the k populations or treatments, the response distribution is normal.
2. 1 = 2 = … = k (The k normal distributions have identical standard deviations.
3. The observations in the sample from any particular one of the k populations or treatments are independent of one another.
4. When comparing population means, k random samples are selected independently of one another. When comparing treatment means, treatments are assigned at random to subjects or objects.
![Page 13: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/13.jpg)
13 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Definitions
2 2 2
1 1 2 2 k kSSTr n x x n x x n x x
A measure of disparity among the sample means is the treatment sum of squares, denoted by SSTr is given by
A measure of variation within the k samples, called error sum of squares and denoted by SSE is given by
2 2 21 1 2 2 k kSSE n 1 s n 1 s n 1 s
![Page 14: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/14.jpg)
14 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Definitions
A mean square is a sum of squares divided by its df. In particular,
The error df comes from adding the df’s associated with each of the sample variances:
(n1 - 1) + (n2 - 1) + …+ (nk - 1)
= n1 + n2 … + nk - 1 - 1 - … - 1 = N - k
mean square for
treatments = MSTr = SSTrk 1
mean square for error = MSE = SSEN k
![Page 15: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/15.jpg)
15 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
ExampleThree filling machines are used by a bottler to fill 12 oz cans of soda. In an attempt to determine if the three machines are filling the cans to the same (mean) level, independent samples of cans filled by each were selected and the amounts of soda in the cans measured. The samples are given below.
Machine 112.033 11.985 12.009 12.00912.033 12.025 12.054 12.050
Machine 212.031 11.985 11.998 11.99211.985 12.027 11.987
Machine 312.034 12.021 12.038 12.05812.001 12.020 12.029 12.01112.021
![Page 16: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/16.jpg)
16 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example
2 2 2
1 1 2 2 k k
2 2 2
SSTr n x x n x x n x x
8(0.0065833) 7(-0.0174524) 9(0.0077222)
0.000334672+0.00213210+0.00053669
0.00301552
1 1 1n 8, x 12.0248, s 0.02301
3 3 3n 9, x 12.0259, s 0.01650 2 2 2n 7, x 12.0007, s 0.01989
x 12.018167
![Page 17: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/17.jpg)
17 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example
2 2 21 1 2 2 k k
2 2 2
SSE n 1 s n 1 s n 1 s
7(0.0230078) 6(0.0198890) 8(0.01649579)
0.0037055 0.0023734 0.0021769
0.00825582
1 1 1n 8, x 12.0248, s 0.02301
3 3 3n 9, x 12.0259, s 0.01650 2 2 2n 7, x 12.0007, s 0.01989
x 12.018167
![Page 18: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/18.jpg)
18 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example
SSTrk 1
mean square for treatments = MSTr =
SSTr 0.00301552MSTr 0.0015078
k 1 3 1
mean square for error = MSE = SSEN k
SSE 0.0082579MSE 0.00039313
N k 24 3
1 1 1n 8, x 12.0248, s 0.02301
3 3 3n 9, x 12.0259, s 0.01650 2 2 2n 7, x 12.0007, s 0.01989
x 12.018167
![Page 19: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/19.jpg)
19 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Comments
Both MSTr and MSE are quantities that are calculated from sample data.
As such, both MSTr and MSE are statistics and have sampling distributions.
More specifically, when H0 is true, µMSTr = µMSE.
However, when H0 is false, µMSTr µMSE and the greater the differences among the ’s, the larger µMSTr will be relative to µMSE.
![Page 20: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/20.jpg)
20 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
The Single-Factor ANOVA F Test
Null hypothesis: H0: µ1 = µ2 = µ3 = … = µk
Alternate hypothesis: At least two of the µ’s are different
Test Statistic: MSTrF
MSE
![Page 21: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/21.jpg)
21 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
The Single-Factor ANOVA F Test
When H0 is true and the ANOVA assumptions are reasonable, F has an F distribution with df1 = k - 1 and df2 = N - k.
Values of F more contradictory to H0 than what was calculated are values even farther out in the upper tail, so the P-value is the area captured in the upper tail of the corresponding F curve.
![Page 22: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/22.jpg)
22 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example
Consider the earlier example involving the three filling machines.
Machine 112.033 11.985 12.009 12.009 12.03312.025 12.054 12.050
Machine 212.031 11.985 11.998 11.992 11.98512.027 11.987
Machine 312.034 12.021 12.038 12.058 12.00112.020 12.029 12.011 12.021
![Page 23: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/23.jpg)
23 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example
SSTr 0.00301552 SSE 0.00825582
MSTr 0.0015078 MSE 0.00039313
1 1 1n 8, x 12.0248, s 0.02301
3 3 3n 9, x 12.0259, s 0.01650 2 2 2n 7, x 12.0007, s 0.01989
x 12.018167
![Page 24: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/24.jpg)
24 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example
1. Let µ1, µ2 and µ3 denote the true mean amount of soda in the cans filled by machines 1, 2 and 3, respectively.
2. H0: µ1 = µ2 = µ3
3. Ha: At least two among are µ1, µ2 and µ3 different
4. Significance level: = 0.01
5. Test statistic:MSTr
FMSE
![Page 25: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/25.jpg)
25 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example
6. Looking at the comparative dotplot, it seems reasonable to assume that the distributions have the same ’s. We shall look at the normality assumption on the next slide.*
12.0612.0512.0412.0312.0212.0112.0011.99
Fill
Dotplot for FillMachine
Machine 1
Machine 2
Machine 3
*When the sample sizes are large, we can make judgments about both the equality of the standard deviations and the normality of the underlying populations with a comparative boxplot.
![Page 26: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/26.jpg)
26 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example6. Looking at normal plots for the samples, it
certainly appears reasonable to assume that the samples from Machine’s 1 and 2 are samples from normal distributions. Unfortunately, the normal plot for the sample from Machine 2 does not appear to be a sample from a normal population. So as to have a computational example, we shall continue and finish the test, treating the result with a “grain of salt.”
P-Value: 0.692A-Squared: 0.235
Anderson-Darling Normality Test
N: 8StDev: 0.0230078Average: 12.0248
12.05512.04512.03512.02512.01512.00511.99511.985
.999
.99
.95
.80
.50
.20
.05
.01
.001
Pro
bab
ility
Machine 1
Normal Probability Plot
P-Value: 0.031A-Squared: 0.729
Anderson-Darling Normality Test
N: 7StDev: 0.0198890Average: 12.0007
12.0312.0212.0112.0011.99
.999
.99
.95
.80
.50
.20
.05
.01
.001
Pro
bab
ility
Machine 2
Normal Probability Plot
P-Value: 0.702A-Squared: 0.237
Anderson-Darling Normality Test
N: 9StDev: 0.0164958Average: 12.0259
12.0612.0512.0412.0312.0212.0112.00
.999
.99
.95
.80
.50
.20
.05
.01
.001
Pro
bab
ility
Machine 3
Normal Probability Plot
![Page 27: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/27.jpg)
27 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example7. Computation:
SSTr 0.00301552 SSE 0.00825582
MSTr 0.0015078 MSE 0.00039313
1 1 1n 8, x 12.0248, s 0.02301
3 3 3n 9, x 12.0259, s 0.01650 2 2 2n 7, x 12.0007, s 0.01989
x 12.018167
1 2 3N n n n 8 7 9 24, k 3
1
2
MSTr 0.0015078F 3.835
MSE 0.00039313df treatment df k 1 3 1 2
df error df N k 24 3 21
![Page 28: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/28.jpg)
28 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example8. P-value:
3.835
dfden / dfnum 2
21 0.100 2.570.050 3.470.025 4.420.010 5.780.001 9.77
From the F table with numerator df1 = 2 and denominator df2 = 21 we can see that
0.025 < P-value < 0.05
(Minitab reports this value to be 0.038
1
2
MSTr 0.0015078F 3.835
MSE 0.00039313df treatment df k 1 3 1 2
df error df N k 24 3 21
Recall
1
2
MSTr 0.0015078F 3.835
MSE 0.00039313df treatment df k 1 3 1 2
df error df N k 24 3 21
Recall
![Page 29: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/29.jpg)
29 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example
9. Conclusion:
Since P-value > = 0.01, we fail to reject H0. We are unable to show that the mean fills are different and conclude that the differences in the mean fills of the machines show no statistically significant differences at the 1% level of significance.
![Page 30: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/30.jpg)
30 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Total Sum of Squares
The relationship between the three sums of squares is SSTo = SSTr + SSEwhich is often called the fundamental identity for single-factor ANOVA.
Informally this relation is expressed as
Total variation = Explained variation + Unexplained variation
Total sum of squares, denoted by SSTo, is given by
with associated df = N - 1.all N obs.
2SSTo (x x)
![Page 31: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/31.jpg)
31 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Single-factor ANOVA Table
The following is a fairly standard way of presenting the important calculations from an single-factor ANOVA. The output from most statistical packages will contain an additional column giving the P-value.
![Page 32: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/32.jpg)
32 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Single-factor ANOVA Table
The ANOVA table supplied by Minitab
One-way ANOVA: Fills versus Machine
Analysis of Variance for Fills Source DF SS MS F PMachine 2 0.003016 0.001508 3.84 0.038Error 21 0.008256 0.000393Total 23 0.011271
![Page 33: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/33.jpg)
33 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Another Example
A food company produces 4 different brands of salsa. In order to determine if the four brands had the same sodium levels, 10 bottles of each Brand were randomly (and independently) obtained and the sodium content in milligrams (mg) per tablespoon serving was measured.
The sample data are given on the next slide.
Use the data to perform an appropriate hypothesis test at the 0.05 level of significance.
![Page 34: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/34.jpg)
34 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Another Example
Brand A43.85 44.30 45.69 47.13 43.3545.59 45.92 44.89 43.69 44.59
Brand B42.50 45.63 44.98 43.74 44.9542.99 44.95 45.93 45.54 44.70
Brand C45.84 48.74 49.25 47.30 46.4146.35 46.31 46.93 48.30 45.13
Brand D43.81 44.77 43.52 44.63 44.8446.30 46.68 47.55 44.24 45.46
![Page 35: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/35.jpg)
35 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Another Example
1. Let µ1, µ2 , µ3 and µ4 denote the true mean sodium content per tablespoon in each of the brands respectively.
2. H0: µ1 = µ2 = µ3 = µ4
3. Ha: At least two among are µ1, µ2, µ3 and µ4 are different
4. Significance level: = 0.05
5. Test statistic:MSTr
FMSE
![Page 36: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/36.jpg)
36 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
6. Looking at the following comparative boxplot, it seems reasonable to assume that the distributions have the equal ’s as well as the samples being samples from normal distributions.
Another Example
Bra
nd
D
Bra
nd
C
Bra
nd
B
Bra
nd
A
49
48
47
46
45
44
43
42
Boxplots of Brand A - Brand D(means are indicated by solid circles)
![Page 37: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/37.jpg)
37 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example
Treatment df = k - 1 = 4 - 1 = 3
7. Computation:Brand k si
Brand A 10 44.9001.180Brand B 10 44.5911.148Brand C 10 47.0561.331Brand D 10 45.1801.304
xi
2 2 2 21 1 2 2 3 3 4 4
2 2
2 2
SSTr n (x x) n (x x) n (x x) n (x x)
10(44.900 45.432) 10(44.591 45.432)
10(47.056 45.432) 10(45.180 45.432)
36.912
x 45.432
![Page 38: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/38.jpg)
38 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example7. Computation (continued):
Error df = N - k = 40 - 4 = 36
2 2 2 21 1 2 2 3 3 4 4
2 2 2 2
SSE n 1 s n 1 s n 1 s n 1 s
9(1.180) 9(1.148) 9(1.331) 9(1.304)
55.627
SSTr
SSE
SSTr 36.912MSTr 12.304df 3F 7.963
SSE 55.627MSE 1.5452df 36
![Page 39: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/39.jpg)
39 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example8.P-value:
F = 7.96 with dfnumerator= 3 and dfdenominator= 36
Using df = 30 we find
P-value < 0.001
7.96
![Page 40: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/40.jpg)
40 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example
9. Conclusion:
Since P-value < = 0.001, we reject H0. We can conclude that the mean sodium content is different for at least two of the Brands.
We need to learn how to interpret the results and will spend some time on developing techniques to describe the differences among the µ’s.
![Page 41: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/41.jpg)
41 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Multiple Comparisons
A multiple comparison procedure is a method for identifying differences among the µ’s once the hypothesis of overall equality (H0) has been rejected.
The technique we will present is based on computing confidence intervals for difference of means for the pairs.
Specifically, if k populations or treatments are studied, we would create k(k-1)/2 differences. (i.e., with 3 treatments one would generate confidence intervals for µ1 - µ2, µ1 - µ3 and µ2 - µ3.) Notice that it is only necessary to look at a confidence interval for µ1 - µ2 to see if µ1 and µ2 differ.
![Page 42: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/42.jpg)
42 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
The Tukey-Kramer Multiple Comparison Procedure
When there are k populations or treatments being compared, k(k-1)/2 confidence intervals must be computed. If we denote the relevant Studentized range critical value by q, the intervals are as follows:
For i - j:
Two means are judged to differ significantly if the corresponding interval does not include zero.
i ji j
MSE 1 1( ) q
2 n n
![Page 43: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/43.jpg)
43 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
The Tukey-Kramer Multiple Comparison Procedure
When all of the sample sizes are the same, we denote n by n = n1 = n2 = n3 = … = nk, and the confidence intervals (for µi - µj) simplify to
i j
MSE( ) q
n
![Page 44: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/44.jpg)
44 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example (continued)
Continuing with example dealing with the sodium content for the four Brands of salsa we shall compute the Tukey-Kramer 95% Tukey-Kramer confidence intervals for µA - µB, µA - µC, µA - µD, µB - µC, µB - µD and µC - µD.
A B C D
55.627MSE 1.5452, n n n n n 10
36Interpolating from the table
q 3.81 i.e. 60% of the way from 3.85 to 3.79
MSE 1.5452q 3.81 1.498
n 10
![Page 45: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/45.jpg)
45 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example (continued)
Difference95% Confidence
Limits95% Confidence
Interval
A - B 0.309 ± 1.498 (-1.189, 1.807)
A - C -2.156 ± 1.498 (-3.654, -0.658)
A - D -0.280 ± 1.498 (-1.778, 1.218)
B - C -2.465 ± 1.498 (-3.963, -0.967)
B - D -0.589 ± 1.498 (-2.087, 0.909)
C - D 1.876 ± 1.498 (0.378, 3.374)
Notice that the confidence intervals for µA – µB, µA – µC
and µC – µD do not contain 0 so we can infer that the mean sodium content for Brands C is different from Brands A, B and D.
![Page 46: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/46.jpg)
46 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Example (continued)
We also illustrate the differences with the following listing of the sample means in increasing order with lines underneath those blocks of means that are indistinguishable.
Brand B Brand A Brand D Brand C
44.591 44.900 45.180 47.056
Notice that the confidence interval for µA – µC, µB – µC, and µC – µD do not contain 0 so we can infer that the mean sodium content for Brand C and all others differ.
![Page 47: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/47.jpg)
47 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Minitab Output for Example
One-way ANOVA: Sodium versus Brand
Analysis of Variance for Sodium Source DF SS MS F PBrand 3 36.91 12.30 7.96 0.000Error 36 55.63 1.55Total 39 92.54 Individual 95% CIs For Mean Based on Pooled StDevLevel N Mean StDev ------+---------+---------+---------+Brand A 10 44.900 1.180 (-----*------) Brand B 10 44.591 1.148 (------*-----) Brand C 10 47.056 1.331 (------*------) Brand D 10 45.180 1.304 (------*-----) ------+---------+---------+---------+Pooled StDev = 1.243 44.4 45.6 46.8 48.0
![Page 48: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/48.jpg)
48 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Minitab Output for Example
Tukey's pairwise comparisons
Family error rate = 0.0500Individual error rate = 0.0107
Critical value = 3.81
Intervals for (column level mean) - (row level mean)
Brand A Brand B Brand C
Brand B -1.189 1.807
Brand C -3.654 -3.963 -0.658 -0.967
Brand D -1.778 -2.087 0.378 1.218 0.909 3.374
![Page 49: 1 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc. Chapter 15 The Analysis of Variance](https://reader036.vdocuments.site/reader036/viewer/2022062417/551c21ec550346a84f8b5be3/html5/thumbnails/49.jpg)
49 Copyright © 2005 Brooks/Cole, a division of Thomson Learning, Inc.
Simultaneous Confidence Level
The Tukey-Kramer intervals are created in a manner that controls the simultaneous confidence level.
For example at the 95% level, if the procedure is used repeatedly on many different data sets, in the long run only about 5% of the time would at least one of the intervals not include that value of what it is estimating.
We then talk about the family error rate being 5% which is the maximum probability of one or more of the confidence intervals of the differences of mean not containing the true difference of mean.