1 analysis of variance chapter 14 2 introduction analysis of variance helps compare two or more...
TRANSCRIPT
11
Analysis of VarianceAnalysis of VarianceAnalysis of VarianceAnalysis of VarianceChapter 14Chapter 14
22
IntroductionIntroduction
Analysis of variance helps compare two or more Analysis of variance helps compare two or more populations of quantitative data.populations of quantitative data.Specifically, we are interested in the relationships Specifically, we are interested in the relationships among the population means (are they equal or among the population means (are they equal or not).not).The procedure works by analyzing the sample The procedure works by analyzing the sample variance.variance.
33
14.114.1 One - Way Analysis of Variance One - Way Analysis of Variance
The analysis of variance is a procedure that The analysis of variance is a procedure that tests to determine whether differences exits tests to determine whether differences exits among two or more population means.among two or more population means.
To do this, the technique analyzes the sample To do this, the technique analyzes the sample variancesvariances
44
Example 1Example 1– An apple juice manufacturer is planning to develop a new An apple juice manufacturer is planning to develop a new
product -a liquid concentrate.product -a liquid concentrate.– The marketing manager has to decide how to market the The marketing manager has to decide how to market the
new product.new product.– Three strategies are consideredThree strategies are considered
Emphasize convenience of using the product.Emphasize convenience of using the product.Emphasize the quality of the product.Emphasize the quality of the product.Emphasize the product’s low price.Emphasize the product’s low price.
One - Way Analysis of Variance :One - Way Analysis of Variance :
55
Example 1 - continuedExample 1 - continued– An experiment was conducted as follows:An experiment was conducted as follows:
In three cities an advertisement campaign was launched .In three cities an advertisement campaign was launched .
In each city only one of the three characteristics In each city only one of the three characteristics
(convenience, quality, and price) was emphasized.(convenience, quality, and price) was emphasized.
The weekly sales were recorded for twenty weeks The weekly sales were recorded for twenty weeks
following the beginning of the campaigns.following the beginning of the campaigns.
One - Way Analysis of Variance :One - Way Analysis of Variance :
66
One - Way Analysis of Variance :One - Way Analysis of Variance :
Convnce Quality Price529 804 672658 630 531793 774 443514 717 596663 679 602719 604 502711 620 659606 697 689461 706 675529 615 512498 492 691663 719 733604 787 698495 699 776485 572 561557 523 572353 584 469557 634 581542 580 679614 624 532
Convnce Quality Price529 804 672658 630 531793 774 443514 717 596663 679 602719 604 502711 620 659606 697 689461 706 675529 615 512498 492 691663 719 733604 787 698495 699 776485 572 561557 523 572353 584 469557 634 581542 580 679614 624 532
See file See file ((Xm1.xls)Xm1.xls)
Weekly sales
Weekly sales
Weekly sales
77
SolutionSolution– The data is quantitative.The data is quantitative.– Our problem objective is to compare sales in three Our problem objective is to compare sales in three
cities.cities.– We hypothesize on the relationships among the We hypothesize on the relationships among the
three mean weekly sales:three mean weekly sales:
One - Way Analysis of Variance :One - Way Analysis of Variance :
88
H0: 1 = 2= 3
H1: At least two means differ
To build the statistic needed to test thehypotheses use the following notation:
• Solution
Defining the HypothesesDefining the Hypotheses
99
Independent samples are drawn from k populations (treatments).
1 2 kX11
x12
.
.
.Xn1,1
1
1x
n
X21
x22
.
.
.Xn2,1
2
2x
n
Xk1
xk2
.
.
.Xnk,1
k
kx
n
Sample sizeSample mean
First observation,first sample
Second observation,second sample
X is the “response variable”.The variables’ value are called “responses”.
NotationNotation
1010
TerminologyTerminology
In the context of this problem…In the context of this problem…Response variableResponse variable – weekly sales – weekly salesResponses Responses – actual sale values– actual sale valuesExperimental unitExperimental unit – weeks in the three cities when we record – weeks in the three cities when we record sales figures.sales figures.FactorFactor – the criterion by which we classify the populations (the – the criterion by which we classify the populations (the treatments). In this problems the factor is the marketing strategy.treatments). In this problems the factor is the marketing strategy.Factor levelsFactor levels – the population (treatment) names. In this problem – the population (treatment) names. In this problem factor levels are the marketing trategies.factor levels are the marketing trategies.
1111
Two types of variability are employed Two types of variability are employed when testing for the equality of the when testing for the equality of the
population meanspopulation means
The rationale of the test statistic
1212
Graphical demonstration:Employing two types of variability
1313
20
25
30
1
7
Treatment 1 Treatment 2 Treatment 3
10
12
19
9
Treatment 1Treatment 2Treatment 3
20
161514
1110
9
10x1
15x2
20x3
10x1
15x2
20x3
The sample means are the same as before,but the larger within-sample variability makes it harder to draw a conclusionabout the population means.
A small variability withinthe samples makes it easierto draw a conclusion about the population means.
1414
The rationale behind the test statistic – I The rationale behind the test statistic – I
If the null hypothesis is true, we would expect all the If the null hypothesis is true, we would expect all the sample means be close to one another (and as a sample means be close to one another (and as a result to the grand mean).result to the grand mean).If the alternative hypothesis is true, at least some of If the alternative hypothesis is true, at least some of the sample means would reside away from one the sample means would reside away from one another.another.Thus, we measure variability among sample Thus, we measure variability among sample means. means.
1515
The variability among the sample means is The variability among the sample means is measured as the sum of squared distances measured as the sum of squared distances between each mean and the grand mean.between each mean and the grand mean.
This sum is called the This sum is called the SSum of um of SSquares for quares for TTreatmentsreatments
SSTSSTIn our example treatments arerepresented by the differentadvertising strategies.
Variability among sample meansVariability among sample means
1616
2
1
)( xxnSSTk
jjj
There are k treatments
The size of sample j The mean of sample j
Sum of squares for treatments (SSTR)Sum of squares for treatments (SSTR)
Note: When the sample means are close toone another, their distance from the grand mean is small, leading to amall SST. Thus, large SST indicates large variation among sample means, which supports H1.
1717
Solution – continuedSolution – continuedCalculate SSTCalculate SST
2k
1jjj
321
)xx(nSST
608.65x653.00x577.55x
= 20(577.55 - 613.07)2 + + 20(653.00 - 613.07)2 + + 20(608.65 - 613.07)2 == 57,512.23
The grand mean is calculated by
k21
kk2211
n...nnxn...xnxn
X
Sum of squares for treatments (SST)Sum of squares for treatments (SST)
1818
Is SST = 57,512.23 large enough to favor Is SST = 57,512.23 large enough to favor HH11??
See next.See next.
Sum of squares for treatments (SST)Sum of squares for treatments (SST)
1919
Large variability within the samples weakens the Large variability within the samples weakens the “ability” of the sample means to represent their “ability” of the sample means to represent their corresponding population means. corresponding population means. Therefore, even-though sample means may Therefore, even-though sample means may markedly differ from one another, large SST markedly differ from one another, large SST must be judged relative to the “within samples must be judged relative to the “within samples variability”. variability”.
The rationale behind test statistic – II The rationale behind test statistic – II
2020
The variability within samples is measured by The variability within samples is measured by adding all the squared distances between adding all the squared distances between observations and their sample means.observations and their sample means.
This sum is called the This sum is called the SSum of um of SSquares for quares for EError - rror -
SSE.SSE.In our example this is the sum of all squared differencesbetween sales in city j and thesample mean of city j (over all the three cities).
Within samples variability Within samples variability
2121
Solution – continuedSolution – continuedCalculate SSECalculate SSE
Sum of squares for errors (SSE) Sum of squares for errors (SSE)
k
1j
2ij
n
1i
23
22
21
)xx(SSE
24.670,8s11,238,7s00.775,10sj
(n1 - 1)S12 + (n2 -1)S2
2 + (n3 -1)S32
= (20 -1)10,774.44 + (20 -1)7238.61+ (20-1)8,669.47 = = 506,967.88
2222
• Note: If SST is small relative to SSE, we Note: If SST is small relative to SSE, we can’t infer that treatments are the cause can’t infer that treatments are the cause for different average performance.for different average performance.
• Is SST = 57,512.23 large enough relative Is SST = 57,512.23 large enough relative to SSE = 506,983.50 to argue that the to SSE = 506,983.50 to argue that the means means AREARE different? different?
Sum of squares for errors (SSE) Sum of squares for errors (SSE)
2323
To perform the test we need to calculate the mean sum of squaresmean sum of squares as follows:
The mean sum of squares The mean sum of squares
Calculation of MST - Mean Square for Treatments
28,756.1213
57,512.231k
SSTMST
Calculation of MSEMean Square for Error
8,894.17360
509,967.88kn
SSEMSE
2424
Calculation of the test statistic Calculation of the test statistic
3.238,894.1728,756.12MSEMST
F
with the following degrees of freedom:v1=k -1 and v2=n-k
We assume:1. The populations tested are normally distributed.2. The variances of all the populations tested are equal.
For honors class:Testing equal variances
For honors class:Testing normality
2525
And finally the hypothesis test:
H0: 1 = 2 = …=k
Ha: At least two means differ
Test statistic:
R.R: F>F,k-1,n-k
MSEMST
F
The F test rejection region The F test rejection region
2626
The F testThe F test
Ho: 1 = 2= 3
H1: At least two means differ
Test statistic F= MST MSE= 3.2315.3FFF:.R.R 360,13,05.0knk 1
Since 3.23 > 3.15, there is sufficient evidence to reject Ho in favor of H1, and argue that at least one of the mean sales is different than the others.
3.238,894.1728,756.12MSEMST
F
2727
-0.02
0
0.02
0.04
0.06
0.08
0.1
0 1 2 3 4
Use Excel to find the p-valueUse Excel to find the p-value =FDIST(3.23,2,57) = .0467=FDIST(3.23,2,57) = .0467
The F test p- value The F test p- value
p Value = P(F>3.23) = .0467
2828
Excel single factor printoutExcel single factor printoutAnova: Single Factor
SUMMARYGroups Count Sum Average Variance
Convnce 20 11551 577.55 10774.997Quality 20 13060 653 7238.1053Price 20 12173 608.65 8670.2395
ANOVASource of Variation SS df MS F P-value F crit
Between Groups 57512.233 2 28756.117 3.2330414 0.046773 3.1588456Within Groups 506983.5 57 8894.4474
Total 564495.73 59
Anova: Single Factor
SUMMARYGroups Count Sum Average Variance
Convnce 20 11551 577.55 10774.997Quality 20 13060 653 7238.1053Price 20 12173 608.65 8670.2395
ANOVASource of Variation SS df MS F P-value F crit
Between Groups 57512.233 2 28756.117 3.2330414 0.046773 3.1588456Within Groups 506983.5 57 8894.4474
Total 564495.73 59
SS(Total) = SST + SSE
See file See file ((Xm1.xls)Xm1.xls)
If the single factor ANOVA leads us to conclude at least If the single factor ANOVA leads us to conclude at least two means differ, we often wants to know which ones.two means differ, we often wants to know which ones.Two means are considered different if the difference Two means are considered different if the difference between the corresponding sample means is larger than between the corresponding sample means is larger than a critical number. a critical number. The larger sample mean is believed to be associated The larger sample mean is believed to be associated with a larger population mean.with a larger population mean.
14.2 Multiple Comparisons14.2 Multiple Comparisons
3030
Fisher’s Least Significant DifferenceFisher’s Least Significant DifferenceThe Fisher’s Least Significant (LSD) method is one The Fisher’s Least Significant (LSD) method is one procedure designed to determine which mean difference procedure designed to determine which mean difference is significant.is significant.The hypotheses are:The hypotheses are:
HH00: |: |ii – – jj| = 0| = 0HHaa: |: |ii – – jj| | 0. 0.
The statistic: The statistic: ji xx
This method builds on the equal variance t-test of the This method builds on the equal variance t-test of the difference between two means.difference between two means.The test statistic is improved by using MSE rather than sThe test statistic is improved by using MSE rather than spp
22..
We can conclude that We can conclude that ii and and jj differ (at differ (at % significance % significance level if |level if |ii - - jj| > LSD, where| > LSD, where
kn.f.d
)n1
n1
(MSEtLSDji
2
Fisher’s Least Significant DifferenceFisher’s Least Significant Difference
Experimentwise type I error rate (Experimentwise type I error rate (EE))(the effective type I error)(the effective type I error)
The Fisher’s method may result in an increased probability of The Fisher’s method may result in an increased probability of committing a type I error.committing a type I error.The probability of committing at least one type I error in a series of C The probability of committing at least one type I error in a series of C hypothesis tests each at hypothesis tests each at level of significance is increasing too. level of significance is increasing too. This probability is called experimentwise type I error rate (This probability is called experimentwise type I error rate (EE ). It ). It isiscalculated bycalculated by
EE = 1-(1 – = 1-(1 – ))CC
where C is the number of pairwise comparisons (C = k(k-1)/2, k is the where C is the number of pairwise comparisons (C = k(k-1)/2, k is the number of treatments)number of treatments)The Bonferroni adjustment determines the required The Bonferroni adjustment determines the required type I error type I error probability per pairwise comparison (probability per pairwise comparison () ,) , to secure a pre-determined to secure a pre-determined overall overall EE
The procedure:The procedure:– Compute the number of pairwise comparisons (C)Compute the number of pairwise comparisons (C)
[C=k(k-1)/2], where k is the number of [C=k(k-1)/2], where k is the number of populations/treatments.populations/treatments.
– Set Set = = EE/C, where the value of /C, where the value of EE is predetermined is predetermined
– We can conclude that We can conclude that ii and and jj differ (at differ (at /C% significance /C% significance level iflevel if
kn.f.d
nnMSEt
jiji
11
(2C)αE
The Bonferroni AdjustmentThe Bonferroni Adjustment
35.4465.6080.653xx
10.3165.60855.577xx
45.750.65355.577xx
32
31
21
Example1 - continuedExample1 - continued– Rank the effectiveness of the marketing strategiesRank the effectiveness of the marketing strategies
(based on mean weekly sales).(based on mean weekly sales).– Use the Fisher’s method, and the Bonferroni adjustment methodUse the Fisher’s method, and the Bonferroni adjustment method
Solution (the Fisher’s method)Solution (the Fisher’s method)– The sample mean sales were 577.55, 653.0, 608.65.The sample mean sales were 577.55, 653.0, 608.65.– Then, Then,
71.59)20/1()20/1(8894t
)n1
n1
(MSEt
2/05.
ji
2
The Fisher and Bonferroni methodsThe Fisher and Bonferroni methods
The significant difference is between 1 and 2.
Solution (the Bonferroni adjustment)Solution (the Bonferroni adjustment)– We calculate C=k(k-1)/2 to be 3(2)/2 = 3.We calculate C=k(k-1)/2 to be 3(2)/2 = 3.– We set We set = .05/3 = .0167, thus t = .05/3 = .0167, thus t.0167.01672, 60-32, 60-3 = 2.467 (Excel). = 2.467 (Excel).
54.73)20/1()20/1(8894467.2
)n1
n1
(MSEtji
2
Again, the significant difference is between 1 and 2.
35.4465.6080.653xx
10.3165.60855.577xx
45.750.65355.577xx
32
31
21
The Fisher and Bonferroni methodsThe Fisher and Bonferroni methods
The test procedure:The test procedure:– Find a critical number Find a critical number as follows: as follows:
gnMSE
),k(q
k = the number of samples =degrees of freedom = n - kng = number of observations per sample (recall, all the sample sizes are the same) = significance levelq(k,) = a critical value obtained from the studentized range table
The Tukey Multiple ComparisonsThe Tukey Multiple ComparisonsIf the sample sizes are not extremely different, we can use the above procedure with ng calculated as the harmonic mean of the sample sizes.
The test procedure:The test procedure:– Find a critical number Find a critical number as follows: as follows:
gnMSE
),k(q
k = the number of samples =degrees of freedom = n - kng = number of observations per sample = significance levelq(k,) = a critical value obtained from the studentized range table
The Tukey Multiple ComparisonsThe Tukey Multiple Comparisons
recall, all the sample sizes are the same
k21 n1...n1n1k
gn
The Tukey Multiple ComparisonsThe Tukey Multiple Comparisons
If the sample sizes are not the same, but don’t differ much from one another, we can use the harmonic mean of the sample sizes for ng.
Recall, all the sample sizes are the same
• Repeat this procedure for each pair of samples. Rank the means if possible.
Select a pair of means. Calculate the difference between the Select a pair of means. Calculate the difference between the larger and the smaller mean.larger and the smaller mean.
• If there is sufficient evidence to
conclude that max > min .
minmax xx
minmax xx
The Tukey Multiple ComparisonsThe Tukey Multiple Comparisons
City 1 vs. City 2: 653 - 577.55 = 75.45City 1 vs. City 3: 608.65 - 577.55 = 31.1City 2 vs. City 3: 653 - 608.65 = 44.35
Example 1 – continued. We had three populations Example 1 – continued. We had three populations (three marketing strategies).(three marketing strategies).K = 3,K = 3,
Sample sizes were equal. nSample sizes were equal. n11 = n = n22 = n = n33 = 20, = 20,= n-k = 60-3 = 57,= n-k = 60-3 = 57,MSE = 8894.MSE = 8894.
minmax xx
71.7020
8894(3,57)q.
n
MSEν)(k,qω 05
gα
Take q.05(3,60) from the table.
Population
Sales - City 1Sales - City 2Sales - City 3
Mean
577.55653698.65
minmax xx
The Tukey Multiple ComparisonsThe Tukey Multiple Comparisons
Excel – Tukey and Fisher LSD methodExcel – Tukey and Fisher LSD method
Xm15 -1.xls
Multiple ComparisonsOmega = 71.7007033950796Variable Variable Difference LSD
1 2 -75.45 59.720673 -31.1 59.72067
2 3 44.35 59.72067
Fisher’s LDS
Multiple ComparisonsOmega = 71.7007033950796Variable Variable Difference LSD
1 2 -75.45 73.541763 -31.1 73.54176
2 3 44.35 73.54176
Bonferroni adjustments = .05
Type = .05/3 = .0167
4242
14.3 Randomized Blocks Design14.3 Randomized Blocks Design
The purpose of designing a randomized block The purpose of designing a randomized block experiment is to reduce the experiment is to reduce the within-treatments within-treatments variation variation thus increasing thethus increasing the relative amount of relative amount of among-treatment variation.among-treatment variation.
This helps in detecting differences among the This helps in detecting differences among the treatment means more easily.treatment means more easily.
The Block of dark blues
4343
Treatment 4
Treatment 3
Treatment 2
Treatment 1
The Block ofbluish purples
Randomized BlocksRandomized Blocks
The block of Greyish pinks
4444
The sum of square total is partitioned into three The sum of square total is partitioned into three sources of variationsources of variation– TreatmentsTreatments– BlocksBlocks– Within samples (Error)Within samples (Error)
SS(Total) = SST + SSB + SSESS(Total) = SST + SSB + SSE
Sum of square for treatments Sum of square for blocks Sum of square for error
Recall. For the independent samples design we have: SS(Total) = SST + SSE
Partitioning the total variabilityPartitioning the total variability
4545
To perform hypothesis tests for treatments and blocks we need
• Mean square for treatments• Mean square for blocks• Mean square for error
The mean sum of squareThe mean sum of square
1kSST
MST
1bSSB
MSB
1)1)(b(k
SSEMSE
4646
The test statistic for the randomized block The test statistic for the randomized block design ANOVAdesign ANOVA
MSEMST
F
MSEMSB
F
Test statistics for treatments
Test statistics for blocks
4747
Testing the mean responses for treatmentsTesting the mean responses for treatments
F > FF > F,k-1,(k-1)(b-1),k-1,(k-1)(b-1)
Testing the mean response for blocksTesting the mean response for blocks
F> FF> F,b-1,(k-1)(b-1),b-1,(k-1)(b-1)
The F test rejection regionThe F test rejection region
4848
Example 2Example 2– Are there differences in the effectiveness of cholesterol Are there differences in the effectiveness of cholesterol
reduction drugs? reduction drugs? – To answer this question the following experiment was To answer this question the following experiment was
organized:organized:25 groups of men with high cholesterol were matched by age 25 groups of men with high cholesterol were matched by age and weight. Each group consisted of 4 men.and weight. Each group consisted of 4 men.Each person in a group received a different drug.Each person in a group received a different drug.The cholesterol level reduction in two months was recorded.The cholesterol level reduction in two months was recorded.
– Can we infer from the data in Xm2.xls that there are Can we infer from the data in Xm2.xls that there are differences in mean cholesterol reduction among the four differences in mean cholesterol reduction among the four drugs? drugs?
Randomized Blocks ANOVA - ExampleRandomized Blocks ANOVA - Example
Additional example
4949
SolutionSolution– Each drug can be considered a treatment.Each drug can be considered a treatment.
– Each 4 records (per group) can be blocked, because Each 4 records (per group) can be blocked, because they are matched by age and weight.they are matched by age and weight.
– This procedure eliminates the variability in This procedure eliminates the variability in cholesterol reductioncholesterol reduction related to related to different different combinations of age and weightcombinations of age and weight. .
– This helps detect differences in the mean cholesterol This helps detect differences in the mean cholesterol reductionreduction attributed to the different drugs. attributed to the different drugs.
Randomized Blocks ANOVA - ExampleRandomized Blocks ANOVA - Example
5050
ANOVA
Source of Variation SS df MS F P-value F crit
Rows 3848.657 24 160.3607 10.10537 9.7E-15 1.669456Columns 195.9547 3 65.31823 4.116127 0.009418 2.731809Error 1142.558 72 15.86886
Total 5187.169 99
BlocksTreatments b-1 MSTR / MSE MSBL / MSE
Conclusion: At 5% significance level there is sufficient evidence to infer that the mean “cholesterol reduction” gained by at least two drugs are different.
K-1
Randomized Blocks ANOVA - ExampleRandomized Blocks ANOVA - Example
5151
>
The rejection region:The rejection region:
ji xx
jiknα/2, n
1n1
MSEt
14.2 Multiple Comparisons14.2 Multiple Comparisons
44.35xx
31.10xx
75.45xx
32
31
21 >59.72
<59.72
<59.72
Testing the differencesExample – continued
Calculating LSD:MSE = 8894.44; n1 = n2 = n3 =20. t.05/2,60-3 = tinv(.05,57) = 2.002LSD=(2.002)[8894.44(1/20+1/20)].5 = 59.72