©2003 thomson/south-western 1 chapter 11 – analysis of variance slides prepared by jeff heyl,...
TRANSCRIPT
©2003 Thomson/South-Western 1
Chapter 11 –Chapter 11 –
Analysis of Analysis of VarianceVariance
Slides prepared by Jeff Heyl, Lincoln UniversitySlides prepared by Jeff Heyl, Lincoln University©2003 South-Western/Thomson Learning™
Introduction toIntroduction to Business StatisticsBusiness Statistics, 6e, 6eKvanli, Pavur, KeelingKvanli, Pavur, Keeling
©2003 Thomson/South-Western 2
Analysis of VarianceAnalysis of Variance
Analysis of Variance (ANOVA) Analysis of Variance (ANOVA) determines if a factor has a determines if a factor has a significant effect on the variable significant effect on the variable being measuredbeing measured
Examine variation within samples Examine variation within samples and variation between samplesand variation between samples
©2003 Thomson/South-Western 3
Measuring VariationMeasuring Variation
SS(factor)SS(factor) measures between-sample measures between-sample variation variation [SS(between)][SS(between)]
SS(error)SS(error) measures within-sample measures within-sample variation variation [SS(within)][SS(within)]
SS(total)SS(total) measures the total variation in measures the total variation in the sample the sample [SS(factor)] [SS(error)][SS(factor)] [SS(error)]
©2003 Thomson/South-Western 4
Determining Sum of SquaresDetermining Sum of Squares
SS(factor) = + -SS(factor) = + -TT22
nn
TT1122
nn11
TT2222
nn22
SS(total) = ∑SS(total) = ∑xx22 - = ∑ - = ∑xx22 - -(∑(∑xx))22
nn
TT22
nn
SS(error) = ∑xSS(error) = ∑x22 - + - + ororTT11
22
nn11
TT2222
nn22
SS(error) = SS(total) - SS(factor)SS(error) = SS(total) - SS(factor)
©2003 Thomson/South-Western 5
ANOVA Test for HANOVA Test for Hoo: µ: µ11 = µ = µ22 Versus HVersus Haa: µ: µ11 ≠ µ ≠ µ22
MS(factor) = =MS(factor) = =SS(factor)SS(factor)
df for factordf for factor
SS(factor)SS(factor)
11
MS(error) = =MS(error) = =SS(error)SS(error)
nn11 + + nn22 - 2 - 2
SS(error)SS(error)
df for errordf for error
FF = = = =
estimated population variance based onestimated population variance based onthe variation among the sample meansthe variation among the sample means
estimated population variance based onestimated population variance based onthe variation within each of the samplesthe variation within each of the samples
MS(factor)MS(factor)
MS(error)MS(error)
©2003 Thomson/South-Western 6
Defining the Rejection RegionDefining the Rejection Region
Figure 11.1Figure 11.1
FF
©2003 Thomson/South-Western 7
p-Values for Battery p-Values for Battery Lifetime ExampleLifetime Example
Figure 11.2Figure 11.2
tt curve with 8 curve with 8 dfdf4.204.20
tt
pp-value-value = 2 (shaded area)= 2 (shaded area)= .0030= .0030
FF curve with 1 and 8 curve with 1 and 8 dfdf17.6417.64
FF
pp-value-value = shaded area= shaded area= .0030= .0030
©2003 Thomson/South-Western 8
Dot Array DiagramDot Array Diagram
Figure 11.4Figure 11.4
|2525
|3030
|3535
|4040
|4545
|5050
BB••
BB••
BB••
BB••
BB••
AA••
AA••
AA••
AA••
AA••
Number of cartonsNumber of cartons
|2525
|3030
|3535
|4040
|4545
|5050
BB••
BB••
BB••
BB••
BB••
AA••
AA••
AA••
AA••
AA••
Number of cartonsNumber of cartons
Figure 11.3Figure 11.3
©2003 Thomson/South-Western 9
AssumptionsAssumptions
The replicates are obtained The replicates are obtained independently and randomly from independently and randomly from each of the populationseach of the populations
The replicates from each population The replicates from each population follow a (approximate) normal follow a (approximate) normal distributiondistribution
The normal populations all have a The normal populations all have a common variancecommon variance
©2003 Thomson/South-Western 10
Deriving the Sum of SquaresDeriving the Sum of Squares
SS(factor) = + + ... + -SS(factor) = + + ... + -TT11
22
nn11
TT2222
nn22
TTkk22
nnkk
TT22
nn
SS(total) = ∑SS(total) = ∑xx22 - -TT22
nn
SS(error) = ∑SS(error) = ∑xx22 - + + ... + - + + ... +TT11
22
nn11
TT2222
nn22
TTkk22
nnkk
= SS(total) - SS(factor)= SS(total) - SS(factor)
©2003 Thomson/South-Western 11
The ANOVA TableThe ANOVA Table
SourceSource dfdf SSSS MSMS FF
FactorFactor k - 1k - 1 SS(factor)SS(factor) MS(factor)MS(factor) MS(factor) MS(factor)
ErrorError nn - 2 - 2 SS(error)SS(error) MS(error)MS(error) MS(error)MS(error)
TotalTotal nn - 1 - 1 SS(total)SS(total)
SS(factor)SS(factor)
kk - 1 - 1MS(factor) =MS(factor) =
SS(error)SS(error)
nn - - kkMS(error) =MS(error) =
MS(factor)MS(factor)
MS(error)MS(error)FF = =
©2003 Thomson/South-Western 12
Test for Equal VariancesTest for Equal Variances
HHoo: : 1122 = = 22
22 = … = = … = kk22
HHaa: at least 2 variances are unequal: at least 2 variances are unequal
reject Hreject Hoo if if HH > > HHTable A.14Table A.14
HH = =maximum maximum ss22
minimum minimum ss22
©2003 Thomson/South-Western 13
Confidence Intervals inConfidence Intervals inOne-Factor ANOVAOne-Factor ANOVA
XXii - - tt/2,/2,nn--kksspp to to XXii + + tt/2,/2,nn--kksspp11
nnii
11
nnii
wherewhere
kk = number of populations (levels)= number of populations (levels)nnii = number of replicates in the = number of replicates in the iith sampleth sample
nn = total number of observations= total number of observations
sspp == MS(error)MS(error)
©2003 Thomson/South-Western 14
Confidence Intervals inConfidence Intervals inOne-Factor ANOVAOne-Factor ANOVA
The (1 - The (1 - ) • 100% confidence interval for µ) • 100% confidence interval for µii - µ - µjj is is
((XXii - X - Xjj) - ) - tt/2,/2,nn--kksspp + +
to (to (XXii - X - Xjj) + ) + tt/2,/2,nn--kksspp + +
11
nnii
11
nnjj
11
nnii
11
nnjj
©2003 Thomson/South-Western 15
Multiple Comparisons Multiple Comparisons ProcedureProcedure
The multiple comparisons procedure compares The multiple comparisons procedure compares all possible pairs of means in such a way that all possible pairs of means in such a way that the probability of making one or more Type 1 the probability of making one or more Type 1 errors is errors is
Tukey’s TestTukey’s Test
Q =Q =maximum (maximum (XXii) - minimum () - minimum (XXii))
MS(error)/MS(error)/nnrr
©2003 Thomson/South-Western 16
Multiple Comparisons Multiple Comparisons ProcedureProcedure
1.1. Find QFind Q,,kk,,vv using Table A.16 using Table A.16
4.4. If two means differ by more than D, the conclusion If two means differ by more than D, the conclusion is that the corresponding population means are is that the corresponding population means are unequalunequal
2.2. DetermineDetermine DD = = QQ,,kk,,vv • •MS(error)MS(error)
nnrr
3.3. Place the sample means in order, from smallest Place the sample means in order, from smallest to largestto largest
©2003 Thomson/South-Western 17
Plot of Group MeansPlot of Group Means
11 22 33 44 55
GroupGroup
Gro
up
Mea
ns
Gro
up
Mea
ns
2626
2525
2424
2323
2222
2121
Figure 11.5Figure 11.5
Nylon Breaking StrengthNylon Breaking Strength
©2003 Thomson/South-Western 18
Figure 11.6Figure 11.6
Nylon Breaking StrengthNylon Breaking Strength
©2003 Thomson/South-Western 19
Figure 11.7Figure 11.7
Nylon Breaking StrengthNylon Breaking Strength
©2003 Thomson/South-Western 20
Figure 11.7Figure 11.7
Nylon Breaking StrengthNylon Breaking Strength
©2003 Thomson/South-Western 21
One-Factor ANOVA One-Factor ANOVA ProcedureProcedure
1.1. The replicates are obtained The replicates are obtained independently and randomly from each independently and randomly from each of the populationsof the populations
2.2. The observations from each population The observations from each population follow (approximately) a normal follow (approximately) a normal distributiondistribution
3.3. The populations all have a common The populations all have a common variancevariance
RequirementsRequirements
©2003 Thomson/South-Western 22
One-Factor ANOVA One-Factor ANOVA ProcedureProcedure
HHoo: µ: µ11 = µ = µ22 = … = µ = … = µkk
HHaa: not all µ’s are equal: not all µ’s are equal
HypothesesHypotheses
SourceSource dfdf SSSS MSMS FF
FactorFactor k - 1k - 1 SS(factor)SS(factor) MS(factor) MS(factor)
ErrorError n - 2n - 2 SS(error)SS(error) MS(error)MS(error)
TotalTotal n - 1n - 1 SS(total)SS(total)
MS(factor)MS(factor)
MS(error)MS(error)
reject reject HHoo if if FF** > > FF,,kk-1,-1,nn--kk
©2003 Thomson/South-Western 23
Completely Completely Randomized DesignRandomized Design
Replicates are obtained in a completely Replicates are obtained in a completely random manner from each populationrandom manner from each population
Null hypothesis isNull hypothesis is
HHoo: µ: µ11 = µ = µ22 = ... = µ = ... = µnn
©2003 Thomson/South-Western 24
Randomized Block DesignRandomized Block Design
The samples are not independent, the data The samples are not independent, the data are grouped (blocked) by another variableare grouped (blocked) by another variable
The difference between the randomized The difference between the randomized block design and the completely block design and the completely randomized design is that here we use a randomized design is that here we use a blocking strategy rather than independent blocking strategy rather than independent samples to obtain a more precise test for samples to obtain a more precise test for examining differences in the factor level examining differences in the factor level meansmeans
©2003 Thomson/South-Western 25
Randomized Block DesignRandomized Block Design
SS(factor) = [SS(factor) = [TT1122 + + TT22
22 + ... + + ... + TTkk22] -] -
11
bbTT22
bkbk
wherewhere
kk= number of factor levels in the design= number of factor levels in the designbb= number of blocks in the design= number of blocks in the designnn= number of observations = = number of observations = bkbkTT11, , TT22, ..., , ..., TTkk represent the totals for the k factor levelsrepresent the totals for the k factor levels
SS11, , SS22, ..., , ..., SSbb are the totals for the are the totals for the bb blocks blocks
TT= = TT11 + + TT22 + ... + + ... + TTkk
= = SS11 + + SS22 + ... + + ... + S Sbb = total of all observations = total of all observations
©2003 Thomson/South-Western 26
Randomized Block DesignRandomized Block Design
SS(blocks) = [SS(blocks) = [SS1122 + + SS22
22 + ... + + ... + SSbb22] -] -
11
kkTT22
bkbk
SS(total) = ∑SS(total) = ∑xx22 - - TT22
bkbk
SS(error) + SS(total) - SS(factor) - SS(blocks)SS(error) + SS(total) - SS(factor) - SS(blocks)
df for factordf for factor = = kk - 1 - 1df for blocksdf for blocks = = bb - 1 - 1
df for errordf for error = (= (kk - 1)( - 1)(bb - 1) - 1)df for totaldf for total = = bkbk - 1 - 1
©2003 Thomson/South-Western 27
Randomized Block DesignRandomized Block Design
SourceSource dfdf SSSS MSMS FF
FactorFactor kk - 1 - 1 SS(factor)SS(factor) MS(factor)MS(factor) FF11
BlocksBlocks bb - 1 - 1 SS(blocks)SS(blocks) MS(blocks)MS(blocks) FF22
ErrorError ((kk - 1)( - 1)(bb - 1) - 1) SS(error)SS(error) MS(error)MS(error)
TotalTotal bkbk - 1 - 1 SS(total)SS(total)
FF11 = = MS(factor)MS(factor)
MS(error)MS(error)FF22 = =
MS(blocks)MS(blocks)
MS(error)MS(error)
©2003 Thomson/South-Western 28
Factor Hypothesis TestFactor Hypothesis Test
HHoo: µ: µ11 = µ = µ22 = … = µ = … = µkk
HHaa: not all µ’s are equal: not all µ’s are equal
reject reject HHoo if if FF** > > FF,,kk-1,(-1,(kk-1)(-1)(bb-1)-1)
FF11 = = MS(factor)MS(factor)
MS(error)MS(error)
©2003 Thomson/South-Western 29
Block Hypothesis TestBlock Hypothesis Test
HHoo: µ: µ11 = µ = µ22 = … = µ = … = µbb
HHaa: not all µ’s are equal: not all µ’s are equal
reject reject HHoo if if FF** > > FF,,bb-1,(-1,(kk-1)(-1)(bb-1)-1)
FF22 = = MS(blocks)MS(blocks)
MS(error)MS(error)
©2003 Thomson/South-Western 30
Hardness Test Data AnalysisHardness Test Data Analysis
Figure 11.10Figure 11.10
©2003 Thomson/South-Western 31
Hardness Test Data AnalysisHardness Test Data Analysis
Figure 11.11Figure 11.11
©2003 Thomson/South-Western 32
Confidence IntervalConfidence IntervalDifference Between Difference Between
Two MeansTwo Means
Randomized BlockRandomized Block (1- (1- ) 100% confidence interval) 100% confidence interval
((XXii - X - Xjj) - ) - tt/2,/2,dfdf • • s • +s • +
to (to (XXii - X - Xjj) + ) + tt/2,df/2,df • • s • +s • +
11
bb11
bb
11
bb11
bb
©2003 Thomson/South-Western 33
Dental Claim Data AnalysisDental Claim Data Analysis
Figure 11.12Figure 11.12
©2003 Thomson/South-Western 34
Multiple Comparisons Multiple Comparisons Procedure:Procedure:
Randomized BlockRandomized Block
||XXii - - XXjj| > D| > D
DD = = QQ,,kk,(,(kk-1)(-1)(bb-1)-1)MS(error)MS(error)
bb
©2003 Thomson/South-Western 35
Machine Choice ExampleMachine Choice Example
Figure 11.13Figure 11.13
©2003 Thomson/South-Western 36
Two-Way Factorial DesignTwo-Way Factorial Design
SingleSingle MarriedMarried
MaleMale LowLow HighHigh
FemaleFemale HighHigh LowLow
Figure 11.14Figure 11.14
©2003 Thomson/South-Western 37
Two-Way Factorial DesignTwo-Way Factorial Design
Figure 11.15Figure 11.15
11 22 ......
bb
11 xx xx
xx
22 xx xx
xx
......
aa xx xx
xx
Factor BFactor B
Factor AFactor A
©2003 Thomson/South-Western 38
Two-Way Factorial DesignTwo-Way Factorial Design
Figure 11.16Figure 11.16
11 22 ...... BB
TotalsTotals
11
TT11
22
TT22
......
aa
TTaa
SS11 SS22 SSbb
Factor AFactor A
Factor BFactor B
xx, , xx(total =(total = R R1111))
xx, , xx(total =(total = R R2121))
xx, , xx(total =(total = R R1212))
xx, , xx(total =(total = R Raa11))
xx, , xx(total =(total = R Raa22))
xx, , xx(total =(total = R R2222))
xx, , xx(total =(total = R R11bb))
xx, , xx(total =(total = R R11bb))
xx, , xx(total =(total = R Rabab))
TotalsTotals
©2003 Thomson/South-Western 39
Two-Way Factorial DesignTwo-Way Factorial Design
Factor A: SSA = [Factor A: SSA = [TT1122 + + TT22
22 + ... + + ... + TTaa22] -] -
11
brbrTT22
abrabr
11
rrTT22
abrabrInteraction: SSAB = [∑Interaction: SSAB = [∑RR22] - SSA - SSB -] - SSA - SSB -
Factor B: SSB = [Factor B: SSB = [SS1122 + + SS22
22 + ... + + ... + SSaa22] -] - TT22
abrabr11
arar
TT22
abrabrTotal: SS(total) = ∑Total: SS(total) = ∑xx22 - -
©2003 Thomson/South-Western 40
Two-Way Factorial DesignTwo-Way Factorial Design
SS(error) = SS(total) - SSA - SSB - SSABSS(error) = SS(total) - SSA - SSB - SSAB
MS(error) =MS(error) =SS(error)SS(error)
abab((rr - 1) - 1)
MSA =MSA =SSASSA
aa - 1 - 1MSB =MSB =
SSBSSB
bb - 1 - 1
MSAB =MSAB =SSABSSAB
((aa - 1)( - 1)(bb - 1) - 1)
©2003 Thomson/South-Western 41
Two-Way Factorial DesignTwo-Way Factorial Design
SourceSource dfdf SSSS MSMS FF
Factor AFactor A aa - 1 - 1 SSASSA MSAMSA FF11
Factor BFactor B bb - 1 - 1 SSBSSB MSBMSB FF22
InteractionInteraction ((aa - 1)( - 1)(bb - 1) - 1) SSABSSAB MSABMSAB FF33
ErrorError abab((rr - 1) - 1) SS(error)SS(error) MS(error)MS(error)
TotalTotal abrabr - 1 - 1 SS(total)SS(total)
©2003 Thomson/South-Western 42
Hypothesis Test - Factor AHypothesis Test - Factor A
HHoo: Factor A is not significant (µ: Factor A is not significant (µMM = µ = µFF))
HHaa: Factor A is significant (µ: Factor A is significant (µMM ≠ µ ≠ µFF))
reject Hreject Ho,o,AA if if FF11 > > FF,,vv1,1,vv22
FF11 = =MSAMSA
MS(factor)MS(factor)
©2003 Thomson/South-Western 43
Hypothesis Test - Factor BHypothesis Test - Factor B
HHoo: Factor B is not significant (µ: Factor B is not significant (µ11 = µ = µ22 = µ = µ33))
HHaa: Factor B is significant (not all µ’s are equal): Factor B is significant (not all µ’s are equal)
reject Hreject Ho,o,BB if if FF22 > > FF,,vv1,1,vv22
FF22 = =MSBMSB
MS(error)MS(error)
©2003 Thomson/South-Western 44
Hypothesis Test - Hypothesis Test - InteractionInteraction
HHoo: Interaction is not significant : Interaction is not significant
HHaa: Interaction is significant: Interaction is significant
reject Hreject Ho,o,ABAB if if FF22 > > FF,,vv1,1,vv22
FF33 = =MSABMSAB
MS(error)MS(error)
©2003 Thomson/South-Western 45
Multiple Comparisons Multiple Comparisons Procedure:Procedure:
Two-Way Factorial DesignTwo-Way Factorial Design
DD = = QQ,,kk,,vv • •MS(error)MS(error)
nnrr
vv = df for error = df for errornnrr = number of replicates in each sample = number of replicates in each sample
©2003 Thomson/South-Western 46
Interaction EffectInteraction Effect
Figure 11.17Figure 11.17
–300300 –
–250250 –
–200200 –
–150150 –
–100100 –
|Category 1Category 1
|Category 2Category 2
|Category 3Category 3
|Category 4Category 4
MaleMale
FemaleFemale
Employee classificationEmployee classification
An
nu
al a
mo
un
t cl
aim
ed
An
nu
al a
mo
un
t cl
aim
ed
on
de
nta
l in
sura
nce
on
de
nta
l in
sura
nce
AA
©2003 Thomson/South-Western 47
Interaction EffectInteraction Effect
Figure 11.17Figure 11.17
–300300 –
–250250 –
–200200 –
–150150 –
–100100 –
|Category 1Category 1
|Category 2Category 2
|Category 3Category 3
|Category 4Category 4
MaleMale
FemaleFemale
Employee classificationEmployee classification
An
nu
al a
mo
un
t cl
aim
ed
An
nu
al a
mo
un
t cl
aim
ed
on
de
nta
l in
sura
nce
on
de
nta
l in
sura
nce
BB
©2003 Thomson/South-Western 48
Gender Factor AnalysisGender Factor Analysis
Figure 11.18Figure 11.18
©2003 Thomson/South-Western 49
Gender Factor AnalysisGender Factor Analysis
Figure 11.19Figure 11.19