two factor designs
DESCRIPTION
BRAND. 1234 17.9, 18.1 17.8, 17.8 18.1, 18.2 17.8, 17.9 18.0, 18.2 18.0, 18.3 18.4, 18.1 18.1, 18.5 18.0, 17.8 17.8, 18.0 18.1, 18.3 18.1, 17.9. 1 2 3. DEVICE. Two Factor Designs. Consider studying the impact of two factors on the yield (response):. NOTE: The “1”, “2”,etc... - PowerPoint PPT PresentationTRANSCRIPT
1
Two Factor DesignsConsider studying the impact of two factors on the yield (response):
Here we have R = 3 rows (levels of the Row factor), C = 4 (levels of the column factor), and n = 2 replicates per cell
[nij for (i,j)th cell if not all equal]
NOTE: The “1”, “2”,etc...mean Level 1, Level 2,etc..., NOT metric values
NOTE: The “1”, “2”,etc...mean Level 1, Level 2,etc..., NOT metric values
1 2 3 417.9, 18.1 17.8, 17.8 18.1, 18.2 17.8, 17.9
18.0, 18.2 18.0, 18.3 18.4, 18.1 18.1, 18.5
18.0, 17.8 17.8, 18.0 18.1, 18.3 18.1, 17.9
BRAND
1
2
3
DEVICE
2
MODEL:
i = 1, ..., Rj = 1, ..., Ck= 1, ..., n
In general, n observations per cell, R • C cells.
Yijk = ijijijk
3
the grand meanithe difference between the ith
row mean and the grand meanj the difference between the jth
column mean and the grand meanij the interaction associated with
the i-th row and the j-th columnijij
4
Where Y••• = Grand mean
Yi•• = Mean of row i
Y•j• = Mean of column j
Yij• = Mean of cell (i,j)
Y••• = Grand mean
Yi•• = Mean of row i
Y•j• = Mean of column j
Yij• = Mean of cell (i,j)
[All the terms are somewhat “intuitive”,
except for (Yij• -Yi•• - Y•j• + Y•••)]
Yijk = Y•••+ (Yi•• - Y•••) + (Y•j• - Y•••)
+ (Yij• - Yi•• - Y•j• + Y•••)
+ (Yijk - Yij•)
Yijk = Y•••+ (Yi•• - Y•••) + (Y•j• - Y•••)
+ (Yij• - Yi•• - Y•j• + Y•••)
+ (Yijk - Yij•)
5
The term (Yij• -Yi•• - Y•j• + Y•••) is more intuitively written as:
how a cellmean differs
from grand mean
adjustmentfor “row
membership”
adjustmentfor “column
membership”
We can, without loss of generality, assume (for a moment) that there is no error (random part); why then might the above be non-zero?
(Yij• - Y•••) (Yi•• - Y•••) (Y•j• - Y•••)
6
ANSWER:
Two basic ways to look at interaction:
BL BH
AL 5 8
AH 10 ?
If AHBH = 13, no interaction
If AHBH > 13, + interaction
If AHBH < 13, - interaction
- When B goes from BLBH, yield goes up by 3 (58).- When A goes from AL AH, yield goes up by 5 (510).- When both changes of level occur, does yield go up by the sum, 3 + 5 = 8?
Interaction = degree of difference from sum of separate effects
1)
“INTERACTION”
7
2)
- Holding BL, what happens as A goes from AL AH? +5
- Holding BH, what happens as A goes from AL AH? +9
If the effect of one factor (i.e., the impact of changing its level) is DIFFERENT for different levels of another factor, then INTERACTION exists between the two factors.
BL BH
AL 5 8
AH 10 17
NOTE:- Holding AL, BL BH has impact + 3- Holding AH, BL BH has impact + 7
(AB) = (BA) or (9-5) = (7-3).
8
(Yijk - Y•••) = (Yi•• - Y•••) + (Y•j• - Y•••)
+ [(Yij• - Yi••) - (Y•j• - Y•••)]
+ (Yijk - Yij•)
Going back to the (model) equation on page 4, and bringing Y... to the other side of the equation, we get
If we then square both sides, triple sum both sides over i, j, and k, we get, (after noting that all cross-product terms cancel):
Effect of column j at row i. Effect of column j
9
TSS = SSBRows + SSBCols + SSIR,C+ SSWError
and, in terms of degrees of freedom,
R.C.n-1 = (R-1) + (C-1) + (R-1)(C-1) + R.C.(n-1); DF of Interaction = (RC-1)-(R-1)-(C-1) = (R-1)(C-1).
OR,
(Yijk - Y•••)n.C.Yi•• - Y•••
i j k i
+ n.R.Y•j• - Y•••)2
+ n.Yij• - Yi•• - Y•j• +Y•••
i j
(Yijk - Yij•
i j k
j
10
17.9, 18.1 17.8, 17.8 18.1, 18.2 17.8,17.9
18.1 17.8 18.15 17.85
18.2, 18.0 18.0, 18.3 18.4, 18.1 18.1, 18.5
18.1 18.15 18.25 18.3
18.0, 17.8 17.8, 18.0 18.1, 18.3 18.1, 17.9
17.9 17.9 18.2 18.0
1 2 3 4
18.00 17.95 18.20 18.05
1
2
3
In our example:
DEV ICE
17.95
18.20
18.00
18.05
BRAND
11
SSBrows =2 4[(17.95-18.05) 2 + (18.20-18.05)
2 + (18.0-18.05)
2]
= 8 (.01 + .0225 + .0025) = .28
SSBcol =2•3[(18-18.05) 2+(17.95-18.05)
2+(18.2-18.05)
2+( 18.05-18.05)
2]
= 6 (.0025 + .001 + .0225 + 0) = .21
SSIR,C = 2(18-17.95-18+18.05)2 + (17.8-17.95-
17.95+18.05)2 ....… + (18-18-18.05+18.05)2
[]
= 2 [.055] = .11
SSW = (17.9-18.0) 2 + (18.1-18.0)
2 + (17.8-17.8)
2 + (17.8-17.8)
2 + …
....... (18.1-18.0) 2 + (17.9-18.0)
2
= .30
TSS = .28 + .21+ .11 + .30 = .90
•
•
12
FTV (2, 12) = 3.89 Reject Ho
FTV (3, 12) = 3.49 Accept Ho
FTV (6, 12) = 3.00 Accept Ho
1) Ho: All Row Means EqualH1: Not all Row Means Equal
2) Ho: All Col. Means EqualH1: Not All Col. Means Equal
3) Ho: No Int’n between factorsH1: There is int’n between factors
ANOVA
.05
SOURCE SSQ df M.S. FcalcRows .28 2 .14 5.6COL .21 3 .07 2.8Int’n .11 6 .0183 .73Error .30 12 .025
13
An issue to think about:We have: E ( MSI) =
+ Vint’n
E (MSW) =
Since Vint’n cannot be negative, and MSI = .0183 < MSW = .025, some argue that this is “strong” evidence that Vint’n is not > 0.
If this is true, E(MSI) = , and we should combine MSI and MSW (i.e., “pool”) estimates. This gives:
SSQ df MS SSQ df MS
Int. .11 6 .0183 Error .41 18 .0228Error .30 12 .025
to
(Some stat packages suggest what you should do).
14
Fixed Random Mixed
MSBrows + VR + VI + VR + VR
MSBcol + VC + VI + VC + VI + VC
MSBInt’n + VI + VI + VI
MSWerror
Another issue:The table of 2 pages ago assumes what is called a “Fixed Model”. There is also what is called a “Random Model” (and a “Mixed Model”).
MEANSQUARE EXPECTATIONS
col = fixedrow= random
col = fixedrow= random
Reference: Design and Analysis of Experiments by D.C. Montgomery, 4 th edition, Chapter 11.
15
Fixed: Specific levels chosen by the experimenterRandom: Levels chosen randomly from a large
number of possibilities
Fixed: All Levels about which inferences are to be made are included in the experiment
Random: Levels are some of a large number possible
Fixed: A definite number of qualitatively distinguishable levels, and we plan to study them all, or a continuous set of quantitative settings, but we choose a suitable, definite subset in a limited region and confine inferences to that subset
Random: Levels are a random sample from an infinite ( or large) population
16
“In a great number of cases the investigator may argue either way, depending on his mood and his handling of the subject matter. In other words, it is more a matter of assumption than of reality.”
Some authors say that if in doubt, assume fixed model. Others say things like “I think in most experimental situations the random model is applicable.” [The latter quote is from a person whose experiments are in the field of biology].
17
My own feeling is that in most areas of management, a majority of experiments involve the fixed model [e.g., specific promotional campaigns, two specific ways of handling an issue on an income statement, etc.] . Many cases involve neither a “pure” fixed nor a “pure” random situation [e.g., selecting 3 prices from 6 “practical” possibilities].
Note that the issue sometimes becomes irrelevant in a practical sense when (certain) interactions are not present. Also note that each assumption may yield you the same “answer” in terms of practical application, in which case the distinction may not be an important one.
18
M FInteresting Example:*
Frontiersman
April
50 peopleper cell
Mean Scores
“Frontiersman” “April” “Frontiersman” “April”Dependent males males females femalesVariables (n=50) (n=50) (n=50) (n=50)
Intent-to-purchase 4.44 3.50 2.04 4.52Intent-to-purchase 4.44 3.50 2.04 4.52
(*) Decision Sciences”, Vol. 9, p. 470, 1978
Brand Name Appeal for Men & Women:
19
1 2
1 2
2
3
4
gender
brandM
ea
n
Interaction Plot - Data Means for y
12Y
20
ANOVA Results
Dependent Source d.f. MS FVariable
Intent-to- Sex (A) 1 23.80 5.61* purchase Brand name (B) 1 29.64 6.99**(7 pt. scale) A x B 1 146.21 34.48***
Error 196 4.24
*p<.05**p<.01
***p<.001
21
Two-Way ANOVA in Minitab
Stat>>Anova>>General Linear Model:
ModelModel device brand device*brand
Random factorsRandom factors
ResultsResults
Factor plotsFactor plots
GraphsGraphs
device
Tick “Display expected mean squares and variance components”
Main effects plots & Interactions plots
Use standardized residuals for plots
22
EXCELa b c d
X 17.9 17.8 18.1 17.818.1 17.8 18.2 17.918.2 18.0 18.4 18.118.0 18.3 18.1 18.518.0 17.8 18.1 18.117.8 18.0 18.3 17.9
SUMMARY a b c d Totalx
Count 2 2 2 2 8Sum 36 35.6 36.3 35.7 143.6
Average 18 17.8 18.15 17.85 17.95Variance 2 0 0.5 0.5 2.57
Count 2 2 2 2 8Sum 36.2 36.3 36.5 36.6 145.6
Average 18.1 18.15 18.25 18.3 18.2Variance 2 4.5 4.5 8 3.43
Count 2 2 2 2 8Sum 35.8 35.8 36.4 36.0 144.0
Average 17.9 17.9 18.2 18.0 18.0Variance 2 2 2 2 2.86
TotalCount 6 6 6 6Sum 108 107.7 109.2 108.3
Average 18 17.95 18.2 18.05Variance 2 3.9 1.6 6.3
ANOVASource ofVariation
SS df MS F P-value
F crit
Sample .28 2 .14 5.6 0.019 3.885
Columns .21 3 .07 2.8 0.085 3.490
Interaction .11 6 .0183 0.73 0.632 2.996
Within .30 12 .025
Total 90 23
23
SPSSTime Device Brand17.9 1.00 1.0018.1 1.00 1.0018.2 2.00 1.0018.0 2.00 1.0018.0 3.00 1.0017.8 3.00 1.0017.8 1.00 2.0017.8 1.00 2.0018.0 2.00 2.0018.3 2.00 2.0017.8 3.00 2.0018.0 3.00 2.0018.1 1.00 3.0018.2 1.00 3.0018.4 2.00 3.0018.1 2.00 3.0018.1 3.00 3.0018.3 3.00 3.0017.8 1.00 4.0017.9 1.00 4.0018.1 2.00 4.0018.5 2.00 4.0018.1 3.00 4.0017.9 3.00 4.00
24
* * * A N A L Y S I S O F V A R I A N C E * * * Time by Device Brand
Sum of Mean SigSource of Variation Squares DF Square F of F
Main Effects .49000 5 .09800 3.920 .024 Device .28000 2 .14000 5.600 .019 Brand .21000 3 .07000 2.800 .085
2-Way Interactions .11000 6 .01833 .733 .633 Device Brand .11000 6 .01833 .733 .633
Explained .60000 11 .05455 2.182 .098Residual .30000 12 .02500Total .90000 23 .03913
25
Two Factors with No Replication,
When there’s no replication, there is no “pure” way to estimate ERROR.Error is measured by considering more than one observation (i.e., replication) at the same “treatment combination” (i.e., experimental conditions).
1 2 3
1 7 3 4
2 10 6 8
3 6 2 5
4 9 5 7
A
B
26
Our model for analysis is “technically”:
Yij = i j + Iij
i = 1, ..., R
j = 1, ..., C
We can write:
Yij = Y•• + (Yi• - Y••) + (Y•j - Y••)
+ (Yij - Yi• - Y•j+ Y••)
27
After bringing Y•• to the other side of the equation, squaring both sides, and double summing over i and j,
We Find:
Yij - Y••)2 = C • Yi•-Y••)2
+ R • Y•j - Y••)2
+ (Yij - Yi• - Y•j + Y••)2
R
i = 1
C
j=1
R
i=1
C
j=1
R
i=1
C
j=1
28
TSS = SSBROWS + SSBCol + SSIR, C
R•C - 1 = (R - 1) + (C - 1) + (R - 1) (C - 1)Degrees of Freedom :
We Know, E(MSInt.) = VInt.
If we assume VInt. = 0, E(MSInt.) = 2,
and we can call SSIR,C SSW
MSInt MSW
29
And, our model may be rewritten:
Yij = + i + j + ij,
and the “labels” would become:
TSS = SSBROWS+ SSBCol + SSWError
In our problem: SSBrows = 28.67
SSBcol = 32
SSW = 1.33
30
Source SSQ df MSQ Fcalc
rows
col
Error
28.67
32.00
1.33
9.55
16.00
00.22
3
2
6
43
72
TSS = 62 11
at = .01,
FTV (3,6)
= 9.78
FTV(2,6)
= 10.93
ANOVAand:
31
What if we’re wrong about there being no interaction?
If we “think” our ratio is,
in Expectation, 2 + VROWS , (Say, for ROWS) 2
and it really is (because there’s interaction)
2 + VROWS,
2 + Vint’n
being wrong can lead only to giving us an underestimated Fcalc.
32
Thus, if we’ve REJECTED Ho, we can feel confident of our conclusion, even if there’s
interaction
If we’ve ACCEPTED Ho, only then could the no interaction assumption be CRITICAL.
33
Blocking• We will add a factor even if it is not of interest so that the study of the prime factors is under more homogeneous conditions.This factor is called “block”. Most of time, the block does not interact with prime factors.
• Popular factors are “location”, “gender” and so on.
• A two-factor design with one block factor is called a “randomized block design”.
34
For example, suppose that we are studying worker absenteeism as a function of the age of the worker, and have different levels of ages: 25-30, 40-55, and 55-60. However, a worker’s gender may also affect his/her amount of absenteeism. Even though we are not particularly concerned with the impact of gender, we want to ensure that the gender factor does not pollute our conclusions about the effect of age. Moreover, it seems unlikely that “gender” interacts with “ages”. We include “gender” as a block factor.