two factor designs

34
1 Two Factor Designs Consider studying the impact of two factors on the yield (response): Here we have R = 3 rows (levels of the Row factor), C = 4 (levels of the column factor), and n = 2 replicates per cell [n for (i,j) th cell if not all equal] NOTE: The “1”, “2”,et c... mean Level 1, Level 2, etc... , NOT metric values 1 2 3 4 17.9, 18.1 17.8, 17.8 18.1, 18.2 17.8, 17.9 18.0, 18.2 18.0, 18.3 18.4, 18.1 18.1, 18.5 18.0, 17.8 17.8, 18.0 18.1, 18.3 18.1, 17.9 BRAND 1 2 3 DEVICE

Upload: zeno

Post on 05-Jan-2016

47 views

Category:

Documents


0 download

DESCRIPTION

BRAND. 1234 17.9, 18.1 17.8, 17.8 18.1, 18.2 17.8, 17.9 18.0, 18.2 18.0, 18.3 18.4, 18.1 18.1, 18.5 18.0, 17.8 17.8, 18.0 18.1, 18.3 18.1, 17.9. 1 2 3. DEVICE. Two Factor Designs. Consider studying the impact of two factors on the yield (response):. NOTE: The “1”, “2”,etc... - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Two Factor Designs

1

Two Factor DesignsConsider studying the impact of two factors on the yield (response):

Here we have R = 3 rows (levels of the Row factor), C = 4 (levels of the column factor), and n = 2 replicates per cell

[nij for (i,j)th cell if not all equal]

NOTE: The “1”, “2”,etc...mean Level 1, Level 2,etc..., NOT metric values

NOTE: The “1”, “2”,etc...mean Level 1, Level 2,etc..., NOT metric values

1 2 3 417.9, 18.1 17.8, 17.8 18.1, 18.2 17.8, 17.9

18.0, 18.2 18.0, 18.3 18.4, 18.1 18.1, 18.5

18.0, 17.8 17.8, 18.0 18.1, 18.3 18.1, 17.9

BRAND

1

2

3

DEVICE

Page 2: Two Factor Designs

2

MODEL:

i = 1, ..., Rj = 1, ..., Ck= 1, ..., n

In general, n observations per cell, R • C cells.

Yijk = ijijijk

Page 3: Two Factor Designs

3

the grand meanithe difference between the ith

row mean and the grand meanj the difference between the jth

column mean and the grand meanij the interaction associated with

the i-th row and the j-th columnijij

Page 4: Two Factor Designs

4

Where Y••• = Grand mean

Yi•• = Mean of row i

Y•j• = Mean of column j

Yij• = Mean of cell (i,j)

Y••• = Grand mean

Yi•• = Mean of row i

Y•j• = Mean of column j

Yij• = Mean of cell (i,j)

[All the terms are somewhat “intuitive”,

except for (Yij• -Yi•• - Y•j• + Y•••)]

Yijk = Y•••+ (Yi•• - Y•••) + (Y•j• - Y•••)

+ (Yij• - Yi•• - Y•j• + Y•••)

+ (Yijk - Yij•)

Yijk = Y•••+ (Yi•• - Y•••) + (Y•j• - Y•••)

+ (Yij• - Yi•• - Y•j• + Y•••)

+ (Yijk - Yij•)

Page 5: Two Factor Designs

5

The term (Yij• -Yi•• - Y•j• + Y•••) is more intuitively written as:

how a cellmean differs

from grand mean

adjustmentfor “row

membership”

adjustmentfor “column

membership”

We can, without loss of generality, assume (for a moment) that there is no error (random part); why then might the above be non-zero?

(Yij• - Y•••) (Yi•• - Y•••) (Y•j• - Y•••)

Page 6: Two Factor Designs

6

ANSWER:

Two basic ways to look at interaction:

BL BH

AL 5 8

AH 10 ?

If AHBH = 13, no interaction

If AHBH > 13, + interaction

If AHBH < 13, - interaction

- When B goes from BLBH, yield goes up by 3 (58).- When A goes from AL AH, yield goes up by 5 (510).- When both changes of level occur, does yield go up by the sum, 3 + 5 = 8?

Interaction = degree of difference from sum of separate effects

1)

“INTERACTION”

Page 7: Two Factor Designs

7

2)

- Holding BL, what happens as A goes from AL AH? +5

- Holding BH, what happens as A goes from AL AH? +9

If the effect of one factor (i.e., the impact of changing its level) is DIFFERENT for different levels of another factor, then INTERACTION exists between the two factors.

BL BH

AL 5 8

AH 10 17

NOTE:- Holding AL, BL BH has impact + 3- Holding AH, BL BH has impact + 7

(AB) = (BA) or (9-5) = (7-3).

Page 8: Two Factor Designs

8

(Yijk - Y•••) = (Yi•• - Y•••) + (Y•j• - Y•••)

+ [(Yij• - Yi••) - (Y•j• - Y•••)]

+ (Yijk - Yij•)

Going back to the (model) equation on page 4, and bringing Y... to the other side of the equation, we get

If we then square both sides, triple sum both sides over i, j, and k, we get, (after noting that all cross-product terms cancel):

Effect of column j at row i. Effect of column j

Page 9: Two Factor Designs

9

TSS = SSBRows + SSBCols + SSIR,C+ SSWError

and, in terms of degrees of freedom,

R.C.n-1 = (R-1) + (C-1) + (R-1)(C-1) + R.C.(n-1); DF of Interaction = (RC-1)-(R-1)-(C-1) = (R-1)(C-1).

OR,

(Yijk - Y•••)n.C.Yi•• - Y•••

i j k i

+ n.R.Y•j• - Y•••)2

+ n.Yij• - Yi•• - Y•j• +Y•••

i j

(Yijk - Yij•

i j k

j

Page 10: Two Factor Designs

10

17.9, 18.1 17.8, 17.8 18.1, 18.2 17.8,17.9

18.1 17.8 18.15 17.85

18.2, 18.0 18.0, 18.3 18.4, 18.1 18.1, 18.5

18.1 18.15 18.25 18.3

18.0, 17.8 17.8, 18.0 18.1, 18.3 18.1, 17.9

17.9 17.9 18.2 18.0

1 2 3 4

18.00 17.95 18.20 18.05

1

2

3

In our example:

DEV ICE

17.95

18.20

18.00

18.05

BRAND

Page 11: Two Factor Designs

11

SSBrows =2 4[(17.95-18.05) 2 + (18.20-18.05)

2 + (18.0-18.05)

2]

= 8 (.01 + .0225 + .0025) = .28

SSBcol =2•3[(18-18.05) 2+(17.95-18.05)

2+(18.2-18.05)

2+( 18.05-18.05)

2]

= 6 (.0025 + .001 + .0225 + 0) = .21

SSIR,C = 2(18-17.95-18+18.05)2 + (17.8-17.95-

17.95+18.05)2 ....… + (18-18-18.05+18.05)2

[]

= 2 [.055] = .11

SSW = (17.9-18.0) 2 + (18.1-18.0)

2 + (17.8-17.8)

2 + (17.8-17.8)

2 + …

....... (18.1-18.0) 2 + (17.9-18.0)

2

= .30

TSS = .28 + .21+ .11 + .30 = .90

Page 12: Two Factor Designs

12

FTV (2, 12) = 3.89 Reject Ho

FTV (3, 12) = 3.49 Accept Ho

FTV (6, 12) = 3.00 Accept Ho

1) Ho: All Row Means EqualH1: Not all Row Means Equal

2) Ho: All Col. Means EqualH1: Not All Col. Means Equal

3) Ho: No Int’n between factorsH1: There is int’n between factors

ANOVA

.05

SOURCE SSQ df M.S. FcalcRows .28 2 .14 5.6COL .21 3 .07 2.8Int’n .11 6 .0183 .73Error .30 12 .025

Page 13: Two Factor Designs

13

An issue to think about:We have: E ( MSI) =

+ Vint’n

E (MSW) =

Since Vint’n cannot be negative, and MSI = .0183 < MSW = .025, some argue that this is “strong” evidence that Vint’n is not > 0.

If this is true, E(MSI) = , and we should combine MSI and MSW (i.e., “pool”) estimates. This gives:

SSQ df MS SSQ df MS

Int. .11 6 .0183 Error .41 18 .0228Error .30 12 .025

to

(Some stat packages suggest what you should do).

Page 14: Two Factor Designs

14

Fixed Random Mixed

MSBrows + VR + VI + VR + VR

MSBcol + VC + VI + VC + VI + VC

MSBInt’n + VI + VI + VI

MSWerror

Another issue:The table of 2 pages ago assumes what is called a “Fixed Model”. There is also what is called a “Random Model” (and a “Mixed Model”).

MEANSQUARE EXPECTATIONS

col = fixedrow= random

col = fixedrow= random

Reference: Design and Analysis of Experiments by D.C. Montgomery, 4 th edition, Chapter 11.

Page 15: Two Factor Designs

15

Fixed: Specific levels chosen by the experimenterRandom: Levels chosen randomly from a large

number of possibilities

Fixed: All Levels about which inferences are to be made are included in the experiment

Random: Levels are some of a large number possible

Fixed: A definite number of qualitatively distinguishable levels, and we plan to study them all, or a continuous set of quantitative settings, but we choose a suitable, definite subset in a limited region and confine inferences to that subset

Random: Levels are a random sample from an infinite ( or large) population

Page 16: Two Factor Designs

16

“In a great number of cases the investigator may argue either way, depending on his mood and his handling of the subject matter. In other words, it is more a matter of assumption than of reality.”

Some authors say that if in doubt, assume fixed model. Others say things like “I think in most experimental situations the random model is applicable.” [The latter quote is from a person whose experiments are in the field of biology].

Page 17: Two Factor Designs

17

My own feeling is that in most areas of management, a majority of experiments involve the fixed model [e.g., specific promotional campaigns, two specific ways of handling an issue on an income statement, etc.] . Many cases involve neither a “pure” fixed nor a “pure” random situation [e.g., selecting 3 prices from 6 “practical” possibilities].

Note that the issue sometimes becomes irrelevant in a practical sense when (certain) interactions are not present. Also note that each assumption may yield you the same “answer” in terms of practical application, in which case the distinction may not be an important one.

Page 18: Two Factor Designs

18

M FInteresting Example:*

Frontiersman

April

50 peopleper cell

Mean Scores

“Frontiersman” “April” “Frontiersman” “April”Dependent males males females femalesVariables (n=50) (n=50) (n=50) (n=50)

Intent-to-purchase 4.44 3.50 2.04 4.52Intent-to-purchase 4.44 3.50 2.04 4.52

(*) Decision Sciences”, Vol. 9, p. 470, 1978

Brand Name Appeal for Men & Women:

Page 19: Two Factor Designs

19

1 2

1 2

2

3

4

gender

brandM

ea

n

Interaction Plot - Data Means for y

12Y

Page 20: Two Factor Designs

20

ANOVA Results

Dependent Source d.f. MS FVariable

Intent-to- Sex (A) 1 23.80 5.61* purchase Brand name (B) 1 29.64 6.99**(7 pt. scale) A x B 1 146.21 34.48***

Error 196 4.24

*p<.05**p<.01

***p<.001

Page 21: Two Factor Designs

21

Two-Way ANOVA in Minitab

Stat>>Anova>>General Linear Model:

ModelModel device brand device*brand

Random factorsRandom factors

ResultsResults

Factor plotsFactor plots

GraphsGraphs

device

Tick “Display expected mean squares and variance components”

Main effects plots & Interactions plots

Use standardized residuals for plots

Page 22: Two Factor Designs

22

EXCELa b c d

X 17.9 17.8 18.1 17.818.1 17.8 18.2 17.918.2 18.0 18.4 18.118.0 18.3 18.1 18.518.0 17.8 18.1 18.117.8 18.0 18.3 17.9

SUMMARY a b c d Totalx

Count 2 2 2 2 8Sum 36 35.6 36.3 35.7 143.6

Average 18 17.8 18.15 17.85 17.95Variance 2 0 0.5 0.5 2.57

Count 2 2 2 2 8Sum 36.2 36.3 36.5 36.6 145.6

Average 18.1 18.15 18.25 18.3 18.2Variance 2 4.5 4.5 8 3.43

Count 2 2 2 2 8Sum 35.8 35.8 36.4 36.0 144.0

Average 17.9 17.9 18.2 18.0 18.0Variance 2 2 2 2 2.86

TotalCount 6 6 6 6Sum 108 107.7 109.2 108.3

Average 18 17.95 18.2 18.05Variance 2 3.9 1.6 6.3

ANOVASource ofVariation

SS df MS F P-value

F crit

Sample .28 2 .14 5.6 0.019 3.885

Columns .21 3 .07 2.8 0.085 3.490

Interaction .11 6 .0183 0.73 0.632 2.996

Within .30 12 .025

Total 90 23

Page 23: Two Factor Designs

23

SPSSTime Device Brand17.9 1.00 1.0018.1 1.00 1.0018.2 2.00 1.0018.0 2.00 1.0018.0 3.00 1.0017.8 3.00 1.0017.8 1.00 2.0017.8 1.00 2.0018.0 2.00 2.0018.3 2.00 2.0017.8 3.00 2.0018.0 3.00 2.0018.1 1.00 3.0018.2 1.00 3.0018.4 2.00 3.0018.1 2.00 3.0018.1 3.00 3.0018.3 3.00 3.0017.8 1.00 4.0017.9 1.00 4.0018.1 2.00 4.0018.5 2.00 4.0018.1 3.00 4.0017.9 3.00 4.00

Page 24: Two Factor Designs

24

* * * A N A L Y S I S O F V A R I A N C E * * * Time by Device Brand

Sum of Mean SigSource of Variation Squares DF Square F of F

Main Effects .49000 5 .09800 3.920 .024 Device .28000 2 .14000 5.600 .019 Brand .21000 3 .07000 2.800 .085

2-Way Interactions .11000 6 .01833 .733 .633 Device Brand .11000 6 .01833 .733 .633

Explained .60000 11 .05455 2.182 .098Residual .30000 12 .02500Total .90000 23 .03913

Page 25: Two Factor Designs

25

Two Factors with No Replication,

When there’s no replication, there is no “pure” way to estimate ERROR.Error is measured by considering more than one observation (i.e., replication) at the same “treatment combination” (i.e., experimental conditions).

1 2 3

1 7 3 4

2 10 6 8

3 6 2 5

4 9 5 7

A

B

Page 26: Two Factor Designs

26

Our model for analysis is “technically”:

Yij = i j + Iij

i = 1, ..., R

j = 1, ..., C

We can write:

Yij = Y•• + (Yi• - Y••) + (Y•j - Y••)

+ (Yij - Yi• - Y•j+ Y••)

Page 27: Two Factor Designs

27

After bringing Y•• to the other side of the equation, squaring both sides, and double summing over i and j,

We Find:

Yij - Y••)2 = C • Yi•-Y••)2

+ R • Y•j - Y••)2

+ (Yij - Yi• - Y•j + Y••)2

R

i = 1

C

j=1

R

i=1

C

j=1

R

i=1

C

j=1

Page 28: Two Factor Designs

28

TSS = SSBROWS + SSBCol + SSIR, C

R•C - 1 = (R - 1) + (C - 1) + (R - 1) (C - 1)Degrees of Freedom :

We Know, E(MSInt.) = VInt.

If we assume VInt. = 0, E(MSInt.) = 2,

and we can call SSIR,C SSW

MSInt MSW

Page 29: Two Factor Designs

29

And, our model may be rewritten:

Yij = + i + j + ij,

and the “labels” would become:

TSS = SSBROWS+ SSBCol + SSWError

In our problem: SSBrows = 28.67

SSBcol = 32

SSW = 1.33

Page 30: Two Factor Designs

30

Source SSQ df MSQ Fcalc

rows

col

Error

28.67

32.00

1.33

9.55

16.00

00.22

3

2

6

43

72

TSS = 62 11

at = .01,

FTV (3,6)

= 9.78

FTV(2,6)

= 10.93

ANOVAand:

Page 31: Two Factor Designs

31

What if we’re wrong about there being no interaction?

If we “think” our ratio is,

in Expectation, 2 + VROWS , (Say, for ROWS) 2

and it really is (because there’s interaction)

2 + VROWS,

2 + Vint’n

being wrong can lead only to giving us an underestimated Fcalc.

Page 32: Two Factor Designs

32

Thus, if we’ve REJECTED Ho, we can feel confident of our conclusion, even if there’s

interaction

If we’ve ACCEPTED Ho, only then could the no interaction assumption be CRITICAL.

Page 33: Two Factor Designs

33

Blocking• We will add a factor even if it is not of interest so that the study of the prime factors is under more homogeneous conditions.This factor is called “block”. Most of time, the block does not interact with prime factors.

• Popular factors are “location”, “gender” and so on.

• A two-factor design with one block factor is called a “randomized block design”.

Page 34: Two Factor Designs

34

For example, suppose that we are studying worker absenteeism as a function of the age of the worker, and have different levels of ages: 25-30, 40-55, and 55-60. However, a worker’s gender may also affect his/her amount of absenteeism. Even though we are not particularly concerned with the impact of gender, we want to ensure that the gender factor does not pollute our conclusions about the effect of age. Moreover, it seems unlikely that “gender” interacts with “ages”. We include “gender” as a block factor.