topic 27: strategies of analysis
DESCRIPTION
Topic 27: Strategies of Analysis. Outline. Strategy for analysis of two-way studies Interaction is not significant Interaction is significant What if levels of factor or factors is quantitative? What if cell size is n=1?. An analytical strategy. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/1.jpg)
Topic 27: Strategies of Analysis
![Page 2: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/2.jpg)
Outline
• Strategy for analysis of two-way studies
– Interaction is not significant
– Interaction is significant
• What if levels of factor or factors is quantitative?
• What if cell size is n=1?
![Page 3: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/3.jpg)
An analytical strategy
• Run the model with main effects and the two-way interaction
• Plot the data, the means and look at the normal quantile plot
• Check the significance test for the interaction
![Page 4: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/4.jpg)
AB interaction not sig• Possibly rerun the analysis without the
interaction. Only consider if dfe is small
• For a main effect that is not significant–No evidence to conclude that the levels
of this factor are associated with different means of the response variable–Possibly rerun without this factor giving
a one-way anova. Watch out for Type II errors that inflate MSE
![Page 5: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/5.jpg)
AB interaction not sig• If both main effects are not significant–Model could be run as Y=; –Basically assuming a one population
model• For a main effect with more than two
levels that is significant, use the means or lsmeans statement with the tukey multiple comparison procedure
• Contrasts and linear combinations can also be examined using the contrast and estimate statements
![Page 6: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/6.jpg)
AB interaction sig but not important
• Plots and a careful examination of the cell means may indicate that the interaction is not very important even though it is statistically significant
• Use the marginal means for each significant main effect to describe the important results
• You may need to qualify these results using the interaction
![Page 7: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/7.jpg)
AB interaction sig but not important
• Use the methods described for the situation where the interaction is not significant but keep the interaction in the model
• Carefully interpret the marginal means as averages over the levels of the other factor
![Page 8: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/8.jpg)
AB interaction is sig and important
• Options include–Treat as a one-way with ab levels; use
tukey to compare means; contrasts and estimate can also be useful–Report that the interaction is significant;
plot the means and describe the pattern–Analyze the levels of A for each level of
B (use a BY statement) or vice versa
![Page 9: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/9.jpg)
Specification of contrast and estimate statements
• When using the contrast statement always double check your results with the estimate statement
• The order of factors is determined by the order in the class statement, not the order in the model statement
• Contrast should come a priori from research questions
![Page 10: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/10.jpg)
KNNL Example• KNNL p 833• Y is the number of cases of bread sold• A is the height of the shelf display, a=3
levels: bottom, middle, top• B is the width of the shelf display, b=2:
regular, wide• n=2 stores for each of the 3x2
treatment combinations
![Page 11: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/11.jpg)
Contrast/Estimate Example
• Suppose we want to compare the average of the two height = middle cells with the average of the other four cells
• With this approach, the contrast should correspond to a research question formulated before examining the data
![Page 12: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/12.jpg)
Contrast/Estimate
• First, formulate question as a contrast using the cell means model– (μ21 + μ22)/2 = (μ11 + μ12 + μ31 + μ32)/4
– -.25μ11-.25μ12+.5μ21+.5μ22-.25μ31-.25μ32 =0
• Then translate the contrast into the factor effects model using
μij = μ + αi + βj + (αβ)ij
![Page 13: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/13.jpg)
Contrast/Estimate• -.25μ11 = -.25(μ + α1 + β1 + (αβ)11)• -.25μ12 = -.25(μ + α1 + β2 + (αβ)12)• +.5μ21 = +.5 (μ + α2 + β1 + (αβ)21)• +.5μ22 = +.5 (μ + α2 + β2 + (αβ)22)• -.25μ31 = -.25(μ + α3 + β1 + (αβ)31)• -.25μ32 = -.25(μ + α3 + β2 + (αβ)32)• C = (-.5α1 + α2 - .5α3) +
(-.25(αβ)11-.25(αβ)12+.5(αβ)21+.5(αβ)22-.25(αβ)31-.25(αβ)32)
![Page 14: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/14.jpg)
Proc glm
proc glm data=a1; class height width; model sales=height width height*width;
![Page 15: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/15.jpg)
Contrast and estimate
contrast 'middle two vs all others' height -.5 1 -.5 height*width -.25 -.25 .5 .5 -.25 -.25;estimate 'middle two vs all others' height -.5 1 -.5 height*width -.25 -.25 .5 .5 -.25 -.25;means height*width;run;
![Page 16: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/16.jpg)
Output
Cont middle two vs all othersDF SS MS F Pr > F1 1536 1536 148.65 <.0001
Par middle two vs all others Estimate StErr t Pr > |t| 24 1.96 12.19 <.0001
![Page 17: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/17.jpg)
Check with means
1 1 451 2 432 1 652 2 693 1 403 2 44(65+69)/2 – (45+43+40+44)/4 = 67 – 43 = 24
![Page 18: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/18.jpg)
One Quantitative factor and one categorical
• Plot the means vs the quantitative factor for each level of the categorical factor
• Consider linear and quadratic terms for the quantitative factor (regression)
• Consider different relationships for the different levels of the categorical factor
• Lack of fit analysis can be useful
![Page 19: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/19.jpg)
Two Quantitative factors
• Plot the means
–vs A for each level B
–vs B for each level A
• Consider linear and quadratic terms
• Consider products to allow for interaction
• Lack of fit analysis can be useful
![Page 20: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/20.jpg)
One observation per cell
• For Yijk we use
– i to denote the level of the factor A
– j to denote the level of the factor B
–k to denote the kth observation in cell (i,j)
• i = 1, . . . , a levels of factor A
• j = 1, . . . , b levels of factor B
• n = 1 observation in cell (i,j)
![Page 21: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/21.jpg)
Factor effects model
• μij = μ + αi + βj
• μ is the overall mean• αi is the main effect of A • βj is the main effect of B• Because we have only one observation per
cell, we do not have enough information to estimate the interaction in the usual way…it is confounded with error
![Page 22: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/22.jpg)
Constraints
• Σi αi= 0• Σjβj = 0• SAS GLM Constraints
–αa= 0 (1 constraint)–βb = 0 (1 constraint)
![Page 23: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/23.jpg)
ANOVA Table
Source df SS MS F A a-1 SSA MSA MSA/MSE B b-1 SSB MSB MSB/MSE Error (a-1)(b-1) SSE MSE _ Total ab-1 SST MST
Would normally be df for interaction
![Page 24: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/24.jpg)
Expected Mean Squares
• Based on model which has no interaction
• E(MSE) = σ2
• E(MSA) = σ2 + b(Σiαi2)/(a-1)
• E(MSB) = σ2 + a(Σjβj2)/(b-1)
• Here, αi and βj, are defined with the usual factor effects constraints
![Page 25: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/25.jpg)
KNNL Example• KNNL p 882
• Y is the premium for auto insurance
• A is the size of the city, a=3 levels: small, medium and large
• B is the region, b=2: East, West
• n=1 the response is the premium charged by a particular company
![Page 26: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/26.jpg)
The data
data a2; infile 'c:\...\CH20TA02.txt'; input premium size region;if size=1 then sizea='1_small ';if size=2 then sizea='2_medium';if size=3 then sizea='3_large ';proc print data=a1; run;
![Page 27: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/27.jpg)
Output
Obs premium size region sizea 1 140 1 1 1_small 2 100 1 2 1_small 3 210 2 1 2_medium 4 180 2 2 2_medium 5 220 3 1 3_large 6 200 3 2 3_large
![Page 28: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/28.jpg)
Proc glm
proc glm data=a2; class sizea region; model premium= sizea region/solution; means sizea region / tukey; output out=a3 p=muhat;run;
![Page 29: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/29.jpg)
Output
Class Level InformationClass Levels Valuessizea 3 1_small 2_medium 3_largeregion 2 1 2Number of observations 6
![Page 30: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/30.jpg)
Output
Source DF MS F PModel 3 3550 71.00 0.0139Error 2 50Total 5
![Page 31: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/31.jpg)
Output
Source DF MS F Psizea 2 4650 93.00 0.0106region 1 1350 27.00 0.0351
![Page 32: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/32.jpg)
Output solution
Parameter EstimateInt 195 B sizea 1_small -90 B sizea 2_medium -15 B sizea 3_large 0 B region 1 30 B region 2 0 B
![Page 33: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/33.jpg)
Check vs predicted values (muhat)
reg sizea muhat1 1_s 135=195-90+30 2 1_s 105=195-901 2_m 210=195-15+30 2 2_m 180=195-151 3_l 225=195 +30 2 3_l 195=195
![Page 34: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/34.jpg)
Multiple comparisons Size
Mean N sizea
A 210 2 3_largeAA 195 2 2_medium
B 120 2 1_small
![Page 35: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/35.jpg)
Multiple comparisons Region
Mean N region
A 190 3 1
B 160 3 2
The anova results told us thatthese were different. No need for multiple comparison adjustment
![Page 36: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/36.jpg)
Plot the data
symbol1 v='E' i=join c=blue;symbol2 v='W' i=join c=green;title1 'Plot of the data';proc gplot data=a3; plot premium*sizea= region;run;
![Page 37: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/37.jpg)
The plot
Possible interaction?
![Page 38: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/38.jpg)
Plot the estimated model
symbol1 v='E' i=join c=blue;symbol2 v='W' i=join c=green;title1 'Plot of the model estimates';proc gplot data=a2; plot muhat*sizea=region;run;
![Page 39: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/39.jpg)
The plot
![Page 40: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/40.jpg)
Tukey test for additivity
• Can’t look at usual interaction but can add one additional term to the model (θ) to look at specific type of interaction
• μij = μ + αi + βj + θαiβj
• We use one degree of freedom to estimate θ
• There are other variations, eg θiβj
![Page 41: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/41.jpg)
Step 1: Get muhat (grand mean)
proc glm data=a2; model premium=; output out=aall p=muhat;proc print data=aall;run;
![Page 42: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/42.jpg)
Output
Obs premium size region muhat 1 140 1 1 175 2 100 1 2 175 3 210 2 1 175 4 180 2 2 175 5 220 3 1 175 6 200 3 2 175
![Page 43: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/43.jpg)
Step 2: Get muhatA (factor A means)
proc glm data=a2; class size; model premium=size; output out=aA p=muhatA;proc print data=aA;run;
![Page 44: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/44.jpg)
Output
Obs premium size region muhatA 1 140 1 1 120 2 100 1 2 120 3 210 2 1 195 4 180 2 2 195 5 220 3 1 210 6 200 3 2 210
![Page 45: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/45.jpg)
Step 3: Find muhatB (factor B means)
proc glm data=a2; class region; model premium=region; output out=aB p=muhatB;proc print data=aB;run;
![Page 46: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/46.jpg)
Output
Obs remium size region muhatB 1 140 1 1 190 2 100 1 2 160 3 210 2 1 190 4 180 2 2 160 5 220 3 1 190 6 200 3 2 160
![Page 47: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/47.jpg)
Step 4: Combine and compute
data a4; merge aall aA aB; alpha=muhatA-muhat; beta=muhatB-muhat; atimesb=alpha*beta;proc print data=a4; var size region atimesb;run;
![Page 48: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/48.jpg)
Output
Obs size region atimesb 1 1 1 -825 2 1 2 825 3 2 1 300 4 2 2 -300 5 3 1 525 6 3 2 -525
![Page 49: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/49.jpg)
Step 5: Run glm
proc glm data=a4; class size region; model premium=size region atimesb/solution; output out=a5 p=tukmuhat;run;
![Page 50: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/50.jpg)
Output
Source DF F Value Pr > Fsize 2 360.37 0.0372region 1 104.62 0.0620atimesb 1 6.75 0.2339
Not significant…not enough evidence of this type of
interaction but really small study (1 df error)
![Page 51: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/51.jpg)
Output
Par Est SE t PInt 195 B 2.9 66.49 0.0096size1 -90 B 3.5 -25.05 0.0254size2 -15 B 3.5 -4.18 0.1496size3 0 B . . .region1 30 B 2.9 10.23 0.0620region2 0 B . . .atimesb -0.006 0.002 -2.60 0.2339
![Page 52: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/52.jpg)
Does this look more like the means plot?
![Page 53: Topic 27: Strategies of Analysis](https://reader034.vdocuments.site/reader034/viewer/2022051418/568150be550346895dbedf56/html5/thumbnails/53.jpg)
Last slide
• Read KNNL Chapters 19-20
• We used program topic27.sas to generate the output for today