checking model adequacy can we trust our analytic results?bacraig/notes514/topic5a.pdf ·...
TRANSCRIPT
Checking Model Adequacy
Can we trust our analytic results?
Bruce A Craig
Department of StatisticsPurdue University
STAT 514 Topic 5 1
Model Checking Diagnostics
Theoretical justification of standard analysis assumes
1 Errors independent2 Errors normally distributed3 Errors have constant variance
Need to check if these conditions reasonably met
yij = (y .. + (y i . − y ..)) + (yij − y i .)yij = yij + ǫij
observed = predicted + residual
Diagnostics primarily use predicted and residual values
Residuals are our “estimates” of unobservable errors
STAT 514 Topic 5 2
Common Diagnostics
Normally distributed errorsHistogram of residualsNormal probability plot / QQ plot of residuals
Shapiro-Wilks/Kolmogorov-Smirnov test of residuals
Errors have constant variancePlot residuals ǫij vs predicted values yij (residual plot)
Modified Levene’s test
Errors are independentPlot residuals ǫij vs time/space variableDurbin-Watson test
Plot residuals ǫij vs other variable of interest
STAT 514 Topic 5 3
Normally distributed errors
Histogram of residuals
Is histogram approximately bell-shaped?
QQ Plot of residuals
Does the relationship look approximately linear?
Created by plotting:Ordered residuals ǫ(t) vs associated quantile values ztwhere zt are such that P(Z ≤ zt) = Pt
with Pt = (t − .5)/N for t = 1, 2, . . . ,N
STAT 514 Topic 5 4
Constant Variance
Often the variance of a response is linked to its mean
Can investigate visually through residual plot
Plot ǫij vs yijIs the range of ǫij constant for different levels of yij
Modified Levene’s (Brown-Forsythe) test
Compute |yij −mi | where mi is median of group i
Compare average “deviations” using F-testAbsolute value a more robust measure of deviationMedian a more robust measure of centerOther tests more sensitive to Normal assumption
STAT 514 Topic 5 5
Independence
A well-planned experiment often goes a long way towardssatisfying this assumption. Randomization protectsagainst unknown factors
Plot of the residuals over time (or space)
Is there a drift or pattern over time/space?
Plot residuals versus other relevant variables
Often variables omitted from analysisExperimental conditions (e.g., temp)May result in inclusion of new factor in next experiment
STAT 514 Topic 5 6
Outliers/Unusual Observations
May observe “unusual” responses in diagnostic plots
Results in large |ǫij |Do not simply remove. Investigate!
Helpful to have detailed lab/experimental notesDon’t look only for excuses or typos
How important are they? Do conclusions change with andwithout them in the analysis?
Formal tests (e.g. standardized residuals) can be used butonly confirm whether they are “unusual.”
STAT 514 Topic 5 7
Diagnostics Example
data one;
infile ’c:\saswork\data\tensile.dat’;
input percent strength time;
proc glm data=one;
class percent; model strength=percent;
means percent / hovtest=bf; ****Modified Levene’s test;
output out=diag p=pred r=res; run;
proc sort; by pred;
symbol1 v=circle i=sm50; title1 ’Residual Plot’;
proc gplot; plot res*pred/frame; run;
proc univariate data=diag noprint;
var res; qqplot res / normal (L=1 mu=est sigma=est);
histogram res / normal; run;
proc sort; by time;
symbol1 v=circle i=sm75;
proc gplot; plot res*time / vref=0 vaxis=-6 to 6 by 1;
symbol1 v=circle i=sm50;
proc gplot; plot res*time / vref=0 vaxis=-6 to 6 by 1;
STAT 514 Topic 5 8
The GLM Procedure
Dependent Variable: strength
Sum of
Source DF Squares Mean Square F Value Pr > F
Model 4 475.7600000 118.9400000 14.76 <.0001
Error 20 161.2000000 8.0600000
Corrected Total 24 636.9600000
R-Square Coeff Var Root MSE strength Mean
0.746923 18.87642 2.839014 15.04000
Source DF Type I SS Mean Square F Value Pr > F
percent 4 475.7600000 118.9400000 14.76 <.0001
Source DF Type III SS Mean Square F Value Pr > F
percent 4 475.7600000 118.9400000 14.76 <.0001
Brown and Forsythe’s Test for Homogeneity of strength Variance
ANOVA of Absolute Deviations from Group Medians
Sum of Mean
Source DF Squares Square F Value Pr > F
percent 4 4.9600 1.2400 0.32 0.8626
Error 20 78.0000 3.9000
STAT 514 Topic 5 9
STAT 514 Topic 5 10
STAT 514 Topic 5 11
STAT 514 Topic 5 12
STAT 514 Topic 5 13
Non-constant Variance
Often response variable’s mean and variance linked
F test robust to violations (especially if balanced)
Why concern?
Comparison of treatments depends on MSEIncorrect confidence interval lengthsMore Type I and Type II errors
Variance-Stabilizing TransformationsCommon transformations√
y , log(y), 1/y , arcsin(√y), and 1/
√y
Box-Cox transformations
uses maximum likelihood procedurescan approximate using relationship σi = θµβ
i
transformation is y1−β
Distribution often more “Normal” after transformation
STAT 514 Topic 5 14
Transformations
Consider an RV X with E(X )=µx and V(X )=σ2x
Define Y = f (X ); What is the mean and variance of Y ?
Delta Method Approximation
First-order Taylor Series Expansion about µX
Assume f (X ) such that f ′(µx ) 6= 0
Then f (X ) ≈ f (µx ) + (X − µx )f ′(µx )
E(Y )=E(f (X ))≈ E(f (µx)) + E((X − µx)f′(µx))= f (µx)
V(Y )=V(f (X )) ≈ [f ′(µx)]2V(X ) = [f ′(µx)]
2σ2x
STAT 514 Topic 5 15
Variance-Stabilizing Transformation
Suppose response y is such that σ2y = g(µy )
Want to find f (y) such that V(f (y))≈ c
Using Delta method : V(f (y))≈ [f ′(µy )]2σ2
y
Therefore want to choose f such that [f ′(µy )]2g(µy ) ≈ c
Examples
g(µ) = µ (Poisson) f (y) =∫
1√µdµ → f (y) =
√y
g(µ) = µ(1− µ) (Binomial) f (y) =∫
1√µ(1−µ)
dµ → f (X ) = asin(√y)
g(µ) = µ2β (Box-Cox) f (y) =∫
µ−βdµ → f (y) = y1−β
g(µ) = µ2 (Box-Cox) f (y) =∫
1µdµ → f (y) = log y
STAT 514 Topic 5 16
Box-Cox Transformation
Perform analysis of variance on transformed y
yλ =
yλ−1
λyλ−1 λ 6= 0
y logy λ = 0
y is the geometric mean of the observations
y =
a∏
i=1
ni∏
j=1
yij
1/N
Find λ which minimizes SSE
Need to rescale yλ for direct comparison
STAT 514 Topic 5 17
Box-Cox λ Approximation
Box-Cox is assuming σ2y = Cµ2β
y
The appropriate transform is 1− β
Consider estimating β using the fact that
log(σy ) = log(C )/2 + β log(µy )
Regress sample estimates, log(Si) versus log(yi), and usethe slope as the estimate of β
STAT 514 Topic 5 18
trans.sas
title1 ’Increasing Variance Example’;
data one;
infile ’c:\saswork\data\boxcox.dat’; input trt resp;
proc glm data=one; class trt;
model resp=trt; output out=diag p=pred r=res;
title1 ’Residual Plot’;
symbol1 v=circle i=none; proc sort; by pred;
proc gplot data=diag; plot res*pred /frame;
proc univariate data=one noprint;
var resp; by trt; output out=two mean=mu std=sigma;
data three; set two;
logmu = log(mu); logsig = log(sigma);
proc reg; model logsig = logmu;
title1 ’Mean vs Std Dev’; symbol1 v=circle i=rl;
proc gplot; plot logsig*logmu / regeqn;
proc transreg data=one;
model boxcox(resp / lambda=-2 to 2 by .2) = class(trt);
run;
STAT 514 Topic 5 19
STAT 514 Topic 5 20
STAT 514 Topic 5 21
STAT 514 Topic 5 22
STAT 514 Topic 5 23
What if no transformation works?
Can be situations when there is not a power relationshipbetween the group variances and means
If residuals appear Normal, consider weighted ANOVA orlinear mixed model
If residuals skewed, consider a generalized linear modelwhere you assume an alternative distribution for y
Seek assistance if uncertain what to do or how to do it
STAT 514 Topic 5 24
Example: Soldering Study
Experiment designed to assess the strength of five typesof flux used in soldering wire boards
Forty units were used in the experiment
Units randomly and equally assigned to flux type (ni = 8)
Y strength
X type of flux
STAT 514 Topic 5 25
SAS Commands
data a1; infile ’u:\.www\datasets525\CH18TA02.txt’;
input strength type;
proc gplot; ***scatterplot;
plot strength*type;
proc glm;
class type;
model strength=type;
means type / hovtest=bf; ***Modified Levene test;
lsmeans type / stderr; ***Alternative to means;
run;
proc transreg;
model boxcox(strength / lambda=-2 to 2 by .2) = class(type);
run;
STAT 514 Topic 5 26
Scatterplot
STAT 514 Topic 5 27
Output
Source DF Sum of Squares Mean Square F Value Pr > F
Model 4 353.6120850 88.4030213 41.93 <.0001
Error 35 73.7988250 2.1085379
Cor Total 39 427.4109100
Brown and Forsythe’s Test for Homogeneity of strength Variance
ANOVA of Absolute Deviations from Group Medians
Sum of Mean
Source DF Squares Square F Value Pr > F
type 4 9.3477 2.3369 2.94 0.0341
Error 35 27.8606 0.7960
strength Standard
type LSMEAN Error Pr > |t|
1 15.4200000 0.5133880 <.0001
2 18.5275000 0.5133880 <.0001
3 15.0037500 0.5133880 <.0001
4 9.7412500 0.5133880 <.0001
5 12.3400000 0.5133880 <.0001
STAT 514 Topic 5 28
Log Transform Suggested?
STAT 514 Topic 5 29
SAS Commands
proc means data=a1; ***Obtain sample variances for weights;
var strength;
by type;
output out=a2 var=s2;
data a2; set a2; wt=1/s2;
data a3; merge a1 a2; by type;
proc glm data=a3; ***Weighted ANOVA;
class type;
model strength=type;
weight wt;
lsmeans type / stderr cl; ***Must use lsmeans over means here;
proc mixed data=a1; ***Diff variances for all types;
class type;
model strength=type / ddfm=kr;
repeated / group=type;
STAT 514 Topic 5 30
GLM Output
Sum of
Source DF Squares Mean Square F Value Pr > F
Model 4 324.2130988 81.0532747 81.05 <.0001
Error 35 35.0000000 1.0000000
Corrected Total 39 359.2130988
Least Squares Means
strength Standard
type LSMEAN Error Pr > |t| 95% Confidence Limits
1 15.4200000 0.4373949 <.0001 14.532041 16.307959
2 18.5275000 0.4429921 <.0001 17.628178 19.426822
3 15.0037500 0.8791614 <.0001 13.218957 16.788543
4 9.7412500 0.2887129 <.0001 9.155132 10.327368
5 12.3400000 0.2720294 <.0001 11.787751 12.892249
STAT 514 Topic 5 31
MIXED Output
Cov Parm Group Estimate
Residual type 1 1.5305
Residual type 2 1.5699
Residual type 3 6.1834
Residual type 4 0.6668
Residual type 5 0.5920
Fit Statistics
-2 Res Log Likelihood 122.1
AIC (smaller is better) 132.1
AICC (smaller is better) 134.2
BIC (smaller is better) 140.6
Null Model Likelihood Ratio Test
DF Chi-Square Pr > ChiSq
4 13.73 0.0082
STAT 514 Topic 5 32
MIXED Output
Type 3 Tests of Fixed Effects
Num Den
Effect DF DF F Value Pr > F
type 4 14.8 71.78 <.0001
Least Squares Means
Standard
Effect type Estimate Error DF t Value Pr > |t|
type 1 15.4200 0.4374 7 35.25 <.0001
type 2 18.5275 0.4430 7 41.82 <.0001
type 3 15.0038 0.8792 7 17.07 <.0001
type 4 9.7413 0.2887 7 33.74 <.0001
type 5 12.3400 0.2720 7 45.36 <.0001
STAT 514 Topic 5 33
Summary
GLM (weighted ANOVA) and MIXED analysis provide thesame factor level estimates and standard errors.
Without ddfm=kr, F test and df are also the same.
With ddfm=kr, F test similar to Welch F test and factorlevel df more reasonable.
Can consider groups of factor levels with similar variances
Group1=1 : Type 1 and 2Group1=2 : Type 3Group1=3 : Type 4 and 5
STAT 514 Topic 5 34
SAS Commands
data a1; ***Defining the groups;
set a1;
group1 = 3;
if type=1 | type=2 then group1=1;
if type=3 then group1=2;
proc mixed data=a1; ***Diff variances for each group1;
all types;
class type;
model strength=type / ddfm=kr;
repeated / group=group1;
STAT 514 Topic 5 35
MIXED Output
Cov Parm Group Estimate
Residual Group 1 1.5502
Residual Group 2 6.1834
Residual Group 3 0.6294
Fit Statistics
-2 Res Log Likelihood 122.1
AIC (smaller is better) 128.1
AICC (smaller is better) 128.9 ***Much better fit
BIC (smaller is better) 133.2
Null Model Likelihood Ratio Test
DF Chi-Square Pr > ChiSq
2 13.70 0.0011
STAT 514 Topic 5 36
MIXED Output
Type 3 Tests of Fixed Effects
Num Den
Effect DF DF F Value Pr > F
type 4 19.8 77.68 <.0001
Least Squares Means
Standard
Effect type Estimate Error DF t Value Pr > |t|
type 1 15.4200 0.4402 14 35.03 <.0001
type 2 18.5275 0.4402 14 42.09 <.0001
type 3 15.0038 0.8792 7 17.07 <.0001
type 4 9.7413 0.2805 14 34.73 <.0001
type 5 12.3400 0.2805 14 43.99 <.0001
STAT 514 Topic 5 37
Near-Zero/Truncated Values
Can have heavily skewed dist near zero
Concentration of rare contaminantNumber of defects in assembly lineNumber of birds at a given site
Measurements may also be truncated or censored
Concentration of contaminant (below detectable level)Lifetime of component (does not fail during study)
Transformations sometimes successful
log(x+.001),√x + .001
Should likely consider more advanced models
Assume non-Normal errors (generalized linear models)Zero-inflated or two-part models (e.g., ZIP model)
STAT 514 Topic 5 38
Background Reading
Normality assumption: Montgomery 3.4.1
Residual plots: Montgomery 3.4.2 - 3.4.4
Variance-stabilizing transformation: Montgomery 3.4.3
Box-Cox transformation: Montgomery 15.1.1
STAT 514 Topic 5 39