overview of robust methods analysis jinxia ma november 7, 2013
TRANSCRIPT
![Page 1: Overview of Robust Methods Analysis Jinxia Ma November 7, 2013](https://reader035.vdocuments.site/reader035/viewer/2022062715/56649d985503460f94a83151/html5/thumbnails/1.jpg)
Overview of Robust Methods Analysis
Jinxia MaNovember 7, 2013
![Page 2: Overview of Robust Methods Analysis Jinxia Ma November 7, 2013](https://reader035.vdocuments.site/reader035/viewer/2022062715/56649d985503460f94a83151/html5/thumbnails/2.jpg)
Contents
• What are robust methods• Why robust methods• How to conduct the robust methods analysis• Apply robust analysis to your data
![Page 3: Overview of Robust Methods Analysis Jinxia Ma November 7, 2013](https://reader035.vdocuments.site/reader035/viewer/2022062715/56649d985503460f94a83151/html5/thumbnails/3.jpg)
What are “robust methods”?
• Robust statistics are statistics with good performance for data drawn from a wide range of probability distributions, especially for distributions that are not normally distributed.– Outliers– Departures from parametric distributions
![Page 4: Overview of Robust Methods Analysis Jinxia Ma November 7, 2013](https://reader035.vdocuments.site/reader035/viewer/2022062715/56649d985503460f94a83151/html5/thumbnails/4.jpg)
Why robust methods?
• What’s the problem of standard methodologies?– Example: Linear regression assumptions• Linearity• Independence of errors• Errors are normally distributed• Homoscedasticity
– Example: comparing groups (ANOVA F-test)• Errors have a common variance, normally distributed
and independent
![Page 5: Overview of Robust Methods Analysis Jinxia Ma November 7, 2013](https://reader035.vdocuments.site/reader035/viewer/2022062715/56649d985503460f94a83151/html5/thumbnails/5.jpg)
Why robust methods?
– Example: Detecting differences among groups• Problem 1: Heavy-tailed distributions
Figure 1: Despite the obvious similarity between the standard normal and contaminated normal distributions, the standard normal has variance 1 and the contaminated normal has variance 10.9.
![Page 6: Overview of Robust Methods Analysis Jinxia Ma November 7, 2013](https://reader035.vdocuments.site/reader035/viewer/2022062715/56649d985503460f94a83151/html5/thumbnails/6.jpg)
Why robust methods?
– Example: Detecting differences among groups• Problem 1: Heavy-tailed distributions
Figure 2: Left panel, power = 0.96. Right panel, power = 0.28.(n= 25 per group, Student’s T test.
![Page 7: Overview of Robust Methods Analysis Jinxia Ma November 7, 2013](https://reader035.vdocuments.site/reader035/viewer/2022062715/56649d985503460f94a83151/html5/thumbnails/7.jpg)
Why robust methods?
– Example: Detecting differences among groups• Problem 1: Heavy-tailed distributions
Figure 3: Left panel, a bivariate normal distribution, corr = .8. Middle panel, a bivariate normal distribution, corr= .2.Right panel, one marginal distribution is normal, but the other is a contaminated normal, corr = .2.
Correlation = .8 Correlation = .2 Correlation = .2
![Page 8: Overview of Robust Methods Analysis Jinxia Ma November 7, 2013](https://reader035.vdocuments.site/reader035/viewer/2022062715/56649d985503460f94a83151/html5/thumbnails/8.jpg)
Why robust methods?
– Example: Detecting differences among groups• Problem 2: Assuming normality via the central limit
theorem
Figure 4: The distribution of Student’s T, n=25, when sampling from a (standard) lognormal distribution. The dashed line is the distribution under normality.For real Student’s T: P(T<=-2.086)=P(T>=2.086)=.025, E(T)=0.For “Lognormal T”: P(T<=-2.086)=.12, P(T>=2.86)=.001, E(T)=-.54.
![Page 9: Overview of Robust Methods Analysis Jinxia Ma November 7, 2013](https://reader035.vdocuments.site/reader035/viewer/2022062715/56649d985503460f94a83151/html5/thumbnails/9.jpg)
Why robust methods?
– Example: Detecting differences among groups• Problem 3: Heteroscedasticity
– The third fundamental insight is that violating the usual homoscedasticity assumption (i.e. the assumption that all groups are assumed to have a common variance), is much more serious than once thought. Both relatively poor power and inaccurate confidence intervals can result.
![Page 10: Overview of Robust Methods Analysis Jinxia Ma November 7, 2013](https://reader035.vdocuments.site/reader035/viewer/2022062715/56649d985503460f94a83151/html5/thumbnails/10.jpg)
How to test/compare robust methods?
– Example: Comparing dependent groups with missing values: an approach based on a robust method• 1: Simulation• 2: Bootstrap
![Page 11: Overview of Robust Methods Analysis Jinxia Ma November 7, 2013](https://reader035.vdocuments.site/reader035/viewer/2022062715/56649d985503460f94a83151/html5/thumbnails/11.jpg)
How to test/compare robust methods?
– Example: Comparing dependent groups with missing values: an approach based on a robust method• 1: Simulation
– g-and-h distribution
– Let Z be a random variable generated from a standard normal distribution, then W has a g-and-h distribution.
![Page 12: Overview of Robust Methods Analysis Jinxia Ma November 7, 2013](https://reader035.vdocuments.site/reader035/viewer/2022062715/56649d985503460f94a83151/html5/thumbnails/12.jpg)
How to test/compare robust methods?
– Example: Comparing dependent groups with missing values: an approach based on a robust method• 1: Simulation
– g-and-h distribution» g=h=0, standard normal» G>0, skewed; the bigger the value of g, the more skewed.» H>0, heavy-tailed; the bigger the value of h, the more
heavy-tailed.
![Page 13: Overview of Robust Methods Analysis Jinxia Ma November 7, 2013](https://reader035.vdocuments.site/reader035/viewer/2022062715/56649d985503460f94a83151/html5/thumbnails/13.jpg)
How to test/compare robust methods?
• 1: Simulation– g-and-h distribution
![Page 14: Overview of Robust Methods Analysis Jinxia Ma November 7, 2013](https://reader035.vdocuments.site/reader035/viewer/2022062715/56649d985503460f94a83151/html5/thumbnails/14.jpg)
How to test/compare robust methods?• 2: Bootstrap (B = 2000)
![Page 15: Overview of Robust Methods Analysis Jinxia Ma November 7, 2013](https://reader035.vdocuments.site/reader035/viewer/2022062715/56649d985503460f94a83151/html5/thumbnails/15.jpg)
Robust solutions
– Alternate Measures of Location• One way of dealing with outliers is to replace the mean
with alternative measures of location– Median– Trimmed mean– Winsorized mean– M-estimator
![Page 16: Overview of Robust Methods Analysis Jinxia Ma November 7, 2013](https://reader035.vdocuments.site/reader035/viewer/2022062715/56649d985503460f94a83151/html5/thumbnails/16.jpg)
Robust solutions
– Transformations• A simple way of dealing with skewness is to transform
the data.– Logarithms– Simple transformations do not deal effectively with outliers– The resulting distributions can remain highly skewed
![Page 17: Overview of Robust Methods Analysis Jinxia Ma November 7, 2013](https://reader035.vdocuments.site/reader035/viewer/2022062715/56649d985503460f94a83151/html5/thumbnails/17.jpg)
Robust solutions
– Nonparametric regression• Sometimes called smoothers.• Imagine that in a regression situation the goal is to
estimate the mean of Y, given that X=6, based on n pairs of observations. The strategy is to focus on the observed X values close to 6 and use the corresponding Y values to estimate the mean of Y. Typically, smoothers give more weight to Y values for which the corresponding X values are close to 6. For pairs of points for which the X value is far from 6, the corresponding Y values are ignored.
![Page 18: Overview of Robust Methods Analysis Jinxia Ma November 7, 2013](https://reader035.vdocuments.site/reader035/viewer/2022062715/56649d985503460f94a83151/html5/thumbnails/18.jpg)
Robust solutions
– Robust measures of association• Use some analog of Pearson’s correlation that removes
or down weights outliers• Fit a regression line and measure the strength of the
association based on this fit.
![Page 19: Overview of Robust Methods Analysis Jinxia Ma November 7, 2013](https://reader035.vdocuments.site/reader035/viewer/2022062715/56649d985503460f94a83151/html5/thumbnails/19.jpg)
Practical Illustration of Robust Methods
– Analysis of a lifestyle intervention for older adults• N=364• This trial was conducted to compare a six-month
lifestyle intervention to a no treatment control condition• Outcome variables: (a) eight indices of health-related
quality of life; (b) depression; (c) life satisfaction.• Preliminary analysis revealed that all outcome variables
were found to have outliers based on boxplots.
![Page 20: Overview of Robust Methods Analysis Jinxia Ma November 7, 2013](https://reader035.vdocuments.site/reader035/viewer/2022062715/56649d985503460f94a83151/html5/thumbnails/20.jpg)
Practical Illustration of Robust Methods
– Analysis of a lifestyle intervention for older adults
Figure 5: The median regression line for predicting physical function based on the number of session hours (R function: qsmcobs).
- r=.178 (p=.001). However, the association appears to be non-linear.
![Page 21: Overview of Robust Methods Analysis Jinxia Ma November 7, 2013](https://reader035.vdocuments.site/reader035/viewer/2022062715/56649d985503460f94a83151/html5/thumbnails/21.jpg)
Practical Illustration of Robust Methods
– Analysis of a lifestyle intervention for older adults
Figure 6: The median regression line for predicting physical composite based on the number of session hours (R function: qsmcobs).
- For 0 to 5 hours, r=-.071 (p=.257). - For 5 hours or more, r=.25 (p=.045).
![Page 22: Overview of Robust Methods Analysis Jinxia Ma November 7, 2013](https://reader035.vdocuments.site/reader035/viewer/2022062715/56649d985503460f94a83151/html5/thumbnails/22.jpg)
Practical Illustration of Robust Methods
– Analysis of a lifestyle intervention for older adults
Table: Measures of association between hours of treatment and the variables listed in column 1 (n = 364).rw * = 20% Winsorized correlation
Pearson’s r p rw * p re ** PHYSICAL FUNCTION 0.178
0.001 0.135 0.016 0.048
BODILY PAIN 0.170 0.002 0.156 0.005 0.198
GENERAL HEALTH 0.209 0.0001 0.130 0.012 0.111
VITALITY 0.099 0.075 0.139 0.012 0.241
SOCIAL FUNCTION 0.112 0.043 0.157 0.005 0.228
MENTAL HEALTH 0.141 0.011 0.167 0.003 0.071
PHYSICAL COMPOSITE 0.200 0.0002 0.136 0.015 0.255
MENTAL COMPOSITE 0.095 0.087 0.149 0.007 0.028
DEPRESSION -0.022 0.694 -0.132 0.018 0.134
LIFE SATISFACTION 0.086 0.125 0.118 0.035 0.119
![Page 23: Overview of Robust Methods Analysis Jinxia Ma November 7, 2013](https://reader035.vdocuments.site/reader035/viewer/2022062715/56649d985503460f94a83151/html5/thumbnails/23.jpg)
Practical Illustration of Robust Methods
– Analysis of a lifestyle intervention for older adults
Table 2: P-values when comparing ethnic matched group patients to a non-matched group.
Welch’s test: dealing with heteroscedasticityYuen’s test: based on trimmed means
No single method is always best.
Welch’s test: p-value
Yuen’s test: p-value
d dt ξ
Physical Function 0.1445 0.0469 0.212 0.310 0.252
Bodily Pain .01397 <.0001 0.591 0.666 0.501
Physical Composite <.0001 0.0002 0.420 0.503 0.391
Cognition 0.0332 0.0091 0.415 0.408 0.308
![Page 24: Overview of Robust Methods Analysis Jinxia Ma November 7, 2013](https://reader035.vdocuments.site/reader035/viewer/2022062715/56649d985503460f94a83151/html5/thumbnails/24.jpg)
Software
– R: www.r-project.org– www.rcf.usc.edu/~rwilcox– Example: comparing two groups• > x1=read.table(file=“ ”)• > x2=read.table(file=“ ”)• > x<-list(x1,x2)• > lincon(x,tr=0.2,alpha=0.05)• Lincon is a heteroscedastic test of d linear contrasts
using trimmed means.
![Page 25: Overview of Robust Methods Analysis Jinxia Ma November 7, 2013](https://reader035.vdocuments.site/reader035/viewer/2022062715/56649d985503460f94a83151/html5/thumbnails/25.jpg)
No single method is always best.
![Page 26: Overview of Robust Methods Analysis Jinxia Ma November 7, 2013](https://reader035.vdocuments.site/reader035/viewer/2022062715/56649d985503460f94a83151/html5/thumbnails/26.jpg)
Thank you!