generalized estimating equations (gee): a modern love story

34
Generalized Estimating Equations (GEE): A Modern Love Story April 18, 2011 DαSAL Brandi Stupica Data for today on the H: drive in the DaSAL folder GEE Talk Data_041811.sav

Upload: rex

Post on 31-Jan-2016

58 views

Category:

Documents


5 download

DESCRIPTION

Generalized Estimating Equations (GEE): A Modern Love Story. April 18, 2011 D α SAL Brandi Stupica. Data for today on the H: drive in the DaSAL folder GEE Talk Data_041811.sav. What are generalized estimating equations? Applications Why you should love GEEs. PART I. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Generalized Estimating Equations (GEE): A Modern Love Story

Generalized Estimating Equations (GEE): A Modern

Love Story

Generalized Estimating Equations (GEE): A Modern

Love Story

April 18, 2011DαSAL

Brandi Stupica

April 18, 2011DαSAL

Brandi Stupica

Data for today on the H: drive in the DaSAL folderGEE Talk Data_041811.sav

Page 2: Generalized Estimating Equations (GEE): A Modern Love Story

PART I. PART I.

What are generalized estimating equations?ApplicationsWhy you should love GEEs

What are generalized estimating equations?ApplicationsWhy you should love GEEs

Page 3: Generalized Estimating Equations (GEE): A Modern Love Story

What are Generalized Estimating Equations (GEE)?

What are Generalized Estimating Equations (GEE)?

• Extension of the Generalized Linear Model (GZLM), which is an extension of the General Linear Model (GLM)– GLM analyzes models with normally distributed DVs that

are linearly linked to predictors– GZLM extends GLM to analyze non-normally distributed

DVs that may be non-linearly linked to predictors• Easily handles interactions between discrete and

continuous IVs• Cannot analyze correlated, non-independent, clustered,

nested, repeated measures, within-subjects data– GEE extends GZLM and analyzes correlated data with

• Normal and non-normal DVs• DVs that are linearly or non-linearly linked to IVs• Full factorial models with any combo of discrete and

continuous IVs

• Extension of the Generalized Linear Model (GZLM), which is an extension of the General Linear Model (GLM)– GLM analyzes models with normally distributed DVs that

are linearly linked to predictors– GZLM extends GLM to analyze non-normally distributed

DVs that may be non-linearly linked to predictors• Easily handles interactions between discrete and

continuous IVs• Cannot analyze correlated, non-independent, clustered,

nested, repeated measures, within-subjects data– GEE extends GZLM and analyzes correlated data with

• Normal and non-normal DVs• DVs that are linearly or non-linearly linked to IVs• Full factorial models with any combo of discrete and

continuous IVs

Page 4: Generalized Estimating Equations (GEE): A Modern Love Story

Application of GEEApplication of GEE

• Nested data– Dyadic relationships– Family studies– School and organizational studies

• Repeated measures– Longitudinal data analysis

• Within subjects designs– Pre/post designs

• Nested data– Dyadic relationships– Family studies– School and organizational studies

• Repeated measures– Longitudinal data analysis

• Within subjects designs– Pre/post designs

Page 5: Generalized Estimating Equations (GEE): A Modern Love Story

Why You Should Love GEEs for Correlated Data

Why You Should Love GEEs for Correlated Data

• Compared to rANOVA– Doesn’t assume DV is normal or that it is linearly linked to

predictors• Can model DVs that are binomial, multinomial, Poisson, negative

binomial, and more!– Can model interactions between factors and covariates with

ease• Compared to Linear Mixed Models

– Doesn’t require that repeated responses have multivariate normal distribution• Unlikely to meet this assumption when DV is binary or count data

• Rather than combining multiple assessments, analyze with improved power by including within Ss factor

• Uses all available data as default rather than complete cases only

• Extraordinary flexibility can streamline results sections

• Compared to rANOVA– Doesn’t assume DV is normal or that it is linearly linked to

predictors• Can model DVs that are binomial, multinomial, Poisson, negative

binomial, and more!– Can model interactions between factors and covariates with

ease• Compared to Linear Mixed Models

– Doesn’t require that repeated responses have multivariate normal distribution• Unlikely to meet this assumption when DV is binary or count data

• Rather than combining multiple assessments, analyze with improved power by including within Ss factor

• Uses all available data as default rather than complete cases only

• Extraordinary flexibility can streamline results sections

Page 6: Generalized Estimating Equations (GEE): A Modern Love Story

PART II.PART II.Conducting a GEEConducting a GEE

Page 7: Generalized Estimating Equations (GEE): A Modern Love Story

Conducting a GEE: First Step

Conducting a GEE: First Step

• Arrange your data in “long form”• Arrange your data in “long form”

B. How data need to look for GEEA. How data usually look

Page 8: Generalized Estimating Equations (GEE): A Modern Love Story

Getting from A to B: Restructuring Your Data

Getting from A to B: Restructuring Your Data

Page 9: Generalized Estimating Equations (GEE): A Modern Love Story

Restructuring Your DataRestructuring Your Data

Page 10: Generalized Estimating Equations (GEE): A Modern Love Story

Restructuring Your DataRestructuring Your Data

Page 11: Generalized Estimating Equations (GEE): A Modern Love Story

Restructuring Your DataRestructuring Your Data

Page 12: Generalized Estimating Equations (GEE): A Modern Love Story

Restructuring Your DataRestructuring Your Data

Page 13: Generalized Estimating Equations (GEE): A Modern Love Story

Restructured DataRestructured Data

Page 14: Generalized Estimating Equations (GEE): A Modern Love Story

Conducting a GEE Analysis

Conducting a GEE Analysis

Page 15: Generalized Estimating Equations (GEE): A Modern Love Story

Selecting the Model TypeSelecting the Model Type

• Dozens of model combinations with GEE– DV can be discrete,

any of several distributions, and nonlinearly linked to IVs

• Must select distribution of DV and link function

• Dozens of model combinations with GEE– DV can be discrete,

any of several distributions, and nonlinearly linked to IVs

• Must select distribution of DV and link function

Page 16: Generalized Estimating Equations (GEE): A Modern Love Story

Response VariableResponse Variable

• Also known as outcome variable, DV

• Category order is for multinomial DVs

• For binary outcomes, can specify reference category

• Also known as outcome variable, DV

• Category order is for multinomial DVs

• For binary outcomes, can specify reference category

Page 17: Generalized Estimating Equations (GEE): A Modern Love Story

PredictorsPredictors

• Options for factors allows specification of reference category and how to handle missing data

• Options for factors allows specification of reference category and how to handle missing data

Page 18: Generalized Estimating Equations (GEE): A Modern Love Story

ModelModel

• Full factorial is a few clicks away

• Full factorial is a few clicks away

Page 19: Generalized Estimating Equations (GEE): A Modern Love Story

Estimation and StatisticsEstimation and Statistics

Page 20: Generalized Estimating Equations (GEE): A Modern Love Story

EM MeansEM Means

• Several options for controlling for family-wise error

• Several options for contrasts, including– Simple– Pairwise– Deviation– Difference

• Several options for controlling for family-wise error

• Several options for contrasts, including– Simple– Pairwise– Deviation– Difference

Page 21: Generalized Estimating Equations (GEE): A Modern Love Story

Save, Export, and Cross Your Fingers

Save, Export, and Cross Your Fingers

Page 22: Generalized Estimating Equations (GEE): A Modern Love Story

Results: Descriptive Information

Results: Descriptive Information

Page 23: Generalized Estimating Equations (GEE): A Modern Love Story

Detour to Explain the Relevance of Goodness of

Fit

Detour to Explain the Relevance of Goodness of

Fit

Page 24: Generalized Estimating Equations (GEE): A Modern Love Story

Working Correlation Matrix

Working Correlation Matrix

• What is a working correlation matrix?– Correlated data could be correlated many ways– Specify in the beginning the assumptions that

should be made about how correlated data are correlated

– “Working” comes from the structure being re-estimated at each iteration

• GEE robust to misspecification• Then, why bother picking the best one?

– Small gain in efficiency by selecting correct underlying structure

• In the “Repeated” tab I picked Unstructured correlation matrix– Why?

• What is a working correlation matrix?– Correlated data could be correlated many ways– Specify in the beginning the assumptions that

should be made about how correlated data are correlated

– “Working” comes from the structure being re-estimated at each iteration

• GEE robust to misspecification• Then, why bother picking the best one?

– Small gain in efficiency by selecting correct underlying structure

• In the “Repeated” tab I picked Unstructured correlation matrix– Why?

Page 25: Generalized Estimating Equations (GEE): A Modern Love Story

Working Correlation Matrix Options

Working Correlation Matrix Options

• Unstructured– No assumption about relative magnitude of the correlation between any two pairs of observations– Must estimate many parameters– Most efficient and conservative but can lead to poor estimates with small samples

• Independent– Assumes measurements for the repeated measure uncorrelated – Default in SPSS– 1’s on the diagonal and 0's off the diagonal– Signifies variables correlated with themselves at any given time but not correlated with measurements at other times– Illogical assumption and often wrong given that data are correlated and non-independent by nature!!!– Thus, I always start with something other than independent, and choose unstructured because most conservative,

efficient, and makes no assumptions

• AR(1): Auto-regressive, order 1– Correlation diminishes exponentially over-time– Assumes equal time intervals– 1's on the diagonal; alpha for observations one apart; alpha-squared for two apart; alpha-cubed for three apart , and so

on

• Exchangeable– Correlation does not change with time– Correlations for within-subjects variables homogenous,– 1's on the diagonal and equal correlation for all off-diagonal elements

• M-dependent– Correlation does not change with time until time M, when it drops to zero– 1’s on the diagonal and 0 for observations separated by some number M or more and equal correlation for responses

separated by less than M time points– Researcher specifies M

• Unstructured– No assumption about relative magnitude of the correlation between any two pairs of observations– Must estimate many parameters– Most efficient and conservative but can lead to poor estimates with small samples

• Independent– Assumes measurements for the repeated measure uncorrelated – Default in SPSS– 1’s on the diagonal and 0's off the diagonal– Signifies variables correlated with themselves at any given time but not correlated with measurements at other times– Illogical assumption and often wrong given that data are correlated and non-independent by nature!!!– Thus, I always start with something other than independent, and choose unstructured because most conservative,

efficient, and makes no assumptions

• AR(1): Auto-regressive, order 1– Correlation diminishes exponentially over-time– Assumes equal time intervals– 1's on the diagonal; alpha for observations one apart; alpha-squared for two apart; alpha-cubed for three apart , and so

on

• Exchangeable– Correlation does not change with time– Correlations for within-subjects variables homogenous,– 1's on the diagonal and equal correlation for all off-diagonal elements

• M-dependent– Correlation does not change with time until time M, when it drops to zero– 1’s on the diagonal and 0 for observations separated by some number M or more and equal correlation for responses

separated by less than M time points– Researcher specifies M

Page 26: Generalized Estimating Equations (GEE): A Modern Love Story

Choosing the Best Fitting Working Correlation MatrixChoosing the Best Fitting

Working Correlation Matrix• Run model for

different working correlation structure assumptions, choose the one assumption with the lowest QIC value

• But, wait…What is the QICC?

• Run model for different working correlation structure assumptions, choose the one assumption with the lowest QIC value

• But, wait…What is the QICC?

Page 27: Generalized Estimating Equations (GEE): A Modern Love Story

Bonus! Choosing the Best Subset of Predictors

Bonus! Choosing the Best Subset of Predictors

• QICC used for choosing best subset of predictors

• Penalizes for model complexity

• Run a model and a nested model dropping one of the predictors, then compare QICC coefficients

• Lower QICC indicates better fit

• QICC used for choosing best subset of predictors

• Penalizes for model complexity

• Run a model and a nested model dropping one of the predictors, then compare QICC coefficients

• Lower QICC indicates better fit

Page 28: Generalized Estimating Equations (GEE): A Modern Love Story

Test of Model EffectsTest of Model Effects

Page 29: Generalized Estimating Equations (GEE): A Modern Love Story

Parameter EstimatesParameter Estimates

Page 30: Generalized Estimating Equations (GEE): A Modern Love Story

Estimated Marginal Means and Pairwise Comparisons

Estimated Marginal Means and Pairwise Comparisons

Page 31: Generalized Estimating Equations (GEE): A Modern Love Story

Continuous Normal DVContinuous Normal DV

• Example if time• Example if time

Page 32: Generalized Estimating Equations (GEE): A Modern Love Story

More Information on GEEs

More Information on GEEs

• Hardin, J. W., & Hilbe, J. M. (2003). Generalized estimating equations. Boca Raton, FL: Chapman and Hall/CRC Press.

• Norusis, M. (2011). IBM SPSS Statistics 19 Advanced Statistical Procedures Companion. Upper Saddle River, NJ: Pearson.

• http://faculty.chass.ncsu.edu/garson/PA765/gzlm_gee.htm

• Hardin, J. W., & Hilbe, J. M. (2003). Generalized estimating equations. Boca Raton, FL: Chapman and Hall/CRC Press.

• Norusis, M. (2011). IBM SPSS Statistics 19 Advanced Statistical Procedures Companion. Upper Saddle River, NJ: Pearson.

• http://faculty.chass.ncsu.edu/garson/PA765/gzlm_gee.htm

Page 33: Generalized Estimating Equations (GEE): A Modern Love Story

Estimated Marginal Means

Estimated Marginal Means

Page 34: Generalized Estimating Equations (GEE): A Modern Love Story

EM MeansEM Means