causal models for regression modeling strategies › doc › rms › causalmodels.pdf ·...

Causal Models for Regression Modeling

StrategiesDrew Griffin Levy

Regression Modeling Strategies Short CourseMay, 2020

Takeaways: Reasons to consider causal models for regression modeling in observational studies

1. Alternative approaches to variable selection

2. Deeper insight re. how causal inferences from associational models can be questionable

3. Identifying the minimum (and various) set of adjustments necessary for unbiased estimation of effects

4. Risk of inducing bias with statistical adjustment (collider stratification bias)

5. Clearly and explicitly communicating assumptions about justifications for model specification

Resources• DAGitty - drawing and analyzing causal diagrams (DAGs) (www.dagitty.net/)• Judea Pearl

1. Causal Inference in Statistics: A Primer, 20162. Causality: Models, Reasoning and Inference, 20093. The Book of Why: The New Science of Cause and Effect, 2018.

• Miguel Hernan1. The Causal Inference Book2. edX MOOC: Causal Diagrams: Draw Your Assumptions Before Your Conclusions

• Modern Epidemiology, 3rd Ed. Rothman, Greenland, Lash: Chapter 12–Causal Diagrams

• Causal Diagrams for Epidemiologic Research. S. Greenland, J. Pearl, J. Robins. Epidemiology 1999;10:37-48.

• Developing a Protocol for Observational Comparative Effectiveness Research: A User's Guide: Supplement 2, Use of Directed Acyclic Graphs

http://www.dagitty.net/

https://www.wiley.com/en-us/Causal+Inference+in+Statistics%253A+A+Primer-p-9781119186847

http://bayes.cs.ucla.edu/BOOK-2K/

http://bayes.cs.ucla.edu/WHY/

https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/

https://www.edx.org/course/causal-diagrams-draw-your-assumptions-before-your-conclusions

https://pdfs.semanticscholar.org/975a/68eda501a91f5c53793a9916c255b9b30145.pdf

https://www.ncbi.nlm.nih.gov/books/NBK126190/pdf/Bookshelf_NBK126190.pdf

Analytic bias• Model selection

– E(β |̂β ”̂significant”) ≠βtrue• Model misspecification• Over-fitting• Residual confounding• Arbitrary categorization• Collider bias

POPULATION

SAMPLE ANALYSIS

“What we observeis not nature itself,

but nature exposed to our method of questioning.”

-Werner Heisenberg DECISIONS & ACTION

INFERENCE

Conventional statistical methods

• Risk of selection bias; confounding by indication

• Importance of study / experimental design

• Omitted variables• Missing data• Measurement issues• Information bias DATA

Likelihood: P(data | 𝚹)

Uncertainties• Model specification• Model selection• Assumptions re. distributions

• Cognition/psychology

• Intentions• Motivations

Association vs. Causation

Belief ~ Evidence

P(𝚹 | data )

NATURE

The Epistemological Arc

We can & will be fooled by data!

“The data are profoundly dumb!”---Judea Pearl, Book of Why

• Data helps to describe reality—albeit imperfectly• It is a prevalent mistake to believe that “all the answers

[information] are in the data”• Observations are not objective; Nature is indifferent to

furnishing noise vs. signal; the computer cannot divine causes; good faith science requires humility• Relying on statistical approaches to identifying

variables for adjustment and control of confounding can be problematic

Alternative PoV: how to identify variables for unbiased estimation1. How to estimate a 1° effect (e.g., Tx) without bias• Confounding is a causal phenomenon• Confounding: P(Y|X) ≠ P(Y|do(X))

2. Identifying the set(s) of adjustments necessary for unbiased estimation of specific effects

3. Causal models also elucidate• Adjustments that induce bias!• Selection bias• Much else

“What causes say about data”

• Causal diagrams show how causal relations are expected to translate into associations & independencies

1. Initially, associations & independencies derived from subject matter knowledge are posited in a DAG

2. Then given the posited model, associations & independencies observed in data are are computed

• A credible causal model will reconcile associations & independencies observed with the constraints provided by the posited causal model• Subject to further criticism; revision qualification,

elaboration, updating, refinement

Basic structures in causal models

1. Causal relationship2. Chains3. Mediation4. Confounder5. Collider

Cause-effect

DAGs are both causal models and statistical models (i.e., models that represent associations and independencies)

Causal effects imply associations Lack of causal effects imply independencies: e.g., P(Y|X) ≠ P(Y)

*Figures, examples and propositions appropriated from Hernan’s Causal Diagrams: Draw Your Assumptions Before Your Conclusions


Causal structures: Chains, Junctions and Paths

• Mediation

• Direct vs. indirect effects• Total effect

• Conditional independence:• In general: Pr(Y=y|X=x) = Pr(Y=y)• Pr(Y=y|A=a, B=b) = Pr(Y=y|B=b)



Confounders

• Causal structure with common causes

• Bias: A and Y are not expected to be independent

• Bias: estimation of magnitude of association of A and Y



Colliders & Collider-stratification bias

• Paths with convergent arrows • When colliders are not

conditioned on they block pathways.

• When colliders areconditioned on they open pathways

• Thus adjustment can inadvertently induce bias!

• The prevalence of these collider structures is likely under appreciated.

Stratifying on a collider is a major culprit in systematic bias

Selection Bias and collider-stratification bias

• Common effects do not create an association, unless conditioned on.

• When there is a component of the association due to selecting a subset of the population, we say that there is selection bias.



Deconfounding → P(Y|do(X))

• Distinguish concepts: confounding, confounder, and “deconfounding”• “d-separation”: for any given pattern of paths in the

causal model, what pattern of dependencies and independencies we should expect in the data• “Back-door criterion” for bias evaluation indicates

possible sets of variables for unbiased estimation• Identify the set of adjustments necessary for

unbiased estimation of effects

Daggity: - drawing and analyzing causal diagrams (DAGs) (www.dagitty.net/)

Staplin N, Herrington WG, Judge PK, Reith CA, Haynes R, Landray MJ, Baigent C, Emberson J. Use of Causal Diagrams to Inform the Design and Interpretation of Observational Studies: An Example from the Study of Heart and Renal Protection (SHARP). Clin J Am


https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5338700/

“Draw your assumptions before your conclusions.” —M. Hernan

• Causal diagrams help us summarize what we know about a problem and communicate our assumptions about its causal structure.• Causal diagrams help us diagnose biases in causal

inference• Causal diagrams help you organize your expert

knowledge visually; and therefore, they help you draw your assumptions before your conclusions.

Resources• DAGitty - drawing and analyzing causal diagrams (DAGs) (www.dagitty.net/)• Judea Pearl

1. Causal Inference in Statistics: A Primer, 20162. Causality: Models, Reasoning and Inference, 20093. The Book of Why: The New Science of Cause and Effect, 2018.

• Miguel Hernan1. The Causal Inference Book2. edX MOOC: Causal Diagrams: Draw Your Assumptions Before Your Conclusions

• Modern Epidemiology, 3rd Ed. Rothman, Greenland, Lash: Chapter 12–Causal Diagrams

• Causal Diagrams for Epidemiologic Research. S. Greenland, J. Pearl, J. Robins. Epidemiology 1999;10:37-48.

• Developing a Protocol for Observational Comparative Effectiveness Research: A User's Guide: Supplement 2, Use of Directed Acyclic Graphs


https://www.wiley.com/en-us/Causal+Inference+in+Statistics%253A+A+Primer-p-9781119186847

http://bayes.cs.ucla.edu/BOOK-2K/

http://bayes.cs.ucla.edu/WHY/

https://www.hsph.harvard.edu/miguel-hernan/causal-inference-book/


https://pdfs.semanticscholar.org/975a/68eda501a91f5c53793a9916c255b9b30145.pdf

https://www.ncbi.nlm.nih.gov/books/NBK126190/pdf/Bookshelf_NBK126190.pdf

Proposed process for using SCMs and DAGs

1. Think hard about the research question and problem of effect identification

2. Develop DAGs based on subject matter knowledge without looking at data: do not contort the DAG based on data availability

3. Do the causal calculus in Daggity to identify the set of minimum necessary adjustment for unbiased effect estimation

4. Do analysis and reconcile observations with causal model (this is science)

5. Publish the DAG with the research report.

Takeaways: Reasons to consider causal models for regression modeling in non-randomized studies1. Better approaches to variable selection2. Deeper insight re. how causal inferences from

associational models can be questionable3. Identifying the minimum set of adjustments

necessary for unbiased (unconfounded) estimation of effects

4. Risk of collider stratification bias5. Clearly and explicitly communicating assumptions

about justifications for model specification.

Analytic bias• Model selection

– E(β |̂β ”̂significant”) ≠βtrue• Model misspecification• Over-fitting• Residual confounding• Arbitrary categorization• Collider bias

POPULATION

SAMPLE ANALYSIS

“What we observeis not nature itself,

but nature exposed to our method of questioning.”

-Werner Heisenberg DECISIONS & ACTION

INFERENCE

Conventional statistical methods

• Risk of selection bias; confounding by indication

• Importance of study / experimental design

• Omitted variables• Missing data• Measurement issues• Information bias DATA

Likelihood: P(data | 𝚹)

Uncertainties• Model specification• Model selection• Assumptions re. distributions

• Cognition/psychology

• Intentions• Motivations

Association vs. Causation

Belief ~ Evidence

P(𝚹 | data )

NATURE

The Epistemological Arc

causal models for regression modeling strategies › doc › rms › causalmodels.pdf ·...

Documents