identification of molecular mechanisms that drive...
TRANSCRIPT
Identification of Molecular Mechanisms that Drive Interindividual Variability Using
Mediation Analysis
Joshua Millstein, Assistant Professor
Biostatistics, Preventive Medicine
University of Southern California
Example: Food Allergy
rs7192, rs9275596
HLA-DRB1, HLA-DQB1 DNA Methylation
Other Exposures?
Peanuts IgE reactivity
HLA-DRB1, HLA-DQB1 Gene Expression?
Other Exposures?
Hong, et al. 2015.
Example: Food Allergy
Hong, et al. 2015.
DNA : Outcome (GWAS)
DNA : Meth, Meth : Outcome
Causal Inference Test (CIT)
Example: Personalized Diabetes Therapy
rs552668
ADRA2A Expression
Yohimbine
Metformin T2D Control
Tang, et al. 2014.
Other Exposures?
Example: Personalized Diabetes Therapy
Tang, et al. 2014.
DNA : Expr DNA : Outcome Expr : Outcome
CIT
Yohimbine
Example: Immune Response to Influenza Vaccination
Mult. SNPs
Expr 20 Genes
Other Exposures?
Infl. Vaccine Antibody Response
Franco, et al. 2013.
Example: Immune Response to Influenza Vaccination
Franco, et al. 2013.
Vaccine : mRNA DNA : mRNA
CIT
mRNA : Antibody Response
Causality Conditions
1) L causes T
4) The predictive power of L on T is explained by G.
2) L explains variation in G not explained by T.
3) G explains variation in T not explained by L.
L T G
L T G
L T G ?
L T G
L®T
|L G T
|G T L
|L T G
Millstein, et al. 2009. BMC Genetics
Component Tests
A working mathematical definition of ‘causal’ is described by a set of conditions within a linear modeling framework:
Four component hypothesis tests:
Standard F-tests (partial F-tests) can be used for tests 1-3, however, test 4 is an equivalence testing problem.
Equivalence testing requires defining
boundaries within which the parameter is sufficiently close to the target.
Non-significance does not equate to
significant equivalence.
T =a1 + b1L +e1
G =a + b2L+ b3T +e2
T =a3 + b4G+ b5L+e3
1) H0 : b1 = 0 H1 : b1 ¹ 0
2) H0 : b2 = 0 H1 : b2 ¹ 0
3) H0 : b4 = 0 H1 : b4 ¹ 0
4) H0 : b5 ¹ 0 H1 : b5 = 0
Millstein, et al. 2009. BMC Genetics
Causal Inference Test (CIT)
Intersection/union test:
• Union of the acceptance regions
• Intersection of the rejection regions
• Conservative test of the union of null hypotheses
• P-value is the max of component test p-values
Software freely available from CRAN:
• R package, ‘cit’
• https://cran.r-project.org/web/packages/cit/index.html
• Continuous outcome
• Continuous potential mediator
• Single instrumental variable with values, {0, 1, 2}
pCIT = max(p1, p2, p3, p4)
Millstein, et al. 2009. BMC Genetics
Cassella & Berger, 2002.
CIT False Discovery Rate (qCIT)
Remaining issues:
• How to adjust for multiple testing?
• What if parametric assumptions fail?
• Not obvious how to estimate empirical null distribution using permutation
Solution: novel intersection/union type test with FDR
• Let, P[TD1] denote P[ true discovery for component test 1 | FDR = q1 ]
• True discovery union = P[ all TDs ] = P[ TD1 ] * P[ TD2 ] *…
• P[ TD1 ] = 1 – P[ FD1 ] = 1 – q1
• qCIT = P[ any false ] = 1 – P[ all TDs ]
qCIT =1- (1-q1)*(1-q2 )*(1-q3)*(1-q4) ³ max(q1,q2,q3,q4 )
Parametric
Millstein FDR estimators
Permutation-Based (non-parametric)
FD̂R =S *
S
1- S /m
1- S * /m
FD̂R =ma
S
1- S /m
1-a
m: total number of tests conducted
a: significance level
S: No. tests w/ p-value < a
S*: No. tests w/ p-value < a
from a permuted replicate dataset
Advantages:
1. Non-parametric
2. Powerful
3. Confidence Intervals for FDR
4. R package, ‘fdrci’
5. https://cran.r-project.org/web/packages/fdrci/index.html
Millstein & Volfson, 2013. Frontiers in Genetics
Permutation-Based FDR Example
Extensions to the CIT software:
1. Binary as well as continuous outcomes
2. Multiple binary and/or continuous instrumental variables
3. Permutation-based FDR
4. Parametric FDR
5. Adjustment covariates
Millstein & Volfson, 2013. Frontiers in Genetics
Causal Inference Test (CIT)
L T G
L T G
L
T
G
Causal
Reactive
Independent
Underlying Model: Causal