directed acyclic graphs and the use of linear mixed models ... · bayes 2012, aachen siem...

25
Liver Toxicity in rat Causal models Distributions living on a DAG Estimation Some results How to search Graphical models? Conclusions For Further Reading References Acknowledgements Directed Acyclic Graphs and the use of Linear Mixed Models Prediction and Marker Selection Siem Heisterkamp 1,2 1 Grünenthal, department CDB-Biometrics Aachen, Germany 2 Groningen Bioinformatics Centre (GBIC), Groningen, The Netherlands Bayes 2012, Aachen Siem Heisterkamp DAG

Upload: others

Post on 05-Oct-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Directed Acyclic Graphs and the use of Linear Mixed Models ... · Bayes 2012, Aachen Siem Heisterkamp DAG. Liver Toxicity in rat Causal models Distributions living on a DAG Estimation

Liver Toxicity in ratCausal models

Distributions living on a DAGEstimation

Some resultsHow to search Graphical models?

ConclusionsFor Further Reading

References

Acknowledgements

Directed Acyclic Graphs and the use ofLinear Mixed Models

Prediction and Marker Selection

Siem Heisterkamp1,2

1Grünenthal, department CDB-Biometrics Aachen, Germany2Groningen Bioinformatics Centre (GBIC), Groningen, The Netherlands

Bayes 2012, Aachen

Siem Heisterkamp DAG

Page 2: Directed Acyclic Graphs and the use of Linear Mixed Models ... · Bayes 2012, Aachen Siem Heisterkamp DAG. Liver Toxicity in rat Causal models Distributions living on a DAG Estimation

Liver Toxicity in ratCausal models

Distributions living on a DAGEstimation

Some resultsHow to search Graphical models?

ConclusionsFor Further Reading

References

Acknowledgements

Acknowledgements

I Joanna in ’t Hout, (University of Nijmegen, dept. MedicalStatistics and Epidemiology)

I Herman Vreuls, (MSD, dept. BARDS)I Jan Polman (formerly MSD, dept. MDI)I Susanne Bauerschmidt, (formerly MSD, dept. MDI)I Geny Groothuis, (University of Groningen, dept. Pharmacy)I Marieke Elferink, (University of Groningen, dept. Pharmacy)I Peter Olinga, (University of Groningen, dept. Pharmacy)I Elisa van Leeuwen (University of Groningen, dept. Pharmacy)

Siem Heisterkamp DAG

Page 3: Directed Acyclic Graphs and the use of Linear Mixed Models ... · Bayes 2012, Aachen Siem Heisterkamp DAG. Liver Toxicity in rat Causal models Distributions living on a DAG Estimation

Liver Toxicity in ratCausal models

Distributions living on a DAGEstimation

Some resultsHow to search Graphical models?

ConclusionsFor Further Reading

References

Liver Toxicity in rat

I Part of a large study on comparison of expression in liver ofhumans, rats and HEP-cell lines (collaboration between formerOrganon and University of Groningen) Elferink et.al. (2011)In this presentation

I Rats exposed to a range of liver-toxic compounds (a.o.Paracetamol) using liver slices

I Gene-expression arrays applied on liver tissueI Q: How to find biologically meaningful associations between

treatment and gene-expression?

Siem Heisterkamp DAG

Page 4: Directed Acyclic Graphs and the use of Linear Mixed Models ... · Bayes 2012, Aachen Siem Heisterkamp DAG. Liver Toxicity in rat Causal models Distributions living on a DAG Estimation

Liver Toxicity in ratCausal models

Distributions living on a DAGEstimation

Some resultsHow to search Graphical models?

ConclusionsFor Further Reading

References

Usual approach

I Unstructured hunting of SNP’s or bio-markers either univariate ormulti-variable

I Selection of variables is completely data-drivenI Subject of this presentation:

I Causal models formulated from hypotheses using findings fromdatabases (Ingenuity, GRAIL, etc)

I Test these by relatively simple means using linear causal modelsI Find smaller sub-graphs

Siem Heisterkamp DAG

Page 5: Directed Acyclic Graphs and the use of Linear Mixed Models ... · Bayes 2012, Aachen Siem Heisterkamp DAG. Liver Toxicity in rat Causal models Distributions living on a DAG Estimation

Liver Toxicity in ratCausal models

Distributions living on a DAGEstimation

Some resultsHow to search Graphical models?

ConclusionsFor Further Reading

References

Part of Ingenuity Pathway Analysis

Siem Heisterkamp DAG

Page 6: Directed Acyclic Graphs and the use of Linear Mixed Models ... · Bayes 2012, Aachen Siem Heisterkamp DAG. Liver Toxicity in rat Causal models Distributions living on a DAG Estimation

Liver Toxicity in ratCausal models

Distributions living on a DAGEstimation

Some resultsHow to search Graphical models?

ConclusionsFor Further Reading

References

Some DAG-Theory

Directed Acyclic Graph

●●

● ●●

Lbp

Tnfr.1

Tnfr.2

Jun.1

Jun.2

Apoc4

y

Full model with common edges

●●

● ●●

Lbp

Tnfr.1

Tnfr.2

Jun.1

Jun.2

Apoc4

y

DAG

●●

● ●●

Lbp

Tnfr.1

Tnfr.2

Jun.1

Jun.2

Apoc4

y

moralized DAG

Siem Heisterkamp DAG

Page 7: Directed Acyclic Graphs and the use of Linear Mixed Models ... · Bayes 2012, Aachen Siem Heisterkamp DAG. Liver Toxicity in rat Causal models Distributions living on a DAG Estimation

Liver Toxicity in ratCausal models

Distributions living on a DAGEstimation

Some resultsHow to search Graphical models?

ConclusionsFor Further Reading

References

Some DAG-Theory

●●

●●

Lbp

Tnfr.1

Tnfr.2

Jun.1

Jun.2

Apoc4

y

Full DAG model

Adjacency matrix

Lbp Tnfr .1 Tnfr .2 Jun.1 Jun.2 Apoc4 YLbp 0 1 1 0 0 0 0

Tnfr .1 0 0 1 1 0 0 0Tnfr .2 0 0 0 1 1 0 0Jun.1 0 0 0 0 1 1 0Jun.2 0 0 0 0 0 1 0Apoc4 0 0 0 0 0 0 1

Din 0 1 2 2 2 2 1

Siem Heisterkamp DAG

Page 8: Directed Acyclic Graphs and the use of Linear Mixed Models ... · Bayes 2012, Aachen Siem Heisterkamp DAG. Liver Toxicity in rat Causal models Distributions living on a DAG Estimation

Liver Toxicity in ratCausal models

Distributions living on a DAGEstimation

Some resultsHow to search Graphical models?

ConclusionsFor Further Reading

References

Some DAG-Theory

Markov-Equivalence of different DAG’sExample: DAG’s with 3 nodes

a

b

c

model 1

c

b

a

model 2

c

b

a

model 3

a

c

b

model 4

Siem Heisterkamp DAG

Page 9: Directed Acyclic Graphs and the use of Linear Mixed Models ... · Bayes 2012, Aachen Siem Heisterkamp DAG. Liver Toxicity in rat Causal models Distributions living on a DAG Estimation

Liver Toxicity in ratCausal models

Distributions living on a DAGEstimation

Some resultsHow to search Graphical models?

ConclusionsFor Further Reading

References

Some DAG-Theory

Markov-Equivalence of different DAG’sAssociated Moralized graphs

a

b

c

moralized model 1

c

b

a

moralized model 2

c

b

a

moralized model 2

a

c

b

moralized model 4

Siem Heisterkamp DAG

Page 10: Directed Acyclic Graphs and the use of Linear Mixed Models ... · Bayes 2012, Aachen Siem Heisterkamp DAG. Liver Toxicity in rat Causal models Distributions living on a DAG Estimation

Liver Toxicity in ratCausal models

Distributions living on a DAGEstimation

Some resultsHow to search Graphical models?

ConclusionsFor Further Reading

References

Some DAG-Theory

Seemingly different DAG’s may be equivalent

I Two DAG’s are Markov Equivalent iff1. Skeleton graph’s are equal2. The ’amoralities’ are the same

I In other words: the moralized graphs must be the sameI Causality can only be established in case of ’colliding’ arrows

Siem Heisterkamp DAG

Page 11: Directed Acyclic Graphs and the use of Linear Mixed Models ... · Bayes 2012, Aachen Siem Heisterkamp DAG. Liver Toxicity in rat Causal models Distributions living on a DAG Estimation

Liver Toxicity in ratCausal models

Distributions living on a DAGEstimation

Some resultsHow to search Graphical models?

ConclusionsFor Further Reading

References

Directed Acyclic GraphFinding the distributions

I log lik (Dag) =∑ν log P (Child |ν)

I Linear model; algorithm by Cox & Wermuth (1996)

1. Moralize the graph

2. Trace-back from Y E [y |νy ] =∑

jaj yβy|νj

with

(∑j

aj y − dy

)6= 0

3. Conditional logLik: ∝ λ (y − E [y |νy ])2

and precision λ

(dy −

∑j

aj y

)4. Repeat 2 until last married grand-parents...

Siem Heisterkamp DAG

Page 12: Directed Acyclic Graphs and the use of Linear Mixed Models ... · Bayes 2012, Aachen Siem Heisterkamp DAG. Liver Toxicity in rat Causal models Distributions living on a DAG Estimation

Liver Toxicity in ratCausal models

Distributions living on a DAGEstimation

Some resultsHow to search Graphical models?

ConclusionsFor Further Reading

References

Directed Acyclic GraphFinding the distributions

I Algorithm by Cox & Wermuth (1996)1. Let A adjacency matrix of a DAG2. Construct moralized Am, strip Y and other barren variables3. Construct Laplacian:

LA = I − D− 1

2m AmD

− 12

m

I With Dm the diagonal matrix of degrees for each nodeI LA inverse of the prior co-variance matrix (precision or

concentration)I May be weighted and of deficit rank

Siem Heisterkamp DAG

Page 13: Directed Acyclic Graphs and the use of Linear Mixed Models ... · Bayes 2012, Aachen Siem Heisterkamp DAG. Liver Toxicity in rat Causal models Distributions living on a DAG Estimation

Liver Toxicity in ratCausal models

Distributions living on a DAGEstimation

Some resultsHow to search Graphical models?

ConclusionsFor Further Reading

References

Directed Acyclic GraphCausality and Distributions

According to DAG-theoryI Causality can only be established in case of ’colliding’ arrows

Corollary from the Normal Graph-theory Cox & Wermuth (1996)I A node cannot be causal with respect to other nodes if the

correlation between the latter conditional on the former is ZEROI Equivalent to a zero entry in the inverse of covariance matrix

Siem Heisterkamp DAG

Page 14: Directed Acyclic Graphs and the use of Linear Mixed Models ... · Bayes 2012, Aachen Siem Heisterkamp DAG. Liver Toxicity in rat Causal models Distributions living on a DAG Estimation

Liver Toxicity in ratCausal models

Distributions living on a DAGEstimation

Some resultsHow to search Graphical models?

ConclusionsFor Further Reading

References

DAG and the linear mixed model

I X represent observations on nodes (for simplicity: no othervariables)

Y |β ∝ Nor(X (I + AD)α + Iβ, σ2Σy

)β ∝ Nor

(0, λ−1L−1

A

)I σ2 Σy the covariance matrix of the observationsI λ LA the precisionI LA of rank r ≤ p (β random effects)I AD Adjacency matrix of the Dual Graph of edges (α fixed effects)

Siem Heisterkamp DAG

Page 15: Directed Acyclic Graphs and the use of Linear Mixed Models ... · Bayes 2012, Aachen Siem Heisterkamp DAG. Liver Toxicity in rat Causal models Distributions living on a DAG Estimation

Liver Toxicity in ratCausal models

Distributions living on a DAGEstimation

Some resultsHow to search Graphical models?

ConclusionsFor Further Reading

References

Linear Causal model

log lik

∣∣∣∣∣Y ,A,X, λ, α, usual︷︸︸︷. . .

)= log lik

(Y

∣∣∣∣∣β,A,X, α, usual︷︸︸︷. . .

)︸ ︷︷ ︸

usual log lik

+

−0.5 n(λβtLA β − rLA log (λ)− log (det (LA))

)︸ ︷︷ ︸prior log lik

I r rank of LA

Siem Heisterkamp DAG

Page 16: Directed Acyclic Graphs and the use of Linear Mixed Models ... · Bayes 2012, Aachen Siem Heisterkamp DAG. Liver Toxicity in rat Causal models Distributions living on a DAG Estimation

Liver Toxicity in ratCausal models

Distributions living on a DAGEstimation

Some resultsHow to search Graphical models?

ConclusionsFor Further Reading

References

Similarities with Linear mixed model

SimilaritiesI Identifiable (like in mixed models) for fixed λ · σ2

usually a suitable σ̂2 is plugged in or σ = 1I Standard software (e.g. lme in S-Plus or R) with user-defined

variance structureI Different DAGs for different random levels e.g. gender etc.I Common causes may be modeled as well

Siem Heisterkamp DAG

Page 17: Directed Acyclic Graphs and the use of Linear Mixed Models ... · Bayes 2012, Aachen Siem Heisterkamp DAG. Liver Toxicity in rat Causal models Distributions living on a DAG Estimation

Liver Toxicity in ratCausal models

Distributions living on a DAGEstimation

Some resultsHow to search Graphical models?

ConclusionsFor Further Reading

References

Differences with Linear mixed model

DifferencesI Changing adjacency A may change Laplacian LA as wellI Consequences for estimation and testing

1. Use of Wald test for individual interactions not possible as LA

depends on H0

2. ML-estimation must be used with general criteria (AIC, AICc, BIC,gBIC)

Siem Heisterkamp DAG

Page 18: Directed Acyclic Graphs and the use of Linear Mixed Models ... · Bayes 2012, Aachen Siem Heisterkamp DAG. Liver Toxicity in rat Causal models Distributions living on a DAG Estimation

Liver Toxicity in ratCausal models

Distributions living on a DAGEstimation

Some resultsHow to search Graphical models?

ConclusionsFor Further Reading

References

4 best DAG’s using gBIC

●●

● ●●

Lbp

Tnfr.1

Tnfr.2

Jun.1

Jun.2

Apoc4

y

DAG(1) gBIC -28071.45

●●

● ●●

Lbp

Tnfr.1

Tnfr.2

Jun.1

Jun.2

Apoc4

y

DAG(2) gBIC -14310.42

●●

● ●●

Lbp

Tnfr.1

Tnfr.2

Jun.1

Jun.2

Apoc4

y

DAG(3) gBIC -10965.17

●●

●●

Lbp

Tnfr.1

Tnfr.2

Jun.1

Apoc4

y

DAG(4) gBIC -10788.49

Siem Heisterkamp DAG

Page 19: Directed Acyclic Graphs and the use of Linear Mixed Models ... · Bayes 2012, Aachen Siem Heisterkamp DAG. Liver Toxicity in rat Causal models Distributions living on a DAG Estimation

Liver Toxicity in ratCausal models

Distributions living on a DAGEstimation

Some resultsHow to search Graphical models?

ConclusionsFor Further Reading

References

4 best moralized DAG’s

●●

● ●●

Lbp

Tnfr.1

Tnfr.2

Jun.1

Jun.2

Apoc4

y

moralized (1)

●●

● ●●

Lbp

Tnfr.1

Tnfr.2

Jun.1

Jun.2

Apoc4

y

moralized (2)

●●

● ●●

Lbp

Tnfr.1

Tnfr.2

Jun.1

Jun.2

Apoc4

y

moralized (3)

●●

●●

Lbp

Tnfr.1

Tnfr.2

Jun.1

Apoc4

y

moralized (4)

Siem Heisterkamp DAG

Page 20: Directed Acyclic Graphs and the use of Linear Mixed Models ... · Bayes 2012, Aachen Siem Heisterkamp DAG. Liver Toxicity in rat Causal models Distributions living on a DAG Estimation

Liver Toxicity in ratCausal models

Distributions living on a DAGEstimation

Some resultsHow to search Graphical models?

ConclusionsFor Further Reading

References

in-vivo liver toxicity in ratvalidation by leave one out

Table: Prediction for 4 models

DAG(1) DAG(2) DAG(3) DAG(4)gBIC -28071 -14310 -10965 -10788

T C T C T C T Cpred. T 6 0 6 0 6 0 6 0pred. C 4 10 4 10 4 10 4 10Total 10 10 10 10 10 10 10 10

Note: None of the models are Markov-equivalent

Siem Heisterkamp DAG

Page 21: Directed Acyclic Graphs and the use of Linear Mixed Models ... · Bayes 2012, Aachen Siem Heisterkamp DAG. Liver Toxicity in rat Causal models Distributions living on a DAG Estimation

Liver Toxicity in ratCausal models

Distributions living on a DAGEstimation

Some resultsHow to search Graphical models?

ConclusionsFor Further Reading

References

How to search for plausible models?

I The number of models to look for is too largeI With lasso or graphical lasso edges can be deleted automatically

(e.g. gLassso by Friedman)I But: Results in disconnected networks or biologically hard to

explain (?)I Proposal: extra penalty function with L1 and L0-norm on

connected edges acting on the dual graph

Siem Heisterkamp DAG

Page 22: Directed Acyclic Graphs and the use of Linear Mixed Models ... · Bayes 2012, Aachen Siem Heisterkamp DAG. Liver Toxicity in rat Causal models Distributions living on a DAG Estimation

Liver Toxicity in ratCausal models

Distributions living on a DAGEstimation

Some resultsHow to search Graphical models?

ConclusionsFor Further Reading

References

How to search plausible models (2)?I Double exponential prior on α ∈ Edges (Lasso L1)I Poisson prior on the number of edges (L0)I Constrain for at least one connected path between

Grandparent(s) and Final node(s)I ⇒ 1-prob(no pathway of any length exists)I extra priors added (penalty functions)

−0.5 n λ1

(∑s∈E

|αs|

)︸ ︷︷ ︸

prior log lik L1 norm

+n log

1− e

−∑k

∑f∈Fg∈G

∑sπ(λ1,λ2,αs)·wk

s,f ,g

︸ ︷︷ ︸prior log lik L0 norm

Siem Heisterkamp DAG

Page 23: Directed Acyclic Graphs and the use of Linear Mixed Models ... · Bayes 2012, Aachen Siem Heisterkamp DAG. Liver Toxicity in rat Causal models Distributions living on a DAG Estimation

Liver Toxicity in ratCausal models

Distributions living on a DAGEstimation

Some resultsHow to search Graphical models?

ConclusionsFor Further Reading

References

What have we learned?

I Unstructured univariate or multi-variable hunting fordifferentiating genes, SNPs, markers

I Selection of variables: no guarantee for biologically meaningfulmodels

I Extract hypotheses from knowledge-bases (Ingenuity etc)I Fit with causal modelsI Search smaller models

I Some problems to be solved practically:I How to compare alternative models?

Siem Heisterkamp DAG

Page 24: Directed Acyclic Graphs and the use of Linear Mixed Models ... · Bayes 2012, Aachen Siem Heisterkamp DAG. Liver Toxicity in rat Causal models Distributions living on a DAG Estimation

Liver Toxicity in ratCausal models

Distributions living on a DAGEstimation

Some resultsHow to search Graphical models?

ConclusionsFor Further Reading

References

For Further Reading I

Cox DR, Wermuth N,Multivariate Dependencies: Models analysis andInterpretation(1996). Chapman & Hall, London

Pearl, J. Causality: Models Reasoning and Inference(2000).Cambridge, Cambridge University Press

Raftery, A. E., (1996) Hypothesis testing and model selection in Gilks,W. R., Richardson, S., Spiegelhalter, D.J. (eds.) Markov ChainMonte Carlo in Practice. London: Chapman and Hall.

Elferink, M.G.L., Olinga, P., van Leeuwen , E.M., Bauerschmidt, S.,Polman, J., Schoonen, W.G., Heisterkamp, S.H. and Groothuis,G.M.M., (2011) Gene expression analysis of precision-cut humanliver slices indicate stable expression of ADME-Tox related genes.Toxicol Appl Pharmacol (2011) PMID 21420995.

Siem Heisterkamp DAG

Page 25: Directed Acyclic Graphs and the use of Linear Mixed Models ... · Bayes 2012, Aachen Siem Heisterkamp DAG. Liver Toxicity in rat Causal models Distributions living on a DAG Estimation

Liver Toxicity in ratCausal models

Distributions living on a DAGEstimation

Some resultsHow to search Graphical models?

ConclusionsFor Further Reading

References

For Further Reading II

Dawid, A.P., (2004) Probability, Causality and the Empirical World: ABayes -de Finetti -Popper -Borel Synthesism, Statistical Science,19, 44-57.

Dawid, A.P., (2007) Fundamentals of Statistical Causality ResearchReport no 279, Department of Statistical Sciences, UniversityCollege London, September 2007

Siem Heisterkamp DAG