dr. parag singla dr. sumeet agarwal advisors causal ... · dr. parag singla. reintroducing the grn...
Post on 03-Aug-2020
10 Views
Preview:
TRANSCRIPT
Causal Computational Models for
Gene Regulatory Networks
Sahil LoombaParul Jain
AdvisorsDr. Sumeet Agarwal
Dr. Parag Singla
Reintroducing the GRN Problem
Target GeneRegulatory Gene
DNA
Cell Membrane
Factor 1 Factor 4
Factor 2 Factor 5
Factor 3
GeneExpression
Level
2
● Correlation
● Granger Causality
● Mutual Information
● Transfer Entropy
Parameters: Size, quantisation, time, lag
Asides: Grid Search, Laplace Smoothing
Where BTP1 finished…
TE ~ MI > GC > CO 3
… is where BTP2 picks up
Correlation Mutual Information
Granger Causality
Transfer Entropy
Convergent Cross Mapping
As Random Variable As Dynamical System
Linear Non Linear
Non
Pr
edic
tive
Pred
ictiv
e
4
Convergent Cross Mapping [Sugihara et al. 2012]Diffeomorphism across Shadow Manifolds
● For weakly coupled dynamical systems
● New notion of causality: belonging to same dynamical system
● Library size ∝ Time series length● Parameters:
○ Dimensions: M, E where E ≥ M○ Lag:
5
Convergent Cross MappingSome Results
Network Size 50, Time Series Length = 500 Network Size 100, Time Series Length = 500 6
Normalisation should be the norm!
Network Size 50, Time Series Length = 1000 Normalised Time Series data 7
High degrees are degrading!
Network Size 50, Average Degree = 7 8
High degrees are degrading!
Network Size 50, Average Degree = 2 9
Convergent Cross MappingSome Results
Network Size 50, Time Series Length = 1000 Network Size 100, Time Series Length = 1000 10
Transfer EntropySome Results
Network Size 100, Time Series Length = 1000Quantization Level = 10
Network Size 50, Time Series Length = 1000 Quantization Level = 10 11
Mutual InformationSome Results
Network Size 100, Time Series Length = 1000Quantization Level = 10
Network Size 50, Time Series Length = 1000 Quantization Level = 10
12
Granger CausalitySome Results
Network Size 100, Time Series Length = 1000Network Size 50, Time Series Length = 1000 13
Network Size 100, Time Series Length = 1000Network Size 50, Time Series Length = 1000
CorrelationSome Results
14
ARACNE [Margilin et al. 2006]Graph Structure Estimation
Network Size 100, Time Series Length = 1000Network Size 50, Time Series Length = 1000 15
Pairwise Metrics of CausalityA Summary
CCM and Correlation *seem* to work the best at a pairwise level16
Naive Edge Selection
1. (Since all pairwise metrics are directly proportional to the strength of
causality,) sort nC2 edges by metric value in decreasing order.
2. Choose top-k edges and output as graph G.
Smart Edge SelectionFuture Work
17
Use a “sophisticated” algorithm which selects top-k edges, by making use of graph
connectivity and other constraints information.
Intrinsic Graph Structure Estimation [Hino et al. 2015]Adjacency Matrix to Observation Matrix
f
(Pairwise Metrics used here)
Θ
Ξ
18
Intrinsic Graph Structure Estimation [Hino et al. 2015]The Random Walk Model
DigraphLaplacian
β/τ β/τ
19
Intrinsic Graph Structure Estimation [Hino et al. 2015]Parameter Estimation Algorithm
In an EM style algorithm, iterating over k (number of edges in graph):
20
Intrinsic Graph Structure EstimationMoving towards Multi-attribute Data
g
(Multiple Pairwise Metrics used here)
Θ
21
Intrinsic Graph Structure EstimationMoving towards Multi-attribute Data
Ideally, Θ should be exactly mapped by every qf -1(Ξ)
22
Intrinsic Graph Structure EstimationMore Considerations & Postprocessing
● Imposing extra constraints on Θ
● Explore different ways to map g to Θ (hard versus soft constraints)
● For reduced computational complexity, use a small window of k● Further postprocessing: Data Processing Inequality
23
Thank YouQuestions and Feedback
24
top related