dr. parag singla dr. sumeet agarwal advisors causal ... · dr. parag singla. reintroducing the grn...

Post on 03-Aug-2020

10 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Causal Computational Models for

Gene Regulatory Networks

Sahil LoombaParul Jain

AdvisorsDr. Sumeet Agarwal

Dr. Parag Singla

Reintroducing the GRN Problem

Target GeneRegulatory Gene

DNA

Cell Membrane

Factor 1 Factor 4

Factor 2 Factor 5

Factor 3

GeneExpression

Level

2

● Correlation

● Granger Causality

● Mutual Information

● Transfer Entropy

Parameters: Size, quantisation, time, lag

Asides: Grid Search, Laplace Smoothing

Where BTP1 finished…

TE ~ MI > GC > CO 3

… is where BTP2 picks up

Correlation Mutual Information

Granger Causality

Transfer Entropy

Convergent Cross Mapping

As Random Variable As Dynamical System

Linear Non Linear

Non

Pr

edic

tive

Pred

ictiv

e

4

Convergent Cross Mapping [Sugihara et al. 2012]Diffeomorphism across Shadow Manifolds

● For weakly coupled dynamical systems

● New notion of causality: belonging to same dynamical system

● Library size ∝ Time series length● Parameters:

○ Dimensions: M, E where E ≥ M○ Lag:

5

Convergent Cross MappingSome Results

Network Size 50, Time Series Length = 500 Network Size 100, Time Series Length = 500 6

Normalisation should be the norm!

Network Size 50, Time Series Length = 1000 Normalised Time Series data 7

High degrees are degrading!

Network Size 50, Average Degree = 7 8

High degrees are degrading!

Network Size 50, Average Degree = 2 9

Convergent Cross MappingSome Results

Network Size 50, Time Series Length = 1000 Network Size 100, Time Series Length = 1000 10

Transfer EntropySome Results

Network Size 100, Time Series Length = 1000Quantization Level = 10

Network Size 50, Time Series Length = 1000 Quantization Level = 10 11

Mutual InformationSome Results

Network Size 100, Time Series Length = 1000Quantization Level = 10

Network Size 50, Time Series Length = 1000 Quantization Level = 10

12

Granger CausalitySome Results

Network Size 100, Time Series Length = 1000Network Size 50, Time Series Length = 1000 13

Network Size 100, Time Series Length = 1000Network Size 50, Time Series Length = 1000

CorrelationSome Results

14

ARACNE [Margilin et al. 2006]Graph Structure Estimation

Network Size 100, Time Series Length = 1000Network Size 50, Time Series Length = 1000 15

Pairwise Metrics of CausalityA Summary

CCM and Correlation *seem* to work the best at a pairwise level16

Naive Edge Selection

1. (Since all pairwise metrics are directly proportional to the strength of

causality,) sort nC2 edges by metric value in decreasing order.

2. Choose top-k edges and output as graph G.

Smart Edge SelectionFuture Work

17

Use a “sophisticated” algorithm which selects top-k edges, by making use of graph

connectivity and other constraints information.

Intrinsic Graph Structure Estimation [Hino et al. 2015]Adjacency Matrix to Observation Matrix

f

(Pairwise Metrics used here)

Θ

Ξ

18

Intrinsic Graph Structure Estimation [Hino et al. 2015]The Random Walk Model

DigraphLaplacian

β/τ β/τ

19

Intrinsic Graph Structure Estimation [Hino et al. 2015]Parameter Estimation Algorithm

In an EM style algorithm, iterating over k (number of edges in graph):

20

Intrinsic Graph Structure EstimationMoving towards Multi-attribute Data

g

(Multiple Pairwise Metrics used here)

Θ

21

Intrinsic Graph Structure EstimationMoving towards Multi-attribute Data

Ideally, Θ should be exactly mapped by every qf -1(Ξ)

22

Intrinsic Graph Structure EstimationMore Considerations & Postprocessing

● Imposing extra constraints on Θ

● Explore different ways to map g to Θ (hard versus soft constraints)

● For reduced computational complexity, use a small window of k● Further postprocessing: Data Processing Inequality

23

Thank YouQuestions and Feedback

24

top related