dr. parag singla dr. sumeet agarwal advisors causal ... · dr. parag singla. reintroducing the grn...

24
Causal Computational Models for Gene Regulatory Networks Sahil Loomba Parul Jain Advisors Dr. Sumeet Agarwal Dr. Parag Singla

Upload: others

Post on 03-Aug-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Dr. Parag Singla Dr. Sumeet Agarwal Advisors Causal ... · Dr. Parag Singla. Reintroducing the GRN Problem Regulatory Gene Target Gene DNA Cell Membrane Factor 1 Factor 4 Factor 2

Causal Computational Models for

Gene Regulatory Networks

Sahil LoombaParul Jain

AdvisorsDr. Sumeet Agarwal

Dr. Parag Singla

Page 2: Dr. Parag Singla Dr. Sumeet Agarwal Advisors Causal ... · Dr. Parag Singla. Reintroducing the GRN Problem Regulatory Gene Target Gene DNA Cell Membrane Factor 1 Factor 4 Factor 2

Reintroducing the GRN Problem

Target GeneRegulatory Gene

DNA

Cell Membrane

Factor 1 Factor 4

Factor 2 Factor 5

Factor 3

GeneExpression

Level

2

Page 3: Dr. Parag Singla Dr. Sumeet Agarwal Advisors Causal ... · Dr. Parag Singla. Reintroducing the GRN Problem Regulatory Gene Target Gene DNA Cell Membrane Factor 1 Factor 4 Factor 2

● Correlation

● Granger Causality

● Mutual Information

● Transfer Entropy

Parameters: Size, quantisation, time, lag

Asides: Grid Search, Laplace Smoothing

Where BTP1 finished…

TE ~ MI > GC > CO 3

Page 4: Dr. Parag Singla Dr. Sumeet Agarwal Advisors Causal ... · Dr. Parag Singla. Reintroducing the GRN Problem Regulatory Gene Target Gene DNA Cell Membrane Factor 1 Factor 4 Factor 2

… is where BTP2 picks up

Correlation Mutual Information

Granger Causality

Transfer Entropy

Convergent Cross Mapping

As Random Variable As Dynamical System

Linear Non Linear

Non

Pr

edic

tive

Pred

ictiv

e

4

Page 5: Dr. Parag Singla Dr. Sumeet Agarwal Advisors Causal ... · Dr. Parag Singla. Reintroducing the GRN Problem Regulatory Gene Target Gene DNA Cell Membrane Factor 1 Factor 4 Factor 2

Convergent Cross Mapping [Sugihara et al. 2012]Diffeomorphism across Shadow Manifolds

● For weakly coupled dynamical systems

● New notion of causality: belonging to same dynamical system

● Library size ∝ Time series length● Parameters:

○ Dimensions: M, E where E ≥ M○ Lag:

5

Page 6: Dr. Parag Singla Dr. Sumeet Agarwal Advisors Causal ... · Dr. Parag Singla. Reintroducing the GRN Problem Regulatory Gene Target Gene DNA Cell Membrane Factor 1 Factor 4 Factor 2

Convergent Cross MappingSome Results

Network Size 50, Time Series Length = 500 Network Size 100, Time Series Length = 500 6

Page 7: Dr. Parag Singla Dr. Sumeet Agarwal Advisors Causal ... · Dr. Parag Singla. Reintroducing the GRN Problem Regulatory Gene Target Gene DNA Cell Membrane Factor 1 Factor 4 Factor 2

Normalisation should be the norm!

Network Size 50, Time Series Length = 1000 Normalised Time Series data 7

Page 8: Dr. Parag Singla Dr. Sumeet Agarwal Advisors Causal ... · Dr. Parag Singla. Reintroducing the GRN Problem Regulatory Gene Target Gene DNA Cell Membrane Factor 1 Factor 4 Factor 2

High degrees are degrading!

Network Size 50, Average Degree = 7 8

Page 9: Dr. Parag Singla Dr. Sumeet Agarwal Advisors Causal ... · Dr. Parag Singla. Reintroducing the GRN Problem Regulatory Gene Target Gene DNA Cell Membrane Factor 1 Factor 4 Factor 2

High degrees are degrading!

Network Size 50, Average Degree = 2 9

Page 10: Dr. Parag Singla Dr. Sumeet Agarwal Advisors Causal ... · Dr. Parag Singla. Reintroducing the GRN Problem Regulatory Gene Target Gene DNA Cell Membrane Factor 1 Factor 4 Factor 2

Convergent Cross MappingSome Results

Network Size 50, Time Series Length = 1000 Network Size 100, Time Series Length = 1000 10

Page 11: Dr. Parag Singla Dr. Sumeet Agarwal Advisors Causal ... · Dr. Parag Singla. Reintroducing the GRN Problem Regulatory Gene Target Gene DNA Cell Membrane Factor 1 Factor 4 Factor 2

Transfer EntropySome Results

Network Size 100, Time Series Length = 1000Quantization Level = 10

Network Size 50, Time Series Length = 1000 Quantization Level = 10 11

Page 12: Dr. Parag Singla Dr. Sumeet Agarwal Advisors Causal ... · Dr. Parag Singla. Reintroducing the GRN Problem Regulatory Gene Target Gene DNA Cell Membrane Factor 1 Factor 4 Factor 2

Mutual InformationSome Results

Network Size 100, Time Series Length = 1000Quantization Level = 10

Network Size 50, Time Series Length = 1000 Quantization Level = 10

12

Page 13: Dr. Parag Singla Dr. Sumeet Agarwal Advisors Causal ... · Dr. Parag Singla. Reintroducing the GRN Problem Regulatory Gene Target Gene DNA Cell Membrane Factor 1 Factor 4 Factor 2

Granger CausalitySome Results

Network Size 100, Time Series Length = 1000Network Size 50, Time Series Length = 1000 13

Page 14: Dr. Parag Singla Dr. Sumeet Agarwal Advisors Causal ... · Dr. Parag Singla. Reintroducing the GRN Problem Regulatory Gene Target Gene DNA Cell Membrane Factor 1 Factor 4 Factor 2

Network Size 100, Time Series Length = 1000Network Size 50, Time Series Length = 1000

CorrelationSome Results

14

Page 15: Dr. Parag Singla Dr. Sumeet Agarwal Advisors Causal ... · Dr. Parag Singla. Reintroducing the GRN Problem Regulatory Gene Target Gene DNA Cell Membrane Factor 1 Factor 4 Factor 2

ARACNE [Margilin et al. 2006]Graph Structure Estimation

Network Size 100, Time Series Length = 1000Network Size 50, Time Series Length = 1000 15

Page 16: Dr. Parag Singla Dr. Sumeet Agarwal Advisors Causal ... · Dr. Parag Singla. Reintroducing the GRN Problem Regulatory Gene Target Gene DNA Cell Membrane Factor 1 Factor 4 Factor 2

Pairwise Metrics of CausalityA Summary

CCM and Correlation *seem* to work the best at a pairwise level16

Page 17: Dr. Parag Singla Dr. Sumeet Agarwal Advisors Causal ... · Dr. Parag Singla. Reintroducing the GRN Problem Regulatory Gene Target Gene DNA Cell Membrane Factor 1 Factor 4 Factor 2

Naive Edge Selection

1. (Since all pairwise metrics are directly proportional to the strength of

causality,) sort nC2 edges by metric value in decreasing order.

2. Choose top-k edges and output as graph G.

Smart Edge SelectionFuture Work

17

Use a “sophisticated” algorithm which selects top-k edges, by making use of graph

connectivity and other constraints information.

Page 18: Dr. Parag Singla Dr. Sumeet Agarwal Advisors Causal ... · Dr. Parag Singla. Reintroducing the GRN Problem Regulatory Gene Target Gene DNA Cell Membrane Factor 1 Factor 4 Factor 2

Intrinsic Graph Structure Estimation [Hino et al. 2015]Adjacency Matrix to Observation Matrix

f

(Pairwise Metrics used here)

Θ

Ξ

18

Page 19: Dr. Parag Singla Dr. Sumeet Agarwal Advisors Causal ... · Dr. Parag Singla. Reintroducing the GRN Problem Regulatory Gene Target Gene DNA Cell Membrane Factor 1 Factor 4 Factor 2

Intrinsic Graph Structure Estimation [Hino et al. 2015]The Random Walk Model

DigraphLaplacian

β/τ β/τ

19

Page 20: Dr. Parag Singla Dr. Sumeet Agarwal Advisors Causal ... · Dr. Parag Singla. Reintroducing the GRN Problem Regulatory Gene Target Gene DNA Cell Membrane Factor 1 Factor 4 Factor 2

Intrinsic Graph Structure Estimation [Hino et al. 2015]Parameter Estimation Algorithm

In an EM style algorithm, iterating over k (number of edges in graph):

20

Page 21: Dr. Parag Singla Dr. Sumeet Agarwal Advisors Causal ... · Dr. Parag Singla. Reintroducing the GRN Problem Regulatory Gene Target Gene DNA Cell Membrane Factor 1 Factor 4 Factor 2

Intrinsic Graph Structure EstimationMoving towards Multi-attribute Data

g

(Multiple Pairwise Metrics used here)

Θ

21

Page 22: Dr. Parag Singla Dr. Sumeet Agarwal Advisors Causal ... · Dr. Parag Singla. Reintroducing the GRN Problem Regulatory Gene Target Gene DNA Cell Membrane Factor 1 Factor 4 Factor 2

Intrinsic Graph Structure EstimationMoving towards Multi-attribute Data

Ideally, Θ should be exactly mapped by every qf -1(Ξ)

22

Page 23: Dr. Parag Singla Dr. Sumeet Agarwal Advisors Causal ... · Dr. Parag Singla. Reintroducing the GRN Problem Regulatory Gene Target Gene DNA Cell Membrane Factor 1 Factor 4 Factor 2

Intrinsic Graph Structure EstimationMore Considerations & Postprocessing

● Imposing extra constraints on Θ

● Explore different ways to map g to Θ (hard versus soft constraints)

● For reduced computational complexity, use a small window of k● Further postprocessing: Data Processing Inequality

23

Page 24: Dr. Parag Singla Dr. Sumeet Agarwal Advisors Causal ... · Dr. Parag Singla. Reintroducing the GRN Problem Regulatory Gene Target Gene DNA Cell Membrane Factor 1 Factor 4 Factor 2

Thank YouQuestions and Feedback

24