the measure of synergy as a tool in systems biology d. anastassiou

19
The Measure of Synergy The Measure of Synergy as a Tool in Systems Biology as a Tool in Systems Biology D. Anastassiou D. Anastassiou C2B2/MAGNet Center C2B2/MAGNet Center Third Annual Retreat, Third Annual Retreat, 4/11/2008 4/11/2008

Upload: prince

Post on 17-Jan-2016

29 views

Category:

Documents


0 download

DESCRIPTION

The Measure of Synergy as a Tool in Systems Biology D. Anastassiou. C2B2/MAGNet Center Third Annual Retreat, 4/11/2008. Synergy. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: The Measure of Synergy  as a Tool in Systems Biology D. Anastassiou

The Measure of Synergy The Measure of Synergy as a Tool in Systems Biologyas a Tool in Systems Biology

D. AnastassiouD. Anastassiou

C2B2/MAGNet Center C2B2/MAGNet Center Third Annual Retreat, 4/11/2008Third Annual Retreat, 4/11/2008

Page 2: The Measure of Synergy  as a Tool in Systems Biology D. Anastassiou

SynergySynergyDefinition: “The Definition: “The interactioninteraction of two or more agents or of two or more agents or forces so that their combined effect is greater than the forces so that their combined effect is greater than the sum of their individual effects” sum of their individual effects” ((American Heritage Dictionary)American Heritage Dictionary)

Natural application in systems biology Natural application in systems biology (holistic as opposed to reductionist paradigm): We (holistic as opposed to reductionist paradigm): We wish to analyze multiple interacting factors in terms of wish to analyze multiple interacting factors in terms of the the purely cooperativepurely cooperative nature of their contributions nature of their contributions towards an outcome.towards an outcome.

D. Anastassiou, "Computational Analysis of the Synergy D. Anastassiou, "Computational Analysis of the Synergy Among Multiple Interacting Genes" (Review Article), Among Multiple Interacting Genes" (Review Article), Molecular Systems BiologyMolecular Systems Biology, Vol. 3, No. 83, February 2007. , Vol. 3, No. 83, February 2007.

Page 3: The Measure of Synergy  as a Tool in Systems Biology D. Anastassiou

Information-theoretic definitionInformation-theoretic definition

Synergy of two factors Synergy of two factors GGii, , GGjj with respect to with respect to

an outcome an outcome CC::

whole sum of parts

i j i jI(G ,G ; C) - I(G ; C)+I(G ; C)

Synergy can be positive or negative (redundancy) Synergy can be positive or negative (redundancy) and extended to more than two factors.and extended to more than two factors.

Page 4: The Measure of Synergy  as a Tool in Systems Biology D. Anastassiou

Given a large set of gene expression Given a large set of gene expression data in both presence and absence data in both presence and absence of a phenotype such as cancer, we of a phenotype such as cancer, we can estimate the information can estimate the information II((GGii; ; CC) )

that any gene that any gene GGi i provides about provides about

cancer cancer C,C,

CONDITIONS

GENES

HEALTH CANCER

Example: Synergy of two genes Example: Synergy of two genes with respect to a phenotypewith respect to a phenotype

as well as the information as well as the information I(GI(Gii,G,Gjj; ; CC) )

that any pair of two genes (that any pair of two genes (GGii, , GGjj), ),

jointly provide about cancer jointly provide about cancer CC..

Page 5: The Measure of Synergy  as a Tool in Systems Biology D. Anastassiou

Best gene pairs for classificationBest gene pairs for classification

Extension of “gene ranking” based on Extension of “gene ranking” based on II((GGii; ; CC))

to “gene-pair ranking” based on to “gene-pair ranking” based on II((GGii,G,Gjj; ; CC))

Observation: Sometimes high-ranked gene pairs do not Observation: Sometimes high-ranked gene pairs do not include any of the high-ranked single genes, suggesting include any of the high-ranked single genes, suggesting that the correlation of the gene pair with cancer is due to that the correlation of the gene pair with cancer is due to a a purely cooperativepurely cooperative effect of the two genes. effect of the two genes.

V. Varadan and D. Anastassiou, “Inference of Disease-Related Molecular Logic from Systems-Based Microarray Analysis,” PLoS Computational Biology, Vol. 2, Issue 6, June 2006, pp. 585-597.

This purely cooperative effect can be quantified!This purely cooperative effect can be quantified!

Page 6: The Measure of Synergy  as a Tool in Systems Biology D. Anastassiou

What is the “cancer interactome”?What is the “cancer interactome”?

High synergy: High synergy: II((GGii,G,Gjj; ; CC) >> ) >> II((GGii; ; CC) + ) + II((GGjj; ; CC) ) ≥≥ 0 0

further implies that the two genes further implies that the two genes GGii and and GGjj “interact” with “interact” with

respect to cancer, and can be used to construct a respect to cancer, and can be used to construct a ““synergysynergy network,network,” a graph with nodes represent genes and ” a graph with nodes represent genes and edges connect significantly high-synergy gene pairs.edges connect significantly high-synergy gene pairs.

J. Watkinson, X. Wang, T. Zheng, D. Anastassiou, J. Watkinson, X. Wang, T. Zheng, D. Anastassiou, “Identification of gene interactions associated with disease “Identification of gene interactions associated with disease from gene expression data using synergy networks,” from gene expression data using synergy networks,” BMC Systems BiologyBMC Systems Biology, February 2008, February 2008

High correlation: High correlation: II((GGii,G,Gjj; ; CC) >> 0) >> 0

implies that the two genes can be jointly used for implies that the two genes can be jointly used for classificationclassification

Page 7: The Measure of Synergy  as a Tool in Systems Biology D. Anastassiou

Example (prostate cancer)Example (prostate cancer)

Page 8: The Measure of Synergy  as a Tool in Systems Biology D. Anastassiou

Example of scatter plot for highest-synergy Example of scatter plot for highest-synergy gene pair from prostate cancer datagene pair from prostate cancer data

50 green (healthy) and 52 red (cancerous) dots50 green (healthy) and 52 red (cancerous) dotsCancer = (Low Cancer = (Low RBP1) RBP1) AND (High AND (High EEF1B2)EEF1B2)

Page 9: The Measure of Synergy  as a Tool in Systems Biology D. Anastassiou

Using synergy for inference of Using synergy for inference of gene regulatory interactionsgene regulatory interactions

The “phenotype” can be the The “phenotype” can be the expression level of a third geneexpression level of a third gene

Page 10: The Measure of Synergy  as a Tool in Systems Biology D. Anastassiou

GivenGiven: A “blinded” compendium of 300 normalized : A “blinded” compendium of 300 normalized Affymetrix microarray experiments from Affymetrix microarray experiments from E. coliE. coli, involving , involving 3,456 genes out of which 120 (also blinded) transcription 3,456 genes out of which 120 (also blinded) transcription factors. factors.

ChallengeChallenge: Reconstruct a genome-scale transcriptional : Reconstruct a genome-scale transcriptional network (identify TF-target interactions). network (identify TF-target interactions).

ScoreScore based on known “ground truth” from chromatin based on known “ground truth” from chromatin precipitation and otherwise experimentally verified precipitation and otherwise experimentally verified Transcription Factor (TF)-target interactions (from Transcription Factor (TF)-target interactions (from RegulonDB).RegulonDB).

Application to “Challenge 5”Application to “Challenge 5”of the DREAM2 conferenceof the DREAM2 conference

Page 11: The Measure of Synergy  as a Tool in Systems Biology D. Anastassiou

““Three-way” mutual information Three-way” mutual information (common to three genes)(common to three genes)

1 2 3 1231 2 3

12 23 13

( ; ; ) =E logp p p p

I G G Gp p p

Can be estimated from Can be estimated from continuous datacontinuous data

Page 12: The Measure of Synergy  as a Tool in Systems Biology D. Anastassiou

Three-way mutual information is Three-way mutual information is the opposite of synergy!the opposite of synergy!

It turns out that -It turns out that -II((GG11;;GG22;;GG33) is equal to) is equal to

II((GG11,G,G22; ; GG33) - [) - [II((GG11;;GG33)+)+II((GG22;;GG33)] =)] =

II((GG22,G,G33; ; GG11) - [) - [II((GG22;;GG11)+)+II((GG33;;GG11)] =)] =

II((GG11,G,G33; ; GG22) - [) - [II((GG11;;GG22)+)+II((GG33;;GG22)])]

the synergy of two of the genes with respect to the third.the synergy of two of the genes with respect to the third.

II((GG1;1;GG2;2;GG3) can be negative, in which case there is no 3) can be negative, in which case there is no Venn diagram possible.Venn diagram possible.

Page 13: The Measure of Synergy  as a Tool in Systems Biology D. Anastassiou

Synergistic “entanglement” Synergistic “entanglement” of three genesof three genes

If If II((GG11;;GG22;;GG33) << 0, this suggests that there is some ) << 0, this suggests that there is some

interaction mechanism connecting the three interaction mechanism connecting the three genes, and the positive quantity genes, and the positive quantity -I-I((GG11;;GG22;;GG33) can ) can

be seen as measuring their synergistic be seen as measuring their synergistic “entanglement.”“entanglement.”

In that case, one likely scenario is that one of the In that case, one likely scenario is that one of the three genes is, at least partly or indirectly, three genes is, at least partly or indirectly, synergistically regulated by the other two.synergistically regulated by the other two.

Page 14: The Measure of Synergy  as a Tool in Systems Biology D. Anastassiou

Most-likely regulated gene Most-likely regulated gene in a synergistically entangled tripletin a synergistically entangled triplet

II((GGii,G,Gkk; ; GGjj) ) ≥ max ≥ max {{II((GGii,G,Gjj; ; GGkk), ), II((GGkk,G,Gjj; ; GGii}}

Or, as it turns out, equivalently:Or, as it turns out, equivalently:

II((GGii;;GGkk) ) ≤ min ≤ min {{II((GGii;;GGjj) , ) , II((GGkk;;GGj)j)}}

Page 15: The Measure of Synergy  as a Tool in Systems Biology D. Anastassiou

Synergistic regulation indexSynergistic regulation index

where , ,( ; ) < ( ; )( ; ) < ( ; )

( , ) = max - ( ; ; )

i k i j

i k k j

i j kk

k i k jI G G I G GI G G I G G

S i j I G G G

Measures the degree of confidence that gene Measures the degree of confidence that gene GGii

cooperativelycooperatively regulates gene regulates gene GGj j

It also identifies It also identifies GGkk as the best synergistic as the best synergistic

partner of partner of GGii for the regulation of for the regulation of GGj j

Can be used to Can be used to augmentaugment the traditional MI measure: the traditional MI measure:

MM((ii, , jj) = ) = II((GGii;;GGjj))

Page 16: The Measure of Synergy  as a Tool in Systems Biology D. Anastassiou

Final score for Final score for GGii → → GGj j regulationregulation

Score ( , ) ( , ) = M i j S i jcomputed from 2-way and 3-way MI values.computed from 2-way and 3-way MI values.

Turns out that it is equal to:Turns out that it is equal to:

where , ,( ; ) < ( ; )( ; ) < ( ; )

max ( ; | )

i k i j

i k k j

i j kk

k i k jI G G I G GI G G I G G

I G G G

Page 17: The Measure of Synergy  as a Tool in Systems Biology D. Anastassiou

ResultsResults

GISL:GISL:

Among top 150 predictions, Among top 150 predictions, 106 were in “ground truth.”106 were in “ground truth.”

TEAM SCORECombined

log10(P value)

GISL 40.5

Team 121 25.2

Team 73 24.1

Team 41 18.7

Team 58 10.0

Page 18: The Measure of Synergy  as a Tool in Systems Biology D. Anastassiou

Potential for biological discovery using synergy:Potential for biological discovery using synergy:Large number of statistically significant entangled tripletsLarge number of statistically significant entangled triplets

Page 19: The Measure of Synergy  as a Tool in Systems Biology D. Anastassiou

Conclusions and acknowledgementsConclusions and acknowledgements

Synergy-based methodologies have the potential to Synergy-based methodologies have the potential to contribute towards empowering systems biology to contribute towards empowering systems biology to achieve genuine biological discovery by identifying achieve genuine biological discovery by identifying multiple interacting contributing factors, such as genes, multiple interacting contributing factors, such as genes, SNPs and CNVs.SNPs and CNVs.

Co-authors:Co-authors:

Prof. Tian Zheng, Statistics, Columbia UniversityProf. Tian Zheng, Statistics, Columbia UniversityProf. Xiaodong Wang, EE, Columbia UniversityProf. Xiaodong Wang, EE, Columbia University

Ph.D. students: John Watkinson, Kuo-ching Liang Ph.D. students: John Watkinson, Kuo-ching Liang