1.introduction to transcriptional networks 2.regulation of the expression of the lac operon...

48
1. Introduction to transcriptional networks 2. Regulation of the expression of the Lac operon 3. Finding Biclusters in Bipartite Graphs Today’s lecture will cover the following three topics Systems Biology

Upload: maximilian-gibbs

Post on 01-Jan-2016

216 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

1. Introduction to transcriptional networks2. Regulation of the expression of the Lac

operon3. Finding Biclusters in Bipartite Graphs

Today’s lecture will cover the following three topics

Systems Biology

Page 2: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

Unlike protein-protein interaction networks the transcriptional networks are directed networks

By the term transcriptional networks we generally mean gene regulatory networks

transcriptional networks

Page 3: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

transcriptional networks: Basic mechanism of gene regulation

Page 4: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

transcriptional networks

Page 5: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

Most genes are regulated at transcription level and it is assumed that 5-10% of protein coding genes encode regulatory proteins.

Some regulatory proteins play targeted role i.e. they take part in regulation of a few genes.

Some regulatory proteins play more general role in initiating transcription (for example the eukaryotic transcription factors of type II or the RNA polymerase itself that is essential for the transcription of all genes).

It is considered that dedicated regulatory proteins are those that affect up to 5% genes of a genome.

However the boundary between the generalist and dedicated regulatory proteins is blurred.

transcriptional networks

Page 6: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

Experiments and methods used to determine regulatory relations

1. Complementary DNA microarrays

2. Oligonucleotide chips

3. Reverse transcription polymerase chain reaction

4. Serial analysis of gene expression

5. Chromatin Immunoprecipitation

6. Bioinformatics—e.g. by way of identifying binding sites

transcriptional networks

Page 7: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

Transcriptional Networks: Case study 1

An extended transcriptional regulatory network of Escherichia coli and analysis of its hierarchical structure and network motifs

Hong-Wu Ma, Bharani Kumar, Uta Ditges2, Florian Gunzer2, Jan Buer1,2 and An-Ping Zeng*

Nucleic Acids Research, 2004, Vol. 32, No. 22 6643–6649

This work combined data sets from 3 different sources:

1. RegulonDB (version 4.0, http://www.cifn.unam.mx/Computational_Genomics/regulondb/)

2. Ecocyc (version 8.0, www.ecocyc.org)

3. Shen-Orr,S.S., Milo,R., Mangan,S. and Alon,U. (2002) Network motifs in the transcriptional regulation network of Escherichia coli. Nature Genet., 31, 64–68.

Page 8: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

Transcriptional Network: Case study 1Nucleic Acids Research, 2004, Vol. 32, No. 22 6643–6649

Comparison of the TRN of E.coli from three different data sources (A) Based on number of genes (B) Based on number regulatory interactions

Page 9: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

A combined network that includes all the 2624 interactions from the three data sets has been produced.

In addition, this work extended this network by adding 23 additional genes and around 100 regulatory relationships through literature survey.

The final TRN altogether includes 1278 genes and 2724 interactions.

Transcriptional Network: Case study 1Nucleic Acids Research, 2004, Vol. 32, No. 22 6643–6649

Page 10: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

This work discovered a hierarchical structure in the TRN.

The hierachical structure was identified according to the following way:

(1) genes which do not code for transcription factors (TFs) or code for a TF which only regulates its own expression (auto-regulatory loop) were assigned to layer 1 (the lowest layer);

(2) then we removed all the genes in layer 1 and from the remaining network identified TFs which do not regulate other genes and assigned the corresponding genes in layer 2;

(3) we repeated step 2 to remove nodes which have been assigned to a layer and identified a new layer until all the genes were assigned to different layers. As a result, a nine layer hierarchical structure was uncovered.

Transcriptional Network: Case study 1Nucleic Acids Research, 2004, Vol. 32, No. 22 6643–6649

From BMC Bioinformatics 2004, 5:199 of the related authors

Page 11: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

Transcriptional Network: Case study 1Nucleic Acids Research, 2004, Vol. 32, No. 22 6643–6649

Page 12: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

The hierarchical structure implies absence of cycles in the network i.e. feedback loops (though auto regulatory and inter-regulatory loops exist)

As the network is not complete, we cannot say that feedback loop could not be found in future however it seems they would not be too many.

A possible biological explanation for the existence of this hierarchical structure is that the interactions in this particular TRN are between proteins and genes without involving metabolites.

Only after a regulating gene has been transcribed, translated and eventually further modified by cofactors or other proteins, it canregulate the target gene.

A feedback from the regulated gene at transcriptional level may delay the process for the target gene to access a desired expression level in a new environment.

Transcriptional Network: Case study 1Nucleic Acids Research, 2004, Vol. 32, No. 22 6643–6649

Page 13: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

Feedback control may be mainly through other interactions (e.g. metabolite and protein interaction) at post-transcriptional level rather than through transcriptional interactions between proteins and genes. For example, a gene at the bottom layer may code for a metabolic enzyme, the product of which can bind to a regulator which in turn regulates its expression. In this case, the feedback is through metabolite–protein interaction to change the activity of the transcription factor and then to affect the expression of the regulated gene.

Therefore, to fully understand the gene expression regulation, an integrated network that includes different interactions is needed.

Transcriptional Network: Case study 1Nucleic Acids Research, 2004, Vol. 32, No. 22 6643–6649

Page 14: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

To calculate network motifs in the E.coli TRN, this work removed all the loops in the network (including the autoregulatory loops and the two-gene regulatory loops). Then they used the program Mfinder developed by Kashtan et al. to generate the motif profiles.

The first four types are the so-called coherent FFLs in which the direct effect of the up regulator is consistent with its indirect effect through the mid regulator. In contrast, the last four types of FFLs are incoherent because the direct effect of the up regulator is contradictive with its indirect effect

Transcriptional Network: Case study 1Nucleic Acids Research, 2004, Vol. 32, No. 22 6643–6649

Page 15: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

Transcriptional Network: Case study 1Nucleic Acids Research, 2004, Vol. 32, No. 22 6643–6649

(A) Gene gadA is regulated by six FFLs (B)Gene lpd is regulated by five FFLs (C) Gene slp is regulated by 17 regulators

Page 16: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

Transcriptional Network: Case study 1Nucleic Acids Research, 2004, Vol. 32, No. 22 6643–6649

Page 17: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

Topological and causal structure of the yeast transcriptional regulatory networkNabil Guelzim1,2, Samuele Bottani3, Paul Bourgine2 & François Képès1

Transcriptional Network: Case study 2

In this work the yeast transcriptional network was constructed by manual inspection of the websites of MIPS, SwissProt, Yeast Protein Database, S. cerevisiae Promoter Database and the Saccharomyces Genome Database

The network consists of 491 genes and 909 regulatory relations

nature genetics • volume 31 • may 2002

Page 18: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

Transcriptional Network: Case study 2nature genetics • volume 31 • may 2002

The network consists of 491 genes and 909 regulatory relations

Bold type indicates self-activation, bold italics indicates self-inhibition and borders indicate essential genes. Thick lines represent activation, thin lines represent inhibition and the dashed gray line represents dual regulation.

Page 19: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

Indegree distribution of this yeast transcriptional network is exponential

Typical exponential distribution on normal scale

Transcriptional Network: Case study 2nature genetics • volume 31 • may 2002

Page 20: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

Indegree distribution of this yeast transcriptional network is exponential

Indegree distribution of the transcriptional network on semi-log scale

open squares, full line --for all 402 regulated genes (367 nonregulatory and 35 interregulatory genes), 909 connections, p(k)=157e–0.45k; R=0.99)filled circles, broken line ---for the subset of 35 interregulatory genes, 72 connections; p(k)=15e–0.43k; R=0.94

Transcriptional Network: Case study 2nature genetics • volume 31 • may 2002

Page 21: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

Outdegree distribution of this yeast transcriptional network follows power law

Typical power law distribution on normal scale

Transcriptional Network: Case study 2nature genetics • volume 31 • may 2002

Page 22: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

Outdegree distribution of this yeast transcriptional network follows power law

Outdegree distribution of the transcriptional network on log-log scale

Open squares, full line --for all 124 regulating proteins (909 connections; P(k)=23k−0.87; R=0.95)

filled circles, broken line – for 37 regulating proteins that control regulatory genes (72 connections; P(k)=19k−1.14; R=0.99)

Transcriptional Network: Case study 2nature genetics • volume 31 • may 2002

Page 23: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

an operon is a functioning unit of genomic material containing a cluster of genes under the control of a single regulatory signal or promoter.

The genes are transcribed together into an mRNA strand and either translated together in the cytoplasm, or undergo trans-splicing to create monocistronic mRNAs that are translated separately.

The result of this is that the genes contained in the operon are either expressed together or not at all.

Originally operons were thought to exist solely in prokaryotes but since the discovery of the first operons in eukaryotes in the early 1990s, more evidence has arisen to suggest they are more common than previously assumed.

The operon

Page 24: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

The lac operon of e.coli consists of three genes LacZ, LacY and LacAThey are the codes of enzymes needed for processing lactoseLacI is an adjacent gene which is a regulator ( transcriptional repressor) of the Lac operonBesides the promoter operator region there is a region where a complex called CAP binds which affect the transcription positivelyLacZ codes for the enzyme B-galactosidase and LacY codes for lactose permease, an enzyme that facilitates the flux of lactose through the cell membraneLacA is not directly involved in processing Lactose

Source: Models of cellular regulation by Baltazar D. Aguda and Avner Friedman

The Lac operon

Page 25: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

Source: Models of cellular regulation by Baltazar D. Aguda and Avner Friedman

Static model of the regulation of the expression of the Lac operon

The LacI tetramer binds at the promoter region and stops the transcription

The CAP complex binds the cap region and enhance the binding of RNA polymerase

The Lac operon

Page 26: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

cAMP binds and LacI is suppressed by Allolactose

cAMP cannot bind and repressor protein LacI binds

cAMP binds and repressor protein LacI binds

cAMP cannot bind and LacI is suppressed by Allolactose

Summary in Table

Page 27: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

The technique of finding biclusters can be used to determine co-expressed gene groups

1. Introduction to transcriptional networks2. Regulation of the expression of the Lac

operon3. Finding Biclusters in Bipartite Graphs

Page 28: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

Given a nxp data matrix X, where n is the number of objects (e.g. genes) and p is the number of conditions (e.g. array), a bicluster is defined as a submatrix XIJ of X within which a subset of objects I express similar behavior across the subset of conditions J.

A nxp data matrix X can be easily converted to a bipartite graph by considering a threshold or so.

Finding bicluster (densely connected regions) in a bipartite graph is a similar problem.

Definition of a bicluster

Page 29: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

A Graph G=(V,E) is bipartite if its vertex set V can be partitioned into two subsets V1, V2 such that each edge of E has one end vertex in V1 and another in V2.

V1

V2

Page 30: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

Biclusters are densely connected regions in a bipartite graph

C d A a G g I f K k

D c A b G h I g L i

D d B a H e I h L j

E c B b H f J f L k

E d C a H g J g M l

F c C b H h K h M m

F d D a I e G f N l

G d D b K i C c N m

K j

Page 31: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

Gene expression data can be represented as bipartite graphs

gene/cond. cond0 cond1 cond2 cond3 cond4

YAL005C 2.85 3.34 0 0 0

YAL012W 0.21 0.03 0.18 -0.27 -0.32

YAL014C -0.03 -0.07 0.28 0.32 -0.27

YAL015C -0.25 0.58 0.77 0.28 0.32

YAL016W 0.11 0.04 0.75 0.82 0.21

YAL017W 0.24 0.31 0.95 0.12 0.18

YAL021C -0.3 0.22 0.02 -0.64 0.06

gene/cond. cond0 cond1 cond2 cond3 cond4

YAL005C 1 1 0 0 0

YAL012W 0 0 0 0 0

YAL014C 0 0 0 0 0

YAL015C 0 0 0 0 0

YAL016W 0 0 0 1 0

YAL017W 0 0 1 0 0

YAL021C 0 0 0 0 0

By transforming highest 5% values to 1

Before transforming, the data can be normalized

Biclusters in gene expression data represents transcription modules/co-expressed gene groups

Page 32: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

•Tanay,A. et al. (2002) Discovering statistically significant biclusters in gene expression data. Bioinformatics, 18 (Suppl. 1), S136–S144.

•Ihmels,J. et al. (2002) Revealing modular organization in the yeast transcriptional network. Nat. Genet., 31, 370–377.

•Ben-Dor,A., Chor,B., Karp,R. and Yakhini,Z. (2002) Discovering local structure in gene expression data: the order-preserving sub-matrix problem. In Proceedings of the 6th Annual International Conference on Computational Biology, ACM Press, New York, NY, USA, pp. 49–57.

•Cheng,Y. and Church,G. (2000) Biclustering of expression data. Proc. Int. Conf. Intell. Syst. Mol. Biol. pp. 93–103.

•Murali,T.M. and Kasif,S. (2003) Extracting conserved gene expression motifs from gene expression data. Pac. Symp. Biocomput., 8, 77–88.

Page 33: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

We propose a biclustering method incorporating DPClus

G/E a b c d e f g h i j k l m

A 1 1 0 0 0 0 0 0 0 0 0 0 0

B 1 1 0 0 0 0 0 0 0 0 0 0 0

C 1 1 1 1 0 0 0 0 0 0 0 0 0

D 1 1 1 1 0 0 0 0 0 0 0 0 0

E 0 0 1 1 0 0 0 0 0 0 0 0 0

F 0 0 1 1 0 0 0 0 0 0 0 0 0

G 0 0 0 1 1 1 1 0 0 0 0 0 0

H 0 0 0 0 1 1 1 1 0 0 0 0 0

I 0 0 0 0 1 1 1 1 0 0 0 0 0

J 0 0 0 0 1 1 0 0 0 0 0 0 0

K 0 0 0 0 0 0 0 1 1 1 1 0 0

L 0 0 0 0 0 0 0 0 1 1 1 0 0

M 0 0 0 0 0 0 0 0 0 0 0 1 1

N 0 0 0 0 0 0 0 0 0 0 0 1 1

An example bipartite graph and its corresponding matrix

1||

0

)()(C

jkjBGijBGik MMCN (for ik)

Page 34: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

BiClus:Biclustering method incorporating DPClus

Concerning each row i (i=0 to |G|-1) of MCN, we calculate thresholdi=avgi+(maxi- avgi) Gmargin and set (MSG)ik =(MSG)ki=1if (MCN)ik thresholdi and thresholdi is not an indeterminate number (for k=0 to |G|-1).Here, avgi = SUMi/ni where ni is the number of non-zero entries in row i of MCN

and maxi is the maximum value of the entries in row i of MCN

Gmargin is a user defined value

1.

A B C D E F G H I J K L M N

A 0 2 2 2 0 0 0 0 0 0 0 0 0 0

B 2 0 2 2 0 0 0 0 0 0 0 0 0 0

C 2 2 0 4 2 2 1 0 0 0 0 0 0 0

D 2 2 4 0 2 2 1 0 0 0 0 0 0 0

E 0 0 2 2 0 2 1 0 0 0 0 0 0 0

F 0 0 2 2 2 0 1 0 0 0 0 0 0 0

G 0 0 1 1 1 1 0 3 3 2 0 0 0 0

H 0 0 0 0 0 0 3 0 4 2 1 0 0 0

I 0 0 0 0 0 0 3 4 0 2 1 0 0 0

J 0 0 0 0 0 0 2 2 2 0 0 0 0 0

K 0 0 0 0 0 0 0 1 1 0 0 3 0 0

L 0 0 0 0 0 0 0 0 0 0 3 0 0 0

M 0 0 0 0 0 0 0 0 0 0 0 0 0 2

N 0 0 0 0 0 0 0 0 0 0 0 0 2 0

Common neighbor matrix of the bipartite graph

Page 35: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

A B C D E F G H I J K L M N

A 0 1 1 1 0 0 0 0 0 0 0 0 0 0

B 1 0 1 1 0 0 0 0 0 0 0 0 0 0

C 1 1 0 1 1 1 0 0 0 0 0 0 0 0

D 1 1 1 0 1 1 0 0 0 0 0 0 0 0

E 0 0 1 1 0 1 1 0 0 0 0 0 0 0

F 0 0 1 1 1 0 0 0 0 0 0 0 0 0

G 0 0 0 0 1 0 0 1 1 1 0 0 0 0

H 0 0 0 0 0 0 1 0 1 1 0 0 0 0

I 0 0 0 0 0 0 1 1 0 1 0 0 0 0

J 0 0 0 0 0 0 1 1 1 0 0 0 0 0

K 0 0 0 0 0 0 0 0 0 0 0 1 0 0

L 0 0 0 0 0 0 0 0 0 0 1 0 0 0

M 0 0 0 0 0 0 0 0 0 0 0 0 0 1

N 0 0 0 0 0 0 0 0 0 0 0 0 1 0

BiClus:Biclustering method incorporating DPClus

This matrix represents a simple graph

Page 36: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

BiClus:Biclustering method incorporating DPClus

Simple graph derived from the common neighbor matrix.

We can use DPClus to find clusters in the simple graph.

Page 37: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

BiClus:Biclustering method incorporating DPClus

Clustering by DPClus

Page 38: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

BiClus:Biclustering method incorporating DPClus

Clustering by DPClus

Page 39: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

BiClus:Biclustering method incorporating DPClus

Finally determined biclusters

Page 40: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

Evaluation of BiClus

-Using Synthetic data-Using real data

Page 41: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

Synthetic data

Artificially embedded biclusters with noise

Evaluation of BiClus

Page 42: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

Synthetic data

Artificially embedded biclusters with overlap

Evaluation of BiClus

Page 43: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

||

||max

||

1),(

21

21

),(),(

121

*

111222 GG

GG

MMMS

MCGMCG

G

Let M1, M2 be two sets of biclusters. The gene match score of M1 with respect to M2 is given by the function

Evaluation of BiClus

A systematic comparison and evaluation of biclustering methodsfor gene expression dataAmela Prelic´, Stefan Bleuler, Philip Zimmermann, Anja Wille, Peter Bu¨ hlmann, Wilhelm Gruissem, Lars Hennig, Lothar Thiele and Eckart Zitzle

BIOINFORMATICS, Vol. 22 no. 9 2006, pages 1122–1129

Page 44: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

effect of relevance of BCs

0

0.2

0.4

0.6

0.8

1

1.2

0 0.05 0.1 0.15 0.2 0.25 0.3

noise level

avg

mat

chin

g sc

ore

SAMBA

BiClus

Evaluation of BiClus

Synthetic data

Artificially embedded biclusters with noise

Page 45: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

Evaluation of BiClus

regulatory complexity: relevance of BCs

0

0.2

0.4

0.6

0.8

1

1.2

0 1 2 3 4 5 6 7 8 9

overlap degree

avg

mat

chin

g sc

ore

SAMBA

BiClus

Synthetic data

Artificially embedded biclusters with overlap

Page 46: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

Gasch,A.P. et al. (2000) Genomic expression programs in the response of yeast cells to environmental changes. Mol. Biol. Cell, 11, 4241–4257.

Gene expression data collected from the above work

Page 47: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

Gene expression data can be represented as bipartite graphs

gene/cond. cond0 cond1 cond2 cond3 cond4

YAL005C 2.85 3.34 0 0 0

YAL012W 0.21 0.03 0.18 -0.27 -0.32

YAL014C -0.03 -0.07 0.28 0.32 -0.27

YAL015C -0.25 0.58 0.77 0.28 0.32

YAL016W 0.11 0.04 0.75 0.82 0.21

YAL017W 0.24 0.31 0.95 0.12 0.18

YAL021C -0.3 0.22 0.02 -0.64 0.06

gene/cond. cond0 cond1 cond2 cond3 cond4

YAL005C 1 1 0 0 0

YAL012W 0 0 0 0 0

YAL014C 0 0 0 0 0

YAL015C 0 0 0 0 0

YAL016W 0 0 0 1 0

YAL017W 0 0 1 0 0

YAL021C 0 0 0 0 0

By transforming highest 5% values to 1

Before transforming, the data can be normalized

Biclusters in gene expression data represents transcription modules

Page 48: 1.Introduction to transcriptional networks 2.Regulation of the expression of the Lac operon 3.Finding Biclusters in Bipartite Graphs Today’s lecture will

0.001 0.010.0030.002

Evaluation of BiClus

Real gene expression data of yeast

P-values represents statistical significance of functional richness of the modules

P-Values calculated using FuncAssociate: The Gene Set Functionator from http://llama.med.harvard.edu/cgi/func/funcassociate