chapter 8: the topology of biological networks overview
TRANSCRIPT
1
Prof. Yechiam Yemini (YY)
Computer Science DepartmentColumbia University
Chapter 8: The Topology of BiologicalNetworks
8.1 Introduction
2
Overview A gallery of networks Scale-free network models
2
3
A Gallery of Networks
4
Introduction Network abstractions
Node: biological object Edge: interaction between nodes
Regulatory networks Node: genes; edge: regulatory interaction
Metabolic networks Node: metabolite; edge: reaction
Protein networks Node: protein; edge: interaction Node: module; edge: interaction Node: complex; edge: sharing a protein Node: residue; edge: folding neighbors
What Can Network Abstractions Teach Us About Biological Systems?
3
5
The E.Coli Regulatory NetworkNode= TFsEdge= Regulatory interaction
Hierarchical structure and modules in theEscherichia coli regulatory network.
Hong-Wu Ma , Jan Buer, and An-Ping Zeng
http://www.biomedcentral.com/1471-2105/5/199
MODULAR VIEW
HIERARCHICAL VIEW
UNORGANIZED VIEW
6
Regulatory Network Of E.ColiE.coli: 105 TFs affect 749 genes7 TFs regulate >0.5 genesConnectivity distribution
Egress: follows a power-law Ingress: follows exponential (Shen-Orr).
Martinez-Antonio, Collado-Vides,Curr Opin Microbiol 6, 482 (2003)
4
7
Yeast Regulation
The colour scheme depicts functional category: orange, mitotic cell cycle; pink, budding and filamentformation; green, amino acid metabolism; yellow, nitrogen and sulphur utilization; blue, C-compoundand carbohydrate utilization; red, TFs; grey, unspecific or several functional categories.
http://www.biochemj.org/bj/381/0001/bj3810001.htm
Node= TFsEdge= Regulatory interaction Charting gene regulatory networks:
strategies, challenges and perspectivesGong-Hong WEI, De-Pei LIU1 and Chih-Chuan LIANG ; Biochem J. 2004 (381)
8
Yeast Regulatory NetworkSergei Maslov
http://www.cmth.bnl.gov/~maslov/rockefeller_2002_networks.ppt
5
9
Metabolic NetworkNode= MetabolitesEdge= Reaction
Ravasz et al…Science Vol 297, 2002
E-Coli
Human
10
Signaling Networks
http://www.cs.tau.ac.il/~spike/ www.bioscience.org/1998/v3/d/malumbre/fig2.jpg
MAPK signaling pathway
6
11
Yeast P2P Interaction Network
http://www.macdevcenter.com/pub/a/mac/2004/08/20/bioinformatics.html
http://www.imb-jena.de/tsb/yeast.html
Node= proteinsEdge= interaction
12
Yeast P2P Domain Interaction Network
(A) Yeast SH3 domain protein-protein network; proteins are colored according to their k-core value (6-core = black, 5-core =cyan, 4-core = blue, 3-core = red, 2- core = green, 1-core = yellow), identifying subnets in which each protein has at least kinteractions. By definition, lower core numbers encompass all higher core numbers (e.g. 4-core subgraph includes 4-core, 5-core and 6-core). The 6-core subgraph is highlighted in red and depicted in (B).
http://www.utoronto.ca/boonelab/proteomics.htmNode= domainEdge= interaction
7
13
A Network of Protein Complexes
Red, cell cycle;dark green, signalling;dark blue, transcription, DNA maintenance, chromatin structure;pink, protein and RNA transport;orange, RNA metabolism; light green, protein synthesis;brown, cell polarity and structure;violet, intermediate and energy metabolism;light blue, membrane biogenesis and traffic.
Lowe panel is an example of a complex (yeast TAP-C212) linked to two other complexes(yeast TAP-C77 and TAP-C110) by shared components.
http://www.genomenewsnetwork.org/articles/01_02/Yeast_proteins_image1.shtml
Node=complexEdge=shared proteinsColor=role
14
Key Question:Are Biological Networks Random?
Or do they reflect hidden organizational principles?
First answer: biological network are random Are organized through scale-free random evolution Barabasi group: Jeong et al. Nature 407, 651-654 (2000).
Second answer: regulatory networks are not randomAre organized from statistically-significant motifsUri Alon’s group: Shen-Orr et al. Nature Gen. 31, 64 (2002)
Both can be correctContradiction is only seeming
8
15
Statistical Topology Features
16
Random Networks (Erdos Renyi, 1959)
G(n,p) a graph on n nodes where an edge has probability p Toss a coin with probability p to select an edge Average degree d=p(n-1)~pn Probability of k edges (m=n(n-1)/2): p(k)= pk(1-p)[m-k] ~ (dk/k!)exp(-d)
G(n,p(n)) has a property F, if p(G(n,p(n)∈F)1 when n∞
Main result: many properties F have threshold behavior There exists p*(n) such that
if p(n)/p*(n)>1 p(G(n,p(n)∈F)1 and if p(n)/p*(n)<1 then p(G(n,p(n)∈F)0
Example: F=connectivity p*(n)=(1/n)ln(n) As p(n) increases towards p*(n) the graph grows a giant component
mk
9
17
Topology Measures of ER Randomness
C(k)=fraction of clique filled
Degree distribution Clustering Path length
Poisson
p(k)~ dk/k!
C=p L~ ln N
L
C
18
ER Does Not Model Many Real-World Networks
Watts-Strogatz (98) many real-networks have:(A) high degree of clustering (cliquishness) and(B) short average length (small-world separation)
Network C Crand L NWWW 0.1078 0.00023 3.1 153127
Internet 0.18-0.3 0.001 3.7-3.76 3015-6209
Actors 0.79 0.00027 3.65 225226
Coauthors 0.43 0.00018 5.9 52909
Metabolic 0.32 0.026 2.9 282
Foodweb 0.22 0.06 2.43 134
C. elegan 0.28 0.05 2.65 282
10
19
Watts-Strogatz Small World Networks
L=100 d=49.51 C=0.67 L=14 d=11.1 C=0.63 L=5 d=4.46 C=0.01
Start with a deterministic k-regular ring
Rewire connectionwith probability p
p
Converges to Random network
20
But Many Real Networks HavePower-Law Degree Distribution
P(k) = k - γA: actors γ =2.3B: WWW γ =2.67C: power grid γ =4
Faloutsos & Faloutsos 99Internet graph
AS graph: γ =2.1Routers: γ =2.48
11
21
Scale Free Networks Scale free= power-law degree distribution: p(k) = k - γ
Why is a power law “scale free”? If k is scaled by a factor α p(ak)/p(k)=α−γ regardless of k Contrast with ER: p(k)=γk/k! => p(αk)/p(k)=γ(α−1)k(k!/(αk)!)
Topological features of SF nets γ=2 hub-and-spoke topology 2<γ<3 small number of hubs γ>3 network is dispersed
Topology measures: L~ln(lnN)) for 2<γ<3 C(k) constant P(k) ~k-3
A.-L.Barabási, R. Albert, Science 286, 509 (1999)
22
Scale Free vs. Random
Poisson distribution
Exponential Network
Power-law distribution
Scale-free Network
12
23
Scale Free Network Examples (Barabasi 01)
24
How Do Scale Free Networks Rise?
Evolution through preferential attachment: A new node connects to node i with probability:
where ki is the degree of i jj
ii k
kkΣ
=π )(
A.-L.Barabási, R. Albert, Science 286, 509 (1999)
13
25
Global Topology Features
26
Characterizing Metabolic Networks
Jeong et al, "The large-scale organization of metabolic networks", Nature 407 651 (2000)
14
27
Metabolic Nets Have Power Law Distribution
H. Jeong et al, Nature, 407 651 (2000)
28
Clustering
Metabolic networks Protein networks
E. Ravasz et al., Science, 2002
15
29
The P53 Tumor Supressor NetworkVogelstein et al, Nature 2000
30
node failurefc
0 1Fraction of removed nodes, f
1
S
Robustness: SF Nets Are Robust WRT Failures
Maintain connectivity and topological features through lossS- fraction of nodes in largest connected component
1
S
0 1fFailures
Albert, Jeong, Barabási Nature 406, 378 (2000)
16
31
How About Targeted Attacks? SF networks are sensitive to attacks on hubs
1
S
0 1ffc
Disease analysis; drug design…
32
Robustness of The Yeast Protein Network
Node Failure:Red: lethalGreen: robustYellow: unknown
H. Jeong et al., Nature, 2001
17
33
Robustness of The Yeast Protein Network
Highly connected proteins are more essential (lethal)...
34
Biological Networks
Evolution through duplication may explain γ<2
Approx. Exponent γNetwork
1.6Gene functional interactions1.4-1.7Yeast Gene Expression Net1.7, 2.2E.coli Metabolic Net
1.5, 1.6, 1.7, 2.5Yeast Protein-Protein NetBIOLOGICAL
2.1-2.3Phone calls4Power-grid
2.3Actors3Citations
2.1 (in), 2.5 (out)InternetNON-BIOLOGICAL
Chung, Dewey, Lu, Galas, D.J., Journal of Computational Biology(2003)
18
35
Gene Duplication Networks
0.001
0.01
0.1
1
1 10 100
log k
log
P(k
)
Scale Free + Small world
Pastor-Satorras, Smith & R. V. Sole, “Evolving protein interaction networks through geneduplication”, Santa Fe Institute Working Paper 02-02-008, 2002
36
Is The Metabolic Network of E-Coli “Small”?
Masanori Arita. PNAS 101 (6): 1543-1547
Fell, D. A. & Wagner, A. (2000) Nat. Biotechnol. 18.Wagner, A. & Fell, D. A. (2001) Proc. R. Soc. London Ser. B 268,.
Ma, H.-W. & Zeng, A.-P. (2003) Bioinformatics 19, 270–277.
19
37
Is The Metabolic Network “Small”?
Filled bars: the direction ofreactions is considered, AL = 8.4
Open bars: all reactions areconsidered reversible, AL = 8.0
L≈ 8, much larger than that of arandom graph
Masanori Arita. PNAS 101 (6): 1543-1547
Considered more detailed structural modelFocus on carbon metabolism
38
Yeast Regulatory Network
Luscombe NM, Babu MM, Yu H, Snyder M, Teichmann SA & Gerstein M (2004)Genomic analysis of regulatory network dynamics
reveals large topological changes.Nature 431: 308-312.
20
39
Very complex network 3420 genes, 142 TFs 7074 regulatory interactions
Simplify using graph-theoretic statistics: Global topological measures Local network motifs
Target Genes
Transcription Factors
Comprehensive Dataset Available
40
Global Topology Measures Connectivity:
Ingress degree: 2.1 – each gene is regulated by ~ 2TFsEgress degree 49.8 – each TF targets ~ 50 genes
Degree distribution: power law (scale free) Clustering coefficient: 0.11 (low local density)
Clustering coefficient
4 neighbours
1 existing link
6 possible links
= 1/6 = 0.17
21
41
Partition Network Into Activity SubnetsActive subnet computations: Start with active genes Compute TFs that influence them Compute closure of TFs that influence current graph
1,385Stress response
1,715DNA damage
1,876Diauxic shift
876Sporulation
437Cell cycle
No. genesCellular Activity
Switching from glucose to lactose
Cell transforms into spores
42
Cell cycle Sporulation Diauxic shift DNA damage Stress
Activity Subnets
Binary stateMulti-stage activities
22
43
Do Global Topology Features Vary By Activity?
Literature: Network topologies are perceived to be invariantScale-free, small-world, and clusteredDifferent biological networks and genomes
Random expectation: Sample different size sub-networksfrom complete network and calculate topological measures
path length clustering coefficient outgoing degreeincoming degree
random network size
44
Outgoing degree
“Binary conditions” greater connectivity
“Multi-stage conditions” lower connectivity
Binary:Quick, large-scale turnover of genes
Multi-stage:Controlled, ticking
over of genes at different stages
23
45
Incoming degree
“Binary conditions”smaller connectivityless complex TF combinations
“Multi-stage conditions”larger connectivitymore complex TF combinations
BinaryMulti-stage
46
Path length
“Binary conditions” shorter path-length “faster”, direct action
“Multi-stage” conditions longer path-length “slower”, indirect action
BinaryMulti-stage
24
47
Clustering coefficient
“Binary conditions”smaller coefficientsless TF-TF inter-regulation
“Multi-stage conditions” larger coefficients more TF-TF inter-regulation
BinaryMulti-stage
48
Literature: motif usage is well conserved for regulatory networksacross different organisms [Alon]
Random expectation: sample sub-nets for motif occurrence
single input motif multiple input motif feed-forward loop
random network size
Do Local Topology Features Vary By Activity?
25
49
Final Notes
50
Challenges & Opportunities
Improved understanding of network evolutionEvolutionary models, selection…
Modularity Network-sequence relationships Network-structure (folding) relationships Applications: drug-design…