ppi network alignment
DESCRIPTION
PPI Network Alignment. 陳琨、朱安強、林晏禕、翁翊鐘 陳縕儂、呂哲安、楊孟翰. Protein-protein Interaction Network Alignment. Protein Biosynthesis. From DNA to life. Biology Technology. How do we measure protein interaction? Two-hybrid screens Co-immunoprecipitation. Two-hybrid screens. UAS. Reporter gene (LacZ). - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/1.jpg)
PPI Network PPI Network AlignmentAlignment
陳琨、朱安強、林晏禕、翁翊鐘陳縕儂、呂哲安、楊孟翰
![Page 2: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/2.jpg)
PROTEIN-PROTEIN PROTEIN-PROTEIN INTERACTIONINTERACTIONNETWORK NETWORK ALIGNMENTALIGNMENT
![Page 3: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/3.jpg)
Protein BiosynthesisProtein Biosynthesis
![Page 4: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/4.jpg)
From DNA to lifeFrom DNA to life
![Page 5: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/5.jpg)
Biology TechnologyBiology TechnologyHow do we measure protein
interaction?◦Two-hybrid screens◦Co-immunoprecipitation
![Page 6: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/6.jpg)
Two-hybrid screensTwo-hybrid screens
A. Regular transcription of the reporter gene
UASReporter gene
(LacZ)
![Page 7: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/7.jpg)
Two-hybrid screensTwo-hybrid screens
B. One fusion protein only (Gal4-BD + Bait) – no transcription
UASReporter gene
(LacZ)
no transcription
![Page 8: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/8.jpg)
Two-hybrid screensTwo-hybrid screens
C. One fusion protein only (Gal4-AD + Prey) – no transcription
UASReporter gene
(LacZ)
no transcription
![Page 9: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/9.jpg)
Two-hybrid screensTwo-hybrid screens
D. Two fusion proteins with interacting Bait and Prey
UASReporter gene
(LacZ)
![Page 10: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/10.jpg)
Co-immunoprecipitationCo-immunoprecipitation
Known viral proteinProtein A
AntibodyUnknown proteinX
Y
![Page 11: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/11.jpg)
Protein-Protein Interaction Protein-Protein Interaction Networks?Networks?Protein are nodesInteractions are edges
Yeast PPI network
![Page 12: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/12.jpg)
Network comparisonsQuery for a modulePredict functions of a modulePredict protein functionsValidate protein interactionsPredict protein interactions
![Page 13: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/13.jpg)
Random networkRandom networkConnect each pair of node with
prob p Expect value of edge is pN(N-1)/2Poisson distribution
◦The node with high degree is rare
![Page 14: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/14.jpg)
Scale-free networkScale-free networkPower-law degree distributionHubs and nodesWhen a node add into network, it
prefer to link to hubs
![Page 15: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/15.jpg)
The Network Alignment The Network Alignment ProblemProblemGiven k different protein
interaction networks belonging to different species, we wish to find conserved sub-networks within these networks
Conserved in terms of protein sequence similarity (node similarity) and interaction similarity (network topology similarity)
![Page 16: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/16.jpg)
General Framework For General Framework For Network Alignment AlgorithmsNetwork Alignment Algorithms
![Page 17: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/17.jpg)
PATHBLASTPATHBLASTConserved pathways within bacteria and yeast as revealed by global protein network alignment. Brian P. Kelley , Roded Sharan , Richard M. Karp , Taylor Sittler , David E. Root , Brent R. Stockwell , and Trey Ideker (2003)
![Page 18: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/18.jpg)
Protein SimilarityProtein SimilarityHomologous proteins:
two proteins that have common ancestry.
Orthologous proteins: two protein from different species that diverged after a speciation event.
Paralogous proteins: two proteins from the same species that diverged after a duplication event.
Source: Roded Sharan, Protein-protein Interaction: Network Alignment Lecture Note
![Page 19: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/19.jpg)
Path BlastPath BlastPathBlast is a strategy for aligning two protein
interaction networks to elucidate their conserved pathways.
This method identifies pairs of interaction paths, drawn from the networks of different species or from different processes within a species, where proteins at equivalent path positions share strong sequence homology.
Source: Conserved pathways within bacteria and yeast as revealed by global protein network alignment. PNAS, 2003.
![Page 20: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/20.jpg)
Alignment GraphAlignment GraphVertical solid line:
protein-protein intertactions.
Horizontal dotted line: significant sequence similarity.
Node: a homologous protein pair.
Link: protein interaction relations of three types: direct, gap, and mismatch. Source: Conserved pathways within bacteria and
yeast as revealed by global protein network alignment. PNAS, 2003.
![Page 21: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/21.jpg)
Yeast & Bacteria PPI Alignment Yeast & Bacteria PPI Alignment graph graph The yeast and bacteria global alignment
graphs v.s. randomized networks obtained by permuting the protein name.
This suggests that both species share conserved interaction pathways.
“direct interaction” are rare. “mismatches” and “gaps” were permitted,
allowed overcome false negatives.
Source: Roded Sharan, Protein-protein Interaction: Network Alignment Lecture Note
![Page 22: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/22.jpg)
Scoring FunctionScoring Function
p(v) is the probability of true homology with in the protein pair represented by v.
q(e) is the probability that the protein-protein interactions represented by e.
The background probabilities are the expected values of p(v) and q(e) over global alignment graph.
![Page 23: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/23.jpg)
Pathways & Protein Pathways & Protein ComplexesComplexesPathBLAST is used to find
conserved paths and then overlapping paths are merged into complexs.
Source: Roded Sharan, Protein-protein Interaction: Network Alignment Lecture Note
![Page 24: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/24.jpg)
Yeast v.s. BacteriaYeast v.s. BacteriaOrthologous PathwaysSelect the 150 highest-
scoring pathway of length four from alignment graph.
Combing overlapping pathways, found fell into 5 network regions.
Right figure involves the union of 6 paths.
With similar function.Solid link: direct
interactions, dotted link: gaps or mismatches.
Source: Conserved pathways within bacteria and yeast as revealed by global protein network alignment. PNAS, 2003.
![Page 25: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/25.jpg)
Yeast vs. Yeast.Yeast vs. Yeast.Paralogous PathwaysProteins were not
allowed to pair with themselves or their neighbors.
Analyzed 150 highest-scoring pathway alignments of length 4 from alignment graph.
distinct alignments but homologous in function.
Source: Conserved pathways within bacteria and yeast as revealed by global protein network alignment. PNAS, 2003.
![Page 26: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/26.jpg)
Pathway QueriesPathway Queries
PATHBLAST identified two other well known MAPK pathways as the highest-scoring hits,indicating that the algorithm was sufficiently sensitive and specific to identify known paralogous pathways.
Source: Conserved pathways within bacteria and yeast as revealed by global protein network alignment. PNAS, 2003.
![Page 27: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/27.jpg)
Identification of Identification of Protein ComplexesProtein Complexes
Roded Sharan, Trey Ideker, Brian P. Kelley, Ron Shamir, Richard M. Karp:
Identification of Protein Complexes by Comparative Analysis of Yeast and Bacterial Protein Interaction Data.
Journal of Computational Biology 12(6): 835-846 (2005)
![Page 28: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/28.jpg)
State-of-The-Art
![Page 29: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/29.jpg)
Flashback[Input] the alignment graph of 2
PPI networks.We already can handle the
problem of finding conserved linear pathways.
Now this is not the end: How can we step further?
![Page 30: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/30.jpg)
MotivationFinding more complex conserved
structures is of practical interest.
![Page 31: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/31.jpg)
MotivationFinding more complex conserved
structures is of practical interest. [Reduction] Now we can merge
overlapping paths into complexes.
![Page 32: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/32.jpg)
MotivationFinding more complex conserved
structures is of practical interest. [Reduction] Now we can merge
overlapping paths into complexes. Or we can develop another model
to identify conserved complexes.
![Page 33: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/33.jpg)
A New Model: The Main Idea How do you recognize protein
complexes?◦ Dense Subgraphs◦ Comparative Analysis
"When I use a word," Humpty Dumpty said in a rather a scornful tone, "it means just what I choose it to mean -- neither more nor less."
Lewis Carroll, Through the Looking-Glass
![Page 34: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/34.jpg)
Dense Subgraph: LikelihoodLikelihood Formula 0.1: given an
induced subgraph,◦ L(C) = |Ec|/ { ½ * |Vc| * ( |Vc| - 1 ) }
It makes sense: graphs with more edges have higher likelihood.
![Page 35: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/35.jpg)
Dense Subgraph: Likelihood(Cont.)
Likelihood Formula 0.1: given an induced subgraph,◦ L(C) = |Ec|/ { ½ * |Vc| * ( |Vc| - 1 ) }
It makes sense: graphs with more edges have higher likelihood.
We only consider the structure of graphs.
Problems of link analysis are often data-dependent.
![Page 36: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/36.jpg)
Dense Subgraph: Likelihood(Cont.)
Likelihood Formula 0.1: given an induced subgraph,◦ L(C) = |Ec|/ { ½ * |Vc| * ( |Vc| - 1 ) }
Likelihood Formula 0.2: given an induced subgraph,
What the hell is it?
![Page 37: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/37.jpg)
Dense Subgraph: Likelihood(Cont.)
What do you expect about the behavior of revised formulas?
Higher likelihood: The scores of dense graphs are higher.
Adjustment: The weakest link ◦ Bonus: Interaction with low
probability happens.
![Page 38: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/38.jpg)
Dense Subgraph: Likelihood(Cont.)
Higher likelihood: The scores of dense graphs are higher.
We assume that every 2 proteins in a complex interact with some probability p( 0.8 is used in this work).
We can use the model as a baseline for comparing density.
![Page 39: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/39.jpg)
Dense Subgraph: Likelihood(Cont.)
Adjustment: The weakest link!p(u,v) is defined to be the fraction
of graphs in FG that includes this edge.◦ FG : the family of graphs with V and
the same degree sequence.Edges incident on vertices with
higher degrees have higher probability.
![Page 40: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/40.jpg)
Dense Subgraph: Likelihood(Cont.)
Likelihood Formula 0.2: given an induced subgraph,
What the hell is it?◦ For p(u,v) = 0.2, we have 4 and ¼ in
both side.◦ For p(u,v) = 0.6, we have 4/3 and 1/2 in
both side.◦ It makes sense! We emphasize the
weakest link.
![Page 41: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/41.jpg)
Dense Subgraph: Likelihood(Cont.)
Likelihood Formula 0.1: given an induced subgraph,
◦ L(C) = |Ec|/ { ½ * |Vc| * ( |Vc| - 1 ) } Likelihood Formula 0.2: given an induced
subgraph,
Likelihood Formula 0.3: given an induced subgraph,
![Page 42: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/42.jpg)
The Main Idea Revisited How do you recognize protein
complexes?◦ Dense Subgraphs
We have some revised formula for density in a PPI network.
◦ Comparative Analysis
![Page 43: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/43.jpg)
Comparative AnalysisIdea: If some structure occurs in
different species, it is of high probability to be some meaningful structure.
How do you define dense substructures on alignment graphs?
![Page 44: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/44.jpg)
Comparative Analysis(Cont.)
Consider two subsets U1 ={ u1,..., uk}, V2 ={ v1,..., vk} and Θ: U1 → V2 is a many-to-many correspondence.
Since you already have
You may derive the formula 1.1 as follows:
Does it make sense?
![Page 45: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/45.jpg)
Comparative Analysis(Cont.)Θ is useful information:
You have the formula 1.2:
{ A/(A+B) }/ {X/(X+Y)}
![Page 46: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/46.jpg)
The Main Idea Revisited How do you recognize protein
complexes?◦ Dense Subgraphs
We have some revised formula for density in a PPI network.
◦ Comparative Analysis We have some revised formula for
density in an alignment network.
![Page 47: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/47.jpg)
Search the Complexes Now we only need to find heavy
subgraphs in the alignment graph.The problem is NP-Hard.
![Page 48: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/48.jpg)
Search the Complexes(Cont.)
[Seed] Compute a seed around each node v.
[Refined Seed] Enumerate all subsets of the seed that have size 3 and contain v.
[Local Search] Iteratively modify the refined seed.
[Output Heavy Subgraphs] For each node, we record at most k heaviest subgraphs.
![Page 49: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/49.jpg)
Search the Complexes(Cont.)
[Seed] Compute a seed around each node v.
[Restrict the Size] Keep seeds small![Refined Seed] Enumerate all subsets of
the seed that have size 3 and contain v.[Local Search] Iteratively modify the
refined seed.[Output Heavy Subgraphs] For each node,
we record at most k heaviest subgraphs.[Filtering overlapping ones] Greedy
method is used!
![Page 50: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/50.jpg)
The Main Idea Revisited How do you recognize protein
complexes?◦ Dense Subgraphs
We have some revised formula for density in a PPI network.
◦ Comparative Analysis We have some revised formula for
density in an alignment network. Finally, we have some practical method
to search complexes!
![Page 51: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/51.jpg)
PATH QUERIESPATH QUERIES
![Page 52: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/52.jpg)
Path QueriesProblem definitionInput
◦a target network represented as an undirected weighted graph G(V, E), with a weight function on the edges w:E×E→R
◦A path queries Q=(q1,…,qk)
Scoring function of node similarity H:Q×V
![Page 53: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/53.jpg)
Output: a set of best matching pathways P=(p1,…,pl) in G, where a good match is measured in two respects:
1. The matched nodes are similar by scoring function H.
2. The reliability of edges in the matched pathway is high.
![Page 54: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/54.jpg)
![Page 55: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/55.jpg)
Algorithm
1. Introduce a mapping M from Q to P∪{0} where deleted query nodes are mapped to 0 by M.
2. Path Scoring:• interaction score and sequence
score
k
qMiii
l
iii
i
qMqHppw0,1
1
11 ,,
![Page 56: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/56.jpg)
Interaction score◦Edges weights represent the
logarithm of reliability of interaction between two proteins.
Sequence score◦BLAST E-value for the two proteins
normalized by the maximal E-value over all pairs of proteins from the two networks.
![Page 57: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/57.jpg)
AlgorithmAvoiding cycles
◦N. Alon, R. Yuster, and U. Zwick: Color-coding. J.ACM, 1995.
Finding the best matching paths:
deldeldel
del
idel
Vmdel
NSmiW
EjmjmwjcSmiW
EjmjqHjmwjcSmiW
SjiW
,1,,,1
,,,,,,
,,,,,,,1
max,,,
![Page 58: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/58.jpg)
Dataset and ResultsYeast and fly PPI networks
◦ The yeast (S. cerevisiae) PPI network contains 4,726 proteins and 15,166 known interactions between them.
◦ The fly (D. melanogaster) PPI network contains 7,028 proteins and 22,837 interactions.
271 pathways were discovered which were better than 99% of randomly chosen from yeast PPI network, and then were used as queries for the fly PPI network.
![Page 59: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/59.jpg)
Results
![Page 60: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/60.jpg)
APPLICATION OF PPI APPLICATION OF PPI NETWORK NETWORK ALIGNMENT: ALIGNMENT: ORTHOLOGY ORTHOLOGY MAPPINGMAPPING
S. Bandyopadhyay, R. Sharan, and T. Ideker. Systematic identification of functional orthologs based on protein network comparison. Genome Research, 16(3):428–435, 2006
![Page 61: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/61.jpg)
IntroductionIntroductionAnnotating protein function across species is
often complicated by the presence of paralogous proteins
Most of the methods of dealing with this problem are sequence-based models, thus sequences of proteins from different species were compared to find a group of proteins that have the same functional annotation
A protein and its functional ortholog are likely to interact with proteins in their respective networks that are themselves functional orthologs
This introduced a strategy for identifying functionally related proteins that supplements sequence-based comparisons with information on conserved protein-protein interactions
![Page 62: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/62.jpg)
Introduction (cont’d)Introduction (cont’d)
a b
a’b’
a’
b’b
a
![Page 63: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/63.jpg)
Functional orthologyFunctional orthology When the protein in question has
similarity to not one but many paralogous proteins, it’s harder to distinguish which of these is the true ortholog, the protein that is directly inherited from a common ancestor
Definite functional orthologs are defined as proteins that are functionally equivalent as a result of direct ancestry
![Page 64: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/64.jpg)
Model reviewModel reviewThe protein interaction networks of two species
are aligned by assigning proteins to sequences homology groups using the Inparanoid algorithm
Networks are aligned into a merged graph representation
Probabilistic inference is performed on the aligned networks to identify pairs of proteins, one from each species, that are likely to retain the same function based on conservation of their interacting partners
A logistic function is used to compute the probability of functional orthology for a protein pair i given the states of functional orthology for its network neighbors
The previous probability is updated for each pair over successive iterations of Gibbs sampling
![Page 65: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/65.jpg)
Model review (cont’d)Model review (cont’d)
![Page 66: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/66.jpg)
Conservation indexConservation indexConsider an alignment graph G
◦Nodes represent sequence-similar protein pairs
◦Edges link nodes (a, b) and (a’, b’) if one of (a, a’) or (b, b’) directly interacts, and the other interacts via a neighbor, which is directly connected to them
◦An edge is strongly conserved if its endpoints are true functional orthologs
![Page 67: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/67.jpg)
Conservation index Conservation index (cont’d)(cont’d)
network itsin bprotein of degree the:)(
network itsin aprotein of degree the:)(
i node involving links conservedstrongly ofnumber the:)(
i node a ofindex on conservati :)(
)()(
)(2)(
bd
ad
id
ic
bdad
idic
![Page 68: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/68.jpg)
Probabilistic modelProbabilistic modelThe probability of functional
orthology for a pair of proteins is influenced by the probabilities of functional orthology for their network neighbors, which in turn depend on their network neighbors, and so on
This type of probabilistic model is known as a Markov random field
![Page 69: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/69.jpg)
Probabilistic model Probabilistic model (cont’d)(cont’d)
Positive training examples: the definite functional orthologs having as least one conserved interaction
Negative training examples: the protein paired with its best BLAST e-value matching protein not the same cluster by the Inparanoid algorithm
examples trainingnegative allover ))|(1(
and examples trainingpositive allover )|(
ofproduct themaximizingby optimized are and Parameter
)(such that all ofset the:Z
i node of neighbors ofset the:)(
i node of state the:
)}(exp{1
1)|(
)(
)(
N(i)
)(
iNi
iNi
j
i
iNi
ZzP
ZzP
iNjz
iN
z
icZzp
![Page 70: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/70.jpg)
Orthology inferenceOrthology inferenceThe above model was used to estimate the
final posterior probabilities P(zi) using the Gibbs sampling
Nodes representing ambiguous functional orthologs are each assigned a temporary state z=0 or z=1, initially at random
At each iteration, a node i is sampled (with replacement) and its value if zi is updated given the states of its neighbors, ZN(i). The new value of zi is set to 0 or 1 with probability P(zi|ZN(i))
Over all iterations, the nodes designed as definite functional orthologs and non-orthologs are forced to states of 1 and 0, respectively
![Page 71: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/71.jpg)
Experimental resultsExperimental results
A total of 2244 clusters were generated by the Inparanoid algorithm, covering 2834 proteins in yeast and 3881 proteins in fly
Of these, 1552 clusters contained only a single yeast and fly protein pair and were assumed to represent definite functional orthologs
They applied above method to resolve the remaining 692 clusters which were assumed to represent ambiguous functional orthologs, and found 121 contained protein pairs for which at least one pair had conserved interations between networks
In 60 of these, the highest probability was assigned to the protein pair that was also the most sequence-similar via BLAST
![Page 72: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/72.jpg)
Experimental results Experimental results (cont’d)(cont’d)
![Page 73: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/73.jpg)
ConclusionConclusionThese findings confirm that
yeast/fly proteins classified as definite functional orthologs are more likely to have equivalent functional roles in the protein network
The conserved network context could be used to help discriminate functional orthology from general sequence similarity
![Page 74: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/74.jpg)
MULTIPLE NETWORK MULTIPLE NETWORK ALIGNMENTALIGNMENT
R. Sharan, S. Suthram, R.M. Kelley, T. Kuhn, S. McCuine, P. Uetz, T. Sittler, R.M. Karp, and T. Ideker.Conserved patterns of protein interaction in multiple species. PNAS, 102(6):1974–1979, 2004
![Page 75: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/75.jpg)
The alignment graphThe alignment graph Each node in this graph consists of a group
of sequence-similar proteins, one from each species
Each link between a pair of nodes in the alignment graph represent conserved protein interactions between the corresponding protein group
A search over the alignment graph is performed to identify:1. Short linear paths of interacting proteins, which
model signal transduction pathways2. Dense clusters of interactions, which model
protein complexes
![Page 76: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/76.jpg)
The alignment graph The alignment graph (cont’d)(cont’d)
![Page 77: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/77.jpg)
Experimental resultsExperimental resultsThey applied the multiple network
alignment framework to three PPI networks:◦ Yeast: 14319 interactions among 4389 proteins◦ Worm: 3926 interactions among 2718 proteins◦ Fly: 20720 interactions among 7038 proteins
It identified 183 protein clusters and 240 paths conserved at a significance level of P < 0.01; groups of conserved clusters overlap to define 71 distinct network regions
![Page 78: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/78.jpg)
Experimental results Experimental results (cont’d)(cont’d)
![Page 79: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/79.jpg)
Experimental results Experimental results (cont’d)(cont’d)
![Page 80: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/80.jpg)
Prediction of protein Prediction of protein functionfunctionWhenever the set of proteins in a
conserved cluster or path (over all species) was significantly enriched for a particular GO annotation and at least half of the proteins in the cluster or path had that annotation, all remaining proteins in the sub-network were predicted to have that annotation
![Page 81: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/81.jpg)
Fast and accurate alignment of Fast and accurate alignment of multiple PPI networksmultiple PPI networksBy Maxim Kalaev, Vineet Bafna,
and Roded Sharan, 2007Drawback of the alignment graph:
exponential growth of the graph with the number of species
They introduced a new algorithm avoiding the explicit representation of every set of potentially orthologous proteins, thereby reducing time and memory requirements
![Page 82: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/82.jpg)
The layered alignment The layered alignment graph (1/3)graph (1/3)Given k PPI networks (for k species
respectively)A layered alignment graph: each layer
corresponds to a species and contains the corresponding network. Additional edges connect proteins from different layers if they are sequence similar
A k-spine: a sub-graph of size k which includes a vertex from each of the layers. A k-spine corresponds to a set of truly orthologous proteins
A collection of connected k-spines induces a candidate conserved sub-network
![Page 83: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/83.jpg)
The layered alignment The layered alignment graph (2/3)graph (2/3)
Species 1 Species 2 Species k
k-spin U[3]……
Inter-layer edge
PPI edge
U1 U2 U3 Uk
![Page 84: PPI Network Alignment](https://reader036.vdocuments.site/reader036/viewer/2022062519/568152ce550346895dc0e9b1/html5/thumbnails/84.jpg)
The layered alignment The layered alignment graph (3/3)graph (3/3)If considering every k-spine to be a
node in a graphAn m-subnet: a collection U of k multi-
sets Ui = {ui[1],…, ui[m]}◦ For all 1≦ i ≦ k and 1≦ j ≦ m, ui[j] belongs
to Vi
◦ For all 1≦ j ≦ m, the set U[j] = {u1[j], u2[j],…, uk[j]} is a k-spine
The task is to look for high scoring m-subnets, for a fixed m