![Page 1: PPI Network Alignmentsssykim/teaching/s13/slides/Lecture_PPI.pdf · 2013-04-17 · PPI Network Alignment • Comparave’analysis’of’PPInetworks’across’differentspecies’](https://reader033.vdocuments.site/reader033/viewer/2022042709/5f4b7d20ce2d1d2ed812564d/html5/thumbnails/1.jpg)
PPI Network Alignment
02-‐715 Advanced Topics in Computa8onal Genomics
![Page 2: PPI Network Alignmentsssykim/teaching/s13/slides/Lecture_PPI.pdf · 2013-04-17 · PPI Network Alignment • Comparave’analysis’of’PPInetworks’across’differentspecies’](https://reader033.vdocuments.site/reader033/viewer/2022042709/5f4b7d20ce2d1d2ed812564d/html5/thumbnails/2.jpg)
PPI Network Alignment
• Compara8ve analysis of PPI networks across different species by aligning the PPI networks – Find func8onal orthologs of proteins in PPI network of different
species
– Discover conserved subnetwork mo8fs in the PPI network
• Global vs. local alignment – Most of the previous work was focused on local alignment – Global alignment can beJer capture the global picture of how
conserved subnetwork mo8fs are organized – but this is more challenging
![Page 3: PPI Network Alignmentsssykim/teaching/s13/slides/Lecture_PPI.pdf · 2013-04-17 · PPI Network Alignment • Comparave’analysis’of’PPInetworks’across’differentspecies’](https://reader033.vdocuments.site/reader033/viewer/2022042709/5f4b7d20ce2d1d2ed812564d/html5/thumbnails/3.jpg)
PPI Network Alignment
• Challenges – How can we align mul$ple PPI networks?: pair-‐wise alignment is an
easier problem
– How can we use both sequence conserva8on informa8on and local network topology during the alignment?
• Conserved subnetworks across species have proteins with conserved sequences as well as conserved interac8ons with other proteins
• Most of the previous work was focused on finding orthologs based on the sequence similari8es
![Page 4: PPI Network Alignmentsssykim/teaching/s13/slides/Lecture_PPI.pdf · 2013-04-17 · PPI Network Alignment • Comparave’analysis’of’PPInetworks’across’differentspecies’](https://reader033.vdocuments.site/reader033/viewer/2022042709/5f4b7d20ce2d1d2ed812564d/html5/thumbnails/4.jpg)
IsoRank and IsoRank-Nibble
• Mul8ple PPI network alignment for mul8ple species
• Global alignment
• Alignment based on both sequence and local connec8vity conserva8ons
• Based on Google PageRank
![Page 5: PPI Network Alignmentsssykim/teaching/s13/slides/Lecture_PPI.pdf · 2013-04-17 · PPI Network Alignment • Comparave’analysis’of’PPInetworks’across’differentspecies’](https://reader033.vdocuments.site/reader033/viewer/2022042709/5f4b7d20ce2d1d2ed812564d/html5/thumbnails/5.jpg)
PageRank Overview
• Developed by Larry Page and used in Google search engine
• Algorithm for ranking hyperlinked webpages in the network of webpages – Node is each webpage – Directed edge from a linking page to the hyperlinked page
• Pages with higher PageRank are returned as search hits
![Page 6: PPI Network Alignmentsssykim/teaching/s13/slides/Lecture_PPI.pdf · 2013-04-17 · PPI Network Alignment • Comparave’analysis’of’PPInetworks’across’differentspecies’](https://reader033.vdocuments.site/reader033/viewer/2022042709/5f4b7d20ce2d1d2ed812564d/html5/thumbnails/6.jpg)
PageRank Overview
• PageRank models the user behavior
• PageRank for each page is the probability that a websurfer who starts at a random page and takes a random walk on this network of webpages end up at that page – With probability d (damping factor), the websurfer jumps to a different
randomly selected webpage and starts a random walk
– Without the damping factor, only the webpages with no outgoing edges will get non-‐zero PageRanks
![Page 7: PPI Network Alignmentsssykim/teaching/s13/slides/Lecture_PPI.pdf · 2013-04-17 · PPI Network Alignment • Comparave’analysis’of’PPInetworks’across’differentspecies’](https://reader033.vdocuments.site/reader033/viewer/2022042709/5f4b7d20ce2d1d2ed812564d/html5/thumbnails/7.jpg)
PageRank
• The webpages with a greater number of pages linked to it are ranked higher
• If a webpage has mul8ple hyperlinks, and the vote of each outgoing edge is divided by the number of hyperlinks
• The vote of each hyperlink depends on the PageRank of the linking webpage – Recursive defini8on of PageRanks
![Page 8: PPI Network Alignmentsssykim/teaching/s13/slides/Lecture_PPI.pdf · 2013-04-17 · PPI Network Alignment • Comparave’analysis’of’PPInetworks’across’differentspecies’](https://reader033.vdocuments.site/reader033/viewer/2022042709/5f4b7d20ce2d1d2ed812564d/html5/thumbnails/8.jpg)
PageRank Illustration
![Page 9: PPI Network Alignmentsssykim/teaching/s13/slides/Lecture_PPI.pdf · 2013-04-17 · PPI Network Alignment • Comparave’analysis’of’PPInetworks’across’differentspecies’](https://reader033.vdocuments.site/reader033/viewer/2022042709/5f4b7d20ce2d1d2ed812564d/html5/thumbnails/9.jpg)
PageRank
• PageRank pi of page i is given as
– d: damping factor, it ensures each page gets at least (1-‐d) PageRank
– N: the number of webpages
– Lij=1 if page j points to page i, and 0 otherwise –
![Page 10: PPI Network Alignmentsssykim/teaching/s13/slides/Lecture_PPI.pdf · 2013-04-17 · PPI Network Alignment • Comparave’analysis’of’PPInetworks’across’differentspecies’](https://reader033.vdocuments.site/reader033/viewer/2022042709/5f4b7d20ce2d1d2ed812564d/html5/thumbnails/10.jpg)
PageRank
• Using matrix nota8on
• p: the vector of length N • e: the vector of N ones • : diagonal elements are ci • L: NxN matrix of Lij’s
• Introduce normaliza8on so that average PageRank is 1
![Page 11: PPI Network Alignmentsssykim/teaching/s13/slides/Lecture_PPI.pdf · 2013-04-17 · PPI Network Alignment • Comparave’analysis’of’PPInetworks’across’differentspecies’](https://reader033.vdocuments.site/reader033/viewer/2022042709/5f4b7d20ce2d1d2ed812564d/html5/thumbnails/11.jpg)
PageRank
• p/N is the sta8onary distribu8on of a Markov chain over the N webpages
• In order to find p, we use power method – Ini8alize – Iterate to find fixed point p
![Page 12: PPI Network Alignmentsssykim/teaching/s13/slides/Lecture_PPI.pdf · 2013-04-17 · PPI Network Alignment • Comparave’analysis’of’PPInetworks’across’differentspecies’](https://reader033.vdocuments.site/reader033/viewer/2022042709/5f4b7d20ce2d1d2ed812564d/html5/thumbnails/12.jpg)
IsoRank
• Stage 1: Given two networks G1 and G2, compute the similarity scores Rij for a pair of protein for node i in vertex set V1 in G1 and protein for node j in vertex set V2 in G2 – Use PageRank algorithm
• Stage 2: Given the matrix R of Rij, find the global alignment using a greedy algorithm
![Page 13: PPI Network Alignmentsssykim/teaching/s13/slides/Lecture_PPI.pdf · 2013-04-17 · PPI Network Alignment • Comparave’analysis’of’PPInetworks’across’differentspecies’](https://reader033.vdocuments.site/reader033/viewer/2022042709/5f4b7d20ce2d1d2ed812564d/html5/thumbnails/13.jpg)
From PageRank to IsoRank
• PageRank ranks webpages, whereas IsoRank ranks the pairs of proteins from the two networks to be aligned.
• PageRank uses the hyperlink informa8on from neighboring nodes to recursively compute the ranks, whereas IsoRank uses the sequence similarity and network connec8vity with other neighboring nodes to define the ranks.
![Page 14: PPI Network Alignmentsssykim/teaching/s13/slides/Lecture_PPI.pdf · 2013-04-17 · PPI Network Alignment • Comparave’analysis’of’PPInetworks’across’differentspecies’](https://reader033.vdocuments.site/reader033/viewer/2022042709/5f4b7d20ce2d1d2ed812564d/html5/thumbnails/14.jpg)
IsoRank
• Similarly to PageRank, pairwise similarity score Rij is recursively defined as
– N(i) : the set of neighbors of node u within the graph of u
• Using matrix nota8on
• A is a large but sparse matrix
![Page 15: PPI Network Alignmentsssykim/teaching/s13/slides/Lecture_PPI.pdf · 2013-04-17 · PPI Network Alignment • Comparave’analysis’of’PPInetworks’across’differentspecies’](https://reader033.vdocuments.site/reader033/viewer/2022042709/5f4b7d20ce2d1d2ed812564d/html5/thumbnails/15.jpg)
IsoRank Example
![Page 16: PPI Network Alignmentsssykim/teaching/s13/slides/Lecture_PPI.pdf · 2013-04-17 · PPI Network Alignment • Comparave’analysis’of’PPInetworks’across’differentspecies’](https://reader033.vdocuments.site/reader033/viewer/2022042709/5f4b7d20ce2d1d2ed812564d/html5/thumbnails/16.jpg)
IsoRank
• When the network edges are weighted
• Power method can be used to compute Rij’s
![Page 17: PPI Network Alignmentsssykim/teaching/s13/slides/Lecture_PPI.pdf · 2013-04-17 · PPI Network Alignment • Comparave’analysis’of’PPInetworks’across’differentspecies’](https://reader033.vdocuments.site/reader033/viewer/2022042709/5f4b7d20ce2d1d2ed812564d/html5/thumbnails/17.jpg)
IsoRank
• Incorpora8ng sequence similarity informa8on E
– α = 0: only sequence similarity informa8on is used but no network informa8on is used.
– α = 1: only network informa8on is used
![Page 18: PPI Network Alignmentsssykim/teaching/s13/slides/Lecture_PPI.pdf · 2013-04-17 · PPI Network Alignment • Comparave’analysis’of’PPInetworks’across’differentspecies’](https://reader033.vdocuments.site/reader033/viewer/2022042709/5f4b7d20ce2d1d2ed812564d/html5/thumbnails/18.jpg)
IsoRank: Stage 2
• Extrac8ng node-‐mapping informa8on for global alignment given pairwise similarity scores Rij – One-‐to-‐one mapping
• Any node is mapped to at most one node in the network from other species
• Efficient computa8on • Ignores gene duplica8on
– Many-‐to-‐many mapping • Finds clusters of orthologous genes across networks from different species
– Mapping criterion: iden8fy pairs of nodes that have high Rij scores, while ensuring the mapping obeys transi8ve closures – if the mapping contains (a,b) and (b,c), it should contain (a,c)
![Page 19: PPI Network Alignmentsssykim/teaching/s13/slides/Lecture_PPI.pdf · 2013-04-17 · PPI Network Alignment • Comparave’analysis’of’PPInetworks’across’differentspecies’](https://reader033.vdocuments.site/reader033/viewer/2022042709/5f4b7d20ce2d1d2ed812564d/html5/thumbnails/19.jpg)
IsoRank: Stage 2
• One-‐to-‐one mapping – Greedy approach – Select the highest scoring pair
![Page 20: PPI Network Alignmentsssykim/teaching/s13/slides/Lecture_PPI.pdf · 2013-04-17 · PPI Network Alignment • Comparave’analysis’of’PPInetworks’across’differentspecies’](https://reader033.vdocuments.site/reader033/viewer/2022042709/5f4b7d20ce2d1d2ed812564d/html5/thumbnails/20.jpg)
IsoRank: Stage 2
• Many-‐to-‐many mapping – Greedy approach – Form a k-‐par8te graph with k graphs – Iterate un8l k-‐par8te graph has no edges
• Finding seed pair: – select the edge (i,j) with the highest score Rij (i,j are from two different graphs G1 and G2)
• Extend the seed: – In (G3, …, Gk), find a node l, such that 1) Rlj and Rli are the highest scores between l and any node in G1 and G2, and 2) Rli and Rlj exceed a certain threshold
• Remove from k-‐par8te graph the match set
![Page 21: PPI Network Alignmentsssykim/teaching/s13/slides/Lecture_PPI.pdf · 2013-04-17 · PPI Network Alignment • Comparave’analysis’of’PPInetworks’across’differentspecies’](https://reader033.vdocuments.site/reader033/viewer/2022042709/5f4b7d20ce2d1d2ed812564d/html5/thumbnails/21.jpg)
Results
• Alignment PPI networks from five species – S. cerevisiae, D. Melanogaster, C. elegans, M. musculus, H. sapiens
– The common subgraph supported by the global alignment contains • 1,663 edges supported by at least two PPI networks • 157 edges supported by at least three networks
– The alignment by sequence-‐only (no network) method contains
• 509 edges with support in two or more species
• 40 edges supported by at least three networks
![Page 22: PPI Network Alignmentsssykim/teaching/s13/slides/Lecture_PPI.pdf · 2013-04-17 · PPI Network Alignment • Comparave’analysis’of’PPInetworks’across’differentspecies’](https://reader033.vdocuments.site/reader033/viewer/2022042709/5f4b7d20ce2d1d2ed812564d/html5/thumbnails/22.jpg)
Results
• Subgraphs selected from yeast-‐fly PPI network alignment
![Page 23: PPI Network Alignmentsssykim/teaching/s13/slides/Lecture_PPI.pdf · 2013-04-17 · PPI Network Alignment • Comparave’analysis’of’PPInetworks’across’differentspecies’](https://reader033.vdocuments.site/reader033/viewer/2022042709/5f4b7d20ce2d1d2ed812564d/html5/thumbnails/23.jpg)
Summary
• IsoRank uses both sequence similarity and network informa8on to align mul8ple PPI networks from different species
• IsoRank adopts PageRank algorithm to solve the problem of global PPI network alignment