bnfo 602 phylogenetics
DESCRIPTION
BNFO 602 Phylogenetics. Usman Roshan. Summary of last time. Models of evolution Distance based tree reconstruction Neighbor joining UPGMA. Why phylogenetics?. Study of evolution Origin and migration of humans Origin and spead of disease Many applications in comparative bioinformatics - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: BNFO 602 Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062500/56815438550346895dc23ada/html5/thumbnails/1.jpg)
BNFO 602 Phylogenetics
Usman Roshan
![Page 2: BNFO 602 Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062500/56815438550346895dc23ada/html5/thumbnails/2.jpg)
Summary of last time
• Models of evolution
• Distance based tree reconstruction– Neighbor joining– UPGMA
![Page 3: BNFO 602 Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062500/56815438550346895dc23ada/html5/thumbnails/3.jpg)
Why phylogenetics?
• Study of evolution– Origin and migration of humans– Origin and spead of disease
• Many applications in comparative bioinformatics– Sequence alignment– Motif detection (phylogenetic motifs, evolutionary trace,
phylogenetic footprinting)– Correlated mutation (useful for structural contact prediction)– Protein interaction– Gene networks– Vaccine devlopment– And many more…
![Page 4: BNFO 602 Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062500/56815438550346895dc23ada/html5/thumbnails/4.jpg)
Maximum Parsimony
• Character based method
• NP-hard (reduction to the Steiner tree problem)
• Widely-used in phylogenetics
• Slower than NJ but more accurate
• Faster than ML
• Assumes i.i.d.
![Page 5: BNFO 602 Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062500/56815438550346895dc23ada/html5/thumbnails/5.jpg)
Maximum Parsimony
• Input: Set S of n aligned sequences of length k
• Output: A phylogenetic tree T– leaf-labeled by sequences in S– additional sequences of length k labeling the
internal nodes of T
such that is minimized. ∑∈ )(),(
),(TEji
jiH
![Page 6: BNFO 602 Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062500/56815438550346895dc23ada/html5/thumbnails/6.jpg)
Maximum parsimony (example)
• Input: Four sequences– ACT– ACA– GTT– GTA
• Question: which of the three trees has the best MP scores?
![Page 7: BNFO 602 Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062500/56815438550346895dc23ada/html5/thumbnails/7.jpg)
Maximum Parsimony
ACT
GTT ACA
GTA ACA ACT
GTAGTT
ACT
ACA
GTT
GTA
![Page 8: BNFO 602 Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062500/56815438550346895dc23ada/html5/thumbnails/8.jpg)
Maximum Parsimony
ACT
GTT
GTT GTA
ACA
GTA
12
2
MP score = 5
ACA ACT
GTAGTT
ACA ACT
3 1 3
MP score = 7
ACT
ACA
GTT
GTAACA GTA
1 2 1
MP score = 4
Optimal MP tree
![Page 9: BNFO 602 Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062500/56815438550346895dc23ada/html5/thumbnails/9.jpg)
Maximum Parsimony: computational complexity
ACT
ACA
GTT
GTAACA GTA
1 2 1
MP score = 4
Finding the optimal MP tree is NP-hard
Optimal labeling can becomputed in linear time O(nk)
![Page 10: BNFO 602 Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062500/56815438550346895dc23ada/html5/thumbnails/10.jpg)
Local search strategies
Phylogenetic trees
Cost
Global optimum
Local optimum
![Page 11: BNFO 602 Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062500/56815438550346895dc23ada/html5/thumbnails/11.jpg)
Local search for MP
• Determine a candidate solution s• While s is not a local minimum
– Find a neighbor s’ of s such that MP(s’)<MP(s)– If found set s=s’– Else return s and exit
• Time complexity: unknown---could take forever or end quickly depending on starting tree and local move
• Need to specify how to construct starting tree and local move
![Page 12: BNFO 602 Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062500/56815438550346895dc23ada/html5/thumbnails/12.jpg)
Starting tree for MP
• Random phylogeny---O(n) time• Greedy-MP
![Page 13: BNFO 602 Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062500/56815438550346895dc23ada/html5/thumbnails/13.jpg)
Greedy-MP
Greedy-MP takes O(n^2k^2) time
![Page 14: BNFO 602 Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062500/56815438550346895dc23ada/html5/thumbnails/14.jpg)
Local moves for MP: NNI
• For each edge we get two different topologies
• Neighborhood size is 2n-6
![Page 15: BNFO 602 Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062500/56815438550346895dc23ada/html5/thumbnails/15.jpg)
Local moves for MP: SPR
• Neighborhood size is quadratic in number of taxa• Computing the minimum number of SPR moves
between two rooted phylogenies is NP-hard
![Page 16: BNFO 602 Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062500/56815438550346895dc23ada/html5/thumbnails/16.jpg)
Local moves for MP: TBR
• Neighborhood size is cubic in number of taxa• Computing the minimum number of TBR moves
between two rooted phylogenies is NP-hard
![Page 17: BNFO 602 Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062500/56815438550346895dc23ada/html5/thumbnails/17.jpg)
Local optima is a problem
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
1 48 96 144 192 240 288 336
TNT
![Page 18: BNFO 602 Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062500/56815438550346895dc23ada/html5/thumbnails/18.jpg)
Iterated local search: escape local optima by perturbation
Local optimumLocal search
![Page 19: BNFO 602 Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062500/56815438550346895dc23ada/html5/thumbnails/19.jpg)
Iterated local search: escape local optima by perturbation
Local optimum
Output of perturbation
Perturbation
Local search
![Page 20: BNFO 602 Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062500/56815438550346895dc23ada/html5/thumbnails/20.jpg)
Iterated local search: escape local optima by perturbation
Local optimum
Output of perturbation
Perturbation
Local search
Local search
![Page 21: BNFO 602 Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062500/56815438550346895dc23ada/html5/thumbnails/21.jpg)
ILS for MP
• Ratchet
• Iterative-DCM3
• TNT
![Page 22: BNFO 602 Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062500/56815438550346895dc23ada/html5/thumbnails/22.jpg)