toni gabaldón centre for genomic regulation (crg...
TRANSCRIPT
Toni GabaldónCentre for Genomic Regulation (CRG), Barcelona
([email protected], @toni_Gabaldon)
Introduction to phylogenetics
Phylogenetic corals and corals of life
[top] Let dots represent Genera ???
[note at far left] If then had all given
descendants then their w.[would] have been a great series.
[note at base] Parents of Marsupials and Placentals
[note within Rodents
[notes at right] no form intermediate
Rodents
Marsupials
A phylogenetic treeA branching diagram (bipartite graph) showing the inferred
evolutionary relationships among various biological species or other entities (e.g sequences) based on similarities and differences in their physical and/or genetic characteristics.
Why care about (phylogenetic) trees?
“Nothing make sense in biology if not in the light of evolution”
Biological systems are the result of the evolutionary process. Trees represent this process.
Assumptions on statistical tests usually assume “independence of data”, but no data in comparative biology is “independent”, as all organisms are related.Phylogenetic trees are the first step to detect and remove the effect of evolutionaryrelatedness.
How do we reconstruct phylogenetic trees?
Phylogenetic trees can be based on anything that can tell us on similarities and differences. (e.g RFLPs, phenotypic characters, sequences).
Molecular Phylogenetics is nowadays the most widely used method
Sequences are compared by means of alignments and this constitutes the information used to make the tree.
→ quality of the tree depends on the quality of the underlying alignment
Therefore is important to have good alignments and consider to remove unreliableparts (alignment trimming, see http://trimal.cgenomics.org )
So, how to find the best tree?
Exhaustive search: make ALL trees first, and then see which one best fits the data(you need an optimality criterion)
Heuritisc search: Try to find a way to find an optimal tree (hopefully the best) without testing them all. You also need an optimality criterion and you are not guaranteed to find the best, but you save time.
Phylogenetic approaches:
Distance methods (NJ, UPGMA)
Maximum Parsiomy
Probabilistic Methods (Maximum Likelihood and Bayesian Inference)
Distance-based methods: if there are no errors (in distances)then, the correct tree can be obtained in polynomial time. Otherwise, optimization problems are NP-hard.
Maximum Parsimony, Probabilistic Methods: NP-hard
Confidence values: bootstrapping
Maximum Parsimony
Finding the tree with that implies the minimal number of changes along its branches.
Neighbor Joining
Based on the current distances matrix calculate the matrix Q 2. Find the pair of taxa in Q with the lowest value. Create a node on the tree that joins these closest neighbors.3. Calculate the distance of each of the taxa in the pair to this new node.4. Calculate the distance of all taxa outside of this pair to the new node.5. Start the algorithm again, considering the pair of joined neighbors as a single taxon and using the distances calculated in the previous step