phylogenetics
DESCRIPTION
Reconstructing the Tree of Life. Phylogenetics. Tree of Life Web Project. http://www.tolweb.org/tree/. Fig. 26-21. EUKARYA. Dinoflagellates. Land plants. Forams. Green algae. Ciliates. Diatoms. Red algae. Amoebas. Cellular slime molds. Euglena. Trypanosomes. Animals. Leishmania. - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/1.jpg)
PhylogeneticsReconstructing the Tree of Life
![Page 2: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/2.jpg)
Tree of Life Web Project
http://www.tolweb.org/tree/
![Page 3: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/3.jpg)
Fig. 26-21
Fungi
EUKARYA
Trypanosomes
Green algaeLand plants
Red algae
ForamsCiliates
Dinoflagellates
Diatoms
Animals
AmoebasCellular slime molds
Leishmania
Euglena
Green nonsulfur bacteriaThermophiles
Halophiles
Methanobacterium
Sulfolobus
ARCHAEA
COMMONANCESTOR
OF ALLLIFE
BACTERIA
(Plastids, includingchloroplasts)
Greensulfur bacteria
(Mitochondrion)
Cyanobacteria
ChlamydiaSpirochetes
![Page 4: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/4.jpg)
Outline
1. What is a phylogeny?
2. How do you construct a phylogeny?The Molecular ClockStatistical Methods
![Page 5: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/5.jpg)
Are Genetic Distancesand fossil recordroughly congruent?
Think about relationships among the major lineages of life and when they appeared in the fossil record
![Page 6: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/6.jpg)
Fossil Record vs Molecular Clock
• Molecular clock and fossil record are not always congruent– Fossil record is incomplete, and soft bodied species are
usually not preserved– Mutation rates can vary among species (depending on
generation time, replication error, mismatch repair)
• But they provide complementary information– Fossil record contains extinct species, while molecular
data is based on extant taxa– Major events in fossil record could be used to calibrate
the molecular clock
![Page 7: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/7.jpg)
Evolutionary History of HIV
Evolutionary AnalysisFreeman& Herron, 2004Time
HIV evolved multiple times from SIV (Simian Immunodeficiency Syndrome)
![Page 8: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/8.jpg)
Charles Darwin (1809 -1882)
On the Origin of Species (1859)
– Living species are related by common ancestry
– Change through time occurs at the population not the organism level
– The main cause of adaptive evolution is natural selection
![Page 9: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/9.jpg)
Darwin envisaged evolution as a tree
The affinities of all the beings of the same class have sometimes been represented by a great tree. I believe this simile largely speaks the truth………The green and budding twigs may represent existing species; and those produced during former years may represent the long succession of extinct species…..….the great Tree of Life….covers the earth with ever-branching and beautiful ramifications
Charles Darwin, On the Origin of Species; pages 131-132
![Page 10: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/10.jpg)
Reconstructing the Tree of Life
![Page 11: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/11.jpg)
The only figure in The Origin of Species
![Page 12: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/12.jpg)
Lamarck proposed a ladder of life
What did people believe before Darwin?
Past Future
![Page 13: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/13.jpg)
Jean-Baptiste Lamarck
• French Naturalist (1744-1829)• “Professor of Worms and Insects”
in Paris
• The first scientific theory of evolution (inheritance of acquired traits)
![Page 14: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/14.jpg)
Lamarck’s View of Evolution
• Continuum between physical and biological world (followed Aristotle)
• Scala Naturae (“Ladder of Life” or “Great Chain of Being”)
Being
Realm of Being
Realm of Becoming
Non-Being
God
Angels
Demons
ManAnimals
Plants
Minerals
![Page 15: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/15.jpg)
What is wrong with a ladder?
• Evolution is not linear but branching
• Living organisms are not ancestors of one another
• The ladder implies progress
![Page 16: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/16.jpg)
What is right with the tree?• Evolution is a branching process• If a mutation occurs, one species
is not turning into another, but there is a split, and both lineages continue to evolve
• So, evolution is not progressive - all living taxa are equally “successful”
• Phylogenies (Trees) reflect the hierarchical structuring of relationships
![Page 17: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/17.jpg)
The only figure in The Origin of Species
![Page 18: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/18.jpg)
The Tree of Life is a Fractalhttp://tolweb.org/tree/phylogeny.html
![Page 19: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/19.jpg)
Genealogical structures • Phylogeny
– A depiction of the ancestry relations between species (it includes speciation events)
– Tree-like (divergent)
• Pedigree– A depiction of the ancestry relations within
populations– Net-like (reticulating)
![Page 20: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/20.jpg)
Four butterflies connected to their parents
offspring
parents
![Page 21: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/21.jpg)
Popu
latio
nIn
divi
dual
s
past
futu
re
![Page 22: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/22.jpg)
Popu
latio
nLi
neag
e/
Spec
ies
Phyl
ogen
y
What happened here?
Lineage-branchingSpeciation
![Page 23: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/23.jpg)
What happened here?
Extinction
![Page 24: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/24.jpg)
A B C
The True History
A B C
A simplified representation
Representation of phylogenies?
![Page 25: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/25.jpg)
Some terms used to describe a phylogenetic tree
Taxon (taxa)Tip
Internal branchInternode
Node (Speciation event)
Root
![Page 26: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/26.jpg)
Outline
1. What is a phylogeny?
2. How do you construct a phylogeny?The Molecular ClockStatistical Methods
![Page 27: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/27.jpg)
• A phylogenetic tree represents a hypothesis about evolutionary relationships
• Each branch point represents the divergence of two taxa (e.g. species)
• Sister taxa are groups that share an immediate common ancestor
What is a Phylogeny?
![Page 28: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/28.jpg)
Molecular Clock
• Mutations• On average, mutations
occur at a given rate
Example:Mitochondria: 1 mutation every ~2.2%/million years.
![Page 29: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/29.jpg)
Faster if
• Mutation rate is faster:
– Shorter generation time(greater number of meiosis or
mitosis events in a given time)
– Replication Error (e.g. Sloppy DNA or RNA polymerase, inefficient mismatch repair)
Molecular Clock
![Page 30: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/30.jpg)
![Page 31: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/31.jpg)
Phylogenetic Trees with Proportional Branch Lengths
• In some trees, the length of a branch can reflect the number of genetic changes that have taken place in a particular DNA sequence in that lineage
• So longer branches = greater evolutionary distance
![Page 32: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/32.jpg)
Phylogenetic Informative Characters (mutations)
• Neutral mutations: –Mutations that are not subjected to selection
–Better for constructing phylogenies because selection could make unrelated taxa appear more similar or related taxa more different
–Examples: Noncoding regions of DNA, 3rd codon position in proteins, introns, microsatellites (“junk DNA”)
![Page 33: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/33.jpg)
Codon Bias
In the case of amino acids
Mutations in Position 1, 2 lead to change
Mutations in Position 3 don’t matter
![Page 34: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/34.jpg)
Species
Canislupus
Pantherapardus
Taxideataxus
Lutra lutra
Canislatrans
Order Family Genus
Carnivora
FelidaeM
ustelidaeC
anidae
Canis
LutraTaxidea
Panthera
![Page 35: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/35.jpg)
Sistertaxa
ANCESTRALLINEAGE
Taxon A
Polytomy (unresolved branching point)
Common ancestor oftaxa A–F
Branch point(node)
Taxon B
Taxon C
Taxon D
Taxon E
Taxon F
![Page 36: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/36.jpg)
A A A
BBB
C C C
DDD
E E E
FFF
G G G
Group IIIGroup II
Group I
(a) Monophyletic group (clade) (b) Paraphyletic group (c) Polyphyletic group
A monophyletic clade consists of an ancestral taxa and all its descendants
![Page 37: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/37.jpg)
Examples of Paraphyletic Groups
![Page 38: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/38.jpg)
A
B
C
D
E
F
G
Group I
(a) Monophyletic group (clade)
(in the lecture on species concepts we discussed that the “smallest” monophyletic group is a “phylogenetic species”)
![Page 39: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/39.jpg)
![Page 40: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/40.jpg)
![Page 41: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/41.jpg)
Synapomorphies
• Synapomorphies are shared derived homologous traits
• They can be DNA nucleotides or other heritable traits
• They are used to group taxa that are more closely related to one another
![Page 42: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/42.jpg)
![Page 43: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/43.jpg)
![Page 44: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/44.jpg)
![Page 45: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/45.jpg)
synapomorphies
![Page 46: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/46.jpg)
![Page 47: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/47.jpg)
![Page 48: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/48.jpg)
![Page 49: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/49.jpg)
Sometimes similar looking traits are not homologous, and are not synapomorphies, but are the result of convergent evolution
![Page 50: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/50.jpg)
![Page 51: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/51.jpg)
![Page 52: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/52.jpg)
How do we construct Phylogenies?
![Page 53: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/53.jpg)
Phylogenetic Methods
• Parsimony: Minimize # steps
• Distance Matrix: minimize pairwise genetic distances
• Maximum Likelihood: Probability of the data given the tree
• Bayesian: Probability of the tree given the data
![Page 54: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/54.jpg)
Parsimony Uses DiscreteCharacters (like mutations, or some heritable trait)
Select the tree with the minimum number of character-state transitions summed across all characters
![Page 55: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/55.jpg)
Fig. 26-15-1
Species I
Three phylogenetic hypotheses:
Species II Species III
I
II
III
I
III
IIIII
III
Parsimony: Example 1
![Page 56: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/56.jpg)
Fig. 26-15-2
Species I
Site
Species II
Species III
I
II
III
I
III
IIIII
III
Ancestralsequence
1/C1/C
1/C
1/C
1/C
4321
C
C C
C
T
T
T
T
T
T A
AA
A G
G
![Page 57: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/57.jpg)
Fig. 26-15-3
Species I
Site
Species II
Species III
I
II
III
I
III
IIIII
III
Ancestralsequence
1/C1/C
1/C
1/C
1/C
4321
C
C C
C
T
T
T
T
T
T A
AA
A G
G
I I
I
II
II
II
III
III
III3/A
3/A
3/A3/A
3/A
2/T2/T
2/T 2/T
2/T4/C
4/C
4/C
4/C
4/C
![Page 58: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/58.jpg)
Fig. 26-15-4
Species I
Site
Species II
Species III
I
II
III
I
III
IIIII
III
Ancestralsequence
1/C1/C
1/C
1/C
1/C
4321
C
C C
C
T
T
T
T
T
T A
AA
A G
G
I I
I
II
II
II
III
III
III3/A
3/A
3/A3/A
3/A
2/T2/T
2/T 2/T
2/T4/C
4/C
4/C
4/C
4/C
I I
I
II
II
II
III
III
III
7 events7 events6 events
![Page 59: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/59.jpg)
Three possible trees
Tree 1
C B
AO
Tree 2
A B
CO
B C
AO
C B AO A B CO
B A CO
Tree 3
Parsimony: Example 2
![Page 60: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/60.jpg)
C B AO
Map the characters (mutations) onto tree 1
12
ABC
O1 2 3 4 5
GGG
T G G A A
CCC
GAA
AAC
AAT
![Page 61: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/61.jpg)
Map the characters (mutations) onto tree 1
ABC
O1 2 3 4 5
GGG
T G G A A
CCC
GAA
AAC
AAT
Total # number of steps = 6
C B AO
12
3
3
45
![Page 62: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/62.jpg)
Actually, there is more than one way to map character 3
ABC
O
3
G
GAA
Either way the character contributes 2 steps to the overall tree length
C B AO
3
3
C B AO
3
3
![Page 63: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/63.jpg)
Map the characters onto tree 2
# steps = 5
ABC
O
1 2 3 4 5
GGG
T G G A A
CCC
GAA
AAC
AAT
A B CO
12
45
3
![Page 64: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/64.jpg)
Tree 3
Length = 6 steps
ABC
O
1 2 3 4 5
GGG
T G G A A
CCC
GAA
AAC
AAT
B A CO
12
453
3
![Page 65: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/65.jpg)
Most parsimonious tree
Which tree had the shortest branch lengths (most parsimonious)?
Tree 1: length = 6
C B
AO
Tree 2: length = 5
B C
AO
C B AO A B CO
B A CO
Tree 3: length = 6
![Page 66: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/66.jpg)
Example from Freeman & Herron, Fig. 4.8
Where do the Whales belong?
![Page 67: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/67.jpg)
Freeman & Herron, Fig. 4.9: Using maximum parsimony, looks like the whales cluster with the hippos (and cows)
![Page 68: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/68.jpg)
Parsimony• Simplest and fastest method of phylogenetic
reconstruction
• Can give misleading results if rates of evolution (rates that mutations occur) differ in different lineages
• Tends to become less accurate as genetic distances get greater
• Could be mislead by reversals, homoplasy: Because with only 4 nucleotides, after a while, same mutations occur repeatedly at a given site (called “saturation”)
![Page 69: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/69.jpg)
Distance Matrix
Continuous orDiscrete Characters
![Page 70: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/70.jpg)
Distance Matrix
• Calculate pairwise distances between taxa• Choose the tree that minimizes overall
distances between taxa
proportion sequence distance at 2 genes(hypothetical data)
mouse cat dog dolphin seal
Mouse 1Cat 0.05 1Dog 0.03 0.02 1Dolphin 0.08 0.15 0.03 1Seal 0.09 0.23 0.01 0.02 1
![Page 71: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/71.jpg)
Freeman & Herron, Fig. 4.10: Using genetic distances, looks like the whales again cluster with the hippos (and cows)
![Page 72: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/72.jpg)
Distance Matrix
• Generally more accurate than parsimony
• Like parsimony, it tends to be computationally fast
![Page 73: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/73.jpg)
Z: Probability of the data
Maximum Likelihood (R.A. Fisher)• Probability of the data given the tree• This is a “Frequentist” method: one true answer (one
true tree)
• Draw from the data (probability distribution of DNA sequence data) to find the true tree
• Choose the tree (x, y axis) that maximizes the probability of the observed data (z axis)
Felsenstein, J. 1981. Evolutionary trees from DNA sequences: a maximum likelihood approach. Journal of Molecular Evolution. 17(6):368-76.
x,y: Tree space
![Page 74: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/74.jpg)
Z: Probability of the data
Maximum Likelihood (R.A. Fisher)• Probability of the data given the tree• The aim of maximum likelihood estimation is to find
the parameter value(s) that makes the observed data most likely.
• For example: finding a mean. If you want to have a number that describes the data, like human height, you could find the mean
Felsenstein, J. 1981. Evolutionary trees from DNA sequences: a maximum likelihood approach. Journal of Molecular Evolution. 17(6):368-76.
x,y: Tree space
![Page 75: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/75.jpg)
• Often yields more accurate tree than parsimony or distance
• Relies on an accurate assumption of which mutations are more probable (A->G more often than A->T or C? i.e. accurate model of molecular
evolution) • Computationally intensive
Maximum Likelihood(R.A. Fisher)
![Page 76: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/76.jpg)
Bayesian InferenceReverend Thomas Bayes (1702-1760)
• Probability of a tree given the data• Uses prior information on the tree• Does not assume that there is one correct tree• Will modify estimate based on additional information
• Uses Bayes’ Theorem
P(A/B) = P(B/A)P(A) P(B)
![Page 77: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/77.jpg)
Bayesian InferenceReverend Thomas Bayes (1702-1760)
• Probability of a tree given the data:
• Will modify estimate based on additional information: so as you get more data, you update your hypothesis for the tree
• Uses prior information on the tree: this is where you start
• The sequential use of the Bayes' formula (recursive): when more data become available after calculating a posterior distribution, the posterior becomes the next prior
• Does not assume that there is one correct tree
![Page 78: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/78.jpg)
Bayesian InferenceReverend Thomas Bayes (1702-1760)
• Uses Bayes’ Theorem
P(A/B) = P(B/A)P(A) P(B)
P(A) = prior probabilityP(A/B) = posterior probability—this is the treeP(B/A) = the probability B of observing given A, is also known as the likelihood. It indicates the compatibility of the evidence with the given hypothesis.
![Page 79: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/79.jpg)
Bayesian Inference
• Like Likelihood, often yields more accurate tree than parsimony or distance
• Computationally more intensive than parsimony or distance matrix, but less intensive than likelihood
• Needs a prior probability for the tree and model
![Page 80: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/80.jpg)
• Sufficient Amount of Data: – With enough data most statistical methods
usually yield the same tree– Insufficient data would yield a tree that lacks
resolution (lacks statistical power)
• Gene trees vs species trees– Evolutionary history of individual genes are
not necessarily the same– Should try to get data from many genes, or
the whole genome
Potential problems of Phylogenetic Reconstruction
![Page 81: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/81.jpg)
Challenges of Phylogenetic Reconstructions
• Different parts of the genome might have different evolutionary histories (different gene genealogies, horizontal gene transfers, allopolyploidy, etc)
• So, there might not be one true tree for a group of taxa, and relationships might be difficult to resolve because they are inherently complex
![Page 82: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/82.jpg)
• Current trend is to use whole genome data to reconstruct phylogenies
• Gain a comprehensive picture of the evolutionary relationships among taxa for the whole genome
![Page 83: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/83.jpg)
Neutral data are better for capturing genetic distances (the molecular clock) than genes that might be under selection
• Why?
![Page 84: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/84.jpg)
• Typically, evolutionary biologists will use a variety of methods to reconstruct a phylogeny. • Maximum likelihood and Bayesian methods are
considered more robust.
• Tree is only as good as the data. Having many homoplastic characters (due to convergent evolution, reversals, etc.) will make the reconstruction less robust• Standard to use Bootstrapping to assess the validity of the
tree
• Understanding statistics is fundamental to understanding evolution• Much of statistics was in fact developed in order to model
evolutionary processes (such as ANOVA, analysis of variance)
Phylogenetic Reconstructions
![Page 85: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/85.jpg)
1. Sometimes the Molecular Clock (based on genetic data) conflicts with the Geological Record. Why would this happen?
(A) Sometimes there are gaps in the geological record, because fossils do not form everywhere, and mutation rate might vary between different species
(B) Radiometric dating relies on chance events in the preservation of isotopes, making the timing events in the geological time scale less accurate than the molecular clock
(C) Mutation rates slow down as you go back in time, making estimation of timing of events less accurate as you go back in time
(D) The molecular clock is calculated from radioisotopes, while the geological record is obtained from fossil data. The two can conflict when fossils end up displaced from their original sedimentary layer
![Page 86: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/86.jpg)
2. You are a medical researcher working on HIV. A novel strain has appeared in Madison, Wisconsin. To determine which drugs would be most effective in treating this new strain (because different strains are resistant to different drugs), you need to determine its recent evolutionary history. You decide to reconstruct the evolutionary history of HIV by using a phylogenetic approach. Thus, you collect samples from patients in various geographic locations and sequence a fragment of RNA. Using parsimony, which is the correct phylogeny for HIV-1 based on the data below?
HIV-1, Uganda, Africa ACAUGHIV-1, San Francisco, USA UGAUGHIV-1, Madison, USA UAAGGHIV-1, New York, USA UAAAGHIV-1, Paris ACAUCHIV-2 Africa (ancestral outgroup): ACCUG
![Page 87: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/87.jpg)
3. Which of the following is most TRUE regarding phylogenetic reconstructions?
(A) Phylogenetic reconstruction based on any gene would yield the same tree
(B) Parsimony is the most accurate method for reconstructing phylogenies
(C) Some DNA sequence data is better for phylogenetic reconstruction than others, such as those that tend to be less subjected to selection (3rd codon, introns)
(D) Maximum likelihood relies on maximizing distances among taxa
![Page 88: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/88.jpg)
4. Which of the following types of data would be most optimal for constructing a phylogeny?(a) Non-coding and regulatory sequences(b) Non-coding and non functional
sequences(c) Paralogous genes(d) Genes that have undergone purifying
selection(e) Intron sequences within rapidly
evolving genes
![Page 89: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/89.jpg)
5. Which of the following reasons is FALSE on why the type of data chosen in the question above would be optimal for constructing a phylogeny?(a) Because selection might make taxa seem more
closely related due to convergent evolution(b) Because selection might make taxa seem more
distantly related due to disruptive evolution(c) Because selection might make taxa seem more
closely related due to purifying selection (d) Because non-coding regulatory sequences are
likely to be neutral(e) Because coding sequences are likely to be under
selection
![Page 90: Phylogenetics](https://reader035.vdocuments.site/reader035/viewer/2022062520/56816585550346895dd8335d/html5/thumbnails/90.jpg)
Answers
• 1A• 2C• 3C• 4B• 5D