assessing phylogenetic hypotheses and phylogenetic data · bootstrapping (non-parametric) •...

50
Assessing Phylogenetic Hypotheses and Phylogenetic Data We use numerical phylogenetic methods because most data includes potentially misleading evidence of relationships We should not be content with constructing phylogenetic hypotheses but should also assess what ‘confidence’ we can place in our hypotheses This is not always simple! (but do not despair!)

Upload: others

Post on 21-Feb-2021

7 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Assessing Phylogenetic Hypotheses andPhylogenetic Data

• We use numerical phylogenetic methods becausemost data includes potentially misleading evidence ofrelationships

• We should not be content with constructingphylogenetic hypotheses but should also assess what‘confidence’ we can place in our hypotheses

• This is not always simple! (but do not despair!)

Page 2: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Assessing Data Quality

• We expect (or hope) our data will be well structuredand contain strong phylogenetic signal

• We can test this using randomization tests of explicitnull hypotheses

• The behaviour or some measure of the quality of ourreal data is contrasted with that of comparable butphylogenetically uninformative data determined byrandomization of the data

Page 3: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Random PermutationRandom permutation destroys any correlation among characters tothat expected by chance aloneIt preserves number of taxa, characters and character states in eachcharacter (and the theoretical maximum and minimum tree lengths)

Original structured data withstrong correlations amongcharacters

‘TAXA’ ‘CHARACTERS’1 2 3 4 5 6 7 8

R-P N U D E R T O UA-E R E A P L E A DN-R M R M M A D N PD-M L T R E Y M D RO-U D E Y U D E Y MM-T O M O T O U L TL-E Y D N D M P M EY-D A P L R N R R E

Randomly permuted data with any correlation among characters due to chance

‘TAXA’ ‘CHARACTERS’1 2 3 4 5 6 7 8

R-P R P R P R P R PA-E A E A E A E A EN-R N R N R N R N RD-M D M D M D M D MO-U O U O U O U O UM-T M T M T M T M TL-E L E L E L E L EY-D Y D Y D Y D Y D

Page 4: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Matrix Randomization Tests• Compare some measure of data quality/hierarchical

structure for the real and many randomly permuteddata sets

• This allows us to define a test statistic for the nullhypothesis that the real data are no better structuredthan randomly permuted and phylogeneticallyuninformative data

• A permutation tail probability (PTP) is the proportionof data sets with as good or better measure of qualitythan the real data

Page 5: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Structure of Randomization Tests• Reject null hypothesis if, for example, more than 5% of

random permutations have as good or better measure thanthe real data

Measure of data quality (e.g. tree length, ML, pairwise incompatibilities)

95% cutoff

GOOD BAD

Fre

qu

ency

PASS

TEST

reject null hypothesis

FAIL

TEST

Page 6: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Matrix Randomization Tests

• Measures of data quality include: 1. Tree length for most parsimonious trees - the

shorter the tree length the better the data (PAUP*)

2. Numbers of pairwise incompatibilities betweencharacters (pairs of incongruent characters) - thefewer character conflicts the better the data

3. Skewness of the distribution of tree lengths(PAUP)

Page 7: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Matrix Randomization Tests

Real data

Randomly permuted

Ciliate SSUrDNA

OchromonasSymbiodiniumProrocentrumLoxodesTracheloraphisSpirostomumGruberiaEuplotesTetrahymena

OchromonasSymbiodiniumProrocentrumLoxodesTetrahymenaTracheloraphisSpirostomumEuplotesGruberia

Strict consensus

1 MPTL = 618CI = 0.696 RI = 0.714PTP = 0.01PC-PTP = 0.001Significantly non random

3 MPTsL = 792CI = 0.543RI = 0.272PTP = 0.68PC-PTP = 0.737Not significantly differentfrom random

Min = 430Max = 927

Page 8: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Skewness of Tree Length Distributions

• Studies with random (andphylogenetically uninformative)data showed that the distributionof tree lengths tends to be normal

• In contrast, phylogeneticallyinformative data is expected tohave a strongly skeweddistribution with few shortesttrees and few trees nearly asshort

NU

MB

ER

OF

TR

EE

S

shortest tree

NU

MB

ER

OF

TR

EE

S

shortest tree

Tree length

Tree length

Page 9: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Skewness of Tree Length Distributions

• Skewness of tree length distributions can be used as ameasure of data quality in randomization tests

• It is measured with the G1 statistic in PAUP

• Significance cut-offs for data sets of up to eight taxahave been published based on randomly generateddata (rather than randomly permuted data)

• PAUP does not perform the more directrandomization test

Page 10: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Skewness - example792 | (3)793 | (6)794 | (12)795 | (7)796 | (17)797 | (30)798 | (33)799 |# (42)800 |# (62)801 |# (91)802 |# (111)803 |## (134)804 |## (172)805 |### (234)806 |#### (292)807 |#### (356)808 |###### (450)809 |####### (557)810 |######## (642)811 |######### (737)812 |############ (973)813 |############## (1130)814 |################ (1308)815 |#################### (1594)816 |##################### (1697)817 |########################## (2097)818 |############################## (2389)819 |################################## (2714)820 |###################################### (3080)821 |######################################### (3252)822 |############################################# (3616)823 |################################################# (3933)824 |################################################### (4094)825 |####################################################### (4408)826 |######################################################### (4574)827 |########################################################## (4656)828 |############################################################# (4871)829 |############################################################## (4962)830 |################################################################ (5130)831 |############################################################## (5005)832 |############################################################### (5078)833 |############################################################### (5035)834 |############################################################### (5029)835 |############################################################# (4864)836 |########################################################## (4620)837 |######################################################## (4491)838 |##################################################### (4256)839 |################################################### (4057)840 |############################################### (3749)841 |############################################ (3502)842 |####################################### (3160)843 |################################### (2771)844 |############################### (2514)845 |############################ (2258)846 |######################### (1964)847 |###################### (1728)848 |################## (1425)849 |############## (1159)850 |########### (915)851 |######### (760)852 |####### (581)853 |###### (490)854 |#### (321)855 |### (269)856 |### (218)857 |## (161)858 |# (95)859 |# (73)860 |# (46)861 | (26)862 | (16)863 | (14)864 | (7)865 | (7)866 | (3)867 | (2)

Frequency distribution of tree lengths Frequency distribution of tree lengths

RANDOMLY PERMUTED DATA g1=-0.100478

722 |## ( 72)723 |### ( 92)724 |### ( 101)725 |### ( 87)726 |#### ( 107)727 |#### ( 120)728 |#### ( 111)729 |##### ( 134)730 |##### ( 137)731 |#### ( 110)732 |#### ( 113)733 |#### ( 119)734 |#### ( 127)735 |##### ( 131)736 |#### ( 106)737 |#### ( 109)738 |#### ( 126)739 |#### ( 115)740 |##### ( 136)741 |#### ( 128)742 |##### ( 144)743 |##### ( 134)744 |###### ( 160)745 |##### ( 152)746 |##### ( 159)747 |###### ( 164)748 |###### ( 182)749 |####### ( 216)750 |####### ( 193)751 |######## ( 235)752 |######## ( 244)753 |######### ( 251)754 |######## ( 243)755 |######### ( 254)756 |######## ( 243)757 |######### ( 271)758 |######### ( 255)759 |########## ( 287)760 |######### ( 268)761 |########## ( 291)762 |########### ( 319)763 |########## ( 295)764 |########### ( 314)765 |########### ( 312)766 |########### ( 331)767 |########### ( 325)768 |############ ( 347)769 |########### ( 333)770 |############ ( 361)771 |############## ( 400)772 |############# ( 386)773 |############## ( 420)774 |############## ( 399)775 |############### ( 435)776 |################# ( 505)777 |################# ( 492)778 |################## ( 534)779 |################## ( 517)780 |################## ( 529)781 |###################### ( 637)782 |##################### ( 604)783 |######################## ( 685)784 |######################## ( 691)785 |###################### ( 644)786 |######################## ( 700)787 |########################## ( 746)788 |######################### ( 713)789 |########################## ( 743)790 |########################## ( 746)791 |######################### ( 732)792 |########################## ( 764)793 |############################ ( 811)794 |######################### ( 717)795 |########################## ( 762)796 |######################## ( 695)797 |############################ ( 807)798 |######################## ( 685)799 |####################### ( 660)800 |######################## ( 688)801 |####################### ( 659)802 |######################## ( 693)803 |######################## ( 694)804 |########################## ( 762)805 |########################## ( 743)806 |######################### ( 737)807 |########################## ( 745)808 |############################ ( 816)809 |############################# ( 838)810 |############################ ( 827)811 |########################## ( 765)812 |############################## ( 859)813 |########################## ( 763)814 |########################### ( 773)815 |############################# ( 835)816 |############################ ( 802)817 |########################### ( 798)818 |############################# ( 848)819 |############################# ( 847)820 |############################## ( 879)821 |############################ ( 828)822 |########################### ( 784)823 |########################## ( 757)824 |########################## ( 770)825 |############################ ( 812)826 |############################ ( 819)827 |############################# ( 850)828 |############################## ( 863)829 |################################ ( 934)830 |################################ ( 919)831 |################################# ( 963)832 |################################### ( 1021)833 |###################################### ( 1113)834 |####################################### ( 1143)835 |######################################## ( 1162)836 |########################################## ( 1223)837 |############################################ ( 1270)838 |############################################### ( 1356)839 |################################################ ( 1399)840 |############################################### ( 1356)841 |################################################# ( 1424)842 |################################################### ( 1492)843 |#################################################### ( 1499)844 |######################################################## ( 1630)845 |####################################################### ( 1594)846 |######################################################## ( 1619)847 |########################################################### ( 1718)848 |############################################################# ( 1765)849 |############################################################## ( 1793)850 |################################################################ ( 1853)851 |############################################################## ( 1800)852 |############################################################# ( 1773)853 |################################################################ ( 1861)854 |################################################################ ( 1853)855 |############################################################## ( 1805)856 |########################################################### ( 1722)857 |######################################################### ( 1651)858 |####################################################### ( 1613)859 |###################################################### ( 1559)860 |################################################### ( 1482)861 |################################################### ( 1479)862 |################################################ ( 1409)863 |############################################## ( 1349)864 |################################################ ( 1407)865 |################################################### ( 1487)866 |################################################## ( 1445)867 |##################################################### ( 1550)868 |################################################### ( 1482)869 |###################################################### ( 1573)870 |####################################################### ( 1587)871 |#################################################### ( 1525)872 |###################################################### ( 1576)873 |###################################################### ( 1572)874 |#################################################### ( 1499)875 |################################################### ( 1480)876 |############################################### ( 1370)877 |############################################ ( 1289)878 |########################################## ( 1228)879 |######################################## ( 1165)880 |################################### ( 1006)881 |################################## ( 992)882 |############################### ( 890)883 |########################### ( 792)884 |######################## ( 693)885 |###################### ( 650)886 |##################### ( 606)887 |################ ( 469)888 |############## ( 415)889 |########### ( 314)890 |######## ( 232)891 |####### ( 213)892 |##### ( 133)893 |#### ( 114)894 |### ( 75)895 |## ( 60)896 |## ( 52)897 |# ( 17)898 |# ( 16)899 | ( 6)900 | ( 4)

REAL DATACiliate SSUrDNA g1=-0.951947

Page 11: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Matrix Randomization Tests - use andlimitations

• Can detect very poor data - that provides no goodbasis for phylogenetic inferences (throw it away!)

• However, only very little may be needed to rejectthe null hypothesis (passing test ≠≠≠≠ great data)

• Doesn’t indicate location of this structure (morediscerning tests are possible)

• In the skewness test, significance levels for G1 havebeen determined for small numbers of taxa only sothat this test remains of limited use

Page 12: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Assessing Phylogenetic Hypotheses -groups on trees

Several methods have been proposed that attachnumerical values to internal branches in trees that areintended to provide some measure of the strength ofsupport for those branches and the correspondinggroups

These methods include:character resampling methods - the bootstrap and jackknifedecay analyses

additional randomization tests

Page 13: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Bootstrapping (non-parametric)• Bootstrapping is a modern

statistical technique that usescomputer intensive randomresampling of data todetermine sampling error orconfidence intervals for someestimated parameter

Page 14: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Bootstrapping (non-parametric)• Characters are resampled with replacement to create

many bootstrap replicate data sets• Each bootstrap replicate data set is analysed (e.g.

with parsimony, distance, ML)

• Agreement among the resulting trees is summarizedwith a majority-rule consensus tree

• Frequency of occurrence of groups, bootstrapproportions (BPs), is a measure of support for thosegroups

• Additional information is given in partition tables

Page 15: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Bootstrapping

Original data matrix

CharactersTaxa 1 2 3 4 5 6 7 8A R R Y Y Y Y Y YB R R Y Y Y Y Y YC Y Y Y Y Y R R RD Y Y R R R R R ROutgp R R R R R R R R

A B C D12 1

2

345

678

A B C D

122

55

668

Outgroup Outgroup

Resampled data matrix

CharactersTaxa 1 2 2 5 5 6 6 8A R R R Y Y Y Y YB R R R Y Y Y Y YC Y Y Y Y Y R R RD Y Y Y R R R R ROutgp R R R R R R R R

Randomly resample characters from the original data withreplacement to build many bootstrap replicate data sets of thesame size as the original - analyse each replicate data set

Summarise the results of multipleanalyses with a majority-ruleconsensus treeBootstrap proportions (BPs) arethe frequencies with whichgroups are encountered inanalyses of replicate data sets

A B C D

Outgroup

96%

66%

Page 16: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Bootstrapping - an exampleCiliate SSUrDNA - parsimony bootstrap

123456789 Freq-----------------.**...... 100.00...**.... 100.00.....**.. 100.00...****.. 100.00...****** 95.50.......** 84.33...****.* 11.83...*****. 3.83.*******. 2.50.**....*. 1.00.**.....* 1.00ajority-rule consensus

Partition Table

Ochromonas (1)

Symbiodinium (2)

Prorocentrum (3)

Euplotes (8)

Tetrahymena (9)

Loxodes (4)

Tracheloraphis (5)

Spirostomum (6)

Gruberia (7)

100

96

84

100

100

100

Page 17: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Bootstrapping - random data

Randomly permuted data - parsimony bootstrap

Majority-rule consensus (with minority components)

Partition Table123456789 Freq-----------------.*****.** 71.17..**..... 58.87....*..*. 26.43.*......* 25.67.***.*.** 23.83...*...*. 21.00.*..**.** 18.50.....*..* 16.00.*...*..* 15.67.***....* 13.17....**.** 12.67....**.*. 12.00..*...*.. 12.00.**..*..* 11.00.*...*... 10.80.....*.** 10.50.***..... 10.00

OchromonasSymbiodiniumProrocentrumLoxodesSpirostomumumTetrahymenaEuplotesTracheloraphisGruberia

71

26

1659

1621

OchromonasSymbiodiniumProrocentrumLoxodesTracheloraphisSpirostomumumEuplotesTetrahymenaGruberia

71

59

Page 18: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Bootstrap - interpretation• Bootstrapping was introduced as a way of establishing confidence

intervals for phylogenies• This interpretation of bootstrap proportions (BPs) depends on the

assumption that the original data is a random sample from a muchlarger set of independent and identically distributed data

• However, several things complicate this interpretation- Perhhaps the assumptions are unreasonable - making any statistical

interpretation of BPs invalid- Some theoretical work indicates that BPs are very conservative, and

may underestimate confidence intervals - problem increases withnumbers of taxa

- BPs can be high for incongruent relationships in separate analyses -and can therefore be misleading (misleading data -> misleading BPs)

- with parsimony it may be highly affected by inclusion or exclusion ofonly a few characters

Page 19: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

• Bootstrapping is a very valuable and widely usedtechnique - it (or some suitable) alternative isdemanded by some journals, but it may require apragmatic interpretation:

• BPs depend on two aspects of the support for a group - thenumbers of characters supporting a group and the level ofsupport for incongruent groups

• BPs thus provides an index of the relative support forgroups provided by a set of data under whateverinterpretation of the data (method of analysis) is used

Bootstrap - interpretation

Page 20: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

• High BPs (e.g. > 85%) is indicative of strong ‘signal’ in thedata

• Provided we have no evidence of strong misleading signal(e.g. base composition biases, great differences in branchlengths) high BPs are likely to reflect strong phylogeneticsignal

• Low BPs need not mean the relationship is false, only that itis poorly supported

• Bootstrapping can be viewed as a way of exploring therobustness of phylogenetic inferences to perturbations in thethe balance of supporting and conflicting evidence forgroups

Bootstrap - interpretation

Page 21: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Jackknifing• Jackknifing is very similar to bootstrapping and

differs only in the character resampling strategy

• Some proportion of characters (e.g. 50%) arerandomly selected and deleted

• Replicate data sets are analysed and the resultssummarised with a majority-rule consensus tree

• Jackknifing and bootstrapping tend to producebroadly similar results and have similarinterpretations

Page 22: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Decay analysis• In parsimony analysis, a way to assess support for a

group is to see if the group occurs in slightly lessparsimonious trees also

• The length difference between the shortest treesincluding the group and the shortest trees that excludethe group (the extra steps required to overturn agroup) is the decay index or Bremer support

• Total support (for a tree) is the sum of all clade decayindices - this has been advocated as a measure for anas yet unavailable matrix randomization test

Page 23: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Decay analysis -example

OchromonasSymbiodiniumProrocentrumLoxodesTracheloraphisSpirostomumGruberiaEuplotesTetrahymena

OchromonasSymbiodiniumProrocentrumLoxodesTetrahymenaTracheloraphisSpirostomumEuplotesGruberia

Ciliate SSUrDNA data Randomly permuted data

+27

+15 +8

+3

+1+1

+45

+7

+10

Page 24: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Decay analyses - in practiceDecay indices for each clade can be determined by:

- Saving increasingly less parsimonious trees andproducing corresponding strict componentconsensus trees until the consensus is completelyunresolved

- analyses using reverse topological constraints todetermine shortest trees that lack each clade

- with the Autodecay or TreeRot programs (inconjunction with PAUP)

Page 25: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Decay indices - interpretation

• Generally, the higher the decay index the better therelative support for a group

• Like BPs, decay indices may be misleading if the datais misleading

• Unlike BPs decay indices are not scaled (0-100) and itis less clear what is an acceptable decay index

• Magnitude of decay indices and BPs generallycorrelated (i.e. they tend to agree)

• Only groups found in all most parsimonious treeshave decay indices > zero

Page 26: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Trees are typically complex - they can bethought of as sets of less complexrelationships

A B C D E(AB)C

(AC)D

(DE)A

(AB)D

(AC)E

(DE)B

(AB)E

(BC)D

(DE)C

(AC)E

Resolved triplets

ABCD

ACDE

ABDE

BCDE

ABCE

Resolved quartets

Clades

AB

ABC

DE

Page 27: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Extending Support Measures

• The same measures (BP, JP & DI) that areused for clades/splits can also be determinedfor triplets and quartets

• This provides a lot more information becausethere are more triplets/quartets than there areclades

• Furthermore....

Page 28: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

The Decay Theorem• The DI of an hypothesis of relationships is equal to

the lowest DI of the resolved triplets that thehypothesis entails

• This applies equally to BPs and JPs as well as DIs• Thus a phylogenetic chain is no stronger than its

weakest link!• and, measures of clade support may give a very

incomplete picture of the distribution of support

Page 29: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Extensions

• Double decay analysis is the determination ofdecay indices for all relationships - gives a morecomprehensive but potentially verycomplicated summary of support

• Majority-rule reduced consensus provides asimilarly more comprehensive/complicatedsummary of bootstrap/jackknife proportions

• Leaf stability provides support values for thephylogenetic position of particular leaves

Page 30: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Bootstrapping with Reduced ConsensusA B C D E FGHIJ

A B C D E FGHIJ

X

A B C D E FGHIJX

A B C D E F G H I J

A B C D E F G H I J

X

50.5

50.5

50.550.5

50.5

100

100 100100

99

9998

98

A 1111100000B 0111100000C 0011100000D 0001100000E 0000100000F 0000010000G 0000011000H 0000011100I 0000011110J 0000011111X 1111111111

Page 31: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Bootstrapping

A B C D E F G H I J A B C D E F G H I JX

50.5

50.5

50.550.5

50.5100

100 100100

9999

9898

Page 32: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Leaf Stability• Leaf stability is the average of supports of the

triplets/quartets containing the leafAcanthostegaIchthyostegaGreererpetonCrassigyrinusEucritta

Whatcheeria

Gephyrostegus

BalanerpetonDendrerpeton

ProterogyrinusPholiderpeton

MegalocephalusLoxommaBaphetes

94

100

59

84

(98)

(98)

(69)

(53)

(54)

(58)

(49)

(64)

(64)

(66)

(66)

(67)

(67)

(67)

100

95

Page 33: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

PTP tests of groups• A number of randomization tests have been proposed

for evaluating particular groups rather than entiredata matrices by testing null hypotheses regardingthe level of support they receive from the data

• Randomisation can be of the data or the group• These methods have not become widely used both

because they are not readily performed and becausetheir properties are still under investigation

• One type, the topology dependent PTP tests areincluded in PAUP* but have serious problems

Page 34: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Comparing competing phylogenetichypotheses - tests of two trees

• Particularly useful techniques are those designed toallow evaluation of alternative phylogenetichypotheses

• Several such tests allow us to determine if one tree isstatistically significantly worse than another:

Winning sites test, Templeton test, Kishino-Hasegawa test, Shimodaira-Hasegawa test, parametric bootstrapping

Page 35: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

• All these tests are of the null hypothesis that thedifferences between two trees (A and B) are no greaterthan expected from sampling error

• The simplest ‘wining sites’ test sums the number ofsites supporting tree A over tree B and vice versa(those having fewer steps on, and better fit to, one ofthe trees)

• Under the null hypothesis characters are equally likelyto support tree A or tree B and a binomial distributiongives the probability of the observed difference innumbers of winning sites

Tests of two trees

Page 36: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

The Templeton test

Templeton’s test is a non-parametric Wilcoxonsigned ranks test of the differences in fits ofcharacters to two trees

It is like the ‘winning sites’ test but also takesinto account the magnitudes of differences in thesupport of characters for the two trees

Page 37: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Templeton’s test - an exampleS

eym

ou

riad

ae

Dia

dec

tom

orp

ha

Syn

apsi

da

Par

arep

tilia

Cap

torh

inid

ae

Pal

eoth

yris

Cla

ud

iosa

uru

s

Yo

un

gin

ifo

rmes

Arc

ho

sau

rom

orp

ha

Lep

ido

sau

rifo

rmes

Pla

cod

us

Eo

sau

rop

tery

gia

Ara

eosc

elid

ia

2

1

Recent studies of the relationships ofturtles using morphological data haveproduced very different results withturtles grouping either within theparareptiles (H1) or within the diapsids(H2) the result depending on themorphologistThis suggests there may be:- problems with the data- special problems with turtles- weak support for turtle relationships

The Templeton test was used to evaluate the trees and showed thatthe slightly longer H1 tree found in the constrained analyses was notsignificantly worse than the unconstrained H2 treeThe morphological data do not allow choice between H1 and H2

Parsimony analysis of the most recent data favoured H2However, analyses constrained by H2 produced trees thatrequired only 3 extra steps (<1% tree length)

Page 38: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Kishino-Hasegawa test• The Kishino-Hasegawa test is similar in using

differences in the support provided by individualsites for two trees to determine if the overalldifferences between the trees are significantlygreater than expected from random sampling error

• It is a parametric test that depends on assumptionsthat the characters are independent and identicallydistributed (the same assumptions underlying thestatistical interpretation of bootstrapping)

• It can be used with parsimony and maximumlikelihood - implemented in PHYLIP and PAUP*

Page 39: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Kishino-Hasegawa testIf the difference between trees (treelengths or likelihoods) isattributable to sampling error, thencharacters will randomly supporttree A or B and the total differencewill be close to zeroThe observed difference issignificantly greater than zero if it isgreater than 1.95 standarddeviationsThis allows us to reject the nullhypothesis and declare the sub-optimal tree significantly worse thanthe optimal tree (p < 0.05)

Under the null hypothesis themean of the differences inparsimony steps or likelihoods foreach site is expected to be zero, andthe distribution normal

From observed differences wecalculate a standard deviation

Distribution of Step/Likelihood differences at each site0

Sites favouring tree A Sites favouring tree B

Exp

ecte

d

M

ean

Page 40: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Kishino-Hasegawa test - an exampleCiliate SSUrDNA

Maximum likelihood tree

OchromonasSymbiodinium

ProrocentrumSarcocystis

TheileriaPlagiopyla nPlagiopyla f

Trimyema cTrimyema sCyclidium p

Cyclidium gCyclidium l

GlaucomaColpodiniumTetrahymena

ParameciumDiscophryaTrithigmostoma

OpisthonectaColpoda

DasytrichiaEntodinium

SpathidiumLoxophylum

HomalozoonMetopus c

Metopus pStylonychiaOnychodromous

OxytrichiaLoxodes

TracheloraphisSpirostomum

GruberiaBlepharisma anaerobic ciliates with hydrogenosomes

Parsimonious character optimizationof the presence and absence ofhydrogenosomes suggests fourseparate origins of hydrogenosomeswithin the ciliates

Questions - how reliable is this result?- in particular how well supported is the idea of multiple origins?- how many origins can we confidently infer?

Page 41: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

OchromonasSymbiodinium

ProrocentrumSarcocystis

TheileriaPlagiopyla nPlagiopyla f

Trimyema cTrimyema s

GlaucomaColpodinium

TetrahymenaParamecium

Cyclidium pCyclidium g

Cyclidium lDiscophryalTrithigmostoma

OpisthonectaDasytrichia

EntodiniumSpathidium

HomalozoonLoxophylumMetopus c

Metopus pStylonychiaOnychodromous

OxytrichiaColpoda

LoxodesTracheloraphis

SpirostomumGruberiaBlepharisma

81-86

99-10095-100

96-100

100100

80-50

100

69-78

18-0

41-30

46-26100-99

100100

100

10069-99

78-9989-91

35-17

11-0

15-0

100

83-82

53-45100

100 42

67-9950-53

100-98

Kishino-Hasegawa test - an exampleCiliate SSUrDNA dataMost parsimonious tree Parsimony analysis yields a

very similar tree- in particular, parsimonious character optimization indicates four separate origins of hydrogenosomes within ciliates

Decay indices and BPs for parsimony and distance analyses indicate relativesupport for cladesDifferences between the ML, MP anddistance trees generally reflect the less well supported relationships

117

3

10

2618

63

3

3

1

3

3

3

3

4827

33

75 6

712

323

4

27

5

45-72

56 33

17

Page 42: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Kishino-Hasegawa test - exampleOchromonasSymbiodiniumProrocentrumSarcocystisTheileriaPlagiopyla nPlagiopyla fTrimyema cTrimyema sCyclidium pCyclidium gCyclidium lDasytrichiaEntodiniumLoxophylumHomalozoonSpathidiumMetopus cMetopus pLoxodesTracheloraphisSpirostomumGruberiaBlepharismaDiscophryaTrithigmostomaStylonychiaOnychodromousOxytrichiaColpodaParameciumGlaucomaColpodiniumTetrahymenaOpisthonecta

OchromonasSymbiodiniumProrocentrumSarcocystisTheileriaPlagiopyla nPlagiopyla fTrimyema cTrimyema sCyclidium p

Cyclidium gCyclidium l

HomalozoonSpathidium

DasytrichiaEntodinium

Loxophylum

Metopus cMetopus p

LoxodesTracheloraphisSpirostomumGruberiaBlepharismaDiscophryaTrithigmostomaStylonychiaOnychodromousOxytrichiaColpodaParameciumGlaucomaColpodiniumTetrahymenaOpisthonecta

Parsimony analyse with topologicalconstraints were used to find theshortest trees that forcedhydrogenosomal ciliate lineagestogether and thereby reducedthe number of separate origins ofhydrogenosomes

Two examples of the topological constraint trees

Each of the constrained parsimonytrees were compared to the MLtree and the Kishino-Hasegawa testused to determine which of thesetrees were significantly worse thanthe ML tree

Page 43: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Kishino-Hasegawa test

No. Constraint Extra Difference SignificantlyOrigins tree Steps and SD worse?4 ML +10 - -4 MP - -13 ±±±± 18 No3 (cp,pt) +13 -21 ±±±± 22 No3 (cp,rc) +113 -337 ±±±± 40 Yes3 (cp,m) +47 -147 ±±±± 36 Yes3 (pt,rc) +96 -279 ±±±± 38 Yes3 (pt,m) +22 -68 ±±±± 29 Yes3 (rc,m) +63 -190 ±±±± 34 Yes2 (pt,cp,rc) +123 -432 ±±±± 40 Yes2 (pt,rc,m) +100 -353 ±±±± 43 Yes2 (pt,cp,m) +40 -140 ±±±± 37 Yes2 (cp,rc,m) +124 -466 ±±±± 49 Yes2 (pt,cp)(rc,m) +77 -222 ±±±± 39 Yes2 (pt,m)(rc,cp) +131 -442 ±±±± 48 Yes2 (pt,rc)(cp,m) +140 -414 ±±±± 50 Yes1 (pt,cp,m,rc) +131 -515 ±±±± 49 Yes

Constrained analyses used to findmost parsimonious trees with lessthan four separate origins ofhydrogenosomes

Tested against ML tree

Trees with 2 or 1 origin are allsignificantly worse than the MLtreeWe can confidently conclude thatthere have been at least threeseparate origins ofhydrogenosomes within thesampled ciliates

Test summary and results - origins of ciliate hydrogenosomes (simplified)

Page 44: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Shimodaira-Hasegawa Test• To be statistically valid, the Kishino-Hasegawa test

should be of trees that are selected a priori

• However, most applications have used trees selecteda posteriori on the basis of the phylogenetic analysis

• Where we test the ‘best’ tree against some other treethe KH test will be biased towards rejection of thenull hypothesis

• The SH test is a similar but more statistically correcttechnique in these circumstances and should bepreferred

Page 45: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Taxonomic Congruence

• Trees inferred from different data sets (differentgenes, morphology) should agree if they areaccurate

• Congruence between trees is best explained bytheir accuracy

• Congruence can be investigated using consensus(and supertree) methods

• Incongruence requires further work to explainor resolve disagreements

Page 46: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Reliability of Phylogenetic Methods• Phylogenetic methods (e.g. parsimony, distance, ML)

can also be evaluated in terms of their generalperformance, particularly their:

consistency - approach the truth with more dataefficiency - how quickly (how much data)robustness - how sensitive to violations of assumptions

• Studies of these properties can be analytical or bysimulation

Page 47: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Reliability of Phylogenetic Methods

• There have been many arguments that ML methodsare best because they have desirable statisticalproperties, such as consistency

• However, ML does not always have these properties– if the model is wrong/inadequate (fortunately this is

testable to some extent)– properties not yet demonstrated for complex inference

problems such as phylogenetic trees

Page 48: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Reliability of Phylogenetic Methods

• “Simulations show that ML methods generallyoutperform distance and parsimony methods over abroad range of realistic conditions”

Whelan et al. 2001 Trends in Genetics 17:262-272

• Most simulations are very (unrealistically) simple– few taxa (typically just four)– few parameters (standard models - JC, K2P etc)

Page 49: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Reliability of Phylogenetic Methods

• Simulations with four taxa have shown:- Model based methods - distance and maximum likelihood

perform well when the model is accurate (not surprising!)- Violations of assumptions can lead to inconsistency for all

methods (a Felsenstein zone) when branch lengths or ratesare highly unequal

- Maximum likelihood methods are quite robust to violationsof model assumptions

- Weighting can improve the performance of parsimony(reduce the size of the Felsenstein zone)

Page 50: Assessing Phylogenetic Hypotheses and Phylogenetic Data · Bootstrapping (non-parametric) • Characters are resampled with replacement to create many bootstrap replicate data sets

Reliability of Phylogenetic Methods• However:- Generalising from four taxon simulations may be

dangerous as conclusions may not hold for more complexcases

- A few large scale simulations (many taxa) have suggestedthat parsimony can be very accurate and efficient

- Most methods are accurate in correctly recovering knownphylogenies produced in laboratory studies

• More study of methods is needed to help in choice ofmethod using more realistic simulations