influence function for robust phylogenetic reconstruction
TRANSCRIPT
Influence Function for Robust PhylogeneticReconstruction
Mahendra Mariadassou
Unité MIGINRA Jouy-en-Josas
JOBIM 2011Institut Pasteur
M. Mariadassou (INRA MIG) June 11 1 / 25
Outline
1 Motivation
2 Outlier Sites
3 Taxon Influence Index
M. Mariadassou (INRA MIG) June 11 2 / 25
Goal of Phylogenetic Reconstruction
Chimpanzee Bonobo Human Gorilla Orangutan
Pictures: National Geographic
Last common ancestor of bonobo
and chimpanzee
M. Mariadassou (INRA MIG) June 11 3 / 25
Applications in Many Domains
Including but not limited to:
Studies of an Evolutionary Group:Estimate Time to Most Recent Common Ancestor (TMRCA);Find genes under positive/purifying selection;Identifying Horizontal Gene Transfer;Testing evolutionary hypothesis.
Systematics:Reconstruct Tree of X, phylogeny of all living X;DNA barcoding: easily identify the species of a new organism;Natural way to measure biodiversity.
Most of these applications require “good” trees.
M. Mariadassou (INRA MIG) June 11 4 / 25
Applications in Many Domains
Including but not limited to:
Studies of an Evolutionary Group:Estimate Time to Most Recent Common Ancestor (TMRCA);Find genes under positive/purifying selection;Identifying Horizontal Gene Transfer;Testing evolutionary hypothesis.
Systematics:Reconstruct Tree of X, phylogeny of all living X;DNA barcoding: easily identify the species of a new organism;Natural way to measure biodiversity.
Most of these applications require “good” trees.
M. Mariadassou (INRA MIG) June 11 4 / 25
Applications in Many Domains
Including but not limited to:
Studies of an Evolutionary Group:Estimate Time to Most Recent Common Ancestor (TMRCA);Find genes under positive/purifying selection;Identifying Horizontal Gene Transfer;Testing evolutionary hypothesis.
Systematics:Reconstruct Tree of X, phylogeny of all living X;DNA barcoding: easily identify the species of a new organism;Natural way to measure biodiversity.
Most of these applications require “good” trees.
M. Mariadassou (INRA MIG) June 11 4 / 25
Reconstruction Methods
Three main families:Distance based: Neighbor-Joining (NJ),
Parsimony based: Maximum Parcimony(MP),
Likelihood based: Maximum Likelihood(ML), Bayesian Inference (BI).
Alignment −→ Phylogenetic Tree
A C C T TB G G A AC G G A C
NJ, MP,ML, BI,
. . .A B C
M. Mariadassou (INRA MIG) June 11 5 / 25
Validating the Tree
Inference Problems:Compare inferred tree to true tree to assess how good it is,
Inferred True
A B C A B C
But the true tree is not available!
Confidence Issue:How confident are we on the inferred tree ?
Which parts of the tree are reliable/not reliable ?How robust is the tree to small changes in the data and outliers ?
M. Mariadassou (INRA MIG) June 11 6 / 25
Validating the Tree
Inference Problems:Compare inferred tree to true tree to assess how good it is,
Inferred True
A B C
?But the true tree is not available!
Confidence Issue:How confident are we on the inferred tree ?
Which parts of the tree are reliable/not reliable ?How robust is the tree to small changes in the data and outliers ?
M. Mariadassou (INRA MIG) June 11 6 / 25
Validating the Tree
Inference Problems:Compare inferred tree to true tree to assess how good it is,
Inferred True
A B C
?But the true tree is not available!
Confidence Issue:How confident are we on the inferred tree ?
Which parts of the tree are reliable/not reliable ?How robust is the tree to small changes in the data and outliers ?
M. Mariadassou (INRA MIG) June 11 6 / 25
Bootstrap Values: the Theory
Original Dataset:
Alignment Phylogenetic treeA A C T TB G G A TC G G C C
A B C
66
Bootstrap Datasets:A A C T CB G G A GC G G C G
A C A T AB G G A GC G G C G
A T T T TB A T A TC C C C C
A B C A B C A B C
M. Mariadassou (INRA MIG) June 11 7 / 25
Bootstrap Values: A Robustness Index ?
Potential Causes for Uncertainty:Finite sequence lengths (sampling errors);Poor alignment quality (influent sites);Poor taxon sampling (rogue taxa);Model misspecification,. . .
Different Tools to Assess Them:Bootstrap: deals with global sources of uncertainty;Unable to pinpoint local sources of uncertainty;Need for other indexes to detect outliers
M. Mariadassou (INRA MIG) June 11 8 / 25
Bootstrap Values: A Robustness Index ?
Potential Causes for Uncertainty:Finite sequence lengths (sampling errors);Poor alignment quality (influent sites);Poor taxon sampling (rogue taxa);Model misspecification,. . .
Different Tools to Assess Them:Bootstrap: deals with global sources of uncertainty;Unable to pinpoint local sources of uncertainty;Need for other indexes to detect outliers
M. Mariadassou (INRA MIG) June 11 8 / 25
Outlier Sites: Motivation and Goal
Motivation: Filter DataIdentifying/filter out outliers corresponding to:
Sequencing errors;Alignment errors;Presence of an atypical DNA segment;. . .
ProcedureQuantify influence of site i by
Removing site i from alignment;Computing new tree T−i from smaller (jackknife) alignment;Comparing new tree to original tree;
Compute phylogeny from “not too influent sites”.
M. Mariadassou (INRA MIG) June 11 9 / 25
Outlier Sites: Motivation and Goal
Motivation: Filter DataIdentifying/filter out outliers corresponding to:
Sequencing errors;Alignment errors;Presence of an atypical DNA segment;. . .
ProcedureQuantify influence of site i by
Removing site i from alignment;Computing new tree T−i from smaller (jackknife) alignment;Comparing new tree to original tree;
Compute phylogeny from “not too influent sites”.
M. Mariadassou (INRA MIG) June 11 9 / 25
For Phylogenies...
DefinitionFor alignment of size n, the influence value of site i is:
IF(site i) = (n− 1)(LogLik(T−i)
n− 1− LogLik(T)
n
)Difference between support of alignments for their ML trees;Negative / Positive value: enhanced / weakened support (whenadding the site);Expect most sites with small negative values and a few with largepositive values.
Strategy towards greater stabilityFocus on outliers: sites with IF(site i) > 0;Rank them in decreasing IF(site i);Remove them one at the time until a stable tree is found.
M. Mariadassou (INRA MIG) June 11 10 / 25
Data: Zygomycetes & Chytridiomycetes
Lower mushrooms;
Biology: unknown!
Enough signal toresolve the topology;
1026 sites, 158 taxa,GTR model.
sBasidiomycota
Ascomycota
Zygomicota
Glomeromycota
Chytridiomycota
M. Mariadassou (INRA MIG) June 11 11 / 25
Information About Sites
0 200 400 600 800 1000
510
1520
25
Sites
num
ber o
f int
erna
l nod
es d
iffer
ent f
rom
ML
tree
0 200 400 600 800 1000
05
01
00
Sites
Influ
en
ce
va
lue
M. Mariadassou (INRA MIG) June 11 12 / 25
Distance Between Trees
0
2
4
6
8
10
1 2
14
10
2 0
30
4 05 0
1 02 0
3 04 0
5 0
Pen
ny-H
endy
dis
tanc
e
Tree w
ithout i
site
s i=1,..
.50
T ree w ithou t i s ites i=1 ,...50
D is ta n ce b e tw e e n tre e s
d(T0,Ti) ≈ 18 and d(Ti,Tj) ≤ 2 for i, j = 1..45
M. Mariadassou (INRA MIG) June 11 13 / 25
Observations
Strongest Outlier (position 142): Highly variable site located on amed loop (5nt) located on a conserved hairpin;
Removing most influent sites leads to Increased bootstrap valuesand loss of 20% of inner nodes;
Confirms monophyly of phyla Glomeromycota
Reinforces polyphyletic status of phyla Chytridiomycota andZygomycota
M. Mariadassou (INRA MIG) June 11 14 / 25
Taxon Influence Index: Motivation and Goal
Motivation: Filter DataStudy the robustness of the tree with respect to the speciesIdentify rogue taxa.
ProcedureQuantify influence of taxon i by:
Removing site i from alignment;Computing new tree T−i from smaller (jackknife) alignment;Comparing new tree to original tree;
Compute phylogeny from “not too influent sites”.
M. Mariadassou (INRA MIG) June 11 15 / 25
Taxon Influence Index: Motivation and Goal
Motivation: Filter DataStudy the robustness of the tree with respect to the speciesIdentify rogue taxa.
ProcedureQuantify influence of taxon i by:
Removing site i from alignment;Computing new tree T−i from smaller (jackknife) alignment;Comparing new tree to original tree;
Compute phylogeny from “not too influent sites”.
M. Mariadassou (INRA MIG) June 11 15 / 25
Method
Pruning
Inference
Inference
Ti T (−i)
T
i
TII(i) = 2
Species set (whole)
Species set (partial)
M. Mariadassou (INRA MIG) June 11 16 / 25
Data: Placental Mammal Phylogeny
Mitochondrial genome of 68 mammals;Amino Acids sequences;Sequences are 3658 AA long;MtMam + I + Γ4 model;Phylogeny published in Nikaido et al. in 2003.
M. Mariadassou (INRA MIG) June 11 17 / 25
Taxon Influence Index
BDS RF
00.
050.
10.
15
02
46
810
12
Eur. red squirrel
Guinea pig
RabbitEur. red squirrel
Guinea Pig
Rabbit
BSD: Branch-score distance / RF: Robinson-Foulds distancemean edge length: 0.05
M. Mariadassou (INRA MIG) June 11 18 / 25
Complete Phylogeny
American.pika
Australian.echidna
baboonbarbary.macaque
blue.whale
bonobo
Bornean.orangutan
cape.golden.mole
cat
chimpanzee
common.gibbon
common.wombat
donkey
dugong
elephant.shrew
fat.dormouse
formosan.lesser.horseshoe.bat
formosan.shrew
gorilla
greater.cane.rat
grey.sealharbor.seal
hippopotamus
horsehorseshoe.bat
human
Japanese.mole
Japanese.pipistrelle
little.red.flying.fox
long−eared.hedgehog
mouse
nine−banded.armadillo
North.American.opossumnorthern.brown.bandicoot
pig
polar.bear
rat
Ryukyu.flying.fox
sheep
silver−gray.brushtail.possum
sperm.whale
Sumatran.orangutan
vole
wallaroo
white.rhinoceros
78
93
91
97
83
76
99
79
97
69
74
85
87
76
85
9999
9984
89
98/75
94/28
92/14
97/54
98/5534/11
40/54
94/23
94/30
Eurasian.red.squirrel (8)
guinea.pig (12)
rabbit (8)
aardvark
African.elephant
alpaca
cape.hyrax
cow
dog
European.mole
fin.whale
greater.Japanese.shrew−mole
greater.moonrathedgehog
Indian.rhinoceros
Jamaican.fruit−eating.batlong−clawed.shrew
longtailed.bat
northern.tree.shrew
platypus
slow.loris
tenrec1
white−fronted.capchin
●●●●●
●●
●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●●
●
●
●
●●
●
●●●●●●●●●●●
●
●●●●●●●●●●
0 10 20 30 40 50 60
4060
8010
0
Nodes
BS
sco
res
(%)
●
●●
● ●
●
●
● ●
M. Mariadassou (INRA MIG) June 11 19 / 25
Influential Species (mostly in Afrotheria)American.pika
●
●
1
1
Eurasian.red.squirrel
●
●●
●
1
11
1
fat.dormouse
●
●
1
1
greater.cane.rat
●
●
●
1
1
1
guinea.pig
●
●
●
●
●
●
1
1
1
1
1
1
mouse
●●
●
11
1
nine−banded.armadillo
●
●
1
1
rabbit
●
●●
●
1
11
1
M. Mariadassou (INRA MIG) June 11 20 / 25
Data: Bilaterian Transcription Factor T-Box
T-box TF of 164 metazoans, involved in gastrulation;
Amino Acids sequences;
Sequences are 296 AA long;
LG + I + Γ4 model;
Ancient family with 8 subfamiliesBrachyuryTbx1/10Tbx15/18/22Tbx20Tbx2/3Tbx4/5Tbx6/VegTEomes/Tbr1/Tbx21
Interest lies in the position of TF OlTbx present in a Oscarellalobularis.
M. Mariadassou (INRA MIG) June 11 21 / 25
AaTbx6a
AaTbx6b
AgXP31467
AgXP31592
AgXP32060
AgXP32179
AgXP55771
AmphiEomes
AmphiTbx1
AmphiTbx2
AmphiTbx6
AmqTbx115
AmqTbx45
AmqTbxA
AmqTbxBAmqTbxC
AmqTbxE
ApBra
ApTbrain
AtH15−1
AtH15−2
AtOptomot
AtTbx1
Avtbx115
BfTbx1518
Cap1241000
Cap190074
Cap1960038
Cap2380027
Cap2960003
Cap7420011
Cape gw1 7
CiBra
CiTbx1
CiTbx1518CiTbx20
CiTbx23
CiTbx6aCiTbx6b
CiTbxA
Dmbi
Dmbyn
DmCG6634P
DmDoc1DmDoc3
DmH15
Dmorg1
DrEomesod
Drnotail
Drsimilar
DrT−box15
DrT−box18
DrT−box24
Drtbx1
Drtbx16
Drtbx2
Drtbx20
Drtbx4
Drtbx5
Drtbx6
GgTbx6
HeBraHmBra
HmTbxAHmTbxB
HmTbxC
HpBra
HrTbxA
HrTbxB
HrTbxCHrTbxD
HrTbxG
HrTbxH
HrTbxI
HrTbxJ
HrTbxK
HrTbxL
HrTbxM
HrTbxN
HrTbxO
HrTbxP
HybraB
LgTbr
LgTbxA
LgTbxBLgTbxC
LgTbxD
LgTbxE
LgTbxF
LgTbxG
LgTbxH
LgTbxI
LgTbxJ
LvBra
LvTbx23
Mlbra
MlTbx1
MltbxC
MltbxD
MltbxE
MmEomes
MmT
MmTbr1
MmTbx1MmTbx10
MmTbx15
MmTbx18
MmTbx19
MmTbx2
MmTbx20
MmTbx21
MmTbx22
MmTbx3
MmTbx4
MmTbx5
MmTbx6
MtTbx6
NvBra
Nvtbx1
NvTbx15b
NvTbx20
NvTbx23
NvTbxA
NvTbxB
NvTbxC
NvTbxD
NvTbxE
OlTbx
OmBra
PcBra
PcTbx45
Pdbra
Plbra
PpBra
PpTbx23
PvBra
SdBra
SdTbx2
SgTbx6aSgTbx6b
SpTBR1
SpTbx
SpTbx2
SpTbx2b
SrBra
TaBra
TaTbx23
TaTbxB
TaTbxDTaTbxE
Tcdorsocr
TcTbox20
XBRA
Xleomes
Xltbx1
Xltbx2
Xltbx20
Xltbx3
Xltbx5
Xltbx6
XlTbxAXlTbxB
XlTbxC
XlVEGT
880.96
920.98
981.00
710.93
840.76
810.99
680.98
880.98
750.99
880.98
520.78
851.00
620.97 99
1.00
790.98
871.00
780.97
540.92
730.99
1001.00
991.00 61
0.66
920.99
940.84
910.6
720.92
550.99
620.94
990.99
880.99
961.00
740.97
660.75
920.99
530.9279
0.99
780.99
1001.00
1001.00
1001.00
830.99
1000.64 89
0.98
1001.00
1001.00
1000.99
850.74
901.00
971.00
981.00
800.81
780.97
750.98
540.83
930.99
991.00
991.00
740.69
360.73
330.5
510.8
820.99
981.00
760.94
470.46
951.00
921.00
971.00
460.94
981.00
640.83
330.57
1001.00
390.53
770.85
981.00
540.95
1001.00
Tbr1/E
omes
Brachyury
Tbx2/3
Tbx6/D
oc Tbx6/V
egT
Tbx4/5
Tbx20
Tbx15/18/22
Tbx1
0.5
AaTbx6aAaTbx6b
AgX
P31
467
AgXP31592
AgXP32060
AgXP32179
AgX
P55
771
AmphiEomes
AmphiTbx1
AmphiTbx2
AmphiTbx6
AmqTbx115
Am
qTbx45
Am
qTbx
AA
mqT
bxB
Am
qTbx
CA
mqT
bxE
ApBra
ApTbrain
AtH
15−1
AtH
15−2
AtOptomot
AtTbx1 Avtbx115
BfTb
x151
8
Cap1241000
Cap190074
Cap1960038
Cap2380027
Cap2960003
Cap
7420
011
Cape gw
1 7
CiBra
CiTbx1
CiT
bx15
18
CiT
bx20
CiTbx23
CiTbx6aCiTbx6b
CiTbxA
Dmbi
Dm
byn
Dm
CG
6634
P
DmDoc1DmDoc3
Dm
H15
Dmorg1
DrEomesod
Drnotail
Drsim
ilar
DrT−b
ox15
DrT−box
18
DrT−box24Drtbx1
Drtbx16
Drtbx2
Drtbx20
Drtbx4
Drtbx5
Drtbx6
GgTbx6
HeB
raH
mB
ra
HmTbxAHmTbxB
HmTbxC
HpBra
HrTbx
A
HrTbxB
HrT
bxC
HrT
bxD
HrTbx
G
HrTbxH
HrT
bxI
HrT
bxJ
HrT
bxK
HrTbxL
HrTbxM
HrTbxN
HrTbxOHrTbxP
HybraB
LgTbr
LgTbxA
LgTbx
B
LgTbx
C
LgTbxD
LgT
bxE
LgTbxF
LgTbxG
LgTbxH
LgT
bxI
LgT
bxJ
LvBra
LvTbx2
3
Mlbra
MlTbx1
Mltb
xC
MltbxD
MltbxE
MmEomes
Mm
T
MmTbr1
MmTbx1MmTbx10
Mm
Tbx1
5
MmTbx18
Mm
Tbx19
MmTbx2
Mm
Tbx
20
MmTbx21
MmTbx22
MmTbx3
Mm
Tbx4M
mT
bx5
Mm
Tbx6
MtTbx6
NvB
ra
Nvtbx1
NvTbx1
5b
NvT
bx20
NvT
bx23
NvTbxA
NvT
bxB
NvT
bxC
NvT
bxD
NvTbxE
OlTbx
Om
Bra
PcBra
PcT
bx45
PdbraPlbra
PpB
ra
PpTb
x23
PvBra
SdB
ra
SdT
bx2
SgT
bx6a
SgT
bx6b
SpTBR1
SpTbx
SpT
bx2
SpTbx2b
SrB
ra
TaBra
TaTb
x23
TaTbxB
TaTbxD
TaT
bxE
Tcdorsocr
TcTb
ox20
XB
RA
Xleomes
Xltbx1
Xltbx2
Xltbx20
Xltbx3
Xltbx5
Xltbx6
XlTbx
A
XlTbxB
XlTbxCXlVEGT
M. Mariadassou (INRA MIG) June 11 22 / 25
AgXP32060
AgXP32179
AgXP55771
AmphiEomes
AmphiTbx1
AmphiTbx2
AmphiTbx6
AmqTbx45
ApBra
ApTbrain
AtH15−1AtH15−2
AtOptomot
AtTbx1
BfTbx1518
Cap1241000
Cap190074
Cap1960038
Cap2380027
Cap2960003
Cap7420011
CiBra
CiTbx1
CiTbx1518CiTbx20
CiTbx23
Dmbi
Dmbyn
DmCG6634P
DmH15
Dmorg1
DrEomesod
DrnotailDrsimilar
DrT−box15DrT−box18
DrT−box24
Drtbx1
Drtbx16
Drtbx2
Drtbx20
Drtbx4
Drtbx5
Drtbx6
GgTbx6
HmBra
HpBra
HrTbxAHrTbxB
HrTbxGHrTbxH
HrTbxJHrTbxK
HrTbxN
HrTbxO
HrTbxP
HybraB
LgTbxA
LgTbxBLgTbxC
LgTbxD
LgTbxE
LgTbxF
LgTbxH
LgTbxI
Mlbra
MlTbx1
MltbxC
MltbxD
MmEomes
MmT
MmTbr1
MmTbx1MmTbx10
MmTbx15
MmTbx18
MmTbx19
MmTbx2
MmTbx20
MmTbx21
MmTbx22
MmTbx3
MmTbx4
MmTbx5
MmTbx6
NvBra
Nvtbx1
NvTbx15b
NvTbx20
NvTbx23
OlTbx
OmBra
PcBra
PcTbx45
Pdbra
Plbra
PvBra
SdTbx2
SpTBR1
SpTbx
SpTbx2b
TaBra
TaTbx23
TcTbox20
XBRA
Xleomes
Xltbx1
Xltbx2
Xltbx20
Xltbx3
Xltbx5
Xltbx6
XlTbxA
XlTbxB
XlTbxC
XlVEGT
990.52
740.5
46
42*
500.98
730.48
720.27
96*
91
93
760.28
660.8 85
0.99
740.5
540.56
621.00
6668
561.00
910.4
4858*
1000.97
84
740.92
100
55
94
1000.64
85
700.66
7882
0.46
530.6
10086
43 46
72*
76
431.00
420.87
42
44*
880.83
100*
98
99100
54
5696
980.97
100
5646
0.57
98
99
74
46
42
5073
0.95
720.6
960.54
910.72
930.99
76
660.64 85
0.78
741.00
540.72
621.00
6668
560.81
91
480.98 58
1000.99
840.98
740.94
1001.00 55
94
1000.69
24*
52*
980.49
41*
660.76
96
980.98
94
341.00
590.99
790.94
100
100
24*
520.44
98
410.38
66
960.44
980.99
940.92
980.73
96
98
Tbr1/E
omes
Brachyury
Tbx2/3
Tbx6/V
egTT
bx4/5
Tbx20
Tbx15/18/22
Tbx1
0.5
AgX
P32060
AgXP32
179
AgX
P55771
AmphiEomes
AmphiTbx1
Am
phiT
bx2
AmphiTbx6
AmqTbx45
ApB
ra
ApTbrain
AtH
15−1
AtH
15−2
AtOpt
omot
AtTbx1
BfT
bx15
18
Cap
1241
000
Cap190074
Cap1960038
Cap2380027
Cap
2960
003
Cap7420011
CiB
ra
CiTbx1
CiT
bx15
18
CiT
bx20
CiT
bx23
Dmbi
Dm
byn
Dm
CG
6634P
Dm
H15
Dmorg1
DrEomesod
Drnotail
Drsimilar
DrT
−box
15
DrT
−box
18
DrT−box24
Drtbx1
Drtbx16
Drtbx2
Drt
bx20
Drtbx4
Drtbx5
Drtbx6GgTbx6
HmBra
HpB
ra
HrT
bxA
HrT
bxB
HrT
bxG
HrT
bxH
HrT
bxJH
rTbxK
HrTbxN
HrTbxO
HrTbxP
HybraB
LgTbxA
LgTb
xBLg
TbxC
LgTb
xD
LgTbxE
LgTbxF
LgTbxH
LgTbxI
Mlbra
MlTbx1
MltbxCMltb
xD
MmEomes
Mm
T
MmTbr1
MmTbx1
MmTbx10
Mm
Tbx1
5
Mm
Tbx1
8
MmTbx19
MmTbx2
Mm
Tbx
20
MmTbx21
Mm
Tbx2
2
MmTbx3
MmTbx4
Mm
Tbx5
MmTbx6
NvBra
Nvtbx1
NvTbx
15b
NvT
bx20
NvTbx23
OlTbx
Om
Bra
PcB
ra
PcTbx45
Pdb
ra
Plbra
PvB
ra
SdTbx2
SpTBR1
SpTbx
SpTbx2b
TaBra
TaTbx23
TcT
box20
XBRA
Xleomes
Xltbx1
Xltbx2
Xltb
x20
Xltbx3
Xltbx5
Xltbx6
XlTbx
AX
lTbx
B
XlTbxC
XlVEGT
M. Mariadassou (INRA MIG) June 11 23 / 25
T-Box TF Phylogeny
Branch Depth
Boo
tstr
ap V
alue
0
20
40
60
80
100
0 20 40 60
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●●●
●
●
●
●●
●
●
●●
●
●
●●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
127 Tbox
mean 164
loess 164
mean 127
loess 127
0 20 40 60
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●●●
●
●
●
●●
●
●
●●
●
●
●●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
123 Tbox
mean 164
loess 164
mean 123
loess 123
0 20 40 60
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●●●
●
●
●
●●
●
●
●●
●
●
●●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●●●
●
●
●
●
●
●
●
●
●
116 Tbox
mean 164
loess 164
mean 116
loess 116
Number of Tbx in tree 164 127 123 116BP of OlTbx in clade TBX2/3 19 32 34 59
No sponge (yet) identified as a member of the Tbx2/3 subfamily: newevolutionary hypothesis.
M. Mariadassou (INRA MIG) June 11 24 / 25
Summary
Potential Sources of Instability and Indexes to Detect Them1 Data sampling: Bootstrap;
2 Outliers: Influence function for sites;
3 Rogue species: Taxon Influence Index.
Pros and Cons1 Assess uncertainty coming from different sources;
2 Highlight potential outliers;
3 No rigorous statistical threshold for inclusion (p-values,...)
M. Mariadassou (INRA MIG) June 11 25 / 25