estimating the effective sample size of phylogenetic tree topologies from bayesian mcmc analyses

34
Es#ma#ng the Effec#ve Sample Size of tree topologies from Bayesian MCMC analyses Rob Lanfear ANU & NESCent [email protected] @roblanfear

Upload: rob-lanfear

Post on 13-Jul-2015

407 views

Category:

Education


2 download

TRANSCRIPT

Es#ma#ng(the(Effec#ve(Sample(Size(

of(tree(topologies(from((

Bayesian(MCMC(analyses(

Rob(Lanfear(

ANU(&(NESCent(

(

[email protected](

@roblanfear(

(

1.  Effec#ve(Sample(Size((ESS)(

2.  ESS(for(tree(topologies,(in(principle(

3.  ESS(for(tree(topologies,(in(prac#ce(

4.  An(example(using(hox(genes(

(

1.  Effec#ve(Sample(Size((ESS)(

(

-2

0

2

0 250 500 750 1000x

y

-10

0

10

0 250 500 750 1000x

yESS(=(544.6(

ESS(=(7.5(

ESS = n1+ 2 ρ

k=1

∑k

ESS(>(200(

Drummond(AJ,(Ho(SYW,(Phillips(MJ,(Rambaut(A((2006)((

Relaxed(Phylogene#cs(and(Da#ng(with(Confidence.(PLoS(Biol(4(5):(e88.(

?(

ESS = n1+ 2 ρ

k=1

∑k

(

(

(

2.(ESS(for(tree(topologies,(in(principle(

(

A( D(B( C( A( D(B( C(

ESS(=(544.6(

ESS(=(7.5(

-8

-4

0

4

8

0 2500 5000 7500 10000x

y

ESS(=(549.5(

0

1

2

3

0 1000 2000 3000 4000 5000x

y

Raw(data(

Distances(between(sequen#al(samples(

ESS(=(7.5(

0.0

0.2

0.4

0.6

0.8

0 4 8 12distance

density

gap1

5

10

-8

-4

0

4

8

0 2500 5000 7500 10000x

y

ESS(=(549.5(Raw(data(

-8

-4

0

4

8

0 2500 5000 7500 10000x

y

ESS(=(549.5(Raw(data(

-8

-4

0

4

8

0 2500 5000 7500 10000x

y

ESS(=(549.5(Raw(data(

-8

-4

0

4

8

0 2500 5000 7500 10000x

y

ESS(=(549.5(Raw(data(

(

3.  ESS(for(tree(topologies,(in(prac#ce(

-8

-4

0

4

8

0 2500 5000 7500 10000x

y

ESS(=(549.5(Raw(data(

0.5

1.0

1.5

2.0

2.5

0 10 20 30 40 50Gap size

Med

ian

dist

ance

-8

-4

0

4

8

0 2500 5000 7500 10000x

y

ESS(=(549.5(Raw(data(

0.5

1.0

1.5

2.0

2.5

0 10 20 30 40 50Gap size

Med

ian

dist

ance

Median(distance(between(random(pairs(

-8

-4

0

4

8

0 2500 5000 7500 10000x

y

ESS(=(549.5(Raw(data(

0.5

1.0

1.5

2.0

2.5

0 10 20 30 40 50Gap size

Med

ian

dist

ance

Lower(CI(of(median((from(bootstrapping)(

Median(distance(between(random(pairs(

0.5

1.0

1.5

2.0

2.5

0 10 20 30 40 50Gap size

Med

ian

dist

ance

-8

-4

0

4

8

0 2500 5000 7500 10000x

y

ESS(=(549.5(Raw(data(

0.5

1.0

1.5

2.0

2.5

0 10 20 30 40 50Gap size

Med

ian

dist

ance

-8

-4

0

4

8

0 2500 5000 7500 10000x

y

ESS(=(549.5(Raw(data(

0.5

1.0

1.5

2.0

2.5

0 10 20 30 40 50Gap size

Med

ian

dist

ance

-8

-4

0

4

8

0 2500 5000 7500 10000x

y

ESS(=(549.5(Raw(data(

0.5

1.0

1.5

2.0

2.5

0 10 20 30 40 50Gap size

Med

ian

dist

ance

0.5

1.0

1.5

2.0

2.5

0 10 20 30 40 50Gap size

Med

ian

dist

ance

-8

-4

0

4

8

0 2500 5000 7500 10000x

y

ESS(=(549.5(Raw(data(

0.5

1.0

1.5

2.0

2.5

0 10 20 30 40 50Gap size

Med

ian

dist

ance

18(

0.5

1.0

1.5

2.0

2.5

0 10 20 30 40 50Gap size

Med

ian

dist

ance

-8

-4

0

4

8

0 2500 5000 7500 10000x

y

ESS(=(549.5(Raw(data(

0.5

1.0

1.5

2.0

2.5

0 10 20 30 40 50Gap size

Med

ian

dist

ance

18(

Approximate(ESS(=(10000/18(=(555.6((

-8

-4

0

4

8

0 2500 5000 7500 10000x

y

ESS(=(549.5(Raw(data(

0

20

40

-200 0 200 400Difference between approximate and actual ESS (N=500)

count

256(reps(>(true(ESS(

244(reps(<(true(ESS(

Sign(test:(p(=(0.6228(

(

4.(An(example(using(hox(genes(

Bfl_Xlox

Nvi_Lox5

Mmu_HoxA10

Bfl_Hox1

Nvi_Lox2

Mmu_HoxB9

Mmu_C

dx2

Bfl_Evx

A

Mmu_HoxA3

Bfl_Hox4

Ttr_eve

Bfl_Hox6

Nvi_scr

Csp_Xlox

Nvi_Lox4

Lan_scr

Nvi_Post2

Mmu_HoxB5

Alo_pb

Mmu_Gsh2

Nvi_Hox3

Mmu_Evx2

Hro_ev

e

Mmu_HoxA5

Mmu_HoxB1

Bfl_Gsx

Mmu_Xlox

Cva_Hox1

Mmu_HoxA6

Mmu_HoxB6

Mmu_HoxA1

Mmu_HoxA2Mmu_HoxB2

Csp_Gsx

Bfl_Hox8

Nvi_D

fdMmu_H

oxB4

Mmu_H

oxA4Htr_

Lox18

Csp_C

dx

Lan_Post2

Mmu_Gsh

1

Mmu_Evx1

Mmu_HoxB8

Hro_Lox5

Pst_Xlox

Mmu_HoxB7

Bfl_Hox3

Mmu_HoxA9

Bfl_Hox2

Nvi_C

dx

Cva_H

ox3

Mmu_HoxB3

Hme_Lox4

Mmu_HoxA7

Mmu_HoxC8

Bfl_Hox9

Nvi_Hox1

Nvi_Post

1

Bfl_Cdx

Mmu_HoxC10

Bfl_Hox7

Bfl_Hox5

Mmu_C

dx1

Cva_Hox2

Bfl_Hox10

Hro_Lox2

Lan_Po

st1

(

68(Hox(genes((

MCMC(run(using(MrBayes(

59K(samples(aeer(burnin(

10

20

30

40

0 100 200 300Gap size

Med

ian

dist

ance

Robinson'Foulds,distance,Approximate(ESS(=(746.9(

10

20

30

40

0 100 200 300Gap size

Med

ian

dist

ance

Robinson'Foulds,distance,Approximate(ESS(=(746.9(

0.4

0.6

0.8

1.0

0 100 200 300Gap size

Med

ian

dist

ance

Branch,Score,Difference,Approximate(ESS(=(694.1(

10

20

30

40

0 100 200 300Gap size

Med

ian

dist

ance

Robinson'Foulds,distance,Approximate(ESS(=(746.9(

0.4

0.6

0.8

1.0

0 100 200 300Gap size

Med

ian

dist

ance

Branch,Score,Difference,Approximate(ESS(=(694.1(

60

80

100

0 100 200 300Gap size

Med

ian

dist

ance

Path,Difference,Approximate(ESS(=(880.6((

Thanks(to(Dan(Warren(

(

(

Code(available(at((

github.com/danlwarren/RWTY(

(

(

Slides(will(be(on(SlideShare(

(

(

Comments?([email protected](