community phylogeography of a carnivorous plant and its ... · the pale pitcher plant sarracenia...

65
Community phylogeography of a carnivorous plant and its arthropod and microbe symbionts: New methods of data collection enable an expansion of phylogeographic investigations. Bryan Carstens Department of Biological Science

Upload: others

Post on 27-Sep-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

  • Community phylogeography of a carnivorous plant and its

    arthropod and microbe symbionts: New methods of data collection

    enable an expansion of phylogeographic investigations.

    Bryan CarstensDepartment of Biological Science

  • The Pale Pitcher plant Sarracenia alata

    • carnivorous

    • inhabits pine savannahs and bogs

    • long lived, clonal perennial with a patchy distribution

    • extrafloral nectaries, down-pointing hairs, modified leaves, digestive fluids

    Gulf of Mexico

    Community phylogeography of Sarracenia alata and its symbionts

  • No previous genetic investigations into S. alata, apart from its

    inclusion in two phylogenetic studies (Bayer et al 1996; Oard

    1997)

    environmental differentiation

    (mainly E-W)

    Mississippi River discontinuity

    (Soltis et al. 2007)

    Community phylogeography of Sarracenia alata and its symbionts

  • The Pale Pitcher plant Sarracenia alata

    • Mississippi River / Atchafalaya Swamp are the dominant geographic barriers

    Gulf of Mexico

    Community phylogeography of Sarracenia alata and its symbionts

  • 0

    5

    10

    15

    20

    25

    30

    35

    40

    45

    Presetlement 1900 1986

    Percentage

    Gulf Coast Habitat Loss

    Longleaf Pine Wetlands

    red

    raw

    n fro

    m N

    oss

    1989

    Pine Savannah habitat is severely reduced and fragmented . . .

    Community phylogeography of Sarracenia alata and its symbionts

  • Sarracenia alata - data from 86 plants across five populations

    • rps16-trnK region of the cpDNA

    • 8 MSATs developed in lab (Koopman et al. 2009)

    Community phylogeography of Sarracenia alata and its symbionts

  • Population structure

    • STRUCTURE (Pritchard et al. 2000;

    Evanno et al. 2005)

    k = 2 partitions (East—West)

    • STRUCTURAMA

    k = 4 partitions, largely corresponding

    to sampled populations (AS + T)

    Koopman & Carstens 2010

    Community phylogeography of Sarracenia alata and its symbionts

  • Selection - cpDNA (Tajima’s D)

    Tajima 1989

    Selection - MSATs (BOTTLENECK)Cornuet & Luikart 1996

    Selection - MSATs (Lewontin KrakhauerTest)

    Lewontin & Krakhauer 1973

    -0.1067; P > 0.10

    Koopman & Carstens 2010

    Community phylogeography of Sarracenia alata and its symbionts

  • There is no compelling evidence for adaptive differentiation . . .

    Community phylogeography of Sarracenia alata and its symbionts

  • M + q (LAMARC)Kuhner 2006

    M + q + g (LAMARC)Kuhner 2006

    M + q (MIGRATE-N)Beerli & Felsenstein 2001

    Koopman & Carstens 2010

    Estimates of migration are non-zero and dependent on model assumptions . . .

    Community phylogeography of Sarracenia alata and its symbionts

  • Koopman & Carstens 2010

    Estimates of migration are non-zero and dependent on model assumptions . . .

    Community phylogeography of Sarracenia alata and its symbionts

  • Goal: Identify the factors that contribute to evolution of S. alata.

    Is local adaptation important?

    Is there evidence of gene flow?

    Is there evidence of population expansion?

    Community phylogeography of Sarracenia alata and its symbionts

    How do we identify these factors?

  • Summarize genetic variation with statistics

    • FST, Tajima’s D, Fu & Li’s F

    Estimate parameters using some model

    • Nm with Wright’s Island model

    • genealogies using a phylogenetic model

    • migration rates with a coalescent-model

    Use these statistics or estimates to understand or infer the

    evolutionary history that produced the genetic variation.

    How do we analyze genetic data in evolutionary biology?

    Community phylogeography of Sarracenia alata and its symbionts

  • Summaries and estimates are formally generated, but

    interpreted by researchers in a qualitative manner.

    • over-interpretation – more detailed historical scenarios are

    proposed than the data support (Knowles & Maddison 2002)

    • confirmation bias – novel information is interpreted in a

    manner consistent with preconceived ideas (Nickerson

    1998)

    The most egregious examples are found in phylogeography . .

    .

    Community phylogeography of Sarracenia alata and its symbionts

  • Community phylogeography of Sarracenia alata and its symbionts

    M + q + t (IMa)

    Hey & Nielsen 2007

    M + q (LAMARC)Kuhner 2006

    M + q + g (LAMARC)Kuhner 2006

    M + q (MIGRATE-N)Beerli & Felsenstein 2001

  • How do we move past this qualitative approach to data analysis?

    Hypothesis testing?

    Prob (data|null model is true) is calculated, but because genetic

    data are not independent and identically distributed, simulations

    are used to construct the test distribution.

    We reject or fail to reject the hypothesis.

    Knowles & Carstens 2007

    Community phylogeography of Sarracenia alata and its symbionts

  • How do we move past this qualitative approach to data analysis?

    Prob (data|null model is true) is calculated, but because genetic

    data are not independent and identically distributed, simulations

    are used to construct the test distribution.

    Phylogeographic hypothesis testing is

    flawed because the biological realism of

    any historical model is difficult to assess.

    Community phylogeography of Sarracenia alata and its symbionts

  • Assumptions

    •accuracy of qi, other values

    •adequacy of sampling strategy

    • timing of population model

    • topology of population model

    •adequacy of summary statistics

    Prob (data | null model is true)

    Knowles & Carstens 2007

    Community phylogeography of Sarracenia alata and its symbionts

  • Is Hypothesis-testing the best way to move beyond

    qualitative data analysis?

    • rejecting an unrealistic hypothesis does not help us understand

    anything about the demographic history

    • it may promote false confidence regarding our understanding of

    the system

    • we are not able to differentiate among hypotheses that can not

    be rejected

    Community phylogeography of Sarracenia alata and its symbionts

  • In order to identify the historical forces that generate biodiversity,

    we must understand the historical demography of the species.

    • We can not replicate evolutionary history.

    • We do not have experimental controls.

    Evolutionary genetics is a historical discipline . . .

    . . . that uses statistical tools developed for experimental

    research.

    Community phylogeography of Sarracenia alata and its symbionts

  • How should we proceed?

    1. Propose a set of possible hypotheses, where each hypothesis

    represents a plausible historical scenario (Chamberlin 1890).

    2. Calculate the probability of each hypothesis given the data.

    3. Rank the hypotheses; evaluate the relative support for each.

    Community phylogeography of Sarracenia alata and its symbionts

  • • Information theory is a statistical framework developed for

    quantifying the loss of information that occurs when a model is

    used to describe reality (K-L distance; Kullback & Leibler 1951).

    • Akaike (1973) linked K-L distance and maximum likelihood.

    • Following Chamberlin, we can calculate Prob (Hj | data) for j

    hypotheses and rank them using AIC.

    An information theoretic approach to phylogeography is

    statistically rigorous (like hypothesis testing) but more broadly

    applicable because it does not depend on the adequacy of model

    assumptions.

    Community phylogeography of Sarracenia alata and its symbionts

  • • IMa Hey & Nielsen 2007

    • approach described by Carstens et al 2009

    Information theoretic approach to evolutionary genetics

  • • little support in the data for demographic models with migration

    Information theoretic approach to evolutionary genetics

  • • little support in the data for demographic models with migration

    • parameter estimates are interdependent

    Information theoretic approach to evolutionary genetics

  • • little support in the data for demographic models with migration

    • parameter estimates are interdependent

    • parameter estimates weighted by model probabilities (wi)

    • compare to estimates from normal IMa run

    Information theoretic approach to evolutionary genetics

    12.695 2.79 4.303 0.003 0.0002 0.371

  • Phylogeography as an exercise in in

    demographic model selection . . .

    • assess the statistical fit of a wide range of demographic

    models to the data

    • rank these models using information theory

    • estimate parameters using model averaging

    Information theoretic approach to evolutionary genetics

  • IMa model is not flexible; but simulation-based approaches are

    • Approximate Bayesian Computation

    • Approximate Likelihoods

    . . . but we need better data than Koopman & Carstens 2010

    Second-generation sequencing (Roche 454 Titanium)

    Information theoretic approach to evolutionary genetics

  • • number of samples / species

    • number of loci

    • number of species (comparative studies)  

    Information theoretic approach to evolutionary genetics

    Molecular Ecology Resources (2008)

    Second-generation sequencing methods, such as Roche 454 ,

    enable expansion along each of the phylogeographic sampling

    axes:

  • Second-generation sequencing for phylogeography

    (Margaret Koopman & John McCormack)

    • restriction digest / reduced representation libraries

    • amplify parallel portions of the genome in multiple

    individuals

    • size selection / gel extraction

    • PCR used to add individual-identifying barcodes and linkers

    • sequence using ROCHE 454 (Titanium Chemistry)

    • bioinformatics processing (Sarah Hird)

    Koopman et al. (wet lab); Hird et al. (software) in review

    Information theoretic approach to evolutionary genetics

  • • 8-10 individuals from each of 10 fragmented populations

    • 2 trial runs on 1/8th of a sequencing plate

    • 1 full sequencing plate

    • > 1,000,000 sequencing reads averaging > 350 bp

    • 522 variable and 674 non-variable loci

    • Average length 378 bp; ~4.6 variable sites per locus

    • ~450 kb of data!

    • ~0.016% of the S. alata genome

    Information theoretic approach to evolutionary genetics

    Second-generation sequencing in S. alata

  • • 8-10 individuals from each of 10 fragmented populations

    • 2 trial runs on 1/8th of a sequencing plate

    • 1 full sequencing plate

    • > 1,000,000 sequencing reads averaging > 350 bp

    • 522 variable and 674 non-variable loci

    • Average length 378 bp; ~4.6 variable sites per locus

    • ~450 kb of data!

    • ~0.016% of the S. alata genome

    Information theoretic approach to evolutionary genetics

    Second-generation sequencing in S. alata

  • If we treat phylogeography as an exercise in demographic model

    selection, we need to assess the statistical fit of a wide range

    (100s) of demographic models to the data:

    relative posterior probabilities

    (Approximate Bayesian Computation)

    model likelihoods

    (Approximate Likelihoods)

    Information theoretic approach to evolutionary genetics

  • Approximate Bayesian Computation

    • Compute the joint posterior distribution of any number of

    models | data by simulation of prior distribution under a set

    of models {Mi}, rejection filtering, and calculation of the

    contribution of each model to the posterior distribution.

    • MS (Hudson 2002) used to simulate under coalescent

    models

    • PERL script to generate prior distributions for models

    (12,500,000 draws)

    • MSBAYES (Hickerson et al. 2007) to perform rejection step

    Information theoretic approach to evolutionary genetics

  • Approximate Bayesian Computation

    • ABC has be used to compare a small number of models.

    Information theoretic approach to evolutionary genetics

    Fagundes et al. 2007

  • Information theoretic approach to evolutionary genetics

    Approximate Bayesian Computation

    Our goal is an even parameterization of potential model space.

  • Information theoretic approach to evolutionary genetics

    Parameterization of {Mi} for S. alata

    • divergence model, samples partitioned E-W

    • used only 50 loci with best representation across individuals

    • vectors of summary statistics were calculated from each

    quartile of the data (π)

    • parameterization of the models as follows:

    (Α=1=2, Α 1=2, 1 Α=2, 2 Α=1, Α 1 2)

    migration (0, M1=2, M1≠2, M1, M2)

    population expansion (0, 1 2, 1=2, 1, 2)

  • 0

    0,02

    0,04

    0,06

    0,08

    0,1

    0,12

    0,14

    0,16

    0,18

    Relative Posterior Probability of 125 Models

    Rela

    tive P

    oste

    rior

    Pro

    babili

    tyInformation theoretic approach to evolutionary genetics

  • 0

    0,02

    0,04

    0,06

    0,08

    0,1

    0,12

    0,14

    0,16

    0,18

    Relative Posterior Probability of 125 Models

    Rela

    tive P

    oste

    rior

    Pro

    babili

    tyInformation theoretic approach to evolutionary genetics

    ABC does not differentiate large numbers of models; experimentation with

    multiple combinations of summary statistics did not improve the resolution.

  • Information theoretic approach to evolutionary genetics

    Approximate likelihoods

    . . . approach in development with Brian O’Meara

    • simulate genealogies under {Mi} that match those used in

    ABC

    • calculate the proportion of times that a gene tree from the

    empirical data is found in the set of simulated genealogies

    • this proportion approximates Probability (GT | Mi)

    • compute AIC values and information theoretic metrics such

    as model probabilities

    • gene trees have long been used in phylogeography . . .

  • wi

    Information theoretic approach to evolutionary genetics

    0

    0,02

    0,04

    0,06

    0,08

    0,1

    0,12

    0,14

    0,16

    0,18

    Exhaustive Model Selection

    The three models with the highest AIC score exclude migration

    and allow population size change . . .

    GT matching approximates the P(GT | Model)

  • Expansion of Approximately Likelihood approach to 10

    populations

    • > 10,000 models for this scenario

    • include ‘species’ delimitation, parameter estimation using

    model averaging, heuristic exploration of parameter space?

    In a single – species investigation, 2nd generation sequencing

    data and demographic model selection allows us to identify the

    forces that influence the population genetic structure of the

    species . . .

    . . . but benefit of NGS data is felt with multi-species

    comparisons.

    Community Genetics of Sarracenia alata and its symbionts

  • Community Genetics of Sarracenia alata and its symbionts

  • Community Genetics of Sarracenia alata and its symbionts

  • Community Genetics of Sarracenia alata and its symbionts

    Before our work, nothing was known regarding the microbial fauna in Sarracenia alata . .

    .

  • Community Genetics of Sarracenia alata and its symbionts

    In Sarracenia purpurea, microbes dominant

    the pitcher fluid, and play an important role

    in prey digestion (Plummer & Jackson

    1963; Harvey & Miller 1996) including

    mineral-ization of the majority of nutrients

    that plants derive from prey (Butler et al.

    2008)

  • Community Genetics of Sarracenia alata and its symbionts

    In the plant, Eastern - Western populations are geographically

    and genetically isolated and do not exchange migrants.

    • Mississippi River predates Sarracenia

    • tiny seeds (~2mm)

    • no ornamentation (flight, eliasomes)

    • short seed dispersal (~5cm)

    • absent from flood plains

  • Did the entire community move simultaneously?

    • Seeds dispersed by floods with idiosyncratic colonization by

    symbiontic species; predicts many divergence events

    • Community divergence following Atchafalaya Embayment

    (~10,000 ybp), or community pattern of Mississippi River

    discontinuity; predicts a single divergence event

    Community Genetics of Sarracenia alata and its symbionts

  • Metagenomic Community Phylogeography

    • genomic DNA extracted from pitcher fluid

    5 populations, 7 pitchers from four time points

    • PCR amplification of COI, 12s rRNA using barcoding

    primers

    • Roche 454 sequencing to isolate unique haplotypes

    • BLAST search to determine sequence identity

    Community Genetics of Sarracenia alata and its symbionts

    Ph

    oto

    s: R

    . K

    itko

    Habrotrocha rosa Sarraceniopusgibsoni

    >30 eukaryotic lineages identified from

    Eastern & Western populations

    (yeasts, algae, dipterans, mites, rotifers, ants)

  • MSBAYES: Hierarchical ABC used to test for simultaneous

    divergence

    Hic

    kers

    on e

    t al. 2

    006,

    2007

    Community Genetics of Sarracenia alata and its symbionts

    yeasts, algae, dipterans, mites, rotifers, ants, S. alata (eukaryotes)

  • Simultaneous Divergence?

    Number of Divergence events

    Poste

    rior

    Pro

    babili

    ty

    Community Genetics of Sarracenia alata and its symbionts

    Eukaryotes appear to diverge across Mississippi River in idiosyncratic manner .

    . .

  • Community Genetics of Sarracenia alata and its symbionts

    Roche 454 sequencing of bacterial DNA extracted from pitcher

    fluid and surrounding environment

    383,660 16s rRNA bacterial sequences from 73 S. alata pitchers

  • Community Genetics of Sarracenia alata and its symbionts

    Bacterial communities in the pitcher fluid are distinct from those in the soil . . .

  • Community Genetics of Sarracenia alata and its symbionts

    Bacterial diversity peaks in July and bacterial communities become

    more similar as the season progresses . . .

  • Community Genetics of Sarracenia alata and its symbionts

    Koopman & Carstens in review

    Phylogenetic community structure

    analysis using UniFrac (Lozupone &

    Knight 2005) produces a rooted

    summary of the dominant phylogenetic

    pattern exhibited by the community.

    15 possible rooted 4-taxon topologies

    Long odds that this happened by

    chance

    (p = 0.0044)

    ?

  • Enterobacteria dominate the pitcher fluid communities . . .

    . . . Enterobacteria are commonly isolated from animal

    guts.

    all sequencesubiquitous

    idiosyncraticTaxonomy (ubiquitous)

    Community Genetics of Sarracenia alata and its symbionts

    Koopman & Carstens in review

  • Phylogenetic Community Structure Analysis

    (Webb et al. 2002; Cavender-Bares et al. 2009)

    H0: bacterial assemblages in each pitcher are evenly distributed throughout the

    phylogeny, indicating assembly as a result of neutral processes (Hubbell

    2001).

    Ha: phylogenies can be overly-clustered

    (habitat filtering)

    Ha: phylogenies over-dispersed

    (competitive species interactions)

    (Cavender-B

    are

    s e

    t al. 2

    009)

    Community Genetics of Sarracenia alata and its symbionts

  • Phylogenetic Community Structure Analysis

    (Webb et al. 2002; Cavender-Bares et al. 2009)

    • 15 pitchers (3 from each of 5 sites)

    • 102 community structure tests using PICANTE (Kembel et al 2010)

    • analyses conducted at family level

    • data from 8 most diverse families used

    Community Genetics of Sarracenia alata and its symbionts

  • Phylogenetic Community Structure Analysis

    (Webb et al. 2002; Cavender-Bares et al. 2009)

    • 95 of 102 tests were clustered, 53 were significant at P = 0.05

    • 7 of 102 were over dispersed (none significant)

    • (Enterobacteriaceae) 13/15 pitchers were clustered, 12 significant at P =

    0.05.

    • If ecological function is correlated with phylogeny, the standard interpretation

    is that there are significant ecological differences conducive to the

    colonization and growth of different groups of bacteria among all pitchers

    (Fine et al. 2006).

    Community Genetics of Sarracenia alata and its symbionts

  • Phylogenetic Community Structure Analysis

    (Webb et al. 2002; Cavender-Bares et al. 2009)

    Do habitat differences produce correspondingly similar differences

    in the pitcher-fluid environment?

    Clustering could result from the colonization of pitchers by non-

    random packages of bacteria . . .

    via symbiontic arthropods?

    via prey items?

    Community Genetics of Sarracenia alata and its symbionts

  • Ants comprise the majority of Sarracenia prey items (Ellison

    & Gotelli 2009) and constitute most of S. alata’s prey (~ 80%).

    • collected pitcher fluid from 10 plants in the Abita Springs

    population, extracted Bacterial DNA

    • collected all ants within 1 m2 of sampled plants

    • bacterial DNA was isolated from the digestive tracts of the ants

    and extracted

    • bacterial community fingerprints of both pitcher fluid and ant

    guts were generated using ARISA

    Community Genetics of Sarracenia alata and its symbionts

  • H0: no significant difference between the bacterial

    Communities in the ant guts and pitcher fluid

    Analysis of Similarity

    R statistics from ANOSIM analyses for the ARISA data partitioned

    by

    by fluid(total) versus ant(total) (R value 0.217, P = 0.03)

    by individual plot (fluidi, anti) (R value 0.018, P = 0.291)

    Community Genetics of Sarracenia alata and its symbionts

  • There are no significant differences between the

    bacterial communities of pitcher fluid and ant guts.

    Ant microbiomes have evolved to facilitate the digestion of

    arthropod prey (Holldöbler & Wilson 2000).

    Do carnivorous plants co-opt the digestive microbiomes

    of their dominant prey items?

    Community Genetics of Sarracenia alata and its symbionts

  • • LSU Faculty Research Program

    • NSF EPSCoR Pfund

    • Louisiana Board Of Regents Research Competitiveness Subprogram

    • Chancellors' Future Leaders in Research

    • Howard Hughes Medical Institute

    • NSF (DEB-0918212)

    • NSF (DEB-0956069)

    Acknowledgements

    Sarah Hird

    Noah Reid

    John McVay

    Tara Pelletier

    Lowell Urbatch, Gary King, Brent Christner

    Danielle

    Fuselier

    Hannah

    Fullerton

    Joey

    Charboneau

    Dan Ence

    Jen Carstens

    Margaret Koopman

    Yi-Hsin Erica Tsai

    Amanda Zellmer