community phylogeography of a carnivorous plant and its ... · the pale pitcher plant sarracenia...
TRANSCRIPT
-
Community phylogeography of a carnivorous plant and its
arthropod and microbe symbionts: New methods of data collection
enable an expansion of phylogeographic investigations.
Bryan CarstensDepartment of Biological Science
-
The Pale Pitcher plant Sarracenia alata
• carnivorous
• inhabits pine savannahs and bogs
• long lived, clonal perennial with a patchy distribution
• extrafloral nectaries, down-pointing hairs, modified leaves, digestive fluids
Gulf of Mexico
Community phylogeography of Sarracenia alata and its symbionts
-
No previous genetic investigations into S. alata, apart from its
inclusion in two phylogenetic studies (Bayer et al 1996; Oard
1997)
environmental differentiation
(mainly E-W)
Mississippi River discontinuity
(Soltis et al. 2007)
Community phylogeography of Sarracenia alata and its symbionts
-
The Pale Pitcher plant Sarracenia alata
• Mississippi River / Atchafalaya Swamp are the dominant geographic barriers
Gulf of Mexico
Community phylogeography of Sarracenia alata and its symbionts
-
0
5
10
15
20
25
30
35
40
45
Presetlement 1900 1986
Percentage
Gulf Coast Habitat Loss
Longleaf Pine Wetlands
red
raw
n fro
m N
oss
1989
Pine Savannah habitat is severely reduced and fragmented . . .
Community phylogeography of Sarracenia alata and its symbionts
-
Sarracenia alata - data from 86 plants across five populations
• rps16-trnK region of the cpDNA
• 8 MSATs developed in lab (Koopman et al. 2009)
Community phylogeography of Sarracenia alata and its symbionts
-
Population structure
• STRUCTURE (Pritchard et al. 2000;
Evanno et al. 2005)
k = 2 partitions (East—West)
• STRUCTURAMA
k = 4 partitions, largely corresponding
to sampled populations (AS + T)
Koopman & Carstens 2010
Community phylogeography of Sarracenia alata and its symbionts
-
Selection - cpDNA (Tajima’s D)
Tajima 1989
Selection - MSATs (BOTTLENECK)Cornuet & Luikart 1996
Selection - MSATs (Lewontin KrakhauerTest)
Lewontin & Krakhauer 1973
-0.1067; P > 0.10
Koopman & Carstens 2010
Community phylogeography of Sarracenia alata and its symbionts
-
There is no compelling evidence for adaptive differentiation . . .
Community phylogeography of Sarracenia alata and its symbionts
-
M + q (LAMARC)Kuhner 2006
M + q + g (LAMARC)Kuhner 2006
M + q (MIGRATE-N)Beerli & Felsenstein 2001
Koopman & Carstens 2010
Estimates of migration are non-zero and dependent on model assumptions . . .
Community phylogeography of Sarracenia alata and its symbionts
-
Koopman & Carstens 2010
Estimates of migration are non-zero and dependent on model assumptions . . .
Community phylogeography of Sarracenia alata and its symbionts
-
Goal: Identify the factors that contribute to evolution of S. alata.
Is local adaptation important?
Is there evidence of gene flow?
Is there evidence of population expansion?
Community phylogeography of Sarracenia alata and its symbionts
How do we identify these factors?
-
Summarize genetic variation with statistics
• FST, Tajima’s D, Fu & Li’s F
Estimate parameters using some model
• Nm with Wright’s Island model
• genealogies using a phylogenetic model
• migration rates with a coalescent-model
Use these statistics or estimates to understand or infer the
evolutionary history that produced the genetic variation.
How do we analyze genetic data in evolutionary biology?
Community phylogeography of Sarracenia alata and its symbionts
-
Summaries and estimates are formally generated, but
interpreted by researchers in a qualitative manner.
• over-interpretation – more detailed historical scenarios are
proposed than the data support (Knowles & Maddison 2002)
• confirmation bias – novel information is interpreted in a
manner consistent with preconceived ideas (Nickerson
1998)
The most egregious examples are found in phylogeography . .
.
Community phylogeography of Sarracenia alata and its symbionts
-
Community phylogeography of Sarracenia alata and its symbionts
M + q + t (IMa)
Hey & Nielsen 2007
M + q (LAMARC)Kuhner 2006
M + q + g (LAMARC)Kuhner 2006
M + q (MIGRATE-N)Beerli & Felsenstein 2001
-
How do we move past this qualitative approach to data analysis?
Hypothesis testing?
Prob (data|null model is true) is calculated, but because genetic
data are not independent and identically distributed, simulations
are used to construct the test distribution.
We reject or fail to reject the hypothesis.
Knowles & Carstens 2007
Community phylogeography of Sarracenia alata and its symbionts
-
How do we move past this qualitative approach to data analysis?
Prob (data|null model is true) is calculated, but because genetic
data are not independent and identically distributed, simulations
are used to construct the test distribution.
Phylogeographic hypothesis testing is
flawed because the biological realism of
any historical model is difficult to assess.
Community phylogeography of Sarracenia alata and its symbionts
-
Assumptions
•accuracy of qi, other values
•adequacy of sampling strategy
• timing of population model
• topology of population model
•adequacy of summary statistics
Prob (data | null model is true)
Knowles & Carstens 2007
Community phylogeography of Sarracenia alata and its symbionts
-
Is Hypothesis-testing the best way to move beyond
qualitative data analysis?
• rejecting an unrealistic hypothesis does not help us understand
anything about the demographic history
• it may promote false confidence regarding our understanding of
the system
• we are not able to differentiate among hypotheses that can not
be rejected
Community phylogeography of Sarracenia alata and its symbionts
-
In order to identify the historical forces that generate biodiversity,
we must understand the historical demography of the species.
• We can not replicate evolutionary history.
• We do not have experimental controls.
Evolutionary genetics is a historical discipline . . .
. . . that uses statistical tools developed for experimental
research.
Community phylogeography of Sarracenia alata and its symbionts
-
How should we proceed?
1. Propose a set of possible hypotheses, where each hypothesis
represents a plausible historical scenario (Chamberlin 1890).
2. Calculate the probability of each hypothesis given the data.
3. Rank the hypotheses; evaluate the relative support for each.
Community phylogeography of Sarracenia alata and its symbionts
-
• Information theory is a statistical framework developed for
quantifying the loss of information that occurs when a model is
used to describe reality (K-L distance; Kullback & Leibler 1951).
• Akaike (1973) linked K-L distance and maximum likelihood.
• Following Chamberlin, we can calculate Prob (Hj | data) for j
hypotheses and rank them using AIC.
An information theoretic approach to phylogeography is
statistically rigorous (like hypothesis testing) but more broadly
applicable because it does not depend on the adequacy of model
assumptions.
Community phylogeography of Sarracenia alata and its symbionts
-
• IMa Hey & Nielsen 2007
• approach described by Carstens et al 2009
Information theoretic approach to evolutionary genetics
-
• little support in the data for demographic models with migration
Information theoretic approach to evolutionary genetics
-
• little support in the data for demographic models with migration
• parameter estimates are interdependent
Information theoretic approach to evolutionary genetics
-
• little support in the data for demographic models with migration
• parameter estimates are interdependent
• parameter estimates weighted by model probabilities (wi)
• compare to estimates from normal IMa run
Information theoretic approach to evolutionary genetics
12.695 2.79 4.303 0.003 0.0002 0.371
-
Phylogeography as an exercise in in
demographic model selection . . .
• assess the statistical fit of a wide range of demographic
models to the data
• rank these models using information theory
• estimate parameters using model averaging
Information theoretic approach to evolutionary genetics
-
IMa model is not flexible; but simulation-based approaches are
• Approximate Bayesian Computation
• Approximate Likelihoods
. . . but we need better data than Koopman & Carstens 2010
Second-generation sequencing (Roche 454 Titanium)
Information theoretic approach to evolutionary genetics
-
• number of samples / species
• number of loci
• number of species (comparative studies)
Information theoretic approach to evolutionary genetics
Molecular Ecology Resources (2008)
Second-generation sequencing methods, such as Roche 454 ,
enable expansion along each of the phylogeographic sampling
axes:
-
Second-generation sequencing for phylogeography
(Margaret Koopman & John McCormack)
• restriction digest / reduced representation libraries
• amplify parallel portions of the genome in multiple
individuals
• size selection / gel extraction
• PCR used to add individual-identifying barcodes and linkers
• sequence using ROCHE 454 (Titanium Chemistry)
• bioinformatics processing (Sarah Hird)
Koopman et al. (wet lab); Hird et al. (software) in review
Information theoretic approach to evolutionary genetics
-
• 8-10 individuals from each of 10 fragmented populations
• 2 trial runs on 1/8th of a sequencing plate
• 1 full sequencing plate
• > 1,000,000 sequencing reads averaging > 350 bp
• 522 variable and 674 non-variable loci
• Average length 378 bp; ~4.6 variable sites per locus
• ~450 kb of data!
• ~0.016% of the S. alata genome
Information theoretic approach to evolutionary genetics
Second-generation sequencing in S. alata
-
• 8-10 individuals from each of 10 fragmented populations
• 2 trial runs on 1/8th of a sequencing plate
• 1 full sequencing plate
• > 1,000,000 sequencing reads averaging > 350 bp
• 522 variable and 674 non-variable loci
• Average length 378 bp; ~4.6 variable sites per locus
• ~450 kb of data!
• ~0.016% of the S. alata genome
Information theoretic approach to evolutionary genetics
Second-generation sequencing in S. alata
-
If we treat phylogeography as an exercise in demographic model
selection, we need to assess the statistical fit of a wide range
(100s) of demographic models to the data:
relative posterior probabilities
(Approximate Bayesian Computation)
model likelihoods
(Approximate Likelihoods)
Information theoretic approach to evolutionary genetics
-
Approximate Bayesian Computation
• Compute the joint posterior distribution of any number of
models | data by simulation of prior distribution under a set
of models {Mi}, rejection filtering, and calculation of the
contribution of each model to the posterior distribution.
• MS (Hudson 2002) used to simulate under coalescent
models
• PERL script to generate prior distributions for models
(12,500,000 draws)
• MSBAYES (Hickerson et al. 2007) to perform rejection step
Information theoretic approach to evolutionary genetics
-
Approximate Bayesian Computation
• ABC has be used to compare a small number of models.
Information theoretic approach to evolutionary genetics
Fagundes et al. 2007
-
Information theoretic approach to evolutionary genetics
Approximate Bayesian Computation
Our goal is an even parameterization of potential model space.
-
Information theoretic approach to evolutionary genetics
Parameterization of {Mi} for S. alata
• divergence model, samples partitioned E-W
• used only 50 loci with best representation across individuals
• vectors of summary statistics were calculated from each
quartile of the data (π)
• parameterization of the models as follows:
(Α=1=2, Α 1=2, 1 Α=2, 2 Α=1, Α 1 2)
migration (0, M1=2, M1≠2, M1, M2)
population expansion (0, 1 2, 1=2, 1, 2)
-
0
0,02
0,04
0,06
0,08
0,1
0,12
0,14
0,16
0,18
Relative Posterior Probability of 125 Models
Rela
tive P
oste
rior
Pro
babili
tyInformation theoretic approach to evolutionary genetics
-
0
0,02
0,04
0,06
0,08
0,1
0,12
0,14
0,16
0,18
Relative Posterior Probability of 125 Models
Rela
tive P
oste
rior
Pro
babili
tyInformation theoretic approach to evolutionary genetics
ABC does not differentiate large numbers of models; experimentation with
multiple combinations of summary statistics did not improve the resolution.
-
Information theoretic approach to evolutionary genetics
Approximate likelihoods
. . . approach in development with Brian O’Meara
• simulate genealogies under {Mi} that match those used in
ABC
• calculate the proportion of times that a gene tree from the
empirical data is found in the set of simulated genealogies
• this proportion approximates Probability (GT | Mi)
• compute AIC values and information theoretic metrics such
as model probabilities
• gene trees have long been used in phylogeography . . .
-
wi
Information theoretic approach to evolutionary genetics
0
0,02
0,04
0,06
0,08
0,1
0,12
0,14
0,16
0,18
Exhaustive Model Selection
The three models with the highest AIC score exclude migration
and allow population size change . . .
GT matching approximates the P(GT | Model)
-
Expansion of Approximately Likelihood approach to 10
populations
• > 10,000 models for this scenario
• include ‘species’ delimitation, parameter estimation using
model averaging, heuristic exploration of parameter space?
In a single – species investigation, 2nd generation sequencing
data and demographic model selection allows us to identify the
forces that influence the population genetic structure of the
species . . .
. . . but benefit of NGS data is felt with multi-species
comparisons.
Community Genetics of Sarracenia alata and its symbionts
-
Community Genetics of Sarracenia alata and its symbionts
-
Community Genetics of Sarracenia alata and its symbionts
-
Community Genetics of Sarracenia alata and its symbionts
Before our work, nothing was known regarding the microbial fauna in Sarracenia alata . .
.
-
Community Genetics of Sarracenia alata and its symbionts
In Sarracenia purpurea, microbes dominant
the pitcher fluid, and play an important role
in prey digestion (Plummer & Jackson
1963; Harvey & Miller 1996) including
mineral-ization of the majority of nutrients
that plants derive from prey (Butler et al.
2008)
-
Community Genetics of Sarracenia alata and its symbionts
In the plant, Eastern - Western populations are geographically
and genetically isolated and do not exchange migrants.
• Mississippi River predates Sarracenia
• tiny seeds (~2mm)
• no ornamentation (flight, eliasomes)
• short seed dispersal (~5cm)
• absent from flood plains
-
Did the entire community move simultaneously?
• Seeds dispersed by floods with idiosyncratic colonization by
symbiontic species; predicts many divergence events
• Community divergence following Atchafalaya Embayment
(~10,000 ybp), or community pattern of Mississippi River
discontinuity; predicts a single divergence event
Community Genetics of Sarracenia alata and its symbionts
-
Metagenomic Community Phylogeography
• genomic DNA extracted from pitcher fluid
5 populations, 7 pitchers from four time points
• PCR amplification of COI, 12s rRNA using barcoding
primers
• Roche 454 sequencing to isolate unique haplotypes
• BLAST search to determine sequence identity
Community Genetics of Sarracenia alata and its symbionts
Ph
oto
s: R
. K
itko
Habrotrocha rosa Sarraceniopusgibsoni
>30 eukaryotic lineages identified from
Eastern & Western populations
(yeasts, algae, dipterans, mites, rotifers, ants)
-
MSBAYES: Hierarchical ABC used to test for simultaneous
divergence
Hic
kers
on e
t al. 2
006,
2007
Community Genetics of Sarracenia alata and its symbionts
yeasts, algae, dipterans, mites, rotifers, ants, S. alata (eukaryotes)
-
Simultaneous Divergence?
Number of Divergence events
Poste
rior
Pro
babili
ty
Community Genetics of Sarracenia alata and its symbionts
Eukaryotes appear to diverge across Mississippi River in idiosyncratic manner .
. .
-
Community Genetics of Sarracenia alata and its symbionts
Roche 454 sequencing of bacterial DNA extracted from pitcher
fluid and surrounding environment
383,660 16s rRNA bacterial sequences from 73 S. alata pitchers
-
Community Genetics of Sarracenia alata and its symbionts
Bacterial communities in the pitcher fluid are distinct from those in the soil . . .
-
Community Genetics of Sarracenia alata and its symbionts
Bacterial diversity peaks in July and bacterial communities become
more similar as the season progresses . . .
-
Community Genetics of Sarracenia alata and its symbionts
Koopman & Carstens in review
Phylogenetic community structure
analysis using UniFrac (Lozupone &
Knight 2005) produces a rooted
summary of the dominant phylogenetic
pattern exhibited by the community.
15 possible rooted 4-taxon topologies
Long odds that this happened by
chance
(p = 0.0044)
?
-
Enterobacteria dominate the pitcher fluid communities . . .
. . . Enterobacteria are commonly isolated from animal
guts.
all sequencesubiquitous
idiosyncraticTaxonomy (ubiquitous)
Community Genetics of Sarracenia alata and its symbionts
Koopman & Carstens in review
-
Phylogenetic Community Structure Analysis
(Webb et al. 2002; Cavender-Bares et al. 2009)
H0: bacterial assemblages in each pitcher are evenly distributed throughout the
phylogeny, indicating assembly as a result of neutral processes (Hubbell
2001).
Ha: phylogenies can be overly-clustered
(habitat filtering)
Ha: phylogenies over-dispersed
(competitive species interactions)
(Cavender-B
are
s e
t al. 2
009)
Community Genetics of Sarracenia alata and its symbionts
-
Phylogenetic Community Structure Analysis
(Webb et al. 2002; Cavender-Bares et al. 2009)
• 15 pitchers (3 from each of 5 sites)
• 102 community structure tests using PICANTE (Kembel et al 2010)
• analyses conducted at family level
• data from 8 most diverse families used
Community Genetics of Sarracenia alata and its symbionts
-
Phylogenetic Community Structure Analysis
(Webb et al. 2002; Cavender-Bares et al. 2009)
• 95 of 102 tests were clustered, 53 were significant at P = 0.05
• 7 of 102 were over dispersed (none significant)
• (Enterobacteriaceae) 13/15 pitchers were clustered, 12 significant at P =
0.05.
• If ecological function is correlated with phylogeny, the standard interpretation
is that there are significant ecological differences conducive to the
colonization and growth of different groups of bacteria among all pitchers
(Fine et al. 2006).
Community Genetics of Sarracenia alata and its symbionts
-
Phylogenetic Community Structure Analysis
(Webb et al. 2002; Cavender-Bares et al. 2009)
Do habitat differences produce correspondingly similar differences
in the pitcher-fluid environment?
Clustering could result from the colonization of pitchers by non-
random packages of bacteria . . .
via symbiontic arthropods?
via prey items?
Community Genetics of Sarracenia alata and its symbionts
-
Ants comprise the majority of Sarracenia prey items (Ellison
& Gotelli 2009) and constitute most of S. alata’s prey (~ 80%).
• collected pitcher fluid from 10 plants in the Abita Springs
population, extracted Bacterial DNA
• collected all ants within 1 m2 of sampled plants
• bacterial DNA was isolated from the digestive tracts of the ants
and extracted
• bacterial community fingerprints of both pitcher fluid and ant
guts were generated using ARISA
Community Genetics of Sarracenia alata and its symbionts
-
H0: no significant difference between the bacterial
Communities in the ant guts and pitcher fluid
Analysis of Similarity
R statistics from ANOSIM analyses for the ARISA data partitioned
by
by fluid(total) versus ant(total) (R value 0.217, P = 0.03)
by individual plot (fluidi, anti) (R value 0.018, P = 0.291)
Community Genetics of Sarracenia alata and its symbionts
-
There are no significant differences between the
bacterial communities of pitcher fluid and ant guts.
Ant microbiomes have evolved to facilitate the digestion of
arthropod prey (Holldöbler & Wilson 2000).
Do carnivorous plants co-opt the digestive microbiomes
of their dominant prey items?
Community Genetics of Sarracenia alata and its symbionts
-
• LSU Faculty Research Program
• NSF EPSCoR Pfund
• Louisiana Board Of Regents Research Competitiveness Subprogram
• Chancellors' Future Leaders in Research
• Howard Hughes Medical Institute
• NSF (DEB-0918212)
• NSF (DEB-0956069)
Acknowledgements
Sarah Hird
Noah Reid
John McVay
Tara Pelletier
Lowell Urbatch, Gary King, Brent Christner
Danielle
Fuselier
Hannah
Fullerton
Joey
Charboneau
Dan Ence
Jen Carstens
Margaret Koopman
Yi-Hsin Erica Tsai
Amanda Zellmer