molecular characterization of local adaptation of natural flowering dogwood populations (c. florida)...

Post on 14-Apr-2017

90 Views

Category:

Science

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Molecular Characterization ofLocal Adaptation of Natural Flowering

Dogwood Populations(C. florida)

to Fungal Pathogens and Environmental Stress 

Andrew PaisAdvisor: Dr. Qiuyun Jenny Xiang

Department of Plant & Microbial BiologyNorth Carolina State University

Insect-disease risk

Insect-disease risk

Why the flowering dogwood tree?Why the flowering dogwood tree?• State flower of North Carolina, State tree of VirginiaState flower of North Carolina, State tree of Virginia• Economically valuable in horticultureEconomically valuable in horticulture

• Sales ~ $70 million per yearSales ~ $70 million per year• Ecologically important calcium pumpEcologically important calcium pump

Fig. 3. Conceptual model of calcium (Ca) cycling in an eastern U.S. hardwood forest with and without dogwood (Cornus florida) present. Arrow thickness indicates amount of Ca movement and box size indicates size of available Ca pool. Loss of dogwood would resul...

Eric J. Holzmueller, Shibu Jose, Michael A. Jenkins

Ecological consequences of an exotic fungal disease in eastern U.S. hardwood forests

Forest Ecology and Management, Volume 259, Issue 8, 2010, 1347–1353

http://dx.doi.org/10.1016/j.foreco.2010.01.014

Calciumcycle

Study System: Cornus florida L.

Threats

Dogwood AnthracnoseDiscula destructiva Powdery Mildew

Erysiphe pulchra Drought Stress +Sun Scorch

1980

Dogwooddisease

1980

?

?

?

?

??

?

??? ?

?

?

?

?

?

??

?

?

?

?

?

?

?

?

??

?

?

?

Dogwooddisease

1980

??

??

??

?

??

??

??

???

??? ?? ?

??

??

?

??

Dogwooddisease

Adapted from Ennos 2015

Co-evolution under fluctuating environments

Co-evolution involving R and AVR loci in natural populations

Co-evolution involving quantitative genetic resistance to tree pathogens

Pathogen development following initial establishment

Tolerance to tissue invasion

Probability of initial establishment

Pathogen pressure

Co-evolutionary dynamics of trees and pathogens in natural populations

Factors affecting impact of pathogens on individual trees

Disease and geneticconsequences

Adapted from Ennos 2015

Co-evolution under fluctuating environments

Co-evolution involving R and AVR loci in natural populations

Co-evolution involving quantitative genetic resistance to tree pathogens

Pathogen development following initial establishment

Tolerance to tissue invasion

Probability of initial establishment

Pathogen pressure

Co-evolutionary dynamics of trees and pathogens in natural populations

Factors affecting impact of pathogens on individual trees

Disease and geneticconsequences

Adapted from Ennos 2015

Co-evolution under fluctuating environments

Co-evolution involving R and AVR loci in natural populations

Co-evolution involving quantitative genetic resistance to tree pathogens

Pathogen development following initial establishment

Tolerance to tissue invasion

Probability of initial establishment

Pathogen pressure

Co-evolutionary dynamics of trees and pathogens in natural populations

Factors affecting impact of pathogens on individual trees

Disease and geneticconsequences

Adapted from Ennos 2015

Co-evolution under fluctuating environments

Co-evolution involving R and AVR loci in natural populations

Co-evolution involving quantitative genetic resistance to tree pathogens

Pathogen development following initial establishment

Tolerance to tissue invasion

Probability of initial establishment

Pathogen pressure

Co-evolutionary dynamics of trees and pathogens in natural populations

Factors affecting impact of pathogens on individual trees

Disease and geneticconsequences

Adapted from Ennos 2015

Co-evolution under fluctuating environments

Co-evolution involving R and AVR loci in natural populations

Co-evolution involving quantitative genetic resistance to tree pathogens

Pathogen development following initial establishment

Tolerance to tissue invasion

Probability of initial establishment

Pathogen pressure

Co-evolutionary dynamics of trees and pathogens in natural populations

Factors affecting impact of pathogens on individual trees

Disease and geneticconsequences

SSR: Hadziabdic et al 2010 and 2012 Chloroplast: Call et al 2015

Prior assessments of genetic diversity

Objectives

• I. NC pilot study of adaptive variation

• II. Range-wide study of adaptive variation

• III. Secondary chemical diversity

Objectives

• I. NC pilot study of adaptive variation

• II. Range-wide study of adaptive variation

• III. Secondary chemical diversity

Objectives

• I. NC pilot study of adaptive variation

• II. Range-wide study of adaptive variation

• III. Secondary chemical diversity

I. Objectives

• Genetic analyses using GBS to identify genetic markers associated with environmental variables and disease severity.

I. Questions• Has the species evolved local adaptation as a consequence of

environmentally heterogeneous ecological pressures?

• Which SNPs are likely to be candidates under selection?

• Which environmental gradients are most important to genetic divergence and local adaptation of C. florida populations if any?

• What genetic predisposition does C. florida possess to adapt to ongoing climate change in North Carolina?

I. Questions (cont.)• And how does repeated GBS experimentation influence

final results?

• Double-digestion of libraries using PstI +MspI (Peterson et al 2012)

Genetic Data - GBS

• Two libraries of Illumina Hi-Seq 2000 – 100bp paired-end• 96+85 samples (library one and two)

Mountain

Piedmont

Coast

SM PI

DKUM

NW

CF

Sampling (six populations; 181 individuals)Studydesign

“Whoever the men were who designed the geographical biscuit cutter which sliced out the Old North State, they succeeded so well botanically that one might think of them as possessed with less political sense than vegetational acumen…”

“In a very real sense North Carolina, though lying at right angles to the north-south longitudinal lines, unites Canada and Florida within a little over two-thirds of her length.”

Bertram Whittier Wells (1932)The Natural Gardens of North Carolina

Studydesign

Rainfall

Wetter DrierLighter Darker

Length of Growing Period (LGP)

Longer BluerShorter Greener

Soil Type

HistosolsUltisols

InceptisolsOccurrence by county

Dogwood Anthracnose

Studydesign

• Environmental variables/traits• Plant Phenotypeo Disease severityo Osmotic leaf potential (drought resistance)

• Multi-locus genotyping with GBS approach and Illumina sequencing

Data Collection Studydesign

• Environmental variables/traits• Plant Phenotypeo Disease severityo Osmotic leaf potential (drought resistance)

• Multi-locus genotyping with GBS approach and Illumina sequencing

Data Collection One Man Army

Studydesign

Environmental Data

Field Measurements• Elevation and coordinates• Diameter by height• Canopy coverage over each

tree• Nutrients of soil surface

coreP, K, Ca, Mg, S, Na, Mn, Cu, Zn

• %HM, CEC, Ac, pH, W/V, %BS

Climate• Mean Temp• Mean Rainfall• Frost free/growing period

Soil classifications• Histosols• Ultisols• Inceptisols

Phenotype Datao Disease severity (plant health) - % leaf blotting

and branch dieback (Mielke and Langdon 1986)

1 2 3 4 5HealthyDiseased

Phenotype Data

o Osmotic leaf potential per tree

• Branch cuttings from field osmotically adjusted in water

• Recorded with Osmometer

• Mmol/kg [solute/water] was representative of leaf osmotic potential

Analytical methods

Gradient forest analysis (GF)

• Utilizes presence-absence of alleles per sample

• Cumulative importance of allele along ecological gradients

• Ellis et al 2012; Fitzpatrick et al 2015

Fst outlier analyses

• Hierarchical Fst-Het. model (Excoffier et al 2010)

• Island model (Foll and Gaggiotti 2008)

120 130 140 150 160 170

00.010.0

20.030.0

40.050.0

60.070.0 B195_77

B1219_ 7

B244_51B977_86

MeanJuly precipitation (prec7)

Cum

ulat

ive Im

porta

nceLatent factor mixed modeling

Frichot et al 2013

Fixed env. effect

Locus effect K latent factor

Allele freq. matrix

Data Analysis

FilteringLow

Quality

Read Assembly

FilteringMissing

Data

Identify Outlier

Loci

Correlation Analysis

General Population Structure

GBS

Data AnalysisLibraryOne

LibraryTwo

Data Analysis

Identify Outlier

Loci

Correlation Analysis

x2

Identify Outlier

Loci

Correlation Analysis

General Population Structure

GradientForest

Analysis

Validation Candidate SNPs+

Neutral SNPs

Results : Population Structure

Library Loci Coverage

One 2983 30.0x

Two 2764 34.6x

Cross-validating outlier results reduces false positives

3

27

0

50

2

1

0

490

00

3

1

0

0

46

8

1

0

25

7*

0

0

1

0

1

0

1

1

0

0

106

0

2

0

3

0

0

0

46

030

2

0

0

0

16

3

0

0

7

2

0

1

7

1

0

0

9

7

0

1

Bay

esca

n Li

brar

y1

Bayescan Library2

LFMMLibrary1

LFMMLibrary2

ArlequinLibrary2ArlequinLibrary1

2+1+7+9+7+1+7+3+2+7

Candidate loci (54)54 candidateloci (three overlaps)

Cross-validating outlier results reduces false positives

Library one dataset

54 candidate SNPs+

1171 neutral SNPs

envPC1 envPC2e.g. associating SNPs with PCA scores derived from environmental traits

SNP turnover along ecological gradients

Importance of ecological gradients

Temperature Covariates

Library 1: 2983 SNPsLatent factor mixed modeling

shows high numbers of SNPs

associated to temperature

I. Conclusions• Detected divergence in genetic structure corresponding to

regionally unique selective pressures

• Identified 54 candidate loci likely to be under selection

• Temp. cov. and soil comp. contributed most to adaptive divergence

• Trees to conserve Alleles to conserve

• Cross-validation of consistent GBS results identified neutral and candidate SNPs while lowering false positives

I. Limitations• Not all variability of adaptive landscape examined

• Adaptive significance of de novo GBS reads speculative (i.e. function of larger contigs?)

−4 −2 0 2

4−2−

02

4

Comp.1

2.pmo

CBroader_studyPilot_study

PCA of bioclim predictorsHijmans et al 2005

I. Limitations cont.

• Disease pressure in NC mountains confounded with abiotic factors

Dogwood anthracnose?

• How to define disease in first place?• Prevalence• Incidence• Occurrence

• No gen. div. decline in diseased sites• (SSR- Hadziabdic et al 2010)

(Chloroplast- Call et al 2015)• Disease estimates based on visual

observation• Allelic patterns of resistance genes not

observed

Objectives

• I. NC pilot study of adaptive variation

• II. Range-wide study of adaptive variation

• III. Secondary chemical diversity

II. Objectives

• Characterize the ecological and genomic relationships to disease, controlling for genetic structure.

II. Questions• Do biogeographic patterns revealed from analysis of

GBS markers support patterns from previous population studies of C. florida and related pathogens?

• Has disease in the past three decades influenced the genetic diversity of natural populations of C. florida?

• Where are changes in allele frequencies most abrupt and what ecological gradients are they corresponding to?

Estimating disease occurrence with GBS data

Population Geneticsof C. florida

Cross validate to draft genome of

C. florida

Draft genome alignment

Related Work r2=0.09887, p=2.02E-05 r2=0.1114, p=3.14E-06

Pow

dery

Mild

ew

Pow

dery

Mild

ew

Canopy HealthFrost Free Period

MLR Model

Mean annual prec.

Latitude

Elevation

Days after bud break

Foliar microbe alpha diversity(derived from ITS results)

Pow

dery

Mild

ew

Operational

TaxonomicUnits

NC Sites

Abundant powdery mildew

Prec. Mean

Another estimate of disease occurrence

0 0.0005 0.001 0.0015 0.002 0.0025

Diseased sites

Healthy sites

Proportion of dogwood anthracnose sequences

*Visu

al C

ateg

oriza

tion

Three estimates of disease occurrence

Disease spotted at site?

Disease sequences > 0.1%?

County occurrence of disease?

0 0.0005 0.001 0.0015 0.002 0.0025

Diseased sites

Healthy sites

Proportion of dogwood anthracnose sequences

*Visu

al C

ateg

oriza

tion

Reduce disease categories into disease incidence gradient

Disease spotted at site?

Disease sequences > 0.1%?

County occurrence of disease?

Disease spotted at site?

Disease sequences > 0.1%?

County occurrence of disease?

Low evidence of disease

High evidence of disease

MCA incidence gradient

C. floridaSNPs

General Population Structure

Geneticdiversity

Data Analysis

MCA incidencegradient

EMMAXThree disease

categories

Temp-prec.

monthcollected

LFMMGWAS

BioclimData

C. floridaSNPs AMOVA

FstDiscriminant

analysisC. florida

SNPs Gradientforest

C. floridaSNPs

General Population Structure

Geneticdiversity

Data Analysis

MCA incidencegradient

EMMAXThree disease

categories

Temp-prec.

monthcollected

LFMMGWAS

BioclimData

C. floridaSNPs AMOVA

FstDiscriminant

analysisC. florida

SNPs Gradientforest

C. floridaSNPs

General Population Structure

Geneticdiversity

Data Analysis

MCA incidencegradient

EMMAXThree disease

categories

Temp-prec.

monthcollected

LFMMGWAS

BioclimData

C. floridaSNPs AMOVA

FstDiscriminant

analysisC. florida

SNPs Gradientforest

C. floridaSNPs

General Population Structure

Geneticdiversity

Data Analysis

MCA incidencegradient

EMMAXThree disease

categories

Temp-prec.

monthcollected

LFMMGWAS

BioclimData

C. floridaSNPs AMOVA

FstDiscriminant

analysisC. florida

SNPs Gradientforest

C. floridaSNPs

General Population Structure

Geneticdiversity

Data Analysis

MCA incidencegradient

EMMAXThree disease

categories

Temp-prec.

monthcollected

LFMMGWAS

BioclimData

C. floridaSNPs AMOVA

FstDiscriminant

analysisC. florida

SNPs Gradientforest

C. floridaSNPs

General Population Structure

Geneticdiversity

Data Analysis

MCA incidencegradient

EMMAXThree disease

categories

Temp-prec.

monthcollected

LFMMGWAS

BioclimData

C. floridaSNPs AMOVA

FstDiscriminant

analysisC. florida

SNPs Gradientforest

C. floridaSNPs

General Population Structure

Geneticdiversity

Data Analysis

MCA incidencegradient

EMMAXThree disease

categories

Temp-prec.

monthcollected

LFMMGWAS

BioclimData

C. floridaSNPs AMOVA

FstDiscriminant

analysisC. florida

SNPs Gradientforest

C. floridaSNPs

General Population Structure

Geneticdiversity

Data Analysis

MCA incidencegradient

EMMAXThree disease

categories

Temp-prec.

monthcollected

LFMMGWAS

BioclimData

C. floridaSNPs AMOVA

FstDiscriminant

analysisC. florida

SNPs Gradientforest

C. floridaSNPs

General Population Structure

Geneticdiversity

Data Analysis

MCA incidencegradient

EMMAXThree disease

categories

Temp-prec.

monthcollected

LFMMGWAS

BioclimData

C. floridaSNPs AMOVA

FstDiscriminant

analysisC. florida

SNPs Gradientforest

fastSTRUCTURE DAPC

K2

K3

K4

K5

Disease occurrence

USFS ecological divisions

Results: Genetic Structure

fastSTRUCTURE DAPC

K2

K3

K4

K5

Disease occurrence

USFS ecological divisions

Results: Genetic Structure

fastSTRUCTURE DAPC

K2

K3

K4

K5

Disease occurrence

(FHTET)

USFS ecological divisions

Hot

ContinentalWof

MS

Diseased co

unty

map

Results: Genetic Structure

Sites in non-diseased county Sites in diseased county

Results: Genetic DiversityFew differences in allelic richness

Rarifi

ed a

llelic

rich

ness

+ 9

5% C

I

Results: Genetic Diversity

Genetic diversity same between sites with evidence for absence-presence of disease

0.116

0.118

0.12

0.122

0.124

0.126

0.128

0.13

0.132

0.134

Dogwoodanthracnosecountyoccurrence

Contaminantsequencethresholdfordogwoodanthracnose

Visualobservationof diseaseat site

Generalpopulation

Nucle

otid

e dive

rsityy

Noevidence ofdiseaseoccurrence

Evidence ofdiseaseoccurrence

Generalpopulation

Visual absence-presence of disease at site

D. destructiva : C. florida < or > 0.1% Absence-occurrence of dogwood anthracnose by county

Discriminant function 1Discriminant function 1

Discriminant function 1

Genome positionGenome positionGenome position

Loading plotLoading plotLoading plot

Cont

ributi

onDe

nsity

Z>4

Results: SNPs correlated to disease categories

SNP27429_9

DiseasedHealthy

DAPC of disease occurrence-absence

GBS tag 27429_9

Capsicum: subtilisin-like protease

Genome contig 009582F

activating downstream immune signaling processes

Figueiredo et al 2014

STACKS locus-SNPTests detected by STACKS locus blastn hit

Genome scaffold, base-pair position

Proximate blastn hit (+ 3kbp) to genome alignment

27429_9

[DAPC DDES test], [K2 LFMM]

ref|XM_016719275.1| PREDICTED: Capsicum annuum subtilisin-like protease SBT1.1 009582F, 18512

ref|XM_010663313.1| PREDICTED: Vitis vinifera subtilisin-like protease

Candidate gene: subtilisin-like protease

Disease Associations(y=MCA1 dim)

Cluster Model

Significant SNP associations to disease

probability gradient (median Z>4)

K2 12

K3 7

K4 2

K5 0

Results: Latent factor mixed modelling

SNP27429_9

Results: Controlling for abiotic covariates

GWA to MCA disease gradient (EMMAX model) controlling for:

No covariate

Precipitation and mean temperature (month of collection)

EnvPC1 and envPC2

Grand Total

1 1 1 61 1 1 61 1 51 1 5

1 41 4

Results: Allelic transitions (Gradient Forest)

Cum

ulative

impo

rtance

LongitudeLatitude

MCA1 disease gradient

0.01.0

2.03.0

4.0

30 32 34 36 38 40 42

00.010.0

20.030.0

40.050.0

0.01.0

2.03.0

4.0

−95 −90 −85 −80 −75

00.020.0

40.060.0

00.050.0

01.051.0

0.0 0.5 1.0 1.5

00.050.0

01.051.0

02.0

0.0 0.5 1.0Cu

mulat

ive im

porta

nce

Cum

ulative

impo

rtance

1

2

3 4

56 78

1234

12

345

1

2345

Candidate SNPs for disease resilience-susceptibility

Conclusions1. Diseased vs. non-affected areas: few differences in

genetic diversity

2. Some select SNPs consistently associated to disease occurrence (e.g. subtilisin locus)

3. Contour between hot-continental/disease affected region and rest of flowering dogwoods range reflected by allele frequency changes along spatial gradients

4. Several loci of interest certain alleles fixated in areas of disease occurrence

II. Conundrum• How demographic changes will affect species in future?

• i.e. smaller and more isolated populations

• Increasing homozygosity within individual trees

• Less functional precursors available for adaptive potential (screening hypothesis)

II. Conundrum• Do patterns of phenotypic diversity correlate to disease

resistance-susceptibility?

• Do patterns of genetic variation correlate with patterns of phenotypic variation (i.e. secondary metabolites)

Firn and Jones 2000

II. Conundrum• Do patterns of phenotypic diversity correlate to disease

resistance-susceptibility?

• Do patterns of genetic variation correlate with patterns of phenotypic variation (i.e. secondary metabolites)

Firn and Jones 2000

Why study natural diversity of plant secondary metabolism (PSM)?• Small genetic diversity large

diversity in PSM

• Closely tied to ecological functions

• Specialized deterrents to herbivory and disease

• Few studies characterize secondary metabolite diversity in natural populations

Kampranis et al 2007

Asn Ile

6

2

Objectives

• I. NC pilot study of adaptive variation

• II. Range-wide study of adaptive variation

• III. Secondary chemical diversity

Objectives

• I. NC pilot study of adaptive variation

• II. Range-wide study of adaptive variation

• III. Secondary chemical diversity (NC pilot study)

Questions• Is dogwood disease constrained primarily by abiotic factors?

• Is there evidence from candidate SNPs and biomarkers for local adaptation?

• What is the relationship between genetic diversity and chemodiversity?

• Do healthy plants exhibit greater chemodiversity?

• If so, is higher chemodiversity a product of induced responses to the environment or environment-driven selection?

Untargeted Chemical Profiling

• 50% Methanol Extracts• LCMS• XCMS Pipeline

• Filtering/Remove Isotopic peaks

• Missing Data Treatment• Standardization/

Transformation

Chemical Profiling: PCA by Population(2785 peaks x 171 samples)

Data Analysis

Random forest

General genetic-chemical-

environmentalrelationships

Group comparison

Diversity indices

Ordination

Discriminant analysis(DAPC)

Regression

Biomarkeridentification

DAPC

Support vector

machines

DA of partial least

squares

GWAS:EMMAX

Logistic mixed

modelling

Chemicaldiversity

Regression

Logistic mixed

modelling

Gaussian graphical modelling

Data Analysis

Random forest

General genetic-chemical-

environmentalrelationships

Group comparison

Diversity indices

Ordination

Discriminant analysis(DAPC)

Regression

Biomarkeridentification

DAPC

Support vector

machines

DA of partial least

squares

GWAS:EMMAX

Logistic mixed

modelling

Chemicaldiversity

Regression

Logistic mixed

modelling

Gaussian graphical modelling

Data Analysis

Random forest

General genetic-chemical-

environmentalrelationships

Group comparison

Diversity indices

Ordination

Discriminant analysis(DAPC)

Regression

Biomarkeridentification

DAPC

Support vector

machines

DA of partial least

squares

GWAS:EMMAX

Logistic mixed

modelling

Chemicaldiversity

Regression

Logistic mixed

modelling

Gaussian graphical modelling

Estimating Chemical Diversity

• Calculated from all features including isotopic peaks (2785)

• Recalculated from predicted network of glucosides

• Indices used as response-predictor

Shannon Index

Evenness Index

Berger-Parker Index

Simpson’s Indices

Health scores rose with indices of chemical diversity of all 2785 peaks

Shannon’s H’:Uncertainty that standardized unit of chromatogram space for leaf sample falls under a particular chromatogram peak among other peaks passing pre-processing thresholds

DAPC of genetic data DAPC of 377 chemicals

Population genetic clustering congruent with population metabolomic clustering

A CF

PI1PI2

SM2SM1

UMDK

TNC

{

{

{PiedmontMountain

Coastal Plains

Temp. at collectionand prec. of driestmonth strongly correlatedwith chemical ordinationspace

Identification of SNP-chemical association after controlling for abiotic covariates

SNP association to metabolite M435T576

Metabolite M435T576 is important biomarker

Hypothesized glucoside network

Mass properties reveal similar predicted structures

Induced dominance of M435T576 expression increases odds of being healthy

III. Conclusions• Prec. and temp. covariates strongly correlated

with chemical expression

• After controlling for environment, identified SNP-biomarker network likely involved with synthesis/regulation of glucosides

• While chemical data more variable, both clustering patterns of genetic and chemical data similar

• Discovered general trend that increase in chemical richness and evenness linked to healthier trees

• For hypothesized network, upregulation of M435T576 relative to other compounds in network was associated with higher odds of being healthy

Summary

• I. NC pilot study of adaptive variation

• II. Range-wide study of adaptive variation

• III. Secondary chemical diversity

AcknowledgementsAdvisor: Dr. Jenny XiangCollaborators: Dr. Phil Wadl and Dr. James Leebens-MackLab mates: Xiang Liu, Shihori Obata, Juliet Lindo, Ashley Yow, Will Kohlaway, Andres Qi, Jason LattierCommittee members and collaborators: Dr. Ross Whetten, Dr. William Hoffman, Dr. Sirius Li, and Dr. Jean RistainoOthers: Genome Science Laboratories staff, Forest Health Technology Enterprise Team, Shang Xue, Shuping Ruan, North Carolina Forest Services- Health Branch staff (Brian Heath), Dr. Ning Zhang, Dr. Guohong Cai, and Dr. Alex HarkessFunding sources:

Dogwood Genome Project:NSF #: 1444567

Questions?

top related