genome wide association mapping of sclerotinia ... · pdf filegenome wide association mapping...

62
Genome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach Maxime Bastien, Humira Sonah and François Belzile* Département de Phytologie and Institut de Biologie Intégrative et des Systèmes (IBIS), Université Laval, Quebec City, Quebec, Canada G1V 0A6 Received 4 October 2013. *Corresponding author: [email protected] Page 1 of 62 The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Upload: doanphuc

Post on 31-Jan-2018

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Genome wide association mapping of Sclerotinia sclerotiorum resistance in soybean

with a genotyping by sequencing approach

Maxime Bastien, Humira Sonah and François Belzile*

Département de Phytologie and Institut de Biologie Intégrative et des Systèmes (IBIS),

Université Laval, Quebec City, Quebec, Canada G1V 0A6

Received 4 October 2013.

*Corresponding author: [email protected]

Page 1 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 2: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Abstract

Sclerotinia stem rot (SSR) is one of the most important pests in cool soybean growing

regions of the Northeastern United States and Canada. However, the intensity of

infestations varies considerably from year to year according to weather conditions, thus

making it difficult for breeders to select under uniform disease pressure. Selection for

resistance to SSR would be greatly facilitated by the use of molecular markers. In this

work, a collection of 130 lines was inoculated using the cotton pad method and was

genetically characterized using a genotyping-by-sequencing protocol optimized for

soybean. Genome-wide association mapping and linkage disequilibrium (LD) analyses

were performed with 7,864 single nucleotide polymorphisms (SNPs). LD varied

considerably over physical distance, reaching a r2 value of 0.2 after 8.5 Mb in the

pericentromeric region and 0.5 Mb in the telomeric region. The mixed linear model

performed very well in accounting for population structure and relatedness, as only 5.5%

of the observed p-values were < 0.05. The strongest association was found on

chromosome Gm15 (p-value=1.38 x 10-6; q-value=0.011). Two additional SNP markers

in the vicinity had a q-value < 0.1. This marker was validated in the progeny of a

biparental cross, where F4:6 lines carrying the susceptibility allele developed lesions 17.6

mm longer than lines carrying the resistance allele. Interestingly, other genes

contributing to resistance to pathogens have been reported in this region of Gm15.

Three other association peaks having a q-value < 0.1 were detected on chromosomes

Gm01, Gm19 and Gm20.

Page 2 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 3: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Abbreviations: AM, association mapping; FDR, false discovery rate; GBS, genotyping-

by-sequencing; GLM, general linear model; GWAM, genome-wide association mapping;

LD, linkage disequilibrium; LRR, leucine-rich repeat; LSD, least significant difference;

MAF, minor allele frequency; MAS, marker-assisted selection; MLM, mixed linear model;

PC, principal component; PCA, principal component analysis; QTL, quantitative trait loci;

RAD, restriction site associated DNA; RIL, recombinant inbred lines; RRL, reduced

representation libraries; SLAF, specific-locus amplified fragment; SNP, single nucleotide

polymorphism; SSR, sclerotinia stem rot.

Page 3 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 4: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Sclerotinia stem rot (SSR) of soybean (Glycine max (L.) Merr.), caused by Sclerotinia

sclerotiorum (Lib.) de Bary, is one of the most important diseases in the northern United

States and Canada. In the United States, it was among the top ten diseases that

affected soybean yield in three of four years between 2006 and 2009, even ranking

second in 2009 (Koenning and Wrather, 2010). The development of SSR is highly

sensitive to fluctuations in humidity and temperature (Boland and Hall, 1987; Phillips,

1994; Workneh and Yang, 2000; Mila and Yang, 2008). Therefore, the disease can

cause considerable damage in soybean one year and be almost absent the following. In

Eastern Canada, and particularly in the province of Quebec, it is the most important

disease in soybean and resistance/susceptibility is assessed in official varietal

registration trials.

Chemical control of SSR is difficult to achieve because several preventive and systemic

treatments are required (Mueller et al., 2004). Biological control agents including

Coniothyrium minitans, Streptomyces lydicus and Trichoderma harzianum are all

effective to reduce the number of sclerotia, with C. minitans having the best

effectiveness (Zeng et al., 2012). However, these products need to be applied yearly to

obtain the best efficacy and will be unnecessary in the years where climatic conditions

are not favorable for disease development. Cultural practices such as crop rotation

(Kurle et al., 2001; Rousseau et al., 2007), reduced tillage (Sutton and Peng, 1993;

Gracia-Garza et al., 2002; Mueller et al., 2002), and wide row spacing (Kurle et al.,

2001; Mila and Yang, 2008) have been reported to reduce the impact of SSR in soybean

fields. The rationale behind these practices is to decrease inoculum and/or maintain

unfavorable conditions for fungal development.

Page 4 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 5: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Although no complete resistance to S. sclerotiorum has been described in soybean,

important differences in susceptibility to the pathogen have been reported (Kim et al.,

1999; Hartman et al., 2000; Hoffman et al., 2002; Chen and Wang, 2005; Bastien et al.,

2012). Moreover, in the field, true physiological resistance may be confounded with

escape or avoidance mechanisms (Kim and Diers, 2000; Rousseau et al., 2004). Overall,

the development of cultivars exhibiting enhanced resistance is one of the most effective

means to manage SSR (Grau, 1988; Kurle et al., 2001) and is an important objective of

breeding programs targeted at northern soybean growing areas.

To date, quantitative trait loci (QTLs) for white mold resistance in soybean have been

reported by Kim and Diers (2000), Arahana et al. (2001), Guo et al. (2008), Han et al.

(2008), Vuong et al. (2008), Huynh et al. (2010), Li et al. (2010) and Sebastian et al.

(2010). The mapping populations were generated from crosses between a resistant and

a susceptible parent. In all but one study, it has proven challenging to detect the same

QTLs in different trials. However, the most reproducible work identified three QTLs on

chromosomes 06 and 20, in three out of four different field trials (Huynh et al., 2010).

Selective phenotyping confirmed the impact of these putative QTLs in four additional

trials. The cotton pad method used in this study measures the length of lesions on the

main stem following inoculation with mycelium. It provides a reproducible measure of a

component of physiological resistance to SSR, namely to the infection via floral buds, in

both field and controlled conditions (Bastien et al., 2012).

Such QTL mapping in biparental crosses is limited in terms of the diversity sampled, two

parents per population, and the resolution provided by the low number of recombination

Page 5 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 6: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

events incurred during population development. An alternative mapping approach,

genome-wide association mapping (GWAM), is gaining popularity to identify genes of

interest in plants. Compared to linkage mapping, it enables the study of many genotypes

at once and, if a sufficient number of markers is used, generates more precise QTL

positions. GWAM has been shown to have the potential to dissect the genetic basis of

complex traits in Arabidopsis (Atwell et al., 2010) and rice (Huang et al., 2012). However,

population structure and kinship present within the association mapping (AM) population

must be taken into consideration to avoid detecting spurious associations. In addition,

because a statistical test is performed between each marker and the trait, traditional p-

value cutoffs of 0.01 or 0.05 have to be made stricter to avoid an abundance of false

positive results.

In soybean, GWAM studies using less than 200 microsatellite markers have reported

associations to various quantitative traits (Hou et al., 2011; Li et al., 2011; Korir et al.,

2013; Niu et al., 2013; Zuo et al., 2013). Such low marker density inevitably leads to a

low power of detection of QTLs and an imprecise localization of the QTL, thus reducing

their usefulness for marker-assisted selection (MAS). Three other GWAM studies have

been conducted using the GoldenGate assay, a high-throughput analysis method

capable of genotyping 1,536 SNPs on sets of 96 samples (Hyten et al., 2008; 2010b). In

the first study, two populations of advanced soybean breeding lines were screened for

iron deficiency chlorosis tolerance (Mamidi et al., 2011). Using the best model to control

for population structure, 15.5% and 18.7% of marker-trait association p-values were

inferior to 5%. This obvious inflation of observed p-values relative to the expected p-

values could have lead to the reporting of many false positives. The two other GWAM

Page 6 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 7: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

studies were conducted to identify QTLs associated with chrorophyll, chlorophyll

fluorescence parameters, yield and yield components in soybean landraces (Hao et al.,

2012a; b). Here, no correction for multiple testing was used, potentially leading to the

reporting of many false positive associations.

All of these GWAM studies have relied on relatively small numbers of markers

(hundreds to a thousand). Recent developments in high throughput next-generation

sequencing technologies now offer the opportunity to considerably increase the number

of SNPs in such studies. Although re-sequencing still remains too expensive to routinely

carry out on a large set of genotypes, several methods have been developed that

involve sequencing only a small fraction of the entire genome. Four main complexity

reduction methods have been described to date: Restriction site Associated DNA (RAD)

sequencing, Reduced Representation Libraries (RRL), Specific-Locus Amplified

Fragment (SLAF) sequencing, and Genotyping By Sequencing (GBS). Initially

developed in animals and fungi, (Miller et al., 2007; Baird et al., 2008), RAD sequencing

has been applied in barley (Chutimanitsakun et al., 2011) and rapeseed (Bus et al.,

2012) but not yet in soybean. Using the RRL approach in soybean, 1,682, 7,947, 25,047

and 14,550 SNPs were identified in four independent studies (Deschamps et al., 2010;

Wu et al., 2010; Hyten et al., 2010a; Varala et al., 2011). Although highly promising as a

SNP discovery tool, the RRL approach required between 6 and 50 µg of DNA per

sample, a quantity that is not very practical for genotyping on a large number of lines.

The SLAF sequencing approach proposed in soybean (Sun et al., 2013) requires two

Page 7 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 8: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

digestions, PCR amplification and purification steps, in addition to size selection on gels.

With such complexity this approach is unlikely to gain much popularity in soybean.

The GBS method described by Elshire et al. (2011) in maize and barley involves a

greatly simplified library production procedure more amenable to use on large numbers

of individuals/lines. There is no size selection step of the digested DNAs, enabling it to

be carried out using small amounts of DNA (100 ng). When adapted to soybean with the

ApeKI enzyme, a total of 10,120 high quality SNPs was discovered among eight diverse

soybean lines (Sonah et al., 2013). Their distribution mirrored closely the distribution of

gene-rich regions in the soybean genome, thus making GBS an attractive approach to

rapidly and efficiently genotype a large number of soybean lines with thousands of SNP

markers.

This paper reports the use of the GBS approach for the identification of QTLs

contributing to SSR resistance in soybean via a GWAM approach. The results suggest

that AM is a useful strategy for dissecting complex traits in soybean, thus providing a

valuable tool to assist in plant breeding. The high number of SNPs also enabled an

evaluation of LD decay over physical distance.

Page 8 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 9: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Material and Methods

Soybean Lines

A panel of 130 soybean lines representative of the diversity present in a private breeding

program (Semences Prograin Inc.) in Eastern Canada covering maturity groups (MG)

000-II was used with the sole exception of Williams 82 (MG III) (see list in Supplemental

Table 1). Among these, three varieties were known to show good levels of resistance to

SSR (Karlo RR, Maple Donovan and S19-90), while three were moderately (OAC

Bayfield and Williams 82) or highly susceptible (Nattosan) (Bastien et al., 2012). Maple

Donovan and Nattosan are commercial cultivars from the Eastern Cereal and Oilseed

Research Centre (Agriculture and Agri-Food Canada, Ottawa, Canada), while S19-90 is

a commercial cultivar from Syngenta Seeds. Williams 82 was obtained from the

American Germplasm Resources Information Network. Karlo RR is a cultivar from

Semences Prograin (St-Césaire, QC, Canada). Seeds of the other lines were obtained

from Semences Prograin.

Validation Populations

Two populations of 192 F4:5 lines segregating for the SNP marker most highly

associated with SSR resistance were used for validation purposes. Population 1 was

generated from the cross PR918827 x PR935401. Both genotypes are advanced

breeding lines from Semences Prograin. PR918827 carries the resistance allele at this

locus while PR935401 carries the susceptibility allele. Population 2 was generated from

the cross PR918827 x Toma. PR918827 carries the resistance allele at this locus while

Toma carries the susceptibility allele.

Page 9 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 10: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Phenotypic Evaluation

The association mapping (AM) panel was sown in the greenhouse at Université Laval in

a randomized complete block design comprising four blocks separated in time. Planting

dates were 25 Sept 2009, 6 Nov 2009, 18 Dec 2009 and 29 Jan 2010. The validation

experiments were sown in the greenhouse on 21 Sept 2012 (population 1) and on 2 Nov

2012 (population 2). A randomized complete block design comprising three blocks was

used. For all experiments, experimental units consisted of a total of six plants grown in

pairs in three 6-L pots. The potting mix was made of 50% black earth, 30% perlite and

20% Promix (Premier Tech Horticulture, Rivière-du-Loup, QC, Canada). Seeds were

inoculated with RhizoStick® inoculant (Becker Underwood, Ames, IA) at sowing. Plants

were grown under natural light supplemented with 600 W high-pressure sodium lamps

(P.L. Light Systems, Beamsville, ON, Canada) to provide a 16-h photoperiod. During the

growing period prior to inoculations the day/night temperature was 26/22°C.

Inoculum was prepared from strain NB-5 (provided by Dr. S. Rioux of CEROM, Quebec

City, QC, Canada). Inoculations were performed over several days because of

differences in flowering date. A pot was inoculated when both plants had reached the R1

growth stage. The cotton pad method described in Bastien et al. (2012) was used.

Briefly, S. sclerotiorum grown in potato dextrose broth was homogenized for 30 s in a

Waring blender (New Hartford, CT). Pieces (2.7 x 5.5 cm) of cotton pad [U.S. Cotton

(Canada) Co., Montreal, QC, Canada] were then soaked in the suspension. The

inoculum was applied on the petiole of the lowest node bearing flowers. After inoculation,

Page 10 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 11: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

plants were transferred to a different greenhouse where day/night temperatures were

22°C/18°C. Humidity was controlled based on water pressure deficit (maintained at 2.5

g/m3 with a fogging system). Lesion length was measured 7 d after inoculation.

DNA Extraction, Library Preparation and Sequencing

DNA was extracted from 50 mg fresh young leaves using the DNeasy 96 Plant kit

(Qiagen, cat. no. 69181) following the manufacturer’s protocol. DNA was quantified

using a Thermo Scientific Nanodrop 8000 spectrophotometer (Wilmington, DE). DNA

concentrations were normalized to 10 ng/µl and subsequently used for library

preparation. Three ApeKI libraries (48-plex) were prepared according to the GBS

protocol described by Elshire et al. (2011). Fourteen genotypes unrelated to this work

were included in one of the three GBS libraries for a total of 144 DNA samples. Single-

end sequencing was performed on three lanes of an Illumina HiSeq2000 (at the McGill

University-Génome Québec Innovation Center in Montreal, QC, Canada).

Processing of Illumina Raw Sequence Read Data and SNP Calling

The pipeline described by Sonah et al. (2013) was used for the processing of Illumina

108-bp reads. A maximum of one third of missing data per marker was tolerated. Finally,

missing information in the filtered set of high-quality SNPs was imputed using

fastPHASE (Scheet and Stevens, 2006).

Page 11 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 12: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Genotypic Data Analysis

SNP marker data were imported into TASSEL version 3.0 standalone (Bradbury et al.,

2007). Indels, markers having a minor allele frequency (MAF) inferior to 5% and markers

located on unanchored scaffolds were eliminated from the dataset. The remaining SNP

markers were used for analysis of population structure, linkage disequilibrium and

marker-trait associations.

Linkage Disequilibrium Analysis

Decay of LD between marker loci was assessed using the squared allele frequency

correlation (r2) between pairs of loci located on the same chromosome (Hill and

Robertson, 1968). To take into account the variability of recombination over the genome,

chromosomes were divided into telomeric (low LD) and pericentromeric (high LD)

regions. Borders between these regions were determined based on the mean r2 value

for markers located in a 1-Mb sliding window (0.1 Mb increments). Windows with fewer

than 10 marker pairs (i.e. <5 markers) were almost exclusively found in the repeat-rich

pericentromeric regions and were assigned a value of r2 = 1. Starting from each end of a

chromosome, a series of 10 consecutive windows with r2 = 1 was considered to mark the

beginning of the pericentromeric region. In a few cases, the first high LD zone

encountered was clearly separate from the large body of pericentric heterochromatin

(extended zone of low LD) reported in Schmutz et al. (2010). In such cases the second

“high LD” region was chosen to define the border. The precise boundary between

telomeric and pericentromeric regions was taken to be the midpoint of the first 1-Mb

Page 12 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 13: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

window. LD plots of all chromosomes are found in Supplementary Fig. 1. A summary of

regions and SNP coverage is shown in Table 1.

LD decay with physical distance was plotted both for the entire dataset and separately

for the telomeric and pericentromeric regions described above. A regression of r2

against distance was performed for all marker pairs using the R script LDit

(http://www.rilab.org/code/files/LDit.html, page verified 25 Sept 2013).

Association Analysis

All marker-trait association tests were run in TASSEL version 3.0 standalone (Bradbury

et al., 2007). A principal component analysis (PCA) was conducted to assess population

structure and a kinship (K) matrix was calculated to estimate familial relatedness

between lines. On the basis of the Scree plot, the first 16 PCs were used to capture

population structure in the association analyses. Four different models were tested: a) a

naïve analysis using only the general linear model (GLM); b) a GLM analysis with

principal components (P) as a cofactor; c) a mixed linear model (MLM) analysis using

the kinship matrix (K) as a cofactor; and d) a MLM analysis in which both population

structure and relatedness (P+K) were used as cofactors (Zhang et al., 2010). Quantile-

quantile plots were produced to assess the extent to which the analysis produced more

significant results than expected by chance.

The critical values for assessing the significance of marker-trait associations were

calculated using QVALUE (Storey and Tibshirani, 2003). The q-value is a measure of

Page 13 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 14: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

significance in terms of the false discovery rate similar to the p-value that relates to the

false positive rate. This approach limits the number of false positive results, while

offering a more liberal criterion than the Bonferroni correction factor. Marker-trait

associations having a q-value inferior to 0.1 were declared significant.

QTL Validation Experiments

A codominant cleaved amplified polymorphic sequence marker was developed and used

to genotype the candidate SNP on Gm15 in the crosses segregating for this marker. The

locus was amplified with specific primers (5’-TACCAAAATAACTTGTCTTGCAGCTTGG-

3’ and 5’- GCGGAGGAGCAAGCAGCTTATATGG-3’) and the resulting amplicon was

digested with ApeKI, with digestion of the amplicon signaling the resistance allele. For

each population, three F4:5 plants per row were genotyped and a line was considered to

have reached fixation at the marker when all three plants were homozygous for the

same allele. Based on this information, 24 F4:5 plants homozygous for each of the

resistance or the susceptibility allele were selected and F4:6 progeny of these 48 plants

were tested for their reaction to SSR as described above. Resistant genotypes Karlo RR

and S19-90 as well as susceptible genotypes Nattosan and OAC Bayfield were used as

checks in these experiments.

Statistical Analysis of Phenotypic Data

An analysis of variance was performed on the phenotypic data using PROC GLM of

SAS (Version 9.3, SAS Institute, Cary, NC) for a randomized complete block design.

Analysis of residuals and PROC UNIVARIATE confirmed the assumptions that

Page 14 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 15: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

experimental errors were normally distributed around a zero mean, and had a common

variance. Fisher’s protected least significant difference (LSD) at alpha = 0.05 was used

to test the differences among genotypes. Analyses of variance for the QTL validation

experiments were performed using PROC MIXED (SAS Release Version 9.3, SAS

Institute, Cary, NC). Alleles were analyzed as fixed effects, while genotypes and

replications were analyzed as random effects. Genotypes were nested within alleles.

Page 15 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 16: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Results

Evaluation of SSR Resistance in a Panel of 130 Soybean Lines

The distribution of lesion lengths among lines in the AM panel is shown in Fig. 1 and the

data for each line are provided in Supplemental Table 1. Lesion lengths covered a very

broad range (29 mm to 192 mm, mean of 114 mm), and the distribution was bell-shaped

with few highly resistant or highly susceptible and a large majority of lines exhibiting

intermediate reactions. Resistant checks Karlo RR, S19-90 and Maple Donovan all

developed shorter lesions than the average (29, 44 and 81 mm, respectively). One of

the moderately susceptible checks, Williams 82, ranked near the average (118 mm),

while another, OAC Bayfield, developed longer lesions (160 mm). The highly susceptible

cultivar Nattosan developed the longest lesions among all checks (177 mm).

Marker Distribution

The pooled GBS libraries were sequenced in three lanes of one flow cell, generating

285.9 million reads for the 144 lines comprised in three 48-plex libraries. The 130 lines

belonging to the association panel comprised 2.1 million reads per line on average, for a

total of 266.7 million reads. The number of reads per line varied between 0.5 and 5.8

million. A total of 12,193 SNPs was initially called for this set of lines. A large portion

(35.3%) of these was found to have a MAF < 5% and these SNPs were discarded,

leaving 7,893 SNPs with a MAF ≥ 5% (Fig. 2). Of these, 29 SNPs mapped to scaffolds

that are currently unassigned to a chromosome and the 7,864 that mapped onto one of

the 20 soybean chromosomes were used for the ensuing analyses. In total, these

Page 16 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 17: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

markers cover 945.5 Mb of the genome. The largest number of SNPs was found on

chromosome 18 (626 SNPs), followed by chromosome 15 (516 SNPs), and the lowest

number of SNPs was observed on chromosomes 11 (173 SNPs) and 12 (248 SNPs).

The distribution of SNPs on each chromosome is presented in Table 1. For the purpose

of refining the analysis of LD, the chromosomes were divided into telomeric and

pericentromeric regions (as described in the Materials and Methods) to properly reflect

the marked differences in marker coverage, recombination and LD in these regions.

Summed over all chromosomes, the telomeric (“low LD”) regions span a total of 400.9

Mb and 5,579 SNP were detected in this zone, for a coverage of one SNP every 71.9 kb.

In contrast, the pericentromeric (“high LD”) regions spanned 544.7 Mb and 2,285 SNP

were detected in this zone, for a coverage of one SNP every 238.4 kb.

Linkage Disequilibrium Analysis

LD analysis was performed using 7,864 SNPs. Decay of LD over physical distance is

presented in Fig. 3. For the entire data set, the regression curve fitted to the LD plot falls

below r2 = 0.2 at ~0.85 Mb (Fig. 3a). Within the telomeric regions of the chromosomes,

the regression curve crosses this threshold value at ~0.5 Mb (Fig. 3b). In stark contrast,

in the pericentromeric region, the LD curve falls below r2 = 0.2 at ~8.5 Mb (Fig. 3c). As

the mean distance between markers was inflated due to a relatively small number of

marker pairs that were exceptionally distant, the median distance between SNPs was

deemed more representative. Within the telomeric region, the median distance between

markers is 26.7 kb and it is 65.3 kb in the pericentromeric region.

Page 17 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 18: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

To examine how tightly the alleles at two loci are associated in these two distinct regions

of the genome, we then estimated the median r2 between marker pairs separated by

<100 kb within either the pericentromeric or telomeric regions. As can be seen in Fig. 4a,

within the pericentromeric regions, markers remain tightly associated (r2 > 0.8) at

distances as high as 21.1 kb, and even at large distances (~100 kb), marker pairs retain

a high level of association (r2 > 0.67). In contrast, within the telomeric regions (Fig. 4b),

median r2 values fall below 0.8 at only 3.6 kb and below 0.5 at 24.5 kb.

Population Structure Analysis

A principal component analysis (PCA) was performed on the 130 lines of the AM panel.

Principal component 1 (PC1) explained 5.9% of the variation in the data, while PC2 and

PC3 explained 4.7% and 4.5% of the variation, respectively. When observing these first

three axes of the PCA (Supplementary Fig. 2) we saw no clear grouping among lines,

indicating a low level of population structure. The first 16 PCs used in the association

analyses (as determined based on the Scree plot) captured 42.8% of the variability.

Association Mapping

The number of significant associations between SNPs and lesion length varied between

the statistical methods tested. The naïve GLM model detected the largest number of

significant associations at q-value < 0.1 (428). This method does not account for any

possible confounding effects that could lead to false positives, which led to an inflation of

the cumulative distribution of p-values relative to the observed p-values (Fig. 5). The

GLM model taking into account population structure (PCA) detected 13 significant

Page 18 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 19: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

associations but still led to an inflation of the cumulative distribution of p-values relative

to observed p-values. The MLM model taking into account familial relatedness (K) did

not detect any significant association. The fact that this model provided fewer significant

results than expected by chance suggests that it may be overly conservative. The mixed

model correcting for both population structure and familial relatedness (PCA+K) yielded

the best fit to the theoretical distribution. In total, 5.5% of marker-trait associations had a

p-value below 5% and this model detected a total of 10 significant marker-trait

associations (q-value < 0.1) defining four genomic regions (Fig. 6 and Table 2).

The most significant region comprises three SNPs covering 312 kb on Gm15. The

marker with the strongest association with lesion length (q-value = 0.011) is located at

position 13,651,235 and explains 14.5% of the variation for SSR resistance. It is the sole

marker that was detected in both the PCA and PCA+K analyses. Genotypes carrying the

minor allele (A) at this locus developed lesions 15.1 mm shorter than genotypes carrying

the major allele (G). The second significant region comprises five consecutive SNPs on

Gm01 between positions 29,185,984 and 31,164,344. They share an identical q-value of

0.040 and explain 7.3% of the variation. Genotypes carrying the minor allele at these

loci developed lesions 5.4 mm shorter than genotypes carrying the major allele. The

next significant region is marked by a single SNP located on Gm20 at position

39,698,515 (q-value = 0.094). It explained 6.3% of the variation for resistance to SSR

and genotypes carrying the minor allele (A) at this locus developed lesions 8.0 mm

shorter than genotypes carrying the major allele (G). The last significant region is also

marked by a single SNP located on Gm19 at position 50,557,054 (q-value = 0.094). It

explained 7.2% of the variation for resistance to SSR. Genotypes carrying the minor

Page 19 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 20: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

allele (A) at the locus on Gm19 developed lesions 9.9 mm longer than genotypes

carrying the major allele (C). Combined together, these four most significant markers

explained 35.3% of the phenotypic variation.

Validation Experiments

To validate the candidate markers associated with SSR resistance on Gm15, two

populations of F4:5 lines derived from parents contrasted for the peak marker on Gm15

were identified. The phenotypic contrast between parents of population 1 (PR918827 x

PR935401) was large (96.4 mm). In contrast, the parents of population 2 (PR918827 x

Toma) were both considered partially resistant and the phenotypic contrast between

them was small (15.8 mm). For each population, an equal number of F4:6 lines

homozygous either for the resistance or the susceptibility allele (24 each) were

evaluated for SSR resistance under greenhouse conditions (Fig. 7). With the sole

exception of OAC Bayfield that showed much smaller lesions than expected in the

Population 1 trial, all other checks developed lesions of the expected length in both trials.

In Population 1, among the group of lines homozygous for the resistance allele, lesion

length averaged 70.6 mm, whereas it averaged 82.9 mm among those homozygous for

the susceptibility allele. The contrast between the two groups of lines was not significant

(p = 0.170). In population 2, among the group of lines homozygous for the resistance

allele, lesion length averaged 38.6 mm, whereas it was 56.2 mm among those

homozygous for the susceptibility allele. The contrast between the two groups of lines

was significant (p = 0.027).

Page 20 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 21: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Genomic Landscape Near the Peak SNP on Gm15

To have a broad view of the genomic landscape in the vicinity of the peak SNP on

Gm15, we examined the interval defined by SNP markers having a q-value < 0.2. Six

SNPs have a q-value under this threshold, defining an interval that spans 590 kb

between position 13,339,206 and 13,929,317. Annotation using the blast2go software

(Conesa et al., 2005) revealed that this region harbors 28 predicted genes, including

twelve that have a predicted function related to disease resistance (Fig. 8).

Page 21 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 22: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Discussion

Number of Markers

In this work we performed GWAM with 7,864 SNPs. This is considerably more than

previous work in soybean that employed between 55 and 186 microsatellite markers

(Hou et al., 2011; Li et al., 2011; Korir et al., 2013; Niu et al., 2013; Zuo et al. 2013). It is

also a significant increase compared to three GWAM studies where genotyping was

conducted with the GoldenGate Assay. One used 858 and 868 SNPs in two different AM

populations (Mamidi et al., 2011), while the other two studies were conducted with 1,142

SNPs (Hao et al., 2012a; b).

Linkage Disequilibrium

Assessing the decay of LD in an association mapping panel provides an estimate of the

number of markers required to detect QTLs. The method most widely used to describe

LD decay consists in doing a regression of r2 against physical or genetic distance and

finding the intersection of the regression with a set threshold. In this work, we found that

applying this approach to physical distance did not provide an accurate portrayal of LD,

as it varies considerably across the genome (Comadran et al., 2009; Lee et al., 2013).

We devised a criterion, based on the mean level of LD in sliding windows, to define

borders between the telomeric (low LD) and the pericentromeric (high LD) regions of the

genome. This is similar to the annotation found in the soybean genome browser in

SoyBase in which pericentromeric regions are defined as ones having “near-zero rates

of recombination”, except that the zones defined here relate directly to the situation

encountered in this collection of lines. When we compared the pericentromeric regions

Page 22 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 23: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

defined in this work with those in SoyBase, on average, there was a ~5% difference in

the portion of the chromosome that was labeled pericentromeric.

As expected, LD decay varied considerably between the two regions, falling below

r2=0.2 after only 500 kb in the telomeric regions and extending up to 8.5 Mb in the

pericentromeric regions. The genome-wide figure (0.85 Mb) thus provides a relatively

poor description of what are two very distinct situations. Nonetheless, as previous

studies have used such global figures of LD decay, we will use this number to perform

comparisons on a similar basis. In the first work conducted on the question, 74

sequence tagged sites were used to study LD decay in three chromosomal regions

(Hyten et al., 2007). Among elite cultivars, LD reached r2=0.1 after 574 kb in one region

but never reached this threshold in the two other regions spanning 513 and 336 kb. In

another study conducted by Mamidi et al. (2011), it was reported that r2 fell below 0.1 at

7.0 Mb in one collection of lines using 858 SNPs and extended to 5.9 Mb in another

collection of lines using 868 SNPs. These two AM panels were respectively composed

of 141 or 143 advanced breeding lines adapted to the north central states of the United

States. Here, the AM panel comprised 130 cultivars and advanced breeding lines

representing the extent of diversity present within a single private breeding program.

Over all loci, r2 dropped below 0.1 at ~2.8 Mb. This less extensive LD is likely a

reflection of the broader scope of the latter panel as it comprised genetically-modified,

conventional and food-type soybeans belonging to maturity groups 000 to II. Lastly,

using 1,142 SNPs, Hao et al. (2012b) found that r2 reached 0.1 at only 500 kb in an AM

panel composed of 191 landraces from different geographic origins and with phenotypic

Page 23 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 24: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

variations. It is not surprising to find such low levels of LD in a very diverse collection of

lines.

Although the number of SNP markers used in this study is much higher than in

previously published work, is this coverage sufficient to successfully detect QTLs?

Based on the study of LD in three small euchromatic chromosomal regions (336 to 574

kb), Hyten et al. (2007) estimated that the number of tag SNPs required to obtain

sufficient SNP coverage at r2=0.8 in elite material ranged between 9,600 and 29,400

markers. In this work, close to 8,000 SNP markers covering the entire genome were

available to examine this question on a larger scale. If we assume that the most

challenging location for detection of a QTL is midway between two flanking markers, we

need to estimate the degree of correlation between these flanking markers and a

hypothetical QTL midway between them. To do this, we estimated the median r2 value

between markers located at half the median distance between marker pairs in our

dataset. For the telomeric regions, the median distance between marker pairs was 26.7

kb and thus, for loci situated 13.3 kb apart, the logarithmic regression shown in Fig. 4

provides an estimate of r2 = 0.60. In other words, given the number and distribution of

SNPs described in this work, we estimate that the degree of correlation between any

QTL and a flanking SNP marker is expected to be greater than 0.6 in telomeric

(euchromatic) regions. Similarly, for the pericentromeric regions (in which the median

distance between markers was found to be 65.3 kb), the median r2 value between loci at

half this distance is estimated to be 0.76. If one uses r2 = 0.8 as a threshold above which

an association analysis would have high power to detect QTLs (Hyten et al., 2007), we

find that the coverage achieved in the telomeric regions is probably sufficient to detect

Page 24 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 25: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

many QTLs either of moderate to large effect or located in more favorable positions

relative to flanking markers, but that it is still inadequate to provide high QTL detection

power throughout the soybean genome of this size and composition.

The pericentromeric region spans 544.7 Mb and is covered by 2,285 SNP markers. The

median distance between a SNP and a QTL at which r2 > 0.8 is < 21.1 kb (Fig. 4a). Thus,

to ensure an even coverage at such high LD levels, a SNP every 42.2 kb, or 12,900

SNPs, would be needed in the pericentromeric region. Similarly, the telomeric region

spans 400.9 Mb and is covered by 5,579 SNP markers. The median distance between a

SNP and a QTL at which r2 > 0.8 (Fig. 4b) is < 3.6 kb. Therefore, on average, SNPs

must be present every 7.2 kb, for a total of 55,700 SNPs over the region. For the entire

genome, a total of 68,600 well-distributed SNPs would thus be needed. However, it is

important to underline that this number is based on the worst-case scenario of a

hypothetical QTL midway between two SNPs. This study has demonstrated that GWAM

can detect QTLs with much fewer SNPs.

Validation of the QTL on Gm15

Although a number of studies have reported QTLs detected via association analyses in

soybean, validation of such candidate QTLs is rare. Here, we assessed the phenotypic

contrast between lines segregating for the peak SNP marker associated with SSR

resistance on Gm15. In the two segregating populations, the magnitude of the difference

in lesion length between the two genotypic classes was similar (12.3 mm in population 1

and 17.6 mm in population 2) and consistent with the allelic effect estimated in the

Page 25 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 26: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

association analysis (15.1 mm, Table 2). In population 1, this contrast was not

statistically significant (p < 0.170). A plausible explanation for this is the presence of

other segregating QTLs in this cross (R x S; 96.4 mm difference in lesion length) that

could have blurred the allelic contrast at a single locus. In contrast, both parents of

population 2 were partially resistant to SSR (15.8 mm difference in lesion length) and

the observed contrast between genotypic classes (17.6 mm) was significant (p < 0.027)

and explains completely the contrast between the parents of this cross. To our

knowledge this is the first report of validation of QTLs detected for SSR resistance in

soybean.

Magnitude of SSR Resistance QTLs and Relevance to Breeding

The quantitative nature of SSR resistance in soybean is clearly supported in the present

study, as the four genomic regions identified each explained between 6.3 and 14.5% of

the variation. With the exception of one QTL on chromosome 6, which explained

between 18.9 to 23.6% of phenotypic variation (Huynh et al., 2010), all other reported

QTLs accounted for 16% or less of SSR resistance (Kim et Diers, 2000; Arahana et al.,

2001; Guo et al., 2008; Han et al., 2008; Vuong et al., 2008; Huynh et al., 2010;

Sebastian et al., 2010). Combined together, the four candidate QTLs accounted for

38.4% of the variation for SSR variation, leaving a large proportion of variation

unexplained. In simulations, QTL detection power was demonstrated to increase with

both heritability and population size (Bradbury et al., 2011). As our association panel

(130 lines) is at the lower end of tested population sizes (100-300 lines), we would

expect this to constitute a limitation. Finally, the R2 value is influenced by the LD

Page 26 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 27: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

between the marker and the QTL and by the marker allele frequency. R2 is greatest

when LD is near 1 and the allele frequency is 0.5. Both conditions are likely to be fulfilled

in a bi-parental population but neither is likely to be the case in an AM population.

Although the number of markers that were used here is greater than most AM studies

published in soybean to this day, several thousand more SNP markers would be

required to ensure complete genome coverage. The impact of marker coverage was

examined in a recent work where the genomes of 226 accessions of the model legume

Medicago truncatula were sequenced, generating over 6 millions SNPs, that were used

to perform GWAM for several traits (Stanton-Geddes et al., 2013). In parallel, GWAM for

these traits was conducted with an in silico 250K SNP array. The comparison of AM

results revealed that candidates identified using the in silico arrays were often distant

from the top sequence-based candidates and highly biased towards common variants.

This implies that some QTLs for SSR resistance carried by rare variants may have gone

undetected because of insufficient marker coverage, and that the exact location of

candidate QTL will still need to be refined. However, from a breeder’s standpoint, the

mapping precision of QTLs achieved in study is more than sufficient to allow their

exploitation in marker-assisted breeding.

Pyramiding QTLs identified in this study could potentially result in increased resistance,

as the most resistant lines among the association panel were not fixed for the resistance

alleles at all QTLs. For instance, Karlo RR, the most highly resistant line in the panel,

possesses the resistance allele for QTLs on Gm01 and Gm19 but the susceptibility

allele at QTLs on Gm15 and Gm20. Thus, through the crossing of complementary lines

and the use of marker-assisted selection, the identification of breeding lines combining

Page 27 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 28: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

increased levels of resistance to SSR should be possible. However, the simultaneous

handling of many small-effect QTLs for several characters within a breeding program via

phenotypic selection could prove challenging. The development of genomic selection

schemes could provide an alternative way to achieve this goal (Poland et al., 2012).

Comparison with Previously Reported QTLs for SSR Resistance

Some of the candidate QTLs found in this work are located on chromosomes where

other SSR resistance QTLs have been found. The three QTLs previously reported on

Gm19 (Arahana et al., 2001; Han et al., 2008; Sebastian et al., 2010), however, map to

regions distinct from that reported here. QTLs on chromosomes 1 and 20 were also

reported (Arahana et al., 2001; Li et al., 2010), but again mapping to intervals distinct

from this study. Three QTLs for SSR resistance were described on chromosome 15. The

first encompasses a 44.7 Mb interval that includes our significant peak (Guo et al., 2008).

The second is located between microsatellite satt231 (position 50.5 Mb) and a restriction

fragment length polymorphism 18 cM upstream (Han et al., 2008). Both intervals are too

large to state if they detected the same QTL as ours. The third QTL was found near a

RAPD marker (OP_m12b), which is located between Satt720 (position 4.1 Mb) and

BARC-014271-01299 (position 14.0 Mb) (Arahana et al., 2001). This interval includes

the significant peak detected here. The resistance allele came from the susceptible

parent Williams 82 and the susceptible allele from the resistant parent S19-90. In our

work the resistance allele is carried by S19-90, which suggests that the two QTLs are

distinct. Overall this is the first report of association to SSR resistance in soybean in

these regions of chromosomes 1, 19 and 20.

Page 28 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 29: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Genomic Landscape Near the Association Peak on Gm15

Markers having a q-value < 0.2 on Gm15 define a 590 kb zone between positions

13,339,206 and 13,929,317 where twelve of the 28 predicted genes (42.9%) have a

predicted function related to disease resistance (Fig. 8). Within this zone there is a 260

kb gap between the peak SNP at position 13,651,235 and another SNP at position

13,911,191 (Fig. 9). This gap could have been caused by the presence of eight

consecutive predicted serine/threonine protein kinase genes spanning 83.5 kb. In

soybean, gene-rich regions that harbor clustered multigene families such as nucleotide-

binding and receptor-like protein classes have been shown to be the most enriched for

structural variation (McHale et al., 2012). Furthermore, evidence of SNP associations in

or adjacent to serine/threonine protein kinases genes with quantitative resistance

against fungal pathogens has been presented in maize (Kump et al., 2011; Poland et al.,

2011; Wang et al., 2012b). In soybean, genes encoding serine/threonine protein kinases

were shown to be involved in the reaction to infection with soybean rust caused by

Phakopsora pachyrhizi (Tremblay et al., 2011) and were suggested to be associated

with the higher level of partial resistance to Phytophthora sojae (Wang et al., 2012a).

Two other predicted proteins contain leucine-rich repeat (LRR) domains, a class of

genes involved in partial resistance to fungal pathogens in maize (Kump et al., 2011;

Poland et al., 2011). The recognition ability of polygalacturonase-inhibiting proteins,

extracellular plant proteins capable of inhibiting fungal endopolygalacturonases, resides

in their LRR structure, where solvent-exposed residues in the β-strand/β-turn motifs of

Page 29 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 30: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

the LRRs are determinants of specificity (De Lorenzo et al., 2001). In the model legume

Medicago truncatula, high-recombination regions are significantly overrepresented in

LRR genes (Paape et al., 2012). Here the genes coding for the two putative LRR

proteins flank the eight consecutive predicted serine/threonine protein kinase genes. In

soybean LLR-containing genes tend to co-localize with disease resistance QTL (Hayes

et al., 2004; Valdes-Lopez et al., 2011; Kang et al., 2012), notably in a QTL linked to

partial resistance to Phytophthora sojae (Jeong et al., 2001; Wang et al., 2010; Wang et

al., 2012a). Finally, two predicted genes contain ankyrin repeats domains, a feature of a

large family whose members are involved in a number of physiological and

developmental functions that include responses to biotic and abiotic stresses (Cao et al.,

1997; Yan et al., 2002; Yang et al., 2012). Ankyrin containing proteins have been

reported to be associated to quantitative resistance to various pathogens in maize

(Kump et al., 2011) and rice (Mou et al., 2013). The overexpression of the rice gene

OsBIANK1 in Arabidopsis plants was shown to increase disease resistance to Botrytis

cinerea, a pathogen closely related to S. sclerotiorum (Li et al., 2013). Taken together,

these evidences strongly support the presence of a disease resistance gene cluster

near the association peak on Gm15.

Page 30 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 31: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Conclusions

The dense marker coverage obtained through GBS enabled us to more finely describe

LD over the entire genome and to conduct GWAM at an unprecedented resolution in

soybean. We have found that LD persists over much longer distances in the

pericentromeric region than the telomeric region and that this information must be

considered when assessing marker coverage adequacy. We identified four regions

associated with SSR resistance and one of these was validated in a bi-parental

population. These QTLs can potentially be pyramided in new soybean cultivars to

achieve improved SSR resistance through marker-assisted or genomic selection.

Acknowledgments

We thank Semences Prograin Inc. and the Natural Sciences and Engineering Research

Council of Canada for providing the financial support for this research. We would like to

extend our gratitude to Denis Marois and Martin Lacroix for their technical assistance.

Page 31 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 32: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

References

Arahana, V.S., G.L. Graef, J.E. Specht, J.R. Steadman and K.M. Eskridge. 2001.

Identification of QTLs for resistance to Sclerotinia sclerotiorum in soybean. Crop Sci.

41:180-188. doi:10.2135/cropsci2001.411180x

Atwell, S., Y.S. Huang, B.J. Vilhjalmsson, G. Willems, M. Horton, et al. 2010. Genome-

wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines. Nature

465(7298):627-631. doi:10.1038/nature08800

Baird, N.A., P.D. Etter, T.S. Atwood, M.C. Currey, A.L. Shiver, et al. 2008. Rapid SNP

discovery and genetic mapping using sequenced RAD markers. PLoS One 3(10):e3376.

doi:10.1371/journal.pone.0003376

Bastien, M., T.T. Huynh, G. Giroux, E. Iquira, S. Rioux, and F. Belzile. 2012. A

reproducible assay for measuring partial resistance to Sclerotinia sclerotiorum in

soybean. Can. J. Plant Sci. 92:279-288. doi:10.4141/CJPS2011-101

Boland, G.J., and R. Hall. 1987. Evaluating soybean cultivars for resistance to

Sclerotinia sclerotiorum under field conditions. Plant Dis. 71:934-936.

Page 32 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 33: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Bradbury, P.J., Z. Zhang, D.E. Kroon, T.M. Casstevens, Y. Ramdoss, and E.S. Buckler.

2007. TASSEL: software for association mapping of complex traits in diverse samples.

Bioinformatics 23(19):2633-2635. doi:10.1093/bioinformatics/btm308

Bradbury, P., T. Parker, M. Hamblin, and J.-L. Jannink. 2011. Assessment of power and

false discovery rate in genome-wide association studies using the BarleyCAP germplam.

Crop Sci. 51:52-59. doi:10.2135/cropsci2010.02.0064

Bus, A., J. Hecht, B. Huettel, R. Reinhardt, and B. Stich. 2012. High-throughput

polymorphism detection and genotyping in Brassica napus using next-generation RAD

sequencing. BMC Genomics 13:281. doi:10.1186/1471-2164-13-281

Cao, H., J. Glazebrook, J.D. Clarke, S. Volko, and X. Dong. 1997. The Arabidopsis

NPR1 gene that controls systemic acquired resistance encodes a novel protein

containing ankyrin repeats. Cell 88(1):57-63. doi:10.1016/S0092-8674(00)81858-9

Chen, Y., and D. Wang. 2005. Two convenient methods to evaluate soybean for

resistance to Sclerotinia sclerotiorum. Plant Dis. 89:1268-1272. doi:10.1094/PD-89-1268

Chutimanitsakun, Y., R.W. Nipper, A. Cuesta-Marcos, L. Cistué, A. Corey, et al. 2011.

Construction and application for QTL analysis of a Restriction site Associated DNA

(RAD) linkage map in barley. BMC Genomics 12(4). doi:10.1186/1471-2164-12-4

Page 33 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 34: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Comadran, J., W. Thomas, F. van Eeuwijk, S. Ceccarelli, S. Grando, et al. 2009.

Patterns of genetic diversity and linkage disequilibrium in a highly structured Hordeum

vulgare association-mapping population for the Mediterranean basin. Theor. Appl. Genet.

119(1):175-187. doi :10.1007/s00122-009-1027-0

Conesa, A., S. Götz, J.M. Garcia-Gomez, J. Terol, M. Talon, and M. Robles. 2005.

Blast2GO: a universal tool for annotation, visualization and analysis in functional

genomics research. Bioinformatics 21:3674-3676. doi:10.1093/bioinformatics/bti610

De Lorenzo, G., R. D’Ovidio, and F. Cervone. 2001. The role of polygalacturonase-

inhibiting proteins (PGIPs) in defense against pathogenic fungi. Annu. Rev. Phytopathol.

39:313-335. doi:10.1146/annurev.phyto.39.1.313

Deschamps, S., M. la Rota, J.P. Ratashak, P. Biddle, D. Thureen, et al. 2010. Rapid

genome-wide single nucleotide polymorphism discovery in soybean and rice via deep

resequencing of reduced representation libraries with the Illumina genome analyzer.

Plant Gen. 3(1):53-68. doi:10.3835/plantgenome2009.09.0026

Elshire, R.J., J.C. Glaubitz, Q. Sun, J.A. Poland, K. Kawamoto, et al. 2011. A robust,

simple Genotyping-by-Sequencing (GBS) approach for high diversity species. PLoS

ONE 6(5):e19379. doi:10.1371/journal.pone.0019379

Page 34 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 35: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Gracia-Garza, J.A., S. Neumann, T.J. Vyn, and G.J. Boland. 2002. Influence of crop

rotation and tillage on production of apothecia by Sclerotinia sclerotiorum. Can. J. Plant

Pathol. 24:137-143. doi:10.1080/07060660309506988

Grau, C.R. 1988. Sclerotinia stem rot of soybean. In: T.D. Wyllie and D.H. Scott editors,

Soybean diseases of the North Central region. APS, St. Paul, MN. p. 56-149.

Guo, X., D. Wang, S.G. Gordon, E. Helliwell, T. Smith, et al. 2008. Genetic mapping of

QTLs underlying partial resistance to Sclerotinia sclerotiorum in soybean PI 391589A

and PI 391589B. Crop Sci. 48:1129-1139. doi:10.2135/cropsci2007.04.0198

Han, F., M. Katt, W. Schuh, and D.M. Webb. 2008. QTL controlling Sclerotinia stem rot

resistance in soybean. U.S. Patent 7,250,552. Date issued: 18 September.

Hao, D., M. Chao, Z. Yin, and D. Yu. 2012a. Genome-wide association analysis

detecting significant single nucleotide polymorphisms for chlorophyll and chlorophyll

fluorescence parameters in soybean (Glycine max) landraces. Euphytica 186:919-931.

doi:10.1007/s10681-012-0697-x

Hao, D., H. Cheng, Z. Yin, S. Cui, D. Zhang, et al. 2012b. Identification of single

nucleotide polymorphisms and haplotypes associated with yield and yield components in

soybean (Glycine max) landraces across multiple environments. Theor. Appl. Genet.

124:447–458. doi:10.1007/s00122-011-1719-0

Page 35 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 36: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Hartman, G.L., M.E. Gardner, T. Hymowitz, and G.C. Naidoo. 2000. Evaluation of

perennial Glycine species for resistance to soybean fungal pathogens that cause

Sclerotinia stem rot and sudden death syndrome. Crop Sci. 40:545-549.

doi:10.2135/cropsci2000.402545x

Hayes, A.J., S.C. Jeong, M.A. Gore, Y.G. Yu, G.R. Buss, et al. 2004. Recombination

within a nucleotide-binding-site/leucine-rich-repeat gene cluster produces new variants

conditioning resistance to soybean mosaic virus in soybeans. Genetics. 166(1):493-503.

doi:10.1534/genetics.166.1.493

Hill, W.G., and A. Robertson. 1968. Linkage disequilibrium in finite populations. Theor.

Appl. Genet. 38:226-231.

Hoffman, D.D., B.W. Diers, G.L. Hartman, C.D. Nickell, R.L. Nelson, et al. 2002.

Selected soybean plant introductions with partial resistance to Sclerotinia sclerotiorum.

Plant Dis. 86:971-980. doi:10.1094/PDIS.2002.86.9.971

Hou, J., C. Wang, X. Hong, J. Zhao, C. Xue, et al. 2011. Association analysis of

vegetable soybean quality traits with SSR markers. Plant Breed. 130(4):444-449.

doi:10.1111/j.1439-0523.2011.01852.x

Huang, X., Y. Zhao, X. Wei, C. Li, A. Wang, et al. 2012. Genome-wide association study

of flowering time and grain yield traits in a worldwide collection of rice germplasm.

Nature Genet. 44(1):32-39. doi:10.1038/ng.1018

Page 36 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 37: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Huynh T.T., M. Bastien, E. Iquira, P. Turcotte, and F. Belzile. 2010. Identification of

QTLs associated with partial resistance to white mold in soybean using field-based

inoculation. Crop Sci. 50(3):969-979. doi:10.2135/cropsci2009.06.0311

Hyten, D.L., I.Y. Choi, Q.J. Song, R.C. Shoemaker, R.L. Nelson, et al. 2007. Highly

variable patterns of linkage disequilibrium in multiple soybean populations. Genetics

175 :1937-1944. doi:10.1534/genetics.106.069740

Hyten, D.L., Q. Song, I.-Y. Choi, M.-S. Yoon, J.E. Specht, et al. 2008. High-throughput

genotyping with the GoldenGate assay in the complex genome of soybean. Theor. Appl.

Genet. 166(7): 945-952. doi:10.1007/s00122-008-0726-2

Hyten, D.L., S.B. Cannon, Q. Song, N. Weeks, E.W. Fickus, et al. 2010a. High-

throughput SNP discovery through deep resequencing of a reduced representation

library to anchor and orient scaffolds in the soybean whole genome sequence. BMC

Genomics 11:38. doi:10.1186/1471-2164-11-38

Hyten, D.L., I.Y. Choi, Q. Song, J.E. Specht, T.E. Carter, et al. 2010b. A high density

integrated genetic linkage map of soybean and the development of a 1536 Universal

Soy Linkage Panel for quantitative trait locus mapping. Crop Sci. 50:960-968.

doi:10.2135/cropsci2009.06.0360

Page 37 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 38: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Jeong, S.C., A.J. Hayes, R.M. Biyashev, and M.A. Saghai Maroof. 2001. Diversity and

evolution of a non-TIR-NBS sequence family that clusters to a chromosomal “hotspot”

for disease resistance genes in soybean. Theor. Appl. Genet. 103:406-414.

doi:10.1007/s001220100567

Kang, Y.J., K.H. Kim, S. Shim, M.Y. Yoon, S. Sun, et al. 2012. Genome-wide mapping

of NBS-LRR genes and their association with disease resistance in soybean. BMC Plant

Biol. 12:139. doi:10.1186/1471-2229-12-139

Kim, H.S., C.H. Sneller, and B.W. Diers. 1999. Evaluation of soybean cultivars for

resistance to Sclerotinia stem rot in field environments. Crop Sci. 39:64-68.

doi:10.2135/cropsci1999.0011183X003900010010x

Kim, H.S., and B.W. Diers. 2000. Inheritance of partial resistance to Sclerotinia stem rot

in soybean. Crop Sci. 40:55-61. doi: 10.2135/cropsci2000.40155x

Koenning, S.R., and J.A. Wrather. 2010. Suppression of soybean yield potential in the

continental United States by plant diseases from 2006 to 2009. Plant Health Progress.

doi:10.1094/PHP-2010-1122-01-RS

Korir, P.C., J. Zhang, K. Wu, T. Zhao, and J. Gai. 2013. Association mapping combined

with linkage analysis for aluminum tolerance among soybean cultivars released in

Yellow and Changjiang river valleys in China. Theor. Appl. Genet. 126:1659–1675.

doi :10.1007/s00122-013-2082-0

Page 38 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 39: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Kump, K.L., P.J. Bradbury, R.J. Wisser, E.S. Buckler, A.R. Belcher, et al. 2011.

Genome-wide association study of quantitative resistance to southern leaf blight in the

maize nested association mapping population. Nature Genet. 43(2):163-169.

doi:10.1038/ng.747

Kurle, J.E., C.R. Grau, E.S. Oplinger, and A. Mengistu. 2001. Tillage, crop sequence,

and cultivar effects on Sclerotinia stem rot incidence and yield in soybean. Agron. J.

93:973-982. doi:10.2134/agronj2001.935973x

Lee, W.K., N. Kim, J. Kim, J.-K. Moon, N. Jeong et al. 2013. Dynamic genetic features of

chromosomes revealed by comparison of soybean genetic and sequence-based

physical maps. Theor. Appl. Genet. 126:1103-1119. doi:10.1007/s00122-012-2039-8

Li, D., M. Sun, Y. Han, W. Teng, and W. Li. 2010. Identification of QTL underlying

soluble pigment content in soybean stems related to resistance to soybean white mold

(Sclerotinia sclerotiorum). Euphytica 172(1):49-57. doi:10.1007/s10681-009-0036-z

Li, D., F. Wang, B. Liu, Y. Zhang, L. Huang, et al. 2013. Ectopic expression of rice

OsBIANK1, encoding an ankyrin repeat-containing protein, in Arabidopsis confers

enhanced disease resistance to Botrytis cinerea and Pseudomonas syringae. J.

Phytopathol. 161:27-34. doi:10.1111/jph.12023

Page 39 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 40: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Li, Y.-H., M.J.M. Smulders, R.-Z. Chang, and L.-J. Qiu. 2011. Genetic diversity and

association mapping in a collection of selected Chinese soybean accessions based on

SSR marker analysis. Conserv. Genet. 12:1145-1157. doi:10.1007/s10592-011-0216-y

Mamidi, S., S. Chikara, R.J. Goos, D.L. Hyten, D. Annam, et al. 2011. Genome-wide

association analysis identifies candidate genes associated with iron deficiency chlorosis

in soybean. Plant Gen. 4(3):154-164. doi:10.3835/plantgenome2011.04.0011

McHale, L.K., W.J. Haun, W.W. Xu, P.B. Bhaskar, J.E. Anderson, et al. 2012. Structural

variants in the soybean genome localize to clusters of biotic stress-response genes.

Plant Physiol. 159(4):1295-1308. doi:10.1104/pp.112.194605.

Mila, A.L., and X.B. Yang. 2008. Effects of fluctuating soil temperature and water

potential on sclerotia germination and apothecial production of Sclerotinia sclerotiorum.

Plant Dis. 92:78-82. doi:10.1094/PDIS-92-1-0078

Miller, M.R., J.P. Dunham, A. Amores, W.A. Cresko, and E.A. Johnson. 2007. Rapid and

cost-effective polymorphism identification and genotyping using restriction site

associated DNA (RAD) markers. Genome Res. 17(2):240-248. doi:10.1101/gr.5681207

Mou, S., Z. Liu, D. Guan, A. Qiu, Y. Lai, and S. He. 2013. Functional analysis and

expressional characterization of rice ankyrin repeat-containing protein, OsPIANK1, in

basal defense against Magnaporthe oryzae attack PLoS One 8(3):e59699.

doi:10.1371/journal.pone.0059699

Page 40 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 41: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Mueller, D.S., A.E. Dorrance, R.C. Derksen, E. Ozkan, J.E. Kurle, et al. 2002. Efficacy of

fungicides on Sclerotinia sclerotiorum and their potential for control of Sclerotinia stem

rot on soybean. Plant Dis. 86:26-31. doi:10.1094/PDIS.2002.86.1.26

Mueller, D.S., C.A. Bradley, C.R. Grau, J.M. Gaska, J.E. Kurle, and W.L. Pedersen.

2004. Application of thiophanate-methyl at different host growth stages for management

of Sclerotinia stem rot in soybean. Crop Prot. 23:983-988.

doi:10.1016/j.cropro.2004.02.013

Niu, Y., Y. Xu, X.F. Liu, S.X. Yang, S.P. Wei, et al. 2013. Association mapping for seed

size and shape traits in soybean cultivars. Mol. Breeding 31:785–794.

doi:10.1007/s11032-012-9833-5

Paape, T., P. Zhou, A. Branca, R. Briskine, N. Young, and P. Tiffin. 2012. Fine-scale

population recombination rates, hotspots, and correlates of recombination in the

Medicago truncatula genome. Genome Biol. Evol. 4(5):726–737.

doi:10.1093/gbe/evs046

Phillips, A.J.L. 1994. Influence of fluctuating temperatures and interrupted periods of

plant surface wetness on infection of bean leaves by ascospores of Sclerotinia

sclerotiorum. Ann. Appl. Biol. 124:413-427.

Page 41 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 42: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Poland, J.A., P.J. Bradbury, E.S. Buckler, and R.J. Nelson. 2011. Genome-wide nested

association mapping of quantitative resistance to northern leaf blight in maize. P. Natl.

Acad. Sci. USA 108(17):6893-6898. doi:10.1073/pnas.1010894108

Poland, J., J. Endelman, J. Dawson, J. Rutkoski, S. Wu, et al. 2012. Genomic selection

in wheat breeding using genotyping-by-sequencing. Plant Gen. 5(3):103-113.

doi:10.3835/plantgenome2012.06.0006

Rousseau, G., T.T. Huynh, D. Dostaler, and S. Rioux. 2004. Greenhouse and field

assessments of resistance in soybean inoculated with sclerotia, mycelium, and

ascospores of Sclerotinia sclerotiorum. Can. J. Plant Sci. 84:615-623. doi:10.4141/P03-

003

Rousseau, G.X., D. Dostaler, and S. Rioux. 2007. Effect of crop rotation and soil

amendments on Sclerotinia stem rot on soybean in two soils. Can. J. Plant Sci. 87:605-

614. doi:10.4141/P05-137

Scheet, P., and M. Stephens. 2006. A fast and flexible statistical model for large-scale

population genotype data: Applications to inferring missing genotypes and haplotypic

phase. Am. J. Hum. Genet. 78:629-644. doi:0002-9297/2006/7804-0010

Schmutz, J., S.B. Cannon, J. Schlueter, J. Ma, T. Mitros, et al. 2010. Genome sequence

of the palaeopolyploid soybean. Nature 463(7278):178-183. doi:10.1038/nature08670

Page 42 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 43: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Sebastian, S.A., H. Lu, F. Han, D. Kyle, B.R. Hedges, et al. 2010. Genetic loci

associated with Sclerotinia tolerance in soybean. U.S Patent 7,790,949 B2. Date issued:

7 September.

Sonah, H., M. Bastien, E. Iquira, A. Tardivel, G. Légaré, et al. 2013. An improved

genotyping by sequencing (GBS) approach offering increased versatility and efficiency

of SNP discovery and genotyping. PLoS ONE 8(1):e54603.

doi:10.1371/journal.pone.0054603

Stanton-Geddes, J., T. Paape, B. Epstein, R. Briskine, J. Yoder, et al. 2013. Candidate

genes and genetic architecture of symbiotic and agronomic traits revealed by whole-

genome, sequence-based association genetics in Medicago truncatula. PLoS ONE

8(5):e65688. doi:10.1371/journal.pone.0065688

Storey, J.D., and R. Tibshirani. 2003. Statistical significance for genomewide studies. P.

Natl. Acad. Sci. USA 100(16):9440-9445. doi:10.1073/pnas.1530509100

Sun, X., D. Liu, X. Zhang, W. Li, H. Liu, et al. 2013. SLAF-seq: An efficient method of

large-scale de Novo SNP discovery and genotyping using high-throughput sequencing.

PLoS ONE 8(3):e58700. doi:10.1371/journal.pone.0058700

Sutton, J.C., and G. Peng. 1993. Manipulation and vectoring of biocontrol organisms to

manage foliage and fruit disease in cropping systems. Annu. Rev. Phytopathol. 31:473-

493.

Page 43 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 44: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Tremblay, A., P. Hosseini, N.W. Alkharouf, S. Li, and B.F. Matthews. 2011. Gene

expression in leaves of susceptible Glycine max during infection with Phakopsora

pachyrhizi using next generation sequencing. Sequencing 2011.

doi:10.1155/2011/827250

Valdes-Lopez, O., S. Thibivilliers, J. Qiu, W.W. Xu, T.H.N. Nguyen, et al. 2011.

Identification of quantitative trait loci controlling gene expression during the innate

immunity response of soybean. Plant Physiol. 157(4):1975-1986.

doi:10.1104/pp.111.183327

Varala, K., K. Swaminathan, Y. Li, and M.E. Hudson. 2011. Rapid genotyping of

soybean cultivars using high throughput sequencing. PLoS ONE 6(9):e24811.

doi:10.1371/journal.pone.0024811

Vuong, T.D., B.W. Diers, and G.L. Hartman. 2008. Identification of QTL for resistance to

Sclerotinia stem rot in soybean plant introduction 194639. Crop Sci. 48(6):2209-2214.

doi:10.2135/cropsci2008.01.0019

Wang, H., L. Waller, S. Tripathy, S.K. St. Martin, L. Zhou, et al. 2010. Analysis of genes

underlying soybean quantitative trait loci conferring partial resistance to Phytophthora

sojae. Plant Gen. 3(1):23-40. doi:10.3835/plantgenome2009.12.0029

Page 44 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 45: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Wang, H., A. Wijeratne, S. Wijeratne, S. Lee, C.G. Taylor, et al. 2012a. Dissection of two

soybean QTL conferring partial resistance to Phytophthora sojae through sequence and

gene expression analysis. BMC Genomics 13:428. doi:10.1186/1471-2164-13-428

Wang, M., J. Yan, J. Zhao, X. Zhang, and Y. Xiao. 2012b. Genome-wide association

study (GWAS) of resistance to head smut in maize. Plant Sci. 196:125-131.

doi:10.1016/j.plantsci.2012.08.004

Workneh, F., and X.B. Yang. 2000. Prevalence of Sclerotinia stem rot of soybeans in the

north-central United States in relation to tillage, climate, and latitudinal positions.

Phytopathology 90:1375-1382. doi:10.1094/PHYTO.2000.90.12.1375

Wu, X., C. Ren, T. Joshi, T. Vuong, D. Xu, and H. Nguyen. 2010. SNP discovery by

high-throughput sequencing in soybean. BMC Genomics 11(1):469. doi:10.1186/1471-

2164-11-469

Yan, J., J. Wang, and H. Zhang. 2002. An ankyrin repeat-containing protein plays a role

in both disease resistance and antioxidation metabolism. Plant J. 29(2):193–202.

doi:10.1046/j.0960-7412.2001.01205.x

Yang, Y., Y. Zhang, P. Ding, K. Johnson, X. Li, and Y. Zhang. 2012. The ankyrin-repeat

transmembrane protein BDA1 functions downstream of the receptor-like protein SNC2

to regulate plant immunity. Plant Physiol. 159(4):1857–1865.

doi:10.1104/pp.112.197152

Page 45 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 46: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Zeng, W., W. Kirk, and J. Hao. 2012. Field management of Sclerotinia stem rot of

soybean using biological control agents. Biol. Control 60:141–147.

doi:10.1016/j.biocontrol.2011.09.012

Zhang, Z., E. Ersoz, C.-H. Lai, R.J. Todhunter, H.K. Tiwari, et al. 2010. Mixed linear

model approach adapted for genome-wide association studies. Nat. Genet. 42(4):355-

360. doi:10.1038/ng.546

Zuo, Q., J. Hou, B. Zhou, Z. Wen, S. Zhang, et al. 2013. Identification of QTLs for

growth period traits in soybean using association analysis and linkage mapping. Plant

Breed. 132(3):317-323. doi:10.1111/pbr.12060

Page 46 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 47: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Figures

Fig. 1. Distribution of disease susceptibility among 130 soybean lines.

Fig. 2. Distribution of SNPs according to their minor allele frequency (MAF). SNPs with

MAF < 5% were excluded from the analysis.

Fig. 3. Intrachromosomal linkage disequilibrium decay over physical distance. (a) Whole

chromosome; (b) Telomeric regions; (c) Pericentromeric regions.

Fig. 4. Plot of median r2 over physical distance. a) Pericentromeric regions; b) Telomeric

regions.

Fig. 5. Cumulative distribution of p-values from genome-wide association tests using

four statistical models. Cumulative p-values are equivalent to observed p-value under an

expected model taking into account all background effects.

Fig. 6. Genome-wide association scan for Sclerotinia stem rot (SSR) resistance. The –

log10(P) values from the genome-wide scan are plotted against the SNP positions on the

physical map of each chromosome. The significance threshold (q = 0.1) is indicated by

the horizontal line.

Fig. 7. Lesion length among recombinant inbred lines classes contrasted for the SNP on

Gm15 in two validation populations.

Fig. 8. Predicted genes in the vicinity of the peak SNP at position 13,651,235 on Gm15.

Genes involved in disease resistance are colored in yellow, other genes are colored in

purple.

Fig. 9. Heat map of r2 values between SNP markers located in an interval containing

SNP markers associated with Sclerotinia stem rot resistance (at q < 0.2) on Gm15.

Page 47 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 48: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Supplementary Fig. 1. Average r2 values over physical distance for the 20 soybean

chromosomes. The arrows show the approximate extent of the pericentromeric region.

Supplementary Fig. 2. Projection of the 130 genotypes on the plane of the three first

eigenvectors of the principal component analysis.

Page 48 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 49: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Tables

Table 1. Marker distribution among chromosomal regions.

Telomeric Pericentromeric

SNP †Borders region region

Total Position

(Mb) Left Right Width Nb Width Nb

Chr Nb First Last (Mb) (Mb) (Mb) SNP (Mb) SNP

1 287 0.29 55.90 4.6 46.0 14.21 170 41.4 117

2 479 0.11 51.55 12.1 41.2 22.34 349 29.1 130

3 328 0.21 47.74 8.6 35.8 20.33 254 27.2 74

4 421 0.01 49.08 12.4 41.2 20.27 312 28.8 109

5 357 0.02 41.92 9.4 26.9 24.4 351 17.5 6

6 420 0.05 50.49 8.4 43.2 15.64 184 34.8 236

7 324 0.11 44.54 11.4 34.8 21.03 227 23.4 97

8 358 0.66 46.90 15.3 42.1 19.44 215 26.8 143

9 418 0.30 46.83 3.3 32.9 16.93 242 29.6 176

10 419 0.13 50.95 6.6 36.6 20.82 250 30.0 169

11 173 0.25 39.10 10.9 35.9 13.85 124 25.0 49

12 248 0.02 40.10 7.8 33.9 13.98 203 26.1 45

13 490 0.01 44.34 5.1 23.9 25.53 335 18.8 155

14 392 0.06 49.68 11.1 45.9 14.82 260 34.8 132

15 516 0.14 50.89 14.2 39.3 25.65 348 25.1 168

16 452 0.05 37.13 8.4 22.8 22.68 364 14.4 88

17 355 0.35 41.84 15.2 31.6 25.09 301 16.4 54

18 626 0.01 62.19 13.5 50.9 24.78 477 37.4 149

19 435 0.03 50.56 5.9 34.9 21.53 292 29.0 143

20 366 0.11 46.75 3.6 32.7 17.54 321 29.1 45

Total 7,864 400.9 5,579 544.7 2,285

† Borders between the pericentromeric and telomeric regions

Page 49 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 50: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Table 2. Most significant single nucleotide polymorphisms (SNPs) associated with

resistance.

Chr† Position‡ p-value q-value MAF MA effect§ ¶R2 LLR# LLS†† a‡‡

15 13,651,235 1.38 x 10-6 0.011 0.32 Resistant 0.145 103.7 118.8 15.1

1 29,185,984 3.09 x 10-5 0.040 0.39 Resistant 0.073 110.7 116.1 5.4

20 39,698,515 1.19 x 10-4 0.094 0.32 Resistant 0.063 108.6 116.5 8.0

19 50,557,054 1.21 x 10-4 0.094 0.05 Susceptible 0.072 113.4 123.3 9.9

† Chromosome number

‡ Position of peak marker on the physical map

§ Indicates whether the minor allele provides increased resistance or susceptibility

¶ Indicates the proportion of total phenotypic variation accounted for by the marker

# Mean lesion length in mm of genotypes carrying the resistance allele

†† Mean lesion length in mm of genotypes carrying the susceptibility allele

‡‡ Average change in lesion length following allele substitution

Page 50 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 51: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

Supplementary Table 1. Lesion length (mm) among the 130 soybean lines.

Genotype LL Genotype LL Genotype LL Genotype LL

Karlo RR 28.6 15 88.5 40 116.6 2 138.0

PRO 275 37.6 Tundra 90.8 70 116.6 37 139.8

S19-90 43.5 90 92.2 Azur 117.9 S04273.09 140.1

Toma 49.2 46 92.4 Williams 82 118.0 PR9031LL 140.3

93 49.5 124 93.2 122 118.8 99 140.4

21 50.8 72 93.5 Accent 119.5 Oria 141.5

Majesta 56.9 S04273.19 93.8 12 119.6 Saska 143.8

Prius RR 57.7 31 94.0 Acora 119.8 9 149.8

S04297.18 60.8 24 97.2 Victoria 120.0 Nova 150.1

Maple Arrow 61.0 110 97.3 36 120.7 109 150.8

PR918827 64.9 23 101.2 PR938626 121.0 118 151.3

64 65.8 Jutra 101.3 35 122.9 PR935413 152.0

S04280.44 65.8 27 101.5 PRO 25-53 123.0 6 152.7

45 72.0 97 101.8 102 123.3 Bixi LL 152.7

116 72.7 44 102.0 123 123.6 28 153.7

30 74.5 Venus 103.8 4 125.1 Amasa 153.8

52 75.8 80 104.6 PR939402 126.5 25 155.0

34 79.0 13 105.3 38 126.9 107 155.0

100 80.1 119 106.0 20 128.2 29 156.1

22 80.5 Korus 106.7 92 130.2 112 156.3

Maple Donovan 81.3 PR9368B25 107.5 111 130.9 1 157.8

121 82.0 5 109.0 2601R 131.2 OAC Bayfield 159.9

A x N-1-55 82.3 Delta 109.3 49 133.7 56 161.3

65 82.8 101 109.8 Lotus 133.8 PR935401 161.3

69 83.1 Kolia 110.4 95 134.2 117 163.2

17 83.3 108 112.3 62 134.6 26 169.0

61 84.5 115 112.7 67 135.0 Supra 169.5

66 84.6 77 113.8 Naya 135.5 7 173.6

PRD 419 84.9 PR9423B31 113.8 94 136.8 10 173.7

125 86.3 41 114.5 113 136.8 8 176.1

Bakara 86.7 51 114.7 19 137.7 Nattosan 176.6

39 87.4 63 115.4 98 137.8 96 192.4

33 87.9 Aquita 116.0

Page 51 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 52: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

0

5

10

15

20

25

21-36 37-52

53-68 69-84

85-100

101-116

117-132

133-148

149-164

165-180

181-196

Num

ber o

f lin

es

Lesion length (mm)

Maple Donovan (R)

OAC Bayfield (S)

S19-90 (R)

Williams 82 (S)

Nattosan (S)

Karlo RR (R)

Page 52 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 53: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

0

5

10

15

20

25

30

35

40

<0.05 0.05-0.1 0.1-0.2 0.2-0.3 0.3-0.4 0.4-0.5

% o

f SN

Ps

Minor allele frequency

Page 53 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 54: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

a c b

Distance (Mb)

r2

Distance (Mb) Distance (Mb) r2

r2

Page 54 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 55: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

0"

0,1"

0,2"

0,3"

0,4"

0,5"

0,6"

0,7"

0,8"

0,9"

1"

0" 20" 40" 60" 80" 100"

r2

Distance (kb)

0"

0,1"

0,2"

0,3"

0,4"

0,5"

0,6"

0,7"

0,8"

0,9"

1"

0 20 40 60 80 100

r2

Distance (kb)

a b Page 55 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 56: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

0,0  

0,2  

0,4  

0,6  

0,8  

1,0  

0   0,2   0,4   0,6   0,8   1  

Cum

ulat

ive

P

Observed P

Expected Distribution Naive PCA K PCA + K

Page 56 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 57: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

-Log

(p-v

alue

)

2

1

0

3

4

5

6

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 Chromosome

Page 57 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 58: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

35#

45#

55#

65#

75#

85#

1# 2#

Lesion

#length#(m

m)#

Suscep;ble#Allele#

Resistant#Allele#

Population 1 Population 2

Page 58 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 59: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

13,651,235 13,339,206 13,339,218

13,911,191 13,911,995

13,929,317

Glyma15g17230.1 Glyma15g17240.1

Ankyrin repeats

Glyma15g17310.1 TIR domain – Leucine-rich

repeats

Glyma15g17360.1 Glyma15g17370.1 Glyma15g17390.1 Glyma15g17410.1 Glyma15g17420.1 Glyma15g17430.1 Glyma15g17450.1 Glyma15g17460.1

Serine/threonine protein kinase

Glyma15g17540.1 TIR domain – Leucine-rich

repeats

Page 59 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 60: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

13,339,206 13,339,218 13,516,378 13,525,925 13,529,976 13,651,175 13,651,235 13,911,191 13,911,995 13,929,317

Upper r2

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

> 0.

01

< 0.

01

< 0.

001

< 0.

0001

Lower p-values

Page 60 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 61: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

0,0#

0,2#

0,4#

0,6#

0,8#

1,0#

0# 10# 20# 30# 40# 50# 60#

R2#

Physical#posi.on#(Mb)#

Gm18#

0,0#

0,2#

0,4#

0,6#

0,8#

1,0#

0# 10# 20# 30# 40# 50#

R2#

Physical#posi.on#(Mb)#

Gm19#

0,0#

0,2#

0,4#

0,6#

0,8#

1,0#

0# 10# 20# 30# 40#

R2#

Physical#posi.on#(Mb)#

Gm17#

0,0#

0,2#

0,4#

0,6#

0,8#

1,0#

0# 10# 20# 30# 40#

R2#

Physical#posi.on#(Mb)#

Gm05#

0,0#

0,2#

0,4#

0,6#

0,8#

1,0#

0# 10# 20# 30# 40# 50#

R2#

Physical#posi.on#(Mb)#

Gm01#

0,0#

0,2#

0,4#

0,6#

0,8#

1,0#

0# 10# 20# 30# 40#

R2#

Physical#posi.on#(Mb)#

Gm11#

0,0#

0,2#

0,4#

0,6#

0,8#

1,0#

0# 10# 20# 30# 40# 50#

R2#

Physical#posi.on#(Mb)#

Gm10#

0,0#

0,2#

0,4#

0,6#

0,8#

1,0#

0# 10# 20# 30# 40#

R2#

Physical#posi.on#(Mb)#

Gm09#

0,0#

0,2#

0,4#

0,6#

0,8#

1,0#

0# 10# 20# 30# 40#

R2#

Physical#posi.on#(Mb)#

Gm07#

0,0#

0,2#

0,4#

0,6#

0,8#

1,0#

0# 10# 20# 30# 40# 50#

R2#

Physical#posi.on#(Mb)#

Gm06#

0,0#

0,2#

0,4#

0,6#

0,8#

1,0#

0# 10# 20# 30# 40#

R2#

Physical#posi.on#(Mb)#

Gm03#

0,0#

0,2#

0,4#

0,6#

0,8#

1,0#

0# 10# 20# 30# 40# 50#

R2#

Physical#posi.on#(Mb)#

Gm02#

0,0#

0,2#

0,4#

0,6#

0,8#

1,0#

0# 10# 20# 30# 40#

R2#

Physical#posi.on#(Mb)#

Gm13#

0,0#

0,2#

0,4#

0,6#

0,8#

1,0#

0# 10# 20# 30# 40# 50#

R2#

Physical#posi.on#(Mb)#

Gm14#

0,0#

0,2#

0,4#

0,6#

0,8#

1,0#

0# 10# 20# 30# 40# 50#

R2#

Physical#posi.on#(Mb)#

Gm15#

0,0#

0,2#

0,4#

0,6#

0,8#

1,0#

0# 10# 20# 30# 40#

R2#

Physical#posi.on#(Mb)#

Gm12#

0,0#

0,2#

0,4#

0,6#

0,8#

1,0#

0# 10# 20# 30# 40#

R2#

Physical#posi.on#(Mb)#

Gm08#

0,0#

0,2#

0,4#

0,6#

0,8#

1,0#

0# 10# 20# 30# 40# 50#

R2#

Physical#posi.on#(Mb)#

Gm04#

0,0#

0,2#

0,4#

0,6#

0,8#

1,0#

0# 10# 20# 30#

R2#

Physical#posi.on#(Mb)#

Gm16#

0,0#

0,2#

0,4#

0,6#

0,8#

1,0#

0# 10# 20# 30# 40#

R2#

Physical#posi.on#(Mb)#

Gm20#

Page 61 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030

Page 62: Genome wide association mapping of Sclerotinia ... · PDF fileGenome wide association mapping of Sclerotinia sclerotiorum resistance in soybean with a genotyping by sequencing approach

-­‐0,016  

-­‐0,011  

-­‐0,006  

-­‐0,001  

0,004  

0,009  

0,014  

0,019  

-­‐0,017  

-­‐0,012  

-­‐0,007  

-­‐0,002  

0,003  

0,008  

0,013  

0,018  

0,023  

0,028  

-­‐0,014   -­‐0,009   -­‐0,004   0,001   0,006   0,011   0,016  

PC  3  

PC  2  

PC  1  

PC2  vs  PC1  PC3  vs  PC1  

Page 62 of 62

The Plant Genome: Posted 20 Dec. 2013; doi: 10.3835/plantgenome2013.10.0030