the use of complex populations in breeding with markers sbc “breeding with molecular markers”...

46
The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: [email protected]

Upload: roger-marsh

Post on 03-Jan-2016

230 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

The use of complex populations in breeding with markers

SBC “Breeding with molecular markers”

David Francis

Contact: [email protected]

Page 2: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

breeding programs tend to have complex population structures consisting of many independent crosses

Genetic studies tend to focus on bi-parental crosses with defined structure.

Page 3: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

Jargon:QTL

LD

SNP Structure

Mixed Model Analysis of

Variance

Iden

tity

by d

esce

nt

Please stop me and ask when a definition will help clarify

Page 4: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

Objectives

Understand the diversity of populations that are being used to test marker-trait associations (linkage).

Understand the difference between the discovery of linkage and use of markers for selection.

Use this information to facilitate interaction with colleagues from other disciplines (field, marker support, analysis, etc…).

Use information to design and implement discovery and selection projects.

Page 5: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

Background

Introduction to Populations

Case study

Discovery Populations

Selection Populations

Association Mapping

Single Marker analysis of variance

Changes to the model used for analysis:

Account for population structure

Haplotypes to gain information

Page 6: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

Standard populations for inbred species (line crosses)

F2

RIL (recombinant inbred lines)BC (back cross)AB (Advanced Back Cross) *IBC (Inbred Back cross) *

Emerging populations for association mapping

Natural populationsUnstructured populationsFamily-based *Nested Association Mapping (NAM; a variation of RIL)

Page 7: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

Standard populations for inbred species (line crosses)

F2 Few meiosis, population not fixedRIL Few meiosis, population fixed (can replicate)BC Few meiosis, population not fixedAB Few meiosis, population not fixedIBC Few meiosis, population fixed

Emerging populations for association mapping

Nat. pop. Samples all meiosis in history of species, pop. often fixed.

Unst. pop. Samples all meiosis since pop. establishedFamily-based Samples all meiosis in pedigreeNAM See RIL. Meiosis increased due to size of pop/

and multiple crosses.

Page 8: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

Populations• Early generation (F2, BC1)

– Strong theoretical basis– Balanced designs– Tools for interval mapping (point of analysis)– Most breeding programs do not collect extensive data on early

generation populations– Retain too much “donor” Parent

• AB and IBC populations– Reduce donor parent, isolate genetic factors, allow detection– Unbalanced design may limit power

• Unstructured (natural populations)– More like populations that breeders use

Page 9: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

Frequency of heterozygotes (Cc) and homozygotes (CC+cc) in each generation of selfing a hybrid (F1).

00.10.20.30.40.50.60.70.80.9

1

F1 F2 F3 F4 F5 F6 F¥

Generation

Fre

qu

ency

Cc

CC+cc

Freq CC = p2 + pqF Freq Cc = 2pq (1-F)

Freq cc = q2 + pqF

Review: affect of inbreeding

Page 10: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

Advanced Backcross and Inbred Backcross Populations

Parent 1 x Parent 2 (Donor)

F1 x ‘Parent 1

BC1 (n lines)

BC1-1 x Parent 1 BC2-1S0 ⊗ . . . BC2-1S5

BC1-2 x Parent 1 BC2-2S0 ⊗ . . . BC2-2S5

.

.BC1-n x Parent 1 BC2-nS0 ⊗ . . . BC2-nS5

AB IBC

Page 11: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

Statistical considerations with AB, IBC, and association populations

0

2

4

6

8

10

0

Genotypic classT

rait

va

lue

Unequal sample size/unbalanced dataDonor class is under representedNeed to adjust Df for F-testproper F-test {Mj/Gk(Mj)}These considerations affect power and whether significance level is accurately estimated

Page 12: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

Take home messages:

A) Genotyping throughput and reagent packaging favors working with very large populations (~480)

B) Measuring traits (Phenotyping) is the limiting factor

C) For elite polpulations, marker number and the ability to distinguish descent (IBD) from state (IBS) are limitations (this is a function of linkage phase and LD)

D) Incorporating pedigree data or population structure data into analysis improves detection of trait associations (QTL) and the efficiency of MAS (defined as relative efficiency of selection).

E) We can detect some known QTL, but not all known QTL in complex populations. Power goes up with population size and marker number.

F) Phenotypic selection is effective.

Page 13: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

Case study: mapping and selection of bacterial spot resistance in tomato

populations.

David Francis, Sung-Chur Sim, Hui Wang, Matt Robbins, Wencai Yang.

Page 14: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

Bacterial Spot is a disease complex caused by ~4 species of Xanthomonas bacteria. There are physiological races.

Sources of resistance are mostly close relatives of cultivated tomato Solanum lycopersicum or Solanum pimpinellifolium.

Hawaii 7998 (T1)

Hawaii 7981 (T3)

PI128216 (T3)

PI 114490 (T1, T2, T3, T4)

Page 15: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

Field rating based on Horsfall-Barratt scale quantitative scale (1-12)en.wikipedia/org/wiki/Horsfall-Barratt_scale

Distribution approaches normal (ANOVA, regression, mixed models)

GH rating based on HRScored 0 or 1 (non-parametric)

Page 16: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

Bacterial spot QTL discovery in IBC Populations Ohio, T2 & T1

(2000-2004)

FL, T3 and T4(2002-2004)

Brasil

T3 2002-2004

Page 17: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

SSR111

LEGTOM5c

CosOH51

SSR601

SSR320

TOM59

TOM196

TOM144

SSR637a

SSR637b

SSR637c

SSR637d

SSR637e

CosOH57

I2B

GEN 3 3 3 3 3 3 11 11 11 11 11 11 11 11 11

PI114490 1.5 1 1 1 1 1 1 1 1 0 0 1 0 0 1 1Fla7600 6 2 2 2 2 2 1 1 2 0 1 0 1 0 2 2OH9242 7 3 2 2 2 2 2 2 3 1 0 0 0 1 2 1

6142 3 3 1 2 2 2 2 1 2 0 1 0 1 0 1 16148 4 3 1 1 1 2 2 12 23 1 1 0 1 1 2 -6149 4.5 13 2 1 2 2 12 1 2 0 1 0 1 0 2 26027 5 1 1 2 2 2 2 1 2 0 1 0 1 0 2 26053 5 1 1 1 2 2 2 12 23 1 0 0 0 1 2 16082 5 3 2 2 2 2 2 1 2 0 1 0 1 0 2 26110 5 3 2 2 2 2 2 1 1 0 0 1 0 0 2 1

OH5949 5 1 2 1 2 2 1 1 2 0 1 0 1 0 2 2OH5882 5 3 2 2 2 2 2 1 2 0 1 0 1 0 2 2OH6027 5 1 1 1 2 2 2 1 2 0 1 0 1 0 2 2

6021 5.5 3 2 1 2 2 2 1 2 0 1 0 1 0 2 1

6076 8 3 2 2 2 2 2 2 3 1 0 0 0 1 2 16088 8 3 2 2 2 2 2 2 3 1 0 0 0 1 2 16118 8 3 2 2 2 2 2 2 3 1 0 0 0 1 2 16120 8 2 2 2 2 2 1 2 3 1 0 0 0 1 2 16125 8 3 2 2 2 2 2 2 3 1 0 0 0 1 2 16127 8 3 2 2 2 2 2 2 3 1 0 0 0 1 2 16133 8 3 2 2 2 2 2 2 3 1 0 0 0 1 12 16135 8 3 2 2 2 2 2 2 3 1 0 0 0 1 2 16158 8 3 2 - 2 2 2 2 3 1 0 0 0 1 2 16161 8 3 2 2 2 2 2 2 3 1 0 0 0 1 2 1

Page 18: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

Results of discovery studies:

Three IBC populations [[OH88119 x Ha7998]x(OH88119)] x(OH88119)[[OH88119 x PI128216]x(OH88119)] x(OH88119)[FL7600 x PI114490]x(OH9242)]x(OH9242)

Multiple F2 populationsIBC x elite parentOH7870 x Ha7981

Results:

Hawaii 7998 (T1) Rx-1, Rx-2, Rx-3, Chr11 QTL

Hawaii 7981 (T3) R-Xv3

PI128216 (T3) Rx-4, Chr11

PI 114490 (T1, T2, T3, T4) QTL Chr 11, Chr3, Chr4

X X

Page 19: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

We have IBC lines and IBC x elite derived lines that “look good” and we want to integrate them with the elite breeding program. Strategy:

1) Develop populations to combine loci for resistance to multiple races

2) Validate Marker-QTL associations in order to assess feasibility of MAS

3) Conduct simultaneousphenotypic and MAS.

Page 20: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

OH75 FL82 K64 OH86 OH74 MR13

Genes

Parents

OH75 FL82 K64 OH74 MR13OH75 F1-1 F1-2 F1-3 F1-4FL82 F1-5 F1-6 K64 F1-7 F1-8OH74 F1-9

Rx-3 (5) Rx-4(11) QTL11 QTL11 ? ?

F1-1 F1-2 F1-3 F1-4 F1-5 F1-6 F1-7F1-1 X X XF1-2 X X X X F1-3 X X X X

“Population” consisting of 11 independent crosses, progeny segregate

Page 21: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

First segregating generation: grow ~100 plants in the field (total populations size 1,100) and select plants from each extreme (n = 110)

0

2

4

6

8

10

12

Page 22: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

Following year: Evaluate plots

RCB, two replicates, rating based on a plot (not single plant), scale 1-12.

Page 23: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

Phenotypic evaluation (Focus on T1). Selection conducted in 2007 was predictive of plot performance in 2008 based on both nonparametric analysis and analysis of variance (p < 0.0001).

Heritability estimated from the parent-offspring regression suggests a narrow sense heritability of 0.32.

Plants rated as resistant in 2007 produced plots with an average disease rating of 4.02 in 2008; plants rated as susceptible produced plots with an average disease rating of 5.16 in 2008 (LSD 0.39).

Realized gain under selection ~13% decrease in disease

OH75 rated 3.5; OH88119 rated 9.0 

Page 24: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

Y = μ REPy + Qw + Markerα + Zv + Error

Sequence variation linked to traits

Marker analysis using The Unified Mixed Model

Buckler Lab, TASSEL

Page 25: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

%macro Mol(mark);proc mixed data = three;class &mark gen rep;model T1 = &mark / solution;random gen rep;%mend;

%Mol(TOM144); %Mol(CT10737I); %Mol(CT20244I); %Mol(pto); %Mol(SL10526);

%Mol(rx3);

Markerα

Page 26: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

0.00

1.002.00

3.00

4.00

5.006.00

7.00

8.00

30 40 50 60 70 80 90

single-point analysis

Rx-3

Page 27: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

Y = μ REPy + Qw + Markerα + Zv + Error

Adding matrix of population structure can correct for background effects and can add insight to which crosses, pedigrees, subpopulations have highest breeding value

Page 28: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu
Page 29: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

0.80 0.90 1.000.40 0.50 0.60 0.700.00 0.10 0.20 0.30

r 2 value

Ch

rom

som

e

P v

alu

e

>0.05

<0.05

<0.01

<0.001

<0.0001

Combined

12

1

10

11

41

2

3

4

5

6

7

8

9

2 3 11 127 8 9 105 6

Page 30: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

Qw

Pedigree information

Proportion of genome from a parent (pedigree)

Designation of cross (0/1)

Q – Matrix from Structuregen subpop1 subpop2 subpop3 subpop4 subpop5 subpop66111R1 0.129 0.128 0.016 0.696 0.016 0.0156111R2 0.671 0.088 0.016 0.184 0.015 0.0266111R3 0.934 0.013 0.011 0.015 0.007 0.0196111S1 0.88 0.051 0.009 0.019 0.009 0.0326111S2 0.456 0.213 0.048 0.22 0.014 0.0496115S3 0.077 0.018 0.53 0.027 0.008 0.3416115S4 0.018 0.016 0.908 0.024 0.008 0.0266117R1 0.86 0.01 0.012 0.1 0.006 0.0126117R2 0.392 0.011 0.264 0.055 0.011 0.2676117S1 0.205 0.016 0.481 0.227 0.008 0.0636117S2 0.156 0.035 0.193 0.426 0.011 0.1796117S3 0.016 0.009 0.922 0.029 0.014 0.0116117S4 0.227 0.015 0.317 0.28 0.009 0.1526124R1 0.015 0.079 0.766 0.063 0.008 0.0696124R2 0.016 0.033 0.526 0.4 0.01 0.014

Page 31: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

%macro Mol(mark);proc mixed data = three;class &mark gen rep;model T1 = OH75 FL82 K64 OH86 OH74 &mark / solution;random gen rep;%mend;

%Mol(TOM144); %Mol(CT10737I); %Mol(CT20244I); %Mol(pto); %Mol(SL10526);

%Mol(rx3);

MarkerαQw

Page 32: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

0.00

1.002.00

3.00

4.00

5.006.00

7.00

8.00

30 40 50 60 70 80 90

single-point analysis

single-point analysis corrected for population structure

Rx-3

Page 33: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

M1 M2Rx-3

rx-3

OH75: 1, R, 1

OH86: 0, S, 1

FL82 1, S, 0

M1 M2

M1 M2rx-3

OH75 x OH86, M1 can be used for selection, M2 cannot

OH75 x FL82, M2 can be used for selection, M2 cannot

What happens when the breeding material is a combination of progeny from both crosses?

M1 M2

Page 34: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

M1 M2Rx-3

rx-3

OH75: 1, R, 1

OH86: 0, S, 1

FL82 1, S, 0

M1 M2

M1 M2rx-3

Reality check: Markers are identical by state but not by descent (presumably because of LD decay). Potential solution is to use haplotypes.

M1 M2

Page 35: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

proc mixed data = three;class mark1 mark2 gen rep;model T1 = mark1*mark2 OH75 FL82 K64 OH86 OH74 / solution;random gen rep;

M1 M2 M3 M4 M5 M6

M1*M2, M2*M3, M3*M4, M5*M6

Interactions term defines haplotypes

Page 36: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

0.00

1.002.00

3.00

4.00

5.006.00

7.00

8.00

30 40 50 60 70 80 90

single-point analysis

single-point analysis corrected for population structure

indicates haplotype analysis

haplotype analysis corrected for population structure.

Rx-3

Page 37: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

Interval P to S L 10526 E s timate S D DF t value P r > |t|Pto*SL10526 0 0 3.76 0.531 96 7.09 <.0001Pto*SL10526 0 2 3.99 0.624 96 6.41 <.0001Pto*SL10526 2 0 3.22 0.375 96 8.59 <.0001Pto*SL10526 2 2 6.14 0.501 96 12.26 <.0001Pto*SL10526 1 0 4.35 0.395 96 11.01 <.0001Pto*SL10526 1 2 5.48 0.470 96 11.65 <.0001Pto*SL10526 1 1 7.39 0.975 96 7.59 <.0001

Page 38: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

C hr. Marker F value P r > F1 S L10945 2.48 0.08942 S L10649 0.03 0.9742 SL10771 0.14 0.8693 SL10910 0.37 0.69083 SL10736 1.33 0.25153 SL10494 0.22 0.63853 SL10425 1.17 0.31613 SSR601 0.05 0.8284 SL10322 6.03 0.00344 SL10888 1.29 0.28036 SL10401 0.18 0.83626 SL10187 0.11 0.89357 SL20017 0.62 0.53779 SL10024 1.02 0.36519 LEOH8.4 0.25 0.779

Genome-Wide Scan

Page 39: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

We can detect resistance conferred by the Rx-3 locus on chromosome 5

We can detect resistance conferred by Rx-4 on chromosome 11

We cannot detect QTL on chromosome 11

We can detect a strong interaction between loci on 11 and 5 (data not shown)

What needs to happen to improve prospects for “whole genome” discovery and/or selection?

More markers

Larger populations

F = Gen/Error (non-replicated)

F = Gen/Gen(Marker) (replicated)

Worst (genetic pop)

Worst (breeding pop)

Best

Page 40: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

Population sizes

• F-test – Marker/Gen(Marker)– Larger F from greater marker effect (strength of

locus or closely linked to the causal gene)– Larger F by decreasing error– For maker studies it will nearly always be more

powerful to increase the number of genotypes rather than increasing replicates of genotypes

Page 41: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

Sample size power estimates

=0.05

=0.10

=0.01

=0.05

N for r2=0.10

101 171

N for r2=0.05

206 349

N for r2=0.01

1047 1774

False +False -

Proportion σ2P

Page 42: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

0

1

2

3

4

5

6

7

1 2 3 4 5 6 7 8 9 10 11 12

0

1

2

3

4

5

6

7

1 2 3 4 5 6 7 8 9 10 11 12

Discovery populations:

Magnitude of difference between R and S is large

Gen(Marker) variation moderate

Breeding populations

Difference between R and S is moderate

Gen(Marker) variation is moderate

Detecting significant marker trait associations is more difficult when magnitude of difference between genotypic classes is reduced

Page 43: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

Population sizes can be increased by decreasing plot replication.

“Augmented designs” with a few checks highly replicated

Checks provide “error” to assess significance of differences between un-replicated genotypes

Checks can be used to normalize data (nearest check, flanking checks, etc…)

Page 44: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

Take home messages:A) Genotyping throughput and reagent packaging favors working

with very large populations (~480) (effective MAS implementation will require larger populations)

B) Measuring traits (Phenotyping) is the limiting factor. (scoring larger populations will minimize Gen(Marker) error)

C) For elite polpulations, marker number and the ability to distinguish descent (IBD) from state (IBS) are limitations (this is a function of linkage phase and LD) (haplotypes)

D) Incorporating pedigree data or population structure data into analysis improves detection of trait associations (QTL) and the efficiency of MAS (defined as relative efficiency of selection). (corrects for structure; avoids false positives)

E) We can detect some known QTL, but not all known QTL in complex populations. Power goes up with population size and marker number. (Marker analysis is still more descriptive than predictive)

F) Phenotypic selection is effective.

Page 45: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu

AcknowledgmentsFrancis GroupMatt Robbins

Sung-Chur SimTroy Aldrich

Collaborators, OSUEsther van der Knaap

Bert BishopTea MeuliaSally Miller

Melanie Lewis Ivey

Collaborators, UCDAllen Van Deynze

Kevin StoffelAlex Kozic

FundingUSDA/AFRIOARDC RECGP matching funds grant; MAFPA

Collaborators, CAUHui Wang

Wencai Yang

Collaborators, UFLJay Scott

Sam Hutton

Page 46: The use of complex populations in breeding with markers SBC “Breeding with molecular markers” David Francis Contact: francis.77@osu.edu