single marker analysis and interval mapping · save computation time at one or multiple genomic...

52
Single Marker Analysis and Interval Mapping Jiankang Wang, CIMMYT China and CAAS E-mail: [email protected]; [email protected] Web: http://www.isbreeding.net 1

Upload: others

Post on 19-Jul-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Single Marker Analysis and Interval Mapping

Jiankang Wang, CIMMYT China and CAAS

E-mail: [email protected]; [email protected]

Web: http://www.isbreeding.net

1

Page 2: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Comparison of Estimated

Recombination Frequency in Bi-

Parental Genetic Populations

Sun Z., H. Li*, L. Zhang, J. Wang. 2012. Estimation of recombination frequency in biparental genetic populations.

Genetics Research 94: 163-177

2

Page 3: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Populations handled in QTL IciMapping Parent P1 Parent P2 Legends

Hybridization

F1

Selfing

1. P1BC1F1 7. F2 2. P2BC1F1

Repeated selfing

9. P1BC2F1 13. P1BC1F2 8. F3 14. P2BC1F2 10. P2BC2F1

Doubled haploids

15. P1BC2F2 16. P2BC2F2

11. P1BC2RIL 5. P1BC1RIL 4. F1RIL 6. P2BC1RIL 12. P2BC2RIL BC3F1, BC4F1 etc.

P1BC2F1 P1BC1F1 F1 P2BC1F1 P2BC2F1 Marker-assisted

selection

19. P1BC2DH 17. P1BC1DH 3. F1DH 18. P2BC1DH 20. P2BC2DH CSS lines or

Introgression lines

P1 × CP P2 × CP P3 × CP … Pn × CP CP=common parent

RIL family 1 RIL family 2 RIL family 3 RIL family i RIL family n

One NAM population

Page 4: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

The genetic analysis can be very complicated even with biparental populations!

• F1-derived populations – F2, F3, DH, RIL: p=0.5, q=0.5 at each locus

• P1BC1-derived population – F2, F3, DH, RIL: p=0.75, q=0.25 at each locus

• P2BC1-derived population – F2, F3, DH, RIL: p=0.25, q=0.75 at each locus

• P1BC2-derived population – F2, F3, DH, RIL: p=0.875, q=0.125 at each locus

• P2BC2-derived population – F2, F3, DH, RIL: p=0.125, q=0.875 at each locus

4

Page 5: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Twenty biparental populations

• Allele frequencies can be different

• Genotypes and their frequencies are much different

• Are they equal good in estimating the recombination frequency between two linked loci?

5

Page 6: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

LOD scores from different populations True r=0.05 (Upper), True r=0.2 (lower)

6

0

10

20

30

40

50

60

70

F2 F3 F1DH F1RIL BC1F1 BC1F2 BC1DH BC1RIL BC2F1 BC2F2 BC2DH BC2RIL

LOD

True r = 0.05PopSize=50 PopSize=100 PopSize=200

0

5

10

15

20

25

F2 F3 F1DH F1RIL BC1F1 BC1F2 BC1DH BC1RIL BC2F1 BC2F2 BC2DH BC2RIL

LOD

True r = 0.2PopSize=50 PopSize=100 PopSize=200

Page 7: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Deviations to the true value (upper) and standard errors (lower) of estimated recombination frequency

7

0

0.02

0.04

0.06

0.08

F2 F3 F1DH F1RIL BC1F1 BC1F2 BC1DH BC1RIL BC2F1 BC2F2 BC2DH BC2RIL

De

viat

ion

True r = 0.3

PopSize=50 PopSize=100 PopSize=200

0

0.05

0.1

0.15

0.2

0.25

F2 F3 F1DH F1RIL BC1F1 BC1F2 BC1DH BC1RIL BC2F1 BC2F2 BC2DH BC2RIL

Stan

dar

d e

rro

r

True r = 0.3

PopSize=50 PopSize=100 PopSize=200

Page 8: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Observations

• When two alleles at each locus have equal frequency of 0.5, we had a better estimation.

• When a population has more genotypes, we had a better estimation.

• For F2 and F3 to be efficient, we need co-dominant markers.

8

Page 9: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Minimum population size to have at least one recombinant

9

Pop. r=0.01 r=0.02 r=0.03 r=0.05 r=0.1 r=0.2 r=0.3 F2 (C, C) 150 75 50 30 15 8 5 F2 (C, D) 299 149 99 60 31 16 11 F2 (C, R) 299 149 99 60 31 16 11 F2 (D, D) 299 149 99 61 31 16 11 F2 (D, R) 149786 29956 13616 4754 1197 299 132 F2 (R, R) 299 149 99 61 31 16 11 DH 299 149 99 59 29 14 9 RIL 152 77 52 32 17 9 7

In the first column, C for co-dominant marker; D for dominant marker; R for recessive marker

Page 10: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Outlines

• Quantitative Traits and QTL Mapping

• Single Marker Analysis

• The Conventional (Simple) Interval Mapping

10

Page 11: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Quantitative Traits and QTL Mapping

11

Page 12: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Quantitative traits in genetics

0200400600800

1000120014001600

54 56 58 60 62 64 66 68 70 72 74

Nu

mb

er

of

wo

me

n

Midpoint group value

Distribution of height (inches) among 4995 British women

0

5

10

15

20

25

30

5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

Nu

mb

er

of

wo

me

n

Midpoint group value

Ear length (cm) of one maize inbred line (P1)

0

5

10

15

20

25

30

5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

Nu

mb

er

of

wo

me

n

Midpoint group value

Ear length (cm) of aonther maize inbred line (P2)

0

5

10

15

20

25

30

5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

Nu

mb

er

of

wo

me

n

Midpoint group value

Ear length (cm) of their F1 hybrids

0

20

40

60

80

100

120

140

160

5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21

Nu

mb

er

of

wo

me

n

Midpoint group value

Ear length (cm) of their F2 hybrids

Page 13: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Quantitative traits

• Continuous phenotypic variation

• Affected by many genes

• Affected by environment

• Epistasis

• Polygene (or multi-factorial ) hypothesis

• Classical quantitative genetics

13

Page 14: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Quantitative trait does not have to be “continuous”

• Categorical traits: traits in which the phenotype corresponds to any one of a number of discrete categories – Number of skin ridges forming the fingerprints

– Number of kernels on an ear of corn

– Number of puppies in a litter

• Threshold traits: traits that have only two, or a few, phenotypic classes, but their inheritance is determined by the effects of multiple genes acting together with the environment – Liability to express the trait, which is not directly observable.

– When liability is high enough (above a “threshold”), the trait will be expressed; Otherwise, the trait is not expressed. 14

Page 15: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

What is QTL Mapping?

• The procedure to map individual genetic factors with small effects on the quantitative traits, to specific chromosomal segments in the genome

• The key questions in QTL mapping studies are:

– How many QTL are there?

– Where are they located in the marker map?

– How large an influence does each of them have on the trait of interest?

– Are they interacting with each other?

– Are they stably expressed across environments?

15

Page 16: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Dataset of QTL mapping

• Mapping population

• Marker data of each individual in the mapping population

• Linkage map

• Phenotypic data

16

Page 17: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Example: 10 RIL of Rice (linkage map of Chr. 5 )

17

Marker C263 R830 R3166 XNpb387 R569 R1553 C128 C1402 XNpb81 C246 R2953 C1447

Grain

width

(mm)

Position

(cM) 0.0 3.5 8.5 19.5 32.0 66.6 74.1 78.6 81.8 91.9 92.7 96.8

RIL1 0 0 0 0 0 0 0 0 0 0 0 0 2.33

RIL2 2 2 2 2 2 0 0 0 0 2 2 2 1.99

RIL3 0 2 2 2 2 2 2 2 2 2 2 2 2.24

RIL4 0 0 0 0 0 0 2 2 2 2 2 2 1.94

RIL5 0 0 0 0 0 2 2 0 0 0 0 0 2.76

RIL6 0 0 0 2 2 2 2 2 2 2 2 2 2.32

RIL7 0 0 0 0 0 0 0 0 0 0 0 0 2.32

RIL8 2 2 0 2 2 0 0 0 0 2 2 2 2.08

RIL9 0 0 0 0 2 2 0 0 0 0 0 0 2.24

RIL10 0 0 0 0 2 2 0 0 0 0 0 0 2.45

Page 18: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Classification of mapping populations

• Bi-parental mapping populations (linkage

mapping)

– Temporary population: F2 and BC

– Permanent population: RIL, DH, CSSL

– Secondary population

• Association mapping – Natural populations: human and animals

18

Page 19: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Overview on QTL mapping methods • Single marker analysis (Sax 1923; Soller et al. 1976)

– The single marker analysis identifies QTLs based on the difference between the mean phenotypes for different marker groups, but cannot separate the estimates of recombination fraction and QTL effect.

• Interval mapping (IM) (Lander and Botstein 1989) – IM is based on maximum likelihood parameter estimation and provides a

likelihood ratio test for QTL position and effect. The major disadvantage of IM is that the estimates of locations and effects of QTLs may be biased when QTLs are linked.

• Regression interval mapping (RIM) (Haley and Knott 1992; Martinez and Curnow 1992 ) – RIM was proposed to approximate maximum likelihood interval mapping to

save computation time at one or multiple genomic positions.

• Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with multiple marker regression analysis, which controls the

effects of QTLs on other intervals or chromosomes onto the QTL that is being tested, and thus increases the precision of QTL detection.

19

Page 20: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Overview on QTL mapping methods Multiple interval mapping (MIM) (Kao et al. 1999)

– MIM is a state-of-the-art gene mapping procedure. But implementation of the multiple-QTL model is difficult, since the number of QTL defines the dimension of the model which is also an unknown parameter of interest.

Bayesian model (Sillanpää and Corander 2002) – In any Bayesian model, a prior distribution has to be considered. Based on the

prior, Bayesian statistics derives the posterior, and then conduct inference based on the posterior distribution. However, Bayesian models have not been widely used in practice, partially due to the complexity of computation and the lack of user-friendly software.

Inclusive Composite Interval Mapping (Li et al. 2006)

– In the first step, stepwise regression was applied to identify the most significant regression variables in both cases but with different probability levels of entering and removing variables. In the second step, a one-dimensional scanning or interval mapping was conducted for mapping additive and a two-dimensional scanning was conducted for mapping digenic epistasis.

20

Page 21: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Single Marker Analysis

21

Page 22: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Evidence for marker and QTL association

• Three marker types MM, Mm, and mm at one marker locus

• When marker is linked with QTL, the three marker types will have un-equal means.

22

Marker type mm Marker

type Mm

Marker type MM

Marker type mm

Marker type Mm

Marker type MM

Page 23: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Backcrosses (P1BC1 and P2BC1) of P1: MMQQ and P2: mmqq

23

P1BC1F1 P2BC1F1

Genotype Frequency Genotypic value

Genotype Frequency Genotypic value

MMQQ (1-r)/2 m+a MmQq (1-r)/2 m+d

MMQq r/2 m+d Mmqq r/2 m-a

MmQQ r/2 m+a mmQq r/2 m+d

MmQq (1-r)/2 m+d mmqq (1-r)/2 m-a

Page 24: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Difference between the two marker types (P1BC1 as example)

• Two marker types:

• Difference in phenotype between the two types

24

MMQqMMQQMM )1( rr

rdarmdmramr )1()())(1(

MmQqMmQQMm )1( rr

drramdmramr )1())(1()(

))(21(MmMM dar

Page 25: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

A barley DH population

25

0.0

0.1

0.2

0.3

0.4

0.5

36 38 40 42 44 46 48 50

Fre

qu

en

cy

Kernel weight

Type 0

Type 2

Marker locus Act8A

0.0

0.1

0.2

0.3

0.4

0.5

36 38 40 42 44 46 48 50

Fre

qu

en

cy

Kernel weight

Type 0

Type 2

Marker locus Act8B

Page 26: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Significance test on phenotypic means of marker types

26

Parameter Marker Act8A Marker Act8B

Type 0 Type 2 Type 0 Type 2

Sample size 70 74 58 69

Degree of freedom 69 73 57 68

Mean 42.23 42.79 43.89 41.25

Variance 4.45 5.32 3.53 2.79

Standard error 2.11 2.31 1.88 1.67

Combined variance 4.90 3.13

T-test 1.51 (P=0.1342) 8.37 (P=1.00E-13)

Page 27: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

A soybean F2 population

Marker locus *Satt339 Marker locus *Sat_033

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0 10 20 30 40 50

Fre

qu

en

cy

Chlorophy II content

Type 0

Type 1

Type 2

0

0.1

0.2

0.3

0.4

0.5

0.6

0 10 20 30 40 50

Fre

qu

en

cy

Chlorophy II content

Type 0

Type 1

Type 2

Page 28: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Significance test on phenotypic means of marker types

Parameter Marker Act8A Marker Act8B

Type 0 Type 1 Type 2 Type 0 Type 1 Type 2

Sample size 65 111 39 56 90 62

Mean 35.16 32.76 14.22 30.72 30.47 29.20

Variance 47.71 40.42 65.52 92.13 97.24 133.28

Standard error 6.91 6.65 8.09 9.60 9.86 11.54

T-test of additive 15.06 (P=1.43E-33) 0.80 (P=0.4264)

T-test of dominance 8.47 (P=5.89E-13) 0.35 (P=0.7270)

Page 29: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Problems with the Single Marker Analysis

• Cannot separate QTL effect and the marker-QTL distance

• Low detection power

• Does not take the advantage of genetic linkage map

29

Page 30: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Conventional Interval Mapping

30

Page 31: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Interval mapping (IM) (Lander and Botstein 1989; Milestone in QTL

mapping methodology and applications )

• Linear model (j=1,2,…,n )

• b* represent QTL effect, is the indicator

variable (0 or 1) for QTL genotype

• Likelihood profile

• Support interval: One-LOD interval

31

*

jx

jji exbby **

0

Page 32: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

QTL genotypes under each marker type in P1BC1 (need to consider three loci simultaneously; double crossover not

considered in this slide)

P1: P2:

F1: P1:

区间标记类型1 区间标记类型2 区间标记类型3 区间标记类型4

Mi Q Mi +1

Mi Q Mi +1

mi q mi +1

mi q mi +1

×

Mi Q Mi +1 Mi Q Mi +1

Mi Q Mi +1

×

Mi Q Mi +1 Mi Q Mi +1 Mi Q Mi +1 Mi Q Mi +1

Mi Q Mi +1 Mi Q mi +1 mi q

Mi +1

mi q mi +1

Mi Q Mi +1

Mi q mi +1

mi q Mi +1

Mi Q Mi +1

mi Q Mi +1

mi q mi +1

32 Marker class I Marker class II Marker class III Marker class IV

Page 33: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Marker types and QTL types in DH populations (double crossover is considered in this slide)

▲ ●

▽ △ ○

▼ ▲ ● ▽

△ ○

▽ △ ○ ▼ ▲ ●

▼ ▲ ●

▼ ▲ ● ▽ △ ○

▽ △ ○ ×

○ ▼ ▲

● ▽ △

No crossover

One crossover between left marker and

QTL

One crossover between QTL

and right marker

Two crossovers

between the two markers

▼ ▲ ●

▽ △ ○ ▼ △ ○

▲ ● ▽ ▼ ● △

▲ ▽ ○

● ▽ △

▼ ▲ ○

)1)(1( RL21 rr )1( RL2

1 rr RL21 )1( rr RL2

1 rr

)1)(1( RL rr )1( RL rr RL )1( rr RLrr

Page 34: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

QTL types under each marker class

Marker class I Marker class II Marker class III Marker class IV

▲ ●

▽ △ ○

▼ ▲ ● ▽

△ ○

○ ▼ ▲

● ▽ △

▼ ▲ ●

▽ △ ○ ▼ △ ○

▲ ● ▽ ▼ ● △

▲ ▽ ○

● ▽ △

▼ ▲ ○

)1)(1( RL21 rr

RL21 rr

)1( RL21 rr

RL21 )1( rr )1( RL2

1 rr

RL21 )1( rr )1)(1( RL2

1 rr

RL21 rr

Page 35: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Two QTL genotypes in 4 marker classes in DH population

35

0

0.1

0.2

0.3

0.4

0 1 2 3 4 5 6 7 8

Pro

bab

lity

de

nsi

ty

Quantitative trait

AABBQQqq

0

0.1

0.2

0.3

0.4

0 1 2 3 4 5 6 7 8

Pro

bab

lity

de

nsi

ty

Quantitative trait

AAbbQQqq

0

0.1

0.2

0.3

0.4

0 1 2 3 4 5 6 7 8

Pro

bab

lity

de

nsi

ty

Quantitative trait

aaBBQQqq

0

0.1

0.2

0.3

0.4

0 1 2 3 4 5 6 7 8

Pro

bab

lity

de

nsi

ty

Quantitative trait

aabbQQqq

Proportion of QTL genotypes depends on QTL position and the marker interval

Page 36: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Three QTL genotypes in 9 marker classes in F2 population

36

Proportion of QTL genotypes depends on QTL position and the marker interval

0

0.1

0.2

0.3

0.4

0 1 2 3 4 5 6 7 8

Pro

bab

ility

de

nsi

ty

Quantitative trait

AABB

QQ

Qq

qq

0

0.1

0.2

0.3

0.4

0 1 2 3 4 5 6 7 8P

rob

abili

ty d

en

sity

Quantitative trait

AABbQQQqqq

0

0.1

0.2

0.3

0.4

0 1 2 3 4 5 6 7 8

Pro

bab

ility

de

nsi

ty

Quantitative trait

AAbbQQQqqq

0

0.1

0.2

0.3

0.4

0 1 2 3 4 5 6 7 8

Pro

bab

ility

de

nsi

ty

Quantitative trait

AaBB

QQ

Qq

qq

0

0.1

0.2

0.3

0.4

0 1 2 3 4 5 6 7 8

Pro

bab

ility

de

nsi

ty

Quantitative trait

AaBbQQQqqq

0

0.1

0.2

0.3

0.4

0 1 2 3 4 5 6 7 8

Pro

bab

ility

de

nsi

ty

Quantitative trait

AabbQQQqqq

0

0.1

0.2

0.3

0.4

0 1 2 3 4 5 6 7 8

Pro

bab

ility

de

nsi

ty

Quantitative trait

aaBB

QQ

Qq

qq

0

0.1

0.2

0.3

0.4

0 1 2 3 4 5 6 7 8

Pro

bab

ility

de

nsi

ty

Quantitative trait

aaBbQQQqqq

0

0.1

0.2

0.3

0.4

0 1 2 3 4 5 6 7 8

Pro

bab

ility

de

nsi

ty

Quantitative trait

aaBbQQQqqq

Page 37: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Frequency of QTL genotypes in each marker class in DH population

37

Marker interval Sample size

Frequency QTL genotype

Left Right QQ qq

AA BB n1 ½ (1-r) ½ (1-rL-rR+rLrR) ½ rLrR

AA bb n2 ½ r ½ (1-rL)rR ½ rL(1-rR)

aa BB n3 ½ r ½ rL(1-rR) ½ (1-rL)rR

aa bb n4 ½ (1-r) ½ rLrR ½ (1-rL-rR+rLrR)

RLRL 2 rrrrr

▲ ●

▽ △ ○

▼ ▲ ● ▽

△ ○

○ ▼ ▲

● ▽ △

▼ ▲ ●

▽ △ ○ ▼ △ ○

▲ ● ▽ ▼ ● △

▲ ▽ ○

● ▽ △

▼ ▲ ○

)1)(1( RL21 rr

RL21 rr

)1( RL21 rr

RL21 )1( rr )1( RL2

1 rr

RL21 )1( rr )1)(1( RL2

1 rr

RL21 rr

Page 38: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

MLEs of means of QTL genotypes

38

qk

kikij NY,,1

2 ),(~

qk

kijik

njmi

q yfL

i

,,1

2

,,1;,,1

2

1 ),|(ln)|,,,(ln

yY

Page 39: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

EM algorithm for calculating MLE

• The Expectation step, given initial values

• wijk measures the probability of QTL genotypes of each DH line given the marker class

39

qk

kijik

kijik

ijkyf

yfw

,,1'

)0(2)0(

''

)0(2)0(

),|(

),|(

Page 40: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

EM algorithm for calculating MLE

40

• The Maximization step, given QTL genotypes are known from wijk in E-step

injmi qk

kijijkq yfwL

,,1;,,1 ,,1

)0(2)0(2

1 ),|(ln)|,,,(ln

xX

i

i

njmi

ijk

njmi

ijijk

kw

yw

,,1;,,1

,,1;,,1

)1(

i

i

njmi

ijk

njmi

kijijk

w

yw

,,1;,,1

,,1;,,1

2)1(

)1(2

)(

Page 41: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Test the existence of QTL

41

010 : qH

equalnot are , , : 1 qAH

Likelihood under H0:

injmi

ijyfL

,,1;,,1

2

00

2

00 ),|()|,(

yY

Likelihood ratio test:

Likelihood of odd (LOD):

)1(~)(max

)(maxln2 20 qdf

HL

HLLRT

A

)(max

)(maxlog

0

10HL

HLLOD A

Page 42: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Estimation of genetic effects

DH populations: 1 for QQ, 2 for qq

F2 populations, 1 for QQ, 2 for Qq, 3 for qq

a 1a 2

)ˆˆ(ˆ212

1 )ˆˆ(ˆ212

1 a

a 1 d 2 a 3

)ˆˆ(ˆ312

1 )ˆˆ(ˆ312

1 a )ˆˆ(ˆ312

12 d

Page 43: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Contribution of a QTL

No distortion

With distortion

%100P

G

V

VPVE

2

G(DH) aV 2

412

21

G(F2)ˆˆ daV

2

qqQQG(DH) 4 affV

22

QqQqqqQQQq

22

qqQQqqQQG(F2) )()(2])([ dffadfffaffffV

Page 44: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Interval mapping in a barley DH population

02468

101214

1111111122222222222333333334444444455555555555566666666777777777

LOD

sco

re

One dimensional scanning on the seven barley chromosomes, step=1cM

-1.5

-1

-0.5

0

0.5

1

1111111122222222222333333334444444455555555555566666666777777777

Ad

dit

ive

eff

ect

One dimensional scanning on the seven barley chromosomes, step=1cM

0

10

20

30

40

1111111122222222222333333334444444455555555555566666666777777777

PV

E (%

)

One dimensional scanning on the seven barley chromosomes, step=1cM

Page 45: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

QTL identified in the barley DH population

Chromo. Position (cM)

Left marker

Right marker

LOD PVE (%) Additive

5 3 ABA306B Act88 13.15 34.55 -1.31

7 0 dRpg1 iPgd1A 2.55 7.79 -0.62

7 98 VAtp57A MWG571D 5.36 15.77 -0.89

Page 46: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Interval mapping in a soybean F2 population

0

10

20

30

40

0 20 40 60 80 100 120 140 160

LOD

Scanning on one chromosome

-15

-10

-5

0

5

10

15

0 20 40 60 80 100 120 140 160

Ge

ne

tic

eff

ect

Scanning on one chromosome

Additive Dominance

Page 47: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

QTL identified in the soybean F2 population

Position (cM)

Left marker

Right marker

LOD PVE (%) Additive Dominance

39 *Satt285 *Sat_239 27.89 73.02 -10.96 9.48

78 *Sat_239 *Satt255 36.00 68.49 -11.09 9.39

91 *Satt255 *Satt339 38.23 58.21 -10.66 8.90

131 *Satt521 *Sat_033 18.24 69.56 -11.07 9.68

Page 48: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Problems with Simple Interval Mapping

• Multiple peaks when QTLs are unlinked

48

0

2

4

6

8

10

02

04

06

08

01

00 0

20

40

60

80

10

0 02

04

06

08

01

00 0

20

40

60

80

10

0 02

04

06

08

01

00 0

20

40

60

80

10

0

LOD

Scanning on 6 chromosomes, each of 120cM. Step = 1cM

Page 49: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Problems with Simple Interval Mapping

• Ghost QTL when two QTLs are linked

49

0

5

10

15

20

02

04

06

08

01

00 0

20

40

60

80

10

0 02

04

06

08

01

00 0

20

40

60

80

10

0 02

04

06

08

01

00 0

20

40

60

80

10

0

LOD

Scanning on 6 chromosomes, each of 120cM. Step = 1cM

Ghost QTL

Page 50: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

Problems with Simple Interval Mapping

• Biased estimation of QTL effects

50

-0.5

0

0.5

1

1.5

02

04

06

08

01

00 0

20

40

60

80

10

0 02

04

06

08

01

00 0

20

40

60

80

10

0 02

04

06

08

01

00 0

20

40

60

80

10

0

Ad

dit

ive

eff

ect

Scanning on 6 chromosomes, each of 120cM. Step = 1cM

Page 51: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

实习5 QTL IciMapping软件 (I) (上周)

• 连锁图谱构建功能(MAP)

– Grouping:分群、设定Anchor信息

– Ordering:标记排序

– Rippling:图谱调整

–连锁图绘制

–输入和输出

• 多个图谱的整合功能(CMP)

• 两个位点间重组率的估计工具(2pointREC)

• 方差分析 工具 (ANOVA)

Page 52: Single Marker Analysis and Interval Mapping · save computation time at one or multiple genomic positions. • Composite interval mapping (CIM) (Zeng 1994) – CIM combines IM with

实习5 QTL IciMapping软件 (II)

• 双亲群体QTL作图(BIP功能)

–选择作图方法

–设定作图参数

–结果分析

• QTL作图功效分析(BIP功能)

–设定遗传模型

–功效分析