visg polyploids project rod ball & phillip wilcox scion; sammie yilin jia plant and food...

24
VISG Polyploids Project all & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food Research Gail Timmerman-Vaughan Plant and Food Research Nihal DeSilva Plant and Food Research

Upload: vance-randal

Post on 14-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: VISG Polyploids Project Rod Ball & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food Research Gail Timmerman-Vaughan Plant and Food Research Nihal

VISG Polyploids Project

Rod Ball & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food ResearchGail Timmerman-Vaughan Plant and Food Research

Nihal DeSilva Plant and Food Research

Page 2: VISG Polyploids Project Rod Ball & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food Research Gail Timmerman-Vaughan Plant and Food Research Nihal

Abstract

A number of important horticultural, crop, and forage species are polyploids. Existing QTLmapping in polyploids is limited to specific marker types and segregation patterns, and inference is

limited. The VISG polyploids project is developing methods for QTL mapping inpolyploids which make full use of available marker information and enable multi-locus

Bayesian inference of the genetic architecture. We review progress and outline the modelling approach and future plans for development.

For QTL mapping in diploid plants, with moderate to large full sib-families, it is usuallypossible to infer the segregation pattern and linkage phase for each marker. Markers aretypically fully informative for one or both parents.

However, polyploids have 2 or more sub-genomes resulting in (e.g. for an allo-tetraploid) 4or more alleles at each locus each of which could have been inherited from one of 8 grand-parental chromosomes. Markers are rarely fully informative. Moreover, the number of copiesof each allele is not observed with current marker technology, and it is necessary to distinguishmarker genotypes from marker phenotypes. For example for a marker with dominant allele M:genotypes Mmmm, MMmm, MMMm, MMMM and permutations have the same phenotype.

Page 3: VISG Polyploids Project Rod Ball & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food Research Gail Timmerman-Vaughan Plant and Food Research Nihal

Abstract (ctd.)

The statistical method needs to contend with considerable and variable amounts of missinginformation. Previous methods consider possible segregation patterns for special cases ofcertain types of flanking markers, and give a LOD score for putative QTL. However this is adhoc and flanking markers may not be fully informative, or even informative at a given locuson a given sub-genome. We are developing a general method based on peeling, for samplingfrom the joint distribution of a set of fully informative virtual markers.

Peeling and conditional peeling have been implemented for allo-polyploids, and initial testingstarted on one diploid wheat dataset.

This year we will be integrating the peeling method with the BIC method for QTL mapping(Ball, Genetics 2001; utilising multiple imputation of missing information) and/or reversiblejump MCMC sampling, extending Bayesian multi-locus inference of the genetic architectureto polyploids.

Page 4: VISG Polyploids Project Rod Ball & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food Research Gail Timmerman-Vaughan Plant and Food Research Nihal

Polyploids - goals

• Extend Bayesian multilocus QTL mapping methods to allo-polyploids.

• QTL mapping—marker trait association in families or pedigrees.• Allo-polyploids, multiple sub-genomes retaining their identity

e.g. AABB, AABBCC.• In diploid plant families can generally infer segregation pattern

and linkage phase.• In polyploids have up to e.g. 8 alleles in allo-tetraploids that could

have been inherited at a given locus. missing information.• Previously, interval mapping applied on an ad hoc basis to pairs

of markers, in certain cases.

Page 5: VISG Polyploids Project Rod Ball & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food Research Gail Timmerman-Vaughan Plant and Food Research Nihal

Polyploids – example 1

Example 1. Marker on a polyploid chromosome.

A ---------M---------------------------

A ---------m---------------------------

B ---------m---------------------------

B ---------m---------------------------

Marker phenotype: M.

Page 6: VISG Polyploids Project Rod Ball & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food Research Gail Timmerman-Vaughan Plant and Food Research Nihal

Polyploids – Example 2

Example 2. Counter-example: correlation does not implycausation (linkage).

Example marker on a polyploid chromosome.A ---------M1--------M2----------m3---------A ---------m1--------m2----------m3---------B ---------m1--------M2----------M3---------B ---------m1--------m2----------m3---------M1 correlated with M2 on sub-genome A,M2 correlated with M3 on sub-genome B.Spurious linkage induced between M1 and M3.

Page 7: VISG Polyploids Project Rod Ball & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food Research Gail Timmerman-Vaughan Plant and Food Research Nihal

Polyploids – Example 2

Another possible model representing the same data.A ---------M1-----------M2--------------M3---------A ---------m1-----------m2--------------m3---------B ---------m1-----------m2--------------m3---------B ---------m1-----------m2--------------m3--------- Same expected correlation between M1 and M2, and

between M2 and M3, but further apparent distance. M3 on the wrong subgenome.

Page 8: VISG Polyploids Project Rod Ball & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food Research Gail Timmerman-Vaughan Plant and Food Research Nihal

Polyploids QTL mapping previous work

Mostly ad hoc specific cases, probability calculations for specific marker types.

Consider only single markers (e.g. Doerge and Craig 2000) or

pairs of markers (e.g. Cao et al 2005).• Single locus i.e. test for a single QTL versus no QTL

separately at each locus (e.g. Cao et al extend interval mapping).

• Lack the benefits of Bayesian multilocus approach.• Flanking markers may not be informative

Page 9: VISG Polyploids Project Rod Ball & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food Research Gail Timmerman-Vaughan Plant and Food Research Nihal

VISG Polyploids Modelling

• Bayesian model selection approach

• BIC method and/or RJMCMC

• Multiple imputation to handle missing information.

• Peeling and conditional peeling along chromosomes to sample from the missing information.

Page 10: VISG Polyploids Project Rod Ball & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food Research Gail Timmerman-Vaughan Plant and Food Research Nihal

BIC method (Ball; Genetics 2001)

• A Bayesian model selection method

• A non-MCMC, multi-locus QTL mapping method

• Considers multiple models representing alternate QTL genetic architectures according to their probabilities

• QTL architectures represented to within the resolution of the marker map by linear regression on subset of selected

markers.

Page 11: VISG Polyploids Project Rod Ball & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food Research Gail Timmerman-Vaughan Plant and Food Research Nihal

BIC method

• Avoids selection bias (Miller 1990, Beavis 1994) where the same data is used to select loci, and estimate the size of their effects, due to over-estimated effects being more likely to be selected.

• Missing values estimated by multiple imputation.

Page 12: VISG Polyploids Project Rod Ball & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food Research Gail Timmerman-Vaughan Plant and Food Research Nihal

BIC method example:

Page 13: VISG Polyploids Project Rod Ball & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food Research Gail Timmerman-Vaughan Plant and Food Research Nihal

Bayesian Model selection/BIC method

Note inference of genetic architecture (to within the resolution

of the marker map):

• posterior probability for number of QTL in a region

• posterior probability for QTL in the vicinity of a marker

• unbiased estimates of QTL e ectsffavoid selection bias by considering all models, not just models where the effect is selected or ‘significant’ (unlike e.g., interval mapping).

Page 14: VISG Polyploids Project Rod Ball & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food Research Gail Timmerman-Vaughan Plant and Food Research Nihal

Multiple imputation

• Jointly analyse multiple copies of the data with independent

randomly sampled values for the missing data adjusting the

likelihood appropriately

• Need to sample from the distribution of missing marker information

• Will sample from a set of fully informative ‘virtual markers’

using a variant of ‘peeling’.

Page 15: VISG Polyploids Project Rod Ball & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food Research Gail Timmerman-Vaughan Plant and Food Research Nihal

Peeling (Elston and Stewart 1971)

• Missing information problem for diploid human pedigrees be- cause of small family size.

• Peeling—exhaustively evaluates joint or marginal probabilities in a pedigree.

• Feasible for several markers simultaneously.

• A special case of graphical models methods (e.g. Lauritzen and Spiegelhalter 1988; Thomas et al 2000).

Page 16: VISG Polyploids Project Rod Ball & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food Research Gail Timmerman-Vaughan Plant and Food Research Nihal

Peeling

Process:• Summation over progeny in parent progeny triples.• Remove the progeny from the graph. (It is peeled away).• Repeat for all progeny. - End result is a function on remaining node or a value (likelihood)for the model in terms of any parameters θ that are conditionedon in the above process.

- Reverse the steps (reverse peeling) obtaining a random samplefrom the distribution.

Page 17: VISG Polyploids Project Rod Ball & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food Research Gail Timmerman-Vaughan Plant and Food Research Nihal

VISG peeling

• Peeling to sample from the virtual marker genotypes at one locus.• Conditional peeling. Peel sequentially along the genome, conditional on previously sampled values and recombination rates.• complexities (not yet being addressed)

– recombination rates may vary between parents (male or female) and sub-genomes– recombination distances between markers may also need to be estimated.– recombination rates not known need pairwise peelings⇒

Page 18: VISG Polyploids Project Rod Ball & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food Research Gail Timmerman-Vaughan Plant and Food Research Nihal

Hierarchical model for allo-tetraploids

Page 19: VISG Polyploids Project Rod Ball & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food Research Gail Timmerman-Vaughan Plant and Food Research Nihal

Peeling equations for an allo-tetraploid

Page 20: VISG Polyploids Project Rod Ball & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food Research Gail Timmerman-Vaughan Plant and Food Research Nihal

Peeling equations for an allo-tetraploid (ctd.)

• Continue, peel back to marginal distribution [mp1] , and sample from that distribution.• Reverse the process sampling from each value in turn.

Page 21: VISG Polyploids Project Rod Ball & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food Research Gail Timmerman-Vaughan Plant and Food Research Nihal

Progress

• Three year project work plan written and accepted by the Project Governance Group.

• Peeling equations derived.

• R functions for peeling and conditional peeling implemented by Sammie.

• Testing peeling.

Page 22: VISG Polyploids Project Rod Ball & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food Research Gail Timmerman-Vaughan Plant and Food Research Nihal

Next steps

• Integrating peeling with the BIC method.

• Testing and application to more datasets.

Page 23: VISG Polyploids Project Rod Ball & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food Research Gail Timmerman-Vaughan Plant and Food Research Nihal

References

Ball, R.D. 2001: Bayesian methods for quantitative trait loci mapping based on model selection: approximate analysis using the Bayesian

Information Criterion. Genetics 159: 1351–1364.

Beavis, W.D., 1994: The power and deceit of QTL experiments: lessons from comparative QTL studies. Proc 49th Ann. Corn and Sorghum Indus. Res. Conf.

Cao, D., Craig, B. A., and Doerge, R.W. 2005: A model selection based interval mapping method for autopolyploids. Genetics 169: 2371–2382. Note: the model selection referred to here is for the parental segregation pattern for a pair of loci, extending interval mapping.

Page 24: VISG Polyploids Project Rod Ball & Phillip Wilcox Scion; Sammie Yilin Jia Plant and Food Research Gail Timmerman-Vaughan Plant and Food Research Nihal

References (ctd.)

De Silva, H. N. and Ball, R.D. 2007: “Linkage disequilibrium mapping concepts”, Chapter 7, In: Association mapping in plants. N. C. Oraguzie et al. editors, Springer Verlag, ISBN

0387358447. (31pp)

Doerge, R.W. and Craig, B. A. 2000: Model selection for quantitative trait locus analysis in polyploids. PNAS 14: 7951–7956. Note: the model selection referred to here is for the parental segregation pattern for a single locus.

Elston R.C. and Stewart J. 1971: A general model for the genetic analysis of pedigree data. Human Heredity 21: 523–542.

Miller, A.J. 1990: Subset selection in regression, Monographs on Statistics and Applied Probability 40, Chapman and Hall, London.

Thomas, A., Gutin, A., Abkevich, V., and Bansal, A. 2000: Multilocus linkage analysis by blocked Gibbs sampling. Statistics and Computing 10: 259–269.