genotyping in breeding programs
TRANSCRIPT
Genotyping in Breeding Programs
Melaku GedilPresented at
Implementation of Crop Improvement Strategy of IITASeptember 8-10, 2015
Genetic Variation• Selection of best parents for crossing
– Germplasm collections and Introductions of exotics– Heterotic grouping– Gene pools
• Introgression of novel genes from wild relatives – When cultivars do not have sources
• Induction of variation– Chemically induced point mutations– Radiation induced mutations
Strategies for Crop Breeding• Recombine genes among genotypes
– Recurrent selection, backcrossing, multi-parent crosses; bi-parental crosses, etc.
– Genetic stocks (populations, fixed lines, hybrids, clones, ….)
• Selection of superior genotypes– Gene actions (additive vs non-additive)– Selection intensity and Population size– Efficiency (genetic gain, shorter breeding cycle
time)
4
Marker Assisted Selection (MAS)
What is MAS?
• Molecular marker assisted selection (MAS) is the use of DNA sequences and/or banding patterns that are associated with desired trait as a substitute for or to assist phenotypic screening.
• By determining the allele of a DNA marker, plants that possess particular genes or quantitative trait loci (QTLs) may be identified based on their genotype rather than their phenotype.
Advantages of MAS
6
• Marker-assisted selection may greatly increase the efficiency and effectiveness of breeding
• Simpler compared to phenotypic screening• Selection may be carried out at seedling stage • Single plants may be selected with high
reliability• Leads to accelerated line development in
breeding programs.
Application of MAS1. Marker-assisted selection of simple (R) and
complex traits (QTL)2. Improvement of recalcitrant traits3. Introgression through backcrossing
– Minimizes linkage drag4. Pyramiding of genes5. Pre-emptive breeding6. Allele mining in genetic resources7. Comparative genomics in less studied but
related species.
8
F2
P2
F1
P1 x
large populations consisting of thousands of plants
PHENOTYPIC SELECTION
Field trialsGlasshouse trials
DonorRecipient
CONVENTIONAL PLANT BREEDING
Salinity screening in phytotron Bacterial blight screening Phosphorus deficiency plot
9
Conditions under which MAS is valuable
10
F2
P2
F1
P1 x
large populations consisting of thousands of plants
ResistantSusceptible
MARKER-ASSISTED SELECTION (MAS)
MARKER-ASSISTED BREEDING
Selection is based on DNA markers
Evaluation at the Seedling StageMarker-Assisted Selection allows identification of favorable genotypes at the seedling stage.
• The process of investigating or determining the differences in the genetic make-up (genotype) of individual plant samples by examining their DNA sequence or banding patterns using variety of molecular biology assays. – It could involve comparison of individuals with each other or with reference
sample.• An array of techniques are available
– Restriction fragment analysis– PCR amplification and visualization– Hybridization– Sequencing– Expression profiling: mRNA profile
• A molecular marker (identified as genetic marker) is a fragment of DNA that is associated with a certain location within the genome used to 'flag' the position of a particular gene or the inheritance of a particular characteristic.
What is genotyping?
Application of Marker Technology - I
• Construction of Genetic Linkage Map• Gene Tagging• QTL Analysis• Diversity Analysis (similarity/distance)• Marker-assisted selection (MAS)• Map-based cloning• Transformation/Genetic engineering
Application of Marker Technology - II
Other Applications• Variety identification/tracking• Germplasm conservation• Comparative mapping• Genome structure and organization• Evolutionary studies
Key issues in implementation of MAB1. Availability of genomic resources
• Whole genome sequence; Transcriptome; markers, maps, phylogeny
2. Cost-effective genotyping systems
• Flexible, uniplex SNP assays
• Fixed array, genome-wide assays
3. Multi-environment phenotyping (GxE, epistasis)
• Precision phenotyping (standardized, trait ontology)
• High throughput phenotyping (digital, uav, optics?)
4. Accurate Marker-trait association methods (LD, QTL)
• GWAS (LD); QTL mapping
• Begin with less complex traits, high heritability
• Databases and analysis pipelines
Dr. Melaku Gedil, Molecular Breeder, IITA, Ibadan, Nigeria
Key issues in implementation of MAB1. Availability of genomic resources
• Whole genome sequence; Transcriptome; markers, maps, phylogeny
2. Cost-effective genotyping systems
• Flexible, uniplex SNP assays
• Fixed array, genome-wide assays
3. Multi-environment phenotyping (GxE, epistasis)
• Precision phenotyping (standardized, trait ontology)
• High throughput phenotyping (digital, uav, optics?)
4. Accurate Marker-trait association methods (LD, QTL)
• GWAS (LD); QTL mapping
• Begin with less complex traits, high heritability
• Databases and analysis pipelines
Dr. Melaku Gedil, Molecular Breeder, IITA, Ibadan, Nigeria
Genomics Resources 4 Crop Improvement
Whole genome sequence
Assembly/Annotation
Gene/Marker Discovery
Assay/Validation
Application
MAB TRANSGENICS
Functional Genomics
Transcriptome, epigenome, etc.
Bioinformatics
In silico analysis & data mining
Current status of whole genome sequences of IITA mandate crops
Species Sub species/genotype
Family
Genome size (Mbp)
No of predicted genes
Chrom. no. (2n)
Reference
Maize Zea mays ssp mays B73
Poaceae 2,300 39,656 10 [15]
Soybean Glycine max, variety Williams
Fabaceae 1,115 46,430 20 [16]
Cowpea Vigna unguiculata Fabaceae 620 5,888 GSRs 22 [17]
Cassava Manihot esculenta
Euphorbiaceae 770 30,666 18 [18,19];
Banana Musa acuminata (ssp. malaccensis)
Musaceae 523 36,542 22 [20]
Yam* Dioscorea rotundata
Dioscoreaceae 594 21,882 20 [21]
Cacao Theobroma cacao cv. Matina
Malvaceae 430 28,798 20 [22]
Key issues in implementation of MAB1. Availability of genomic resources
• Whole genome sequence; Transcriptome; markers, maps, phylogeny
2. Cost-effective genotyping systems
• Flexible, uniplex SNP assays
• Fixed array, genome-wide assays
3. Multi-environment phenotyping (GxE, epistasis)
• Precision phenotyping (standardized, trait ontology)
• High throughput phenotyping (digital, uav, optics?)
4. Accurate Marker-trait association methods (LD, QTL)
• GWAS (LD); QTL mapping
• Begin with less complex traits, high heritability
• Databases and analysis pipelines
Dr. Melaku Gedil, Molecular Breeder, IITA, Ibadan, Nigeria
SSR Genotyping Platforms
Cheaper but low throughput – gel electrophoresis
• High resolution agarose or PAGE
• Multiplex based on size
• Tandem loading
Semi-automated capillary electrophoresis
• Multiplexing multiple dye labeled primers or co-loading
• Universal tailing
Key issues in implementation of MB2. Cost-effective genotyping systems
Declining cost of genotyping and a choice of genotyping platforms
i. First generation: SSR semi-automated fragment analysis (also AFLP, DArT)
ii. Fixed array SNP platforms (multiplex options a wide array of samples x SNP.
a. Microarray: GeneChip, by Affymetrix, Agilent, 24 samples x 3k; 96 x 650k, 96x384, assays specific to human, plants
b. Bead Array: Infinium beadChips, Golden Gate Assay or Bead Express of Illumina (24 x 3456)). E.g. 96 x 1536 SNP in soy, maize, cassava
c. Flexible (Uniplex) SNP genotyping: KASP (LGC Genomics) or TaqMan. Any number of sample x any no of SNP ($0.09 to $0.12/data point)
d. Miniaturized versions such as Array Tape by Douglas Scientific (150k/day); OpenArray system from Life Technologies; DynamicArray by Fludigm (96x96);
e. Sequenom MassArray iPlex system (MALDI-TOF) mass spec
iii. GBS (genotyping-by-sequencing) and RAD (restriction site associated DNA)
a. Maize ~500k data points, $40/sample (~$9/sample, soon)
b. Cassava ~80K SNP data, 5K quality data
Gut 2001. HUMAN MUTATION 17:475.492
Automation in Genotyping of Single NucleotidePolymorphisms
First Generation Detection/genotyping methods
• Asllele specific hybridization (ASH/ASO) – dot blot
• PCR-RFLP (CAPS)
• Allele-Specific PCR (AS-PCR)
• SSCP/DGGE/DHPLC (conformation based)
• Primer extension/single base extension (SBE)/minisequencing
• Oligonucleotide ligation assay (OLA)
• Taqman assay
• Invasive cleavage (FRET based assay)
• Heteroduplex formation based (TILLING, HPLC-TMHA)
• Genotyping by sequencing (GBS)
• Microarray/Genechips
• MALDI-TOF
KASP and GBS protocols/assay developed/optimized
Crop GBS KASP Map Diversity GS
Maize ✔ ✔ WIP/WIP ✔/✔ ✔/*
Soybean ✔* ✔ ✔Cowpea ✔ ✔ WIP Wip/wip
Cassava ✔ ✔ ✔/✔ ✔/✔ WIP/na
Banana ✔ ✔Yam* ✔ ✔ ✔/WIP ✔/✔
Cacao ? ✔ ?/? x
Assignment•What is SNP?
• Why is it advantageous to use SNP markers these days?
• What are the different ways of SNP discovery?
• What are the options for SNP genotyping?
• What are the applications of SNP markers
Discovery methodsIn-vitro and In-silico
• Direct sequencing
• Sanger sequencing (partial sequences/fragments)
• Next-generation sequencing
• Whole genome sequencing
• RRS sequencing
• mRNA sequencing (transcriptomics)
• BAC clone sequencing
• Mining from EST databases
SNP Typing
Key issues in implementation of MB
Thomson. 2014. Plant Breeding and Biotechnology 2:195-212
International Institute of Tropical Agriculture Nigeria
32Melaku Gedil
Example of SNPs-Yam MatK chloroplast gene
33
Multiple sequence alignment featuring C/T and A/C SNP polymorphism
34
Haplotype of a few accessions with polymorphisms obtained at different positions of the gene
Position 549 572 744 783 819 870 999 1218 Consensus Y M Y R W M A W
cassava4.1_008121m T C T G A A A A I000107_WPSY2R_2013-10-15_F08 T C T G A A A A I000211_WPSY2R_2013-10-15_C08 T C T A/G A A A A I000338_WPSY2R_2013-10-15_A11 T C T G A A A A I000351_WPSY2R_2013-10-15_B11 T C T G A A A A I000388_WPSY2R_2013-10-15_D08 T C T G A A A I030006B_WPSY2R_2013-10-15_E05 T C T G A A A I030264_WPSY2R_2013-10-15_B06 T C T G A A A A I051652_YPSY2R_2013-10-15_B04 C A T G A A/C A T I051653_YPSY2R_2013-10-15_A05 C A T G A A/C A T I051654_YPSY2R_2013-10-15_G03 C A T G A A/C A T I070481_YPSY2R_2013-10-15_G02 C A T G A A/C A T I070498_YPSY2R_2013-10-15_F02 C A T G A A/C A T I070539_YPSY2R_2013-10-15_F01 C A T G A C A T I070576_YPSY2R_2013-10-15_D01 C A T G A A/C A T I070808_YPSY2R_2013-10-15_D03 C A T G A A/C A T I070874_YPSY2R_2013-10-15_B03 C A T G A A/C A T
Key issues in implementation of MAB1. Availability of genomic resources
• Whole genome sequence; Transcriptome; markers, maps, phylogeny
2. Cost-effective genotyping systems
• Flexible, uniplex SNP assays
• Fixed array, genome-wide assays
3. Multi-environment phenotyping (GxE, epistasis)
• Precision phenotyping (standardized, trait ontology)
• High throughput phenotyping (digital, uav, optics?)
4. Accurate Marker-trait association methods (LD, QTL)
• GWAS (LD); QTL mapping
• Begin with less complex traits, high heritability
• Databases and analysis pipelines
Dr. Melaku Gedil, Molecular Breeder, IITA, Ibadan, Nigeria
Key issues in implementation of MAB
3. Precision and HTP phenotyping lead to accelerated genetic gain by increasing heritability, mainly through reducing environmental variation. Helps to dissect the genetics of quantitative traits • Rate-limiting step in breeding pipelines• Accuracy, for reliable marker-trait association• High-throughput to match genotypic data• Multi-environment to detect GxE interaction• Metadata – weather, georeference• Robust and standardized screening protocols• Establishment of phenotyping hubs and hotspots for
abiotic and biotic are key elements
Dr. Melaku Gedil, Molecular Breeder, IITA, Ibadan, Nigeria
Approaches to increase throughput and quality of phenotyping
• Automated and mechanized field experiment management
• Digital data capture
• Improved sample tracking methods using hand-held electronic devises
• Deployment of ground-based and aerial advanced technologies in imaging
• Remote sensing – UAV
• Integrated Databases – easy access, shared, visualization,
• Analysis pipelines
Dr. Melaku Gedil, Molecular Breeder, IITA, Ibadan, Nigeria
Dr. Melaku Gedil, Molecular Breeder, IITA, Ibadan, Nigeria
Key issues in implementation of MAB1. Availability of genomic resources
• Whole genome sequence; Transcriptome; markers, maps, phylogeny
2. Cost-effective genotyping systems
• Flexible, uniplex SNP assays
• Fixed array, genome-wide assays
3. Multi-environment phenotyping (GxE, epistasis)
• Precision phenotyping (standardized, trait ontology)
• High throughput phenotyping (digital, uav, optics?)
4. Accurate Marker-trait association methods (LD, QTL)
• GWAS (LD); QTL mapping
• Begin with less complex traits, high heritability
• Databases and analysis pipelines
Dr. Melaku Gedil, Molecular Breeder, IITA, Ibadan, Nigeria
Key issues in implementation of MB4. Accurate Marker-trait association methods (LD, QTL)
i. Reliable predictive markers associated with traits of interest
ii. Begin with less complex traits
iii. Analysis pipelines and data management tools
Dr. Melaku Gedil, Molecular Breeder, IITA, Ibadan, Nigeria
Source: Varshney et al. 2014. PLoS Biology, v. 12, no. 6.
Key issues in implementation of MB4. Accurate Marker-trait association methods (LD, QTL)
Take-home message:
i. Reliable predictive markers associated with traits of interest
ii. Begin with less complex traits
iii. Analysis pipelines and data management tools
Dr. Melaku Gedil, Molecular Breeder, IITA, Ibadan, Nigeria
Target Traits for Molecular Breeding
CMDCBSD
CarotenoidsStarch
AcyanogenesisPPD
MSVPVAC
Aflatoxin (Tang et al. 2015)Stem borer (Samayoa et al. 2015)
MLN (WIP)Striga – QTL (WIP)
Cass
ava
Mai
ze
Target Traits for Molecular Breeding
* Striga• Drought• Aphid tolerance (Huynh et al. 2015)
* SCN, soybean cyst nematode• Rust• Pl.ht, lodging, maturity (Lee et al. 2015)
* Anthracnose WIP• YMV WIP• Flowering or floral WIP
Cow
pea
Soyb
ean
Yam
52
Genotyping, data management and analytical services at our disposal
• CGIAR shared high-throughput low cost genotyping facility at ICRISAT
• Integrated Breeding Platform (IBP) – Breeding management system (BMS)
• The Global Open Breeding Informatics Initiative (GOBII)
• Integrated Genotyping Service and Support unit at BECA
• African Orphan Crop Consortium (AOCC)
53
Databases and Breeders’ tool kits
54
Constraints for Genotyping
Sample tracking
Lyophilization
DNA extraction
DNA quantification
Turn-around time
Data analysis
pipeline
Barcoding
Tissue sampling
Fluidics
Fludics & Plate
analyzer
Library – in-house
BMS, informatics
Thank you