next generation sequencing technologies and their applications in ornamental crops

54
NEXT GENERATION SEQUENCING TECHNOLOGIES AND THEIR APPLICATION IN ORNAMENTAL CROPS Indian Agricultural Research Institute, New Delhi K.RAVINDRA KUMAR Ph.D. 1 st Year R.No. 10461 Division of Floriculture and landscaping

Upload: ravindra-kumar

Post on 14-Aug-2015

71 views

Category:

Education


1 download

TRANSCRIPT

NEXT GENERATION SEQUENCING TECHNOLOGIES AND THEIR APPLICATION IN

ORNAMENTAL CROPS

Ind

ian

Agric

ultu

ral R

esea

rch

Insti

tute

, New

Del

hi

K.RAVINDRA KUMARPh.D. 1st YearR.No. 10461

Division of Floriculture and landscaping

Evolution of DNA Revolution

DNA Sequencing

Refers to determining the order of nucleotide (G, A, T and C) in a stretch of DNA.

Useful in biotechnology research and discovery, diagnostics, and forensics.

Genome Sequencing

4

4

ACGTGGTAA CGTATACAC TAGGCCATA GTAATGGCG CACCCTTAG TGGCGTATA CATA…

ACGTGGTAATGGCGTATACACCCTTAGGCCATA

Short fragments of DNA

AC..GCTT..TC

CG..CA

AC..GC

TG..GT TC..CC

GA..GCTG..AC

CT..TGGT..GC AC..GC AC..GC

AT..ATTT..CC

AA..GC

Short DNA sequences

ACGTGACCGGTACTGGTAACGTACACCTACGTGACCGGTACTGGTAACGTACGCCTACGTGACCGGTACTGGTAACGTATACACGTGACCGGTACTGGTAACGTACACCTACGTGACCGGTACTGGTAACGTACGCCTACGTGACCGGTACTGGTAACGTATACCTCT...

Sequenced genome

Genome

Slab gel based Sequencer

ABI PRISM 377

Sanger Sequencing

Array view

Capillary based sequencer

Conventional Genome Sequencing Methods

Next Generation DNA SequencingVery early in applications

Allelic discrimination by sequencing

Thousands of individual mini-sequencing reactions on a single plate

Get millions of base pairs of sequence per run

Sequencing of genes of interest possible

Pro:

Comprehensive analysis of each gene in full

Works for SNP discovery

Less time required for sequencing

Con:

Expensive instruments

Expensive reagents

Low sample throughput

Early phase of technology development

Instruments not readily available

Next Generation Sequencing Platforms

Sequencing By Synthesis – Roche/454/GS FLX+

Pyrosequencing

454 Life Sciences

Sequencing By Synthesis – Illumina/Solex/HiSEq.2000

Solexa - Cambridge scientists Shankar Balasubramanian and David

Klenerman - 2005 - Sequencing of the Whole Bacteriophage phiX-174 Genome

Sequencing By Synthesis – Ion Torrent

 DNA Electronics Ltd. February, 2010

Sequencing By Ligation – Life/AB SOLiD 5500 series

17Complementary strand elongation: DNA Ligase

5 reading frames, each position is read twice

Life Technologies, 2010

Single Molecule Sequencing: HeliScope/Helicos

19

Pacific Biosciences, Life/Visigen, LI-COR Biosciences

2010: 5K$, a few days

2009: Illumina, Helicos40-50K$

Sequencing the Human Genome

Year

Log

10(p

rice)

201020052000

10

8

6

4

22015: 1000$, <24 hrs?

2008: ABI SOLiD60K$, 2 weeks

2007: 4541M$, 3 months

2001: Celera100M$, 3 years

2001: Human Genome Project2.7G$, 11 years

Genomic research studies using next-generation sequencing technology in ornamentals

Source : Masafumi Yagi,2015

ApplicationsDe novo sequencing of genome.

Resequencing of genome.

Whole genome analysis.

Transcriptome analysis.

Marker development and association studies.

Marker assisted selection.

Genetic diversity

Maintenance of large gene bank collections.

Conserved syntenic segment (CSS)

Marker assisted breeding

mRNA sequencing

Transcriptome analysis

miRNA/small RNA sequencing

Genetic diversity1750 gene banks world wide conserving 7 m accessions of advanced cultivars, landraces, and wild species.

Large-scale characterization, use and management possible through NGS tech.

Legal constraints on the ownership of genetic resources.

Correct identification of accessions, tracking seed lots, identification of varieties, identify and eliminate duplicate accessions, justify adding new accessions to the collection, core sampling can be possible through NGS technologies.

Case study-1

Objectives:

To develop high quality whole genome sequencing in carnation.

To understand the genetic systems of carnation and to perform the structural

analysis of the whole genome of the carnation.

IntroductionCarnation (Dianthus caryophyllus L.) is one of the major flower crop in worldwide.

More than 300 Dianthus species have been recorded and distributed in Europe and Asia.

Most of the carnation cultivars are diploid, with a chromosome number of 2n=2x=30. The estimated genome size of carnation is 622 Mb.

Many new carnations have been bred for attractive characteristics.

The plant pigments of species belonging to the families of Caryophyllales are betalains, but carnation is only the exception having anthocyanins and chalcone derivatives instead of betalains, is one of the attractive materials to study evolution of genetic systems for pigment synthesis.

Carnation flowers are highly sensitive to ethylene. Vase life of the flower, which is a polygenic trait that is controlled by several genes involved in ethylene production and ethylene sensitivity.

Genetic linkage maps of the carnation genome have been constructed and used to identify QTL responsible for resistance to carnation bacterial wilt.

Genomes – Total Size

Carnation R.hybrida Petunia Chrysanthemum

Tulip

622 Mb1.1 Gb

1.6 Gb

9.4 Gb

26 Gb

Yagi M et al., 2014 Lilium 36 Gb

Materials and Methods

Plant materials: Francesco – Red Mediterranean standard-type cultivar

Karen Rouge – Cultivar with bacterial wilt resistance derived from D.capitatus ssp.andrzejowskianus.

Construction of BAC libraries and BAC DNA sequencing:

BAC libraries were constructed from nuclear DNA prepared from young

leaves. DNA partially digested with HindIII and size-selected, and 100-180 kb

DNA was ligated to the BAC vector plndigoBAC5 and introduced into E.coli

DH10B cells by electroporation.

For shotgun sequencing of BAC clones, BAC DNAs barcoded with a GS

Titanium Rapid Library MID adaptors kit, and pooled for sequencing using

GS Titanium platform (Roche Diagnostics).

Whole-genome shotgun sequencing was performed using both HiSeq 1000

(Illumina) and GS FLX+ (Roche).

Insert size in HiSeq1000 : PE insert size 500 bp and OF insert size 180 bp

Insert size in GS FLX+ : 4 kb

Strategy for sequencing and data assembly

Non-redundant cDNA data set was developed by removing redundant cDNA

sequence with a CD-HIT tool.

Repeat sequences including transposable elements were detected with

Repeat Master and TransposanPSI.

Genes for tRNAs were assigned using the tRNA scan SE programme.

The rRNA genes were identified based on sequence similarity with

A.thaliana.

Genes for small nucleolar RNA (snoRNA) were predicted using snoScan.

miRNA genes were searched against a miRBase library (MapMi programme).

Protein encoding genes were identified by PASA and Augustus programmes

based on cDNA alignment and gene prediction.

Comparison of metabolic pathways:

Beta vulgaris, A.thaliana and Oryza sativa were chosen.

B.vulgaris, EST sequences were obtained from dbEST of the NCBI

database. Having large number of registered genes among this order.

Results

Sequencing the carnation genome:

In HiSeq 1000 system, a total of 1277.4 M,

1526.5, 442.6 and 475.3 M reads corresponding

to 127.7, 152.6, 44.3 and 47.5 Gb sequence

data were collected from PE, OF, 3 kb MP and

5 kb MP libraries respectively.

The carnation genome size was 622 Mb (670

Mb). Total redundancy of the obtained

sequence data (376.6 Gb) was equivalent to

604- times.

The total length of the resulting genomic

assemblies was 568.9Mb, equivalent to 91% of

the estimated genome size, containing 69 Mb

gaps.

96% of the core genes were covered.

Correlation of the genomic sequences with a genetic linkage map:

Genetic linkage map developed in

carnation with 412 SSR loci on a total

length of 969.6 cM.

All primer sequences and flanking

regions were successfully mapped to

the assembled genome sequences.

Single corresponding scaffolds could

be identified for 378 (91.7%) of the 412

SSR loci. 85-11 x Pretty Favvare 85p population

Carnation nou No.1 x Pretty Favvare

NP population

The genes for enzymes

involved in anthocyanin

biosynthesis was identified

(Tic 104 TE).

13 genes for rRNAs and 1050

intact genes for tRNAs were

identified.

56137 protein-encoding genes

were identified.

Out of 3 enzymes (DOPA,

DOD, CYP76AD) involved in

synthesis of betalains, one

copy of DOD (Dca8668) was

found in the carnation

genome (Conserved region).

Phenylpropanoid pathway

Enzymes like GSA, HEMB, HEMC, HEME, CHLD, CHLM,

CRD,PORA and DVR are responsible for chlorophyll synthesis

in carnation. By contrast, all of the enzymes are likely to be

encoded by a single gene i.e STAY-GREEN (SGR).

Nucleotide-binding site-leucine-rich repeat (NBS-LRR) genes

identified and assigned 217 NBS-containing potential

Resistance (R) genes.

The 217 R-genes assigned to 125 scaffolds, out of which 87

contain single NBS genes, while the other scaffolds contain

multiple NBS genes.

With respect to ethylene biosynthesis, 3 ACC synthase genes

(DcACS1, 2 and 3) and one ACC oxidase gene (DcACO1)

identified.

Genes involved in carbohydrate metabolism:

Pinitol one of the rare sugar is responsible for flower opening as act as

substrate for respiration and cell wall synthesis in carnation. This sugar also

responsible for salinity tolerance.

Gene Dca24344 was identified for encoding myo-inositol methyl transferase

(IMT), which catalyzes the conversion of myo-inositol to pinitol.

Genes related to floral scent:

Methyl benzoate that is a major scent component of modern carnation

cultivars, is also derived from the methylation of benzoic acid.

A similarity search against the carnation genome sequences detected 11

genes in the SABATH family (DcSABATH1-11), which are candidate genes of

benzoic acid methyl-transferase in carnation.

Case study 2

Objectives:

To obtain a high quality rose transcriptome and identifying novel genes responsible for trait of interest.

To become a model plant for woody perennials, genetics and genomic tools have to be developed.

Introduction:

Among the cut flowers Rose is the most economically important crop with

30 % market share.

Rose is also an important source for perfume and natural oils.

Within the Rosaceae family (apple, peach, strawberry), rose can become a

model for woody ornamental and fruit crops.

Small genome size (approx. 560 Mbp).

It can be genetically transformed (Debener and Hibrand- Saint Oyant, 2009)

Among woody plants Rose has shortest life cycle: about one year from

seed to flower.

Rose is an ideal model for – Recurrent blooming (Iwata et al., 2012)

Flower morphogenesis (Dubois et al., 2010)

Scent biosynthesis (Scalliet et al., 2008)

(Scent biosynthesis path ways are unique in Rose not yet identified in other

model species. )

Rose Genome Sequencing - Challenges

The ploidy level varied within the genus from diploid to decaploid roses.

The majority of cultivated roses are diploid -or- tetraploids.

Rose is highly heterozygous varies from 36 to 87 % (Soules, 2009).

Materials and Methods

Selection of genotype: R.chinensis var. spontanea x R.odorata var. gigantia

R.chinensis ‘Old Blush’

Historical genotype, introduced during the year 1760.

Contributed to the introduction of important ornamental traits like

continuous flowering and tea-scent.

Different genomic resources available on this genotype as,

BAC library (Hess et al., 2007) EST (Dubois et al., 2012)

F1 progeny (Byrne et al.,2007) Genetic transformation protocol (Vergne

2010)

Rose Genomic Tools:

EST and Micro-Arrays Studies : Mostly transcriptomic approach. Identified

5000 unigenes out of 30,000 genes. EST have been obtained from floral

tissues during the floral transition, development and scent production.

Using micro-array compared the gene expression during petal

development and between perfume and non-perfume cultivars.

Genes identified for flower induction: APETALA1 and SUPPRESSOR OF

CONSTANS1.

Genes involved in hormone signaling are also regulated, suggested that

ethylene and auxin may be involved in floral induction.

RNA sequencing

French consortium combined 454 and Illumina sequencing technologies to identify new genes from R.chinensis ‘Old Blush’

Using Ortho MCL, identified

14000 protein families.

Among this 50% common

between rose, strawberry,

Prunus and Arabidopsis.

3500 proteins are only

specific to rose.

The contigs were compared

with already known

Rosaceae genome

sequences such as apple,

peach and strawberry.

Unigene discovery in Rose using NGS technologies

Discovery of Micro RNA

miRNA are short (20-24 nt) non protein coding RNA, which were play

important roles in regulating plant growth and development.

miRNA regulate the expression of target mRNA post-transcriptionnaly

through cleavage of targeted mRNA.

Using Illumina, miRNA libraries prepared from flower tissues or from

petals treated with ethylene.

Compared with known miRNAs for identification.

Solving the high degree of heterozygosity in Rose

High genetic density map

Production of haploids

Old Blush x R.wichuriana

300 F1 progeny

F2 population

Segregating for trait of interest like

Mode of flowering (Continuous vs once)Type of flower (single vs double)Flower colour (pink vs white)Architecture (bushy vs ground cover)Susceptibility to PM or Black spot

SSR markers to help anchor

this new genetic map to

previous maps as the

integrated consensus map

SNP (68,893) genotyping

INRA (Angers, France)

Angers and Lyon, French groups

Haploid material from ‘Old Blush’Is under development.

Rose Genome Sequence Initiative

Presently ‘Old Blush is under sequencing at the Genoscope (Evry, France).

For rose annotation, synteny between rose and strawberry, for which

genome sequence is available (http://www.rosaceae.org), can be used.

Almost all rose genetic markers of LG1 are located on strawberry

chromosome 7. However few rearrangements existed between both

genomes.

Regarding micro synteny, the Rdr1 locus corresponds to a cluster of TIR-

NBS-LRR genes. This cluster also conserved between rose and strawberry.

Conclusion

NGS technologies are paving the way to a new era of scientific discovery.

As genome sequencing becomes easier, more accessible, and more cost

effective, genomics will become an integral part of every branch of the life

sciences.

Genome sequencing will allow to study whole genome analysis,

transcriptome analysis, genetic diversity, genome evolution, marker

development and marker assisted breeding.

Marker development and MAS helps in identifying adult traits can be done

at the seedling stage and therefor greatly accelerating the process of plant

selection.

Collaborative research work with different areas of expertise is essential to

handling NGS technologies, especially in ornamental crops as the

application of technologies and operation of specialized equipment are

very difficult for individual researchers and institutions.