high throughput dna sequencing and the deployment of...

45
WHAT ARE THE MOLECULAR BASES OF SPECIFICITY BETWEEN HOST AND PATHOGEN? WHAT IS THE GENETIC BASIS FOR VARIATION IN SUCH SPECIFICITY? HOW CAN WE ACHIEVE DURABLE RESISTANCE? WHAT NEW OPPORTUNITIES ARE PROVIDED BY HIGH-THROUGHPUT DNA SEQUENCING AND MARKER TECHNOLOGIES? LONG-STANDING QUESTIONS IN PLANT-PATHOGEN INTERACTIONS: High Throughput DNA Sequencing and the Deployment of Resistance Genes Richard Michelmore The Genome Center & Dept. Plant Sciences University of California, Davis http://michelmorelab.ucdavis.edu

Upload: others

Post on 19-Aug-2020

10 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

WHAT ARE THE MOLECULAR BASES OF

SPECIFICITY BETWEEN HOST AND PATHOGEN? WHAT IS THE GENETIC BASIS FOR VARIATION

IN SUCH SPECIFICITY? HOW CAN WE ACHIEVE DURABLE RESISTANCE? WHAT NEW OPPORTUNITIES ARE PROVIDED BY

HIGH-THROUGHPUT DNA SEQUENCING AND MARKER TECHNOLOGIES?

LONG-STANDING QUESTIONS IN

PLANT-PATHOGEN INTERACTIONS:

High Throughput DNA Sequencing and the Deployment of Resistance Genes

Richard Michelmore The Genome Center & Dept. Plant Sciences

University of California, Davis

http://michelmorelab.ucdavis.edu

Page 2: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

Lettuce genetics, genomics & breeding

The Compositae Genome Project

Molecular basis of disease resistance: CHARGE, Niblrrs

OVERLAP AND INTEGRATION OF PROJECTS IN MICHELMORE LAB

Bremia genomics

Technologies: Molecular markers High-throughput sequencing Gene silencing, RNAi Bioinformatics

Bremia lactucae

Page 3: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

Gary Shroth (Illumina): “One HiSeq will be able to generate as much sequence as was in GenBank in 2009, every four days”.

Decreasing cost of sequencing (1990 – 2010)

DNA sequence becoming an inexpensive commodity. New paradigms as to how DNA sequence is generated, handled and valued.

The Economist, August 12th, 2010

Page 4: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

Clone by clone, Sanger sequencing Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina Genome Analyzer & HiSeq 2000 ABI Solid Helicos (Complete Genomics) Imminent arrivals: Ion Torrent Pacific BioSciences Oxford Nanopore

The Revolution in DNA Sequencing: Identification of Variation

Page 5: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

Illumina Sequencing Platforms

Feature GAiix HiSeq2000

Flowcells x Surface Imaging 1 x 1 2 x 2

Read length 2 x 150 2 x 100

Yield per run (PF data) 50 Gb 200 to 350 Gb

Data Rate 5 Gb / day 25 to 40 Gb / day

Human Genomes (30x) per run 0.5 >2

G. Shroth, Illumina

Page 6: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

DNA (0.1-1.0 ug)

Sample preparation: Fragmentation &

addition of primers Cluster growth in flow cell

5’

5’ 3’

G

T

C

A

G

T

C

A

G

T

C

A

C

A

G

T C

A

T

C

A

C

C

T A G

C G

T A

G T

1 2 3 7 8 9 4 5 6

Image acquisition Base calling

T G C T A C G A T …

Sequencing

Illumina Sequencing of Clusters of DNA Molecules Reversible Terminator Chemistry

Cycles of synthesis, imaging, & washing

G. Shroth, Illumina

Page 7: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

Illumina HiSeq 2000

Dual surface imaging Fast scanning and imaging Two flow cells Initially capable of 200 Gb

per run -> 350 Gb Run time 7 to 8 days for

2x100 bp 25 Gb/day 2 billion paired-end reads <$10,000 per human

genome <$200 per transcriptome

G. Shroth, Illumina

Page 8: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

Low instrument and reagent costs. Silicon chip technology. Single-use microchips Uses natural dNTPs and DNA polymerase Real-time detection of base incorporation No optics or enzymic amplification cascade Proprietary H+ ion sensitive layer

www.iontorrent.com

Page 9: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

www.iontorrent.com

Sequencing determined by measuring H+ release following incorporation of nucleotide. Chip interrogated with cycles of A, T, C, G.

Interrogated with C

Interrogated with G (or A)

Interrogated with T

Page 10: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

Simple kits and operation Benchtop convenience, small footprint

Fast. Single reads

Up to millions of reads 10s to 100s Mb output

Not single molecule sequencing

Compatible with preps for other libraries

www.iontorrent.com

Page 11: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

Single Molecule Real Time (SMRT™) sequencing Recording natural DNA synthesis by DNA polymerase as it occurs Single molecule resolution Simple amplification-free sample prep Long reads, average read over 1kb Fast, 1 to 3 bases incorporated per second, Sample prep to data analysis in less than a day Low overall costs 80,000 Zero Mode Waveguides (ZMWs) monitored simultaneously, 2 sets per SMRT cell ~33% of ZMWs have only one polymerase Not for counting large numbers of tags

http://pacificbiosciences.com

Page 12: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

Processive DNA Synthesis with Phospholinked Nucleotides

Incorporation of labeled nucleotide creates flash of light, captured by optics system and converted into base call with quality metrics

Base being incorporated held in detection volume for tens of milliseconds, producing a bright flash of light. Phosphate chain cleaved, releasing the attached dye molecule. Process repeats at 1 to 3 bases per second.

Page 13: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

Distributed Reads: SMRT Strobe Sequencing

Standard 3.2kb Direct Read:

6.4kb Footprint: 19 x 170 bases Strobed Read (50% Duty Cycle)

• Converts long reads into even longer scaffolds

• Flexibility in gap lengths and strobe on-times

• No need to create multiple paired-end/mate-pair libraries

• Produces linked, single-molecule, kilobase observation windows of genomic structure

Strobe Sequencing:

Page 14: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

Detection of DNA Methylation by SMRT Sequencing

70.5 71.0 71.5 72.0 72.5 73.0 73.5 74.0 74.50

100

200

300

400

Fluo

resc

ence

inte

nsity

(a.u

.)

Time (s)

104.5 105.0 105.5 106.0 106.5 107.0 107.5 108.0 108.50

100

200

300

400

Fluo

resc

ence

inte

nsity

(a.u

.)

Time (s)

C

T G A T C G T A C

mA A G T C T A A

G C C A A A A

Kinetic detection of methylated bases during sequencing E.g.: N6-methyladenosine (mA)

Page 15: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

www.nanoporetech.com

Page 16: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

www.nanoporetech.com

Exonuclease attached at entrance of α-hemolysin nanopore delivers individual DNA bases in order into the nanopore. Cyclodextrin attached to inside surface of nanopore acts as a binding site for individual DNA bases and allows accurate monitoring of their passage through the nanopore.

Page 17: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

$1,000 ($100?) human genome coming => $1,000 genome for many animals and plants $100 genome for fungi $10 genome for bacteria en masse Not just genomic DNA sequence:

DNA modifications epigenomics & copy number variation (CNV) expression analysis

Enormous amounts of sequence data Need for major data handling capabilities Vital role for bioinformatics

The Challenge and Opportunity: How to utilize the deluge of sequence data?

In near future: DNA sequence = an inexpensive commodity generated on a variety of platforms

Page 18: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

High-throughput sequencing Sequence assembly

Annotation Acquisition of other relevant data

Display in genome browser

UC Davis Sequencing & Gene Expression Service Cores 2011 ->

Tissue, DNA or RNA samples brought to Genome Center

Researcher queries samples versus existing information over web

Integrated activities of DNA Technology & Bioinformatics Cores

Page 19: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

Genomic sequencing De novo Microbial, animal and plant diversity Novel & unculturable organisms Biomes (bacteria = 100x human) Novel genespace Re-sequencing SNP and CNV discovery, TILLING Gene cloning, novel allelic diversity Genome Wide Association Studies (GWAS) High resolution population genetics Mapping BSAseq

Gene regulation Transcriptome sequencing for gene models and splicing RNAseq for expression analysis Small and non-coding RNAs Ribosome profiling CHIPseq for DNA binding sites DNA modifications and epigenomics

Page 20: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

Gene discovery from non-model organisms

E.g. Cloning of 10 genes from wheat rust, Puccinia striiformis f.sp. tritici (collaboration with J. Dubcovsky)

Whole genome sequencing (<100Mb genome) Two lanes Illumina GA to ~60x.

Quicker, cheaper, more informative than gene-by-gene. Permanent resource for other studies.

Page 21: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

Genomic Encyclopedia of Bacteria & Archea www.jgi.doe.gov/programs/GEBA Using phyogenetic approach to sample diversity for genome sequencing.

1000 (Human) Genomes Project

http://www.1000genomes.org Extending hapmap. 2,000 anonymous individuals from 20 populations. 4x coverage originally planned; deeper, 30x now feasible Pilot studies: 15 M SNPs, 1 M small indels, 20 K structural variants

25,000 human genomes sequenced by end of 2011 by this & other projects 1001 Arabidopsis Genomes Project

http://www.1001genomes.org 1001 extensively characterized accessions July 2010: 158 finished, 94 released, 98 underway

1000 Plant & Animal Reference Genomes Project

http://ldl.genomics.org.cn BGI 1,000 economically and scientifically important species in two years

Analyses of Biological Diversity

Page 22: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

Genome-Wide Association Studies (GWAS) Case and control comparisons to identify phenotype-marker associations Requires sufficient density of markers & numbers of individuals Currently based on SNP marker analysis Will be replaced by sequencing, at least in non-model species

False-positives due to multiple comparison issues Power issues due to loci of small effect

E.g. Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. The Wellcome Trust Case Control Consortium. (2007) Nature 447:661-678.

Page 23: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

Within L. sativa types: Crisphead x Crisphead Romaine x Romaine Leafy x Leafy + Butterhead

Between cultivated lettuce types: L. sativa x L. sativa

Between & within primary genepool: L. sativa x L. serriola L. serriola x L. serriola

Between & within secondary genepool: L. sativa x L. saligna L. saligna x L. saligna

For each population: Intercross 20 genotypes for 3 generations to breakdown LD & population substructure. Self to produce 200 F7:8 RILs. Phenotype & sequence F7:8 RILs.

PLANNED ASSOCIATION MAPPING POPULATIONS IN LETTUCE

Cultivated lettuce

Primary genepool

Secondary genepool

Page 24: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

Old paradigm (slow and inflexible): One-by-one marker development Utilization of core set of reference markers

Current paradigm (faster but specific to populations): Sequence transcriptome of parents to id. 1,000s of SNPs Develop informative SNP panel for specific sets of crosses Run SNPs on segregating individuals

Future paradigm (fast, flexible,& highly informative): Sequence transcriptome or genespace of segregating individuals

Rate limiting steps:

Informative populations Accurate phenotyping Library preparation Sequencing not limiting Data analysis?

Genetic Mapping

Page 25: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

Parent-independent genotyping for constructing an ultrahigh-density linkage map based on population sequencing. Xie et al., 2010. PNAS 2010.

238 rice RILs each sequenced to 0.055x, 13x in aggregate. Barcoded and multiplexed. 2x 36 nt paired-end reads, 20.6 Mb total single run. Genotypes inferred from RILs using maximum parsimony of recombination & HMM. New capabilities => any species tractable in a single run.

Page 26: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

E.g. BSAseq: High-throughput sequencing of the 52b2 mutant of Arabidopsis to identify the causal mutation using sequence-based mapping of a bulked segregating population. Cuperus et al. (2010) PNAS 107:466-471.

©2010 by National Academy of Sciences

Page 27: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

Transcriptome Profiling using Illumina RNAseq

•  Rapidly replacing chip hybridizations •  Analysis of unknown genomes/genes, no a priori knowledge required •  No limitation on number of genes assayed •  Digital, unambiguous readout •  High sensitivity, specificity, dynamic range •  Opportunity to progressively sequence to greater resolution •  Detection few mRNA per cell possible •  Low cost (vs. SAGE, MPSS, = chips) •  Mulitplex samples 12 per lane, 192 libraries per run on HiSeq2000 •  Quantification of splice variants •  Quantification of allele specific expression •  Currently limited by speed (& cost) of library preparation ->

automation

Page 28: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

Human Body Map 2.0 Project

16 Tissues: •  Adrenal •  Adipose •  Brain •  Breast •  Colon •  Heart •  Kidney •  Liver •  Lung •  Lymph Node •  Ovary •  Prostate •  Skeletal Muscle •  Testis •  Thyroid •  White Blood Cells

Illumina RNAseq whole transcriptome, HiSeq2000 5 billion reads, 300 Gb, 14 days total run time

G. Shroth, Illumina

Page 29: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

Illumina RNA-Seq Protocols

Conventional mRNA-Seq Purified Total RNA

Poly-A Selection

RNA Fragmentation cDNA Synthesis

Adapter Ligation & PCR

Total RNA-Seq + DSN Purified Total RNA

RNA Fragmentation cDNA Synthesis

Adapter Ligation & PCR

DSN Normalization:

Denaturation @ 98oC

Renaturation @ 68oC

Cleavage with DSN

PCR of intact fragments (Duplex Specific Nuclease

from Evrogen)

www.illumina.com

mRNA seq

mRNAseq + DSN

Total RNAseq + DSN

DSN treatment does not affect relative abundance of rare transcripts & reveals non-poly A snoRNAs

Page 30: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

THE ‘BOOM-AND-BUST’ CYCLE OF DISEASE CONTROL The Agricultural Consequences of Pathogen Evolution

INTRODUCTION OF RESISTANT VARIETY

INCREASE IN ACREAGE

PATHOGEN CHANGES TO RENDER RESISTANCE INEFFECTIVE

DECREASE IN ACREAGE DEVELOPMENT OF

RESISTANT VARIETY (Suneson, 1960)

Page 31: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

Aim: Durable Disease Resistance

“Resistance that remains effective over long periods of widespread agricultural use”

Pragmatic but elusive goal

Success can only be judged retrospectively

Strategies to maximize likelihood that will be durable

Diversify selection pressures on pathogen

Roy Johnson (1984). A critical analysis of durable resistance. Ann. Rev. Phytopathol. 22:309-30.

Page 32: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

MOLECULAR IDENTIFICATION & USE OF DISEASE RESISTANCE GENES

PHENOTYPIC LEVEL Resistant accessions Lab & field screens

MOLECULAR LEVEL Candidate genes from ESTs

Ultra-dense map from GeneChip

GENETIC ANALYSES Coincident position on chromosome?

Targeted association studies of diversity panel Fine mapping with high-resolution recombinants

FUNCTIONAL ANALYSES VIGS / RNAi: Gene family

EMS mutation,transgenic complementation: Specific gene

BREEDING Marker-assisted selection Introgression & pyramiding

BASIC STUDIES Molecular basis of specificity

Genetic basis of evolution of specificity

Page 33: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

Resistance candidate genes:

Pathogen recognition

Resistance signaling pathway

Defense response

Susceptibility factor

Quantitative trait loci

Resistance phenotypes: Downy mildew

Virus

Anthracnose

Root aphid

Plasmopora lactucae-radicis

Corky root

Fusarium wilt

Verticillium wilt

Effector HR

CLX_S3_7961 CLX_S3_9182

CLX_S3_790 CLX_S3_6420

CLX_S3_12147 CLX_S3_6892

QGD8I16 QGB24A18

QGG16O14 CLX_S3_8436 CLX_S3_1125

RGC12E QGD13P19

RGC11B CLX_S3_15206 CLX_S3_13702

CLX_S3_6762 QGC18F20

CLX_S3_6208 CLX_S3_2669 CLX_S3_2959

QGF9F09 CLX_S3_15211

QGG16P08 CLX_S3_427

QGG19L12

0

20

40

60

80

100

120

R39

RB

Q3

AvrPto-HR

9 6 CLSX2101

CLX_S3_6010 CLX_S3_10810 CLX_S3_12870 CLX_S3_14206 CLX_S3_12264

CLX_S3_8454 CLX_S3_5158

CLX_S3_10491 CLX_S3_12430

QGJ5F12 CLSY5163

CLX_S3_6910 CLX_S3_6962 CLX_S3_3891

QGF18C23 CLX_S3_9497

CLX_S3_14773 QGF16L07 QGF17L05

CLX_S3_13566

CLX_S3_374 .

0

20

40

60

80

100

120

QGD5O22 CLSY3114

CLX_S3_14061 CLX_S3_12044 CLX_S3_14050 CLX_S3_12906 CLX_S3_1517 CLX_S3_1173

QGG19N22 CLX_S3_1404

QGJ12P21 QGG21B18

CLX_S3_4405 RGC15A

CLX_S3_1446 QGG13J04

CLX_S3_13603 QGG10J06

CLX_S3_14099 QGA13C08

CLX_S3_357 RGC4O

QGH8M10 RGC4E RGC4T RGC4V RGC4F RGC4X RGC4G RGC4C RGC4R

QGG13M01 LserNBS16

RGC4S RGC4Q

RGC20A RGC20C RGC20B

CLX_S3_8418 CLX_S3_6808

RGC5A LsatNBS11

CLX_S3_11489 CLX_S3_821

CLX_S3_12882 CLRX4778

CLX_S3_5632

0

20

40

60

80

100

AvrPpiC-H

R

AvrRps4-H

R

8

AN

T3

7

QGH3f04

CLX_S3_10156

QGG13J02 CLX_S3_1659

CLX_S3_2349 QGF6A06

CLSM11140 CLX_S3_13351

CLX_S3_2031 CLX_S3_15074 CLX_S3_12667 CLX_S3_13025

QGF25M24 RGC4K

0

20

40

60

80

100

RB

Q1

FUS3

QGF3e10 CLX_S3_10033 CLX_S3_1274

QGF24O17 CLSM3058

CLX_S3_5128 CLX_S3_6332

QGC16I05 QGC20G02

QGG8P13 CLX_S3_1736

CLRY2600 CLX_S3_2780 CLX_S3_8398

QGF7C21 CLX_S3_1042 CLX_S3_1828

QGG11K24 CLX_S3_579

CLRX8688 CLX_S3_15606 CLX_S3_10397 CLX_S3_9834 CLX_S3_3539

RGC16E QGF17I06

CLX_S3_3224 CLX_S3_8409

CLSM741 QGD7B12

LserNBS03 QGG8K23

RGC16B QGC12K15 CLSY4446

RGC16A QGE3a07

CLRX1526 RGC16F

CLRX9010 CLX_S3_6708

QGF20G21 CLSX3769

RGC6A CLX_S3_3390 CLX_S3_1499 CLX_S3_9641

0

20

40

60

80

100 Tu M

o2 D

m10

plr R

BQ

2 Dm

5/8

Dm

17 D

m43

AvrB - HR

AvrRpt2 - HR

AvrR

pm1 - HR

1 QGI9H24 CLX_S3_15048 CLX_S3_5843

CLX_S3_14085 QGB19M14

CLX_S3_15781 CLSS12467

CLX_S3_7798 CLX_S3_1242 CLX_S3_9203

QGG4d04 CLX_S3_11134 CLX_S3_3740 CLX_S3_535

QGG14D23 CLSX5517 CLRY4117

CLX_S3_2774 QGI13M08

CLX_S3_15772

CLSY483 RGC1P

LserNBS10 RGC4A

LserNBS24 RGC1O

CLSX9287 CLRX8858

RGC1Q CLX_S3_2706 CLX_S3_2010

CLX_S3_11722 CLX_S3_11849

0

20

40

Dm

13

3b

QGH6P20

CLX_S3_6776 CLX_S3_4579

CLX_S3_15049

CLX_S3_4740

QGC5C13 QGA17D17

QGF6A22 CLRY8019

CLX_S3_2939 CLX_S3_826

CLX_S3_13487 CLX_S3_2221

QGB7L15

CLX_S3_1356 QGF5E16

CLX_S3_15192 QGJ7D09

QGG21F05

CLX_S3_12280 QGB4d05

CLX_S3_15257 CLX_S3_4305

CLX_S3_13459

0

20

40

60

80

100

120

140

160

180

5

CLX_S3_2225

0

20

40

60

80

100

3a

Dm

7

4

cor

CLSX3763 CLSX3674

CLX_S3_11945 QGG26B23

CLX_S3_13611 CLSY4807 CLRY3021

RGC11A RGC12G

CLX_S3_2930 CLX_S3_7341 CLX_S3_8995 CLX_S3_9470 CLX_S3_7346 CLX_S3_7363 CLX_S3_4627

QGC10F01 CLX_S3_15389

CLRY544 CLX_S3_14751 CLX_S3_3791

CLRY9160 CLX_S3_1701 CLX_S3_6039 CLX_S3_1415

QGF13H12 QGI4E15

QGA12L23 QGG11A07

CLX_S3_12210 CLSM14946

CLX_S3_9579 CLX_S3_6780

0

20

40

60

80

100

120

140

160

180

200

Dm

44 D

m4

Dm

11

mo1

FUS1

VRT1

Ra D

m1

AN

T2

2 Dm

2 Dm

6 Dm

14 Dm

15 Dm

16 Dm

18

FUS2

RGC2F RGC2A

AY153845 RGC2X

QGB28O08 RGC2J

RGC2M RGC2B RGC2S

CLSM16923 RGC2Y RGC2U RGC4U

CLX_S3_4455 QGD10N11

CLX_S3_14182 CLX_S3_1771 CLX_S3_9978

LsatNBS05 CLX_S3_13590 CLX_S3_11288 CLX_S3_12930

CLX_S3_3045 CLX_S3_4550 CLX_S3_3943

CLX_S3_15473 CLX_S3_6671

QGI11K10 CLX_S3_7649 CLX_S3_4292

CLX_S3_15231 QGC22B19

CLX_S3_4116 CLX_S3_1216

CLX_S3_12591 CLX_S3_4001 CLX_S3_7749 CLX_S3_8485

CLX_S3_891 CLX_S3_5919

QGF20E14 CLX_S3_7847

RGC1B CLX_S3_6313

0

20

40

60

80

100

120

140

Dm

3 Tvr

Dm

45

Dm

36

52 PHENOTYPES MAPPED RELATIVE TO 289 RESISTANCE GENE CANDIDATES

Nearly all resistance phenotypes segregate with a cluster of NBS-LRR encoding genes. 100s of resistance genes in all genotypes.

Complex clusters of phenotypes & candidate genes

Leah McHale et al., 2009. Theor. Appl. Genet. 118:1223-4. Maria Truco, Oswaldo Ochoa, Kirsten Lahre, et al. unpublished

Page 34: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

Cf-2 Cf-4 Cf-5 Cf-9 RPP27

Xa21 FLS2

Pto L6 M N RPS4 RPP1 RPP5 RRS1

RPM1 RPS2 Prf I2 Mi Dm3 RPP8 Bs2

Nucleotide Binding Site Leucine-Rich Repeat Protein Kinase Toll-Interleukin Homology Coiled-coil domain C terminal domain

RPP13 RPS5 Xa1 Pi-B Rx1 Rx2 Rp1 Gpa2

MAJOR PROTEIN MOTIFS SHARED BETWEEN PLANT DISEASE

RESISTANCE GENE PRODUCTS ACTIVE AGAINST DIVERSE

PATHOGENS & PESTS

RPW8

Rpg1

Page 35: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

35S   ocs  

RGC2B  =  Dm3  

~500bp pdk  intron  ~400bp  GUS  

•   Generate  stable  transgenic  plants  with  RNAi  construct.  •   Test  for  silencing  of  GUS  using  Agrobacterium-­‐mediated          transient  assays  with  35S::GUS.  •   Test  plants  exhibiCng  silencing  for  disease  resistance          to  isolates  expressing  avirulence  to  mulCple  Dm  genes  in  cluster.  

LG2

FUNCTIONAL ANALYSIS OF CANDIDATE GENES USING RNAi: SILENCING RGC2 GENE FAMILY WITH GUS REPORTER FRAGMENT FOR SILENCING

Wroblewski et al. 2007. Plant J. 51:803-818

Page 36: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

RNAi using the NBS region of RGC12G silences the linked Dm7, Dm4 and Dm11 genes

LSE57/15 21 F1s 15 F1s CG R4T57 27 F1s 23 F1s CG Capitan 25 F1s 24 F1s CG

R = Resistance S = Susceptible

GUS

Avr7

Silencing

Dm7

CG w/ RNAi x LSE57/15 R4T57 x CG w/ RNAi Capitan x CG w/ RNAi (T-DNA/-)(dm0) (Dm7) (Dm4) (T-DNA/-)(dm0) (Dm11) (T-DNA/-)(dm0)

+ + - - R R S S no no yes yes

+ + - - R R S S no no yes yes

GUS

Avr4

Silencing

Dm4

GUS

Avr11

Silencing

Dm11

+ + - - R R S S no no yes yes

Marilena Christopoulou

Conclusion: Dm4, Dm7 and Dm11 are encoded by RGC12G or related sequence(s)

On-going: making RNAi lines for most candidate resistance genes

= excellent markers for breeding resistance

Page 37: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

COMBINING RESISTANCES TO FUSARIUM AND DIEBACK IN CRISPHEAD AND ROMAINE

In collaboration with Ivan Simko, USDA, Salinas

Ra

Dm

1

AN

T2

Dm

2 Dm

6 Dm

14 Dm

15 Dm

16 Dm

18 FUS2

20

40

60

80

100

120

140

2 RGC2F

RGC2A

AY153845

RGC2X

QGB28O08

RGC2J

RGC2M

RGC2B

RGC2S

CLSM16923

RGC2Y

RGC2U

RGC4U

CLX_S3_4455

QGD10N11

CLX_S3_14182

CLX_S3_1771

CLX_S3_9978

LsatNBS05

CLX_S3_13590

CLX_S3_11288

CLX_S3_12930

CLX_S3_3045

CLX_S3_4550

CLX_S3_3943

CLX_S3_15473

CLX_S3_6671

QGI11K10

CLX_S3_7649

CLX_S3_4292

CLX_S3_15231

QGC22B19

CLX_S3_4116

CLX_S3_1216

CLX_S3_12591

CLX_S3_4001

CLX_S3_7749

CLX_S3_8485

CLX_S3_891

CLX_S3_5919

QGF20E14

CLX_S3_7847

RGC1B

CLX_S3_6313

0 Dm

3 Tvr

Dieback resistance from crisphead

Fusarium resistance from romaine

Danger that transfer of resistance to one disease will introduce susceptibility to the other.

Broke linkage. Have lines resistant to both diseases.

QTL for Fusarium resistance close to resistance to dieback

Chr 2

Large cluster of NBS-LRR encoding genes and resistance phenotypes

Page 38: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

CONSEQUENCES OF RESISTANCE GENE GENETICS & MOLECULAR BIOLOGY TO PLANT IMPROVEMENT

All genotypes have large numbers (100s) of resistance genes. Backcrossing often introgresses clusters not single resistance genes. Introgression will replace resistance genes in recurrent parent. Clustering implies pyramiding some combinations of genes will be very difficult. Large number and wide variety of recognition specificities possible. Only finite number of specificities against a particular pathogen?? Possibility for repeated introgression of same resistance specificity. Some non-host resistance may be pyramids of specific genes. Quantitative resistance can be conferred by single NBS-LRR genes. Divergent selection using multiple/different resistances optimal. Opportunities for transfer across sexual compatibility barriers.

Page 39: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

HOW CAN WE ACHIEVE DURABLE RESISTANCE?

Strategies to enhance durability of resistance

Use knowledge of pathogen variability to inform strategies for resistance gene deployment. New opportunities from new

technologies for rapid and comprehensive genotyping

Pyramid/stack multiple genes – increase evolutionary hurdle

Diversify selection pressure on pathogen

Utilize ‘durable’ resistance – empirical rather than mechanistic?

Match life expectancy of resistance and cultivar

Page 40: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

DIRECT OR INDIRECT PATHOGEN PRODUCTS

SIGNAL CASCADE, RESISTANCE RESPONSE & RAPID CELL DEATH

SELECTION PRESSURE ON PATHOGEN FOR LACK OF ACTIVITY OF MULTIPLE AVIRULENCE FACTORS

PYRAMIDING RESISTANCE GENES USING MARKER-ASSISTED SELECTION OF CAUSAL GENES

X

MULTIPLE RECOGNITION EVENTS

Page 41: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

Diversify selection pressure on pathogen

Heterogeneity of resistance genes in space and/or time:

diversify resistance sources,

pipeline with different resistance genes from other programs,

multilines and cultivar mixtures,

syn- and allo-patric gene deployment,

Roy Johnson (1984). A critical analysis of durable resistance. Ann. Rev. Phytopathol. 22:309-30.

Page 42: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

IMPACT OF HIGH THROUGHPUT SEQUENCING & MASSIVELY PARALLEL GENOTYPING ON RESISTANCE STRATEGIES

•  Global rather then gene-by-gene analysis •  Saturation of identification of candidate genes Recognition, signal transduction, response SNPs in causal genes •  Characterization of germplasm Full genome resequencing of 10 – 100 genotypes Gobal genotyping of 1000 - 10,000 genotypes Natural variation in 1o, 2o, & 3o genepools Vast numbers of resistance genes available §  Characterization of pathogens (A)virulence factors Pathogen variability §  Gene deployment Marker assisted selection of causal genes Pyramids of multiple genes, conventional & transgenic Heterogeneity between genotypes in space & time Fragment selection pressure on pathogen populations Manage pathogen evolution

Page 43: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

PATHOGEN POPULATION GENETICS SHOULD DRIVE DEPLOYMENT OF RESISTANCE GENES

Influenza paradigm

Continual sampling of pathogen Virulence phenotyping Gene-space sequencing (10s) SNP genotyping (1000s)

Resistance gene discovery pipeline Germplasm screens Mapping, molecular markers Molecular characterization

Deployment of effective resistance genes Pyramiding, MAS or effector-driven selection Allo- and sympatric diversity Temporal adjustment of resistance genes deployed Transgenic approaches for novel resistance strategies

+

Page 44: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

Potential Challenges to Implementation of the Influenza Paradigm for Resistance Gene Deployment

Collection of pathogen samples from diverse, low-tech locations. Global coordination of pathogen sampling.

Data processing and interpretation.

Data-driven consensus building and decision making.

Sufficient numbers of genes for pipeline (conventional &/or transgenic).

Persuading breeders (public and private) to participate and coordinate.

Providing agronomically acceptable options to famers and consumers.

Revision/adaption of regulatory/registration requirements to accommodate

agronomically equivalent cultivars with different resistance genes.

Page 45: High Throughput DNA Sequencing and the Deployment of ...gcp21.org/Bellagio/17-RichardMichelmore.pdf · Long, high accuracy reads, ABI 3730 Massively parallel, shorter reads 454 Illumina

SUMMARY DNA sequence becoming an inexpensive commodity generated on a variety of platforms. New paradigms as to how DNA sequence is generated, handled and valued. Enormous amounts of sequence data imminent. Need for major bioinformatics capabilities: data into knowledge. Unprecedented opportunities for discovery and manipulation of variation in plants and pathogens. Durable resistance: old ideas, new opportunities. Diversify sources and types of resistance. Pyramid to increase evolutionary hurdle for pathogen. Fragment/diversify selection pressure on pathogen. Pathogen population genetics should drive deployment of resistance genes: the influenza paradigm.