nextera tools for microbial...
TRANSCRIPT
1
© 2013 Illumina, Inc. All rights reserved. Illumina, IlluminaDx, BaseSpace, BeadArray, BeadXpress, cBot, CSPro, DASL, DesignStudio, Eco, GAIIx, Genetic Energy, Genome Analyzer, GenomeStudio, GoldenGate, HiScan, HiSeq, Infinium, iSelect, MiSeq, Nextera, NuPCR, SeqMonitor, Solexa, TruSeq, TruSight, VeraCode, the pumpkin orange color, and the Genetic Energy streaming bases design are trademarks or registered trademarks of Illumina, Inc. All other brands and names contained herein are the property of their respective owners.
Developing Nextera tools for microbial
sequencing
Niall Gormley 17th May 2013
2
No. 1 guiding principle of sample prep is…
Simplicity
No. steps Tube transfers Volumes
Avoid gels Avoid PCR
Compromise by App
3
A-tail
Clean up
Ligate adaptors
PCR
Shear
Clean up
2x Clean up
Gel
Clean up
End polish
~ 1 day
1ug
TruSeq DNA
A-tail
Ligate adaptors
Shear
End polish
Bead size select
2x Clean up
Clean up
½ day
1ug
TruSeq DNA PCR-Free
Tagment
Clean up
Clean up
PCR
1.5 hr
50ng
Nextera
Tagment
Clean up
PCR
1.5 hr
1ng
Nextera XT
4
Tagment
Clean up
Nextera PCR free?
10 min?
? ng
Tagment
Clean up
PCR
1.5 hr
1ng
Nextera XT
3’
+ 5’ 3’
3’ 5’
Rd1 SP
Rd2 SP’
Challenges:
More steps
Transposome
Indexing
5
The next-gen ‘Agar plate’:
vs
Bta colonies (102-103 per plate)
• Very cheap • Simple
Clusters (1Million / mm2) • Turn-key solution / multiple pathogens • ‘see it all’ • >>Greater detail • Info at the ultimate resolution
Nextera prep
6
Some examples of ultimate resolution
7
Both: smear +ve for Tuberculosis 14 days apart
Culture +ve sputum samples tested at nat ref lab
2 students overlap briefly at Cambridge language school
Resistance to: • Isoniazid
• Prothionamide
• Streptomycin
Strain ID: • Beijing
• MIRU-VNTR
50ng DNA - Nextera – MiSeq – 150bp reads – Two isolates genetically identical – No SNP differences in mapped genes
Analyzed 35 genes/promoters associated with resistance to antibiotics – C-15T mutation in inhA (shared target for Iso & Pro) – V77G mutation in gidB (putative 16s rRNA M’tase – confers low-level resistance to Strep)
8
Outbreak commences in neonatal ward
Same ward before outbreak Rest of hospital
Initial antibiogram suggest diff to ST22 (EMRSA-15)
Outbreak ward
9
Sequenced 14 isolates
Same ward before outbreak Rest of hospital
Outbreak ward
All strains on outbreak ward are ST 22
4 strains have different ST type
10
Sequencing reveals 2 distinct groups
reference
Maximum likelihood tree
Separated by 102 SNPs
Hypermutator
Outbreak ward
Identified Hypermutator with frameshift in mutS
Single A insertion in hemG causing truncation of HemG
100% concordance between resistance phenotype and mutations identified in the sequence genomes Small colonies
11
Single read 36 base yielding 27 million reads
81%: 170x coverage of E.coli genome
10%: 37,000x coverage of pUC19
9%: F plasmid and pUC19 insert
12
Extreme sample prep =
13
Biotin
Biotin
Biotin
Biotin
Mate-pairs
14
Shear
End polish
Cap linker
Circularize
EcoP15I digest
Biotin enrich
End polish/A-tail Clean up
Ligate adaptors
PCR
Exonuclease
Clean up
EtOH ppt
Clean up
Clean up
Clean up
Bead washes
Bead washes
Gel
Gel Clean up
Methylation
Clean up Gel
Clean up
2006
Fill in
Shear
End polish
Circularize
Shear
End polish
Ligate adaptors
emPCR
Exonuclease
Clean up
Bead size cut
Clean up
Clean up
Clean up
Clean up Hairpin linker
Clean up
Clean up
Methylation Clean up
Cleave hairpin
Clean up
2009 Shear
End polish
Cre-Lox linker
Fill in
Circularize
EcoP15I digest
Biotin enrich End polish
A-tail
Ligate adaptors
PCR
Exonuclease
Clean up
Bead size cut
Bead size cut
Clean up
Clean up
Clean up
Bead washes
Bead washes
Bead washes
Gel Clean up
2011
Clean up
Manipulation of very large DNA molecules
Basic short insert preps
• Gels • Clean-ups • Ligations
Challenges
15
Bio Bio
+
XmnI
Bio Bio
XmnI
Bio Bio
Strep Strep Bio Bio
Ligate
Digest
Strep gel shift
Bio
Strep - + - + - +
Bio
65% ligation efficiency at short end => 42% efficiency both ends
16
Shear
End polish
Cap linker
Circularize
EcoP15I digest
Biotin enrich
End polish/A-tail Clean up
Ligate adaptors
PCR
Exonuclease
Clean up
EtOH ppt
Clean up
Clean up
Clean up
Bead washes
Bead washes
Gel
Gel Clean up
Methylation
Clean up Gel
Clean up
2006
Fill in
Shear
End polish
Circularize
Shear
End polish
Ligate adaptors
emPCR
Exonuclease
Clean up
Bead size cut
Clean up
Clean up
Clean up
Clean up Hairpin linker
Clean up
Clean up
Methylation Clean up
Cleave hairpin
Clean up
2009 Shear
End polish
Cre-Lox linker
Fill in
Circularize
EcoP15I digest
Biotin enrich End polish
A-tail
Ligate adaptors
PCR
Exonuclease
Clean up
Bead size cut
Bead size cut
Clean up
Clean up
Clean up
Bead washes
Bead washes
Bead washes
Gel Clean up
2011
Gel
Shear
Biotin end polish
Circularize
Clean up
Clean up
Exonuclease
Clean up
Bead washes
Biotin enrich End polish
A-tail
Ligate adaptors
PCR
Shear
Bead washes
Bead washes
Gel
2008
Clean up
Clean up
Biotin
Bio-dNTPs
17
Strand displace Bead size cut
Clean up
Circularize
Tagment
Bead washes
Biotin enrich End polish
A-tail
Ligate adaptors
PCR
Exonuclease Shear
Bead washes
Bead washes
Clean up
2013
Gel
Shear
Biotin end polish
Circularize
Clean up
Clean up
Exonuclease
Clean up
Bead washes
Biotin enrich End polish
A-tail
Ligate adaptors
PCR
Shear
Bead washes
Bead washes
Gel
2008
Clean up
Biotin
Bio-dNTPs
Biotin
18
Bacterial de novo assembly with Gel-Free Mate-Pair
E.coli de novo Assembly Metrics
Total size [bp] 4,664,930 (15)*
N50 scaffold [bp] 4,635,607 (1)*
Max scaffold [bp] 4,635,607
E.coli mate-pair data - 2 x 100 bp ( 1 of 4 Indexed bacterial libraries on a single MiSeq Run)
* Number of contigs in data set
Complete de novo assembly
Median size 3971 bp Total Reads 1,262,185 % mate-pairs 98.2 % % Unique 99.8 % Diversity 338,674,000
Mate-Pair Protocol + MiSeq = DNA to de novo in 3 days!
19
Size GC content N50 Max
Scaffold Total size Genome Coverage
Input Coverage
E. coli DH10B 4.7 M 50% TruSeq 109,035 (13) 330,951 4,544,106 (136) 95% 59
Mate-Pair 3,690,672 (1) 3,690,672 4,762,657 (68) 96% 57 R. sphaeroides 4.6 M 67%
TruSeq 279,376 (6) 551,032 4,161,361 (73) 82% 47 Mate-Pair 2,590,288 (1) 2,590,288 4,591,572 (64) 94% 31
B. cereus 5.4 M 35%
TruSeq 485,654 (3) 1,197,538 5,371,324 (70) 92% 66 Mate-Pair 3,611,328 (1) 3,611,328 5,444,030 (39) 99% 40
20
Mate pairs
WGS paired ends
Size selected
Mate pair Input
Contigs
Scaffolds
N50 contig
N50 scaffold
Ilmn + JGI Gel free 1ug 27 4 408.4 kb 3,493 kb Ilmn + JGI ~6 kb (gel) 4ug 29 6 275.1 kb 3,692 kb JGI + JGI ~ 10 kb (gel) >10ug 48 8 188.2 kb 4,060 kb
Ilmn + JGI Gel free 1ug 8 1 744.3 kb 6,328 kb JGI + JGI ~ 10 kb (gel) >10ug 7 1 1,190.2 kb 6,355 kb
Cellulomonas flavigena
74% GC
Conexibeater woesei 73% GC
2 4 6 8 10 12 14 16 18 20 2 4 6 8 10 12 14 16 18 20 Fragment size kb
Fragment size kb
Num
ber o
f rea
ds
Num
ber o
f rea
ds
C. flavigena C. woesei
JGI JGI Ilmn 1ug
Ilmn 1ug
Ilmn 4ug
@JGI
Data courtesy of Feng Chen
21
Strand displace Bead size cut
Clean up
Circularize
Tagment
Bead washes
Biotin enrich End polish
A-tail
Ligate adaptors
PCR
Exonuclease Shear
Bead washes
Bead washes
Clean up
2013
Biotin Biotin
Strand displace Bead size cut
Clean up
Circularize
Tagment
Biotin enrich PCR
Exonuclease Tagment
Clean up
22
Shear Tagment
E.coli Human
Ec Hu
Shear Tagment
Human E.coli
Ec Hu
Combine &
Split
Ec Hu Ec Hu
Combine &
Split
Ec Hu Ec Hu
Shear Tagment
Ec Hu Ec Hu
Shear Tagment
Hu Ec Hu Ec
23
Shear Shear
Tagment Shear
Shear Tagment
E.c
oli
Hum
an
% Paired ends 3.4 29.1 1.7 61.5 % Mate Pairs 95.1 69.6 95.8 35.3 Diversity 159 M 49 M 5,346 M 127 M
% Paired ends 4.4 34.4 0.7 47 % Mate Pairs 90 60. 94 45 Diversity 4 M 50 M 2,947 M 218 M
Tagment Tagment
24
0
2000
4000
6000
8000
10000
12000
14000
0 2500 5000 7500 10000 12500 15000
1000 ng
500 ng
250 ng
125 ng 62.5 ng
E. coli MG1655 N50 Max Scaffold Total size Genome Coverage
Input Coverage
1000 ng 4,639,989 (1) 4,639,989 4,672,244 (78) 99.4% 180 500 ng 4,617,536 (1) 4,617,536 4,652,083 (84) 99.1% 163 250 ng 4,631,394 (1) 4,631,394 4,661,881 (67) 99.5% 140 125 ng 4,644,837 (1) 4,644,837 4,689,356 (86) 99.6% 158 62.5 ng 4,648,614 (1) 4,648,614 4,684,742 (65) 99.5% 153
Lastly: in the hands of R&D
25
Thanks!