next generation sequencing workflow and applications for ...next generation sequencing workflow and...

39
Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team Lead June 8, 2015 National Center for HIV/AIDS, Viral Hepatitis, STD, and TB Prevention Division of Tuberculosis Elimination

Upload: others

Post on 24-Aug-2020

17 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

Next Generation Sequencing Workflow and Applications for Mycobacterium

tuberculosis

Jamie Posey PhD Applied Research Team Lead

June 8 2015

National Center for HIVAIDS Viral Hepatitis STD and TB Prevention Division of Tuberculosis Elimination

Next Generation Sequencing (NGS) Workflow

Sample Preparation

Library Preparation

Sequence Analyze

Sample Preparation for Whole Genome Sequencing (WGS)

Isolation of DNA Chemical lysis (CTAB) Mechanical lysis (FastPrep-24) Purify DNA

Shear genomic DNA

Physical Enzymatic

Library Preparation for Illumina Platforms

Sequence Libraries

httpswwwyoutubecomembedHMyCqWhwB8Eiframeamprel=0ampautoplay=1

httpswwwyoutubecomwatchv=v8p4ph2MAvI

httpswwwyoutubecomwatchv=WYBzbxIfuKs

httpswwwyoutubecomwatcht=10ampv=L_jAtDSB8kA

Ion Torrent PGM by Life Technologies

PacBio SMRT

Illumina

Genome Assembly

httpwwwjigsawplanetcomrc=createpuzzle

Bioinformatic Tools

gatcbiotechcom

Simple Variant Call Pipeline

httpswwwebiacuktrainingsitesebiacuktrainingfilesmaterials2014140217_AgriOmicsdan_bolser_snp_callingpdf

Examples of Commercial Products

The companies and products depicted here are not endorsed by CDC

Reference-Guided (Mapped) Assembly

Reference SequenceGenome

Low sequence coverage

UNMAPPED READS 1 Sequences not present in the reference 2 Plasmids or other extrachromosomal 3 DNA Structural VariationRearrangement

ADVANTAGES Relatively fast well-suited to highly-conserved genomes DISADVANTAGES Issues with high diversity mobile elements

Cov

erag

e

18X

1X

Contig 1 Contig 2

De-Novo Assembly

NOT TO SCALE

Contig 1 Contig 2 Contig 3

Contig 4 Contig 5 Contig 6 Contig 7

ADVANTAGES Reference agnostic assembles all the reads it can into contigs DISADVANTAGES Doesnrsquot always get things right Repeat sequences etc

Whole Genome SNP Typing

A

A

T

C

C

C

T

T

T

A

A

A

G

G

C

A

T

T

Reference SequenceCore Genome

1

2

3

ACTAGA

ACTAGT TCTACT

NJ tree visualization

Turnaround Time

DNA Prep Sequencing Analysis

0 ndash 21 days 2 ndash 3 days Few hours to days

MOLECULAR EPIDEMIOLOGY

Cluster 1

~ 100 Patients

Cluster 2

Cluster 3

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

DRUG RESISTANCE

Applications of WGS for Drug Resistance

Surveillance

Clinical management

Identify new mechanisms

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

C G C31998G58 bp upstream Rv0029

C TC1663856T

acn

C T C51403T Rv0047c G AG1674048A

fabG1

C T C118832T Rv0102 T CT1877958C

pks7

C T C247984T Rv0207c C GC1888075G

pks9

T C T362962CPE_PGRS5

T CT2087076C

171 bp upstream of Rv1838c

C G C477188G Rv0398c C TC2372126T

Rv2112c

C A C480678A mmpL1 G CG2402463C

Rv2142c

A GA649974G

ubiE T CT2614547C

46 bp upstream Rv2339

C T C761147T rpoB G AG2751471A

Rv2449c

G C G765719C rpoC G AG2958534A

Rv2631

C G C799139G Rv0698 G AG3126489A

Rv2819c

G AG905686A

Rv0811c G AG3137406A

echA16

C A C926861APE_PGRS13

C TC3213150T

lepB

C GC1023436G

betP A CA3377940C

PPE46

T GT1093459G

PE_PGRS17

A CA3380380C

PPE47

G AG1114491A

Rv0997 A GA3416480G

Rv3055

C TC1208858T

Rv1084 C TC3455434T

Rv3088

C TC1231660T

Rv1104 G CG3608047C

Rv3230c

A CA1246730C

bpoB C TC3764285T

PPE56

C TC1266797T

Rv1139c G AG3765280A

PPE56

C TC1309314T

fdxC C AC3777772A

spoU

A CA1320356C

papA3 A GA4026439G

5 bp upstream Rv3585

G AG1353888A

tagA C AC4037284A

PE_PGRS59

G AG1421085A

Rv1272c T AT4072484A

Rv3633

G CG4084482C

topA

A GA4314271G

bfrB

Identify New Mechanisms of Resistance

fabG1 mabA inhA

L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T

METAGENOMICS

Starting Material

Sputum Subculture

Dx Culture

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 2: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

Next Generation Sequencing (NGS) Workflow

Sample Preparation

Library Preparation

Sequence Analyze

Sample Preparation for Whole Genome Sequencing (WGS)

Isolation of DNA Chemical lysis (CTAB) Mechanical lysis (FastPrep-24) Purify DNA

Shear genomic DNA

Physical Enzymatic

Library Preparation for Illumina Platforms

Sequence Libraries

httpswwwyoutubecomembedHMyCqWhwB8Eiframeamprel=0ampautoplay=1

httpswwwyoutubecomwatchv=v8p4ph2MAvI

httpswwwyoutubecomwatchv=WYBzbxIfuKs

httpswwwyoutubecomwatcht=10ampv=L_jAtDSB8kA

Ion Torrent PGM by Life Technologies

PacBio SMRT

Illumina

Genome Assembly

httpwwwjigsawplanetcomrc=createpuzzle

Bioinformatic Tools

gatcbiotechcom

Simple Variant Call Pipeline

httpswwwebiacuktrainingsitesebiacuktrainingfilesmaterials2014140217_AgriOmicsdan_bolser_snp_callingpdf

Examples of Commercial Products

The companies and products depicted here are not endorsed by CDC

Reference-Guided (Mapped) Assembly

Reference SequenceGenome

Low sequence coverage

UNMAPPED READS 1 Sequences not present in the reference 2 Plasmids or other extrachromosomal 3 DNA Structural VariationRearrangement

ADVANTAGES Relatively fast well-suited to highly-conserved genomes DISADVANTAGES Issues with high diversity mobile elements

Cov

erag

e

18X

1X

Contig 1 Contig 2

De-Novo Assembly

NOT TO SCALE

Contig 1 Contig 2 Contig 3

Contig 4 Contig 5 Contig 6 Contig 7

ADVANTAGES Reference agnostic assembles all the reads it can into contigs DISADVANTAGES Doesnrsquot always get things right Repeat sequences etc

Whole Genome SNP Typing

A

A

T

C

C

C

T

T

T

A

A

A

G

G

C

A

T

T

Reference SequenceCore Genome

1

2

3

ACTAGA

ACTAGT TCTACT

NJ tree visualization

Turnaround Time

DNA Prep Sequencing Analysis

0 ndash 21 days 2 ndash 3 days Few hours to days

MOLECULAR EPIDEMIOLOGY

Cluster 1

~ 100 Patients

Cluster 2

Cluster 3

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

DRUG RESISTANCE

Applications of WGS for Drug Resistance

Surveillance

Clinical management

Identify new mechanisms

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

C G C31998G58 bp upstream Rv0029

C TC1663856T

acn

C T C51403T Rv0047c G AG1674048A

fabG1

C T C118832T Rv0102 T CT1877958C

pks7

C T C247984T Rv0207c C GC1888075G

pks9

T C T362962CPE_PGRS5

T CT2087076C

171 bp upstream of Rv1838c

C G C477188G Rv0398c C TC2372126T

Rv2112c

C A C480678A mmpL1 G CG2402463C

Rv2142c

A GA649974G

ubiE T CT2614547C

46 bp upstream Rv2339

C T C761147T rpoB G AG2751471A

Rv2449c

G C G765719C rpoC G AG2958534A

Rv2631

C G C799139G Rv0698 G AG3126489A

Rv2819c

G AG905686A

Rv0811c G AG3137406A

echA16

C A C926861APE_PGRS13

C TC3213150T

lepB

C GC1023436G

betP A CA3377940C

PPE46

T GT1093459G

PE_PGRS17

A CA3380380C

PPE47

G AG1114491A

Rv0997 A GA3416480G

Rv3055

C TC1208858T

Rv1084 C TC3455434T

Rv3088

C TC1231660T

Rv1104 G CG3608047C

Rv3230c

A CA1246730C

bpoB C TC3764285T

PPE56

C TC1266797T

Rv1139c G AG3765280A

PPE56

C TC1309314T

fdxC C AC3777772A

spoU

A CA1320356C

papA3 A GA4026439G

5 bp upstream Rv3585

G AG1353888A

tagA C AC4037284A

PE_PGRS59

G AG1421085A

Rv1272c T AT4072484A

Rv3633

G CG4084482C

topA

A GA4314271G

bfrB

Identify New Mechanisms of Resistance

fabG1 mabA inhA

L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T

METAGENOMICS

Starting Material

Sputum Subculture

Dx Culture

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 3: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

Sample Preparation for Whole Genome Sequencing (WGS)

Isolation of DNA Chemical lysis (CTAB) Mechanical lysis (FastPrep-24) Purify DNA

Shear genomic DNA

Physical Enzymatic

Library Preparation for Illumina Platforms

Sequence Libraries

httpswwwyoutubecomembedHMyCqWhwB8Eiframeamprel=0ampautoplay=1

httpswwwyoutubecomwatchv=v8p4ph2MAvI

httpswwwyoutubecomwatchv=WYBzbxIfuKs

httpswwwyoutubecomwatcht=10ampv=L_jAtDSB8kA

Ion Torrent PGM by Life Technologies

PacBio SMRT

Illumina

Genome Assembly

httpwwwjigsawplanetcomrc=createpuzzle

Bioinformatic Tools

gatcbiotechcom

Simple Variant Call Pipeline

httpswwwebiacuktrainingsitesebiacuktrainingfilesmaterials2014140217_AgriOmicsdan_bolser_snp_callingpdf

Examples of Commercial Products

The companies and products depicted here are not endorsed by CDC

Reference-Guided (Mapped) Assembly

Reference SequenceGenome

Low sequence coverage

UNMAPPED READS 1 Sequences not present in the reference 2 Plasmids or other extrachromosomal 3 DNA Structural VariationRearrangement

ADVANTAGES Relatively fast well-suited to highly-conserved genomes DISADVANTAGES Issues with high diversity mobile elements

Cov

erag

e

18X

1X

Contig 1 Contig 2

De-Novo Assembly

NOT TO SCALE

Contig 1 Contig 2 Contig 3

Contig 4 Contig 5 Contig 6 Contig 7

ADVANTAGES Reference agnostic assembles all the reads it can into contigs DISADVANTAGES Doesnrsquot always get things right Repeat sequences etc

Whole Genome SNP Typing

A

A

T

C

C

C

T

T

T

A

A

A

G

G

C

A

T

T

Reference SequenceCore Genome

1

2

3

ACTAGA

ACTAGT TCTACT

NJ tree visualization

Turnaround Time

DNA Prep Sequencing Analysis

0 ndash 21 days 2 ndash 3 days Few hours to days

MOLECULAR EPIDEMIOLOGY

Cluster 1

~ 100 Patients

Cluster 2

Cluster 3

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

DRUG RESISTANCE

Applications of WGS for Drug Resistance

Surveillance

Clinical management

Identify new mechanisms

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

C G C31998G58 bp upstream Rv0029

C TC1663856T

acn

C T C51403T Rv0047c G AG1674048A

fabG1

C T C118832T Rv0102 T CT1877958C

pks7

C T C247984T Rv0207c C GC1888075G

pks9

T C T362962CPE_PGRS5

T CT2087076C

171 bp upstream of Rv1838c

C G C477188G Rv0398c C TC2372126T

Rv2112c

C A C480678A mmpL1 G CG2402463C

Rv2142c

A GA649974G

ubiE T CT2614547C

46 bp upstream Rv2339

C T C761147T rpoB G AG2751471A

Rv2449c

G C G765719C rpoC G AG2958534A

Rv2631

C G C799139G Rv0698 G AG3126489A

Rv2819c

G AG905686A

Rv0811c G AG3137406A

echA16

C A C926861APE_PGRS13

C TC3213150T

lepB

C GC1023436G

betP A CA3377940C

PPE46

T GT1093459G

PE_PGRS17

A CA3380380C

PPE47

G AG1114491A

Rv0997 A GA3416480G

Rv3055

C TC1208858T

Rv1084 C TC3455434T

Rv3088

C TC1231660T

Rv1104 G CG3608047C

Rv3230c

A CA1246730C

bpoB C TC3764285T

PPE56

C TC1266797T

Rv1139c G AG3765280A

PPE56

C TC1309314T

fdxC C AC3777772A

spoU

A CA1320356C

papA3 A GA4026439G

5 bp upstream Rv3585

G AG1353888A

tagA C AC4037284A

PE_PGRS59

G AG1421085A

Rv1272c T AT4072484A

Rv3633

G CG4084482C

topA

A GA4314271G

bfrB

Identify New Mechanisms of Resistance

fabG1 mabA inhA

L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T

METAGENOMICS

Starting Material

Sputum Subculture

Dx Culture

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 4: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

Library Preparation for Illumina Platforms

Sequence Libraries

httpswwwyoutubecomembedHMyCqWhwB8Eiframeamprel=0ampautoplay=1

httpswwwyoutubecomwatchv=v8p4ph2MAvI

httpswwwyoutubecomwatchv=WYBzbxIfuKs

httpswwwyoutubecomwatcht=10ampv=L_jAtDSB8kA

Ion Torrent PGM by Life Technologies

PacBio SMRT

Illumina

Genome Assembly

httpwwwjigsawplanetcomrc=createpuzzle

Bioinformatic Tools

gatcbiotechcom

Simple Variant Call Pipeline

httpswwwebiacuktrainingsitesebiacuktrainingfilesmaterials2014140217_AgriOmicsdan_bolser_snp_callingpdf

Examples of Commercial Products

The companies and products depicted here are not endorsed by CDC

Reference-Guided (Mapped) Assembly

Reference SequenceGenome

Low sequence coverage

UNMAPPED READS 1 Sequences not present in the reference 2 Plasmids or other extrachromosomal 3 DNA Structural VariationRearrangement

ADVANTAGES Relatively fast well-suited to highly-conserved genomes DISADVANTAGES Issues with high diversity mobile elements

Cov

erag

e

18X

1X

Contig 1 Contig 2

De-Novo Assembly

NOT TO SCALE

Contig 1 Contig 2 Contig 3

Contig 4 Contig 5 Contig 6 Contig 7

ADVANTAGES Reference agnostic assembles all the reads it can into contigs DISADVANTAGES Doesnrsquot always get things right Repeat sequences etc

Whole Genome SNP Typing

A

A

T

C

C

C

T

T

T

A

A

A

G

G

C

A

T

T

Reference SequenceCore Genome

1

2

3

ACTAGA

ACTAGT TCTACT

NJ tree visualization

Turnaround Time

DNA Prep Sequencing Analysis

0 ndash 21 days 2 ndash 3 days Few hours to days

MOLECULAR EPIDEMIOLOGY

Cluster 1

~ 100 Patients

Cluster 2

Cluster 3

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

DRUG RESISTANCE

Applications of WGS for Drug Resistance

Surveillance

Clinical management

Identify new mechanisms

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

C G C31998G58 bp upstream Rv0029

C TC1663856T

acn

C T C51403T Rv0047c G AG1674048A

fabG1

C T C118832T Rv0102 T CT1877958C

pks7

C T C247984T Rv0207c C GC1888075G

pks9

T C T362962CPE_PGRS5

T CT2087076C

171 bp upstream of Rv1838c

C G C477188G Rv0398c C TC2372126T

Rv2112c

C A C480678A mmpL1 G CG2402463C

Rv2142c

A GA649974G

ubiE T CT2614547C

46 bp upstream Rv2339

C T C761147T rpoB G AG2751471A

Rv2449c

G C G765719C rpoC G AG2958534A

Rv2631

C G C799139G Rv0698 G AG3126489A

Rv2819c

G AG905686A

Rv0811c G AG3137406A

echA16

C A C926861APE_PGRS13

C TC3213150T

lepB

C GC1023436G

betP A CA3377940C

PPE46

T GT1093459G

PE_PGRS17

A CA3380380C

PPE47

G AG1114491A

Rv0997 A GA3416480G

Rv3055

C TC1208858T

Rv1084 C TC3455434T

Rv3088

C TC1231660T

Rv1104 G CG3608047C

Rv3230c

A CA1246730C

bpoB C TC3764285T

PPE56

C TC1266797T

Rv1139c G AG3765280A

PPE56

C TC1309314T

fdxC C AC3777772A

spoU

A CA1320356C

papA3 A GA4026439G

5 bp upstream Rv3585

G AG1353888A

tagA C AC4037284A

PE_PGRS59

G AG1421085A

Rv1272c T AT4072484A

Rv3633

G CG4084482C

topA

A GA4314271G

bfrB

Identify New Mechanisms of Resistance

fabG1 mabA inhA

L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T

METAGENOMICS

Starting Material

Sputum Subculture

Dx Culture

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 5: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

Sequence Libraries

httpswwwyoutubecomembedHMyCqWhwB8Eiframeamprel=0ampautoplay=1

httpswwwyoutubecomwatchv=v8p4ph2MAvI

httpswwwyoutubecomwatchv=WYBzbxIfuKs

httpswwwyoutubecomwatcht=10ampv=L_jAtDSB8kA

Ion Torrent PGM by Life Technologies

PacBio SMRT

Illumina

Genome Assembly

httpwwwjigsawplanetcomrc=createpuzzle

Bioinformatic Tools

gatcbiotechcom

Simple Variant Call Pipeline

httpswwwebiacuktrainingsitesebiacuktrainingfilesmaterials2014140217_AgriOmicsdan_bolser_snp_callingpdf

Examples of Commercial Products

The companies and products depicted here are not endorsed by CDC

Reference-Guided (Mapped) Assembly

Reference SequenceGenome

Low sequence coverage

UNMAPPED READS 1 Sequences not present in the reference 2 Plasmids or other extrachromosomal 3 DNA Structural VariationRearrangement

ADVANTAGES Relatively fast well-suited to highly-conserved genomes DISADVANTAGES Issues with high diversity mobile elements

Cov

erag

e

18X

1X

Contig 1 Contig 2

De-Novo Assembly

NOT TO SCALE

Contig 1 Contig 2 Contig 3

Contig 4 Contig 5 Contig 6 Contig 7

ADVANTAGES Reference agnostic assembles all the reads it can into contigs DISADVANTAGES Doesnrsquot always get things right Repeat sequences etc

Whole Genome SNP Typing

A

A

T

C

C

C

T

T

T

A

A

A

G

G

C

A

T

T

Reference SequenceCore Genome

1

2

3

ACTAGA

ACTAGT TCTACT

NJ tree visualization

Turnaround Time

DNA Prep Sequencing Analysis

0 ndash 21 days 2 ndash 3 days Few hours to days

MOLECULAR EPIDEMIOLOGY

Cluster 1

~ 100 Patients

Cluster 2

Cluster 3

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

DRUG RESISTANCE

Applications of WGS for Drug Resistance

Surveillance

Clinical management

Identify new mechanisms

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

C G C31998G58 bp upstream Rv0029

C TC1663856T

acn

C T C51403T Rv0047c G AG1674048A

fabG1

C T C118832T Rv0102 T CT1877958C

pks7

C T C247984T Rv0207c C GC1888075G

pks9

T C T362962CPE_PGRS5

T CT2087076C

171 bp upstream of Rv1838c

C G C477188G Rv0398c C TC2372126T

Rv2112c

C A C480678A mmpL1 G CG2402463C

Rv2142c

A GA649974G

ubiE T CT2614547C

46 bp upstream Rv2339

C T C761147T rpoB G AG2751471A

Rv2449c

G C G765719C rpoC G AG2958534A

Rv2631

C G C799139G Rv0698 G AG3126489A

Rv2819c

G AG905686A

Rv0811c G AG3137406A

echA16

C A C926861APE_PGRS13

C TC3213150T

lepB

C GC1023436G

betP A CA3377940C

PPE46

T GT1093459G

PE_PGRS17

A CA3380380C

PPE47

G AG1114491A

Rv0997 A GA3416480G

Rv3055

C TC1208858T

Rv1084 C TC3455434T

Rv3088

C TC1231660T

Rv1104 G CG3608047C

Rv3230c

A CA1246730C

bpoB C TC3764285T

PPE56

C TC1266797T

Rv1139c G AG3765280A

PPE56

C TC1309314T

fdxC C AC3777772A

spoU

A CA1320356C

papA3 A GA4026439G

5 bp upstream Rv3585

G AG1353888A

tagA C AC4037284A

PE_PGRS59

G AG1421085A

Rv1272c T AT4072484A

Rv3633

G CG4084482C

topA

A GA4314271G

bfrB

Identify New Mechanisms of Resistance

fabG1 mabA inhA

L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T

METAGENOMICS

Starting Material

Sputum Subculture

Dx Culture

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 6: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

httpswwwyoutubecomembedHMyCqWhwB8Eiframeamprel=0ampautoplay=1

httpswwwyoutubecomwatchv=v8p4ph2MAvI

httpswwwyoutubecomwatchv=WYBzbxIfuKs

httpswwwyoutubecomwatcht=10ampv=L_jAtDSB8kA

Ion Torrent PGM by Life Technologies

PacBio SMRT

Illumina

Genome Assembly

httpwwwjigsawplanetcomrc=createpuzzle

Bioinformatic Tools

gatcbiotechcom

Simple Variant Call Pipeline

httpswwwebiacuktrainingsitesebiacuktrainingfilesmaterials2014140217_AgriOmicsdan_bolser_snp_callingpdf

Examples of Commercial Products

The companies and products depicted here are not endorsed by CDC

Reference-Guided (Mapped) Assembly

Reference SequenceGenome

Low sequence coverage

UNMAPPED READS 1 Sequences not present in the reference 2 Plasmids or other extrachromosomal 3 DNA Structural VariationRearrangement

ADVANTAGES Relatively fast well-suited to highly-conserved genomes DISADVANTAGES Issues with high diversity mobile elements

Cov

erag

e

18X

1X

Contig 1 Contig 2

De-Novo Assembly

NOT TO SCALE

Contig 1 Contig 2 Contig 3

Contig 4 Contig 5 Contig 6 Contig 7

ADVANTAGES Reference agnostic assembles all the reads it can into contigs DISADVANTAGES Doesnrsquot always get things right Repeat sequences etc

Whole Genome SNP Typing

A

A

T

C

C

C

T

T

T

A

A

A

G

G

C

A

T

T

Reference SequenceCore Genome

1

2

3

ACTAGA

ACTAGT TCTACT

NJ tree visualization

Turnaround Time

DNA Prep Sequencing Analysis

0 ndash 21 days 2 ndash 3 days Few hours to days

MOLECULAR EPIDEMIOLOGY

Cluster 1

~ 100 Patients

Cluster 2

Cluster 3

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

DRUG RESISTANCE

Applications of WGS for Drug Resistance

Surveillance

Clinical management

Identify new mechanisms

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

C G C31998G58 bp upstream Rv0029

C TC1663856T

acn

C T C51403T Rv0047c G AG1674048A

fabG1

C T C118832T Rv0102 T CT1877958C

pks7

C T C247984T Rv0207c C GC1888075G

pks9

T C T362962CPE_PGRS5

T CT2087076C

171 bp upstream of Rv1838c

C G C477188G Rv0398c C TC2372126T

Rv2112c

C A C480678A mmpL1 G CG2402463C

Rv2142c

A GA649974G

ubiE T CT2614547C

46 bp upstream Rv2339

C T C761147T rpoB G AG2751471A

Rv2449c

G C G765719C rpoC G AG2958534A

Rv2631

C G C799139G Rv0698 G AG3126489A

Rv2819c

G AG905686A

Rv0811c G AG3137406A

echA16

C A C926861APE_PGRS13

C TC3213150T

lepB

C GC1023436G

betP A CA3377940C

PPE46

T GT1093459G

PE_PGRS17

A CA3380380C

PPE47

G AG1114491A

Rv0997 A GA3416480G

Rv3055

C TC1208858T

Rv1084 C TC3455434T

Rv3088

C TC1231660T

Rv1104 G CG3608047C

Rv3230c

A CA1246730C

bpoB C TC3764285T

PPE56

C TC1266797T

Rv1139c G AG3765280A

PPE56

C TC1309314T

fdxC C AC3777772A

spoU

A CA1320356C

papA3 A GA4026439G

5 bp upstream Rv3585

G AG1353888A

tagA C AC4037284A

PE_PGRS59

G AG1421085A

Rv1272c T AT4072484A

Rv3633

G CG4084482C

topA

A GA4314271G

bfrB

Identify New Mechanisms of Resistance

fabG1 mabA inhA

L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T

METAGENOMICS

Starting Material

Sputum Subculture

Dx Culture

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 7: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

Genome Assembly

httpwwwjigsawplanetcomrc=createpuzzle

Bioinformatic Tools

gatcbiotechcom

Simple Variant Call Pipeline

httpswwwebiacuktrainingsitesebiacuktrainingfilesmaterials2014140217_AgriOmicsdan_bolser_snp_callingpdf

Examples of Commercial Products

The companies and products depicted here are not endorsed by CDC

Reference-Guided (Mapped) Assembly

Reference SequenceGenome

Low sequence coverage

UNMAPPED READS 1 Sequences not present in the reference 2 Plasmids or other extrachromosomal 3 DNA Structural VariationRearrangement

ADVANTAGES Relatively fast well-suited to highly-conserved genomes DISADVANTAGES Issues with high diversity mobile elements

Cov

erag

e

18X

1X

Contig 1 Contig 2

De-Novo Assembly

NOT TO SCALE

Contig 1 Contig 2 Contig 3

Contig 4 Contig 5 Contig 6 Contig 7

ADVANTAGES Reference agnostic assembles all the reads it can into contigs DISADVANTAGES Doesnrsquot always get things right Repeat sequences etc

Whole Genome SNP Typing

A

A

T

C

C

C

T

T

T

A

A

A

G

G

C

A

T

T

Reference SequenceCore Genome

1

2

3

ACTAGA

ACTAGT TCTACT

NJ tree visualization

Turnaround Time

DNA Prep Sequencing Analysis

0 ndash 21 days 2 ndash 3 days Few hours to days

MOLECULAR EPIDEMIOLOGY

Cluster 1

~ 100 Patients

Cluster 2

Cluster 3

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

DRUG RESISTANCE

Applications of WGS for Drug Resistance

Surveillance

Clinical management

Identify new mechanisms

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

C G C31998G58 bp upstream Rv0029

C TC1663856T

acn

C T C51403T Rv0047c G AG1674048A

fabG1

C T C118832T Rv0102 T CT1877958C

pks7

C T C247984T Rv0207c C GC1888075G

pks9

T C T362962CPE_PGRS5

T CT2087076C

171 bp upstream of Rv1838c

C G C477188G Rv0398c C TC2372126T

Rv2112c

C A C480678A mmpL1 G CG2402463C

Rv2142c

A GA649974G

ubiE T CT2614547C

46 bp upstream Rv2339

C T C761147T rpoB G AG2751471A

Rv2449c

G C G765719C rpoC G AG2958534A

Rv2631

C G C799139G Rv0698 G AG3126489A

Rv2819c

G AG905686A

Rv0811c G AG3137406A

echA16

C A C926861APE_PGRS13

C TC3213150T

lepB

C GC1023436G

betP A CA3377940C

PPE46

T GT1093459G

PE_PGRS17

A CA3380380C

PPE47

G AG1114491A

Rv0997 A GA3416480G

Rv3055

C TC1208858T

Rv1084 C TC3455434T

Rv3088

C TC1231660T

Rv1104 G CG3608047C

Rv3230c

A CA1246730C

bpoB C TC3764285T

PPE56

C TC1266797T

Rv1139c G AG3765280A

PPE56

C TC1309314T

fdxC C AC3777772A

spoU

A CA1320356C

papA3 A GA4026439G

5 bp upstream Rv3585

G AG1353888A

tagA C AC4037284A

PE_PGRS59

G AG1421085A

Rv1272c T AT4072484A

Rv3633

G CG4084482C

topA

A GA4314271G

bfrB

Identify New Mechanisms of Resistance

fabG1 mabA inhA

L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T

METAGENOMICS

Starting Material

Sputum Subculture

Dx Culture

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 8: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

Bioinformatic Tools

gatcbiotechcom

Simple Variant Call Pipeline

httpswwwebiacuktrainingsitesebiacuktrainingfilesmaterials2014140217_AgriOmicsdan_bolser_snp_callingpdf

Examples of Commercial Products

The companies and products depicted here are not endorsed by CDC

Reference-Guided (Mapped) Assembly

Reference SequenceGenome

Low sequence coverage

UNMAPPED READS 1 Sequences not present in the reference 2 Plasmids or other extrachromosomal 3 DNA Structural VariationRearrangement

ADVANTAGES Relatively fast well-suited to highly-conserved genomes DISADVANTAGES Issues with high diversity mobile elements

Cov

erag

e

18X

1X

Contig 1 Contig 2

De-Novo Assembly

NOT TO SCALE

Contig 1 Contig 2 Contig 3

Contig 4 Contig 5 Contig 6 Contig 7

ADVANTAGES Reference agnostic assembles all the reads it can into contigs DISADVANTAGES Doesnrsquot always get things right Repeat sequences etc

Whole Genome SNP Typing

A

A

T

C

C

C

T

T

T

A

A

A

G

G

C

A

T

T

Reference SequenceCore Genome

1

2

3

ACTAGA

ACTAGT TCTACT

NJ tree visualization

Turnaround Time

DNA Prep Sequencing Analysis

0 ndash 21 days 2 ndash 3 days Few hours to days

MOLECULAR EPIDEMIOLOGY

Cluster 1

~ 100 Patients

Cluster 2

Cluster 3

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

DRUG RESISTANCE

Applications of WGS for Drug Resistance

Surveillance

Clinical management

Identify new mechanisms

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

C G C31998G58 bp upstream Rv0029

C TC1663856T

acn

C T C51403T Rv0047c G AG1674048A

fabG1

C T C118832T Rv0102 T CT1877958C

pks7

C T C247984T Rv0207c C GC1888075G

pks9

T C T362962CPE_PGRS5

T CT2087076C

171 bp upstream of Rv1838c

C G C477188G Rv0398c C TC2372126T

Rv2112c

C A C480678A mmpL1 G CG2402463C

Rv2142c

A GA649974G

ubiE T CT2614547C

46 bp upstream Rv2339

C T C761147T rpoB G AG2751471A

Rv2449c

G C G765719C rpoC G AG2958534A

Rv2631

C G C799139G Rv0698 G AG3126489A

Rv2819c

G AG905686A

Rv0811c G AG3137406A

echA16

C A C926861APE_PGRS13

C TC3213150T

lepB

C GC1023436G

betP A CA3377940C

PPE46

T GT1093459G

PE_PGRS17

A CA3380380C

PPE47

G AG1114491A

Rv0997 A GA3416480G

Rv3055

C TC1208858T

Rv1084 C TC3455434T

Rv3088

C TC1231660T

Rv1104 G CG3608047C

Rv3230c

A CA1246730C

bpoB C TC3764285T

PPE56

C TC1266797T

Rv1139c G AG3765280A

PPE56

C TC1309314T

fdxC C AC3777772A

spoU

A CA1320356C

papA3 A GA4026439G

5 bp upstream Rv3585

G AG1353888A

tagA C AC4037284A

PE_PGRS59

G AG1421085A

Rv1272c T AT4072484A

Rv3633

G CG4084482C

topA

A GA4314271G

bfrB

Identify New Mechanisms of Resistance

fabG1 mabA inhA

L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T

METAGENOMICS

Starting Material

Sputum Subculture

Dx Culture

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 9: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

Simple Variant Call Pipeline

httpswwwebiacuktrainingsitesebiacuktrainingfilesmaterials2014140217_AgriOmicsdan_bolser_snp_callingpdf

Examples of Commercial Products

The companies and products depicted here are not endorsed by CDC

Reference-Guided (Mapped) Assembly

Reference SequenceGenome

Low sequence coverage

UNMAPPED READS 1 Sequences not present in the reference 2 Plasmids or other extrachromosomal 3 DNA Structural VariationRearrangement

ADVANTAGES Relatively fast well-suited to highly-conserved genomes DISADVANTAGES Issues with high diversity mobile elements

Cov

erag

e

18X

1X

Contig 1 Contig 2

De-Novo Assembly

NOT TO SCALE

Contig 1 Contig 2 Contig 3

Contig 4 Contig 5 Contig 6 Contig 7

ADVANTAGES Reference agnostic assembles all the reads it can into contigs DISADVANTAGES Doesnrsquot always get things right Repeat sequences etc

Whole Genome SNP Typing

A

A

T

C

C

C

T

T

T

A

A

A

G

G

C

A

T

T

Reference SequenceCore Genome

1

2

3

ACTAGA

ACTAGT TCTACT

NJ tree visualization

Turnaround Time

DNA Prep Sequencing Analysis

0 ndash 21 days 2 ndash 3 days Few hours to days

MOLECULAR EPIDEMIOLOGY

Cluster 1

~ 100 Patients

Cluster 2

Cluster 3

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

DRUG RESISTANCE

Applications of WGS for Drug Resistance

Surveillance

Clinical management

Identify new mechanisms

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

C G C31998G58 bp upstream Rv0029

C TC1663856T

acn

C T C51403T Rv0047c G AG1674048A

fabG1

C T C118832T Rv0102 T CT1877958C

pks7

C T C247984T Rv0207c C GC1888075G

pks9

T C T362962CPE_PGRS5

T CT2087076C

171 bp upstream of Rv1838c

C G C477188G Rv0398c C TC2372126T

Rv2112c

C A C480678A mmpL1 G CG2402463C

Rv2142c

A GA649974G

ubiE T CT2614547C

46 bp upstream Rv2339

C T C761147T rpoB G AG2751471A

Rv2449c

G C G765719C rpoC G AG2958534A

Rv2631

C G C799139G Rv0698 G AG3126489A

Rv2819c

G AG905686A

Rv0811c G AG3137406A

echA16

C A C926861APE_PGRS13

C TC3213150T

lepB

C GC1023436G

betP A CA3377940C

PPE46

T GT1093459G

PE_PGRS17

A CA3380380C

PPE47

G AG1114491A

Rv0997 A GA3416480G

Rv3055

C TC1208858T

Rv1084 C TC3455434T

Rv3088

C TC1231660T

Rv1104 G CG3608047C

Rv3230c

A CA1246730C

bpoB C TC3764285T

PPE56

C TC1266797T

Rv1139c G AG3765280A

PPE56

C TC1309314T

fdxC C AC3777772A

spoU

A CA1320356C

papA3 A GA4026439G

5 bp upstream Rv3585

G AG1353888A

tagA C AC4037284A

PE_PGRS59

G AG1421085A

Rv1272c T AT4072484A

Rv3633

G CG4084482C

topA

A GA4314271G

bfrB

Identify New Mechanisms of Resistance

fabG1 mabA inhA

L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T

METAGENOMICS

Starting Material

Sputum Subculture

Dx Culture

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 10: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

Examples of Commercial Products

The companies and products depicted here are not endorsed by CDC

Reference-Guided (Mapped) Assembly

Reference SequenceGenome

Low sequence coverage

UNMAPPED READS 1 Sequences not present in the reference 2 Plasmids or other extrachromosomal 3 DNA Structural VariationRearrangement

ADVANTAGES Relatively fast well-suited to highly-conserved genomes DISADVANTAGES Issues with high diversity mobile elements

Cov

erag

e

18X

1X

Contig 1 Contig 2

De-Novo Assembly

NOT TO SCALE

Contig 1 Contig 2 Contig 3

Contig 4 Contig 5 Contig 6 Contig 7

ADVANTAGES Reference agnostic assembles all the reads it can into contigs DISADVANTAGES Doesnrsquot always get things right Repeat sequences etc

Whole Genome SNP Typing

A

A

T

C

C

C

T

T

T

A

A

A

G

G

C

A

T

T

Reference SequenceCore Genome

1

2

3

ACTAGA

ACTAGT TCTACT

NJ tree visualization

Turnaround Time

DNA Prep Sequencing Analysis

0 ndash 21 days 2 ndash 3 days Few hours to days

MOLECULAR EPIDEMIOLOGY

Cluster 1

~ 100 Patients

Cluster 2

Cluster 3

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

DRUG RESISTANCE

Applications of WGS for Drug Resistance

Surveillance

Clinical management

Identify new mechanisms

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

C G C31998G58 bp upstream Rv0029

C TC1663856T

acn

C T C51403T Rv0047c G AG1674048A

fabG1

C T C118832T Rv0102 T CT1877958C

pks7

C T C247984T Rv0207c C GC1888075G

pks9

T C T362962CPE_PGRS5

T CT2087076C

171 bp upstream of Rv1838c

C G C477188G Rv0398c C TC2372126T

Rv2112c

C A C480678A mmpL1 G CG2402463C

Rv2142c

A GA649974G

ubiE T CT2614547C

46 bp upstream Rv2339

C T C761147T rpoB G AG2751471A

Rv2449c

G C G765719C rpoC G AG2958534A

Rv2631

C G C799139G Rv0698 G AG3126489A

Rv2819c

G AG905686A

Rv0811c G AG3137406A

echA16

C A C926861APE_PGRS13

C TC3213150T

lepB

C GC1023436G

betP A CA3377940C

PPE46

T GT1093459G

PE_PGRS17

A CA3380380C

PPE47

G AG1114491A

Rv0997 A GA3416480G

Rv3055

C TC1208858T

Rv1084 C TC3455434T

Rv3088

C TC1231660T

Rv1104 G CG3608047C

Rv3230c

A CA1246730C

bpoB C TC3764285T

PPE56

C TC1266797T

Rv1139c G AG3765280A

PPE56

C TC1309314T

fdxC C AC3777772A

spoU

A CA1320356C

papA3 A GA4026439G

5 bp upstream Rv3585

G AG1353888A

tagA C AC4037284A

PE_PGRS59

G AG1421085A

Rv1272c T AT4072484A

Rv3633

G CG4084482C

topA

A GA4314271G

bfrB

Identify New Mechanisms of Resistance

fabG1 mabA inhA

L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T

METAGENOMICS

Starting Material

Sputum Subculture

Dx Culture

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 11: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

Reference-Guided (Mapped) Assembly

Reference SequenceGenome

Low sequence coverage

UNMAPPED READS 1 Sequences not present in the reference 2 Plasmids or other extrachromosomal 3 DNA Structural VariationRearrangement

ADVANTAGES Relatively fast well-suited to highly-conserved genomes DISADVANTAGES Issues with high diversity mobile elements

Cov

erag

e

18X

1X

Contig 1 Contig 2

De-Novo Assembly

NOT TO SCALE

Contig 1 Contig 2 Contig 3

Contig 4 Contig 5 Contig 6 Contig 7

ADVANTAGES Reference agnostic assembles all the reads it can into contigs DISADVANTAGES Doesnrsquot always get things right Repeat sequences etc

Whole Genome SNP Typing

A

A

T

C

C

C

T

T

T

A

A

A

G

G

C

A

T

T

Reference SequenceCore Genome

1

2

3

ACTAGA

ACTAGT TCTACT

NJ tree visualization

Turnaround Time

DNA Prep Sequencing Analysis

0 ndash 21 days 2 ndash 3 days Few hours to days

MOLECULAR EPIDEMIOLOGY

Cluster 1

~ 100 Patients

Cluster 2

Cluster 3

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

DRUG RESISTANCE

Applications of WGS for Drug Resistance

Surveillance

Clinical management

Identify new mechanisms

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

C G C31998G58 bp upstream Rv0029

C TC1663856T

acn

C T C51403T Rv0047c G AG1674048A

fabG1

C T C118832T Rv0102 T CT1877958C

pks7

C T C247984T Rv0207c C GC1888075G

pks9

T C T362962CPE_PGRS5

T CT2087076C

171 bp upstream of Rv1838c

C G C477188G Rv0398c C TC2372126T

Rv2112c

C A C480678A mmpL1 G CG2402463C

Rv2142c

A GA649974G

ubiE T CT2614547C

46 bp upstream Rv2339

C T C761147T rpoB G AG2751471A

Rv2449c

G C G765719C rpoC G AG2958534A

Rv2631

C G C799139G Rv0698 G AG3126489A

Rv2819c

G AG905686A

Rv0811c G AG3137406A

echA16

C A C926861APE_PGRS13

C TC3213150T

lepB

C GC1023436G

betP A CA3377940C

PPE46

T GT1093459G

PE_PGRS17

A CA3380380C

PPE47

G AG1114491A

Rv0997 A GA3416480G

Rv3055

C TC1208858T

Rv1084 C TC3455434T

Rv3088

C TC1231660T

Rv1104 G CG3608047C

Rv3230c

A CA1246730C

bpoB C TC3764285T

PPE56

C TC1266797T

Rv1139c G AG3765280A

PPE56

C TC1309314T

fdxC C AC3777772A

spoU

A CA1320356C

papA3 A GA4026439G

5 bp upstream Rv3585

G AG1353888A

tagA C AC4037284A

PE_PGRS59

G AG1421085A

Rv1272c T AT4072484A

Rv3633

G CG4084482C

topA

A GA4314271G

bfrB

Identify New Mechanisms of Resistance

fabG1 mabA inhA

L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T

METAGENOMICS

Starting Material

Sputum Subculture

Dx Culture

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 12: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

De-Novo Assembly

NOT TO SCALE

Contig 1 Contig 2 Contig 3

Contig 4 Contig 5 Contig 6 Contig 7

ADVANTAGES Reference agnostic assembles all the reads it can into contigs DISADVANTAGES Doesnrsquot always get things right Repeat sequences etc

Whole Genome SNP Typing

A

A

T

C

C

C

T

T

T

A

A

A

G

G

C

A

T

T

Reference SequenceCore Genome

1

2

3

ACTAGA

ACTAGT TCTACT

NJ tree visualization

Turnaround Time

DNA Prep Sequencing Analysis

0 ndash 21 days 2 ndash 3 days Few hours to days

MOLECULAR EPIDEMIOLOGY

Cluster 1

~ 100 Patients

Cluster 2

Cluster 3

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

DRUG RESISTANCE

Applications of WGS for Drug Resistance

Surveillance

Clinical management

Identify new mechanisms

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

C G C31998G58 bp upstream Rv0029

C TC1663856T

acn

C T C51403T Rv0047c G AG1674048A

fabG1

C T C118832T Rv0102 T CT1877958C

pks7

C T C247984T Rv0207c C GC1888075G

pks9

T C T362962CPE_PGRS5

T CT2087076C

171 bp upstream of Rv1838c

C G C477188G Rv0398c C TC2372126T

Rv2112c

C A C480678A mmpL1 G CG2402463C

Rv2142c

A GA649974G

ubiE T CT2614547C

46 bp upstream Rv2339

C T C761147T rpoB G AG2751471A

Rv2449c

G C G765719C rpoC G AG2958534A

Rv2631

C G C799139G Rv0698 G AG3126489A

Rv2819c

G AG905686A

Rv0811c G AG3137406A

echA16

C A C926861APE_PGRS13

C TC3213150T

lepB

C GC1023436G

betP A CA3377940C

PPE46

T GT1093459G

PE_PGRS17

A CA3380380C

PPE47

G AG1114491A

Rv0997 A GA3416480G

Rv3055

C TC1208858T

Rv1084 C TC3455434T

Rv3088

C TC1231660T

Rv1104 G CG3608047C

Rv3230c

A CA1246730C

bpoB C TC3764285T

PPE56

C TC1266797T

Rv1139c G AG3765280A

PPE56

C TC1309314T

fdxC C AC3777772A

spoU

A CA1320356C

papA3 A GA4026439G

5 bp upstream Rv3585

G AG1353888A

tagA C AC4037284A

PE_PGRS59

G AG1421085A

Rv1272c T AT4072484A

Rv3633

G CG4084482C

topA

A GA4314271G

bfrB

Identify New Mechanisms of Resistance

fabG1 mabA inhA

L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T

METAGENOMICS

Starting Material

Sputum Subculture

Dx Culture

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 13: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

Whole Genome SNP Typing

A

A

T

C

C

C

T

T

T

A

A

A

G

G

C

A

T

T

Reference SequenceCore Genome

1

2

3

ACTAGA

ACTAGT TCTACT

NJ tree visualization

Turnaround Time

DNA Prep Sequencing Analysis

0 ndash 21 days 2 ndash 3 days Few hours to days

MOLECULAR EPIDEMIOLOGY

Cluster 1

~ 100 Patients

Cluster 2

Cluster 3

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

DRUG RESISTANCE

Applications of WGS for Drug Resistance

Surveillance

Clinical management

Identify new mechanisms

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

C G C31998G58 bp upstream Rv0029

C TC1663856T

acn

C T C51403T Rv0047c G AG1674048A

fabG1

C T C118832T Rv0102 T CT1877958C

pks7

C T C247984T Rv0207c C GC1888075G

pks9

T C T362962CPE_PGRS5

T CT2087076C

171 bp upstream of Rv1838c

C G C477188G Rv0398c C TC2372126T

Rv2112c

C A C480678A mmpL1 G CG2402463C

Rv2142c

A GA649974G

ubiE T CT2614547C

46 bp upstream Rv2339

C T C761147T rpoB G AG2751471A

Rv2449c

G C G765719C rpoC G AG2958534A

Rv2631

C G C799139G Rv0698 G AG3126489A

Rv2819c

G AG905686A

Rv0811c G AG3137406A

echA16

C A C926861APE_PGRS13

C TC3213150T

lepB

C GC1023436G

betP A CA3377940C

PPE46

T GT1093459G

PE_PGRS17

A CA3380380C

PPE47

G AG1114491A

Rv0997 A GA3416480G

Rv3055

C TC1208858T

Rv1084 C TC3455434T

Rv3088

C TC1231660T

Rv1104 G CG3608047C

Rv3230c

A CA1246730C

bpoB C TC3764285T

PPE56

C TC1266797T

Rv1139c G AG3765280A

PPE56

C TC1309314T

fdxC C AC3777772A

spoU

A CA1320356C

papA3 A GA4026439G

5 bp upstream Rv3585

G AG1353888A

tagA C AC4037284A

PE_PGRS59

G AG1421085A

Rv1272c T AT4072484A

Rv3633

G CG4084482C

topA

A GA4314271G

bfrB

Identify New Mechanisms of Resistance

fabG1 mabA inhA

L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T

METAGENOMICS

Starting Material

Sputum Subculture

Dx Culture

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 14: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

NJ tree visualization

Turnaround Time

DNA Prep Sequencing Analysis

0 ndash 21 days 2 ndash 3 days Few hours to days

MOLECULAR EPIDEMIOLOGY

Cluster 1

~ 100 Patients

Cluster 2

Cluster 3

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

DRUG RESISTANCE

Applications of WGS for Drug Resistance

Surveillance

Clinical management

Identify new mechanisms

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

C G C31998G58 bp upstream Rv0029

C TC1663856T

acn

C T C51403T Rv0047c G AG1674048A

fabG1

C T C118832T Rv0102 T CT1877958C

pks7

C T C247984T Rv0207c C GC1888075G

pks9

T C T362962CPE_PGRS5

T CT2087076C

171 bp upstream of Rv1838c

C G C477188G Rv0398c C TC2372126T

Rv2112c

C A C480678A mmpL1 G CG2402463C

Rv2142c

A GA649974G

ubiE T CT2614547C

46 bp upstream Rv2339

C T C761147T rpoB G AG2751471A

Rv2449c

G C G765719C rpoC G AG2958534A

Rv2631

C G C799139G Rv0698 G AG3126489A

Rv2819c

G AG905686A

Rv0811c G AG3137406A

echA16

C A C926861APE_PGRS13

C TC3213150T

lepB

C GC1023436G

betP A CA3377940C

PPE46

T GT1093459G

PE_PGRS17

A CA3380380C

PPE47

G AG1114491A

Rv0997 A GA3416480G

Rv3055

C TC1208858T

Rv1084 C TC3455434T

Rv3088

C TC1231660T

Rv1104 G CG3608047C

Rv3230c

A CA1246730C

bpoB C TC3764285T

PPE56

C TC1266797T

Rv1139c G AG3765280A

PPE56

C TC1309314T

fdxC C AC3777772A

spoU

A CA1320356C

papA3 A GA4026439G

5 bp upstream Rv3585

G AG1353888A

tagA C AC4037284A

PE_PGRS59

G AG1421085A

Rv1272c T AT4072484A

Rv3633

G CG4084482C

topA

A GA4314271G

bfrB

Identify New Mechanisms of Resistance

fabG1 mabA inhA

L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T

METAGENOMICS

Starting Material

Sputum Subculture

Dx Culture

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 15: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

Turnaround Time

DNA Prep Sequencing Analysis

0 ndash 21 days 2 ndash 3 days Few hours to days

MOLECULAR EPIDEMIOLOGY

Cluster 1

~ 100 Patients

Cluster 2

Cluster 3

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

DRUG RESISTANCE

Applications of WGS for Drug Resistance

Surveillance

Clinical management

Identify new mechanisms

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

C G C31998G58 bp upstream Rv0029

C TC1663856T

acn

C T C51403T Rv0047c G AG1674048A

fabG1

C T C118832T Rv0102 T CT1877958C

pks7

C T C247984T Rv0207c C GC1888075G

pks9

T C T362962CPE_PGRS5

T CT2087076C

171 bp upstream of Rv1838c

C G C477188G Rv0398c C TC2372126T

Rv2112c

C A C480678A mmpL1 G CG2402463C

Rv2142c

A GA649974G

ubiE T CT2614547C

46 bp upstream Rv2339

C T C761147T rpoB G AG2751471A

Rv2449c

G C G765719C rpoC G AG2958534A

Rv2631

C G C799139G Rv0698 G AG3126489A

Rv2819c

G AG905686A

Rv0811c G AG3137406A

echA16

C A C926861APE_PGRS13

C TC3213150T

lepB

C GC1023436G

betP A CA3377940C

PPE46

T GT1093459G

PE_PGRS17

A CA3380380C

PPE47

G AG1114491A

Rv0997 A GA3416480G

Rv3055

C TC1208858T

Rv1084 C TC3455434T

Rv3088

C TC1231660T

Rv1104 G CG3608047C

Rv3230c

A CA1246730C

bpoB C TC3764285T

PPE56

C TC1266797T

Rv1139c G AG3765280A

PPE56

C TC1309314T

fdxC C AC3777772A

spoU

A CA1320356C

papA3 A GA4026439G

5 bp upstream Rv3585

G AG1353888A

tagA C AC4037284A

PE_PGRS59

G AG1421085A

Rv1272c T AT4072484A

Rv3633

G CG4084482C

topA

A GA4314271G

bfrB

Identify New Mechanisms of Resistance

fabG1 mabA inhA

L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T

METAGENOMICS

Starting Material

Sputum Subculture

Dx Culture

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 16: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

MOLECULAR EPIDEMIOLOGY

Cluster 1

~ 100 Patients

Cluster 2

Cluster 3

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

DRUG RESISTANCE

Applications of WGS for Drug Resistance

Surveillance

Clinical management

Identify new mechanisms

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

C G C31998G58 bp upstream Rv0029

C TC1663856T

acn

C T C51403T Rv0047c G AG1674048A

fabG1

C T C118832T Rv0102 T CT1877958C

pks7

C T C247984T Rv0207c C GC1888075G

pks9

T C T362962CPE_PGRS5

T CT2087076C

171 bp upstream of Rv1838c

C G C477188G Rv0398c C TC2372126T

Rv2112c

C A C480678A mmpL1 G CG2402463C

Rv2142c

A GA649974G

ubiE T CT2614547C

46 bp upstream Rv2339

C T C761147T rpoB G AG2751471A

Rv2449c

G C G765719C rpoC G AG2958534A

Rv2631

C G C799139G Rv0698 G AG3126489A

Rv2819c

G AG905686A

Rv0811c G AG3137406A

echA16

C A C926861APE_PGRS13

C TC3213150T

lepB

C GC1023436G

betP A CA3377940C

PPE46

T GT1093459G

PE_PGRS17

A CA3380380C

PPE47

G AG1114491A

Rv0997 A GA3416480G

Rv3055

C TC1208858T

Rv1084 C TC3455434T

Rv3088

C TC1231660T

Rv1104 G CG3608047C

Rv3230c

A CA1246730C

bpoB C TC3764285T

PPE56

C TC1266797T

Rv1139c G AG3765280A

PPE56

C TC1309314T

fdxC C AC3777772A

spoU

A CA1320356C

papA3 A GA4026439G

5 bp upstream Rv3585

G AG1353888A

tagA C AC4037284A

PE_PGRS59

G AG1421085A

Rv1272c T AT4072484A

Rv3633

G CG4084482C

topA

A GA4314271G

bfrB

Identify New Mechanisms of Resistance

fabG1 mabA inhA

L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T

METAGENOMICS

Starting Material

Sputum Subculture

Dx Culture

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 17: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

Cluster 1

~ 100 Patients

Cluster 2

Cluster 3

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

DRUG RESISTANCE

Applications of WGS for Drug Resistance

Surveillance

Clinical management

Identify new mechanisms

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

C G C31998G58 bp upstream Rv0029

C TC1663856T

acn

C T C51403T Rv0047c G AG1674048A

fabG1

C T C118832T Rv0102 T CT1877958C

pks7

C T C247984T Rv0207c C GC1888075G

pks9

T C T362962CPE_PGRS5

T CT2087076C

171 bp upstream of Rv1838c

C G C477188G Rv0398c C TC2372126T

Rv2112c

C A C480678A mmpL1 G CG2402463C

Rv2142c

A GA649974G

ubiE T CT2614547C

46 bp upstream Rv2339

C T C761147T rpoB G AG2751471A

Rv2449c

G C G765719C rpoC G AG2958534A

Rv2631

C G C799139G Rv0698 G AG3126489A

Rv2819c

G AG905686A

Rv0811c G AG3137406A

echA16

C A C926861APE_PGRS13

C TC3213150T

lepB

C GC1023436G

betP A CA3377940C

PPE46

T GT1093459G

PE_PGRS17

A CA3380380C

PPE47

G AG1114491A

Rv0997 A GA3416480G

Rv3055

C TC1208858T

Rv1084 C TC3455434T

Rv3088

C TC1231660T

Rv1104 G CG3608047C

Rv3230c

A CA1246730C

bpoB C TC3764285T

PPE56

C TC1266797T

Rv1139c G AG3765280A

PPE56

C TC1309314T

fdxC C AC3777772A

spoU

A CA1320356C

papA3 A GA4026439G

5 bp upstream Rv3585

G AG1353888A

tagA C AC4037284A

PE_PGRS59

G AG1421085A

Rv1272c T AT4072484A

Rv3633

G CG4084482C

topA

A GA4314271G

bfrB

Identify New Mechanisms of Resistance

fabG1 mabA inhA

L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T

METAGENOMICS

Starting Material

Sputum Subculture

Dx Culture

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 18: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

Cluster 2

Cluster 3

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

DRUG RESISTANCE

Applications of WGS for Drug Resistance

Surveillance

Clinical management

Identify new mechanisms

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

C G C31998G58 bp upstream Rv0029

C TC1663856T

acn

C T C51403T Rv0047c G AG1674048A

fabG1

C T C118832T Rv0102 T CT1877958C

pks7

C T C247984T Rv0207c C GC1888075G

pks9

T C T362962CPE_PGRS5

T CT2087076C

171 bp upstream of Rv1838c

C G C477188G Rv0398c C TC2372126T

Rv2112c

C A C480678A mmpL1 G CG2402463C

Rv2142c

A GA649974G

ubiE T CT2614547C

46 bp upstream Rv2339

C T C761147T rpoB G AG2751471A

Rv2449c

G C G765719C rpoC G AG2958534A

Rv2631

C G C799139G Rv0698 G AG3126489A

Rv2819c

G AG905686A

Rv0811c G AG3137406A

echA16

C A C926861APE_PGRS13

C TC3213150T

lepB

C GC1023436G

betP A CA3377940C

PPE46

T GT1093459G

PE_PGRS17

A CA3380380C

PPE47

G AG1114491A

Rv0997 A GA3416480G

Rv3055

C TC1208858T

Rv1084 C TC3455434T

Rv3088

C TC1231660T

Rv1104 G CG3608047C

Rv3230c

A CA1246730C

bpoB C TC3764285T

PPE56

C TC1266797T

Rv1139c G AG3765280A

PPE56

C TC1309314T

fdxC C AC3777772A

spoU

A CA1320356C

papA3 A GA4026439G

5 bp upstream Rv3585

G AG1353888A

tagA C AC4037284A

PE_PGRS59

G AG1421085A

Rv1272c T AT4072484A

Rv3633

G CG4084482C

topA

A GA4314271G

bfrB

Identify New Mechanisms of Resistance

fabG1 mabA inhA

L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T

METAGENOMICS

Starting Material

Sputum Subculture

Dx Culture

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 19: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

Cluster 3

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

DRUG RESISTANCE

Applications of WGS for Drug Resistance

Surveillance

Clinical management

Identify new mechanisms

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

C G C31998G58 bp upstream Rv0029

C TC1663856T

acn

C T C51403T Rv0047c G AG1674048A

fabG1

C T C118832T Rv0102 T CT1877958C

pks7

C T C247984T Rv0207c C GC1888075G

pks9

T C T362962CPE_PGRS5

T CT2087076C

171 bp upstream of Rv1838c

C G C477188G Rv0398c C TC2372126T

Rv2112c

C A C480678A mmpL1 G CG2402463C

Rv2142c

A GA649974G

ubiE T CT2614547C

46 bp upstream Rv2339

C T C761147T rpoB G AG2751471A

Rv2449c

G C G765719C rpoC G AG2958534A

Rv2631

C G C799139G Rv0698 G AG3126489A

Rv2819c

G AG905686A

Rv0811c G AG3137406A

echA16

C A C926861APE_PGRS13

C TC3213150T

lepB

C GC1023436G

betP A CA3377940C

PPE46

T GT1093459G

PE_PGRS17

A CA3380380C

PPE47

G AG1114491A

Rv0997 A GA3416480G

Rv3055

C TC1208858T

Rv1084 C TC3455434T

Rv3088

C TC1231660T

Rv1104 G CG3608047C

Rv3230c

A CA1246730C

bpoB C TC3764285T

PPE56

C TC1266797T

Rv1139c G AG3765280A

PPE56

C TC1309314T

fdxC C AC3777772A

spoU

A CA1320356C

papA3 A GA4026439G

5 bp upstream Rv3585

G AG1353888A

tagA C AC4037284A

PE_PGRS59

G AG1421085A

Rv1272c T AT4072484A

Rv3633

G CG4084482C

topA

A GA4314271G

bfrB

Identify New Mechanisms of Resistance

fabG1 mabA inhA

L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T

METAGENOMICS

Starting Material

Sputum Subculture

Dx Culture

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 20: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

DRUG RESISTANCE

Applications of WGS for Drug Resistance

Surveillance

Clinical management

Identify new mechanisms

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

C G C31998G58 bp upstream Rv0029

C TC1663856T

acn

C T C51403T Rv0047c G AG1674048A

fabG1

C T C118832T Rv0102 T CT1877958C

pks7

C T C247984T Rv0207c C GC1888075G

pks9

T C T362962CPE_PGRS5

T CT2087076C

171 bp upstream of Rv1838c

C G C477188G Rv0398c C TC2372126T

Rv2112c

C A C480678A mmpL1 G CG2402463C

Rv2142c

A GA649974G

ubiE T CT2614547C

46 bp upstream Rv2339

C T C761147T rpoB G AG2751471A

Rv2449c

G C G765719C rpoC G AG2958534A

Rv2631

C G C799139G Rv0698 G AG3126489A

Rv2819c

G AG905686A

Rv0811c G AG3137406A

echA16

C A C926861APE_PGRS13

C TC3213150T

lepB

C GC1023436G

betP A CA3377940C

PPE46

T GT1093459G

PE_PGRS17

A CA3380380C

PPE47

G AG1114491A

Rv0997 A GA3416480G

Rv3055

C TC1208858T

Rv1084 C TC3455434T

Rv3088

C TC1231660T

Rv1104 G CG3608047C

Rv3230c

A CA1246730C

bpoB C TC3764285T

PPE56

C TC1266797T

Rv1139c G AG3765280A

PPE56

C TC1309314T

fdxC C AC3777772A

spoU

A CA1320356C

papA3 A GA4026439G

5 bp upstream Rv3585

G AG1353888A

tagA C AC4037284A

PE_PGRS59

G AG1421085A

Rv1272c T AT4072484A

Rv3633

G CG4084482C

topA

A GA4314271G

bfrB

Identify New Mechanisms of Resistance

fabG1 mabA inhA

L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T

METAGENOMICS

Starting Material

Sputum Subculture

Dx Culture

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 21: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

DRUG RESISTANCE

Applications of WGS for Drug Resistance

Surveillance

Clinical management

Identify new mechanisms

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

C G C31998G58 bp upstream Rv0029

C TC1663856T

acn

C T C51403T Rv0047c G AG1674048A

fabG1

C T C118832T Rv0102 T CT1877958C

pks7

C T C247984T Rv0207c C GC1888075G

pks9

T C T362962CPE_PGRS5

T CT2087076C

171 bp upstream of Rv1838c

C G C477188G Rv0398c C TC2372126T

Rv2112c

C A C480678A mmpL1 G CG2402463C

Rv2142c

A GA649974G

ubiE T CT2614547C

46 bp upstream Rv2339

C T C761147T rpoB G AG2751471A

Rv2449c

G C G765719C rpoC G AG2958534A

Rv2631

C G C799139G Rv0698 G AG3126489A

Rv2819c

G AG905686A

Rv0811c G AG3137406A

echA16

C A C926861APE_PGRS13

C TC3213150T

lepB

C GC1023436G

betP A CA3377940C

PPE46

T GT1093459G

PE_PGRS17

A CA3380380C

PPE47

G AG1114491A

Rv0997 A GA3416480G

Rv3055

C TC1208858T

Rv1084 C TC3455434T

Rv3088

C TC1231660T

Rv1104 G CG3608047C

Rv3230c

A CA1246730C

bpoB C TC3764285T

PPE56

C TC1266797T

Rv1139c G AG3765280A

PPE56

C TC1309314T

fdxC C AC3777772A

spoU

A CA1320356C

papA3 A GA4026439G

5 bp upstream Rv3585

G AG1353888A

tagA C AC4037284A

PE_PGRS59

G AG1421085A

Rv1272c T AT4072484A

Rv3633

G CG4084482C

topA

A GA4314271G

bfrB

Identify New Mechanisms of Resistance

fabG1 mabA inhA

L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T

METAGENOMICS

Starting Material

Sputum Subculture

Dx Culture

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 22: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

Applications of WGS for Drug Resistance

Surveillance

Clinical management

Identify new mechanisms

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

C G C31998G58 bp upstream Rv0029

C TC1663856T

acn

C T C51403T Rv0047c G AG1674048A

fabG1

C T C118832T Rv0102 T CT1877958C

pks7

C T C247984T Rv0207c C GC1888075G

pks9

T C T362962CPE_PGRS5

T CT2087076C

171 bp upstream of Rv1838c

C G C477188G Rv0398c C TC2372126T

Rv2112c

C A C480678A mmpL1 G CG2402463C

Rv2142c

A GA649974G

ubiE T CT2614547C

46 bp upstream Rv2339

C T C761147T rpoB G AG2751471A

Rv2449c

G C G765719C rpoC G AG2958534A

Rv2631

C G C799139G Rv0698 G AG3126489A

Rv2819c

G AG905686A

Rv0811c G AG3137406A

echA16

C A C926861APE_PGRS13

C TC3213150T

lepB

C GC1023436G

betP A CA3377940C

PPE46

T GT1093459G

PE_PGRS17

A CA3380380C

PPE47

G AG1114491A

Rv0997 A GA3416480G

Rv3055

C TC1208858T

Rv1084 C TC3455434T

Rv3088

C TC1231660T

Rv1104 G CG3608047C

Rv3230c

A CA1246730C

bpoB C TC3764285T

PPE56

C TC1266797T

Rv1139c G AG3765280A

PPE56

C TC1309314T

fdxC C AC3777772A

spoU

A CA1320356C

papA3 A GA4026439G

5 bp upstream Rv3585

G AG1353888A

tagA C AC4037284A

PE_PGRS59

G AG1421085A

Rv1272c T AT4072484A

Rv3633

G CG4084482C

topA

A GA4314271G

bfrB

Identify New Mechanisms of Resistance

fabG1 mabA inhA

L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T

METAGENOMICS

Starting Material

Sputum Subculture

Dx Culture

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 23: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

29 44

26

19

36 63

46 44

Spoligotype 777776777760601 MIRU-VNTR A224325143323244234423337 B224325153323444234423337 C22432515332343-234422333

B B

B B

B

B

A

C

C G C31998G58 bp upstream Rv0029

C TC1663856T

acn

C T C51403T Rv0047c G AG1674048A

fabG1

C T C118832T Rv0102 T CT1877958C

pks7

C T C247984T Rv0207c C GC1888075G

pks9

T C T362962CPE_PGRS5

T CT2087076C

171 bp upstream of Rv1838c

C G C477188G Rv0398c C TC2372126T

Rv2112c

C A C480678A mmpL1 G CG2402463C

Rv2142c

A GA649974G

ubiE T CT2614547C

46 bp upstream Rv2339

C T C761147T rpoB G AG2751471A

Rv2449c

G C G765719C rpoC G AG2958534A

Rv2631

C G C799139G Rv0698 G AG3126489A

Rv2819c

G AG905686A

Rv0811c G AG3137406A

echA16

C A C926861APE_PGRS13

C TC3213150T

lepB

C GC1023436G

betP A CA3377940C

PPE46

T GT1093459G

PE_PGRS17

A CA3380380C

PPE47

G AG1114491A

Rv0997 A GA3416480G

Rv3055

C TC1208858T

Rv1084 C TC3455434T

Rv3088

C TC1231660T

Rv1104 G CG3608047C

Rv3230c

A CA1246730C

bpoB C TC3764285T

PPE56

C TC1266797T

Rv1139c G AG3765280A

PPE56

C TC1309314T

fdxC C AC3777772A

spoU

A CA1320356C

papA3 A GA4026439G

5 bp upstream Rv3585

G AG1353888A

tagA C AC4037284A

PE_PGRS59

G AG1421085A

Rv1272c T AT4072484A

Rv3633

G CG4084482C

topA

A GA4314271G

bfrB

Identify New Mechanisms of Resistance

fabG1 mabA inhA

L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T

METAGENOMICS

Starting Material

Sputum Subculture

Dx Culture

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 24: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

C G C31998G58 bp upstream Rv0029

C TC1663856T

acn

C T C51403T Rv0047c G AG1674048A

fabG1

C T C118832T Rv0102 T CT1877958C

pks7

C T C247984T Rv0207c C GC1888075G

pks9

T C T362962CPE_PGRS5

T CT2087076C

171 bp upstream of Rv1838c

C G C477188G Rv0398c C TC2372126T

Rv2112c

C A C480678A mmpL1 G CG2402463C

Rv2142c

A GA649974G

ubiE T CT2614547C

46 bp upstream Rv2339

C T C761147T rpoB G AG2751471A

Rv2449c

G C G765719C rpoC G AG2958534A

Rv2631

C G C799139G Rv0698 G AG3126489A

Rv2819c

G AG905686A

Rv0811c G AG3137406A

echA16

C A C926861APE_PGRS13

C TC3213150T

lepB

C GC1023436G

betP A CA3377940C

PPE46

T GT1093459G

PE_PGRS17

A CA3380380C

PPE47

G AG1114491A

Rv0997 A GA3416480G

Rv3055

C TC1208858T

Rv1084 C TC3455434T

Rv3088

C TC1231660T

Rv1104 G CG3608047C

Rv3230c

A CA1246730C

bpoB C TC3764285T

PPE56

C TC1266797T

Rv1139c G AG3765280A

PPE56

C TC1309314T

fdxC C AC3777772A

spoU

A CA1320356C

papA3 A GA4026439G

5 bp upstream Rv3585

G AG1353888A

tagA C AC4037284A

PE_PGRS59

G AG1421085A

Rv1272c T AT4072484A

Rv3633

G CG4084482C

topA

A GA4314271G

bfrB

Identify New Mechanisms of Resistance

fabG1 mabA inhA

L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T

METAGENOMICS

Starting Material

Sputum Subculture

Dx Culture

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 25: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

Identify New Mechanisms of Resistance

fabG1 mabA inhA

L203L A silent mutation in mabA confers isoniazid resistance on Mycobacterium tuberculosis Ando H Miyoshi-Akiyama T Watanabe S Kirikae T

METAGENOMICS

Starting Material

Sputum Subculture

Dx Culture

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 26: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

METAGENOMICS

Starting Material

Sputum Subculture

Dx Culture

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 27: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

Starting Material

Sputum Subculture

Dx Culture

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 28: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

AMD Metagenomics Project Overview

Clinical specimen

Comparative pathogen sequence database RNADNA content sequence

High throughput sequencing Sample

extraction processing

Subtractive host

sequence database

Identification amp characterization

Clutter Mitigation

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 29: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

Clinical Specimen Sets

Human whole blood (EDTA) 2 liters bull Acquired through normal channels

Human sputum 2 liters bull Multiple donors bull Took over 6 months to accumulate

Human stool 2 liters bull Obtained from single donor bull ldquoGenome dietrdquo

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 30: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

Test Sputum for Mtb

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 31: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

Metagenomics

35819020 10

336491752 89

35886 0

4844864 1

Sputum Background Sample Raw Read processing

Quality filter failed (Q30 trim-filter)

Human mapped (BWA -defaults)

TB Mapped (BWA -defaults)

Other

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 32: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

MTb gDNA + hu gDNA at different ratios MTb gDNA + Sputum gDNA at different ratios

MTbhuDNA Ratio () MTbSputum Ratio ()

20 1 01 001 0001 00001 20 10 1 01 001 0001

Transposon-Based Library Prep

Sequence Capture

MiSeq Sequencing and Analysis

All mixtures 25 ng uL-1

Project Workflow

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 33: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

Agilent SureSelect

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 34: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

- (

Mtb SureSelect Enrichment

01 1 20

20 1 10

2654689 117325

10133910 11929612 67210

001

11052

(825 45x) (994 77x)

(989 251x) (989 270x) (72 3x)

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 35: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

Recent Publication

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 36: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

Conclusions

Some infrastructure is required for NGS Laboratory work Bioinformatics IT and data management

Developed SOPs and optimized analysis

Surveillance Drug resistance

Work in progress

Universal genotyping using WGS Metagenomics Fund at least two state laboratories for WGS

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements
Page 37: Next Generation Sequencing Workflow and Applications for ...Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis Jamie Posey, PhD Applied Research Team

Acknowledgements

Laboratory Branch Applied Research Team Lauren Cowan Melisa Willby Paige Gupton Kartee Johnson

Core Facility

Mike Frace Mili Sheth Jamie Davis

AMD Metagenomics Team

Chris Hopkins Eishita Tyagi Scott Burns

  • Next Generation Sequencing Workflow and Applications for Mycobacterium tuberculosis
  • Slide Number 2
  • Next Generation Sequencing (NGS) Workflow
  • Sample Preparation for Whole Genome Sequencing (WGS)
  • Library Preparation for Illumina Platforms
  • Sequence Libraries
  • Slide Number 7
  • Slide Number 8
  • Genome Assembly
  • Bioinformatic Tools
  • Simple Variant Call Pipeline
  • Examples of Commercial Products
  • Reference-Guided (Mapped) Assembly
  • De-Novo Assembly
  • Whole Genome SNP Typing
  • NJ tree visualization
  • Turnaround Time
  • Molecular Epidemiology
  • Cluster 1
  • Slide Number 22
  • Cluster 3
  • Slide Number 24
  • Drug resistance
  • Applications of WGS for Drug Resistance
  • Slide Number 29
  • Slide Number 30
  • Identify New Mechanisms of Resistance
  • MEtagenomics
  • Starting Material
  • AMD Metagenomics Project Overview
  • Clinical Specimen Sets
  • Test Sputum for Mtb
  • Metagenomics
  • Project Workflow
  • Agilent SureSelect
  • Mtb SureSelect Enrichment
  • Recent Publication
  • Conclusions
  • Acknowledgements