whole genome sequencing for outbreak analysis and pathogen ... · whole genome sequencing for...

39
Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang Scientific Officer (Medical) Microbiology Division, PHLSB 23 Dec 2019

Upload: others

Post on 18-Jun-2020

6 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

Whole genome sequencing for outbreak analysis and pathogen typing

Challenges and Opportunities

Alan TsangScientific Officer (Medical)

Microbiology Division, PHLSB

23 Dec 2019

Page 2: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

Agenda• Overview of typing

• WGS-based typing

• Examples

• Challenges and advantages of WGS

Page 3: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

Typing• Allow differentiation of microbes beyond the species and subspecies

level

– To relate individual cases to an outbreak of infectious disease

– To establish an association between an outbreak of food poisoning

and a specific food vehicle

– To trace the source of contaminants within a manufacturing process

Page 4: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

Typing

• Phenotypic

– Characterization of bacteria based on expressed traits

• Serotyping

• Genotyping

– Characterization of bacteria based on genetic content

• Pulsed–field gel electrophoresis (PFGE)

• Multi-locus sequence typing (MLST)

• Variable-number tandem repeat (VNTR) typing

Page 5: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

Drawbacks• Low resolution

– Only rough idea of relationship between isolates

• Labour intensive

– Lots of tedious lab work

• Relatively expensive

– In time and consumables

Page 6: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

3 years ago…

Page 7: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

Systems Comparison

15Gb Output• For E. coli, ~5 Mb • 80x coverage depth• ~ 0.4 Gb• ~ 3% of a MiSeq run

Page 8: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

Whole genome sequencing workflow

Kwong JC et al. Whole genome sequencing in clinical and public health microbiology. Pathology. 2015

Next-Gen Sequencing Library preparation

Page 9: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

How does Illumina sequencing work

Better libraries, better runs, better data

Page 10: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

Basic genome informatics• Millions of DNA sequences

– Reads

• Typically 50-300 bp each

• Includes quality information

• File size ~ 1 gigabyte

Page 11: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

Whole genome sequencing workflow

Kwong JC et al. Whole genome sequencing in clinical and public health microbiology. Pathology. 2015

Page 12: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

Two main approaches

• Gene-by-gene comparisons

• Single Nucleotide Polymorphism (SNP) analysis

Page 13: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

gene-by-gene comparisons • Compare in Gene level

• Multi-Locus Sequence Typing (cgMLST/wgMLST)

• Can be standardized between laboratories

• Databases:

• Ridom SeqSphere+ (Commercial Software)

• BIGSdb

Page 14: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

cgMLST database

Page 15: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

cgMLST database

Page 16: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

Genomes and Loci

L1

Strain 1

Strain 2

Strain 3

Strain 4

Strain 5

Strain 6

L2 L3 L4 L5 L6 L7 L8

L1

L1

L1

L1

L1

L1

L2 L3 L4 L5 L6 L7 L8

L2850

L2850

L3 L4 L5 L6 L7 L8 L2850

L3 L4 L5 L6 L8 L2850

L2 L3 L4 L5 L7 L8 L2850

L3 L4 L5 L6 L7 L2850

L2 L3 L4 L5 L6 L7 L8 L2850

…..

…..

…..…..…..

…..…..

Page 17: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

cgMLST

L1

Strain 1

Strain 2

Strain 3

Strain 4

Strain 5

Strain 6

L3 L4 L5

L1

L1

L1

L1

L1

L1

L3 L4 L5

L3 L4 L5

L3 L4 L5

L3 L4 L5

L3 L4 L5

L3 L4 L5

L2850

L2850

L2850

L2850

L2850

L2850

L2850

…..

…..

…..…..…..

…..…..

Page 18: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

Genomes and Loci

L1

Strain 1 1111….1

Strain 2 1111….1

Strain 3 2211….2

Strain 4 3322….3

Strain 5 2111….2

Strain 6 1111….1

L3 L4 L5

L1

L1

L1

L1

L1

L1

L3 L4 L5

L2850

L2850

L3 L4 L5 L2850

L3 L4 L5 L2850

L3 L4 L5 L2850

L3 L4 L5 L2850

L3 L4 L5 L2850

Page 19: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

Whole genome sequencing workflow

Kwong JC et al. Whole genome sequencing in clinical and public health microbiology. Pathology. 2015

Page 20: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

Single Nucleotide Polymorphism (SNP) analysis

• This approach provides an even higher resolution power than cgMLST

• A difference between DNA sequences in the identity of a single

nucleotide (an A, T, G, or C)

• have the advantage of including intergenic regions

Page 21: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

Read mapping

What a SNP look likeSNP (A=>G)

Reference

Page 22: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

SNP-based typing

Ref GGCAGCAGTGTCTTGCCCGATTGCAGGATGAGTTACCAGCCACAGAATT

Strain A GGCAGCAGTGTCATGCCCGATTCCAGGATGAGTTACCAGCCACAGAATT

Strain B GGCAGCAGTGTCATGCCCGATTCCAGGATGAGTTACCAGCCACAGAATT

Strain C GGCAGCAGTGTCATGCCCGATTGCAGGATGAGTTACCAGCCACAGAATT

Strain D GGCAGCAGTGTCATGCCCGATTCCAGGATGAGTTACCAGCCACAGAATT

Strain E GGCAGCAGTGTCATGCCCGATTCCAGGATGAGTTACCAGCCACAGAATT

Strain F GCCACCAGAGTCTTACCGGATAGCAGCATGAGATACCTGCCACACAATT

Page 23: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

SNP-based typing

A B C D E

A

B 0

C 1 1

D 0 0 1

E 0 0 1 0

F 12 12 11 12 12

Phylogenetic treeA

B

D

E

C

F

1 SNP

SNP matrixConcatenated SNP’s from the SNP matrix are

used to construct a phylogenetic tree

Ref GGCAGCAGTGTCTTGCCCGATTGCAGGATGAGTTACCAGCCACAGAATT

Strain A GGCAGCAGTGTCATGCCCGATTCCAGGATGAGTTACCAGCCACAGAATT

Strain B GGCAGCAGTGTCATGCCCGATTCCAGGATGAGTTACCAGCCACAGAATT

Strain C GGCAGCAGTGTCATGCCCGATTGCAGGATGAGTTACCAGCCACAGAATT

Strain D GGCAGCAGTGTCATGCCCGATTCCAGGATGAGTTACCAGCCACAGAATT

Strain E GGCAGCAGTGTCATGCCCGATTCCAGGATGAGTTACCAGCCACAGAATT

Strain F GCCACCAGAGTCTTACCGGATAGCAGCATGAGATACCTGCCACACAATT

Page 24: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

SNP-based typingRef GGTTGCTGGTAG

Strain A GGTAGCTCGTAG

Strain B GGTAGCTCGTAG

Strain C GGTAGCTGGTAG

Strain D GGTAGCTCGTAG

Strain E GGTAGCTCGTAG

Strain F CCATAGAGCATC

A B C D E

A

B 0

C 1 1

D 0 0 1

E 0 0 1 0

F 12 12 11 12 12

Phylogenetic treeA

B

D

E

C

F

1 SNP

SNP matrixConcatenated SNP’s from the SNP matrix are

used to construct a phylogenetic tree

Page 25: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

Example – outbreak investigation• In 2019, a cluster of Candida auris colonization occurred in a public

hospital in Hong Kong and affected 15 patients over a period of

approximately one month. This occurrence marked the first ever

detection of C. auris in Hong Kong.

• Whole-genome sequencing for the isolates was performed as part of the

outbreak investigation.

Page 26: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

Major clades of Candida auris

Strains were:

• Very different across clades

• Highly related within clade

Page 27: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong
Page 28: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong
Page 29: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

SNP numbers will vary…

using SNP callingpipeline A

using SNP callingpipeline B

Page 30: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

SNP analysis • Many academic researchers have developed pipelines for similar

analysis, some of which are publically available

– output vary

• Many variables affect the number of measured SNPs between isolates

– tools employed

– SNP-calling filters / parameters

– species (nucleotide mutation rates vary between pathogens)

– reference sequence

– number and diversity of isolates analyzed

– time between samples

• Interpret genomic data in parallel with local epidemiological data

• No SNP databases or nomenclature is available

Page 31: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

Schürch AC et al. Clin Microbiol Infect. 2018

Page 32: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

Hatherell HA et. al. BMC Med. 2016

Page 33: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

Example – serovar prediction• Traditional serology and the Kauffmann White Scheme (KWS) have

been the gold standard for Salmonella serotyping

– maintained by the World Health Organization (WHO)

Collaborating Centre for Reference and Research on Salmonella,

located at the Pasteur Institute in Paris, France

– The current (9th) edition issued in 2007 comprises antigenic

variants that had been validated as of January 1, 2007

• Evaluate the potential use of WGS to serve as a method for the routine

serotyping of Salmonella isolates

Page 34: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

Salmonella Serotyping Using WGS

Strain Traditional Serotyping

Tool A Tool B v1 Tool B v2

1 Derby Derby N/A Derby

2 Bovismorbificans Bovismorbificans N/A Bovismorbificans

3 Wandsworth Wandsworth Wandsworth N/A

4 Typhimurium I 4,[5],12:i:- Typhimurium Typhimurium

5 Chailey Breda Chailey Chailey

6 Virchow Virchow N/A Virchow

7 Urbana Johannesburg N/A Urbana

8 Crewe Crewe|Poitiers N/A Crewe

Page 35: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

New edition of the scheme - 2020

Page 36: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

Challenges• Different pipelines

– different results

• Different versions of same pipeline

– different results

Page 37: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

Drawbacks• Interpretation of WGS data

• A set of standardized tools and guidelines is not defined yet

• Cost?

• Data storage

– WGS generates large amounts of data

– requires both physical space and virtual space

• Internet connection/speed

– The large amounts of data generated by WGS need to be

transferred through the Internet to be available and of benefit to the

global community

Page 38: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

Benefits of WGS• Performance

– a far superior resolution

– provides more information on pathogens

• Ease of sharing

– can be easily exchanged electronically around the globe

– can be stored in repositories (e.g. NCBI, EBI)

– the genomic data can be reanalyzed locally at any time

– local pathogens can easily be compared with other sequences in

publicly available international databases, allowing the local

outbreak to be interpreted in an international context

• Universality

– universal across all pathogens

X species-specific primer

X species-specific enzyme

Page 39: Whole genome sequencing for outbreak analysis and pathogen ... · Whole genome sequencing for outbreak analysis and pathogen typing Challenges and Opportunities Alan Tsang ... Kwong

Thank you

For Your Attention