interpreting genomic variation and phylogenetic trees to understand disease transmission (asm...

84
INTERPRETING GENOMIC VARIATION AND PHYLOGENETIC TREES TO UNDERSTAND DISEASE TRANSMISSION Jennifer Gardy Canada Research Chair in Public Health Genomics University of British Columbia and BC Centre for Disease Control @jennifergardy

Upload: jennifer-gardy

Post on 18-Jan-2017

814 views

Category:

Science


1 download

TRANSCRIPT

Page 1: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

INTERPRETING GENOMIC VARIATION AND PHYLOGENETIC TREES TO

UNDERSTAND DISEASE TRANSMISSION

Jennifer Gardy Canada Research Chair

in Public Health Genomics University of British Columbia

and BC Centre for Disease Control

@jennifergardy

Page 2: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

http://www.slideshare.net/jennifergardy

Page 3: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

T O P I C S T O B E C OV E R E D

• A case study from my own research

• The importance of high-quality WGS data

• Building a phylogeny 101

• Inferring transmission: manually

• Inferring transmission: with math

Page 4: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

Part 1: A case study from my own research

Page 5: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

BCCDC is responsible for communicable disease diagnosis, surveillance, epidemiology, and prevention in British Columbia, Canada.

BC has about 250 TB cases per year. ~30% of these are part of outbreaks.

Page 6: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

By studying outbreaks toUNDERSTAND TB TRANSMISSIONwe can design & deliver better interventions and end outbreaks quickly.

Page 7: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

SURVEILLANCE IDENTIFIES TB CASES

Page 8: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

MOLECULAR EPIDEMIOLOGY IDENTIFIES POTENTIALLY RELATED CASES

Page 9: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

M O L E C U L A R T Y P I N G O F M . T U B E R C U L O S I S

• SPOLIGOTYPING • 43 oligonucleotide spacers between conserved direct repeats • Hybridisation assay: is spacer present or not? Binary 0 or 1 • 43-digit binary string converted to 15-digit string using octal

transformation

• IS6110-RFLP • Restriction enzyme digest followed by electrophoresis • Probe these ladders for IS6110 insertion element • Final pattern is just the bands with IS6110

• MIRU-VNTR • PCR amplification of 12-24 MIRU (Mycobacterial Interspersed

Repetitive Unit) VNTR regions • Size of amplified product indicates number of repeats • Final fingerprint is a 12 or 24-digit number

Page 10: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

CONTACT TRACING SUGGESTS TRANSMISSIONS

Page 11: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

L I M I TAT I O N S O F C U R R E N T M E T H O D S

• Genotyping methods only tell you a cluster of cases exists, not the order/direction of transmission

• Size/membership of the cluster varies with the molecular typing method(s) used

• Epidemiological investigation is required to derive the links between cases, and may not be available or of sufficient quality

Page 12: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

ge·no·mic ep·i·de·mi·ol·o·gy (jēˈnōmik ˌepiˌdēmēˈäləjē/) n. reading whole genome sequences from outbreak isolates to track person-to-person spread of an infectious disease.

Page 13: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

AAAAAA

Page 14: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

AAAAAA

AAAAAA

AACAAA

Page 15: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

AAAAAA

AAAAAA

AACAAA

AACAAA

GACAAA

AAAATA

AAAAAA

Page 16: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

AAAAAA AACAAA

AACAAA

AACTAA AACTAA

AACAAG

Page 17: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

TELEPHONE

ART B

Y DE

VIAN

TART

USE

R SC

UMMY

Page 18: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)
Page 19: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

TB LABORATORY INVESTIGATION • Multiple reports of suspected false-positive TB

diagnoses, suspected errors in processing on four occasions

• Typing showed 11 isolates belonging to four MIRU-VNTR clusters, but MIRU patterns were associated with large outbreaks

• Were these truly due to a lab error (most likely) or were some/all true positives and part of the outbreaks (less likely, but not impossible)?

• Hypothesis: if lab error, all isolates involved in splashover should be 100% identical after WGS

Page 20: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

1. Sequenced all isolates on the MiSeq

2. Aligned against MTB H37Rv reference genome

3. Identified high-quality variants

4. Compared all genomes to each other at only the variant positions

ACG ACGCTT CTT

Page 21: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

0 variants between isolates in each of

the 4 contamination events supports the

hypothesis that a spillover occurred.

Page 22: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

T H E I M P O R TA N C E O F H I G H - Q U A L I T Y D ATAPA R T 2

Page 23: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

Garbage in, garbage out

Page 24: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

SEQUENCING CONSIDERATIONS

• What platform should I use? • Sequencing chemistry? Sequencer model?

• How much can I multiplex? • Need at least 30x, ideally 50x, we aim for 100x

• Include 1+ control non-outbreak samples, especially when using an external sequencing service

• Do I have nucleic acid from all of my isolates? • Am I sequencing from culture or from specimen?

Page 25: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

BIOINFORMATICS ADVICEIf you know your bug inside out and are familiar with stringing various command-line software

packages together into an analytical pipeline, go for it. If at least one of these is not true, DO NOT

GO FOR IT! Use a pipeline tuned to your bug.

Page 26: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

The DIY method

Page 27: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

M Y U S U A L P I P E L I N E

• Read QC with FASTQC • Map against reference with BWAmem • Call SNVs with samtools mpileup • Output a VCF file with SNVs only - no indels • Remove all SNVs in repetitive regions using bedtools

subtract • Custom Python script to filter out SNVs common to all

sequenced isolates and format remainder as a table • High coverage dataset makes SNV calling based on qual

score thresholds easy - examine scores in context • Manually inspect each SNV using a BAM viewer tool

Page 28: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

Organism-specific pipelines

Page 29: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

https://gph.niid.go.jp/tgs-tb/index_tb.html

Page 30: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

http://www.wgsa.net

Page 31: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

http://conferences.asm.org/images/ngsfinalprogram.pdf

Page 32: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

LOOK AT YOUR DATA

Page 33: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)
Page 34: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

63bp deletion

Page 35: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

O T H E R C O N S I D E R AT I O N S• Are you seeing the expected number of SNVs?

• Is there over-representation of SNVs in annotated repetitive genes? These may be false.

• You may be sequencing one population or many - do you see heterogeneity at any positions?

• Indels may also act as markers of transmission but are harder to reliably call, especially on certain NGS platforms

Page 36: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

THE FINAL OUTPUT - A FASTA FILE OF CONCATENATED VARIANTS.

Page 37: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

part 3: phylogenies 101

Page 38: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

Who has constructed a phylogeny before?

Page 39: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

P H Y L O G E N Y B A S I C S

• You can make a tree very quickly using Neighbour-Joining (NJ) methods

• Maximum-likelihood methods are better: RaxML is popular, as is FastTree for larger datasets

• You will usually need to select an evolution model, jModelTest can help

• Bootstrapping or other support calculations are important for understanding how robust your tree is

Page 40: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

P H Y L O G E N Y T O P T I P S

• Before aligning your sequences and making a tree, ensure you have informative names/tip labels

• Use FigTree to interact with and create nice visual displays of your tree

• Before working with your phylogeny, read this, from the excellent Andrew Rambaut: http://epidemic.bio.ed.ac.uk/how_to_read_a_phylogeny

Page 41: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)
Page 42: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)
Page 43: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)
Page 44: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

http://www.beast2.org

Page 45: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)
Page 46: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

Part 3: Inferring transmission manually

Page 47: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

TELEPHONE

ART B

Y DE

VIAN

TART

USE

R SC

UMMY

Page 48: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

REAL-WORLD PATTERNS OF SPREAD AREN’T AS SIMPLE

Page 49: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

Genomic data provides a higher resolution view of a cluster, but SNVs alone do not often suggest obvious

person-to-person transmission

Page 50: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

D E T E R M I N I N G T H E O R D E R O F

T R A N S M I S S I O N

• Duration of infectious period:

• Date of symptom onset

• Date of diagnosis

• Date put on treatment

• Infectiousness

• Hospitalizations

• Social contacts, locations

Page 51: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

REMEMBER: IDENTICAL SEQUENCES DON’T NECESSARILY MEAN PERSON-TO-PERSON TRANSMISSION

Page 52: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

REMEMBER: IDENTICAL SEQUENCES DON’T NECESSARILY MEAN PERSON-TO-PERSON TRANSMISSION

Page 53: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

A

B

C

D

E

1. group the samples according to mutation pattern

Page 54: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

A

B

D

C

E

2. figure out all possible transmissions based on patterns of mutations and on who was sick first

Page 55: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

A

A

B D

BD

AB

D

A

A

C E

CE

Page 56: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

A

A

B D

BD

AB

D

A

A

C E

CEHow did A infect the B/D groups

and the C/E groups?

Page 57: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

CONSIDER WITHIN-HOST DIVERSITY WHEN DEALING WITH CHRONIC INFECTIONS,

INFECTIONS WITH LATENT OR CARRIAGE PERIODS, OR DISSEMINATED INFECTIONS

Page 58: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

A

A

B D

BD

AB

D

A

A

C E

CE

4. ASK WHICH SCENARIO IS MOST LIKELY GIVEN THE EPI DATA

Page 59: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

A

A

B D

BD

AB

D

A

A

C E

CE

• A was the index patient • A, B, and D work together • B has a non-infectious form of the disease • D fell ill within two days of B

Page 60: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

A

A

B D

BD

AB

D

A

A

C E

CE

• C was in a ward of Hospital X at the same time as A • E was admitted to the ward after A and C had been

discharged

Page 61: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

A

A

B D

BD

AB

D

A

A

C E

CE

• C was in a ward of Hospital X at the same time as A • E was admitted to the ward after A and C had been

discharged

Page 62: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

A

B

C

D

E

WORK

WORK

ADMITTED TO WARD

INFECTED VIA FOMITE?

Page 63: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

Part 4: Inferring transmission with math

Page 64: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)
Page 65: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

http://www.whoinfectedwhom.org

Page 66: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

TRANSPHYLO INTERPRETS A BAYESIAN PHYLOGENY IN THE CONTEXT OF WITHIN-HOST GENETIC DIVERSITY .

with Xavier Didelot & Caroline Colijn (Imperial College London)

Page 67: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

Can we infer a transmission tree T given a phylogenetic tree G?

A

B

C

D A

BC

D

Page 68: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

1. Build a time-labelled phylogeny using BEAST

A

BC

D

Page 69: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

2. Assign each host a colour

A

BC

D

Page 70: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

3. Colour the tree according to when a lineage transmitted from one host to another

A

BC

D

Page 71: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

A

BC

D

A

Page 72: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

4. Do this over many, many trees.

Page 73: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)
Page 74: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

A

B

C

D

A

BC

D

5. Use an MCMC approach to infer most probable transmissions over all phylogenies

Page 75: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)
Page 76: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)
Page 77: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)
Page 78: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

HATHERELL ET AL, 2016. microbial genomics.

An updated model to better infer time of infection

Page 79: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)
Page 80: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)
Page 81: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)
Page 82: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)
Page 83: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

MEMO

Bus: (250) 868-7818 Fax: (250) 868-7826 Kelowna Health Centre Email: [email protected] 1340 Ellis Street www.interiorhealth.ca Kelowna, BC V1Y 9N1

Quality y Integrity y Respect y Trust

In 2008, an outbreak of Mycobacterium Tuberculosis (TB) was declared after a higher-than-expected number of TB cases were identified in the Central Okanagan. Between 2008 and 2014, 52 outbreak-related active TB cases were identified. Most cases were homeless and/or street-involved persons in Kelowna with a small linked cluster in Penticton, and several cases in Salmon Arm. Interior Health’s TB Outbreak Management Team, in partnership with community organizations and the BC Centre for Disease Control have used numerous strategies to identify and treat new cases and to minimize the public health risk. Epidemiological and genomics (genetic fingerprinting) data demonstrate that the peak of the outbreak occurred in late 2010/early 2011. There is currently no evidence of ongoing transmission and incidence of new TB cases has returned to baseline (pre-outbreak) levels.

The Central Okanagan TB outbreak is declared over as of January 29, 2015. We expect to see sporadic new TB diagnoses connected to the outbreak in the coming years; early detection of these cases will be critical to preventing another outbreak. The CD Unit will disseminate further information about next steps as the outbreak response is de-escalated. Outbreaks of TB among homeless persons are strongly related to social determinants of health such as employment, income, safe housing, and access to health care. Preventing and controlling future outbreaks requires continued attention to these inequities through comprehensive policies and programs that aim to reduce health disparities in our community. On behalf of the Office of the Medical Health Officers, we thank each of you for your hard work and collaboration in controlling this outbreak and for your continued dedication to TB prevention and control. If you have any questions, please contact the Communicable Disease Unit at 1-866-778-7736 or by email [email protected].

To: CIHS Promotion & Prevention; Infection Control, Workplace Health & Safety, KGH Administrators, PRH Administrators, Senior Executive Team, CD Unit

From: Dr. Sue Pollock, Medical Health Officer & Medical Director, Communicable Disease

Date: February 4, 2015

RE: Central Okanagan TB Outbreak Declared Over

Page 84: Interpreting genomic variation and phylogenetic trees to understand disease transmission (ASM Microbe 2016 Workshop)

R E C A P

• Doing careful sequencing and bioinformatics can reveal mutations that can help you infer who infected whom (and when!), but you need to know your bug!

• Phylogenetic trees can help you to explore this data, and can feed into automated methods for transmission inference. Nothing in biology makes sense except in the light of evolution!

• These automated methods are no replacement for good field epidemiology data, and are likely not required for a small cluster of cases