nature genetics: doi:10.1038/ng · supplementary figure 6 phylogenetic distribution of novel phage...

13
Supplementary Figure 1 Terminal branch lengths of H58 versus non-H58 isolates. The number of SNPs between each isolate and its last common ancestor was determined from the phylogenetic tree in Figure 1. The frequency of each terminal branch distance was calculated and adjusted for the number of isolates in each of the lineages. All branch lengths are shown in the main panel, and those with lengths of less than 25 SNPs are shown in the inset. Nature Genetics: doi:10.1038/ng.3281

Upload: others

Post on 17-Jan-2020

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Nature Genetics: doi:10.1038/ng · Supplementary Figure 6 Phylogenetic distribution of novel phage regions identified in the S. Typhi H58 lineage. The maximum-likelihood phylogeny

Supplementary Figure 1

Terminal branch lengths of H58 versus non-H58 isolates.

The number of SNPs between each isolate and its last common ancestor was determined from the phylogenetic tree in Figure 1. The

frequency of each terminal branch distance was calculated and adjusted for the number of isolates in each of the lineages. All branch

lengths are shown in the main panel, and those with lengths of less than 25 SNPs are shown in the inset.

Nature Genetics: doi:10.1038/ng.3281

Page 2: Nature Genetics: doi:10.1038/ng · Supplementary Figure 6 Phylogenetic distribution of novel phage regions identified in the S. Typhi H58 lineage. The maximum-likelihood phylogeny

Supplementary Figure 2

Temporal analysis.

(a) Time-dependent accumulation of SNPs in the whole genomes of S. Typhi isolates. Root-to-tip branch lengths extracted from the

maximum-likelihood tree of S. Typhi are plotted against the year of isolation. Points representing H58 isolates are colored red. Lines

indicate linear regression of branch lengths on isolation dates, for H58 (red), all S. Typhi (black) and all S. Typhi isolated since 1992

(dashed). (b) Changes in the effective population size of the H58 lineage over time. The central black line indicates the median

estimates, and shaded areas represent confidence limits expressed as 95% highest posterior probability densities (HPDs). The dashed

red vertical line corresponds to the year in which the H58 lineage appeared to disseminate to multiple geographical locations in the

corresponding H58 BEAST analysis.

Nature Genetics: doi:10.1038/ng.3281

Page 3: Nature Genetics: doi:10.1038/ng · Supplementary Figure 6 Phylogenetic distribution of novel phage regions identified in the S. Typhi H58 lineage. The maximum-likelihood phylogeny

Supplementary Figure 3

Association of S. Typhi H58 and multidrug resistance.

The frequency of H58 among MDR and non-MDR isolates and their associated country of origin are displayed (odds ratio and P values

were calculated). All countries shown have 2 MDR isolates. OR, odds ratio; Inf, infinite; CAR, Central African Republic; DRC,

Democratic Republic of the Congo; S. Africa, South Africa. *P < 0.01.

Nature Genetics: doi:10.1038/ng.3281

Page 4: Nature Genetics: doi:10.1038/ng · Supplementary Figure 6 Phylogenetic distribution of novel phage regions identified in the S. Typhi H58 lineage. The maximum-likelihood phylogeny

Supplementary Figure 4

Phylogenetic distribution of acquired resistance genes and DNA gyrase and topoisomerase IV mutations found in the 1,832 S. Typhi isolates.

The phylogeny of 1,832 S. Typhi isolates constructed using 22,145 SNPs is depicted in the center and surrounded by colored band

circles representing (1) The geographical region the isolate is from and the number of (2) resistance genes, (3) gyrA mutations, (4) gyrB

mutations, (5) parC mutations and (6) parE mutations present in the isolate. A red arc represents the H58 lineage, and the phylogenetic

position of the CT18 (R) reference (AL513382) is indicated. Branch lengths are indicative of the estimated substitution rate per variable

site. A, alanine; R, arginine; N, asparagine; D, aspartic acid; Q, glutamine; E, glutamic acid; G, glycine; I, isoleucine; L, leucine; K,

lysine; F, phenylalanine; S, serine; Y, tyrosine. *Rare SNP.

Nature Genetics: doi:10.1038/ng.3281

Page 5: Nature Genetics: doi:10.1038/ng · Supplementary Figure 6 Phylogenetic distribution of novel phage regions identified in the S. Typhi H58 lineage. The maximum-likelihood phylogeny

Supplementary Figure 5

Antimicrobial resistance trends of H58 S. Typhi isolates.

Numbers of H58 S. Typhi that were MDR on genotyping and/or harbored at least one gyrA mutation conferring nalidixic acid resistance

and reduced fluoroquinolone susceptibility, among isolates from (a) Southeast Asia, (b) South Asia and (c) Africa.

Nature Genetics: doi:10.1038/ng.3281

Page 6: Nature Genetics: doi:10.1038/ng · Supplementary Figure 6 Phylogenetic distribution of novel phage regions identified in the S. Typhi H58 lineage. The maximum-likelihood phylogeny

Supplementary Figure 6

Phylogenetic distribution of novel phage regions identified in the S. Typhi H58 lineage.

The maximum-likelihood phylogeny of 853 S. Typhi H58 isolates constructed using 1,534 SNPs is depicted in the center, rooted using

an S. Typhi isolate from the nearest neighboring cluster of non-H58 isolates as an outgroup (black circle; isolate 10060_5_62_

Fij107364_2012) and surrounded by colored band circles representing (1) country of isolation and (2) phage regions. Each of the phage

regions is detailed in Supplementary Table 6. Branch lengths are indicative of the estimated substitution rate per variable site.

Nature Genetics: doi:10.1038/ng.3281

Page 7: Nature Genetics: doi:10.1038/ng · Supplementary Figure 6 Phylogenetic distribution of novel phage regions identified in the S. Typhi H58 lineage. The maximum-likelihood phylogeny

Supplementary Table 2. Public reference strains used in this study. Key: *= S. Typhi isolates used in study by

Holt, K. E. et al. (2008) 1 R = Reference strain used in chromosomal phylogenetic analyses.

Public reference strains of S. Typhi

Isolate name Tree name Accession number Year of isolation

Continent Region within continent

Country

BL196 BL196_2005 SalTypBL196v1 2005 Asia Southeast Asia Malaysia

CR0044 CR0044_2007 SalTypSTCR0044v1 2007 Asia Southeast Asia Malaysia

CR0063 CR0063_2007 SalTypCR0063 v1 2007 Asia Southeast Asia Malaysia

P-stx-12 P-stx-12_2012 ASM24553v1 2012 Asia South Asia India

ST0208 ST0208_2008 YKP860805.1 2008 Asia Southeast Asia Malaysia

UJ308A UJ308A_2012 SalTypUJ308Av1 2012 Australia & Oceania Oceania Papua New Guinea

UJ816A UJ816A_2012 SalTypUJ816Av1 2012 Australia & Oceania Oceania Papua New Guinea

E98_3139* E98_3139_1998 ASM18037v1 1998 North America North America Mexico

J185SM* J185SM_1985 ASM18031v1 1985 Asia Southeast Asia Indonesia

CT18*R CT18_S.Typhi_CT18_1993 AL513382.1 1993 Asia Southeast Asia Vietnam

Ty2* 10349_1_84_RusTy2_1916 ERR343332 1916 Europe Eastern Europe Russia

404ty* 10349_1_90_Indo404ty_1983 ERR343338 1983 Asia Southeast Asia Indonesia

E00-7866* 10349_1_88_MorE00-

7866_2000

ERR343336 2000 Africa North Africa Morocco

E02-1180* 10349_1_89_IndE02-

1180_2002

ERR343337 2002 Asia South Asia India

E98-0664* 10349_1_86_KenE98-

0664_1998

ERR343334 1998 Africa East Africa Kenya

E98-2068* 10349_1_87_BanE98-

2068_1998

ERR343335 1998 Asia South Asia Bangladesh

M223 * 10425_1_10_UnkM223_1939 ERR349340 1939 Unknown Unknown Unknown

150(98)S* 10561_2_47_Vie150_98_S_1

998

ERR357622 1998 Asia Southeast Asia Vietnam

8(04)N* 10349_1_95_Vie8_04_N_200

4

ERR343343 2004 Asia Southeast Asia Vietnam

E02-2759* 10349_1_91_IndE02-

2759_2002

ERR343339 2002 Asia South Asia India

E03-4983* 10540_1_4_IndoE03-

4983_2003

ERR352601 2003 Asia Southeast Asia Indonesia

Public reference plasmids and phages

IncHI1

plasmids Tree name Accession number

R27 AF250878_R27 AF250878

pHCM1 AL513383_pHCM1 AL513383

pAKU1 pAKU1 AM412236

Other plasmids Tree name Accession number

pHCM2 pHCM2 NC_003385.1

Nature Genetics: doi:10.1038/ng.3281

Page 8: Nature Genetics: doi:10.1038/ng · Supplementary Figure 6 Phylogenetic distribution of novel phage regions identified in the S. Typhi H58 lineage. The maximum-likelihood phylogeny

Supplementary Table 3. Plasmids identified in S. Typhi H58 isolates. A summary of the known and novel

plasmids identified in 853 H58 isolates. The number of isolates that contain the plasmid and their geographical

origins are described. The phylogenetic distribution of each of the plasmids is shown in Figure 6.

Tn2670-like composite transposon (catA1, sul2, dfrA7, blaTEM-1, strAB, sul2) Location Number of isolates Countries of isolation Plasmid - IncHI1 PST6 185

140 43 15 9 6 4 1 1

Cambodia Vietnam Kenya Laos

Pakistan India

Tanzania Sri Lanka Unknown

Chromosome – yidA 13 10 3 1 1

Bangladesh Iraq

Pakistan Palestine

India Chromosome – cya 72

19 4 6 3 3 1 1 1 1

Malawi India

South Africa Tanzania

Bangladesh Cambodia

Afghanistan Africa

Australia Nepal

Chromosome - STY4438 11 Fiji Chromosome – fbp 1 India Other acquired resistance genes Location (genes) Number of isolates Countries of isolation IncN plasmid (sul1, aadA1, dfrA15) 1 India IncN plasmid (Tn6029: blaTEM-1, strAB, sul2; sul1, aadA1)

1 India

IncFIB(K) plasmid (blaTEM-1, sul2, qnrS1) 5 1 1

Bangladesh South Africa

Unknown IncFIB(K) + IncN plasmids (Tn6029: blaTEM-1, strAB, sul2; dfrA5, blaCARB-6, catB, aac3)

13 Tanzania

Novel plasmid (strAB, sul1, blaOXA-23) 1 Bangladesh

Nature Genetics: doi:10.1038/ng.3281

Page 9: Nature Genetics: doi:10.1038/ng · Supplementary Figure 6 Phylogenetic distribution of novel phage regions identified in the S. Typhi H58 lineage. The maximum-likelihood phylogeny

Supplementary Table 4. Isolates sequenced using PacBIO RS II Platform. S. Typhi H58 isolates selected for sequencing on the

PacBIO RS II platform (Pacific Biosciences, CA, USA).

Name of isolate (tree name) Laboratory number

Accession number

Year of isolation Continent Region within

continent Country

10349_1_74_Ind12148_2012 ERL12148 ERR343322 2012 Asia South Asia India

10060_6_83_Tan129-0238-M_2008 129-0238-M ERR331380 2008 Africa East Africa Tanzania

10349_1_79_Ind12960_2012 ERL12960 ERR343327 2012 Asia South Asia India

9475_6_19_Mal1016889_2011 1016889 ERR279116 2011 Africa Southern Africa Malawi

10607_2_36_Mal1036491_2012 1036491 ERR360828 2012 Africa Southern Africa Malawi

Nature Genetics: doi:10.1038/ng.3281

Page 10: Nature Genetics: doi:10.1038/ng · Supplementary Figure 6 Phylogenetic distribution of novel phage regions identified in the S. Typhi H58 lineage. The maximum-likelihood phylogeny

Supplementary Table 5. Amino acid substitutions in the quinolone resistance-determining regions of DNA

gyrase and topoisomerase IV genes. (a) Combinations of coding changes detected in the QRDR and their

frequency amongst H58 and non-H58 lineages. (b) Nucleotide substitutions resulting in QRDR coding changes.

Key to derived base: A = adenine; C = cytosine; G = guanine; T = thymidine and amino acid abbreviations: Ala =

alanine; Arg = Arginine; Asn = asparagine; Asp= aspartic acid; Gln = glutamine; Glu= glutamic acid; Gly=

glycine; Ile = isoleucine; Leu = leucine; Lys = lysine; Phe = phenylalanine; Ser = serine; Tyr = tyrosine. *Rare

SNP (see B).

(a) Combinations of coding changes detected in the QRDR and their frequency amongst H58 and non-H58

lineages.

gyrA gyrB parC parE H58 Other Ser83Phe - - - 199 (23%) 57 (6%) Ser83Phe - Glu84Gly - 6 (1%) 1 (0.1%) Ser83Phe - Ser80Ile - 1 (0.1%) 0 Ser83Phe - - Leu416Phe 1 (0.1%) 6 (0.6%) Ser83Phe - - Asp420Asn 153 (18%) 1 (0.1%) Ser83Phe Gln465Arg - Asp420Asn 6 (0.7%) 0 Ser83Tyr - - - 74 (9%) 8 (0.8%) Ser83Tyr - - Asp420Asn 1 (0.1%) 0 Asp87Asn - - - 0 5 (0.5%) Asp87Tyr - - - 8 (0.9%) 12 (1%)

Asp87Tyr* - - - 2 (0.2%) 1 (0.1%) Ser83Phe, Asp87Tyr - Glu84Lys - 1 (0.1%) 0 Ser83Phe, Asp87Tyr - Ser80Ile - 19 (2%) 0 Ser83Phe, Asp87Tyr - - Asp420Asn 1 (0.1%) 0 Ser83Tyr, Asp87Tyr - Ser80Ile - 1 (0.1%) 0

- Gln465Leu - - 0 1 (0.1%) - Ser464Phe - - 27 (3%) 5 (0.5%) - Ser464Tyr - - 0 2 (0.2%) - Ser464Phe, Gln465Leu - - 0 1 (0.1%) - - - - 353 (41%) 879 (88%)

(b) Nucleotide substitutions resulting in QRDR coding changes. Gene Nucleotide change Amino acid change gyrA C248T Ser83Phe

C248A Ser83Tyr G259A Asp87Asn A260G Asp87Tyr G259T Asp87Tyr*

gyrB C1391T Ser464Phe C1391A Ser464Tyr A1394T Gln465Leu A1394G Gln465Arg

parC G239T Ser80Ile G250A Glu84Lys A251G Glu84Gly

parE C1246T Leu416Phe G1258A Asp420Asn

Nature Genetics: doi:10.1038/ng.3281

Page 11: Nature Genetics: doi:10.1038/ng · Supplementary Figure 6 Phylogenetic distribution of novel phage regions identified in the S. Typhi H58 lineage. The maximum-likelihood phylogeny

Supplementary Table 6. Novel phage sequences identified in S. Typhi H58 isolates. A summary of the intact

prophage regions, and the corresponding closest sequenced phage reference genomes, that were identified in 853

H58 isolates using PHAST 2. The number of isolates that contain the phage sequence and their geographical

origins are described. The phylogenetic distribution of each of the phages is shown in Supplementary Figure 6.

Novel phage regions

Number of genes

Regions size (kb)

Closely related phage family Number of isolates

Countries of isolation

1 24 27.8 P4-like phage 3,4 1 1

Laos Malawi

2 80 68.4 SP4 (P2-like phage) 5 1 Malawi

3 58 52.8 SE-OLF-10058 (P2-like phage) 6 24 India

4 44 35.9 EC026_P13 (P2-like phage) 7 1 India

5 79 50.1 P1-like phage (cryptic plasmid) 8,9 54 8 1

Vietnam Cambodia Pakistan

Nature Genetics: doi:10.1038/ng.3281

Page 12: Nature Genetics: doi:10.1038/ng · Supplementary Figure 6 Phylogenetic distribution of novel phage regions identified in the S. Typhi H58 lineage. The maximum-likelihood phylogeny

Supplementary Table 7. Non-synonymous SNPs that define the H58 lineage. The non-synonymous SNPs that are present in 99%

of H58 isolates and not in the non-H58 isolates. Gene annotation was performed using the S. Typhi CT18 database at NCBI

file://localhost/(http://www.ncbi.nlm.nih.gov/; Accession number AL513382). Functional categories are as annotated in the S. Typhi

CT18 genome (Accession number: AL513382). Key for derived base: A = adenine; C = cytosine; G = guanine; T = thymidine. Key

for ancestral AA (amino acid) and derived AA: A = alanine; R = arginine; N = asparagine; D = aspartic acid; C = cysteine; E =

glutamic acid; Q = glutamine; G = glycine; H = histidine; O =hydroxyproline; I = isoleucine; L = leucine; K = lysine; M =

methionine; F = phenylalanine; P = proline; S = serine; T = threonine; W = tryptophan; Y = tyrosine; V = valine; * = stop codon.

Coordinate in CT18

Base in H58

Base in non-H58

Amino acid substitution Gene name Functional Category Function

2750755 A G A2421T STY2875 Pathogenicity/adaptation/chaperones Large repetitive protein

1629304 A G A25V ssaP Pathogenicity/adaptation/chaperones SPI2 type III secretion system protein SsaP

2875160 A G Q185* sptP Pathogenicity/adaptation/chaperones Pathogenicity effector tyrosine phosphatase protein SptP

4192687 T C A32V gph Degradation of small molecules Phosphoglycolate phosphatase 3843665 T C G204D dsdA Degradation of small molecules D-serine dehydratase 89102 G A K201R etfB Degradation of small molecules Protein fixA 3152879 A G M286I uxuA Degradation of small molecules D- mannonate dehydrolase 3004181 C T E337G recB Degradation of macromolecules Exonuclease V subunit 1360939 T C Q30* dbpA Pseudogenes Putative ATP-dependent RNA helicase

3360344 C T H97R nanE2 Pseudogenes Putative N-acetylmannosamine-6-phosphate 2-epimerase 2

4196909 A G A56V bigA Pseudogenes Putative surface-exposed virulence protein 3824631 G T D60E torC Pseudogenes Cytochrome c-type protein 1286044 C T D372G trpE Central/intermediary metabolism Anthranilate synthase component 1 3144053 C A V203G puuB Central/intermediary metabolism Putative oxidoreductase 693560 T C M100I rlpB Central/intermediary metabolism Rare lipoprotein B precursor

4273783 A C R1019S metH Central/intermediary metabolism Putative B12-dependent methionine synthase

2401233 A G R116C yfbT Central/intermediary metabolism Putative phosphoglycolate phosphatase 40159 A G G11E betC Central/intermediary metabolism Putative secreted sulfatase 880083 G A T201A iaaA Central/intermediary metabolism Putative L-asparaginase 387595 T C T204I rtn Central/intermediary metabolism Putative rtn protein 2972433 T C M51I csrB stable RNA CsrB regulator 4665891 A G A245V arcA Regulators Arginine deiminase 2002943 A G L63F sirA Regulators Invasion response-regulator 3398551 A G A620V yhdA Membrane/surface structures Putative lipoprotein

3659647 T C G251E lsrC Membrane/surface structures Putative ABC transporter permease protein

529155 G A K590E kefA Membrane/surface structures Integral membrane protein AefA

1270888 A G R252Q kcsA Membrane/surface structures Putative membrane transport protein (voltage-gated potassium channel)

2202853 T C R334C yegT Membrane/surface structures Putative nucleoside permease 461438 A G R4C yajI Membrane/surface structures Putative lipoprotein 4020211 T C T44M lip1 Membrane/surface structures Putative membrane protein 2388057 A G T530I nuoG Information transfer Putative NADH dehydrogenase I chain G 1810914 A G A99T SBOV18161 Information transfer Hydrogenase-1 operon protein HyaE 3484294 T C V213I wecF Conserved hypothetical Putative 4-alpha-L-fucosyl transferase 4775254 A C P237Q yjjV Conserved hypothetical Putative deoxyribonuclease 4253640 A G A315T dprA Conserved hypothetical Hypothetical protein 387082 A G

Intergenic

1055966 T C

Intergenic 2288504 T C

Intergenic

2348633 A G

Intergenic 2662406 G A

Intergenic

3182059 G A

Intergenic 3693688 T C

Intergenic

3863384 G T

Intergenic 4214165 T C

Intergenic

Nature Genetics: doi:10.1038/ng.3281

Page 13: Nature Genetics: doi:10.1038/ng · Supplementary Figure 6 Phylogenetic distribution of novel phage regions identified in the S. Typhi H58 lineage. The maximum-likelihood phylogeny

REFERENCES 1.   Holt, K.E. et al. High-throughput sequencing provides insights into genome variation and

evolution in Salmonella Typhi. Nat Genet 40, 987-93 (2008). 2. Zhou, Y., Liang, Y., Lynch, K.H., Dennis, J.J. & Wishart, D.S. PHAST: a fast phage search

tool. Nucleic Acids Res 39, W347-52 (2011). 3. Ghisotti, D. et al. Multiple regulatory mechanisms controlling phage-plasmid P4 propagation.

FEMS Microbiol Rev 17, 127-34 (1995). 4. Thomson, N. et al. The role of prophage-like elements in the diversity of Salmonella enterica

serovars. J Mol Biol 339, 279-300 (2004). 5. Moreno Switt, A.I. et al. Genomic characterization provides new insight into Salmonella

phage diversity. BMC Genomics 14, 481 (2013). 6. Ogunremi, D. et al. High resolution assembly and characterization of genomes of Canadian

isolates of Salmonella Enteritidis. BMC Genomics 15, 713 (2014). 7. Ogura, Y. et al. Comparative genomics reveal the mechanism of the parallel evolution of

O157 and non-O157 enterohemorrhagic Escherichia coli. Proc Natl Acad Sci U S A 106, 17939-44 (2009).

8. Lobocka, M.B. et al. Genome of bacteriophage P1. J Bacteriol 186, 7032-68 (2004). 9. Billard-Pomares, T. et al. Characterization of a P1-like bacteriophage carrying an SHV-2

extended-spectrum beta-lactamase from an Escherichia coli strain. Antimicrob Agents Chemother 58, 6550-7 (2014).

Nature Genetics: doi:10.1038/ng.3281