the global evolution and adaptation of vibrio cholerae across multiple niche dimensions rob edwards...
TRANSCRIPT
The Global Evolution and Adaptation of Vibrio cholerae Across Multiple Niche Dimensions Rob
Edwards
Flinders2015
How to annotate a coupleof hundred genomes
RobEdwards
Flinders2015
Annotation of microbial genomesand comparison across differences
Cholerae Haiti Genome Sequencing ORF Calling Annotation Global evolution Niche dimensions
Cholera is caused by Vibrio cholerae
About 3-5 million cases per year
About 100 - 200,000 deaths world wide per year
Notable deaths: Tchaikovsky Polk (11th President USA)
A world wide pandemic
About 75% of patients have no symptoms
25-50 PINTS of diarrhea per DAY
Severe symptoms are by dehydration
Treatment
Clean water
Electrolytes
Vaccine
Not antibiotics
Symptoms
1st – 1817 to 1823 Started at the Ganges, spread by colonialists
2nd – 1829 to 1849 Worldwide spread via immigrants
3rd – 1852 to 1859 John Snow first epidemiologist
Multiple Pandemics
John Snow
Portrait painted in
1847 when he was
34 years old.
First epidemiological study
John Snow
Cholera outbreak in Soho, London 1854
Plotted all cases on a map
Found big cluster around water well
First epidemiological study
John Snow’s Map
First epidemiological study
On the mode of communication
of Cholera
1854
Cholera caused by bacteria
Outbreaks of cholera
1st – 1817 to 1823 Started at the Ganges, spread by colonialists
2nd – 1829 to 1849 Worldwide spread via immigrants
3rd – 1852 to 1859 John Snow first epidemiologist
4th – 1863 to 1879 Originated in mecca
5th – 1881 to 1896 First cholerae vaccine (1892)
6th – 1899 to 1923 Killed 800,000 people
7th – 1961 to present 1991: South America killed > 100,000
people
Multiple Pandemics
Earthquake Jan 12th, 2010
No cholera in Haiti for > 50 years
First case, October 22nd, 2010
By February, 2011 250,000 cases and ~5,000 deaths
What was the original source?
Haitian Outbreak
http://www.ph.ucla.edu/
Haitian cholera outbreaks
Source: Final Report of the Independent Panel of Experts on the Cholera Outbreak in Haiti
Cases by day – Mirebalais Hospital
Cases by Age – St Marc HospitalOn October 20th, 2010
Two hypotheses:
Endemic, waterborne strain that has been in Haiti but not caused disease for 50 years
Imported from another country
Haitian Outbreak
"They have been fortunate in Haiti that for 50 years the conditions have been such that they haven’t had an intense increase in cholera bacterial populations. ... But they’ve had an earthquake, they’ve had destruction, they’ve had a hurricane ... I think it’s very unfortunate to look for a scapegoat. It is an environmental phenomenon that is involved”
Rita ColwellJohns Hopkins School of Public Health
The environmental hypothesis
“The organism that is causing the disease is very uncharacteristic of (Haiti and the Caribbean), and is quite characteristic of the region from where the soldiers in the base came. ... I don't see there is any way to avoid the conclusion that an unfortunate and presumably accidental introduction of the organism occurred."
John MekalanosHarvard Medical School
The human hypothesis
Source: Final Report of the Independent Panel of Experts on the Cholera Outbreak in Haiti
Conditions favor human hypothesis
Conditions favor human hypothesis
Source: Final Report of the Independent Panel of Experts on the Cholera Outbreak in Haiti
Conditions favor human hypothesis
Source: Final Report of the Independent Panel of Experts on the Cholera Outbreak in Haiti
Global evolution of Vibrio
Can we use genomics to identify the global evolution of Vibrio?
Which gene(s) are important for temporal/spatial variation?
Prototype Vibrio cholerae
sequence
TIGRNature 406, 477-483(3 August 2000)
Sequenced genomes
2011 – 32 Vibrio strains sequenced
Fabiano Thompson's Lab @ UFRJ
Fundação Oswaldo Cruz
Ion quality scores
Sequenced genomes
2011 – 32 Vibrio strains sequenced
2011 – 171 Vibrio strains sequenced
How do you analyze 250+ genomes?
The steps in genome sequencing
Generate genome sequence Assembly ORF calling tRNA identification rRNA identification Functional annotation
www.sigmaaldrich.com
Putative protein
Open Reading Frame (ORF)
A stretch of amino acids with no stop codon
Coding Sequence (CDS) An ORF that could encode a protein
Protein encoding gene (PEG) An ORF that could encode a protein
Hypothetical protein = putative protein
Something that has not been experimentally shown
Polypeptide
Short stretch of ~50 amino acids. Often a domain
Reads per chromosome (Chr. I)
Reads per chromosome (Chr. II)
Cholera Toxin Phage
Assembly
ORF Calling
Annotation
Annotated Vibrio using RAST
Single nucleotide polymorphisms
ATCATCGATCAGCATGCATCAGCATCGATCAGCATCATCGATCAGCATGCATCAGCATCGATCAGCATCATCGATCAGCATGCATCAGCCTCGATCAGCATCATCGATCAGCATGCATCAGCCTCGATCAGCATCATCGATCAGCAAGCATCAGCCTCGATCAGCATCATCGATCAGCAAGCATCAGCCTCGATCAGCATCATCGATCAGCAAGCATCAGCCTCGATCAGCATCATCGATCAGCAAGCATCAGCCTCGAGCAGCATCATCGATCAGCAAGCATCAGCCTCGAGCAGC
Global evolution
Mutreja et al 2011
Mutreja et al 2011
Waves of spread of cholera
Mutreja et al 2011
Different evolution for each wave
On the source of Haitian cholera
Cholerae
Mimicus
Parahemolyticus
Harveyi
Vibrio cholerae from Bangladesh in 1994 Vibrio cholerae from Haiti in 2010 Vibrio cholerae from Bangladesh in 2002 Vibrio cholerae from Haiti in 2010 Vibrio cholerae from Haiti in 2010
Outbreak in Khatmandu, Nepal before the soldiers left
Outbreaks downstream (not upstream) along the river from the nepalese UN camp
But that could have come from river trade. Ships used to fly the yellow flag when they were quarantined by cholera
Nepalese soldiers?
http://www.ph.ucla.edu/
Haitian cholera outbreaks
Conservation of the ~120 kb superintegron region across 210 strains
Evolution not only by SNPs
Mother
Daughter Daughter
SNPs HGT
Horizontal gene transferversus
Vertical evolution
210 Vibrio genomes
Reassembled
Reannotated
Find interesting genes!
Year Continent Country Lat/Lon
Coordinates Clinical or
Environmental Source Serogroup Serotype Biotype Mutreja wave
Niche dimensions
Serogroup
Vibrio cholerae
Non-cholera toxin
No disease
Cholera toxin
O1 O139
ClassicalEl Tor
Ogawa Inaba
Epidemics
Biotype
Serotype
V. cholerae classification
15,000 genes in the pangenome
933 subsystems (pathways) present in at least one genome
SNPs (after Mutreja)
Response variables
Recreate evolution of the Vibrios
What are the important genes for each niche dimension
Who, what, when, where!
Use random forests to identifyimportant variables
Analysis
O-antigenExopoly-
saccharideCapsule Sialic Acid
DNA recomb.
01 10 20 5 10
01 10 20 5 10
01 10 20 5 10
0139 100 1 8 10
0139 100 1 8 10
0139 100 1 8 10
0139 100 1 8 10
Random Forest
O139O1
Exopoly-saccharide
<50
Random Forest
O1 O139
DNA-recombination
10
O1 O139
Capsule
<10
Random Forest
Each tree votes on the importance of each variable.
Typically, run 10,000 trees
Responsevariablesand nichedimensions
Genes important for who ?(serogroup)
Genes important for what? (clinical, environmental, ...)
Genes important for where?(continent)
Separation of functions by continent
Genes important for when?(year)
DNA Repair
DNA repair & phages
Normal DNA repair(134 strains)
umuC umuDAdditional DNA repair(4 strains; not O1)
umuCumuDPhage borne DNA repair(72 strains) umuCprophage
Mutreja et al 2011
Waves 2 & 3 have phage Interrupted repair
Different evolution for each wave
Conclusions
Unraveling evolution and spread of new pathogens
Mining genomes and niche dimensions
Don't get scooped!
Multi-genome projects??
Organism Number Organism Number
S. pyogenes 3,615 Mycobacterium tuberculosis
390
S. pneumoniae 3,085 Salmonella in cattle and humans
373
Rice (Oryza sativa) 3,000 Vibrio 274
C. elegans 2,007 Shigella sonnei 263
Clostridium difficile 1,250 Mycobacterium tuberculosis
259
The thousand (human) genome project
1,092 Streptococcus pneumoniae
240
Mycobacterium tuberculosis
1,000 Methicillin-resistant Staphylococcus aereus
193
Plasmodium falciparum
825 Campylobacter jejuni
192
Streptococcus pneumoniae
616 Mycobacterium abscessus in CF
170
Nick Loman: http://lab.loman.net/
Current multigenome projects