1.pathogenomics project 2.cross-domain horizontal gene transfer analysis 3.horizontal gene transfer:...
Post on 20-Dec-2015
241 views
TRANSCRIPT
1. Pathogenomics Project
2. Cross-Domain Horizontal Gene Transfer Analysis
3. Horizontal Gene Transfer: Identifying Pathogenicity Islands
Pathogenomics
Goal:
Identify previously unrecognized mechanisms of microbial pathogenicity using a combination of informatics, evolutionary biology, microbiology and genetics.
Explosion of data
26 of the 36 publicly available bacterial genome sequences are for pathogens
Approximately 24,000 pathogen genes with no known function!
~177 bacterial genome projects in progress …
Data as of June, 2001
Bacterial Pathogenicity
Processes of microbial pathogenicity at the molecular level are still minimally understood
Pathogen proteins identified that manipulate host cells by interacting with, or mimicking, host proteins
Yersinia Type III secretion system
Approach
Idea: Could we identify novel virulence factors by identifying bacterial pathogen genes more similar to host genes than you would expect based on phylogeny?
Prioritize for biological study. - Previously studied in the laboratory? - Can UBC microbiologists study it? - C. elegans homolog?
Search pathogen genes against databases. Identify those with eukaryotic similarity.
Evolutionary significance. - Horizontal transfer? Similar by chance?
Modify screening method /algorithm
Approach
Genome data for…
Anthrax Necrotizing fasciitis Cat scratch disease Paratyphoid/enteric feverChancroid Peptic ulcers and gastritisChlamydia Periodontal diseaseCholera PlagueDental caries PneumoniaDiarrhea (E. coli etc.) SalmonellosisDiphtheria Scarlet feverEpidemic typhus ShigellosisMediterranean fever Strep throatGastroenteritis SyphilisGonorrhea Toxic shock syndromeLegionnaires' disease Tuberculosis Leprosy TularemiaLeptospirosis Typhoid feverListeriosis UrethritisLyme disease Urinary Tract InfectionsMeliodosis Whooping cough Meningitis +Hospital-acquired infections
Bacterial Pathogens
Chlamydophila psittaci Respiratory disease, primarily in birdsMycoplasma mycoides Contagious bovine pleuropneumoniaMycoplasma hyopneumoniae Pneumonia in pigsPasteurella haemolytica Cattle shipping feverPasteurella multicoda Cattle septicemia, pig rhinitisRalstonia solanacearum Plant bacterial wiltXanthomonas citri Citrus cankerXylella fastidiosa Pierce’s Disease - grapevines
Bacterial wilt
World Research Community
ApproachPrioritized candidates
Study function of homolog in model host (C. elegans)
Study function of gene in bacterium.
Infection of mutant in model host
C. elegansDATABASE
Collaborations with others
Informatics/Bioinformatics• BC Genome Sequence Centre• Centre for Molecular Medicine
and Therapeutics
Evolutionary Theory• Dept of Zoology
• Dept of Botany
• Canadian Institute for Advanced Research
Pathogen Functions• Dept. Microbiology
• Biotechnology Laboratory
• Dept. Medicine
• BC Centre for Disease Control
Host Functions• Dept. Medical Genetics
• C. elegans Reverse Genetics Facility
• Dept. Biological Sciences SFU
Interdisciplinary group
Coordinator
• For each complete bacterial and eukaryote genome: BLASTP (and MSP Crunch) of all deduced proteins against non-redundant SWALL database
• Overlay NCBI taxonomy information form ACEDB database
• Query database for bacterial proteins who’s top scoring hit is eukaryotic (and eukaryotic proteins who’s top hit is bacterial)
• Perform similar query, but filtering different taxonomic groups from the analysis
Development of first database: Sequence similarity-based approach
BAE-watch Database: Bacterial proteins with unusual similarity with Eukaryotic proteins
Problem: Proteins highly conserved in the three domains of life
Top hit to a protein from another domain may occur by chance.
“StepRatio” score helps detect these.
Example:Glucose-6-Phosphate Reductase
Example of a case with a high StepRatio:
Enoyl ACP reductase
BAE-watch Database: Bacterial proteins with unusual similarity with Eukaryotic proteins
Haemophilus influenzae Rd-KW20 proteins most strongly matching eukaryotic proteins
PhyloBLAST – a tool for analysisBrinkman et al. (2001) Bioinformatics. 17:385-387.
Trends in this Sequence-based Analysis
• Identifies the strongest cases of lateral gene transfer between bacteria and eukaryotes
• Most common “cross-domain” horizontal transfers:
Bacteria Unicellular Eukaryote
• Identifies nuclear genes with potential organelle origins
• A control: Method identifies all previously reported Chlamydia trachomatis “plant-like” genes.
First case: Bacterium Eukaryote Lateral Transfer
0.1
Bacillus subtilis
Escherichia coli
Salmonella typhimurium
Staphylococcua aureus
Clostridium perfringens
Clostridium difficile
Trichomonas vaginalis
Haemophilus influenzae
Acinetobacillus actinomycetemcomitans
Pasteurella multocida
N-acetylneuraminate lyase (NanA) of the protozoan Trichomonas vaginalis is 92-95% similar to NanA of Pasteurellaceae bacteria.
de Koning et al. (2000) Mol. Biol. Evol. 17:1769-1773
N-acetylneuraminate lyase – role in pathogenicity?
Pasteurellaceae
•Mucosal pathogens of the respiratory tract
T. vaginalis
•Mucosal pathogen, causative agent of the STD Trichomonas
N-acetylneuraminate lyase (sialic acid lyase, NanA)
Involved in sialic acid metabolism
Role in Bacteria: Proposed to parasitize the mucous membranes of animals for nutritional purposes
Role in Trichomonas: ?
Hydrolysis of glycosidic linkages of terminal sialic residues in glycoproteins, glycolipids SialidaseFree sialic acid
Transporter
Free sialic acid NanA
N-acetyl-D-mannosamine + pyruvate
Another case: A Sensor Histidine Kinase for a Two-component Regulation System
Signal Transduction
Histidine kinases common in bacteria
Ser/Thr/Tyr kinases common in eukaryotes
However, a histidine kinase was recently identified in fungi, including pathogens Fusarium solani and Candida albicans
How did it get there?
Candida
Neurospora crassa NIK-1
Fusarium solani FIK2 Streptomyces coelicolor SC4G10.06c
Candida albicans CaNIK1
Escherichia coli RcsC
Erwinia carotovora RpfA / ExpSEscherichia coli BarASalmonella typhimurium BarA
Pseudomonas aeruginosa GacS
Pseudomonas fluorescens GacS / ApdAPseudomonas tolaasii RtpA / PheN
Pseudomonas syringae GacS / LemA
Pseudomonas viridiflava RepAAzotobacter vinelandii GacS
0.1
Streptomyces coelicolor SC7C7.03
Xanthomonas campestris RpfCVibrio cholerae TorS
Escherichia coli TorS
Fusarium solani FIK1Fungi
Pseudomonas aeruginosa PhoQ
100
100
51100
100
100
100
100100
100
100
100
100
86
54
39
100
100
Streptomyces Histidine Kinase. The Missing Link?
virulence factor=
virulence factor ?
Brinkman et al. (2001) Infection and Immunity. In Press.
“Plant-like” genes in Chlamydia
• Chlamydiaceae: Obligate intracellular pathogens of humans
• Proteins: Unusually high number most similar to plant proteins
• Previous proposal: Obtained genes from a plant-like amoebal host? (a relative of Chlamydiaceae infects Acanthamoeba)
“Plant-like” genes in ChlamydiaNCBI GI Protein description Subcellular localization in plants
4377270 Glycyl tRNA Synthetase Chloroplast
4376626 cADP/ATP Translocase Chloroplast
4376667 cGlycogen Hydrolase Chloroplast
4377189 GTP Cyclohydratase & DHBP Synthase Chloroplast
4377237 cBeta-Ketoacyl-ACP Synthase Chloroplast
4376686 cEnoy-Acyl-Carrier Reductase Chloroplast
4376591 cThioredoxin Reductase Chloroplast
4377185 Metal Transport P-type ATPase Chloroplast
4377346 Similar to NA+/H+ Antiporter Chloroplast
4376650 cPhosphate Permease Chloroplast
4376637 GcpE protein Chloroplast
4376637 Tyrosyl tRNA Synthetase Chloroplast
4377360 cMalate Dehydrogenase Chloroplast
4376763 GTP Binding protein Chloroplast
4376911 cADP/ATP Translocase Chloroplast
3329179 Phosphoglycerate Mutase Chloroplast
4377281 cGlycerol-3-Phosphate Acyltransferase Chloroplast
4376993 ABC Transporter ATPase Chloroplast
4376509 dDeoxyoctulonosic Acid Synthetase Chloroplast
4376872 eSugar Nucleotide Phosphorylase Chloroplast
“Plant-like” genes in Chlamydia
6578112 rRNA Methytransferase Chloroplast
3329217 HSP60 Chloroplast
3328745 cPhosphoribosylanthranilate Isomerase Chloroplast
6578104 cAspartate Aminotransferase Chloroplastf
4377328 cPolyribonucleotide Nucleotidyltransferase Chloroplastf
4377362 Putative D-Amino Acid Dehydrogenase Chloroplastg
4377331 Cytosine Deaminase Chloroplast?h
4376915 Lipoate-Protein Ligase A Mitochondrial
4377272 Glycogen Synthase N/Ai
4377065 cDihydropteroate Synthase N/Ai
4377239 cInorganic Pyrophosphatase N/Ai
4376904 Uridine 5’-Monophosphate Synthase N/Ai
4377173 cUDP-Glucose Pyrophosphorylase N/Ai
4376815 GutQ/Kpsf Family Sugar-Phosphate Isomerase Mitochondrial?j
Chlamydiaceae share an ancestral
relationship with Cyanobacteria and
Chloroplast0.1
Pyrococcus furiosus (Archaea)
Thermotoga maritima
Aquifex pyrophilus
Bacillus subtilis
Chlamydophila pneumoniae
Chlamydophila psittaci
Chlamydia muridarum
Chlamydia trachomatis1000
7041000
Chlamydomonas reinhardtii
Klebsormidium flaccidum
Zea mays
Nicotiana tabacum1000
988
998
Synechococcus PCC6301
Synechocystis PCC6803
Microcystis viridis1000
1000
1000
530
Escherichia coli
Zea mays mitochondrion
Rickettsia prowazekii
Caulobacter crescentus
868986
764
349
1000
538
Chloroplasts
Cyanobacteria
Chlamydiaceae
Chlamydiaceae share an ancestral relationship with Cyanobacteria and Chloroplast
L3
L4
L23
L2
S19
L22 S3
L16 L29
S17
L14
L24
L5
S14 S8
L6
L18 S5
L30
L15S10
EscherichiaBacillusThermatogaSynechocystisChlamydia
Unique shared-derived characters unite Chlamydiaceae and Synechocystis
Chlamydiaceae “plant-like” genes reflect an ancestral relationship with Cyanobacteria and Chloroplast
•Chlamydia do not appear to be exchanging DNA with their hosts
•Existing knowledge of Cyanobacteria may stimulate ideas about the function and control of pathogenic Chlamydia?
Non-unique shared characters include a multistage developmental lifecycle, storage of glucose primarily as glycogen, and non-flagellar motility
Expanding the Cross-Domain Analysis
• Identify cross-domain lateral gene transfer between bacteria, archaea and eukaryotes
• No obvious correlation seen with protein functional classification
• Most cases: no obvious correlation seen between “organisms involved” in potential lateral transfer
Exceptions:
– Unicellular eukaryotes
– “Organelle-functioning” proteins in Rickettsia, Synechocystis, and Chlamydiaceae
Horizontal Gene Transfer and Bacterial Pathogenicity
Transposons: ST enterotoxin genes in E. coli
Prophages:Shiga-like toxins in EHECDiptheria toxin gene, Cholera toxinBotulinum toxins
Plasmids:Shigella, Salmonella, Yersinia
Pathogenicity Islands:
Uro/Entero-pathogenic E. coliSalmonella typhimuriumYersinia spp.Helicobacter pyloriVibrio cholerae
Pathogenicity Islands
Associated with
– Atypical %G+C– tRNA sequences– Transposases, Integrases and other mobility genes– Flanking repeats
IslandPath: Identifying Pathogenicity Islands
Yellow circle = high %G+C
Pink circle = low %G+C
tRNA gene lies between the two dots
rRNA gene lies between the two dots
Both tRNA and rRNA lie between the two dots
Dot is named a transposase
Dot is named an integrase
Neisseria meningitidis serogroup B strain MC58 Mean %G+C: 51.37 STD DEV: 7.57
%G+C SD Location Strand Product 39.95 -1 1834676..1835113 + virulence associated pro. homolog 51.96 1835110..1835211 - cryptic plasmid A-related 39.13 -1 1835357..1835701 + hypothetical 40.00 -1 1836009..1836203 + hypothetical 42.86 -1 1836558..1836788 + hypothetical 34.74 -2 1837037..1837249 + hypothetical 43.96 1837432..1838796 + conserved hypothetical 40.83 -1 1839157..1839663 + conserved hypothetical 42.34 -1 1839826..1841079 + conserved hypothetical 47.99 1841404..1843191 - put. hemolysin activ. HecB 45.32 1843246..1843704 - put. toxin-activating 37.14 -1 1843870..1844184 - hypothetical 31.67 -2 1844196..1844495 - hypothetical 37.57 -1 1844476..1845489 - hypothetical 20.38 -2 1845558..1845974 - hypothetical 45.69 1845978..1853522 - hemagglutinin/hemolysin-rel. 51.35 1854101..1855066 + transposase, IS30 family
Variance of the Mean %G+C for all Genes in a Genome: Correlation with bacteria’s clonal nature
non-clonal clonal
Pathogenomics Project: Future Developments
• Identify eukaryotic motifs and domains in pathogen genes
• Threader: Detect proteins with similar tertiary structure
• Identify more motifs associated with• Pathogenicity islands• Virulence determinants
• Functional tests for new predicted virulence factors
• Expand analysis to include viral genomes
• Jeff Blanchard (National Centre for Genome Resources, New Mexico)
• Olof Emanuelsson (Stockholm Bioinformatics Center)
• Genome Sequence Centre, BC Cancer Agency
Acknowledgements
Pathogenomics group Ann M. Rose, Yossef Av-Gay, David L. Baillie, Fiona S. L.
Brinkman, Robert Brunham, Artem Cherkasov, Rachel C. Fernandez, B. Brett Finlay, Hans Greberg, Robert E.W. Hancock, Steven J. Jones, Patrick Keeling, Audrey de Koning, Don G. Moerman, Sarah P. Otto, B. Francis Ouellette, Nancy Price, Ivan Wan.
www.pathogenomics.bc.ca