Beiko ANL Soil Metagenomics presentation

Download Beiko ANL Soil Metagenomics presentation

Post on 15-Apr-2017




0 download

Embed Size (px)



Soil, lateral gene transfer, and hybrid genomesRobert Beiko20 October 2015

A. microbeLateral gene transfer

In the gutButyrate synthesis in Lachnospiraceae and other organisms

Meehan and Beiko (2014) Genome Biol EvolLachnospiraceae

LGT across habitats

Smillie et al. (2011) NatureWithin-site transfer rates are highest in host-associated (i.e., human) habitats

AR genes are frequently transferred BETWEEN habitats

Villegas-Torres et al. (2011) International Biodeterioration & Biodegradation

Forsberg et al. (2012) Science

Sorangium cellulosum So157-214.8 Mbp; 11,599 coding sequences>1200 putative LGT acquisitions

Han et al. (2013) Sci Rep

An ecological view of genomesGenes as individuals, Genomes as communities

Key concept mappings:

Diversity: counts of genes / distribution across functional categoriesCommunity: set of genes and their interactionsMigration: lateral gene transfer

Metacommunity Leibold et al., Ecol Lett, 2004A set of local communities that are linked by dispersal

Species BSpecies A


Pattern and intensityof migration for species A

Genome Metacommunity HypothesisSinceGenes are agents whose trajectories are not bound to their host organismsGenes can evolve and take on new functional roles in concert with other genes

A genome can be viewed as a community of genesRelated sets of genomes comprise a metacommunity of genes

Genome Metacommunities Boon et al., Fems Microbiol Rev, 2014A set of genomes that are linked by LGT

Gene BGene A


Pattern and intensityof LGT for gene A

Genome Metacommunities Boon et al., Fems Microbiol Rev, 2014Related to the pan-genome, but not restricted to specific taxonomic groups

Why is a given gene present in a given genome at a given time?

How are functional roles partitioned across a community?

Soil thinkingHow important is LGT in soil communities?

Does it make sense to think of gene metacommunities in the soil context?

Lots of LGTYESMinimal LGTNO

The procedureIn the absence of a coherent set of known genomes from a given habitatIdentify an interesting sampleSelect genomes with very high marker-gene (i.e., 16S) similarity to sequences in the sample (gOTUs)Mine genomes for evidence of LGT, examine patterns of connectivity

ConclusionsPositive relationship between abundance, diversity and pHSpecific relationships between different bacterial (notably Acidobacteria) and fungal groups vs. pHFungal OTUs appear to tolerate wider pH ranges(1)Chosen sample: (pH = 4.1)

Meet the Sample (MG-RAST)

1277 rRNA gene sequences

Meet the Sample (Matching genomes)

99% 16S identity (e-value < 1e-20):

1211 No matchBradyrhizobiaceae: 61Pseudomonas: 2Nocardioides: 1Acidithiobacillus: 1Cyanobium: 1Total: 18 genomes covering 8 genera

97% 16S identity:

1100 No matchBradyrhizobiaceae: 77Pseudoxanthomonas / Cycloclasticus: 25Acidobacteria: 20Other Proteobacteria: 48Other: 10Total: 114 genomes covering 74 genera

114 genomes covering 74 genera

1277 rRNA gene sequences


gOTUs16S sequence from sampleRhodopseudomonas palustris TIE 1Rhodopseudomonas palustris DX 1Rhodopseudomonas palustris CGA009Bradyrhizobium japonicum USDA 699% 16S identity97% 16S identity141824_31298Rhodopseudomonas palustris HaA2Rhodopseudomonas palustris BisA53Bradyrhizobium BTAi1Nitrobacter winogradskyiOligotropha carboxidovoransAgromonas oligotrophicaBradyrhizobium ORS278

Weird gOTUs141824_229613Bordetella pertussisBordetella bronchisepticaBordetella parapertussis

Gross et al., 2008

Homology searchCompare proxy genomes against nr database

Identify interesting patterns:Unusual best matches (e.g., best nonself match is to a completely different group)Patchy distributions, phylogenetic treesLinked sets of genes: co-transfer?Implicated biological processes?

Acidithiobacillus ferrooxidansA refugee from genus Thiobacillus (a group shattered by 16S rRNA gene sequencing)

Loves long walks on the beach, pH < 2.0, oxidizes iron, sulphur, thiosulphate

Also loves to share genes

Beiko (2011) Biol Direct504 gene trees in which A. ferrooxidans has a unique genus as partnerNot shown: 795 genes w/multiple partnersAlso not shown: 333 other trees with less frequent, unique partner genera

Split by 16S; reunited by genome sequencing?

Genome 1 Acidithiobacillus ferrivorans(renaming of A. ferrooxidans)3093 predicted proteins / 3035 with homology matchesObserved / Predicted capabilities:Facultatively anaerobicPsychrotolerantOptimal pH = 2.5Oxidation of iron and inorganic sulfurCarbon fixation, nitrate reductionTrehalose synthesisBioleachingLiljeqvist et al. (2011) J Bacteriol

Genome 1 Acidithiobacillus ferrivoransBest nonself match is to(273 non-Acidithiobacillus)

Mobile element signatures dominate14 x restriction system-associated8 x transposase8 x transcriptional regulators (incl CopG, TetR)Other resistance (LacZ, bleomycin, )Integrase, reverse transcriptase, toxin/antitoxin, bacteriocin, Nitrate reductase & related>90 unknown

1877 found in other Acidithiobacillus + other generaBest non-Acidithiobacillus match is to

(only 11 Acidobacteria!)

Acidobacterial connectionsshort-chain dehydrogenase/reductase SDR HNH endonuclease Glycoside hydrolase family 8 (x3)RES domain protein Transposase x 5

Phylogenetic profiles# of similar genes (evalue < 10-50)Min 30 connectionsProteobacteriaActinobacteriaCyanobacteriaPlanctomycetesAcidobacteriaBacteroidetesAcidithiobacillus

Key observationsConnections to many other groups, mostly Proteobacteria (not surprising)No between-group connections outside Proteobacteria at this thresholdAcidithiobacillus as hub rather than part of gene-exchange community?


Mutual information-based network(do groups co-occur > random?)

AcidithiobacillusGammaproteobacteriaAlpha/BetaproteobacteriaKey observationsConnections mostly predictable by phylogenyAgain, no interesting partners outside of ProteobacteriaHowever, many connections between Alpha/Betaproteobacteria

Phosphate ABC transportersgi 343775109periplasmic(eval < 10-100)gi 343775110 inner membranesubunit PstC (eval < 10-50)

gi 343775111 inner membranesubunit PstA (eval < 10-50)


Recurrent grouping ofAcidithiobacillus (Gamma)AcidobacteriumThiobacillus (Beta)Defluviimonas (Alpha)Salinisphaera (Gamma)

Genome 2 Terriglobus roseusEichorst et al (2007) IJSEMGroup 1 acidobacteriumPreferred pH: ~6

AerobicCatalase, carotenoids for defense against reactive oxygenOligotrophic; can grow on a wide range of carbon sources

4245 protein-coding genes (2735 with nr matches, 558 species-specific)

Rousk et al.

Eichorst et al.

Best nonself matches (183 non-Acidobacteria)Multidrug resistance / cation efflux / prophage

Best matches outside Terriglobus

Phylogenetic profilesProfiles are wider and more diverse for Terriglobus than for Acidithiobacillus

LPS O-antigen biosynthesisgi 390412425CDP-glucose 4,6-dehydratasegi 390412426 glucose-1-phosphate cytidylyltransferasegi 390412427 LPS biosynthesisprotein

Distribution:AcidithiobacillusAcidobacteriaOther proteobacteriaOther



Contrasting Acidithiobacillus vs Terriglobus relationships:Same partners, different danceCompare profiles vs Streptomycetaceae (five strains found in sample gOTU)

AcidithiobacillusCommonTerriglobusPolyphosphate kinaseglucose-6-phosphate 1-dehydrogenaseCarbon monoxide dehydrogenaseMore glycolytic enzymesHeavy-metal resistance / exportMultidrug resistanceAmmonium transporterCatalase / peroxidaseExopolysaccaride

ConclusionsDifferent layers of LGT:Very recent: mostly mobile elements (proxies unsuitable)Less recent (outside species / genus) (proxies potentially more justifiable)Taxonomy is a pain

Whats the story with gene metacommunities?Lots of LGT!Recurrent patterns of sharing among groups not evidentMetacommunities at the pan-genome level?Need many isolate genomes from single samples

Technical impacts of LGT and gene metacommunitiesMetagenomic read assignmentRecently acquired genes will still look like they belong in the donorThese are some of the most interesting genes!!

Functional prediction (e.g., PICRUSt)Phylogeny will fail to accurately predict the distribution of these genes. Be very careful with extreme or poorly characterized samples!

Phylogenetic beta diversity may be misleading

Key questions in LGT and gene metacommunitiesAre gene-sharing networks:Random?Driven by shared location / habitat?Constrained by phylogenetic relatedness?

Are shared genes:Neutral or adaptive?Driven by specific types of mobile element?