microbial phylogenomics (eve161) class 17: genomes from uncultured

71

Upload: jonathan-eisen

Post on 16-Jan-2017

207 views

Category:

Science


0 download

TRANSCRIPT

Page 1: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Page 2: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Page 3: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

What was Sampled?

Page 4: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

What was Sampled?

Page 5: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Page 6: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

How to Sample?

• Sample “natural” system or modified system?

Page 7: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Page 8: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Page 9: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Comments on Sampling?

Page 10: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

What Obtained from Samples?

• What did they collect from the samples and why?

Page 11: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

What Obtained from Samples?

Page 12: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Comments on What Done with Samples?

Page 13: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Binning

• Why bin metagenomies?

• How did they bin metagenomes?

• How did they test their bins?

Page 14: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Binning

Page 15: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Page 16: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

How do ESOMs work?

Page 17: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Page 18: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Page 19: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Page 20: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Self Organized MapsTo analyze the distribution of genome signatures among and between populations, all contigs and assembled genomes were fragmented into 5-kb pieces, then pooled and clustered by self-organizing map (SOM) [58] based on tetranucleotide frequency distributions (Figure 1; see Materials and methods for details). The SOM is an unsupervised neural network algorithm that clusters multidimensional data and represents it on a two-dimensional map. SOMs of tetranucleotide frequencies have been used previously to successfully bin sequence fragments from isolate genomes [33, 59] and some environmental samples [46, 48, 52]. We utilized an implementation of the SOM, emergent SOM (ESOM), which is distinguished by its use of large borderless maps (for example, thousands of neurons) and visualization of underlying distance structure with background topography [60]. This visualization, where map 'elevation' represents the distance in tetranucleotide frequency between data points, is referred to as the U-Matrix [60]. Thus, genomic clusters were visualized not only by the cohesive clustering of fragments from each genome, but also by distance structure whereby barriers between clusters represent the large differences in genome signatures between genomes relative to those within genomes (Figure 3). This visualization of genomic clustering was used to evaluate the accuracy of the binning based on assembled genomes and to identify novel regions of sequence signature space.

Page 21: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

ESOMsContigs were clustered by tetranucleotide frequency utilizing Databionics ESOM Tools [94]. The input for tetra-ESOM was a 136-dimensional vector (representing the frequencies of the 136 unique reverse complement tetranucleotide pairs, normalized for contig length) for each contig/window. These raw frequencies were transformed with the 'Robust ZT' option built into Databionics ESOM Tools, which normalizes the data using robust estimates of mean and variance. Data were permuted before each run to avoid errors due to sampling order. Maps were toroidal (borderless) with Euclidean grid distance and dimensions scaled from the default map size (50 × 82) as a function of the number of data points, to a ratio of approximately 5.5 map nodes per data point. For example, a typical clustering with approximately 7,500 data points was run on map with dimensions 155 × 255. Training was conducted with the K-Batch algorithm (k = 0.15%) for 20 training epochs. The standard best match search method was used with local best match search radius of 8. Other training parameters were as follows: Gaussian weight initialization method; Euclidean data space function; starting value for training radius of 50 with linear cooling to 1; starting value for learning rate of 0.5 with linear cooling to 0.1; Gaussian kernel function.

Page 22: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Comments on Binning?

Page 23: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Binning

Page 24: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

CPR

Page 25: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

How did they name new phyla?

Page 26: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

CPR

Page 27: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

CPR

Page 28: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Page 29: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

CPR Genomes

Page 30: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Comments on Phyla and Genomes?

Page 31: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

CPR Size

• Why does size matter here?

Page 32: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

CPR are Small

Page 33: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Comments on Size?

Page 34: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Unusual rRNAs 1

• Why so much focus on unusual rRNAs?

Page 35: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Unusual rRNAs 1

Page 36: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Insertions Diverse

Page 37: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Page 38: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Unusual rRNAs 1

Page 39: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Insertion Sites

Page 40: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Unusual rRNAs 2

Page 41: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Insertion Locations by Phyla

Page 42: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Page 43: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Unusual rRNAs 2

Page 44: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

rRNA Copy = 1 in CPR

Page 45: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Comments on rRNAs?

Page 46: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Metatranscriptomes and rRNAs

• What is metatranscriptomics?

• Why do it?

Page 47: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Metatranscriptomes and rRNAs

Page 48: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured
Page 49: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Comments on Metatranscriptomes?

Page 50: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

rRNA insertions

Page 51: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

23S rRNA insertions

Page 52: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Metagenomics and rRNA

• Why examine rRNAs in metagenomic data not PCR data?

• What can you learn from this?

Page 53: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Metagenomics and rRNA

Page 54: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Universal Primers?a, b, PrimerProspector was used to assess the ability of primers 515F and 806R to bind a non-redundant set of assembled near-complete 16S rRNA gene sequences (clustered at 97% sequence identity). The percentage of sequences that would be amplified by these primers is shown on the left axis, the total number of sequences analysed is on the top of each bar, and the number of sequences these primers would not bind to is indicated by the shading. Many assembled groundwater-associated 16S rRNA gene sequences would evade amplification by PCR primers 515F and 806R. Results of the analysis are shown at the domain (a) and superphylum or phylum (b) levels.

Page 55: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Comments on Primers?

Page 56: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Phylogeny

• How infer phylogeny of CPR?

Page 57: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Phylogeny

Page 58: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

CPR

Page 59: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Comments on Phylogeny?

Page 60: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Phyla

• How many phyla are there?

• Why does it matter?

• How determine # of phyla

Page 61: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Phyla

Page 62: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Comments on Phyla?

Page 63: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Novel Ribosomes

• Why examine ribosomes (beyond rRNA) in these organisms?

• What do the findings mean?

Page 64: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Novel Ribosomes

Page 65: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Assembled genomes were analysed using ggKbase (Supplementary Data 4). Shown here is a non-redundant set of complete and near-complete genomes (≥75% of single copy genes, ≤1.125 copies) organized based on a subset of a maximum-likelihood 16S rRNA gene phylogeny (Supplementary Fig. 1). CPR organisms have partial tricarboxylic acid (TCA) cycles and lack electron transport chain (ETC) complexes. In addition, they have incomplete biosynthetic pathways for nucleotides and amino acids. The Peregrinibacteria are a notable exception to some of these limitations. Several Parcubacteria exhibit a complete ubiquinol (cytochrome bo) oxidase operon, as previously seen in Saccharibacteria3. However, lack of NADH dehydrogenase and other ETC components suggests that this enzyme is involved in oxygen scavenging/detoxification rather than energy production. AA Syn., amino acid synthesis; PP, pentose phosphate pathway.

Page 66: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Novel Ribosomes 2

Page 67: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Comments on Ribosomes?

Page 68: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Novel Metabolism

• Why examine metabolism in these organisms?

Page 69: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Novel Metabolism

Page 70: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Assembled genomes were analysed using ggKbase (Supplementary Data 4). Shown here is a non-redundant set of complete and near-complete genomes (≥75% of single copy genes, ≤1.125 copies) organized based on a subset of a maximum-likelihood 16S rRNA gene phylogeny (Supplementary Fig. 1). CPR organisms have partial tricarboxylic acid (TCA) cycles and lack electron transport chain (ETC) complexes. In addition, they have incomplete biosynthetic pathways for nucleotides and amino acids. The Peregrinibacteria are a notable exception to some of these limitations. Several Parcubacteria exhibit a complete ubiquinol (cytochrome bo) oxidase operon, as previously seen in Saccharibacteria3. However, lack of NADH dehydrogenase and other ETC components suggests that this enzyme is involved in oxygen scavenging/detoxification rather than energy production. AA Syn., amino acid synthesis; PP, pentose phosphate pathway.

Page 71: Microbial Phylogenomics (EVE161) Class 17: Genomes from Uncultured

Comments on Metabolism?