esa 2014 qiime
TRANSCRIPT
![Page 1: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/1.jpg)
Community Profiling via
QIIME Dorota Porazinska and Zech Xu
University of Colorado Boulder, CO
![Page 2: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/2.jpg)
File Download
• View slides at: – hAp://goo.gl/4duXII
• Raw files: – hAps://app.box.com/s/kwzjd1go2g8cmic59xcd – Extract it: !tar zxf crawford_mice.tar.gz!
• View IPython Notebook – hAp://nbviewer.ipython.org/gist/RNAer/d8e7cbd7b68a273d2269 – Also inside the downloaded files (require ipython to open it)
• Processed file: – hAps://app.box.com/s/3a6gvuyn8crjamx7uqte – Run: !mv output.tar.gz crawford_mice!!tar zxf output.tar.gz!
![Page 3: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/3.jpg)
Sequencing cost ge]ng cheaper
hAp://goo.gl/rWW1Ay
![Page 4: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/4.jpg)
Tsunami of sequence data
???
![Page 5: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/5.jpg)
1st vs. NGS technologies
hAp://www.patrickwardphd.com/wp-‐content/uploads/2012/05/sprinkler-‐kids-‐l.jpg hAp://1000awesomethings.com/2011/06/21/218-‐drinking-‐from-‐the-‐hose/
![Page 6: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/6.jpg)
A classic microbial ecology study
![Page 7: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/7.jpg)
A classic microbial ecology study
![Page 8: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/8.jpg)
A classic microbial ecology study
![Page 9: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/9.jpg)
A classic microbial ecology study
![Page 10: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/10.jpg)
A classic microbial ecology study
![Page 11: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/11.jpg)
A classic microbial ecology study
Bacterial Community Variacon in Human Body Habitats Across Space and Time, Costello et al., Science 2009
Modified from Hamady et al. Genome Research. 2009
![Page 12: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/12.jpg)
Datasets with billions of sequences:
• Human Microbiome Project: Largest characterizacon of the microbiome of healthy individuals – NIH sponsored, $185 million project – Samples from 300 adults and 18 body sites – Raw data: ~232 GB
![Page 13: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/13.jpg)
Earth Microbiome Project
![Page 14: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/14.jpg)
![Page 15: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/15.jpg)
Coursera Course
hAps://www.coursera.org/course/microbiome
![Page 16: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/16.jpg)
… accumulacng data Healthy individual traveling from the US to Bangladesh
![Page 17: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/17.jpg)
Healthy individual traveling from the US to Bangladesh
Relacves of Crohn's disease pacents
![Page 18: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/18.jpg)
Healthy individual traveling from the US to Bangladesh
Relacves of Crohn's disease pacents
Pacents
![Page 19: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/19.jpg)
… so what can we tell from all this work?
Healthy individual traveling from the US to Bangladesh
Relacves of Crohn's disease pacents
Pacents
USA Global gut
![Page 20: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/20.jpg)
Healthy individual traveling from the US to Bangladesh
Relacves of Crohn's disease pacents
Pacents
USA Venezuela Malawi
Global gut
![Page 21: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/21.jpg)
Healthy individual traveling from the US to Bangladesh
Relacves of Crohn's disease pacents
Pacents
USA Venezuela Malawi
Global gut
HMP
![Page 22: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/22.jpg)
… so what can we tell from all this work?
Healthy individual traveling from the US to Bangladesh
Relacves of Crohn's disease pacents
Pacents
USA Venezuela Malawi
Global gut
HMP
![Page 23: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/23.jpg)
… so what can we tell from all this work?
Healthy individual traveling from the US to Bangladesh
Relacves of Crohn's disease pacents
Pacents
USA Venezuela Malawi
Global gut
HMP
![Page 24: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/24.jpg)
hAp://qiime.org hAp://forum.qiime.org hAp://blog.qiime.org
![Page 25: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/25.jpg)
Graphical User Interface
![Page 26: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/26.jpg)
Command line
![Page 27: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/27.jpg)
Perform idenccal operacons
![Page 28: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/28.jpg)
Paths (absolute) /Users/yoshiki/evident-data/hmp-v13_arare/alpha_div $HOME/evident-data/hmp-v13_arare/alpha_div ~/evident-data/hmp-v13/alpha_div
A slash at the beginning of a path denotes it as an absolute path, i. e. from the base of your hard drive.
![Page 29: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/29.jpg)
Paths (relacve) evident-data/hmp-v13_arare/alpha_div
On the other side relacve paths are not preceeded by a slash
![Page 30: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/30.jpg)
QIIME
![Page 31: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/31.jpg)
QIIME Structure
● Integrates other somware ● Set of scripts to perform certain funccons ● Allows an easy workflow ● Keys, wallet, phone: print_qiime_config.py
![Page 32: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/32.jpg)
QIIME somware dependencies [data-‐lanemask] [data-‐core] [python] [setuptools] [MySQL-‐python] [SQLAlchemy] [pycogent] [pynast] [numpy] [matplotlib] [mpi4py] [lxml] [sphinx] [raxml] [fasFree]
[cdbtools] [chimeraslayer] [cdhit] [rdpclassifier] [blast] [muscle] [infernal] [cytoscape] [clearcut] [mothur] [uclust] [r] [ampliconnoise] [vienna] [pprospector]
![Page 33: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/33.jpg)
Script types
Single Task One step Most of them
Workflows MulGple scripts in one Uses a log file Indicated in the script descripcon
![Page 34: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/34.jpg)
QIIME commands
Get help with index site hAp://qiime.org/genindex.html Get help with the -‐h opcon pick_otus.py -h
Command names are self-‐explanatory Filtering filter_fasta.py filter_otus_by_sample.py filter_distance_matrix.py Sorcng sort_otu_table.py
![Page 35: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/35.jpg)
Ge]ng help
hAp://qiime.org/genindex.html
![Page 36: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/36.jpg)
These opGons are required, else the script will not funcGon correctly
These arguments are opGonal, you can either use them or not, some default values are explained here.
![Page 37: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/37.jpg)
QIIME
• The code is tested (properly) • The documentacon is updated constantly based on users suggescons
• The help in the QIIME-‐forum has a collaboracve spirit (developers & users sharing their research experiences)
![Page 38: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/38.jpg)
print_qiime_config.py
![Page 39: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/39.jpg)
QIIME
![Page 40: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/40.jpg)
Sequencing output (454, Illumina, Sanger)
fastq, fasta, qual, or sff/trace files
Metadata
mapping file
Pre-processinge.g., remove primer(s), demultiplex,
quality filter
Denoise 454 Data
PyroNoise, Denoiser
Reference basedBLAST, UCLUST,
USEARCH
Pick OTUs and representative sequences
De novoe.g., UCLUST, CD-HIT, MOTHUR, USEARCH
Assign taxonomy
BLAST, RDP Classifier
Align sequences
e.g., PyNAST, INFERNAL, MUSCLE,
MAFFT
Build 'OTU table'i.e., sample by observation
matrix
Build phylogenetic treee.g., FastTree, RAxML,
ClearCut
Database Submission
(In development)
OTU (or other sample by observation) table
Phylogenetic Tree
Evolutionary relationship between OTUs
α-diversity and rarefaction
e.g., Phylogenetic Diversity, Chao1,
Observed Species
β-diversity and rarefaction
e.g., Weighted and unweighted UniFrac, Bray-
Curtis, Jaccard
Interactive visualizations
e.g., PCoA plots, distance histograms, taxonomy charts, rarefaction plots, network visualization, jackknifed hierarchical clustering.
Legend
Required step or input Optional step or input
Currently supported for marker-gene data only
(i.e., 'upstream' step)
Currently supported for general sample by observation data
(i.e., 'downstream' step)
www.QIIME.org
Upstream analyses Downstream analyses
![Page 41: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/41.jpg)
Sequencing output (454, Illumina, Sanger)
fastq, fasta, qual, or sff/trace files
Metadata
mapping file
Pre-processinge.g., remove primer(s), demultiplex,
quality filter
Denoise 454 Data
PyroNoise, Denoiser
Reference basedBLAST, UCLUST,
USEARCH
Pick OTUs and representative sequences
De novoe.g., UCLUST, CD-HIT, MOTHUR, USEARCH
Assign taxonomy
BLAST, RDP Classifier
Align sequences
e.g., PyNAST, INFERNAL, MUSCLE,
MAFFT
Build 'OTU table'i.e., sample by observation
matrix
Build phylogenetic treee.g., FastTree, RAxML,
ClearCut
Database Submission
(In development)
OTU (or other sample by observation) table
Phylogenetic Tree
Evolutionary relationship between OTUs
α-diversity and rarefaction
e.g., Phylogenetic Diversity, Chao1,
Observed Species
β-diversity and rarefaction
e.g., Weighted and unweighted UniFrac, Bray-
Curtis, Jaccard
Interactive visualizations
e.g., PCoA plots, distance histograms, taxonomy charts, rarefaction plots, network visualization, jackknifed hierarchical clustering.
Legend
Required step or input Optional step or input
Currently supported for marker-gene data only
(i.e., 'upstream' step)
Currently supported for general sample by observation data
(i.e., 'downstream' step)
www.QIIME.org
QC and split libraries
![Page 42: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/42.jpg)
Sequencing output (454, Illumina, Sanger)
fastq, fasta, qual, or sff/trace files
Metadata
mapping file
Pre-processinge.g., remove primer(s), demultiplex,
quality filter
Denoise 454 Data
PyroNoise, Denoiser
Reference basedBLAST, UCLUST,
USEARCH
Pick OTUs and representative sequences
De novoe.g., UCLUST, CD-HIT, MOTHUR, USEARCH
Assign taxonomy
BLAST, RDP Classifier
Align sequences
e.g., PyNAST, INFERNAL, MUSCLE,
MAFFT
Build 'OTU table'i.e., sample by observation
matrix
Build phylogenetic treee.g., FastTree, RAxML,
ClearCut
Database Submission
(In development)
OTU (or other sample by observation) table
Phylogenetic Tree
Evolutionary relationship between OTUs
α-diversity and rarefaction
e.g., Phylogenetic Diversity, Chao1,
Observed Species
β-diversity and rarefaction
e.g., Weighted and unweighted UniFrac, Bray-
Curtis, Jaccard
Interactive visualizations
e.g., PCoA plots, distance histograms, taxonomy charts, rarefaction plots, network visualization, jackknifed hierarchical clustering.
Legend
Required step or input Optional step or input
Currently supported for marker-gene data only
(i.e., 'upstream' step)
Currently supported for general sample by observation data
(i.e., 'downstream' step)
www.QIIME.org
Building an OTU table
![Page 43: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/43.jpg)
Alpha and Beta diversity
Sequencing output (454, Illumina, Sanger)
fastq, fasta, qual, or sff/trace files
Metadata
mapping file
Pre-processinge.g., remove primer(s), demultiplex,
quality filter
Denoise 454 Data
PyroNoise, Denoiser
Reference basedBLAST, UCLUST,
USEARCH
Pick OTUs and representative sequences
De novoe.g., UCLUST, CD-HIT, MOTHUR, USEARCH
Assign taxonomy
BLAST, RDP Classifier
Align sequences
e.g., PyNAST, INFERNAL, MUSCLE,
MAFFT
Build 'OTU table'i.e., sample by observation
matrix
Build phylogenetic treee.g., FastTree, RAxML,
ClearCut
Database Submission
(In development)
OTU (or other sample by observation) table
Phylogenetic Tree
Evolutionary relationship between OTUs
α-diversity and rarefaction
e.g., Phylogenetic Diversity, Chao1,
Observed Species
β-diversity and rarefaction
e.g., Weighted and unweighted UniFrac, Bray-
Curtis, Jaccard
Interactive visualizations
e.g., PCoA plots, distance histograms, taxonomy charts, rarefaction plots, network visualization, jackknifed hierarchical clustering.
Legend
Required step or input Optional step or input
Currently supported for marker-gene data only
(i.e., 'upstream' step)
Currently supported for general sample by observation data
(i.e., 'downstream' step)
www.QIIME.org
![Page 44: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/44.jpg)
Sequencing output (454, Illumina, Sanger)
fastq, fasta, qual, or sff/trace files
Metadata
mapping file
Pre-processinge.g., remove primer(s), demultiplex,
quality filter
Denoise 454 Data
PyroNoise, Denoiser
Reference basedBLAST, UCLUST,
USEARCH
Pick OTUs and representative sequences
De novoe.g., UCLUST, CD-HIT, MOTHUR, USEARCH
Assign taxonomy
BLAST, RDP Classifier
Align sequences
e.g., PyNAST, INFERNAL, MUSCLE,
MAFFT
Build 'OTU table'i.e., sample by observation
matrix
Build phylogenetic treee.g., FastTree, RAxML,
ClearCut
Database Submission
(In development)
OTU (or other sample by observation) table
Phylogenetic Tree
Evolutionary relationship between OTUs
α-diversity and rarefaction
e.g., Phylogenetic Diversity, Chao1,
Observed Species
β-diversity and rarefaction
e.g., Weighted and unweighted UniFrac, Bray-
Curtis, Jaccard
Interactive visualizations
e.g., PCoA plots, distance histograms, taxonomy charts, rarefaction plots, network visualization, jackknifed hierarchical clustering.
Legend
Required step or input Optional step or input
Currently supported for marker-gene data only
(i.e., 'upstream' step)
Currently supported for general sample by observation data
(i.e., 'downstream' step)
www.QIIME.org
Visualizacons
![Page 45: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/45.jpg)
Sequencing output (454, Illumina, Sanger)
fastq, fasta, qual, or sff/trace files
Metadata
mapping file
Pre-processinge.g., remove primer(s), demultiplex,
quality filter
Denoise 454 Data
PyroNoise, Denoiser
Reference basedBLAST, UCLUST,
USEARCH
Pick OTUs and representative sequences
De novoe.g., UCLUST, CD-HIT, MOTHUR, USEARCH
Assign taxonomy
BLAST, RDP Classifier
Align sequences
e.g., PyNAST, INFERNAL, MUSCLE,
MAFFT
Build 'OTU table'i.e., sample by observation
matrix
Build phylogenetic treee.g., FastTree, RAxML,
ClearCut
Database Submission
(In development)
OTU (or other sample by observation) table
Phylogenetic Tree
Evolutionary relationship between OTUs
α-diversity and rarefaction
e.g., Phylogenetic Diversity, Chao1,
Observed Species
β-diversity and rarefaction
e.g., Weighted and unweighted UniFrac, Bray-
Curtis, Jaccard
Interactive visualizations
e.g., PCoA plots, distance histograms, taxonomy charts, rarefaction plots, network visualization, jackknifed hierarchical clustering.
Legend
Required step or input Optional step or input
Currently supported for marker-gene data only
(i.e., 'upstream' step)
Currently supported for general sample by observation data
(i.e., 'downstream' step)
www.QIIME.org
QC and split libraries
![Page 46: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/46.jpg)
Data
Sequences are in FASTA format
![Page 47: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/47.jpg)
Data
• Quality scores are in the .qual file, similar to FASTA
![Page 48: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/48.jpg)
Metadata (mapping file)
![Page 49: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/49.jpg)
validate_mapping_file.py
![Page 50: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/50.jpg)
Split libraries
• Demulcplex • Quality trim • Quality filter
split_libraries.py hAp://qiime.org/scripts/split_libraries.html
Output files: seqs.fna – demulcplexed sequences histograms.txt – histogram of read lengths split_library_log.txt – detailed informacon about the demulcplexing and quality of reads
![Page 51: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/51.jpg)
Error-‐correccng codes allow mulcplex sequencing
Micah Hamady, et al., Nature Methods, 2008. Error-‐correccng barcodes for pyrosequencing hundreds of samples in mulcplex.
>GCACCTGAGGACAGGCATGAGGAA… >GCACCTGAGGACAGGGGAGGAGGA… >TCACATGAACCTAGGCAGGACGAA… >CTACCGGAGGACAGGCATGAGGAT… >TCACATGAACCTAGGCAGGAGGAA… >GCACCTGAGGACACGCAGGACGAC… >CTACCGGAGGACAGGCAGGAGGAA… >CTACCGGAGGACACACAGGAGGAA… >GAACCTTCACATAGGCAGGAGGAT… >TCACATGAACCTAGGGGCAAGGAA… >GCACCTGAGGACAGGCAGGAGGAA…
>PC.634_1 FLP3FBN01ELBSX CTGGGCCGTGTCTCAGTCCCAATGTGGCCGTTTACCCTCTCAGGCCGGCTACGCATCATCGCCTTGGTGGGCCGTTACCTCACCAACTAGCTAATGCGCCGCAGGTCCATCCATGTTCACGCCTTGATGGGCGCTTTAATATACTGAGCATGCGCTCTGTATACCTATCCGGTTTTAGCTACCGTTTCCAGCAGTTATCCCGGACACATGGGCTAGG!>PC.354_3 FLP3FBN01EEWKD !TTGGACCGTGTCTCAGTTCCAATGTGGGGGCCTTCCTCTCAGAACCCCTATCCATCGAAGGCTTGGTGGGCCGTTACCCCGCCAACAACCTAATGGAACGCATCCCCATCGATGACCGAAGTTCTTTAATAGTTCTACCATGCGGAAGAACTATGCCATCGGGTATTAATCTTTCTTTCGAAAGGCTATCCCCGAGTCATCGGCAGGTTGGATACGTGTTACTCACCCGTGCGCCGGT!
![Page 52: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/52.jpg)
split_libraries.py
• seqs.fna – demulcplexed sequences
![Page 53: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/53.jpg)
Sequencing output (454, Illumina, Sanger)
fastq, fasta, qual, or sff/trace files
Metadata
mapping file
Pre-processinge.g., remove primer(s), demultiplex,
quality filter
Denoise 454 Data
PyroNoise, Denoiser
Reference basedBLAST, UCLUST,
USEARCH
Pick OTUs and representative sequences
De novoe.g., UCLUST, CD-HIT, MOTHUR, USEARCH
Assign taxonomy
BLAST, RDP Classifier
Align sequences
e.g., PyNAST, INFERNAL, MUSCLE,
MAFFT
Build 'OTU table'i.e., sample by observation
matrix
Build phylogenetic treee.g., FastTree, RAxML,
ClearCut
Database Submission
(In development)
OTU (or other sample by observation) table
Phylogenetic Tree
Evolutionary relationship between OTUs
α-diversity and rarefaction
e.g., Phylogenetic Diversity, Chao1,
Observed Species
β-diversity and rarefaction
e.g., Weighted and unweighted UniFrac, Bray-
Curtis, Jaccard
Interactive visualizations
e.g., PCoA plots, distance histograms, taxonomy charts, rarefaction plots, network visualization, jackknifed hierarchical clustering.
Legend
Required step or input Optional step or input
Currently supported for marker-gene data only
(i.e., 'upstream' step)
Currently supported for general sample by observation data
(i.e., 'downstream' step)
www.QIIME.org
Building an OTU table
![Page 54: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/54.jpg)
OTU Picking -‐ “de-‐novo”
• Pros – Vast majority of reads are clustered – No reference database bias
• Cons – Speed; not easily parallelizable – Erroneous reads get clustered
CTGGGCCGTGTCTCAGTCCCAA TTGGAAGATGTCTCAGTTCCAG TTGGGCCGTATGTCAGTCCCTA CTGGGCCGTGTCTCAGTCCCAA TTGGAAGATGTCTCAGTTCCAG TTGGGCCGTATGTCAGTCCCTA
Clustered Sequences
OTUS OTU1 OTU2 OTU3
Clustering Algorithm CTGGGCCGTGTCTCAGTCCCAAACA TTGGAAGATGTCTCAGTTCCAGACA
CTGGGCCGTGTCTCAGTCCCAAACA TTGGAAGATGTCTCAGTTCCAGACA
CTGGGCCGTGTCTCAGTCCCAAACA TTGGAAGATGTCTCAGTTCCAGACA
Experimental Sequences
![Page 55: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/55.jpg)
OTU Picking -‐ “closed-‐reference”
• Pros – Reference database is a quality filter – Speed; easily parallelizable
• Cons – No new OTUs can be observed – Reference database bias
CTGGGCCGTGTCTCAGTCCCAA
CTGGGCCGTGTCTCAGTCCCAA TTGGAAGATGTCTCAGTTCCAG TTGGGCCGTATGTCAGTCCCTA CTGGGCCGTGTCTCAGTCCCAA TTGGAAGATGTCTCAGTTCCAG TTGGGCCGTATGTCAGTCCCTA
Experimental Sequences
Reference Sequences
CTGGGCCGTGTCTCAGTCCCAA
CTGGGCCGTGTCTCAGTCCCAA TTGGAAGATGTCTCAGTTCCAG
CTGGGCCGTGTCTCAGTCCCAA TTGGAAGATGTCTCAGTTCCAG
CTGGGCCGTGTCTCAGTCCCAA TTGGAAGATGTCTCAGTTCCAG
Sequences that hit a reference
CTGGGCCGTGTCTCAGTCCCAA
Sequences that failed to hit
CTGGGCCGTGTCTCAGTCCCAA TTGGAAGATGTCTCAGTTCCAG TTGGGCCGTATGTCAGTCCCTA CTGGGCCGTGTCTCAGTCCCAA TTGGAAGATGTCTCAGTTCCAG TTGGGCCGTATGTCAGTCCCTA
OTUS OTU1 OTU1 OTU1
![Page 56: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/56.jpg)
Reference database
![Page 57: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/57.jpg)
Percentage of reads that do not hit the reference colleccon, by environment type.
![Page 58: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/58.jpg)
Other databases
• hAp://www.arb-‐silva.de hAp://qiime.org/home_stacc/dataFiles.html
• hAp://ssu-‐rrna.org
![Page 59: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/59.jpg)
OTU Picking -‐ “open-‐reference”
• Pros – Best of both worlds
• Cons – Downsides of de-‐novo
CTGGGCCGTGTCTCAGTCCCAA
CTGGGCCGTGTCTCAGTCCCAA TTGGAAGATGTCTCAGTTCCAG TTGGGCCGTATGTCAGTCCCTA CTGGGCCGTGTCTCAGTCCCAA TTGGAAGATGTCTCAGTTCCAG TTGGGCCGTATGTCAGTCCCTA
Experimental Sequences
Reference Sequences
CTGGGCCGTGTCTCAGTCCCAA
CTGGGCCGTGTCTCAGTCCCAA TTGGAAGATGTCTCAGTTCCAG
CTGGGCCGTGTCTCAGTCCCAA TTGGAAGATGTCTCAGTTCCAG
CTGGGCCGTGTCTCAGTCCCAA TTGGAAGATGTCTCAGTTCCAG
Sequences that hit a reference
CTGGGCCGTGTCTCAGTCCCAA
Sequences that failed to hit
CTGGGCCGTGTCTCAGTCCCAA TTGGAAGATGTCTCAGTTCCAG TTGGGCCGTATGTCAGTCCCTA CTGGGCCGTGTCTCAGTCCCAA TTGGAAGATGTCTCAGTTCCAG TTGGGCCGTATGTCAGTCCCTA
OTUS
OTU1 OTU2 OTU3
OTU4 OTU5 OTU6
Clustering Algorithm
![Page 60: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/60.jpg)
pick_open_reference_otus.py
• hAp://qiime.org/scripts/pick_open _reference_otus.html • Workflow script, performs all steps through building an OTU
table (see the log file) – pick_otus.py: determine the OTU clusters – pick_rep_set.py: pick the representacve sequence for each OTU cluster – align_seqs.py: align the sequences to a template or other reference alignment – assign_taxonomy.py: allot a taxonomy to the representacve sequences – filter_alignment.py: remove non-‐phylogeneccally informacve posicons – make_phylogeny.py: construct a phylogeny from an alignment – make_otu_table.py: constructs the actual OTU table object
![Page 61: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/61.jpg)
QIIME parameters
• hAp://qiime.org/documentacon/qiime_parameters_files.html
• Modify the default behavior of a workflow script. • Blank lines and those starcng with ‘#’ are ignored • Format
– script:parameter value
![Page 62: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/62.jpg)
OTU Table in BIOM format
• Opcmized and efficient data abstraccon • Can be used with many types of data, but to make it Excel 'readable’ use: biom convert
![Page 63: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/63.jpg)
biom convert
• hAp://biom-‐format.org • Converts the BIOM format OTU table to an Excel readable format
• biom convert –i otu_table_mc2_w_tax.biom –o otu_table.txt -‐b
![Page 64: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/64.jpg)
OTU table sample idencfiers
![Page 65: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/65.jpg)
Taxonomic Assignment
• Kingdom • Phylum
• Class • Order • Family • Genus • Species
Sequence 16S gene and compare to 16S database with taxonomic assignments
![Page 66: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/66.jpg)
Taxonomic Assignment using e.g. Uclust
CTGGGCCGTGTCTCAGTCCCAA TTGGAAGATGTCTCAGTTCCAG TTGGGCCGTATGTCAGTCCCTA CTGGGCCGTGTCTCAGTCCCAA TTGGAAGATGTCTCAGTTCCAG TTGGGCCGTATGTCAGTCCCTA
Experimental Sequences
Reference Sequences CTGGGCCGTGTCTCAGTCCCAA TTGGAAGATGTCTCAGTTCCAG TTGGGCCGTATGTCAGTCCCTA CTGGGCCGTGTCTCAGTCCCAA TTGGAAGATGTCTCAGTTCCAG TTGGGCCGTATGTCAGTCCCTA
![Page 67: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/67.jpg)
Biom summary
• Basic stacsccs on the OTU table – Num samples, OTUs, sequences in OTUs – Sequences per sample – Useful to determine values to use in downstream analyses
![Page 68: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/68.jpg)
Alpha and Beta diversity
Sequencing output (454, Illumina, Sanger)
fastq, fasta, qual, or sff/trace files
Metadata
mapping file
Pre-processinge.g., remove primer(s), demultiplex,
quality filter
Denoise 454 Data
PyroNoise, Denoiser
Reference basedBLAST, UCLUST,
USEARCH
Pick OTUs and representative sequences
De novoe.g., UCLUST, CD-HIT, MOTHUR, USEARCH
Assign taxonomy
BLAST, RDP Classifier
Align sequences
e.g., PyNAST, INFERNAL, MUSCLE,
MAFFT
Build 'OTU table'i.e., sample by observation
matrix
Build phylogenetic treee.g., FastTree, RAxML,
ClearCut
Database Submission
(In development)
OTU (or other sample by observation) table
Phylogenetic Tree
Evolutionary relationship between OTUs
α-diversity and rarefaction
e.g., Phylogenetic Diversity, Chao1,
Observed Species
β-diversity and rarefaction
e.g., Weighted and unweighted UniFrac, Bray-
Curtis, Jaccard
Interactive visualizations
e.g., PCoA plots, distance histograms, taxonomy charts, rarefaction plots, network visualization, jackknifed hierarchical clustering.
Legend
Required step or input Optional step or input
Currently supported for marker-gene data only
(i.e., 'upstream' step)
Currently supported for general sample by observation data
(i.e., 'downstream' step)
www.QIIME.org
![Page 69: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/69.jpg)
How do we describe and compare diversity?
• α Diversity: – “How many species (taxa) are in a sample?”
• e.g. 6 colors in A and 6 in B • Are polluted environments less diverse than priscne?
• β Diversity: – “How many species are shared between samples?”
• e.g. 2 shared colors between A and B • Do the microbiota differ among different disease states?
A
B
![Page 70: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/70.jpg)
Qualitacve vs. Quanctacve measures
• Qualitacve: Considers presence/absence – α: How many species are in a sample?
• e.g.: 6 species (colors) in both A and B. – β: How many species are shared between samples?
• e.g.: A and B are idenccal because the same colors are present in both.
• Quanctacve: Considers abundance – α: Accounts for distribucon:
• e.g. in B, 6 species are evenly distributed and thus the co community is more diverse than in A where 1 species dominates over other 5.
– β: Samples will be considered more similar if the same distribucon of species is similar. • e.g. B and A no longer look idenccal because of differences in abundance.
A
B
![Page 71: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/71.jpg)
What is a phylogenecc diversity measure?
• α Diversity: – Taxon: “How many species are in a sample?” – Phylogenecc: “How much phylogenecc divergence is in a
sample?” • e.g. B more diverse than A -‐ more divergent colors
• β Diversity: – Taxon: “How many species are shared between samples?” – Phylogenecc: “How much phylogenecc distance is shared
between samples?” • only related colors from B are in A
A
B
![Page 72: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/72.jpg)
UniFrac distance matrix
![Page 73: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/73.jpg)
core_diversity_analyses.py • Workflow script
– filter_samples_from_otu_table.py: Filter samples with low sequence count from table
– single_rarefaccon.py: sample the table at specified sequencing depth – beta_diversity.py: use the sampled table for beta diversity calculacon – principal_coordinates.py: perform PCoA analysis – make_emperor.py: make plots for principal coordinates – mulcple_rarefaccons.py: make mulcple subsamplings/rarefaccons on an otu
table at various sequencing depths – alpha_diversity.py and collate_alpha.py: calculate alpha diversices at those
depths and collate them – make_rarefaccon_plots.py: plot the rarefaccon curves – summarize_taxa.py and plot_taxa_summary.py: summarize taxa and plot
them
![Page 74: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/74.jpg)
Alpha diversity
Basic alpha diversity measure: count number of OTUs. other measures can be: • phylogenecc (PD) • escmators (chao1) • other stacsccs (evenness) • …
![Page 75: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/75.jpg)
Beta diversity
orange1 orange2 blue1 OTU1 4 4 0 OTU2 4 4 0 OTU3 0 1 7 OTU4 0 0 7
![Page 76: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/76.jpg)
Summarize Taxa
• Calculates proporcon of taxa per sample, at different taxonomic levels
• summarize_taxa_through_plots.py
![Page 77: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/77.jpg)
![Page 78: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/78.jpg)
Taxa Summarized by Category
![Page 79: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/79.jpg)
Procrustes Analysis
hAp://qiime.org/tutorials/procrustes_analysis.html transform_coordinate_matrices.py compare_3d_plots.py
Muegge, B. D. et al. Science 332, 970–974 (2011).
![Page 80: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/80.jpg)
![Page 81: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/81.jpg)
Stacsccally Different?
• group_significance.py • Parametric
– G-‐test – ANOVA – T-‐test
• Non parametric – Kruskal-‐Wallis – Mann-‐Whitney-‐U – Bootstrap Mann-‐Whitney-‐U – Bootstrap T-‐test
• compare_categories.py • make_distance_boxplots.py • …
![Page 82: Esa 2014 qiime](https://reader034.vdocuments.site/reader034/viewer/2022042715/558573b8d8b42a4c2c8b4ce9/html5/thumbnails/82.jpg)
Acknowledgements
Rob Knight Antonio Gonzalez Meg Pirrung Adam Robbins-‐Pianka Luke Ursell Tony Walters Doug Wendel Daniel McDonald Yoshiki Vázquez Baeza Will Van Treuren Laura Wegener Parfery Kris Mayer
Merete Eggesbo Jessica Metcalf Ulla Westermann Zhenjiang Zech Xu Jose Navas Chris Lauber MaA Gebert Greg C Humphrey Hongwei Zhou
Rick Stevens (Argonne), Jack Gilbert (Argonne), Folker Meyer (Argonne), Janet Jansson (LBNL), Jed Fuhrman (USC), Jonathan Eisen (UC Davis), many, many sample donors.
Other collaborators: Noah Fierer (CU, EEB), Jeff Gordon (Wash U), Ruth Ley (Cornell), Peter Turnbaugh(Harvard), Maria Gloria Dominguez (UPR), Catherine Lozupone (CU) ...