20120907 microbiome-intro

24

Upload: leo-lahti

Post on 03-Jun-2015

885 views

Category:

Technology


1 download

DESCRIPTION

Introduction to the microbiome R package

TRANSCRIPT

Page 1: 20120907 microbiome-intro

Code sharing for microbiomicsLeo, Wageningen 7.9.201 2

Page 2: 20120907 microbiome-intro
Page 3: 20120907 microbiome-intro

Challenges with computer code:- Analyses not standardized -> confusion & non-optimal choices- Poor documentation -> poor reproducibility & waste of time- Reinventing the wheel -> waste of time & resources

Solution:- Harmonized software libraries (e.g. R packages)- Easier to share tools (GitHub)

-> more reliable & reproducible-> more standardized-> avoid repetitive coding-> added value for publications-> distributed version control; all changes automatically tracked-> facilitates Helsinki-Wageningen collaboration

Page 4: 20120907 microbiome-intro
Page 5: 20120907 microbiome-intro
Page 6: 20120907 microbiome-intro

Wiki: various example analyses already implemented

- retrieve data from MySQL( H/M/PITChip)

- preprocessing (profiling & HITChip Atlas)

- analysis routines (diversities, tables,Wilcoxon tests etc.)

- visualization

-> improving through time

Page 7: 20120907 microbiome-intro

Step­by­step examples with source code andsimulated data

Page 8: 20120907 microbiome-intro

Common core microbiota:effect of analysis depth and prevalence

"Blanket analysis"github.com/microbiome

Estimate the frequency ofbelonging to the core foreach phylotype; confidenceintervals with bootstrap

Coresize

Abundance

PrevalenceSalonen A, et al. (2012) The adult intestinal coremicrobiota is determined by analysis depth and healthstatus, Clinical Microbiology and Infection 18:16–20.

Page 9: 20120907 microbiome-intro

Compatible with HITChip Atlas of Human GutMicrobiota (>3200 samples)

45 studies - Standardized Platform

>1000phylotypes

>3000 samples

-> Compare your own data to HITChip data collections?

Page 10: 20120907 microbiome-intro

Differences to the old profiling script?-> Separate preprocessing from analysis-> Support modularity

-> removed outdated options & outputs from profiling script

1. Preprocessing: minimal output from profiling script:- preprocessed data matrices (oligo/L1 /L2/species/absolutescale) with NMF/RPA/SUM- preprocessing log (parameter values etc.)- quality control plots (heatmap)

2. Analysis & visualization routines- based on profiling script output & done afterwards -> modular- used when needed, not run by default-> keeping it simple & storing disk space

Page 11: 20120907 microbiome-intro

Summary: code development & sharing through GitHubIn-house sharing infrastructure for code-> distributed package maintenance-> avoid bugs; facilitate transparency & reproducibility-> additional visibility & citations?

Avoid extra work and focus on the essential-> check for ready-made examples from the wiki!-> ask for help -> let's add examples to the wiki!

Manage and share your own code?-> GitHub and microbiome R package-> Version control

microbiome.github.com

Page 12: 20120907 microbiome-intro

To discussDo you have R code which could be useful for others?-> let's polish, document & add it in the package!

Which tools to include?- diversity/richness/evenness calculations- PCA, hierarchical clusterings, RDA etc.- Wilcoxon tests- Association (Spearman) tables phylotypes vs. phenotypes- Relative contributions from bg variables

-> ideally, only standard things should be standardized;for rare analyses just use basic R & other packages

Page 13: 20120907 microbiome-intro
Page 14: 20120907 microbiome-intro

HITChip preprocessing steps- Spatial correction

- Between array normalization

- Background correction

- Oligo summarization

Page 15: 20120907 microbiome-intro

1. Spatial correction

Page 16: 20120907 microbiome-intro

2. Between­array normalization: minmax vs. quantiles?

Page 17: 20120907 microbiome-intro

3. Background correction: skip!

Page 18: 20120907 microbiome-intro

4. Oligo summarization

NMF

RPA

SUM

AVE

Page 19: 20120907 microbiome-intro

Preprocessing: recommendations* Normalization:

- minmax: use by default- quantile: use if samples have 'similar' microbiota

* Background correction-> ignore

* Oligo summarization-> NMF: for L0/L1 /L2 levels-> RPA: if species level is also included-> (SUM: for comparison)-> AVE: deprecated

=> The defaults readily implemented in the pipeline

Page 20: 20120907 microbiome-intro

Diversity analysisRichness, evenness, diversity

Shannon vs. Inverse Simpson?

Detection threshold?

Page 21: 20120907 microbiome-intro

Richness with various indices and thresholds

Page 22: 20120907 microbiome-intro

Recommendation:- oligo level

- shannon diversity

- richness as speciescount with 80%quantile detectionthreshold

- evenness withPielou's index

Page 23: 20120907 microbiome-intro

Further analysis tools

Page 24: 20120907 microbiome-intro

microbiome.github.com