quantitative metagenomics

Post on 05-Jan-2017

225 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Quantitative Metagenomics

Lea Benedicte Skov Hansen, PhD NGS Course 13th of June 2016

13/06/2016 Quantitative Metagenomics 2 DTU Sytems Biology, Technical University of Denmark

Exercise

13/06/2016 Quantitative Metagenomics 3 DTU Sytems Biology, Technical University of Denmark

Exercise

•  Metagenome assembly –  Preassembled with two methods:

•  Soap •  Meta Velvet

–  Contig coverage –  Assembly statistics

13/06/2016 Quantitative Metagenomics 4 DTU Sytems Biology, Technical University of Denmark

Exercise

•  Metagenome assembly –  Preassembled with two methods:

•  Soap •  Meta Velvet

–  Contig coverage –  Assembly statistics

•  Gene prediction –  Prodigal –  Gene clustering based on similarity –  Gene catalogue

13/06/2016 Quantitative Metagenomics 5 DTU Sytems Biology, Technical University of Denmark

Exercise

•  Metagenome assembly –  Preassembled with two methods:

•  Soap •  Meta Velvet

–  Contig coverage –  Assembly statistics

•  Gene prediction –  Prodigal –  Gene clustering based on similarity –  Gene catalogue

•  Gene abundance matrix –  Align reads to gene catalogue with bwa –  Count number of reads mapping to a

gene – samtools

13/06/2016 Quantitative Metagenomics 6 DTU Sytems Biology, Technical University of Denmark

Exercise 2

•  Metagenome assembly –  Preassembled with two methods:

•  Soap •  Meta Velvet

–  Contig coverage –  Assembly statistics

•  Gene prediction –  Prodigal –  Gene clustering based on similarity –  Gene catalogue

•  Gene abundance matrix –  Align reads to gene catalogue with bwa –  Count number of reads mapping to a

gene – samtools •  Taxonomic annotation of gene catalogue

–  Blast gene catalogue •  NCBI Bacterial Genomes •  373 additional genomes

–  Rearranging gene abundance to taxonomic abundance

13/06/2016 Quantitative Metagenomics 7 DTU Sytems Biology, Technical University of Denmark

Ecology

Is the scientific analysis and study of interactions among organisms and their environment, such as the interactions organisms have with each other and with their abiotic environment.

13/06/2016 Quantitative Metagenomics 8 DTU Sytems Biology, Technical University of Denmark

Nothing new – except the technology

Classical measures • Abundance • Diversity • Richness

13/06/2016 Quantitative Metagenomics 9 DTU Sytems Biology, Technical University of Denmark

Abundance (Counts)

13/06/2016 Quantitative Metagenomics 10 DTU Sytems Biology, Technical University of Denmark

Abundance (Count)

Lion 64 Zebra 128 Giraffe 64 leopard 64 rhinoceros 64 hippopotamus 128 gazelle 128 elephant 64 monkey 9

13/06/2016 Quantitative Metagenomics 11 DTU Sytems Biology, Technical University of Denmark

Richness

Lion 64 Zebra 128 Giraffe 64 Leopard 64 Rhinoceros 64 Hippopotamus 128 Gazelle 128 Elephant 64 Monkey 9

9 observed species

13/06/2016 Quantitative Metagenomics 12 DTU Sytems Biology, Technical University of Denmark

Richness

Rarefaction curves

13/06/2016 Quantitative Metagenomics 13 DTU Sytems Biology, Technical University of Denmark

Richness

Rarefaction curves

13/06/2016 Quantitative Metagenomics 14 DTU Sytems Biology, Technical University of Denmark

Richness

Lion 1 Zebra 2 Giraffe 1 Leopard 1 Rhinoceros 1 Hippopotamus 2 Gazelle 2 Elephant 1 Monkey 0

Species richness estimators: Chao1 index = Sobs + f12/(2f2) Sobs = observed species f1 = species observed once f2 = species observed twice

8 observed species

Chao1 index = 8 + 52/(2*3) = 12.17

13/06/2016 Quantitative Metagenomics 15 DTU Sytems Biology, Technical University of Denmark

Evenness

Lion 1 1 Zebra 2 1 Giraffe 1 8 Leopard 1 1 Rhinoceros 1 1 Hippopotamus 2 1 Gazelle 2 1 Elephant 1 1 Monkey 0 0

13/06/2016 Quantitative Metagenomics 16 DTU Sytems Biology, Technical University of Denmark

Alpha Diversity

Richness Evenness Richness: s1 = s2

Lion 1 1 Zebra 2 1 Giraffe 1 8 Leopard 1 1 Rhinoceros 1 1 Hippopotamus 2 1 Gazelle 2 1 Elephant 1 1 Monkey 0 0

13/06/2016 Quantitative Metagenomics 17 DTU Sytems Biology, Technical University of Denmark

Alpha Diversity

Richness: s1 = s2 Evenness s1 ≠ s2

Lion 1 1 Zebra 2 1 Giraffe 1 8 Leopard 1 1 Rhinoceros 1 1 Hippopotamus 2 1 Gazelle 2 1 Elephant 1 1 Monkey 0 0

13/06/2016 Quantitative Metagenomics 18 DTU Sytems Biology, Technical University of Denmark

Alpha Diversity

Shannon index

H = Σ i=1

R

pi ln pi

H = Shannon index p = count of species i / total counts R = observed species

Lion 1 1 Zebra 2 1 Giraffe 1 8 Leopard 1 1 Rhinoceros 1 1 Hippopotamus 2 1 Gazelle 2 1 Elephant 1 1 Monkey 0 0

13/06/2016 Quantitative Metagenomics 19 DTU Sytems Biology, Technical University of Denmark

Alpha Diversity

Lion 1 1 Zebra 2 1 Giraffe 1 8 Leopard 1 1 Rhinoceros 1 1 Hippopotamus 2 1 Gazelle 2 1 Elephant 1 1 Monkey 0 0

Shannon index

H = Σ i=1

R

pi ln pi

Hs1 = 2.02 Hs2 = 1.60

p1 = p2 = p3 .. pR

H = ln(R) = 2.08

13/06/2016 Quantitative Metagenomics 20 DTU Sytems Biology, Technical University of Denmark

Sample Sizes

13/06/2016 Quantitative Metagenomics 21 DTU Sytems Biology, Technical University of Denmark

Sample Sizes

Accounting for different sample sizes:

•  Normalize to sample size

•  Rarefy samples

•  Statistical model of sample variance

13/06/2016 Quantitative Metagenomics 22 DTU Sytems Biology, Technical University of Denmark

Sample Sizes

Lion 64 1 Zebra 128 2 Giraffe 64 1 Leopard 64 1 Rhinoceros 64 1 Hippopotamus 128 2 Gazelle 128 2 Elephant 64 1 Monkey 9 0 Total 713 11

Normalize to library size: Norm = ni/ntot

Lion 8.98 9.09 Zebra 17.95 18.18 Giraffe 8.98 9.09 Leopard 8.98 9.09 Rhinoceros 8.98 9.09 Hippopotamus 17.95 18.18 Gazelle 17.95 18.18 Elephant 8.98 9.09 Monkey 1.26 0 Total 100 100

13/06/2016 Quantitative Metagenomics 23 DTU Sytems Biology, Technical University of Denmark

Sample Sizes

Rarefying to smaller library size:

Lion 64 1 Zebra 128 2 Giraffe 64 1 Leopard 64 1 Rhinoceros 64 1 Hippopotamus 128 2 Gazelle 128 2 Elephant 64 1 Monkey 9 0 Total 713 11

Lion 2 1 Zebra 3 2 Giraffe 0 1 Leopard 1 1 Rhinoceros 0 1 Hippopotamus 3 2 Gazelle 1 2 Elephant 0 0 Monkey 0 0 Total 10 10

13/06/2016 Quantitative Metagenomics 24 DTU Sytems Biology, Technical University of Denmark

Sample sizes

Normalization and downsizing does not account for heteroscedasticity! Statistically modeled variance: •  DESeq2 •  EdgeR

13/06/2016 Quantitative Metagenomics 25 DTU Sytems Biology, Technical University of Denmark

Beta-Diversity

Diversity between communities!

13/06/2016 Quantitative Metagenomics 26 DTU Sytems Biology, Technical University of Denmark

Beta-Diversity

Lion 0 2 Zebra 3 2 Giraffe 0 4 Leopard 0 2 Rhinoceros 1 2 Hippodrome 4 0 Gazelle 0 1 Elephant 1 0 Total 9 13

13/06/2016 Quantitative Metagenomics 27 DTU Sytems Biology, Technical University of Denmark

Beta-Diversity Lion 0 2 Zebra 3 2 Giraffe 0 4 Leopard 0 2 Rhinoceros 1 2 Hippodrome 4 0 Gazelle 0 1 Elephant 1 0 Total 9 13

Bray-Curtis dissimilarity metric

Bij = 1 - 2Cij / (Si + Sj) C = sum of the lowest count of common species S = total count of the sample Bs1s2 = 1 – 2*3 / 22 = 0.73 - Dissimilar C = 3 Ss1 + Ss2 = 22

0 ≤ B ≤ 1

13/06/2016 Quantitative Metagenomics 28 DTU Sytems Biology, Technical University of Denmark

Beta-Diversity

Other similarity metrics •  Eucledian distance

•  Jensen Shannon Distance

M=(x+y)/2

13/06/2016 Quantitative Metagenomics 29 DTU Sytems Biology, Technical University of Denmark

Beta-Diversity

Distance matrix

13/06/2016 Quantitative Metagenomics 30 DTU Sytems Biology, Technical University of Denmark

Diversity - example

13/06/2016 Quantitative Metagenomics 31 DTU Sytems Biology, Technical University of Denmark

Diversity - example

13/06/2016 Quantitative Metagenomics 32 DTU Sytems Biology, Technical University of Denmark

Diversity - example

13/06/2016 Quantitative Metagenomics 33 DTU Sytems Biology, Technical University of Denmark

Diversity - example

13/06/2016 Quantitative Metagenomics 34 DTU Sytems Biology, Technical University of Denmark

Hands on!

top related