fish 546: bioinformatics winter 2013 rachel lange

16
Using short-read SOLiD sequencing technology to characterize the microbial community in 500ml of seawater. FISH 546: Bioinformatics Winter 2013 Rachel Lange

Upload: shel

Post on 23-Feb-2016

42 views

Category:

Documents


0 download

DESCRIPTION

Using short-read SOLiD sequencing technology to characterize the microbial community in 500ml of seawater. FISH 546: Bioinformatics Winter 2013 Rachel Lange. Challenges of characterizing microbial communities. Overwhelming abundance Nearly impossible to detect each member in a sample. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: FISH 546: Bioinformatics Winter 2013 Rachel Lange

Using short-read SOLiD sequencing technology to characterize the microbial community in 500ml of seawater.

FISH 546: BioinformaticsWinter 2013Rachel Lange

Page 2: FISH 546: Bioinformatics Winter 2013 Rachel Lange

Challenges of characterizing microbial communitiesOverwhelming abundance

◦Nearly impossible to detect each member in a sample

Page 3: FISH 546: Bioinformatics Winter 2013 Rachel Lange

Challenges of characterizing microbial communitiesMajority of taxa are uncultured or

unknown.◦NCBI has 2601 complete prokaryotic

genomes◦Potentially >120,000 bacterial

species in the ocean (Zinger et al 2011).

Page 4: FISH 546: Bioinformatics Winter 2013 Rachel Lange

Challenges of characterizing microbial communitiesMajority of genes, even within

complete genomes, have unknown functions.

Page 5: FISH 546: Bioinformatics Winter 2013 Rachel Lange

The solution:Use next-generation sequence

technology to sequence anything and everything!

This is getting a little out of hand…

Page 6: FISH 546: Bioinformatics Winter 2013 Rachel Lange

Main objectiveCan we fully characterize the

microbial community in a small water sample from Golden Gardens, a human-disturbed environment?

Approach: ◦Obtained an assembled

metagenome from a single water sample.

◦See who’s there and what they’re doing.

Page 7: FISH 546: Bioinformatics Winter 2013 Rachel Lange

1. Who’s there?Approach

◦Used a web-server, MG-RAST, to assign taxonomy to each contig from the GG metagenome.

◦Compared to local blastn to NCBI database.

Archaea Viruses Bacteria Eukarya Archaea Viruses Bacteria Eukarya

24,365 total

109,408 total

Page 8: FISH 546: Bioinformatics Winter 2013 Rachel Lange

1. Who’s there?

Page 9: FISH 546: Bioinformatics Winter 2013 Rachel Lange

What are the abundant types doing?Candidatus Pelagibacter ubique

(SAR11)◦One of the most common marine

bacteria◦Accounted for 6.095% of sequences

in GG

Page 10: FISH 546: Bioinformatics Winter 2013 Rachel Lange

Metabolism of terpenoids and polyketides

Xenobiotics biodegradation and metabolism

Neurodegenerative diseases

Cell communication

Transcription

Signaling molecules and interaction

Glycan biosynthesis and metabolism

Nucleotide metabolism

Lipid metabolism

Transport and catabolism

Cell growth and death

Folding, sorting and degradation

Replication and repair

Energy metabolism

Metabolism of cofactors and vitamins

Membrane transport

Translation

Signal transduction

Carbohydrate metabolism

Amino acid metabolism

0 200 400 600 800 1000 1200 1400 1600 1800

Number of sequences

Page 11: FISH 546: Bioinformatics Winter 2013 Rachel Lange

What are the abundant types doing?Nitrosopumilus martimus

◦One of the most common marine archaea

◦Accounted for 2.081% of sequences in GG

◦Ammonia oxidizing capability in mesophilic, aerobic environments like GG.

Page 12: FISH 546: Bioinformatics Winter 2013 Rachel Lange

Cell growth and death

Lipid metabolism

Cell motility

Xenobiotics biodegradation and metabolism

Cell communication

Transport and catabolism

Glycan biosynthesis and metabolism

Transcription

Neurodegenerative diseases

Replication and repair

Signaling molecules and interaction

Folding, sorting and degradation

Nucleotide metabolism

Membrane transport

Signal transduction

Energy metabolism

Carbohydrate metabolism

Metabolism of cofactors and vitamins

Translation

Amino acid metabolism

0 50 100 150 200 250 300 350 400 450

Number of sequences

Page 13: FISH 546: Bioinformatics Winter 2013 Rachel Lange

Do bacterial or archaeal ammonia oxidizers dominate the community?Approach:

1. Assemble fasta file of reference amoA gene sequences from published bacterial or archaeal studies.

2. Make a blast db using GG metagenome.3. BlastN to identify all bacterial or archaeal

amoA genes in the GG metagenome4. Compare abundance, diversity, and

phylogeny of each domain. Tools used: NCBI, blastn, Galaxy,

Clustal, MrBayes, iPlant.

Page 14: FISH 546: Bioinformatics Winter 2013 Rachel Lange

Abundance of amoA genes

Bacterial Archaeal

Reference amoA 46 59GG amoA 309 855Total sequences 95994 6287Ratio amoA/Total 0.003 0.136

Archaeal amoA are more than 2x as abundant as Bacterial.

The proportion of amoA/Total is considerably larger!

Page 15: FISH 546: Bioinformatics Winter 2013 Rachel Lange

Diversity and phylogeny of amoA?The amoA sequences obtained do not align with the reference amoA sequences. They are very divergent and when I blast them individually they do not match any reference or environmental amoA sequences.

Page 16: FISH 546: Bioinformatics Winter 2013 Rachel Lange

ConclusionsShort-read sequences from SOLiD

sequencing technology can be used to characterize complex microbial communities.

MG-RAST is an efficient web-server for analyzing the taxonomic and functional diversity within a metagenome.

We need more cultured representatives in order to correctly annotate metagenomes◦Or use metagenomes to assemble

representatives (Iverson, V. et al. Science. 335, 587-590. (2012)).