metagenomics and it’s applications

Post on 16-Apr-2017

782 Views

Category:

Science

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Metagenomics COMPUTATIONAL BIOLOGY

Metagenomics

Metagenomics is the study of genetic material recovered directly from environmental samples.

While traditional microbiology and microbial genome sequencing and genomics  rely upon cultivated clonal cultures, early environmental gene sequencing cloned specific genes to produce a profile of diversity in a natural sample

Cont..

Cont..

TWO APPROACHES FOR METAGENOMICS

In the first approach: Known as ‘sequence-driven metagenomics’, DNA from the

environment of interest is sequenced and subjected to computational analysis.

The metagenomic sequences are compared to sequences deposited in publicly available databases such as GENBANK.

The genes are then collected into groups of similar predicted function, and the distribution of various functions and types of proteins that conduct those functions can be assessed.

Cont,,

In the second approach: ‘Function-driven metagenomics’, the DNA extracted from the

environment is also captured and stored in a surrogate host, but instead of sequencing it, scientists screen the captured fragments of DNA, or ‘clones’, for a certain function.

The function must be absent in the surrogate host so that acquisition

of the function can be attributed to the metagenomics DNA.

LIMITATIONS OF TWO APPROACHES

The sequence driven approach limited existing knowledge: if a metagenomic gene does not look like

a gene of known function deposited in the databases, then little can be learned about the gene or its product from sequence alone.

The function driven approach most genes from organisms in wild communities cannot be expressed

easily by a given surrogate host

How it use in bioinformatics:

Sequence pre-filtering

The first step of metagenomic data analysis requires the execution of certain pre-filtering steps, including the removal of redundant, low-quality sequences and sequences of probable eukaryotic origin .

 The methods available for the removal of contaminating eukaryotic genomic DNA sequences include Eu-Detect and DeConseq.

Comparative metagenomics

Comparative analyses between metagenomes can provide additional insight into the function of complex microbial communities and their role in host health.

 Pairwise or multiple comparisons between metagenomes can be made at the level of sequence composition (comparing GC-content or genome size), taxonomic diversity, or functional complement.

Cont,,

Consequently, metadata on the environmental context of the metagenomic sample is especially important in comparative analyses, as it provides researchers with the ability to study the effect of habitat upon community structure and function.

Metagenomics

Bioinformatics for Whole-Genome Shotgun Sequencing

AUTHORS:

1. KEVIN CHEN 2. LIOR PACHTER

PUBLISHED: JULY 12, 2005 CITATION: 126

Shotgun Sequencing

Shotgun sequencing involves randomly breaking up DNA sequences into lots of small pieces and then reassembling the sequence by looking for regions of overlap.

Large, mammalian genomes difficult to clone(complex). Clone-by-clone sequencing, although reliable and methodical(time taking).

Used by Fred Sanger and his colleagues. To sequence small genomes such as those of viruses and bacteria.

fragments are often of varying sizes, ranging from 2-20kilobases to 200-300 kilo bases.

Advantages of shotgun sequencing:

By removing the mapping stages, much faster process than clone-by-clone sequencing.

Uses a fraction of the DNA that clone-by-clone sequencing needs. Efficient if there is an existing reference sequence. Easier to assemble the genome sequence by aligning it to an

existing reference genome?. Faster and less expensive than methods requiring a genetic map.

Disadvantages of shotgun sequencing

Vast amounts of computing power and sophisticated software are required to assemble shotgun sequences together.

Errors in assembly are more likely to be made because a genetic map is not used

Easier to resolve than in other methods and minimized if a reference genome can be used.

Carried out if a reference genome is already available, otherwise assembly is very difficult without an existing genome to match it to.

Repetitive genomes and sequences can be more difficult to assemble.

Assembling Communities

The assembly of communities has strong similarities to the assembly of highly polymorphic diploid eukaryotes, such as Ciona savigny  and Candida albicans.

If we view prokaryotic strains as analogous to eukaryotic haplotypes. The main difference is that in a microbial community, the number of strains is unknown

and potentially large, and their relative abundance is also unknown and potentially skewed, while in most eukaryotes we know a priori the number of haplotypes and their relative abundance.

This disadvantage is mitigated somewhat by the small size and relative lack of repetitive sequence in prokaryotic and viral genomes, so that the issue of distinguishing alleles from paralogs and polymorphism from repetitive sequence is less acute.

We performed similar calculations for the three whale fall communities. In addition, we considered the problem of assembling all genomes in these communities.

Since the 16S survey indicated that three dominant species constitute approximately half the total abundance and all other species have roughly equal abundance, the Lander–Waterman model implies that the expected coverage should be distributed as the mixture of two Poisons with equal weight.

The results of these calculations are summarized. Similar results were obtained by Venter et al. and Breitbart et al. , and bioinformatitions use different software's.

Whole genome shotgun sequencing guided by bioinformatics pipelines—an optimized approach for an established technique

Shotgun metagenomics sequencing allows researchers to comprehensively sample all genes in all organisms present in a given complex sample. The method enables microbiologists to evaluate bacterial diversity and detect the abundance of microbes in various environments. Shotgun metagenomics also provides a means to study unculturable microorganisms that are otherwise difficult or impossible to analyze.

Phylogeny and Community Diversity

Regards to community diversity, one of the advantages of the WGS approach is that it is less biased then PCR, which is known to suffer from a host of problems.

Community modeling based on analysis of assembly data within the Lander–Waterman model is beginning to show that species abundance curves are not lognormal as previously thought.

New methods that take into account these naturally occurring distributions are needed.

Conclusion

The number of new community shotgun sequencing projects continues to grow, promising to provide vast quantities of sequence data for analysis.

Samples are being drawn from macroscopic environments such as the sea and air, as well as from more contained communities such as the human mouth.

Exciting advances in our understanding of ecosystems, environments, and communities will require creative solutions to numerous new bioinformatics problems.

We have briefly mentioned some of these: assembly (can co-assembly techniques be used to assemble polymorphic genomes and complex communities?), binning (what is the best way to combine diverse sources of information to bin scaffolds?), gene finding (how should gene finding programs, which were designed for complete genes and genomes, be adapted for low-coverage sequence?), fingerprinting (which clustering techniques are best suited for discovering novel pathways and functional groups that allow communities to adapt to their environments?), and MSA and phylogeny (how can we best construct trees and alignments from fragmented data?).

Countless more challenges will likely emerge as WGS sequencing approaches are used to tackle increasingly complex communities.

The reward for computational biologists who work on these problems will be the satisfaction of contributing to the grand enterprise of understanding the total diversity of life on our planet. 

A5-miseq

Produces high quality microbial genome assemblies on a laptop computer without any parameter tuning. A5-miseq does this by automating the process of adapter trimming, quality filtering

Orione

A Galaxy-based framework consisting of publicly available research software and specifically designed pipelines to build complex, reproducible workflows for next-generation sequencing microbiology data analysis.

Enabling microbiology researchers to conduct their own custom analysis and data manipulation without software installation or programming, Orione provides new opportunities for data-intensive computational analyses in microbiology and metagenomics.

METAGENOMICS APPLICATIONS

• Successful products

• Antibiotics

• Antibiotic resistance pathways

• Anti-cancer drugs

•Degradation pathways – Lipases, amylases, nucleases, hemolytic

Cont..

• Transport proteins

• Ecology and Environment

• Energy

• Bioremediation

• Biotechnology

• Agriculture

• Biodefence

Applications

● Global Impacts. The role of microbes is critical in

maintaining atmospheric balances, as they are

the main photosynthetic agents responsible for the generation and

consumption of greenhouse gases involved at all levels in ecosystems

and trophic chains

Applications

Bioremediation. Cleaning up

environmental contamination, such as

● the waste from water treatment facilities

● gasoline leaks on lands or oil spills in the oceans

● toxic chemicals

Applications

● BioenergyWe are harnessing microbial power in

order to produce

● ethanol (from cellulose), hydrogen, methane, butanol...

● Smart Farming. Microbes help our crops by

● the “supressive soil” phenomenon (buffer effect against disease-causing organisms)

● soil enrichment and regeneration

Applications

The World Within. Studying the human microbiome

may lead to valuable new tools and guidelines in

● Human and animal nutrition

● Better understanding of complex diseases (obesity, cancer, asthma...)

● Drug discovery

● Preventative medicine

Applications

Mapping the human microbiome

Tools..

QIIME

QIIME is an open-source bioinformatics pipeline for performing microbiome analysis from raw DNA sequencing data.

QIIME is designed to take users from raw sequencing data generated on the Illumina or other platforms through publication quality graphics and statistics.

QIIME has been applied to studies based on billions of sequences from tens of thousands of samples.

Mothur.

to develop a single piece of open-source, expandable software to fill the bioinformatics needs of the microbial ecology community

screening, processing, aligning & clustering of Sanger, 454 or Illumina (16S rRNA) amplicons

generating a high-quality, effectively ‘normalized’ shared file (i.e. counts of OTUs per sample)

gaining general taxonomic information about the OTUs in your study system (RDP Taxonomic Classifier)

MEGAN

 

In metagenomics, the aim is to understand the composition and operation of complex microbial consortia in environmental samples through sequencing and analysis of their DNA. 

MG.RAST

CONT..

FUTURE OF METAGENOMICS

• To identify new enzymes & antibiotics

• To assess the effects of age, diet, and pathologic states (e.g., inflammatory bowel diseases, obesity, and cancer) on the distal gut microbiome of humans living in different environments

Study of more exotic habitats

• Study antibiotic resistance in soil microbes

• Improved bioinformatics will quicken analysis for library profiling

Cont..

• Investigating ancient DNA remnants

• Discoveries such as phylogenic tags (rRNA genes, etc) will give momentum to the growing field

• Learning novel pathways will lead to knowledge about the current nonculturable bacteria to then culture these systems

top related