functional annotation

78
Functional Annotation Background + Strategy The Group 1 27th Feb 2012 Lavanya Rishishwar Artika Nath Lu Wang Haozheng Tian Shengyun Peng Ashwath Kumar Hamidreza Hassanzadeh

Upload: garren

Post on 24-Feb-2016

225 views

Category:

Documents


0 download

DESCRIPTION

Functional Annotation. Background + Strategy. The Group. Outline. What is Functional Annotation The I mportance of Functional Annotation The Biology of H . haemolyticus Background for Functional Annotation Pros/Cons of Available Approaches Planned Approach Breadth Depth . Outline. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Functional Annotation

Functional Annotation

Background + Strategy

The Group

127th Feb 2012

Lavanya RishishwarArtika NathLu WangHaozheng Tian

Shengyun PengAshwath Kumar

Hamidreza Hassanzadeh

Page 2: Functional Annotation

Outline

• What is Functional Annotation• The Importance of Functional Annotation• The Biology of H. haemolyticus• Background for Functional Annotation• Pros/Cons of Available Approaches • Planned Approach

– Breadth– Depth

27th Feb 2012 2

Page 3: Functional Annotation

Outline

• What is Functional Annotation• The Importance of Functional Annotation• The Biology of H. haemolyticus• Background for Functional Annotation• Pros/Cons of Available Approaches • Planned Approach

– Breadth– Depth

27th Feb 2012 3

Page 4: Functional Annotation

THE ‘WHAT?’Functional Annotation

427th Feb 2012

Page 5: Functional Annotation

Genome Assembly

Assemble the Pieces Right

527th Feb 2012

Page 6: Functional Annotation

Gene Prediction

When on board HMS Beagle, as naturalist, I was much struck with certain facts in the distribution of the inhabitants of South America, and in the geological relations of the present to the past inhabitants of that continent. These facts seemed to me to throw some light on the origin of species - that mystery of mysteries, as it has been called by one of our greatest philosophers .

Identify the words

When on board HMS Beagle, as naturalist, I was much struck with certain facts in the distribution of the inhabitants of South America, and in the geological relations of the present to the past inhabitants of that continent. These facts seemed to me to throw some light on the origin of species - that mystery of mysteries, as it has been called by one of our greatest philosophers .

627th Feb 2012

Page 7: Functional Annotation

Functional Annotation

When on board HMS Beagle, as naturalist, I was much struck with certain facts in the distribution of the inhabitants of South America, and in the geological relations of the present to the past inhabitants of that continent. These facts seemed to me to throw some light on the origin of species - that mystery of mysteries, as it has been called by one of our greatest philosophers .

nat·u·ral·ist [nach-er-uh-list, nach-ruh-]noun1. a person who studies or is an expert in natural history, especially a zoologist or botanist.2. an adherent of naturalism in literature or art.Origin: 1580–90; natural + -ist

Origin of Species, Thenoun( On the Origin of Species by Means of Natural Selection, or the Preservation of Favoured Races in the Struggle for Life ) a treatise (1859) by Charles Darwin setting forth his theory of evolution.

Identify the function (i.e., meaning) of each word

DATABASESPROFILES

727th Feb 2012

Page 8: Functional Annotation

Outline

• What is Functional Annotation• The Importance of Functional Annotation• The Biology of H. haemolyticus• Background for Functional Annotation• Pros/Cons of Available Approaches • Planned Approach

– Breadth– Depth

27th Feb 2012 8

Page 9: Functional Annotation

THE GRAVITY OF THE ANNOTATION PROCESS

Not just Newtonian927th Feb 2012

Page 10: Functional Annotation

“Ultimately, one wishes to determine how genes—and the

proteins they encode—function in the intact organism.”

Albert B, et al. (2002) Molecular biology of cell. New York: Garland Science.

function

1027th Feb 2012

Page 11: Functional Annotation

Function? What is it?

• To a cell biologist function might refer to the network of interactions in which the protein participates or to the location to a certain cellular compartment.

• To a biochemist, function refers to the metabolic process in which a protein is involved or to the reaction catalyzed by an enzyme.

1127th Feb 2012

Page 12: Functional Annotation

Functional Annotation

Functional annotation consists of attaching biological information to genomic elements.• Biochemical function• Biological function• Involved regulation and interactions• Expression

1227th Feb 2012

Page 13: Functional Annotation

Whatever happened to wet-lab?

“Experimentally annotating one complete bacterial genome varies from organism to organism. Roughly speaking, it could take as much as $25,000 and a period of 6-12 months for completing the process”

- Alejandro Caro

1327th Feb 2012

Page 14: Functional Annotation

The Naked Truth

7/98

1/99

7/99

1/00

7/00

1/01

7/01

1/02

7/02

1/03

7/03

1/04

7/04

1/05

7/05

1/06

7/06

1/07

7/07

1/08

7/08

1/09

7/09

1/10

7/10

1/11

7/11

1/12

0

200

400

600

800

1000

1200

1400

1600

1800

2000

KEGG Genome: Release Update of Jan 2012

No. of Genomes in KEGG

1427th Feb 2012

Page 15: Functional Annotation

How Gene Performs Function? Operon • Operon: Several genes with related functions that are regulated

together, because one piece of mRNA codes for several related proteins.

• Polycistronic mRNA,, mRNA coding for more than one polypeptide, is found only in prokaryotes

27th Feb 2012 15

Page 16: Functional Annotation

Coding and non coding RNA’s

Protein CodingEnzymesStructural Regulatory Signal TransductionReceptors ToxinsVirulence Factors Membrane/Transmembrane

Non Coding Riboswitches CRISPRSrna's

Pathway Prediction 27th Feb 2012 16

Page 17: Functional Annotation

Domain/Motif • Domain:

A discrete structural unit that is assumed to fold independently of the rest of the protein and to have its own function.~20-100 aa

• Motif:Are short, conserved regions and frequently are the most conserved regions of domains. Motifs are critical for the domain to function.

27th Feb 2012 17

Page 18: Functional Annotation

Outline

• What is Functional Annotation• The Importance of Functional Annotation• The Biology of H. haemolyticus• Background for Functional Annotation• Pros/Cons of Available Approaches • Planned Approach

– Breadth– Depth

27th Feb 2012 18

Page 19: Functional Annotation

Understanding the Target

1927th Feb 2012

Haemophilus haemolyticus - The Biography

Page 20: Functional Annotation

Haemophilus haemolyticus

• Gram-negative• Facultative anaerobe • Known to colonize the human respiratory tract.• Out of the 8 Haemophilus species found to colonize

the respiratory tract, H. influenzae and H. haemolyticus are the most prevalent ones.

• H. haemolyticus is an emerging pathogen– 5 cases of invasive disease reported between 2009-10.

27th Feb 2012 20

Page 21: Functional Annotation

Strains of H. haemolyticus

fucK : ncoding fuculose-kinase. fucK deletion has been observed in some Hi isolatesHpd: encoding a lipoprotein protein D,

Species Disease State State Isolated Hemolysis Hpd fucK

M19107 H. Haemolyticus Asymptomatic Minnesota Y - -

M19501 H. Haemolyticus Asymptomatic Minnesota N + -

M21127 H.Haemolyticus Pathogenic Georgia Y - -

M21621 H. Haemolyticus Pathogenic Texas Y - -

M21639 H. Haemolyticus Pathogenic Illinois N - -

M21709 H. Influenzae Pathogenic NY N - +

27th Feb 2012 21

Page 22: Functional Annotation

Phylogeny

Niels Nørskov-Lauritsen, N., et al. (2005).Multilocus sequence phylogenetic study of the genus Haemophilus with description of Haemophilus pittmaniae sp. nov. International Journal of Systematic and Evolutionary Microbiology, 55, 449–45627th Feb 2012 22

Page 23: Functional Annotation

Outline

• What is Functional Annotation• The Importance of Functional Annotation• The Biology of H. haemolyticus• Background for Functional Annotation• Pros/Cons of Available Approaches • Planned Approach

– Breadth– Depth

27th Feb 2012 23

Page 24: Functional Annotation

View from 300 ftand a brief time travel

2427th Feb 2012

Page 25: Functional Annotation

Ontology

• An ontology is a "formal, explicit specification of a shared conceptualization“

• Two formal major ontology schemes:– EC – Enzyme Commission Number– GO – Gene Ontology

27th Feb 2012 25

Page 26: Functional Annotation

Enzyme Commission (EC)• A large scale comprehensive attempt to organize and classify

enzymes according to its function• For inclusion in the list, direct experimental evidence is to be

provided for its claimed activity• Organizes the list of enzymes in four levels of hierarchy, starting

with the top most 6 classes:1. Oxidoreductases2. Transferases3. Hydrolases4. Lyases5. Isomerases6. Ligases

2627th Feb 2012

Page 27: Functional Annotation

Chronology: Enzyme Commission (EC)

• Cons of EC: • Hierarchy only provides parent to child

relationship• Only specific to enzymes (doesn't cover all of the

proteins)

2727th Feb 2012

Page 28: Functional Annotation

Chronology: Gene Ontology (GO)Or in other words "give this protein a name and stick to it!!"

27th Feb 2012 28

Page 29: Functional Annotation

What is the GO?

• Molecular Function• Biological Process • Cellular Component• Relations between the terms

– ‘is_a’– ‘part_of’, ‘has_part’– ’regulates’

27th Feb 2012 29

Page 30: Functional Annotation

Structure of GOdu Plessis L, Skunca N, Dessimoz C (2011). The what, where, how and why of gene ontology–a primer for bioinformaticians. Brief Bioinform. Doi: 10.1093/bib/bbr002

27th Feb 2012 30

Page 31: Functional Annotation

General Rule To Apply Evidence Code

27th Feb 2012 31

Page 32: Functional Annotation

Where Do Annotations Come From?

• Inferred from experiment– Most reliable– Base for computational method

• Inferred from computational method– Sequence similarity, structural similarity, etc.

• Inferred from author statement • Curator statement and Obsolete evidence

codes

27th Feb 2012 32

Page 33: Functional Annotation

Why use the GO?• The ‘GO Consortium’ consists of a number of large databases

working together to define standardized ontologies and provide annotations to the GO.

• Search for interacting genes

• Reason across the relations

• Analyze the results of high-throughput experiment

• Infer function of un-annotated genes and inter protein-protein interactions.

27th Feb 2012 33

Page 34: Functional Annotation

Outline

• What is Functional Annotation• The Importance of Functional Annotation• The Biology of H. haemolyticus• Background for Functional Annotation• Pros/Cons of Available Approaches • Planned Approach

– Breadth– Depth

27th Feb 2012 34

Page 35: Functional Annotation

CAUTION!PROS AND CONS OF CONVENTIONAL APPROACHES

Choosing The Right Function Prediction Tool

3527th Feb 2012

Page 36: Functional Annotation

“Perutz et al. showed in 1960 that myoglobin and hemoglobin, the first two protein structures to be solved at atomic resolution using X-ray crystallography, have similar structures even though their sequences differ.”

27th Feb 2012 36

Page 37: Functional Annotation

Pros and Cons: There are no free lunches!

• Homology Useful but different from “same” function– Simply implies common ancestry

3727th Feb 2012

Page 38: Functional Annotation

Pros and Cons: There are no free lunches!

3827th Feb 2012

Page 39: Functional Annotation

Pros and Cons: There are no free lunches!

• Quality of Prediction is as good as the quality of annotation of the database

• Eukaryotic function predictor can not be used for Prokaryotes and vice versa

3927th Feb 2012

Page 40: Functional Annotation

Outline

• What is Functional Annotation• The Importance of Functional Annotation• The Biology of H. haemolyticus• Background for Functional Annotation• Pros/Cons of Available Approaches • Planned Approach

– Breadth– Depth

27th Feb 2012 40

Page 41: Functional Annotation

41

BREADTH AND DEPTH OF THE ANALYSIS

A Snapshot of the Iceberg Named Functional Annotation

27th Feb 2012

Page 42: Functional Annotation

BREADTHSpectrum of Methods Selected

27th Feb 2012 42

Page 43: Functional Annotation

Criteria for selecting methods

1. Currently being maintained2. Applicable to Prokaryotic sequences3. Could be installed locally (support batch

jobs if GUI)OR

Could be included in a pipeline i.e., have a command-line interface

4327th Feb 2012

Page 44: Functional Annotation

Categories of Approaches

• Sequence similarity-based• Phylogenomics-based• Domain/pattern/profile - based

– Domain-based– Pattern-based– Profile-based

• Sequence clustering-based• Machine learning-based• Network-based

27th Feb 2012 44

Page 45: Functional Annotation

Breadth: Options

27th Feb 2012 45

Dead GUI Proprierty Eukaryotic Model External Servers InterPro Web-based Servers

Approach Resource Approach Resource

Sequence similarity based

GOtcha

Sequence clustering based

ProtoNet PFP CluSTr

GOsling eggNOG OntoBlast COGs

GOblet InParanoid Blast2GO MultiParanoid

Phylogenomics basedSIFTER OrthoMCL AFAWE

Machine learning based

ProtFun RIO GOPET

OrthoStrapper SVM-Prot

Domain/pattern/profile based

InterProScan ffPred TMHMM EzyPred HMMTOP

Network based

MCODE HMMER AGeS

Pfam SAMBA SUPERFAMILY RNSC

PROSITE PRODISTIN PRINTS Cytoscape SMART STRING Gene3D VisANT

PANTHER VIRGO TIGRFAMs

Pipelines

RASTSCOP MultiParanoid CATH AGMIAL

CatFam MicroScopePIRSF

PRODOM EFICAz PRIAM

Page 46: Functional Annotation

Flowchart

4627th Feb 2012

Page 47: Functional Annotation

DEPTHDescription of Selected Methods

27th Feb 2012 47

Page 48: Functional Annotation

Level 1The building blocks!

27th Feb 2012 48

Page 49: Functional Annotation

PanGenome Analysis• PanGeome is the full complement of genes in a species.

• It includes core genome which is a set of genes that are present in all strains, dispensable genome that are genes present in 2 or more strains and unique genes which are unique to specific strains.

• In this case, we will be using pangeome of Haemophilus influenzae.

• This database will be used as the reference database in BLAST.

• This method gives high confidence annotations since the strains selected are very closely related to the organism in question.

27th Feb 2012 49

Page 50: Functional Annotation

BLAST: How it works?1. Divide a query

sequence into short chunks called words,

2. Look for exact matches

3. in case of hit try extending the alignment

27th Feb 2012 50

Page 51: Functional Annotation

Statistical assessment

E-value: where, = Total number of residues in the database = Number of residues in the query sequence = Probability that an HSP alignment is a result of random chanceFor e.g., ,

27th Feb 2012 51

Page 52: Functional Annotation

Different flavors!• BLASTN

– Queries nucleotide vs. nucleotide sequences• BLASTP

– Queries protein vs. protein sequences• BLASTX

– Queries 6 possible frames of nucleotide sequences vs. protein sequences

• TBLASTN– Reciprocal of BLASTX

• TBLASTX– Queries 6 possible frames of nucleotide sequences vs. 6 possible

frames of nucleotide sequences inside the database27th Feb 2012 52

Page 53: Functional Annotation

 • Combines protein signatures from a number of member databases into

a single searchable resource

• Capitalizes on their individual strengths to produce an integrated

database and diagnostic tool.

"InterPro provides functional analysis of proteins by classifying them into families and predicting domains and important sites."

Page 54: Functional Annotation

Current release: 36.0 23 February 2012

36.0 !

New features:

• An update to Pfam (26.0) and PIRSF (2.78).

• The integration of 755 new methods from the GENE3D, PANTHER,

PIRSF, Pfam and SUPERFAMILY databases.

Page 55: Functional Annotation

36.0 !

Member database information

Signature Database Version Signatures*Integrated Signatures**

GENE3D 3.3.0 2386 1441HAMAP 140911 1702 1686PANTHER 7 69566 2392PIRSF 2.78 2983 2983PRINTS 41.1 2050 2001PROSITE patterns 20.72 1308 1291PROSITE profiles 20.72 922 897Pfam 26 13672 12672PfamB 26 20000 0ProDom 2006.1 1894 1105SMART 6.2 1008 1002SUPERFAMILY 1.73 1774 1208TIGRFAMs 10.1 4023 4002

* Some signatures may not have matches to UniProtKB proteins.** Not all signatures of a member database may be integrated at the time of an InterPro release.

Page 56: Functional Annotation

36.0 !

Member database information

Signature Database Version Signatures*Integrated Signatures**

GENE3D 3.3.0 2386 1441HAMAP 140911 1702 1686PANTHER 7 69566 2392PIRSF 2.78 2983 2983PRINTS 41.1 2050 2001PROSITE patterns 20.72 1308 1291PROSITE profiles 20.72 922 897Pfam 26 13672 12672PfamB 26 20000 0ProDom 2006.1 1894 1105SMART 6.2 1008 1002SUPERFAMILY 1.73 1774 1208TIGRFAMs 10.1 4023 4002

* Some signatures may not have matches to UniProtKB proteins.** Not all signatures of a member database may be integrated at the time of an InterPro release.

:

:

“The Gene3D database is a large collection of CATH(Class, Architecture, Topology, Homologues superfamily) protein domain assignments for ENSEMBL genomes and Uniprot sequences.”

HAMAPHigh-quality Automated and Manual Annotation of microbial Proteomes

Protein ANalysis THrough Evolutionary Relationships

Evolutionary relationships of proteins from super- to sub-familiesPIRSF

“PRINTS is a database of protein family ‘fingerprints’ offering a diagnostic resource for newly-determined sequences.”

Database of protein domains, families and functional sitesProDom Simple Modular Architecture Research Tool

“SUPERFAMILY is a database of structural and functional annotation for all proteins and genomes.”

TIGRFAMs

Page 57: Functional Annotation

Integration into InterPro

Page 58: Functional Annotation

:

Features of Member Databases• ProDom: provider of sequence-clusters built from UniProtKB using

PSI-BLAST.

• PROSITE patterns: provider of simple regular expressions.

• PROSITE and HAMAP profiles: provide sequence matrices.

• PRINTS provider of fingerprints, which are groups of aligned, un-weighted Position Specific Sequence Matrices (PSSMs).

• PANTHER, PIRSF, Pfam, SMART, TIGRFAMs, Gene3D and SUPERFAMILY: are providers of hidden Markov models (HMMs).

Page 59: Functional Annotation

Querying with InterProScan

“Sequence-based queries are performed using InterProScan, a tool that combines the different protein signature recognition methods native to the InterPro member databases into one resource.”

InterProScanQuery Sequence

Page 60: Functional Annotation

• Web version • Stand-alone version

– A wrapper of sequence analysis apps– Database and output files scanning – Bulk data processing

Querying with InterProScan

Page 61: Functional Annotation

Member Databases & Scanning Methods

The TMHMM and SignalP prediction search algorithms are provided through the web interface at EBI. However, they are not integrated into InterPro.

Member Databases Scanning Methods Software PackagePROSITE patterns pfscan PftoolsProsite ProfilesHAMAP Profiles pfscan Pftools

PRINTS FingerPRINTScanPFAM hmmscan HMMER3.0b3PRODOM ProDomBlastSMART hmmpfam HMMER2.3.2TIGRFAMs hmmscan HMMER3.0b3PIR SuperFamily hmmpfam HMMER2.3.2SUPERFAMILY hmmpfam/hmmsearch HMMER2.3.2GENE3D hmmpfam HMMER2.3.2

Page 62: Functional Annotation

Blast2GO

• B2G has been design to (1) allow automatic and highthroughput sequence annotation and (2) integrate functionality for annotation-based data mining.

27th Feb 2012 62

Page 63: Functional Annotation

Why Blast2GO?

• Blast2GO is designed for high-throughput sequence annotation.

• Better at mining and visualization capabilities

• Good at utilizing annotated sequences already deposited in public databases.

27th Feb 2012 63

Page 64: Functional Annotation

How Blast2GO works?• Basically, Blast2GO uses local or remote BLAST searches to

find similar sequences to one or several input sequences. • The program extracts the GO terms associated to each of

the obtained hits and returns an evaluated GO annotation for the query sequence(s).

• Enzyme codes are obtained by mapping from equivalent GOs while InterPro motifs are directly queried at the InterProScan web service.

• GO annotation can be visualized reconstructing the structure of the Gene Ontology relationships and ECs are highlighted on KEGG maps

27th Feb 2012 64

Page 65: Functional Annotation

How Blast2GO works?

• OBTAINING GO TERMS– The first step is to find sequences similar to a query

set by Blast searching. Homology search can either be done at public databases or custom databases when a local Blast installation is available.

– By using Blast hit gene identifiers (gi) and gene accessions B2G retrieves all GO annotations for the hit sequences, together with their evidence codes (EC).

27th Feb 2012 65

Page 66: Functional Annotation

How Blast2GO works?

• ANNOTATION ASSIGNMENT– annotation score (AS), direct term (DT)

27th Feb 2012 66

Page 67: Functional Annotation

How Blast2GO works?

• STATISTICS– statistical assessment of GO term enrichments in a

group of interesting genes when compared with a reference group (Blüthgen et al., 2004).

– Gossip computes Fisher’s Exact Test applying robust FDR (false discovery rate) correction for multiple testing and returns a list of significant GO terms ranked by their corrected or one-test P-values

• VISUALIZATION

27th Feb 2012 67

Page 68: Functional Annotation

Systems for Functional Annotation

• Clusters of Orthologous Groups (COGs) • euKaryote Orthologous Groups (KOGs)• Gene Ontology (GO)• Enzyme Commission no. (EC)

27th Feb 2012 68

Page 69: Functional Annotation

Clusters of Orthologous Groups of Genes (KOGs, COGs)

– Why?• Orthologs retain the same function during evolution

and hence have a critical role in functional annotation. COGs provides a framework for functional analysis.

• It's also important for phylogenetic and evolutionary analysis of genomes. Interpretable phylogenetic trees generally can be constructed only within sets of orthologs.

27th Feb 2012 69

Page 70: Functional Annotation

How to find Orthologous genes?• Naive approach: For a query gene and target genome, the

highest similarity score indicates homologous relationship– Gives good results for not so distant species– How about larger phylogenetically distances?

• Gene duplications: Suggests that a many-to-many relationship required

• What if several hits with not a so high score emerge ? Stringent threshold may lead to false negatives

• COG approach: Each two genes inside a COG are either orthologous genes or orthologous groups of paralogs

27th Feb 2012 70

Page 71: Functional Annotation

How to create COGs• Choose all 2-permutations of available genes and perform pairwise

comparison between genes from different clades (in this case 5 clades)

• Best hits (BeT) in other organisms are recognized• Make the graph of consistent relations (does not depend on an absolute

threshold level)• The simplest case is a triangle: if a gene yields a hit with two other

genomes there are, being orthologs is a necessary condition for yielding a hit between those two genes

• Merge all triangles with common side

2 210 90

3000 ~8.9e617967 ~3.2e8

27th Feb 2012 71

Page 72: Functional Annotation

How to create COGs - continued

6. Do to existence of paralogs, BeTs are not necessarily symmetrical (RBBH [Reciprocal Best Blast Hits] )

?

Tatusov, Koonin & Lipman, Science 278, 631 (1997)27th Feb 2012 72

Page 73: Functional Annotation

Facing challenges when creating COGs

• The clusters however are subject to ambiguity:– Proteins with distinct regions (multi-domain proteins)

each belonging to a different conserved family. • Sol: Further inspection of domains

– When one gene in a pair of paralogs is lost in one lineage (but not in the other), it may artificially merge the two COGs.

• Sol: Similarity measures

27th Feb 2012 73

Page 74: Functional Annotation

COGs vs. Gene Function

• Each COG includes proteins from at least 3 major clades with divergence time estimated around over a billion year. Hence they are ancient conserved families with important (if not necessary function)

• Accordingly, the proteins belonging to mysterious COGs are good possible candidates for further analysis

• Also, if someone experimentally verifies a genes function, it can also be confidently applied to fellow COG members. Similarly upon inclusion of a new gene to the COG (by COGNITOR) it’s function is derived

• For most free-living prokaryotes, ~80% of the genes belong to COGs. Up to 1o% of proteins in genomes are estimated as fast evolving, poorly conserved proteins and hence the COG coverage of most genomes is approaching saturation

27th Feb 2012 74

Page 75: Functional Annotation

http://www.ncbi.nlm.nih.gov/COG/

Clusters of Orthologous Groups (COGs)

27th Feb 2012 75

Page 76: Functional Annotation

Classification of COGs by functional categories INFORMATION STORAGE AND PROCESSING [J] Translation, ribosomal structure and biogenesis [A] RNA processing and modification [K] Transcription [L] Replication, recombination and repair [B] Chromatin structure and dynamics

CELLULAR PROCESSES AND SIGNALING [D] Cell cycle control, cell division, chromosome partitioning [Y] Nuclear structure [V] Defense mechanisms [T] Signal transduction mechanisms [M] Cell wall/membrane/envelope biogenesis [N] Cell motility [Z] Cytoskeleton [W] Extracellular structures [U] Intracellular trafficking, secretion, and vesicular transport [O] Posttranslational modification, protein turnover, chaperones

METABOLISM[C] Energy production and conversion [G] Carbohydrate transport and metabolism [E] Amino acid transport and metabolism [F] Nucleotide transport and metabolism [H] Coenzyme transport and metabolism [I] Lipid transport and metabolism [P] Inorganic ion transport and metabolism [Q] Secondary metabolites biosynthesis, transport and catabolism

POORLY CHARACTERIZED [R] General function prediction only [S] Function unknown

27th Feb 2012 76

Page 77: Functional Annotation

LipoP• It is a tool used to mainly predict lipoprotein signal peptides.

• It is most suitable for Gram negative bacteria but shown to have considerable accuracy for Gram positive bacteria as well.

• It uses Hidden Markov Models to distinguish between lipoproteins (SPaseII-cleaved proteins), SPaseI-cleaved proteins, cytoplasmic proteins, and transmembrane proteins.

27th Feb 2012 77

Page 78: Functional Annotation

Thank You!

27th Feb 2012 78

To be continued…