computational proteomics - nbpgr...protparam tool protparam is a tool which allows the computation...
Post on 02-Oct-2020
1 Views
Preview:
TRANSCRIPT
Computational Proteomics
Proteomics Tools
Basic Tools
Primary Structure Analysis Tools
Secondary Structure Analysis Tools
Post-translation Modification Prediction Tools
Topology Prediction Tools
Pattern and Profile Searches Tools
Similarity Searches Tools
RNA Structure Prediction Tools
Protein Identification and Characterization Tools
Basic Tools
Translate Tool
Translate Tool Output:
Reverse Translate Tool
Transcription and Translation Tool
Primary Structure Analysis Tools
ProtParam Tool
ProtParam is a tool which allows the computation of various physical and chemical
parameters for a given protein stored in Swiss-Prot or TrEMBL or for a user entered
sequence.
The computed parameters include the molecular weight, theoretical pI, amino acid
composition, atomic composition, extinction coefficient, estimated half-life, instability
index, aliphatic index and grand average of hydropathicity (GRAVY).
ProtParam Output:
ScanSite pI/MW Tool
ScanSite pI/MW Output:
RADAR (Rapid Automatic Detection and Alignments of Repeats in Protein Sequence)
RADAR Output:
COILS - Prediction of Coiled Coil Regions in Proteins
2ZIP Server
2ZIP Server Output:
ePESTfind
epestfind allows rapid and objective identification of PEST motifs in protein target sequences. Briefly, the PEST hypothesis was based on a literature survey that combined both information on protein stability as well as protein primary sequence information. Initially, the study relied on 12 short-lived proteins with well-known properties [1], but was continually extended later [2,3]. The initial group of proteins included E1A, c-myc, p53, c-fos, v-myb, P730 phytochrome, heat shock protein 70 (HSP 70), HMG-CoA reductase, tyrosine aminotransferase (TAT), ornithine decarboxylase (ODC), alpha-Casein and beta-Casein. Although all these proteins exerted various different cellular functions it became apparent that they shared high local concentrations of amino acids proline (P), glutamic acid (E), serine (S), threonine (T) and to a lesser extent aspartic acid (D). From that it was concluded that PEST motifs reduce the half-lives of proteins dramatically and hence, that they target proteins for proteolytic degradation.
ePESTfind
ePESTfind Output:
ProtScale
ProtScale Output:
Secondary Structure Analysis Tools
GOR IV A Secondary Structure Predicting Server
GOR IV Output:
GOR IV Output: Conti….
SSP (Protein Secondary Structure)
SSP (Protein Secondary Structure) Output:
PDISORDER prediction
PDISORDER V. 1.0 is the program for predicting ordered and disordered regions in protein sequences.
Minimum required sequence length is 40.
It is increasingly evident that intrinsically unstructured protein regions play key roles in cell-signaling,
regulation and cancer (Iakoucheva et al., J. Mol. Biol. (2002) 323, 573–584), which makes them extremely
useful for discovery of anticancer drugs. Requirement of intrinsic structural disorder is shown for many
protein functions - see, for instance, Dunker et al., Biochemistry (2002) 41 (21), 6573 -6582.
PDISORDER prediction Output:
Post-translation
Modification Prediction
Tools
PATS - Prediction of apicoplast targeted sequences
PATS Output:
NetOGlyc - Prediction of O-GalNAc (mucin type) glycosylation sites in mammalian proteins
NetOGlyc - Prediction of O-GalNAc (mucin type) glycosylation sites in mammalian proteins. Conti…
YinOYang - O-beta-GlcNAc attachment sites in eukaryotic protein sequences
YinOYang - O-beta-GlcNAc attachment sites in eukaryotic protein sequences. Conti…
YinOYang - O-beta-GlcNAc attachment sites in eukaryotic protein sequences. Conti…
Myristoylator
Myristoylator Output:
Sulfinator
Sulfinator Output:
ProP
ProP Output:
SignalP
SignalP 3.0 server predicts the presence and location of signal peptide cleavage sites in amino acid sequences from different organisms: Gram-positive prokaryotes, Gram-negative prokaryotes, and eukaryotes. The method incorporates a prediction of cleavage sites and a signal peptide/non-signal peptide prediction based on a combination of several artificial neural networks and hidden Markov models.
SignalP Output:
SignalP Output:
NetAcet Server
NetAcet 1.0 server predicts substrates of N-acetyltransferase A (NatA). The method was
trained on yeast data but, as mentioned in the article describing the method, it obtains
similar performance values on mammalian substrates acetylated by NatA orthologs.
NetAcet Server Output:
NetPhosYeast
NetPhosYeast 1.0 server predicts serine and threonine phosphorylation sites in yeast proteins. This service is closely related to NetPhos and NetPhosK.
NetPhosYeast Output:
NetPhosYeast Output:
DictyOGlyc1.1 Server
The DictyOGlyc server produces neural network predictions for GlcNAc O-glycosylation sites in Dictyostelium discoideum proteins.
DictyOGlyc1.1 Server Output:
DictyOGlyc1.1 Server Output:
NetCGlyc1.0 Server
NetCGlyc 1.0 produces neural network predictions of C-mannosylation sites in mammalian proteins.
NetCGlyc1.0 Server Output:
NetOGlyc3.1 Server
The NetOglyc server produces neural network predictions of mucin type GalNAc O-glycosylation sites in mammalian proteins.
NetOGlyc3.1 Server Output:
NetOGlyc3.1 Server Output:
NetGlycate1.0 Server
NetGlycate 1.0 server predicts glycation of ε amino groups of lysines in mammalian proteins.
NetGlycate1.0 Server Output:
NetPicoRNA1.0 Server
The NetPicoRNA 1.0 server produces neural network predictions of cleavage sites of picornaviral proteases
NetPicoRNA1.0 Server Output:
NetCorona1.0 Server
NetCorona predicts coronavirus 3C-like proteinase (or protease) cleavage sites using artificial neural networks on amino acid sequences. Every potential site is scored and a list is compiled in addition to a graphical representation. Refer to publication for more detailed information and performance values.
NetCorona1.0 Server Output:
NetCorona1.0 Server Output:
Topology Prediction Tools
NetNES
NetNES Output:
TargetP
TargetP Output:
TMHMM Server
TMHMM Server Output:
Pattern and Profile Searches
Tools
InterProScan
InterProScan Output:
ProDom
ProDom Output:
ScanProsite
ScanProsite Output:
Similarity Searches Tools
BLAST
WU-BLAST
WU-BLAST Output:
WU-BLAST Output:
Fasta3
Fasta3
Fasta3 Output:
PropSearch
Common protein sequence alignment programs are at present not capable to detect functional and / or structural homologs, if the sequence identity is below the significance threshold of about 25%. PROPSEARCH was designed to find the putative protein family if querying a new sequence has failed using alignment methods. By neglecting the order of amino acid residues in a sequence, PROPSEARCH uses the amino acid composition instead. In addition, other properties like molecular weight, content of bulky residues, content of small residues, average hydrophobicity, average charge a.s.o. and the content of selected dipeptide-groups are calculated from the sequence as well. 144 such properties are weighted individually and are used as query vector. The weights have been trained on a set of protein families with known structures, using a genetic algorithm. Sequences in the database are transformed into vectors as well, and the euclidian distance between the query and database sequences is calculated. Distances are rank ordered, and sequences with lowest distance are reported on top.
PropSearch
PropSearch Output:
Structure Assignment With Text Description (SAWTED)
SAWTED stands for Structure Assignment With Text Description. It is a method to improve the coverage of the detection of remote homologues of known structure by sequence searches (e.g. PSI-BLAST) and fold recognition programs. When sequence database searches return only hits with scores worse than an accepted threshold for reliability the user will often compare what is known about the function of the query sequence with that known about the poor scoring hits. Some hits may appear more sensible than others and deserve closer inspection. In SAWTED this comparison is made automatically using an algorithm to compare the text of SWISS-PROT annotations related to the query and to the poor scoring hits. A single E-value is given for the user to assess the similarity of function. SAWTED is currently implemented to enhance PSI-BLAST searches against the PDB, and as part of our 3D-PSSM fold recognition server. We use the words from the comments (CC) and keywords (KW) sections of the SWISS-PROT entry. The score between two SWISS-PROT entries is calculated using the vector cosine model of text retrieval.
Structure Assignment With Text Description (SAWTED)
RNA Structure Prediction Tools
RNA Secondary Structure Prediction Tool
RNA Secondary Structure Prediction Tool Output:
pknotsRG
pknotsRG is a tool for folding RNA secondary structures, including the class of simple recursive
pseudoknots. The program runs in O(n^4) time and O(n^2) space, therefore its application here
on the BiBiserv is limited to sequences of length up to 800 bases. The energy parameters for
structures containing no pseudoknots are the same as in the actual mfold 3.1. The energy for
pseudoknots is computed with a model similar to that used by Rivas&Eddy in pknots. The
folding temperature is fixed to 37C.
Currently, there are three different modes available:
•pknotsRG-mfe: computes the structure of s (knotted or not) of minimal free energy.
•pknotsRG-enf: enforces a pseudoknot: it delivers the energetically best complete structure of
s that includes at least one pseudoknot somewhere.
•pknotsRG-loc: delivers the best local pseudoknot that can be element of any structure of s,
where "best" is defined by free energy per nucleotide. The rest of s remains unfolded.
pknotsRG
pknotsRG Output:
Protein Identification and Characterization Tools
FindMod
FindMod is a tool that can predict potential protein post-translational modifications (PTM) and find potential single amino acid substitutions in peptides. The experimentally measured peptide masses are compared with the theoretical peptides calculated from a specified Swiss-Prot/TrEMBL entry or from a user-entered sequence, and mass differences are used to better characterise the protein of interest
MASCOT
Mascot is a powerful search engine which uses mass spectrometry data to identify proteins from primary sequence databases. While a number of similar programs available, Mascot is unique in that it integrates all of the proven methods of searching. These different search methods can be categorised as follows: Peptide Mass Fingerprint in which the only experimental data are peptide mass values, (detailed description) Sequence Query in which peptide mass data are combined with amino acid sequence and composition information. A super-set of a sequence tag query, (detailed description) MS/MS Ion Search using uninterpreted MS/MS data from one or more peptides, (detailed description) The general approach for all types of search is to take a small sample of the protein of interest and digest it with a proteolytic enzyme, such as trypsin. The resulting digest mixture is analysed by mass spectrometry. Different types of mass spectrometer have different capabilities. A simple instrument will measure a set of molecular weights for the intact mixture of peptides. An instrument with MS/MS capability can additionally provide structural information by recording the fragment ion spectrum of a peptide. Usually, the digest mixture will be separated by chromatography prior to MS/MS analysis, so that MS/MS spectra from individual peptides can be measured.
MASCOT
MASCOT Results:
pepMAPPER
Mapper, the UMIST protein search tool which uses Mass Spec. data produced by the digestion of a protein to identify a match to a protein from a database.
top related