protein analysis tools 2 nd april, 2012
Post on 18-Feb-2016
15 Views
Preview:
DESCRIPTION
TRANSCRIPT
Protein Analysis Tools2nd April, 2012
Ansuman Chattopadhyay, PhD, Head Molecular Biology Information ServiceHealth Sciences Library SystemUniversity of Pittsburghansuman@pitt.edu
http://www.hsls.pitt.edu/guides/genetics
What we’ll do:
Brief overview of CLC Main Workbench
find genomic context of a protein sequence
search for the presence of conserved domains
create a multiple sequence alignment plot
What we’ll do: analyze primary structure such as, hydrophobicity,
hydrophylicity, antigenicity, repeat sequence detection etc.
predict secondary structure
predict post translational modification such as, Phosphorylation, glycosylation, ….
search for interacting partners
predict domain driven protein-protein interactions
Workshop Resourceshttp://www.hsls.pitt.edu/molbio/tutorials
Why CLC Main ?
Windows Mac Linux DNA, RNA, Protein, Microarray Data Analysis Regular Update HSLS Licensed
CLC Main Access
HSLS CLC Main Registration Link: http://www.hsls.pitt.edu/molbio/clcmain
Access via Pitt - Network Connect Instruction video: http://goo.gl/JNjMt
CLC Main –getting started (basic navigation steps): http://media.hsls.pitt.edu/media/molbiovideos/clc-navigation-ac0312.swf
CLC Main Workbench Walkthrough (Part1): http://media.hsls.pitt.edu/media/molbiovideos/clcmain-walkthrough-part1-ac0112.swf
CLC Main Workbench Walkthrough (Part2): http://media.hsls.pitt.edu/media/molbiovideos/clcmain-walkthrough-part2-ac0112.swf
Videos
Protein Sequence
Human PLCg1 Refseq no: NP_002651 Uniprot Accession Number: P19174 FASTA file Raw sequence
CLC features:
Search, Import, Create new sequence
Import a DNA /Protein sequence into CLC Main (Part1):http://media.hsls.pitt.edu/media/molbiovideos/clc-import-part1-ac0112.swf
Import a DNA /Protein sequence into CLC Main (Part 2):http://media.hsls.pitt.edu/media/molbiovideos/clc-import-part2-ac0112.swf
Videos
PLCg1 Orthologous sequences PLCg1:
Mouse: NP_067255 Rat: NP_037319 Cow: NP_776850 Dog: XP_542998 Zebra fish: NP_919388
Human: NP_002651
NP_067255,NP_037319,NP_776850,XP_542998,NP_919388,NP_002651
Create a multiple sequence alignment plot using CLC(part1):
http://media.hsls.pitt.edu/media/molbiovideos/msf-clcmain-ac0212 part1.swf
Create a multiple sequence alignment plot using CLC (part2):
http://media.hsls.pitt.edu/media/molbiovideos/msf-clcmain-ac0212-part2.swf Create a multiple sequence alignment plot:
http://media.hsls.pitt.edu/media/clres2705/msa.swf Compare two peptide sequences.:
http://media.hsls.pitt.edu/media/clres2705/blast2.swf
Videos
Starting with a short peptide sequence find:
the whole protein sequence orthologs in other species (nematode)
Tool:UCSC BLATNCBI BLAST against SwissProt
Place a mRNA or peptide sequence into the human genome (BLAT):
http://www.hsls.pitt.edu/molbio/videos/play?v=12e
Find homologous sequences: http://media.hsls.pitt.edu/media/clres2705/blast.swf
Videos
Sequence Manipulation & Format Conversion Sequence Manipulation Suite
http://bioinformatics.org/sms2/ Readseq
http://thr.cit.nih.gov/molbio/readseq/
GenePept
FASTA
Hands-On Retrieve amino acid sequence present
between position 25 to 45 in Sequence A (MS Word Doc) Identify the rat gene which encodes this peptide
fragment and retrieve its whole protein sequence Find the fruit fly homolog of this protein.
What % identity the fruit fly protein shares with its rat homolog?
Predict potential MAPK phosphorylation sites present in the fruit fly protein
Protein Domain Search: InterPro Scan InterPro is a database of protein families, domains,
regions, repeats and sites in which identifiable features found in known proteins can be applied to new protein sequences.
>gi|72198189|ref|NP_000624.2| B-cell lymphoma protein 2 alpha isoform MAHAGRTGYDNREIVMKYIHYKLSQRGYEWDAGDVGAAPPGAAPAPGIFSSQPGHTPHPAASRDPVARTSPLQTPAAPGAAAGPALSPVPPVVHLTLRQAGDDFSRRYRRDFAEMSSQLHLTPFTARGRFATVVEELFRD GVNWGRIVAFFEFGGVMCVESVNREMSPLVDNIALWMTEYLNRHLHTWIQDNGGWDAFVELYGPSMRPLFDFSWLSLKTLLSLALVGACITLGAYLGHK
Videos:
Find protein domains, PTM, secondary str etc: http://media.hsls.pitt.edu/media/clres2705/uniprot.swf
Start with a protein pattern and find what proteins posses that domain: http://media.hsls.pitt.edu/media/clres2705/scanprosite.swf
Search for protein domains,repeats and sites: http://media.hsls.pitt.edu/media/clres2705/interpro.swf
Protein Domain Search: ScanProsite
>gi|72198189|ref|NP_000624.2| B-cell lymphoma protein 2 alpha isoform MAHAGRTGYDNREIVMKYIHYKLSQRGYEWDAGDVGAAPPGAAPAPGIFSSQPGHTPHPAASRDPVARTSPLQTPAAPGAAAGPALSPVPPVVHLTLRQAGDDFSRRYRRDFAEMSSQLHLTPFTARGRFATVVEELFRD GVNWGRIVAFFEFGGVMCVESVNREMSPLVDNIALWMTEYLNRHLHTWIQDNGGWDAFVELYGPSMRPLFDFSWLSLKTLLSLALVGACITLGAYLGHK
Pattern Search [AC]-x-V-x(4)-{ED}:
This pattern is translated as: [Ala or Cys]-any-Val-any-any-any-any-{any but Glu or Asp}
F-[GSTV]-P-R-L-[G>]
Protein Primary Structure Analysis Tool: ExPASy from SIB
Calculated Mol Wt Theoritical PI Extinction coefficients Estimated half-life
Hydropathicity plot : Kyte & Doolittle Hydrophilicity plot: Hopp T.P., Woods K.R
Antigenic Site Prediction
Tool: Emboss Antigenic
>gi|72198189|ref|NP_000624.2| B-cell lymphoma protein 2 alpha isoform MAHAGRTGYDNREIVMKYIHYKLSQRGYEWDAGDVGAAPPGAAPAPGIFSSQPGHTPHPAASRDPVARTSPLQTPAAPGAAAGPALSPVPPVVHLTLRQAGDDFSRRYRRDFAEMSSQLHLTPFTARGRFATVVEELFRD GVNWGRIVAFFEFGGVMCVESVNREMSPLVDNIALWMTEYLNRHLHTWIQDNGGWDAFVELYGPSMRPLFDFSWLSLKTLLSLALVGACITLGAYLGHK
EmBoss Antigenic Antigenic predicts potentially antigenic regions of a protein sequence, using
the method of Kolaskar and Tongaonkar.Analysis of data from
experimentally determined antigenic sites on proteins has revealed that the hydrophobic residues Cys, Leu and Val, if they occur on the surface of a protein, are more likely to be a part of antigenic sites. A semi-empirical method which makes use of physicochemical properties of amino acid residues and their frequencies of occurrence in experimentally known segmental epitopes was developed by Kolaskar and Tongaonkar to predict antigenic determinants on proteins. Application of this method to a large number of proteins has shown that their method can predict antigenic determinants with about 75% accuracy which is better than most of the known methods. This method is based on a single parameter and thus very simple to use.
Transmembrane Site Prediction
Tool: TMHMM Server
>gi|72198189|ref|NP_000624.2| B-cell lymphoma protein 2 alpha isoform MAHAGRTGYDNREIVMKYIHYKLSQRGYEWDAGDVGAAPPGAAPAPGIFSSQPGHTPHPAASRDPVARTSPLQTPAAPGAAAGPALSPVPPVVHLTLRQAGDDFSRRYRRDFAEMSSQLHLTPFTARGRFATVVEELFRD GVNWGRIVAFFEFGGVMCVESVNREMSPLVDNIALWMTEYLNRHLHTWIQDNGGWDAFVELYGPSMRPLFDFSWLSLKTLLSLALVGACITLGAYLGHK
Protein Secondary Structure>gi|72198189|ref|NP_000624.2| B-cell lymphoma protein 2 alpha isoform MAHAGRTGYDNREIVMKYIHYKLSQRGYEWDAGDVGAAPPGAAPAPGIFSSQPGHTPHPAASRDPVARTSPLQTPAAPGAAAGPALSPVPPVVHLTLRQAGDDFSRRYRRDFAEMSSQLHLTPFTARGRFATVVEELFRD GVNWGRIVAFFEFGGVMCVESVNREMSPLVDNIALWMTEYLNRHLHTWIQDNGGWDAFVELYGPSMRPLFDFSWLSLKTLLSLALVGACITLGAYLGHK
Protein-Protein Interactions Prediction
Tool: STRING
>gi|72198189|ref|NP_000624.2| B-cell lymphoma protein 2 alpha isoform MAHAGRTGYDNREIVMKYIHYKLSQRGYEWDAGDVGAAPPGAAPAPGIFSSQPGHTPHPAASRDPVARTSPLQTPAAPGAAAGPALSPVPPVVHLTLRQAGDDFSRRYRRDFAEMSSQLHLTPFTARGRFATVVEELFRD GVNWGRIVAFFEFGGVMCVESVNREMSPLVDNIALWMTEYLNRHLHTWIQDNGGWDAFVELYGPSMRPLFDFSWLSLKTLLSLALVGACITLGAYLGHK
Hands-on Take the human BCL2 protein sequence and
Find its domain architecture Predict the topology of its transmembrane region Design suitable antigenic site for antibody generation What is its calculated Mol Wt and Ext Coefficient? Predict its secondary structure
What % of this protein possesses alpha helical structure? Predict its potential interacting partners
Hands-on
Prediction of potential phosphorylation sites present in a protein sequence.
Sequence: human BCL2
>gi|72198189|ref|NP_000624.2| B-cell lymphoma protein 2 alpha isoform MAHAGRTGYDNREIVMKYIHYKLSQRGYEWDAGDVGAAPPGAAPAPGIFSSQPGHTPHPAASRDPVARTSPLQTPAAPGAAAGPALSPVPPVVHLTLRQAGDDFSRRYRRDFAEMSSQLHLTPFTARGRFATVVEELFRD GVNWGRIVAFFEFGGVMCVESVNREMSPLVDNIALWMTEYLNRHLHTWIQDNGGWDAFVELYGPSMRPLFDFSWLSLKTLLSLALVGACITLGAYLGHK
Phosphorylation Site Prediction:
>gi|72198189|ref|NP_000624.2| B-cell lymphoma protein 2 alpha isoform MAHAGRTGYDNREIVMKYIHYKLSQRGYEWDAGDVGAAPPGAAPAPGIFSSQPGHTPHPAASRDPVARTSPLQTPAAPGAAAGPALSPVPPVVHLTLRQAGDDFSRRYRRDFAEMSSQLHLTPFTARGRFATVVEELFRD GVNWGRIVAFFEFGGVMCVESVNREMSPLVDNIALWMTEYLNRHLHTWIQDNGGWDAFVELYGPSMRPLFDFSWLSLKTLLSLALVGACITLGAYLGHK
Tool: NetPhos
Phosphorylation Site Prediction:
>gi|72198189|ref|NP_000624.2| B-cell lymphoma protein 2 alpha isoform MAHAGRTGYDNREIVMKYIHYKLSQRGYEWDAGDVGAAPPGAAPAPGIFSSQPGHTPHPAASRDPVARTSPLQTPAAPGAAAGPALSPVPPVVHLTLRQAGDDFSRRYRRDFAEMSSQLHLTPFTARGRFATVVEELFRD GVNWGRIVAFFEFGGVMCVESVNREMSPLVDNIALWMTEYLNRHLHTWIQDNGGWDAFVELYGPSMRPLFDFSWLSLKTLLSLALVGACITLGAYLGHK
Tool: GPS
Thank you!Any questions?
Carrie Iwema Ansuman Chattopadhyayiwema@pitt.edu ansuman@pitt.edu 412-383-6887 412-648-1297
http://www.hsls.pitt.edu/guides/genetics
top related