proteomics jen,mona & krishna. introduction what is proteome? proteome is the entire complement...
TRANSCRIPT
IntroductionWhat is proteome?
proteome is the entire complement of proteins, including the modifications made to a particular set of proteins, produced by an organism or system at particular time and conditions.
varies with time and distinct requirements, or stresses, that a cell or organism undergoes.
• What is proteomics? Proteomics is the large-scale study of
proteins, particularly their functions and structures.
A short list of protein modifications that might be studied under proteomics include:
1. phosphorylation 2. ubiquitination3. methylation4. acetylation5. glycosylation 6. oxidation7. Nitrosylation etc.
Why proteomics?• Gives better understanding of an organism than
Genomics.• Limitations of genomics that made proteomics a
better approach:1. the level of transcription of a gene gives only a
rough estimate of its level of expression into a protein.
2. many transcripts give rise to more than one protein, through alternative splicing or alternative post-translational modifications.
3. many proteins form complexes with other proteins or RNA molecules, and only function in the presence of these other molecules.
4. proteins experience post-translational modifications that profoundly affect their activities.
5. protein degradation rate plays an important role in protein content.
Any cell may make different sets of proteins at different times, or under different conditions. Furthermore, any one protein can undergo a wide range of post-translational modifications. So proteomics study can be complex.
Therefore, proteomics is a better approach but complex.
Branches of proteomicsProteomics analysis Determining proteins which are post-translationally
modified
Expression proteomics Profiling of expressed proteins using
quantitative methodsCell mapping proteomics Identification of protein complexes
Methods1. Gel based proteomics(2DE):
◦ older approach◦ Separates proteins according to charge in
the first dimension and according to the size in the second dimension.
◦ Commonly separated using polyacrylamide gel electrophorosis(PAGE).
◦ Identifies individual proteins in complex samples or multiple proteins in single sample.
2.Mass spectrometry based proteomics:◦ Highly accurate for extremely low mass particles. ◦ Proteins are cleaved into peptides with
enzymatic protease and the peptide masses are detected with the help of mass spectrometer(eg TOF)
◦ The mass spectrum of the peptides is obtained and it is converted to a list of peptide masses that is searched against the genome databases.
◦ Since, each protein has a unique peptide mass fingerprint, peptide masses can identify the protein in the database.
3.Protein arrays ◦ Idea is similar to cDNA arrays.◦ Substrate is bound on the surface of array◦ Sample is introduced, binding takes place◦ Detection and analysis.◦ Analysis of protein-protein, protein-DNA or
protein-RNA interactions can be done.
Applications Identification of potential new drugs for the
treatment of diseases. This relies on genome and proteome information to identify proteins associated with a disease, which computer software can then use as targets for new drugs.
Biomarkers A number of techniques allow to test for proteins
produced during a particular disease, which helps to
diagnose the disease quickly.
Examples of biomarkersAlzheimer's disease In Alzheimer’s disease, elevations in beta secretase
create amyloid/beta-protein, targeting this enzyme decreases the amyloid/beta-protein and slows the
progression of the disease Heart disease Standard protein biomarkers for CVD include
interleukin-6, interleukin-8, serum amyloid A protein,
fibrinogen, and troponins.
Introduction – Current StateMany different informational
protein databases available online
Most databases are focused on protein identification◦Research community provides the
data that drives the database contents
◦Validation of Mass Spec dataSingle vs. Multiple Species
Support
Overview of Databases NCBI – Protein / Peptidome Human Gene and Protein Database (HGPD) Human Proteinpedia / Human Protein
Reference Database (HPRD) Dynamic Proteomics Open Proteomics Database Global Proteome Machine Database Peptide Atlas Proteomics Identifications Database
(PRIDE) UniProt Knowledgebase
NCBI – Protein / PeptidomeTwo databases contained in the
Entrez suiteMulti-species result setsProtein
◦Provides gene information pertaining to the expressed protein queried
Peptidome◦Mass Spec based protein identification
database◦Experiment based result sets
Human Gene and Protein Database (HGPD)Several cDNA contributors,
spanning the globeGateway Expression System
◦Allows for reproducible clone library. Clones are available for purchase.
Wheat Germ Cell-free protein synthesis◦Protein Expression portion of the
database. Allows for visualization of the SDS-PAGE results.
Human Proteinpedia / Human Protein Reference Database (HPRD)
Modeled after wikipedia◦ Users submit and edit the data in the database◦ Differences
Original submitter expected to provide experimental evidence for the data
Only the original submitter can edit that specific data later.
Allows several protein features to be annotated◦ Post-translational modification◦ Tissue expression◦ Cell line expression◦ Subcellular localization◦ Enzyme substrates◦ Protein-protein interactions
Human Proteinpedia / Human Protein Reference Database (HPRD)
No visual protein expression dataProtein amino acid sequence
givenRaw and processed mass spec
files are available as experimental evidence
Provides links to the protein in other databases
Dynamic ProteomicsDifferent type of database, focusing on the
dynamics of proteins treated with an anti-cancer drug
Shows different uses for data repositories for proteomics◦ Not just all-encompassing data source with generic
data.◦ Using simple databases and web front ends to make
more specific types of data available to the community.
Also provides links to other databasesCan compare multiple sequences at once to
search the cDNA library.
Dynamic ProteomicsTime lapse microscopy movies that illustrate the protein dynamics in individual living human cancer cells in response to an anti-cancer drug
Time Lapse Video
Open Proteomics Database
University of Texas Multi-species resultsSmaller pool of data submitted
for query
Global Proteome Machine Database
Private industry involvementMass Spec ValidationProtein IdentificationUtilizes data from other
databases◦Differs from the scheme of just
linking to other protein databases
Peptide AtlasSeattle Proteome CenterFocused on subset of human
proteins◦Heart, Lung, Blood
Funded by NIHPart of the Trans-Proteomic
Pipeline software suite
Proteomics Identifications Database (PRIDE)
One of the earlier proteomic databases
European Bioinformatics InstituteLarger selection of species
specific dataJava based, available for local
deployment
UniProt KnowledgebaseSwiss Institute of BioinformaticsAlso curated by European
Bioinformatics InstituteFunded by NIH
◦Forced the conversion of earlier non-public versions to become free and open
ExPAsy Proteomics ServerSwiss Institute of Bioinformatics tool
suiteProtein ID by amino acid sequenceIsoelectric Point ComputationPrediction of post translational
modifications and amino acid substitutions.
Predicts protein cleavage sitesProtein identification by molecular
weight
Future ConsiderationsSelection of a few ‘primary’ data
repositoriesConsolidation of multiple redundant efforts
being funded by the same agency◦Particularly NIH
Data standards to streamline the submission of results into multiple data sources.◦Reduction of the need to perform many
searches to find information about a protein◦mzXML is a start, but only covers mass spec
data
Database References NCBI
◦ Protein http://www.ncbi.nlm.nih.gov/protein/◦ Peptidome http://www.ncbi.nlm.nih.gov/pepdome
Human Gene and Protein Database (HGPD)◦ http://riodb.ibase.aist.go.jp/hgpd/cgi-bin/index.cgi
Human Proteinpedia ◦ http://www.humanproteinpedia.org/index_html
Human Protein Reference Database (HPRD)◦ http://www.hprd.org/
Dynamic Proteomics◦ http://alon-serv.weizmann.ac.il/dynamprotb/seqsrch
Open Proteomics Database◦ http://bioinformatics.icmb.utexas.edu/OPD/
Global Proteome Machine Database◦ http://thegpm.org
Peptide Atlas◦ http://www.peptideatlas.org/
Proteomics Identifications Database (PRIDE)◦ http://www.ebi.ac.uk/pride/
UniProt Knowledgebase◦ http://www.uniprot.org/
Tool ReferencesExPAsy Proteomics Server
◦http://www.expasy.ch/Trans-Proteomic Pipeline
◦ http://tools.proteomecenter.org/wiki/index.php?title=Software:TPP
Discovery of protein biomarkers A biomarker can be defined as any laboratory
measurement or physical sign used as a substitute for a clinically meaningful end point that measures directly how a patient feels, functions or survives as applied to proteomics, a biomarker is an identified protein(s) that is unique to a particular disease state.
Biomarkers of drug efficacy and toxicity are becoming a key need in the drug development process.
Mass spectral-based proteomic technologies are ideally suited for the discovery of protein biomarkers in the absence of any prior knowledge of quantitative changes in protein levels.
The success of any biomarker discovery effort will depend upon the quality of samples analysed, the ability to generate quantitative information on relative protein levels and the ability to readily interpret the data generated.
Study of Tumor Metastasis and Cancers
The identification of protein molecules with their expressions correlated to the metastatic process help to understand the metastatic mechanisms and thus facilitate the development of strategies for the therapeutic interventions and clinical management of cancer.
Information contained within proteomic patterns has been demonstrated to detect ovarian, breast and prostate cancers with sensitivities and specificities greater than 90%.
Field of Neurotrauma Neurotrauma results in complex alterations to the biological
systems within the nervous system, and these changes evolve over time.
Near-completion of the Human Genome Project has stimulated scientists to begin looking for the next step in unraveling normal and abnormal functions within biological systems. Consequently, there is new focus on the role of proteins in these processes.
Proteomics is a burgeoning field that may provide a valuable approach to evaluate the post-traumatic central nervous system (CNS). However the senstivity of the tissue and detection of potential biomarkers are major concern.
Renal disease diagnosis Proteomics has also found significant application in studying the
effects of chemical insults on the kidney, particularly as a result of environmental toxins, drugs and other bioactive agents.
Combining classic analytical techniques as two-dimensional gel electrophoresis and more sophisticated techniques, such as MS, liquid chromatography has enabled considerable progress to be made in cataloguing and quantifying proteins present in urine and various kidney tissue compartments in both normal and diseased physiological states.
Critical developmental tasks that still need to be accomplished are completely defining the proteome in the various biological compartments (e.g. tissues, serum and urine) in both health and disease, which presents a major challenge given the dynamic range and complexity of such proteomes; and also achieving the routine ability to accurately
and reproducibly quantify proteomic expression profiles and develop diagnostic platforms.
Neurology In neurology and neuroscience, many applications of
proteomics have involved neurotoxicology and neurometabolism, as well as in the determination of specific proteomic aspects of individual brain areas and body fluids in neurodegeneration.
Investigation of brain protein groups in neurodegeneration, such as enzymes, cytoskeleton proteins, chaperones, synaptosomal proteins and antioxidant proteins, is in progress as phenotype related proteomics.
The concomitant detection of several hundred proteins on a gel provides sufficiently comprehensive data to determine a pathophysiological protein network and its peripheral representatives. An additional advantage is that hitherto unknown proteins have been identified as brain proteins.
Autoantibody profiling Proteomics technologies enable profiling of autoantibody
responses using biological fluids derived from patients with autoimmune disease.
They provide a powerful tool to characterize autoreactive B-cell responses in diseases including rheumatoid arthritis, multiple sclerosis, autoimmune diabetes, and systemic lupus erythematosus.
Autoantibody profiling may serve purposes including classification of individual patients and subsets of patients based on their 'autoantibody fingerprint', examination of epitope spreading and antibody isotype usage, discovery and characterization of candidate autoantigens, and tailoring antigen-specific therapy.
Alzheimer's disease
In Alzheimer’s disease, elevations in beta secretase create amyloid/beta-protein, which causes plaque to build up in the patient's brain, which is thought to play a role in dementia.
Targeting this enzyme decreases the amyloid/beta-protein and so slows the progression of the disease.
A procedure to test for the increase in amyloid/beta-protein is immunohistochemical staining, in which antibodies bind to specific antigens or biological tissue of amyloid/beta-protein.
Heart disease
Heart disease is commonly assessed using several key protein based biomarkers. Standard protein biomarkers for CVD include interleukin-6, interleukin-8, serum amyloid A protein, fibrinogen, and troponins.
cTnI cardiac troponin I increases in concentration within 3 to 12 hours of initial cardiac injury and can be found elevated days after an acute myocardial infarction.
A number of commercial antibody based assays as well as other methods are used in hospitals as primary tests for acute MI.
Future Challenges
There is a need for biomarkers with more accurate diagnostic capability, particularly for early-stage disease.
Also adding a quality control sample on each chip array, and normalizing spectral data through commercially available or in-house generated computer programs
Another challenge that proteomics techniques face lie largely in the application of bioinformatics, i.e. the spectral data management and analysis. The vast amount of spectral data generated demand implementation of advanced data management and analysis strategies.
Finally, the obvious challenge, as stated by many investigators, is the identification of the important proteins and peptides that contribute to the proteomic analysis.