proteomics jen,mona & krishna. introduction what is proteome? proteome is the entire complement...

ProteomicsJen,Mona & Krishna

IntroductionWhat is proteome?

proteome is the entire complement of proteins, including the modifications made to a particular set of proteins, produced by an organism or system at particular time and conditions.

varies with time and distinct requirements, or stresses, that a cell or organism undergoes.

• What is proteomics? Proteomics is the large-scale study of

proteins, particularly their functions and structures.

A short list of protein modifications that might be studied under proteomics include:

1. phosphorylation 2. ubiquitination3. methylation4. acetylation5. glycosylation 6. oxidation7. Nitrosylation etc.

Why proteomics?• Gives better understanding of an organism than

Genomics.• Limitations of genomics that made proteomics a

better approach:1. the level of transcription of a gene gives only a

rough estimate of its level of expression into a protein.

2. many transcripts give rise to more than one protein, through alternative splicing or alternative post-translational modifications.

3. many proteins form complexes with other proteins or RNA molecules, and only function in the presence of these other molecules.

4. proteins experience post-translational modifications that profoundly affect their activities.

5. protein degradation rate plays an important role in protein content.

Any cell may make different sets of proteins at different times, or under different conditions. Furthermore, any one protein can undergo a wide range of post-translational modifications. So proteomics study can be complex.

Therefore, proteomics is a better approach but complex.

Branches of proteomicsProteomics analysis Determining proteins which are post-translationally

modified

Expression proteomics Profiling of expressed proteins using

quantitative methodsCell mapping proteomics Identification of protein complexes

Methods1. Gel based proteomics(2DE):

◦ older approach◦ Separates proteins according to charge in

the first dimension and according to the size in the second dimension.

◦ Commonly separated using polyacrylamide gel electrophorosis(PAGE).

◦ Identifies individual proteins in complex samples or multiple proteins in single sample.

2.Mass spectrometry based proteomics:◦ Highly accurate for extremely low mass particles. ◦ Proteins are cleaved into peptides with

enzymatic protease and the peptide masses are detected with the help of mass spectrometer(eg TOF)

◦ The mass spectrum of the peptides is obtained and it is converted to a list of peptide masses that is searched against the genome databases.

◦ Since, each protein has a unique peptide mass fingerprint, peptide masses can identify the protein in the database.

3.Protein arrays ◦ Idea is similar to cDNA arrays.◦ Substrate is bound on the surface of array◦ Sample is introduced, binding takes place◦ Detection and analysis.◦ Analysis of protein-protein, protein-DNA or

protein-RNA interactions can be done.

Applications Identification of potential new drugs for the

treatment of diseases. This relies on genome and proteome information to identify proteins associated with a disease, which computer software can then use as targets for new drugs.

Biomarkers A number of techniques allow to test for proteins

produced during a particular disease, which helps to

diagnose the disease quickly.

Examples of biomarkersAlzheimer's disease In Alzheimer’s disease, elevations in beta secretase

create amyloid/beta-protein, targeting this enzyme decreases the amyloid/beta-protein and slows the

progression of the disease Heart disease Standard protein biomarkers for CVD include

interleukin-6, interleukin-8, serum amyloid A protein,

fibrinogen, and troponins.

http://en.wikipedia.org/w/index.php?title=Troponins&action=edit&redlink=1

http://en.wikipedia.org/w/index.php?title=Troponins&action=edit&redlink=1

BIOINFORMATICS & DATABASE TOOLS

Introduction – Current StateMany different informational

protein databases available online

Most databases are focused on protein identification◦Research community provides the

data that drives the database contents

◦Validation of Mass Spec dataSingle vs. Multiple Species

Support

Overview of Databases NCBI – Protein / Peptidome Human Gene and Protein Database (HGPD) Human Proteinpedia / Human Protein

Reference Database (HPRD) Dynamic Proteomics Open Proteomics Database Global Proteome Machine Database Peptide Atlas Proteomics Identifications Database

(PRIDE) UniProt Knowledgebase

NCBI – Protein / PeptidomeTwo databases contained in the

Entrez suiteMulti-species result setsProtein

◦Provides gene information pertaining to the expressed protein queried

Peptidome◦Mass Spec based protein identification

database◦Experiment based result sets

Human Gene and Protein Database (HGPD)Several cDNA contributors,

spanning the globeGateway Expression System

◦Allows for reproducible clone library. Clones are available for purchase.

Wheat Germ Cell-free protein synthesis◦Protein Expression portion of the

database. Allows for visualization of the SDS-PAGE results.

Human Proteinpedia / Human Protein Reference Database (HPRD)

Modeled after wikipedia◦ Users submit and edit the data in the database◦ Differences

Original submitter expected to provide experimental evidence for the data

Only the original submitter can edit that specific data later.

Allows several protein features to be annotated◦ Post-translational modification◦ Tissue expression◦ Cell line expression◦ Subcellular localization◦ Enzyme substrates◦ Protein-protein interactions

Human Proteinpedia / Human Protein Reference Database (HPRD)

No visual protein expression dataProtein amino acid sequence

givenRaw and processed mass spec

files are available as experimental evidence

Provides links to the protein in other databases

Dynamic ProteomicsDifferent type of database, focusing on the

dynamics of proteins treated with an anti-cancer drug

Shows different uses for data repositories for proteomics◦ Not just all-encompassing data source with generic

data.◦ Using simple databases and web front ends to make

more specific types of data available to the community.

Also provides links to other databasesCan compare multiple sequences at once to

search the cDNA library.

Dynamic ProteomicsTime lapse microscopy movies that illustrate the protein dynamics in individual living human cancer cells in response to an anti-cancer drug

Time Lapse Video

http://www.weizmann.ac.il/mcb/UriAlon/clone_pics/movies/170407pl3D11_1.mpg

Open Proteomics Database

University of Texas Multi-species resultsSmaller pool of data submitted

for query

Global Proteome Machine Database

Private industry involvementMass Spec ValidationProtein IdentificationUtilizes data from other

databases◦Differs from the scheme of just

linking to other protein databases

Peptide AtlasSeattle Proteome CenterFocused on subset of human

proteins◦Heart, Lung, Blood

Funded by NIHPart of the Trans-Proteomic

Pipeline software suite

Proteomics Identifications Database (PRIDE)

One of the earlier proteomic databases

European Bioinformatics InstituteLarger selection of species

specific dataJava based, available for local

deployment

UniProt KnowledgebaseSwiss Institute of BioinformaticsAlso curated by European

Bioinformatics InstituteFunded by NIH

◦Forced the conversion of earlier non-public versions to become free and open

Overview of ToolsExPAsy Proteomics ServerTrans-Proteomic Pipeline

ExPAsy Proteomics ServerSwiss Institute of Bioinformatics tool

suiteProtein ID by amino acid sequenceIsoelectric Point ComputationPrediction of post translational

modifications and amino acid substitutions.

Predicts protein cleavage sitesProtein identification by molecular

weight

Trans-Proteomic PipelineSeattle Proteome Center

ChallengesLarge number of data sourcesParallel efforts Validation of Mass Spec data

Future ConsiderationsSelection of a few ‘primary’ data

repositoriesConsolidation of multiple redundant efforts

being funded by the same agency◦Particularly NIH

Data standards to streamline the submission of results into multiple data sources.◦Reduction of the need to perform many

searches to find information about a protein◦mzXML is a start, but only covers mass spec

data

Database References NCBI

◦ Protein http://www.ncbi.nlm.nih.gov/protein/◦ Peptidome http://www.ncbi.nlm.nih.gov/pepdome

Human Gene and Protein Database (HGPD)◦ http://riodb.ibase.aist.go.jp/hgpd/cgi-bin/index.cgi

Human Proteinpedia ◦ http://www.humanproteinpedia.org/index_html

Human Protein Reference Database (HPRD)◦ http://www.hprd.org/

Dynamic Proteomics◦ http://alon-serv.weizmann.ac.il/dynamprotb/seqsrch

Open Proteomics Database◦ http://bioinformatics.icmb.utexas.edu/OPD/

Global Proteome Machine Database◦ http://thegpm.org

Peptide Atlas◦ http://www.peptideatlas.org/

Proteomics Identifications Database (PRIDE)◦ http://www.ebi.ac.uk/pride/

UniProt Knowledgebase◦ http://www.uniprot.org/

http://www.ncbi.nlm.nih.gov/protein/



http://www.humanproteinpedia.org/index_html

Tool ReferencesExPAsy Proteomics Server

◦http://www.expasy.ch/Trans-Proteomic Pipeline

◦ http://tools.proteomecenter.org/wiki/index.php?title=Software:TPP

Applications of Proteomics

Mona Motwani

Discovery of protein biomarkers A biomarker can be defined as any laboratory

measurement or physical sign used as a substitute for a clinically meaningful end point that measures directly how a patient feels, functions or survives as applied to proteomics, a biomarker is an identified protein(s) that is unique to a particular disease state.

Biomarkers of drug efficacy and toxicity are becoming a key need in the drug development process.

Mass spectral-based proteomic technologies are ideally suited for the discovery of protein biomarkers in the absence of any prior knowledge of quantitative changes in protein levels.

The success of any biomarker discovery effort will depend upon the quality of samples analysed, the ability to generate quantitative information on relative protein levels and the ability to readily interpret the data generated.

Study of Tumor Metastasis and Cancers

The identification of protein molecules with their expressions correlated to the metastatic process help to understand the metastatic mechanisms and thus facilitate the development of strategies for the therapeutic interventions and clinical management of cancer.

Information contained within proteomic patterns has been demonstrated to detect ovarian, breast and prostate cancers with sensitivities and specificities greater than 90%.

Field of Neurotrauma Neurotrauma results in complex alterations to the biological

systems within the nervous system, and these changes evolve over time.

Near-completion of the Human Genome Project has stimulated scientists to begin looking for the next step in unraveling normal and abnormal functions within biological systems. Consequently, there is new focus on the role of proteins in these processes.

Proteomics is a burgeoning field that may provide a valuable approach to evaluate the post-traumatic central nervous system (CNS). However the senstivity of the tissue and detection of potential biomarkers are major concern.

Renal disease diagnosis Proteomics has also found significant application in studying the

effects of chemical insults on the kidney, particularly as a result of environmental toxins, drugs and other bioactive agents.

Combining classic analytical techniques as two-dimensional gel electrophoresis and more sophisticated techniques, such as MS, liquid chromatography has enabled considerable progress to be made in cataloguing and quantifying proteins present in urine and various kidney tissue compartments in both normal and diseased physiological states.

Critical developmental tasks that still need to be accomplished are completely defining the proteome in the various biological compartments (e.g. tissues, serum and urine) in both health and disease, which presents a major challenge given the dynamic range and complexity of such proteomes; and also achieving the routine ability to accurately

and reproducibly quantify proteomic expression profiles and develop diagnostic platforms.

Neurology In neurology and neuroscience, many applications of

proteomics have involved neurotoxicology and neurometabolism, as well as in the determination of specific proteomic aspects of individual brain areas and body fluids in neurodegeneration.

Investigation of brain protein groups in neurodegeneration, such as enzymes, cytoskeleton proteins, chaperones, synaptosomal proteins and antioxidant proteins, is in progress as phenotype related proteomics.

The concomitant detection of several hundred proteins on a gel provides sufficiently comprehensive data to determine a pathophysiological protein network and its peripheral representatives. An additional advantage is that hitherto unknown proteins have been identified as brain proteins.

Autoantibody profiling Proteomics technologies enable profiling of autoantibody

responses using biological fluids derived from patients with autoimmune disease.

They provide a powerful tool to characterize autoreactive B-cell responses in diseases including rheumatoid arthritis, multiple sclerosis, autoimmune diabetes, and systemic lupus erythematosus.

Autoantibody profiling may serve purposes including classification of individual patients and subsets of patients based on their 'autoantibody fingerprint', examination of epitope spreading and antibody isotype usage, discovery and characterization of candidate autoantigens, and tailoring antigen-specific therapy.

Alzheimer's disease

In Alzheimer’s disease, elevations in beta secretase create amyloid/beta-protein, which causes plaque to build up in the patient's brain, which is thought to play a role in dementia.

Targeting this enzyme decreases the amyloid/beta-protein and so slows the progression of the disease.

A procedure to test for the increase in amyloid/beta-protein is immunohistochemical staining, in which antibodies bind to specific antigens or biological tissue of amyloid/beta-protein.

Heart disease

Heart disease is commonly assessed using several key protein based biomarkers. Standard protein biomarkers for CVD include interleukin-6, interleukin-8, serum amyloid A protein, fibrinogen, and troponins.

cTnI cardiac troponin I increases in concentration within 3 to 12 hours of initial cardiac injury and can be found elevated days after an acute myocardial infarction.

A number of commercial antibody based assays as well as other methods are used in hospitals as primary tests for acute MI.

Future Challenges

There is a need for biomarkers with more accurate diagnostic capability, particularly for early-stage disease.

Also adding a quality control sample on each chip array, and normalizing spectral data through commercially available or in-house generated computer programs

Another challenge that proteomics techniques face lie largely in the application of bioinformatics, i.e. the spectral data management and analysis. The vast amount of spectral data generated demand implementation of advanced data management and analysis strategies.

Finally, the obvious challenge, as stated by many investigators, is the identification of the important proteins and peptides that contribute to the proteomic analysis.

proteomics jen,mona & krishna. introduction what is proteome? proteome is the entire complement...

Documents

analysis of protein

multiple proteins

individual proteins

protein content

proteomics study

proteomics jen

different sets of proteins

proteins form complexes