encyclopedia of applied ethics || proteomics

8
Proteomics RM Twyman, University of Warwick, Coventry, UK ª 2012 Elsevier Inc. All rights reserved. Introduction Proteomics is the systematic, large-scale analysis of pro- teins. It is based on the concept of the proteome as a complete set of proteins produced by a given cell or organism under a defined set of conditions. Proteins are directly involved in almost every biological process, so comprehensive analysis of the proteins in the cell pro- vides a unique global perspective on how these molecules interact and cooperate to create and maintain a working biological system. The cell responds to internal and exter- nal changes by regulating the level and activity of its proteins, so changes in the proteome, either qualitative or quantitative, provide a snapshot of this regulatory net- work in action. The proteome is a complex and dynamic entity that can be defined in terms of the sequence, structure, abundance, localization, modification, interaction, and biochemical function of each of its components, providing a rich and varied source of data. The study of the pro- teome raises a number of potential ethical issues, such as those concerning the ownership, storage, and use of human tissues; the storage and use of data arising from proteomic research (especially if this affects donor priv- acy or could lead to discrimination); the extent to which informed consent is required; and questions regarding intellectual property and the use of human samples for proteomic research that later results in a commercial product. The analysis of the diverse properties of the proteome requires an equally diverse range of technolo- gies as well as methods for data integration and mining, which further clouds the issue of ownership and intellec- tual property. Proteomics provides a much more robust and representative picture of the functioning cell than do other forms of large-scale biology, such as genome sequencing or the global analysis of gene expression; therefore, the potential ethical risks associated with sam- ple and data misuse are greater. The Nature of Proteomic Data Data Provided by Protein Separation The analysis of proteins, whether on a small or large scale, requires methods for the separation of protein mixtures into their individual components. Protein separation methods can be placed on a sliding scale from fully selective to fully nonselective. Selective methods aim to isolate individual proteins from a mixture usually by exploiting very specific properties such as their binding specificity or biochemical function. In contrast, nonselec- tive separation methods aim to take a complex protein mixture and fractionate it in such a manner that all the individual proteins, or at least a substantial subfraction, are available for further analysis. Such methods lie at the heart of proteomics and exploit very general properties of proteins, such as their mass or net charge. The ethical issues raised by the separation of proteins reflect the fact that such methods could, and indeed do, provide mole- cular fingerprints that can be used to identify individuals, ethnic groups, and, in a clinical setting, groups of indivi- duals with or susceptible to specific diseases. Many techniques can be used to separate complex protein mixtures in what at least approaches a nonselec- tive manner, but not all of these techniques are suitable for proteomics. One major requirement is high resolution. The separation technique should produce fractions that comprise very simple mixtures of proteins, and ideally each fraction should contain an individual protein. This essentially rules out one-dimensional techniques that is, those that exploit a single chemical or physical property as the basis for separation because there is not enough resolving power. Proteomic techniques are therefore mul- tidimensional; that is, two or more different fractionation principles are employed one after another. The other major requirement in proteomics is high throughput. The separation technique should resolve all the proteins in one experiment and should ideally be easy to automate. The most suitable methods for automation are those that rely on differential rates of migration to produce fractions that can be displayed or collected, a process generally described as separative transport. A final requirement is that the fractionation procedure should be compatible with downstream analysis by mass spectrometry (MS) because this is the major technology platform for high- throughput protein identification. The two groups of techniques that have come to dominate proteomics are two-dimensional gel electrophoresis (2DGE) and multi- dimensional liquid chromatography. Two-dimensional gel electrophoresis In 2DGE, proteins are separated by two rounds of ortho- gonal electrophoresis using different separative principles (Figure 1). The first separation in 2DGE is usually iso- electric focusing (IEF), in which proteins are separated based on their net charge irrespective of their mass. The 642

Upload: rm

Post on 15-Dec-2016

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Encyclopedia of Applied Ethics || Proteomics

64

Proteomics RM Twyman, University of Warwick, Coventry, UK

ª 2012 Elsevier Inc. All rights reserved.

Introduction

Proteomics is the systematic, large-scale analysis of pro­teins. It is based on the concept of the proteome as a complete set of proteins produced by a given cell or organism under a defined set of conditions. Proteins are directly involved in almost every biological process, so comprehensive analysis of the proteins in the cell pro­vides a unique global perspective on how these molecules interact and cooperate to create and maintain a working biological system. The cell responds to internal and exter­nal changes by regulating the level and activity of its proteins, so changes in the proteome, either qualitative or quantitative, provide a snapshot of this regulatory net­work in action.

The proteome is a complex and dynamic entity that can be defined in terms of the sequence, structure, abundance, localization, modification, interaction, and biochemical function of each of its components, providing a rich and varied source of data. The study of the pro­teome raises a number of potential ethical issues, such as those concerning the ownership, storage, and use of human tissues; the storage and use of data arising from proteomic research (especially if this affects donor priv­acy or could lead to discrimination); the extent to which informed consent is required; and questions regarding intellectual property and the use of human samples for proteomic research that later results in a commercial product. The analysis of the diverse properties of the proteome requires an equally diverse range of technolo­gies as well as methods for data integration and mining, which further clouds the issue of ownership and intellec­tual property. Proteomics provides a much more robust and representative picture of the functioning cell than do other forms of large-scale biology, such as genome sequencing or the global analysis of gene expression; therefore, the potential ethical risks associated with sam­ple and data misuse are greater.

The Nature of Proteomic Data

Data Provided by Protein Separation

The analysis of proteins, whether on a small or large scale, requires methods for the separation of protein mixtures into their individual components. Protein separation methods can be placed on a sliding scale from fully selective to fully nonselective. Selective methods aim to

2

isolate individual proteins from a mixture usually by exploiting very specific properties such as their binding specificity or biochemical function. In contrast, nonselec­tive separation methods aim to take a complex protein mixture and fractionate it in such a manner that all the individual proteins, or at least a substantial subfraction, are available for further analysis. Such methods lie at the heart of proteomics and exploit very general properties of proteins, such as their mass or net charge. The ethical issues raised by the separation of proteins reflect the fact that such methods could, and indeed do, provide mole­cular fingerprints that can be used to identify individuals, ethnic groups, and, in a clinical setting, groups of indivi­duals with or susceptible to specific diseases.

Many techniques can be used to separate complex protein mixtures in what at least approaches a nonselec­tive manner, but not all of these techniques are suitable for proteomics. One major requirement is high resolution. The separation technique should produce fractions that comprise very simple mixtures of proteins, and ideally each fraction should contain an individual protein. This essentially rules out one-dimensional techniques – that is, those that exploit a single chemical or physical property as the basis for separation – because there is not enough resolving power. Proteomic techniques are therefore mul­tidimensional; that is, two or more different fractionation principles are employed one after another. The other major requirement in proteomics is high throughput. The separation technique should resolve all the proteins in one experiment and should ideally be easy to automate. The most suitable methods for automation are those that rely on differential rates of migration to produce fractions that can be displayed or collected, a process generally described as separative transport. A final requirement is that the fractionation procedure should be compatible with downstream analysis by mass spectrometry (MS) because this is the major technology platform for high-throughput protein identification. The two groups of techniques that have come to dominate proteomics are two-dimensional gel electrophoresis (2DGE) and multi­dimensional liquid chromatography.

Two-dimensional gel electrophoresis In 2DGE, proteins are separated by two rounds of ortho­gonal electrophoresis using different separative principles (Figure 1). The first separation in 2DGE is usually iso­electric focusing (IEF), in which proteins are separated based on their net charge irrespective of their mass. The

Page 2: Encyclopedia of Applied Ethics || Proteomics

Proteomics 643

Figure 1 Two-dimensional electrophoresis using a tube gel for IEF and a slab gel for SDS-PAGE. The proteins are separated in the first dimension on the basis of charge and in the second dimension on the basis of molecular mass. The circles represent proteins, with shading indicating protein pI values and diameters representing molecular mass. The dotted line shows the direction of separation.

Low pH

High pH

Apply electric field

Equilibrate in SDSTransfer focusing gel to SDS gelApply orthogonal electric field

High pH–

Low pH+

+

underlying principle is that electrophoresis is carried out in a pH gradient, allowing each protein to migrate to its isoelectric point – that is, the point at which its pI value is equivalent to the surrounding pH and its net charge is zero. Proteins with different pI values therefore focus at different positions in the pH gradient. The second separation is usually standard sodium dodecylsulfate polyacrylamide gel electrophoresis (SDS-PAGE), in which proteins are separated according to molecular mass irrespective of charge. The basis of the technique is the exposure of denatured proteins to the detergent SDS, which binds stoichiometrically to the polypeptide backbone and carries a large negative charge. The pre­

sence of tens or hundreds of SDS molecules on each polypeptide dwarfs any intrinsic charge carried by the proteins, and stoichiometric binding means that larger proteins bind more SDS than do smaller proteins. This has two important consequences that ensure separation on the basis of mass alone. First, all protein–SDS complexes have essentially the same charge density. Second, the relative differences in mass between proteins are main­

tained in the protein–SDS complexes. The gel enhances the size-dependent separation by sieving the proteins as they migrate. SDS-PAGE is generally carried out at right angles to the first IEF step to generate a spread of protein spots in a 2D field that constitutes the molecular finger­

print mentioned previously. The data produced by 2DGE experiments are visual in

nature, so downstream analysis involves capturing the images from 2D gels stained with Coomassie, silver, or more sensitive reagents such as the SYPRO range of dyes and then isolating particular spots for further processing and MS. This process is difficult to automate and there­

fore constitutes the most significant bottleneck in

proteomic research. Until quite recently, manual analysis and spot picking from gels were very common. However, there are now various software packages available that produce high-quality digitized gel images and incorpo­rate methods to evaluate quantitative differences between spots on different gels. These can be integrated with spot excision robots that use plastic or steel picking tips to transfer gel slices to microtiter plates for automated diges­tion, cleanup, concentration, and transfer to the mass spectrometer. Several commercially available systems can fully automate the analysis and processing of 2D gels and can handle 200–300 protein spots per hour.

Multidimensional liquid chromatography Liquid chromatography (LC) techniques have several advantages over 2DGE for protein separation, including versatility, sensitivity, and the ease with which they can be automated and integrated with downstream analysis by MS. Unlike gel electrophoresis, LC is suitable for the separation of both proteins and peptides, and it can there­fore be applied upstream of 2DGE to prefractionate the sample, downstream of 2DGE to separate the peptide mixtures from single excised spots, or instead of 2DGE as the separation technology of choice. Alternative LC methods can exploit different separation principles, such as size, charge, hydrophobicity, and affinity for particular ligands. As is the case for electrophoresis, the highest resolution separations are achieved when two or more orthogonal separation principles are applied.

Protein Identification

The techniques described previously allow protein mix­tures to be separated into their components but do not

Page 3: Encyclopedia of Applied Ethics || Proteomics

644 Proteomics

allow those components to be identified. Indeed, the individual fractions produced by such methods are usually anonymous, which from an ethical point of view is advantageous. Each spot on a 2D gel and each fraction emerging from a high-performance liquid chromatogra­phy column looks very much like any other. In the case of 2DGE, even differences in spot size and distribution provide only vague clues about protein identity. The next stage in proteomic analysis is therefore to character­ize the fractions and thus determine which proteins are actually present. Here again, the privacy of sample donors becomes an issue because the identification of proteins, particularly variants associated with ethnic or clinical status, raises concerns that protein fingerprints could be used to discriminate against individuals and groups.

Proteins can often be characterized using probes – typically antibodies – that recognize unique structural features known as epitopes. This is a very powerful way to isolate and identify individual proteins, but it is diffi­cult to apply on a proteomic scale, although large-scale Western blot procedures have been developed based on this approach and protein chips, discussed in more detail later, have allowed this approach to be miniaturized and automated.

The gold standard method for identifying a protein is sequencing by Edman degradation, which involves the stepwise chemical removal of single amino acid residues, allowing short sequences to be determined that can be used to search sequence databases. However, Edman degradation is slow and laborious, and many proteins are blocked to this technique because the N-terminal amino acid is chemically modified.

In the early 1990s, the identification of proteins was revolutionized by simultaneous developments in two areas. First, in MS, techniques became available for the soft ionization of macromolecules, preventing the ions from fragmenting indiscriminately. The two techniques used most widely for ionization in proteomics today are matrix-assisted laser desorption/ionization (MALDI) and electrospray ionization (ESI).

In MALDI, the analyte is mixed with an aromatic matrix compound that can absorb energy from a laser (e.g., a-cyano-4-hydroxycinnamic acid can absorb the energy from a nitrogen UV laser at 337 nm). The analyte and matrix are dissolved in an organic solvent and placed on a holder that can handle multiple samples. The solvent evaporates, leaving matrix crystals in which the analyte is embedded. The holder is placed in the vacuum chamber of the mass spectrometer and a high voltage is applied. At the same time, the crystals are targeted with a short laser pulse. The laser energy is absorbed by the crystals and emitted (desorbed) as heat, resulting in rapid sublimation that converts the analyte into gas-phase ions. These accel­erate away from the probe through the analyzer toward the detector. MALDI is used predominantly for the

analysis of simple peptide mixtures, such as the peptides derived from a single spot from a 2D gel.

In ESI, the analyte is dissolved and forced through a narrow needle held at a high voltage. A fine spray of charged droplets emerges from the needle and is directed into the vacuum chamber of the mass spectrometer through a small orifice. As they enter the mass spectrometer, the droplets are dried using a stream of inert gas, resulting in gas-phase ions that are accelerated through the analyzer toward the detector. Because ESI produces gas-phase ions from solution, it is readily integrated with upstream protein separation by liquid-phase methods, particularly capillary electrophoresis and LC. Whereas MALDI MS is used to analyze simple peptide mixtures, LC-ESI-MS is more sui­ted to the analysis of complex samples.

These ionization techniques can be combined with a variety of instruments for the sensitive determination of peptide masses. Widely used instruments include triple quadrupole (Q), time of flight (TOF), hybrid Q-TOF, TOF-TOF, Q-ion trap, and Fourier transform ion cyclo­tron resonance.

Second, in bioinformatics, algorithms were developed that could be used to search sequence databases with MS data. There are two general approaches, which are com­pared in Figure 2:

• The analysis of intact peptide ions: This allows the masses of intact peptides to be calculated, and these masses can be used to identify proteins in a sample by correlative database searching.

• The analysis of fragmented ions: Intact peptide ions are fragmented randomly, generally by collision with a stream of inert gas (collision-induced dissociation (CID)). This allows the masses of peptide fragments to be deter­mined, and the resulting CID spectrum can be used either for correlative database searching or to derive de novo sequences. In the latter case, the derived sequences can be used as standard queries in similarity search algorithms such as BLAST and FASTA.

The use of intact ion masses to identify proteins is known as peptide mapping or peptide mass fingerprinting (PMF). The principle of the technique is that each protein can be uniquely identified by the masses of its constituent peptides, with this unique signature being known as the PMF. PMF involves the following steps:

• The sample of interest should comprise a single protein or a simple mixture – for example, an individual spot from a 2D gel or a single LC fraction. The sample is digested with a specific cleavage reagent, usually trypsin.

• The masses of the peptides are determined, for example, by MALDI-TOF-MS.

• The experimenter chooses one or more protein sequence databases to be used for correlative searching.

Page 4: Encyclopedia of Applied Ethics || Proteomics

Matching peptide sequences Peptide mass in database

fingerprint MALDI-TOF LGMDGYR

analysis GISLANWMCLAK Exact match WESGYNTR

Trypsin SLANW digestion RATNYN

FQINS Partial matches WESGYN

BLAST search

ESI-MS/MS analysis

Fragment GISLAN ion mass GISLANW

GISLANWM Peptide GISLANWMC ladder De novo GISLANWMCL

sequence GISLANWMCLA

Proteomics 645

GISLANWMCLAK

Figure 2 Protein identification by MS. In a typical strategy, digested peptides are analyzed by MALDI-TOF-MS to determine the masses of intact peptides. These masses can be used in correlative database searches to identify exact matches. If this approach fails, ESI-MS/MS analysis can be used to generate peptide fragment ions. These can be used to search less robust data sources and to produce de novo peptide sequences.

• The algorithm carries out a virtual digest of each protein in the sequence database using the same cleavage specificity as trypsin and then calculates theoretical pep­tide masses for each

• protein.

The algorithm attempts to correlate the theoretical peptide masses with the experimentally determined .

• ones

Proteins in the database are ranked in order of best correlation, usually with a significance threshold based on a minimum number of peptides matched.

PMF may not work for several reasons, including the absence of the sequence from the database, insufficient instrument sensitivity, nonspecific cleavage of the pro­tein, the existence of several polymorphic variants of the protein with different masses, the presence of unantici­pated posttranslational modifications (either natural or artifacts), or the presence of contaminants. Where PMF fails to identify any proteins matching those present in a given sample, the CID spectrum of one or more indivi­dual peptides may provide important additional information. The data can be used in two ways. First, the uninterpreted fragment ion masses can be used in correlative database searching to identify proteins whose peptides would likely yield similar CID spectra under the same fragmentation conditions. Second, the peaks of the mass spectrum can be interpreted, either manually or automatically, to derive partial de novo peptide sequences that can be used as standard database search queries. The advantage of both these approaches is that correlative searching is not limited to databases of full protein sequences. None of the previously discussed techniques present an ethical problem per se, but there is a possibility

that the ability to identify protein sequences in a high-throughput manner could be used in the same way that single nucleotide polymorphisms are envisaged in pharmacogenomics – to identify individual-specific proteomic characteristics that provide information regarding susceptibility to disease or sensitivity to drugs. Where such data are used solely for the clinical benefit of the patient, this would be acceptable. The danger lies in any possibility that proteomic analysis would be used to predict complex medical or even psychological outcomes and deny individuals with specific proteomic profiles insurance or employment or cause invasions of privacy.

Quantitative Proteomics and the Analysis of Posttranslational Variants

The objective in many proteomic experiments is to iden­tify proteins whose abundance differs across two or more related samples. This may include variations in absolute protein levels or variations in the stoichiometry of differ­ent forms of modification, such as phosphorylation. Perhaps quantitative proteomics has the most important impact on ethics because many disease-related changes are quantitative rather than qualitative. In other words, the progression of a disease often involves a particular molecule becoming more or less abundant rather than the appearance of an entirely new molecule (a disease-specific biomarker). Protein quantitation in proteomics relies primarily on the use of general labeling or staining or on the selective labeling or staining of particular classes of proteins. The chosen strategy depends largely on how

Page 5: Encyclopedia of Applied Ethics || Proteomics

(a) In vivo labeling (b) Predigestion (c) Postdigestion in vitro labeling in vitro labeling

State 1 State 2

State 1 State 2 State 1 State 2

Label Label

Mix Extract Extract Extract Extract fractionate fractionate fractionate fractionate

Extract/fractionate Digest DigestLabel Label

MixDigest Label Label

Digest Enrich for Mix Cys-peptides

Relative quantitation from mass spectra

646 Proteomics

Figure 3 Overview of MS-based strategies for quantitative proteomics. Depending on the point at which the label is introduced, most procedures are classified as (a) in vivo labeling, (b) predigestion labeling in vitro, or (c) postdigestion labeling in vitro.

the protein samples are prepared and fractionated, and the strategies can be divided into two broad categories: those based on the image analysis of 2D gels (Figure 1) and those based on differential labeling of samples for separation by LC followed by MS (Figure 3).

Other Proteomics Technologies

Sequence and structural proteomics Although proteomics as we understand it today would not have been possible without advances in DNA sequencing, protein sequencing by Edman degradation for many years provided a crucial link between the activity of a protein and the genetic basis of a particular phenotype. It was not until the mid-1980s that it first became commonplace to predict protein sequences from genes rather than to use protein sequences for gene isolation.

The increasing numbers of stored protein and nucleic acid sequences, and the recognition that functionally related proteins often had similar sequences, catalyzed the development of statistical techniques for sequence comparison that underlie many of the core bioinformatic methods used in proteomics today. Nucleic acid sequences are stored in three primary sequence databases – GenBank, the EMBL nucleotide sequence database, and the DNA Data Bank of Japan – which exchange data every day. These databases also contain protein sequences that have

been translated from DNA sequences. A dedicated protein sequence database, SWISS-PROT, was founded in 1986 and contains highly curated data for more than 70 000 proteins. A related database, TrEMBL, contains automatic translations of the nucleotide sequences in the EMBL database and is not manually curated. There are also many proprietary databases owned by research companies focusing on specific biomedical fields. The ownership, intellectual property rights, and permission to distribute such data for public or private use is a controversial issue.

Because similar sequences give rise to similar struc­tures, it is clear that protein sequence, structure, and function are often intimately linked. The study of 3D protein structure is underpinned by technologies such as x-ray crystallography and nuclear magnetic resonance spectroscopy, and it has given rise to another branch of bioinformatics concerned with the storage, presentation, comparison, and prediction of structures. The Protein Data Bank was the first protein structure database and now contains more than 10 000 structures. Technological developments in structural proteomics have centered on increasing the throughput of structural determination and the initiation of systematic projects for proteome-wide structural analysis.

Interaction proteomics This branch of proteomics considers the genetic and physical interactions among proteins as well as

Page 6: Encyclopedia of Applied Ethics || Proteomics

Proteomics 647

interactions between proteins and nucleic acids or small molecules. The analysis of protein interactions can pro­vide information not only about the function of individual proteins but also about how proteins function in path­ways, networks, and complexes. It is a field that relies on many different technology platforms to provide diverse information, and it is closely linked with functional proteomics and the large-scale analysis of protein locali­zation. Conceptually, the most ambitious aspect of interaction proteomics is the creation of proteome linkage maps based on binary interactions between individual proteins and higher order interactions determined by the systematic analysis of protein complexes. Key tech­nologies in this area include the yeast two-hybrid system (a genetic assay for binary interactions) and MS for the analysis of protein complexes. Interactions between pro­teins and nucleic acids underlie many important processes, including gene regulation, whereas protein interactions with small molecules are also of interest – for example, enzymes interacting with their substrates and receptors with their ligands. These types of interac­tions are often investigated using biochemical assays and structural analysis methods such as x-ray crystallography. The characterization of protein interactions with small molecules can play an important role in the drug devel­opment process.

Functional proteomics The most straightforward way to establish the function of a protein is to test that function directly. Functional proteomics is a relatively new development in which protein functions are tested directly but on a large scale. An example is the systematic testing of expressed proteins for different enzymatic activities, as described in a land­mark publication by Martzen and colleagues.

Protein chip technology Protein chips are miniature devices on which proteins, or reagents that capture proteins from solution, are applied in an array. There are many different types of protein chips, some of which are used to analyze protein abun­dance and others to study protein functions. This emerging technology has the potential to considerably improve the throughput of protein analysis, particularly with the advent of a whole proteome chip for the yeast Saccharomyces cerevisiae. The various different types of protein chips that have been described are summarized as follows:

• Antibody chips: These consist of arrayed antibodies and are used to detect and quantify specific proteins in a complex mixture. They can be thought of as miniaturized high-throughput immunoassay devices.

• Antigen chips: The converse of antibody chips; these devices contain arrayed protein antigens and are

used to detect and quantify antibodies in a complex mixture.

• Universal protein chips (functional arrays): These devices may contain any kind of protein arrayed on the surface and can be used to detect and characterize specific protein–protein and protein–ligand interactions. Various detection methods may be used, including labeling the proteins in solution or detecting changes in the surface properties of the chip (e.g., by surface plasmon resonance). Included within this category are lectin arrays, which are used to detect and characterize glycoproteins.

• Protein capture chips: These devices do not contain arrayed proteins but, rather, other molecules that interact with proteins as broad or specific capture agents. Examples include oligonucleotide aptamers and chips containing molecular imprinted polymers as specific cap­ture agents or the proprietary protein chips produced by companies such as BIAcore and Ciphergen Biosystems that employ broad capture agents based on differing sur­face chemistries to simplify complex protein

• mixtures.

Solution arrays: The latest generation of protein chips are being released from the 2D array format to increase their flexibility and handling capacity. Such devices may, for example, be based on coded micro-spheres or bar-coded gold nanoparticles.

Ethical Considerations

As outlined previously, proteomics has similar although more imperative ethical challenges compared to geno­mics, with the greater imperative resulting from the greater resolution, diversity, and dynamism of proteomics data compared to DNA sequences, and the wider range of samples that can be used for proteomic analysis compared to DNA. For example, proteomics can be applied to ancient samples in which DNA has degraded beyond use, to fixed tissue specimens, and to body tissues and fluids that lack DNA, such as serum, red blood cells, and spinal fluid. This means that proteomics may provide precise molecular data in cases in which DNA evidence is impossible to secure.

One of the main ethical challenges brought about by advances in proteomics is the regulation of the use of human material for proteomics analysis and how the resulting data are used. There is a balance between the need for greater control and the need for wider access, and the tipping point rests on the concept of ‘greater good.’ For example, interests that favor greater control and restriction of sample use include the desire to avoid discrimination, stigmatization, stereotyping (harmful group-based identification), and familial conflict and the desire to provide choice and control for donors over what is done with their biological samples. On the other hand, interests for greater access include the wider benefit to

Page 7: Encyclopedia of Applied Ethics || Proteomics

648 Proteomics

humans in terms of better knowledge, medical progress, lower mortality and morbidity, the commercial interests that come from freer access to samples (particularly in the pharmaceutical industry), freedom of research, better access to data for clinicians, and, finally, the desire to provide choice and control to those who wish to contri­bute to medical research through the donation of samples and body parts (including after death).

It is important to realize one major difference between DNA and proteomic analysis: DNA resources are poten­tially infinite, whereas proteomic resources are not. A DNA sample can be amplified to produce many copies of the original starting sequence, and both genomic DNA and cDNA copies of mRNA can be ‘immortalized’ through the creation of clone libraries, PCR primer col­lections, and array devices. There is no similar amplification technology available for proteins, so sam­ples used in proteomics analysis are finite, and the detection of scarce proteins or protein variants in such samples relies on preparation and the sensitivity of equip­ment. Bequeathing a sample for proteomic analysis therefore lacks the permanence of DNA samples.

However, any data derived from the samples can be used repetitively, either for comparison with a different data set or, where the raw data are stored, for mining to address any number of different questions. It has therefore been recommended that the principle of informed con­sent regarding the use of samples or proteomic data derived therefrom should include enough choices to ensure that donors are exactly aware of the commitment they are making, taking into account the difficulties in defining a priori the research projects that might be undertaken in the future. Rather than offer a simple choice between refusing and granting permission to use biological materials for research, the choices should be widened to allow donors to permit a single study related to the disease for which the sample was originally col­lected or grant access for a limited number of further studies on this disease or others, with the option to request contact for further permission. Provisions should be made to ensure that the data are effectively anonymized (i.e., the patient cannot be personally identified from the data and, for purposes of record keeping and contact, the data should be coded securely). Although the relevant autho­rities in different countries have different recommendations, they all support the existence of a guaranteed chain of responsibility covering the collection, storage, and use of biological samples and the data thus generated and, to a greater or lesser extent, provide details of the research project that will handle the mate­rial, including its aims, study design (why the material is needed), and how the material will be anonymized. The regulatory oversight for the use of human material for proteomic research differs in different countries and is evolving, but harmonization is being developed in the

form of the Universal Declaration on the Human Genome and Human Rights proposed by the United Nations in 1998, which focuses on protection of human health and safety, human (especially patient) rights, commercial rights, intellectual property rights, and international regulations.

With matters of privacy, informed consent, and the ethical benefits and risks of proteomic analysis taken into account, the remaining ethical problem with proteomics is that of ownership and intellectual property, particularly where a sample is used by a private company to generate a commercial product. Approximately 99% of patients do not distinguish between public and private research when deciding whether to allow samples to be used for research purposes, and this is undoubtedly good news given the increasing role of industry in publicly funded medical research, particularly within the European Union. However, there is still some controversy regarding the issue: The European Union adopted the Convention on Human Rights and Biomedicine in 1997, which clearly states that human body parts shall not be used for financial gain, so the creation of value when a human sample is used in research to generate a marketable product is attributed to the research itself, not the research material. In effect, the value of the human material is its ‘throw-away value’ – that is, its value if it were discarded. In contrast, the United States has a booming market for human samples, with several companies offering commercial access to sample banks (often with linked clinical and molecular/genetic data). Such companies exist in Europe, but the materials are banked on the strict understanding that they would otherwise have been destroyed (no intrinsic value or rarity) and third-party use will not identify or otherwise harm the patient. Overall, the concept of offering financial gains to patients for their samples is to be discouraged to avoid creating a market in organs and tissues (as explicitly for­bidden in many countries and by the World Health Organization) and the possibility that samples could be taken under duress to profit criminals.

Summary

Human samples and large derivative data sets are neces­sary for biomedical proteomics research and the development of new drugs using proteomics data. The likelihood that such data will be used in a predictive manner raises ethical concerns, principally the impact on privacy, which means it is necessary to balance the needs of medical research with the needs and rights of the patient. Another important ethical challenge is the issue of property rights (and intellectual property rights) aris­ing from the collection, storage, and dissemination of biological samples and the proteomic data generated from them. A harmonized international regulatory frame­work would be beneficial in this context, but no consistent

Page 8: Encyclopedia of Applied Ethics || Proteomics

Proteomics 649

framework is in place. The outstanding ethical issues are currently addressed by seeking the informed consent of the donor, but it is imperative that informed consent provides donors and patients with sufficient information and suitable choices.

See also: Human Genome Project; Nutrigenomics; Race and Genomics.

Further Reading

Ahmed FE (2008) Utility of mass spectrometry for proteome analysis: Part I. Conceptual and experimental approaches. Expert Review of Proteomics 5: 841–864.

Ahmed FE (2009) Utility of mass spectrometry for proteome analysis: Part II. Ion-activation methods, statistics, bioinformatics and annotation. Expert Review of Proteomics 6: 171–197.

Anderson NG and Anderson NL (1998) Proteome and proteomics: New technologies, new concepts, new words. Electrophoresis 19: 1853–1861.

Chen G and Pramanik BN (2008) LC-MS for protein characterization: Current capabilities and future trends. Expert Review of Proteomics 5: 435–444.

Kalume DE, Molina H, and Pandey A (2003) Tackling the phosphoproteome: Tools and strategies. Current Opinion in Chemical Biology 7: 64–69.

Mann M and Jensen ON (2003) Proteomic analysis of post-translational modifications. Nature Biotechnology 21: 255–261.

Nestler G, Steinert R, Lippert H, and Reymond MA (2004) Using human samples in proteomics-based drug development: Bioethical aspects. Expert Review of Proteomics 1: 77–86.

Patterson SD and Aebersold RH (2003) Proteomics: The first decade and beyond. Nature Genetics 33(supplement): 311–323.

Pattin KA and Moore JH (2009) Role for protein–protein interaction databases in human genetics. Expert Review of Proteomics 6: 647–659.

Reymond MA, Steinert R, Eder F, and Lipert H (2003) Ethical and regulatory issues arising from proteomic research and technology. Proteomics 3: 1387–1396.

Sechi S and Oda Y (2003) Quantitative proteomics using mass spectrometry. Current Opinion in Chemical Biology 7: 70–77.

Stoevesandt O, Taussig MJ, and He M (2009) Protein microarrays: High-throughput tools for proteomics. Expert Review of Proteomics 6: 145–157.

Twyman RM (2004) Principles of Proteomics. Abington, UK: BIOS. Tyers M and Mann M (2003) From genomics to proteomics. Nature

422: 193–197. Wong SC, Chan CM, Ma BB, et al. (2009) Advanced proteomic

technologies for cancer biomarker discovery. Expert Review of Proteomics 6: 123–134.

Biographical Sketch

Richard Twyman gained a first class degree in Genetics with honours from Newcastle University, and a Ph.D. in Molecular Biology from the University of Warwick, in the UK. He is a specialist consultant in scientific project management focusing on post-genomic technology and biotechnology.