the application of proteomics tools for …715/fulltext.pdf · 1 the application of proteomics...

1

THE APPLICATION OF PROTEOMICS TOOLS FOR CHARACTERIZATION OF

BIOPHARMACEUTICAL PROCESSES

A dissertation presented

By

Tyler Carlage

To

The Department of Chemistry and Chemical Biology

In partial fulfillment of requirements for the degree of Doctor of Philosophy

In the field of

Chemistry

Northeastern University

Boston, MA

September, 2011

2

THE APPLICATION OF PROTEOMICS TOOLS FOR CHARACTERIZATION OF

BIOPHARMACEUTICAL PROCESSES

By

Tyler Carlage

ABSTRACT OF DISSERTATION

Submitted in partial fulfillment of the requirements

For the degree of Doctor of Philosophy in Chemistry

In the Graduate School of

Northeastern University, September 2011

3

ABSTRACT

The production of biological drugs for the treatment of serious diseases is a complex

process utilizing a mammalian cell expression system combined with a complex set of

purifications steps to produce drugs of high quality and at optimized yields. In order to

maximize both quality and yield, the cell culture process must be developed ensuring maximal

cell growth and productivity characteristics, as well as scalability and robustness in a

manufacturing setting. The purification method must also be optimized for high yield of

product, while maintaining or enhancing the product quality profile of the drug and removing

cell culture related impurities to acceptably low levels. Limitations of the biopharmaceutical

process can have great impact on the cost of drug production, as well as on the safety and

efficacy of the drug. Understanding the biology of the cells used to produce biopharmaceuticals

can enable the development of better processes. This thesis describes the characterization of

biopharmaceutical processes using proteomics technology.

In chapter 1, the biopharmaceutical process is described, including the development of

cell culture processes, and methods used to enhance cell culture performance through various

genetic engineering strategies. Proteomics analysis can enable the identification of new cell

culture biomarkers related to cell growth and productivity. Tools used in the proteomics field

are outlined, as well as reported proteomics studies of mammalian cell cultures.

In chapter 2, a high-producing Chinese hamster ovarian cell culture which had been

transfected with the apoptosis inhibitor Bcl-XL gene was compared to a low-producing control.

Shotgun proteomics was used to compare the high and low-producing fed-batch cell cultures at

different growth timepoints. A total of 392 proteins were identified in this study, and 32 of these

4

proteins were determined to be differentially expressed, including several proteins related to

protein metabolism such as eukaryotic translation initiation factor 3, and ribosome 40S. In

addition, several intermediate filament proteins such as vimentin and annexin, as well as histone

H1.2 and H2A, were downregulated in the high producer. A growth inhibitor, galectin-1, was

downregulated in the high-producer, which may be related to lower cell growth in the control.

The molecular chaperone BiP was upregulated significantly in the high-producer and may

indicate an unfolded protein response due to ER stress.

In chapter 3, an advanced proteomics method using two-dimensional liquid

chromatography and iTRAQ chemical labeling was used to probe the proteomic changes

occurring in CHO cells during exponential and stationary phases of cell culture. Using this

approach, 59 proteins were identified with significant dynamic trends. These proteins were

analyzed using pathway analysis tools, which identified a network of proteins associated with

cell growth and apoptosis. Molecular chaperones and isomerases, such as GRP78 and PDI, were

upregulated during stationary phase, and are associated with cellular response to endoplasmic

reticulum (ER) stress. Nucleic acid binding proteins including MCM2 and MCM5 were

downregulated during stationary phase, and are known cell growth markers. In addition, two

proteins with growth-regulating properties, transglutaminase-2 and clusterin, were identified.

These proteins are associated with tumor proliferation and apoptosis, and were observed to be

expressed at relatively high levels during stationary phase, which was confirmed by western

blotting.

Gene order in eukaryotes is not random, but rather genes related by function tend to be

clustered together and are regulated in similar patterns. It is thought that the co-regulation of

5

nearby genes is related to chromatin remodeling and histone activity. In chapter 4, genes of

interest related to cell culture performance are mapped to mouse chromosomes, and analyzed for

evidence of clustering. Several clusters of known oncogenes are identified. Several other

clusters of genes of interest are identified from the list of differentially expressed proteins

described in chapter 3. This work provides some initial evidence of potential clustering of

growth-related genes in CHO, which can be expanded on with availability of the CHO genome

and gene expression data.

In the fifth chapter, the application of proteomics techniques to the analysis of secreted

host-cell proteins in process intermediate samples is described. Proteins present in the cell

culture media are identified, including glycolytic enzymes released from damaged cells and

several growth-regulating proteins secreted by the cells. In addition, the clearance of the host

cell proteins is studied by performing proteomics analysis on process intermediates from various

stages within the downstream purification process. Several relatively abundance proteins co-

purified throughout the process are identified, and physiochemical properties of these co-purified

proteins are analyzed.

6

ACKNOWLEDGEMENTS

Writing this thesis is the culmination of years of trial and error, ups and downs, and

overcoming many obstacles both personal and professional. I could not have done it without the

help, guidance, and support of many people around me.

Firstly, I would like to thank my advisor Prof. William S. Hancock for giving me the

opportunity to be a part of his research group as a co-op student. Having students in this

program requires flexibility, patience, and understanding and you have shown all of these

qualities as my advisor. I have learned so much working for you. Also, thank you to Prof.

Marina Hincapie, your guidance has been greatly appreciated. Thank you to my fellow group

members Sam Tep, Agnes Rafalko, and Majlinda Kulloli for your help and friendship.

I have had the privilege to work with many great scientists at Biogen Idec to whom I owe

a great deal of thanks. Thank you to Damian Houde and Yelena Lyubarskaya for developing my

mass spectrometry skills which have served me well. Thank you to Li Zang, Yelena

Lyubarskaya, Rashmi Kshirsagar, Andy Weiskopf, Helena Madden, and Rohin Mhatre, whom

have all graciously allowed me the opportunity to pursue my PhD at Biogen Idec. I would like

to especially thank my manager Li Zang for your support and mentoring, which was critical for

my success. Also, thank you to Jason Wong, Vijay Janakiraman, and Matt Westoby for

providing relevant samples and information needed for this work.

Finally, I would like to thank my sister Calley, and my friends Ben, Peter, Kristian, Kurt,

Wes, Karl, and Adam for always supporting me. Nazira, thank you for being in my life, you

have made this all the more worthwhile. Thank you to my grandfather Fred for being an

7

inspiring and supportive figure in my life. And finally, thank you to my mother Debra for your

neverending love and support and having faith in me, and for my father Dean who showed me

the value of perseverance and achieving your goals. I love you.

8

TABLE OF CONTENTS

ABSTRACT 2

ACKNOWLEDGEMENTS 6

TABLE OF CONTENTS 8

LIST OF FIGURES 13

LIST OF TABLES 16

LIST OF ABBREVIATIONS 17

CHAPTER 1 22

OVERVIEW OF THE BIOPHARMACEUTICAL PROCESS AND THE APPLICATION

OF PROTEOMICS TO CELL CULTURE

1.1 The Biopharmaceutical Process 22

1.2 Targeted Engineering to Enhance Mammalian Cell Cultures 27

1.3 Proteomics Tools 31

1.3.1 Mass Spectrometry 32

1.3.1.1 Electrospray Ionization 33

1.3.1.2 Matrix Assisted Laser Desorption Ionization (MALDI) 35

1.3.1.3 Ion Trap Mass Spectrometer 36

1.3.1.4 Quadrupole Mass Spectrometer 39

1.3.1.5 Time-of-Flight Mass Spectrometer 40

1.3.1.6 Orbitrap Mass Spectrometer 42

1.3.1.7 Tandem Mass Analysis 43

1.3.2 Separation Methods 45

1.3.2.1 Two Dimensional Gel Electrophoresis (2DGE) 46

9

1.3.2.2 Shotgun Proteomics 48

1.3.3 Quantitation 50

1.4 Proteomics Analysis of Mammalian Cell Cultures 53

1.5 Overview of Research 57

1.6 References 59

CHAPTER 2 71

PROTEOMICS COMPARISON OF LOW- AND HIGH-PRODUCING CHO CELL

CULTURES

2.1 Overview 71

2.2 Methods 73

2.2.1 CHO Cell Lines 73

2.2.2 Cell Culture Conditions 73

2.2.3 Cell Lysis 74

2.2.4 Trypsin Digestion 74

2.2.5 LC/MS Analysis 75

2.2.6 Protein Identification 75

2.2.7 Assessment of Relative Abundance of Peptides and Proteins 76

2.3 Results 77

2.3.1 Cell Growth and Specific Productivity 77

2.3.2 Extraction of Proteins from CHO Cells 78

2.3.3 Classification of Identified CHO Proteins 79

2.3.4 Identification of Proteomic Changes 80

2.3.5 Differential Expression between Control and High-Producer 81

2.4 Conclusion 87

10

2.5 References 88

CHAPTER 3 91

ANALYSIS OF DYNAMIC CHANGES TO THE CHO PROTEOME DURING

EXPONENTIAL AND STATIONARY PHASES OF CELL CULTURE

3.1 Overview 91

3.2 Methods 93

3.2.1 Cell Culture 93

3.2.2 Cell Lysis 93

3.2.3 Protein Digestion and Labeling 93

3.2.4 HPLC Fractionation 94

3.2.5 LC/MS 95

3.2.6 Data Analysis 95

3.2.7 Pathway Analysis 97

3.2.8 Western Blotting 97

3.3 Results 98

3.3.1 Proteomic Analysis of Cell Lysates 99

3.3.2 Analysis of Dynamic Trends in Protein Expression 102

3.3.3 Identification of Growth-Regulating Proteins 108

3.3.4 Potential Implications on CHO Cell Culture 111

3.4 Conclusions 114

3.5 References 116

CHAPTER 4 121

CHROMOSOMAL MAPPING OF CHO GENES RELATED TO CELL GROWTH

11

4.1 Overview 121

4.2 Methods 127

4.2.1 Chromosome Mapping 127

4.2.2. Pathway Analysis 128

4.3 Results 128

4.3.1 Identification of Cell Growth Gene Networks 128

4.3.2 Mapping of Genes of Interest on Mouse Chromosomes 131

4.3.3. Mapping of Differentially Expressed CHO Proteins 133

on Mouse Chromosomes

4.4. Conclusions 137

4.5 References 138

CHAPTER 5 143

PROTEOMICS CHARACTERIZATION OF HOST CELL PROTEINS PRESENT IN

VARIOUS STAGES OF A BIOPHARMACEUTICAL PROCESS

5.1 Overview 143

5.2 Methods 145

5.2.1 Purification of Cell Culture Harvest 145

5.2.2 Protein Digestion 146

5.2.3 HPLC Fractionation 146

5.2.4 LC/MS 147

5.2.5 Data Analysis 147

5.3 Results 148

5.3.1 Identification of Secreted Proteins in Process Intermediate Samples 148

5.3.2 Implications of Secreted CHO Proteins on Cell Culture 150

12

5.3.3 Implication of Secreted CHO Proteins on Downstream Purification 159

5.4 Conclusion 162

5.5 References 164

CONCLUDING REMARKS 169

13

LIST OF FIGURES

Figure 1.1: Cell line generation and development for cell culture processes. ............................. 24

Figure 1.2: The Biopharmaceutical Process ................................................................................ 26

Figure 1.3: Effect of Bcl-XL expression on CHO Productivity and Viability ............................ 29

Figure 1.4: Mammalian Proteome Complexity ........................................................................... 32

Figure 1.5: Schematic representation of electrospray ionization ................................................. 34

Figure 1.6: Schematic of the matrix assisted laser desorption ionization technique. .................. 36

Figure 1.7: A simplified schematic of a quadrupole ion trap mass analyzer ............................... 37

Figure 1.8: A simplified schematic of a Thermo Finnigan Ultra TSQ triple quadrupole mass

spectrometer .................................................................................................................................. 40

Figure 1.9: A schematic of an orthogonal acceleration time-of-flight mass spectrometer .......... 42

Figure 1.10: Schematic representation of a hybrid LTQ-Orbitrap mass spectrometer ................ 43

Figure 1.11: Fragment ions generated from tandem mass analysis. ............................................ 44

Figure 1.12: An example of a two-dimensional gel ..................................................................... 48

Figure 1.13: A schematic showing the MUDPIT workflow. ....................................................... 49

Figure 1.14: A typical 4-plex iTRAQ workflow ......................................................................... 51

Figure 1.15: An overview of SILAC ........................................................................................... 53

Figure 1.16: 2D-DIGE gel images of Cy2-labeled pools of Comparison A and B cell lysates . 56

Figure 2.1: Cellular Productivity Profiles .................................................................................... 78

Figure 2.2: Proteins Identified in CHO Samples ........................................................................ 80

Figure 2.3: Analysis of BSA-Spiked CHO Lysates ..................................................................... 81

Figure 2.4: Proteins Upregulated in the High-Producer .............................................................. 84

Figure 2.5: Proteins Downregulated in the High-Producer ......................................................... 85

14

Figure 3.1: CHO Cell Growth and Viability................................................................................ 99

Figure 3.2: Protein Identification Summary .............................................................................. 101

Figure 3.3: Identification of Dynamic Proteomic Trends .......................................................... 103

Figure 3.4: Differentially expressed proteins were classified by protein class using PANTHER.

..................................................................................................................................................... 106

Figure 3.5: Abundance over Time for Proteins Involved in Relevant Pathways ...................... 107

Figure 3.6: Top Scoring Protein Network from Ingenuity Pathway Analysis........................... 109

Figure 3.7: Confirmation of Dynamic Trends in Transglutaminase-2 and Clusterin Expression

..................................................................................................................................................... 111

Figure 4.1: Mouse karyotype using the Giemsa (G-banding) technique ................................... 123

Figure 4.2: Karyotypic analysis of diploid Chinese hamster fibroblast chromosomes (LA-CHE)

and CHO-K1 chromosomes ........................................................................................................ 125

Figure 4.3: Karyotype of CHO-DG44 cells using Giemsa (G-banding) technique. .................. 126

Figure 4.5: The top network for mouse chromosome 11 ........................................................... 130

Figure 4.6: Summary of Growth Regulating Gene Clusters Identified on Mouse Chromosomes

..................................................................................................................................................... 132

Figure 4.7: Cluster of Genes of Interest on Mouse Chromosome 7 .......................................... 133

Figure 4.8: Gene Expression Activity of CHO Genes Mapped to Mouse Chromosomes......... 134

Figure 4.9: Chromosomal Mapping of Cell Growth Related Genes ......................................... 135

Figure 5.1: Downstream Process Overview .............................................................................. 149

Figure 5.2: Protein Identification Summary .............................................................................. 150

Figure 5.3: Cellular Compartment of Proteins Identified in HCCF .......................................... 151

Figure 5.4: Classification of Secreted Proteins from Cell Culture Harvest ............................... 152

15

Figure 5.5: Sequence Alignment of Pyruvate Kinase M1 and M2 ............................................ 155

Figure 5.6: MS/MS Spectrum for Pyruvate Kinase M2 Peptide T48 (CCSGAIIVLTK).......... 156

Figure 5.7: Top Scoring Network of Extracellular Proteins in HCCF ...................................... 157

16

LIST OF TABLES

Table 2.1: Comparison of Cell Lysis Techniques ........................................................................ 79

Table 2.2: Differentially Expressed Proteins ............................................................................... 83

Table 3.1: List of Proteins with Dynamic Trends in CHO Cell Culture .................................... 105

Table 4.1: Top Gene Networks Identified in Each Mouse Chromosome .................................. 128

Table 5.1: Top 5 Most Abundant Proteins in Cell Culture Harvest ........................................... 153

Table 5.2: List of Co-Purified Host Cell Proteins ...................................................................... 160

Table 5.3: Average Physiochemical Values for Co-Purified Proteins ....................................... 161

17

LIST OF ABBREVIATIONS

2DGE

two dimensional gel electrophoresis

BCA

bicinchoninic acid

BHK

Baby hamster kidney

BiP

Immunoglobulin binding protein

Bp

base pair

BSA

bovine serum albumin

CHO

Chinese Hamster Ovary

CID

collision induced dissociation

Clu

clusterin

CYR61

cysteine-rich, angiogenic inducer, 61

DAVID

Database for Annotation, Visualization, and

Integrated Discovery

DC

direct current

DHFR

dihydrofolate dehydrogenase

DIGE

differential imaging gel electrophoresis

DNA

deoxyribonucleic acid

DTT

dithiothreitol

Eif3

eukaryotic transcription initiation factor 3

ELISA

enzyme-linked immunosorbent assay

ER

endoplasmic reticulum

18

ESI

electrospray ionization

FABP4

Fatty acid binding protein 4

FDA

Food and Drug Administration

GAPDH

glyceraldehyde 3-phosphate dehydrogenase

GHT

glycine, hypoxyanthine, and thymidine

GnTIV

N-aceltylglucosaminyltransferase IV

GRAVY

Grand Relative Average Hydropathicity

GRP78

78 kDa glucose-regulated protein

HCCF

Harveseted cell culture fluid

HCl

Hydrochloric acid

HEK

Human embryonic kidney

HPLC

High performance liquid chromatography

IEF

isoelectric focusing

IgG

Immunoglobulin G

IPA

Ingenuity Pathway Analysis

IPG

immobilized pH gradient

ITRAQ

isobaric tag for relative and absolute quantitation

kDa

kilodalton

LC/MS

liquid chromatography mass spectrometry

m/z

mass to charge ratio

MALDI

matrix assisted laser desorption ionization

MCM2

minichromosome maintenance complex component

19

2

mg

milligram

mL

milliliter

mm

millimeter

MOPS

3-(N-morpholino)propanesulfonic acid

M-PER

Mammalian Protein Extraction Reagent

MRM

multiple reaction monitoring

MS

mass spectrometry

Mtorc1

mammalian target of rapamycin complex 1

MUDPI

T

multidimensional protein identification technology

NS0

mouse myeloma null cell line

PAb

polyclonal antibody

PANTHER

Protein Analysis Through Evolutionary

Relationships

PBS

phosphate buffered saline

PBS-T

phosphate buffered saline with Tween

PDI

protein disulfide isomerase

PEP

phosphoenolpyruvate

pI

isoelectric point

PIBF

progesterone immunomodulatory binding factor 1

PIKK

phosphatidylinositol 3-kinase-related kinase

20

PK

pyruvate kinase

PQD

pulsed Q dissociation

PSA

Prostate-specific antigen

PTM

post translational modification

r2

correlation coefficient

RACK1

Receptor for activated C kinase

RF

radio frequency

RP

reversed phase

RSD

relative standard deviation

SCX

strong cation exchange

SD

standard deviation

SDS-PAGE

sodium dodecyl sulfate polyacrylamide gel

electrophoresis

SILAC

stable isotopic labeling of amino acids in cell

culture

SIRNA

small interfering ribonucleic acid

ST3Gal-IV

Gal beta-1,3/4-GlcNAc alpha-2,3-sialyltransferase

ST6GalI

Gal beta-1,4-GlcNAc alpha-2,6-sialylatransferase I

TCTP

Translationally controlled tumor protein

TFA

trifluoroacetic acid

TG2

Transglutaminase-2

TOF

time-of-flight

21

TP53

tumor protein 53

Tris

2-Amino-2-hydroxymethyl-propane-1,3-diol

UPR

unfolded protein response

UV

ultraviolet

VCD

viable cell density

VCP

valosin containing protein

µg

microgram

µL

microliter

22

CHAPTER 1

OVERVIEW OF THE BIOPHARMACEUTICAL PROCESS AND THE APPLICATION OF

PROTEOMICS TO CELL CULTURE

1.1. The Biopharmaceutical Process

The commercialization of biopharmaceuticals developed to treat serious diseases has

undergone rapid growth over the past 30 years. These specialized drugs offer the potential to

target a disease more specifically than a conventional synthesized pharmaceutical, while

resulting in fewer side effects. Over 250 biological drugs have been approved worldwide since

the first approval of insulin in 1982 [1]. Biopharmaceuticals earned over $90 billion in revenue

in 2010. Despite the decreasing rate of FDA approvals over the past several years, market

research indicates that the revenue generated by biopharmaceuticals is expected to increase to

$167 billion by 2015 [2].

The process used to produce biopharmaceuticals (i.e. biopharmaceutical process or

bioprocess), is very complex. The upstream part of the process uses genetically modified cells to

express a recombinant protein drug product [3]. Mammalian cells such as Chinese hamster

ovary (CHO), baby hamster kidney (BHK), or mouse myeloma (NS0) are the most common

platform used for biopharmaceutical production, primarily because they produce proteins with

desired post-translational modifications. For example, glycoproteins expressed by CHO cells

exhibit glycosylation similar to human cells [4], which is critical for correct protein function and

pharmacokinetics [5, 6].

23

The first stage of cell line development involves insertion of a target gene into the host

cell, which is typically accomplished using an engineered vector and utilizing a suitable selection

mechanism. For example, the commonly used CHO-DG44 cell line is a dihydrofolate

dehydrogenase (DHFR) deleted mutant which requires glycine, hypoxyanthine, and thymidine

(GHT) for growth [7]. The transfection of CHO-DG44 cells with a DHFR-containing vector

allows selection and gene amplification in GHT-minus media. Including the anti-folate drug

methotrexate in the medium provides further selective pressure on the cells, allowing for

selection of cells expressing DHFR, and in turn the target gene, at relatively high levels [8].

Cells expressing the transfected genes will grow under the selective conditions. These

selected cells are then transferred as single cells to separate vessels, where they are grown to

produce clonal populations. These subclones are studied to determine cell growth and

productivity characteristics, and slowly scaled up to larger culture volumes as additional

parameters are monitored. After a final cell line displaying the best cell growth, productivity,

and product quality attributes is selected, optimization of the process begins. Process and media

optimization occurs in small scale systems including 96 well plates, small shake flasks, and

benchtop bioreactors (typically up to 5 L in total volume). The conditions used to grow the cells

can have a significant impact on the cell culture phenotype. The media components used to feed

the cells, dissolved oxygen levels, bioreactor temperature and pH all play key roles in supporting

high cell growth and productivity [9]. These parameters are typically optimized for each specific

cell line used, and vary between different cell culture processes.

24

Figure 1.1: Cell line generation and development for cell culture processes.

Proteins of interest are marked protein o.i. Wavy lines indicate subcloning of individual cell

lines, to obtain optimal cell growth and productivity. Vials indicate frozen banks of cells. [9]

Upon harvesting the cells, the conditioned media containing the drug product is isolated

from the cellular material and moved into the downstream process where a series of filtration and

separation steps are used to purify the drug product from the cell culture matrix [10]. This part

of the process must focus on purifying away residual host cell materials, including proteins and

DNA, as well as maintaining a high yield of purified drug product. A simplified diagram of the

entire biopharmaceutical process is shown in Figure 1.2.

The biopharmaceutical process is optimized for both product yield, and product quality.

The yield of the process is critical, due to the relatively high cost of biopharmaceutical

production. The factors affecting yield from the upstream process are cell growth, cell viability,

and cell-specific protein production (i.e. specific productivity) [11]. High cell growth and

25

productivity is selected during screening of cell lines, and is further driven by careful

optimization of cell culture conditions. In addition to cell growth and productivity, cell viability

is also important as it determines the duration of time the cell culture can maintain productivity.

Cell viability is limited by apoptosis, or programmed cell death, which can be triggered by

numerous factors including the accumulation of metabolic byproducts throughout the course of

the cell culture [12].

Product quality is also directly tied to the upstream process. Post-translational

modifications, as well as other aspects of product quality including protein aggregation and

enzymatic cleavage can all have an impact on protein function and must be carefully monitored

during cell culture development [13, 14]. Controlling product quality and process consistency

across different development stages and production scales is of high importance within the

industry [15].

The ability to produce high quality biopharmaceuticals using efficient and robust

processes is a critical component of success for a biopharmaceutical development. Corporations

with a wide portfolio of clinical and commercial programs must have the ability to rapidly

develop suitable processes for many biopharmaceuticals, while meeting program timelines.

Because of the importance of this aspect of biopharmaceutical development, an active area of

research is the identification and application of methods to enhance mammalian cell cultures

through various engineering efforts.

26

Figure 1.2: The Biopharmaceutical Process

A simplified diagram of a typical large-scale biopharmaceutical process is shown. The cell

culture process starts with a small volume of cells called the seed, which is thawed and used to

inoculate a small bioreactor. The reactor size can be scaled up to 25,000 L, depending on the

manufacturing scale. Upon harvesting the cells, centrifugation and depth filtration is used to

separate cells from the media supernatant, which is then transferred to the purification process.

A series of chromatographic and filtration steps are used to purify the drug from the cell culture

matrix. Typically a combination of affinity and ion-exchange chromatography is used. The bulk

material generated from this process is then formulated and packaged to create the final drug

product.

Cell Culture

150L

Bioreactor

750L

Bioreactor

5,000L

Bioreactor

Depth

Filtration

Depth

Filtration

CollectionCollection

CentrifugeCentrifuge

Cell Culture

150L

Bioreactor

750L

Bioreactor

5,000L

Bioreactor

Depth

Filtration

Depth

Filtration

CollectionCollection

CentrifugeCentrifuge

Harvest

Collection

Tank

1,500L

Harvest

Collection

Tank

1,500L

Filter

Chromatography

Skid

Chromatography 1

Column

Eluate

Hold

Tank

Eluate

Hold

Tank

Chromatography

Skid

Column

Chromatography 2 Ultra Filtration

Diafiltration

Bulk

Fill

Purification

Harvest

Collection

Tank

1,500L

Harvest

Collection

Tank

1,500L

Filter

Chromatography

Skid

Chromatography 1

Column

Eluate

Hold

Tank

Eluate

Hold

Tank

Chromatography

Skid

Column

Chromatography 2 Ultra Filtration

Diafiltration

Ultra Filtration

Diafiltration

Bulk

Fill

Bulk

Fill

Purification

27

1.2. Targeted Engineering to Enhance Mammalian Cell Cultures

One method to enhance cell culture performance is through modification of relevant

biological pathways via cellular engineering. Certain pathways are known to be directly related

to important aspects of cell culture, including cell growth, productivity, and product quality.

Targeted overexpression of genes within these pathways has become a common method for

improvement of cell culture characteristics [16].

One example is the engineering of enhanced metabolism of carbon sources such as

glucose and glutamine, both of which are fed to cells in large quantities to drive cell growth.

Glutamine in particular is typically fed to cells in large concentrations because these cells cannot

produce them naturally due to low endogenous expression of glutamine synthase. The metabolic

conversion of glutamine and glucose into carbon dioxide and water produces lactate and

ammonia as byproducts. These compounds are known to negatively affect cell growth [17].

Overexpression of pyruvate carboxylase in CHO, BHK, and HEK-293 (human embryonic

kidney) cells was used to reduce glucose consumption and therefore reduced the buildup of toxic

ammonia and lactate in the cells [18-20]. In addition, overexpression of glutamine synthase in

NS0 cells was used to reduce glutamine consumption and ammonia levels in cell culture [21].

The effect of overexpression of the molecular chaperones immunoglobulin binding

protein (BiP) and protein disulfide isomerase (PDI) on mammalian cell cultures has also been

investigated. Both of these proteins reside in the endoplasmic reticulum (ER) and are involved

in folding of nascent polypeptides entering the ER after translation. It was thought that these

proteins may be part of a protein secretion “bottleneck” where high levels of expressed proteins

cannot be folded and processed efficiently to achieve optimal specific productivity. Therefore,

28

overexpression of these proteins may offer a way to overcome this bottleneck in productivity.

While BiP overexpression increased protein production in yeast cells [22], the opposite effect

was observed in CHO and NS0 cells [23]. Overexpression of PDI improved production of

monoclonal antibodies in CHO cells [23, 24]. However, it was observed that this approach gave

varied results between different cell lines, and does not offer a universal benefit in mammalian

cell culture performance.

Resistance to apoptosis is another area where targeted engineering has been applied to

improve cell cultures. The anti-apoptotic genes Bcl-2 and Bcl-XL have both been studied

extensively to assess their impact on cell growth, viability, and productivity. Various studies

have shown that overexpression of either Bcl-2 or Bcl-XL have resulted in extended viability of

cells in culture, as well as higher viable cell densities compared to control cell lines [25-28].

However, the impact of these genes on productivity is mixed. In one case, overexpression of

Bcl-XL resulted in up to 90% higher specific productivity of a recombinant antibody (see Figure

1.3) [29]. In another similar study, it was shown that overexpression of Bcl-XL did not improve

productivity in a CHO cell line expressing recombinant erythropoietin [30].

29

Figure 1.3: Effect of Bcl-XL expression on CHO Productivity and Viability

A: Comparison of titer of a monoclonal antibody expressed in a parent cell line 100AB-37 and

mixed populations co-expressing either bcl-xL (100AB-37/BclxL pool) or null vector (100AB-

37/null vector pool) .B: Comparison of percent viability of parent cell line 100AB-37 and mixed

30

populations expressing either bcl-xL (100AB-37/BclxL pool) or null vector (100AB-37/ null

vector pool). [29]

Efforts to modify protein glycosylation via overexpression of various glycosidases have

also been successfully used in mammalian cells. The product quality profile of the

biopharmaceutical is tied to the glycosylation on the protein, which can affect efficacy as well as

circulatory half-life. As an example, overexpression of GnTIV (N-

aceltylglucosaminyltransferase IV) modified the glycosylation such that higher numbers of tri-

and tetra-antenarry glycans were present on interferon-gamma [31]. This provides additional

potential sialylation sites on the molecule compared to the control material which contained

mostly bi-antennary glycan structures. Additional overexpression of sialtransferases ST3Gal-IV

and ST6Gal-I boosted overall sialylation to 80% compared to ~60% in the control [32].

The success of these efforts demonstrates the potential benefits of targeted engineering on

cell culture performance. However, it also underlines how the results of these efforts are

dependent on existing knowledge about relevant genes and biological pathways. While some

relevant pathways are well understood, others are not and therefore our knowledge of potential

targets is limited. In fact, our general understanding of the underlying biology of mammalian

cells used for biopharmaceutical production is lacking. Years of adaptation have changed the

cell biology to a significant degree from the original tissues which they originate from; for

example most CHO cell lines have lost their ability to properly control cell cycle [33]. These

changes make comparisons between different cell types and even different cell lines difficult.

Many of the genes playing key roles in cellular productivity, cell growth, and certain aspects of

product quality, are in fact not well understood. The study of mammalian cell cultures using

31

proteomics tools offers the potential of identifying novel proteins involved in cellular growth,

productivity, and product quality which could be used as potential targets for cellular

engineering. By directly comparing cell cultures with different phenotypes, i.e. low and high

producers, quantitative proteomics approaches could be used to identify differentially expressed

proteins which are related to that phenotype. The genetic manipulation of these potential

markers may result in positive enhancements to mammalian cell cultures, such as higher

productivity or better product quality.

1.3. Proteomics Tools

Proteomics analysis involves the large-scale characterization of the protein components

of a given biological sample [34]. Proteomics evolved as a complement to the field of genomics,

and is of high interest because of the close relationship between protein expression and

biological function.

There are many challenges associated with proteomics analysis. Mammalian proteomes

are extremely complex, even more so than genomes [35]. As a result of alternative splicing,

multiple isoforms of a protein may exist. In addition, many proteins contain post-translational

modifications, such as glycosylation, oxidation, deamidation, phosphorylation, etc. These

modifications increase sample complexity. Finally, compared to genomics, there is no direct

method for amplifying protein during analysis such as DNA polymerase chain reaction for

genomics studies. Therefore, proteomics method must be extremely sensitive to detect low-level

proteins present in biological samples.

32

Typically, proteomics analysis involves the use of separation techniques to isolate

proteins or peptides associated with the proteome, and mass spectrometric techniques to identify

the proteins of interest based on fragmentation data. Rapid advances in separations and mass

spectrometry technology over the past 10 years have accelerated the proteomics field and

provided many powerful tools to proteomics researchers.

Figure 1.4: Mammalian Proteome Complexity

The number of genes, mRNA transcripts, and proteins present in a human are shown.

1.3.1. Mass Spectrometry

A mass spectrometer is a mass-sensitive detector which detects charged analytes based on

their mass-to-charge (m/z) ratio. Analytes are first ionized, a process which transfers molecules

from solid or liquid phase into gas phase and imparts a positive or negative charge, allowing

them to be manipulated using electric or electromagnetic fields. The ions are then separated

33

based on their m/z ratio using a mass filter. Common mass filters include ion traps, quadrupoles,

time-of-flight chambers, and Orbitraps. All of these filters operate on the common principle that

an ion will travel through an electromagnetic field following a path dependent upon its m/z ratio.

Therefore, manipulation of that field enables the filtering of particular ions, such that only ions

of a specific m/z will pass to the detector [36].

1.3.1.1. Electrospray Ionization

The first stage of mass spectrometric analysis is ionization. One of the most popular and

powerful ionization methods for analysis of biomolecules is electrospray ionization. Pioneered

by John Fenn in the 1980’s, this process involves the application of high voltage to a liquid

sample flowing through a stainless steel needle [37]. The voltage produces charged molecules in

the sample, which form droplets as they exit the needle in a cone formation. The solvent

forming these droplets is evaporated in the electrospray source by application of high

temperature and heated gas, producing smaller and smaller droplets. Coulombic repulsion of the

ions within the droplet also facilitates desolvation, until single ions are present in the gas phase.

The amount of charge present on the ion depends on the chemical structure. For example, small

peptides may accept one or two positive charges during positive mode electrospray ionization.

On the other hand, a protein of 150 kDa would accept anywhere from 35 to 75 positive charges.

As a result, mass spectra of electrospray ionized compounds typically consists of many peaks

resulting from different charge states of the same analyte, complicating data analysis [38].

One of the advantages of electrospray is that it is easily interfaced with liquid

chromatography. A typical electrospray source works with a range of flow rates from 10 – 0.1

34

mL/min. Nanospray ionization is a scaled down electrospray format which works with flow

rates in the nL/min range. Peaks generated from nanoscale liquid chromatography typically have

higher analyte concentrations compared to microscale liquid chromatography, resulting in

greater sensitivity when coupled with nanospray-ESI mass spectrometry [39]. Compared to

standard ESI, nanospray typically uses a small diameter coated silica needle to which the high

voltage is applied, and uses little or no desolvation gas due to the smaller droplet sizes generated

during ionization. The major disadvantage of nanospray-ESI is less robustness due to the small

needle size, which can easily become clogged with particulates, destabilizing the spray.

Figure 1.5: Schematic representation of electrospray ionization [39].

35

1.3.1.2. Matrix Assisted Laser Desorption Ionization (MALDI)

Similarly to electrospray, MALDI is a “soft” ionization technique which is suitable for

protein and peptide analysis because it leaves the molecules relatively intact during MS analysis

[40]. This method uses pulsed light from a laser source to transfer analytes from a solid surface

to the gas phase. Ions generated during this process are transferred from the source into the mass

spectrometer under high vacuum. Samples are first mixed with an acidic matrix solution, and

deposited in small droplets onto a metal surface where the spot dries, forming a crystalline

structure on the surface of the plate [41]. A pulsed nitrogen laser, typically operating at 5 – 10

Hz, is directed at various regions of the sample spots on the plate. The energy from the laser

excites the matrix molecules, causing a transfer of charge to the analyte and desorption from the

plate surface.

MALDI typically generates singly charged ions for proteins and peptides, simplifying

spectra interpretation compared to electrospray. Other advantages include high sensitivity, high

tolerance of salts and impurities, and the ability to analyze an entire plate (up to 384 samples on

one plate) in a high-throughput manner using newer instruments with automated data acquisition

software. Disadvantages include matrix-related noise in the low mass region of the spectrum, as

well as low shot-to-shot reproducibility and short sample life span [42].

36

Figure 1.6: Schematic of the matrix assisted laser desorption ionization technique.

1.3.1.3. Ion Trap Mass Spectrometer

The ion trap mass spectrometer was originally developed by Wolfgang Paul, who won

the 1989 Nobel Prize in Physics for his invention [43]. The original ion trap design utilized three

electrodes, two endcaps and one ring electrode, to trap ions in a central region where they can be

ejected to a detector. A radio frequency voltage applied to the ring electrode traps the ions

around the ring, while a low pressure gas, typically helium, is used in the trap to collide with the

ions and lower their kinetic energies, thereby stabilizing their motion within the trapping region.

Further application of a DC voltage across the endcaps also provides a trapping effect, such that

the ions are trapped in the center of the three electrodes. Ramping of the RF and DC voltages

37

destabilizes the motion of the ions and causes them to eject from the trap to a detector in order of

increasing m/z ratio.

Figure 1.7: A simplified schematic of a quadrupole ion trap mass analyzer

Ions enter the trap from the ion source, and are trapped between the ring electrode and endcap

electrodes. Once ejected from the trap, the ions are detected by an electron multiplier detector.

[44]

Trapped ions may be subjected to tandem mass analysis by collision with gas molecules

under higher kinetic energy. In a data dependent analysis, precursor ions are detected in a single

ion trap scan, followed by selection of a particular precursor ion for fragmentation. The trap is

filled with the target precursor ion, and fragmented under high kinetic energy conditions.

Resulting fragment ions are then ejected to the detector. The ion trap mass spectrometer also has

the unique capability of multiple stage tandem mass analysis, or MSn. This allows multiple

38

stages of fragmentation on fragment ions to generate smaller and smaller fragments to assist in

identification of analytes. Tandem mass analysis is discussed further in section 1.3.1.7.

The ion trap suffers from the limitation of space-charging effects. Because of the defined

space occupied by the ion cloud within the mass spectrometer, the ions may destabilize each

other if they are too close in physical space, causing loss of signal. For this reason, ion trap mass

spectrometers use an automatic gain control to limit the number of ions allowed into the trap for

every scan event. This limits the overall sensitivity of an ion trap mass spectrometer.

The introduction of the linear ion trap reduced these limitations. The linear ion trap

utilizes parallel quadrupole rods to trap ions. Due to the design of the linear ion trap, more ions

can be stored in the trap without space-charging effects, thereby increasing the sensitivity of the

instrument up to ten times compared to the original 3-D ion trap design [45, 46]. The linear ion

trap also has faster scanning rates compared to the 3D ion trap.

In proteomics research, the linear ion trap is an instrument of choice because of its high

sensitivity and fast scan rates, which extends the proteome coverage of an analysis when

compared to other types of mass spectrometers. In addition, the linear ion trap is available in

hybrid configurations with other mass analyzers, such as the LTQ-Orbitrap and LTQ-FT which

combine the capabilities of the linear ion trap with the high mass resolving power and high mass

accuracy of the Orbitrap and Fourier transform mass analyzers.

39

1.3.1.4. Quadrupole Mass Spectrometer

A quadrupole mass spectrometer uses a quadrupole mass analyzer to filter ions based on

their m/z ratio. The quadrupole consists of four metal rods which are arranged in parallel. A

pair of radio frequency (RF) voltages is applied to two opposing pairs of rods. A DC voltage is

superimposed over the RF frequencies. As ions pass between the rods, ions of a specific m/z

ratio can be selected by adjusting the voltages applied to the rods. Other ions will not be able to

pass through the rods.

The quadrupole mass spectrometer is often configured as a triple quadrupole, with the

three quadrupoles arranged in series. In this configuration, the first and third quadrupoles (Q1

and Q3, respectively) act as mass filters, while the second quadrupole (Q2) acts as a collision cell

and is set at a higher pressure by introduction of a collision gas into the quadrupole chamber.

The triple quadrupole mass spectrometer can be used for selective monitoring reaction (SRM)

experiments, where Q1 selects a precursor ion, Q2 fragments the precursor, and Q3 selects a

specific fragment ion. This method is an extremely specific and sensitive method for detecting

analytes, and is commonly used for detecting drug metabolites from biological samples [47-49].

It has also been used for quantitation of peptide biomarkers in complex biological samples [50].

The quadrupole is typically used for quantitative applications because of its high

sensitivity and wide dynamic range, and the high specificity offered by MRM analysis.

However, due to a relatively low quadrupole scan speed the instrument is relatively slow when

performing MS scans over a wide mass window.

40

Figure 1.8: A simplified schematic of a Thermo Finnigan Ultra TSQ triple quadrupole mass

spectrometer is shown. Two shorter quadrupoles after the source focus the ion beam prior to

entering Q1. [51]

1.3.1.5. Time-of-Flight Mass Spectrometer

The time-of-flight mass spectrometer (TOF-MS) measures the m/z ratio of an ion based

on the time it takes to traverse a field-free drift chamber, called a time-of-flight chamber. The

amount of time it takes for an ion to move through the chamber to the detector is directly

proportional to its mass to charge ratio [52]. Heavier ions will take a longer time to reach the

detector, while lighter ions will take less time (assuming the same charge state). The ions are

accelerated into the TOF region by application of an accelerating voltage. In theory, all ions

should accelerate at the same time and energy into the TOF; however in reality the kinetic

energies of accelerated ions encompass a distribution due to imperfections in ion transmission.

To correct for differences in the distribution of kinetic energy of ions of the same m/z, use of a

reflectron has become standard in TOF MS. This device is installed at one end of the flight

chamber, and applies a constant electrostatic field to reflect the ion beam back towards the

41

detector. Ions with higher kinetic energies will penetrate the electrostatic field deeper, such that

lower kinetic energy ions will catch up to their higher energy counterparts (see Figure 1.9). The

use of reflectron technology in TOF-MS instruments enables high resolution and high mass

accuracy measurements [53].

The time-of-flight mass spectrometer is a pulsed ion mass analyzer. Ions must be

accelerated into the TOF chamber in discrete groups at the same time. For this reason, pulsed

ion sources such as MALDI are typically combined with TOF mass analyzers due to their

inherent compatibility. However, in order to make TOF compatible with continuous ion sources,

orthogonal extraction is used. In this case, ions are accelerated along an axis perpendicular to the

time of flight chamber, and the ion beam is focused and cooled using ion optics and collision

with a low-pressure gas. The focused beam is accelerated perpendicularly into the TOF where it

is further focused by a set of charged grids followed by acceleration from a charged pusher plate.

The hybrid quadrupole-TOF-MS system is offered by several vendors, and combines the

strengths of the triple quadrupole system with the high resolution and mass accuracy of the TOF.

42

Figure 1.9: A schematic of an orthogonal acceleration time-of-flight mass spectrometer

Orthogonal acceleration time of flight mass spectrometer schematic:[54] 20 – ion source; 21 –

ion transport; 22 – flight tube; 23 – isolation valve; 24 – repeller plate; 25 – grids; 26 –

acceleration region; 27 – reflectron; 28 – detector.

1.3.1.6. Orbitrap Mass Spectrometer

The Orbitrap mass spectrometer is a novel mass analyzer which utilizes two electrodes,

an inner spindle and outer barrel electrode, to capture ions in an oscillating orbit [55]. Ions are

injected into the mass analyzer tangentially, and accumulate in the space between the two

electrodes due to the electrostatic repulsions with the electrodes and centrifugal balancing forces.

The harmonic oscillations of the orbiting ions are converted to m/z ratio using Fourier

transformation.

The main advantage of the Orbitrap is high resolution (up to 240,000), high mass

accuracy, and a wide dynamic range [56]. The instrument is available from Thermo Scientific as

a hybrid instrument coupled with a linear ion trap (LTQ-Orbitrap) and also as a standalone

43

benchtop mass analyzer (Exactive). The LTQ-Orbitrap is a popular instrument for bottom-up

proteomics work, and has been successfully used for identification of proteins from complex

mixtures [57-61] as well as in peptide quantitation [62]. The high mass accuracy of the Orbitrap

improves the confidence of peptide identification [63].

Figure 1.10: Schematic representation of a hybrid LTQ-Orbitrap mass spectrometer

(http://www.thermo.com).

1.3.1.7. Tandem Mass Analysis

Mass spectrometry is the technology enabling large-scale proteomic studies, because it is

the only instrument capable of sequencing large numbers of proteins quickly and easily [64].

This is accomplished by tandem MS analysis, where proteins or peptides are fragmented into

unique fragment ions. The resulting MS/MS spectra can be searched against a protein sequence

database to determine the identity of the protein which that peptide was derived from.

The most common fragmentation method is collision-induced dissociation (CID). In this

mode, ions are collided with neutral atoms (typically He, Ar, or N) inducing fragmentation along

44

labile bonds. Depending on instrument configuration, kinetic energy of the precursor ions may

also be increased by adjustment of the field strength within the collision chamber to facilitate

fragmentation. For a peptide, the common fragmentation is along the peptide backbone, creating

b and y type fragment ions. These fragment ions provide information that can be used to identify

peptides. Database search algorithms such as Sequest and Mascot are commonly used to search

tandem mass data against a protein sequence database for identification [65].

Figure 1.11: Fragment ions generated from tandem mass analysis.

Different possible fragment ions that can be generated from peptides are shown. Peptides

fragmented by collision-induced dissociation typically generate b and y type ions.

45

Another fragmentation method unique to ion trap mass spectrometers is called pulsed Q

dissociation (PQD) which is available on Thermo Scientific ion trap mass spectrometers. This

method first applies ramped up RF voltages (the transition of RF voltages is referred to as Q

values) within the ion trap to increase the kinetic energy of the trapped ions. The ions are held

for a set amount of time (usually 100 µs) before the Q value is adjusted to the starting point. The

pressure within the trap is then increased using collision gas as in typical CID, which causes

fragmentation of the trapped ions [66].

The use of PQD for tandem mass analysis in ion traps overcomes the limitation of using

CID, called the “1/3 cutoff rule”. When performing traditional CID using an ion trap instrument,

the mass range of the tandem mass spectrum is limited to 1/3 of the mass to charge ratio of the

precursor ion and above. Therefore, low m/z fragment ions are not detected. This is due to the

high Q values used within the trap to activate the ions during fragmentation, which causes

destabilization of low-mass ions.

When using PQD fragmentation, the Q value is lowered prior to fragmentation, which

allows more ions to be ejected to the detector, thus enabling detection of low m/z fragment ions.

However, the fragmentation efficiency of PQD is typically lower compared to CID because of

the lower kinetic energy of the precursor ions during fragmentation.

1.3.2. Separations Methods

Mass spectrometers have inherent limitations in performance. One such limitation

involves ion suppression, where analytes at low abundance or with weak ionization efficiency

46

will be detected with greatly compromised sensitivity when analyzed simultaneously with

analytes at higher abundance or with better ionization efficiency. Another limitation is

resolution. Depending upon the type of mass analyzer being used, mass spectrometers can have

up to 500,000 resolution, but more typically < 100,000 resolution for proteomics experiments.

In a complex proteomics sample, isobaric or near-isobaric peptides cannot be accurately

resolved, thereby interfering with protein identification. To mitigate these limitations, suitable

separation techniques are required prior to MS analysis to reduce sample complexity and

improve ionization for accurate identification of proteins.

1.3.2.1. Two Dimensional Gel Electrophoresis (2DGE)

Two types of electrophoresis are commonly used in proteomics analysis: isoelectric

focusing (IEF) and SDS-PAGE. Isoelectric focusing uses a pH gradient formed in the gel to

separate proteins based on their pI. SDS-PAGE separates proteins based on their molecular

weight as they migrate through a gel matrix of defined density.

Isoelectric focusing and SDS-PAGE can be combined into a two-dimensional gel

electrophoresis (2DGE) method where proteins are first separated by pI using an immobilized

pH gradient (IPG) strip, which is then loaded transversely across the top of an SDS-PAGE gel to

separate the proteins by molecular weight [67, 68]. Proteins are typically visualized by staining

the gel with non-specific UV-absorbing or fluorescent reagents. A popular method is Coomassie

Blue stain, which can detect down to ~ 40 ng of protein in a single band [69]. Other more

47

sensitive stains such as silver stain can detect below 1 ng of protein per band. Specific spots of

interest from the gel can be excised for identification using in-gel trypsin digestion followed by

mass spectrometric analysis [70].

Advantages of 2DGE include the ability to resolve thousands of proteins in a single gel,

and has been frequently applied in proteomics studies to analyze cell lysates. Smales et al. used

2DGE to analyze NS0 cell lysates, identifying over 2000 proteins in a single gel using Sypro

Ruby stain [71]. 2DGE also offers the ability to quantitate differences between samples using

difference gel electrophoresis (DIGE) where different fluorescent dyes are used to differentiate

proteins from different samples’ run in a single gel [72]. Disadvantages of these methods

include limited throughput and poor gel-to-gel reproducibility.

48

Figure 1.12: An example of a two-dimensional gel is shown, from the analysis of a CHO whole

cell lysate with identified proteins indicated [73]

1.3.2.2. Shotgun Proteomics

Another common proteomics approach is shogtun proteomics, which utilizes proteolytic

digestion of proteins into peptides, typically using trypsin, followed by LC/MS analysis for

identification of the proteins. Peptides are separated based on differences in hydrophobicity

using reversed-phase chromatography. For increased peak separation efficiency, this can be

49

coupled with other separation techniques. One common approach is MUDPIT

(multidimensional protein identification technology) which combined ion-exchange

chromatography and reversed-phase chromatography online for a two-dimensional separation of

peptides [74].

Figure 1.13: A schematic showing the MUDPIT workflow. Digested tryptic peptides are

separated using a special column consisting of strong cation exchange (SCX) and reversed-phase

solid phases. Peptides are eluted stepwise from the SCX to RP phase using increasing

concentrations of salt, followed by reversed-phase gradient to elute the peptides from the second

colunn into the MS. [75]

Advantages of the shotgun proteomics method are the ability to sequence hundreds or

even thousands of proteins in a single LC/MS run, making this the method of choice for large

numbers of proteomics samples. The major disadvantage of the method is the increased

complexity of the sample after trypsin digestion, which can make accurate protein identification

50

difficult due to ionization suppression of low-level peptides, poor MS/MS spectra quality, and

interference of near-isobaric peptides in MS/MS spectra.

1.3.3. Quantitation

Relative quantitation of proteins in shotgun proteomics can be achieved using label-free

methods, chemical labeling methods, or metabolic labeling methods. These methods enable

relative quantitation of proteins between different proteomic samples, allowing the determination

of differential expression.

Spectral counting is a label-free method based on the observation that the number of

MS/MS scans acquired for peptides associated with a particular protein is proportional to the

concentration of that protein in the sample [76]. As a result, the spectral count value for proteins

can be used as an estimate of relative abundance of the proteins, and compared between samples

to infer differences in relative abundance. The method is fairly accurate for more abundant

proteins, but accuracy is rather poor for low-abundance proteins.

The most common chemical labeling approach is the isobaric tagging reagent for relative

and absolute quantitation (iTRAQ). This method utilizes a chemical label which labels the N-

termini and lysine side-chains of peptides. Up to eight different samples can be labeled with a

different iTRAQ reagent. After labeling, the samples are mixed together, and analyzed by

LC/MS. Because the iTRAQ reagents are isobaric, peptides of the same sequence from the

different samples will be detected as a single ion during MS analysis. However, when the

labeled peptides are fragmented during tandem mass analysis, a unique reporter ion is generated

51

from each iTRAQ reagent. The relative intensity of these reporter ions can be used to infer

relative quantitation of the peptides [77, 78].

Figure 1.14: A typical 4-plex iTRAQ workflow

Samples are digested and labeled with different iTRAQ reagents. After labeling, the samples are

mixed together into a single tube. This mixed sample is subsequently analyzed by some form of

LC/MS analysis. During each peptide selected for tandem MS analysis will yield typical

fragment ions as well as reporter ions in the low m/z region of the mass spectrum. The relative

abundance of the peptide in the four samples can be determined from the intensity of these

reporter ions.

The SILAC approach (stable isotopic labeling by amino acids in cell culture) is a

metabolic labeling approach used in cases where proteomic samples can be generated from cell

culture. Two different cell cultures are fed with light and heavy media, the latter containing

52

stable isotopes of an amino acid such as arginine but otherwise identical to the light media. After

growing the cells in these media, the “heavy” amino acids are incorporated in the proteins

expressed by the cells. After preparing and analyzing protein samples from the light and heavy

cell cultures, a pair of ions will be detected for peptides corresponding to light and heavy media.

The ratio of these ions can be used to determine relative quantitation of proteins [79].

The SILAC method is a very strong quantitative method, because the stable isotopes are

incorporated into the proteins at a very early stage of the experiment. As a result, sample to

sample variability is minimized. One limitation of the method is the cost of stable isotope

labeled media. Because of the relatively high cost, these experiments are typically carried out in

small cell volumes, either in plates or small shake flasks. In addition, the common SILAC

method can multiplex two samples together, making analysis of larger numbers of samples more

time-consuming compared to the iTRAQ method.

53

Figure 1.15: An overview of SILAC

Cells are grown in “light” and “heavy” media. In this example, the heavy media contains a

stable isotope of arginine containing 6 13

C atoms. After passaging the cells three times, the cells

are harvested and proteins extracted. Subsequent shotgun proteomics analysis identifies two ions

for a given peptide, separated by a mass of 6 Da. The relative abundance of that peptide can be

determined from the intensity of the two ions in the MS spectrum.

1.4. Proteomics Analysis of Mammalian Cell Cultures

Proteomics has been utilized to study mammalian cell cultures, including the effect of

sodium butyrate treatment [80], low culture temperature [81], lactate metabolism [82], specific

productivity [41], and cell growth rates [83].

Addition of sodium butyrate to cell culture media has been shown to improve specific

productivity in mammalian cell cultures [84]. Proteomics and transcriptomics tools were used to

study the effect of butyrate treatment on CHO and hybridoma proteomes [80]. Whole cell

lysates generated from cells grown with and without sodium butyrate were analyzed by 2DGE

54

and stained with Sypro Orange. Gel spots were compared between gels and differences of 1.5 or

greater were considered significant. Over 600 spots were identified, and 43 spots were found to

be differentially expressed after sodium butyrate treatment. The differentially expressed proteins

were found to be related to protein folding, trafficking, redox control, and ER stress response.

The authors concluded that upregulation of proteins involved in folding and vesicle transport

may have a role in the increased productivity observed under butyrate treated conditions.

In another example, CHO cells with different expression levels of green fluorescent

protein (GFP) were compared using proteomics and transcriptomics tools [41]. Cell lysates

prepared at two different cell culture time points were digested with trypsin and labeled with

iTRAQ, followed by two-dimensional LC/MS analysis. A total of 864 proteins were identified.

Upregulated and downregulated proteins were identified at the two time points based on the

iTRAQ reporter ion ratios. Weak correlation between transcriptome and proteome results was

observed. Many differentially expressed proteins had protein biosynthesis functional attributes

including molecular chaperones. Other differentially expressed proteins included cytoskeletal

proteins, DNA binding proteins, and proteins involved in cellular metabolism.

Proteomics analysis using two-dimensional gel electrophoresis was used to investigate

cell cultures where metabolic shift was observed [82]. The metabolic shift was triggered by

reducing the concentration of glucose and glutamine in cell culture, resulting in altered cellular

metabolism and lower lactate production. Cell lysates were analyzed by 2DGE with silver

staining. Eight differentially expressed protein spots were identified. These spots were excised

and subjected to trypsin in-gel digestion followed by MALDI-TOF analysis. All of the spots

55

were identified, and included actin, GAG polyprotein, NADH-ubiquinone, and phosphoglycerate

mutase.

In another case, CHO cell lines with different growth rates were compared using 2DGE

and cDNA microarray tools [83]. During cloning experiments of four different CHO-K1 cell

lines expressing recombinant monoclonal antibodies, clones displaying different cell growth

properties were selected for further analysis. Cell lysates were analyzed by 2D DIGE. A total of

58 protein spots were identified as differentially expressed between different pairs of slow and

fast growing cells. After MALDI-TOF analysis of digested protein spots, several potential

growth related markers were identified, including cytoskeletal proteins, chaperones, and other

proteins involved in the secretory pathway such as valosin containing protein (VCP). Upon

functional validation by siRNA and overexpression studies, VCP was shown to have a direct

impact on CHO cell growth, boosting viable cell density by 20 - 30% across three different cell

lines.

56

Figure 1.16: 2D-DIGE gel images of Cy2-labeled pools of Comparison A and B cell lysates

Differentially expressed proteins that have been successfully identified by MALDI-TOF MS and

LC-MS/MS are represented on the gel using Decyder 6.5 software generated Master Number as

a reference. (i) 2D-DIGE gel image of Cy2-labeled pool of MAb-expressing clone V1-5 (Slow)

and MAb-expressing clone 9B10 (Fast) (Comparison A) cell lysates. (ii) 2D-DIGE gel image of

Cy2-labeled pool of PA DUKX 153.8 (Slow) and PA DUKX 378 (Fast) (Comparison B) cell

lysates. (iii) Zoomed in region of the 2D-DIGE gel image of Cy2-labeled pool of PA DUKX

153.8 (Slow) and PA DUKX 378 (Fast) (Comparison B) cell lysates demonstrating upregulated

expression of protein subsequently identified by MS as VCP. [83]

57

While these studies offer glimpses into the biological processes at work in different

mammalian cell cultures, the discovery of information that can directly lead to improved control

over process yield and product quality is arguably limited. The sheer complexity of a

mammalian cell lysate extends beyond the limits of what current proteomic analytical

technologies can detect, making identification and quantitation of proteins difficult. Issues of

biological variability and the dynamic aspects of cell culture pose additional obstacles to

comprehensive proteomic studies. The application of new proteomics tools and techniques to

this particular area of research is vital in the effort to improve our understanding of the biology

of mammalian cell cultures for bioprocessing.

1.5. Overview of Research

The work described in this thesis is the application of different analytical techniques and

tools aimed at identifying and quantifying proteomic changes within CHO cell cultures used for

bioprocessing. The goals of this work are 1) the establishment of a proteomic platform for

analysis of proteomic changes in CHO cell cultures and 2) the identification of proteins and

protein pathways which may be related to CHO cell growth and productivity as it relates to the

production of biopharmaceuticals. With many tools available for proteomics research, the

challenge is finding the right tools and the best way to apply them to obtain a comprehensive

analysis of the proteome.

58

In chapter 2, work is presented on the development of a “shotgun” proteomics method for

analysis of CHO cell culture samples, and the application of this method to comparison of low-

and high-producing CHO cell cultures. This study served as a proof-of-concept to show that

bottom-up proteomics approaches using enzymatic protein digestion and LC/MS analysis can be

applied to the analysis of CHO cell cultures and be used to identify relevant proteins. In addition

to identifying some proteins of interest in this study, it also highlighted some limitations of this

kind of method, in terms of identification and quantitation of proteomic differences in a complex

mammalian cell lysate.

In the chapter 3, the development of an improved proteomics method is described, which

combines isotopic chemical labeling with two-dimensional LC/MS analysis to increase the

dynamic range, as well as lower the limit of detection of the method. In addition, a novel data

analysis method was developed to identify dynamic changes in protein abundance over multiple

time-points of a cell culture. This method was applied to the identification of dynamic trends in

protein expression between the exponential and stationary growth phases of a CHO cell culture.

In the fourth chapter, CHO proteomics data is used to study correlations between gene

function and chromosomal location on the mouse chromosomes. Genes of interest are mapped

to the mouse genome, and evidence of co-expression of genes of similar function is shown. The

impact that this data may have on CHO cell biology and cell culture performance is discussed.

In the fifth chapter, the application of proteomics techniques to the analysis of secreted

host-cell proteins in process intermediate samples is described. The identification and

characterization of the secreted host cell proteins is examined, with discussion on the potential

impact of some of these proteins on cell culture performance. In addition, the clearance of the

59

host cell proteins is studied by performing proteomics analysis on process intermediates from

various stages within the downstream purification process. This study increases our

understanding of protein clearance during purification, can be used to support process

development, and provides additional host cell protein clearance data compared to current

industry standards.

1.6. References

1. Shankar G, Pendley C, Stein KECINNBJ, Pmid: A risk-based bioanalytical strategy

for the assessment of antibody immune responses against biological drugs. Nature

biotechnology 2007, 25(5):555-561.

2. Global Biopharmaceutical Market Report (2010-2015)

[http://www.imarcgroup.com/index.php?option=com_content&view=article&id=57&Ite

mid=77]

3. Andersen DC, Krummen L: Recombinant protein expression for therapeutic

applications. Current opinion in biotechnology 2002, 13(2):117-

4. De Jesus M, Wurm FM: Manufacturing recombinant proteins in kg-ton quantities

using animal cells in bioreactors. European journal of pharmaceutics and

biopharmaceutics : official journal of Arbeitsgemeinschaft fur Pharmazeutische

Verfahrenstechnik eV 2011.

5. Hossler P, Khattak SF, Li ZJ: Optimal and consistent protein glycosylation in

mammalian cell culture. Glycobiology 2009, 19(9):936-949.

http://www.imarcgroup.com/index.php?option=com_content&view=article&id=57&Itemid=77

http://www.imarcgroup.com/index.php?option=com_content&view=article&id=57&Itemid=77

60

6. Jones AJ, Papac DI, Chin EH, Keck R, Baughman SA, Lin YS, Kneer J, Battersby JE:

Selective clearance of glycoforms of a complex glycoprotein pharmaceutical caused

by terminal N-acetylglucosamine is similar in humans and cynomolgus monkeys.

Glycobiology 2007, 17(5):529-540.

7. Urlaub G, Kas E, Carothers AM, Chasin LA: Deletion of the diploid dihydrofolate

reductase locus from cultured mammalian cells. Cell 1983, 33(2):405-412.

8. Pallavicini MG, DeTeresa PS, Rosette C, Gray JW, Wurm FM: Effects of methotrexate

on transfected DNA stability in mammalian cells. Molecular and cellular biology

1990, 10(1):401-404.

9. Wurm FM: Production of recombinant protein therapeutics in cultivated

mammalian cells. Nature biotechnology 2004, 22(11):1393-1398.

10. Roque AC, Lowe CR, Taipa MA: Antibodies and genetically engineered related

molecules: production and purification. Biotechnology progress 2004, 20(3):639-654.

11. Dinnis DM, James DC: Engineering mammalian cell factories for improved

recombinant monoclonal antibody production: lessons from nature? Biotechnology

and bioengineering 2005, 91(2):180-189.

12. al-Rubeai M, Singh RP: Apoptosis in cell culture. Current opinion in biotechnology

1998, 9(2):152-156.

13. Lipscomb ML, Palomares LA, Hernandez V, Ramirez OT, Kompala DS: Effect of

production method and gene amplification on the glycosylation pattern of a secreted

reporter protein in CHO cells. Biotechnology progress 2005, 21(1):40-49.

14. Chee Furng Wong D, Tin Kam Wong K, Tang Goh L, Kiat Heng C, Gek Sim Yap M:

Impact of dynamic online fed-batch strategies on metabolism, productivity and N-

61

glycosylation quality in CHO cell cultures. Biotechnology and bioengineering 2005,

89(2):164-177.

15. Li F, Vijayasankaran N, Shen AY, Kiss R, Amanullah A: Cell culture processes for

monoclonal antibody production. mAbs 2010, 2(5):466-479.

16. Lim Y, Wong NS, Lee YY, Ku SC, Wong DC, Yap MG: Engineering mammalian cells

in bioprocessing - current achievements and future perspectives. Biotechnology and

applied biochemistry 2010, 55(4):175-189.

17. Hassell T, Gleave S, Butler M: Growth inhibition in animal cell culture. The effect of

lactate and ammonia. Applied biochemistry and biotechnology 1991, 30(1):29-41.

18. Elias CB, Carpentier E, Durocher Y, Bisson L, Wagner R, Kamen A: Improving glucose

and glutamine metabolism of human HEK 293 and Trichoplusia ni insect cells

engineered to express a cytosolic pyruvate carboxylase enzyme. Biotechnol Prog

2003, 19(1):90-97.

19. Irani N, Beccaria AJ, Wagner R: Expression of recombinant cytoplasmic yeast

pyruvate carboxylase for the improvement of the production of human

erythropoietin by recombinant BHK-21 cells. Journal of biotechnology 2002,

93(3):269-282.

20. Fogolin MB, Wagner R, Etcheverrigaray M, Kratje R: Impact of temperature

reduction and expression of yeast pyruvate carboxylase on hGM-CSF-producing

CHO cells. Journal of biotechnology 2004, 109(1-2):179-191.

21. Bell SL, Bebbington C, Scott MF, Wardell JN, Spier RE, Bushell ME, Sanders PG:

Genetic engineering of hybridoma glutamine metabolism. Enzyme and microbial

technology 1995, 17(2):98-106.

62

22. Powers SL, Robinson AS: PDI improves secretion of redox-inactive beta-glucosidase.

Biotechnology progress 2007, 23(2):364-369.

23. Borth N, Mattanovich D, Kunert R, Katinger H: Effect of increased expression of

protein disulfide isomerase and heavy chain binding protein on antibody secretion

in a recombinant CHO cell line. Biotechnology progress 2005, 21(1):106-111.

24. Mohan C, Park SH, Chung JY, Lee GM: Effect of doxycycline-regulated protein

disulfide isomerase expression on the specific productivity of recombinant CHO

cells: thrombopoietin and antibody. Biotechnology and bioengineering 2007,

98(3):611-615.

25. Fussenegger M, Fassnacht D, Schwartz R, Zanghi JA, Graf M, Bailey JE, Portner R:

Regulated overexpression of the survival factor bcl-2 in CHO cells increases viable

cell density in batch culture and decreases DNA release in extended fixed-bed

cultivation. Cytotechnology 2000, 32(1):45-61.

26. Jung D, Cote S, Drouin M, Simard C, Lemieux R: Inducible expression of Bcl-XL

restricts apoptosis resistance to the antibody secretion phase in hybridoma cultures.

Biotechnology and bioengineering 2002, 79(2):180-187.

27. Mastrangelo AJ, Hardwick JM, Bex F, Betenbaugh MJ: Part I. Bcl-2 and Bcl-x(L) limit

apoptosis upon infection with alphavirus vectors. Biotechnology and bioengineering

2000, 67(5):544-554.

28. Mastrangelo AJ, Hardwick JM, Zou S, Betenbaugh MJ: Part II. Overexpression of bcl-

2 family members enhances survival of mammalian cells in response to various

culture insults. Biotechnology and bioengineering 2000, 67(5):555-564.

63

29. Chiang GG, Sisk WP: Bcl-x(L) mediates increased production of humanized

monoclonal antibodies in Chinese hamster ovary cells. Biotechnology and

bioengineering 2005, 91(7):779-792.

30. Kim YG, Kim JY, Mohan C, Lee GM: Effect of Bcl-xL overexpression on apoptosis

and autophagy in recombinant Chinese hamster ovary cells under nutrient-deprived

condition. Biotechnology and bioengineering 2009, 103(4):757-766.

31. Fukuta K, Abe R, Yokomatsu T, Kono N, Asanagi M, Omae F, Minowa MT, Takeuchi

M, Makino T: Remodeling of sugar chain structures of human interferon-gamma.

Glycobiology 2000, 10(4):421-430.

32. Fukuta K, Yokomatsu T, Abe R, Asanagi M, Makino T: Genetic engineering of CHO

cells producing human interferon-gamma by transfection of sialyltransferases.

Glycoconjugate journal 2000, 17(12):895-904.

33. Seth G, Hossler P, Yee JC, Hu WS: Engineering cells for cell culture bioprocessing--

physiological fundamentals. Advances in biochemical engineering/biotechnology 2006,

101:119-164.

34. Chalmers MJ, Gaskell SJ: Advances in mass spectrometry for proteome analysis.

Current opinion in biotechnology 2000, 11(4):384-390.

35. Humphery-Smith I, Cordwell SJ, Blackstock WP: Proteome research:

complementarity and limitations with respect to the RNA and DNA worlds.

Electrophoresis 1997, 18(8):1217-1242.

36. Fred W McLafferty FT: Interpretation of Mass Spectra, 4 edn: University Science

Books; 1993.

64

37. Whitehouse CM, Dreyer RN, Yamashita M, Fenn JB: Electrospray interface for liquid

chromatographs and mass spectrometers. Analytical chemistry 1985, 57(3):675-679.

38. Barnett DA, Ells B, Guevremont R, Purves RW: Application of ESI-FAIMS-MS to the

analysis of tryptic peptides. Journal of the American Society for Mass Spectrometry

2002, 13(11):1282-1291.

39. Wang HX, Jin BF, Wang J, He K, Yang SC, Shen BF, Zhang XM: [Nano-ESI-MS/MS

identification on apoptosis qssociated proteins induced by inhibiting ubiquitin-

proteasome pathway]. Sheng wu hua xue yu sheng wu wu li xue bao Acta biochimica et

biophysica Sinica 2002, 34(5):630-634.

40. Perkel JM: Mass Spectrometry Applications for Proteomics. The Scientist 2001,

15(16):31.

41. Nissom PM, Sanny A, Kok YJ, Hiang YT, Chuah SH, Shing TK, Lee YY, Wong KT, Hu

WS, Sim MY et al: Transcriptome and proteome profiling to understanding the

biology of high productivity CHO cells. Molecular biotechnology 2006, 34(2):125-140.

42. Zheng J, Li N, Ridyard M, Dai H, Robbins SM, Li L: Simple and robust two-layer

matrix/sample preparation method for MALDI MS/MS analysis of peptides. Journal

of proteome research 2005, 4(5):1709-1716.

43. Paul W: Electromagnetic Traps for Charged and Neutral Particles. Angewandte

Chemie International Edition in English 1990, 29(7):739-748.

44. Glish GL, Vachet RW: The basics of mass spectrometry in the twenty-first century.

Nature reviews Drug discovery 2003, 2(2):140-150.

65

45. Hopfgartner G, Varesio E, Tschappat V, Grivet C, Bourgogne E, Leuthold LA: Triple

quadrupole linear ion trap mass spectrometer for the analysis of small molecules

and macromolecules. Journal of mass spectrometry : JMS 2004, 39(8):845-855.

46. Douglas DJ, Frank AJ, Mao D: Linear ion traps in mass spectrometry. Mass

spectrometry reviews 2005, 24(1):1-29.

47. King RC, Gundersdorf R, Fernandez-Metzler CL: Collection of selected reaction

monitoring and full scan data on a time scale suitable for target compound

quantitative analysis by liquid chromatography/tandem mass spectrometry. Rapid

communications in mass spectrometry : RCM 2003, 17(21):2413-2422.

48. Xia YQ, Miller JD, Bakhtiar R, Franklin RB, Liu DQ: Use of a quadrupole linear ion

trap mass spectrometer in metabolite identification and bioanalysis. Rapid

communications in mass spectrometry : RCM 2003, 17(11):1137-1145.

49. Hopfgartner G, Husser C, Zell M: Rapid screening and characterization of drug

metabolites using a new quadrupole-linear ion trap mass spectrometer. Journal of

mass spectrometry : JMS 2003, 38(2):138-150.

50. Le Blanc JC, Hager JW, Ilisiu AM, Hunter C, Zhong F, Chu I: Unique scanning

capabilities of a new hybrid linear ion trap mass spectrometer (Q TRAP) used for

high sensitivity proteomics applications. Proteomics 2003, 3(6):859-869.

51. Grange AH, Winnik W, Ferguson PL, Sovocool GW: Using a triple-quadrupole mass

spectrometer in accurate mass mode and an ion correlation program to identify

compounds. Rapid communications in mass spectrometry : RCM 2005, 19(18):2699-

2715.

66

52. Wiley WC, McLaren IH: Time-of-Flight Mass Spectrometer with Improved

Resolution. Review of Scientific Instruments 1955, 26(12):1150-1157.

53. Mamyrin BA, Karatajev JV, Shmikk DV, Zagulin VA: The mass-reflectron, a new

nonmagnetic time-of-flight mass spectrometer with high resolution. Sov Phys JETP

1973, 37:45.

54. Kobayashi T: Orthogonal acceleration time-of-flight mass spectrometer. In. Edited by

USPTO, vol. 7230234. U.S.: JEOL Ltd.; 2001.

55. Makarov A: Electrostatic axially harmonic orbital trapping: a high-performance

technique of mass analysis. Analytical chemistry 2000, 72(6):1156-1162.

56. Makarov A, Denisov E, Lange O, Horning S: Dynamic range of mass accuracy in LTQ

Orbitrap hybrid mass spectrometer. Journal of the American Society for Mass

Spectrometry 2006, 17(7):977-982.

57. Michalski A, Damoc E, Hauschild JP, Lange O, Wieghaus A, Makarov A, Nagaraj N,

Cox J, Mann M, Horning S: Mass spectrometry-based proteomics using Q Exactive, a

high-performance benchtop quadrupole Orbitrap mass spectrometer. Molecular &

cellular proteomics : MCP 2011.

58. Zeng Z, Hincapie M, Pitteri SJ, Hanash S, Schalkwijk J, Hogan JM, Wang H, Hancock

WS: A Proteomics Platform Combining Depletion, Multi-lectin Affinity

Chromatography (M-LAC), and Isoelectric Focusing to Study the Breast Cancer

Proteome. Analytical chemistry 2011, 83(12):4845-4854.

59. de Souza GA, Godoy LM, Mann M: Identification of 491 proteins in the tear fluid

proteome reveals a large number of proteases and protease inhibitors. Genome

biology 2006, 7(8):R72.

67

60. Li X, Gerber SA, Rudner AD, Beausoleil SA, Haas W, Villen J, Elias JE, Gygi SP:

Large-scale phosphorylation analysis of alpha-factor-arrested Saccharomyces

cerevisiae. Journal of proteome research 2007, 6(3):1190-1197.

61. Shi R, Kumar C, Zougman A, Zhang Y, Podtelejnikov A, Cox J, Wisniewski JR, Mann

M: Analysis of the mouse liver proteome using advanced mass spectrometry. Journal

of proteome research 2007, 6(8):2963-2972.

62. Manes NP, Dong L, Zhou W, Du X, Reghu N, Kool AC, Choi D, Bailey CL, Petricoin

EF, 3rd, Liotta LA et al: Discovery of mouse spleen signaling responses to anthrax

using label-free quantitative phosphoproteomics via mass spectrometry. Molecular

& cellular proteomics : MCP 2011, 10(3):M110.000927.

63. Mann M, Kelleher NL: Precision proteomics: the case for high resolution and high

mass accuracy. Proceedings of the National Academy of Sciences of the United States of

America 2008, 105(47):18132-18138.

64. Yates JR, Ruse CI, Nakorchevsky A: Proteomics by mass spectrometry: approaches,

advances, and applications. Annual review of biomedical engineering 2009, 11:49-79.

65. Craig R, Beavis RC: TANDEM: matching proteins with tandem mass spectra.

Bioinformatics (Oxford, England) 2004, 20(9):1466-1467.

66. Schwartz JC, Syka JP: Improving the Fundamentals of MSn on 2D Ion Traps: New

Ion Activation and Isolation Techniques. In: 53rd ASMS Conference on Mass

Spectrometry: 2005; San Antonio, Texas; 2005.

67. Scheele GA: Two-dimensional gel analysis of soluble proteins. Charaterization of

guinea pig exocrine pancreatic proteins. The Journal of biological chemistry 1975,

250(14):5375-5385.

68

68. Klose J: Protein mapping by combined isoelectric focusing and electrophoresis of

mouse tissues. A novel approach to testing for induced point mutations in mammals.

Humangenetik 1975, 26(3):231-243.

69. Candiano G, Bruschi M, Musante L, Santucci L, Ghiggeri GM, Carnemolla B, Orecchia

P, Zardi L, Righetti PG: Blue silver: a very sensitive colloidal Coomassie G-250

staining for proteome analysis. Electrophoresis 2004, 25(9):1327-1333.

70. Herbert BR, Harry JL, Packer NH, Gooley AA, Pedersen SK, Williams KL: What place

for polyacrylamide in proteomics? Trends in biotechnology 2001, 19(10 Suppl):S3-9.

71. Smales CM, Dinnis DM, Stansfield SH, Alete D, Sage EA, Birch JR, Racher AJ,

Marshall CT, James DC: Comparative proteomic analysis of GS-NS0 murine

myeloma cell lines with varying recombinant monoclonal antibody production rate.


72. Unlu M, Morgan ME, Minden JS: Difference gel electrophoresis: a single gel method

for detecting changes in protein extracts. Electrophoresis 1997, 18(11):2071-2077.

73. Korke R, Gatti Mde L, Lau AL, Lim JW, Seow TK, Chung MC, Hu WS: Large scale

gene expression profiling of metabolic shift of mammalian cells in culture. Journal of

biotechnology 2004, 107(1):1-17.

74. Washburn MP, Wolters D, Yates JR, 3rd: Large-scale analysis of the yeast proteome

by multidimensional protein identification technology. Nature biotechnology 2001,

19(3):242-247.

75. Netterwald J: Got MudPIT? In: G & P Magazine. vol. 7; 2007: G4-G8.

76. Old WM, Meyer-Arendt K, Aveline-Wolf L, Pierce KG, Mendoza A, Sevinsky JR,

Resing KA, Ahn NG: Comparison of label-free methods for quantifying human

69

proteins by shotgun proteomics. Molecular & cellular proteomics : MCP 2005,

4(10):1487-1502.

77. Ross PL, Huang YN, Marchese JN, Williamson B, Parker K, Hattan S, Khainovski N,

Pillai S, Dey S, Daniels S et al: Multiplexed protein quantitation in Saccharomyces

cerevisiae using amine-reactive isobaric tagging reagents. Molecular & cellular

proteomics : MCP 2004, 3(12):1154-1169.

78. Chen X, Sun L, Yu Y, Xue Y, Yang P: Amino acid-coded tagging approaches in

quantitative proteomics. Expert review of proteomics 2007, 4(1):25-37.

79. Ong SE, Blagoev B, Kratchmarova I, Kristensen DB, Steen H, Pandey A, Mann M:

Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and

accurate approach to expression proteomics. Molecular & cellular proteomics : MCP

2002, 1(5):376-386.

80. Yee JC, de Leon Gatti M, Philp RJ, Yap M, Hu WS: Genomic and proteomic

exploration of CHO and hybridoma cells under sodium butyrate treatment.


81. Baik JY, Lee MS, An SR, Yoon SK, Joo EJ, Kim YH, Park HW, Lee GM: Initial

transcriptome and proteome analyses of low culture temperature-induced

expression in CHO cells producing erythropoietin. Biotechnology and bioengineering

2006, 93(2):361-371.

82. Seow TK, Korke R, Liang RC, Ong SE, Ou K, Wong K, Hu WS, Chung MC: Proteomic

investigation of metabolic shift in mammalian cell culture. Biotechnology progress

2001, 17(6):1137-1144.

70

83. Doolan P, Meleady P, Barron N, Henry M, Gallagher R, Gammell P, Melville M,

Sinacore M, McCarthy K, Leonard M et al: Microarray and proteomics expression

profiling identifies several candidates, including the valosin-containing protein

(VCP), involved in regulating high cellular growth rate in production CHO cell

lines. Biotechnology and bioengineering 2010, 106(1):42-56.

84. Cherlet M, Marc A: Stimulation of monoclonal antibody production of hybridoma

cells by butyrate: evaluation of a feeding strategy and characterization of cell

behaviour. Cytotechnology 2000, 32(1):17-29.

71

CHAPTER 2

PROTEOMICS COMPARISON OF LOW- AND HIGH-PRODUCING CHO CELL

CULTURES

This Chapter has been published in Analytical Chemistry:

Carlage T, Hincapie M, Zang L, Lyubarskaya Y, Madden H, Mhatre R, Hancock WS:

Proteomic profiling of a high-producing Chinese hamster ovary cell culture. Analytical

chemistry 2009, 81(17):7357-7362.

2.1 Overview

The aims of this study were to develop a proteomics method suitable for identification

and quantitation of differentially expressed proteins between different CHO cultures, and to

apply this method to compare the protein expression of a high-producing CHO cell culture to a

low-producing control over multiple time-points. This work served as a proof of concept to

show that LC/MS-based proteomics tools could successfully be used to identify relevant

differentially expressed proteins from CHO cell culture samples.

The method development consisted of three sections. First, a protein extraction method

was developed which utilized mass-spec compatible detergents and sonication to reproducibly

extract proteins from CHO cells. Secondly, a shotgun proteomics approach was developed for

the analysis of CHO cell lysates. While several previously reported proteomic studies of CHO

have relied on two-dimensional gel electrophoresis methods [1-3], shotgun proteomics offers the

advantages of higher throughput and the ability to be able to identify a large number of proteins

in a single LC/MS experiment [4]. The final part of the method development consisted of

72

evaluating a label-free quantitation strategy for the identification of differentially expressed

proteins. The method development data indicates that this method is suitable for identification of

differentially expressed proteins in CHO cell cultures.

This method was applied to the analysis of a low- and high-producing CHO cell culture.

Both cultures used the same CHO cell line expressing a recombinant fusion protein. However,

there were several differences between the two cell cultures which resulted in a marked

difference in overall yield. The high-producing cell culture used an optimized media profile to

enhance cell growth, and was also transfected with the anti-apoptotic gene Bcl-XL. This gene,

along with other Bcl family members such as Bcl-2, inhibits apoptosis by binding to pro-

apoptosis proteins in the mitochondrial membrane or cytoplasm, thereby disrupting the caspase

activation necessary for apoptosis to occur [5, 6]. The application of these inhibitors in CHO and

BHK cell cultures has been studied. In some cases, over-expression of Bcl-2 and Bcl-XL

prolonged cell viability; however these results varied between different cell lines and conditions

[7]. In another study, transfection of CHO with Bcl-XL was shown to increase overall

productivity by 80% through increased cell growth and specific productivity [8]. The benefits of

expressing apoptosis inhibitors in mammalian cell culture are not well understood; however, the

targeting of such proteins continues to be employed as one strategy for increasing cell growth

and productivity. For this reason, there was value in the analysis of proteomic changes

associated with the upregulation of this growth-regulating gene.

73

2.2 Methods

2.2.1 CHO Cell Lines

Both cell cultures studied were derived from the same Chinese hamster ovary DG44 host

cell line by stable transfection of a plasmid encoding genes for DHFR and a humanized

recombinant fusion protein. Fusion protein production was further amplified by cell line

selection in increasing concentrations of methotrexate. In addition, the high-producing cell line

was transfected with a plasmid encoding Bcl-XL and G418. Stable clones were selected in the

presence of neomycin and methotrexate.

2.2.2 Cell Culture Conditions

The fed-batch control culture was grown in a 2-L sparged B. Braun bioreactor (Sartorius,

Goettingen, Germany) for 13 days using a proprietary custom in-house serum-free medium

supplemented with protein hydrolysate. The high-producing culture was grown in a 200-L

custom-made stainless steel stirred tank bioreactor for 16 days using a modified version of the

media used for the control. The nutrient profile and complex media components are different

between the cell cultures, with the high producing culture having media optimized for higher cell

growth. pH was controlled at 7.15 using sodium carbonate for both cultures. Cell number and

viability were measured by trypan blue staining and using a Cedex automated cell counter

(Innovatis, Bielefeld, Germany). A volume equivalent to 3E7 cells was sampled from the

bioreactor at varying timepoints (day 0, 5, 10, 13 for the control culture, and 1, 5, 10, 16 for the

high producing culture). Samples were centrifuged at 500 g for 10 minutes, and the supernatants

were removed. Pellets were reconstituted in 5 mL PBS and centrifuged at 500 g for 10 minutes.

Supernatants were removed, and pellets stored at -70°C until further analysis.

74

2.2.3 Cell Lysis

Cell pellets were thawed at room temperature and reconstituted in a lysis buffer

consisting of 50 mM Tris, pH 7.5 and 0.1% Rapigest (Waters, Milford, MA). Samples were then

sonicated in a water bath for 3 cycles of 15 seconds each. Following sonication, samples were

centrifuged at 5,000 g for 10 minutes. Supernatants were transferred to clean tubes. The total

protein concentration of each cell lysate was measured by BCA (Pierce, Rockford, IL) according

to the manufacturer’s instructions. Samples were stored at -70°C prior to tryptic digestion.

2.2.4 Trypsin Digestion

Lysates were denatured and reduced by adding 20 µL of lysate (approx. 100 µg total

protein) to 45 µL of 8 M Guanidine, 50 mM Tris pH 7.5 and 1 µL of 500 mM DTT, for a final

concentration of 4 M Guanidine and 8 mM DTT. Samples were incubated at 60°C for 15

minutes. Five microliters of 300 mM iodoacetic acid was added to each sample, for a final

concentration of approx. 20 mM, and samples were incubated at room temperature in the dark

for 1 hour. Five microliters of 500 mM DTT was added to each sample to quench the remaining

iodoacetic acid. Each sample was applied to a Microspin SEC spin-column (BioRad, Hercules,

CA) that was pre-equilibrated in 50 mM ammonium bicarbonate, pH 8.0 for cleanup. After

centrifugation at 1,500 g for 4 minutes, the desalted samples were brought to a final volume of

200 µL in 50 mM ammonium bicarbonate, and 10 µL of trypsin was added. Samples were

incubated at 37°C for 18 hours. Five microliters of 1% TFA was added to each sample after

digestion to stop the reaction.

75

2.2.5 LC/MS Analysis

All samples were analyzed in triplicate using a Dionex Ultimate 3000 HPLC interfaced

with an LTQ linear ion trap mass spectrometer. The composition of solvent A was 0.1% (v/v)

formic acid in water, and solvent B was 0.1% (v/v) formic acid in acetonitrile. A volume of 1

µL of peptide digest for each sample was injected onto a CapTrap column (75pprox.. 2 µg on-

column) (Michrom Bioresources, Auburn, CA) using the Dionex autosampler. The trap column

was washed with 100% A at a flow rate of 20 µL/min for 10 minutes to desalt the sample. The

captured peptides were eluted from the CapTrap onto a 0.075 x 150 mm C18AQ column

(Michrom Bioresources, Auburn, CA) using an acetonitrile gradient at 300 nL/min. The gradient

was from 2% B to 35% B over 110 minutes, increased to 90% B in 20 minute, held at 90% B for

35 minutes, and back to 2% B in 5 minutes. The column was re-equilibrated for 60 minutes

before the next injection. The Dionex HPLC was controlled using Chromeleon v.6.80 and the

LTQ was controlled using Xcalibur 2.0.6 software (Thermo Fisher Scientific, Waltham, MA).

The electrospray conditions were as follows: spray voltage 1.70 kV, capillary voltage 48 V, tube

lens 70 V, capillary temperature 225 °C. Each MS scan was acquired in centroid mode from

300-2000 m/z, followed by 9 MS/MS scans of the 9 most intense peaks in data dependent mode.

Dynamic exclusion was enabled for a duration of 30 seconds and a repeat count of 1.

Normalized collision energy of 35%, isolation width of 2.0 m/z, and activation Q of 0.25 was

used for each MS/MS scan.

2.2.6 Protein Identification

Peptide sequences and proteins were identified by searching all MS2 spectra against

theoretical fragmentation spectra of a mouse protein database (Swiss-Prot, updated in September,

76

2007, 12,902 entries). For this, Sequest algorithm incorporated into the Bioworks software,

version 3.1, SR1.4 (Thermo Electron, San Jose, CA) was used. Search parameters included

carbamidomethylation of cysteines, ±1.4 Daltons and ±1.0 Dalton tolerance for precursor and

product ion masses, respectively. Only peptides resulting from tryptic cleavages with up to one

missed cleavage were searched. The Sequest results were filtered by correlation score (Xcorr)

values selected to obtain highly confident peptide and protein identifications: Xcorr 1.9, 2.2, 3.75

for singly, doubly and triply charged peptide ions, respectively, and all with dCn 0.1. Protein

identifications were then validated by using ProteinProphet software, accepting identifications

made with 95% confidence. Only proteins identified with two or more unique peptides were

considered.

2.2.7 Assessment of Relative Abundance of Peptides and Proteins

The spectral counting method was used for estimation of relative peptide and protein

abundance. This method uses the number of scans generated by the mass spectrometer for every

peptide identified from a specific protein as a semi-quantitative measurement of protein

abundance, and has been shown previously to be useful for comparing abundance between

different samples in LC/MS experiments [9].

77

2.3 Results

2.3.1 Cell Growth and Specific Productivity

The high-producer and control CHO cells are both DG44 clones, with the high-producer

being transfected with the Bcl-XL gene to inhibit apoptosis and enhance cellular productivity

(14). The media profile for the high-producer was also optimized in order to improve

metabolite availability and limit bottlenecks to cell growth. The viable cell density (VCD) of

both cell cultures was monitored on a daily basis using trypan blue staining and cell counting.

The control culture was harvested after 13 days, and reached a maximum density of 5.8E6

cells/mL on day 10 (see Figure 2.1). The cells maintained densities >5E6 cells/mL until day 13,

when it decreased to 3.6E6 cells/mL. This change also corresponds to a small decrease in cell

viability (data not shown). The high producer was harvested after 16 days, and it reached a

maximum density of 15.4E6 cells/mL on day 7 and maintained a similar cell density until day

16. The recombinant fusion protein titer was determined by measuring fusion protein

concentration in the secreted media at different days by affinity chromatography. Titer was

slightly higher in the high-producer compared to the control (data not shown).

78

Figure 2.1: Cellular Productivity Profiles

Viable cell density was measured by trypan blue staining using a CEDEX cell counter on various

days for both control and high-producer cell cultures.

2.3.2 Extraction of Proteins from CHO Cells

A robust and reproducible cell lysis method is imperative for proteomics. Many cell lysis

techniques use strong detergents to solubilize proteins. While having a high efficiency, these

methods are typically not compatible with mass spectrometric analysis. Hence, for solubilization

of proteins from CHO cells, a mass-spectrometry compatible detergent, Rapigest, was used [10,

11]. Cell pellets were reconstituted using a Tris buffer containing 0.1% Rapigest and subjected to

3 cycles of sonication to disrupt cell membranes. We evaluated the efficiency of this extraction

method and compared it to a commercially available Mammalian Protein Extraction Reagent (M-

PER®) kit (Pierce). Five replicates of the sample CHO cell culture sample were analyzed using

either method, and protein concentration was measured by the BCA method. The results are

presented in Table 2.1 and indicate that the two methods yield a similar amount of total protein

Cell Growth

0.00

5.00

10.00

15.00

20.00

0 5 10 15

Time (d)

Via

ble

Cell

Den

sit

y (

xE

6

Cell

s/m

L)

High Producer

Control

79

from CHO cells, while the sonication method was more reproducible between the 5 replicates.

Based on these results, the sonication method using Rapigest for protein solubilization was used

for proteomics analysis of the CHO lysates.

Table 2.1: Comparison of Cell Lysis Techniques

Cell pellets were lysed using Pierce Mammalian Protein Extraction Reagent ®, and a sonication

method using Rapigest®. The total protein concentration was measured for each lysate using the

Pierce BCA kit. RSD was calculated by dividing standard deviation by average concentration.

2.3.3 Classification of Identified CHO Proteins

Cell lysates were treated with trypsin to digest the proteins into peptide fragments. The

resulting fragments were analyzed by LC/MS/MS, and sequence information was generated by

searching against a Swissprot mouse database of protein sequences for identification. In this

study, 392 proteins were identified with conservative criteria for protein assignment as well as

the measurement of at least 2 unique peptides per protein (see the methods section).

To determine the cellular origin of the identified proteins, the DAVID bioinformatics tool was

used to categorize proteins based on cellular compartment from Gene Ontology

(http://david.abcc.ncifcrf.gov/). As shown in Figure 2.2, a similar number of total proteins were

Replicate Sonication M-PER

1 4.3 4.5

2 4.2 4.2

3 3.9 3.9

4 4.1 3.8

5 4.0 3.6

Average 4.1 4.0

Standard Deviation 0.2 0.4

RSD 3.9% 9.2%

Concentration of Cell

Lysate (mg/mL)

80

identified in both cell cultures. More nuclear and cytoskeletal proteins were identified in the

control cell culture, while more cytosolic and ribosomal proteins were identified in the high

producer. These results correlate with differential expression patterns that we identified with our

proteomic measurements and discussed in section 2.3.5.

Figure 2.2: Proteins Identified in CHO Samples

Identified proteins were categorized according to cellular compartment using DAVID

(http://david.abcc.ncifcrf.gov/). All proteins were identified by at least 2 unique peptides.

2.3.4 Identification of Proteomic Changes

A label free strategy was used to quantitate differences in protein expression between

CHO cell culture samples. This spectral counting approach uses the number of MSMS scans

assigned to peptides from a particular protein as an estimate of protein abundance [9]. Peptides

at higher abundance will trigger MSMS events more often than lower abundance peptides.

In order to examine how this method is suitable for complex CHO lysates, BSA was

spiked into a CHO lysate at various concentrations. The spiked lysate was analyzed by LC/MS

and recovery of BSA was measured by spectral counting. Figure 2.3 shows the correlation plot

0

50

100

# P

rote

ins

Control 352 105 67 44 47 34 24

High Producer 339 97 75 51 37 31 20

Total Nucleus Cytosol Ribosome Cytoskeleton MitochondrionEndoplasmic

Reticulum

81

between BSA spectral counts and actual protein concentration. The correlation was assessed by

linear regression, and gave an R2 value of 0.9990. This data indicates that spectral counting is

suitable for quantitating differences in protein expression between different CHO samples.

Figure 2.3: Analysis of BSA-Spiked CHO Lysates

A linear plot of spectral counts for BSA against amount of BSA spiked into a CHO cell lysate as

measured by shotgun proteomics.

2.3.5 Differential Expression between Control and High-Producer

Differentially expressed proteins were identified based on the ratio of spectral counts

between control and high-producer for each identified protein. The ratio of spectral counts was

used to calculate fold changes between control and high-producer at day 5, day 10, and the

endpoints (day 13 for control, day 16 for high-producer). In addition, we calculated the relative

standard deviation of the spectral count values measured with three replicates for both the control

and high producer cell lines. Proteins that showed a fold change of greater than 2.0 or less than -

2.0, and that had a relative standard deviation of less than or equal to 0.5 were identified as

R2 = 0.9990

0

20

40

60

80

100

120

140

160

180

0 50 100 150 200 250

Amount BSA (ug)

To

tal P

ep

tid

es

82

differentially expressed. A list of selected proteins with the greatest level of differential

expression is shown in Table 2.2.

A total of 32 differentially expressed proteins were identified. The major functionalities

of these proteins include protein metabolism, cytoskeletal structure, and cell cycle control. Both

BiP and the recombinant fusion protein showed the highest level of upregulation in the high-

producer, with fold changes over 2.0 at all three timepoints. Other proteins such as 40S

ribosome, eukaryotic translation initiation factor 3, and alanyl-tRNA-synthetase were

upregulated to a lesser degree across all 3 timepoints. Several proteins were consistently

downregulated at all 3 timepoints, such as histone H1.2, vimentin, and galectin-1. Other proteins

such as RACK1, alpha enolase, and calcylcin showed upregulation and certain timepoints, and

downregulation at others.

83

Table 2.2: Differentially Expressed Proteins

Differentially expressed proteins were identified by calculating ratio of spectral counts for each

protein at each timepoint. Proteins showing a two-fold change up or down, and which had a

relative standard deviation less than or equal to 0.5, were considered as differentially expressed.

Protein Metabolism 5 10 End

Alanyl tRNA synthetase 5.0 2.0 1.3

T-complex protein 1 subunit delta 1.9 2.9 1.0

T-complex protein 1 subunit eta 2.1 1.5 -1.1

Eukaryotic translation initiation factor 3 subunit 5 epsilon2.4 1.4 2.7

BiP 2.0 2.6 2.8

60S ribosomal Protein L30 1.5 4.8 2.4

40S ribosomal protein S6 2.1 1.6 1.5

40S ribosomal Protein S7 1.2 1.2 2.1

Transcription

Histone H1.2 -2.3 -3.0 -3.2

Histone H2A type 1-F -1.8 -4.0 -4.6

Nucleosome assembly protein 1-like 1 1.2 1.3 2.4

Heterogeneous nuclear ribonucleoprotein A2/B1 -1.1 -1.2 -2.1

Cytoskeleton

Annexin-A2 -1.4 -1.3 -2.2

Adenylyl cyclase-associated protein 1 2.1 1.6 1.0

Filamin-A -1.1 -1.1 -2.1

Myosin-9 -1.3 -2.3 -2.2

Myosin regulatory light chain 2-B -1.2 -2.4 -3.4

Vimentin -2.4 -6.2 -1.1

Cell Cycle Regulation

Receptor for activated C kinase 2.2 1.3 -1.4

Calcyclin 2.1 1.2 -2.3

GTP-binding nuclear protein Ran 2.0 1.6 -1.4

Cell Growth

Galectin-1 -1.3 -2.4 -2.8

Glycolysis

Alpha-enolase 2.3 1.9 -1.6

Glyceraldehyde-3-phosphate dehydrogenase 2.9 1.2 -2.7

Miscellaneous

Chloride intracellular channel protein 1 -1.2 -1.1 -2.0

Dihydrofolate reductase 3.1 1.6 2.0

Recombinant fusion protein 3.9 2.6 2.3

Osteoclast-stimulating factor 1 -1.2 -2.3 -1.3

Phosphoserine aminotransferase 2.5 1.5 1.3

Proteasome activator complex subunit 1 -2.1 -5.5 -1.2

Prostaglandin E synthase 3 -1.1 -2.0 -2.1

Thioredoxin 2.5 -1.1 -2.4

Fold Change

84

Figure 2.4: Proteins Upregulated in the High-Producer

Relative abundance of proteins was determined by spectral counts for (A) recombinant fusion

protein, (B) RACK1, (C) BiP, and (D) alpha-enolase. Error bars correspond to one standard

deviation (n=3).

Alpha-Enolase

0.0

10.0

20.0

30.0

40.0

50.0

60.0

70.0

80.0

90.0

0 2 4 6 8 10 12 14 16 18

Time (d)

Sp

ectr

al

Co

un

ts

RACK1

0.0

5.0

10.0

15.0

20.0

25.0

0 2 4 6 8 10 12 14 16 18

Time (d)

Sp

ectr

al

Co

un

ts

BiP

0.0

50.0

100.0

150.0

200.0

250.0

300.0

0 2 4 6 8 10 12 14 16 18

Time (d)

Sp

ectr

al

Co

un

ts

High-Producer

Control

Recombinant Fusion Protein

0.0

50.0

100.0

150.0

200.0

250.0

300.0

350.0

0 2 4 6 8 10 12 14 16 18

Time (d)

Sp

ectr

al

Co

un

ts

High-Producer

Control

High-Producer

Control

High-Producer

Control

85

Figure 2.5: Proteins Downregulated in the High-Producer

Relative abundance of proteins was determined by spectral counts for (A) annexin-A2, (B)

histone H2A, (C) galectin-1, and (D) vimentin. Error bars correspond to one standard deviation

(n=3).

The upregulation of proteins such as alanyl-tRNA synthetases, Eif3, and 40S ribosome

indicate that protein metabolism is increased in the high producer. These proteins play crucial

roles in the translation of proteins. The recombinant fusion protein was detected at 2-3 fold

higher levels in the high-producer compared to the control, indicating that the intracellular

concentration of this protein is reaching much higher levels. These results support the increased

productivity observed in the high-producing cell culture. Previous studies of Bcl-XL transfected

CHO cells have also shown an increase in specific productivity over non-transfected cells [8],

Histone H2A

0.0

2.0

4.0

6.0

8.0

10.0

12.0

14.0

16.0

18.0

20.0

0 2 4 6 8 10 12 14 16 18

Time (d)

Sp

ectr

al

Co

un

ts

Vimentin

0.0

20.0

40.0

60.0

80.0

100.0

120.0

140.0

0 2 4 6 8 10 12 14 16 18

Time (d)

Sp

ectr

al

Co

un

ts

Galectin-1

0.0

10.0

20.0

30.0

40.0

50.0

60.0

0 2 4 6 8 10 12 14 16 18

Time (d)

Sp

ectr

al

Co

un

ts

High-Producer

Control

High-Producer

Control

High-Producer

Control

Annexin-A2

0.0

5.0

10.0

15.0

20.0

25.0

30.0

0 2 4 6 8 10 12 14 16 18

Time (d)

Sp

ectr

al

Co

un

ts

High-Producer

Control

86

and our study which uses the insights generated by proteomic measurements suggests that the

increased levels of product is related to a greater level of protein biosynthesis under the

fermentation conditions used in this study.

The molecular chaperone BiP was significantly upregulated in the high-producer. This

is relevant since BiP is a key chaperone involved in protein folding in the endoplasmic reticulum

(ER). The upregulation of BiP may indicate ER stress in the high producer is due to high

intracellular concentrations of unfolded proteins, which can lead to an unfolded protein response

in the cell. When unfolded proteins accumulate to a certain level in the ER, chaperones such as

BiP are upregulated to clear the proteins from the ER for degradation [12]. The differential

expression of BiP may indicate a UPR event is occurring in the high-producer. The expression

profiles for BiP and the recombinant fusion protein are similar, which suggests that higher

intracellular concentrations of the product over time results in a proportional response by the cell

to express BiP.

Galectins are a class of carbohydrate-binding proteins that modulate various activities

within cells, such as differentiation, cell growth, apoptosis, and tumor progression [13].

Galectin-1 has specifically been shown to inhibit cell growth and activate apoptosis in T cells

and the expression of galectin-1 may be regulated by Bcl-XL [14]. Galectin-1 was

downregulated in the high-producer, indicating that it may be responsible for inhibiting cell

growth in the control and thus it is a candidate as a biomarker for successful cell engineering and

product yield.

Several proteins that are involved in cell cycle regulation were differentially expressed in

this study, including GTPase Ran, which has a role in the regulation of mitosis [15], and

87

RACK1, a kinase receptor. RACK1 has been shown to inhibit the tyrosine kinase Src, which can

lead to G0/1 cell cycle arrest [16]. Both of these proteins showed a similar expression profile

over time. They were upregulated in the high producer at day 5 and day 10, and were

downregulated at the endpoints. The differential expression of these proteins may indicate that

the cell cycle is controlled differently in the high-producer compared to the control. We also

observed the differential expression of several cytoskeletal proteins, such as vimentin, annexin,

myosin, and filamin. These intermediate filament proteins have several functions; including

maintaining cell shape, intracellular transport, and formation of mitotic spindles during cell

division. The differential expression of these proteins may also be related to control of the cell

cycle, and related to cellular productivity.

The downregulation of histones in the high-producer could also be evidence of

differential cell cycle control. Several histones, which are responsible for condensation of DNA

into chromatin structures, were downregulated in the high-producer. Lower expression of

histones results in greater accessibility of DNA for transcription. Similar results were observed

by Nissom et al., who observed downregulation of histone 1.2 in a high-producing CHO culture

[17].

2.4 Conclusion

The application of shotgun proteomics to CHO cell lysates has been shown to be a useful

tool for studying protein expression and identifying changes associated with cellular

productivity. In this study, we have successfully identified differentially expressed proteins

while comparing a low and high producing CHO cell culture. Several of these proteins are

88

related to cell growth and productivity, including the molecular chaperone BiP, and the growth-

regulating protein galectin-1. On limitation of this work is the limited number of proteins

identified. The method has a limited dynamic range, in part because of the simple one-

dimensional separation used which does not sufficiently separate the complex mixture of

peptides present in the trypsin-digested cell lysate. Additional separation modes would help to

increase the number of proteins identified and improve the results of proteomics studies of CHO

cell lysates.

2.5 References

1. Yee JC, de Leon Gatti M, Philp RJ, Yap M, Hu WS: Genomic and proteomic

exploration of CHO and hybridoma cells under sodium butyrate treatment.


2. Seow TK, Korke R, Liang RC, Ong SE, Ou K, Wong K, Hu WS, Chung MC: Proteomic

investigation of metabolic shift in mammalian cell culture. Biotechnology progress

2001, 17(6):1137-1144.

3. Baik JY, Lee MS, An SR, Yoon SK, Joo EJ, Kim YH, Park HW, Lee GM: Initial

transcriptome and proteome analyses of low culture temperature-induced

expression in CHO cells producing erythropoietin. Biotechnology and bioengineering

2006, 93(2):361-371.

4. Hancock WS, Wu SL, Shieh P: The challenges of developing a sound proteomics

strategy. Proteomics 2002, 2(4):352-359.

5. Hengartner MO: The biochemistry of apoptosis. Nature 2000, 407(6805):770-776.

89

6. Strasser A, O'Connor L, Dixit VM: Apoptosis signaling. Annu Rev Biochem 2000,

69:217-245.



culture insults. Biotechnol Bioeng 2000, 67(5):555-564.


monoclonal antibodies in Chinese hamster ovary cells. Biotechnology and

bioengineering 2005, 91(7):779-792.

9. Liu H, Sadygov RG, Yates JR, 3rd: A model for random sampling and estimation of

relative protein abundance in shotgun proteomics. Anal Chem 2004, 76(14):4193-

4201.

10. Arnold RJ, Hrncirova P, Annaiah K, Novotny MV: Fast proteolytic digestion coupled

with organelle enrichment for proteomic analysis of rat liver. J Proteome Res 2004,

3(3):653-657.

11. Yu YQ, Gilar M, Lee PJ, Bouvier ES, Gebler JC: Enzyme-friendly, mass

spectrometry-compatible surfactant for in-solution enzymatic digestion of proteins.

Anal Chem 2003, 75(21):6023-6028.

12. Patil C, Walter P: Intracellular signaling from the endoplasmic reticulum to the

nucleus: the unfolded protein response in yeast and mammals. Curr Opin Cell Biol

2001, 13(3):349-355.

13. Yang RY, Liu FT: Galectins in cell growth and apoptosis. Cell Mol Life Sci 2003,

60(2):267-276.

90

14. Brandt B, Buchse T, Abou-Eladab EF, Tiedge M, Krause E, Jeschke U, Walzel H:

Galectin-1 induced activation of the apoptotic death-receptor pathway in human

Jurkat T lymphocytes. Histochem Cell Biol 2008, 129(5):599-609.

15. Rensen WM, Mangiacasale R, Ciciarello M, Lavia P: The GTPase Ran: regulation of

cell life and potential roles in cell transformation. Front Biosci 2008, 13:4097-4121.

16. Mamidipudi V, Zhang J, Lee KC, Cartwright CA: RACK1 regulates G1/S progression

by suppressing Src kinase activity. Mol Cell Biol 2004, 24(15):6788-6798.

17. Nissom PM, Sanny A, Kok YJ, Hiang YT, Chuah SH, Shing TK, Lee YY, Wong KT, Hu

WS, Sim MY et al: Transcriptome and proteome profiling to understanding the

biology of high productivity CHO cells. Mol Biotechnol 2006, 34(2):125-140.

91

CHAPTER 3

ANALYSIS OF DYNAMIC CHANGES TO THE CHO PROTEOME DURING

EXPONENTAL AND STATIONARY PHASES OF CELL CULTURE

3.1 Overview

This study describes the development of an improved proteomics methodology for analysis of

CHO cell lysates and the application of this method to the analysis of proteomic changes during

exponential and stationary phases of a CHO cell culture. Proteomics analysis of mammalian cell

cultures presents distinct analytical challenges due to the complexity and wide range of protein

concentrations in a typical sample, as well as the dynamic nature of cell culture experiments,

which run over the course of many days. During this time, various biological processes occur

which can affect the cell culture phenotype.

Many mammalian cell cultures demonstrate cell growth properties characterized by an

exponential growth phase where cell density increases rapidly, followed by a stationary phase

with little to no cell growth but relatively high specific productivity [1-3]. The factors

controlling this transition are not well understood; possible explanations include limitation of key

metabolites, accumulation of toxic waste products, or cellular response to ER stress [2, 4].

However, the transition from exponential to stationary phase has a significant impact on cell

culture performance since it is directly tied to cell growth, and in some cases can affect specific

productivity. Understanding the biological changes associated with the transition of mammalian

cells through different growth phases could increase our understanding of some of the underlying

mechanisms affecting cell growth and productivity. Also, attempts to identify cell culture

92

biomarkers should consider the dynamic changes that occur throughout cell culture, as biomarker

abundance could change over time.

In order to consider the dynamic aspects of cell culture during proteomics analysis, a method is

required that can identify quantitative changes in protein expression over multiple timepoints.

The method should also incorporate proper data analysis tools which enable the identification of

significant trends in protein expression over the different timepoints. This approach could

potentially identify proteins with trends in expression which correlates with dynamic changes in

cell culture. It could also be used to detect differential expression of proteins between different

cell culture conditions, by identifying differences in protein trends.

In this study, we applied a quantitative proteomics approach to monitor changes in the CHO

proteome through the course of a fed-batch cell culture expressing a monoclonal antibody. A

combination of multi-dimensional liquid chromatography, isobaric chemical tagging, and mass

spectrometry was used for the analysis of cell culture samples at different timepoints which

encompassed both exponential and stationary phases of the cell culture. This method was

combined with a novel data analysis approach designed to identify dynamic trends in protein

expression using linear regression calculations. Using this method, we identified proteins which

are differentially expressed over the course of cell culture, and may provide biological insight

into the transition of CHO cells from exponential to stationary phase.

93

3.2 Methods

3.2.1 Cell Culture

A CHO cell line genetically modified to express a recombinant antibody and the anti-apoptotic

gene Bcl-Xl was grown under fed-batch conditions in a 3-L sparged bioreactor for 16 days using

a proprietary custom in-house chemically defined medium. Cell number and viability were

measured using a Cedex (Innovatis, Bielefeld, Germany), an automated cell counter that uses

trypan blue staining. A volume equivalent to 1x107 viable cells was sampled from the bioreactor

on days 6, 9, 12, and 16. The cell samples were centrifuged at 1000 g for 2 minutes to collect the

pelleted cells. The pellets were reconstituted in 5 mL PBS and again centrifuged at 1000 g for 1

minute. The supernatants were removed, and the pellets were flash frozen in liquid nitrogen

followed by storage at -70°C until further analysis.

3.2.2 Cell Lysis

Cell lysates were prepared as described previously [5]. Cell pellets were thawed at room

temperature and reconstituted in a lysis buffer consisting of 50 mM Tris, pH 7.5 and 0.1%

Rapigest (Waters, Milford, MA). Samples were then sonicated in a water bath for 3 cycles of 15

seconds each. Following sonication, samples were centrifuged at 10,000 g for 10 minutes.

Supernatants were transferred to clean tubes. The total protein concentration of each cell lysate

was measured by BCA (Pierce, Rockford, IL) according to the manufacturer’s instructions.

Samples were stored at -70°C prior to tryptic digestion.

3.2.3 Protein Digestion and Labeling

For each sample, a volume equivalent to 50 µg of protein was dried by SpeedVac and

reconstituted in 25 µL of 8 M Urea, 3 µL of 125 mM tris(2-carboxyethyl)phosphine and 10 µL

94

of iTRAQ dissolution buffer. The samples were incubated at 37 °C for 60 minutes. After the

samples reach room temperature, 3.5 µL of 200 mM iodoacetamide was added to each, and

samples were incubated for 60 min at ambient temperature in the dark. A 5X volume of cold

acetone was added to each sample, and they were incubated at – 20 °C for 4 hrs. Samples were

then centrifuged at 10,000 g for 10 min. The supernatant was removed, and the pellet

reconstituted in 10 µL of 8 M Urea and 80 µL of iTRAQ dissolution buffer. Ten micrograms of

trypsin was added to each sample, which were incubated for 18 hrs at 37 °C. After digestion,

samples were evaporated by SpeedVac to a volume less than 30 µL. Each tube of 4-plex iTRAQ

reagent was reconstituted in 70 µL of ethanol and added to the digests. The day 6 samples were

labeled with iTRAQ-114, day 9 with iTRAQ-115, etc. The digests were incubated at ambient

temperature for 2 hrs. Following labeling, the samples were mixed together and evaporated to

dryness.

3.2.4 HPLC Fractionation

Labeled tryptic digests were fractionated using reversed-phase chromatography. After

reconstituting the labeled peptide digest in mobile phase A (20 mM Ammonium Formate pH

10.0), the digest was loaded onto a Waters XBridge C18 column (2.1 x 150 mm) heated to 45°C

at a flow rate of 300 µL/min. Mobile phase B was acetonitrile. The gradient conditions were

set to 2% B for 5 min, then 2% - 10% B over 5 min, followed by 10% - 40% B over 25 minutes.

Fractions were collected every 2 min from 13 to 45 min, for a total of 16 fractions per sample.

Each fraction was evaporated to dryness immediately after collection.

95

3.2.5 LC/MS

Reversed-phase fractions were evaporated to dryness and reconstituted in 100 µL of 0.1% formic

acid in water. Each fraction was analyzed in triplicate by LC/MS using 25 µL injection volumes.

An Agilent 1200 HPLC was connected to a Thermo Scientific Orbitrap Discovery mass

spectrometer. Separation was achieved using a Waters Acquity HSS T3 C18 column (1.0 x 100

mm) heated to 55°C. Mobile phase A was 0.1% formic acid in water, and mobile phase B 0.1%

formic acid in acetonitrile. Peptides were separated with a flow of 70 µL/min at a steady 2% B

for 5 min, then 2% - 35% B over 120 min, followed by a wash step at 90% B and re-equilibration

at 2% B for 20 min. The column was temperature controlled at 55°C. The mass spectrometer

was set up to scan MS followed by MS/MS on the top 4 precursor ions. In MS mode, a mass

range of 400 – 1500 m/z was scanned. For MS/MS scans, pulsed Q dissociation (PQD) was used

with a collision energy of 33 and an isolation width of 3.0. Each MS/MS event was comprised

of 2 microscans. Dynamic exclusion was enabled with a repeat count of 2, for a 15 sec window.

3.2.6 Data Analysis

The proteins were identified using Thermo Scientific Proteome Discoverer 1.1. The Sequest

algorithm was used to search MS/MS data against a mouse sequence database downloaded from

Uniprot on 12/07/2009. The dataset from each reactor was processed independently. The

proteins were identified with 10 ppm mass accuracy for precursor ions, and 0.6 Da for product

ions. The MS/MS data was searched with static modifications set to 4-plex iTRAQ at N-termini

and lysines, and Cys carbamidomethylation. Dynamic modifications were set to asparagine and

glutamine deamidation, methionine oxidation, and tyrosine iTRAQ labeling. The database was

96

searched in reverse to determine false discovery rates. Each peptide was identified with less than

a 5% false discovery rate.

The iTRAQ reporter ions were detected as the most confident centroid within a 0.3 Da window.

Ratios were calculated using iTRAQ-114 as the denominator and -115, -116, and -117 as

numerators. Only iTRAQ spectra with ion counts above 5.0 were used for quantitation, and only

unique peptides were used to calculate protein ratios. The ratios were normalized against the

global protein median for each dataset.

To identify proteins with significant trends, the slope of the linear regression was calculated

using relative intensity as the y-values and cell culture time as the x-values. Since the protein

intensity at day 6 was used as the denominator for all of the iTRAQ ratios, the relative intensity

at day 6 was set to 1.0 for all proteins, and the iTRAQ ratios were set as the relative intensity at

the three other timepoints. Proteins that were missing two or more iTRAQ ratios were ignored

from the calculations. The average and standard deviation of the distribution of the slopes were

calculated for each dataset, and the threshold for significant protein trends was set as +/- 1 SD

from the average for each distribution. The lists from the two datasets were compared and only

proteins present in both lists were considered significant trending proteins.

Proteins were classified by biological process and by protein class using PANTHER, an online

tool which uses annotations such as gene ontology to classify genes by category such as

biological process or molecular function [6].

97

3.2.7 Pathway Analysis

A dataset including a list of gene names and the average calculated slope for all differentially

expressed proteins was uploaded into Ingenuity Pathway Analysis (Ingenuity® Systems,

www.ingenuity.com). A Core analysis was used with both direct and indirect relationships

allowed, and specifying maximum of 70 proteins per network and 10 networks total. All data

sources were selected, and all cell and tissue types were selected. No expression value cutoff

was used. All network eligible molecules were overlaid onto a global molecular network

developed from information contained in Ingenuity’s Knowledge Base. Networks of network

eligible molecules were then algorithmically generated based on their connectivity.

3.2.8 Western Blotting

Cell lysate samples (10-20 µg) were separated under reducing conditions using a Novex 4-12%

Bis-Tris SDS-PAGE gel (Invitrogen, Carlsbad, CA) with MOPS running buffer according to the

manufacturer’s recommended protocol. The proteins were then transferred to a nitrocellulose

membrane using an iBlot (Invitrogen) set to 20 V for 7 min. The blot was blocked overnight

with milk blocking buffer (KPL, Gaithersburg, MD), and probed with primary antibody (mouse

monoclonal antiguinea pig transglutaminase-2 (Abcam, Cambridge, MA) or rabbit polyclonal

antimouse beta actin (Abcam) or mouse antirat clusterin (Abcam) at a concentration of 1,000 –

2,000 µg/mL. After washing 3X with PBS with 0.1% Tween-20 (PBS-T), the blot was probed

with appropriate secondary antibody (Abcam) at a concentration of 10,000 µg/mL. After a

second wash step with PBS-T, the blot was detected using the LumiGLO chemiluminescent kit

(KPL).

98

3.3 Results

A CHO cell culture expressing a recombinant antibody was grown under fed-batch conditions in

a 3 L bioreactor for 16 days in duplicate. During the exponential phase, the viable cell density

(VCD) increased from 2.5x106 viable cells/mL at day 3 to a maximum of approximately 3.2x10

7

viable cells/mL at day 12 (Figure 3.1a). The cell density remained nearly constant during the

stationary phase which occurred between day 12 and day 16, when the cells were harvested. The

cell viability stayed above 95% through day 14, and tapered over the last two days to 87% on

day 16 (Figure 3.1b). In order to study changes in protein expression over time, the reactors

were sampled at days 6, 9, 12, and 16 for proteomics analysis.

99

Figure 3.1: CHO Cell Growth and Viability

Chinese hamster ovary cells were grown in a two duplicate 3 L bioreactor for 16 days.

Bioreactor A (Red) and B (Blue) were run under identical conditions. The viable cell density

(A) and percent viability (B) was measured each day by using a CEDEX.

3.3.1 Proteomics Analysis of Cell Lysates

The four time-point samples from each reactor were multiplexed using iTRAQ and analyzed by

two dimensional LC/MS. A total of 2836 unique proteins were identified, which resulted from

A

B

100

over 7000 peptides detected in either bioreactor (Figure 3.2a). Using PANTHER, the

corresponding gene list was classified by biological process terms from Gene Ontology (Figure

3.2b). Major processes represented in the identified proteins included cell metabolism, transport,

cell communication, and cell development.

To account for sample variability and for more accurate measurement of the change in protein

expression, three different normalization techniques were incorporated into the workflow. First,

the reactors were sampled based on the viable cell density, and samples were frozen at a constant

cell density of 2x107 viable cells/mL. Secondly, after cell lysis the amount of sample used in

digestion and labeling was normalized based on the total protein concentration such that a

volume equivalent to 50 µg of total protein was used. Thirdly, the iTRAQ ratios were

normalized based on their global medians.

The iTRAQ chemical tagging method was well-suited for this application because of its ability to

multiplex samples from different bioreactor timepoints, which reduced the variability between

samples as well as the time and cost of the analysis compared to SILAC approach, which is

another common quantitative proteomics approach. The SILAC method utilizes metabolic

labeling of a cell culture with a light and heavy media containing isotopically labeled Arg, and

allows comparison of two cell cultures [7]. We chose not to use SILAC because our cell culture

experiments were performed at the 3 L reactor scale, which would require significant amounts of

labeled media to perform the experiment. We have also observed that spiking of isotopically

labeled Arg into a cell culture bioreactor after inoculation with standard media did not result in

significant incorporation of the heavy label in CHO proteins (data not shown). Therefore, we

101

deemed the iTRAQ method more widely applicable for analysis of samples generated using

different production scales.

Figure 3.2: Protein Identification Summary

Samples generated from each bioreactor were analyzed and submitted separately for Sequest

search against the mouse database. A) A summary of the peptides and proteins identified in each

bioreactor. B) The list of genes identified by Sequest search was classified by biological process

using PANTHER.

BR#1 BR#2

# Unique Peptides 7399 7479

# Unique Proteins 2130 2026

# Total Proteins 2836

A

B

102

3.3.2 Analysis of Dynamic Trends in Protein Expression

Proteins exhibiting significant changes in protein expression over time were identified by first

performing linear regression analysis on the plot of protein relative abundance (derived from

iTRAQ data) versus time. The slope of the regression line is an indicator of the trend of protein

abundance over time. An example is illustrated in Figure 3.3a for the protein GRP78. This

particular protein showed an increase in relative abundance over time, increasing from 1.0 on

day 6 to 3.2 on day 16. Although this is not a linear increase in abundance, plotting the linear

regression line of this plot provides useful information for determining the trend in abundance of

this protein. In this case, the slope is 0.24 indicating a positive trend. Similarly, a negative slope

would indicate a decreasing trend. The advantage of using such a method for identification of

trends is that the calculation takes into account the relative protein abundance at multiple

timepoints, and therefore is suitable for identifying real trends in protein abundance. By

performing this calculation for all proteins identified in each reactor, we obtained a distribution

of slopes (Figure 3.3b). The distribution indicates that the median slope for all proteins is 0.0,

thus most of the identified proteins are not significantly changing in abundance over time. A

threshold of +/- 1 standard deviation from the mean was used to identify proteins with significant

trends. This threshold was found to be suitable for identifying proteins with significant changes

in abundance, as most of the proteins found outside of these limits had at least 2-fold changes in

abundance from day 6 to day 16.

103

Figure 3.3: Identification of Dynamic Proteomic Trends

The slope of the linear regression line was calculated for the plot of relative iTRAQ reporter ion

intensity versus time. A) An example of an increasing trend showed a positive slope. B) The

distribution was calculated from the slopes of all proteins identified in bioreactor A. The

thresholds are indicated in red, -0.055 and 0.085.

The determination of upregulated proteins over time using the slope of the linear regression line

has several advantages over pairwise comparison of iTRAQ intensities at specific timepoints. It

is useful for identification of positive or negative trends which take into account protein

A

B

104

abundance at multiple timepoints, which may be more relevant to cell culture versus pairwise

comparisons between two timepoints. It allows visualization of the frequency of these protein

trends for an entire dataset, to help understand the data trends from a global perspective. Finally,

it is easily applied to large datasets and does not require the use of special software since the

calculations are easily applied using Microsoft Excel.

Using this approach, 59 proteins were identified with significant dynamic trends over the course

of the cell culture (Table 3.1). All listed proteins showed trends which were biologically

reproducible between the two reactors sampled. Thirteen of the proteins had negative trends, and

the other 44 had positive trends. The protein with the most positive trend was clusterin, with an

averaged slope of 0.36. Similarly, MCM2 had the sharpest decreasing trend with an averaged

slope of -0.11. To understand what functions were associated with these proteins, the list of

differentially expressed proteins was classified by PANTHER protein class (Figure 3.4). Major

functional groups represented by these proteins include chaperones, nucleic acid binding

proteins, isomerases, proteases, transporters, transferases, and oxidoreductases.

105

Table 3.1: List of Proteins with Dynamic Trends in CHO Cell Culture

Differentially expressed proteins were identified with a slope greater than 1 standard deviation

from the mean, or less than 1 standard deviation from the mean. Proteins where similar trends

Gene Name 6 9 12 16 Slope

Cell Metabolism ALDH2 aldehyde dehydrogenase 2 family (mitochondrial) 1.00 0.81 1.23 2.11 0.12

GPD2 glycerol-3-phosphate dehydrogenase 2 (mitochondrial) 1.00 0.67 1.71 3.00 0.22

MDH2 malate dehydrogenase 2, NAD (mitochondrial) 1.00 0.78 1.09 2.07 0.11

NDUFA5 NADH dehydrogenase (ubiquinone) 1 alpha subcomplex, 5, 13kDa 1.00 2.18 1.60 4.59 0.32

Chaperones / Protein Folding CANX calnexin 1.00 0.79 1.08 2.07 0.11

CALR calreticulin 1.00 0.96 1.37 2.55 0.16

DNAJB11 DnaJ (Hsp40) homolog, subfamily B, member 11 1.00 1.08 1.69 2.84 0.19

HSPE1 heat shock 10kDa protein 1 (chaperonin 10) 1.00 0.68 1.11 1.97 0.11

HSPA5 heat shock 70kDa protein 5 (glucose-regulated protein, 78kDa) 1.00 0.95 1.47 3.45 0.25

HSPA9 heat shock 70kDa protein 9 (mortalin) 1.00 0.82 1.22 1.92 0.10

HSP90B1 heat shock protein 90kDa beta (Grp94), member 1 1.00 0.94 1.34 2.63 0.17

HYOU1 hypoxia up-regulated 1 1.00 0.86 1.35 2.25 0.13

P4HB prolyl 4-hydroxylase, beta polypeptide 1.00 0.90 1.23 1.95 0.10

PDIA3 protein disulfide isomerase family A, member 3 1.00 0.95 1.27 2.28 0.13



TXNDC5 thioredoxin domain containing 5 (endoplasmic reticulum) 1.00 1.10 1.28 1.97 0.10

Nucleic Acid Binding CHMP2B chromatin modifying protein 2B 1.00 0.83 0.43 0.39 -0.08

EIF2B4 eukaryotic translation initiation factor 2B, subunit 4 delta, 67kDa 1.00 0.68 0.58 0.26 -0.07

HMGB2 high-mobility group box 2 1.00 0.55 0.62 0.15 -0.08

HIST1H1D histone cluster 1, H1d 1.00 0.50 0.72 0.23 -0.06

MCM2 minichromosome maintenance complex component 2 1.00 0.59 0.23 0.00 -0.11



SF3B1 splicing factor 3b, subunit 1, 155kDa 1.00 0.77 0.41 0.12 -0.08

Kinase MAST1 microtubule associated serine/threonine kinase 1 1.00 0.80 1.32 2.68 0.17

PRKCSH protein kinase C substrate 80K-H 1.00 1.05 1.20 2.28 0.13

PRKAG2 protein kinase, AMP-activated, gamma 2 non-catalytic subunit 1.00 0.94 1.30 2.08 0.11

Lipid Metabolism HMGCS1 3-hydroxy-3-methylglutaryl-CoA synthase 1 (soluble) 1.00 0.76 0.36 0.16 -0.09

ACAA2 acetyl-CoA acyltransferase 2 1.00 0.73 1.42 2.20 0.13

PLIN2 adipose differentiation related protein 1.00 2.13 2.12 2.57 0.14

CYB5R3 cytochrome b5 reductase 3 1.00 1.45 1.10 2.96 0.18

PNPLA8 patatin-like phospholipase domain containing 8 1.00 2.19 1.57 3.87 0.25

Oxidoreductase MT1F metallothionein 1F 1.00 0.79 1.39 2.38 0.15

PRDX3 peroxiredoxin 3 1.00 0.95 1.38 2.41 0.15

PRDX5 peroxiredoxin 5 1.00 1.15 1.48 2.27 0.13

SOD2 superoxide dismutase 2, mitochondrial 1.00 0.86 1.35 2.33 0.14

Protease CTSD cathepsin D 1.00 1.10 1.19 2.12 0.11

CLPP ClpP caseinolytic peptidase, ATP-dependent, proteolytic subunit homolog 1.00 1.07 1.26 2.38 0.14

KLK11 kallikrein-related peptidase 11 1.00 1.56 0.65 2.40 0.11

USP10 ubiquitin specific peptidase 10 1.00 1.81 0.87 0.25 -0.08

Transport ABCB5 ATP-binding cassette, sub-family B (MDR/TAP), member 5 1.00 1.56 1.71 2.82 0.17

CAPRIN1 cytoplasmic activation/proliferation related protein 1 1.00 0.92 0.85 0.21 -0.08

ATP5B ATP synthase, H+ transporting, mitochondrial F1 complex, beta polypeptide 1.00 0.76 0.92 1.93 0.10

ATP5G1 ATP synthase lipid-binding protein 1.00 0.44 1.07 3.14 0.22

Other CAV3 caveolin 3 1.00 1.07 1.79 2.81 0.19

SSR1 signal sequence receptor, alpha 1.00 0.88 0.61 0.17 -0.07

MANF mesencephalic astrocyte-derived neurotrophic factor 1.00 1.12 1.60 2.23 0.13

ADSL adenylosuccinate lyase 1.00 1.33 2.11 2.70 0.18

ARGLU1 arginine and glutamate rich 1 1.00 1.27 0.74 0.41 -0.07

CLU clusterin 1.00 1.27 2.23 4.56 0.36

ETHE1 ethylmalonic encephalopathy 1 1.00 0.84 1.11 2.01 0.10

GOT2 glutamic-oxaloacetic transaminase 2, mitochondrial 1.00 0.78 1.24 1.99 0.11

1500003O03Rik novel protein RP23-22A15.1 1.00 1.00 1.67 2.31 0.14

TGM2 transglutaminase 2 1.00 1.96 2.88 3.20 0.22

WDR65 WD repeat domain 65 1.00 1.35 1.74 2.44 0.14

BCL2L1 BCL2-like 1 (Bcl-XL) 1.00 0.67 1.79 3.08 0.21

N/A monoclonal antibody light chain 1.00 1.28 1.71 2.32 0.13

N/A monoclonal antibody heavy chain 1.00 1.17 1.93 2.34 0.14

Relative Abundance

106

were observed in both bioreactors were considered significant. The average relative abundance

and slope values from both bioreactor datasets are shown.

Figure 3.4: Differentially expressed proteins were classified by protein class using PANTHER.

Different trends were observed within the protein classes. For example, many ER proteins were

upregulated in the stationary phase. These include the molecular chaperones GRP78, calnexin,

GRP94, and hypoxia upregulated protein 1, as well as several protein disulfide isomerases.

Another ER-residing protein, armet, was also present at higher levels in the stationary phase.

Armet does not have reported chaperone activity, but is associated with inhibition of ER-

mediated stress [8]. In contrast, several proteins associated with nucleic acid binding such as

0%

2%

4%

6%

8%

10%

12%

14%

16%

18%

20%

107

minichromosome maintenance complexes 2, 5, and 6 were downregulated in the stationary

phase. Cytoplasmic activator/proliferation-associated protein-1 (caprin1) was also

downregulated. To further illustrate some of the trends in protein abundance, the relative

abundance plots for GRP78, armet, MCM2, and caprin1 are shown in Figure 3.5.

Figure 3.5: Abundance over Time for Proteins Involved in Relevant Pathways

The relative intracellular concentration of A) GRP78 B) armet C) MCM5 D) caprin1 is shown

for both bioreactor experiments (denoted by red and blue traces). The protein abundance was

determined from the iTRAQ ratios using day 6 as the denominator and days 9, 12, and 16 as

numerators.

GRP78

0

0.5

1

1.5

2

2.5

3

3.5

4

0 5 10 15 20

Time (d)

Re

lati

ve A

bu

nd

ance

Armet

0

0.5

1

1.5

2

2.5

3

0 5 10 15 20

Time (d)

Re

lati

ve A

bu

nd

ance

MCM5

00.20.40.60.8

11.2

0 5 10 15 20

Time (d)

Re

lati

ve

Ab

un

da

nce

Caprin1

0

0.2

0.4

0.6

0.8

1

1.2

0 5 10 15 20

Time (d)

Re

lati

ve A

bu

nd

ance

108

3.3.3 Identification of Growth-Regulating Proteins

The resulting list of differentially expressed proteins was analyzed using Ingenuity Pathway

Analysis to identify common pathways and networks among the proteins. The highest scoring

network identified in the pathway analysis had a score of 50, and included 29 of the proteins

from the significant trend group (Figure 3.6). This network is associated with hematological

function and development, hematopoesis, and cell death.

109

Figure 3.6: Top Scoring Protein Network from Ingenuity Pathway Analysis

Genes correlating to differentially expressed proteins were analyzed for functional networks

using Ingenuity. The top network shown gave a score of 50, and is associated with

hematological function and development, hematopoesis, and cell death. Gene names indicated in

green are downregulated in stationary phase, and red indicates upregulated. Solid lines indicate

protein-protein interactions, and dotted lines indicate relationships based on gene expression.

110

Two proteins included in the top-scoring network, clusterin and transglutaminase-2, were of

particular interest because both have cell growth regulating properties [9, 10]. In order to

confirm the dynamic trends observed for these two proteins, western blotting was performed on

the four timepoints from both bioreactors for both proteins. The results confirmed the changes in

abundance observed using quantitative proteomics (Figure 3.7). Transglutaminase-2 was found

to increase an average of 2.8-fold over 10 days by western blot based on spot volume analysis,

which corresponds to a 3.2-fold change observed in the proteomics analysis. Clusterin was

detected as several bands, which has been reported previously due to the presence of a secreted

form which is heavily glycosylated, as well as a nuclear non-glycosylated form [11, 12]. The

major band observed in this blot migrates at approximately 28 kDa, which corresponds to the

nuclear form of clusterin. This clusterin band was found with an average 2.0 fold increase over

10 days, compared to a 4.5 fold increase observed by proteomics.

111

6A 9A 12A 16A 6B 9B 12B 16B

Figure 3.7: Confirmation of Dynamic Trends in Transglutaminase-2 and Clusterin Expression

The relative abundance of transglutaminase-2 and clusterin as measured using iTRAQ reporter

ion ratios and B) western blotting results for transglutaminase-2 and clusterin. Cell lysates

corresponding to days 6, 9, 12, and 16 from both bioreactors were analyzed in parallel.

3.3.4 Potential Implications on CHO Cell Culture

The observation of upregulation of molecular chaperones and isomerases involved in protein

folding is likely due to cellular response to ER stress. The increased expression of the molecular

chaperones BiP, calnexin, and GRP94, are associated with the unfolded protein response (UPR)

[13-15]. Another upregulated protein, armet, is reportedly associated with inhibition of ER-

mediated stress [8]. The unfolded protein response is triggered by high levels of unfolded

Transglutaminase-2

0

0.5

1

1.5

2

2.5

3

3.5

4

0 2 4 6 8 10 12 14 16 18

Time (d)

Re

lati

ve A

bu

nd

ance

Clusterin

0

1

2

3

4

5

6

7

0 5 10 15 20

Time (d)

Rela

tive

Abu

ndan

ce

TGM2

bActin

CLU

112

proteins in the ER lumen, which can activate the UPR through three different receptors: Ire1,

ATF6, and PERK. It results in the enhanced transcription of UPR genes including molecular

chaperones, isomerases, and proteases [16]. The resulting increase in protein folding and

degradation capability enables the cell to increase its ability to process proteins entering the ER.

Another upregulated group of proteins observed in this study are oxidoreductases, which are

associated with having antioxidant function. The increased expression of antioxidants may be

related to the unfolded protein response, since an oxidizing environment in the ER lumen is

critical for protein folding [17, 18]. The upregulation of UPR-related proteins also correlates

with increased intracellular abundance of the monoclonal antibody during stationary phase.

Similarly, higher extracellular abundance of the IgG was also observed, which is indicated by the

higher specific productivity during stationary phase. Previous proteomic studies of CHO and

NS0 cells have also reported evidence of the unfolded protein response in mammalian cell

cultures [19, 20]. This process may play a role in the observed cell growth characteristics since

the UPR has been reported to trigger cell cycle arrest [4], however this has not been confirmed

for our particular cell culture process.

Pathway analysis of the list of differentially expressed proteins revealed a network consisting of

29 proteins which had functional associations with hematological function and development,

hematopoesis, and cell death. Although the cells studied have no direct relation to blood cells,

these hematological functions were associated with the results because the filtering parameters

used for identification of networks were intentionally relaxed. This was intentional, since

Ingenuity does not contain filters specific for CHO cells. The results from Ingenuity indicate

that the protein network is associated with development and growth of blood cells. These

113

associations may also translate to cellular growth and development for CHO cells. Several

proteins in the network are directly associated with regulation of cell growth. One such protein

is Bcl-XL, which is associated with inhibition of apoptosis in mammalian cells [21]. The CHO

cells used in these experiments had been engineered to overexpress Bcl-XL as a strategy to

regulate cell growth through inhibition of apoptosis in cell culture [22]. Our results indicate that

Bcl-XL is expressed at higher levels during stationary phase compared to exponential growth

phase in this cell culture.

Several proteins associated with cellular proliferation such as MCM2, MCM5, MCM6, and

caprin1 were downregulated in the stationary phase. High levels of MCM proteins have been

previously identified as cell proliferation markers, and are involved in control of DNA synthesis

[23]. Caprin1 has also been reported to be involved in cell proliferation [24, 25]. The dynamic

trends observed for these proteins correlates with the relatively static cell growth observed

during stationary phase, and indicate that these known growth-related marker proteins are

indicators of cell growth in this particular CHO culture. Downregulation of MCM3 and MCM5

was also reported to be associated with high productivity in sodium butyrate treated CHO cells

[26].

Other differentially expressed proteins associated with regulation of cell growth are

transglutaminase-2 and clusterin. Transglutminase-2 (TG2) is an 82-kDa membrane protein that

catalyzes crosslinking of lysine and glutamine residues [27, 28]. It has been reported to be

overexpressed in certain tumors and to have anti-apoptotic activity [10, 29]. Clusterin is a

protein present in nuclear and secreted forms in mammalian cells that has both pro- and anti-

apoptotic properties and is also associated with tumor progression [9, 11, 30, 31]. The role of

114

clusterin is difficult to decipher due to the multiple forms present in mammalian cells. It is

expressed as a secreted protein which is heavily glycosylated, as well as a nuclear non-

glycosylated form [32]. The band that we observe by western blot seems to correspond to a

partially-processed variant of clusterin which lacks glycosylation. A band of 28-30 kDa was

previously observed by O’Sullivan et al, by western blotting of nuclear clusterin under reducing

conditions [33]. The nuclear form of clusterin has been associated with signaling of apoptosis in

cancer cells [11].

Our data indicates that several growth-regulating proteins Bcl-XL, transglutaminase-2 and

clusterin are expressed at higher levels in the stationary phase of the cell culture, while other

marker proteins associated with cell growth including MCM2 and caprin-1 are downregulated in

stationary phase. One possible explanation for these observations is that several of these proteins

are involved in control of cellular growth during the cellular response to ER stress. Bcl-XL has

been previously reported as having an active anti-apoptotic role during cellular response to ER

stress by inhibiting the translocation of BIM [34]. The differential expression of clusterin,

transglutaminse-2, minichromosome maintenance complex proteins and caprin-1 may also be

tied into this cellular response, as playing roles in cellular adaptation to ER stress and the

concomitant regulation of cellular growth. However, the roles that these proteins play directly

on cell growth and productivity will require additional study to validate the observations.

3.4 Conclusions

The main goal of this study was to establish a suitable approach for identifying trends in protein

expression over the course of a mammalian cell culture. The transition of mammalian cells from

115

exponential to stationary phase during cell culture is an important attribute of cell culture

performance, as it involves the transition of cells from a high growth phase into a high

productivity phase with little to no cell growth. This study utilized a proteomics strategy

designed to identify dynamic protein trends in a CHO mammalian cell culture over a period of

time encompassing both exponential and stationary phases. Using a quantitative proteomics

approach, we identified differentially expressed proteins with increasing or decreasing trends in

protein expression. The results obtained here provide us a baseline of proteomic changes related

to the transition of cell growth from exponential to stationary phase in one CHO cell culture

process, and these results can be validated in future studies of other CHO processes to determine

how biologically reproducible they are. The differential expression of some of these proteins

may be related to the changes in cell growth observed throughout the CHO cell culture. The

differential expression of translgutminase-2 and clusterin is of particular interest due to their role

in the regulation of cell growth. The nature of the relationships between these proteins and cell

culture performance will be the subject of future studies. In addition to identifying differential

protein expression over the course of a cell culture, the proteomics strategy described here can be

applied to identify proteins with different trends in expression between different cell culture

conditions, which may be useful for identifying protein markers associated with productivity and

product quality.

116

3.5 References

1. Huang YM, Hu W, Rustandi E, Chang K, Yusuf-Makagiansar H, Ryll T: Maximizing

productivity of CHO cell-based fed-batch culture using chemically defined media

conditions and typical manufacturing equipment. Biotechnology progress 2010,

26(5):1400-1410.

2. Wurm FM: Production of recombinant protein therapeutics in cultivated

mammalian cells. Nature biotechnology 2004, 22(11):1393-1398.

3. Schoenherr I, Stapp T, Ryll T: A comparison of different methods to determine the

end of exponential growth in CHO cell cultures for optimization of scale-up.

Biotechnology progress 2000, 16(5):815-821.

4. Brewer JW, Diehl JACINPNASUSAN, Pmid: PERK mediates cell-cycle exit during

the mammalian unfolded protein response. Proceedings of the National Academy of

Sciences of the United States of America 2000, 97(23):12625-12630.

5. Carlage T, Hincapie M, Zang L, Lyubarskaya Y, Madden H, Mhatre R, Hancock WS:

Proteomic profiling of a high-producing Chinese hamster ovary cell culture.

Analytical chemistry 2009, 81(17):7357-7362.

6. Thomas PD, Campbell MJ, Kejariwal A, Mi H, Karlak B, Daverman R, Diemer K,

Muruganujan A, Narechania A: PANTHER: a library of protein families and

subfamilies indexed by function. Genome research 2003, 13(9):2129-2141.

7. Ong SE, Blagoev B, Kratchmarova I, Kristensen DB, Steen H, Pandey A, Mann M:

Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and

accurate approach to expression proteomics. Molecular & cellular proteomics : MCP

2002, 1(5):376-386.

117

8. Apostolou A, Shen Y, Liang Y, Luo J, Fang S: Armet, a UPR-upregulated protein,

inhibits cell proliferation and ER stress-induced cell death. Experimental cell

research 2008, 314(13):2454-2467.

9. Bi J, Guo AL, Lai YR, Li B, Zhong JM, Wu HQ, Xie Z, He YL, Lv ZL, Lau SH et al:

Overexpression of clusterin correlates with tumor progression, metastasis in gastric

cancer: a study on tissue microarrays. Neoplasma 2010, 57(3):191-197.

10. Jang GY, Jeon JH, Cho SY, Shin DM, Kim CW, Jeong EM, Bae HC, Kim TW, Lee SH,

Choi Y et al: Transglutaminase 2 suppresses apoptosis by modulating caspase 3 and

NF-kappaB activity in hypoxic tumor cells. Oncogene 2010, 29(3):356-367.

11. Yang CR, Leskov K, Hosley-Eberlein K, Criswell T, Pink JJ, Kinsella TJ, Boothman

DA: Nuclear clusterin/XIP8, an x-ray-induced Ku70-binding protein that signals

cell death. Proceedings of the National Academy of Sciences of the United States of

America 2000, 97(11):5907-5912.

12. Wong P, Pineault J, Lakins J, Taillefer D, Leger J, Wang C, Tenniswood M: Genomic

organization and expression of the rat TRPM-2 (clusterin) gene, a gene implicated

in apoptosis. The Journal of biological chemistry 1993, 268(7):5021-5031.

13. Schroder M, Kaufman RJ: The mammalian unfolded protein response. Annual review

of biochemistry 2005, 74:739-789.

14. Patil C, Walter P: Intracellular signaling from the endoplasmic reticulum to the

nucleus: the unfolded protein response in yeast and mammals. Curr Opin Cell Biol

2001, 13(3):349-355.

15. Malhotra JD, Kaufman RJ: The endoplasmic reticulum and the unfolded protein

response. Seminars in cell & developmental biology 2007, 18(6):716-731.

118

16. Ze Z, Chunbin Z, Kezhon Z: Role of Unfolded Protein Response in Lipogenesis. World

Journal of Hepatology 2010, 2(6).

17. Malhotra JD, Miao H, Zhang K, Wolfson A, Pennathur S, Pipe SW, Kaufman RJ:

Antioxidants reduce endoplasmic reticulum stress and improve protein secretion.

Proceedings of the National Academy of Sciences of the United States of America 2008,

105(47):18525-18530.

18. Csala M, Margittai E, Banhegyi G: Redox control of endoplasmic reticulum function.

Antioxidants & redox signaling 2010, 13(1):77-108.

19. Jones J, Nivitchanyong T, Giblin C, Ciccarone V, Judd D, Gorfien S, Krag SS,

Betenbaugh MJ: Optimization of tetracycline-responsive recombinant protein

production and effect on cell growth and ER stress in mammalian cells.


20. Smales CM, Dinnis DM, Stansfield SH, Alete D, Sage EA, Birch JR, Racher AJ,

Marshall CT, James DC: Comparative proteomic analysis of GS-NS0 murine

myeloma cell lines with varying recombinant monoclonal antibody production rate.




culture insults. Biotechnol Bioeng 2000, 67(5):555-564.


monoclonal antibodies in Chinese hamster ovary cells. Biotechnol Bioeng 2005,

91(7):779-792.

119

23. Tye BK: MCM proteins in DNA replication. Annual review of biochemistry 1999,

68:649-686.

24. Grill B, Wilson GM, Zhang KX, Wang B, Doyonnas R, Quadroni M, Schrader JW:

Activation/division of lymphocytes results in increased levels of cytoplasmic

activation/proliferation-associated protein-1: prototype of a new family of proteins.

Journal of immunology (Baltimore, Md : 1950) 2004, 172(4):2389-2400.

25. Kaddar T, Rouault JP, Chien WW, Chebel A, Gadoux M, Salles G, Ffrench M, Magaud

JP: Two new miR-16 targets: caprin-1 and HMGA1, proteins implicated in cell

proliferation. Biology of the cell / under the auspices of the European Cell Biology

Organization 2009, 101(9):511-524.

26. Kantardjieff A, Jacob NM, Yee JC, Epstein E, Kok YJ, Philp R, Betenbaugh M, Hu WS:

Transcriptome and proteome analysis of Chinese hamster ovary cells under low

temperature and butyrate treatment. Journal of biotechnology 2010, 145(2):143-159.

27. Mehta K, Kumar A, Kim HI: Transglutaminase 2: a multi-tasking protein in the

complex circuitry of inflammation and cancer. Biochemical pharmacology 2010,

80(12):1921-1929.

28. Griffin M, Casadio R, Bergamini CM: Transglutaminases: nature's biological glues.

The Biochemical journal 2002, 368(Pt 2):377-396.

29. Miyoshi N, Ishii H, Mimori K, Tanaka F, Hitora T, Tei M, Sekimoto M, Doki Y, Mori

M: TGM2 is a novel marker for prognosis and therapeutic target in colorectal

cancer. Annals of surgical oncology 2010, 17(4):967-972.

120

30. Leskov KS, Klokov DY, Li J, Kinsella TJ, Boothman DA: Synthesis and functional

analyses of nuclear clusterin, a cell death protein. The Journal of biological chemistry

2003, 278(13):11590-11600.

31. Flanagan L, Whyte L, Chatterjee N, Tenniswood M: Effects of clusterin over-

expression on metastatic progression and therapy in breast cancer. BMC cancer

2010, 10:107.

32. Kapron JT, Hilliard GM, Lakins JN, Tenniswood MP, West KA, Carr SA, Crabb JW:

Identification and characterization of glycosylation sites in human serum clusterin.

Protein science : a publication of the Protein Society 1997, 6(10):2120-2133.

33. O'Sullivan J, Whyte L, Drake J, Tenniswood M: Alterations in the post-translational

modification and intracellular trafficking of clusterin in MCF-7 cells during

apoptosis. Cell death and differentiation 2003, 10(8):914-927.

34. Morishima N, Nakanishi K, Tsuchiya K, Shibata T, Seiwa E: Translocation of Bim to

the endoplasmic reticulum (ER) mediates ER stress signaling for activation of

caspase-12 during ER stress-induced apoptosis. The Journal of biological chemistry

2004, 279(48):50375-50381.

121

CHAPTER 4

CHROMOSOMAL MAPPING OF CHO GENES RELATED TO CELL GROWTH

4.1 Overview

Until recently, gene order in eukaryotes was assumed to be random. However, statistical

analysis of the expression patterns of various genomes resulted in the observation that in many

cases genes with similar expression patterns are clustered together [1, 2]. Evidence was first

provided for gene clustering by Cho et al, who showed that 25% of co-expressed genes involved

in cell division in yeast were clustered by chromosomal location [3]. Subsequently, many

studies have also found strong evidence of gene clustering, including gene clustering in A.

thaliana [4, 5], clustering of muscle-specific genes in C. elegans [6], and tissue-specific gene

clustering in humans [7].

Gene clustering has major implications on genomic evolution, aging, and cellular

development. Tandem pairs of co-expressed genes are thought to be attributable to regulation by

a shared promoter [8, 9]. However, this explanation alone does not explain the broader scale of

gene clustering observed in many eukaryotes. Chromatin structure is also thought to play a key

role in regulating gene expression. In support of this, histone deacetylases in yeast were shown

to play a role in regulating expression of gene clusters, including genes involved in

gluconeogenesis and ribosomal genes by modification of chromatin structure [10]. Modification

of histones likely plays a role in regulation of the co-expression of gene clusters.

Evidence of clustering of genes involved in cell cycle arrest has been reported. Zhang et

al showed clustering of human fibroblast genes associated with replicative senescence [11]. This

122

form of cell cycle arrest occurs in somatic cells which have undergone many cycles of cell

division. Using cDNA microarray analysis and statistical analysis of gene expression, they

showed that 150 of the 376 genes upregulated during senescence were clustered when using 1

Mbp as the maximum cluster size.

The oncoprotein c-MYC and its target gene MT-MC1 regulate expression of various

genes implicated in tumor proliferation [12]. In order to understand the genes targeted by MT-

MC1 (and in turn c-MYC), transcriptional profiling was used to identify differentially expressed

genes in myeloid cancer cells overexpressing MT-MC1 [13]. Interestingly, 34% of the target

genes were clustered on six chromosomal loci, which indicate some functional organization of

the complex set of tumor-controlling genes targeted by these oncoproteins.

Strong evidence of gene clustering in eukaryotes, including genes implicated in cell

growth, creates an argument for analysis of gene clustering in CHO cells. This is an emerging

area of interest in biomarker discovery, and could increase our understanding of the genetic

pathways involved in cell growth and productivity. However, the lack of a complete publicly

available CHO genome creates a significant obstacle.

In order to analyze gene co-expression, the chromosomal structure of the organism must

be well understood. Most mammalian cells are diploid, which means that they contain two sets

of autosomal chromosomes. Humans have 22 autosomal chromosomes and two sex

chromosomes, resulting in a total of 46 chromosomes in each cell. Similarly, the common

mouse Mus musculus is also diploid and has a total of 40 chromosomes per cell (see Figure 4.1).

In contrast, the Chinese hamster was an attractive model for early genetics research due to its

relatively low chromosome number of 22. Theodore T. Puck created the first cultured CHO cell

123

line in 1957 by extracting Chinese hamster ovary cells and growing them in a cultured

monolayer for genetic studies [14]. Researchers also found these cells to be easily adapted to

suspension cultures. Over the next several decades, CHO cells became a popular model to study

toxicity, genetics, and as a gene expression platform due to its high growth rate and production

yield [15-17].

Figure 4.1: Mouse karyotype using the Giemsa (G-banding) technique. [18]

124

As mentioned, Chinese hamsters only have 22 total chromosomes, a very low number for

a mammal [19]. A study of the chromatin structure of CHO cells in 1969 showed that they are

aneuploid, indicating that the chromsomes had lost their diploid properties, thus changing their

genetic composition significantly compared to the ovarian tissue cells from which they were

derived. Karyotypic analysis of CHO-K1 cells indicate that this difference is due to

translocation and deletion of certain chromatin structures from the native CHO chromosomes

(see Figure 4.2) [20].

125

Figure 4.2: Karyotypic analysis of diploid Chinese hamster fibroblast chromosomes (LA-CHE)

and CHO-K1 chromosomes. Altered chromosomes are marked “Z”. [20]

A more recent and thorough genetic characterization of CHO-DG44 chromosomes

identified 20 unique chromosomes, with only 7 being normal [21]. Four Z group chromosomes

originally identified in the CHO-K1 study, 7 derivative chromosomes with known origin, and 2

marker chromosomes of unidentified origin were described (see Figure 4.3). Further analysis of

different CHO-DG44 recombinant cell lines illustrated marked differences in chromosome

composition between the different lines, including aneuploidy, deletions, and complex

126

rearrangements. Apparently, the CHO genome is not very stable; however this does not seem to

affect the stability of recombinant protein expression.

Figure 4.3: Karyotype of CHO-DG44 cells using Giemsa (G-banding) technique. Normal

Chinese hamster chromosomes are shown in the top row. [21]

Both mouse and CHO share significant gene sequence homology [22]. Despite the

difference in chromosome number and composition, there likely exists some conservation in

gene localization between the two species. In this chapter the activity and function of genes

mapped to the mouse chromosomes are studied. Known genes of interest relevant to cell culture,

including genes identified from the proteomics work in Chapter 3, oncogenes related to tumor

127

proliferation, and reported target genes for cellular engineering of enhanced mammalian cell

cultures are analyzed for evidence of clustering on mouse chromosomes. Using specific

thresholds, clusters of functionally similar genes are identified. The results described here

establish initial information regarding potential CHO gene clustering, which can be further

expanded upon in the future using gene expression analysis tools to positively identify gene

clustering in CHO.

4.2 Methods

4.2.1 Chromosome Mapping

The complete Mus musculus genome was downloaded from the Global Proteome

Machine (http://www.thegpm.org). A list of genes of interest was generated from some of the

proteins identified in Chapter 3. This list was referenced to the mouse genome using MS Excel

to determine the chromosomal locations of those genes. Potential gene clusters were identified

by filtering the genes by +/- 150000 bp from the starting position of each gene of interest.

4.2.2 Pathway Analysis

Twenty one lists of genes corresponding to the 21 unique chromosomes of the mouse

genome were uploaded into Ingenuity Pathway Analysis (Ingenuity® Systems,

www.ingenuity.com). A Core analysis was used with both direct and indirect relationships

allowed, and specifying maximum of 35 proteins per network and 10 networks total. All data

128

sources were selected, and all cell and tissue types were selected. No expression value cutoff

was used. All network eligible molecules were overlaid onto a global molecular network

developed from information contained in Ingenuity’s Knowledge Base. Networks of network

eligible molecules were then algorithmically generated based on their connectivity.

4.3 Results

4.3.1 Identification of Cell Growth Gene Networks

To get an initial assessment of the correlation between genetic function and chromosome

location, the set of genes corresponding to each mouse chromosome was analyzed using

Ingenuity Pathway Analysis (http://www.ingenuity.com) to determine the top networks

associated with each individual chromosome. Chromosomes which had top-scoring networks

associated with cell growth and proliferation, cell death, or protein synthesis are shown in Figure

4.4. Eleven chromosomes were associated with cell growth and proliferation, 11 were associated

with cell death, and only chromosomes 2 and 6 were associated with protein synthesis. An

example of a top-scoring network from chromosome 11 is shown in Figure 4.5.

Table 4.1: Top Gene Networks Identified in Each Mouse Chromosome

http://www.ingenuity.com/

129

The top 5 scoring networks were determined for each mouse chromosome gene set using

Ingenuity Pathway Analysis (http://www.ingenuity.com). The chromosomes with networks

associated with cell growth and proliferation, cell death, or protein synthesis are shown.

130

Figure 4.5: The top network for mouse chromosome 11 is shown, based on direct and indirect

relationships. The network is associated with cell cycle, cellular growth and proliferation, and

hematological system development and function. Only genes shown in gray are present on

chromosome 11.

131

4.3.2 Mapping of Genes of Interest on Mouse Chromosomes

To determine the degree of clustering of genes known to be involved in regulation of cell

growth, a list of genes of interest was generated and mapped to the mouse genome. This gene

list was comprised of two components: a list of known oncogenes generated by the Cancer

Genome project at the Wellcome Trust Sanger Institute, and a list of known gene targets for

cellular engineering. The oncogene list was downloaded directly from the Wellcome Trust

Sanger Institute, and contained 458 genes (http://www.sanger.ac.uk/genetics/CGP/Census/). In

addition, a list of genes used as cell engineering targets was generated based on previously

reported studies [23]. This list represents all genes known to have a direct impact on cell culture

performance upon modification of their expression levels, and includes 36 genes involved in

cellular metabolism, cell cycle control, protein secretion, and apoptosis. The combined gene list

was mapped to the mouse genome, which was downloaded from the Global Proteome Machine

(http://www.thegpm.org).

Gene clusters were identified based on chromosome location. Genes with starting codons

within 400,000 bp of each other were classified as significant clusters. A summary of the gene

clustering analysis is shown in Figure 4.6.

132

Figure 4.6: Summary of Growth Regulating Gene Clusters Identified on Mouse Chromosomes

The number of clusters identified on each chromosome is shown in blue. The number of genes

mapped to clusters is shown in red.

Evidence of gene clustering was identified on nearly every mouse chromosome. Most

clusters consisted of only two genes; however clusters of three genes were identified on

chromosomes 6, 7, 10, and 11. Two relevant clusters identified on chromosome 7 are shown in

Figure 4.7. The cluster of ERCC2, CBLC, and BCL3 are within 530 kbp of a second cluster of

CD79A and CIC. These are all known oncogenes, and are involved in tumor proliferation. For

example, CBLC is a protein that has been shown to interact with EGFR, a cell surface receptor

which is involved in regulation of cell growth and DNA synthesis [24]. Another gene in this

cluster, Bcl3, is a proto-oncogene candidate which regulates transcription of genes associated

with tumor proliferation by interaction with NF-kappa-B [25].

133

Figure 4.7: Cluster of Genes of Interest on Mouse Chromosome 7.

Descriptions of top scoring cell compartment are shown in purple ovals, biological process in

blue diamonds, and molecular functions are shown in orange boxes for each gene. Each

description is followed by the total number of entries for each ontology present in GO. The red

box shows the top 5 interactants for each gene, and the red circles show the total number of

interacting proteins listed by Genecards (http://www.genecards.org).

4.3.3 Mapping of Differentially Expressed CHO Proteins on Mouse Chromosomes

The relative activity of each gene set corresponding to a mouse chromosome was

determined by mapping the CHO proteins identified from the study described in chapter 3 to

their respective gene location in the mouse genome. This approach assumes that the most active

genes are most highly represented in the list of identified proteins from the proteomics

experiment. The number of genes mapped for each mouse chromosome is shown in Figure 4.8.

134

Figure 4.8: Gene Expression Activity of CHO Genes Mapped to Mouse Chromosomes

The proteins identified in the proteomics study described in chapter 3 were mapped to the mouse

genome. The (A) number of genes corresponding to proteins identified for each mouse

chromosome and (B) number of identified genes divided by the total number of genes for each

chromosome is shown.

The results indicate that chromosomes 7, 2, and 11 were the most represented in terms of

the number of gene products identified in the proteomics study. These chromosomes were also

A

B

135

associated with cell growth and proliferation based on the networking data shown in Figure 4.4.

Chromosomes 15, 18, and 9 had the highest percentage of identified genes per chromosome,

with close to 10% of all chromosomal genes identified. This differences observed between the

two charts is due to the wide range of chromosomal gene content, as chromosomes 7, 2, and 11

contain the highest number of genes of all of the mouse chromosomes.

The data was probed for evidence of co-expression by mapping differentially expressed

genes identified in the proteomics characterization of CHO cells in exponential and stationary

growth phases (see Table 3.1) and analyzing the data for evidence of clustering. In this study,

none of the differentially expressed proteins were found to be co-localized on the mouse

chromosomes. However, some evidence of clustering with other genes of interest was identified.

A diagram illustrating some evidence of clustering between these genes is shown in Figure 4.9.

Figure 4.9: Chromosomal Mapping of Cell Growth Related Genes

Proteins of interest identified in Chapter 3, shown in red, were mapped to the mouse genome.

Genes located within 300 kbp are shown within the boxes. Genes found to have similar

functions as the mapped genes are indicated in black italics.

Both Tti1 and Rprd1b were clustered near Tgm2 on chromosome 2. Tti1 has been shown

to interact with mammalian target of rapamycin (mTOR), a member of the

Tti1Rprd1bTgm2

D630003M21Rik 1700060C20Rik

Bpi

VprbpManf

Rbm15b

Esco2Ccdc25

1110020C17Rik Scara3

CluGulo

Adam2 Ephx2

2 9 14

136

136hosphatidylinositol 3-kinase-related kinase (PIKK) family which regulates cell growth [26].

Immunoprecipitation and size-exclusion chromatography was used to demonstrate stabilization

of both mTORC1 and mTORC2 complexes via Tti1 binding [27]. Therefore, this protein seems

to play a key role in regulating mTOR activities in mammalian cells.

The gene Rprd1b, also known as C20orf77, was found by Jung et al. to be highly

expressed in lung cancer cells [28]. Suppression subtractive hybridization was used to analyze

gene expression in lung cancer cell lines and a list of genes including C20orf77 were consistently

detected at high levels of expression in all lung cancer cell lines. The gene may play a role in

carcinogenesis. Due to the proximity and functional similarities in Tgm2, Tti1, and Rprd1b,

these genes are candidates for expression analysis to determine if any co-expression exists

between the three, which could indicate some shared functional impact of these genes on cell

growth in CHO cells.

Vprbp encodes the gene product DCAF1, a receptor for ubiquitin ligase [29]. Transy et

al. demonstrated cell cycle arrest upon DCAF1 interaction with Vpr [30]. Vpr hijacks DDB1

ubiquitin ligase complex through interaction with DCAF1, causing cell cycle arrest in G2 phase.

Vbrpb is located near Manf on chromosome 9. Manf is upregulated as a response to ER stress

conditions, and likely plays a role in protein folding [31]. The close proximity of these two

genes on chromosome 6 may indicate co-expression as a component of the unfolded protein

response, since cell cycle arrest has been reported as being a common trait of mammalian cells

under ER stress [32].

Both Scara3 and Clu are co-located on chromosome 14. Scara3 codes the protein CSR1,

which is downregulated in prostate cancer cell lines, and is implicated as a tumor-suppressor

137

[33]. While the mechanism of tumor suppression is not well understood, Zhu et al. showed that

CSR1 binds to CPSF3, a protein involved in conversion of heteronuclear RNA to mRNA [34].

Upon binding to CSR1, CPSF3 translocates from the nucleus to the cytoplasm, inhibiting

polyadenylation of RNA and resulting in cell death. Downregulation of CSR1 inhibits cell death

in prostate cancer cell lines. Both clusterin and CSR1 play a role in regulation of cell death, and

the co-location of these genes on chromosome 14 makes them candidates for expression analysis

in future experiments.

4.4 Conclusions

Analysis of gene co-expression is an emerging area of interest in biomarker discovery.

Identification of related sets of genes that share regulation of expression can help to elucidate

biological pathways. The analysis of the location of genes of interest on mouse chromosomes

described here gives an initial assessment of possible clustering of CHO genes related to cell

growth which may comprise part of the biological machinery driving cell growth in CHO, and

consequently may play a significant role in cell culture process performance. From the limited

data available, three potential gene clusters were identified near the genes coding for clusterin,

Manf, and transglutaminase-2, which were identified as proteins of interest in Chapter 3.

Without the full CHO genome available with chromosomal locations, it is difficult to

make an accurate assessment of CHO gene expression. The apparent genetic instability of CHO

makes the study of gene location an interesting area for future research. This study lays the

groundwork which can be expanded to use gene expression tools and a complete CHO genome

138

to identify gene expression patterns relevant to CHO cell culture performance and identify

further potential growth and productivity related biomarkers.

4.5 References

1. Cohen BA, Mitra RD, Hughes JD, Church GM: A computational analysis of whole-

genome expression data reveals chromosomal domains of gene expression. Nature

genetics 2000, 26(2):183-186.

2. Hurst LD, Pal C, Lercher MJ: The evolutionary dynamics of eukaryotic gene order.

Nature reviews Genetics 2004, 5(4):299-310.

3. Cho RJ, Campbell MJ, Winzeler EA, Steinmetz L, Conway A, Wodicka L, Wolfsberg

TG, Gabrielian AE, Landsman D, Lockhart DJ et al: A genome-wide transcriptional

analysis of the mitotic cell cycle. Molecular cell 1998, 2(1):65-73.

4. Birnbaum K, Shasha DE, Wang JY, Jung JW, Lambert GM, Galbraith DW, Benfey PN:

A gene expression map of the Arabidopsis root. Science (New York, NY) 2003,

302(5652):1956-1960.

5. Zhu T: Global analysis of gene expression using GeneChip microarrays. Current

opinion in plant biology 2003, 6(5):418-425.

6. Roy PJ, Stuart JM, Lund J, Kim SK: Chromosomal clustering of muscle-expressed

genes in Caenorhabditis elegans. Nature 2002, 418(6901):975-979.

7. Yang YS, Song HD, Shi WJ, Hu RM, Han ZG, Chen JL: Chromosome localization

analysis of genes strongly expressed in human visceral adipose tissue. Endocrine

2002, 18(1):57-66.

139

8. Lercher MJ, Blumenthal T, Hurst LD: Coexpression of neighboring genes in

Caenorhabditis elegans is mostly due to operons and duplicate genes. Genome

research 2003, 13(2):238-243.

9. Papp B, Pal C, Hurst LD: Evolution of cis-regulatory elements in duplicated genes of

yeast. Trends in genetics : TIG 2003, 19(8):417-422.

10. Robyr D, Suka Y, Xenarios I, Kurdistani SK, Wang A, Suka N, Grunstein M:

Microarray deacetylation maps determine genome-wide functions for yeast histone

deacetylases. Cell 2002, 109(4):437-446.

11. Zhang H, Pan KH, Cohen SN: Senescence-specific gene expression fingerprints reveal

cell-type-dependent physical clustering of up-regulated chromosomal loci.


100(6):3251-3256.

12. Nesbit CE, Tersak JM, Prochownik EV: MYC oncogenes and human neoplastic

disease. Oncogene 1999, 18(19):3004-3016.

13. Rogulski KR, Cohen DE, Corcoran DL, Benos PV, Prochownik EV: Deregulation of

common genes by c-Myc and its direct target, MT-MC1. Proceedings of the National

Academy of Sciences of the United States of America 2005, 102(52):18968-18973.

14. Tjio JH, Puck TT: Genetics of somatic mammalian cells. II. Chromosomal

constitution of cells in tissue culture. The Journal of experimental medicine 1958,

108(2):259-268.

15. Kao FT, Puck TT: Genetics of somatic mammalian cells. IX. Quantitation of

mutagenesis by physical and chemical agents. Journal of cellular physiology 1969,

74(3):245-258.

140

16. Walters RA, Petersen DF: Radiosensivity of mammalian cells. I. Timing and dose-

dependence of radiation-induced division delay. Biophysical journal 1968,

8(12):1475-1486.

17. Jayapal KP, Wlaschin KF, Yap MG, Hu W-S: Recombinant protein therapeutics from

CHO cells - 20 years and counting. Chemical Engineering Progress 2007, 103(7):40-

47.

18. Akeson EC, Davisson MT: Mitotic chromosome preparations from mouse cells for

karyotyping. Current protocols in human genetics / editorial board, Jonathan L Haines

[et al] 2001, Chapter 4:Unit4.10.

19. Yerganian G: Cytogenic possiblities with the Chinese hamster, Cricetulus barabensis

griseus. Genetics 1952, 37:638.

20. Deaven LL, Petersen DF: The chromosomes of CHO, an aneuploid Chinese hamster

cell line: G-band, C-band, and autoradiographic analyses. Chromosoma 1973,

41(2):129-144.

21. Derouazi M, Martinet D, Besuchet Schmutz N, Flaction R, Wicht M, Bertschinger M,

Hacker DL, Beckmann JS, Wurm FM: Genetic characterization of CHO production

host DG44 and derivative recombinant cell lines. Biochemical and biophysical

research communications 2006, 340(4):1069-1077.

22. Johnson KC, Jacob NM, Nissom PM, Hackl M, Lee LH, Yap M, Hu WS: Conserved

microRNAs in Chinese hamster ovary cell lines. Biotechnology and bioengineering

2011, 108(2):475-480.

141

23. Lim Y, Wong NS, Lee YY, Ku SC, Wong DC, Yap MG: Engineering mammalian cells

in bioprocessing - current achievements and future perspectives. Biotechnology and

applied biochemistry 2010, 55(4):175-189.

24. Oda K, Matsuoka Y, Funahashi A, Kitano HCINMSB, Pmid: A comprehensive

pathway map of epidermal growth factor receptor signaling. Molecular systems

biology 2005, 1:2005.0010.

25. Dyer MJ, Oscier DG: The configuration of the immunoglobulin genes in B cell

chronic lymphocytic leukemia. Leukemia : official journal of the Leukemia Society of

America, Leukemia Research Fund, UK 2002, 16(6):973-984.

26. Laplante M, Sabatini DM: mTOR signaling at a glance. Journal of cell science 2009,

122(Pt 20):3589-3594.

27. Kaizuka T, Hara T, Oshiro N, Kikkawa U, Yonezawa K, Takehana K, Iemura S,

Natsume T, Mizushima N: Tti1 and Tel2 are critical factors in mammalian target of

rapamycin complex assembly. The Journal of biological chemistry 2010,

285(26):20109-20116.

28. Jung HM, Choi SJ, Kim JK: Expression profiles of SV40-immortalization-associated

genes upregulated in various human cancers. Journal of cellular biochemistry 2009,

106(4):703-713.

29. Zhang S, Feng Y, Narayan O, Zhao LJ: Cytoplasmic retention of HIV-1 regulatory

protein Vpr by protein-protein interaction with a novel human cytoplasmic protein

VprBP. Gene 2001, 263(1-2):131-140.

30. Le Rouzic E, Belaidouni N, Estrabaud E, Morel M, Rain JC, Transy C, Margottin-Goguet

FCINCCA, Pmid: HIV1 Vpr arrests the cell cycle by recruiting DCAF1/VprBP, a

142

receptor of the Cul4-DDB1 ubiquitin ligase. Cell cycle (Georgetown, Tex) 2007,

6(2):182-188.

31. Apostolou A, Shen Y, Liang Y, Luo J, Fang S: Armet, a UPR-upregulated protein,

inhibits cell proliferation and ER stress-induced cell death. Experimental cell

research 2008, 314(13):2454-2467.

32. Brewer JW, Hendershot LM, Sherr CJ, Diehl JA: Mammalian unfolded protein

response inhibits cyclin D1 translation and cell-cycle progression. Proceedings of the

National Academy of Sciences of the United States of America 1999, 96(15):8505-8510.

33. Yu G, Tseng GC, Yu YP, Gavel T, Nelson J, Wells A, Michalopoulos G, Kokkinakis D,

Luo JH: CSR1 suppresses tumor growth and metastasis of prostate cancer. The

American journal of pathology 2006, 168(2):597-607.

34. Zhu ZH, Yu YP, Shi YK, Nelson JB, Luo JH: CSR1 induces cell death through

inactivation of CPSF3. Oncogene 2009, 28(1):41-51.

143

CHAPTER 5

PROTEOMICS CHARACTERIZATION OF HOST CELL PROTEINS PRESENT IN

VARIOUS STAGES OF A BIOPHARMACEUTICAL PROCESS

5.1 Overview

The secreted proteins generated by mammalian host cells during cell culture have

significant implications on the entire biopharmaceutical process. For one, the interaction of

mammalian cells is mediated by soluble elements released by cells into the extracellular

environment. Major classes of secreted proteins include cytokines, proteases, protein hormones,

growth factors, chemokines, or other extracellular matrix proteins. In addition to proteins which

are released from the cells by the classical secretory pathway, other extracellular proteins can be

shed from the cell membrane, or released through other pathways such as exosomes [1].

Proteomics has been used to study secreted proteins released from cancer cells, with the

hopes of discovering biomarkers that could be used for early diagnosis or elucidating disease

pathways, and which would only require analysis of biological fluids rather than tissue biopsies

[2]. Kawanishi et al. compared the secretomes of poorly invasive RT112 bladder carcinoma

cells to highly invasive T24 cells using shotgun proteomics and cDNA microarray analysis [3].

The T24 cancer cells were found to overexpress several proteins including the chemokine

CXCL1, which is associated with tumor progression. Urine levels of CXCL1 also correlated

with disease invasiveness.

As another example, shotgun proteomics analysis of conditioned media from three

prostate cancer cell lines identified several potential biomarkers [4]. These included follistatin,

144

pentraxin 3, and spondin 2. Testing of serum levels of these proteins by ELISA showed that

levels of these proteins were increased in prostate cancer patients compared to healthy controls,

and also correlated with levels of prostate specific antigen (PSA), the currently established tumor

marker for prostate cancer.

The growth-regulating protein clusterin described previously in Chapter 3, is expressed as

both a nuclear and secreted form, with the latter having anti-apoptotic function [5]. It is likely

that proteins such as secreted clusterin may be expressed by CHO cells and are present in the

conditioned media, playing roles in cell growth and apoptosis. Proteomics profiling of the CHO

secretome, by analysis of the harvested cell culture fluid (HCCF), would enable the identification

of these proteins, and help to understand the impact they may have on cell culture performance.

In addition to the impact of secreted proteins on cell culture, these proteins also play a

role during the downstream purification process. The presence of secreted host cell proteins in

the final drug product may cause immunogenic responses in patients if not cleared to an

acceptably low level. For this reason, it is important to demonstrate clearance of host cell

proteins using a suitable analytical method at each step of the downstream process. The gold

standard method for host cell protein analysis is an immunoassay platform such as an ELISA,

which utilizes a polyclonal antibody (pAb) targeting various secreted host cell proteins released

from the host cells [6]. Known limitations of these methods include potential binding

interference from the sample matrix, as well as the risk of the generic pAb lacking specificity to

certain secreted proteins. The latter becomes more likely as the same antibody is typically used

in different host cell protein assays for different cell culture processes, which may not

necessarily secrete the same proteins. For these reasons, a supplementary method utilizing mass

145

spectrometry would be beneficial towards identifying and quantifying secreted host cell proteins

from various intermediate steps of a biopharmaceutical process.

The chapter describes the application of proteomics methodology to the analysis of

secreted proteins present in the extracellular matrix of a CHO cell culture, as well as the

subsequent intermediate steps of downstream purification. The proteomics dataset was analyzed

to determine which proteins may have an impact on cell culture performance. Several growth-

related proteins were identified. In addition, analysis of physiochemical properties and protein

pathways and networks help understand the relevance of the proteins present in the extracellular

matrix and how they were cleared during downstream purification. The results described here

increase our understanding of both the upstream cell culture and downstream purification

processes, and further demonstrates proteomics as an important analytical tool for the

characterization of the biopharmaceutical process.

5.2 Methods

5.2.1 Purification of Cell Culture Harvest

A CHO cell culture expressing a recombinant monoclonal antibody was harvested from a 400 L

bioreactor and subjected to centrifugation and filtration to isolate the supernatant from the

cellular material. The resulting harvested cell culture fluid (HCCF) was purified by sequential

separation steps including affinity, ion exchange, and phenyl chromatography. After each step,

the pertinent chromatographic fraction was collected on-line into a vessel and aliquoted to an

appropriate tube. The titer of the HCCF sample was measured by protein A chromatography

146

with UV detection. The concentrations of other samples were measured by A280. The protein

concentration was measured by A280. The samples were stored at -70 C until analysis.

5.2.2 Protein Digestion

For each sample, a volume equivalent to 200 µg of protein was dried by SpeedVac and

reconstituted in 10 µL of water, 5 µL of 1 M Tris pH 8.0 buffer, 75 µL of 8 M Guanidine HCl,

and 10 µL of 100 mM DTT. The samples were incubated at 60 °C for 30 minutes. After the

samples reach room temperature, 10 µL of 250 mM iodoacetamide was added to each, and

samples were incubated for 60 min at ambient temperature in the dark. The alkylation reaction

was quenched by adding 5 µL of 1 M DTT to each tube and mixing. A 5X volume of cold

acetone was added to each sample, and they were incubated at – 20 °C for 4 hrs. Samples were

then centrifuged at 10,000 g for 10 min. The supernatant was removed, and the pellet

reconstituted in 20 µL of 8 M Guandine HCl and 180 µL of 50 mM Tris pH 8.0. Ten

micrograms of trypsin was added to each sample, which were incubated for 18 hrs at 37 °C.

5.2.3 HPLC Fractionation

Tryptic digests were fractionated using reversed-phase chromatography. The digest was loaded

onto a Waters XBridge C18 column (2.1 x 150 mm) heated to 45°C at a flow rate of 300 µL/min.

Mobile phase B was acetonitrile. The gradient conditions were set to 2% B for 5 min, then 2% -

10% B over 5 min, followed by 10% - 40% B over 25 minutes. Fractions were collected every 2

min from 13 to 45 min, for a total of 16 fractions per sample. Each fraction was evaporated to

dryness immediately after collection.

147

5.2.4 LC/MS

Reversed-phase fractions were evaporated to dryness and reconstituted in 100 µL of 0.1% formic

acid in water. Each fraction was analyzed in duplicate by LC/MS using 40 µL injection

volumes. An Agilent 1200 HPLC was connected to a Thermo Scientific Orbitrap Discovery

mass spectrometer. Separation was achieved using a Waters Acquity HSS T3 C18 column (1.0 x

100 mm) heated to 55°C. Mobile phase A was 0.1% formic acid in water, and mobile phase B

0.1% formic acid in acetonitrile. Peptides were separated with a flow of 70 µL/min at a steady

2% B for 5 min, then 2% - 35% B over 120 min, followed by a wash step at 90% B and re-

equilibration at 2% B for 20 min. The column was temperature controlled at 55°C. The mass

spectrometer was set up to scan MS followed by MS/MS on the top 8 precursor ions. In MS

mode, a mass range of 400 – 1500 m/z was scanned. For MS/MS scans, collisionally induced

dissociation (CID) was used with a collision energy of 35 and an isolation width of 3.0. Each

MS/MS event was comprised of 2 microscans. Dynamic exclusion was enabled with a repeat

count of 2, for a 15 sec window.

5.2.5 Data Analysis

The proteins were identified using Thermo Scientific Proteome Discoverer 1.1. The Sequest

algorithm was used to search MS/MS data against a mouse sequence database downloaded from

Uniprot on 12/07/2009. The dataset from each reactor was processed independently. The

proteins were identified with 10 ppm mass accuracy for precursor ions, and 0.6 Da for product

ions. The MS/MS data was searched with static modifications set to Cys carbamidomethylation.

Dynamic modifications were set to asparagine and glutamine deamidation, and methionine

148

oxidation. The database was searched in reverse to determine false discovery rates. Each

peptide was identified with less than a 5% false discovery rate.

Proteins were classified by biological process and by protein class using PANTHER, an online

tool which uses annotations such as gene ontology to classify genes by category such as

biological process or molecular function [7].

5.3 Results

5.3.1 Identification of Secreted Proteins in Process Intermediate Samples

Samples were obtained from various stages of an IgG process, starting with the harvested

cell culture fluid (HCCF), which is conditioned cell culture media subjected to centrifugation to

remove solid cellular material, followed by filtration. After this step, the material is subjected to

a series of chromatography steps which include Protein A affinity chromatography, anion

exchange chromatography, cation exchange chromatography, and phenyl chromatography. The

anion exchange and phenyl columns were operated in flow-through mode where the IgG was not

captured by the column, and therefore the flow-through fractions contained the IgG and not the

eluate fractions.

149

Figure 5.1: Downstream Process Overview

The process utilized four stages of chromatography followed by ultrafiltration / diafiltration. The

column eluates from ProA and cation exchange chromatography steps were collected, as well as

flow-through fractions of the anion-exchange and phenyl steps as these columns were operated

in flow-through mode for purification (no binding of IgG).

Samples from each intermediate step were collected and analyzed using a shotgun

proteomics approach described in section 5.2. Each sample was subjected to trypsin digestion,

reversed-phase HPLC fractionation, and LC/MS analysis using an Orbitrap Discovery mass

spectrometer. Each fraction was analyzed in duplicate. A total of 2671 peptides were identified

in the HCCF sample, which corresponded to 323 proteins. In the subsequent process

intermediate samples, approximately 80 proteins were identified in each sample, from several

hundred peptide IDs.

150

Figure 5.2: Protein Identification Summary

In-process samples were analyzed using a two-dimensional LC/MS method. Peptides were

identified by Sequest search against the mouse sequence database (Uniprot Dec 2010). Each

peptide was identified with <5% false discovery rate.

5.3.2 Implications of Secreted CHO Proteins on Cell Culture

As mentioned in section 5.1, secreted proteins can play important roles in cellular processes

associated with disease progression. In this study, the analysis of secreted CHO proteins can

potentially identify proteins which may be involved in various important processes including cell

signaling, cell adhesion, and regulation of cell growth. However, the proteins identified in the

extracellular matrix of the harvested cell culture are not all secreted proteins. Some of the

proteins are intracellular proteins, which have leaked into the extracellular matrix. Proteins can

leak out of cell membranes due to the loss of integrity of the cellular membrane. This will

HCCF ProA AEX CIEX Phenyl

# Proteins ID'd 323 80 82 84 82

# Peptides 2671 473 345 430 503

0

500

1000

1500

2000

2500

3000

151

happen as a result of apoptosis, which causes the cells to break up into smaller vesicles. Another

source of cell leakage is during the centrifugation process, when the cells are subjected to

mechanical shearing forces which can disrupt cell membranes. In order to identify the presence

of extracellular and intracellular proteins in the cell culture harvest, PANTHER

(http://www.pantherdb.org) was used to categorize proteins based on cellular compartment as

shown in Figure 5.3. Fifty-nine of the proteins were classified as extracellular, while the rest

were either membrane or intracellular proteins.

Figure 5.3: Cellular Compartment of Proteins Identified in HCCF

The list of proteins identified in the harvested cell culture fluid (HCCF) was uploaded to

PANTHER (http://www.pantherdb.org) and analyzed by cellular compartment term in Gene

Ontology. The proteins classified as extracellular were counted, and any other proteins not

included in that category were counted as intracellular.

152

This result indicates that many intracellular proteins are leaked into the extracellular matrix

during the cell culture process. These intracellular proteins are likely high-abundance proteins

from the cytosol of the cell, or membrane proteins cleaved from the cell surface. The proteins

identified in the HCCF sample were categorized by protein class using PANTHER

(www.pantherdb.org). The two major classes or proteins were transferase (145 proteins) and

oxidoreductases (148 proteins). In addition, 80 proteases were identified in the extracellular

matrix.

Figure 5.4: Classification of Secreted Proteins from Cell Culture Harvest

The list of proteins identified in HCCF were uploaded to PANTHER (www.pantherdb.org) and

classified by protein class.

http://www.pantherdb.org/

153

The presence of proteases is important from a biopharmaceutical process perspective since the

presence of residual protease activity can cause clipping of the IgG product [8]. Similarly,

presence of oxidoreductases could disrupt the structure of the IgG by reduction of the interchain

disulfide bonds holding the light chains and heavy chains together. The data in figure 5.4

indicates that significant numbers of these proteins are present in the extracellular matrix.

The top 5 most abundant proteins identified in HCCF based on spectral counts were the

chaperone HSPA8, pyruvate kinase isozyme M1/M2, glyceraldehyde-3-phosphate

dehydrogenase (GAPDH), alpha-enolase, and fatty acid binding protein (FABP4). Information

related to these proteins in shown in Table 5.1.

Table 5.1: Top 5 Most Abundant Proteins in Cell Culture Harvest

The top 5 native CHO proteins identified in HCCF as indicated by number of spectral counts are

shown.

Heat shock cognate 71 kDa is a molecular chaperone involved in protein folding [9]. It

also functions to disassemble clathrin-coated vesicles.

Fatty acid binding protein (FABP4) plays a role in lipid transport across cell membranes

[10]. As described in Chapter 3, there is strong evidence of an unfolded protein response in

CHO cells used in mammalian cell culture. One result of the UPR is increased lipid metabolism,

which is necessary to increase the physical size of the ER in order to achieve higher protein

Gene Description # AAs MW [kDa] calc. pI ΣCoverage Σ# Peptides Function

Hspa8 Heat shock cognate 71 kDa protein 646 70.8 5.52 60.84 183 Protein folding and transport

Pkm2 Pyruvate kinase isozymes M1/M2 531 57.8 7.47 32.02 167 Glycolysis, cell death and tumor proliferation

Gapdh Glyceraldehyde-3-phosphate dehydrogenase 333 35.8 8.25 38.44 148 Carbohydrate metabolism, early secretory pathway

Eno1 Alpha-enolase 434 47.1 6.80 33.64 147 Glycolysis, cell growth control, stimulates IgG production

Fabp4 Fatty acid-binding protein 132 14.6 8.40 38.64 97 Fatty acid uptake, transport, and metabolism

154

folding capacity within the cells [11]. Fatty acid binding protein may play a role in this process,

as it appears to be expressed at a high level in the CHO cells.

Three glycolytic enzymes were in the “top 5” list. Alpha-enolase is a glycolytic enzyme

comprising the ninth step of glycolysis, and catalyzes the conversion of 2-phosphoglycerate to

phosphoenolpyruvate (PEP). Glyceraldehydes-3-phosphate dehydrogenase (GAPDH) is an

enzyme involved in the sixth step of glycolysis, which catalyzes the conversion of

glyceraldehydes-3-phosphate to D-glycerate 1,3-bisphophate. In addition, GAPDH has been

shown to initiate apoptosis and have transcription-regulating activity [12, 13].

Pyruvate kinase (PK) is a glycolytic enzyme involved in the conversion of

phophoenolpyruvate (PEP) into pyruvate as tenth step of glycolysis. Pyruvate is then converted

into lactate, where it is excreted or enters the citric acid cycle. In normal tissues the M1 isoform

of PK is observed, while it has been reported that in rapidly dividing cells and especially in

cancer cells, the M2 isoform of pyruvate kinase is the prevalent form [14]. The upregulation of

pyruvate kinase M2 leads to increased rate of glycolysis and increased lactate production [15].

This phenomenon, widely observed in cancer cells, has been named the Warburg effect when

first observed 75 years ago by Otto Warburg [16]. The upregulation of PKM2 was thought to

explain the onset of the Warburg effect in highly proliferating cancer cells. However, a recent

study by Bluemlein et al. used quantitative mass spectrometry to show that PKM1 and PKM2

levels are tissue specific, and do not significantly change between normal and cancerous tissues

[17]. The link between PKM2 and cancer metabolism remains unclear.

The M2 isoform of PK is generated by alternative splicing, resulting in a different

primary structure compared to PKM1 [18]. The BLAST alignment is shown in Figure 5.5.

155

Figure 5.5: Sequence Alignment of Pyruvate Kinase M1 and M2

The sequences of both M1 and M2 isoforms of pyruvate kinase (Uniprot accession P52480) were

compared using BLAST sequence alignment.

The difference in primary structure results in the generation of several unique tryptic

peptides which distinguish the two isoforms of PK. Data analysis identified peptides unique to

isoform M2 in the HCCF sample, while peptides unique to isoform M1 were not detected. An

MS/MS spectrum for the peptide T48 which corresponds to residues 423-433 of the M2 isozyme

is shown in Figure 5.6.

PK M2 PK M1

156

Figure 5.6: MS/MS Spectrum for Pyruvate Kinase M2 Peptide T48 (CCSGAIIVLTK)

The tandem MS spectrum of pyruvate kinase M2 tryptic peptide T48 is shown. The matching b-

type ions are shown in red, and y-type ions shown in blue.

The presence of high levels of glycolytic enzymes in the HCCF indicate that these

proteins are expressed at high levels to help drive cellular metabolism. In particular, the

presence of PKM2 indicates that these cells may be operating under metabolic conditions similar

to tumor cells. In addition, expression of PKM2 can also produce high levels of lactate, which

can trigger apoptosis in cell culture [19].

The top 5 proteins shown in Table 5.1 are intracellular proteins that have leaked out of

cells with compromised cell membrane integrity. To better understand the function of the

secreted proteins identified in the HCCF, the list of 59 extracellular proteins as identified by

PANTHER was subjected to pathway analysis using IPA (http://www.ingenuity.com). The top

network identified is shown in Figure 5.7.

157

Figure 5.7: Top Scoring Network of Extracellular Proteins in HCCF

Genes correlating to extracellular proteins identified in the HCCF were analyzed for functional

networks using Ingenuity. Gene names indicated in gray are identified in the HCCF, while white

are genes not identified in the sample.

This network has a score of 24, included 15 of the extracellular proteins identified in

HCCF, and is associated with cancer, cell death and proliferation, and cellular movement. Most

of the relationships indicated in this network are indirect through other proteins not identified in

this study, including leptin (LEP), and tumor protein p53 (TP53). This network provides strong

158

evidence that the CHO cells are secreting many proteins that are associated with cell growth and

apoptosis.

Four proteins identified in this network were of particular interest due to their reported

cell-growth regulating properties. Clusterin, as reported in chapter 3, is a protein that is present

in both secreted and nuclear forms. In that study, western blotting analysis of clusterin indicated

that the nuclear form of clusterin is upregulated during the transition of CHO cells into stationary

phase. Here, we see evidence of a secreted anti-apoptotic form of clusterin present in the

extracellular matrix.

Cysteine rich, angiogenic inducer 61 (CYR61) is a secreted, heparin-binding protein.

CYR61 is an immediate-early gene which is activated by growth factor stimulation. It is

expressed transiently throughout the cell cycle through G1, causing the protein to accumulate in

rapidly growing cells despite a short half-life [20]. One of the most important functions of

CYR61 is the promotion of cell adhesion. In cell culture, CYR61 immobilized to a solid surface

causes cell adhesion through integrins and heparan sulfate proteoglycans (HSPGs) [21]. CYR61

also has roles in cell growth. Babic et al. showed that expression of CYR61 in adenocarcinoma

cell lines promoted angiogenesis and tumor proliferation [22]. In addition, studies inhibiting

expression of CYR61 by siRNA in ovarian cancer cells resulted in decreased proliferation and

increased apoptosis [23].

The Fas receptor (Fas) is a death receptor on the surface of cells that triggers caspase-

dependent apoptosis [24]. It is also known as CD95, Apo-1, and tumor necrosis factor receptor

superfamily, member 6 (TNFRSf6). Upon binding with FAS ligand (FasL), the death-inducing

http://en.wikipedia.org/wiki/Tumor_necrosis_factor_receptor

http://en.wikipedia.org/wiki/Apoptosis

http://en.wikipedia.org/wiki/Tumor_necrosis_factors

159

signaling complex is formed, which leads to activation of caspase-8. The protein is observed in a

secreted form, which may be due to cleavage from the cell membrane.

Translationally controlled tumor protein (TCTP) is widely expressed in many tissues and

is present both intra- and extracellular environments. It has been implicated in important cellular

processes, including cell growth, cell cycle progression, and malignant transformation and in the

protection of cells against various stress conditions and apoptosis [25, 26]. It binds several

growth-regulating proteins. Liu et al. showed that TCTP interacts with the anti-apoptotic protein

Mcl-1, modulating the function of the protein by protecting it from degradation [27].

Overexpression of TCTP in lung carcinoma cells was shown to reverse p53 mediated apoptosis,

by destabilization of p53 upon TCTP binding [28].

Screening of growth and apoptosis related proteins in the extracellular matrix could

provide valuable information about cell culture performance. The relative abundance of these

proteins in the extracellular matrix may correlate with cell growth, or may be early indicators of

apoptosis in cell culture. In addition, these proteins may be valuable makers for targeted

engineering to manipulate cell growth and apoptosis.

5.3.3 Implications of Secreted CHO Proteins on Downstream Purification

Proteomics analysis of CHO host cell proteins from intermediate steps of the purification process

was used to assess host cell protein clearance. As shown in Figure 5.1, four modes of

chromatography were used to purify an IgG from HCCF. The proteins identified in the

intermediate samples are shown in Table 5.2. These co-purified proteins are defined as proteins

160

that were observed in any of the four intermediate samples. Many of the proteins identified in

later intermediate process samples were contaminants such as keratin. These proteins were

excluded from the list of co-purified proteins shown in Table 5.2. In order to understand the

physiochemical properties of the co-purified proteins, distributions of pI, molecular weight, and

Grand Relative Average Hydropathicity (GRAVY) were determined based on the spectral counts

measured for each protein.

Table 5.2: List of Co-Purified Host Cell Proteins

The proteins shown in this list are observed in at least one of the intermediate samples. Spectral

counts for each protein are shown from each of the four intermediate samples. pI and MW

values were obtained from Sequest, while GRAVY values were obtained from the Gravy

Calculator (http://www.gravy-calculator.de/).

Description

pI MW GRAVY Affinity Anion Cation Phenyl

Plexin D1 6.8 193 -0.19 11 11 4 17

Progesterone immunomodulatory binding factor 1 9.1 25 -0.89 11 2 10 11

Kinesin family member 24 7.2 135 -0.68 6 3 12 6

Thrombospondin 3 (Fragment) 8.0 11 0.22 5 15 7 6

Potassium voltage-gated channel, subfamily H member 8 4.6 10 -0.22 3 2 2 6

GRIP and coiled-coil domain containing 2 6.5 48 -0.85 7 3 6 5

Myotubularin related protein 11 9.4 30 -0.38 4 3 7 4

Ras-related protein Rab-33A (Small GTP-binding protein S10) 7.9 27 -0.22 7 3 6 4

IQ motif containing GTPase activating protein 3 8.7 84 -0.35 2 2 4 4

Fat 1 cadherin (Fragment) 5.0 506 -0.28 21 7 8 3

BAI1-associated protein 2 8.8 32 -0.62 2 3 4 3

NKR-P1E 6.3 20 -0.27 0 2 3 3

BTB/POZ domain-containing adapter for CUL3-mediated RhoA degradation protein 3 6.3 36 -0.52 1 0 0 2

Lymphocyte transmembrane adapter 1 5.0 45 -0.61 0 0 0 2

LIM and calponin homology domains 1 5.5 109 -0.92 0 2 8 2

ArfGAP with SH3 domain, ankyrin repeat and PH domain 2 5.6 62 -0.48 2 2 2 2

Cholinergic receptor, nicotinic, alpha polypeptide 7, isoform CRA_b 8.5 17 -0.14 2 2 2 2

Zinc finger RNA binding protein 9.1 114 -0.59 2 1 2 2

KRAB-zinc finger protein 73 (Fragment) 8.5 14 -0.18 1 0 1 2

Phosphatidylinositol-4-phosphate 3-kinase C2 domain-containing subunit alpha 8.0 191 -0.30 0 0 0 1

U3 small nucleolar RNA-associated protein 14 homolog B 8.9 86 -0.84 0 0 0 1

Exocyst complex component 4 7.0 65 -0.28 0 0 0 1

Uncharacterized protein KIAA1737 9.4 46 -0.49 0 0 0 1

Opsin 3 9.7 20 0.17 0 0 4 1

Novel protein (9030409G11Rik) (Fragment) 7.1 79 -0.72 2 0 3 1

Ran-binding protein 6 (RanBP6) 5.0 125 -0.08 0 0 3 1

MAPK-interacting and spindle-stabilizing protein 9.5 28 -0.71 3 1 3 1

MKIAA1405 protein (Fragment) 9.3 33 -0.96 0 1 3 1

Oral-facial-digital syndrome 1 protein homolog 5.8 117 -0.89 1 3 2 1

Solute carrier family 13 (Sodium-dependent dicarboxylate transporter), member 3 8.0 61 0.47 5 0 1 1

Tumor necrosis factor receptor superfamily member 6 8.0 37 -0.77 4 2 1 1

Aurora kinase A 9.4 45 -0.66 0 1 1 1

Spectral CountsPhysiochemical Properties

161

The top two proteins present in the final phenyl eluate step are plexin D1 and progesterone

immunomodulatory binding factor 1. The other co-purified proteins are detected at lower levels

compared to these proteins based on their spectral count values. Plexin D1 is a secreted

signaling protein associated with axonal growth and development, and has been linked to tumor

invasiveness [29]. Progesterone immunomodulatory binding factor 1 is synthesized after binding

of progesterone to binding receptors during pregnancy. It has multiple functions including

immunosuppression by regulation of cytokine synthesis, NK activity, and arachidonic acid

metabolism, all of which play a role in maintenance of pregnancy [30]. Full-length PIBF is

predominantly present in the nucleus, however a shorter spliced variant is secreted outside of the

cell [31]. It has been shown that PIBF expression is also expressed by tumor cells [32]. These

two proteins represent the most highly abundant in this particular biopharmaceutical process.

Their association with cancer further strengthens the links between CHO cell cultures and cancer

cells described in previous chapters.

To assess the physiochemical properties of the proteins co-purified in each intermediate step, the

weighted average isoelectric point, molecular weight and GRAVY values were calculated for

each sample.

Table 5.3: Average Physiochemical Values for Co-Purified Proteins

The average pI, molecular weight, and GRAVY values for the proteins identified in each

intermediate sample were determined by calculating weighted averages based on the number of

spectral counts for each identified protein.

Affinity Anion Cation Phenyl

pI 7.18 7.27 7.52 7.41

MW 157.16 111.73 96.18 87.74

GRAVY -0.40 -0.32 -0.48 -0.42

162

These results indicate changes in average pI, molecular weight, and GRAVY values of co-

purified host cell proteins as they go through several purification stages. A trend is observed in

decreasing molecular weight of co-purified proteins throughput the purification process. The

average molecular weight of the affinity chromatography eluate is 157 kDa, which decreases to

an average value of 88 kDa in the phenyl eluate. This indicates that higher molecular weight

proteins are cleared more efficiently during purification compared to lower molecular weight

proteins.

A small increase in pI of co-purified proteins across the purification steps is observed, increasing

from 7.27 in the affinity chromatography eluate to 7.41 in the phenyl eluate. This may be

indicative of the selectivity of ion exchange chromatography, which will tend to co-purify

proteins of similar pI as the target analyte. In this case, the IgG has a theoretical pI of 8.61,

which would explain the selectivity of higher pI proteins during purification.

The GRAVY values of co-purified proteins do not significantly change throughput purification.

This indicates that relative hydrophobicity may not play a significant role in protein clearance in

this particular process.

5.2 Conclusion

The analysis of secreted proteins generated from CHO cells during the cell culture process in the

conditioned media as well as in intermediate processing steps provides information critical to the

performance of the biopharmaceutical process.

163

These results indicate that many intracellular proteins involved in glycolysis are released into the

conditioned media during the process. These include several glycolytic enzymes, including

PKM2 which has been associated with the Warburg effect observed in tumor cells. These

enzymes play a key role in primary metabolism which drives CHO cell growth.

In addition, several extracellular proteins with growth-regulating properties were identified.

These include Fas, a critical receptor for caspase-dependent apoptosis, CYR61, a secreted

signaling protein associated with tumor proliferation, and TCTP, a tumor-related protein with

anti-apoptotic properties. These proteins are of interest due to their potential use as diagnostic

markers of cell culture, as well as cellular engineering targets to enhance CHO cell growth

properties.

Analysis of co-purified proteins in various intermediate processing steps of purification indicates

changes in the composition of proteins in both pI and molecular weight. This particular

purification process seems to clear high molecular weight proteins and low pI proteins most

effectively. The most abundant co-purified proteins identified in this study were plexin D1 and

progesterone immunomodulatory binding factor 1 (PIBF). These proteins are candidates for

targeted quantitation using either immunoassays or MRM-type mass spectrometry analysis to

monitor the clearance of these proteins quantitatively during different purification steps. This

approach could serve as a supplement to the host cell protein information provided by ELISA-

type assays, and could be useful in cases where it is difficult to quantitate host cell protein levels

using an immunoassay due to matrix interferences or lack of specificity.

164

5.5 References

1. Pavlou MP, Diamandis EP: The cancer cell secretome: a good source for discovering

biomarkers? Journal of proteomics 2010, 73(10):1896-1906.

2. Makridakis M, Vlahou A: Secretome proteomics for discovery of cancer biomarkers.

Journal of proteomics 2010, 73(12):2291-2305.

3. Kawanishi H, Matsui Y, Ito M, Watanabe J, Takahashi T, Nishizawa K, Nishiyama H,

Kamoto T, Mikami Y, Tanaka Y et al: Secreted CXCL1 is a potential mediator and

marker of the tumor invasion of bladder cancer. Clinical cancer research : an official

journal of the American Association for Cancer Research 2008, 14(9):2579-2587.

4. Sardana G, Jung K, Stephan C, Diamandis EP: Proteomic analysis of conditioned

media from the PC3, LNCaP, and 22Rv1 prostate cancer cell lines: discovery and

validation of candidate prostate cancer biomarkers. Journal of proteome research

2008, 7(8):3329-3338.

5. Leskov KS, Klokov DY, Li J, Kinsella TJ, Boothman DA: Synthesis and functional

analyses of nuclear clusterin, a cell death protein. The Journal of biological chemistry

2003, 278(13):11590-11600.

6. Eaton LC: Host cell contaminant protein assay development for recombinant

biopharmaceuticals. Journal of chromatography A 1995, 705(1):105-114.

7. Thomas PD, Campbell MJ, Kejariwal A, Mi H, Karlak B, Daverman R, Diemer K,

Muruganujan A, Narechania A: PANTHER: a library of protein families and

subfamilies indexed by function. Genome research 2003, 13(9):2129-2141.

165

8. Gao SX, Zhang Y, Stansberry-Perkins K, Buko A, Bai S, Nguyen V, Brader ML:

Fragmentation of a highly purified monoclonal antibody attributed to residual CHO

cell protease activity. Biotechnology and bioengineering 2011, 108(4):977-982.

9. Takayama S, Xie Z, Reed JC: An evolutionarily conserved family of Hsp70/Hsc70

molecular chaperone regulators. The Journal of biological chemistry 1999, 274(2):781-

786.

10. Baxa CA, Sha RS, Buelt MK, Smith AJ, Matarese V, Chinander LL, Boundy KL,

Bernlohr DA: Human adipocyte lipid-binding protein: purification of the protein

and cloning of its complementary DNA. Biochemistry 1989, 28(22):8683-8690.

11. Lee AH, Scapa EF, Cohen DE, Glimcher LHCINSJ, Pmid: Regulation of hepatic

lipogenesis by the transcription factor XBP1. Science (New York, NY) 2008,

320(5882):1492-1496.

12. Hara MR, Agrawal N, Kim SF, Cascio MB, Fujimuro M, Ozeki Y, Takahashi M, Cheah

JH, Tankou SK, Hester LD et al: S-nitrosylated GAPDH initiates apoptotic cell death

by nuclear translocation following Siah1 binding. Nature cell biology 2005, 7(7):665-

674.

13. Zheng L, Roeder RG, Luo YCINCJ, Pmid: S phase activation of the histone H2B

promoter by OCA-S, a coactivator complex that contains GAPDH as a key

component. Cell 2003, 114(2):255-266.

14. Christofk HR, Vander Heiden MG, Harris MH, Ramanathan A, Gerszten RE, Wei R,

Fleming MD, Schreiber SL, Cantley LC: The M2 splice isoform of pyruvate kinase is

important for cancer metabolism and tumour growth. Nature 2008, 452(7184):230-

233.

166

15. Mazurek S, Boschek CB, Hugo F, Eigenbrodt E: Pyruvate kinase type M2 and its role

in tumor growth and spreading. Seminars in cancer biology 2005, 15(4):300-308.

16. Warburg O: On the origin of cancer cells. Science (New York, NY) 1956,

123(3191):309-314.

17. Bluemlein K, Gruning NM, Feichtinger RG, Lehrach H, Kofler B, Ralser M: No

evidence for a shift in pyruvate kinase PKM1 to PKM2 expression during

tumorigenesis. Oncotarget 2011.

18. Noguchi T, Inoue H, Tanaka T: The M1- and M2-type isozymes of rat pyruvate kinase

are produced from the same gene by alternative RNA splicing. The Journal of

biological chemistry 1986, 261(29):13807-13812.

19. Ozturk SS, Riley MR, Palsson BO: Effects of ammonia and lactate on hybridoma

growth, metabolism, and antibody production. Biotechnology and bioengineering

1992, 39(4):418-431.

20. O'Brien TP, Yang GP, Sanders L, Lau LF: Expression of cyr61, a growth factor-

inducible immediate-early gene. Molecular and cellular biology 1990, 10(7):3569-

3577.

21. Chen CC, Lau LF: Functions and mechanisms of action of CCN matricellular

proteins. The international journal of biochemistry & cell biology 2009, 41(4):771-783.

22. Babic AM, Kireeva ML, Kolesnikova TV, Lau LF: CYR61, a product of a growth

factor-inducible immediate early gene, promotes angiogenesis and tumor growth.


95(11):6355-6360.

167

23. Gery S, Xie D, Yin D, Gabra H, Miller C, Wang H, Scott D, Yi WS, Popoviciu ML, Said

JW et al: Ovarian carcinomas: CCN genes are aberrantly expressed and CCN1

promotes proliferation of these cells. Clinical cancer research : an official journal of

the American Association for Cancer Research 2005, 11(20):7243-7254.

24. Wajant H: The Fas signaling pathway: more than a paradigm. Science (New York,

NY) 2002, 296(5573):1635-1636.

25. Tuynder M, Susini L, Prieur S, Besse S, Fiucci G, Amson R, Telerman A: Biological

models and genes of tumor reversion: cellular reprogramming through tpt1/TCTP

and SIAH-1. Proceedings of the National Academy of Sciences of the United States of

America 2002, 99(23):14976-14981.

26. Yarm FR: Plk phosphorylation regulates the microtubule-stabilizing protein TCTP.

Molecular and cellular biology 2002, 22(17):6209-6221.

27. Liu H, Peng HW, Cheng YS, Yuan HS, Yang-Yen HF: Stabilization and enhancement

of the antiapoptotic activity of mcl-1 by TCTP. Molecular and cellular biology 2005,

25(8):3117-3126.

28. Rho SB, Lee JH, Park MS, Byun HJ, Kang S, Seo SS, Kim JY, Park SY: Anti-apoptotic

protein TCTP controls the stability of the tumor suppressor p53. FEBS letters 2011,

585(1):29-35.

29. Roodink I, Verrijp K, Raats J, Leenders WP: Plexin D1 is ubiquitously expressed on

tumor vessels and tumor cells in solid malignancies. BMC cancer 2009, 9:297.

30. Szekeres-Bartho J, Polgar B: PIBF: the double edged sword. Pregnancy and tumor.

American journal of reproductive immunology (New York, NY : 1989) 2010, 64(2):77-86.

168

31. Polgar B, Kispal G, Lachmann M, Paar C, Nagy E, Csere P, Miko E, Szereday L, Varga

P, Szekeres-Bartho J: Molecular cloning and immunologic characterization of a novel

cDNA coding for progesterone-induced blocking factor. Journal of immunology

(Baltimore, Md : 1950) 2003, 171(11):5956-5963.

32. Lachmann M, Gelbmann D, Kalman E, Polgar B, Buschle M, Von Gabain A, Szekeres-

Bartho J, Nagy E: PIBF (progesterone induced blocking factor) is overexpressed in

highly proliferating cells and associated with the centrosome. International journal of

cancer Journal international du cancer 2004, 112(1):51-60.

169

CONCLUDING REMARKS

The future growth of the biopharmaceutical industry is dependent upon robust processes

generating drugs at high yields and of high quality. In order to push the envelope of what a

biopharmaceutical process can deliver, “omics” techniques will play a critical role in

understanding the cell biology and identifying biomarkers related to productivity and product

quality. As described in chapters 2 and 3, proteomics technology can identify differentially

expressed proteins associated with cell growth and productivity from CHO cell cultures. From

these studies, several general observations can be made about CHO cell cultures. In both cases,

proteins were identified which are associated with cellular response to ER stress, also referred to

as the unfolded protein response. This biological process plays a key role in mammalian cell

cultures, as the cells are engineered to express target genes at high levels and the cells respond to

high level of intracellular non-processed proteins. Modification of the pathways related to the

UPR is one area of interest from a cellular engineering point of view, as this process can directly

affect cellular productivity.

Another observation from chapter 3 is the differential expression of cancer-related proteins

throughout cell culture. These proteins, such as clusterin and transglutaminase-2, likely play

roles in the regulation of cell growth, and along with other published studies indicate similarity

between tumor cells and mammalian cells used in cell culture. These similarities may be due to

the adaptation and selection of the cells towards high cell growth. The processes driving tumor

proliferation may be another possible area to target for engineering of faster growing cell

cultures.

170

The chromosomal structure of CHO cells is another area where further research may unlock

some clues to the biology driving cell culture performance. As described in chapter 4, the CHO

chromosomes are commonly subject to various genetic mutations, including deletion and

insertion of large segments of DNA, resulting in different chromosomal structures even between

different CHO subclones. Any significant genetic mutations affecting cell growth are likely

selected out early on during process development. However, the effect of these mutations on cell

culture performance is seemingly unexplored. As described in chapter 4, clusters of co-

expressed genes involved in regulation of cell growth potentially exist in CHO, which makes the

understanding of chromosomal structure even more important.

Proteomics can also play a role in understanding secreted proteins in cell culture media, and how

those proteins are cleared during the purification process. More stringent requirements by

regulatory agencies to demonstrate protein clearance has resulted in the need for more specific

and sensitive detection methods. At this stage, proteomics can play a supporting role by

identifying the relevant proteins present in intermediate steps of purification. Mass spectrometry

may also play a role in quantitating these proteins in different samples, using sensitive methods

such as MRM. The applicability of such approaches for monitoring of host cell protein clearance

compared to the standard ELISA type methods will need to be evaluated in the future.

The constant advancement and maturation of proteomics technology will continue to enable the

discovery of insights into the biological processes driving the biopharmaceutical process. This

information, combined with information from other “omics” areas including genomics and

metabolomics, will facilitate the engineering of better biopharmaceutical processes in the future.

Recombinant proteins will be produced in larger quantities, and with greater control over the

171

product quality profile. In the end, this will potentially lead to lower costs, greater efficacy, and

greater safety of the drugs for patients and will represent a major push forward for the

biopharmaceutical industry.

the application of proteomics tools for …715/fulltext.pdf · 1 the application of proteomics...

Documents