the subcellular proteome of undifferentiated human embryonic stem cells

10
RESEARCH ARTICLE The subcellular proteome of undifferentiated human embryonic stem cells Prasenjit Sarkar 1 , Timothy S. Collier 2 , Shan M. Randall 2 , David C. Muddiman 2 and Balaji M. Rao 1 1 Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, NC, USA 2 W.M. Keck FT-ICR Mass Spectrometry Laboratory, Department of Chemistry, North Carolina State University, Raleigh, NC, USA Received: September 23, 2011 Revised: October 31, 2011 Accepted: November 14, 2011 We have characterized the subcellular proteome of human embryonic stem cells (hESCs) through MS analysis of the membrane, cytosolic, and nuclear fractions, isolated from the same sample of undifferentiated hESCs. Strikingly, 74% of all proteins identified were detected in a single subcellular fraction; we also carried out immunofluorescence studies to validate the subcellular localization suggested by proteomic analysis, for a subset of proteins. Our approach resulted in deeper proteome coverage – peptides mapping to 893, 2475, and 1185 proteins were identified in the nuclear, cytosolic, and membrane fractions, respectively. Additionally, we used spectral counting to estimate the relative abundance of all cytosolic proteins. A large number of proteins relevant to hESC biology, including growth factor receptors, cell junction proteins, transcription factors, chromatin remodeling proteins, and histone modifying enzymes were identified. Our analysis shows that components of a large number of interacting signaling pathways are expressed in hESCs. Finally, we show that proteomic analysis of the endoplasmic reticulum (ER) and Golgi compartments is a powerful alternative approach to identify secreted proteins since these are synthesized in the ER and transit through the Golgi. Taken together, our results show that systematic subcellular proteomic analysis is a valuable tool for studying hESC biology. Keywords: Cell biology / Extracellular proteins / Human embryonic stem cells / Subcellular fractionation / Subcellular proteomics 1 Introduction Human embryonic stem cells (hESCs) are pluripotent cells originally derived from the inner cell mass of the preim- plantation blastocyst stage embryo [1]. hESCs can be propagated indefinitely in cell culture and can also differ- entiate into all somatic cell types. Therefore, hESCs have great potential to revolutionize regenerative medicine, provide a renewable source for the generation of functional cells for drug evaluation, and serve as model systems for studying human embryogenesis. While the precise molecular mechanisms controlling pluripotency of hESCs remain unclear, it is increasingly becoming evident that hESCs are maintained in the undifferentiated state by a large network of interacting pathways. The contribution of a specific pathway toward the maintenance of hESC pluripotency depends on several factors such as the subset of component proteins in that pathway that are actually expressed in hESCs, their subcellular localization, relative levels of protein expression, and the stoichiometric ratios of various interacting proteins. Colour Online: See the article online to view Figs. 1–4 and Table 1 in colour. Abbreviations: CM, conditioned medium; EC, embryonal carci- noma; EP300, Histone acetyltransferase p300; GO, Gene Ontol- ogy; hESC, human embryonic stem cell; MO4L1, Mortality factor 4-like protein 1; NSAF, Normalized Spectral Abundance Factor; SILAC-CM, conditioned medium containing stable isotopes 13 C 6 , 15 N 2 L-lysine and 13 C 6 L-arginine Correspondence: Professor Balaji M. Rao, Department of Chemical and Biomolecular Engineering, North Carolina State University, Campus Box 7905, EB1, Raleigh 27695, NC, USA E-mail: [email protected] Fax: 11-919-513-3465 & 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com Proteomics 2012, 12, 421–430 421 DOI 10.1002/pmic.201100507

Upload: prasenjit-sarkar

Post on 06-Jul-2016

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The subcellular proteome of undifferentiated human embryonic stem cells

RESEARCH ARTICLE

The subcellular proteome of undifferentiated human

embryonic stem cells

Prasenjit Sarkar1, Timothy S. Collier 2, Shan M. Randall2, David C. Muddiman2

and Balaji M. Rao1

1 Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, NC, USA2 W.M. Keck FT-ICR Mass Spectrometry Laboratory, Department of Chemistry, North Carolina State University,

Raleigh, NC, USA

Received: September 23, 2011

Revised: October 31, 2011

Accepted: November 14, 2011

We have characterized the subcellular proteome of human embryonic stem cells (hESCs)

through MS analysis of the membrane, cytosolic, and nuclear fractions, isolated from the

same sample of undifferentiated hESCs. Strikingly, 74% of all proteins identified were

detected in a single subcellular fraction; we also carried out immunofluorescence studies to

validate the subcellular localization suggested by proteomic analysis, for a subset of proteins.

Our approach resulted in deeper proteome coverage – peptides mapping to 893, 2475, and

1185 proteins were identified in the nuclear, cytosolic, and membrane fractions, respectively.

Additionally, we used spectral counting to estimate the relative abundance of all cytosolic

proteins. A large number of proteins relevant to hESC biology, including growth factor

receptors, cell junction proteins, transcription factors, chromatin remodeling proteins, and

histone modifying enzymes were identified. Our analysis shows that components of a large

number of interacting signaling pathways are expressed in hESCs. Finally, we show that

proteomic analysis of the endoplasmic reticulum (ER) and Golgi compartments is a powerful

alternative approach to identify secreted proteins since these are synthesized in the ER and

transit through the Golgi. Taken together, our results show that systematic subcellular

proteomic analysis is a valuable tool for studying hESC biology.

Keywords:

Cell biology / Extracellular proteins / Human embryonic stem cells / Subcellular

fractionation / Subcellular proteomics

1 Introduction

Human embryonic stem cells (hESCs) are pluripotent cells

originally derived from the inner cell mass of the preim-

plantation blastocyst stage embryo [1]. hESCs can be

propagated indefinitely in cell culture and can also differ-

entiate into all somatic cell types. Therefore, hESCs have

great potential to revolutionize regenerative medicine,

provide a renewable source for the generation of functional

cells for drug evaluation, and serve as model systems

for studying human embryogenesis. While the precise

molecular mechanisms controlling pluripotency of

hESCs remain unclear, it is increasingly becoming

evident that hESCs are maintained in the undifferentiated

state by a large network of interacting pathways. The

contribution of a specific pathway toward the maintenance

of hESC pluripotency depends on several factors

such as the subset of component proteins in that pathway

that are actually expressed in hESCs, their subcellular

localization, relative levels of protein expression, and

the stoichiometric ratios of various interacting proteins.

Colour Online: See the article online to view Figs. 1–4 and Table 1 in

colour.

Abbreviations: CM, conditioned medium; EC, embryonal carci-

noma; EP300, Histone acetyltransferase p300; GO, Gene Ontol-

ogy; hESC, human embryonic stem cell; MO4L1, Mortality factor

4-like protein 1; NSAF, Normalized Spectral Abundance Factor;

SILAC-CM, conditioned medium containing stable isotopes 13C6,15N2 L-lysine and13C6 L-arginine

Correspondence: Professor Balaji M. Rao, Department of

Chemical and Biomolecular Engineering, North Carolina State

University, Campus Box 7905, EB1, Raleigh 27695, NC, USA

E-mail: [email protected]

Fax: 11-919-513-3465

& 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com

Proteomics 2012, 12, 421–430 421DOI 10.1002/pmic.201100507

Page 2: The subcellular proteome of undifferentiated human embryonic stem cells

Large-scale analysis of protein expression using MS can

provide a systems-level perspective on pathways that are

potentially active in hESCs. Indeed, over the last few years,

several proteomic studies on hESCs have been reported

(reviewed in [2, 3]).

Subcellular fractionation prior to proteomic analysis is a

powerful approach to reduce the overall sample complexity

and obtain deeper sequence coverage, as well as to study

protein localization [4]. In the context of hESCs, subcellular

fractionation has been largely used as a tool for the reduc-

tion of sample complexity. In particular, several studies have

focused on comparative analysis of membrane proteins in

hESCs and differentiated cells, or multiple different hESC

or embryonal carcinoma (EC) lines. For instance, Harkness

et al. identified membrane proteins that are expressed in

hESCs under two different culture conditions [5]. Dormeyer

et al. studied the differences in the membrane proteomes of

hESCs and human EC cells [6]. More recently, Gerwe et al.

characterized the differences in membrane protein expres-

sion between different hESC lines, including an hESC line

with karyotypic abnormalities, and their differentiated

derivatives [7]. Prokhorova et al. used a SILAC-based

approach to assess quantitative differences in

membrane protein expression between hESCs and cells

undergoing differentiation [8]. Van Hoof et al. also used

SILAC to identify cell-surface proteins that were

differentially expressed in cardiomyoctes derived from

hESCs [9]. In comparison, relatively fewer studies have

focused on the analysis of the nuclear hESC proteome.

Barthelery et al. evaluated a procedure to isolate

nuclear proteins and subsequently deplete the sample of

histones to improve nuclear proteome coverage, in the

context of hESCs [10]. Pewsey et al. studied the dynamic

changes in the nuclear proteome of a human EC cell line

(NTERA-2) upon differentiation induced by treatment with

retinoic acid [11].

In this study, we present the proteomic analysis of

membrane, cytosolic, and nuclear fractions, obtained from a

single sample of undifferentiated hESCs; we also used the

method of spectral counting to quantify the relative abun-

dances of all proteins identified in the cytosolic fraction. To

the best of our knowledge, our study represents

the first reported instance of a comprehensive character-

ization of the hESC proteome at subcellular resolution,

coupled with quantitative estimates of relative protein

expression. The simultaneous isolation of multiple

compartments allows us to assess the effectiveness of

subcellular fractionation, and consequently interpret

proteomic data to gain insight into protein localization.

Further, we also validated the localization of a subset of

proteins identified through MS, using fluorescence spec-

troscopy. Interestingly, we identified several secreted

proteins in subcellular fractions containing endoplasmic

reticulum (ER) and Golgi; secreted proteins are synthesized

in the ER and transit through the Golgi. Our results suggest

that proteomic analysis of the ER and Golgi compartments

is a powerful alternate approach to interrogate the secretome

of hESCs.

2 Materials and methods

2.1 Cell culture

Undifferentiated H9 hESCs were cultured on tissue culture

plates coated with MatrigelTM (BD Biosciences, Bedford,

MA, USA), in mouse embryonic fibroblast (MEF) condi-

tioned medium (CM), or in CM without L-lysine and

L-arginine, but containing the stable isotopes 13C6, 15N2

L-lysine, and 13C6 L-arginine (Pierce, Rockford, IL, USA)

(SILAC-CM), as described previously [12]. Stable isotope-

labeled arginine and lysine incorporation of 98.5 and 98.0%,

respectively, was achieved. Arginine-to-proline conversion

was determined to be approximately 5% in our system [13].

2.2 Immunofluorescence

Cells were passaged on to glass-bottom culture dishes

(Greiner Bio-one, Monroe, NC, USA) coated with Matri-

gelTM (BD Biosciences) and grown in CM. Cells were fixed

with 4% paraformaldehyde (Fisher Scientific, Houston, TX,

USA) and permeabilized with 0.5% Triton X-100 (Acros

Organics, Geel, Belgium). Subsequently, cells were blocked

in 1� PBS with 5% BSA and 0.3% Triton X-100 and stained

overnight with the primary antibody in the same buffer.

Rabbit-anti-human antibodies for b-CATENIN (Cell Signal-

ing, Danvers, MA, USA), CALPAIN1 (CALPAIN m-type)

(Cell Signaling), E-CADHERIN (Cell Signaling), P300

(Thermo Scientific, Rockford, IL, USA), Mortality factor

4-like protein 1 (MO4L1) (Sigma-Aldrich, St. Louis, MO,

USA), ENY2 (Cell Signaling), and THYMOSIN b4 (Milli-

pore, Billerica, MA, USA) were used. Isotype rabbit IgG was

purchased from Cell Signaling. Cells were then stained with

Alexa 633-conjugated goat-anti-rabbit IgG (Invitrogen,

Carlsbad, CA, USA) and DAPI (Invitrogen) and imaged

using a Zeiss LSM 710 confocal microscope.

2.3 Subcellular fractionation

Membrane, cytoplasmic, and nuclear fractions were isolated

from hESCs as described previously [12]. Detailed protocols

for subcellular fractionation are provided in Supporting

Information File 1. The ER-enriched fraction was isolated as

follows. Cells were scraped and homogenized in Sucrose

Buffer 1 (SB1; 250 mM sucrose, 25 mM potassium chloride,

5 mM magnesium chloride, 10 mM triethanolamine, 10 mM

acetic acid, 2.5 mM sodium pyrophosphate, 1 mM b-glycer-

ophosphate disodium salt, 1 mM sodium orthovanadate,

cØmplete minis Protease Inhibitor cocktail tablets (Roche,

Indianapolis, IN, USA), phosphatase inhibitor cocktails I

422 P. Sarkar et al. Proteomics 2012, 12, 421–430

& 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com

Page 3: The subcellular proteome of undifferentiated human embryonic stem cells

and II (Sigma-Aldrich), with pH adjusted to 7.6 using either

triethanolamine and/or acetic acid), and centrifuged at

1000� g for 10 min. The supernatant was collected and

centrifuged at 3000� g for 10 min. Again, the supernatant

was collected and centrifuged at 15 000� g for 30 min and

this pellet was retrieved as the ER–Golgi-enriched fraction.

The pellet was homogenized in 8 M urea and 50 mM

ammonium bicarbonate, in the presence of protease and

phosphatase inhibitors and used for MS analysis.

2.4 MS and data analysis

This study was conducted in parallel with a technical

analysis comparing relative quantification of protein

expression using SILAC and spectral counting [12]. The

overall experimental design is shown in Supporting Infor-

mation Fig. 1. Cells grown on CM were differentiated by

adding 25mM SB431542 (Sigma-Aldrich); the medium was

replaced every day with fresh CM containing 25 mM

SB431542. Subsequently, protein samples obtained through

subcellular fractionation from undifferentiated cells grown

in SILAC-CM were combined with those from differentiat-

ing cells at different time points, and analyzed using MS.

Data from heavy peptide identifications were used for the

analysis of undifferentiated hESCs, as presented in this

article. Simultaneously, hESCs cultured in SILAC-CM were

also used for the analysis of the cytoplasmic fraction by

spectral counting. Cells cultured in CM were used for the

analysis of subcellular fractions enriched in ER and Golgi.

A total of 25 mg of protein from unlabeled cells differ-

entiated in CM was mixed with 25 mg of protein from

undifferentiated hESCs cultured in SILAC-CM for the

analysis of SILAC samples. In all, 50mg of protein sample

each was used for the quantification of cytoplasmic proteins

by spectral counting and analysis of the ER-enriched frac-

tion. MS and subsequent data analysis was carried out as

described previously [12]. Briefly, samples were run on a

10–20% Tris-HCl Gel (Bio-Rad, Hercules, CA, USA),

reduced with DTT, alkylated with iodoacetamide, and

digested with proteomic-grade trypsin. In-gel-digested

peptide samples were separated in an Eksigent 1-D1nano-

LC system (Eksigent, Dublin, CA, USA) with a vented

column configuration and detected using an LTQ-Orbitrap

XL. Magic C18AQ (particle size, 5mm; pore size, 200 A;

Microm BioResources, Auburn, CA, USA) was used as

packing material for both the trapping and the analytical

columns. An IntegraFrit capillary (New Objective, Woburn,

MA, USA) measuring 5 cm in length and having a 75 mm id

was employed as a trap for desalting peptides. A PicoFrit

capillary (New Objective) with a 75 mm id and measuring

15 cm in length was utilized as the analytical column.

Burdick and Jackson (Muskegon, MI, USA) supplied all LC

solvents. The composition of mobile phase A was 98%

water, 2% acetonitrile, and 0.2% formic acid. The compo-

sition of mobile phase B was 98% acetonitrile, 2% water,

and 0.2% formic acid. In total, 8 mL of sample was injected at

a flow rate of 1.5 mL/min and switched to 350 nL/min via a

10-port valve before eluting peptides onto the analytical

column. The gradient increased from 2% B to 50% B over

127 min before ramping up to 95% B. After holding for

5 min at 95% B, re-equilibration was established by flowing

at 2% B for 10 min.

Precursor scans in the Orbitrap analyzer were acquired

with 60 000 resolving power at m/z 400, and these broad-

band scans were followed by up to eight data-dependent

MS/MS scan events in the ion trap. The minimum MS

signal threshold for MS/MS activation was set to 2500.

Collision-induced dissociation was employed for fragmen-

tation with an isolation width of m/z 2 and normalized

collision energy of 35%. Unassigned and 11 charge states

were rejected for MS/MS, and dynamic exclusion was set to

180 s with one repeat count and a repeat duration of 0 s.

Automatic gain control settings were 8� 103 ions for the ion

trap and 1� 106 ions for the Orbitrap. Ionization times were

restricted to 80 and 500 ms for the ion trap and Orbitrap,

respectively.

RAW LC-MS/MS files were processed using MASCOT

Distiller (Matrix Science, Boston, MA, USA) to generate

peak lists in.mgf format for database searching using the

MASCOT server; all searches were performed against the

UniProt human database containing both target and reverse

protein sequences (last modified: 10/2010). Search toler-

ances were set to 75 ppm for the precursor ion and to

70.6 Da for the fragment ions. Cysteine carbamidomethy-

lation was set as a fixed modification, and variable modifi-

cations included oxidation of methionine as well as

deamidation of glutamine and asparagine. ProteoIQ

(BioInquire, Athens, GA, USA) was used to create protein

lists at 1% false discovery rate (FDR). Normalized Spectral

Abundance Factor (NSAF) values were manually calculated

using unnormalized spectral count values obtained from

ProteoIQ as follows:

ðNSAFÞx ¼ðSpC=LÞx

PNi¼1 ðSpC=LÞi

where L is the number of amino acids

SpC is the total number of MS/MS spectra that identify

protein x.Subsequent Gene Ontology (GO) annotation analysis was

carried out using Blast2Go [14] and DAVID [15].

3 Results and discussion

3.1 Elucidation of the subcellular proteome of

undifferentiated hESCs

We carried out proteomic analysis of the nuclear, cytosolic,

and membrane components derived from subcellular frac-

tionation of a single sample of undifferentiated hESCs. Flow

Proteomics 2012, 12, 421–430 423

& 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com

Page 4: The subcellular proteome of undifferentiated human embryonic stem cells

cytometric analysis of the expression of the pluripotency

markers OCT4 and SSEA4 in the cells used for proteomic

analysis is shown in Supporting Information Fig. 2. Our

SILAC analysis identified peptides corresponding to 893

proteins that map to 713 protein groups in the nuclear

sample, 1397 proteins mapping to 1155 protein groups in

the cytosolic fraction and 1185 proteins that map to 929

protein groups in the membrane sample. Performing

spectral counting analysis on the undifferentiated cyto-

plasmic sample, we obtained 2439 protein identifications

from 2070 protein groups. Taking the SILAC and spectral

counting analysis together, we identified 2475 proteins in

the cytosolic fraction. Detailed lists of proteins identified in

the three subcellular fractions are provided in Supporting

Information Tables 1–9. The spectral count analysis on the

cytosolic fraction also gave us the relative abundance values

of cytosolic proteins. In this method, the total number of

peptide identifications in the data that map to each protein

is normalized by the protein length to give the relative

protein abundance in the sample; this metric is called the

NSAF. NSAF values can be compared across different

proteins in the same biological sample, giving the relative

abundances of these proteins in the same sample [16–18]. A

comprehensive list of the NSAF values for all cytosolic

proteins is provided in Supporting Information Table 9.

An overview of proteins identified in our proteomic

analysis is shown in Fig. 1. Of the 1185 proteins identified

in the plasma membrane fraction, 540 were found in this

fraction only, i.e. they were not identified in the nuclear or

cytosolic fractions. Similarly, out of the 2475 proteins

identified in the cytosolic fraction, 1675 were found only in

the cytosolic fraction and out of the 893 proteins identified

in the nuclear fraction, 376 were unique to this fraction

only. Thus, in total out of 3359 proteins identified, 2491

(74%) were unique to a single subcellular fraction and only

26% were shared between fractions, indicating that the

fractionation procedure had merit; only 9.7% of the proteins

were identified in all three fractions. Nevertheless, the

fractionation was not perfect; we identified proteins from

intracellular organelles such as mitochondria, ER, and Golgi

in all three fractions. Also, the list of proteins found in all

three fractions includes highly abundant proteins such as

actins and histones that can contaminate other fractions.

However, despite these limitations, the fact that 74% of the

identified proteins were unique to one of the three subcel-

lular fractions demonstrates distinct subcellular localization

of proteins in hESCs and underlines the importance of

subcellular fractionation procedures while studying the

hESC proteome.

3.2 Identification of proteins relevant to hESC

biology

An important advantage of subcellular fractionation in

proteomic analysis is the significantly deeper coverage

obtained due to the reduction in sample complexity. Several

growth factor receptors, G-proteins, integrins, as well as

cell–cell junction proteins such as those of the

adherens junctions, tight junctions, gap junctions, and

desmosomes were identified from the membrane fraction

(Supporting Information Table 10). Similarly, several chro-

matin-remodeling enzymes, histone acetyltransferases,

histone deacetylases, histone methyltransferases, DNA

methyltransferases as well as transcription factors were

identified in the nuclear fraction, despite their presence in

lower abundances than structural proteins or histones

(Supporting Information Table 11). To our knowledge, this

is the first attempt at a comprehensive characterization of

the epigenetic factors present in the nucleus of undiffer-

entiated hESCs. A few epigenetic factors and transcription

factors were identified in the cytoplasmic fraction; these are

summarized in Supporting Information Table 12. We also

identified several serine/threonine/tyrosine kinases and

phosphatases, as well as cell-cycle regulators in all three

subcellular fractions. A list of these proteins along with their

experimentally observed subcellular localization is provided

in Supporting Information Table 13. Taken together, our

Figure 1. Overview of protein identifications obtained in membrane, cytosolic, and nuclear fractions of undifferentiated hESCs.

(A) Overview of protein identifications obtained across all three fractions combined. The number of proteins identified in a single

subcellular fraction (membrane, cytosol, or nucleus only), two subcellular fractions or all three fractions is indicated. (B) Proteins iden-

tified and (C) annotated in a single subcellular fraction versus those identified/annotated in multiple subcellular fractions are indicated.

424 P. Sarkar et al. Proteomics 2012, 12, 421–430

& 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com

Page 5: The subcellular proteome of undifferentiated human embryonic stem cells

results highlight the effectiveness of subcellular proteomics

in elucidating the protein expression profile of hESCs.

3.3 Confirmation of localization using fluorescence

microscopy

As discussed earlier, 74% of proteins were identified in a

single subcellular fraction; this underlines the effectiveness

of our subcellular fractionation protocol. We further vali-

dated subcellular localization suggested by proteomic

analysis for seven target proteins using fluorescence

microscopy–Enhancer of yellow 2 transcription factor

homolog (ENY2), Calpain-1 catalytic subunit (CAN1),

Histone acetyltransferase p300 (EP300), b-catenin (CTNB1),

Thymosin b-4 (TBY4), MO4L1, and Cadherin-1 (E-cadherin

or CADH1). Table 1 summarizes the comparison between

immunofluorescence data and results obtained using

subcellular proteomic analysis and the corresponding

microscopy images are shown in Fig. 2.

Our results show that five out of seven proteins identified

in the cytoplasm through subcellular proteomic analysis

were also detected in the cytoplasm using microscopy. Note

that despite being annotated as nuclear-only, ENY2 and

MO4L1-components of the SAGA and NuA4 histone acet-

yltransferase (HAT) complexes, respectively, are present in

the cytoplasm of undifferentiated hESCs as suggested by

proteomic analysis. Interestingly, cytoplasmic localization of

ENY2 has been reported in Drosophila S2 cells [19]. EP300

and b-catenin were identified in the cytoplasm using spec-

tral counting but not detected using immunofluorescence.

Pertinently, cytoplasmic localization of EP300 has been

reported previously [20]. Also, cytoplasmic b-catenin is a key

component of the Wnt signaling pathway; activation of Wnt

signaling has been reported to maintain hESC pluripotency

[21]. EP300, b-catenin, MO4L1, and E-cadherin were not

detected in the analysis of the cytoplasmic fraction from the

SILAC-CM sample but were identified using spectral

counting. It is not surprising that spectral counting is

consistently able to identify more proteins due to analyzing

replicate injections of a relatively less complex sample [12].

Also, five out of seven proteins were identified in the

nucleus using immunofluorescence but were not detected

in the proteomic analysis of the nuclear fraction. This can be

attributed to the fact that MS approaches utilized in this

study were global platforms optimized to identify and

quantify as much of the proteome as possible. A large

number of proteins may go undetected using global

Table 1. Comparison between protein identifications in hESCs cultured in SILAC-CM (ID), spectral counting (SpC), andimmunofluorescence (IF)

GO annotations (GO) of proteins are also listed for reference. ND, not determined; measurement was not conducted; 1, protein detected;0, protein not detected.

Proteomics 2012, 12, 421–430 425

& 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com

Page 6: The subcellular proteome of undifferentiated human embryonic stem cells

approaches due to the limitations of the instrument related

to duty cycle as well as detection limits. This is primarily due

to the complexity of the sample that contains a large number

of proteins with a wide range of expression levels. On the

other hand, the immunofluorescence assay is a targeted

approach which seeks to detect a specific protein of interest.

Further, the use of a secondary antibody for the detection

allows for a potentially large increase in signal intensity;

multiple fluorescent labels can bind to a single primary

antibody. Indeed, this is a major advantage of using

immunofluorescent detection schemes for low-abundance

proteins. Thus for these proteins, immunofluorescent

detection is more sensitive than MS analysis.

The proteins selected for immunofluorescence analysis

have a range of cytoplasmic NSAF values (8.67� 10�6–

7.53� 10�3), suggesting a range of abundances in the cyto-

plasm. Strictly speaking, fluorescence intensities between

different images cannot be compared due to the use of

different antibodies and unknown binding affinities of anti-

body–target interaction. Nevertheless, assuming antibody

excess during immunofluorescent staining leading to

complete saturation of target sites, there is broad qualitative

agreement between the NSAF values and the corresponding

cytoplasmic fluorescent intensity; proteins with higher

abundance, as suggested by their NSAF value, show greater

fluorescent labeling in these cases.

3.4 Identification of key signaling pathways in

hESCs

We combined the lists of proteins obtained through MS

analysis of all three subcellular fractions and identified

signaling pathways that these proteins have been associated

with, using the Protein Interaction Database (PID) at the

National Cancer Institute (NCI). Our analysis shows that

undifferentiated hESCs express proteins that map to

numerous signaling pathways, including the Activin/Nodal

pathway, BMP pathway, canonical, and noncanonical Wnt

pathways, FGF pathway, IGF pathway, Akt pathway, and the

HDAC Classes I, II, and III pathways. A comprehensive list

of pathways identified and their experimentally detected

component proteins is provided in Supporting Information

Table 14. Our analysis shows that the components of a large

number of signaling pathways are expressed in undiffer-

entiated hESCs. These pathways exhibit considerable

crosstalk, as evidenced by several proteins mapping to

multiple pathways. In concert, these pathways likely dictate

all aspects of hESC biology such as pluripotency, self-

renewal, suppression of epithelial-to-mesenchymal trans-

formation (EMT), regulation of cell cycle, and suppression

of apoptosis.

The expression level of pathway components is an

important determinant of the overall influence of a specific

pathway on hESC fate. Since several components of multi-

ple signaling pathways are present in the cytoplasm, we

used spectral counting analysis to assess the relative abun-

dance of cytoplasmic proteins; higher NSAF values corre-

spond to higher protein abundance. A schematic diagram

Figure 2. Immunofluorescence analysis of intracellular localiza-

tion of target proteins in undifferentiated hESCs. Cells were

stained with a nuclear dye (DAPI) and an antibody specific to the

target protein. (A, D, G, J, M, P, and S) DAPI signal; signal due to

immunofluorescent staining with antibody: (B) CALPAIN1

(m type) antibody. (E) MO4L1 antibody. (H) THYMOSIN b4 anti-

body. (K) b-CATENIN antibody. (N) E-CADHERIN antibody.

(Q) ENY2 antibody. (T) P300 antibody; (C, F, I, L, O, R, and U)

Composite images showing fluorescence due to nuclear dye as

well as antibody staining.

426 P. Sarkar et al. Proteomics 2012, 12, 421–430

& 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com

Page 7: The subcellular proteome of undifferentiated human embryonic stem cells

Figure 3. Network

diagram for the

subset of signaling

pathways identified

in hESCs. Relative

expression and

localization data

have been depicted

simultaneously.

Proteins identified

only in membrane

are shown in

yellow. Proteins

identified only in

cytosol are shown

in shades of red,

corresponding to

relative expression,

as shown in the

legend. Proteins

identified in the

nucleus only are

shown in blue.

Proteins identified

in multiple frac-

tions are shown

with thick borders.

Proteins that were

not identified in our

analysis are shown

in grey. (Note: A

high-resolution

version of this

image is available

online.)

Proteomics 2012, 12, 421–430 427

& 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com

Page 8: The subcellular proteome of undifferentiated human embryonic stem cells

that simultaneously combines known interactions between

signaling pathways, relative abundance of certain compo-

nent proteins as well as their experimentally determined

subcellular localization is shown in Fig. 3. Our figure is

restricted to a subset of signaling pathways, viz. the Activin/

Nodal pathway, BMP pathway, FGF pathway, IGF pathway,

Akt pathway, and the mTOR pathway, and the HDAC

Classes I, II, and III pathways. To our knowledge, this is the

first effort to characterize the signaling pathways active in

hESC, wherein data on both the relative protein expression

and the subcellular localization are combined. Interestingly,

several proteins that may regulate the amount of active

SMAD2 and SMAD3, such as PPM1A, NEDD4-L, CDK2/4,

CALM, and ERK1/2 [22–27], are present in the cytoplasm. A

list of these SMAD2/3-regulating proteins with their NSAF

values in the cytosol is provided in Supporting Information

Table 15.

3.5 ER as the door to the hESC secretome

The hESC microenvironment plays a significant role in the

regulation of hESC signaling network [28]. The complex

microenvironment of hESCs is determined in part by

endogenous factors that are secreted by hESCs. However,

experimental characterization of the hESC secretome is

challenging [29]. In our proteomic analysis, we identified a

small fraction of proteins annotated as being present in the

ER and Golgi in all the three subcellular fractions (Fig. 4A),

suggesting contamination by ER and Golgi. Indeed, the ER

membrane is contiguous with the outer nuclear envelope

and this causes the ER to be pulled down with the nucleus

during fractionation [30]. Since secreted proteins are

synthesized in the ER and transit through the Golgi, we

hypothesized that secreted proteins could be identified in

subcellular fractions. To test this hypothesis, we pooled the

list of proteins identified by LC-MS/MS and analyzed

the subset of proteins that are annotated as being present in

the extracellular region (GO: 0005576) and hence are puta-

tively secreted. As anticipated, this subset of proteins

contained several secreted factors including cytokines and

growth factors and extracellular matrix proteins. To further

confirm that these secreted proteins are obtained, at least in

part, from the ER and Golgi, we isolated a subcellular

fraction that was enriched in these compartments. As

shown in Fig. 4B, this enriched fraction contains few

proteins that are not annotated as being present in the

ER, Golgi, or mitochondria; this fraction has significant

Figure 4. GO annotation analysis of proteins identified in

subcellular fractions. (A) GO annotation analysis of proteins

identified in the membrane, nuclear, and cytoplasmic fractions.

Proteins were classified based on their GO annotations as being

present in the ER and Golgi only, mitochondria but not ER and

Golgi, extracellular, and others. (B) GO annotation analysis of

the fraction enriched in ER and Golgi.

Table 2. List of growth factors and cytokines and extracellularmatrix proteins identified

UniProtaccession

ID Description

Growth factors

P09038 FGF2 Heparin-binding growth factor 2P60983 GMFB Glia maturation factor bO60234 GMFG Glia maturation factor gP51858 HDGF Hepatoma-derived growth factorO75610 LFTY1 Left–right determination factor 1O00292 LFTY2 Left–right determination factor 2P55145 MANF Mesencephalic astrocyte-derived

neurotrophic factorP14174 MIF Macrophage migration inhibitory

factorP21741 MK Midkine

Extracellular matrix proteins

Q8NCW5 AIBP Apolipoprotein A-I-binding proteinP07355 ANXA2 Annexin A2P02649 APOE Apolipoprotein EQ9BUR5 APOO Apolipoprotein OA6NMY6 AXA2L Putative annexin A2-like proteinP27797 CALR CalreticulinP10909 CLUS ClusterinP12109 CO6A1 Collagen a-1(VI) chainP39060 COIA1 Collagen a-1(XVIII) chainP78310 CXAR Coxsackievirus and adenovirus

receptorQ14118 DAG1 DystroglycanP23142 FBLN1 Fibulin-1P02751 FINC FibronectinP06396 GELS GelsolinO75487 GPC4 Glypican-4Q9Y625 GPC6 Glypican-6Q5UCC4 INM02 UPF0510 protein INM02O00515 LAD1 Ladinin-1P11047 LAMC1 Laminin subunit g-1P09382 LEG1 Galectin-1P55001 MFAP2 Microfibrillar-associated protein 2Q08431 MFGM LactadherinQ32P28 P3H1 Prolyl 3-hydroxylase 1P40967 PME17 Melanocyte protein Pmel 17Q13162 PRDX4 Peroxiredoxin-4Q92626 PXDN Peroxidasin homologP02786 TFR1 Transferrin receptor protein 1

428 P. Sarkar et al. Proteomics 2012, 12, 421–430

& 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com

Page 9: The subcellular proteome of undifferentiated human embryonic stem cells

mitochondrial contamination. Yet, GO analysis confirmed

that the enriched fraction contained several putative secreted

factors. A complete list of putatively secreted proteins, as

suggested by GO analysis, is provided in Supporting Infor-

mation Table 16; this list combines data from the nuclear,

membrane, and cytoplasmic fractions as well as the ER-

enriched fraction. A subset of these proteins – extracellular

matrix proteins and growth factors – is summarized in

Table 2. Interestingly, there was significant overlap between

the lists of secreted proteins identified using our approach

and a previous study on the hESC secretome [29]. In this

study, hESCs grown in feeder-free conditions were incu-

bated with serum-free medium for 24 h and proteomic

analysis of the hESC-CM was carried out. In total, 79 out of

160 proteins identified using our approach were also iden-

tified in the previously reported study. Differences in protein

identifications obtained using our approach may arise due to

two reasons. First, in our study, proteins are flagged as

putatively secreted if they are annotated as extracellular (GO:

0005576). The list of proteins thus obtained also contains

some proteins that are annotated, in addition to extra-

cellular, as also being present in other intracellular

compartments. One such example is a-tubulin-1 (Uniprot

ID: P68366). The second reason relates to an inherent

limitation of an approach involving proteomic analysis of

hESC-CM. While incubation of hESCs in serum-free

medium for 24 h does not affect the expression of plur-

ipotency markers such as Oct-4 and SSEA3 [29], it is

possible that the withdrawal of serum or serum supple-

ments results in changes in the secreted protein profile of

hESCs. In contrast, our approach does not require the use of

serum-free medium. Nevertheless, our results demonstrate

that the analysis of a subcellular fraction enriched in the ER

and Golgi is a viable strategy to interrogate the cellular

secretome. While the list of putatively secreted proteins

presented here is not comprehensive, we hypothesize

that optimization of our protocols for targeted enrichment

and purification of the ER and Golgi followed by MS

analysis can be used for complete characterization of the

secretome.

4 Concluding remarks

In this study, we have used subcellular fractionation prior to

proteomic analysis to comprehensively characterize the

membrane, cytoplasmic, and nuclear proteomes of undif-

ferentiated hESCs. Notably, 74% of the proteins identified in

our analysis were found in a single subcellular fraction,

simultaneously underscoring the effectiveness of our frac-

tionation procedure as well as the distinct subcellular loca-

lization of proteins in hESCs. Such an approach not only

enables deeper proteome coverage but, in conjunction with

GO analysis and orthogonal validation using techniques

such as fluorescence microscopy if necessary, can also

provide insight into protein localization. Our data lay the

foundation for quantitative analysis of changes in the

subcellular proteome of hESCs upon initiation of differ-

entiation. Interestingly, our analysis shows that two proteins

that are annotated as nuclear-only are indeed present in the

cytoplasm of undifferentiated hESCs. Our results also show

that proteomic analysis of ER and Golgi can provide a

powerful alternative to conventional approaches for the

characterization of the cellular secretome. Comprehensive

identification of the exogenous signaling factors present in

mouse embryonic fibroblast-CM coupled with the know-

ledge of endogenous signaling factors secreted by hESCs

themselves will enable us to fully characterize the micro-

environment of hESCs.

The authors gratefully acknowledge the funding from theNational Science Foundation grant CBET-0966859.

The authors have declared no conflict of interest.

5 References

[1] Thomson, J. A., Itskovitz-Eldor, J., Shapiro, S. S., Waknitz,

M. A. et al., Embryonic stem cell lines derived from human

blastocysts. Science 1998, 282, 1145–1147.

[2] Hughes, C. S., Nuhn, A. A., Postovit, L. M., Lajoie, G. A.,

Proteomics of human embryonic stem cells. Proteomics

2011, 11, 675–690.

[3] Van Hoof, D., Heck, A. J., Krijgsveld, J., Mummery, C. L.,

Proteomics and human embryonic stem cells. Stem Cell

Res. 2008, 1, 169–182.

[4] Lee, Y. H., Tan, H. T., Chung, M. C., Subcellular fractionation

methods and strategies for proteomics. Proteomics 2010,

10, 3935–3956.

[5] Harkness, L., Christiansen, H., Nehlin, J., Barington, T. et al.,

Identification of a membrane proteomic signature for

human embryonic stem cells independent of culture

conditions. Stem Cell Res. 2008, 1, 219–227.

[6] Dormeyer, W., Van Hoof, D., Braam, S. R., Heck, A. J. et al.,

Plasma membrane proteomics of human embryonic stem

cells and human embryonal carcinoma cells. J. Proteome

Res. 2008, 7, 2936–2951.

[7] Gerwe, B. A., Angel, P. M., West, F. D., Hasneen, K. et al.,

Membrane proteomic signatures of karyotypically normal

and abnormal human embryonic stem cell lines and deri-

vatives. Proteomics 2011, 11, 2515–2527.

[8] Prokhorova, T. A., Rigbolt, K. T., Johansen, P. T., Henning-

sen, J. et al., Stable isotope labeling by amino acids in cell

culture (SILAC) and quantitative comparison of the

membrane proteomes of self-renewing and differentiating

human embryonic stem cells. Mol. Cell. Proteomics 2009, 8,

959–970.

[9] Van Hoof, D., Dormeyer, W., Braam, S. R., Passier, R. et al.,

Identification of cell surface proteins for antibody-based

selection of human embryonic stem cell-derived cardio-

myocytes. J. Proteome Res. 2010, 9, 1610–1618.

Proteomics 2012, 12, 421–430 429

& 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com

Page 10: The subcellular proteome of undifferentiated human embryonic stem cells

[10] Barthelery, M., Salli, U., Vrana, K. E., Enhanced nuclear

proteomics. Proteomics 2008, 8, 1832–1838.

[11] Pewsey, E., Bruce, C., Tonge, P., Evans, C. et al., Nuclear

proteome dynamics in differentiating embryonic carcinoma

(NTERA-2) cells. J. Proteome Res. 2010, 9, 3412–3426.

[12] Collier, T. S., Randall, S. M., Sarkar, P., Rao, B. M. et al.,

Comparison of stable-isotope labeling with amino acids in

cell culture and spectral counting for relative quantification

of protein expression. Rapid Commun. Mass Spectrom.

2011, 25, 2524–2532.

[13] Collier, T. S., Sarkar, P., Rao, B., Muddiman, D. C., Quanti-

tative top-down proteomics of SILAC labeled human

embryonic stem cells. J. Am. Soc. Mass Spectrom. 2010, 21,

879–889.

[14] Conesa, A., Gotz, S., Garcia-Gomez, J. M., Terol, J. et al.,

Blast2GO: A universal tool for annotation, visualization and

analysis in functional genomics research. Bioinformatics

(Oxford, England) 2005, 21, 3674–3676.

[15] Huang da, W., Sherman, B. T., Lempicki, R. A., Systematic

and integrative analysis of large gene lists using DAVID

bioinformatics resources. Nat. Protoc. 2009, 4, 44–57.

[16] Liu, H., Sadygov, R. G., Yates, J. R., 3rd, A model for

random sampling and estimation of relative protein abun-

dance in shotgun proteomics. Anal. Chem. 2004, 76,

4193–4201.

[17] Zybailov, B., Mosley, A. L., Sardiu, M. E., Coleman, M. K.

et al., Statistical analysis of membrane proteome expres-

sion changes in Saccharomyces cerevisiae. J. Proteome

Res. 2006, 5, 2339–2347.

[18] Sardiu, M. E., Cai, Y., Jin, J., Swanson, S. K. et al., Prob-

abilistic assembly of human protein interaction networks

from label-free quantitative proteomics. Proc. Natl. Acad.

Sci. USA 2008, 105, 1454–1459.

[19] Kopytova, D. V., Orlova, A. V., Krasnov, A. N., Gurskiy, D. Y.

et al., Multifunctional factor ENY2 is associated with the

THO complex and promotes its recruitment onto nascent

mRNA. Genes Dev. 2010, 24, 86–96.

[20] Shi, D., Pop, M. S., Kulikov, R., Love, I. M. et al., CBP and

p300 are cytoplasmic E4 polyubiquitin ligases for p53. Proc.

Natl. Acad. Sci. USA 2009, 106, 16275–16280.

[21] Sato, N., Meijer, L., Skaltsounis, L., Greengard, P., Brivan-

lou, A. H., Maintenance of pluripotency in human and

mouse embryonic stem cells through activation of Wnt

signaling by a pharmacological GSK-3-specific inhibitor.

Nat. Med. 2004, 10, 55–63.

[22] Funaba, M., Zimmerman, C. M., Mathews, L. S., Modulation

of Smad2-mediated signaling by extracellular signal-regu-

lated kinase. J. Biol. Chem. 2002, 277, 41361–41368.

[23] de Caestecker, M. P., Parks, W. T., Frank, C. J., Castagnino,

P. et al., Smad2 transduces common signals from receptor

serine-threonine and tyrosine kinases. Genes Dev. 1998, 12,

1587–1592.

[24] Kretzschmar, M., Doody, J., Timokhina, I., Massague, J., A

mechanism of repression of TGFbeta/Smad signaling by

oncogenic Ras. Genes Dev. 1999, 13, 804–816.

[25] Lin, X., Duan, X., Liang, Y. Y., Su, Y. et al., PPM1A functions

as a Smad phosphatase to terminate TGFbeta signaling.

Cell 2006, 125, 915–928.

[26] Matsuura, I., Denissova, N. G., Wang, G., He, D. et al.,

Cyclin-dependent kinases regulate the antiproliferative

function of Smads. Nature 2004, 430, 226–231.

[27] Kuratomi, G., Komuro, A., Goto, K., Shinozaki, M. et al.,

NEDD4-2 (neural precursor cell expressed, developmentally

down-regulated 4-2) negatively regulates TGF-beta (trans-

forming growth factor-beta) signalling by inducing ubiqui-

tin-mediated degradation of Smad2 and TGF-beta type I

receptor. Biochem. J. 2005, 386, 461–470.

[28] Peerani, R., Rao, B. M., Bauwens, C., Yin, T. et al., Niche-

mediated control of human embryonic stem cell self-

renewal and differentiation. EMBO J. 2007, 26, 4744–4755.

[29] Bendall, S. C., Hughes, C., Campbell, J. L., Stewart, M. H.

et al., An enhanced mass spectrometry approach reveals

human embryonic stem cell growth factors in culture. Mol.

Cell. Proteomics 2009, 8, 421–432.

[30] Graham, J. M., Rickwood, D., Subcellular Fractionation: A

Practical Approach, IRL Press at Oxford University Press,

Oxford, New York 1997.

430 P. Sarkar et al. Proteomics 2012, 12, 421–430

& 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim www.proteomics-journal.com