using tree analysis pattern and seldi-tof-ms to discriminate transitional cell carcinoma of the...

EuropeanUrology European Urology 47 (2005) 456–462

UsingTreeAnalysis Pattern and SELDI-TOF-MS toDiscriminateTransitional Cell Carcinoma of the BladderCancer fromNoncancer PatientsWeiwei Liua, Ming Guana, Denglong Wuc, Yuanfang Zhangb, Zhong Wub,Ming Xua, Yuan Lua,*aCenter of Laboratory Medicine, Huashan Hospital, Fudan University, Shanghai 200040, P.R. ChinabDepartment of Urology, Huashan Hospital, Fudan University, Shanghai 200040, P.R. ChinacDepartment of Urology, No. 6 People’s Hospital, Jiaotong University, Shanghai 200042, P.R. China

Accepted 7 October 2004

Available online 28 October 2004

AbstractObjective: To determine whether SELDI protein profiling of urine coupled with a tree analysis pattern coulddifferentiate TCC from noncancer patients.Methods: The ProteinChip Arrays were performed on a ProteinChip PBS II reader of the ProteinChip BiomarkerSystem. The study was divided into two phases: a preliminary phase with construction of tree analysis pattern, and atesting phase with test urine samples. Generation of the tree analysis pattern was performed by a training data setconsisting of 104 samples. The validity of the tree analysis pattern was then challenged with a test set of 68 samples.Results: Average of 187 mass peaks was detected in the urine samples, and five of these peaks were used to constructthe tree analysis pattern. The classification pattern correctly predicted 91.67–94.64% of the samples for both of thetwo groups in the training set, for an overall correct classification of about 93%. The pattern correctly predicted72.0% (49 of 68) of the test samples, with 71.4% (25 of 35) of the TCC samples, 72.7% (24 of 33) of the noncancersamples.Conclusions: The high sensitivity and specificity obtained by the urine protein profiling approach demonstrate thatSELDI-TOF-MS combined with a tree analysis pattern can both facilitate discriminate TCC bladder cancer withnoncancer and provide an innovative clinical diagnostic platform improve the detection of TCC bladder cancerpatients.# 2004 Elsevier B.V. All rights reserved.

Keywords: TCC; SELDI; Bladder cancer

1. Introduction

Bladder cancer is the fourth most common cancerin men and the eighth most common in women [1].More than 90% of the cases are the transitional cellcarcinoma (TCC) histology [2]. The most reliableways of diagnosis and surveillance of TCC are bycytoscopic examination and bladder biopsy for histo-logical confirmation [3]. The invasive and labor-

* Corresponding author. Tel: +86 21 62498118; Fax: +86 21 62498118.

E-mail address: [email protected] (Y. Lu).

0302-2838/$ – see front matter # 2004 Elsevier B.V. All rights reserved

doi:10.1016/j.eururo.2004.10.006

intensive nature of cytoscopic examination presentsa challenge to develop better and noninvasive diag-nostic tools [2]. Urine cytology has been the goldstandard of the noninvasive diagnostic approaches. Ithas high specificity and provides the advantage overbiopsy of screening the entire urothelium. However,its high false-negative rate has limited its use as anadjunct to cystoscopy [4,5]. Application of new tech-nologies for detection of bladder cancer could havean important effect on public health. To achieve thisgoal, specific and sensitive molecular markers areessential.

.

W. Liu et al. / European Urology 47 (2005) 456–462 457

The classical approach for discovering disease-asso-ciated proteins is two-dimensional polyacrylamide gelelectrophoresis (2D-PAGE) [6]. 2D-PAGE is cumber-some, labor intensive, suffers reproducibility problemsand is not readily transformed into a clinical assay.

Significant technological advances in protein chem-istry have established matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF) as a tool for protein study [7–9]. A limitation ofMALDI-TOF-MS is the presence of buffer compo-nents, lipids, carbohydrates, etc. that prevent efficientionization of the proteins within [10].

The developments of surface-enhanced laser deso-rption/ionization time-of-flight mass spectrometry(SELDI-TOF-MS), have largely overcome many ofthese limitations [11]. By utilizing arrays with differingsurface chemistries, a complex mixture of proteins, asfrom cells or body fluids, can be resolved into subsetsof proteins with common properties. After the arraysare washed to remove weakly bound proteins, a solu-tion containing an energy-absorbing molecule is addedand allowed to crystallize, embedding retained proteinswithin. After crystallization, features within thesearrays are read in a ProteinChip Reader. Once a peakof interest has been detected, the analyte can beenriched or purified for further analysis [12].

Taking advantage of the recent development inSELDI, we were able to simultaneously analyze theprotein profiles of urine samples from patients with orwithout bladder cancer. Comparison of the spectralpatterns was performed by manual visual inspection. Alaborious undertaking fraught with significant error,and clearly suggested that bioinformatic classifieralgorithms would be required to efficiently and effec-tively deal with the high dimensionality of the SELDIdata. In this study, Biomarker Patterns Software(Ciphergen Biosystem, Inc.) is a powerful tool to dealwith the hundreds of the protein that can be detected inthe urine samples.

The objective of this study was to determine whetherSELDI protein profiling of urine coupled with a treeanalysis pattern could effectively differentiate TCCfrom noncancer patients.

Table1Grade/stage of bladder carcinoma

Grade No. of samples Stage No. of samples

I 27 Ta 19

II 35 T1 48

III 21 T2 11

T3 2

CIS 3

2. Materials andmethods

2.1. Study groups and samples

Urine samples were collected from patients seen in the Depart-

ment of Urology, Huashan Hospital and No. 6 People’s Hospital.

Diagnoses were pathologically confirmed, and specimens were

obtained before treatment. The samples were early morning urine

sample and voided. They were immediately briefly centrifuged

(1 minute, 10,000 rpm) for the removal of cellular material. The

urine was distributed into 200 ml aliquots and stored at �80 8C.

None of the samples had been thawed more than twice.

172 specimens were included in this study. In TCC cancer

group, the median age was 58 years (range, 31–86 years). The

noncancer control group included urine samples from 89 patients

with normal individuals (Normal) and other urogenital diseases

(Other). The median age of the healthy people was 50 years (range,

23–86 years). The median age of the other group was 53 years

(range, 21–76 years). Healthy controls (n = 53) included volunteers

with no evidence of disease, and healthy individuals. Other uro-

genital disease (n = 36) included clinical or pathologically con-

firmed prostatitis (n = 4), urinary tract infections (n = 6), benign

prostatic hyperplasia (n = 7), inflammation bladder (n = 6), pros-

tate cancer (n = 3), carcinoma of kidney (n = 2), calculus of ureter

(n = 4), renal transplantation (n = 1), hydronephrosis (n = 3). In the

training set the total number of the noncancer is 56 that including

20 samples with other urogenital diseases and 36 samples with

normal control.

The cancer group consisted of 83 urine samples from TCC

bladder cancer patients at different clinical grades and stages (Table

1). In the training set, the total number of the TCC is 48 and there

are 35 TCC samples in the test set. In the test set the total number of

the noncancer is 33 that including 16 samples with other urogenital

diseases and 17 samples with normal control.

2.2. ProteinChip SELDI analysis of urine

The Weak Cationic Exchange (WCX2) ProteinChip Arrays

(Ciphergen) were performed in our study.

Protein concentration of the urine samples was estimated using

BCA method. Samples were diluted with binding buffer (0.5 mol/l

NaCl buffed with 100 mmol/l sodium acetate, pH 4) to equal

protein concentration (2 mg/ml) before using.

The chips were put in a bioprocessor (Ciphergen Biosystems,

Inc.), which is a device that allows application of larger volumes of

urine to each chip array. The WCX2 arrays were equilibrated twice

with 100 ml of binding buffer. 100 ml of the diluted urine mixture

were applied to each well, and the bioprocessor was sealed and

shaken on a platform shaker at a speed of 300 rpm for 1 hour. After

the incubation, the bioprocessor was washed with 100 ml of binding

buffer in each well. This step was repeated twice, and each time the

binding buffer was discarded by inverting the bioprocessor on a

paper towel. The chips were removed from the bioprocessor and

washed with DI water. After the arrays were air-dried, 0.5 ml of

saturated matrix solution (a-cyano-4-hydroxycinnamic acid in

0.5% acetonitrile and 0.5% trifluoroacetic acid) was applied twice

on the array and allowed to air dry.

2.3. Reading chip

The ProteinChip Arrays were read on a ProteinChip PBS II reader

of the ProteinChip Biomarker System. The chips were analyzed

manually under the following settings: laser intensity 200, detector

sensitivity 9, and molecular mass range 0–50 kDa, mass focus 2–

W. Liu et al. / European Urology 47 (2005) 456–462458

Fig. 1. Flow diagram showing the processes of proteomic analysis.

10 kDa. Data were collected without filters and were later used for

analyses. The operators were unaware of which was which.

Mass accuracy is assessed through the use of the All-in-1

peptide molecular mass standard (Ciphergen Biosystems, Inc.).

2.4. Data analysis

The urine samples were divided into two sets: the training set

(104 samples) and the test set (68 samples). Analysis was also

divided into two phases: a preliminary phase (phase I) with

construction of tree analysis pattern, and a testing phase (phase

II) with test urine samples. The validity of the tree analysis pattern

was then challenged with the test set. The test set was analyzed in

phase II (Fig. 1).

2.5. Protein peak selection

All spectra were compiled, and qualified mass peaks (signal-to-

noise ratio > 5) with mass-to-charge ratios (m/z) between 2 and

50 kDa were autodetected. Peak clusters were complete using

second-pass peak selection (signal-to-noise ratio >2), and esti-

mated peaks were added.

The mass range from 2–50 kDa was selected for analysis

because this range contained the majority of the resolved pro-

tein/peptides. The molecular masses from 0–2000 Da were elimi-

nated from analysis because this area contains adducts and artifacts

of the EAM and possibly other chemical contaminants.

2.6. Construction of tree analysis pattern

Generation of the tree analysis pattern was performed by

Biomarker Patterns Software version 4.0 (Ciphergen Biosystems,

Inc.), using a training data set consisting of 104 samples (48 TCC,

36 normal, 20 other urogenital diseases samples).

A data set was divided into two nodes by tree analysis pattern,

using one rule at a time in the form of a question. Presence or

absence and the intensity levels of one peak define the splitting

decision. This splitting process continues until terminal nodes or

leaves are produced or further splitting has no gain. Classification

of terminal nodes is determined by the group (‘‘class’’) of samples

(i.e., TCC, Normal, or Other) representing the majority of samples

in that node. Peaks selected by this process to form the splitting

rules are the ones that achieve the maximum reduction of cost in the

two descendant nodes [13].

2.7. Statistical analyses

A Bayesian approach was used to calculate the expected

probabilities of each class in each terminal node. Specificity was

calculated as the ratio of the number of noncancer samples cor-

rectly classified to the total number of noncancer samples. Sensi-

tivity was calculated at the ratio of the number of correctly

classified TCC bladder cancer to the total number of cancer

samples.

2.8. Reproducibility analyses

The reproducibility of SELDI spectra from spot to spot on a

single chip (intra-assay) and between chips (inter-assay) was

determined using the urine quality control sample. Five proteins

in the range of 3–20 kDa observed on spectra randomly selected

over the course of the study were used to calculate the mass and

intensity mean CV.

3. Results

A total of 187 qualified mass peaks (signal-to-noiseratio >5) were detected in the training set. SELDI wasparticularly effective in resolving the low molecularweight (<10 kDa) proteins and polypeptides. Peakintensity was normalized to total ion current (2–50 kDa).

These identified 187 peaks in the training set werethen used to construct the decision tree classificationpattern (Fig. 2). Analysis of urine specimens frompatients with TCC, patients with other diseases ofthe urogenital tract, and normal individuals, revealedthat five prominent protein peaks were spectra views ofthese proteins are shown in Fig. 3. No single peak wasidentified alone; indicating that there was not a peakdetected that alone could completely separate twogroups (TCC versus Noncancer). The average SELDImass associated the five proteins are PEAK 1:


Fig. 2. Diagram of decision tree analysis pattern. Classification of the TCC and Noncancer samples in the training set. The root node (top) and descendant nodes

are shown as ovals, and the terminal nodes (Node 1–Node7) are shown as rectangles. The numbers in each node represent the classes [T, number of TCC

samples; Non, number of noncancer samples]. The first number under the root and descendant nodes is the mass value followed by the peak intensity value. For

example, the mass value under the root node is 33221 kDa, and the intensity is 1.241.

5105 Da; PEAK2: 5565 Da; PEAK3: 16048 Da;PEAK4: 28435 Da; PEAK5: 33221 Da (Fig. 3(a)–(e).The classification pattern used five masses to generate 7terminal nodes. Classification trees split up a data setinto two bins or nodes, using one rule at a time in theform of a question. Presence or absence and the inten-sity levels of one peak define the splitting decision. Forexample, the answer to ‘‘Does mass A have an intensityless than or equal to X’’ splits the data set into twonodes, a left node for yes and a right node for no. Thissplitting process continues until terminal nodes orleaves are produced or further splitting has no gain.Classification of terminal nodes is determined by thegroup (‘‘class’’) of samples (i.e., TCC or Noncancer)

representing the majority of samples in that node. Forexample, if the intensity of an unknown sample in mass16048 Da was more than 0.053 and in mass 33221 Dawas more than 1.241, then the sample is placed interminal node 7 and classified as TCC. If the sampleis placed in node 3, it will be assigned to noncancerpatient. Based on the stochastic nature of reality, mis-classification of a new sample cannot be ruled out evenfor a pure node that contains only one sample type, forexample, node 1, which contains only TCC samples.The probability of incorrect assignment of samplesincreases in nodes that contain few majority samplesor when only a few samples are assigned to the node, as,for example, terminal nodes 1, 2 and 6 (Table 2) [15].


Fig. 3. (a, b, c, d and e): Detection of five protein peaks in urine that used in the tree analysis pattern. Mass spectra of the urine samples from two different TCC

patients and two noncancer samples. The average molecular mass of the five proteins is shown in the figure.

A summation of the classification results from the 7terminal nodes is presented for the training and test setsin Table 3. The classification pattern correctly pre-dicted 91.67–94.64% of the samples for both of the twogroups in the training set (Table 3(A)), for an overallcorrect classification of about 93%. The pattern cor-rectly predicted 72.0% (49 of 68) of the test samples,with 71.4% (25 of 35) of the TCC samples, 72.7% (24of 33) of the noncancer samples (Table 3(B)). Whencomparing TCC versus noncancer (Other/Normal), thesensitivity was 71.4% (25 of 35), and the specificitywas 72.7% (24 of 33). And the pattern correctly

predicted 79.0–95.2% of the TCC samples in differentgrades and different stages (Table 3(C)).

The reproducibility of the SELDI spectra, i.e., massand intensity intraassay and interassay, was determinedwith the pooled normal urine quality contral sample.Five proteins in the range of 3–20 kDa observed onspectra randomly selected over the course of the studywere used to calculate the mean CV. The intra- andinterassay mean CV for mass were 0.3% and 0.8%respectively, and the intra- and interassay mean CV forthe normalized intensity were 8% and 12%, respec-tively.


Table 2Expected probability of the classes assigned to 7 terminal nodes

Node Class Observation Probability

1 TCC 7 0.8889

NC 0 0.1111

2 TCC 1 0.3333

NC 3 0.6667

3 TCC 1 0.0952

NC 18 0.9048

4 TCC 13 0.8235

NC 2 0.1765

5 TCC 1 0.0625

NC 29 0.9375

6 TCC 1 0.3333

NC 3 0.6667

7 TCC 24 0.9259

NC 1 0.0741

4. Discussion

Because proteins are genes products, it is logical toexpect that specific proteomic profiles may be alsoassociated with of the tumors [14]. Complex urineproteomic patterns might reflect the underlying patho-logical state of TCC bladder cancer. This hypothesis issupported by the results of our study. The high sensi-tivity, specificity obtained in this study demonstratesthat SELDI can both facilitate discriminate TCC blad-der cancer with noncancer.

SELDI-TOF-MS using a protein chip that capturesproteins based on their ability to selectively bind to aweak cation exchange surface that was capable ofresolving several hundred of urine proteins/peptides.This is far less than proteins capable of being separatedby 2D-PAGE, but the advantage over 2D-PAGE is theability of SELDI to effectively resolve polypeptidesand peptides smaller than 20 kDa [15].

This innovative technology has other numerousadvantages: it is much faster, has a high-throughput

Table 3Decision tree classification of the training and test set

TCC N

A. Training set

TCC (n = 48) 44 (91.67%)

Noncancer (n = 56) 3 (5.36%) 5

Total no. of samples (n = 104)

B. Test set

TCC (n = 35) 25 (71.4%) 1

Noncancer (n = 33) 9 (27.3%) 2

Total no. of samples (n = 68)

Grade I, II G

C. Sensitivity in the stage and grade of tumor in total bladder cancer samples

Sensitivity 79.0% 9

capability, requires orders of magnitude lower amountsof the protein sample, has a sensitivity for detectingproteins in the picomole to attamole range, can effec-tively resolve low mass proteins (2000 to 20,000 Da),and is directly applicable for clinical assay develop-ment [3].

Because of the multifactorial nature of cancer, it isvery likely that a combination of several markers willbe necessary to effectively detect and diagnose TCC.And in our study SELDI technology can generateseveral hundred of protein profiling simultaneously.Biomarker Patterns Software would be capable ofanalyzing such high volume of data to develop anefficient and reproducible classifier.

The identity of the peak masses used in the treeanalysis pattern is not necessary for making a diag-nosis. These proteins/peptides could be derived fromthe host organ, the cancer, or constitute metabolicfragments [16]. The only requirement for this classi-fication system to make an accurate diagnosis is thatthe biomarkers be reproducibly detected by SELDI andaccurately selected by the classifier. Obtaining a namefor each of the masses used in the classifier will notmake the classification system better or more accurate.However, because knowing their exact identities willbe essential for understanding what biological roles ofthese peptide/proteins may have in the cancer, effortsare under way to purify, identify, and characterize theseprotein/peptide biomarkers [15]. Furthermore, know-ing their identities will be essential for producingantibodies for development of either classical orSELDI immunoassays [17,18].

In fact, many noninvasive molecular diagnostic testshave been developed and several potential biomarkershave been identified. Reviews of the sensitivity/speci-ficity and reproducibility of these biomarkers indicatethat some have greater sensitivity/specificity than urinecytology, however poor reproducibility precludes them

oncancer Misclassified rate

4 (8.33%) 4 (8.33%)

3 (94.64%) 3 (5.36%)

7 (6.73%)

0 (28.6 %) 10 (28.6 %)

4 (72.7%) 9 (27.7%)

19 (27.9%)

rade III Ta, CIS T1, T2, T3

5.2% 86.3% 81.9%


from replacing urine cytology or cytoscopy[4,5,10,19]. In this study, the SELDI can overcomethe limitation and the reproducibility is acceptable.

SELDI-MS-TOF combined with the Strong AnionExchange (SAX2) arrays have been used to detect thepotential biomarkers in urine by Antonia Vlahou et al.[3]. Our results further support the applicability of thistechnology as a method for protein profiling of urinesamples if high diagnostic sensitivity for TCC. In theirstudy, they found five potential novel TCC biomarkersdifferent expressed in TCC and noncancer. One of theTCC biomarkers (3.3/3.4 kD) was identified as defen-sin. The sensitivity and specificity of this marker is47% and 86% respectively. Although they used theSAX2 chips, which might be expected to bind differentproteins than those that would bind to the WCX2 chip

used in the study, it is interesting to note that the massesare distinctly different from those used in our study.The contradictory results reflect the different selectionof chromatographic surface.

The ultimate clinical application is the early detec-tion of cancer, a topic of obvious public health impor-tance. In our study the sensitivity for grade I, II andgrade III is 79.2% and 95.2%, respectively. The resultsuggested that we couldn’t use this method to detectTCC bladder cancer in early grade as effectively as inlate grade.

In conclusion this study suggests that TCC specificproteomic signatures are present in the urine of patientswith TCC and have potential as a clinical tool for thedetection or classification of individual cancers withhigh sensitivity and specificity.

References

[1] Greenlee RT, Hill-Harmon MB, Murray T, Thun M. Cancer statistics,

2001. CA Cancer J Clin 2001;51(1):15–36.

[2] Stein JP, Grossfeld GD, Ginsberg DA, Esrig D, Freeman JA, Figueroa

AJ, et al. Prognostic markers in bladder cancer: a contemporary review

of the literature. J Urol 1998;160:645–59.

[3] Vlahou A, Schellhammer PF, Mendrinos S, Patel K, Kondylis FI,

Gong L, et al. Development of a novel proteomic approach for the

detection of transitional cell carcinoma of the bladder in urine.

American J Path 2001;158(4):1491–502.

[4] Lokeshwar VB, Soloway MS. Current bladder tumor tests: does their

projected utility fulfill clinical necessity? J Urol 2001;165:1067–77.

[5] Burchardt M, Burchardt T, Shabsigh A, De La Taille A, Benson MC,

Sawczuk I. Current concepts in biomarker technology for bladder

cancers. Clin Chem 2000;46:595–605.

[6] Jemal A, Thomas A, Murray T, Thun M. Cancer statistics. CA Cancer J

Clin 2002;52:23–47.

[7] Carr SA, Hemling ME, Bean MF, Roberts GD. Integration of mass

spec-trometry in analytical biotechnology. Anal Chem 1991;63:2802–

24.

[8] Loo JA, Brown J, Critchley G, Mitchell C, Andrews PC, Ogorzalek

Loo RR. High sensitivity mass spectrometric methods for obtaining

intact molecular weights from gel-separated proteins. Electrophoresis

1999;20:743–8.

[9] Zhang W, Czernik AJ, Yungwirth T, Aebersold R, Chait BT. Matrix-

assisted laser desorption mass spectrometric peptide mapping of

proteins separated by two-dimensional gel electrophoresis: determi-

nation of phosphorylation in synapsin I. Protein Sci 1994;3:677–86.

[10] von Eggeling F, Davies H, Lomas L, Fiedler W, Junker K, Claussen U,

et al. Tissue-Specific Microdissection Coupled with ProteinChip

Array Technologies: Applications in Cancer Research. Biotechniques

2000;29:1066–70.

[11] Li J, Zhang Z, Rosenzweig J, Wang YY, Chan DW. Proteomics and

Bioinformatics Approaches for Idetification of Serum Biomarkers to

Detect Breast Cancer. Clin Chem 2002;48(8):1296–304.

[12] Weinberger SR, Dalmasso EA, Fung ET. Current achievements using

ProteinChip Array technology. Curr Opin Chem Biol 2002;6(1):86–

91.

[13] Adam BL, Qu Y, Davis JW, Ward MD, Clements MA, Cazares LH,

et al. Serum protein fingerprinting coupled with a pattern-matching

algorithm distinguishes prostate cancer from benign prostate hyper-

plasia and healthy men. Cancer Res 2002;62:3609–14.

[14] Poon TC, Yip TT, Chan AT, Yip C, Yip V, Mok TS, et al. Compre-

hensive proteomic profiling identifies serum proteomic signatures for

detection of hepatocellular carcinama and its subtypes. Clin Chem

2003;49(5):752–60.

[15] Qu Y, Adam BL, Yasui Y, Ward MD, Cazares LH, Schellhammer PF,

et al. Boosted decision tree analysis of surface-enhanced laser deso-

rption/ionization mass spectral serum profiles discriminates prostate

cancer from noncancer patients. Clin Chem 2002;48(10):1835–43.

[16] Petricoin EF, Ardekani AM, Hitt BA, Levine PJ, Fusaro VA, Steinberg

SM, et al. Use of proteomic patterns in serum to identify ovarian

cancer. Lancet 2002;359:572–7.

[17] Wright Jr GL, Cazares LH, Leung SM, Nasim S, Adam BL, Yip TT,

et al. Proteinchip surface enhanced laser desorption/ionization

(SELDI) mass spectrometry: a novel proteomic technology for detec-

tion of prostate cancer biomarkers in complex protein mixtures.

Prostate Cancer Prostatic Dis 1999;2:264–76.

[18] Xiao Z, Adam BL, Cazares LH, Clements MA, Davis JW, Schell-

hammer PF, et al. Quantitation of serum prostate-specific membrane

antigen by a novel protein biochip immunoassay discriminates benign

from malignant prostate disease. Cancer Res 2001;61:6029–33.

[19] Han M, Schoenberg MP. The use of molecular diagnostics in bladder

cancer. Urol Oncol 2000;5:87–92.

using tree analysis pattern and seldi-tof-ms to discriminate transitional cell carcinoma of the...

Documents