gene-expression based classification of neuroblastoma patients using a customized...
DESCRIPTION
Current stratifying markers Germany (NB2004) USA (COG) Japan Tumor Stage [INSS] Patients‘ Age at Dx Amplification of MYCN (2p24) Deletion of 1pHistology [INPC] DNA ploidy High Risk Intermediate Risk Low Risk (observation) Overall Survival >90% Overall SurvivalTRANSCRIPT
Gene-Expression Based Classification of Gene-Expression Based Classification of
Neuroblastoma Patients Using a Customized Neuroblastoma Patients Using a Customized
Oligonucleotide-MicroarrayOligonucleotide-Microarray
André Oberthür, German Neuroblastoma Study Group,André Oberthür, German Neuroblastoma Study Group,Cologne Children‘s Hospital; GermanyCologne Children‘s Hospital; Germany
University Children‘s Hospital of University Children‘s Hospital of CologneCologneDepartment of Pediatric OncologyDepartment of Pediatric Oncology
J Clin Oncol. 2006 Nov 1;24(31):5070-8J Clin Oncol. 2006 Nov 1;24(31):5070-8
• • Tumor of the sympathetic nervous Tumor of the sympathetic nervous systemsystem
• • stage 1-3 (localised), 4 (disseminated), 4s (special)stage 1-3 (localised), 4 (disseminated), 4s (special)
• • 5-year- overall survival: 64%5-year- overall survival: 64%
• • mean age at diagnosis: 15 monthmean age at diagnosis: 15 month
• • Incidence ~ 1.2-1.4/100.000 children in Germany each yearIncidence ~ 1.2-1.4/100.000 children in Germany each year
• • most frequent solid extracranial tumor in childrenmost frequent solid extracranial tumor in children
NeuroblastomaNeuroblastoma
• • Spontaneous regression in > 10% of neuroblastomasSpontaneous regression in > 10% of neuroblastomas
Current stratifying markersCurrent stratifying markersGermanyGermany(NB2004)(NB2004)
USAUSA(COG)(COG)
JapanJapan
Tumor Stage [INSS]
Tumor Stage [INSS]
Tumor Stage [INSS]
Patients‘ Age at Dx
Patients‘ Age at Dx
Patients‘ Age at Dx
Amplification of MYCN (2p24)
Amplification of MYCN (2p24)
Amplification of MYCN (2p24)
Deletion of 1p Histology [INPC]
DNA ploidy
High RiskHigh RiskIntermediate Intermediate RiskRisk
Low Risk (observation)Low Risk (observation)
Overall Survival >90%Overall Survival >90% Overall Survival <40%Overall Survival <40%
Primary Goal of the studyPrimary Goal of the study
To identify a prognostic gene-expression based To identify a prognostic gene-expression based classifier that improves risk stratification of classifier that improves risk stratification of neuroblastoma patients.neuroblastoma patients.The study was performed as a combined marker discovery (first The study was performed as a combined marker discovery (first set, n=77) and validation (second set, n=174) study.set, n=77) and validation (second set, n=174) study.
As patients were not treated uniformly, but depending on their As patients were not treated uniformly, but depending on their risk group, prognosis was predicted for both treated and risk group, prognosis was predicted for both treated and untreated patients (heterogeneous treatment).untreated patients (heterogeneous treatment).
Clinical outcome measure: EFS, OS, Response to therapy, all Clinical outcome measure: EFS, OS, Response to therapy, all currently used markers (stage, age, cytogenetic markers etc.). currently used markers (stage, age, cytogenetic markers etc.). All data are availableAll data are availablefor each patient seperately.for each patient seperately.
1. Construction of a customized neuroblastoma microarray1. Construction of a customized neuroblastoma microarraycomprises 10.263 pre-selected genes with biological or clinical relevancecomprises 10.263 pre-selected genes with biological or clinical relevance
Experimental SetupExperimental Setup
4. Dye-Flipped replicates 4. Dye-Flipped replicates (Agilent standard (Agilent standard protocol) protocol)
3. ”near” 3. ”near” ReferenceReferencevs. 1µg of mixed total RNA from 100 primary neuroblastomavs. 1µg of mixed total RNA from 100 primary neuroblastoma
2. Dual-Colour 2. Dual-Colour SystemSystem1 µg total RNA of 251 initial pre-treatment neuroblastoma1 µg total RNA of 251 initial pre-treatment neuroblastoma
502 gene-expression profiles from 251 neuroblastoma samples502 gene-expression profiles from 251 neuroblastoma samples
Pre-Hybridization QCPre-Hybridization QC1. Tissue QC1. Tissue QC
All samples were fresh-frozen tumor samples thatAll samples were fresh-frozen tumor samples thatwere assessed by a trained pathologist and onlywere assessed by a trained pathologist and onlysamples with a tumor content of >60% weresamples with a tumor content of >60% wereconsidered for the studyconsidered for the study
2. Total RNA QC2. Total RNA QCAll extracted total RNA samples were analyzedAll extracted total RNA samples were analyzedusing the 2100 Bioanalyzer and only samples withusing the 2100 Bioanalyzer and only samples witha RNA integrity number (RIN) >7.5 were includeda RNA integrity number (RIN) >7.5 were included
3. Labelled cRNA QC3. Labelled cRNA QCCy-Dye incorporation of all labelled cRNA samplesCy-Dye incorporation of all labelled cRNA sampleswas assessed using the NanoDrop ND-1000was assessed using the NanoDrop ND-1000spectro-photometer and only samples spectro-photometer and only samples with >10 pmol Cy-Dye/ug cRNA were usedwith >10 pmol Cy-Dye/ug cRNA were used
-0.5
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
6
-1 1Count 251 251
Polarity
AnyC
olor
Prcn
tFea
tNon
Uni
fOL
Array QC metrics: Non-Uniform FeaturesArray QC metrics: Non-Uniform Features• Yellow bars indicate 25%ile Yellow bars indicate 25%ile
to 75%ile with median to 75%ile with median indicated by blue arrowindicated by blue arrow
• Whiskers indicate the value Whiskers indicate the value closest to, but not exceeding, closest to, but not exceeding, 1.5 IQR from the median.1.5 IQR from the median.
• Red dots indicate outside Red dots indicate outside values.values.
• All values are shown for all All values are shown for all metrics.metrics.
• No intra-array reproducibility No intra-array reproducibility statistics are shown as these statistics are shown as these custom arrays did not custom arrays did not contain sufficient replicates contain sufficient replicates for that calculation.for that calculation.Pe
rcen
tage
of F
eatu
re N
on-U
nifo
rmity
Out
liers
(any
col
or)
Perc
enta
ge o
f Fea
ture
Non
-Uni
form
ity O
utlie
rs (a
ny c
olor
)
Array QC metrics: Green Negative Control Array QC metrics: Green Negative Control MetricsMetrics
-10
-8
-6
-4
-2
0
2
4
-1 1Count 251 251
Polarity
AvgO
fgBG
SubS
igna
l_N
egC
onto
l
1
1.5
2
2.5
3
3.5
4
4.5
5
5.5
-1 1Count 251 251
Polarity
StD
evO
fgBG
SubS
igna
l_N
egC
ontr
ol
1
2
3
4
5
6
7
8
9
-1 1Count 251 251
PolaritygR
esid
ualB
GN
oise
Aver
age
of g
reen
Bac
kgro
und-
subs
tract
ed S
igna
l Neg
ativ
e Co
ntro
lAv
erag
e of
gre
en B
ackg
roun
d-su
bstra
cted
Sig
nal N
egat
ive
Cont
rol
Gree
n Re
sidua
l Bac
kgro
und
Noise
)Gr
een
Resid
ual B
ackg
roun
d No
ise)
Stan
dard
Dev
of g
reen
Bac
kgro
und-
subs
tract
ed S
igna
l Neg
ativ
e Co
ntro
lSt
anda
rdDe
v of
gre
en B
ackg
roun
d-su
bstra
cted
Sig
nal N
egat
ive
Cont
rol
Array QC metrics: Red Negative Control Array QC metrics: Red Negative Control MetricsMetrics
-30
-25
-20
-15
-10
-5
0
5
10
15
20
-1 1Count 251 251
Polarity
AvgO
frBG
SubS
igna
l_N
egC
ontr
ol
1
2
3
4
5
6
7
8
9
10
11
-1 1Count 251 251
Polarity
StD
evO
frBG
SubS
igna
l_N
egC
ontr
ol
0
2
4
6
8
10
12
14
16
18
20
22
24
26
-1 1Count 251 251
PolarityrR
esid
ualB
GN
oise
Aver
age
of re
d Ba
ckgr
ound
-sub
stra
cted
Sig
nal N
egat
ive
Cont
rol
Aver
age
of re
d Ba
ckgr
ound
-sub
stra
cted
Sig
nal N
egat
ive
Cont
rol
Red
Resid
ual B
ackg
roun
d No
iseRe
d Re
sidua
l Bac
kgro
und
Noise
Stan
dard
Dev
of re
d Ba
ckgr
ound
-sub
stra
cted
Sig
nal N
egat
ive
Cont
rol
Stan
dard
Dev
of re
d Ba
ckgr
ound
-sub
stra
cted
Sig
nal N
egat
ive
Cont
rol
Array QC metrics: Number of “Well Above Array QC metrics: Number of “Well Above Background” Features (non-controls)Background” Features (non-controls)
8800
8900
9000
9100
9200
9300
9400
9500
9600
9700
9800
9900
10000
-1 1Count 251 251
Polarity
Num
WAB
K_G
REE
N
9100
9200
9300
9400
9500
9600
9700
9800
9900
10000
10100
-1 1Count 251 251
Polarity
Num
WAB
K_R
ED
Tota
l Num
ber W
ell-a
bove
Bac
kgro
und
Feat
ures
gre
enTo
tal N
umbe
r Wel
l-abo
ve B
ackg
roun
d Fe
atur
es g
reen
Tota
l Num
ber W
ell-a
bove
Bac
kgro
und
Tota
l Num
ber W
ell-a
bove
Bac
kgro
und
Feat
ures
red
Feat
ures
red
Array QC metrics: Median Signals and Array QC metrics: Median Signals and Average Absolute Log RatioAverage Absolute Log Ratio
40
60
80
100
120
140
160
180
200
220
240
260
280
300
-1 1Count 251 251
Polarity
Med
ian_
gBG
SubS
ig
100
150
200
250
300
350
400
450
500
550
600
650
-1 1Count 251 251
Polarity
Med
ian_
rBG
SubS
ig
0.06
0.08
0.1
0.12
0.14
0.16
0.18
0.2
0.22
-1 1Count 251 251
Polarity
AbsA
vgLR
Med
ian
gree
n Ba
ckgr
ound
-sub
stra
cted
Sig
nal
Med
ian
gree
n Ba
ckgr
ound
-sub
stra
cted
Sig
nal
Med
ian
red
Back
grou
nd-s
ubst
ract
ed S
igna
lM
edia
n re
d Ba
ckgr
ound
-sub
stra
cted
Sig
nal
Aver
age
Abso
lute
Log
Rat
ios
Aver
age
Abso
lute
Log
Rat
ios
Generation and Validation of the Generation and Validation of the classifier:classifier:
1.1. Generation of a gene-expression classifier Generation of a gene-expression classifier from a first set of 77 NB samples with from a first set of 77 NB samples with maximum divergent courses using the PAM maximum divergent courses using the PAM algorithmalgorithm
evaluation of classification accuracy by a complete, 10 times evaluation of classification accuracy by a complete, 10 times repeated 10-fold cross-validation (=100 predictive models)repeated 10-fold cross-validation (=100 predictive models)
2.2. Evaluation of the predictive power of the PAM Evaluation of the predictive power of the PAM classifier in an independent second set of 174 classifier in an independent second set of 174 NB samplesNB samples
Low RiskLow Risk
28x stage 1, 13x stage 2, 1x stage 3, 12x 28x stage 1, 13x stage 2, 1x stage 3, 12x stage 4Sstage 4S
>1000d event-free survival >1000d event-free survival without CTXwithout CTXn = 54 (non-aggressive n = 54 (non-aggressive phenotype)phenotype)
High RiskHigh Risk
1x stage 2, 2x stage 3, 19x stage 4, 1x stage 1x stage 2, 2x stage 3, 19x stage 4, 1x stage 4S4S
death of disease despite death of disease despite CTXCTXn = 23 (aggressive n = 23 (aggressive phenotype)phenotype)
Training Set:Training Set:77 patients with maximally divergent courses 77 patients with maximally divergent courses
of the disease of the disease
Training set Training set (n=77)(n=77)
PAM combined with a complete, 10 times PAM combined with a complete, 10 times repeated 10-fold cross validation repeated 10-fold cross validation (= 100 predictive (= 100 predictive
models)models)
Estimated classification accuracy:Estimated classification accuracy:99%99%
Signature = all genes, included in > 65/100 predictive models (n=144)Signature = all genes, included in > 65/100 predictive models (n=144)Classifier = PAM algorithm + gene-expression values of 144 predictive Classifier = PAM algorithm + gene-expression values of 144 predictive
genesgenes
40 x Stage 1 40 x Stage 1 2x MYCN-amplified 2x MYCN-amplified 32 x Stage 2 32 x Stage 2 1x MYCN-amplified1x MYCN-amplified36 x Stage 336 x Stage 3 9x MYCN-amplified9x MYCN-amplified48 x Stage 448 x Stage 4 9x MYCN-amplified9x MYCN-amplified18 x Stage 4S 18 x Stage 4S 2x MYCN-amplified2x MYCN-amplified
Outcome prediction by the PAM classifier compared toOutcome prediction by the PAM classifier compared torisk stratification according to the NB2004 studyrisk stratification according to the NB2004 study
High RiskMedium RiskLow Risk
n = 100n = 100 n = 13n = 13 n = 58n = 58
Test Set (n = 174): various clinical Test Set (n = 174): various clinical coursescourses
EFS of neuroblastoma patients (n = 174) as EFS of neuroblastoma patients (n = 174) as predicted by NB2004 and by the PAM predicted by NB2004 and by the PAM
classifierclassifier
3-year EFS:3-year EFS: 0.87 0.870.03vs. 0.510.03vs. 0.510.070.07
3-year EFS:3-year EFS: 0.80 0.800.04 vs. 0.750.04 vs. 0.750.13 vs. 0.13 vs. 0.630.630.07 0.07
Similar Kaplan-Meier estimates were calculated for OS (data not shown)Similar Kaplan-Meier estimates were calculated for OS (data not shown)
Hierarchical Cluster analysis and current prognostic markersHierarchical Cluster analysis and current prognostic markers
Genes
Patie
nts
FavorableFavorable
UnfavorableUnfavorable
Intermediate riskIntermediate risk
or low riskor low risk
or high riskor high risk
EFS of neuroblastoma NB2004 low-, EFS of neuroblastoma NB2004 low-, intermediate- and high-risk patients as intermediate- and high-risk patients as
predicted by the PAM classifierpredicted by the PAM classifier
Similar KM estimates were calculated for stratification by the USA (COG) Similar KM estimates were calculated for stratification by the USA (COG) andand
Japanese NB trial criteria (data not shown)Japanese NB trial criteria (data not shown)
3-year EFS:3-year EFS: 0.87 0.870.04 vs. 0.230.04 vs. 0.230.140.14
Low-risk
3-year EFS:3-year EFS: 1.00 vs. 0.57 1.00 vs. 0.570.190.19
Intermediate-risk
3-year EFS:3-year EFS: 0.80 0.800.11 vs. 0.560.11 vs. 0.560.090.09
High-risk
Multivariate Cox‘s regression analysis*Multivariate Cox‘s regression analysis*for the second set of 174 patientsfor the second set of 174 patients
Single Marker p-value Hazard Ratio [95% CI]Age (continuous) 0.159
Stage (1-3, 4S vs. 4) 0.775
MYCN (normal vs. amp) 0.299
Status 1p (normal vs. del/imb) 0.658
Shimada (F vs. UF) 0.044 2.385 [0.992-5.732]
PAM Classifier (F vs. UF)PAM Classifier (F vs. UF) 0.0070.007 3.389 [1.362-8.430]3.389 [1.362-8.430]
Trial p-value Hazard Ratio [95% CI]German NB trial (LR vs. IR vs. HR) 0.646
Japanese trial (LR vs. IR vs. HR) 0.518
USA trial (LR vs. IR vs. HR)USA trial (LR vs. IR vs. HR) 0.3500.350
PAM Classifier (F vs. UF)PAM Classifier (F vs. UF) <0.0001<0.0001 4.953 [2.563-9.572]4.953 [2.563-9.572]
*Cox’s proportional hazards regression model based on EFS was applied*Cox’s proportional hazards regression model based on EFS was appliedfitted into a stepwise-backward selectionfitted into a stepwise-backward selection of parameters of parameters
1 .Gene-expression based classification allows 1 .Gene-expression based classification allows for a more accurate risk estimation of for a more accurate risk estimation of neuroblastoma patients than current neuroblastoma patients than current stratification systems.stratification systems.
ConclusionsConclusions
2 . The optimal integration of gene-expression 2 . The optimal integration of gene-expression based classification into clinical trials has yet based classification into clinical trials has yet to be determined.to be determined.
- Validation of the Validation of the classifier in an independent classifier in an independent international set of neuroblastoma patients international set of neuroblastoma patients ((„ „ „third“ set):„third“ set):
~ ~ 200-250 samples from international cooperating 200-250 samples from international cooperating partners partners Start: Jan/Feb 2007 Start: Jan/Feb 2007
- Evaluation of the gene signature in a set of Evaluation of the gene signature in a set of treated tumorstreated tumors
~ ~ 60 samples60 samples
- „„Prospective“ evaluation of the PAM classifierProspective“ evaluation of the PAM classifier~ ~ 60 initial pretreatment samples/year, start 10/200460 initial pretreatment samples/year, start 10/2004 for a period of 3 years for a period of 3 years
Ongoing researchOngoing research
Matthias FischerMatthias Fischer11 Frank WestermannFrank Westermann22 Benedikt BrorsBenedikt Brors22
Frank BertholdFrank Berthold11 Manfred SchwabManfred Schwab22 Patrick Patrick WarnatWarnat22
Yvonne KahlertYvonne Kahlert11 Rainer KönigRainer König22
Karen ErnestusKaren Ernestus11 Stefan HaasStefan Haas22
Barbara HeroBarbara Hero11 Roland EilsRoland Eils22
Rüdiger SpitzRüdiger Spitz11
André OberthürAndré Oberthür11
11Department of Pediatric Oncology, University of CologneDepartment of Pediatric Oncology, University of Cologne22German Cancer Research Center, HeidelbergGerman Cancer Research Center, Heidelberg
Many Thanks to:Many Thanks to:
Supported by:Supported by: Deutsche Krebshilfe Deutsche Krebshilfe NGFN2 (BMBF)NGFN2 (BMBF)Fördergesellschaft Kinderkrebs Neuroblastom-Forschung e.V. Fördergesellschaft Kinderkrebs Neuroblastom-Forschung e.V.
Biplots of uncentered PCA of all 251 Biplots of uncentered PCA of all 251 tumor samplestumor samples
PCA 1PCA 1
PCA
2PC
A 2
PCA 1PCA 1
PCA
3PC
A 3
PCA 2PCA 2
PCA
3PC
A 3
00-50-50 5050 100100 00-50-50 5050 100100
00
-50-50
808060604040202000
-20-20
00-20-20 4040 808060602020
00
-50-50
NB2004 High-RiskNB2004 Intermediate-RiskNB2004 Low-Risk+
Biplots of centered PCA of all 251 tumor Biplots of centered PCA of all 251 tumor samplessamples
NB2004 High-RiskNB2004 Intermediate-RiskNB2004 Low-Risk+
PCA 1PCA 1 PCA 1PCA 1 PCA 2PCA 2
PCA
2PC
A 2
PCA
3PC
A 3
PCA
3PC
A 3
00-50-50 5050 10010000-50-50 5050 100100-100-100
00
-50-50
5050
100100
00-50-50 5050 100100-100-100
-50-50
-100-100
00
5050
-50-50
-100-100
00
5050
PCA 1PCA 1 PCA 1PCA 1 PCA 2PCA 2
PCA 3PCA 3PCA 2PCA 2 PCA 3PCA 3
Uncentered Principal Component Analysis – Groups coloured accordingUncentered Principal Component Analysis – Groups coloured accordingto the German Neuroblastoma Trial NB2004 risk groupsto the German Neuroblastoma Trial NB2004 risk groups
PCA 1PCA 1 PCA 1PCA 1 PCA 2PCA 2
PCA 3PCA 3PCA 2PCA 2 PCA 3PCA 3
Uncentered Principal Component Analysis – Groups colored according to Uncentered Principal Component Analysis – Groups colored according to identified PAM 144 gene-classifier for NB patientsidentified PAM 144 gene-classifier for NB patients