non-proportional hazards - harvard universitynon-proportional hazards hajime uno, ph.d dept....
TRANSCRIPT
Non-proportional Hazards
Hajime Uno, Ph.DDept. Biostat. & Comp. Biol. Dana-Farber Cancer Institute
Kaplan-Meier
Logrank test
Cox’s PH estimate
1) Description of survival curves
2) Test of no difference
3) Estimation of treatment effect
A standard methodThree key components in report
Critical issues on the PH estimator
• When the PH assumption is not correct, the PH estimator is estimating a quantity that cannot easily be interpreted. Also, the limiting quantity depends on study-specific underlying censoring distributions (Any model-based treatment contrast estimator has such issues)
0 10 20 30 40
0.2
0.4
0.6
0.8
1.0
time
Pro
babi
lity
0 10 20 30 40
0.6
0.8
1.0
1.2
1.4
1.6
time
Haz
ard
Rat
io
40 10 20 30 40
0.0
0.2
0.4
0.6
0.8
1.0
time
Pro
babi
lity
0 10 20 30 40
0.0
0.2
0.4
0.6
0.8
1.0
time
Pro
babi
lity
0 10 20 30 40
0.0
0.2
0.4
0.6
0.8
1.0
time
Pro
babi
lity
Survival functions Hazard Ratio
Censoring pattern
(1) (2) (3)
Non-PH(1) HR=0.77(2) HR=0.71(3) HR=0.82
0 10 20 30 40
0.4
0.6
0.8
1.0
time
Pro
babi
lity
0 10 20 30 40
0.79
50.
800
0.80
5
time
Haz
ard
Rat
io
50 10 20 30 40
0.0
0.2
0.4
0.6
0.8
1.0
time
Pro
babi
lity
0 10 20 30 40
0.0
0.2
0.4
0.6
0.8
1.0
time
Pro
babi
lity
0 10 20 30 40
0.0
0.2
0.4
0.6
0.8
1.0
time
Pro
babi
lity
Survival functions Hazard Ratio
Censoring pattern
(1) (2) (3)
PH(1) HR=0.80(2) HR=0.80(3) HR=0.80
A cancer example• ECOG E4A03: A phase III randomized trial
to compare low- and high-dose dexamethasone for newly diagnosed multiple myeloma
• N=445 (223 on high-dose, 222 on low-dose) • One of the endpoints was overall survival
7
Rajkumar et al. (2009)Low-doseHigh-Dose
0 10 20 30 40
0.0
0.2
0.4
0.6
0.8
1.0
Survival function
Month
Pro
babi
lity
0 10 20 30 40
0.00
00.
005
0.01
00.
015
0.02
0
Hazard function
Month
Haz
ard
Rat
e
Low-doseHigh-Dose
HR= 0.87 (0.60 - 1.27)
How hazard functions looked like?
Alternative measures
Model-free measures• Median survival time• t-year survival probability• Restricted mean survival time
9
Median Survival Time
10
???
t-year survival probability
11
1-year? 2-year?3-year?
Restricted mean survival time (RMST)
12
- The area under the survival curve up to t*- “t*-year life expectancy” – (e.g., your life
expectancy with low-dose treatment over the next 40 months is 35.4 months”
t*
t*=4035.4 months (low-dose) 33.3 months (high-dose)
• Difference: 2.1 (0.1 - 4.2)
• Ratio: 35.4/33.3 1.06 (1.00 - 1.13)
• Ratio of time lost =6.7/4.6 1.46 (1.02 - 2.13)
Question #1
Have you ever experienced any problem with non-proportional hazards in practice?
13
Question #2
How do you handle the potential violation of the PH assumption in a post hoc setting? (How do you find the violation? What do you do when you find it?)
14
Kaplan-Meier
Logrank test
Cox’s PH estimate
1) Description of survival curves
2) Test of no difference
3) Estimation of treatment effect
A standard methodThree key components in report
Logrank test
• The most powerful test if HR is constant• Equivalent to testing HR=1 with Cox regression
Negatives• May not be so powerful when HR is not
constant• Especially, it will fail when hazard functions
cross, even under stochastic ordering alternatives
16
0 10 20 30 40
0.0
0.2
0.4
0.6
0.8
1.0
Survival function
Month
Pro
babi
lity
0 10 20 30 40
0.00
00.
005
0.01
00.
015
0.02
0
Hazard function
Month
Haz
ard
Rat
eLow-doseHigh-Dose
Logrank: p=0.233
The cancer example
A new test: p=0.005
Alternatives to logrankFor stochastic ordering alternatives
• Tests based on difference between the KM curves– Weighted KM test (Pepe-Fleming, 1989)
1818
Alternatives to logrankFor various patterns of alternatives• Combinations of multiple weighted logrank test
statistics or weighted KM test statistics– Linear combination: Gastwirth (1985), Lee (1996)– Maximum: Tarone (1981), Shen & Cai (2001)
• Sup-versions of weighted logrank statistics– Fleming, Harrington, O’Sullivan (1987)
• Adaptively weighed logrank (or weighted KM) test in a class of weighted statistics– Yang and Prentice (2010)
19
Question #3
Should we consider employing a robust test as the primary analysis, instead of the logrank test at the design stage?
20
A versatile test for equality of two survival functions based on weighted differences of Kaplan-Meier curves
Hajime Uno, Lu Tian, Brian Claggett, LJ Wei
21
0 10 20 30 40
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0 10 20 30 40
−0.4
−0.2
0.0
0.2
0.4
0 10 20 30 40
−4−2
02
4
DifferenceSurvival curves Standardized difference
C
The proposed procedure gives “a bona-fide p-value,” automatically adjusting for multiple choices of c.
Questions1. Have you ever experienced any problem with
non-proportional hazards in practice?2. How do you handle the potential violation of
the PH assumption in a post hoc setting? (How do you find the violation? What do you do when you find it?)
3. Should we consider employing a robust test as the primary analysis, instead of the logranktest at the design stage?
23