بسم الله الرحمن الرحیم. generally,survival analysis is a collection of...

111
م ی ح ر ل ا ن م ح ر ل ها ل ل ما س ب

Upload: cameron-stanley

Post on 19-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

الرحیم الرحمن الله بسم

Generally,survival analysis is a collection of statistical procedures for data analysis for which the outcome variable of interest is time until an event occurs.

4

The usual objective with this type of data is to determine the length of remission and survival and to compare the distributions of remission and survival time in each group.

Thirty melanoma patients (stages 2 to 4) were studied to compare the immunotherapies BCG (Bacillus Calmette-Guerin) and Corynebacterium parvum for their abilities to prolong remission duration and survival time. The age, gender, disease stage, treatment received, remission duration, and survival time are given in Table 3.1.

Comparison of Two Treatments

Comparison of Three Diets

A laboratory investigator interested in the relationship between diet and the development of tumors divided 90 rats into three groups and fed them low-fat, saturated fat, and unsaturated fat diets, respectively (King et al., 1979). The rats were of the same age and species and were in similar physical condition. An identical amount of tumor cells were injected into a foot pad of each rat. The rats were observed for 200 days.

Type of censored data

Right censored

Left censored

Interval censored

the survivor function

The hazard function

This mathematical formula is difficult to explain in practical terms.

h1(t): patients with acute leukemia who do not respond to treatment have an increasing hazard rate

h2(t): indicates the risk of soldiers wounded by bullets who undergo surgeryh3(t): is the risk of healthy persons between 18 and 40 years of age whose mainrisks of death are accidents.h4(t): describes the process of human lifeh5(t): patients with tuberculosis have risks that increase initially, then decrease after treatment

Goals of Survival Analysis

Data Layout

The estimated survivor curves for the treatment and placebo groups.

The possible confounding effect

In this case, we would say that the treatmenteffect is confounded by the effect of log WBC.

Need to adjust for imbalance in the distribution of log WBC

Interaction

What we mean by interaction is that the effect of the treatment may be different, depending on the level of log WBC.

There is strong treatment by log WBC interaction, and we would have to qualify the effect of the treatment as depending on the level of logWBC.

1) To stratify on log WBC and compare survival curves for different strata

or

2) To use mathematical modeling procedures such as the proportional hazards or other survival models

How to estimate and graph survival curves?

Use Kaplan-Meier (KM) method.

Introduction to Kaplan-Meier

Non-parametric estimate of the survival function:No math assumptions! (either about the underlying hazard function or about proportional hazards).Simply, the empirical probability of surviving past certain times in the sample (taking into account censoring).

Introduction to Kaplan-Meier

• Non-parametric estimate of the survival function.

• Commonly used to describe survivorship of study population/s.

• Commonly used to compare two study populations.

• Intuitive graphical presentation.

Beginning of study End of study Time in months

Subject B

Subject A

Subject C

Subject D

Subject E

Survival Data (right-censored)

1 .subject E dies at 4 months

X

100%

Time in months

Corresponding Kaplan-Meier Curve

Probability of surviving to 4

months is 100% = 5/5

Fraction surviving this

death = 4/5Subject E dies at 4

months

Beginning of study End of study Time in months

Subject B

Subject A

Subject C

Subject D

Subject E

Survival Data

2 .subject A drops out after

6 months

1 .subject E dies at 4 months

X

3 .subject C dies at 7 monthsX

100%

Time in months

Corresponding Kaplan-Meier Curve

subject C dies at 7 months

Fraction surviving this

death = 2/3

Beginning of study End of study Time in months

Subject B

Subject A

Subject C

Subject D

Subject E

Survival Data

2 .subject A drops out after

6 months

4 .Subjects B and D survive for the whole

year-long study period

1 .subject E dies at 4 months

X

3 .subject C dies at 7 monthsX

100%

Time in months

Corresponding Kaplan-Meier Curve

Rule from probability theory:

P(A&B)=P(A)*P(B) if A and B independent

In survival analysis: intervals are defined by failures (2 intervals leading to failures here).

P(surviving intervals 1 and 2)=P(surviving interval 1)*P(surviving interval 2)

Product limit estimate of survival = P(surviving interval 1/at-risk up to failure 1) * P(surviving interval 2/at-risk up to failure 2) = 4/5 * 2/3= .5333

The product limit estimate

• The probability of surviving in the entire year, taking into account censoring

• = (4/5) (2/3) = 53%

• NOTE: 40% (2/5) because the one drop-out survived at least a portion of the year.

• AND <60% (3/5) because we don’t know if the one drop-out would have survived until the end of the year.

n(f ): the number of subjects in the risk set at the start the interval

t(f): failure time

q(f): the number of censored subjects

m(f): the number of failures

KM formula =product limit formula

When there are censored subjects

how to evaluate whether or not KM curves for two or more groups are statistically equivalent?Themost popular testing method is called the log–rank test.

The Log-Rank Test for Several Groups

Alternatives to the Log Rank Test

The Wilcoxon test (called the Breslow test in SPSS)

Wilcoxon Test

All the test results are highly significant yielding a similar conclusion to reject the null hypothesis.

Choosing a Test

Confidence intervals for KM curves

Edited Output From Stata:

Time-independent variable:Values for a given individualdo not change over time; e.g.,SEX and Smoking status(SMK).

Why the Cox PH Model Is Popular?1) Semiparametric property

2) Cox PH model is “robust”

the baseline hazard is not specified, reasonably good estimates of regression coefficients, hazard ratios of interest, and adjusted survival curves can be obtained for a wide variety of data situations.

We need are estimates of the b’s to assess the effect ofexplanatory variables of interest.

The measure of effect, which is called a hazard ratio (HR)

Maximum likelihood (ML) Estimation of the Cox PH Model

Statistical inferences for hazard ratios

1) Test for treatment effect:Wald statistic: P <0.001 (highly significant)

Conclusion: treatment effect is significant

2) Point estimate:HR = 4.523

Conclusion: the hazard for the placebo group is 4.5 times the hazard for the treatment group

3) 95% confidence interval for the HR: (2.027,10.094)

the potential confounding effect

HR for model 1 (4.523) is higher than HR for model 2 (3.648)

Confounding: crude versus adjusted HR are meaningfully different.

Confounding due to log WBC must control for log WBC, i.e., prefer model 2 to model 1.

Interaction in model

The Meaning of the PH Assumption

The PH assumption requires that the HR is constant over time

The hazard for one individual is proportional to the hazard for any other individual, where the proportionality constant is independent of time.

PH Not SatisfiedEXAMPLE:

General rule:If the hazards cross, then a Cox PH model is not appropriate.

If the Cox PH model is inappropriate, how should we carry out the analysis?

Evaluating the Proportional Hazards Assumption

Checking the Proportional Hazards Assumption:

There are two types of graphical techniques available.

1) Comparing estimated –ln(–ln) survivor curves

2) Compare observed with predicted survivor curves.

Goodness-Of-Fit (GOF) tests