brd project
TRANSCRIPT
Multi-study Analysis Of Survival Data For Bovine Respiratory Disease
Reporter: Chao ‘Charlie’ Huang
Project presentation
OUTLINE• 1. Introduction– Bovine Respiratory Disease– Survival analysis– Meta-analysis– Statistical models combining multi-study– Arends’ multivariate random-effects model
• 2. Methodology – Data manipulation – Modeling
• 3. Results and discussion– No covariates method– Covariate method
• 4. Conclusion
1. INTRODUCTION• 1.1 Bovine Respiratory Disease (BRD) – a severe cattle disease– coughing, fever, dehydration and death– accounting for “approximately 75 percent of
feedlot morbidity and 50 percent to 70 percent of all feedlot deaths” in the United States (Stotts 2010).
BRD occurrence
Clinical diagnosis ( temperature,
haptoglobin, etc)
Survival analysis
Generalized linear model
Type of predictor variable Type of response variable
Censor?
Linear regression Categorical or continuous Normally distributed No
Logistic regression Categorical or continuous Binary No
Survival analysis Categorical or continuous (maybe time-dependent)
Binary Allowed
The table is modified based on Brian F. Gage, 2004
1.2 Survival analysis– models time-to-event data– censoring– incomplete observation due to death, withdrawal,
etc
– time-dependent covariates
h(t) = P{ t < T < (t + Δt) | T >t}h(t) = P{ t < T < (t + Δt) | T >t}
S(t) = P{T > t} S(t) = P{T > t}
:
ˆ( ) [1 ]i
j
j t tj
dS t
n
0 1 1( ) ( ) exp ...i i k ikh t t x x
• BRD Data from OSU Animal Science Department– Study I• 137 cattle; 21 days; covariates(reticular temperature,
haptoglobin, etc)
– Study II• 265 cattle; 42 days; covariates(rectal temperature,
haptoglobin, etc)
– Study III• 347 cattle; 56 days
• Using Study I and II, Li (2009) finished survival analysis with Kaplan-Meier method and Cox's proportional hazards regression. – Overall “nearly half of the sick animals developed the
disease in the first 7 days after arrival” and “when temperature is higher, the hazard of developing BRD is higher for both data sets”.
– “when the haptoglobin level is higher, the hazard for developing BRD also increases” for Study I, and “the two coefficients, temperature and the interaction between temperature and time, are significant” for Study II.
• Next step– Increased sample size more power– How about we combine the three studies
together?
1.3 Meta-analysis – a statistical method to combine several studies’
results targeting the same or similar hypotheses– controls between-study variation– increases statistical power
Meta-analysis of the effects of psychosocial interventions on survival time in cancer patients
• An example
• However, our data – Has messy structure
• Missing or invalid variable • Different duration
– Is observational data • No randomization• No treatment vs. treatment
• If we cannot use the traditional meta-analysis, how can we combine these three studies?
1.4 Statistical models combining multi-study
• Iterative generalized least-squares
• 1.5 Arends’ multivariate random-effects model
ˆln( ln( ))i i i i iS X Z b
Survival proportion estimated by survival analysis methods
Parameter vector of fixed effects
Parameter vector of random effects
Coefficient and covariance are estimated by iterative generalized linear regression
2. METHODOLOGY
• Data transformation • Study I
• Reticular temperature (RETT) rectal temperature (RECT) • RECT=15.88 + 0.587*RETT by
Bewley et al. (2008)• Data cleaning
• Study I• 137 animals 129 animals
• Study III • 347 animals 230 animals
2. METHODOLOGY
No covariates methodNo covariates method Covariate method Covariate method
3. RESULTS AND DISCUSSION• 3.1 No covariates method
Time
3. RESULTS AND DISCUSSION• 3.1 No covariates method
3. RESULTS AND DISCUSSION• 3.1 No covariates method
A
B
3. RESULTS AND DISCUSSION• 3.1 No covariates method
20 1 2
0 1
ˆln( ln( )) ln( ) [ln( )] (7)
ˆln( ln( )) ln( ) (6)
ij i ij
ij i ij
S day day b
S day b
The straight line model in
equation (6) The quadratic curve model in
equation (7)
Estimate Standard
error P-value
Estimate Standard
error P-value
Regression coefficient
β0 or intercept -1.9249 0.1406 0.0053 -2.4587 0.1146 0.0022 β1 for ln(day) 0.6421 0.0299 <.0001 1.3285 0.05496 <.0001 β2 for [ln(day)]2 -0.1690 0.01300 <.0001
Covariance parameter
Studies 0.0430 0.02983 Residual 0.0499 0.01381
The straight line model in
equation (6) The quadratic curve model in
equation (7)
-2 Res Log Likelihood 3.5 -73.6 AIC 7.5 -69.6
AICC 7.7 -69.4 BIC 5.7 -71.4
3. RESULTS AND DISCUSSION• 3.1 No covariates method
Study-specific result Combined result After the model in equation (6)
3. RESULTS AND DISCUSSION• 3.1 No covariates method
Study-specific result Combined result After the model in equation (7)
3. RESULTS AND DISCUSSION• 3.2 Covariate method
TimeTemperature
3. RESULTS AND DISCUSSION• 3.2 Covariate method
A B
C
D
Study I
Study II
Survival proportion 95% confidence interval
3. RESULTS AND DISCUSSION• 3.2 Covariate method
3. RESULTS AND DISCUSSION• 3.2 Covariate method
The selected fixed effect temperature, ln(day), [ln(day)]2
3. RESULTS AND DISCUSSION• 3.2 Covariate method
20 1 2 3
ˆln( ln( )) ln( ) [ln( )] (8)ij i ijS day temperature day b
The multivariate random-effects model in equation (8)
EstimateStandard
errorP-value
Regression coefficient β0 or intercept -32.7670 0.4943 0.0096
β1 for ln(day) 1.5148 0.0114 <.0001
β2 for temperature 0.7610 0.0027 <.0001
β3 for [ln(day)]2 -0.2024 0.0029 <.0001
Covariance parameterStudies 0.4655
Residual 0.0119
3. RESULTS AND DISCUSSION• 3.2 Covariate method
A
B
C
D
Study-specific results Survival proportion 95% confidence interval
Study I
Study II
3. RESULTS AND DISCUSSION• 3.2 Covariate method
3. RESULTS AND DISCUSSION• 3.2 Covariate method
Survival proportion 95% confidence interval Combined result
4. CONCLUSION• Strength – Handles the observational data– Simple and robust – Easy to be programmed in SAS®
• Weakness – Not a real survival curve– Random effects have the normal distributions– Over-fitting may occur– Journal papers?
• Future Improvement– ln(-ln) transformation• Regression splines, fractional polynomials, etc.• Simulation test may decide the best transformation
– Normal distribution assumption• A gamma distribution by Fiocco, Putter and van
Houwelingen (2009)