survival function estimation from current status data swb jsm san diego 2012
TRANSCRIPT
Survival Function Estimation from Current Status Data
SWB
JSM San Diego 2012
Objective: Compare Baseline and Endline U5
• U5 = P[Life 5 years]
• Added objective: Compare survivor function estimates – P[Life > t|Baseline] vs. P[Life > t|Endline], t
=1,2,…,60+ months– Baseline = 2007; Endline = 2010
Data were from Surveys
• IRC surveyed two Afghan refugee camps in Pakistan in 2007 and one after humanitarian aid in 2010– IRC = International Rescue Committee,
www.rescue.org– Before 2007 = “baseline” and after 2010 =
“endline,”– 2008 earthquake prevented endline survey of
other refugee camp
Data Details
• Many questions: intrusive, detailed
• Data included survey dates, live birth dates within last 5 years, deaths, and current status (dead or alive)
• Pdf files showed questionnaires
• Excel spreadsheets contained data relevant to objective(s)
Data Problems
• If exact dates were unknown, month or season were given, including “monsoon”
• Some ages at deaths were inadmissable; deaths would have occurred after survey dates
• Baseline and endline data were in different formats
• Some instructions seemed to have been misunderstood; e.g., baseline survey date
Some Endline Grouped Age-at-Death Data
Month Ships Jan-05 Feb-05 Mar-05 Apr-05 May-05 Jun-05 Jul-05 Aug-05 Sep-05 Oct-05 Nov-05 Dec-05 Jan-06 Feb-06 Mar-06 Apr-06Jan-05 12
Feb-05 6
Mar-05 14 1
Apr-05 21
May-05 20 1 1
Jun-05 23 3
Jul-05 31 1
Aug-05 22
Sep-05 12 1
Oct-05 43 1 1
Nov-05 28
Dec-05 23
Jan-06 51 2
Feb-06 28
Mar-06 37
Apr-06 26 2
Sufficient Current Status Data
• Not all data are shown. This is endline dataMonth Births Deaths
1/15/2005 12 0
2/15/2005 6 0
3/15/2005 14 1
4/15/2005 21 0
5/15/2005 20 1
6/15/2005 23 3
7/15/2005 31 1
8/15/2005 22 1
9/15/2005 12 2
10/15/2005 43 2
11/15/2005 28 1
12/15/2005 23 1
Nonparametric Max. Likelihood from Current Status Data
• ln L = dj ln(Fj) + sjdj) ln(1Fj)• Fj = P[Life j]; dj and sj are current status deaths
from the both surveys and all live births for j = 1,2,…,60+
• Use Excel Solver to maximize log likelihood as function of pk 0, Fj = pk, k = 1,2,…,j – Also used least squares fit to current status npmle
Kaplan-Meier Estimate from Admissible Ages at Deaths
• Data arranged in form of Nevada table
• Used workbook described at http://home.comcast.net/~pstlarry/KMUsrMan.htm
• Includes Greenwood’s estimate of standard deviation of survival function
CDF Estimates (Current Status)Age at death, months Baseline (MLE, LSE) Endline (MLE, LSE) 1 4.1%, 2.4% 3.8%, 2.7% 6 7.8%, 7.9% 3.8%, 3.2% 12 8.4%, 9.6% 3.8%, 3.2% 24 10.4%, 10.4% 3.9%, 3.4% 36 10.4%, 10.4% 3.9%, 3.9%
CDF Estimates from Current Status Data
0
0.02
0.04
0.06
0.08
0.1
0.12
0 12 24 36 48 60
Age at Death, Months
Baseline, Prov. 2, MLE
Endline Prov.2 MLE
Endline Prov.2 LSE
Baseline, Prov. 2, LSE
Birth, Deaths, U5, and Std. Err.Year, Province Births Deaths U5 Std. Error Est. 2007, 1 847 ~95 11.2% 1.09% 2007, 2 1076 ~112 10.4% 0.93% 2007, both 1923 ~208 10.8% 0.71% 2010, 2 Cur. St. 2133 85 4.0% 0.42% 2010,2 Ages 2133 74 4.1% 0.46%
• Std. Error Est. is Greenwood’s standard deviation• See Banerjee and Wellner for confidence intervals
from current status data
Conclusions
• Estimates from current status and admissible ages at deaths agreed tolerably
• U5 = ~10% baseline and ~4% endline– Pakistan U5 is 8.7% (Wikipedia)
• Infant mortality (1 year) is almost 4% baseline AND endline– Pakistan infant mortality is 6.7% (World Bank,
World Development measures)
References
• Bannerjee, Moulinath and Jon A. Wellner “Confidence Intervals for Current Status Data,” Scand. J. of Statist., Vol. 32, pp. 405-424, 2005
• George, L. L. “Kaplan-Meier Reliability Estimation Spreadsheet,” ASQ Reliability Review, vol. 25, no. 2, http://home.comcast.net/~pstlarry/KMUsrMan.htm, June 2005
• ibid. “What Price Required Data?” http:// home.comcast.net/~pstlarry /NwsRev2.doc pp. 3-4, 2000
• Greenwood, M. “The Natural Duration of Cancer,” Reports on Public Health and Medical Subjects 33, His Majesty’s Stationery Office, pp 1-26, 1926
• Miller, Rupert “What Price Kaplan-Meier?” Biometrics, vol. 39, pp. 1077-1081, 1983
• ibid. Beyond ANOVA, Basics of Applied Statistics, Chapman and Hall, New York, 1997
• “Survival Analysis,” http://en.wikipedia.org/wiki/Survival_analysis Nov. 2, 2011