maternal mortality sri lanka validating maternal mortality estimates_murray_110210_ihme_1210

Post on 27-Jan-2015

118 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

Validating maternal mortality estimates

November 2, 2010

Christopher J.L. Murray

Institute Director

Outline

• Predictive validity

• Uncertainty

• Comparisons with new WHO model

2

Validation of our model approach

• Given the range of options of the modeling strategy it is essential to objectively evaluate model performance

• We want an empirical answer to questions like:

o Are these the right covariates to include in the first stage?

o Are these the right transformations of the covariates?

o Does the spatial-temporal stage improve model performance? If so, by how much?

Validation

• To validate our model, we need something to compare the model’s output to

• Ideally, we would have the “truth” to compare the model to, but we just have observed data points, not the true underlying risk of maternal death

• Instead, we can “hold back” some of the observed data and then see how well our model, fit to the remaining data, does in predicting the held back data points

4

What do we care about?

• We care that our model can:

o Predict for gaps in time

─ For country-years that are missing in the middle of the time series

o Predict out of time (i.e. forecast and backcast)

─ For countries where we have only a partial time series

o Predict for countries with no data

5

Predictive validity

• We can construct four types of predictive validity test to validate our model

• The basic idea:

1. Sample 20% of the data, depending on what type of test you want to conduct

─ Randomly sample 20% of country-years with data

─ Randomly sample 20% of countries with data

─ Hold out the first 20% of years of data for all countries

─ Hold out the last 20% of years of data for all countries

6

Predictive validity

• We can construct four types of predictive validity test to validate our model

• The basic idea:

1. Sample 20% of the data, depending on what type of test you want to conduct

2. Estimate the model in the remaining 80% of data

3. Using the model from step (2), predict into the 20% hold-out sample

4. Calculate metrics of fit to determine how well the model did predicting the observed data in the 20% hold-out sample

7

Predictive validity

• We repeat steps 1-3 30 times for the test of gaps in time and countries with no data, to make sure our results are not an artifact of a given random sample

1. Sample 20% of the data, depending on what type of test you want to conduct

2. Estimate the model in the remaining 80% of data

3. Using the model from step (2), predict into the 20% hold-out sample

4. Calculate metrics of fit to determine how well the model did predicting the observed data in the 20% hold-out sample

8

Predictive validity results: comparing the linear and spatio-temporal models

20% of CountriesRegression Root Mean SE* Root Median SE Mean RE** Median RE

Linear 214.84 27.00 0.604 0.417Spatio-Temporal 189.27 25.34 0.521 0.357

First 20% of Country YearsRegression Root Mean SE Root Median SE Mean RE Median RE

Linear 208.28 22.04 0.702 0.437Spatio-Temporal 129.32 11.92 0.392 0.199

Last 20% of Country YearsRegression Root Mean SE Root Median SE Mean RE Median RE

Linear 158.86 13.23 0.538 0.421Spatio-Temporal 104.08 7.46 0.284 0.213

Random 20% of Country YearsRegression Root Mean SE Root Median SE Mean RE Median RE

Linear 215.44 24.22 0.619 0.419Spatio-Temporal 125.34 10.36 0.286 0.165

* SE = Squared Error** RE = Relative Error

9

Outline

• Predictive validity

• Uncertainty

• Comparisons with new WHO model

10

Uncertainty• Uncertainty is the “life preserver” for any researcher!

• While uncertainty intervals are sometimes ignored by policy-makers, they are crucial when interpreting results

• Identifying and incorporating all relevant types of uncertainty into uncertainty intervals in an empirical way is crucial

11

050

100

150

200

MMR

1980 1990 2000 2010Year

What is the objective of uncertainty measurement?

12

This line is the true, underlying risk of maternal death in a sample country, or the “expected value”

050

100

150

200

MMR

1980 1990 2000 2010Year

What is the objective of uncertainty measurement?

13

But we don’t observe that expected value; we observe particular data points

050

100

150

200

MMR

1980 1990 2000 2010Year

What is the objective of uncertainty measurement?

14

We want our uncertainty bounds to contain the expected value 95% of the time

What are the sources of uncertainty?

• Sampling uncertainty

• Non-sampling uncertainty

• Parameter uncertainty

o From the linear model

o From the spatial-temporal local regressions

• Remaining systematic variation

15

Uncertainty: source 1

• Sampling uncertainty

• Any data source will have some degree of associated stochastic sampling error, which must be reflected in any estimates of uncertainty

• We capture this uncertainty by drawing from a binomial distribution with the observed maternal cause fraction as p and the number of trials (n) as the total number of observed deaths

• We simulate 100 datasets by drawing from these distributions, and use these to propagate the sampling uncertainty through the modeling process

16

Sampling uncertainty

17

050

100

150

200

MMR

1980 1990 2000 2010Year

Uncertainty: sources 2 and 3

• Parameter uncertainty

• The application of a statistical model yields uncertainty in the parameter estimates of the model

o You don’t just get an estimate of the β: you get a β ± a measure of uncertainty

o Here we have two stages of parameter uncertainty

─ From the linear model

─ From the spatial-temporal local regressions

18

Simulating for parameter uncertainty

• For each of the 100 datasets generated:

o Estimate the linear model

o Make five draws from the variance-covariance matrix of the regression βs

o Estimate the spatial-temporal model for each of these draws from the linear model

o Make five draws from the variance-covariance matrix of each of the local regressions

19

Parameter uncertainty: a simple example

20

050

100

150

200

MMR

1980 1990 2000 2010Year

Here’s one potential model

Here’s another potential model

Parameter uncertainty takes into account the different models that could potentially fit the data

Uncertainty: source 4

• The fourth source we want to capture is the remaining systematic variation that our model does not explain

o i.e. Education, fertility, etc and spatio-temporal relatedness do not explain all variation in maternal mortality

• However, we cannot estimate the systematic variation directly; the remaining variation consists of three parts

o Systematic variation

o Stochastic variation

o Non-sampling variation

21

The leftover variation

22

050

100

150

200

MMR

1980 1990 2000 2010Year

Non-sampling error

Systematic error, but we don’t observe the true value

This difference could be partially stochastic error, partially non-sampling error and partially non-sampling error

Uncertainty: source 4

• We can separate out the stochastic variation from the systematic and non-sampling variation using simulation

• But we have no way to separate out the systematic and non-sampling variation, so to be conservative, we include both

o This will dramatically overestimate our uncertainty as non-sampling variation is quite large

23

Summarizing uncertainty

24

Outline

• Predictive validity

• Uncertainty

• Comparisons with new WHO model

25

The recent WHO estimates (2010): input data• The study divides countries into categories defined by the type

of data available in that country

o Group A: Civil registration characterized as complete (63 countries)

o Group B: Other types of data available (85 countries)

o Group C: No national data available (24 countries)

26

The recent WHO estimates (2010): input data• Group A: Civil registration characterized as complete (63 countries

total – none of the workshop countries)

o Requirements for inclusion:

─ Earliest year of data before 1996, latest year after 2002

─ Data available for more than half the range of years available

─ Estimated completeness at more than 85% for all years

─ Deaths to ICD-10 R codes did not exceed 20%

o Inflated by a factor of 1.5, unless country-specific adjustments were available

─ Based on reports in 15 countries; reported misclassification ranges from 1.08 (Uzbekistan) to 3.2 (El Salvador)

o Maternal and all-cause deaths of unknown age redistributed proportionally over the age range

o VR collapsed into 5 year time periods

27

The recent WHO estimates (2010): input data• Group B: Other types of data available (85 countries, including all

workshop countries)

o Sisterhood data

─ Assumed fraction of pregnancy-related deaths is understated, up-adjusted by a factor of 1.1

o Deaths in the HH (including Indian SRS)

─ Adjusted upward by a factor of 1.1

o Other “Special studies” (confidential enquires, “RAMOS”)

─ Adjusted upward by a factor of 1.1

• Group C: No data available (24 countries)

28

Other WHO adjustments

• AIDS-related mortality

• Pregnancy related vs. maternal deaths

29

WHO AIDS adjustment

• Wanted the dependent variable in the regression model to reflect non-AIDS-related maternal deaths only

• Used unpublished UNAIDS tables on the proportion of total deaths of women aged 15-49 due to AIDS

• Assume the fraction of AIDS deaths that occur during pregnancy that should be counted as maternal deaths, non-AIDS related maternal deaths depending on data source:

o 0.1 for pregnancy-related data points

o 0.5 for maternal data points

• Use this non-AIDS-related PMDF as the dependent variable in the regression model

30

WHO Pregnancy-related adjustment

• Distinction between:

o Pregnancy-related mortality (all deaths occurring during pregnancy up to 42 days after – including incidental deaths)

o Maternal mortality (death related to pregnancy, childbirth or puerperium, both direct and indirect causes)

• Adjust the input non-AIDS PMDF from data sources identifying pregnancy-related mortality:

o By a factor of 0.85 for most of the world

o By a factor of 0.9 in Sub-Saharan Africa

o Based on data from 8 countries

31

WHO and partners regression-based approach• Construct a database of 484 observations (680 total, but

exclude 196)

• Use a model to predict maternal mortality for the 109 countries in Group B (non-VR data) and Group C (no data)

32

WHO regression approach

• Dependent variable: ln(non-AIDS PMDF)

• Offset: ln(1-a) where a is the proportion of all AIDS deaths among women aged 15-49 in the population

• Covariates:

o ln(GDP per capita): most data from the WB

o ln(general fertility rate): UNPD

o Coverage of skilled attendant at birth (UNICEF database, filled in using a logit model with time as the only covariate)

• Multi-level regression model with random effects for country and region

• Predicted values for 5-year intervals centered around 1990, 1995, 2000, 2005 and 2008

33

WHO counts of all-cause deaths for maternal age women• All-cause counts of deaths very different from IHME estimates

34

35

36

37

All-cause death counts comparison

• WHO vs. UNPD

o UNPD estimates only available for five year blocks of time (1995, 2000, 2005)

38

39

40

41

WHO: AIDS-related maternal deaths

• Given that the dependent variable was non-AIDS PMDF, after estimation, must estimate contribution of AIDS to maternal mortality, and add this back in

o Move from non-AIDS PMDF to total PMDF

• Assume that half of the estimated number of AIDS deaths that occur during pregnancy should be counted as maternal deaths

• Assume the relative risk of dying from AIDS for a pregnant versus non-pregnant woman is 0.4

42

IHME and the recent UN estimates

43

IHME UN (H4)Data Sources 2651 2142

Vital Statistics 2186 2010Surveys 204 819**Census 46 19Verbal Autopsy 215 113

Scope of Study Time series 1980-2008 1990-2008Countries 181 172

Correction Misclassification Country specific Correction factor 1.5 (63 countries)Completeness Country specific UN estimates

Number of female deaths (15–49) Rajaratnam, 2010 WHO lifetablesEstimate based on Model for all countries 118 model & 63 correction factor Model Linear + Space-time Multilevel

Dependent variable MM rate (ln) by age group

Fraction of MM (log) all ages

Treatment of HIV Model-based Estimated deaths separately

Covariates GDP yes yesEducation yes noTFR yes yesHIV yes noHealth services Neonatal mort SBA

Model Validation yes noUncertainty yes yes

top related