identifying variation in mortality and morbidity · pdf fileidentifying variation in mortality...

37
Identifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and Santa Rosa Counties Center for Health Outcomes Research University of South Florida, Health Sciences Center Revised April 2004

Upload: vophuc

Post on 22-Feb-2018

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

Identifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level

In Escambia and Santa Rosa Counties

Center for Health Outcomes Research

University of South Florida, Health Sciences Center

Revised April 2004

Page 2: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

Zip Code Level Multivariate Analysis Health Indicators Data for the analysis was obtained from the University of South Florida CATCH (Comprehensive Assessment for Tracking Community Health) data warehouse that effectively combines information on more than 300 indicators of community health. The CATCH database includes data from a variety of sources. The most recent data available was utilized for this project. Table 1 outlines the sources of data used for this project and the years from which the data was available. Demographic information was taken from a commercial source (ESRI) and reflects year 2000 U.S. Census data. Where multiple years of data were available the most recent five years of data were included.

Table 1. Sources of Data

DATA SOURCE TYPE OF DATA YEARS AVAILABLEESRI Demographics 2000

Florida Vital Statistics Mortality & Maternal/Child Health

1996-2000

Florida Hospital Discharge Files

Hospitalizations 1997-2001

Selection of Health Indicators The investigators identified a list of health indicators to be included in the study. The health outcome indicators were selected because they were thought to be sensitive to increased exposure of airborne environmental hazards. Both mortality and morbidity indicators were included. The mortality indicators, which might best be characterized as health conditions that would be affected by long-term exposure to environmental hazards, included deaths from; 1) all cancers, 2) from lung cancers alone, 3) cardiovascular diseases, 4) any respiratory disease, 5) birth defects and 6) all causes of death to infants. The morbidity indicators, which might best be characterized as being sensitive to short-term exposure to environmental hazards, included hospitalizations for, asthma, cardiovascular disease, and respiratory disease. Two additional morbidity indicators, numbers of live births with very low birth weight and the number of live births with low birth weight. While these indicators have not been directly associated with airborne environmental exposures they have been widely used as indicators of poor population health.

1

Page 3: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

Escambia/Santa Rosa There were a total of 22 zip codes in Escambia and Santa Rosa counties during the study period. However, four of the zip codes (32511, 32530, 32561, and 32565) had total population of fewer than 200 people. Analysis of zip codes with small population sizes might lead to unstable models. Therefore, data from the five zip codes with small total populations were combined with the zip code adjacent to it with a similar demographic make. Table 2 describes the zip codes that were combined. Table 2. Merged Zip Codes

LOW POPULATION MERGED WITH 32511 32505 32530 32583 32561 32566 32565 32535

Matching A cross-sectional observational study design was employed to compare selected health outcome measures for zip codes within Escambia and Santa Rosa (E/SR) counties with zip codes having similar demographic characteristics from the remainder of Florida (Matches). Escambia and Santa Rosa zip codes were matched using propensity scores (Smith, 1997). Propensity scores were calculated through a series of logistic regression equations that included the following independent variables; percent female, percent of the population over 65 years of age, percent of the population that is black, percent of the population that is Hispanic, total population, the percent of households earning $15,000 or less, and per capita income. A separate regression model was calculated matching E/SR zip codes to zip codes in each of the 13 State of Florida Department of Health Districts. Useful matches were identified for District 1, 2, 3, 4, 7, and 13, which represent counties in the central and northern part of Florida. Table 3 describes the counties and number of zip codes available. Zip codes available for matching in the regressions ranged between 80 and 113 for 5 of the 6 districts. District 1 (the district in which Escambia and Santa Rosa are located) provided only two counties and 20 potential matching zip codes. Appendix A provides a list of the results of the matching process. A series of one-way ANOVA were conducted to compare mean values of the resultant groups. No statistically significant differences were observed.

2

Page 4: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

Table 3. Counties Used for Matching

DISTRICT COUNTIES AVAILABLE ZIP CODES

1 Okaloosa, Walton 20 2 Bay, Calhoun, Franklin, Gadsden, Gulf,

Holmes, Jackson, Jefferson, Leon, Liberty, Madison, Taylor, Wakula, Washington

98

3 Alauchua, Bradford, Columbia, Dixie, Hamilton, Lafayette, Levy, Putnam, Suwannee, Union

90

4 Baker, Clay, Duval, Nassau, Saint Johns 80 7 Seminole, Osceola, Orange, Brevard 113 13 Citrus, Hernando, Lake, Marion, Sumter 103

Dependent Variables

The current study was designed to provide an exploratory analysis of the mortality and morbidity data for zip codes in Escambia and Sata Rosa counties. Lawson suggests that such analysis may be “pursued to examine some underlying structure which could present, or generate hypotheses” (Lawson, 2001, page 61). Here we model tract count data within each zip code. A tract count can be regarded as a type of average obtained by accumulation individual cases over a given area. In doing so it is important to adjust data for the “at risk” population in each tract. To do this we calculated “expected count” for each zip code by first creating age/race banded statewide averages for the selected health indicators and then multiplying these values by the demographic strata in each zip code. Next a ratio was calculated by dividing the observed count (ni) by the expected value (ei). To avoid mathematical problems with zero count values for zip codes a constant of 1 was added to both the numerator and denominator of the equation (ni + 1)/(ei+1).

The resultant standardized mortality/morbidity ratios (SMRs) were used as dependent variables for the analysis. Values of 1 for an SMR indicate the observed count is what might be expected based on the state average. Values greater than 1 indicate observed values greater than expected while values less than 1 indicate that the observed value is less than expected based on state average.

Models Originally, it was planned to directly investigate the impact of environmental exposure on persons living in ES/SR counties using the Zip code as the unit of

3

Page 5: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

analysis. However, reliable data for environmental exposure was not available for all of the Zip codes included in the analysis, across the study period. Therefore a strategy was adopted to investigate if specific Zip codes in ES/SR had rates of health indicators that were higher (at a statistically significant level) than matched comparisons. Patterns of such differences across multiple indicators and age/race strata would suggest potential E/SR Zip codes that should be targeted for study as more detailed environmental data becomes available. A series of generalized linear (poison regression) models were developed to test whether the SMR values for each E/SR Zip code (over five years) were different than those from the matched comparison Zip code at a statistically significant level. Specifically, least squares means for each health indicator studied, for five years of data, were compared after adjusting for age, gender, percent of the population that is black, percent of the population over the age of 65 and the percent of households earning $15,000 per year or less. The unit of analysis for the models was the zip code nested within group. Five years of data were included in the analysis to measure impact over a reasonable time period and to increase the power of the statistical tests. Using standard statistical procedures with nested/repeated data will likely lead to biased standard errors and test statistics (Diggle, Liang, and Zeger, 1994). To adjust for potential bias introduced by utilizing nested/repeated data we apply Generalized Estimating Equations (GEE) to all statistical models developed in this portion of the analysis. GEE provides a convenient, flexible way to solve problems presented with this type of data (Allison, 1998). Model fit was very good for all models developed. The ratio of the deviance statistic to the number of degrees of freedom was less than 1 for all models. Model Interpretation For each model developed, 18 tests of statistical significance were interpreted. Interpretation of multiple statistical tests at the traditional p < .05 (nominal level), can lead to inflation of type I error rates. With 18 tests of significance the likelihood of making a type I error approaches 100 percent. A common method to adjust for this situation in the research setting is to interpret only results with p values smaller than .05. There are a number of statistical procedures available to establish p values that ensure control of experiment-wise errors. Here we employ a modified Bonferoni approach proposed by Hochberg (1988). In this method all p values from models that are run are rank ordered from smallest to largest. The smallest p value is interpreted using a cut point of p < .05 divided by the number of tests interpreted (.05/18 = .0027). The next smallest p values is interpreted using a cut point of p < .05 divided by the number of tests interpreted minus one (.05/17 = .0029). One is subtracted from the previous divisor to determine each subsequent adjusted alpha. This strategy controls for

4

Page 6: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

experiment-wise error at the p < .05 level but also severely reduces power in the analysis. Given the exploratory nature of this analysis, we employed a mixed strategy to interpret significance for the tests of differences between ES/SR Zip codes and the matching Zip groups. We first interpret the models using a modified Bonferoni and report them as strong evidence. We also report other test in the model that are significant at the nominal p < .05 level but not at the adjusted level as weaker evidence. Summarizing Evidence In addition to describing the results of our models we attempt to summarize the results of the models across disease/race/age groups. The object of this summary is to identify the zip codes for which the greatest cumulative evidence of poorer than expected health outcomes in E/SR counties exists. This analysis assumes that that the greatest evidence for a general impact of environmental hazards is provided for Zip codes with strong statistical results found consistently across more than one disease group (when there is more than one group available) in more than one race/age group analysis. In the body of this report we highlight (through the tables) only those zip codes in which E/SR Zip codes have statistically significant higher rates of disease than do the matches for that Zip code. Differences that represent strong evidence of (adjusted to control for experiment-wise error) are colored in red, while those representing weaker evidence (nominal p < .05) are colored in yellow. We focus on zip codes with higher disease rates because we are trying to identify potential problem areas. However, many of the regression models also contained results for which the rates of disease in E/SR zip codes were better than the matched controls statistically. Appendix B presents the complete results of the models. Mortality Indicators Tables 4-8 provide the result of the models developed for the mortality indicators. Results of models, conducted on data from the total population, are presented in Table 4. The majority of statistically significant differences related to birth defect mortality and infant mortality. Only one Zip code (32570) had significantly higher rates across more than one category of disease. Next we describe the results of the models disease/race/age groups. The results for blacks of all ages (Table 5) are provided first because the most examples of outcomes in Escambia and Santa Rosa which were significantly worse than the Matches. It is followed by tables for models using data from, blacks over the age of 65 (Table 6), whites of all ages (Table 7) and whites over the age of 65 (Table 8). Below we summarize the results for those Zip codes that exhibit the most consistent patterns of evidence of differences in the mortality indicators.

5

Page 7: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

Cumulative Evidence of Poor Health Outcomes Zip 32570 (Santa Rosa) The 32570 Zip code had the greatest cumulative evidence for poor health outcomes compared with the matched controls. For models developed from data for all blacks, three strong and one weaker statistically significant difference, was found to exist across all three-disease groups. A similar pattern was observed in models developed with data for blacks over the age of 65. Finally, while they did not extend across disease groups significant differences were found for cardiac and respiratory diseases for white of all ages and whites over the age of 65. Zip code 32534 (Escambia) The 32534 Zip code had the next strongest evidence for poor health outcomes compared with the matched controls. For models developed from data for all blacks, two strong and one weaker statistically significant difference were found to exist across all three-disease groups. However, statistically significant differences were only found for cancers with data for blacks over the age of 65 but for no models developed using white data. Zip code 32501 (Escambia) The 32501 Zip code also had evidence of poor health outcomes compared with the matched controls. For models developed from data for all blacks, two weaker and one strong statistically significant difference were found to exist across two disease groups. Statistically significant differences were also found across two disease groups for blacks over the age of 65. In models developed for white of all ages only one strong difference was found. Zip codes 32566/32561 (Santa Rosa) The 32566/32561 zip codes had three strong and one weaker difference across two disease groups for whites over the age of 65 and two weaker differences across two disease groups in data for blacks over the age of 65. For models developed for blacks of all ages there were two weaker differences across disease groups. However, there were no statistically significant differences across more than one disease group for models developed from data for whites of all ages. Zip codes 32577, 32533, 32503 Each of these zip codes has one strong difference and a least one weaker difference across disease groups for data for all blacks, however these patterns are not found in any of the other models in the analysis.

6

Page 8: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

7

Page 9: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

8

Page 10: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

9

Page 11: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

10

Page 12: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

11

Page 13: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

Summary Mortality Results Table 9 summarize the results of the models for mortality by listing the number of strong or weaker results by age/race groups as well as indicating if the differences were found across disease groups. These results are graphically described Figure 1. The Darker color indicates a greater burden of disease. Table 9: Summary of Results from the Mortality Models

BLACKS WHITES ZIP CODE

COUNTY

All Ages

Over 65

Across Disease Groups

All Ages

Over 65

Across Disease Groups

32570 SR 3s/1w 2s/1w Yes 1s/1w 1s/1w No 32566/ 32561

SR

2w

2w

Yes

-

3s/1w

Yes

32534 ES 2s/1w 1s/1w Yes - - No 32501 ES 1s/2w 1s/1w Yes 1s - - 32577 ES 1s/2w - Yes 2s - No 32533 ES 1s/1w - Yes - - - 32503 ES 1s/2w - Yes - - -

12

Page 14: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

Figure 1. Map of Summary of Mortality models

Morbidity Indicators (Hospitalizations) Tables 10-14 provide the result of the models developed for the morbidity indicators. Fewer statistically significant differences in which ES/SR Zip codes had higher rates of disease than the matched comparison were found in these models than in the models for mortality indicators. As with the results for the mortality indicators, highlighted in these tables are statistically significant results in which ES/SR Zip codes have higher rates of disease than do the matches for that Zip code. Differences that represent strong evidence of (adjusted to control for experiment-wise error) are colored in red, while those representing weaker evidence (nominal p < .05) are colored in yellow. The morbidity health indicators, unlike the mortality indicators, fall primarily into one disease group (cardio-respiratory). Therefore emphasis was placed on the

13

Page 15: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

total significant differences found regardless of the category from which they arise. Below we summarize the results for those Zip codes for which exhibit the most consistent patterns of evidence of differences for the morbidity indicators. There were relatively few statistically significant results for models based on the total population or on the total white or black population. More consistent patterns were found in the models for those over the age of 65. Zip 32570 (Santa Rosa) As with results for the mortality data, the strongest evidence for difference was found for Zip code 32750. Strong evidence for differences was found for four of the five diseases included in the models using data from blacks over the age of 65. Two strong differences were found in models developed for white on all ages and one strong difference was also found for models developed from data for blacks of all ages. Only one difference was found for models developed from data for whites over the age of 65. Zip codes 32566/32561 The Zip code 32566/32561 was found to have three strong differences for models developed with data from whites over the age of 65 and one strong difference for models developed with data from blacks over the age of 65. No significant differences were for models developed with data from whites or blacks of all ages combined. Zip codes 32535/32565 The Zip code 32535/32565 was found to have three strong differences for models developed with data from blacks over the age of 65 and one weak difference for models developed with data from whites over the age of 65. In addition there were three strong differences for models developed with data from both blacks of all ages and two strong differences for models developed for whites of all ages. Zip code 32504 Finally there were two strong differences found in models from data for blacks over the age of 65 for Zip code 32504. However, no similar differences were found in any of the other models developed.

14

Page 16: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

15

Page 17: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

16

Page 18: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

17

Page 19: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

18

Page 20: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

19

Page 21: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

Summary Morbidity Results Table 15 summarizes results of the models for morbidity by listing the number of strong or weaker results by age/race groups. These results are graphically described in Figure 2. Table 15: Summary of Results from the Morbidity Models

BLACKS WHITES ZIP CODE

COUNTY

All Ages

Over 65

Across Disease Groups

All Ages

Over 65

Across Disease Groups

32570 SR 1s 4s NA 2s 1s NA 32535/ 32565

ES

2s

3s

NA

3s

1w

NA

32566/ 32561

SR

-

1s

NA

-

3s

NA

32583/ 32530

SR

-

1w

NA

1s

1w

NA

32571 SR - 1s NA - 1w NA 32533 ES 1s 1w NA - - NA 32504 ES - 2s NA - - NA 32501 ES - 1w NA - - NA

20

Page 22: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

Figure 2. Map of Summary of Morbidity Models

21

Page 23: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

References Smith, HL (1997), “Matching with multiple controls to estimate treatment differences in observational studies” in Sociological Methodology 1997, ed AE Raferty. Oxrford:Basil Blackwell. 325-353.

Lawson, AB. Statistical Methods in Spatial Epidemiology. 2001. John Wiley & Sons Ltd. New York. Diggle PJ, Liang, KY, Zeger SL. (1994), The Analysis of Longitudinal Data. New York: Oxford University Press.

Allison, PD. Logistic Regression Using the SAS System: Theory and Application. SAS Institute Inc. Cary, NC. Hochberg, Y. (1988). A sharper Bonferroni procedure for multiple tests of significance. Biometrika, 75, 800-803.

22

Page 24: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

Appendix A

23

Page 25: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

24

Page 26: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

25

Page 27: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

Appendix B

26

Page 28: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

27

Page 29: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

28

Page 30: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

29

Page 31: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

30

Page 32: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

31

Page 33: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

32

Page 34: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

33

Page 35: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

34

Page 36: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

35

Page 37: Identifying Variation in Mortality and Morbidity · PDF fileIdentifying Variation in Mortality and Morbidity For Selected Health Outcomes at the Postal Zip Code Level In Escambia and

36