lifeexpectancyinamerica

24
Running Head: Predicting Life Expectancy Predicting Life Expectancy – A stepwise regression model A Project Report for STAT 4950 Submitted to: Dr Shenghua (Kelly) Fan Submitted by: Akhil Raman HS4623, Berhane Desta GH6462, Bui Toan

Upload: akhil-raman

Post on 21-Jan-2017

87 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: LifeExpectancyInAmerica

Running Head: Predicting Life Expectancy

Predicting Life Expectancy – A stepwise regression model

A Project Report for STAT 4950

Submitted to:  Dr Shenghua (Kelly) Fan

Submitted by:  Akhil Raman HS4623, Berhane Desta GH6462, Bui Toan

Page 2: LifeExpectancyInAmerica

Running Head: Predicting Life Expectancy

Introduction and Literature Review

Life Expectancy is a statistical measure of how long a person or organism may live, based on the year of

their birth, their current age, demographic factors, and health variables. Globally, life expectancy

exhibits a broad range, from an average of 49.42 years in Swaziland, to 82.6 years in Japan, with the

disparity attributed to public health, medical care, diet, and various socio-economic factors.

In the United States, life expectancy has increased over the generations. Americans today are living

twice as long as citizens in the early 1900’s. However, despite the increase, there are variables which

may lead to a reduction in an individual’s lifespan.

Public Health reports suggest education exhibits a connection with life expectancy. Researchers have

examined the direct (better stress management, healthier lifestyle) and indirect (better paying jobs,

social privilege) to life expectancy. Men and women who pursue higher education have average longer

life expectancies compared to individuals who do not (maximum high school).

Health factors linked to lifestyle habits have been linked to life expectancy. Obesity, smoking,

alcoholism, and diabetes can shorten an individual’s lifespan. Obesity can reduce an individual’s lifespan

by 5 years, smoking by 12, and alcoholism by 20 and diabetes by 10. Along with lifestyle habits, access to

medical attention is crucial. For example, child vaccination is second only to clean water as a source of

fighting infectious disease and can increase the life expectancy of a child.

Racial disparities in life expectancy have been a point of debate. African-Americans represent the low

end of the scale, and Asian-Americans living the longest. Researchers link the difference to socio-

economic, health, and education factors. This project is designed to check the effects of Education, Race,

health and poverty on life expectancy using multiple linear regression. According to the outputs of the

analysis, African Americans, Asian Americans, school enrollment, smoking, binge and diabetes have an

effect on a life expectancy.

Methods

Data was collected from the “American Human Development Project.” All states, including Puerto Rico,

were analyzed. The provided life expectancy for each state was compared across four different domains;

poverty, education, race, and health. Gender differences were not accounted for in the model. The

Page 3: LifeExpectancyInAmerica

Running Head: Predicting Life Expectancy

independent variables were selected based on prior literature on life expectancy. As a first step, we

perform a correlation matrix to see the relationship of the independent variables with the dependent

variable. We observe that smoking, binge, diabetes, obesity, African American, Asian American,

education level variables and poverty variables have significant relationship with life expectancy at 0.05

significant level (Table1.1-1.4).

Next, we performed four multiple linear regression models for the four groups, health, poverty, and race

and education level. Afterwards, we combined the remaining variables from all four models and

conducted a final stepwise regression. For health variables, smoking, binge drinking and diabetes are

included in the model. (see Table1.1). For race group, all the independent variables are included in the

model based. (see Table1.3). For the poverty group, only children under 6 living in poverty remained.

(Table1.2). For the education model, the percentage of the population with bachelor’s degrees ,

graduate degrees and the percentage of the population enrolled in school remained (Table 1.4). When

performing the final model, African American, Asian American, School enrollment, binge, smoking,

diabetes and immunization remained in the model based on 0.15 entry level (Table 6.1).The regression

models for Health, Race, Poverty and Education are given in the appendix under Equations 1,2, 3 and 4,

respectively. The final regression model is given in the appendix under Equation 5.

After finding the models for the four groups and one final model, we conducted normality and equal

variance checks for the four groups and the final model. For the health groups since most of the points

are close to the line and since all the p-values of the normality check are greater than 0.05, the

normality holds true( Figure 3.1 and Table 1.1). Since there is no pattern, equal variance holds true

(Figure 2.1). For the race group, since there are outliers and since all the p-values are not greater than

0.05 the normality assumption is failed( Figure 3.3 and Table 1.3). Since there is no pattern, equal

variance holds true (Figure 2.3). For the Poverty group, since most of the points are close to the line and

since all the p-values of the normality check are greater than 0.05, the normality holds true( Figure 3.2

and Table 1.2). Since there is no pattern, equal variance holds true (Figure 2.2). For Education group,

since most of the points are close to the line and since all the p-values of the normality check are greater

than 0.05, the normality holds true( Figure 3.4 and Table 1.4). Since there is no pattern, equal variance

holds true (Figure2.4).

For the final model, since most of the points are close to the line and since all the p-values of the

normality check are greater than 0.05, normality assumption holds true (Figure 7.1 and Table 7.1). Since

there is no pattern, equal variance holds true (Figure 7.1).

Page 4: LifeExpectancyInAmerica

Running Head: Predicting Life Expectancy

Conclusion

Separately, our stepwise models were consistent with the prior literature, unhealthy habits negatively

affect life expectancy, higher education is correlated with longer life, poverty negatively impacted life

expectancy and all races have an effect on a life expectancy. When we combined the four groups, we

observed that variables African American, Asian American, School enrollment, Smoking, binge, diabetes

and immunization entered the final model based on 0.15 entry level (Equation 5). From the final model,

it is observed that African American, Smoking have negative slopes while the rest have positive slopes.

The ones with negative slopes affect life expectancy negatively and those with positive slopes affect the

life expectancy positively. To further investigate we can start including quadratic and interaction terms.

Page 5: LifeExpectancyInAmerica

Running Head: Predicting Life Expectancy

Appendix

Table 1.1 -1.4 (Pearson Correlation Coefficients for all variables).

Table1.1: Health Table1.2: Poverty

Table 1.3: Race Table 1.4: Education

Smoking = % of adults smoking Binge = % of adults binge drinking Diabetes = % of adults w/diabetes Obesity = % of adults w/obesity

Immunization = %of children immunized X7 = %children under 6 living in poverty X8 = %adults over 65 living in poverty

X9 = %economically disadvantage k-12 students X10 = % children enrolled in preschool Wht = % of population which is white

Lat = % of population which is Latino Afam = %of population which is black Asam = %of population which is Asian

Natam = % of population which is Native American Less_hs = % of population high school dropouts Atleast_hs = %of population w/Hs diploma

Atleast_ba = %of population w/bachelor’s degree Grad_degree = %of population w/graduate degree S_enroll = %of population enrolled in school.

Page 6: LifeExpectancyInAmerica

Running Head: Predicting Life Expectancy

Table 2.1-2.4 (Initial Parameter estimates after separate stepwise regression models)

Table 2.1: Health Table 2.2: Race

Table 2.3: Education Table 2.4: Poverty

Table 3.1-3.4 (Normality Checking for each stepwise regression model)

Table 3.1:Health Table 3.2: Poverty

Figure 1.1:Health Figure 1.2:Poverty

Page 7: LifeExpectancyInAmerica

Running Head: Predicting Life Expectancy

Table 3.3:Race Table 3.4: Education

Figure 1.3:Race Figure1.4: Education

Table 4.1-4.4 (Equal Variances Checking for each stepwise regression model)

Figure2.1: Health Figure 2.2:Poverty

Page 8: LifeExpectancyInAmerica

Running Head: Predicting Life Expectancy

Figure 2.3:Race Figure 2.4: Education

Table 5.1 (Remaining Variables after separate stepwise regression models)

Page 9: LifeExpectancyInAmerica

Running Head: Predicting Life Expectancy

Table 6.1 (Final Stepwise model parameter estimates)

Figure 7.1 (Assumption Checking for Final Model)

Page 10: LifeExpectancyInAmerica

Running Head: Predicting Life Expectancy

Table 7.1

Equation 1: Health:

LE= 80.98619-0.25622*smoking + 0.08958*binge -0.43044*diabetes

Equation 2: Race:

LE= 31.37453 + 0.48144*white + 0.47116*latino +0.38513*AfricanAmerican +0.90752*AsianAmerican

0.49848*NativeAmerican

Equation 3: Poverty

Page 11: LifeExpectancyInAmerica

Running Head: Predicting Life Expectancy

LE= 34.3130-0.23659*under6-poverty

Equation 4: Education

LE = 59.161 +52.807*Atleast_BA -71.605*Grad.Deg + 15.594*S-enroll

Equation 5: Final Model

LE=70.84127 -0.04853*AfAm +0.04204*AsAm +0.12519*S_enrol -0.21581*Smoking +0.09621*binge -

0.23291*diabetes + 0.03929*immunization

SAS Code

Final Model

data MLR;input life under_6 AfAm caucasian latino NatAm AsianAm bachelors_plus grad_degree school_enrollment smoking binge diabetes immunization;datalines;75.4 31.9 26.0 67.0 3.9 0.5 1.1 21.9 8.0 76.1 24.3 13.7 11.2

83.178.3 17.8 3.1 64.1 5.5 14.4 5.3 27.9 9.4 71.2 22.9 20.8 6.7

79.279.6 27.5 3.7 57.8 29.6 4.0 2.7 25.9 9.2 74.1 19.3 17.6 7.8

84.676.0 32.5 15.3 74.5 6.4 0.7 1.2 19.5 6.3 75.7 27.0 14.1 9.5

81.480.8 23.6 5.8 40.1 37.6 0.4 12.8 30.1 11.0 78.4 13.7 18.6 8.5

86.880.0 20.9 3.8 70.0 20.7 0.6 2.7 36.4 13.0 77.5 18.3 20.1 6.0

86.580.8 15.8 9.4 71.2 13.4 0.2 3.8 35.5 15.3 81.9 17.1 17.9 6.8

88.278.4 22.4 20.8 65.3 8.2 0.3 3.2 27.8 11.3 78.3 21.8 20.3 8.3

84.376.5 27.4 50.0 34.8 9.1 0.2 3.5 50.1 26.9 74.6 20.8 25 8.0

84.679.4 26.5 15.2 57.9 22.5 0.3 2.4 25.8 9.2 77.5 19.3 17.1 9.5

88.5

Page 12: LifeExpectancyInAmerica

Running Head: Predicting Life Expectancy

77.2 28.8 30.0 55.9 8.8 0.2 3.2 27.3 9.8 77.2 21.2 16.6 9.979.0

81.3 16.0 1.5 22.7 8.9 0.2 37.7 29.5 9.6 74.9 16.8 21.5 8.281.5

79.5 23.9 0.6 84.0 11.2 1.1 1.2 24.4 7.7 76.1 17.2 16.6 7.077.6

79.0 22.1 14.3 63.7 15.8 0.1 4.5 30.8 11.5 79.7 20.9 23 8.382.2

77.6 25.3 9.0 81.5 6.0 0.2 1.6 22.7 8.1 76.9 25.6 17.8 9.685.3

79.7 19.4 2.9 88.7 5.0 0.3 1.7 24.9 7.9 79.0 20.4 23.1 7.084.2

78.7 22.0 5.7 78.2 10.5 0.8 2.3 29.8 10.5 78.4 22.0 17 8.185.7

76.0 31.6 7.7 86.3 3.1 0.2 1.1 20.5 8.1 74.7 29.0 16.1 9.986.0

75.7 32.1 31.8 60.3 4.2 0.6 1.5 21.4 7.0 75.2 25.7 16.1 10.787.7

79.2 22.0 1.1 94.4 1.3 0.6 1.0 26.8 9.5 77.7 22.8 17.3 8.390.3

78.8 15.2 29.0 54.7 8.2 0.2 5.5 36.1 16.4 78.5 19.1 18 8.789.1

80.5 17.0 6.0 76.1 9.6 0.2 5.3 39.0 16.7 81.2 18.2 20.6 7.287.2

78.2 27.6 14.0 76.6 4.4 0.6 2.4 25.2 9.6 79.2 23.3 19.7 9.186.4

81.1 17.4 5.1 83.1 4.7 1.0 4.0 31.8 10.3 79.2 19.1 22.1 5.987.3

75.0 37.8 36.9 58.0 2.7 0.5 0.9 19.5 7.1 76.1 26.0 14.2 11.382.4

77.5 26.0 11.5 81.0 3.5 0.4 1.6 25.6 9.5 76.6 25.0 19.2 9.182.0

78.5 24.2 0.4 87.8 2.9 6.1 0.6 28.8 9.0 75.7 22.1 20.8 6.574.4

79.8 22.2 4.4 82.1 9.2 0.8 1.7 28.6 9.0 80.2 20.0 22.7 7.884.9

78.1 26.4 7.7 54.1 26.5 0.9 7.1 21.7 7.4 71.8 22.9 18.6 8.676.0

80.3 13.8 1.0 92.3 2.8 0.2 2.1 32.8 12.4 79.4 19.4 18.7 7.290.0

80.3 16.8 12.8 59.3 17.7 0.1 8.2 35.4 13.3 81.3 16.8 18.2 8.480.6

78.4 32.9 1.7 40.5 46.3 8.5 1.3 25.0 10.8 74.7 21.5 16.4 7.985.2

80.5 23.7 14.4 58.3 17.6 0.3 7.3 32.5 14.0 79.1 18.1 19.6 8.484.4

77.8 28.8 21.2 65.3 8.4 1.1 2.2 26.5 8.7 76.2 21.8 15.2 9.384.1

79.5 20.5 1.1 88.9 2.0 5.3 1.0 27.6 7.9 74.1 21.9 23.8 7.681.0

77.8 28.2 12.0 81.1 3.1 0.2 1.7 24.6 8.9 77.9 25.1 20.1 9.986.1

Page 13: LifeExpectancyInAmerica

Running Head: Predicting Life Expectancy

75.9 28.4 7.3 68.7 8.9 8.2 1.7 22.9 7.5 75.7 26.1 16.5 10.178.7

79.5 24.7 1.7 78.5 11.7 1.1 3.6 28.8 10.5 76.0 19.7 16.5 6.979.7

78.5 21.8 10.4 79.5 5.7 0.1 2.7 27.1 10.4 78.6 22.4 18.3 8.888.1

79.9 22.7 4.9 76.4 12.4 0.4 2.8 30.2 12.2 79.1 20.0 19.7 7.488.4

77.0 31.9 27.7 64.1 5.1 0.4 1.3 24.5 8.8 76.4 23.1 15.4 10.184.7

79.5 21.4 1.2 84.7 2.7 8.5 0.9 26.3 7.7 76.2 23.0 22.1 6.684.3

76.3 29.4 16.5 75.6 4.6 0.3 1.4 23.1 8.5 75.3 23.0 10 10.487.7

78.5 28.7 11.5 45.3 37.6 0.3 3.8 25.9 8.6 76.3 19.2 18.9 9.783.0

80.2 16.7 0.9 80.4 13.0 1.0 2.0 29.3 9.4 76.9 11.8 12 6.183.1

80.5 23.5 0.9 94.3 1.5 0.3 1.3 33.6 13.3 77.6 19.1 18.5 6.479.8

79.0 17.5 19.0 64.8 7.9 0.3 5.5 34.2 14.2 76.9 20.9 17.9 7.980.3

79.9 21.1 3.4 72.5 11.2 1.3 7.1 31.1 11.1 74.9 17.5 17.8 6.982.7

75.4 30.2 3.4 93.2 1.2 0.2 0.7 17.5 6.6 75.5 28.6 10.1 11.984.8

80.0 22.8 6.2 83.3 5.9 0.9 2.3 26.3 9.0 78.0 20.9 24.3 7.288.2

78.3 18.5 0.8 85.9 8.9 2.1 0.8 24.1 8.4 73.4 23.0 18.9 7.473.7

;run;

proc print data = mlr;run;

proc corr data = mlr;run;

proc reg data = mlr;model life = under_6 AfAm caucasian latino NatAm AsianAm bachelors_plus grad_degree school_enrollment smoking binge diabetes immunization/selection = stepwise;run;PROC UNIVARIATE DATA= mlr NORMAL PLOT;

TITLE "NORMALITY CHECKING";var residual;

RUN;

Page 14: LifeExpectancyInAmerica

Running Head: Predicting Life Expectancy

Race

DATA PROJECT; INPUT LE WHT LAT AFAM ASAM NATAM;DATALINES;75.4 67.0 3.9 26.0 1.1 0.5 78.3 64.1 5.5 3.1 5.3 14.4 79.6 57.8 29.6 3.7 2.7 4.0 76.0 74.5 6.4 15.3 1.2 0.7 80.8 40.1 37.6 5.8 12.8 0.4 80.0 70.0 20.7 3.8 2.7 0.6 80.8 71.2 13.4 9.4 3.8 0.2 78.4 65.3 8.2 20.8 3.2 0.3 76.5 34.8 9.1 50.0 3.5 0.2 79.4 57.9 22.5 15.2 2.4 0.3 77.2 55.9 8.8 30.0 3.2 0.2 81.3 22.7 8.9 1.5 37.7 0.2 79.5 84.0 11.2 0.6 1.2 1.1 79.0 63.7 15.8 14.3 4.5 0.1 77.6 81.5 6.0 9.0 1.6 0.2 79.7 88.7 5.0 2.9 1.7 0.3 78.7 78.2 10.5 5.7 2.3 0.8 76.0 86.3 3.1 7.7 1.1 0.2 75.7 60.3 4.2 31.8 1.5 0.6 79.2 94.4 1.3 1.1 1.0 0.6 78.8 54.7 8.2 29.0 5.5 0.2 80.5 76.1 9.6 6.0 5.3 0.2 78.2 76.6 4.4 14.0 2.4 0.6 81.1 83.1 4.7 5.1 4.0 1.0 75.0 58.0 2.7 36.9 0.9 0.5 77.5 81.0 3.5 11.5 1.6 0.4 78.5 87.8 2.9 0.4 0.6 6.1 79.8 82.1 9.2 4.4 1.7 0.8 78.1 54.1 26.5 7.7 7.1 0.9 80.3 92.3 2.8 1.0 2.1 0.2 80.3 59.3 17.7 12.8 8.2 0.1 78.4 40.5 46.3 1.7 1.3 8.5 80.5 58.3 17.6 14.4 7.3 0.3 77.8 65.3 8.4 21.2 2.2 1.1 79.5 88.9 2.0 1.1 1.0 5.3 77.8 81.1 3.1 12.0 1.7 0.2 75.9 68.7 8.9 7.3 1.7 8.2 79.5 78.5 11.7 1.7 3.6 1.1 78.5 79.5 5.7 10.4 2.7 0.1 79.9 76.4 12.4 4.9 2.8 0.4 77.0 64.1 5.1 27.7 1.3 0.4 79.5 84.7 2.7 1.2 0.9 8.5 76.3 75.6 4.6 16.5 1.4 0.3 78.5 45.3 37.6 11.5 3.8 0.3 80.2 80.4 13.0 0.9 2.0 1.0 80.5 94.3 1.5 0.9 1.3 0.3

Page 15: LifeExpectancyInAmerica

Running Head: Predicting Life Expectancy

79.0 64.8 7.9 19.0 5.5 0.3 79.9 72.5 11.2 3.4 7.1 1.3 75.4 93.2 1.2 3.4 0.7 0.2 80.0 83.3 5.9 6.2 2.3 0.9 78.3 85.9 8.9 0.8 0.8 2.1 ;PROC PRINT DATA=PROJECT;RUN;PROC CORR DATA=PROJECT;TITLE "CORRELATION MATRIX";VAR LE WHT LAT AFAM ASAM NATAM;RUN;

PROC REG DATA=PROJECT;MODEL LE = WHT LAT AFAM ASAM NATAM/SELECTION=STEPWISE;RUN;PROC REG DATA = PROJECT;TITLE "REGRESSION";MODEL LE = WHT LAT AFAM ASAM NATAM;OUTPUT OUT= GRAPH

P=PREDICTEDSTUDENT=ST_RESR=RESIDUAL;

RUN;

**GRAPHICAL SUMMARIES OF RESIDUALS TO CHECK ASSUMPTIONS;PROC UNIVARIATE DATA=GRAPH NORMAL PLOT;

TITLE "NORMALITY CHECKING";VAR RESIDUAL;

RUN;

SYMBOL VALUE=DOT COLOR=RED I=R;

PROC GPLOT DATA=GRAPH;TITLE "RESIDUALS VERSUS PREDICTED";PLOT ST_RES * PREDICTED;

RUN;

Health

DATA PROJECT; INPUT life smoking binge diabetes obesity immunization;DATALINES;24.3 13.7 11.2 32.0 83.1 24.322.9 20.8 6.7 27.4 79.2 22.919.3 17.6 7.8 25.1 84.6 19.327.0 14.1 9.5 30.9 81.4 27.013.7 18.6 8.5 23.8 86.8 13.718.3 20.1 6.0 20.7 86.5 18.317.1 17.9 6.8 24.5 88.2 17.121.8 20.3 8.3 28.8 84.3 21.820.8 25 8.0 23.8 84.6 20.8

Page 16: LifeExpectancyInAmerica

Running Head: Predicting Life Expectancy

19.3 17.1 9.5 26.6 88.5 19.321.2 16.6 9.9 28.0 79.0 21.216.8 21.5 8.2 21.9 81.5 16.817.2 16.6 7.0 27.1 77.6 17.220.9 23 8.3 27.1 82.2 20.925.6 17.8 9.6 30.8 85.3 25.620.4 23.1 7.0 29.0 84.2 20.422.0 17 8.1 29.6 85.7 22.029.0 16.1 9.9 30.4 86.0 29.025.7 16.1 10.7 33.4 87.7 25.722.8 17.3 8.3 27.8 90.3 22.819.1 18 8.7 28.3 89.1 19.118.2 20.6 7.2 22.7 87.2 18.223.3 19.7 9.1 31.3 86.4 23.319.1 22.1 5.9 25.7 87.3 19.126.0 14.2 11.3 34.9 82.4 26.025.0 19.2 9.1 30.3 82.0 25.022.1 20.8 6.5 24.6 74.4 22.120.0 22.7 7.8 28.4 84.9 20.022.9 18.6 8.6 24.5 76.0 22.919.4 18.7 7.2 26.2 90.0 19.416.8 18.2 8.4 23.7 80.6 16.821.5 16.4 7.9 26.3 85.2 21.518.1 19.6 8.4 24.5 84.4 18.121.8 15.2 9.3 29.1 84.1 21.821.9 23.8 7.6 27.8 81.0 21.925.1 20.1 9.9 29.7 86.1 25.126.1 16.5 10.1 31.1 78.7 26.119.7 16.5 6.9 26.7 79.7 19.722.4 18.3 8.8 28.6 88.1 22.420.0 19.7 7.4 25.4 88.4 20.023.1 15.4 10.1 30.8 84.7 23.123.0 22.1 6.6 28.1 84.3 23.023.0 10 10.4 29.2 87.7 23.019.2 18.9 9.7 30.4 83.0 19.211.8 12 6.1 24.4 83.1 11.819.1 18.5 6.4 25.4 79.8 19.120.9 17.9 7.9 29.2 80.3 20.917.5 17.8 6.9 26.5 82.7 17.528.6 10.1 11.9 32.4 84.8 28.620.9 24.3 7.2 27.7 88.2 20.923.0 18.9 7.4 25.0 73.7 23.0;

PROC PRINT DATA=PROJECT;RUN;PROC CORR DATA=PROJECT;TITLE "CORRELATION MATRIX";VAR life smoking binge diabetes obesity immunization;RUN;

PROC REG DATA=PROJECT;MODEL life = smoking binge diabetes obesity immunization /SELECTION=STEPWISE;

Page 17: LifeExpectancyInAmerica

Running Head: Predicting Life Expectancy

RUN;PROC REG DATA = PROJECT;TITLE "REGRESSION";MODEL life = smoking binge diabetes obesity immunization;OUTPUT OUT= GRAPH

P=PREDICTEDSTUDENT=ST_RESR=RESIDUAL;

RUN;

**GRAPHICAL SUMMARIES OF RESIDUALS TO CHECK ASSUMPTIONS;PROC UNIVARIATE DATA=GRAPH NORMAL PLOT;

TITLE "NORMALITY CHECKING";VAR RESIDUAL;

RUN;

SYMBOL VALUE=DOT COLOR=RED I=R;

PROC GPLOT DATA=GRAPH;TITLE "RESIDUALS VERSUS PREDICTED";PLOT ST_RES * PREDICTED;

RUN;

Poverty

DATA q1;INPUT Y X7 X8 X9 X10;DATALINES;75.4 31.9 10.7 53.0 46.9 78.3 17.8 5.7 38.0 43.0 79.6 27.5 7.7 45.0 32.4 76.0 32.5 10.2 60.0 49.8 80.8 23.6 9.7 53.0 50.8 80.0 20.9 8.1 40.0 48.9 80.8 15.8 6.6 34.0 64.4 78.4 22.4 7.7 48.0 47.3 76.5 27.4 13.1 72.0 73.6 79.4 26.5 9.9 56.0 51.6 77.2 28.8 10.7 57.0 50.9 81.3 16.0 6.8 47.0 54.6 79.5 23.9 7.9 45.0 36.6 79.0 22.1 8.4 44.0 56.0 77.6 25.3 6.8 47.0 40.2 79.7 19.4 6.7 39.0 50.6 78.7 22.0 7.7 48.0 47.5 76.0 31.6 11.2 57.0 48.5 75.7 32.1 11.5 66.0 55.0 79.2 22.0 9.5 43.0 39.1 78.8 15.2 7.7 40.0 51.7 80.5 17.0 8.7 34.0 62.3 78.2 27.6 8.0 46.0 49.3 81.1 17.4 8.3 37.0 46.3

Page 18: LifeExpectancyInAmerica

Running Head: Predicting Life Expectancy

75.0 37.8 11.9 71.0 54.5 77.5 26.0 9.1 44.0 43.0 78.5 24.2 7.0 41.0 37.2 79.8 22.2 7.5 43.0 45.0 78.1 26.4 7.6 48.0 28.4 80.3 13.8 6.1 25.0 51.5 80.3 16.8 7.2 33.0 67.3 78.4 32.9 12.0 67.0 42.9 80.5 23.7 10.9 48.0 60.1 77.8 28.8 9.9 50.0 49.5 79.5 20.5 12.1 32.0 37.6 77.8 28.2 7.7 43.0 46.3 75.9 28.4 9.3 60.0 46.0 79.5 24.7 7.9 51.0 43.7 78.5 21.8 7.9 39.0 49.7 79.9 22.7 8.2 43.0 56.5 77.0 31.9 9.8 55.0 48.9 79.5 21.4 11.1 37.0 38.1 76.3 29.4 9.7 55.0 44.5 78.5 28.7 10.7 50.0 43.0 80.2 16.7 6.0 38.0 40.5 80.5 23.5 6.8 37.0 55.4 79.0 17.5 7.4 37.0 49.5 79.9 21.1 6.9 40.0 43.2 75.4 30.2 9.9 51.0 32.2 80.0 22.8 7.1 39.0 45.3 78.3 18.5 6.8 37.0 40.7 ;RUN;PROC PRINT DATA = q1;TITLE "Question 1";RUN;*CORRELATION ANALYSIS;PROC CORR DATA =q1;TITLE "q1a CORRELATION MATRIX";VAR Y X7 X8 X9 X10;RUN;

PROC REG DATA=q1;MODEL Y=X7 X8 X9 X10/SELECTION=STEPWISE;RUN;PROC REG DATA = q1;TITLE "REGRESSION";MODEL LE = WHT LAT AFAM ASAM NATAM;OUTPUT OUT= GRAPH

P=PREDICTEDSTUDENT=ST_RESR=RESIDUAL;

RUN;

**GRAPHICAL SUMMARIES OF RESIDUALS TO CHECK ASSUMPTIONS;PROC UNIVARIATE DATA=GRAPH NORMAL PLOT;

TITLE "NORMALITY CHECKING";

Page 19: LifeExpectancyInAmerica

Running Head: Predicting Life Expectancy

VAR RESIDUAL;RUN;

SYMBOL VALUE=DOT COLOR=RED I=R;

PROC GPLOT DATA=GRAPH;TITLE "RESIDUALS VERSUS PREDICTED";PLOT ST_RES * PREDICTED;

RUN;

Education

data MLR;input y x1 x2 x3 x4;datalines;17.9 82.1 21.9 8.0 76.19.0 91.0 27.9 9.4 71.214.4 85.6 25.9 9.2 74.117.1 82.9 19.5 6.3 75.719.3 80.7 30.1 11.0 78.410.3 89.7 36.4 13.0 77.511.4 88.6 35.5 15.3 81.912.3 87.7 27.8 11.3 78.312.6 87.4 50.1 26.9 74.614.5 85.5 25.8 9.2 77.515.7 84.3 27.3 9.8 77.210.1 89.9 29.5 9.6 74.911.7 88.3 24.4 7.7 76.113.1 86.9 30.8 11.5 79.713.0 87.0 22.7 8.1 76.99.4 90.6 24.9 7.9 79.010.8 89.2 29.8 10.5 78.418.1 81.9 20.5 8.1 74.718.1 81.9 21.4 7.0 75.29.7 90.3 26.8 9.5 77.711.9 88.1 36.1 16.4 78.510.9 89.1 39.0 16.7 81.211.3 88.7 25.2 9.6 79.28.2 91.8 31.8 10.3 79.219.0 81.0 19.5 7.1 76.113.1 86.9 25.6 9.5 76.68.3 91.7 28.8 9.0 75.79.6 90.4 28.6 9.0 80.215.3 84.7 21.7 7.4 71.88.5 91.5 32.8 12.4 79.412.0 88.0 35.4 13.3 81.316.7 83.3 25.0 10.8 74.715.1 84.9 32.5 14.0 79.115.3 84.7 26.5 8.7 76.29.7 90.3 27.6 7.9 74.111.9 88.1 24.6 8.9 77.913.8 86.2 22.9 7.5 75.711.2 88.8 28.8 10.5 76.011.6 88.4 27.1 10.4 78.616.5 83.5 30.2 12.2 79.1

Page 20: LifeExpectancyInAmerica

Running Head: Predicting Life Expectancy

15.9 84.1 24.5 8.8 76.410.4 89.6 26.3 7.7 76.216.4 83.6 23.1 8.5 75.319.3 80.7 25.9 8.6 76.39.4 90.6 29.3 9.4 76.99.0 91.0 33.6 13.3 77.613.5 86.5 34.2 14.2 76.910.2 89.8 31.1 11.1 74.916.8 83.2 17.5 6.6 75.59.9 90.1 26.3 9.0 78.07.7 92.3 24.1 8.4 73.4

;run;

proc print data = mlr;run;

proc corr data = mlr;run;

proc reg data = mlr;model y = x1 x2 x3 x4/selection = stepwise;run;

Page 21: LifeExpectancyInAmerica

Running Head: Predicting Life Expectancy

References

1) Olshansky, S., Antonucci, T., Berkman, L., Binstock, R., Boersch-Supan, A., Cacioppo, J., . . . Rowe, J.

(2012). Differences In Life Expectancy Due To Race And Educational Differences Are Widening, And

Many May Not Catch Up. Health Affairs, 1803-1813.

2) Olshansky, S., Passaro, D., Hershow, R., Layden, J., Carnes, B., Brody, J., . . . Ludwig, D. (2010). A

Potential Decline in Life Expectancy in the United States in the 21st Century. Obstetrical & Gynecological

Survey, 450-452.

3) Lowery, A. (2014, March 15). Income Gap, Meet the Longevity Gap. New York TImes.