stateenlistmentrate

12
Enlistment rate prediction by state State Enlistment rates compared by education Akhil Raman STAT 4601 Multiple Linear Regression Project Professor Staffan Fredricsson

Upload: akhil-raman

Post on 21-Jan-2017

77 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: StateEnlistmentRate

Enlistment rate prediction by state

State Enlistment rates compared by education

Akhil Raman

STAT 4601 Multiple Linear Regression Project

Professor Staffan Fredricsson

California State University, East Bay

Spring 2015

Page 2: StateEnlistmentRate

Enlistment rate prediction by state

Abstract

Be all that you can be. This slogan paints a picture of dedicated service, selflessness, and loyalty to a higher cause. Per the “American Human Development Project”, the 2010 U.S. population was 308,745,538. The same year, 69,357 Americans enlisted in the United States Army, raising the total number of soldiers to 541,291. With majority of Americans choosing not to enlist in the Army, the question then becomes, who is enlisting in the Army? Researchers have sought to answer this question by comparing enlistment rate across different variables (Ethnicity, College Enrollment, High School GPA, Poverty etc.). The importance of the research helps analysts understand the motives behind joining the military, and also gives the military itself a platform to determine the success of potential recruits (i.e. If a recruit will complete their first enlistment). In my Simple Linear Regression project, I compared Median Income with the state’s enlistment rate. My project displayed a negative correlation (R2 = .335) and strong evidence (p –value < .0001). However, the literature review shows that comparing the median income does not tell the entire story, and there is evidence to suggest the majority of the armed forces are not comprised of individuals from lower socio-economics backgrounds. In addition, it is postulated that recruits join because of a lack of education, and do not have the same opportunities because of it. This paper serves two additional purposes: 1) To review existing literature on education and enlistment rate and 2) To determine if result of this analysis are consist with existing literature or contradicts it. Data collected and analyzed from “American Human Development Project” displays a negative correlation with an R2 value of .4140. The backward elimination model was used, which displayed the percentage of state citizens with a bachelor’s degree, and percentage of state citizens enrolled in college as the most significant predictor variables. The Analysis of Variance of the independent variables shows significance and evidence (P-values < .05.) that the Enlistment rate by state can be explained by the percentage of people who have a bachelor’s degree and percentage of people enrolled in college. However, literature comparing education to enlistment rate paints a different picture.

Page 3: StateEnlistmentRate

Enlistment rate prediction by state

Introduction

On October 7, 2001 the United States declared to invade Afghanistan in response to the September 11, 2001 terrorist attacks. A little under two years later, on March 20, 2003, Former President George W. Bush declared War on Iraq. Since 2001, congress authorized funding for the United States Army to enlist 70,000 recruits per year, and funding has increased to around $309 billion2.

In October 2010, I enlisted in the United States Marine Corps. During this time, I have had the pleasure of meeting and becoming close friends with individuals from different ethnic, cultural, religious, and socio-economic background. Based on my personal observations, service members do pursue higher education (before, during, and after enlistment). My hypothesis is that high school graduation rate, school enrollment rate, college enrollment rate, and the percentage of people who have bachelor’s degrees should all be predictors of the state’s enlistment rate. Specifically, my hypothesis is there is a positive relationship between these variables and the state’s enlistment rate.

Literature Review

According to studies published by the Heritage Foundation, the recruits today are more educated when objectively measured by evidence (Education Level, Household Income, High School GPA, Region/Rural Origin)3. As mentioned in my Simple Linear Regression project, recruits are more educated, come from higher income families, and from areas with lower crime rates. Based on this evidence, the notion that recruits are less educated, come from families with lower median income, and areas of higher crime may not be substantiated. In 2007, only 11 percent of enlisted recruits came from the poorest (fifth-quantile) of neighborhoods1. The research displays clear evidence that majority of the enlisted recruits came from middle-class or upper-middle class backgrounds. However, the data was gathered based on census tract, meaning the recruits household income prior to enlisting was unknown and estimated based on the area lived. According to the research, the estimated median income was $54,834/year 2. This is above the income threshold to be considered “poor.” In addition, American soldiers are more educated than their peers. A little more than 1 percent of enlisted personnel lack a high school degree, compared to 21 percent of men 18-24 years old, and 95 percent of officer accessions have at least a bachelor's degree.1

These facts do not support the notion that American soldiers serve because of no available opportunities. Based on the provided literature, it seems that American soldiers might have more opportunities available compared to non-service members.

Methods

Minitab 17 was used to analyze data provided by the “American Human Development Project.” Data was analyzed by first calculating the percentage of population who enlisted in the Army for 2010. This was done by using the data provided on the “American Human Development Project.” The total number of Army recruits was divided by the total population. Then, a multiple linear regression model was used to determine correlation between the dependent and independent variables. Finally, those variables which did not have a significant p-value were dropped from the model. The backwards method was

Page 4: StateEnlistmentRate

Enlistment rate prediction by state

used to drop any insignificant variables to return the final model. Leverage values and Cook’s Distance Measure were analyzed to determine and outliers and influential observations, respectively. Finally, residuals were also analyzed to ensure the constant variance and normality assumptions were met.

Results

8075706560555045

0.00010

0.00005

0.00000

-0.00005

-0.00010

-0.00015

CollegeEnrollment%

Resid

ual

Residuals Versus CollegeEnrollment%(response is ArmyRecruits(%Population))

9590858075706560

0.00010

0.00005

0.00000

-0.00005

-0.00010

-0.00015

HighSchoolGrad%

Resid

ual

Residuals Versus HighSchoolGrad%(response is ArmyRecruits(%Population))

50454035302520

0.00010

0.00005

0.00000

-0.00005

-0.00010

-0.00015

BachelorsDegree%

Resid

ual

Residuals Versus BachelorsDegree%(response is ArmyRecruits(%Population))

3025201510

0.00010

0.00005

0.00000

-0.00005

-0.00010

-0.00015

GraduateDegree

Resid

ual

Residuals Versus GraduateDegree(response is ArmyRecruits(%Population))

*Residual plots of all independent variables show no clear pattern. The constant variance assumption is met.

10090807060

99

95

90

80706050403020

10

5

1

Mean 78.33StDev 7.199N 51AD 0.599P-Value 0.114

HighSchoolGrad%

Perc

ent

Probability Plot of HighSchoolGrad%Normal

5040302010

99

95

90

80706050403020

10

5

1

Mean 27.94StDev 5.721N 51AD 0.625P-Value 0.098

BachelorsDegree%

Perc

ent

Probability Plot of BachelorsDegree%Normal

8070605040

99

95

90

80706050403020

10

5

1

Mean 62.02StDev 7.681N 51AD 0.670P-Value 0.076

CollegeEnrollment%

Perc

ent

Probability Plot of CollegeEnrollment%Normal

82807876747270

99

9590

80706050403020

10

5

1

Mean 76.93StDev 2.253N 51AD 0.214P-Value 0.843

SchoolEnrollment%

Perc

ent

Probability Plot of SchoolEnrollment%Normal

*Andersen-Darling Normality Test shows that all independent variables have p-values > .05. The normality assumption is met.

Page 5: StateEnlistmentRate

Enlistment rate prediction by state

9590858075706560

0.035%

0.030%

0.025%

0.020%

0.015%

0.010%

0.005%

HighSchoolGrad%

Arm

yRec

ruits

(%Po

pulat

ion)

Scatterplot of ArmyRecruits(%Population) vs HighSchoolGrad%

82807876747270

0.035%

0.030%

0.025%

0.020%

0.015%

0.010%

0.005%

SchoolEnrollment%

Arm

yRec

ruits

(%Po

pulat

ion)

Scatterplot of ArmyRecruits(%Population) vs SchoolEnrollment%

50454035302520

0.035%

0.030%

0.025%

0.020%

0.015%

0.010%

0.005%

BachelorsDegree%

Arm

yRec

ruits

(%Po

pulat

ion)

Scatterplot of ArmyRecruits(%Population) vs BachelorsDegree%

8075706560555045

0.035%

0.030%

0.025%

0.020%

0.015%

0.010%

0.005%

CollegeEnrollment%

Arm

yRec

ruits

(%Po

pulat

ion)

Scatterplot of ArmyRecruits(%Population) vs CollegeEnrollment%

*Scatterplots of the Army Enlistment rate vs. the independent variables. All the scatterplot shows a negative correlation with the Army Enlistment rate.

ArmyRecruits SchoolEnrollment CollegeEnrollmen BachelorsDegree%SchoolEnrollment -0.467 0.001

CollegeEnrollmen -0.203 0.436 0.153 0.001

BachelorsDegree% -0.603 0.392 -0.034 0.000 0.004 0.814 HighSchoolGrad% -0.258 0.440 0.024 0.120 0.067 0.001 0.868 0.403

Backward Elimination of Terms

Candidate terms: HighSchoolGrad%, BachelorsDegree%, CollegeEnrollment%, SchoolEnrollment%

-------Step 1------- -------Step 2------- -------Step 3------- Coef P Coef P Coef PConstant 0.000830 0.000640 0.000523HighSchoolGrad% -0.000001 0.282 -0.000002 0.102BachelorsDegree% -0.000006 0.000 -0.000007 0.000 -0.000007 0.000CollegeEnrollment% -0.000001 0.204 -0.000002 0.050 -0.000002 0.049SchoolEnrollment% -0.000003 0.438

S 0.0000491 0.0000489 0.0000498R-sq 45.41% 44.69% 41.40%R-sq(adj) 40.67% 41.16% 38.96%R-sq(pred) 23.18% 31.48% 35.62%Mallows’ Cp 5.00 3.61 4.38

α to remove = 0.05

Analysis of Variance

A 5% significance level was used to determine the final model.

Correlation Matrix

Page 6: StateEnlistmentRate

Enlistment rate prediction by state

Source DF Seq SS Contribution Adj SS Adj MS F-Value P-ValueRegression 2 0.000000 41.40% 0.000000 0.000000 16.96 0.000 BachelorsDegree% 1 0.000000 36.41% 0.000000 0.000000 30.54 0.000 CollegeEnrollment% 1 0.000000 5.00% 0.000000 0.000000 4.09 0.049Error 48 0.000000 58.60% 0.000000 0.000000Total 50 0.000000 100.00%

Model Summary

S R-sq R-sq(adj) PRESS R-sq(pred)0.0000498 41.40% 38.96% 0.0000001 35.62%

Coefficients

Term Coef SE Coef 95% CI T-Value P-Value VIFConstant 0.000523 0.000068 ( 0.000387, 0.000660) 7.71 0.000BachelorsDegree% -0.000007 0.000001 (-0.000009, -0.000004) -5.53 0.000 1.00CollegeEnrollment% -0.000002 0.000001 (-0.000004, -0.000000) -2.02 0.049 1.00

Regression Equation

ArmyRecruits(%Population) = 0.000523 - 0.000007 BachelorsDegree% - 0.000002 CollegeEnrollment%

R-squared -> 41.40% of the variation in the % of population in a state’s enlistment rate can be explained by the final model.

The final regression equation

Since the Variance Inflation Factors (VIF) is < 10, there is no multicollinearity.

Page 7: StateEnlistmentRate

Enlistment rate prediction by state

Leverage0.0418693710.1747594480.0503924720.0632592710.0288013170.0656905930.1207006510.0290728020.4738768680.0261757140.0203623740.0442288290.0277563210.0501312220.0390730180.0557375370.0283247320.0588917110.0481705870.0242512060.0603310290.1245282520.0589357050.0419644590.0641925070.0229656090.0284396040.0652708760.1225755010.0480526920.1031622490.0389446160.0425860850.0222268080.0558938580.0367674290.0361983620.0247121410.0351919350.0384615390.026646817

0.022528110.0370617070.0223732070.0211081120.0393198930.0485841850.0556596830.0861175560.0292527790.068420654

Cook'sD0.0436540.0569780.029119

0.002540.0040850.0167930.0186450.0087220.5167270.0087670.0303780.0074430.0065150.0004920.0010760.0055550.0083190.0360040.0301490.0121640.0001230.0018810.0001990.0104690.0163580.0024450.0183330.0049420.0005770.0027030.0150560.0001540.0032350.0067170.2162850.0005610.0010770.0001280.0079260.0116410.0134330.003806

0.002280.0247250.0030430.000104

0.04650.0014980.0281570.0053850.007723

Average of all leverage values = 2(k+1)/n = 2(3)/51 =0.117647059. Only the leverage value for Washington D.C. (0.4738) is > 2 x (average of all leverage values).

Based on Cook’s D, since no values are > 2.92 (from F-table), there are no influential observations, which if removed, would significantly affect the model.

Page 8: StateEnlistmentRate

Enlistment rate prediction by state

Conclusions

The regression analysis via Minitab indicates that 41.40 % of the variation in Army Enlistment rate by state can be explained by the percentage of state citizens who are enrolled in college, and those who have a bachelor’s degree. Both predictors have significant p-values and can be included in the final model. This is not consistent with my original hypothesis, that all the education variables would be predictors of a state’s enlistment rate. In addition, the final two predictors do not have a positive correlation. With a negative trend, it shows that as college enrollment and bachelor’s degrees increase, the enlistment rate decreases.

However, as explained in the literature review section, the current research does not support this argument. In fact, the current research provides evidence for the opposite conclusion.

First, the existing literature did not use a regression analysis; a census tract was used to gather information about recruits. Also, the “American Human Development Project” had data for enlistment. Commissioning statistics were not provided. The difference is beyond the scope of this paper, but a bachelor’s degree is required to be a commissioned officer in the United States Military. In addition, the dataset only provided statistics for the United States Army. Further research can be completed to see how the regression analysis changes when including the other branches (Marine Corps, Navy, Air Force, Coast Guard, National Guard).

Finally, my research does not account for total population, average income, ethnic demographics, and other variables which can explain a state’s enlistment rate. Further research is needed, using all these variables.

Page 9: StateEnlistmentRate

Enlistment rate prediction by state

Works Cited

References

1. Military Recruiting Standards | Demographics of Military Personnel. (n.d.). Retrieved from

http://www.heritage.org/research/reports/2008/08/who-serves-in-the-us-military-the-

demographics-of-enlisted-troops-and-officers

2. Report: Increases in Military Spending and Security Assistance Since 9/11 - World Policy

Institute - Research Project | World Policy Institute. (n.d.). Retrieved from

http://www.worldpolicy.org/projects/arms/news/SpendingDOD911.html

3. Who Are the Recruits? The Demographic Characteristics of U.S. Military Enlistment, 2003-

2005. (n.d.). Retrieved from http://www.heritage.org/Research/Reports/2006/10/Who-

Are-the-Recruits-The-Demographic-Characteristics-of-US-Military-Enlistment-2003-

2005

Page 10: StateEnlistmentRate

Enlistment rate prediction by state

Dataset

State ArmyRecruits(%population) HighSchoolGrad% CollegeEnrollment At least bachelors degree Graduate degree ArmyRecruitsTotal Total PopulationAlabama 0.034% 71.8 64.3 21.9 8.0 1,630 4,779,736

Alaska 0.030% 75.5 46.4 27.9 9.4 215 710,231

Arizona 0.031% 74.7 57.9 25.9 9.2 1,996 6,392,017

Arkansas 0.026% 75.0 65.4 19.5 6.3 753 2,915,918

California 0.016% 78.2 61.7 30.1 11.0 6,087 37,253,956

Colorado 0.021% 79.8 61.2 36.4 13.0 1,041 5,029,196

Connecticut 0.011% 75.1 78.7 35.5 15.3 388 3,574,097

Delaware 0.016% 75.5 47.3 27.8 11.3 146 897,934

District of Columbia 0.007% 59.9 50.7 50.1 26.9 40 601,723

Florida 0.027% 70.8 63.0 25.8 9.2 5,147 18,801,310

Georgia 0.032% 69.9 67.7 27.3 9.8 3,119 9,687,653

Hawaii 0.019% 75.4 63.6 29.5 9.6 260 1,360,301

Idaho 0.028% 84.0 45.1 24.4 7.7 446 1,567,582

Illinois 0.017% 81.9 58.7 30.8 11.5 2,220 12,830,632

Indiana 0.023% 77.2 65.8 22.7 8.1 1,518 6,483,802

Iowa 0.019% 87.9 66.6 24.9 7.9 588 3,046,355

Kansas 0.024% 84.5 64.7 29.8 10.5 689 2,853,118

Kentucky 0.021% 79.9 62.9 20.5 8.1 927 4,339,367

Louisiana 0.020% 68.8 64.7 21.4 7.0 922 4,533,372

Maine 0.028% 82.8 56.2 26.8 9.5 369 1,328,361

Maryland 0.016% 82.2 64.0 36.1 16.4 952 5,773,552

Massachusetts 0.013% 82.6 73.2 39.0 16.7 873 6,547,629

Michigan 0.022% 75.9 61.9 25.2 9.6 2,176 9,883,640

Minnesota 0.014% 88.2 70.9 31.8 10.3 733 5,303,925

Mississippi 0.023% 63.8 78.8 19.5 7.1 684 2,967,297

Missouri 0.026% 83.7 61.4 25.6 9.5 1,562 5,988,927

Montana 0.029% 81.9 60.5 28.8 9.0 286 989,415

Nebraska 0.021% 83.8 69.5 28.6 9.0 386 1,826,341

Nevada 0.029% 57.8 51.8 21.7 7.4 774 2,700,551

New Hampshire 0.019% 86.3 64.3 32.8 12.4 252 1,316,470

New Jersey 0.011% 87.2 68.6 35.4 13.3 999 8,791,894

New Mexico 0.026% 67.3 72.4 25.0 10.8 528 2,059,179

New York 0.015% 76.0 68.9 32.5 14.0 2,971 19,378,102

North Carolina 0.028% 76.9 64.0 26.5 8.7 2,650 9,535,483

North Dakota 0.008% 88.4 67.4 27.6 7.9 56 672,591

Ohio 0.022% 81.4 61.5 24.6 8.9 2,523 11,536,504

Oklahoma 0.027% 78.5 60.2 22.9 7.5 1,012 3,751,351

Oregon 0.023% 76.3 47.8 28.8 10.5 867 3,831,074

Pennsylvania 0.017% 84.1 60.9 27.1 10.4 2,171 12,702,379

Rhode Island 0.014% 76.4 65.4 30.2 12.2 151 1,052,567

South Carolina 0.030% 68.2 68.3 24.5 8.8 1,387 4,625,364

South Dakota 0.020% 81.8 71.8 26.3 7.7 162 814,180

Tennessee 0.024% 80.4 62.0 23.1 8.5 1,501 6,346,105

Texas 0.032% 78.9 56.2 25.9 8.6 8,077 25,145,561

Utah 0.018% 78.6 53.3 29.3 9.4 494 2,763,885

Vermont 0.019% 91.4 53.5 33.6 13.3 116 625,741

Virginia 0.026% 81.2 63.9 34.2 14.2 2,103 8,001,024

Washington 0.020% 77.2 48.3 31.1 11.1 1,368 6,724,540

West Virginia 0.024% 78.3 59.2 17.5 6.6 448 1,852,994

Wisconsin 0.018% 91.1 60.1 26.3 9.0 1,041 5,686,986

Wyoming 0.024% 80.3 60.4 24.1 8.4 135 563,626