essays on college transfer in the u.s. and ......abstract this dissertation studies college transfer...
TRANSCRIPT
ESSAYS ON COLLEGE TRANSFER IN THE U.S.
AND CHILDREN’S WELFARE IN CHINA
by
Xiaochen Xu
A dissertation submitted to The Johns Hopkins University in conformity
with the requirements for the degree of Doctor of Philosophy
Baltimore, Maryland
August, 2013
c© 2013 Xiaochen Xu
All Rights Reserved
Abstract
This dissertation studies college transfer in U.S. and children’s welfare in
China. In Chapter 2, I develop and estimate a two-period ability-learning structural
model to analyze the determinants and consequences of college transfer. Students
make college entry and transfer decisions under different financial constraints and
uncertainty about their abilities. In period 1, students choose between community
colleges and universities, and in period 2, they make transfer decisions. I estimate
the structural parameters of the model using data from the Beginning Postsecondary
Students Longitudinal Study (BPS:04/09), with simulated maximum likelihood. I
further examine the extent to which the effectiveness of the transfer function of
community colleges can be improved with three counterfactual experiments. They
included increasing university tuition costs, eliminating transfer costs, and increasing
academic preparedness. The experiments suggest that transfer costs are the main
barrier to college transfer.
Chapter 3 studies the impact of labor migration on children’s health in
China. Labor migration, which frequently results in family separations, is widely
known as one of the main ways of alleviating poverty in developing countries. In
China, migrant workers helped build the Chinese dream in cities across the coun-
try. But for their children, who are left behind in the countryside, the potential
health problems of their physical and social development is becoming a national is-
ii
sue. This study uses data collected as part of the China Health and Nutrition Survey
(CHNS) in 2000, 2004, 2006, and 2009 to identify the impact of parents’ migration
on the health outcomes of children in rural China. The measurements of child health
outcomes are weight-for-age Z-score (WAZ), height-for-age Z-score (HAZ), nutrient
intake (consumption of calories and protein), the number of immunization shots that
children get in the survey year and child-care. To identify the effect of parental
migration on child health, we instrumented parents’ migration status with county
level historical average migration rates. We found there were few significant effects
of parents’ migration on child health outcomes.
Keywords: College transfer, tuition, uncertainty, Bayesian inference,
heterogeneity, children’s health, labor migration, fixed effects
model
JEL Classification: A22, C11, C13, C51, I14, I15
Advisors: Professor Robert Moffitt
Professor Yingyao Hu
iii
Acknowledgements
I am deeply indebted to Robert Moffitt for his guidance and encouragement
on this project. I benefited greatly from the comments of Yingyao Hu. I thank
Przemek Jeziorski, Tiemen Woutersen, and seminar participants at Johns Hopkins
for their helpful comments. The usual disclaimer applies. Comments are welcome.
iv
Contents
Abstract ii
Acknowledgements iv
List of Tables viii
1 Introduction 1
2 The Determinants and Consequences of College Transfer 5
2.1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2.2 STYLIZED FACTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
2.3 MODEL SPECIFICATION . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.3.2 Primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.3.3 Ability learning process . . . . . . . . . . . . . . . . . . . . . . 19
2.3.4 Value function . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.4 ESTIMATION STRATEGY AND IDENTIFICATION . . . . . . . . . 26
2.4.1 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.4.2 Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.5 DATA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.6 ESTIMATION RESULTS . . . . . . . . . . . . . . . . . . . . . . . . . 32
v
2.6.1 Parameters estimates . . . . . . . . . . . . . . . . . . . . . . . 32
2.6.2 Model fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.7 POLICY SIMULATION . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.7.1 Increase tuition fees in universities . . . . . . . . . . . . . . . . 37
2.7.2 Improved academic preparedness . . . . . . . . . . . . . . . . . 39
2.7.3 Decrease the transfer cost . . . . . . . . . . . . . . . . . . . . . 41
2.7.4 No transfer cost . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.8 CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3 The Impact of Labor Migration on Children’s Health: Evidence
from Rural China 54
3.1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.2 BACKGROUND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
3.2.1 Labor Migration and Children Left Behind in Rural China . . 59
3.2.2 Health of Children in China . . . . . . . . . . . . . . . . . . . . 60
3.3 CONCEPTUAL FRAMEWORK . . . . . . . . . . . . . . . . . . . . . 61
3.4 DATA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.5 EMPIRICAL SPECIFICATION . . . . . . . . . . . . . . . . . . . . . 66
3.6 ESTIMATION RESULTS . . . . . . . . . . . . . . . . . . . . . . . . . 69
3.6.1 Results of Ordinary Least Squares model . . . . . . . . . . . . 69
3.6.2 Results of Fixed Effects model . . . . . . . . . . . . . . . . . . 71
3.6.3 Results of Fixed Effects model with instrument variable . . . . 72
3.7 ROBUSTNESS CHECK . . . . . . . . . . . . . . . . . . . . . . . . . . 76
3.8 REGRESSION RESULTS ON SUBSAMPLES . . . . . . . . . . . . . 77
3.9 CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
4 Conclusion 115
vi
A Appendix to Chapter 2 117
A.1 Bayesian Update in Ability Learning Process . . . . . . . . . . . . . . 117
A.2 Estimation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
A.2.1 The Closed Form of P (si1|·) . . . . . . . . . . . . . . . . . . . . 119
A.2.2 The Closed Form of P (si2|·) . . . . . . . . . . . . . . . . . . . . 121
A.2.3 The Closed Form of f(wit|·) . . . . . . . . . . . . . . . . . . . . 122
A.3 The Closed Form of f(κit|·): . . . . . . . . . . . . . . . . . . . . . . . . 123
Bibliography 125
Curriculum Vitae 130
vii
List of Tables
2.8 Possible education paths for a student starting from a four-year uni-
versity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
2.9 Possible education paths for a student starting from a community college 14
2.1 Percentage enrollment in period 1 . . . . . . . . . . . . . . . . . . . . . 45
2.2 Percentage of transfer in period 2 . . . . . . . . . . . . . . . . . . . . . 45
2.3 Average tuition fee . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.4 Average high school GPA and SAT score (normalized) . . . . . . . . . 45
2.5 Average college GPA for transfer students . . . . . . . . . . . . . . . . 46
2.6 Average family income . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2.7 Average family income for transfer students . . . . . . . . . . . . . . . 46
2.10 Estimation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
2.11 Enrollment rate in period 1 in model fit . . . . . . . . . . . . . . . . . 48
2.12 Enrollment rate in period 2 in model fit . . . . . . . . . . . . . . . . . 48
2.13 Transfer rate in period 2 in model fit . . . . . . . . . . . . . . . . . . 49
2.14 Enrollment rate in period 1 in experiment study 1 . . . . . . . . . . . 49
2.15 Transfer rate in period 2 in experiment study 1 . . . . . . . . . . . . . 49
2.16 Enrollment rate in period 2 in experiment study 1 . . . . . . . . . . . 50
2.17 Transfer rate in period 2 for current student in experiment study 1 . 50
2.18 Enrollment rate in period 1 in experiment study 2 . . . . . . . . . . . 50
viii
2.19 Transfer rate in period 2 in experiment study 2 . . . . . . . . . . . . . 51
2.20 Enrollment rate in period 2 in experiment study 2 . . . . . . . . . . . 51
2.21 Enrollment rate in period 1 in experiment study 3 . . . . . . . . . . . 51
2.22 Transfer rate in period 2 in experiment study 3 . . . . . . . . . . . . . 52
2.23 Enrollment rate in period 2 in experiment study 3 . . . . . . . . . . . 52
2.24 Enrollment rate in period 1 in experiment study 4 . . . . . . . . . . . 52
2.25 Transfer rate in period 2 in experiment study 4 . . . . . . . . . . . . . 53
2.26 Enrollment rate in period 2 in experiment study 4 . . . . . . . . . . . 53
3.1 Parents Migration Rate for Children under age ten(CHNS) . . . . . . 81
3.2a Descriptive Statistics (CHNS) . . . . . . . . . . . . . . . . . . . . . . . 82
3.2b Descriptive Statistics (CHNS) . . . . . . . . . . . . . . . . . . . . . . . 83
3.2c Descriptive Statistics (CHNS) . . . . . . . . . . . . . . . . . . . . . . . 84
3.3 Descriptive Statistics (CHNS) of Control Variables . . . . . . . . . . . 85
3.4a OLS regression results: the effects of the household migration status . 86
3.4b OLS regression results: the effects of the household migration status . 87
3.5a OLS regression results: the effects of the father’s migration . . . . . . 88
3.5b OLS regression results: the effects of the father’s migration . . . . . . 89
3.6a OLS regression results: the effects of the mother’s migration . . . . . . 90
3.6b OLS regression results: the effects of the mother’s migration . . . . . . 91
3.7a Fixed effects model results of the effects of the household migration
status on children’s health outcome and care . . . . . . . . . . . . . . 92
3.7b Fixed effects model results of the effects of the household migration
status on children’s health outcome and care . . . . . . . . . . . . . . 93
3.8a Fixed effects model results of the effects of the father’s migration status
on children’s health outcome and care . . . . . . . . . . . . . . . . . . 94
ix
3.8b Fixed effects model results of the effects of the father’s migration status
on children’s health outcome and care . . . . . . . . . . . . . . . . . . 95
3.9a Fixed effects model results of the effects of the mother’s migration
status on children’s health outcome and care . . . . . . . . . . . . . . 96
3.9b Fixed effects model results of the effects of the mother’s migration
status on children’s health outcome and care . . . . . . . . . . . . . . 97
3.10 First Stage fixed effects Regression Results . . . . . . . . . . . . . . . 98
3.11aFixed effects model results of the effects of the household migration
status on children’s health outcome and care: IV approach . . . . . . . 99
3.11bFixed effects model results of the effects of the household migration
status on children’s health outcome and care: IV approach . . . . . . . 100
3.12aFixed effects model results of the effects of the father’s migration status
on children’s health outcome and care: IV approach . . . . . . . . . . 101
3.12bFixed effects model results of the effects of the father’s migration status
on children’s health outcome and care: IV approach . . . . . . . . . . 102
3.13aFixed effects model results of the effects of the mother’s migration
status on children’s health outcome and care: IV approach . . . . . . . 103
3.13bFixed effects model results of the effects of the mother’s migration
status on children’s health outcome and care: IV approach . . . . . . . 104
3.14aRobustness Check 1: the effects of the household migration status on
children’s health outcome and care without household income as a
control variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
3.14bRobustness Check 1: the effects of the household migration status on
children’s health outcome and care without household income as a
control variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
x
3.15aRobustness Check 1: the effects of the father’s migration status on
children’s health outcome and care without household income as a
control variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
3.15bRobustness Check 1: the effects of the father’s migration status on
children’s health outcome and care without household income as a
control variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
3.16aRobustness Check 1: the effects of the mother’s migration status on
children’s health outcome and care without household income as a
control variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
3.16bRobustness Check 1: the effects of the mother’s migration status on
children’s health outcome and care without household income as a
control variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
3.17aRobustness Check 2: the effects of the household migration status on
children’s health outcome and care without the number of elders as
control variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
3.17bRobustness Check 2: the effects of the household migration status on
children’s health outcome and care without the number of elders as
control variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
3.18aRobustness Check 2: the effects of the father’s migration status on
children’s health outcome and care without the number of elders as
control variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
3.18bRobustness Check 2: the effects of the father’s migration status on
children’s health outcome and care without the number of elders as
control variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
xi
3.19aRobustness Check 2: the effects of the mother’s migration status on
children’s health outcome and care without the number of elders as
control variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
3.19bRobustness Check 2: the effects of the mother’s migration status on
children’s health outcome and care without the number of elders as
control variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
3.20aFixed effects model results of the effects of the household migration
status on children’s health outcome and care on subsamples: IV approach111
3.20bFixed effects model results of the effects of the household migration
status on children’s health outcome and care on subsamples: IV approach112
3.21aFixed effects model results of the effects of the father’s migration status
on children’s health outcome and care on subsamples: IV approach . . 113
3.21bFixed effects model results of the effects of the father’s migration status
on children’s health outcome and care on subsamples: IV approach . . 114
xii
Chapter 1
Introduction
Education and health are important topics in labor economics. A persons
welfare is closely related to their education and health. This dissertation consists of
two essays in education and health. The first essay examines the determinants and
consequences of college transfer using a structural modeling approach. The second
essay examines the impact of labor migration on the health outcomes and care of
children using fixed effects model.
The literature on the college market mainly draws on non-structural ap-
proaches. For instance, Hilmer (1998) studied the effects of tuition on college trans-
fer. He suggested that financial concerns are not the most influential for students
making college decisions. Kane and Rouse (1993) examined the consequences of
college transfer on labor market return and found similar returns to two-year and
four-year college credits. The findings of this paper support Kane and Rouses find-
ings. However, there are a few studies which do employ structural models. One is
Fu (2010), who proposed and empirically implemented a market equilibrium model
for college education, focusing on application strategies. Admission, net tuition, and
enrollment were joint outcomes in the model. Beizil and Hansen (2002) also used
a structural model to examine study choices impacting duration of schooling, while
1
taking into account heterogeneous abilities. Nonetheless, the results in both papers
may be biased because they ignored a large proportion of transfer students.
Chapter 2 contains the first attempt to estimate a structural model for
college enrollment and student transfer decisions subject to students’ uncertainty of
abilities. It examines the determinants and consequences of college entry and transfer
decisions through an ability-learning structural model, in which school qualities and
students’ future wages are considered. Using such a model, I can provide insight into
the determinants of college enrollment and transfer decisions, permitting quantitative
evaluation of the effects of counterfactual changes in the college market. The model
explains the allocation of students to different schools. More importantly, I can give
interpretations of the driving forces of and the barriers to college transfer.
This essay contributes to the current literature by simultaneously modeling
four aspects of the decision to enroll in college that are important for empirical
analysis. The first is the uncertainty of student abilities. A student has an a priori
belief about her abilities, informed by her high school GPA, SAT scores and college
GPA. At the same time, the student has private information on her abilities, which
cannot be observed by econometricians. The second aspect is that college transfer
is costly for students; in both monetary and non-monetary terms. Monetary costs
include application fees and relocation costs, while non-monetary costs include the
time spent in school searches and the loss of non-transferable credits. The third is
the effect of college transfer on future income. This includes the empirical fact that
transfer students earn less than non-transfer students who graduate from the same
university. The fourth aspect is the heterogeneity of students with regard to their
family backgrounds, abilities, and school preferences. Students make different college
enrolment and transfer decisions based on their backgrounds, school preferences, and
expected abilities.
2
Chapter 3 aims to establish the overall consequences of parental migration
on the health outcomes and care of their left-behind children. Some of the eco-
nomic literature focusing on labor migration in China suggests that the remittances
forwarded to families by migrated members benefit the households financially. For
instance, Du et al. (2006) and de Brauw and Giles (2008) found that labor migra-
tion increased family consumption level. There are few papers that study the health
outcomes of left-behind children in China. One of them is Mu and Brauw (2011),
which examined the weight of left-behind children, and found that older children
(7-12 years) were more likely to be underweight in migrant households than those
who lived in nonmigrant household. Shu Zhang (2012) used survey data from the
2000 wave of the China Health and Nutrition Survey (CHNS) to study the impact of
labor migration on children’s health. She found no significant health outcome effects
for children whose fathers had migrated. Both papers, however, do not consider the
potential endogeneity of parental migration and children’s health. Therefore their
results might be biased.
The main methodological obstacle of quantifying the effect of parental mi-
gration is the endogeneity problem, or the potential for reverse causation. Instead
of being affected by their parents migration status, a childs health status could be a
critical factor for their parents when making migration decisions. To overcome the
endogeneity problem, we used instrumental variables (IV) estimation. To be more
specific, we instrumented people’s migration status with the historical county level
migration rate. The historical county level migration rate is a suitable indicator
to reflect the local culture and network of migration, where the network refers to
a person’s exposure to migration information from her migrated friends or family
members.
Chapter 3 contributes to the literature in a number of ways. Firstly, we
3
used novel instrumental variables dealing with the endogenous nature of parents’
migration decisions, which are able to predict the migration propensity of parents.
Secondly, we studied different effects of father’s and mother’s migration status on
child health outcomes, which were significantly different. Thirdly, in addition to
traditional measurements of child health that focus on height and weight, we also
considered nutrient intake (consumption of calories and protein), immunization shots
and childcare. These measures provided a more comprehensive picture of the impact
of labor migration on child health.
The remainder of this dissertation consists of two essays and the conclusion.
The concluding chapter summarizes contributions of the essays and discusses avenues
for future research.
4
Chapter 2
The Determinants and
Consequences of College
Transfer
2.1 INTRODUCTION
As college tuition continues to rise, the issue of financing a college educa-
tion is attracting widespread scholarly interest and generating much public policy
debate. According to the Beginning Postsecondary Students Longitudinal Study
(BPS: 04/09), 27% of freshmen in the United States are in community colleges, and
of those, 32% eventually transfer to universities. In spite of the importance of com-
munity colleges, neither the causes of nor the barriers of college transfer have received
much attention.
This paper is the first to estimate a structural model for college enrollment
and transfer decisions subject to students’ uncertainty of abilities. Existing papers in-
volving similar topics mainly rely on non-structural approaches. For instance, Hilmer
(1998) studies the effects of tuition increases on the determinants of college transfer
5
and shows that the enlarged tuition gap between community colleges and universities
pushes more students to community colleges. The estimation results of my model,
however, suggest that financial concerns are not the most influential factors that af-
fect students’ college decisions. Kane and Rouse (1993) examine the consequences
of college transfer on labor market return and find similar returns to two-year and
four-year college credits which coincides with the findings in this paper.
However, there are a few studies which do employ structural models. One
is Fu (2010), who proposes and empirically implements a market equilibrium model
for college education and focuses on application strategies. Admission, net tuition,
and enrollment are the joint outcomes. The estimation results reveal the existence of
substantial heterogeneity in students’ preferences for colleges. Hence, they make
different application and enrollment decisions. Nonetheless, Fu’s results may be
biased for two reasons. The first is that she ignores a large proportion of transfer
students. Second, Fu’s study is premised on the dubious assumption that preferences
are not linked to future wages. Another is Beizil and Hansen (2002), who study
choices involving years of schooling while taking into account heterogeneous abilities.
Under their model, preferences for schools are linked to future wages. Students choose
the years of schooling subject to the uncertainties of their abilities. The researchers
strongly reject the null hypothesis that unobserved market ability is uncorrelated with
realized schooling attainments, which underlies many previous studies that have used
OLS to estimate the return to schooling. However, one of the flawed assumptions
that Beizil and Hansen (2002) made is that all schools are identical. They do not
distinguish among schools by qualities and tuition, and do not allow for factors other
than years in school to affect returns to school, which could bias the estimation
results.
This paper examines the determinants and consequences of college entry
6
and transfer decisions through an ability-learning structural model, in which school
qualities and students’ future wages are considered. Using such a model, I can provide
insight into the determinants of college enrollment and transfer decisions, and permits
quantitative evaluation of the effects of counterfactual changes in the college market.
The model explains the allocation of students to different schools. More importantly,
I can give interpretations of the driving forces of and the barriers to college transfer.
This paper contributes to the current literature by simultaneously modeling
four aspects of the decision to enroll in college that are important for empirical
analysis. The first is the uncertainty of student abilities. A student has an a priori
belief about her abilities, and updates such belief based on her high school GPA,
SAT scores and college GPA. At the same time, the student has private information
on her abilities, which cannot be observed by econometricians. The second aspect
is that college transfer is costly for students; the costs are both monetary and non-
monetary. Monetary costs include application fees and relocation costs, while non-
monetary costs come from the time spent in school searches and the loss of non-
transferable credits. The third is the effect of college transfer on future income,
which captures the empirical fact that transfer students earn less than non-transfer
students conditional on graduating from the same university. The fourth aspect is
the heterogeneity of students with regard to their family backgrounds, abilities, and
school preferences. Students make different college enrollment and transfer decisions
based on their backgrounds, school preferences, and expected abilities.
The structural model in this paper is a two-period model. In period 1, a stu-
dent chooses between community colleges and universities based on her expectation
of her abilities, family background, and school preferences. In period 2, she refines
her expectations about her abilities using her college GPA, and makes transfer or
drop-out decisions. For a student enrolled in a community college in period 1, she
7
can choose to work with an associate’s degree or transfer to a university to get a
bachelor’s degree; for a student enrolled in a university in period 1, she can choose to
study in her original university, transfer to another university, or drop out. Students
who decide to transfer incur the transfer cost.
To estimate the model, I use a simulated maximum likelihood estimation.
The data comes from the Beginning Postsecondary Students Longitudinal Study
(BPS:04/09) from the National Center of Education Statistics (NCES). The data
provide detailed information on students’ high school GPA, SAT scores, college GPA,
family background, school level, and school name. Tuition information is derived from
the Integrated Postsecondary Education Data System (IPEDS).
The estimated model fits the data well, and suggests new ways to interpret
the data. The model generates a rich set of predictions: it matches not only the
static composition of students attending community colleges, public universities and
private universities, but also the dynamic transitions of students among schools.
Interestingly, the model also captures the actual relation between family background
and the education outcome. For instance, large educational differences by family
background are still predicted by the model even though family income and parent
education are assumed to be uncorrelated with returns to school. That suggests
much of the intergenerational transmission of education can be captured through
parents’ influence on student financial resources during college, rather than through
differential access to credit or returns to school.
Some of my major findings are as follows: (a) transfer costs are large and
are the main barrier to college transfer; (b) transfer does not have a significant effect
on student incomes, which suggests that the market does not discriminate against
transfer students; (c) private information on student abilities does not significantly
influence college choices or account for the difference between the expected and ac-
8
tual labor market outcomes; (d) family income and parental education level do have
significant effects on the choice of a college by determining student access to financial
resources during college.
In this paper, I use three counterfactual experiments to examine the extent
to which the effectiveness of college enrollment can be improved. In the first exper-
iment, I increase the tuition in all universities by 20%, while keeping the tuition in
community colleges the same. In the second experiment, I increase both high school
GPA and SAT scores by 0.5 standard deviations. In the third experiment, I set
transfer costs to zero.
The transfer rates from community colleges to universities increase in all
three studies, especially by a larger amount in experiment 3. Eliminating the trans-
fer cost does the most to improve the efficiency of the transfer function of community
colleges. The transfer rate from community colleges to public universities increases
from 3.8% to 26.2%, while the transfer rate from community colleges to private uni-
versities increases from 2.4% to 17.5%, which shows that the main barrier of transfer
is the high transfer cost. Moreover, the university completion rate is the highest
when there is no transfer cost. Simulation studies suggest that decreasing transfer
costs (through cooperative agreements between community colleges and universities)
is the most efficient way to encourage students to attend community colleges and
increase the completion rate at universities.
It is important to improve the effectiveness of the transfer function of com-
munity colleges. This is because the average expenditure per full time student in a
community college is much less than that in a university, the overall cost of higher ed-
ucation could be greatly reduced if more students attended community colleges with
transfer programs. Therefore the counterfactual studies have clear policy implica-
tions for improving the efficiency of the college market, and reducing the educational
9
costs for both individuals and local governments.
The rest of the paper proceeds as follows. Stylized facts on postsecondary
education are presented in Section 2 and model specifications in Section 3. Then, the
estimation strategy is revealed in Section 4 with a brief discussion of identification.
The data are described in Section 5 as well as summary statistics. In section 6,
the estimation results are shown together with a brief discussion of the model fit.
Finally, three counterfactual experiments are described in Section 7. Some model
and estimation details are given in the appendix.
2.2 STYLIZED FACTS
Before developing the model, I will first describe the stylized facts of post-
secondary education that the model should replicate. To do this, I use the Beginning
Postsecondary Students Longitudinal Study (BPS:04/09) from the National Center
of Education Statistics (NCES). In each cycle, the study followed a cohort of stu-
dents enrolling in postsecondary education for the first time. Members were initially
surveyed at the end of their first academic year (2003-04) and invited to participate
in follow-up surveys at the end of their third (2005-06) and sixth (2008-09) years
after entering postsecondary education. The final BPS:04/09 dataset contains the
information of nearly 16,700 students.
• Around 27% of freshmen start their postsecondary education in community
colleges
In Table 2.1, we can see that over one fourth of students choose to begin their
college education in community colleges. The enrollment rate in private uni-
10
versities is about the same as the enrollment rate in public ones.
• Around 12% of students are transfer students
The statistics given in Table 2.2 are that: 40% of community college students
transfer to universities. From these statistics, we can see that the transfer rate
from community colleges to public universities is twice as high as the transfer
rate from community colleges to private universities, possibly because that fi-
nancial constraints are usually one of the main concerns for transfer students.
Therefore, these individuals are more likely to transfer to less expensive uni-
versities. The average tuition is given in Table 2.3.
• Students who started in universities have higher average high school GPA and
SAT scores than those who started in community colleges
In Table 2.4, we can see that both average SAT score and high school GPA are
the highest in private universities, while the average test scores in community
colleges are the lowest, which implies that test scores play an important role
when students make school choices for their tertiary education.
• Community college transfer students have high average college GPA relative to
those who do not transfer
Table 2.5 gives the average GPA for transfer students. The differential may
suggest that the driving force of the transfer behaviors between universities is
different from the driving force of transfer behavior from community colleges
to universities. While the transfers between universities are possibly driven by
ability mismatch (poor performance in current universities), the transfers from
11
community colleges to universities are mainly due to improved expected ability
in current schools.
• Students who started in universities have higher average family income than
those who started in community colleges
From Table 2.6, we see that the cohorts of students in different types of schools
exhibit heterogeneity in family income. The average family income for transfer
students is given in Table 2.7. It is worthwhile to point out that the students
who transfer from community colleges are on average from poorer families than
their counterparts who transfer from universities.
2.3 MODEL SPECIFICATION
2.3.1 Overview
In this model, students are uncertain about their abilities. They only know
their abilities based on their academic performances in previous periods. In period
11, a student has an expectation of her abilities based on her high school GPA and
SAT scores. She makes her school choice between community colleges and different
universities based on her family background, school preferences, and expected abili-
ties. In period 2, the student updates her belief of her abilities based on her college
GPA. She then decides whether or not to transfer based on this belief. If the student
starts in a community college in period 1, she can choose to work with an associate’s
degree, or she can transfer to a university. If the student starts in a four-year uni-
versity in period 1, she can choose to transfer to another university, study in the
1The time line is formally introduced in Section 2.3.2.
12
Table 2.8: Possible education paths for a student starting from a four-year university
Period 1: Four year university
Period 2: Choice 1: stay in the current university for another period
and get a bachelor’s degree.
Choice 2: transfer to another four-year university and
study for another period for a bachelor’s degree.
Choice 3: drop out without any degree.
current university, or drop out without any degree. There are two potential costs for
transfer students. The first is the direct transfer costs which include the application
fee, the time that students spend on school search and application, moving costs, and
the loss of some nontransferable credits. Another is the possible negative effect on
future wages if the student transfers upward. In this case, an upward transfer refers
to a transfer from a community college to a university, or from a university with low
tuition to a university with high tuition. The discrimination against transfer students
from the labor market may come from the different quality of education received by
transfer students and non-transfer students, which is explained further in Section
2.3.2. In this paper, tuition is a proxy for school quality. However the estimation re-
sults will be the same if we use average SAT score as a proxy for school quality. From
the summary statistics, we can see that private universities have higher tuition than
public universities. At the same time, the average SAT score in private universities
is also higher. Using either tuition or SAT as a proxy, the upward transfer between
universities refers to the one from a public university to a private university. To see
the options more clearly, I list all the school choices in Tables 2.8 and 2.9.
In this model, if a student plans to earn a bachelor’s degree, there are two
ways to achieve it. One is to attend the destination university from the beginning.
13
Table 2.9: Possible education paths for a student starting from a community college
Period 1: Community college
Period 2: Choice 1: work with an associate degree.
Choice 2: transfer to a university and
study for another period to get a bachelor’s degree.
The other is to transfer to the destination university after one period at another
institution. The advantage of transferring is that the student may save some tuition
if the destination university has higher costs. The disadvantages of transferring
include transfer costs and possible negative effects on future earnings.
More importantly, after a student learns about her abilities from her test
scores in the first period, a school transfer can potentially lead to a better match be-
tween school choice and student’s individual abilities. Under the model, it is possible
for a community college student to transfer to a university if she believes that her
abilities qualify her for the university and will earn her a higher future return in the
future. For a similar reason, a student starting from at a university has the option
to drop out or transfer to another university if she believes that her abilities do not
match well with the current university and that other options may give her a better
return.
2.3.2 Primitives
In this paper, I only focus on students who receive postsecondary education.
To be more specific, work is not a feasible option in period 1.
Time line: Without loss of generality, I take a unit period to be 2 years.
At t = 1 (period 1), an individual makes a school choice from community colleges and
universities; at t = 2 (period 2), an individual can make a transfer or work decision;
14
and at t = 3 (period 3) and beyond, an individual has to work. Retirement occurs
at time t = T .
Choice set: There are J four-year universities that are ranked according
to ascending order of tuition and indexed by j = 1, 2, · · · , J . To be more specific,
university j = 1 is the one with the lowest tuition, and university j = J is the one
with the highest tuition. It is assumed in the paper that all community colleges are
the same and are indexed by j = 0. An outside option (work), is indexed by j = −1.
Here J denotes the choice set, where J = {−1, 0, 1, · · · , J}.
Utility of working
Utility of working is a logarithm function of student wages (w(·)). The wage
of individual i depends on the student abilities (αij , which is discussed in detail in
Section 3.3), the school from which she graduates (Di), and her years of experience
(Exprit). The utility of working UW (·) is defined as follows,
ln (w(αi,Di, si1, Di, Exprit)) = g(αi,Di
, Di) + γ11(Di > si1) (2.1)
+γ2Exprit + γ3Expr2it + εWit ,
for t = 2, · · · , T.
Here w(·) is an individual’s wage. αi,Diis the ability of individual i at school Di. sit
is student i’s school choice in period t. Di denotes the school from which student i
gets her degree. g(αi,Di, Di) is assumed to take the form of
g(αi,Di, Di = j) = ρ1j + ρ2j · αij , (2.2)
where ρ1j ’s and ρ2j ’s, for all 0 ≤ j < J , are to be estimated. 1(·) is the indicator
function. Exprit is the years of experience of individual i in period t. εWit represents a
15
stochastic wage shock at time t, which follows an Extreme Value Type I distribution
with location and scale parameters zero and τ .
It is assumed that the outside option yields to a student the utility
ln(wout,t) = µout + εWit , (2.3)
which is the logarithm of wage that the student receives if she quits a university and
works without a degree. Here µout is a constant to be estimated. εWit is in equation
2.1.
The utility defined in Equation (2.1) is the student’s utility of working
conditional on graduating from school Di and her ability αi,Di. In Equation (2.1),
the first term captures the crossed effect of ability and school choice on earning, the
second term captures the transfer cost on future earning, and the last two terms
represent the effect of experience on earning.
The crossed effect of ability and school choice on earning is defined in Equa-
tion (2.2). This function includes an intercept term for school j, which is ρ1j , and
a term that captures the return to ability at school j, which is ρ2j . It is expected
that ρ1j is lower for less expensive schools, and higher for more expensive schools,
implying that given the same ability level, a student should have a higher return by
attending a more expensive school (a student should have higher return by investing
more in education). The estimated ρ2j ’s are also expected to be ranked according
to ascending order of tuition as well, which suggests that students with high ability
have higher return from more expensive schools than do low-ability students. There-
fore, high ability students should be more willing to invest more in education, which
coincides with the statistics in the data.
The second term in the equation captures the potential cost of transferring
on earning. 1(Di > si1) = 1 delivers two pieces of information. The first is Di 6= si1,
16
which means the school that the student attends in period 1 is different from the
school from which she graduates. It implies that the student does transfer. The
second is Di > si1, which means the transfer is an upward transfer according to the
aforementioned information. Therefore, the second term in Equation (2.1) captures
the cost of an upward transfer on future earnings. If γ1 is negative, it means that
an upward transfer has a negative effect on the student’s future earnings, which may
happen because education quality can vary across different schools. For a student who
chooses to transfer upward, she has only received the last period of training from her
destination university. Therefore, the education she receives may not be as good as if
she begins her education in the destination school. It is reasonable to think that the
student may receive fewer payoffs if she has received a lower quality education. If the
student transfers from a university with high tuition to a university with low tuition
(downward transfer), I assume there is no effect on the student’s future earnings. It
is assumed in the model that the downward transfer will not be observed if students
know their abilities with certainty, and that can only be explained by uncertainty.
Utility of staying in school
The utility a student receives from either a four-year university (j > 0) or
a community college (j = 0) for each period can be represented by the following
equation. It is the utility that a student receives for one period by attending school
j
USjt(·) = ln(ξ(·)) + νij + εSijt, (2.4)
17
where ξ(·)2 is the money available to the individual given her family income and
school choice, defined as
ξ(Xi, sit = j) = l(Xi, sit)− Tj . (2.5)
As I have previously mentioned, sit is the student i’s school choice at time t. Tj is
the tuition of school j. Xi includes student i’s family income, number of siblings,
and the education levels of her parents. Function l(·) denotes the money available to
the student before paying tuition given her family background Xi and school choice
sit, that takes a log-linear form of
ln[l(Xi, sit = j)] = θj +X ′iβ. (2.6)
Here θj ’s and β are to be estimated. νij represents the student’s idiosyncratic taste
for school j, and it follows
νij ∼ N(0, σ2S). (2.7)
The preference shock, denoted εSijt, follows an Extreme Value Type I distribution
with location zero and scale parameter τ .
A student’s financial constraint is defined in Equation (2.6), which is the
monetary resources available to her before paying tuition. It is assumed that the
intercept term of the equation (θj) is different for different school j, which implies
that the money available to the student is different for different schools before paying
the tuition. There are two reasons for this assumption. The first is that the parental
contributions that may differ for the student’s different school choices. For instance,
the parents may transfer more money to the student if she decides to attend a more
expensive school. Secondly, a student may spend more money aside from tuition in
certain types of schools than in others. For instance, students who attend community
1In estimation, ln(ξ(·)) is replaced by (1 + d0.5)ln(ξ(·)), where ln(ξ(·)) is defined as financial resources
available to students for a half period (1 year), and d0.5 is the discount factor for one year.
18
colleges are more likely to choose a school near their home. Hence, they can live with
their parents and save on transportation and living expenses. Students in commu-
nity colleges are also more likely to take advantage of part-time job opportunities.
Therefore, they may make extra money apart from family contributions.
In Equation (2.4), we can see that the differences in students’ utilities of
attending different schools are captured by family backgrounds (Xi), taste in schools
(νij), and preference shocks (εSijt). The idiosyncratic preferences are captured by
two terms, νijs and εSijts. νij stays the same over time, while εSijt changes over time.
Taking the two terms together shows that a student’s preference for the same school
is different but correlated over time. νij captures the part of student preference that
does not change over time. For instance, a student may prefer a certain school over
time because her parents went to the same school or the location of the school is ideal,
to name a few reasons. On the other hand, εSijt captures the preference shock that
changes over time. For instance, the preferences may change due to some unexpected
good or bad experiences in certain schools.
In Equation (2.4), student utility of attending school is measured by the
monetary resources available to the student rather than from other psychic costs.
Family income and the education level of her parents influence student utility through
the number of financial resources available. As a result, the intergenerational trans-
mission of education is captured through financial means (parents with higher edu-
cation tend to contribute more money to a student’s education) other than through
differential access to education or return to schools.
2.3.3 Ability learning process
Altonji (1993), Cunha, Heckman, and Navarro (2005), and Chen (2007)
show the importance of uncertainty in explaining college decisions and potential
19
wage variation. In this paper, uncertainty is captured in student’s abilities. I assume
student abilities are multi-dimensional, which implies that students have different
abilities at different schools. To be more specific, the ability vector of student i is
given by
αi = [αi0, αi1, αi2, · · · , αiJ ]′,
where αij is student i’s ability at school j.
Students update their beliefs in their ability αij ’s by a Bayesian updating
of the distributions of their abilities at different schools. Before making a college
enrollment decision, a student has a prior belief regarding her abilities. Based on
high school GPA and SAT scores, a student updates her belief of her abilities at all
schools (αijs for j ≥ 0). After the student gets into school j, she can update the
distribution of her ability at school j (αij) using her GPA at school j (κijt). I assume
the student’s GPA at school j (κijt) is only affected by her ability at school j (αij).
In the following equations, I describe the relationship between the student’s
ability and the signals (high school GPA, SAT scores, and college GPA).
Prior distribution: The prior distribution of student i’s ability at school
j (αij) is assumed to follow
αij ∼ N(mj + χij , σ2α), (2.8)
where mj is to be estimated, and
χij ∼ N(0, σ2µ) (2.9)
denotes the unobserved academic aptitude that is known by the student but not
econometricians. From Equation (2.8), we can see that a student’s prior beliefs
about her abilities differ from that of other students only by unobserved academic
20
aptitude (χij).
After receiving high school GPA: It is assumed that the high school
GPA of student i is affected by a linear combination of student abilities, αij ’s, which
is given by,
HsGPAi = µ0 +J∑
j=0
µ1j · αij + εHsij , (2.10)
where µ0 and µ1j are to be estimated, HsGPAi is student i’s high school GPA, and
εHsij ’s are i.i.d N(0, σ2) random noise.
After receiving SAT score: It is assumed that the scores of student i
are affected by a linear combination of student abilities, αij ’s, which is given by,
SATi = µ0 +J∑
j=0
µ1j · αij + εSATij , (2.11)
where µ0 and µ1s are the same as defined in Equation (2.10), SATi is the student i’s
SAT score, and εSATij ’s are i.i.d N(0, σ2) random noise.
After receiving college GPA: students have GPA observations from the
schools that they attend in period 1 before making college choices in period 2. To be
more specific, if a student is enrolled in school j in period 1, she receives her GPA
from school j after a period that is denoted as κij1. It is assumed that the signal κijt
received in school j is only affected by student ability at school j (αij). The relation
between student i’s ability αij and signal κijt is
κijt = αij + εκit, (2.12)
where εκit’s are i.i.d N(0, σ2κ) random noise.
If student i attends school j, she only updates the distribution of her ability
at school j (αij) based on the college GPA that she receives at school j (κijt).
21
Students estimate the posterior mean and variance of their abilities based on
the test scores using Bayesian updating. The details of the ability updating process
are provided in the appendix.
2.3.4 Value function
The value function in period 1 can be solved by recursive deduction. This
function can be computed using the student’s estimated ability and past school
choices. To simplify notation, I define the information set to include all the sig-
nals that the student received from previous periods, and her past school choices.
To be specific, the information set of student i at the start of period t is Iit =
(κi1, · · · , κi,t−1, si1, · · · , si,t−1)′.
Value function in period 2
I now characterize the utility associated with each of the potential choices
available to the agent. Students can make two types of choices in period 2, which I
discuss separately. The first is the choice to continue university studies. The second
is the choice to work.
The present value of attending a four-year university: The present
value of attending such a university j (sit = Dit = j for j > 0) in period 2 is
V2(Xi, Ii2, si2 = j) = d[USjt(Xi, j)] + E
[
T∑
t=3
dt−1UWt (αi,Di
, si1, j, Exprit)|Ii2
]
+d1(si1 6= j) · TF,
(2.13)
22
where TF refers to the direct transfer cost that was introduced in Section 2.1, and d
is the time preference parameter over a period (i.e., 2 years).
The value of attending a four-year university si2 (V2(·)) in period 2 is com-
posed of three parts: the utility of staying in school si2 for one period, US(·) defined
in Equation (2.4), the discounted utility from receiving salary, UW (·) defined in Equa-
tion (2.1) starting from one period later, and the direct transfer cost (TF ) if a student
decides to transfer (si1 6= j).
We can see that the value function is a function of the student’s school
choice in period 1 (si1), family background (Xi), and the information set at period 2
(Ii2), which contains the school choice made in the previous period and all the test
scores that were obtained so far. It implies that the student makes school choice in
period 2 based on the previous school choice, family background, and individual test
scores.
It can be noted that the utility of working (UW (·)) is a random variable.
The expectation is taken over the error terms in the wage equation ({εWit }t≥3) and
student abilities (αij). As I have mentioned, students only know the distribution of
their abilities and need to refine the distribution in every period based on test scores.
The value function is derived based on the refined distribution of their abilities. As
a result, there is a discrepancy between expected abilities in period 1 and in period
2. When the shock to abilities is large, a student may need to adjust her planned
education path. For instance, dropping out from a university is usually unplanned
and can be attributed to, among other factors, a negative shock to predicted student
abilities. It is not rare to see students who enrolled in universities in period 1 may
choose to drop out if they have received very poor grades. The significant proportion
of dropouts is therefore explained by the differential in expected abilities (uncertainty)
across time.
23
Transfer behavior exerts its impact on value function in two ways. One is
through the effect on future wages which is captured by UW (·) (defined in Equa-
tion 2.1). It is assumed that transfer students may not receive the same payoff as
non-transfer students, given that they graduate from the same university. The other
effect of transfer behavior is captured by the last term in Equation (2.13), which is
the direct cost. The term in the indicator function, si1 6= si2, implies the school that
the student attends in period 1 (si1) is different from the one that she attends in
period 2 (si2). As I have pointed out, the transfer costs in this term include both
monetary and non-monetary costs.
The present value of working: If the student decides to work in period
2 (si2 = −1), the discounted utility from work in period 2 is
V2(Xi, Iit, sit) = 1(si1 = 0)E
[
T∑
t=2
dt−1UWt (αi,Di
, si1, Di = 0, Exprit)|Iit
]
+1(si1 6= 0)E
[
T∑
t=2
dt−1ln(wout,t)
]
,
(2.14)
Recall that Di is the school from which student i gets her degree. The
condition that Di = 0 indicates student i receives a degree from a community college.
The value of working is different for students who were enrolled in commu-
nity college and those who were enrolled in four-year universities in period 1. The
expected value inside the first (second) square bracket is the value of working for
student i who was enrolled in a community college (four-year university) in the first
period. The expectation of the first term is taken over the error term in the wage
equation ({εWit }t≥2) and student abilities αij , while the expectation of the second
term is only taken over the error term in the wage equation ({εWit }t≥2).
If a student who was enrolled in a community college for the first period
24
(si1 = 0), decides to work in period 2, she then works with an associate’s degree.
Therefore in this case Dit = 0, and the utility of working that the student receives is
UWt (αi,Di
, si1, Di = 0). However, a student enrolled in a four-year university in the
first period (si1 6= 0), who decides to work in period 2, works without any degree.
In this case, the utility of working without a degree is ln(wout,t), which is defined in
Equation (2.3).
A student makes the school choice si2 that maximizes her utility V2(Xi, Ii2, si2).
The student’s school choice problem at the beginning of period 2 is
maxsi2∈J\{0}
{V2(Xi, Ii2, si2)}, (2.15)
where Ii2 includes all past school choices and test scores observed at the beginning
of period 2. The set J \ {0} encompasses all universities but no community colleges
because community colleges are not feasible options in period 2.
Let the optimal application strategy be si2(Ii2, Xi).
Value function in period 1
In period 1, the value of attending either a four-year university (si1 > 0)
or a community college (si1 = 0) is composed of the utility of staying in school for
the current period US(·), and the expected maximized utility from the second period
given a student’s school choice in period 1 (si1), which can be expressed in a single
value function as
V1(Xi, Ii1, si1) = USjt(Xi, si1) + E
[
maxsi2∈J\{0}
[V2(Xi, Ii2, si2)] |Ii1, si1
]
.
(2.16)
25
As I have mentioned, this model is a two period model. Given the dropout
and transfer options available to students, their enrollment choices predicted by this
model can be different from the ones predicted by a one period model (e.g., Fu’s
2010 model). In this model, a student may not necessarily start in her destination
university, a choice that gives her the highest payoff. She may choose to attend a
cheaper school in the first period and transfer to her destination university in the
second period.
A student makes school choice si1 that maximizes her utility V1(Xi, Ii1, si1).
The student’s school choice problem is
maxsi1∈J\{−1}
{V1(Xi, Ii1, si1)|Ii1}. (2.17)
Let the optimal application strategy be si1(Ii1, Xi). The set J \ {−1} encompasses
all school choices (community college and four-year universities) but not work oppor-
tunities.
2.4 ESTIMATION STRATEGYAND IDENTIFICATION
In this section, the estimation and construction of the likelihood function
are discussed first. A brief discussion on identification is also provided.
2.4.1 Estimation
In the data, I only have three periods of observations in which I observe
either one or two periods of wage data. To be clearer, if a student attends school
for two periods, I have two GPA periods and school enrollment status observations
and one period of wage observation. For students who only complete one period of
education, I have one GPA period and school enrollment status observations as well
as two periods of wage observations.
26
The parameters that I estimate include γ’s in Equation (2.1), ρ’s in Equation
(2.2), τ in Equation (2.4), θ’s and β in Equation (2.6), σ2S in Equation (2.7), m’s
in Equation (2.8), σ2α in Equation (2.8), σ2
µ in Equation (2.9), σ2 in Equation (2.10)
and σ2κ in Equation (2.12).
The error terms, student’s unobservable belief about her abilities ({χj}j∈J
in Equation (2.8)), idiosyncratic tastes in schools ({νj}j∈J in Equation (2.4)), and
the deviation of a student’s true abilities from posterior means ({εj}j∈J in Equation
(A.10)), are assumed to be independent. The joint distribution function of the error
terms, G({χj}j∈J , {νj}j∈J , {εj}j∈J ), follows multivariate normal distribution, and
the off-diagonal elements of the variance-covariance matrix are zeros.
The likelihood function is constructed based on school choice observations
(sit), wage observations (wit), and college GPA observations (κit). The likelihood
function for individuals who work after the first period can be derived in a similar
way.
I implement the estimation step via a simulated maximum likelihood esti-
mation (SMLE).
Li(·) =
∫
P (si1, si2|wi3, κi1, κi2, Xi, {νj}j∈J )× f(wi3, κi1, κi2|{χj}j∈J , {εj}j∈J )
×dG({χj}j∈J , {νj}j∈J , {εj}j∈J )
(2.18)
where P (·) is the simulated probability and f(·) is the simulated density.
Because si1, si2 are conditional independent as defined in the model, I have
P (si1, si2|·) = P (si1|·)× P (si2|·).
wi3, κi1, κi2 are also conditional independent. Therefore
f(wi3, κi1, κi2|·) = fW (wi3|·)× fκ(κi1|·)× fκ(κi2|·).
27
Details regarding the calculation on the simulated probability and simulated density
are discussed in the appendix.
2.4.2 Identification
The identification of the full model hinges on exclusion restrictions. The
parameters in the wage equation (Equation (2.1)) are identified primarily through
the wage observations. The parameters in the utility of attending school (Equation
(2.6)) are identified by school choices. Because the utility is a log function of the
monetary value, the variance of student preference (νij ’s) and the scale parameter
of their preference shocks (εSijt’s) are jointly identified by tuition. Holding all other
variables constant, the estimates of the variance of νij and the scale parameter of εSijt
are needed to rationalize the proportions of student college choices for certain types
of schools. For instance, in period 1, if a large proportion of students with similar
characteristics (family background and SAT scores) choose to attend the same school,
the variance of νij + εSijt should be small; yet otherwise the variance should be large.
Student college choices in period 2 help to identify the variance of νij and the scale
parameter of εSijt separately. For instance, according to the structure of the model,
dropping out from a university is not a planned education path, but is explained
by shocks to student preferences (εSijt’s) and shocks to their expected abilities. If
the proportion of dropping out of a university is about the same for students with
high GPA and students with low GPA, the phenomenon cannot be fully explained
by shocks to expected abilities, so it has to be explained by εSijts. Therefore, the scale
parameter τ of the preference shock εSijt should be large.
The parameters mj ’s and µj ’s in the ability learning process (Equation (2.8)
in Section 2.3) are identified jointly by high school GPA, SAT scores, and college
GPA observations. The variance of unobserved academic aptitude (σ2µ in Equation
28
(2.9)) is primarily identified by wage observations, because the differences between
the observed wages and the predicted wages are jointly explained by unobserved
academic aptitude and wage shocks (εWit ), while the scale parameter τ of the wage
shocks and preference shocks is already identified by school choices.
2.5 DATA
The dataset used in this paper is the Beginning Postsecondary Students
Longitudinal Study (BPS:04/09) from the National Center of Education Statistics
(NCES). The variables which are relevant to this paper include student enrollment
status, demographic characteristics, income, family income, and college GPA.
Data selection: : I drop the observations where one or more critical
information pieces are missing: (a) high school GPA, (b) SAT scores, (c) college
GPA, (d) family income, (e) number of people supported by parents, (f) parents’
highest education level, (g) school level (2 year or 4 year) both for periods 1 and 2,
and (h) school name if the school the student enters a four year university for periods
1 and 2.
I did not drop the observations in which school names are missing while the
school levels are 2-years, for I group all 2-year colleges together (community college).
For universities, school names are needed to identify their selectivity.
A problem may exist if the missing values are not at random. I have com-
pared the average SAT score and high school GPA between the group of observations
with missing values and the group of observations that do not have missing values.
For the observations without missing values, the average SAT score is lower by 0.27
standard deviation, but the average high school GPA is higher by 0.08 standard
deviation. It is hard to conclude whether the missing values are at random. Our
29
estimation might be biased in an ambiguous direction. However we have controlled
students family background and all their test scores, the bias should not be large. In
future research, one possible extension is to write a new model that accounts for the
missing data. That model could then be incorporated into a more complex model for
estimating missing values. An example is given by Dunning and Freedman (2008).
For estimation purpose, I dropped those students who are enrolled in se-
lective universities, which are defined as the top 30 private and public universities
and top 20 liberal arts colleges ranked by U.S. News and World Report 2003-2006
(similar grouping can be found in Fu (2010)).
There are three reasons to drop the students in selective universities. First,
the proportion of students enrolled in selective universities is small. More impor-
tantly, there is almost no transfer and dropout behavior in selective universities.
From the data, the total percentage of enrollment in selective universities is only
8%, while 0% students transferred to or transferred out of selective universities. Be-
cause the students in selective universities do not provide insightful information on
the transfer behavior that is the main interest of this paper, dropping those students
does not alter the estimates of interested parameters. Second, I assume that a student
can choose the school that maximizes her value function. However, the acceptance
rates of selective universities are not high enough to satisfy this assumption.
After the aforementioned data altering procedures, the final sample size is
6300 (rounded to the nearest ten).
Aggregation of schools: The aggregation of schools is necessary in this
paper. The reason for aggregation has been discussed in Fu (2010)’s paper. There
are two major constraints without aggregation. The first is computation feasibility.
If schools are not aggregated, students can choose are thousands of possible schools,
which poses a major computational challenge. The other reason is that the number
30
of students attending any single school is too small to provide accurate estimates of
the parameters. For instance, the enrollment rates in some liberal arts colleges are
exactly zero.
To do this, I try to group schools according to crucial features that may
affect the school decisions of students. The crucial features that I consider are tuition
and school type.
• Group 0: community colleges, which is denoted by j = 0.
• Group 1: public universities, which is denoted by j = 1.
• Group 2: private universities, which is denoted by j = 2.
I treat schools in each group as a single school. The definition of enrollment
is adjusted accordingly. A student is said to attend school j if she attends any school
in group j. I use the average tuition for each group based on tuition information
from the Integrated Postsecondary Education Data System (IPEDS).
Normalization of the test scores: In this paper, the measure of the
test scores (high school GPA, SAT scores, college GPA) in different periods are
different. For computational purposes, I normalize test scores by subtracting the
(cross sectional) sample mean, then divided by the (cross sectional) sample standard
deviation. To see it
κit =κit − E(κt)
s.d.(κt), (2.19)
where
E(κt) =
∑Ni=1 κitN
, (2.20)
V ar(κt) =
∑Ni=1[κit − E(κt)]
2
N − 1. (2.21)
Where κit is the normalized test score, κit is the original test score. Here N is the
total number of observations.
31
Identification of wages in each period: In this model, one period is
defined as 2 years. In the data, I observe wages for 1 year (half period). Therefore,
in estimation, I adjust the utility of working (Equation (2.1)) in the following way.
UW (·) = ln (w1(·)) + d0.5ln (w2(·)) , (2.22)
where w1 are the wages that an individual receives in the first half period, w2 are
the wages that an individual receives in the second half period, and d is the time
preference for a period. w1 and w2 are similarly defined as in Equation (2.1).
ln (wk(αi,Di, si1, Di, Expritk)) = g(αi,Di
, Di) + γ11(Di > si1) (2.23)
+γ2Expritk + γ3Expr2itk + εWit ,
for t = 2, · · · , T,
where Expritk, k=1,2, is the years of experience of individual i at the first or second
half period. All other notations have the same meaning as in Equation (2.1).
2.6 ESTIMATION RESULTS
I present the estimates of the key structural parameters in subsection 5.1,
which is followed by a brief discussion on model fit in subsection 5.2.
2.6.1 Parameters estimates
Parameters in wage equation: In table 2.10, ρ1 represents the intercept
term in the wage equation, which can be understood as the signalling effect of college
degrees. ρ2 represents the return to abilities.3
3The data used in this paper only provide 1 to 2 period wage observations. It is not enough to estimate
the curvature of wages (γ2 and γ3 in Equation (2.1)). The value of γ2 and γ3 are taken from the estimates
in Belzil and Hansen (2002). γ2 is taken to be 0.0884. γ3 is taken to be -0.0029.
32
There is a significant difference of the return to education for different types
of schools. From the estimates of the intercept terms for different schools (ρ1’s), we
can see that the return is much higher for universities. As expected, the intercept
term ρ1 is the lowest for community colleges. This estimation results show the labor
market returns to students holding bachelor degrees are significantly higher than for
those with associate degrees.
From the estimates of the return to abilities (ρ2’s), it is not surprising to
see that the returns to abilities are the lowest from community colleges, and much
higher from universities, which implies that students with high abilities have higher
returns from universities than low-ability students. The estimation results explain the
situation in which high-ability students tend to attend universities, while low-ability
students tend to attend community colleges.
It is noted that although the return to education is similar for graduates from
public universities and those from private universities, the return (both the intercept
term ρ1 and the return to ability ρ2) is slightly higher for public universities, which
suggests that the enrollment in private universities is driven by factors other than
labor market returns, for instance, better living conditions, better meal plans, smaller
classes, etc.
γ1 captures the effect of upward transfer on future wages. The estimate
shows that the transfer cost on income is not significantly different from zero, sug-
gesting that the labor market may not discriminate against transfer students, which
coincides with Kane and Rouse’s (1993) finding.
Recall that the student’s idiosyncratic taste to school, νij , is the part of
the preference that does not change over time, while the preference shock εSijt does
change with time. It is worth noticing that the variance of student idiosyncratic
tastes (σ2S in Equation (2.7)) is small compared to the scale parameter of preference
33
shocks (scale parameter τ of the distribution of εSijt in Equation (2.7)). A comparison
of the variances reveals that the time-varying preference shock dominates the static
idiosyncratic taste. The result also highlights the importance of a time-varying pref-
erence shock for a more complete characterization of the dynamics of school choices.
The one-period model of Fu (2010) is not able to capture such dynamics over time.
Parameters in the utility of attending school: θ is the estimate of
the constant intercept of monetary resources for students during college (Equation
(2.6)). The intercept corresponding to community colleges is the highest. There
are two possible explanations. The first is that the course schedules in community
colleges may be more flexible than in universities, and students are more likely to
take part time jobs. The second is that most community college students choose to
live at home. Therefore, most of them spend a lot less on living expenses than do
university students.
The intercept θ is higher for private universities than for public universities,
which implies that families transfer more money to students if they attend private
universities. The estimates are intuitive because the tuition in private universities is
much higher than for public ones. Some students may have more financial resources
after paying tuition in private universities even when the tuition in private universities
is higher, possibly because the facilities are more expensive in private universities, and
parents are willing to pay for such benefits if they do not have financial constraints.
The estimation results of parameter β show that parental income has a
significant effect on school choices, implying that a student will get more financial
support from her family if her family income is high. It is also not surprising to
see that the number of people supported by parents has a significant negative effect
on their monetary resources during college. The estimation results also reveal that
34
parental education has a positive effect on student financial resources, while the ed-
ucation of mothers has a more significant effect.
Parameters in ability learning process: m captures the mean of stu-
dents’ prior distribution, and σ2α is the variance of the prior distribution. As all the
observed test scores (high school GPA, SAT scores, and college GPA) are normalized,
the variance of the prior distribution σ2α = 34.601 is a very large number. It delivers
the information that the prior distribution is not very informative when it come to
deriving student abilities.
µ1 captures the linear relation between student abilities and their SAT
scores, and the linear relation between student abilities and high school GPAs. It
is as expected that the estimates of µ1 are all positive. The estimates imply that if
students have high SAT scores or high school GPAs, they should infer that they have
higher abilities in all schools.
σ2µ is the variance of student’s private information about her abilities. The
estimate of σ2µ is a small number with a large standard deviation, suggesting that the
private information is insignificant. Therefore, the preference shocks and the wage
shocks, as opposed to students’ unobserved academic aptitude, provide the primary
explanation of the idiosyncratic school choices and the differences in the labor market
returns.
Parameters in the value function: Transfer cost is a large negative
number compared to the utility of attending school, which is between 8 and 12 for
one period. Recall from section 2.3.4 that transfer cost includes monetary and non-
monetary components. The monetary cost, which includes the application fee and
the moving cost, is too small to explain the transfer cost. Therefore, the estimate
35
suggests that the transfer cost is chiefly explained by the non-monetary part, includ-
ing unexpected loss of non-transferable credits and the time spent on college searches
and transfer applications.
2.6.2 Model fit
To examine model fit, I simulated 100 sets of error terms for each individual
and compared the predicted outcomes to the actual observed outcomes. I compared
the predicted enrollment rates with the actual rates for different schools in periods
1 and 2 in Tables 2.11 and 2.12, and the predicted transfer rates with the actual
transfer rates in Table 2.13.
In Table 2.11, we can see that the model fits the enrollment rates in period
1 reasonably well. The enrollment rate of community colleges is overestimated by
1%, while the enrollment rate of universities is underestimated by 1.1% for public
universities and overestimated by 0.1% for private universities.
Similarly, Table 2.12 shows the model fit for the enrollment rate in the
second period. The discrepancy between the predicted enrollment rates and the
actual ones is very small. The fraction of students who choose to work after the
first period is overestimated by 1.8%, while the enrollment rate of universities is
underestimated by 0.8% (1.0%) for public (private) universities, respectively.
The model fit in terms of transfer rate is a very good test of the model’s
predictability regarding the relationship between expected abilities and enrollment
decisions for students. For instance, predicted drop-out rates from universities are
driven entirely by changes in expected academic performance (ability). Table 2.13
compares the predicted and the actual transfer rates. Although the transfer rates
are small compared to the enrollment rates, the model still replicates them well. The
largest discrepancy between the predicted transfer rates and the actual ones is within
36
±3%.
In general, the model not only predicts the enrollment rates for different
schools in each period, but also the transfer rates of various types. These transfers
include “upward” transfers (from a community college to a university), “parallel”
transfers (from a university to a university), and drop-outs from universities. Given
the model’s simplicity and its predictive power, the estimated model fits the actual
data well.
2.7 POLICY SIMULATION
Regarding the estimated model, which fits the data reasonably well, I con-
ducted three counterfactual experiments to answer the following research questions:
What is the main barrier for students to attend the transfer program in community
colleges? How can we improve the efficiency of the transfer function of community
colleges?
2.7.1 Increase tuition fees in universities
Tuition fees are always increasing in universities. In this study, I examine
the extent to which a change of tuition can affect college choices. I increase the
tuition of both private and public universities by 20%, while keeping the tuition in
community colleges the same. This experiment is divided into two parts. The first
examines the effect of tuition increase on the enrollment and transfer choices for high
school graduates. The other examines the effect on the transfer rates for students
who are already enrolled in college.
The effect on college choices of high school graduates:The effect
on college choices of high school graduates: Generally speaking, university tuition
37
increases raises the transfer rates from community colleges to four-year schools but
decreases graduation rates from universities. In Table 2.14, we can see that 2.3%
more students choose to enroll in community colleges in period 1, and 2.8% fewer
students choose to enroll in private universities. As tuition in universities increases,
more students with low family income or low ability tend to start in community
colleges. For low-ability students, because they are not sure whether it is worth
getting a bachelor’s degree, it is relatively cheaper to learn about their abilities in a
community college. For students who have low family incomes, it is more affordable
to start at community colleges in the face of tuition increases.
In Table 2.15, the transfer rates from community colleges to universities are
considerably higher. The transfer rate from community college to a public university
increases from 3.8% to 5.6%, while the transfer rate from community colleges to
private universities increases from 2.4% to 3.4%. As expected, more students tend
to transfer to universities when it is more expensive to start in universities from the
beginning. The drop-out rates from universities are higher because students with
low GPA observations (negative shocks to expected abilities), are more likely to quit
because the cost of finishing at a university is higher.
From Table 2.16, we can see that 1.9% more students enter the labor mar-
ket after the first period, and 2.2% fewer students pursue their degrees in private
universities. Graduation rates from universities are considerably lower because high
tuition pushes more students to start in community colleges and discourages com-
munity college students with poor GPA to pursue bachelor’s degrees.
The effect on transfer decisions of current college students: In
contrast to the effect of tuition increase on high school graduates, a tuition increase
has an opposite effect on the transfer rates for current college students. In Table
38
2.17, there are fewer transfers from community colleges to universities. At the same
time, there are more students who drop out from universities. As expected, the
increase of the dropout rate is higher in private universities, for the absolute amount
of a tuition increase is much higher for private universities. For community college
students, more of them tend to work with an associate degree instead of transferring
when facing higher costs.
In general, from this experiment, we can see that there are two opposite
driving forces that affect transfer rates from community colleges to universities. One
pushes students to receive associate degrees instead of transferring because the cost of
universities is so high. The other pushes more students to start in community colleges
rather than starting in universities, which increases the transfer rates. Therefore,
the transfer rates from community colleges to universities move in an ambiguous
direction. For instance, if the cost in universities is higher than the labor market
return for all students (an extreme situation to consider), no student will pursue a
bachelor’s degree, and there will be a zero transfer rate.
The experiment shows the value of community colleges when tuition in uni-
versities increases. As an alternative choice to students, the existence of community
college improves the student welfare. There are vast literatures on the decrease of
student welfare and graduation rates under the circumstances of higher tuition. If we
take into account community colleges, the change of student welfare and graduation
rate should not be as large as stated in the literature (Campbell and Siegel (1967),
Galper and Dunn (1969), and Leslie and Brinkman (1987)).
2.7.2 Improved academic preparedness
Chicago Mayor Rahm Emmanuel proposed a longer school day plan. Under
his plan, most city high schools will extend their day to 7.5 hours. The goal is to
39
improve academic preparedness. In this simulation study, I examine how improved
academic preparedness would affect students’ college choices. I increase the students’
high school GPA and their SAT scores by 0.5 standard deviations, while keeping their
college GPA the same.
The simulation results show that transfer rates increase, which suggests that
improved academic preparedness improves the efficiency of the transfer function of
community colleges, while it also increases the graduation rates from universities.
From Table 2.18, there are 3.1% fewer students attending community col-
leges in period 1. With increased expected ability, more students are willing to
start in universities. In fact, one of the main reasons for students to transfer from
community colleges is to avoid the drop-out risk from universities when facing ability
uncertainties. With improved academic preparedness, students derive lower drop-out
risk from universities, and are more willing to invest in them.
Indeed, as shown in Table 2.19, the transfer rate from community colleges
to universities increases from 6.2% to 8.3% because that community college students
anticipate higher returns from universities as they derive higher expected abilities
from improved high school GPA and SAT scores. Therefore, more students are
choosing to transfer.
In Table 2.20, as expected, 2.3% fewer students choose to work after the
first period, while 3.2% more students choose to study in private universities, mostly
because the improved academic preparedness encourages more students to attend
universities from the beginning and more transfers from community colleges.
This experiment shows that improving academic preparedness is beneficial
both for individuals and the government. First, the efficiency of the transfer function
of community college is improved, which decreases expenditures for postsecondary
education. Second, the average education level of the whole population rises by
40
increasing graduation rates from universities.
2.7.3 Decrease the transfer cost
In this study, I examine how students make college choices if the transfer
costs is decreased to half of its original value. The simulation study shows that
decreasing transfer costs is a very effective way to improve the efficiency of the college
market.
In Table 2.21, we can see that the enrollment rate of community colleges
increases from 28.3% to 40.8%. At the same time, there are 8.9% fewer students that
choose to enroll in public universities in period 1, and 3.7% fewer students choose to
enroll in private universities. The reason is that if the transfer cost is low, community
colleges are more attractive to both students with financial constraint and low-ability.
In Table 2.22, the transfer rates from community colleges to public univer-
sities are more than 4 times higher than the rates in the baseline model. There are
two causes for the high transfer rates. The first is that more students with low ability
tend to enter community college to learn their ability when the transfer cost is low.
The second is that the proportion of planned transfers also increases without high
transfer cost barriers.
In Table 2.23, the simulation results show that the drop-out rates from
universities decrease to 12.9% from 17.4%. The reason is that it is less costly to
transfer to other universities instead of dropping out when observing bad matches
between their abilities (relatively low GPA) and their current universities.
This experiment suggests that reducing transfer cost is a very efficient way
to improve the efficiency of the transfer program in community colleges and increase
the university completion rate. The transfer cost could be reduced through variables
ways. Community colleges could provide more information sessions to promote the
41
courses that are widely accepted by universities.
2.7.4 No transfer cost
In this study, I examine how students make college choices if there are
no transfer costs. The simulation study shows that decreasing transfer costs is the
most effective way to improve the efficiency of the transfer program. With no transfer
costs, the education path completely changes, while graduation rates from universities
largely increase.
In Table 2.24, we can see that more than half of the students choose to attend
community colleges in period 1, while the enrollment rate of universities decreases to
45.5% from 71.2%. The reason is that if there is no transfer cost, community colleges
are perfect substitutes for universities in the first period. Students without strong
preferences for universities will attend community colleges for financial reasons.
In Table 2.25, the transfer rates from community colleges to universities are
more than 6 times higher than the rates in the baseline model. The transfer rate
from community colleges to public universities increases from 3.8% to 26.2%, while
the transfer rate from community colleges to private universities increases from 2.4%
to 17.5%. There are two causes for the high transfer rates. The first is that there
are more planned transfers due to the zero transfer cost. The second is that the
proportion of unplanned transfers also increases without high transfer cost barriers.
In Table 2.26, the simulation results show that the percentage of students
who enter the labor market after the first period drops from 38.4% to 18.9% as a
result of two driving forces. The first is that the drop-out rates from universities
decrease. The underlying reason is that students can choose to transfer to other uni-
versities instead of dropping out when observing bad matches between their abilities
(relatively low GPA) and their current colleges. The second is that a zero transfer
42
cost encourages more students to transfer to universities from community colleges.
Therefore, the enrollment rate of public universities in period 2 increases to 48.8%
because it is the cheapest way to obtain a bachelor’s degree, while the enrollment
rate of private universities also marks a significant increase.
This experiment suggests that a decreasing transfer cost is the most efficient
way to reduce both individual and the government expenditures in postsecondary
education. It can be achieved if community colleges cooperate with universities. Such
cooperation is possible if community colleges provide freshman-level and sophomore-
level courses under the same syllabus as the courses provided by universities, and
universities accept credits from community college students without discrimination.
2.8 CONCLUSION
In this paper, I develop and estimate a two-period ability-learning structural
model to provide a more complete picture of the college market by including commu-
nity colleges as a viable pathway to bachelor’s degrees. In the model, students make
college decisions with different financial constraints and uncertain abilities. They
choose between community colleges and universities in period 1, and make transfer
decisions in period 2. I estimate the model using simulated maximum likelihood
estimation. The estimated model closely replicates most of the patterns in the data.
The results show that the market has no discrimination against transfer
students because the effect of transfer on future income is not statistically significant
from zero, which coincides with the finding by Kane and Rouse (1995), suggesting
that the only cost of transfer is direct transfer costs that are the main barrier to college
transfer. The estimation results also show that family income has a significant effect
on college choices, which provides evidence that students tend to start in community
colleges when facing financial constraints. Finally, the results support the idea that
43
the return to abilities is higher in universities than in community colleges.
Experiment 1 shows that the tuition increase in universities pushes more
students to community colleges, and also results in more dropouts from universities.
Experiment 2 shows that improved academic preparedness encourages more students
to start in universities, and also encourages more community college students to
transfer to four-year schools. At the same time, there are fewer dropouts from uni-
versities. Experiment 3 shows that with no transfer costs, the fraction of students
starting in community colleges almost doubles. The education pattern completely
changes. The transfer rate to universities is 6 times higher than in the baseline
model. Although the transfer rates from community colleges to universities increase
in all three experiments, transfer cost seems to be the main barrier to improving the
effectiveness of the transfer program.
Building on Fu (2010), Epple, Romamo, and Sieg (2006) and this paper, one
extension is to consider jointly the strategies between colleges and students. Schools
may set different strategies to admit high school graduates and transfer students. A
dynamic general equilibrium model that takes into account both sides of the college
admission market would give a more complete picture of the decision making process
and the underlying driving forces. Consequently, the new model may yield different
outcomes and consequences of the examined policies.
Another extension is to modify the model by allowing for heterogeneous risk
aversion levels. The extension can be achieved by employing the constant relative risk
aversion utility, and allowing the risk aversion coefficient to be different for different
individuals. The extension can help us to understand a diversity of college choices and
different college preferences from another perspective. As a result, the heterogeneous
risk averse level may influence the estimation results and the consequences of the
examined policies.
44
Table 2.1: Percentage enrollment in period 1
Community college Public universities Private universities
27.3% 38.0% 34.7%
Table 2.2: Percentage of transfer in period 2
To \ From Community college Public University Private University
Work 18.5% 10.9% 7.2%
public university 6.4% \ 3.5%
Private university 2.4% 1.5% \
Table 2.3: Average tuition fee
Community college Public universities Private universities
4587 3912 17201
Table 2.4: Average high school GPA and SAT score (normalized)
Community college Public universities Private universities
High School GPA −0.587 0.08725 0.2018
SAT score −0.702 0.01927 0.2150
45
Table 2.5: Average college GPA for transfer students
To \ From Community college Public University Private University
Work −0.367 −0.588 −0.425
Public University 0.169 \ −0.0692
Private University 0.571 −0.365 \
Table 2.6: Average family income
Community college Public universities Private universities
31541 49517 61656
Table 2.7: Average family income for transfer students
To \ From Community college Public University Private University
Work 32034 43155 51656
Public University 30066 \ 62766
Private University 33349 50693 \
46
Table 2.10: Estimation results
Variable Estimates Standard deviation
ρ1
Community college 9.735 0.006
Public schools 9.904 0.014
Private schools 9.825 0.013
ρ2
Community college -0.046 0.008
Public schools 0.066 0.004
Private schools 0.015 0.002
γ1 -0.005 0.027
µout 9.850 0.013
τ 3.176 0.019
θ
Community college 10.421 0.259
Public schools 8.834 0.063
Private schools 9.744 0.005
β
Family income /100,000 0.693 0.000
Number of people that parents support -0.073 0.000
Father’s education level 0.049 0.000
Mother’s education level 0.124 0.000
σ2S 0.0005 0.003
m
Community college 2.833 0.258
Public schools 1.110 0.026
Private schools 3.563 0.197
µ1
Community college 0.009 0.000
Public schools 0.003 0.000
Private schools 0.013 0.000
µ0 1.872 0.039
σ2µ 0.003 0.040
σ2α 34.601 0.848
Transfer cost -7.224 0.003
47
Table 2.11: Enrollment rate in period 1 in model fit
Data Simulated Sample
Community college 27.3% 28.3%
public university 38.0% 36.9%
Private university 34.7% 34.8%
Table 2.12: Enrollment rate in period 2 in model fit
Data Simulated Sample
Work 36.6% 38.4%
public university 35.4% 34.6%
Private university 28.0% 27.0%
48
Table 2.13: Transfer rate in period 2 in model fit
To \ From Community college Public University Private University
Work Data 18.5% 10.9% 7.2%
Simulated Sample 21.1% 8.8% 8.6%
public Data 6.4% \ 3.5%
university Simulated Sample 3.8% \ 3.9%
Private Data 2.4% 1.5% \
university Simulated Sample 2.4% 2.3% \
Table 2.14: Enrollment rate in period 1 in experiment study 1
Baseline model New model
Community college 28.3% 30.6%
public university 36.9% 37.5%
Private university 34.8% 32.0 %
Table 2.15: Transfer rate in period 2 in experiment study 1
To \ From Community college Public University Private University
Work Baseline model 21.1% 8.8% 8.6%
New model 21.6% 9.6% 9.1%
public Baseline model 3.8% \ 3.9%
university New model 5.6% \ 3.3%
Private Baseline model 2.4% 2.3% \
university New model 3.4% 1.7% \
49
Table 2.16: Enrollment rate in period 2 in experiment study 1
Baseline model New model
Work 38.4% 40.3%
public university 34.6% 34.9%
Private university 27.0% 24.8%
Table 2.17: Transfer rate in period 2 for current student in experiment study 1
To \ From Community college Public University Private University
Work Baseline model 21.1% 8.8% 8.6%
New model 21.4% 9.1% 9.3%
public Baseline model 3.8% \ 3.9%
university New model 3.7% \ 3.9%
Private Baseline model 2.4% 2.3% \
university New model 2.2% 2.1% \
Table 2.18: Enrollment rate in period 1 in experiment study 2
Baseline model New model
Community college 28.3% 25.2%
public university 36.9% 35.1%
Private university 34.8% 39.8 %
50
Table 2.19: Transfer rate in period 2 in experiment study 2
To \ From Community college Public University Private University
Work Baseline model 21.1% 8.8% 8.6%
New model 16.8% 8.7% 10.5%
public Baseline model 3.8% \ 3.9%
university New model 5.0% \ 4.1%
Private Baseline model 2.4% 2.3% \
university New model 3.3% 1.7% \
Table 2.20: Enrollment rate in period 2 in experiment study 2
Baseline model New model
Work 38.4% 36.1%
public university 34.6% 33.7%
Private university 27.0% 30.2%
Table 2.21: Enrollment rate in period 1 in experiment study 3
Baseline model New model
Community college 28.3% 40.8%
public university 36.9% 28.0%
Private university 34.8% 31.1%
51
Table 2.22: Transfer rate in period 2 in experiment study 3
To \ From Community college Public University Private University
Work Baseline model 21.1% 8.8% 8.6%
New model 17.17% 5.4% 7.5%
public Baseline model 3.8% \ 3.9%
university New model 14.15% \ 7.8%
Private Baseline model 2.4% 2.3% \
university New model 9.5% 3.8% \
Table 2.23: Enrollment rate in period 2 in experiment study 3
Baseline model New model
Work 38.4% 31.1%
public university 34.6% 39.8%
Private university 27.0% 29.1%
Table 2.24: Enrollment rate in period 1 in experiment study 4
Baseline model New model
Community college 28.3% 54.5%
public university 36.9% 18.3%
Private university 34.8% 27.2 %
52
Table 2.25: Transfer rate in period 2 in experiment study 4
To \ From Community college Public University Private University
Work Baseline model 21.1% 8.8% 8.6%
New model 10.9% 3.4% 4.6%
public Baseline model 3.8% \ 3.9%
university New model 26.2% \ 13.6%
Private Baseline model 2.4% 2.3% \
university New model 17.5% 5.9% \
Table 2.26: Enrollment rate in period 2 in experiment study 4
Baseline model New model
Work 38.4% 18.9%
public university 34.6% 48.8%
Private university 27.0% 32.4%
53
Chapter 3
The Impact of Labor Migration
on Children’s Health: Evidence
from Rural China
3.1 INTRODUCTION
The changing economic climate in China has caused a dramatic increase
in ‘labor migration’. Labor migration is the migration of Chinese rural residents to
bigger cities where higher-paying, temporary jobs are available. In 2009, the floating
population in China reached 211 million adults, leaving over 58 million children
behind in homes far from their parents.
The utility function of parents is composed by household consumption, chil-
dren’s health and education. The main reason for labor migration is to improve
household financial situation. As a result, they could increase household consump-
tion, afford better education for their children, and better health insurance. Conven-
tional wisdom suggests that these left-behind children are at risk of developing health
54
problems and physical and psycho-social stress1as a result of a lack of parental guid-
ance and relevant health information. These issues raise concerns for social workers
and policy makers. Nevertheless, despite the fact that migrated parents are spending
less time with their children, these parents are able to provide better remittances,
nutrition and health relevant information as a result of their increased income and the
knowledge they obtain through their migration experiences. Little is known about
the extent to which the health of left-behind children is affected in China, particularly
those children who are too young to take care of themselves.
This paper aims to establish the overall consequences of parental migration
on the health outcomes and childcare of their left-behind children. The data used in
the analysis are primarily derived from four waves of the China Health and Nutrition
Survey (CHNS), collected in 2000, 2004, 2006 and 2009. The CHNS was designed to
examine the effects of Chinese health, nutrition, and family planning policies. The
people of nine provinces that vary substantially in geography, economic development,
and access to public resources were surveyed.
Some of the economic literature that focuses on labor migration in China
suggests that the remittances forwarded to families by migrated members benefit the
households financially. For instance, Du et al. (2006) and de Brauw and Giles (2008)
found that labor migration increases family consumption level. Giles (2006) also
found that having migrated family members could improve the family’s risk-coping
ability. On the other hand, there are also papers that focus on the left-behind family
members, particularly school-age children. Chen et al. (2009) found that educational
outcomes of children improved in migrant households. However, de Brauw and Mu
(2011) found that the nutrition of some school-age children from migrant households
1Currently, the schools in rural China do not have the adequate systems or a relevant curriculum in place
to address these issues.
55
was negatively affected.
There are a few papers that study the health outcomes of left-behind chil-
dren in China. One of them is Mu and Brauw (2011), which examined the weight of
left-behind children, and found that older children (7-12 years) were more likely to
be underweight in migrant households than those who lived in non-migrant house-
hold. Shu Zhang (2012) used survey data from the 2000 wave of the CHNS to study
the impact of labor migration on children’s health. She found no significant health
outcome effects for children whose fathers had migrated. Both papers, however, do
not consider the potential endogeneity of parents’ migration and children’s health.
Therefore their results might be biased.
The main methodological obstacle of quantifying the effect of parent’s mi-
gration is the endogeneity problem. This may be manifested as a problem of reverse
causation. Instead of being affected by parents’ migration status, children’s health
status could be a critical factor for parents when making migration decisions. For
example, parents whose children are in poor health may have to stay home to take
care of their children. On the other hand, they may have stronger financial incen-
tive to migrate to earn extra money to finance better medical care for their sick
children. Moreover, parents’ migration decisions could be correlated to children’s
health through unobserved variables, such as genetically inherited health deficiency,
whereby sick parents would be too sick to leave their sick children and migrate to ur-
ban areas for work. Therefore, the significant correlation between parents’ migration
and children’s health status may not indicate causality.
To solve the endogeneity problem, we use instrumental variables (IV) es-
timation. To be more specific, we instrumented father’s migration status with the
average male migration rate, using historical county data, instrument mother’s mi-
gration status with historical county level female migration rate, and instrument
56
household migration status with historical county level household migration rate.
The historical county level migration rate is calculated as the average local migration
rate from the previous survey year. The historical county level migration rate by
gender is a suitable indicator to reflect the local culture and network of migration,
where the network refers to a person’s exposure to migration information from her
migrated friends or family members. Intuitively, people living in the areas with a
tradition of migration or with a better migration network are more likely to migrate.
In the first stage regression of this paper, it can be seen that this set of instruments
have strong predictive power on parents’ migration status. One might be concerned
that these instruments could influence child health directly, since county level migra-
tion rates are also correlated with the local average income level. To address this, I
included the county level average income as an explanatory variable.
In this paper, we adopted the panel structure of the data and employed a
fixed effects model to study the overall health status of left-behind children. The
causality effect of migration is identified by two-stage estimation. The estimation
results are presented with and without the IV correction. Moreover, we conducted
two robustness checks to support our estimation results. In the first robustness check,
we excluded household income as an explanatory variable, as household income might
be correlated with unobserved shocks that could also affect children’s health. This
correlation could lead to biases in estimation. In the second robustness check, I
excluded the number of elders in the household, as family size could affect peoples’
migration decisions because children could be taken care of by other family members.
As a result, the estimation results might be biased.
Generally speaking, we found there were few significant effects of parents’
migration on child outcomes. A possible explanation for this is that the coefficients
capture the net effects of parents’ migration. Children with migrated parents receive
57
less physical care, but may receive more financial support, access to better nutrition
products sent from their mothers, and better nutritional information. There are
both positive and negative effects on children’s health. The coefficients imply that
the positive effects of parents’ migration are about the same as the negative effects
on children’s health. Though the regression results on the whole sample were not
significant, the regression results from subsamples provided more insights. It showed
that children aged between 5 and 10 are positively affected by fathers’ migration,
possibly because these children received higher remittance, better access to nutrition
information and products.
Our paper contributes to the literature in a number of ways. Firstly, we used
novel instrumental variables dealing with the endogenous nature of parents’ migration
decisions, which are able to predict the migration propensity of parents. Secondly,
we studies different causality effects of father’s and mother’s migration status on
children’s health outcomes, which were significantly different. Thirdly, in addition
to traditional measurements of child health that focus on height and weight, we also
considered nutrient intake (consumption of calories and protein), immunization shots
and childcare. These measures provided a more comprehensive picture of the impact
of labor migration on children’s health.
The paper proceeds as follows: Section 2 discusses the history of labor mi-
gration and child nutrition in China; Section 3 describes the conceptual framework;
Section 4 discusses the data; Section 5 describes the empirical specification; Section
6 presents the main results regarding the effect of parent migration on the physi-
cal health of children; Section 7 goes through several robustness checks; Section 8
discusses the results from subsamples; and a conclusion is provided in Section 9.
58
3.2 BACKGROUND
According to the analysis report of labor migration in China by the National
Bureau of Statistics of China (2012), the total number of migrated labor from rural
areas increased from 225 million in 2008 to 252 million in 2011. The rapid growth of
rural-to-urban migration has been an important demographic trend in China. In this
section, we first introduce the background of labor migration in China and its impact
on rural communities, followed by a discussion on how the heath of rural children
has changed over time.
3.2.1 Labor Migration and Children Left Behind in Rural China
Since 1958, under the central planned economy in China, China has used
the household registration system (HuKou system) to control the labor migration
from rural to urban areas. Under the HuKou system, households are divided to
Agriculture HuKou and non-Agriculture HuKou, where the rural-urban migration
was strictly restricted. In the 1990s, 83% of households were classified under the
Agriculture HuKou category, according to Mallee (1995).
In 1988, the HuKou reform took place, whereby rural migrants were allowed
to obtain a temporary residence. However according to the World Bank (2009), rural
migrants were not able to access the urban welfare system, including education, health
and the social safety net. Therefore the rural migrants maintained a close tie to their
hometown village, as their benefits were linked to their household registration status.
According to Bao et al. (2009), the large income gap between urban and ru-
ral areas, created by decades of urban-rural segregation and uneven economic growth,
provided strong incentives for rural people to move to urban areas, especially after
rural-urban labor flow was officially permitted. As a result, China has experienced
dramatic changes in its labor market since the 1990s. Liang and Ma (2004) found
59
that the migration population grew from 20 million in 1990 to 45 million in 1995 and
to 79 million in 2000 using the one percent sample from the 1990 and 2000 waves
of the Population Census and one percent sample from the 1995 wave of population
survey.
It is important to note the different migration rates by gender, as mother’s
migration may have different impact on child health than father’s migration. Accord-
ing to Zhao (1999) and Rozelle et al. (1999), there were substantially more migrated
men than women in the mid-1990s. Mu and Van de Walle (2011) showed that the
gender gap in migration has increased over time. Our findings using CHNS data
support this.
3.2.2 Health of Children in China
The health of children in China has improved with economy growth. Shen
et al. (1996) showed that the average height of children aged two to five years had
increased by 3.8 cm in 1990 when compared with data from 1975. Chen (2000) found
the prevalence of underweight children and the rate of stunting (the percentage of
children with Height-for-age Z-scores below two) among Chinese children declined
from 1990 to 1995. Svedberg (2006) found that the stunting rate had decreased
further by 2002. Additionally, Osberg et al. (2009) showed that height-for-age Z-score
in children increased between 1991 and 2000. The changes in children’s health might
be explained by the improvement of the diet quality in China, which is supported by
Du et al. (2004). They showed that the nutritional intake of children shifted from
carbohydrates to high fat and high energy-density foods.
Although the health of children in China has improved on average, malnu-
trition is still an issue. According to Mu and Brauw (2011), the stunting rate in
2002 was still nearly 15%, indicating a substantial portion of the population remain
60
malnourished. There are also other challenges in improving nutrition among chil-
dren. Liu et al. (2012) analyzed urban-rural disparities of China’s child health and
nutritional status using CHNS data from 1989 to 2006, and showed that on average,
urban children have 0.29 higher height-for-age z-scores and 0.19 greater weight-for-
age z-scores than rural children.
3.3 CONCEPTUAL FRAMEWORK
There are at least three main channels through which migration might affect
the health status of children: the income effect, the time effect, and the information
effect.
First of all, the primary reason for a member of a household to migrate is
to increase household income. We anticipate the increased family income will have
a positive effect on child health outcomes for various reasons. For example, extra
income could increase diet quality (Du et al., 2004), by switching from high carbo-
hydrate food to high fat and high energy-density foods. Therefore, the calorie intake
may increase when income increases. Moreover diet improvements might improve
height-for-age Z-score and weight-for-age Z-score. Finally, health service utilization
for children may increase as well. For example, migrant parents may be able to afford
to have their children immunized as a result of increased income.
The second channel through which migration may affect the health status
of children is through the time allocated to childcare. Mu and van de Walle (2011)
found that when one family member leaves for urban work, the remaining family
members must take on an increased farm work load. As a result, they may spend
less time cooking and child rearing. Consequently, child health outcomes may be
affected. In cases where both parents have migrated, children might be left in the
care of relatives, usually their elderly grandparents. In such cases, children might
61
not have a regular diet routine and may eat poorly. As a result, the child’s nutrient
intake, and subsequently, their height and weight, may be affected.
The third channel is though better access to nutritional information from
migrated parents. People always migrate to urban areas that have better economic
conditions and health services. Therefore migrants should have better access to
nutritional information. For example, migrants may learn more about healthy diets,
and encourage their children to eat more nutritious foods. Moreover, they may learn
more about the importance of immunization, and have the incentive to let their
children get immunized.
As explained above, the direction of the effect of parent migration on child
health outcomes is ambiguous. In the next section we present the data and empirical
framework.
3.4 DATA
The China Health and Nutrition Survey (CHNS) was designed to examine
the effects of the health, nutrition, and family planning policies and programs imple-
mented by national and local governments and to check how the social and economic
transformation of Chinese society is affecting the health and nutritional status of
the Chinese population. The Survey covered nine provinces that vary substantially
in geography, economic development, and access to public resources. Demographic
characteristics, household assets and other information were also collected as part
of the survey. The first round of the CHNS, including household, community, and
health/family planning facility data, was collected in 1989. Seven additional panels
were collected in 1991, 1993, 1997, 2000, 2004, 2006 and 2009.
From 1997 onwards, families were asked to provide reasons for migrated or
absent family members as part of CHNS. A migrant was defined as any individual
62
who had left the home at the time of the survey to seek employment. The data used
in the analysis were primarily derived from four waves of the CHNS, collected in
2000, 2004, 2006 and 2009. The reason that we did not use data from the 1997 wave
of the survey is because we used the historical migration rate from the previous wave
as instrument variables, and this information was not available for the 1997 wave.
In the first wave (1997) of the CHNS, 15,917 individuals were surveyed.
Survey response rates and attrition are difficult to determine for two reasons: firstly,
the participants who had migrated in one survey year may have returned home in a
later year; and secondly, new participants were recruited following the 1997 survey, to
replenish samples if a community had less than 20 households, or if participants had
formed a new household or separated from their family into a new housing unit in the
same community. If we calculated response rate based on those who participated in
previous survey rounds remaining in the current survey, our response rates would be
around 88% at individual level and 90% at household level (Popkin et al. 2010). Mu
and de Brauw (2011) showed that the attrition was random and should not generate
panel attrition bias.
For estimation purposes, we dropped observations where one or more of
the following critical pieces of information pieces were missing: (a) child’s height,
(b) child’s weight, (c) child’s calorie intake, (d) child’s protein intake, (e) parents’
education level, and (f) parents’ migration status. To calculate the height-for-age
Z-score (HAZ) and weight-for-age Z-score (WAZ), we used the most recent growth
charts made available by the World Health Organization (WHO). To measure child’s
calorie and protein intake, we used a set of age and gender-specific Recommended Di-
etary Allowances (RDAs) sanctioned by the Chinese Nutrition Society (2000). RDAs
are based on average energy allowances, i.e. calorie intake for each specific age and
gender group.
63
We randomly selected one child from families with several children to avoid
any biases of related children and other unobserved variables. In this paper, we focus
on children under ten years of age, because they are at greater risk of developing
problems associated with malnutrition and are more likely to respond to nutritional
interventions (WHO, 1995). We excluded households in which the children were older
than ten. After the aforementioned data altering procedures, the final sample data
is unbalanced panel data, containing 1,600 children and 2,201 observations.
There are several reasons that only 40% of the children had more than one
observation in the data. The first is that we only kept the observation when we had
both the child data and their parent’s data. For instance, if the mother or father did
not respond to the survey, the child’s response was excluded as it could not be used.
As the individual response rate is 88%, the probability that the child is included in
the next survey year is calculated by multiplying the child’s response rate by their
parents’ response rates, which equates to 0.68 (0.883). The second is the individual
response rate is not 88% for each survey year - it is 83% in year 2000, and 80% in
year 2004 (Popkin et al. 2010). The third is that there are missing variables. For
instance, the response rate of the question for migration status is less than 80%.
After the exclusion of children who are younger than 10, there is approximately 40%
probability that a child is included in more than one survey wave of the survey.
From table (3.1), we can see the migration rate kept increasing and reached
a peak at year 2006. The table shows that fathers were more likely than mothers
to migrate from households. Both parents had migrated from relatively few families,
implying that most families had one parent left in the household to take care of the
children. From the data, it is clear that labor migration became quite common in
rural areas after year 2000. In year 2006, 21% of children had at least one parent
who had migrated, and both parents of 4% of the children sampled had migrated.
64
However these figures likely underreport the true scale of migration because we did
not account for migration that took place over shorter periods of time (Cai et al.,
2008).
Table (3.2a) compares differences in health outcomes and care of children
between children with and without migrated parents. Children are defined as left-
behind if one of their parents was a migrant. According to the table, the left-behind
children on average consumed less protein than children who lived with both of their
parents. At the same time, left-behind children were shorter and weighed less on
average than children who lived with both of their parents. Table (3.2b) shows dif-
ferences between children with and without migrated fathers in health outcomes and
care. By comparing the data from Table (3.2a) and (3.2b) it is evident that there
were fewer significant differences of child health outcomes and care for families with
migrant fathers and non-migrant fathers. Children with migrant fathers have sig-
nificantly smaller weight-for-age Z-score and protein/RDA. Table (3.2c) shows the
differences between children with and without migrated mothers. Unlike children
with migrant fathers, children with migrant mothers consumed significantly less pro-
tein and calories. Although the rates of migration were smaller for mothers, they
seemed to have more of a significant effect on child outcomes than father migration
or household migration.
We can also see that for both migrant and non-migrant households, the
average height-for-age Z-score and weight-for-age Z-score were less than 0. The z
scores show that children in China are on average shorter and lighter in weight
compared to the WHO standards. The WHO standards were formulated in the 1970s
by combining growth data from two distinct data sets in USA. The summary statistics
show that children in China have relatively poor health conditions compared to the
children in USA, while left-behind Chinese children are even more disadvantaged
65
compared to Chinese children who live with both parents. Moreover the average
Calories/RDA and Protein/RDA ratios are under 1 for both migrant and non-migrant
households, which implies that children in China on average consume less protein and
calories than recommended.
Table (3.3) shows the summary statistics of the control variables. House-
hold income is lower in households with migrants. The difference in income could
be explained by the fact that the migrated household members’ income is not in-
cluded in household income, although the remittances provided by the migrant are
included. The table also shows that migrated parents have lower education level and
are younger. This trend could be a result of the local economic conditions, as people
who live in areas with better economic conditions are less likely to migrate. They also
tend to receive more education and have children later in life. For similar reasons,
county level average height and weight are lower for migrant households because they
are proxies for features of local economy development. Moreover the number of fe-
males over 60 in the household is higher in households with migrants, which suggests
that the number of elders in the household influences families’ migration decisions.
In general, people who migrate are more likely to live in big families, and poor ar-
eas. At the same time, they are more likely to have lower education levels and have
children at younger ages. The historical county level migration rates will be used as
instrumental variables and will be discussed later.
3.5 EMPIRICAL SPECIFICATION
In this paper, we adopt three sets of measures of health status. The first
includes child’s weight-for-age Z-score (WAZ), height-for-age Z-score (HAZ). The sec-
ond set includes child’s daily calorie intake, child’s daily calorie intake/RDA, child’s
daily protein intake, and child’s daily protein intake/RDA. The third set includes
66
the number of immunization shots that the child received in the survey year, and
whether the child has been cared for by non-household members.
We aimed to identify cause-effect relationships of parents’ migration status
on children’s health outcomes. In addition to parents’ migration status, child health
is also affected by other demographic factors, such as gender, parents’ education
level, family size, the number of siblings, and household income. These were used as
control variables in the estimation model.
With panel data, two models could be applied: the fixed effects model or
random effects model. The Hausman test showed that the random effects model is
inconsistent. The fixed effects model is employed in this paper. The panel data is
unbalanced. There are 480 children with more than one observation in this data set,
which is the effective sample. Among the effective sample, there are 112 parents who
changed their migration status. The number of parents who changed their migration
status in different survey years helped us to identify the impact of migration on
children’s health.
We employed three separate fixed effects models to identify the effects of
household migration, fathers’ migration and mothers’ migration on child health out-
comes and care. The fixed effects model that we employed to identify the effect of
household migration
Hit = αi + β1Mit + β2Xit + εit (3.1)
where Hit is child i’s health outcome at time t, Mit is child i’s household migration
status at time t. The dummy variable equals to 1 if either or both the child’s parents
had migrated out at time t, and 0 otherwise. Xit is a vector of demographic variables
including gender dummy (female as 1), parents education level, household income,
the number of males aged over 60 in the household, the number of females aged over
60 in the household, the number of boys under age ten in the household, the number
67
of girls under age ten in the household, the county level average height, the county
level average weight, the county level average daily calorie consumption/RDA, the
county level average protein consumption/RDA, and the county level average income.
Here εit is an error terms for individual i at time t.
The fixed effects models that we used to identify the effect of father’s and
mother’s migration on child health are similar to Equation (3.1). The only difference
is the dummy variable Mit. To capture the effect of fathers’ migration, the dummy
variable Mit is redefined to equal to 1 if the child’s father has migrated out at time
t, and 0 otherwise. To capture the effect of mothers’ migration, the dummy variable
Mit is redefined to equal to 1 if child’s mother has migrated out at time t, and 0
otherwise.
We did not include the number of working age males/females in the house-
hold as explanatory variables for two reasons. The first is that we have already
controlled the household income and parents’ migration status. The second is the
preliminary results show that the number of working age males/females in the house-
hold does not have a significant effect on children’s health outcomes. In the model,
we use the number of boys/girls instead of the number of siblings because many chil-
dren come from large families in rural China and often live with their cousins and
their siblings. Therefore the total number of children in the household could impact
the child’s health.
Household income is used as a control variable instead of individual income.
The reason is that there are too many missing values for individual income, especially
for migrants. The remittances are included in household income but we cannot break
them out, as the survey did not ask about the amount of remittances. We included
more variables that measure the households’ assets as explanatory variables, but
the coefficients are not significant. Finally, we only kept household income in the
68
regression.
Plenty of literature mentioned the biases that may be caused due to the en-
dogenous nature of labor migration. In our CHNS sample, endogeneity mainly arose
because a child’s health status also affects parents’ migration decisions. The common
methodology adopted to correct such biases has been used as an instrumental variable
approach, isolating exogenous variation in parents’ migration status. We adopted an
IV approach and used historical county level migration rates as instruments. The
historical county level migration rate is calculated as the local migration rate from
previous survey year. The historical migration rate could proxy the migration net-
work. The difference between the average male migration rate and female migration
rate could also be a proxy for local culture.
3.6 ESTIMATION RESULTS
3.6.1 Results of Ordinary Least Squares model
As a baseline, Table (3.4a) and Table (3.4b) present the baseline effects of
the household migration status on child health outcomes and care from the ordinary
least squares regressions. Here, the child household migration status dummy variable
equals one if either or both the child’s parents have migrated. Table (3.5a) and Table
(3.5b) present the effects of the fathers’ migration status on child health outcomes
and care from the ordinary least squares (OLS) regressions. Table (3.6a) and Table
(3.6b) present the effects of the mothers’ migration status on child health outcomes
and care from the ordinary least squares regressions.
Though the OLS regression analysis may not be able to capture the exact
relationship between labor migration and children’s health, the results give us an
idea of the correlation between children’s health and the explanatory variables. It
69
shows that parental migration does not necessarily negatively correlate with children’s
health outcomes. Firstly, coefficients are similar for father’s migration and household
migration status because the majority of household migrations are fathers’ migration.
Father’s migration and household migration are positively correlated with children’s
height-for-age Z-score, and negatively correlation with the number of immunization
shots that children received. However father’s migration and household migration
have no significant correlation with children’s nutrient intake. Secondly, compared
with father’s migration, a mother’s migration has a higher correlation with children’s
health outcomes, although the rate of migration is smaller for mothers. For instance,
mothers’ migrations is positively correlated with children’s height-for-age Z-score and
negatively correlated with children’s daily calorie and protein intakes. The fact that
migrated mothers are more likely to access child care knowledge may explain this
correlation, as childcare knowledge is positively correlated with children’s physical
outcomes. However, a mother’s absence from home means they are not able to pay
attention or take care of their child’s diet, which leads to lower calorie and protein
consumption in their children.
When the OLS regression results are compared to Table (3.2a), Table (3.2b)
and Table (3.2c),the coefficients of parents’ migration and household migration cease
to be significant for some measures of child health outcomes and care in the OLS re-
sults. This may be because both parents/household migration and children’s health
outcome are correlated with the added explanatory variables in the OLS regression.
For instance, children’s weight-for-age Z-score is significantly different for migrant
household and non-migrant household in Table (3.2a), but the coefficient of house-
hold migration on children’s weight-for-age Z-score is not significant in Table (3.4a).
It can be seen that in Table (3.4a) children’s weight-for-age Z-score is significantly
correlated with fathers’ education, county level average weight and height. At the
70
same time, we can see from Table (3.3) that fathers’ education, county level av-
erage weight and height are all significantly different for migrant household and
non-migrant household. Therefore the correlation between those control variables
and household migration status explains the difference in the OLS results and the
summary statistics. Unlike father’s migration and household migration, mother’s mi-
gration remain significant in Table (3.6a) and Table (3.6b) for the variables that are
significantly different for children with migrant mothers and non-migrant mothers
in table (3.2c). The correlations between mothers’ migration status and some child
outcomes remain significant when variables are added.
3.6.2 Results of Fixed Effects model
Table (3.7a) and Table (3.7b) shows the effects of household migration sta-
tus on the health outcomes and care using the fixed effects model approach without
considering the endogeneity of migration. Similarly, Table (3.8a) and Table (3.8b)
show the effects of father’s migration status on children’s health outcomes and care.
Table (3.9a) and Table (3.9b) show the effects of mother’s migration status on chil-
dren’s health outcomes and care.
With the aid of the fixed effects model, we considerably reduced the threat
of omitted variable bias. From the OLS regression results we can see that most of
the coefficients of parents’ migration and household migration become insignificant
in the fixed effects model results, especially the coefficients of mothers’ migration.
The results imply that there must be some omitted variables that are correlated with
parents’ migration decisions and may have casual effects on children’s outcomes. Even
though we have tried to include most of the relevant variables for children’s outcome,
due to the limitations of the data available, some factors may still be left out. For
instance, we cannot observe whether the child has a chronic health condition. Chronic
71
health conditions are defined as a health problems that persist for over three months,
affects the child’s normal activities, and require hospitalization and/or home health
care and/or extensive medical care 2. Children with chronic health conditions usually
require more time and care from their parents, as well as increased financial support.
Within Chinese families, the mother usually spends more time taking care of the child
while father is the main income provider. Therefore, in households with a chronically
ill child, compared to households with healthy children, the mother is more likely
to stay at home (less likely to migrate), while father is more likely to migrate for
higher wages. For the above reason, the results from the fixed effects model show
that household migration and father’s migration are now negatively correlated with
children’s weight-for-age Z-score, and they are not significant in the OLS model. For
the same reason, the coefficients of mothers’ migration become insignificant in the
fixed effects model results.
3.6.3 Results of Fixed Effects model with instrument variable
Besides omitted variable bias caused by children with chronic health condi-
tions, endogeneity bias may be partially responsible for the insignificant fixed effects
results. First of all, the endogeneity could be a result of reverse causality. Parents’
migration decisions may depend on children’s health status. For instance, mothers
are less likely to migrate when children have relatively poor health status. Moreover
both parents’ migration decisions and children’s outcomes could be correlated with
local environment and development level. Though we have tried to control those local
factors by adding county average variables such as income as independent variables,
it is hard to control all the local differences using current data. For example, the
available data provided little information on the availability and condition of local
2such as Asthma (the most common) and Sickle cell anemia
72
transport. In towns that have railways or paved roads, people are more likely to mi-
grate, and the local market is more prosperous, factors which could favor children’s
health. In this case, both parent migration and children’s outcome are positively
correlated with these unobservable factors, which may strengthen the positive corre-
lation between them.
To solve this endogeneity problem and identify the potential causality effects
of migration on children, we adopted the instrument variable method. The three en-
dogenous variables are the household migration status dummy variable, the father’s
migration status dummy variable and the mother’s migration status dummy vari-
able. The child’s household migration status dummy variable equals to one if either
or both their parents have migrated. The instrumental variables are the historical
county level average household migration rate, the historical county level average
male migration rate, and the historical county level average female migration rate
respectively. The instruments are gender specific. In the survey data, there are be-
tween 20 to 30 households in each county. The instruments capture the migration
network and local culture. It is conceivable that the migrant network affects migra-
tion decisions. From Table (3.3), we can see that households with higher historical
migration rates are more likely to have migrant household members. The local av-
erage migration rates may affect children’s health and care as a result of the income
the parent earned from the urban job. Once we control for the household income
directly in the regression, the local average migration rate is unlikely to affect chil-
dren’s anthropomorphic outcomes. Another threat to the validity of the IV is that
both IV and children’s health outcome may be correlated with unobserved variables.
For instance, the government policy may affect both the historical migration rate and
children’s health. In China, the change of the HuKou system is the biggest change
in government policy that affects labor migration. The policy may affect children’s
73
health through the development of the local economy and labor migration. As we
have already controlled the county level average income in the regression, the change
of the HuKou system is unlikely to effect children’s health.
Table (3.10) presents the first-stage results from the fixed effects regres-
sion. The historical average migration rate is strongly correlated with individual and
household migration status. We have calculated the F-statistics against the null that
the excluded instruments are irrelevant. The F-statistics are 6.65, 5.09 and 7.19 on 1
and 821 degrees of freedom for historical county level male migration rate, historical
county level female migration rate, and historical county level household migration
rate respectively. A common rule of thumb for models with one endogenous regressor
is: the F-statistic against the null that the excluded instruments are irrelevant in the
first-stage regression should be larger than 10 (Stock, Wright, and Yogo 2002). The
instruments we use are not strong instruments, and it may cause bias towards the
OLS estimator. However the coefficients estimated on the instrumental variables are
still significant at the five percent level, which shows the predictive power of the his-
torical county level migration rate. As the average migration rate is a measure of the
local migration network, the regression results support the hypothesis that the local
migration network is a crucial factor that affects individuals’ migration decisions in
the corresponding local area.
The effects of household migration status on children’s health outcome and
care from the fixed effects model using the IV approach are presented in Table (3.11a)
and Table (3.11b). The effects of father’s migration status on children’s health out-
come and care from the fixed effects model using the IV approach are presented in
Table (3.12a) and Table (3.12b). The effects of mother’s migration status on chil-
dren’s health outcome and care are presented in Table (3.13a) and Table (3.13b).
After the correction of the endogeneity, there are few significant effects of
74
parents’ migration on children’s outcomes. There are three possible reasons. The
first is that the IV approach removes the reverse causality between parents’ migration
and children’s health. The second is that the weak instrument we use may cause bias
toward the OLS estimator. As a result, household migration and father’ migration
may lead to an even higher increase in children’s weight than reported in the tables.
The third is that IV usually reduces significance. It is not surprising that after
applying the IV, more coefficients became insignificant. So correcting for endogeneity
did not change the results.
We have discussed that parental migration effects a child’s health in three
major ways: the income effect, the allocation of time and the information effect. As
we have controlled the income effect by including household income as one of the
explanatory variables, the net effect of the parents’ migration here is the combined
effect of the time allocation (the time a parent spends with their child) and informa-
tion effect. The estimation results show that the net effect of parent’s migration is
not significant for most measures of children’s health outcome.
It is surprising to see that the number of elderly in the household only
has a few significant effects on children’s health. The elderly in the household are
likely to be children’s grandparents. Intuitively, the care from grandparents could
compensate the leave of children’s parents. From the regression results, children who
live with grandfathers take more calories. Children who live with their grandmothers
do not have significantly better health outcome than those who do not live with
their grandmothers. However the analysis in this paper focuses only on the measures
of children’s physical health, grandparents may have positive effects on children’s
mental health when children’s parents are absent, which could be studied by future
research.
The variances of the coefficients in the IV approach are obviously larger than
75
the ones in the fixed effects model without correcting for the endogeneity. This is a
sign that the instruments are not adding much variation. The variance is especially
large for the dependent variable of childcare. as the effective sample size is relatively
small due to missing values for the childcare variable. A total of 1048 observations
were used to analyze the childcare variable. There are 118 individuals that have more
than one observation in the sample, among which 33 children’s parents have changed
their migration status.
Overall, the IV approach suggests that there were few significant causality
effects of parents’ migration on children’s health outcomes. In contrast to the concern
that left-behind children might suffer health problems without sufficient care from
migrated parents, our empirical results show that the net effect of parents’ migration
on children’s health is not necessarily negative. These results suggest that the effects
of health information provided by migrated parents are important, and cannot be
ignored.
3.7 ROBUSTNESS CHECK
Of primary concern is that changes in household characteristics reflected in
our data may be endogenous to children’s health status. For instance, the changes
in household income may be correlated with unobserved shocks that could also lead
to changes in children’s health. Moreover, household income may be correlated with
migration decisions of household members. Such correlation may lead to biased
estimates of migration. To rule out the possibility that the above results are driven
by changes in endogenous household income, we estimated the regressions without
including household income as a control variable. The effects of households’ migration
on children’s health outcome and care are reported in Table 3.14a and Table 3.14b.
The effects of fathers’ migration on children’s health outcomes and care are reported
76
in Table 3.15a and Table 3.15b. The effects of mothers’ migration on children’s health
outcomes and care are reported in Table 3.16a and Table 3.16b.
Compared to the previous estimates with household income as a control
variable, the coefficients on migration are very similar in all regression analyses, with
only small variations in the coefficients. The existence of the small variations might
due to the coefficients of parents’ migration also capturing the income effect from
labor migration when we exclude the household income as a control variable.
Besides the household income, the number of elders in the household might
be endogenous because this may be a factor in parents’ migration decision-making.
Therefore the estimates of the coefficients of migration might be biased. In order to
address this issue, we estimated the effects of households’ migration on child health
outcomes and care from the fixed effects model without including the number of
males/females over 60 in the household as control variables in Table 3.17a and Table
3.17b. The effects of fathers’ migration on children’s health outcomes and care in
Table 3.18a and Table 3.18b, and the effects of mothers’ migration on children’s
health outcomes and care in Table 3.19a and Table 3.19b.
The estimation results showed that the magnitude of the coefficients were
very similar. It is worth noting that there were small changes in the standard de-
viation of some coefficients. One of the possible explanations is that the number of
elders is positively correlated with migration decisions. As a consequence of multi-
collinearity, the variance is smaller in this robustness check. Albeit the change, the
results are consistent with the previous findings.
3.8 REGRESSION RESULTS ON SUBSAMPLES
Although the regression results show that there are few significant effect
of parents’ migration on children’s health outcomes and care in general, parent’s
77
migration may have a significant effect on children in particular groups. In this
section, we present the regression results from fixed effects model and IV approach
on subsamples.
In total, we studied ten subgroups: a) children who live in low income
households, where low income is defined as household income level less than the
average annual income level; b) children who live in high income households, where
high income is defined as household income level higher than the average annual
income level; c) children whose parents did not finish high school; d) children whose
parents finished high school; e) children younger than age 5; f) children between ages
5 and 10; g) children who live with their grandparents; h) children who live in nuclear
families; i) children who live in north China; j) children who live in south China.
Due to limitations of the data, some of the coefficients are not identifiable,
particularly the coefficients of mother’s migration, as the effective sample size is too
small for some subsamples. The effective sample contains the children who have more
than one observation in the data. Moreover the IVs are the county level average mi-
gration rate, not much variation was added by the IVs especially when the effective
sample size was small. This problem is more serious for mother’s migration because
males migrate more often than females, and there is less variation in female’s migra-
tion status than male’s migration status. For the above reason, the regression results
of mothers’ migration are not reported here. The regression results for the subsample
of children under age 5 and children with highly educated parents are not available
for the same reason. A second problem is that due to the missing value problem, the
effective sample size was too small in some subsamples for some variables to conduct
fixed effect model analyses. For instance, childcare data for several subsamples were
not available.
Table (3.20a) and Table (3.20b) show the results of fixed effects model of
78
household migration on children’s health and care on subsamples using the IV ap-
proach. Table (3.21a) and Table (3.21b) show the results of fixed effects model of
fathers’ migration.
Generally speaking, the IV approach shows that children between age 5
and 10 are significantly affected by fathers’ migration. The effects are positive on
children’s calories and protein intake for children aged between 5 and 10 years. As
I have mentioned, the effects of parents’ migration can be both positive and nega-
tive. Positive effects include better access to nutritional information and products.
Negative effects may include children not being in the care of either their mother or
father. The results show that there are more positive effects of fathers’ migration
than negative effects for children aged between 5 and 10 years. No significant positive
effects were found for other subsamples, possibly because children between the ages
of 5 and 10 were in the midst of a crucial period of physical development. Most of
the other coefficients were not significant due to the large standard deviations in the
IV approach.
The regressions on subsamples show that parents’ migration had significant
effects on children’s health outcome and care for children in particular groups. The
results showed that the positive effects of parents’ migration could offset the nega-
tive effects of parents’ migration. Additionally, the positive effects outnumber the
negative effects for children’s nutrient intake in some subsamples.
3.9 CONCLUSION
In this paper, we studied left-behind children’s health outcomes including
height-for-age Z-score (HAZ), weight-for-age Z-score (WAZ), daily calorie intake,
daily protein intake, the number of immunization shots received by children and
whether children have been sick during the survey year. The evidence presented
79
above showed that children with migrated parents did not necessarily have poorer
health outcomes than children who lived with both parents. The robustness checks
on the endogeneity assumption supported the findings that labor migration had no
causal effect on the health of left-behind children. The fact that the results changed
so little after excluding household income and the number of elders in the household
suggests that parents’ migration had no significant impact on children’s health, that
children’s health is independent of household income and the number of elders in the
household.
The regression results on subsamples showed that fathers’ migration had
significant positive effects on children’s nutrient intake for children between 5 and 10
years of age. It showed that the positive effects of parents’ migration out-number
and could offset the negative effects of parents’ migration. The regression results
on subsamples provide some insights of the insignificance of the effects of parents’
migration. The negative effects on children’s health of parents’ migration are possibly
compensated by better access to nutrition information and products, the care from
grandparents and the remittances that migrated parents are able to provide.
We have explored the possible mechanisms that may lead to better access to
nutritional information. Future research should examine whether parental migration
effects the social support that children receive and how children’s health outcomes
vary based on the duration of parents’ migration. Nevertheless, these first steps into
the investigation of this important topic cast further doubt on the view that those left-
behind children in China always suffer from their parents’ absence. These findings
should encourage policy makers in areas of high migration to provide alternative
sources of support for left-behind children.
80
Table 3.1: Parents Migration Rate for Children under age ten(CHNS)
1997 2000 2004 2006 2009
Any Parent Migrated 0.06 0.10 0.17 0.21 0.14
Father Migrated Only 0.05 0.09 0.14 0.18 0.12
Mother Migrated Only 0.02 0.03 0.07 0.06 0.05
Both Parents Migrated 0.01 0.02 0.04 0.04 0.03
Number of Observations 927 785 614 585 531
81
Table 3.2a: Descriptive Statistics (CHNS)
Variables Migrant Non-Migrant t-stats of
household Household the difference
Weight (kg) 21.04 21.17 −0.32
(0.36)3 (0.16) (0.74)4
Height (cm) 114.90 113.94 0.98
(0.89) (0.37) (0.33)
Weight-for-age Z-score −0.49 −0.25 −3.41∗∗∗
(0.07) (0.03) (0.00)
Height-for-age Z-score −0.65 −0.50 −1.80.
(0.08) (0.03) (0.07)
Calories (Kcal) 1362.68 1374.04 −0.34
(30.00) (13.87) (0.73)
Protein (g) 40.65 42.75 −1.90.
(1.00) (0.48) (0.06)
Calories/RDA 0.81 0.84 −1.26
(0.02) (0.01) (0.21)
Protein/RDA 0.72 0.78 −3.03∗∗
(0.02) (0.01) (0.00)
Number of immunization shots 6.53 8.97 −1.88.
(1.15) (0.60) (0.06)
Whether the child has been cared by 0.39 0.47 −1.06
non-family member for the past week (0.07) (0.03) (0.29)
1standard deviation of the sample mean;
2p-value, ***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
82
Table 3.2b: Descriptive Statistics (CHNS)
Variables Migrant Father Non-Migrant Father t-stats of the difference
Weight (kg) 21.08 21.16 −0.18
(0.39)5 (0.15) (0.86)6
Height (cm) 115.04 113.95 1.04
(0.97) (0.37) (0.30)
Weight-for-age Z-score −0.48 −0.25 −2.86∗∗
(0.07) (0.03) (0.00)
Height-for-age Z-score −0.63 −0.51 −1.45
(0.08) (0.03) (0.15)
Calories (Kcal) 1379.47 1371.43 0.21
(33.35) (13.63) (0.83)
Protein (g) 41.22 42.62 −1.07
(1.10) (0.47) (0.29)
Calories/RDA 0.82 0.83 −0.58
(0.02) (0.01) (0.56)
Protein/RDA 0.73 0.77 −1.90.
(0.02) (0.01) (0.06)
Number of immunization shots 6.42 8.92 −1.58
(1.22) (0.59) (0.11)
Whether the child has been cared by 0.39 0.47 −0.83
non-family member for the past week (0.08) (0.03) (0.41)
1standard deviation of the sample mean;
2p-value, ***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
83
Table 3.2c: Descriptive Statistics (CHNS)
Variables Migrant Mother Non-Migrant Mother t-stats of the difference
Weight (kg) 20.63 21.18 −0.87
(0.57)7 (0.15) (0.38)8
Height (cm) 113.73 114.10 −0.24
(1.38) (0.36) (0.81)
Weight-for-age Z-score −0.49 −0.27 −1.95.
(0.11) (0.03) (0.05)
Height-for-age Z-score −0.69 −0.52 −1.39
(0.12) (0.03) (0.16)
Calories (Kcal) 1265.22 1378.60 −2.03∗
(44.74) (13.10) (0.04)
Protein (g) 37.65 42.73 −2.67∗∗
(1.52) (0.45) (0.01)
Calories/RDA 0.77 0.84 −2.16∗
(0.02) (0.01) (0.03)
Protein/RDA 0.68 0.77 −2.73∗∗
(0.03) (0.01) (0.01)
Number of immunization shots 5.52 8.79 −1.44
(1.59) (0.57) (0.15)
Whether the child has been cared by 0.38 0.47 −0.68
non-family member for the past week (0.06) (0.03) (0.50)
1standard deviation of the sample mean;
2p-value, ***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
84
Table 3.3: Descriptive Statistics (CHNS) of Control Variables
Variables Migrant household Non-Migrant Household t-stats of the difference
Household annual income (10000$) 2.30 2.82 −2.38∗
(0.20) (0.08) (0.02)
Father’s education 2.03 2.37 −6.75∗∗∗
(0.04) (0.02) (0.00)
Mother’s education 1.79 2.15 −6.34∗∗∗
(0.05) (0.03) (0.00)
County level average 0.71 0.83 −3.45∗∗∗
income (0.03) (0.01) (0.00)
County level average 56.72 58.85 −7.47∗∗∗
weight (0.26) (0.12) (0.00)
County level average 158.48 160.12 −8.51∗∗∗
height (0.18) (0.08) (0.00)
Number of male over 0.28 0.26 0.60
60 in the household (0.03) (0.01) (0.55)
Child’s gender 0.47 0.48 −0.06
(girls=1) (0.03) (0.01) (0.95)
Number of female over 0.32 0.29 1.00.
60 in the household (0.03) (0.01) (0.32)
Number of boys in 0.93 0.79 3.39∗∗∗
the household (0.04) (0.01) (0.00)
Number of girls in 0.82 0.72 2.03∗
the household (0.04) (0.02) (0.04)
County level average 1.02 1.00 1.16
calorie intake/RDA (0.01) (0.00) (0.25)
County level average 1.30 1.30 0.04
protein intake/RDA (0.02) (0.01) (0.97)
Children’s age 6.22 5.94 2.006∗
(0.13) (0.05) (0.05)
Father’s age 32.60 34.16 −3.05∗∗
(0.47) (0.19) (0.00)
Mother’s age 31.25 32.32 −2.10∗
(0.46) (0.22) (0.04)
Historical county level 0.26 0.15 9.50∗∗∗
male migration rate (0.01) (0.00) (0.00)
Historical county level 0.16 0.09 8.58∗∗∗
female migration rate (0.01) (0.00) (0.00)
Historical county level 0.32 0.20 10.00∗∗∗
household migration rate (0.01) (0.00) (0.00)
1standard deviation;
2p-value ***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1,
3historical county level migration rate: the average local migration rate from previous survey year.
85
Table 3.4a: OLS regression results: the effects of the household migration status
WAZ HAZ Immunization shots Childcare by non-family member
Household migration status 0.07 0.13· −2.95· −0.03
(0.06) (0.07) (1.51) (0.04)
Household income 0.00 0.01 −0.35· 0.00
(0.01) (0.01) (0.20) (0.00)
Father education 0.07∗∗ 0.10∗∗ −0.52 0.04∗
(0.03) (0.03) (0.68) (0.02)
Mother education 0.02 0.06· −0.92 0.01
(0.03) (0.03) (0.69) (0.02)
County average income 0.10∗ 0.24∗∗∗ 4.06∗ −0.03
(0.05) (0.05) (1.84) (0.03)
County average weight 0.05∗∗∗ 0.03∗∗ 0.11 0.01∗
(0.01) (0.01) (0.22) (0.01)
County average height 0.08∗∗∗ 0.08∗∗∗ −0.42 −0.02·
(0.01) (0.01) (0.32) (0.01)
Male in household with age over 60 −0.06 −0.05 3.53∗ −0.08∗
(0.06) (0.07) (1.42) (0.04)
Female in household with age over 60 0.11∗ 0.10 0.52 −0.02
(0.05) (0.06) (1.32) (0.04)
Gender −0.20∗∗ −0.19∗ −0.14 0.06
(0.07) (0.08) (1.66) (0.05)
Number of boys in household −0.09· −0.10· 0.78 −0.02
(0.05) (0.05) (1.11) (0.03)
Number of girls in household −0.01 0.01 −0.88 −0.05·
(0.04) (0.05) (1.07) (0.03)
County average calorie consumption −0.22 0.00 −8.67 0.30∗
(0.21) (0.24) (5.34) (0.14)
County average protein consumption 0.14 0.24 3.95 −0.12
(0.15) (0.18) (3.89) (0.10)
Child age −0.05∗∗∗ 0.03∗∗ −0.10 0.02·
(0.01) (0.01) (0.23) (0.01)
Intercept −15.46∗∗∗ −16.22∗∗∗ 74.37· 1.68
(1.67) (1.92) (41.52) (1.09)
R2 0.25 0.19 0.02 0.04
Adj. R2 0.25 0.19 0.02 0.04
Num. obs. 2201 2201 1491 1048
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
86
Table 3.4b: OLS regression results: the effects of the household migration status
Calorie Protein Calorie/RDA Protein/RDA
Household Migration status −30.15 −1.67 −0.02 −0.03
(31.22) (1.06) (0.02) (0.02)
Household income −2.16 0.07 0.00 0.00
(3.35) (0.11) (0.00) (0.00)
Father education 29.90∗ 1.19∗∗ 0.02∗∗ 0.03∗∗
(13.12) (0.45) (0.01) (0.01)
Mother education 28.58∗ 1.30∗∗ 0.02∗ 0.02∗∗
(13.17) (0.45) (0.01) (0.01)
County average income 9.06 0.52 0.00 0.01
(21.97) (0.75) (0.01) (0.01)
County average weight 0.41 0.02 0.00 0.00
(4.28) (0.15) (0.00) (0.00)
County average height 0.99 0.08 0.00 0.00
(6.24) (0.21) (0.00) (0.00)
Male in household with age over 60 12.53 0.32 0.01 0.01
(27.41) (0.93) (0.02) (0.02)
Female in household with age over 60 −0.19 −0.20 0.00 −0.01
(26.26) (0.89) (0.02) (0.02)
Gender −80.79∗ −3.89∗∗∗ −0.01 −0.04·
(33.45) (1.14) (0.02) (0.02)
Number of boys in household 3.52 −0.41 0.00 0.00
(22.30) (0.76) (0.01) (0.01)
Number of girls in household −10.36 −0.23 −0.01 0.00
(21.16) (0.72) (0.01) (0.01)
County average calorie consumption 738.90∗∗∗ −10.95∗∗ 0.42∗∗∗ −0.22∗∗∗
(99.30) (3.38) (0.06) (0.07)
County average protein consumption 195.62∗∗ 35.16∗∗∗ 0.13∗∗ 0.65∗∗∗
(73.58) (2.51) (0.05) (0.05)
Child age 111.98∗∗∗ 3.21∗∗∗ 0.01∗∗∗ 0.01∗
(4.58) (0.16) (0.00) (0.00)
Intercept −566.10 −28.08 −0.29 −0.50
(805.99) (27.45) (0.51) (0.53)
R2 0.29 0.30 0.11 0.18
Adj. R2 0.29 0.30 0.11 0.18
Num. obs. 2201 2201 2201 2201
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
87
Table 3.5a: OLS regression results: the effects of the father’s migration
WAZ HAZ Immunization shots Childcare by non-family member
Father’s migration status 0.07 0.14· −3.05· −0.03
(0.07) (0.08) (1.60) (0.05)
Household income 0.00 0.01 −0.35· 0.00
(0.01) (0.01) (0.20) (0.00)
Father education 0.07∗∗ 0.10∗∗ −0.53 0.04∗
(0.03) (0.03) (0.68) (0.02)
Mother education 0.02 0.05· −0.91 0.01
(0.03) (0.03) (0.69) (0.02)
County average income 0.10∗ 0.24∗∗∗ 4.09∗ −0.03
(0.05) (0.05) (1.84) (0.03)
County average weight 0.05∗∗∗ 0.03∗∗ 0.11 0.01∗
(0.01) (0.01) (0.22) (0.01)
County average height 0.08∗∗∗ 0.08∗∗∗ −0.41 −0.02·
(0.01) (0.01) (0.32) (0.01)
Male in household with age over 60 −0.06 −0.06 3.58∗ −0.08∗
(0.06) (0.07) (1.42) (0.04)
Female in household with age over 60 0.11∗ 0.10 0.48 −0.02
(0.05) (0.06) (1.32) (0.04)
Gender −0.20∗∗ −0.19∗ −0.14 0.06
(0.07) (0.08) (1.66) (0.05)
Number of boys in household −0.09· −0.10· 0.81 −0.02
(0.05) (0.05) (1.11) (0.03)
Number of girls in household −0.01 0.01 −0.85 −0.05·
(0.04) (0.05) (1.07) (0.03)
County average calorie consumption −0.22 0.00 −8.55 0.30∗
(0.21) (0.24) (5.34) (0.14)
County average protein consumption 0.14 0.24 3.88 −0.12
(0.15) (0.18) (3.89) (0.10)
Child age −0.05∗∗∗ 0.03∗∗ −0.11 0.02·
(0.01) (0.01) (0.23) (0.01)
Intercept −15.44∗∗∗ −16.18∗∗∗ 73.12· 1.68
(1.67) (1.92) (41.47) (1.09)
R2 0.25 0.19 0.02 0.04
Adj. R2 0.25 0.19 0.02 0.04
Num. obs. 2201 2201 1491 1048
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
88
Table 3.5b: OLS regression results: the effects of the father’s migration
Calorie Protein Calorie/RDA Protein/RDA
Father’s Migration status −12.72 −1.05 −0.01 −0.02
(33.16) (1.13) (0.02) (0.02)
Household income −2.12 0.07 0.00 0.00
(3.35) (0.11) (0.00) (0.00)
Father education 30.27∗ 1.20∗∗ 0.02∗∗ 0.03∗∗
(13.13) (0.45) (0.01) (0.01)
Mother education 28.74∗ 1.31∗∗ 0.02∗ 0.02∗∗
(13.17) (0.45) (0.01) (0.01)
County average income 9.00 0.52 0.00 0.01
(21.98) (0.75) (0.01) (0.01)
County average weight 0.41 0.02 0.00 0.00
(4.29) (0.15) (0.00) (0.00)
County average height 1.30 0.09 0.00 0.00
(6.24) (0.21) (0.00) (0.00)
Male in household with age over 60 12.82 0.34 0.01 0.01
(27.42) (0.93) (0.02) (0.02)
Female in household with age over 60 −0.31 −0.21 0.00 −0.01
(26.26) (0.89) (0.02) (0.02)
Gender −80.83∗ −3.89∗∗∗ −0.01 −0.04·
(33.46) (1.14) (0.02) (0.02)
Number of boys in household 3.06 −0.42 0.00 0.00
(22.32) (0.76) (0.01) (0.01)
Number of girls in household −10.66 −0.24 −0.01 0.00
(21.17) (0.72) (0.01) (0.01)
County average calorie consumption 740.12∗∗∗ −10.90∗∗ 0.42∗∗∗ −0.22∗∗∗
(99.32) (3.38) (0.06) (0.07)
County average protein consumption 194.04∗∗ 35.10∗∗∗ 0.13∗∗ 0.65∗∗∗
(73.58) (2.51) (0.05) (0.05)
Child age 111.88∗∗∗ 3.20∗∗∗ 0.01∗∗∗ 0.01∗
(4.58) (0.16) (0.00) (0.00)
Intercept −617.76 −30.21 −0.31 −0.53
(805.08) (27.42) (0.51) (0.53)
R2 0.29 0.30 0.11 0.18
Adj. R2 0.29 0.30 0.11 0.18
Num. obs. 2201 2201 2201 2201
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
89
Table 3.6a: OLS regression results: the effects of the mother’s migration
WAZ HAZ Immunization shots Childcare by non-family member
Mother’s migration status 0.18· 0.20· −3.47 0.02
(0.10) (0.11) (2.28) (0.06)
Household income 0.00 0.01 −0.33· 0.00
(0.01) (0.01) (0.20) (0.00)
Father education 0.07∗∗ 0.10∗∗ −0.44 0.04∗
(0.03) (0.03) (0.68) (0.02)
Mother education 0.02 0.06· −0.94 0.01
(0.03) (0.03) (0.69) (0.02)
County average income 0.10∗ 0.24∗∗∗ 4.00∗ −0.03
(0.05) (0.05) (1.84) (0.03)
County average weight 0.05∗∗∗ 0.03∗∗ 0.09 0.01∗
(0.01) (0.01) (0.22) (0.01)
County average height 0.08∗∗∗ 0.08∗∗∗ −0.39 −0.01·
(0.01) (0.01) (0.32) (0.01)
Male in household with age over 60 −0.06 −0.05 3.51∗ −0.08∗
(0.06) (0.07) (1.42) (0.04)
Female in household with age over 60 0.11∗ 0.10 0.59 −0.02
(0.05) (0.06) (1.32) (0.04)
Gender −0.20∗∗ −0.19∗ −0.19 0.06
(0.07) (0.08) (1.66) (0.05)
Number of boys in household −0.08· −0.09· 0.63 −0.02
(0.05) (0.05) (1.11) (0.03)
Number of girls in household 0.00 0.02 −0.98 −0.05·
(0.04) (0.05) (1.07) (0.03)
County average calorie consumption −0.23 −0.02 −8.33 0.30∗
(0.21) (0.24) (5.34) (0.14)
County average protein consumption 0.14 0.25 3.69 −0.13
(0.15) (0.18) (3.88) (0.10)
Child age −0.05∗∗∗ 0.03∗∗ −0.12 0.02·
(0.01) (0.01) (0.23) (0.01)
Intercept −15.56∗∗∗ −16.16∗∗∗ 70.76· 1.58
(1.67) (1.92) (41.44) (1.08)
R2 0.25 0.19 0.02 0.04
Adj. R2 0.25 0.19 0.02 0.04
Num. obs. 2201 2201 1491 1048
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
90
Table 3.6b: OLS regression results: the effects of the mother’s migration
Calorie Protein Calorie/RDA Protein/RDA
Mother’s Migration status −116.12∗ −3.45∗ −0.06∗ −0.06·
(47.71) (1.63) (0.03) (0.03)
Household income −2.19 0.07 0.00 0.00
(3.34) (0.11) (0.00) (0.00)
Father education 30.43∗ 1.22∗∗ 0.02∗∗ 0.03∗∗
(13.08) (0.45) (0.01) (0.01)
Mother education 27.60∗ 1.28∗∗ 0.02∗ 0.02∗∗
(13.16) (0.45) (0.01) (0.01)
County average income 9.35 0.53 0.00 0.01
(21.95) (0.75) (0.01) (0.01)
County average weight 0.23 0.01 0.00 0.00
(4.28) (0.15) (0.00) (0.00)
County average height 0.50 0.07 0.00 0.00
(6.23) (0.21) (0.00) (0.00)
Male in household with age over 60 11.15 0.28 0.01 0.01
(27.39) (0.93) (0.02) (0.02)
Female in household with age over 60 0.52 −0.18 0.00 −0.01
(26.23) (0.89) (0.02) (0.02)
Gender −81.57∗ −3.91∗∗∗ −0.01 −0.04·
(33.42) (1.14) (0.02) (0.02)
Number of boys in household 1.49 −0.50 0.00 −0.01
(22.25) (0.76) (0.01) (0.01)
Number of girls in household −12.25 −0.31 −0.01 −0.01
(21.13) (0.72) (0.01) (0.01)
County average calorie consumption 744.00∗∗∗ −10.74∗∗ 0.43∗∗∗ −0.22∗∗∗
(99.18) (3.38) (0.06) (0.07)
County average protein consumption 193.20∗∗ 35.02∗∗∗ 0.13∗∗ 0.65∗∗∗
(73.44) (2.50) (0.05) (0.05)
Child age 111.85∗∗∗ 3.20∗∗∗ 0.01∗∗∗ 0.01∗
(4.57) (0.16) (0.00) (0.00)
Intercept −471.22 −27.27 −0.23 −0.49
(804.12) (27.40) (0.51) (0.53)
R2 0.29 0.30 0.11 0.18
Adj. R2 0.29 0.30 0.11 0.18
Num. obs. 2201 2201 2201 2201
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
91
Table 3.7a: Fixed effects model results of the effects of the household migration status
on children’s health outcome and care
WAZ HAZ Immunization Childcare by
shots non-family member
Household migration status −0.20· −0.13 −5.11 −0.02
(0.10) (0.12) (3.78) (0.13)
Household income −0.03∗ −0.02 −0.28 −0.01
(0.01) (0.01) (0.57) (0.01)
County average income 0.27∗∗ 0.08 3.84 0.05
(0.09) (0.10) (5.60) (0.10)
County average weight 0.03 −0.04 −0.10 0.02
(0.03) (0.03) (1.13) (0.04)
County average height 0.03 0.05 −1.65 −0.01
(0.03) (0.04) (1.21) (0.04)
Male in household with age over 60 −0.07 0.08 0.75 0.06
(0.16) (0.18) (6.52) (0.23)
Female in household with age over 60 0.13 0.30 11.13 0.10
(0.19) (0.21) (7.50) (0.28)
Number of boys in household −0.04 −0.17 9.50∗ 0.07
(0.12) (0.13) (4.12) (0.17)
Number of girls in household −0.08 −0.21 −0.78 −0.03
(0.12) (0.14) (5.21) (0.18)
County average calorie consumption 0.28 0.14 1.27 0.87
(0.35) (0.39) (14.05) (0.54)
County average protein consumption 0.08 0.29 −4.57 −0.36
(0.25) (0.29) (9.00) (0.38)
Child age −0.04∗ 0.09∗∗∗ 1.35∗ 0.00
(0.02) (0.02) (0.62) (0.03)
R2 0.04 0.10 0.07 0.03
Adj. R2 0.01 0.03 0.02 0.00
Num. obs. 2201 2201 1491 1048
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
92
Table 3.7b: Fixed effects model results of the effects of the household migration status
on children’s health outcome and care
Calorie Protein Calorie/RDA Protein/RDA
Household Migration status −25.15 1.20 −0.01 0.03
(69.77) (2.26) (0.05) (0.05)
Household income −3.97 −0.24 0.00 −0.01
(7.34) (0.24) (0.00) (0.00)
County average income −28.54 −0.04 −0.02 −0.01
(60.74) (1.97) (0.04) (0.04)
County average weight 7.94 0.29 0.01 0.01
(20.05) (0.65) (0.01) (0.01)
County average height 12.92 1.15· 0.01 0.02·
(20.71) (0.67) (0.01) (0.01)
Male in household with age over 60 207.50· 3.46 0.13· 0.07
(107.41) (3.48) (0.07) (0.07)
Female in household with age over 60 48.21 −1.87 0.02 −0.08
(125.27) (4.06) (0.09) (0.08)
Number of boys in household 43.85 −0.55 0.02 −0.02
(78.20) (2.54) (0.05) (0.05)
Number of girls in household 27.00 −0.89 0.02 −0.03
(82.23) (2.67) (0.06) (0.06)
County average calorie consumption 1159.18∗∗∗ 4.22 0.65∗∗∗ 0.02
(232.22) (7.53) (0.16) (0.16)
County average protein consumption 47.99 26.66∗∗∗ 0.07 0.54∗∗∗
(170.03) (5.51) (0.12) (0.12)
Child age 97.59∗∗∗ 2.56∗∗∗ 0.00 0.00
(11.05) (0.36) (0.01) (0.01)
R2 0.27 0.23 0.09 0.10
Adj. R2 0.07 0.06 0.02 0.03
Num. obs. 2201 2201 2201 2201
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
93
Table 3.8a: Fixed effects model results of the effects of the father’s migration status
on children’s health outcome and care
WAZ HAZ Immunization Childcare by
shots non-family member
Father migration status −0.19· −0.20 −4.19 −0.06
(0.11) (0.12) (3.80) (0.14)
Household income −0.02∗ −0.02 −0.27 −0.01
(0.01) (0.01) (0.57) (0.01)
County average income 0.27∗∗ 0.07 3.90 0.05
(0.09) (0.10) (5.60) (0.10)
County average weight 0.03 −0.04 −0.07 0.02
(0.03) (0.03) (1.13) (0.04)
County average height 0.03 0.04 −1.60 −0.01
(0.03) (0.03) (1.21) (0.04)
Male in household with age over 60 −0.06 0.09 1.02 0.06
(0.16) (0.18) (6.52) (0.23)
Female in household with age over 60 0.12 0.30 11.00 0.10
(0.19) (0.21) (7.51) (0.28)
Number of boys in household −0.03 −0.16 9.68∗ 0.07
(0.12) (0.13) (4.12) (0.17)
Number of girls in household −0.08 −0.21 −0.84 −0.02
(0.12) (0.14) (5.21) (0.18)
County average calorie consumption 0.27 0.13 0.77 0.87
(0.35) (0.39) (14.07) (0.54)
County average protein consumption 0.08 0.29 −4.36 −0.36
(0.25) (0.29) (9.01) (0.38)
Child age −0.04∗ 0.09∗∗∗ 1.31∗ 0.01
(0.02) (0.02) (0.62) (0.03)
R2 0.04 0.11 0.07 0.03
Adj. R2 0.01 0.03 0.02 0.00
Num. obs. 2201 2201 1491 1048
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
94
Table 3.8b: Fixed effects model results of the effects of the father’s migration status
on children’s health outcome and care
Calorie Protein Calorie/RDA Protein/RDA
Father Migration status 7.54 3.18 0.01 0.07
(72.23) (2.34) (0.05) (0.05)
Household income −3.80 −0.24 0.00 −0.01
(7.33) (0.24) (0.00) (0.00)
County average income −28.19 0.11 −0.02 −0.01
(60.84) (1.97) (0.04) (0.04)
County average weight 8.14 0.31 0.01 0.01
(20.05) (0.65) (0.01) (0.01)
County average height 13.89 1.19· 0.01 0.02·
(20.67) (0.67) (0.01) (0.01)
Male in household with age over 60 208.81· 3.35 0.13· 0.07
(107.36) (3.48) (0.07) (0.07)
Female in household with age over 60 46.34 −1.87 0.02 −0.08
(125.21) (4.05) (0.09) (0.08)
Number of boys in household 44.25 −0.63 0.02 −0.02
(78.21) (2.53) (0.05) (0.05)
Number of girls in household 24.85 −1.03 0.02 −0.03
(82.27) (2.66) (0.06) (0.06)
County average calorie consumption 1161.00∗∗∗ 4.44 0.65∗∗∗ 0.03
(232.30) (7.52) (0.16) (0.16)
County average protein consumption 46.13 26.49∗∗∗ 0.07 0.54∗∗∗
(170.09) (5.51) (0.12) (0.12)
Child age 96.94∗∗∗ 2.52∗∗∗ 0.00 0.00
(11.08) (0.36) (0.01) (0.01)
R2 0.27 0.23 0.09 0.10
Adj. R2 0.07 0.06 0.02 0.03
Num. obs. 2201 2201 2201 2201
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
95
Table 3.9a: Fixed effects model results of the effects of the mother’s migration status
on children’s health outcome and care
WAZ HAZ Immunization Childcare by
shots non-family member
Mother migration status 0.00 −0.10 −4.31 0.23
(0.15) (0.17) (5.85) (0.22)
Household income −0.02∗ −0.02 −0.25 −0.01
(0.01) (0.01) (0.57) (0.01)
County average income 0.27∗∗ 0.08 3.96 0.06
(0.09) (0.10) (5.61) (0.10)
County average weight 0.03 −0.04 0.01 0.02
(0.03) (0.03) (1.12) (0.04)
County average height 0.03 0.05 −1.57 −0.01
(0.03) (0.03) (1.21) (0.04)
Male in household with age over 60 −0.06 0.08 0.85 0.06
(0.16) (0.18) (6.54) (0.23)
Female in household with age over 60 0.12 0.30 10.94 0.13
(0.19) (0.21) (7.52) (0.28)
Number of boys in household −0.03 −0.18 9.03∗ 0.09
(0.12) (0.13) (4.20) (0.18)
Number of girls in household −0.10 −0.22 −0.97 −0.03
(0.12) (0.14) (5.21) (0.18)
County average calorie consumption 0.29 0.14 1.53 0.92·
(0.35) (0.39) (14.09) (0.54)
County average protein consumption 0.07 0.29 −4.57 −0.43
(0.25) (0.29) (9.02) (0.38)
Child age −0.04∗ 0.09∗∗∗ 1.23∗ 0.00
(0.02) (0.02) (0.61) (0.03)
R2 0.04 0.10 0.06 0.04
Adj. R2 0.01 0.03 0.02 0.01
Num. obs. 2201 2201 1491 1048
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
96
Table 3.9b: Fixed effects model results of the effects of the mother’s migration status
on children’s health outcome and care
Calorie Protein Calorie/RDA Protein/RDA
Mother Migration status −113.18 −1.52 −0.06 0.00
(102.70) (3.33) (0.07) (0.07)
Household income −4.11 −0.25 0.00 −0.01
(7.32) (0.24) (0.00) (0.00)
County average income −27.98 −0.03 −0.02 −0.01
(60.68) (1.97) (0.04) (0.04)
County average weight 8.15 0.28 0.01 0.01
(20.03) (0.65) (0.01) (0.01)
County average height 12.55 1.10 0.01 0.02
(20.60) (0.67) (0.01) (0.01)
Male in household with age over 60 198.61· 3.26 0.12· 0.07
(107.65) (3.49) (0.07) (0.07)
Female in household with age over 60 56.99 −1.65 0.03 −0.08
(125.43) (4.07) (0.09) (0.09)
Number of boys in household 30.20 −0.76 0.01 −0.02
(79.17) (2.57) (0.05) (0.05)
Number of girls in household 22.48 −0.85 0.02 −0.02
(82.07) (2.66) (0.06) (0.06)
County average calorie consumption 1155.05∗∗∗ 4.09 0.65∗∗∗ 0.02
(232.03) (7.53) (0.16) (0.16)
County average protein consumption 55.93 26.84∗∗∗ 0.08 0.54∗∗∗
(170.04) (5.52) (0.12) (0.12)
Child age 97.88∗∗∗ 2.60∗∗∗ 0.00 0.00
(10.98) (0.36) (0.01) (0.01)
R2 0.27 0.23 0.09 0.10
Adj. R2 0.07 0.06 0.02 0.03
Num. obs. 2201 2201 2201 2201
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
97
Table 3.10: First Stage fixed effects Regression Results
Father’s migration Mother’s migration Household migration
status status status
County level male migration rate −0.3185∗
(0.1236)
County level female migration rate −0.2548∗
(0.1131)
County level household migration rate −0.3266∗∗
(0.1212)
Father’s age −0.1506∗ −0.1538∗
(0.0667) (0.0692)
Mother’s age 0.0132 0.0260
(0.0577) (0.0842)
Household income −0.0031 −0.0028 −0.0048
(0.0041) (0.0030) (0.0043)
Male in household with age over 60 0.0214 −0.0854∗ −0.0472
(0.0606) (0.0431) (0.0628)
Female in household with age over 60 0.0038 0.0902· 0.0456
(0.0702) (0.0499) (0.0727)
Number of children in the family 0.0001 −0.0132 −0.0050
(0.0135) (0.0096) (0.0140)
County average income −0.0353 0.0134 0.0092
(0.0344) (0.0248) (0.0363)
Children’s age 0.1724∗∗ −0.0067 0.1475
(0.0665) (0.0581) (0.1086)
R2 0.0381 0.0264 0.0373
Adj. R2 0.0103 0.0071 0.0100
Num. obs. 2201 2201 2201
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
98
Table 3.11a: Fixed effects model results of the effects of the household migration
status on children’s health outcome and care: IV approach
WAZ HAZ Immunization Childcare by
shots non-family member
Household Migration status 2.07 0.19 26.62 0.08
(1.35) (1.14) (39.20) (18.46)
Household income −0.01 −0.02 −0.08 −0.01
(0.02) (0.01) (0.67) (0.16)
County average income 0.27∗ 0.08 5.88 0.05
(0.12) (0.10) (6.61) (0.61)
County average weight 0.05 −0.03 0.98 0.02
(0.04) (0.03) (1.81) (0.36)
County average height 0.10· 0.06 −0.91 −0.01
(0.06) (0.05) (1.60) (0.30)
Male in household with age over 60 0.05 0.10 3.47 0.05
(0.23) (0.19) (7.88) (0.65)
Female in household with age over 60 −0.02 0.28 7.55 0.11
(0.27) (0.23) (9.31) (0.54)
Number of boys in household 0.01 −0.16 10.28∗ 0.06
(0.16) (0.14) (4.60) (1.24)
Number of girls in household −0.23 −0.23 −2.53 −0.04
(0.19) (0.16) (6.08) (2.15)
County average calorie consumption 0.38 0.16 0.24 0.88
(0.47) (0.40) (15.41) (1.19)
County average protein consumption −0.04 0.27 −6.11 −0.37
(0.35) (0.30) (10.02) (1.53)
Child age −0.08∗ 0.08∗∗ 0.27 0.00
(0.03) (0.03) (1.49) (0.39)
R2 0.00 0.09 0.01 0.03
Adj. R2 0.00 0.03 0.00 0.00
Num. obs. 2201 2201 1491 1048
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
99
Table 3.11b: Fixed effects model results of the effects of the household migration
status on children’s health outcome and care: IV approach
Calorie Protein Calorie/RDA Protein/RDA
Household Migration status 1060.38 20.23 0.49 0.20
(799.13) (23.09) (0.50) (0.46)
Household income 2.16 −0.14 0.00 −0.01
(9.80) (0.28) (0.01) (0.01)
County average income −28.57 −0.04 −0.02 −0.01
(72.15) (2.08) (0.05) (0.04)
County average weight 13.89 0.39 0.01 0.01
(24.21) (0.70) (0.02) (0.01)
County average height 46.75 1.74· 0.03 0.03
(34.91) (1.01) (0.02) (0.02)
Male in household with age over 60 268.37∗ 4.53 0.16· 0.08
(135.15) (3.90) (0.08) (0.08)
Female in household with age over 60 −24.43 −3.14 −0.01 −0.09
(158.03) (4.57) (0.10) (0.09)
Number of boys in household 67.18 −0.14 0.03 −0.01
(94.45) (2.73) (0.06) (0.05)
Number of girls in household −42.95 −2.11 −0.01 −0.04
(110.29) (3.19) (0.07) (0.06)
County average calorie consumption 1209.77∗∗∗ 5.11 0.67∗∗∗ 0.03
(278.32) (8.04) (0.17) (0.16)
County average protein consumption −8.44 25.67∗∗∗ 0.05 0.53∗∗∗
(206.16) (5.96) (0.13) (0.12)
Child age 76.80∗∗∗ 2.20∗∗∗ −0.01 −0.01
(20.10) (0.58) (0.01) (0.01)
R2 0.12 0.17 0.03 0.08
Adj. R2 0.03 0.04 0.01 0.02
Num. obs. 2201 2201 2201 2201
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
100
Table 3.12a: Fixed effects model results of the effects of the father’s migration status
on children’s health outcome and care: IV approach
WAZ HAZ Immunization Childcare by
shots non-family member
Father Migration status 2.12 0.87 37.40 −14.95
(1.38) (1.25) (41.49) (229.20)
Household income −0.02 −0.01 −0.05 −0.16
(0.02) (0.01) (0.69) (2.32)
County average income 0.37∗∗ 0.12 6.59 −0.23
(0.14) (0.12) (7.00) (4.43)
County average weight 0.05 −0.03 1.30 0.19
(0.04) (0.04) (1.88) (2.75)
County average height 0.09 0.07 −0.89 −0.28
(0.05) (0.05) (1.56) (4.12)
Male in household with age over 60 −0.09 0.07 2.72 0.45
(0.21) (0.19) (7.71) (6.34)
Female in household with age over 60 0.07 0.27 6.59 −0.41
(0.25) (0.23) (9.70) (8.33)
Number of boys in household −0.08 −0.18 9.10· 0.98
(0.16) (0.14) (4.79) (14.10)
Number of girls in household −0.24 −0.28 −3.02 1.75
(0.19) (0.17) (6.39) (27.34)
County average calorie consumption 0.47 0.22 4.03 0.02
(0.48) (0.43) (16.55) (14.02)
County average protein consumption −0.09 0.21 −8.85 −0.13
(0.35) (0.32) (11.31) (5.04)
Child age −0.09∗ 0.07∗ −0.04 0.35
(0.04) (0.03) (1.52) (5.37)
R2 0.00 0.04 0.00 0.00
Adj. R2 0.00 0.01 0.00 0.00
Num. obs. 2201 2201 1491 1048
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
101
Table 3.12b: Fixed effects model results of the effects of the father’s migration status
on children’s health outcome and care: IV approach
Calorie Protein Calorie/RDA Protein/RDA
Father Migration status 943.30 14.45 0.40 −0.03
(788.91) (22.98) (0.50) (0.47)
Household income −0.29 −0.20 0.00 −0.01
(8.82) (0.26) (0.01) (0.01)
County average income 14.92 0.63 0.00 −0.01
(77.86) (2.27) (0.05) (0.05)
County average weight 15.98 0.40 0.01 0.01
(23.66) (0.69) (0.01) (0.01)
County average height 36.80 1.47· 0.02 0.02
(30.31) (0.88) (0.02) (0.02)
Male in household with age over 60 195.99 3.20 0.12 0.07
(122.17) (3.56) (0.08) (0.07)
Female in household with age over 60 23.31 −2.14 0.01 −0.08
(143.24) (4.17) (0.09) (0.09)
Number of boys in household 26.73 −0.84 0.01 −0.02
(89.86) (2.62) (0.06) (0.05)
Number of girls in household −40.25 −1.81 −0.01 −0.02
(108.06) (3.15) (0.07) (0.06)
County average calorie consumption 1242.30∗∗∗ 5.42 0.68∗∗∗ 0.02
(272.01) (7.92) (0.17) (0.16)
County average protein consumption −22.23 25.66∗∗∗ 0.04 0.55∗∗∗
(201.15) (5.86) (0.13) (0.12)
Child age 76.80∗∗∗ 2.28∗∗∗ 0.00 0.00
(21.05) (0.61) (0.01) (0.01)
R2 0.15 0.21 0.04 0.10
Adj. R2 0.04 0.06 0.01 0.03
Num. obs. 2201 2201 2201 2201
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
102
Table 3.13a: Fixed effects model results of the effects of the mother’s migration status
on children’s health outcome and care: IV approach
WAZ HAZ Immunization Childcare by
shots non-family member
Mother Migration status 2.21 −0.58 21.18 1.76
(2.11) (2.07) (47.55) (4.04)
Household income −0.02 −0.02 −0.26 −0.01
(0.01) (0.01) (0.58) (0.01)
County average income 0.26∗ 0.09 5.20 0.11
(0.11) (0.10) (6.20) (0.17)
County average weight 0.03 −0.04 0.35 0.03
(0.03) (0.03) (1.31) (0.06)
County average height 0.06 0.04 −1.33 0.00
(0.04) (0.04) (1.32) (0.05)
Male in household with age over 60 0.14 0.03 2.87 0.10
(0.27) (0.26) (7.68) (0.29)
Female in household with age over 60 −0.08 0.34 8.66 0.31
(0.29) (0.29) (8.79) (0.56)
Number of boys in household 0.24 −0.24 12.53 0.27
(0.30) (0.29) (7.78) (0.52)
Number of girls in household −0.04 −0.23 −1.51 −0.04
(0.15) (0.15) (5.44) (0.21)
County average calorie consumption 0.39 0.12 −1.02 1.26
(0.42) (0.41) (15.21) (1.08)
County average protein consumption −0.11 0.33 −6.01 −0.87
(0.34) (0.34) (9.63) (1.24)
Child age −0.06∗ 0.09∗∗∗ 0.93 −0.03
(0.02) (0.02) (0.84) (0.07)
R2 0.00 0.09 0.03 0.02
Adj. R2 0.00 0.03 0.01 0.00
Num. obs. 2201 2201 1491 1048
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
103
Table 3.13b: Fixed effects model results of the effects of the mother’s migration status
on children’s health outcome and care: IV approach
Calorie Protein Calorie/RDA Protein/RDA
Mother Migration status 1742.55 26.08 0.95 0.47
(1514.51) (43.02) (0.96) (0.86)
Household income 0.49 −0.19 0.00 −0.01
(9.87) (0.27) (0.01) (0.01)
County average income −37.09 −0.17 −0.02 −0.01
(76.02) (2.09) (0.05) (0.04)
County average weight 6.85 0.26 0.01 0.01
(24.99) (0.69) (0.02) (0.01)
County average height 31.53 1.38· 0.02 0.03
(29.97) (0.83) (0.02) (0.02)
Male in household with age over 60 367.56· 5.77 0.21· 0.11
(192.07) (5.37) (0.12) (0.11)
Female in household with age over 60 −114.50 −4.20 −0.07 −0.12
(209.53) (5.85) (0.13) (0.12)
Number of boys in household 262.89 2.70 0.14 0.04
(213.42) (6.02) (0.14) (0.12)
Number of girls in household 70.07 −0.14 0.04 −0.01
(109.40) (3.02) (0.07) (0.06)
County average calorie consumption 1241.91∗∗∗ 5.39 0.69∗∗∗ 0.04
(297.79) (8.21) (0.19) (0.17)
County average protein consumption −95.71 24.59∗∗∗ 0.00 0.51∗∗∗
(245.26) (6.80) (0.16) (0.14)
Child age 85.21∗∗∗ 2.41∗∗∗ 0.00 −0.01
(17.13) (0.48) (0.01) (0.01)
R2 0.09 0.16 0.01 0.06
Adj. R2 0.03 0.04 0.00 0.02
Num. obs. 2201 2201 2201 2201
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
104
Table 3.14a: Robustness Check 1: the effects of the household migration status on
children’s health outcome and care without household income as a control variable
WAZ HAZ Immunization shots Childcare by non-family member
Household Migration 2.09 0.22 26.66 0.48
status (1.34) (1.13) (39.02) (12.48)
R2 0.00 0.09 0.01 0.00
Adj. R2 0.00 0.02 0.00 0.00
Num. obs. 2201 2201 1491 1048
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
Table 3.14b: Robustness Check 1: the effects of the household migration status on
children’s health outcome and care without household income as a control variable
Calorie Protein Calorie/RDA Protein/RDA
Household Migration status 1056.39 20.48 0.50 0.21
(789.50) (22.87) (0.50) (0.46)
R2 0.12 0.16 0.03 0.08
Adj. R2 0.03 0.04 0.01 0.02
Num. obs. 2201 2201 2201 2201
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
105
Table 3.15a: Robustness Check 1: the effects of the father’s migration status on
children’s health outcome and care without household income as a control variable
WAZ HAZ Immunization shots Childcare by non-family member
Father Migration status 2.13 0.89 37.42 −8.80
(1.38) (1.25) (41.37) (83.99)
R2 0.00 0.04 0.00 0.00
Adj. R2 0.00 0.01 0.00 0.00
Num. obs. 2201 2201 1491 1048
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
Table 3.15b: Robustness Check 1: the effects of the father’s migration status on
children’s health outcome and care without household income as a control variable
Calorie Protein Calorie/RDA Protein/RDA
Father Migration status 943.60 14.66 0.41 −0.02
(785.27) (22.90) (0.50) (0.47)
R2 0.15 0.21 0.04 0.09
Adj. R2 0.04 0.06 0.01 0.02
Num. obs. 2201 2201 2201 2201
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
106
Table 3.16a: Robustness Check 1: the effects of the mother’s migration status on
children’s health outcome and care without household income as a control variable
WAZ HAZ Immunization shots Childcare by non-family member
Mother Migration status 2.11 −0.67 20.95 2.22
(2.12) (2.11) (47.46) (4.57)
R2 0.00 0.09 0.03 0.01
Adj. R2 0.00 0.02 0.01 0.00
Num. obs. 2201 2201 1491 1048
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
Table 3.16b: Robustness Check 1: the effects of the mother’s migration status on
children’s health outcome and care without household income as a control variable
Calorie Protein Calorie/RDA Protein/RDA
Mother Migration status 1745.04 26.43 0.94 0.44
(1533.75) (42.86) (0.97) (0.86)
R2 0.09 0.16 0.01 0.06
Adj. R2 0.03 0.04 0.00 0.02
Num. obs. 2201 2201 2201 2201
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
107
Table 3.17a: Robustness Check 2: the effects of the household migration status on
children’s health outcome and care without the number of elders as control variables
WAZ HAZ Immunization shots Childcare by non-family member
Household Migration 2.07 0.19 28.87 0.13
status (1.35) (1.15) (39.34) (20.97)
R2 0.00 0.09 0.00 0.02
Adj. R2 0.00 0.02 0.00 0.00
Num. obs. 2201 2201 1491 1048
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
Table 3.17b: Robustness Check 2: the effects of the household migration status on
children’s health outcome and care without the number of elders as control variables
Calorie Protein Calorie/RDA Protein/RDA
Household Migration status 1064.49 20.31 0.50 0.20
(802.52) (23.11) (0.50) (0.46)
R2 0.12 0.16 0.02 0.08
Adj. R2 0.03 0.04 0.01 0.02
Num. obs. 2201 2201 2201 2201
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
108
Table 3.18a: Robustness Check 2: the effects of the father’s migration status on
children’s health outcome and care without the number of elders as control variables
WAZ HAZ Immunization shots Childcare by non-family member
Father Migration status 2.11 0.85 38.06 −14.87
(1.38) (1.25) (41.67) (225.42)
R2 0.00 0.04 0.00 0.00
Adj. R2 0.00 0.01 0.00 0.00
Num. obs. 2201 2201 1491 1048
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
Table 3.18b: Robustness Check 2: the effects of the father’s migration status on
children’s health outcome and care without the number of elders as control variables
Calorie Protein Calorie/RDA Protein/RDA
Father Migration status 940.20 14.63 0.40 −0.02
(790.83) (23.02) (0.50) (0.47)
R2 0.15 0.21 0.04 0.09
Adj. R2 0.04 0.06 0.01 0.03
Num. obs. 2201 2201 2201 2201
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
109
Table 3.19a: Robustness Check 2: the effects of the mother’s migration status on
children’s health outcome and care without the number of elders as control variables
WAZ HAZ Immunization shots Childcare by non-family member
Mother Migration status 2.20 −0.55 23.11 2.30
(2.10) (2.06) (46.78) (5.39)
R2 0.00 0.09 0.03 0.01
Adj. R2 0.00 0.02 0.01 0.00
Num. obs. 2201 2201 1491 1048
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
Table 3.19b: Robustness Check 2: the effects of the mother’s migration status on
children’s health outcome and care without the number of elders as control variables
Calorie Protein Calorie/RDA Protein/RDA
Mother Migration status 1741.51 26.72 0.95 0.47
(1513.70) (43.79) (0.96) (0.85)
R2 0.09 0.16 0.01 0.05
Adj. R2 0.02 0.04 0.00 0.01
Num. obs. 2201 2201 2201 2201
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
110
Table 3.20a: Fixed effects model results of the effects of the household migration
status on children’s health outcome and care on subsamples: IV approach
WAZ HAZ Immunization Childcare by
shots non-family member
Household Migration status 3.39 0.08 75.25 −0.44
(Low income household) (3.59) (2.37) (133.99) (1.91)
Household Migration status 3.25 −4.97 N.A.9 N.A.
(High income household) (8.35) (9.78) N.A. N.A.
Household Migration status 2.45 1.68 65.05 N.A.
(Parents with low education level) (1.57) (1.42) (71.59) N.A.
Household Migration status 1.49 0.48 5.30 N.A.
(Child above age 5) (0.98) (0.92) (21.14) N.A.
Household Migration status 4.08 2.79 64.27 2.65
(Child who lives with grandparents) (4.28) (3.80) (89.56) (10.77)
Household Migration status 1.79 −0.90 19.33 1.49
(Child who lives in nuclear family) (1.99) (1.64) (48.11) (7.00)
Household Migration status 2.52 0.42 13.07 −0.07
(North China) (2.04) (1.62) (64.06) (1.74)
Household Migration status 1.87 0.00 24.85 N.A.
(South China) (1.76) (1.55) (45.25) N.A.
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1, 9 the regression results are not available due to missing
values for the variables immunization shots and childcare. In some subsamples, the effective sample
sizes for those two variables are too small to produce reliable regression results, where the effective
sample contains the individuals that have more than one observation in the data.
111
Table 3.20b: Fixed effects model results of the effects of the household migration
status on children’s health outcome and care on subsamples: IV approach
Calorie Protein Calorie/RDA Protein/RDA
Household Migration status 3032.80 71.07 1.66 1.09
(Low income household) (2878.64) (70.86) (1.66) (1.22)
Household Migration status −346.98 −18.25 −0.14 −0.25
(High income household) (3812.96) (126.21) (2.66) (2.66)
Household Migration status 1305.09 27.13 0.66 0.39
(Parents with low education level) (931.02) (25.21) (0.56) (0.47)
Household Migration status 1722.03∗ 48.37∗ 0.94· 0.77·
(Child above age 5) (867.25) (24.18) (0.48) (0.40)
Household Migration status 187.62 31.20 0.03 0.45
(Child who lives with grandparents) (1696.92) (55.35) (1.12) (1.05)
Household Migration status 1957.27 31.99 0.99 0.34
(Child who lives in nuclear family) (1513.14) (37.61) (0.89) (0.72)
Household Migration status 1653.92 29.98 0.78 0.35
(North China) (1402.55) (35.05) (0.88) (0.72)
Household Migration status 767.37 18.74 0.40 0.18
(South China) (917.81) (30.35) (0.56) (0.58)
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
112
Table 3.21a: Fixed effects model results of the effects of the father’s migration status
on children’s health outcome and care on subsamples: IV approach
WAZ HAZ Immunization Childcare by
shots non-family member
Father Migration status 2.49 1.43 76.41 −2.24
(Low income household) (2.05) (1.90) (93.55) (6.24)
Father Migration status 2.14 −1.77 N.A. N.A.
(High income household) (4.54) (3.99) N.A.10 N.A.
Father Migration status 2.07 2.28 68.83 N.A.
(Parents with low education level) (1.34) (1.51) (69.92) N.A.
Father Migration status 1.38 0.68 11.26 N.A.
(Child above age 5) (0.91) (0.92) (25.00) N.A.
Father Migration status 4.50 5.22 54.56 5.24
(Child who lives with grandparents) (5.91) (6.99) (70.88) (20.42)
Father Migration status 1.79 0.17 38.97 −0.68
(Child who lives in nuclear family) (1.58) (1.30) (49.99) (3.53)
Father Migration status 2.29 1.57 18.89 −3.07
(North China) (1.92) (1.84) (75.66) (7.08)
Father Migration status 2.42 0.85 37.96 N.A.
(South China) (2.22) (1.88) (41.46) N.A.
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1, 10 the regression results are not available due to missing
values for the variables immunization shots and childcare. In some subsamples, the effective sample
sizes for those two variables are too small to produce reliable regression results, where the effective
sample contains the individuals that have more than one observation in the data.
113
Table 3.21b: Fixed effects model results of the effects of the father’s migration status
on children’s health outcome and care on subsamples: IV approach
Calorie Protein Calorie/RDA Protein/RDA
Father Migration status 2029.88 45.86 1.10 0.64
(Low income household) (1476.19) (37.03) (0.88) (0.67)
Father Migration status −1831.69 −89.18 −1.44 −2.19
(High income household) (3122.73) (127.10) (2.30) (2.97)
Father Migration status 1286.81 31.95 0.66 0.42
(Parents with low education level) (859.56) (24.21) (0.52) (0.44)
Father Migration status 1354.02· 36.92· 0.72· 0.54
(Child above age 5) (749.43) (20.53) (0.41) (0.33)
Father Migration status −37.47 58.81 0.06 1.00
(Child who lives with grandparents) (2240.88) (87.28) (1.49) (1.65)
Father Migration status 1322.39 11.15 0.56 −0.18
(Child who lives in nuclear family) (1009.21) (26.99) (0.60) (0.58)
Father Migration status 1318.14 17.74 0.57 0.04
(North China) (1284.92) (32.55) (0.83) (0.70)
Father Migration status 886.57 21.65 0.41 0.02
(South China) (1065.57) (34.84) (0.64) (0.66)
***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.
114
Chapter 4
Conclusion
In Chapter 2, I develop and estimate a two-period ability-learning struc-
tural model to provide a more complete picture of the college market by including
community colleges as a viable pathway to bachelor’s degrees. The results show
that the market has no discrimination against transfer students because the effect of
transfer on future income is not statistically significant from zero, which coincides
with the finding by Kane and Rouse (1995), suggesting that the only cost of transfer
is direct transfer costs that are the main barrier to college transfer. The estimation
results also show that family income has a significant effect on college choices, which
provides evidence that students tend to start in community colleges when facing fi-
nancial constraints. Finally, the results support the idea that the return to abilities
is higher in universities than in community colleges.
An immediate extension is to consider jointly the strategies between colleges
and students. Schools may set different strategies to admit high school graduates
and transfer students. A dynamic general equilibrium model that takes into account
both sides of the college admission market would give a more complete picture of the
decision making process and the underlying driving forces. Another extension is to
modify the model by allowing for heterogeneous risk aversion levels. The extension
115
can be achieved by employing the constant relative risk aversion utility, and allowing
the risk aversion coefficient to be different for different individuals. The extension can
help us to understand a diversity of college choices and different college preferences
from another perspective.
Chapter 3 studied left-behind children’s health outcomes including height-
for-age Z-score (HAZ), weight-for-age Z-score (WAZ), daily calorie intake, daily pro-
tein intake, the number of immunization shots received by children and whether chil-
dren have been sick during the survey year. The evidence presented above showed
that children with migrated parents did not necessarily have poorer health outcomes
than children who lived with both parents. The regression results on subsamples
showed that fathers’ migration had significant positive effects on children’s nutrient
intake for children between 5 and 10 years of age. It showed that the positive effects
of parents’ migration could out-number and offset the negative effects of parents’ mi-
gration. The negative effects on children’s health of parents’ migration are possibly
compensated by better access to nutrition information and products, the care from
grandparents and the remittances that migrated parents are able to provide.
We have explored the possible mechanisms that may lead to better access to
nutritional information. Future research should examine whether parental migration
effects the social support that children receive and how children’s health outcomes
vary based on the duration of parents’ migration.
116
Appendix A
Appendix to Chapter 2
A.1 Bayesian Update in Ability Learning Process
Bayesian update after receiving high school GPA: To update the
distribution of αik for all 0 ≤ k ≤ J , we make use of Equation (2.10),
HsGPAi = µ0 +
J∑
j=0
µ1j · αij + εHsij . (A.1)
which is equivalent to saying
HsGPAi − µ0 −∑
j 6=k µ1j · αij
µ1k= αik +
εHsij
µ1k. (A.2)
As the prior distribution of αijs are defined in Equation (2.8), which can be rewritten
as
αij = mj + χij + εαij , where εij ∼ N(0, σ2α)
We can substitute the preceding equation back to Equation (A.2), so we have
HsGPAi − µ0 −∑
j 6=k µ1j · (mj + χij)
µ1k= αik +
εHsij +
∑
j 6=k µ1j · εαij
µ1k. (A.3)
Let
αHsik =
HsGPAi − µ0 −∑
j 6=k µ1j · (mj + χij)
µ1k,
εHsik =
εHsij +
∑
j 6=k µ1jεαij
µ1k, where εHs
ik ∼ N(0,σ2 + σ2
α
∑
j 6=k µ21j
µ21k
).
117
Therefore the posterior distribution of student ability after receiving high school GPA
is
αik ∼ N(αHsik , σ2
Hs,k), (A.4)
where
αHsik =
(mk + χik) ·σ2+σ2
α
∑j 6=k µ2
1j
µ21k
+ αHsik · σ2
α
σ2+σ2α
∑j 6=k µ2
1j
µ21k
+ σ2α
σ2Hs,k =
11
σ2+σ2α
∑j 6=k µ2
1j
µ21k
+ 1σ2α
Bayesian update after receiving SAT score: To update the distribu-
tion of αik for all 0 ≤ k ≤ J , we use Equation (2.11),
SATi = µ0 +J∑
j=0
µ1j · αij + εSATij . (A.5)
which is equivalent to saying
SATi − µ0 −∑
j 6=k µ1j · αij
µ1k= αik +
εSATij
µ1k. (A.6)
From Equation (A.4), the ability αij can be rewritten as
αij = αHsij + εHs
ij , where εHsij ∼ N(0, σ2
Hs,j)
We can substitute the above equation back to Equation (A.6), we have
SATi − µ0 −∑
j 6=k µ1j · αHsij
µ1k= αik +
εSATij +
∑
j 6=k µ1j εHsij
µ1k. (A.7)
Let
αSATik =
SATi − µ0 −∑
j 6=k µ1j · αHsij
µ1k,
εSATik =
εSATij +
∑
j 6=k µ1j εHsij
µ1k, where εSAT
ik ∼ N(0,σ2 +
∑
j 6=k σ2Hs,kµ
21j
µ21k
).
Therefore the posterior distribution of student’s ability after receiving SAT score is
αik ∼ N(αSATik , σ2
SAT,k), (A.8)
118
where
αSATik =
αHsik ·
σ2+∑
j 6=k σ2Hs,kµ
21j
µ21k
+ αSATik · σ2
Hs,k
σ2+∑
j 6=k σ2Hs,k
µ21j
µ21k
+ σ2Hs,k
σ2SAT,k =
11
σ2+∑
j 6=k σ2Hs,k
µ21j
µ21k
+ 1σ2Hs,k
Bayesian update after receiving college GPA: From Equation (2.12),
we have
κijt = αij + εκit where εκit ∼ N(0, σ2κ). (A.9)
The posterior distribution of student’s ability after receiving college GPA is
αik ∼ N(αColik , σ2
Col,k), (A.10)
where
αColik =
αSATik · σ2
κ + αSATik · σ2
SAT,k
σ2κ + σ2
SAT,k
σ2Col,k =
11σ2κ+ 1
σ2SAT,k
A.2 Estimation Details
A.2.1 The Closed Form of P (si1|·)
Define the probability: To find the closed form of P (si1|·), I make use
of the property of extreme value type I distribution. In the model, I have described
that εSij1s in Equation (2.4) follows Extreme Value Type I distribution with location
and scale parameters zero and τ , where εSij1s are the preference shock. εSij1s enters
the value function at period 1 (V1(·) defined in Equation (2.16)), as it is in the utility
of attending school (US(·) defined in Equation (2.4)). By the property of extreme
119
type I distribution, the probability of attending college si1 at period 1 is given by 1
P (si1|wi3, κi1, κi2, Xi, {νj}j∈J ) =exp( 1
τV1(Xi, Ii1, si1))
∑
j≥0 exp(1τV1(Xi, Ii1, j))
(A.11)
• Here V1(Xi, Ii1, j) is the value function at period 1 without the preference shock
εSij1 in US(·). To be more specific, let us define
US(Xi, si1) = ln(ξ(·)) + νij ,
and
V1(Xi, Ii1, si1) = US(Xi, si1) + E
[
maxj∈C(si1)
[V2(Xi, Ii2, j)] |Ii1, si1
]
.
It is equivalent to saying
US(Xi, si1) = US(Xi, si1) + εSij1, and V1(Xi, Ii1, j) = V1(Xi, Ii1, j) + εSij1.
In the denominator of Equation (A.11), I have the summation of exp( 1τV1(Xi,
Ii1, j)) for j ≥ 0, because working outside j = −1 is not a option in period 1 (all
students in this data set have received post secondary education).
Details of deriving the closed form of V1(·): To find the closed form
of the value function at period 1, I have to find the expected maximum of the value
function at period 2 (E[
maxj∈C(si1) [V2(Xi, Ii2, j)] |Ii1, si1]
). The expected value is
taken over the distribution of the error terms εSij2s and εWit . As εSij2s and εWit all follow
Extreme Value Type I distribution with location and scale parameters zero and τ ,
1Details available in Domencich and McFadden (1975, Chapter 4).
120
the expectation has a closed form.
V1(Xi, Ii1, j) = US(Xi, si1) + E
[
maxj∈C(si1)
[V2(Xi, Ii2, j)] |Ii1, si1
]
= US(Xi, si1)
+
∫
τι+ τ log{∑
j∈C(si1)
exp(1
τV2(Xi, Ii2, j))}dK({αij}j∈C(si1))
= US(Xi, si1) + τι
+τ
∫
log{∑
j∈C(si1)
exp(1
τV2(Xi, Ii2, j))}dK({αij}j∈C(si1))
(A.12)
• Here ι = 0.57 is the Euler’s constant.
• dK({αij}j∈C(si1)) is the joint distribution of αijs for j ∈ C(si1).
In the following, I use the Taylor Expansion to approximate the integration. To be
more specific, let
f(αi) = log{∑
j∈C(si1)
exp(1
τV2(Xi, Ii2, j))},
To expand function f(·) at αSATi which is the posterior mean of αi before making
college enrollment decision at period 1.
f(αi) ≈ f(αSATi ) +
f (1)(αSATi )(αSAT
i − αi)
1
Therefore
∫
f(αi)dK(αi) ≈ f(αSATi ) +
f (1)(αSATi )× 0
1= f(αSAT
i )
As a result Equation (A.12) has the form
V1(Xi, Ii1, j) = US(Xi, si1) + τι+ τf(αSATi ).
A.2.2 The Closed Form of P (si2|·)
For the same reason, as εSij2s follows Extreme Value Type I distribution with
location and scale parameters zero and τ , the probability of attending college si2 at
121
period 2 is given by
LSi2 = 1(si2|si1, Xi, κi1, κi2, {λ
µj }j∈J , {νj}j∈J )
=exp( 1
τV2(Xi, Ii1, si1))
∑
j∈C(si1)exp( 1
τV2(Xi, Ii1, j))
,
• V2(Xi, Ii1, si1) is similarly defined as V1(Xi, Ii1, si1). It has the following relation
with V2(·).
V2(Xi, Ii1, j) = V2(Xi, Ii1, j) + εSij1.
A.2.3 The Closed Form of f(wit|·)
f(wit|·) is easy to compute, as the random variable εWit follow an Extreme
Value Type I distribution with location and scale parameters zero and τ .
f(wit|·) = f{εWit = ln(wit)− ln(wit)− ρ2j · εi · σ2Col,Di
} (A.13)
= exp[−(ln(wit)− ln(wit)− ρ2j · εi · σ
2Col,Di
τ] (A.14)
×exp{exp[−(ln(wit)− ln(w)it − ρ2j · εi · σ
2Col,Di
τ]}, (A.15)
where
• wit is the observed student wage at time t,
• ln(w)it is the predicted logarithm of student wage at time t using equation
(2.1),
• εi follows N(0, 1).
• αColi,Di
and σ2Col,Di
are defined in (A.10).
To see Equation (A.13), from the wage equation (Equation (2.1)), I have
ln (wt(αi,Di, si1, Di)) = ρ1,Di
+ ρ2,Diαi,Di
+ γ11(Di > si1)
122
+γ2Exprit + γ3Expr2it + εWit ,
for t = 2, · · · , T.
Substitute αi,Di= αCol
i,Di+ εi · σ
2Col,Di
(Equation (A.10)) into the wage equation, I
have
ln (wt(αi,Di, si1, Di)) = ρ1,Di
+ ρ2,Di(αCol
i,Di+ εi · σ
2Col,Di
) + γ11(Di > si1)
+γ2Exprit + γ3Expr2it + εWit ,
for t = 2, · · · , T.
The predicted logarithm of wage is
ln (wt(αi,Di, si1, Di)) = ρ1,Di
+ ρ2,Diαcoli,Di
+ γ11(Di > si1)
+γ2Exprit + γ3Expr2it,
for t = 2, · · · , T.
After performing a simple math problem, I have
εWit = ln(wit)− ln(wit)− ρ2j · εi · σ2Col,Di
A.3 The Closed Form of f(κit|·):
f(κit|·) = φ(κit − αSAT
i,si1
(σ2κ + σ2
SAT,si1)0.5
). (A.16)
where si1 is student school choice at period 1, as I have mentioned before. αSATi,si1
and
σ2SAT,si1
are defined in (A.8).
To see (A.16), for j ≥ 0 (j = 0 indicates community college; j > 0 indicates
4 year universities), we have
αij = αSATi,si1
+ ε2SAT,si1, (derived from Equation (A.8)),
123
where ε2SAT,si1∼ N(0, σ2
SAT,si1). It can be shown that
κijt = αij + εκij , (Equation (2.12))
= αSATi,si1
+ ε2SAT,si1+ εκij , (substitute αij)
To find the value of the likelihood function Li(·), for each individual I draw
shocks {{λµijr}j∈J , {νijr}j∈J , {εj}j∈J }
Rr=1 from their joint distribution G({λµ
j }j∈J ,
{νj}j∈J , {εj}j∈J ). The likelihood function is approximated by
1
R
R∑
r=1
P r(si1, si2|wi3, κi1, κi2, Xi, {νj}j∈J )× f r(wi3, κi1, κi2|{λµj }j∈J , {εj}j∈J ).
The likelihood function for students who complete only one period education
is similar to Equation (2.18), except that the contribution from working starts from
time two and there is only one period contribution of κit (Lκi1).
124
Bibliography
[1] Altonji, J. (1993): “The Demand for and Return to Education When Education
Outcomes are Uncertain,” Journal of Labor Economics, 11(1), 48-83.
[2] Bao, Shuming and Orn B. Bodvarsson, Jack W. Hou, and Yaohui Zhao (2009).
“Migration in China from 1985-2000-The Effects of Past Migration, Investments,
and Deregulation.” The Chinese Economy, 42(4), 7-28.
[3] Belzil, C. and J. Hansen (2002): “Unobserved Ability and the Return To School-
ing,” Econometrica, 70(5), 2075-2091.
[4] Brauw, Alan D. and Ren Mu. 2011. “Migration and the Overweight and Under-
weight Status of Children in Rural China.” Food Policy, 36(1), 88-100.
[5] Cai F, Albert Park, and Yaohui Zhao. 2008. “The Chinese Labor Market in the
Reform Era. In: Brandt, L, and Tom Rawski (Eds), China’s Economic Tran-
sition: Origins, Mechanisms, and Consequences.” Cambridge University Press:
Cambridge; 2008.
[6] Campbell, R. and B. Siegel (1967): “The Demand for Higher Education in the
United States, 1919-1964,” American Economic Review, 57(3), 482-94.
[7] Chen, Chunming. 2000. “Fat intake and Nutritional Status of Childre in China.”
American Journal of Clinical Nutrition, 72(5S), 1368S-1372S.
125
[8] Chen, Xinxin, Qiuqiong Huang, Scott Rozelle, Yaojiang Shi, and Linxiu Zhang.
2009. “Effect of Migration on Children’s Educational Performance in Rural
China.” Comparative Economic Studies, 51(3); 323-343.
[9] Chen, S. (2008): “Estimating the Variance of Wages in the Presence of Selection
and Unobservable Heterogeneity, Review of Economics and Statistics, 90(2), 275-
289.
[10] Chinese Nutrition Society, 2000. Dietary Reference Intakes, Beijing: Chinese
Light Industry Press, 2000.
[11] Cunha, F., J. J. Heckman, and S. Navarro (2005): “Separating Uncertainty from
Heterogeneity in Life Cycle Earnings,” Oxford Economic Papers, 57(2), 191-261.
[12] Czepiel, S. (2002): “Maximum Likelihood Estimation of Logistic Regres-
sion Models: Theory and Implementation,” [online], Available: http :
//czep.net/stat/mlelr.pdf (May 1, 2011).
[13] de Brauw, Alan, and J. Giles. 2008. “Migrant labor markets and the welfare
of rural households in the developing world: evidence from China.” World Bank
Policy Research Working Paper 4585.
[14] de Brauw, Alan, and Ren Mu. 2011. “Migration and the Overweight and Un-
derweight Status of Children in Rural China.” Food Policy, 36(1), 88-100.
[15] Domencich, T. A. and D. McFadden (1975): “Urban Travel Demand: a Behav-
ioral Analysis,” North-Holland Publishing Company, Amsterdam.
[16] Doyle, W. (2009): “The Effect of Community College Enrollment on Bachelor’s
Degree Completion,” Economics of Education Review, 28(2), 199-206.
[17] Du, Shufa, Tom A. Mroz, Fengying Zhai, and Barry M. Popkin. 2004. “Rapid
126
Income Growth Adversely Affects Diet Quality in China - Particularly for the
Poor!” Social Science and Medicine, 59(7), 1505-1515.
[18] Du, Yang, Albert Park, and Sangui Wang. 2005. “Migration and Rural Poverty
in China.” Journal of Comparative Economics, 33(4), 688-709.
[19] Dunning, Thad, Freedman, A. David. 2008. “Modeling selection effects.” S.P.
Handbook of social science methodology. Sage.
[20] Epple, D., R. Romano and H. Sieg (2006): “Admission, Tuition, and Financial
Aid Policies in The Market for Higher Education,” Econometrica, 74(4), 885-928.
[21] Fu, C (2010): “Equilibrium Tuition, Applications, Admissions and Enrollment
in the College Market,” Working Paper, University of Wisconsin-Madison.
[22] Galper, H. and R. M. Dunn (1969): “A Short-Run Demand Function for Higher
Education in the United States,” Journal of Political Economy, 77(5), 765-777.
[23] Giles, John. 2006. “Is Life More Risky in the Open? Household Risk-Coping
and the opening of China’s Labor Markets.” Journal of Development Economics,
81(1), 25-60.
[24] Hilmer, M (1998): “Post-Secondary Fees and the Decision to Attend a University
or a Community College,” Journal of Public Economics, 67, 329-348.
[25] Kane, T and C. Rouse (1993): “Labor-Market Returns to Two- and Four-Year
College,” The American Economic Review, 85(3), 600-614.
[26] Leslie, L.L. and P.T. Brinkman (1987): “Student Price Response in Higher
Education: The Student Demand Studies,” Journal of Higher Education, 55,
181-204.
127
[27] Liu, Hong, Hai Fang and Zhong Zhao. 2012. “Urban-rural disparities of child
health and nutritional status in China from 1989 to 2006.” Economics and Human
Biology, doi: 10.1016/j.ehb.2012.04.010.
[28] Liang, Zai, and Zhongdong Ma. 2004. “China’s Floating Population: New Evi-
dence from the 2000 Census.” Population and Development Review, 30(3), 467-
488.
[29] Mallee, Hein. 1995. “China’s Household Registration System under Reform.”
Development and Change, 26, 1-29.
[30] Mu, Ren, and Dominique van de Walle. 2011. “Left Behind to Farm? Women’s
Labor Reallocation in Rural China.” Labor Economics, 18(S1), S83-S97.
[31] National Bureau of Statistics of China. 2012. “Year 2011 Report
on the Rural-Urban Labor Migration in China.” stats.gov.cn, http :
//www.stats.gov.cn/tjfx/fxbg/t20120427 402801903.htm.
[32] Osberg, Lars, Jiaping Shao and Kuan Xu. 2009. “The Growth of Poor Children in
China 1991-2000: Why Food Subsidies May Matter.” Health Economics, 18(S1),
S89-S108.
[33] Popkin, Barry M, Shufa Du, Fengying Zhai and Bing Zhang. 2010. “Cohort
Profile: The China Health and Nutrition Survey-monitoring and understanding
socio-economic and health change in China, 1989-2011.” International Journal of
Epidemiology, 39(6), 1435-1440.
[34] Rozelle, Scott, Li Guo, Minggao Shen, Amelia Hughart, John Giles. 1999. “Leav-
ing China’s Farms: Survey Results of New Paths and Remaining Hurdles to Rural
Migration.” China Quarterly, 158, 367-393.
128
[35] Sandy J., A. Gonzalez, and M. Hilmer (2006): “Alternative Paths to College
Completion: Effect of Attending a 2-year School on the Probability of Completing
a 4-year Degree”, Economics of Education Review, 25(5), 463-471.
[36] Shen, Tiefu, Jean-Pierre Habicht, and Ying Chang. 1996. “Effect of Economic
Reforms on Child Growth in Urban and Rural Areas of China.” The New England
Journal of Medicine, 335(6), 400-406.
[37] Stock, Wright, and Yogo. 2002. ”A Survey of weak instruments and weak identi-
fication in Generalized Method of Moments.” Journal of the American Statistical
Association, 20(4), 518-529.
[38] Svedberg, Peter. 2006. “Declining Child Malnutrition: a Reassessment.” Inter-
national Journal of Epidemiology, 35(5), 1336-1346.
[39] World Bank. 2009. From Poor Areas to Poor people: China’s Evolving Poverty
Reduction Agenda. Washington, DC.
[40] Zhang, Shu, 2012. Migration and Children’s Health: Evidence From Rural
China. Department of Economics, University of Houston Working paper.
[41] Zhao, Yaohui. Leaving the Countryside: Rural-to-Urban Migration in China.
American Economic Review 1999, 89(2), 281-286.
129
Curriculum Vitae
Xiaochen Xu was born in Henan, China on March 19, 1984. She obtained a
B.Sc. in Actuarial Science from University of Calgary, Calgary, Canada in 2006. She
obtained a M.Phil. in Statistics from the University of Hong Kong, Hong Kong in
2008, and entered the Ph.D. program in Economics at the Johns Hopkins University
in 2008.
130