essays on college transfer in the u.s. and ......abstract this dissertation studies college transfer...

ESSAYS ON COLLEGE TRANSFER IN THE U.S.

AND CHILDREN’S WELFARE IN CHINA

by

Xiaochen Xu

A dissertation submitted to The Johns Hopkins University in conformity

with the requirements for the degree of Doctor of Philosophy

Baltimore, Maryland

August, 2013

c© 2013 Xiaochen Xu

All Rights Reserved

Abstract

This dissertation studies college transfer in U.S. and children’s welfare in

China. In Chapter 2, I develop and estimate a two-period ability-learning structural

model to analyze the determinants and consequences of college transfer. Students

make college entry and transfer decisions under different financial constraints and

uncertainty about their abilities. In period 1, students choose between community

colleges and universities, and in period 2, they make transfer decisions. I estimate

the structural parameters of the model using data from the Beginning Postsecondary

Students Longitudinal Study (BPS:04/09), with simulated maximum likelihood. I

further examine the extent to which the effectiveness of the transfer function of

community colleges can be improved with three counterfactual experiments. They

included increasing university tuition costs, eliminating transfer costs, and increasing

academic preparedness. The experiments suggest that transfer costs are the main

barrier to college transfer.

Chapter 3 studies the impact of labor migration on children’s health in

China. Labor migration, which frequently results in family separations, is widely

known as one of the main ways of alleviating poverty in developing countries. In

China, migrant workers helped build the Chinese dream in cities across the coun-

try. But for their children, who are left behind in the countryside, the potential

health problems of their physical and social development is becoming a national is-

ii

sue. This study uses data collected as part of the China Health and Nutrition Survey

(CHNS) in 2000, 2004, 2006, and 2009 to identify the impact of parents’ migration

on the health outcomes of children in rural China. The measurements of child health

outcomes are weight-for-age Z-score (WAZ), height-for-age Z-score (HAZ), nutrient

intake (consumption of calories and protein), the number of immunization shots that

children get in the survey year and child-care. To identify the effect of parental

migration on child health, we instrumented parents’ migration status with county

level historical average migration rates. We found there were few significant effects

of parents’ migration on child health outcomes.

Keywords: College transfer, tuition, uncertainty, Bayesian inference,

heterogeneity, children’s health, labor migration, fixed effects

model

JEL Classification: A22, C11, C13, C51, I14, I15

Advisors: Professor Robert Moffitt

Professor Yingyao Hu

iii

Acknowledgements

I am deeply indebted to Robert Moffitt for his guidance and encouragement

on this project. I benefited greatly from the comments of Yingyao Hu. I thank

Przemek Jeziorski, Tiemen Woutersen, and seminar participants at Johns Hopkins

for their helpful comments. The usual disclaimer applies. Comments are welcome.

iv

Contents

Abstract ii

Acknowledgements iv

List of Tables viii

1 Introduction 1

2 The Determinants and Consequences of College Transfer 5

2.1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.2 STYLIZED FACTS . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.3 MODEL SPECIFICATION . . . . . . . . . . . . . . . . . . . . . . . . 12

2.3.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.3.2 Primitives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3.3 Ability learning process . . . . . . . . . . . . . . . . . . . . . . 19

2.3.4 Value function . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

2.4 ESTIMATION STRATEGY AND IDENTIFICATION . . . . . . . . . 26

2.4.1 Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

2.4.2 Identification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.5 DATA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.6 ESTIMATION RESULTS . . . . . . . . . . . . . . . . . . . . . . . . . 32

v

2.6.1 Parameters estimates . . . . . . . . . . . . . . . . . . . . . . . 32

2.6.2 Model fit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

2.7 POLICY SIMULATION . . . . . . . . . . . . . . . . . . . . . . . . . 37

2.7.1 Increase tuition fees in universities . . . . . . . . . . . . . . . . 37

2.7.2 Improved academic preparedness . . . . . . . . . . . . . . . . . 39

2.7.3 Decrease the transfer cost . . . . . . . . . . . . . . . . . . . . . 41

2.7.4 No transfer cost . . . . . . . . . . . . . . . . . . . . . . . . . . 42

2.8 CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3 The Impact of Labor Migration on Children’s Health: Evidence

from Rural China 54

3.1 INTRODUCTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

3.2 BACKGROUND . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

3.2.1 Labor Migration and Children Left Behind in Rural China . . 59

3.2.2 Health of Children in China . . . . . . . . . . . . . . . . . . . . 60

3.3 CONCEPTUAL FRAMEWORK . . . . . . . . . . . . . . . . . . . . . 61

3.4 DATA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

3.5 EMPIRICAL SPECIFICATION . . . . . . . . . . . . . . . . . . . . . 66

3.6 ESTIMATION RESULTS . . . . . . . . . . . . . . . . . . . . . . . . . 69

3.6.1 Results of Ordinary Least Squares model . . . . . . . . . . . . 69

3.6.2 Results of Fixed Effects model . . . . . . . . . . . . . . . . . . 71

3.6.3 Results of Fixed Effects model with instrument variable . . . . 72

3.7 ROBUSTNESS CHECK . . . . . . . . . . . . . . . . . . . . . . . . . . 76

3.8 REGRESSION RESULTS ON SUBSAMPLES . . . . . . . . . . . . . 77

3.9 CONCLUSION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

4 Conclusion 115

vi

A Appendix to Chapter 2 117

A.1 Bayesian Update in Ability Learning Process . . . . . . . . . . . . . . 117

A.2 Estimation Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

A.2.1 The Closed Form of P (si1|·) . . . . . . . . . . . . . . . . . . . . 119

A.2.2 The Closed Form of P (si2|·) . . . . . . . . . . . . . . . . . . . . 121

A.2.3 The Closed Form of f(wit|·) . . . . . . . . . . . . . . . . . . . . 122

A.3 The Closed Form of f(κit|·): . . . . . . . . . . . . . . . . . . . . . . . . 123

Bibliography 125

Curriculum Vitae 130

vii

List of Tables

2.8 Possible education paths for a student starting from a four-year uni-

versity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.9 Possible education paths for a student starting from a community college 14

2.1 Percentage enrollment in period 1 . . . . . . . . . . . . . . . . . . . . . 45

2.2 Percentage of transfer in period 2 . . . . . . . . . . . . . . . . . . . . . 45

2.3 Average tuition fee . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

2.4 Average high school GPA and SAT score (normalized) . . . . . . . . . 45

2.5 Average college GPA for transfer students . . . . . . . . . . . . . . . . 46

2.6 Average family income . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

2.7 Average family income for transfer students . . . . . . . . . . . . . . . 46

2.10 Estimation results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

2.11 Enrollment rate in period 1 in model fit . . . . . . . . . . . . . . . . . 48

2.12 Enrollment rate in period 2 in model fit . . . . . . . . . . . . . . . . . 48

2.13 Transfer rate in period 2 in model fit . . . . . . . . . . . . . . . . . . 49

2.14 Enrollment rate in period 1 in experiment study 1 . . . . . . . . . . . 49

2.15 Transfer rate in period 2 in experiment study 1 . . . . . . . . . . . . . 49


2.17 Transfer rate in period 2 for current student in experiment study 1 . 50


viii









3.1 Parents Migration Rate for Children under age ten(CHNS) . . . . . . 81

3.2a Descriptive Statistics (CHNS) . . . . . . . . . . . . . . . . . . . . . . . 82

3.2b Descriptive Statistics (CHNS) . . . . . . . . . . . . . . . . . . . . . . . 83

3.2c Descriptive Statistics (CHNS) . . . . . . . . . . . . . . . . . . . . . . . 84

3.3 Descriptive Statistics (CHNS) of Control Variables . . . . . . . . . . . 85

3.4a OLS regression results: the effects of the household migration status . 86

3.4b OLS regression results: the effects of the household migration status . 87

3.5a OLS regression results: the effects of the father’s migration . . . . . . 88

3.5b OLS regression results: the effects of the father’s migration . . . . . . 89

3.6a OLS regression results: the effects of the mother’s migration . . . . . . 90

3.6b OLS regression results: the effects of the mother’s migration . . . . . . 91

3.7a Fixed effects model results of the effects of the household migration

status on children’s health outcome and care . . . . . . . . . . . . . . 92

3.7b Fixed effects model results of the effects of the household migration


3.8a Fixed effects model results of the effects of the father’s migration status

on children’s health outcome and care . . . . . . . . . . . . . . . . . . 94

ix

3.8b Fixed effects model results of the effects of the father’s migration status

on children’s health outcome and care . . . . . . . . . . . . . . . . . . 95

3.9a Fixed effects model results of the effects of the mother’s migration


3.9b Fixed effects model results of the effects of the mother’s migration


3.10 First Stage fixed effects Regression Results . . . . . . . . . . . . . . . 98

3.11aFixed effects model results of the effects of the household migration

status on children’s health outcome and care: IV approach . . . . . . . 99

3.11bFixed effects model results of the effects of the household migration


3.12aFixed effects model results of the effects of the father’s migration status

on children’s health outcome and care: IV approach . . . . . . . . . . 101

3.12bFixed effects model results of the effects of the father’s migration status

on children’s health outcome and care: IV approach . . . . . . . . . . 102

3.13aFixed effects model results of the effects of the mother’s migration


3.13bFixed effects model results of the effects of the mother’s migration


3.14aRobustness Check 1: the effects of the household migration status on

children’s health outcome and care without household income as a

control variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

3.14bRobustness Check 1: the effects of the household migration status on



x

3.15aRobustness Check 1: the effects of the father’s migration status on



3.15bRobustness Check 1: the effects of the father’s migration status on



3.16aRobustness Check 1: the effects of the mother’s migration status on



3.16bRobustness Check 1: the effects of the mother’s migration status on



3.17aRobustness Check 2: the effects of the household migration status on

children’s health outcome and care without the number of elders as

control variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

3.17bRobustness Check 2: the effects of the household migration status on



3.18aRobustness Check 2: the effects of the father’s migration status on



3.18bRobustness Check 2: the effects of the father’s migration status on



xi

3.19aRobustness Check 2: the effects of the mother’s migration status on



3.19bRobustness Check 2: the effects of the mother’s migration status on



3.20aFixed effects model results of the effects of the household migration

status on children’s health outcome and care on subsamples: IV approach111

3.20bFixed effects model results of the effects of the household migration

status on children’s health outcome and care on subsamples: IV approach112

3.21aFixed effects model results of the effects of the father’s migration status

on children’s health outcome and care on subsamples: IV approach . . 113

3.21bFixed effects model results of the effects of the father’s migration status

on children’s health outcome and care on subsamples: IV approach . . 114

xii

Chapter 1

Introduction

Education and health are important topics in labor economics. A persons

welfare is closely related to their education and health. This dissertation consists of

two essays in education and health. The first essay examines the determinants and

consequences of college transfer using a structural modeling approach. The second

essay examines the impact of labor migration on the health outcomes and care of

children using fixed effects model.

The literature on the college market mainly draws on non-structural ap-

proaches. For instance, Hilmer (1998) studied the effects of tuition on college trans-

fer. He suggested that financial concerns are not the most influential for students

making college decisions. Kane and Rouse (1993) examined the consequences of

college transfer on labor market return and found similar returns to two-year and

four-year college credits. The findings of this paper support Kane and Rouses find-

ings. However, there are a few studies which do employ structural models. One is

Fu (2010), who proposed and empirically implemented a market equilibrium model

for college education, focusing on application strategies. Admission, net tuition, and

enrollment were joint outcomes in the model. Beizil and Hansen (2002) also used

a structural model to examine study choices impacting duration of schooling, while

1

taking into account heterogeneous abilities. Nonetheless, the results in both papers

may be biased because they ignored a large proportion of transfer students.

Chapter 2 contains the first attempt to estimate a structural model for

college enrollment and student transfer decisions subject to students’ uncertainty of

abilities. It examines the determinants and consequences of college entry and transfer

decisions through an ability-learning structural model, in which school qualities and

students’ future wages are considered. Using such a model, I can provide insight into

the determinants of college enrollment and transfer decisions, permitting quantitative

evaluation of the effects of counterfactual changes in the college market. The model

explains the allocation of students to different schools. More importantly, I can give

interpretations of the driving forces of and the barriers to college transfer.

This essay contributes to the current literature by simultaneously modeling

four aspects of the decision to enroll in college that are important for empirical

analysis. The first is the uncertainty of student abilities. A student has an a priori

belief about her abilities, informed by her high school GPA, SAT scores and college

GPA. At the same time, the student has private information on her abilities, which

cannot be observed by econometricians. The second aspect is that college transfer

is costly for students; in both monetary and non-monetary terms. Monetary costs

include application fees and relocation costs, while non-monetary costs include the

time spent in school searches and the loss of non-transferable credits. The third is

the effect of college transfer on future income. This includes the empirical fact that

transfer students earn less than non-transfer students who graduate from the same

university. The fourth aspect is the heterogeneity of students with regard to their

family backgrounds, abilities, and school preferences. Students make different college

enrolment and transfer decisions based on their backgrounds, school preferences, and

expected abilities.

2

Chapter 3 aims to establish the overall consequences of parental migration

on the health outcomes and care of their left-behind children. Some of the eco-

nomic literature focusing on labor migration in China suggests that the remittances

forwarded to families by migrated members benefit the households financially. For

instance, Du et al. (2006) and de Brauw and Giles (2008) found that labor migra-

tion increased family consumption level. There are few papers that study the health

outcomes of left-behind children in China. One of them is Mu and Brauw (2011),

which examined the weight of left-behind children, and found that older children

(7-12 years) were more likely to be underweight in migrant households than those

who lived in nonmigrant household. Shu Zhang (2012) used survey data from the

2000 wave of the China Health and Nutrition Survey (CHNS) to study the impact of

labor migration on children’s health. She found no significant health outcome effects

for children whose fathers had migrated. Both papers, however, do not consider the

potential endogeneity of parental migration and children’s health. Therefore their

results might be biased.

The main methodological obstacle of quantifying the effect of parental mi-

gration is the endogeneity problem, or the potential for reverse causation. Instead

of being affected by their parents migration status, a childs health status could be a

critical factor for their parents when making migration decisions. To overcome the

endogeneity problem, we used instrumental variables (IV) estimation. To be more

specific, we instrumented people’s migration status with the historical county level

migration rate. The historical county level migration rate is a suitable indicator

to reflect the local culture and network of migration, where the network refers to

a person’s exposure to migration information from her migrated friends or family

members.

Chapter 3 contributes to the literature in a number of ways. Firstly, we

3

used novel instrumental variables dealing with the endogenous nature of parents’

migration decisions, which are able to predict the migration propensity of parents.

Secondly, we studied different effects of father’s and mother’s migration status on

child health outcomes, which were significantly different. Thirdly, in addition to

traditional measurements of child health that focus on height and weight, we also

considered nutrient intake (consumption of calories and protein), immunization shots

and childcare. These measures provided a more comprehensive picture of the impact

of labor migration on child health.

The remainder of this dissertation consists of two essays and the conclusion.

The concluding chapter summarizes contributions of the essays and discusses avenues

for future research.

4

Chapter 2

The Determinants and

Consequences of College

Transfer

2.1 INTRODUCTION

As college tuition continues to rise, the issue of financing a college educa-

tion is attracting widespread scholarly interest and generating much public policy

debate. According to the Beginning Postsecondary Students Longitudinal Study

(BPS: 04/09), 27% of freshmen in the United States are in community colleges, and

of those, 32% eventually transfer to universities. In spite of the importance of com-

munity colleges, neither the causes of nor the barriers of college transfer have received

much attention.

This paper is the first to estimate a structural model for college enrollment

and transfer decisions subject to students’ uncertainty of abilities. Existing papers in-

volving similar topics mainly rely on non-structural approaches. For instance, Hilmer

(1998) studies the effects of tuition increases on the determinants of college transfer

5

and shows that the enlarged tuition gap between community colleges and universities

pushes more students to community colleges. The estimation results of my model,

however, suggest that financial concerns are not the most influential factors that af-

fect students’ college decisions. Kane and Rouse (1993) examine the consequences

of college transfer on labor market return and find similar returns to two-year and

four-year college credits which coincides with the findings in this paper.

However, there are a few studies which do employ structural models. One

is Fu (2010), who proposes and empirically implements a market equilibrium model

for college education and focuses on application strategies. Admission, net tuition,

and enrollment are the joint outcomes. The estimation results reveal the existence of

substantial heterogeneity in students’ preferences for colleges. Hence, they make

different application and enrollment decisions. Nonetheless, Fu’s results may be

biased for two reasons. The first is that she ignores a large proportion of transfer

students. Second, Fu’s study is premised on the dubious assumption that preferences

are not linked to future wages. Another is Beizil and Hansen (2002), who study

choices involving years of schooling while taking into account heterogeneous abilities.

Under their model, preferences for schools are linked to future wages. Students choose

the years of schooling subject to the uncertainties of their abilities. The researchers

strongly reject the null hypothesis that unobserved market ability is uncorrelated with

realized schooling attainments, which underlies many previous studies that have used

OLS to estimate the return to schooling. However, one of the flawed assumptions

that Beizil and Hansen (2002) made is that all schools are identical. They do not

distinguish among schools by qualities and tuition, and do not allow for factors other

than years in school to affect returns to school, which could bias the estimation

results.

This paper examines the determinants and consequences of college entry

6

and transfer decisions through an ability-learning structural model, in which school

qualities and students’ future wages are considered. Using such a model, I can provide

insight into the determinants of college enrollment and transfer decisions, and permits

quantitative evaluation of the effects of counterfactual changes in the college market.

The model explains the allocation of students to different schools. More importantly,

I can give interpretations of the driving forces of and the barriers to college transfer.

This paper contributes to the current literature by simultaneously modeling

four aspects of the decision to enroll in college that are important for empirical

analysis. The first is the uncertainty of student abilities. A student has an a priori

belief about her abilities, and updates such belief based on her high school GPA,

SAT scores and college GPA. At the same time, the student has private information

on her abilities, which cannot be observed by econometricians. The second aspect

is that college transfer is costly for students; the costs are both monetary and non-

monetary. Monetary costs include application fees and relocation costs, while non-

monetary costs come from the time spent in school searches and the loss of non-

transferable credits. The third is the effect of college transfer on future income,

which captures the empirical fact that transfer students earn less than non-transfer

students conditional on graduating from the same university. The fourth aspect is

the heterogeneity of students with regard to their family backgrounds, abilities, and

school preferences. Students make different college enrollment and transfer decisions

based on their backgrounds, school preferences, and expected abilities.

The structural model in this paper is a two-period model. In period 1, a stu-

dent chooses between community colleges and universities based on her expectation

of her abilities, family background, and school preferences. In period 2, she refines

her expectations about her abilities using her college GPA, and makes transfer or

drop-out decisions. For a student enrolled in a community college in period 1, she

7

can choose to work with an associate’s degree or transfer to a university to get a

bachelor’s degree; for a student enrolled in a university in period 1, she can choose to

study in her original university, transfer to another university, or drop out. Students

who decide to transfer incur the transfer cost.

To estimate the model, I use a simulated maximum likelihood estimation.

The data comes from the Beginning Postsecondary Students Longitudinal Study

(BPS:04/09) from the National Center of Education Statistics (NCES). The data

provide detailed information on students’ high school GPA, SAT scores, college GPA,

family background, school level, and school name. Tuition information is derived from

the Integrated Postsecondary Education Data System (IPEDS).

The estimated model fits the data well, and suggests new ways to interpret

the data. The model generates a rich set of predictions: it matches not only the

static composition of students attending community colleges, public universities and

private universities, but also the dynamic transitions of students among schools.

Interestingly, the model also captures the actual relation between family background

and the education outcome. For instance, large educational differences by family

background are still predicted by the model even though family income and parent

education are assumed to be uncorrelated with returns to school. That suggests

much of the intergenerational transmission of education can be captured through

parents’ influence on student financial resources during college, rather than through

differential access to credit or returns to school.

Some of my major findings are as follows: (a) transfer costs are large and

are the main barrier to college transfer; (b) transfer does not have a significant effect

on student incomes, which suggests that the market does not discriminate against

transfer students; (c) private information on student abilities does not significantly

influence college choices or account for the difference between the expected and ac-

8

tual labor market outcomes; (d) family income and parental education level do have

significant effects on the choice of a college by determining student access to financial

resources during college.

In this paper, I use three counterfactual experiments to examine the extent

to which the effectiveness of college enrollment can be improved. In the first exper-

iment, I increase the tuition in all universities by 20%, while keeping the tuition in

community colleges the same. In the second experiment, I increase both high school

GPA and SAT scores by 0.5 standard deviations. In the third experiment, I set

transfer costs to zero.

The transfer rates from community colleges to universities increase in all

three studies, especially by a larger amount in experiment 3. Eliminating the trans-

fer cost does the most to improve the efficiency of the transfer function of community

colleges. The transfer rate from community colleges to public universities increases

from 3.8% to 26.2%, while the transfer rate from community colleges to private uni-

versities increases from 2.4% to 17.5%, which shows that the main barrier of transfer

is the high transfer cost. Moreover, the university completion rate is the highest

when there is no transfer cost. Simulation studies suggest that decreasing transfer

costs (through cooperative agreements between community colleges and universities)

is the most efficient way to encourage students to attend community colleges and

increase the completion rate at universities.

It is important to improve the effectiveness of the transfer function of com-

munity colleges. This is because the average expenditure per full time student in a

community college is much less than that in a university, the overall cost of higher ed-

ucation could be greatly reduced if more students attended community colleges with

transfer programs. Therefore the counterfactual studies have clear policy implica-

tions for improving the efficiency of the college market, and reducing the educational

9

costs for both individuals and local governments.

The rest of the paper proceeds as follows. Stylized facts on postsecondary

education are presented in Section 2 and model specifications in Section 3. Then, the

estimation strategy is revealed in Section 4 with a brief discussion of identification.

The data are described in Section 5 as well as summary statistics. In section 6,

the estimation results are shown together with a brief discussion of the model fit.

Finally, three counterfactual experiments are described in Section 7. Some model

and estimation details are given in the appendix.

2.2 STYLIZED FACTS

Before developing the model, I will first describe the stylized facts of post-

secondary education that the model should replicate. To do this, I use the Beginning

Postsecondary Students Longitudinal Study (BPS:04/09) from the National Center

of Education Statistics (NCES). In each cycle, the study followed a cohort of stu-

dents enrolling in postsecondary education for the first time. Members were initially

surveyed at the end of their first academic year (2003-04) and invited to participate

in follow-up surveys at the end of their third (2005-06) and sixth (2008-09) years

after entering postsecondary education. The final BPS:04/09 dataset contains the

information of nearly 16,700 students.

• Around 27% of freshmen start their postsecondary education in community

colleges

In Table 2.1, we can see that over one fourth of students choose to begin their

college education in community colleges. The enrollment rate in private uni-

10

versities is about the same as the enrollment rate in public ones.

• Around 12% of students are transfer students

The statistics given in Table 2.2 are that: 40% of community college students

transfer to universities. From these statistics, we can see that the transfer rate

from community colleges to public universities is twice as high as the transfer

rate from community colleges to private universities, possibly because that fi-

nancial constraints are usually one of the main concerns for transfer students.

Therefore, these individuals are more likely to transfer to less expensive uni-

versities. The average tuition is given in Table 2.3.

• Students who started in universities have higher average high school GPA and

SAT scores than those who started in community colleges

In Table 2.4, we can see that both average SAT score and high school GPA are

the highest in private universities, while the average test scores in community

colleges are the lowest, which implies that test scores play an important role

when students make school choices for their tertiary education.

• Community college transfer students have high average college GPA relative to

those who do not transfer

Table 2.5 gives the average GPA for transfer students. The differential may

suggest that the driving force of the transfer behaviors between universities is

different from the driving force of transfer behavior from community colleges

to universities. While the transfers between universities are possibly driven by

ability mismatch (poor performance in current universities), the transfers from

11

community colleges to universities are mainly due to improved expected ability

in current schools.

• Students who started in universities have higher average family income than

those who started in community colleges

From Table 2.6, we see that the cohorts of students in different types of schools

exhibit heterogeneity in family income. The average family income for transfer

students is given in Table 2.7. It is worthwhile to point out that the students

who transfer from community colleges are on average from poorer families than

their counterparts who transfer from universities.

2.3 MODEL SPECIFICATION

2.3.1 Overview

In this model, students are uncertain about their abilities. They only know

their abilities based on their academic performances in previous periods. In period

11, a student has an expectation of her abilities based on her high school GPA and

SAT scores. She makes her school choice between community colleges and different

universities based on her family background, school preferences, and expected abili-

ties. In period 2, the student updates her belief of her abilities based on her college

GPA. She then decides whether or not to transfer based on this belief. If the student

starts in a community college in period 1, she can choose to work with an associate’s

degree, or she can transfer to a university. If the student starts in a four-year uni-

versity in period 1, she can choose to transfer to another university, study in the

1The time line is formally introduced in Section 2.3.2.

12

Table 2.8: Possible education paths for a student starting from a four-year university

Period 1: Four year university

Period 2: Choice 1: stay in the current university for another period

and get a bachelor’s degree.

Choice 2: transfer to another four-year university and

study for another period for a bachelor’s degree.

Choice 3: drop out without any degree.

current university, or drop out without any degree. There are two potential costs for

transfer students. The first is the direct transfer costs which include the application

fee, the time that students spend on school search and application, moving costs, and

the loss of some nontransferable credits. Another is the possible negative effect on

future wages if the student transfers upward. In this case, an upward transfer refers

to a transfer from a community college to a university, or from a university with low

tuition to a university with high tuition. The discrimination against transfer students

from the labor market may come from the different quality of education received by

transfer students and non-transfer students, which is explained further in Section

2.3.2. In this paper, tuition is a proxy for school quality. However the estimation re-

sults will be the same if we use average SAT score as a proxy for school quality. From

the summary statistics, we can see that private universities have higher tuition than

public universities. At the same time, the average SAT score in private universities

is also higher. Using either tuition or SAT as a proxy, the upward transfer between

universities refers to the one from a public university to a private university. To see

the options more clearly, I list all the school choices in Tables 2.8 and 2.9.

In this model, if a student plans to earn a bachelor’s degree, there are two

ways to achieve it. One is to attend the destination university from the beginning.

13

Table 2.9: Possible education paths for a student starting from a community college

Period 1: Community college

Period 2: Choice 1: work with an associate degree.

Choice 2: transfer to a university and

study for another period to get a bachelor’s degree.

The other is to transfer to the destination university after one period at another

institution. The advantage of transferring is that the student may save some tuition

if the destination university has higher costs. The disadvantages of transferring

include transfer costs and possible negative effects on future earnings.

More importantly, after a student learns about her abilities from her test

scores in the first period, a school transfer can potentially lead to a better match be-

tween school choice and student’s individual abilities. Under the model, it is possible

for a community college student to transfer to a university if she believes that her

abilities qualify her for the university and will earn her a higher future return in the

future. For a similar reason, a student starting from at a university has the option

to drop out or transfer to another university if she believes that her abilities do not

match well with the current university and that other options may give her a better

return.

2.3.2 Primitives

In this paper, I only focus on students who receive postsecondary education.

To be more specific, work is not a feasible option in period 1.

Time line: Without loss of generality, I take a unit period to be 2 years.

At t = 1 (period 1), an individual makes a school choice from community colleges and

universities; at t = 2 (period 2), an individual can make a transfer or work decision;

14

and at t = 3 (period 3) and beyond, an individual has to work. Retirement occurs

at time t = T .

Choice set: There are J four-year universities that are ranked according

to ascending order of tuition and indexed by j = 1, 2, · · · , J . To be more specific,

university j = 1 is the one with the lowest tuition, and university j = J is the one

with the highest tuition. It is assumed in the paper that all community colleges are

the same and are indexed by j = 0. An outside option (work), is indexed by j = −1.

Here J denotes the choice set, where J = {−1, 0, 1, · · · , J}.

Utility of working

Utility of working is a logarithm function of student wages (w(·)). The wage

of individual i depends on the student abilities (αij , which is discussed in detail in

Section 3.3), the school from which she graduates (Di), and her years of experience

(Exprit). The utility of working UW (·) is defined as follows,

ln (w(αi,Di, si1, Di, Exprit)) = g(αi,Di

, Di) + γ11(Di > si1) (2.1)

+γ2Exprit + γ3Expr2it + εWit ,

for t = 2, · · · , T.

Here w(·) is an individual’s wage. αi,Diis the ability of individual i at school Di. sit

is student i’s school choice in period t. Di denotes the school from which student i

gets her degree. g(αi,Di, Di) is assumed to take the form of

g(αi,Di, Di = j) = ρ1j + ρ2j · αij , (2.2)

where ρ1j ’s and ρ2j ’s, for all 0 ≤ j < J , are to be estimated. 1(·) is the indicator

function. Exprit is the years of experience of individual i in period t. εWit represents a

15

stochastic wage shock at time t, which follows an Extreme Value Type I distribution

with location and scale parameters zero and τ .

It is assumed that the outside option yields to a student the utility

ln(wout,t) = µout + εWit , (2.3)

which is the logarithm of wage that the student receives if she quits a university and

works without a degree. Here µout is a constant to be estimated. εWit is in equation

2.1.

The utility defined in Equation (2.1) is the student’s utility of working

conditional on graduating from school Di and her ability αi,Di. In Equation (2.1),

the first term captures the crossed effect of ability and school choice on earning, the

second term captures the transfer cost on future earning, and the last two terms

represent the effect of experience on earning.

The crossed effect of ability and school choice on earning is defined in Equa-

tion (2.2). This function includes an intercept term for school j, which is ρ1j , and

a term that captures the return to ability at school j, which is ρ2j . It is expected

that ρ1j is lower for less expensive schools, and higher for more expensive schools,

implying that given the same ability level, a student should have a higher return by

attending a more expensive school (a student should have higher return by investing

more in education). The estimated ρ2j ’s are also expected to be ranked according

to ascending order of tuition as well, which suggests that students with high ability

have higher return from more expensive schools than do low-ability students. There-

fore, high ability students should be more willing to invest more in education, which

coincides with the statistics in the data.

The second term in the equation captures the potential cost of transferring

on earning. 1(Di > si1) = 1 delivers two pieces of information. The first is Di 6= si1,

16

which means the school that the student attends in period 1 is different from the

school from which she graduates. It implies that the student does transfer. The

second is Di > si1, which means the transfer is an upward transfer according to the

aforementioned information. Therefore, the second term in Equation (2.1) captures

the cost of an upward transfer on future earnings. If γ1 is negative, it means that

an upward transfer has a negative effect on the student’s future earnings, which may

happen because education quality can vary across different schools. For a student who

chooses to transfer upward, she has only received the last period of training from her

destination university. Therefore, the education she receives may not be as good as if

she begins her education in the destination school. It is reasonable to think that the

student may receive fewer payoffs if she has received a lower quality education. If the

student transfers from a university with high tuition to a university with low tuition

(downward transfer), I assume there is no effect on the student’s future earnings. It

is assumed in the model that the downward transfer will not be observed if students

know their abilities with certainty, and that can only be explained by uncertainty.

Utility of staying in school

The utility a student receives from either a four-year university (j > 0) or

a community college (j = 0) for each period can be represented by the following

equation. It is the utility that a student receives for one period by attending school

j

USjt(·) = ln(ξ(·)) + νij + εSijt, (2.4)

17

where ξ(·)2 is the money available to the individual given her family income and

school choice, defined as

ξ(Xi, sit = j) = l(Xi, sit)− Tj . (2.5)

As I have previously mentioned, sit is the student i’s school choice at time t. Tj is

the tuition of school j. Xi includes student i’s family income, number of siblings,

and the education levels of her parents. Function l(·) denotes the money available to

the student before paying tuition given her family background Xi and school choice

sit, that takes a log-linear form of

ln[l(Xi, sit = j)] = θj +X ′iβ. (2.6)

Here θj ’s and β are to be estimated. νij represents the student’s idiosyncratic taste

for school j, and it follows

νij ∼ N(0, σ2S). (2.7)

The preference shock, denoted εSijt, follows an Extreme Value Type I distribution

with location zero and scale parameter τ .

A student’s financial constraint is defined in Equation (2.6), which is the

monetary resources available to her before paying tuition. It is assumed that the

intercept term of the equation (θj) is different for different school j, which implies

that the money available to the student is different for different schools before paying

the tuition. There are two reasons for this assumption. The first is that the parental

contributions that may differ for the student’s different school choices. For instance,

the parents may transfer more money to the student if she decides to attend a more

expensive school. Secondly, a student may spend more money aside from tuition in

certain types of schools than in others. For instance, students who attend community

1In estimation, ln(ξ(·)) is replaced by (1 + d0.5)ln(ξ(·)), where ln(ξ(·)) is defined as financial resources

available to students for a half period (1 year), and d0.5 is the discount factor for one year.

18

colleges are more likely to choose a school near their home. Hence, they can live with

their parents and save on transportation and living expenses. Students in commu-

nity colleges are also more likely to take advantage of part-time job opportunities.

Therefore, they may make extra money apart from family contributions.

In Equation (2.4), we can see that the differences in students’ utilities of

attending different schools are captured by family backgrounds (Xi), taste in schools

(νij), and preference shocks (εSijt). The idiosyncratic preferences are captured by

two terms, νijs and εSijts. νij stays the same over time, while εSijt changes over time.

Taking the two terms together shows that a student’s preference for the same school

is different but correlated over time. νij captures the part of student preference that

does not change over time. For instance, a student may prefer a certain school over

time because her parents went to the same school or the location of the school is ideal,

to name a few reasons. On the other hand, εSijt captures the preference shock that

changes over time. For instance, the preferences may change due to some unexpected

good or bad experiences in certain schools.

In Equation (2.4), student utility of attending school is measured by the

monetary resources available to the student rather than from other psychic costs.

Family income and the education level of her parents influence student utility through

the number of financial resources available. As a result, the intergenerational trans-

mission of education is captured through financial means (parents with higher edu-

cation tend to contribute more money to a student’s education) other than through

differential access to education or return to schools.

2.3.3 Ability learning process

Altonji (1993), Cunha, Heckman, and Navarro (2005), and Chen (2007)

show the importance of uncertainty in explaining college decisions and potential

19

wage variation. In this paper, uncertainty is captured in student’s abilities. I assume

student abilities are multi-dimensional, which implies that students have different

abilities at different schools. To be more specific, the ability vector of student i is

given by

αi = [αi0, αi1, αi2, · · · , αiJ ]′,

where αij is student i’s ability at school j.

Students update their beliefs in their ability αij ’s by a Bayesian updating

of the distributions of their abilities at different schools. Before making a college

enrollment decision, a student has a prior belief regarding her abilities. Based on

high school GPA and SAT scores, a student updates her belief of her abilities at all

schools (αijs for j ≥ 0). After the student gets into school j, she can update the

distribution of her ability at school j (αij) using her GPA at school j (κijt). I assume

the student’s GPA at school j (κijt) is only affected by her ability at school j (αij).

In the following equations, I describe the relationship between the student’s

ability and the signals (high school GPA, SAT scores, and college GPA).

Prior distribution: The prior distribution of student i’s ability at school

j (αij) is assumed to follow

αij ∼ N(mj + χij , σ2α), (2.8)

where mj is to be estimated, and

χij ∼ N(0, σ2µ) (2.9)

denotes the unobserved academic aptitude that is known by the student but not

econometricians. From Equation (2.8), we can see that a student’s prior beliefs

about her abilities differ from that of other students only by unobserved academic

20

aptitude (χij).

After receiving high school GPA: It is assumed that the high school

GPA of student i is affected by a linear combination of student abilities, αij ’s, which

is given by,

HsGPAi = µ0 +J∑

j=0

µ1j · αij + εHsij , (2.10)

where µ0 and µ1j are to be estimated, HsGPAi is student i’s high school GPA, and

εHsij ’s are i.i.d N(0, σ2) random noise.

After receiving SAT score: It is assumed that the scores of student i

are affected by a linear combination of student abilities, αij ’s, which is given by,

SATi = µ0 +J∑

j=0

µ1j · αij + εSATij , (2.11)

where µ0 and µ1s are the same as defined in Equation (2.10), SATi is the student i’s

SAT score, and εSATij ’s are i.i.d N(0, σ2) random noise.

After receiving college GPA: students have GPA observations from the

schools that they attend in period 1 before making college choices in period 2. To be

more specific, if a student is enrolled in school j in period 1, she receives her GPA

from school j after a period that is denoted as κij1. It is assumed that the signal κijt

received in school j is only affected by student ability at school j (αij). The relation

between student i’s ability αij and signal κijt is

κijt = αij + εκit, (2.12)

where εκit’s are i.i.d N(0, σ2κ) random noise.

If student i attends school j, she only updates the distribution of her ability

at school j (αij) based on the college GPA that she receives at school j (κijt).

21

Students estimate the posterior mean and variance of their abilities based on

the test scores using Bayesian updating. The details of the ability updating process

are provided in the appendix.

2.3.4 Value function

The value function in period 1 can be solved by recursive deduction. This

function can be computed using the student’s estimated ability and past school

choices. To simplify notation, I define the information set to include all the sig-

nals that the student received from previous periods, and her past school choices.

To be specific, the information set of student i at the start of period t is Iit =

(κi1, · · · , κi,t−1, si1, · · · , si,t−1)′.

Value function in period 2

I now characterize the utility associated with each of the potential choices

available to the agent. Students can make two types of choices in period 2, which I

discuss separately. The first is the choice to continue university studies. The second

is the choice to work.

The present value of attending a four-year university: The present

value of attending such a university j (sit = Dit = j for j > 0) in period 2 is

V2(Xi, Ii2, si2 = j) = d[USjt(Xi, j)] + E

[

T∑

t=3

dt−1UWt (αi,Di

, si1, j, Exprit)|Ii2

]

+d1(si1 6= j) · TF,

(2.13)

22

where TF refers to the direct transfer cost that was introduced in Section 2.1, and d

is the time preference parameter over a period (i.e., 2 years).

The value of attending a four-year university si2 (V2(·)) in period 2 is com-

posed of three parts: the utility of staying in school si2 for one period, US(·) defined

in Equation (2.4), the discounted utility from receiving salary, UW (·) defined in Equa-

tion (2.1) starting from one period later, and the direct transfer cost (TF ) if a student

decides to transfer (si1 6= j).

We can see that the value function is a function of the student’s school

choice in period 1 (si1), family background (Xi), and the information set at period 2

(Ii2), which contains the school choice made in the previous period and all the test

scores that were obtained so far. It implies that the student makes school choice in

period 2 based on the previous school choice, family background, and individual test

scores.

It can be noted that the utility of working (UW (·)) is a random variable.

The expectation is taken over the error terms in the wage equation ({εWit }t≥3) and

student abilities (αij). As I have mentioned, students only know the distribution of

their abilities and need to refine the distribution in every period based on test scores.

The value function is derived based on the refined distribution of their abilities. As

a result, there is a discrepancy between expected abilities in period 1 and in period

2. When the shock to abilities is large, a student may need to adjust her planned

education path. For instance, dropping out from a university is usually unplanned

and can be attributed to, among other factors, a negative shock to predicted student

abilities. It is not rare to see students who enrolled in universities in period 1 may

choose to drop out if they have received very poor grades. The significant proportion

of dropouts is therefore explained by the differential in expected abilities (uncertainty)

across time.

23

Transfer behavior exerts its impact on value function in two ways. One is

through the effect on future wages which is captured by UW (·) (defined in Equa-

tion 2.1). It is assumed that transfer students may not receive the same payoff as

non-transfer students, given that they graduate from the same university. The other

effect of transfer behavior is captured by the last term in Equation (2.13), which is

the direct cost. The term in the indicator function, si1 6= si2, implies the school that

the student attends in period 1 (si1) is different from the one that she attends in

period 2 (si2). As I have pointed out, the transfer costs in this term include both

monetary and non-monetary costs.

The present value of working: If the student decides to work in period

2 (si2 = −1), the discounted utility from work in period 2 is

V2(Xi, Iit, sit) = 1(si1 = 0)E

[

T∑

t=2

dt−1UWt (αi,Di

, si1, Di = 0, Exprit)|Iit

]

+1(si1 6= 0)E

[

T∑

t=2

dt−1ln(wout,t)

]

,

(2.14)

Recall that Di is the school from which student i gets her degree. The

condition that Di = 0 indicates student i receives a degree from a community college.

The value of working is different for students who were enrolled in commu-

nity college and those who were enrolled in four-year universities in period 1. The

expected value inside the first (second) square bracket is the value of working for

student i who was enrolled in a community college (four-year university) in the first

period. The expectation of the first term is taken over the error term in the wage

equation ({εWit }t≥2) and student abilities αij , while the expectation of the second

term is only taken over the error term in the wage equation ({εWit }t≥2).

If a student who was enrolled in a community college for the first period

24

(si1 = 0), decides to work in period 2, she then works with an associate’s degree.

Therefore in this case Dit = 0, and the utility of working that the student receives is

UWt (αi,Di

, si1, Di = 0). However, a student enrolled in a four-year university in the

first period (si1 6= 0), who decides to work in period 2, works without any degree.

In this case, the utility of working without a degree is ln(wout,t), which is defined in

Equation (2.3).

A student makes the school choice si2 that maximizes her utility V2(Xi, Ii2, si2).

The student’s school choice problem at the beginning of period 2 is

maxsi2∈J\{0}

{V2(Xi, Ii2, si2)}, (2.15)

where Ii2 includes all past school choices and test scores observed at the beginning

of period 2. The set J \ {0} encompasses all universities but no community colleges

because community colleges are not feasible options in period 2.

Let the optimal application strategy be si2(Ii2, Xi).

Value function in period 1

In period 1, the value of attending either a four-year university (si1 > 0)

or a community college (si1 = 0) is composed of the utility of staying in school for

the current period US(·), and the expected maximized utility from the second period

given a student’s school choice in period 1 (si1), which can be expressed in a single

value function as

V1(Xi, Ii1, si1) = USjt(Xi, si1) + E

[

maxsi2∈J\{0}

[V2(Xi, Ii2, si2)] |Ii1, si1

]

.

(2.16)

25

As I have mentioned, this model is a two period model. Given the dropout

and transfer options available to students, their enrollment choices predicted by this

model can be different from the ones predicted by a one period model (e.g., Fu’s

2010 model). In this model, a student may not necessarily start in her destination

university, a choice that gives her the highest payoff. She may choose to attend a

cheaper school in the first period and transfer to her destination university in the

second period.

A student makes school choice si1 that maximizes her utility V1(Xi, Ii1, si1).

The student’s school choice problem is

maxsi1∈J\{−1}

{V1(Xi, Ii1, si1)|Ii1}. (2.17)

Let the optimal application strategy be si1(Ii1, Xi). The set J \ {−1} encompasses

all school choices (community college and four-year universities) but not work oppor-

tunities.

2.4 ESTIMATION STRATEGYAND IDENTIFICATION

In this section, the estimation and construction of the likelihood function

are discussed first. A brief discussion on identification is also provided.

2.4.1 Estimation

In the data, I only have three periods of observations in which I observe

either one or two periods of wage data. To be clearer, if a student attends school

for two periods, I have two GPA periods and school enrollment status observations

and one period of wage observation. For students who only complete one period of

education, I have one GPA period and school enrollment status observations as well

as two periods of wage observations.

26

The parameters that I estimate include γ’s in Equation (2.1), ρ’s in Equation

(2.2), τ in Equation (2.4), θ’s and β in Equation (2.6), σ2S in Equation (2.7), m’s

in Equation (2.8), σ2α in Equation (2.8), σ2

µ in Equation (2.9), σ2 in Equation (2.10)

and σ2κ in Equation (2.12).

The error terms, student’s unobservable belief about her abilities ({χj}j∈J

in Equation (2.8)), idiosyncratic tastes in schools ({νj}j∈J in Equation (2.4)), and

the deviation of a student’s true abilities from posterior means ({εj}j∈J in Equation

(A.10)), are assumed to be independent. The joint distribution function of the error

terms, G({χj}j∈J , {νj}j∈J , {εj}j∈J ), follows multivariate normal distribution, and

the off-diagonal elements of the variance-covariance matrix are zeros.

The likelihood function is constructed based on school choice observations

(sit), wage observations (wit), and college GPA observations (κit). The likelihood

function for individuals who work after the first period can be derived in a similar

way.

I implement the estimation step via a simulated maximum likelihood esti-

mation (SMLE).

Li(·) =

∫

P (si1, si2|wi3, κi1, κi2, Xi, {νj}j∈J )× f(wi3, κi1, κi2|{χj}j∈J , {εj}j∈J )

×dG({χj}j∈J , {νj}j∈J , {εj}j∈J )

(2.18)

where P (·) is the simulated probability and f(·) is the simulated density.

Because si1, si2 are conditional independent as defined in the model, I have

P (si1, si2|·) = P (si1|·)× P (si2|·).

wi3, κi1, κi2 are also conditional independent. Therefore

f(wi3, κi1, κi2|·) = fW (wi3|·)× fκ(κi1|·)× fκ(κi2|·).

27

Details regarding the calculation on the simulated probability and simulated density

are discussed in the appendix.

2.4.2 Identification

The identification of the full model hinges on exclusion restrictions. The

parameters in the wage equation (Equation (2.1)) are identified primarily through

the wage observations. The parameters in the utility of attending school (Equation

(2.6)) are identified by school choices. Because the utility is a log function of the

monetary value, the variance of student preference (νij ’s) and the scale parameter

of their preference shocks (εSijt’s) are jointly identified by tuition. Holding all other

variables constant, the estimates of the variance of νij and the scale parameter of εSijt

are needed to rationalize the proportions of student college choices for certain types

of schools. For instance, in period 1, if a large proportion of students with similar

characteristics (family background and SAT scores) choose to attend the same school,

the variance of νij + εSijt should be small; yet otherwise the variance should be large.

Student college choices in period 2 help to identify the variance of νij and the scale

parameter of εSijt separately. For instance, according to the structure of the model,

dropping out from a university is not a planned education path, but is explained

by shocks to student preferences (εSijt’s) and shocks to their expected abilities. If

the proportion of dropping out of a university is about the same for students with

high GPA and students with low GPA, the phenomenon cannot be fully explained

by shocks to expected abilities, so it has to be explained by εSijts. Therefore, the scale

parameter τ of the preference shock εSijt should be large.

The parameters mj ’s and µj ’s in the ability learning process (Equation (2.8)

in Section 2.3) are identified jointly by high school GPA, SAT scores, and college

GPA observations. The variance of unobserved academic aptitude (σ2µ in Equation

28

(2.9)) is primarily identified by wage observations, because the differences between

the observed wages and the predicted wages are jointly explained by unobserved

academic aptitude and wage shocks (εWit ), while the scale parameter τ of the wage

shocks and preference shocks is already identified by school choices.

2.5 DATA

The dataset used in this paper is the Beginning Postsecondary Students

Longitudinal Study (BPS:04/09) from the National Center of Education Statistics

(NCES). The variables which are relevant to this paper include student enrollment

status, demographic characteristics, income, family income, and college GPA.

Data selection: : I drop the observations where one or more critical

information pieces are missing: (a) high school GPA, (b) SAT scores, (c) college

GPA, (d) family income, (e) number of people supported by parents, (f) parents’

highest education level, (g) school level (2 year or 4 year) both for periods 1 and 2,

and (h) school name if the school the student enters a four year university for periods

1 and 2.

I did not drop the observations in which school names are missing while the

school levels are 2-years, for I group all 2-year colleges together (community college).

For universities, school names are needed to identify their selectivity.

A problem may exist if the missing values are not at random. I have com-

pared the average SAT score and high school GPA between the group of observations

with missing values and the group of observations that do not have missing values.

For the observations without missing values, the average SAT score is lower by 0.27

standard deviation, but the average high school GPA is higher by 0.08 standard

deviation. It is hard to conclude whether the missing values are at random. Our

29

estimation might be biased in an ambiguous direction. However we have controlled

students family background and all their test scores, the bias should not be large. In

future research, one possible extension is to write a new model that accounts for the

missing data. That model could then be incorporated into a more complex model for

estimating missing values. An example is given by Dunning and Freedman (2008).

For estimation purpose, I dropped those students who are enrolled in se-

lective universities, which are defined as the top 30 private and public universities

and top 20 liberal arts colleges ranked by U.S. News and World Report 2003-2006

(similar grouping can be found in Fu (2010)).

There are three reasons to drop the students in selective universities. First,

the proportion of students enrolled in selective universities is small. More impor-

tantly, there is almost no transfer and dropout behavior in selective universities.

From the data, the total percentage of enrollment in selective universities is only

8%, while 0% students transferred to or transferred out of selective universities. Be-

cause the students in selective universities do not provide insightful information on

the transfer behavior that is the main interest of this paper, dropping those students

does not alter the estimates of interested parameters. Second, I assume that a student

can choose the school that maximizes her value function. However, the acceptance

rates of selective universities are not high enough to satisfy this assumption.

After the aforementioned data altering procedures, the final sample size is

6300 (rounded to the nearest ten).

Aggregation of schools: The aggregation of schools is necessary in this

paper. The reason for aggregation has been discussed in Fu (2010)’s paper. There

are two major constraints without aggregation. The first is computation feasibility.

If schools are not aggregated, students can choose are thousands of possible schools,

which poses a major computational challenge. The other reason is that the number

30

of students attending any single school is too small to provide accurate estimates of

the parameters. For instance, the enrollment rates in some liberal arts colleges are

exactly zero.

To do this, I try to group schools according to crucial features that may

affect the school decisions of students. The crucial features that I consider are tuition

and school type.

• Group 0: community colleges, which is denoted by j = 0.

• Group 1: public universities, which is denoted by j = 1.

• Group 2: private universities, which is denoted by j = 2.

I treat schools in each group as a single school. The definition of enrollment

is adjusted accordingly. A student is said to attend school j if she attends any school

in group j. I use the average tuition for each group based on tuition information

from the Integrated Postsecondary Education Data System (IPEDS).

Normalization of the test scores: In this paper, the measure of the

test scores (high school GPA, SAT scores, college GPA) in different periods are

different. For computational purposes, I normalize test scores by subtracting the

(cross sectional) sample mean, then divided by the (cross sectional) sample standard

deviation. To see it

κit =κit − E(κt)

s.d.(κt), (2.19)

where

E(κt) =

∑Ni=1 κitN

, (2.20)

V ar(κt) =

∑Ni=1[κit − E(κt)]

2

N − 1. (2.21)

Where κit is the normalized test score, κit is the original test score. Here N is the

total number of observations.

31

Identification of wages in each period: In this model, one period is

defined as 2 years. In the data, I observe wages for 1 year (half period). Therefore,

in estimation, I adjust the utility of working (Equation (2.1)) in the following way.

UW (·) = ln (w1(·)) + d0.5ln (w2(·)) , (2.22)

where w1 are the wages that an individual receives in the first half period, w2 are

the wages that an individual receives in the second half period, and d is the time

preference for a period. w1 and w2 are similarly defined as in Equation (2.1).

ln (wk(αi,Di, si1, Di, Expritk)) = g(αi,Di

, Di) + γ11(Di > si1) (2.23)

+γ2Expritk + γ3Expr2itk + εWit ,

for t = 2, · · · , T,

where Expritk, k=1,2, is the years of experience of individual i at the first or second

half period. All other notations have the same meaning as in Equation (2.1).

2.6 ESTIMATION RESULTS

I present the estimates of the key structural parameters in subsection 5.1,

which is followed by a brief discussion on model fit in subsection 5.2.

2.6.1 Parameters estimates

Parameters in wage equation: In table 2.10, ρ1 represents the intercept

term in the wage equation, which can be understood as the signalling effect of college

degrees. ρ2 represents the return to abilities.3

3The data used in this paper only provide 1 to 2 period wage observations. It is not enough to estimate

the curvature of wages (γ2 and γ3 in Equation (2.1)). The value of γ2 and γ3 are taken from the estimates

in Belzil and Hansen (2002). γ2 is taken to be 0.0884. γ3 is taken to be -0.0029.

32

There is a significant difference of the return to education for different types

of schools. From the estimates of the intercept terms for different schools (ρ1’s), we

can see that the return is much higher for universities. As expected, the intercept

term ρ1 is the lowest for community colleges. This estimation results show the labor

market returns to students holding bachelor degrees are significantly higher than for

those with associate degrees.

From the estimates of the return to abilities (ρ2’s), it is not surprising to

see that the returns to abilities are the lowest from community colleges, and much

higher from universities, which implies that students with high abilities have higher

returns from universities than low-ability students. The estimation results explain the

situation in which high-ability students tend to attend universities, while low-ability

students tend to attend community colleges.

It is noted that although the return to education is similar for graduates from

public universities and those from private universities, the return (both the intercept

term ρ1 and the return to ability ρ2) is slightly higher for public universities, which

suggests that the enrollment in private universities is driven by factors other than

labor market returns, for instance, better living conditions, better meal plans, smaller

classes, etc.

γ1 captures the effect of upward transfer on future wages. The estimate

shows that the transfer cost on income is not significantly different from zero, sug-

gesting that the labor market may not discriminate against transfer students, which

coincides with Kane and Rouse’s (1993) finding.

Recall that the student’s idiosyncratic taste to school, νij , is the part of

the preference that does not change over time, while the preference shock εSijt does

change with time. It is worth noticing that the variance of student idiosyncratic

tastes (σ2S in Equation (2.7)) is small compared to the scale parameter of preference

33

shocks (scale parameter τ of the distribution of εSijt in Equation (2.7)). A comparison

of the variances reveals that the time-varying preference shock dominates the static

idiosyncratic taste. The result also highlights the importance of a time-varying pref-

erence shock for a more complete characterization of the dynamics of school choices.

The one-period model of Fu (2010) is not able to capture such dynamics over time.

Parameters in the utility of attending school: θ is the estimate of

the constant intercept of monetary resources for students during college (Equation

(2.6)). The intercept corresponding to community colleges is the highest. There

are two possible explanations. The first is that the course schedules in community

colleges may be more flexible than in universities, and students are more likely to

take part time jobs. The second is that most community college students choose to

live at home. Therefore, most of them spend a lot less on living expenses than do

university students.

The intercept θ is higher for private universities than for public universities,

which implies that families transfer more money to students if they attend private

universities. The estimates are intuitive because the tuition in private universities is

much higher than for public ones. Some students may have more financial resources

after paying tuition in private universities even when the tuition in private universities

is higher, possibly because the facilities are more expensive in private universities, and

parents are willing to pay for such benefits if they do not have financial constraints.

The estimation results of parameter β show that parental income has a

significant effect on school choices, implying that a student will get more financial

support from her family if her family income is high. It is also not surprising to

see that the number of people supported by parents has a significant negative effect

on their monetary resources during college. The estimation results also reveal that

34

parental education has a positive effect on student financial resources, while the ed-

ucation of mothers has a more significant effect.

Parameters in ability learning process: m captures the mean of stu-

dents’ prior distribution, and σ2α is the variance of the prior distribution. As all the

observed test scores (high school GPA, SAT scores, and college GPA) are normalized,

the variance of the prior distribution σ2α = 34.601 is a very large number. It delivers

the information that the prior distribution is not very informative when it come to

deriving student abilities.

µ1 captures the linear relation between student abilities and their SAT

scores, and the linear relation between student abilities and high school GPAs. It

is as expected that the estimates of µ1 are all positive. The estimates imply that if

students have high SAT scores or high school GPAs, they should infer that they have

higher abilities in all schools.

σ2µ is the variance of student’s private information about her abilities. The

estimate of σ2µ is a small number with a large standard deviation, suggesting that the

private information is insignificant. Therefore, the preference shocks and the wage

shocks, as opposed to students’ unobserved academic aptitude, provide the primary

explanation of the idiosyncratic school choices and the differences in the labor market

returns.

Parameters in the value function: Transfer cost is a large negative

number compared to the utility of attending school, which is between 8 and 12 for

one period. Recall from section 2.3.4 that transfer cost includes monetary and non-

monetary components. The monetary cost, which includes the application fee and

the moving cost, is too small to explain the transfer cost. Therefore, the estimate

35

suggests that the transfer cost is chiefly explained by the non-monetary part, includ-

ing unexpected loss of non-transferable credits and the time spent on college searches

and transfer applications.

2.6.2 Model fit

To examine model fit, I simulated 100 sets of error terms for each individual

and compared the predicted outcomes to the actual observed outcomes. I compared

the predicted enrollment rates with the actual rates for different schools in periods

1 and 2 in Tables 2.11 and 2.12, and the predicted transfer rates with the actual

transfer rates in Table 2.13.

In Table 2.11, we can see that the model fits the enrollment rates in period

1 reasonably well. The enrollment rate of community colleges is overestimated by

1%, while the enrollment rate of universities is underestimated by 1.1% for public

universities and overestimated by 0.1% for private universities.

Similarly, Table 2.12 shows the model fit for the enrollment rate in the

second period. The discrepancy between the predicted enrollment rates and the

actual ones is very small. The fraction of students who choose to work after the

first period is overestimated by 1.8%, while the enrollment rate of universities is

underestimated by 0.8% (1.0%) for public (private) universities, respectively.

The model fit in terms of transfer rate is a very good test of the model’s

predictability regarding the relationship between expected abilities and enrollment

decisions for students. For instance, predicted drop-out rates from universities are

driven entirely by changes in expected academic performance (ability). Table 2.13

compares the predicted and the actual transfer rates. Although the transfer rates

are small compared to the enrollment rates, the model still replicates them well. The

largest discrepancy between the predicted transfer rates and the actual ones is within

36

±3%.

In general, the model not only predicts the enrollment rates for different

schools in each period, but also the transfer rates of various types. These transfers

include “upward” transfers (from a community college to a university), “parallel”

transfers (from a university to a university), and drop-outs from universities. Given

the model’s simplicity and its predictive power, the estimated model fits the actual

data well.

2.7 POLICY SIMULATION

Regarding the estimated model, which fits the data reasonably well, I con-

ducted three counterfactual experiments to answer the following research questions:

What is the main barrier for students to attend the transfer program in community

colleges? How can we improve the efficiency of the transfer function of community

colleges?

2.7.1 Increase tuition fees in universities

Tuition fees are always increasing in universities. In this study, I examine

the extent to which a change of tuition can affect college choices. I increase the

tuition of both private and public universities by 20%, while keeping the tuition in

community colleges the same. This experiment is divided into two parts. The first

examines the effect of tuition increase on the enrollment and transfer choices for high

school graduates. The other examines the effect on the transfer rates for students

who are already enrolled in college.

The effect on college choices of high school graduates:The effect

on college choices of high school graduates: Generally speaking, university tuition

37

increases raises the transfer rates from community colleges to four-year schools but

decreases graduation rates from universities. In Table 2.14, we can see that 2.3%

more students choose to enroll in community colleges in period 1, and 2.8% fewer

students choose to enroll in private universities. As tuition in universities increases,

more students with low family income or low ability tend to start in community

colleges. For low-ability students, because they are not sure whether it is worth

getting a bachelor’s degree, it is relatively cheaper to learn about their abilities in a

community college. For students who have low family incomes, it is more affordable

to start at community colleges in the face of tuition increases.

In Table 2.15, the transfer rates from community colleges to universities are

considerably higher. The transfer rate from community college to a public university

increases from 3.8% to 5.6%, while the transfer rate from community colleges to

private universities increases from 2.4% to 3.4%. As expected, more students tend

to transfer to universities when it is more expensive to start in universities from the

beginning. The drop-out rates from universities are higher because students with

low GPA observations (negative shocks to expected abilities), are more likely to quit

because the cost of finishing at a university is higher.

From Table 2.16, we can see that 1.9% more students enter the labor mar-

ket after the first period, and 2.2% fewer students pursue their degrees in private

universities. Graduation rates from universities are considerably lower because high

tuition pushes more students to start in community colleges and discourages com-

munity college students with poor GPA to pursue bachelor’s degrees.

The effect on transfer decisions of current college students: In

contrast to the effect of tuition increase on high school graduates, a tuition increase

has an opposite effect on the transfer rates for current college students. In Table

38

2.17, there are fewer transfers from community colleges to universities. At the same

time, there are more students who drop out from universities. As expected, the

increase of the dropout rate is higher in private universities, for the absolute amount

of a tuition increase is much higher for private universities. For community college

students, more of them tend to work with an associate degree instead of transferring

when facing higher costs.

In general, from this experiment, we can see that there are two opposite

driving forces that affect transfer rates from community colleges to universities. One

pushes students to receive associate degrees instead of transferring because the cost of

universities is so high. The other pushes more students to start in community colleges

rather than starting in universities, which increases the transfer rates. Therefore,

the transfer rates from community colleges to universities move in an ambiguous

direction. For instance, if the cost in universities is higher than the labor market

return for all students (an extreme situation to consider), no student will pursue a

bachelor’s degree, and there will be a zero transfer rate.

The experiment shows the value of community colleges when tuition in uni-

versities increases. As an alternative choice to students, the existence of community

college improves the student welfare. There are vast literatures on the decrease of

student welfare and graduation rates under the circumstances of higher tuition. If we

take into account community colleges, the change of student welfare and graduation

rate should not be as large as stated in the literature (Campbell and Siegel (1967),

Galper and Dunn (1969), and Leslie and Brinkman (1987)).

2.7.2 Improved academic preparedness

Chicago Mayor Rahm Emmanuel proposed a longer school day plan. Under

his plan, most city high schools will extend their day to 7.5 hours. The goal is to

39

improve academic preparedness. In this simulation study, I examine how improved

academic preparedness would affect students’ college choices. I increase the students’

high school GPA and their SAT scores by 0.5 standard deviations, while keeping their

college GPA the same.

The simulation results show that transfer rates increase, which suggests that

improved academic preparedness improves the efficiency of the transfer function of

community colleges, while it also increases the graduation rates from universities.

From Table 2.18, there are 3.1% fewer students attending community col-

leges in period 1. With increased expected ability, more students are willing to

start in universities. In fact, one of the main reasons for students to transfer from

community colleges is to avoid the drop-out risk from universities when facing ability

uncertainties. With improved academic preparedness, students derive lower drop-out

risk from universities, and are more willing to invest in them.

Indeed, as shown in Table 2.19, the transfer rate from community colleges

to universities increases from 6.2% to 8.3% because that community college students

anticipate higher returns from universities as they derive higher expected abilities

from improved high school GPA and SAT scores. Therefore, more students are

choosing to transfer.

In Table 2.20, as expected, 2.3% fewer students choose to work after the

first period, while 3.2% more students choose to study in private universities, mostly

because the improved academic preparedness encourages more students to attend

universities from the beginning and more transfers from community colleges.

This experiment shows that improving academic preparedness is beneficial

both for individuals and the government. First, the efficiency of the transfer function

of community college is improved, which decreases expenditures for postsecondary

education. Second, the average education level of the whole population rises by

40

increasing graduation rates from universities.

2.7.3 Decrease the transfer cost

In this study, I examine how students make college choices if the transfer

costs is decreased to half of its original value. The simulation study shows that

decreasing transfer costs is a very effective way to improve the efficiency of the college

market.

In Table 2.21, we can see that the enrollment rate of community colleges

increases from 28.3% to 40.8%. At the same time, there are 8.9% fewer students that

choose to enroll in public universities in period 1, and 3.7% fewer students choose to

enroll in private universities. The reason is that if the transfer cost is low, community

colleges are more attractive to both students with financial constraint and low-ability.

In Table 2.22, the transfer rates from community colleges to public univer-

sities are more than 4 times higher than the rates in the baseline model. There are

two causes for the high transfer rates. The first is that more students with low ability

tend to enter community college to learn their ability when the transfer cost is low.

The second is that the proportion of planned transfers also increases without high

transfer cost barriers.

In Table 2.23, the simulation results show that the drop-out rates from

universities decrease to 12.9% from 17.4%. The reason is that it is less costly to

transfer to other universities instead of dropping out when observing bad matches

between their abilities (relatively low GPA) and their current universities.

This experiment suggests that reducing transfer cost is a very efficient way

to improve the efficiency of the transfer program in community colleges and increase

the university completion rate. The transfer cost could be reduced through variables

ways. Community colleges could provide more information sessions to promote the

41

courses that are widely accepted by universities.

2.7.4 No transfer cost

In this study, I examine how students make college choices if there are

no transfer costs. The simulation study shows that decreasing transfer costs is the

most effective way to improve the efficiency of the transfer program. With no transfer

costs, the education path completely changes, while graduation rates from universities

largely increase.

In Table 2.24, we can see that more than half of the students choose to attend

community colleges in period 1, while the enrollment rate of universities decreases to

45.5% from 71.2%. The reason is that if there is no transfer cost, community colleges

are perfect substitutes for universities in the first period. Students without strong

preferences for universities will attend community colleges for financial reasons.

In Table 2.25, the transfer rates from community colleges to universities are

more than 6 times higher than the rates in the baseline model. The transfer rate

from community colleges to public universities increases from 3.8% to 26.2%, while

the transfer rate from community colleges to private universities increases from 2.4%

to 17.5%. There are two causes for the high transfer rates. The first is that there

are more planned transfers due to the zero transfer cost. The second is that the

proportion of unplanned transfers also increases without high transfer cost barriers.

In Table 2.26, the simulation results show that the percentage of students

who enter the labor market after the first period drops from 38.4% to 18.9% as a

result of two driving forces. The first is that the drop-out rates from universities

decrease. The underlying reason is that students can choose to transfer to other uni-

versities instead of dropping out when observing bad matches between their abilities

(relatively low GPA) and their current colleges. The second is that a zero transfer

42

cost encourages more students to transfer to universities from community colleges.

Therefore, the enrollment rate of public universities in period 2 increases to 48.8%

because it is the cheapest way to obtain a bachelor’s degree, while the enrollment

rate of private universities also marks a significant increase.

This experiment suggests that a decreasing transfer cost is the most efficient

way to reduce both individual and the government expenditures in postsecondary

education. It can be achieved if community colleges cooperate with universities. Such

cooperation is possible if community colleges provide freshman-level and sophomore-

level courses under the same syllabus as the courses provided by universities, and

universities accept credits from community college students without discrimination.

2.8 CONCLUSION

In this paper, I develop and estimate a two-period ability-learning structural

model to provide a more complete picture of the college market by including commu-

nity colleges as a viable pathway to bachelor’s degrees. In the model, students make

college decisions with different financial constraints and uncertain abilities. They

choose between community colleges and universities in period 1, and make transfer

decisions in period 2. I estimate the model using simulated maximum likelihood

estimation. The estimated model closely replicates most of the patterns in the data.

The results show that the market has no discrimination against transfer

students because the effect of transfer on future income is not statistically significant

from zero, which coincides with the finding by Kane and Rouse (1995), suggesting

that the only cost of transfer is direct transfer costs that are the main barrier to college

transfer. The estimation results also show that family income has a significant effect

on college choices, which provides evidence that students tend to start in community

colleges when facing financial constraints. Finally, the results support the idea that

43

the return to abilities is higher in universities than in community colleges.

Experiment 1 shows that the tuition increase in universities pushes more

students to community colleges, and also results in more dropouts from universities.

Experiment 2 shows that improved academic preparedness encourages more students

to start in universities, and also encourages more community college students to

transfer to four-year schools. At the same time, there are fewer dropouts from uni-

versities. Experiment 3 shows that with no transfer costs, the fraction of students

starting in community colleges almost doubles. The education pattern completely

changes. The transfer rate to universities is 6 times higher than in the baseline

model. Although the transfer rates from community colleges to universities increase

in all three experiments, transfer cost seems to be the main barrier to improving the

effectiveness of the transfer program.

Building on Fu (2010), Epple, Romamo, and Sieg (2006) and this paper, one

extension is to consider jointly the strategies between colleges and students. Schools

may set different strategies to admit high school graduates and transfer students. A

dynamic general equilibrium model that takes into account both sides of the college

admission market would give a more complete picture of the decision making process

and the underlying driving forces. Consequently, the new model may yield different

outcomes and consequences of the examined policies.

Another extension is to modify the model by allowing for heterogeneous risk

aversion levels. The extension can be achieved by employing the constant relative risk

aversion utility, and allowing the risk aversion coefficient to be different for different

individuals. The extension can help us to understand a diversity of college choices and

different college preferences from another perspective. As a result, the heterogeneous

risk averse level may influence the estimation results and the consequences of the

examined policies.

44

Table 2.1: Percentage enrollment in period 1

Community college Public universities Private universities

27.3% 38.0% 34.7%

Table 2.2: Percentage of transfer in period 2

To \ From Community college Public University Private University

Work 18.5% 10.9% 7.2%

public university 6.4% \ 3.5%

Private university 2.4% 1.5% \

Table 2.3: Average tuition fee


4587 3912 17201

Table 2.4: Average high school GPA and SAT score (normalized)


High School GPA −0.587 0.08725 0.2018

SAT score −0.702 0.01927 0.2150

45

Table 2.5: Average college GPA for transfer students


Work −0.367 −0.588 −0.425

Public University 0.169 \ −0.0692

Private University 0.571 −0.365 \

Table 2.6: Average family income


31541 49517 61656

Table 2.7: Average family income for transfer students


Work 32034 43155 51656

Public University 30066 \ 62766

Private University 33349 50693 \

46

Table 2.10: Estimation results

Variable Estimates Standard deviation

ρ1

Community college 9.735 0.006

Public schools 9.904 0.014

Private schools 9.825 0.013

ρ2

Community college -0.046 0.008



γ1 -0.005 0.027

µout 9.850 0.013

τ 3.176 0.019

θ




β

Family income /100,000 0.693 0.000

Number of people that parents support -0.073 0.000

Father’s education level 0.049 0.000

Mother’s education level 0.124 0.000

σ2S 0.0005 0.003

m




µ1




µ0 1.872 0.039

σ2µ 0.003 0.040

σ2α 34.601 0.848

Transfer cost -7.224 0.003

47

Table 2.11: Enrollment rate in period 1 in model fit

Data Simulated Sample

Community college 27.3% 28.3%

public university 38.0% 36.9%

Private university 34.7% 34.8%

Table 2.12: Enrollment rate in period 2 in model fit

Data Simulated Sample

Work 36.6% 38.4%



48

Table 2.13: Transfer rate in period 2 in model fit


Work Data 18.5% 10.9% 7.2%

Simulated Sample 21.1% 8.8% 8.6%

public Data 6.4% \ 3.5%

university Simulated Sample 3.8% \ 3.9%

Private Data 2.4% 1.5% \

university Simulated Sample 2.4% 2.3% \

Table 2.14: Enrollment rate in period 1 in experiment study 1

Baseline model New model



Private university 34.8% 32.0 %

Table 2.15: Transfer rate in period 2 in experiment study 1


Work Baseline model 21.1% 8.8% 8.6%

New model 21.6% 9.6% 9.1%

public Baseline model 3.8% \ 3.9%

university New model 5.6% \ 3.3%

Private Baseline model 2.4% 2.3% \

university New model 3.4% 1.7% \

49



Work 38.4% 40.3%



Table 2.17: Transfer rate in period 2 for current student in experiment study 1



New model 21.4% 9.1% 9.3%










50




New model 16.8% 8.7% 10.5%







Work 38.4% 36.1%








51




New model 17.17% 5.4% 7.5%







Work 38.4% 31.1%








52




New model 10.9% 3.4% 4.6%







Work 38.4% 18.9%



53

Chapter 3

The Impact of Labor Migration

on Children’s Health: Evidence

from Rural China

3.1 INTRODUCTION

The changing economic climate in China has caused a dramatic increase

in ‘labor migration’. Labor migration is the migration of Chinese rural residents to

bigger cities where higher-paying, temporary jobs are available. In 2009, the floating

population in China reached 211 million adults, leaving over 58 million children

behind in homes far from their parents.

The utility function of parents is composed by household consumption, chil-

dren’s health and education. The main reason for labor migration is to improve

household financial situation. As a result, they could increase household consump-

tion, afford better education for their children, and better health insurance. Conven-

tional wisdom suggests that these left-behind children are at risk of developing health

54

problems and physical and psycho-social stress1as a result of a lack of parental guid-

ance and relevant health information. These issues raise concerns for social workers

and policy makers. Nevertheless, despite the fact that migrated parents are spending

less time with their children, these parents are able to provide better remittances,

nutrition and health relevant information as a result of their increased income and the

knowledge they obtain through their migration experiences. Little is known about

the extent to which the health of left-behind children is affected in China, particularly

those children who are too young to take care of themselves.

This paper aims to establish the overall consequences of parental migration

on the health outcomes and childcare of their left-behind children. The data used in

the analysis are primarily derived from four waves of the China Health and Nutrition

Survey (CHNS), collected in 2000, 2004, 2006 and 2009. The CHNS was designed to

examine the effects of Chinese health, nutrition, and family planning policies. The

people of nine provinces that vary substantially in geography, economic development,

and access to public resources were surveyed.

Some of the economic literature that focuses on labor migration in China

suggests that the remittances forwarded to families by migrated members benefit the

households financially. For instance, Du et al. (2006) and de Brauw and Giles (2008)

found that labor migration increases family consumption level. Giles (2006) also

found that having migrated family members could improve the family’s risk-coping

ability. On the other hand, there are also papers that focus on the left-behind family

members, particularly school-age children. Chen et al. (2009) found that educational

outcomes of children improved in migrant households. However, de Brauw and Mu

(2011) found that the nutrition of some school-age children from migrant households

1Currently, the schools in rural China do not have the adequate systems or a relevant curriculum in place

to address these issues.

55

was negatively affected.

There are a few papers that study the health outcomes of left-behind chil-

dren in China. One of them is Mu and Brauw (2011), which examined the weight of

left-behind children, and found that older children (7-12 years) were more likely to

be underweight in migrant households than those who lived in non-migrant house-

hold. Shu Zhang (2012) used survey data from the 2000 wave of the CHNS to study

the impact of labor migration on children’s health. She found no significant health

outcome effects for children whose fathers had migrated. Both papers, however, do

not consider the potential endogeneity of parents’ migration and children’s health.

Therefore their results might be biased.

The main methodological obstacle of quantifying the effect of parent’s mi-

gration is the endogeneity problem. This may be manifested as a problem of reverse

causation. Instead of being affected by parents’ migration status, children’s health

status could be a critical factor for parents when making migration decisions. For

example, parents whose children are in poor health may have to stay home to take

care of their children. On the other hand, they may have stronger financial incen-

tive to migrate to earn extra money to finance better medical care for their sick

children. Moreover, parents’ migration decisions could be correlated to children’s

health through unobserved variables, such as genetically inherited health deficiency,

whereby sick parents would be too sick to leave their sick children and migrate to ur-

ban areas for work. Therefore, the significant correlation between parents’ migration

and children’s health status may not indicate causality.

To solve the endogeneity problem, we use instrumental variables (IV) es-

timation. To be more specific, we instrumented father’s migration status with the

average male migration rate, using historical county data, instrument mother’s mi-

gration status with historical county level female migration rate, and instrument

56

household migration status with historical county level household migration rate.

The historical county level migration rate is calculated as the average local migration

rate from the previous survey year. The historical county level migration rate by

gender is a suitable indicator to reflect the local culture and network of migration,

where the network refers to a person’s exposure to migration information from her

migrated friends or family members. Intuitively, people living in the areas with a

tradition of migration or with a better migration network are more likely to migrate.

In the first stage regression of this paper, it can be seen that this set of instruments

have strong predictive power on parents’ migration status. One might be concerned

that these instruments could influence child health directly, since county level migra-

tion rates are also correlated with the local average income level. To address this, I

included the county level average income as an explanatory variable.

In this paper, we adopted the panel structure of the data and employed a

fixed effects model to study the overall health status of left-behind children. The

causality effect of migration is identified by two-stage estimation. The estimation

results are presented with and without the IV correction. Moreover, we conducted

two robustness checks to support our estimation results. In the first robustness check,

we excluded household income as an explanatory variable, as household income might

be correlated with unobserved shocks that could also affect children’s health. This

correlation could lead to biases in estimation. In the second robustness check, I

excluded the number of elders in the household, as family size could affect peoples’

migration decisions because children could be taken care of by other family members.

As a result, the estimation results might be biased.

Generally speaking, we found there were few significant effects of parents’

migration on child outcomes. A possible explanation for this is that the coefficients

capture the net effects of parents’ migration. Children with migrated parents receive

57

less physical care, but may receive more financial support, access to better nutrition

products sent from their mothers, and better nutritional information. There are

both positive and negative effects on children’s health. The coefficients imply that

the positive effects of parents’ migration are about the same as the negative effects

on children’s health. Though the regression results on the whole sample were not

significant, the regression results from subsamples provided more insights. It showed

that children aged between 5 and 10 are positively affected by fathers’ migration,

possibly because these children received higher remittance, better access to nutrition

information and products.

Our paper contributes to the literature in a number of ways. Firstly, we used

novel instrumental variables dealing with the endogenous nature of parents’ migration

decisions, which are able to predict the migration propensity of parents. Secondly,

we studies different causality effects of father’s and mother’s migration status on

children’s health outcomes, which were significantly different. Thirdly, in addition

to traditional measurements of child health that focus on height and weight, we also

considered nutrient intake (consumption of calories and protein), immunization shots

and childcare. These measures provided a more comprehensive picture of the impact

of labor migration on children’s health.

The paper proceeds as follows: Section 2 discusses the history of labor mi-

gration and child nutrition in China; Section 3 describes the conceptual framework;

Section 4 discusses the data; Section 5 describes the empirical specification; Section

6 presents the main results regarding the effect of parent migration on the physi-

cal health of children; Section 7 goes through several robustness checks; Section 8

discusses the results from subsamples; and a conclusion is provided in Section 9.

58

3.2 BACKGROUND

According to the analysis report of labor migration in China by the National

Bureau of Statistics of China (2012), the total number of migrated labor from rural

areas increased from 225 million in 2008 to 252 million in 2011. The rapid growth of

rural-to-urban migration has been an important demographic trend in China. In this

section, we first introduce the background of labor migration in China and its impact

on rural communities, followed by a discussion on how the heath of rural children

has changed over time.

3.2.1 Labor Migration and Children Left Behind in Rural China

Since 1958, under the central planned economy in China, China has used

the household registration system (HuKou system) to control the labor migration

from rural to urban areas. Under the HuKou system, households are divided to

Agriculture HuKou and non-Agriculture HuKou, where the rural-urban migration

was strictly restricted. In the 1990s, 83% of households were classified under the

Agriculture HuKou category, according to Mallee (1995).

In 1988, the HuKou reform took place, whereby rural migrants were allowed

to obtain a temporary residence. However according to the World Bank (2009), rural

migrants were not able to access the urban welfare system, including education, health

and the social safety net. Therefore the rural migrants maintained a close tie to their

hometown village, as their benefits were linked to their household registration status.

According to Bao et al. (2009), the large income gap between urban and ru-

ral areas, created by decades of urban-rural segregation and uneven economic growth,

provided strong incentives for rural people to move to urban areas, especially after

rural-urban labor flow was officially permitted. As a result, China has experienced

dramatic changes in its labor market since the 1990s. Liang and Ma (2004) found

59

that the migration population grew from 20 million in 1990 to 45 million in 1995 and

to 79 million in 2000 using the one percent sample from the 1990 and 2000 waves

of the Population Census and one percent sample from the 1995 wave of population

survey.

It is important to note the different migration rates by gender, as mother’s

migration may have different impact on child health than father’s migration. Accord-

ing to Zhao (1999) and Rozelle et al. (1999), there were substantially more migrated

men than women in the mid-1990s. Mu and Van de Walle (2011) showed that the

gender gap in migration has increased over time. Our findings using CHNS data

support this.

3.2.2 Health of Children in China

The health of children in China has improved with economy growth. Shen

et al. (1996) showed that the average height of children aged two to five years had

increased by 3.8 cm in 1990 when compared with data from 1975. Chen (2000) found

the prevalence of underweight children and the rate of stunting (the percentage of

children with Height-for-age Z-scores below two) among Chinese children declined

from 1990 to 1995. Svedberg (2006) found that the stunting rate had decreased

further by 2002. Additionally, Osberg et al. (2009) showed that height-for-age Z-score

in children increased between 1991 and 2000. The changes in children’s health might

be explained by the improvement of the diet quality in China, which is supported by

Du et al. (2004). They showed that the nutritional intake of children shifted from

carbohydrates to high fat and high energy-density foods.

Although the health of children in China has improved on average, malnu-

trition is still an issue. According to Mu and Brauw (2011), the stunting rate in

2002 was still nearly 15%, indicating a substantial portion of the population remain

60

malnourished. There are also other challenges in improving nutrition among chil-

dren. Liu et al. (2012) analyzed urban-rural disparities of China’s child health and

nutritional status using CHNS data from 1989 to 2006, and showed that on average,

urban children have 0.29 higher height-for-age z-scores and 0.19 greater weight-for-

age z-scores than rural children.

3.3 CONCEPTUAL FRAMEWORK

There are at least three main channels through which migration might affect

the health status of children: the income effect, the time effect, and the information

effect.

First of all, the primary reason for a member of a household to migrate is

to increase household income. We anticipate the increased family income will have

a positive effect on child health outcomes for various reasons. For example, extra

income could increase diet quality (Du et al., 2004), by switching from high carbo-

hydrate food to high fat and high energy-density foods. Therefore, the calorie intake

may increase when income increases. Moreover diet improvements might improve

height-for-age Z-score and weight-for-age Z-score. Finally, health service utilization

for children may increase as well. For example, migrant parents may be able to afford

to have their children immunized as a result of increased income.

The second channel through which migration may affect the health status

of children is through the time allocated to childcare. Mu and van de Walle (2011)

found that when one family member leaves for urban work, the remaining family

members must take on an increased farm work load. As a result, they may spend

less time cooking and child rearing. Consequently, child health outcomes may be

affected. In cases where both parents have migrated, children might be left in the

care of relatives, usually their elderly grandparents. In such cases, children might

61

not have a regular diet routine and may eat poorly. As a result, the child’s nutrient

intake, and subsequently, their height and weight, may be affected.

The third channel is though better access to nutritional information from

migrated parents. People always migrate to urban areas that have better economic

conditions and health services. Therefore migrants should have better access to

nutritional information. For example, migrants may learn more about healthy diets,

and encourage their children to eat more nutritious foods. Moreover, they may learn

more about the importance of immunization, and have the incentive to let their

children get immunized.

As explained above, the direction of the effect of parent migration on child

health outcomes is ambiguous. In the next section we present the data and empirical

framework.

3.4 DATA

The China Health and Nutrition Survey (CHNS) was designed to examine

the effects of the health, nutrition, and family planning policies and programs imple-

mented by national and local governments and to check how the social and economic

transformation of Chinese society is affecting the health and nutritional status of

the Chinese population. The Survey covered nine provinces that vary substantially

in geography, economic development, and access to public resources. Demographic

characteristics, household assets and other information were also collected as part

of the survey. The first round of the CHNS, including household, community, and

health/family planning facility data, was collected in 1989. Seven additional panels

were collected in 1991, 1993, 1997, 2000, 2004, 2006 and 2009.

From 1997 onwards, families were asked to provide reasons for migrated or

absent family members as part of CHNS. A migrant was defined as any individual

62

who had left the home at the time of the survey to seek employment. The data used

in the analysis were primarily derived from four waves of the CHNS, collected in

2000, 2004, 2006 and 2009. The reason that we did not use data from the 1997 wave

of the survey is because we used the historical migration rate from the previous wave

as instrument variables, and this information was not available for the 1997 wave.

In the first wave (1997) of the CHNS, 15,917 individuals were surveyed.

Survey response rates and attrition are difficult to determine for two reasons: firstly,

the participants who had migrated in one survey year may have returned home in a

later year; and secondly, new participants were recruited following the 1997 survey, to

replenish samples if a community had less than 20 households, or if participants had

formed a new household or separated from their family into a new housing unit in the

same community. If we calculated response rate based on those who participated in

previous survey rounds remaining in the current survey, our response rates would be

around 88% at individual level and 90% at household level (Popkin et al. 2010). Mu

and de Brauw (2011) showed that the attrition was random and should not generate

panel attrition bias.

For estimation purposes, we dropped observations where one or more of

the following critical pieces of information pieces were missing: (a) child’s height,

(b) child’s weight, (c) child’s calorie intake, (d) child’s protein intake, (e) parents’

education level, and (f) parents’ migration status. To calculate the height-for-age

Z-score (HAZ) and weight-for-age Z-score (WAZ), we used the most recent growth

charts made available by the World Health Organization (WHO). To measure child’s

calorie and protein intake, we used a set of age and gender-specific Recommended Di-

etary Allowances (RDAs) sanctioned by the Chinese Nutrition Society (2000). RDAs

are based on average energy allowances, i.e. calorie intake for each specific age and

gender group.

63

We randomly selected one child from families with several children to avoid

any biases of related children and other unobserved variables. In this paper, we focus

on children under ten years of age, because they are at greater risk of developing

problems associated with malnutrition and are more likely to respond to nutritional

interventions (WHO, 1995). We excluded households in which the children were older

than ten. After the aforementioned data altering procedures, the final sample data

is unbalanced panel data, containing 1,600 children and 2,201 observations.

There are several reasons that only 40% of the children had more than one

observation in the data. The first is that we only kept the observation when we had

both the child data and their parent’s data. For instance, if the mother or father did

not respond to the survey, the child’s response was excluded as it could not be used.

As the individual response rate is 88%, the probability that the child is included in

the next survey year is calculated by multiplying the child’s response rate by their

parents’ response rates, which equates to 0.68 (0.883). The second is the individual

response rate is not 88% for each survey year - it is 83% in year 2000, and 80% in

year 2004 (Popkin et al. 2010). The third is that there are missing variables. For

instance, the response rate of the question for migration status is less than 80%.

After the exclusion of children who are younger than 10, there is approximately 40%

probability that a child is included in more than one survey wave of the survey.

From table (3.1), we can see the migration rate kept increasing and reached

a peak at year 2006. The table shows that fathers were more likely than mothers

to migrate from households. Both parents had migrated from relatively few families,

implying that most families had one parent left in the household to take care of the

children. From the data, it is clear that labor migration became quite common in

rural areas after year 2000. In year 2006, 21% of children had at least one parent

who had migrated, and both parents of 4% of the children sampled had migrated.

64

However these figures likely underreport the true scale of migration because we did

not account for migration that took place over shorter periods of time (Cai et al.,

2008).

Table (3.2a) compares differences in health outcomes and care of children

between children with and without migrated parents. Children are defined as left-

behind if one of their parents was a migrant. According to the table, the left-behind

children on average consumed less protein than children who lived with both of their

parents. At the same time, left-behind children were shorter and weighed less on

average than children who lived with both of their parents. Table (3.2b) shows dif-

ferences between children with and without migrated fathers in health outcomes and

care. By comparing the data from Table (3.2a) and (3.2b) it is evident that there

were fewer significant differences of child health outcomes and care for families with

migrant fathers and non-migrant fathers. Children with migrant fathers have sig-

nificantly smaller weight-for-age Z-score and protein/RDA. Table (3.2c) shows the

differences between children with and without migrated mothers. Unlike children

with migrant fathers, children with migrant mothers consumed significantly less pro-

tein and calories. Although the rates of migration were smaller for mothers, they

seemed to have more of a significant effect on child outcomes than father migration

or household migration.

We can also see that for both migrant and non-migrant households, the

average height-for-age Z-score and weight-for-age Z-score were less than 0. The z

scores show that children in China are on average shorter and lighter in weight

compared to the WHO standards. The WHO standards were formulated in the 1970s

by combining growth data from two distinct data sets in USA. The summary statistics

show that children in China have relatively poor health conditions compared to the

children in USA, while left-behind Chinese children are even more disadvantaged

65

compared to Chinese children who live with both parents. Moreover the average

Calories/RDA and Protein/RDA ratios are under 1 for both migrant and non-migrant

households, which implies that children in China on average consume less protein and

calories than recommended.

Table (3.3) shows the summary statistics of the control variables. House-

hold income is lower in households with migrants. The difference in income could

be explained by the fact that the migrated household members’ income is not in-

cluded in household income, although the remittances provided by the migrant are

included. The table also shows that migrated parents have lower education level and

are younger. This trend could be a result of the local economic conditions, as people

who live in areas with better economic conditions are less likely to migrate. They also

tend to receive more education and have children later in life. For similar reasons,

county level average height and weight are lower for migrant households because they

are proxies for features of local economy development. Moreover the number of fe-

males over 60 in the household is higher in households with migrants, which suggests

that the number of elders in the household influences families’ migration decisions.

In general, people who migrate are more likely to live in big families, and poor ar-

eas. At the same time, they are more likely to have lower education levels and have

children at younger ages. The historical county level migration rates will be used as

instrumental variables and will be discussed later.

3.5 EMPIRICAL SPECIFICATION

In this paper, we adopt three sets of measures of health status. The first

includes child’s weight-for-age Z-score (WAZ), height-for-age Z-score (HAZ). The sec-

ond set includes child’s daily calorie intake, child’s daily calorie intake/RDA, child’s

daily protein intake, and child’s daily protein intake/RDA. The third set includes

66

the number of immunization shots that the child received in the survey year, and

whether the child has been cared for by non-household members.

We aimed to identify cause-effect relationships of parents’ migration status

on children’s health outcomes. In addition to parents’ migration status, child health

is also affected by other demographic factors, such as gender, parents’ education

level, family size, the number of siblings, and household income. These were used as

control variables in the estimation model.

With panel data, two models could be applied: the fixed effects model or

random effects model. The Hausman test showed that the random effects model is

inconsistent. The fixed effects model is employed in this paper. The panel data is

unbalanced. There are 480 children with more than one observation in this data set,

which is the effective sample. Among the effective sample, there are 112 parents who

changed their migration status. The number of parents who changed their migration

status in different survey years helped us to identify the impact of migration on

children’s health.

We employed three separate fixed effects models to identify the effects of

household migration, fathers’ migration and mothers’ migration on child health out-

comes and care. The fixed effects model that we employed to identify the effect of

household migration

Hit = αi + β1Mit + β2Xit + εit (3.1)

where Hit is child i’s health outcome at time t, Mit is child i’s household migration

status at time t. The dummy variable equals to 1 if either or both the child’s parents

had migrated out at time t, and 0 otherwise. Xit is a vector of demographic variables

including gender dummy (female as 1), parents education level, household income,

the number of males aged over 60 in the household, the number of females aged over

60 in the household, the number of boys under age ten in the household, the number

67

of girls under age ten in the household, the county level average height, the county

level average weight, the county level average daily calorie consumption/RDA, the

county level average protein consumption/RDA, and the county level average income.

Here εit is an error terms for individual i at time t.

The fixed effects models that we used to identify the effect of father’s and

mother’s migration on child health are similar to Equation (3.1). The only difference

is the dummy variable Mit. To capture the effect of fathers’ migration, the dummy

variable Mit is redefined to equal to 1 if the child’s father has migrated out at time

t, and 0 otherwise. To capture the effect of mothers’ migration, the dummy variable

Mit is redefined to equal to 1 if child’s mother has migrated out at time t, and 0

otherwise.

We did not include the number of working age males/females in the house-

hold as explanatory variables for two reasons. The first is that we have already

controlled the household income and parents’ migration status. The second is the

preliminary results show that the number of working age males/females in the house-

hold does not have a significant effect on children’s health outcomes. In the model,

we use the number of boys/girls instead of the number of siblings because many chil-

dren come from large families in rural China and often live with their cousins and

their siblings. Therefore the total number of children in the household could impact

the child’s health.

Household income is used as a control variable instead of individual income.

The reason is that there are too many missing values for individual income, especially

for migrants. The remittances are included in household income but we cannot break

them out, as the survey did not ask about the amount of remittances. We included

more variables that measure the households’ assets as explanatory variables, but

the coefficients are not significant. Finally, we only kept household income in the

68

regression.

Plenty of literature mentioned the biases that may be caused due to the en-

dogenous nature of labor migration. In our CHNS sample, endogeneity mainly arose

because a child’s health status also affects parents’ migration decisions. The common

methodology adopted to correct such biases has been used as an instrumental variable

approach, isolating exogenous variation in parents’ migration status. We adopted an

IV approach and used historical county level migration rates as instruments. The

historical county level migration rate is calculated as the local migration rate from

previous survey year. The historical migration rate could proxy the migration net-

work. The difference between the average male migration rate and female migration

rate could also be a proxy for local culture.

3.6 ESTIMATION RESULTS

3.6.1 Results of Ordinary Least Squares model

As a baseline, Table (3.4a) and Table (3.4b) present the baseline effects of

the household migration status on child health outcomes and care from the ordinary

least squares regressions. Here, the child household migration status dummy variable

equals one if either or both the child’s parents have migrated. Table (3.5a) and Table

(3.5b) present the effects of the fathers’ migration status on child health outcomes

and care from the ordinary least squares (OLS) regressions. Table (3.6a) and Table

(3.6b) present the effects of the mothers’ migration status on child health outcomes

and care from the ordinary least squares regressions.

Though the OLS regression analysis may not be able to capture the exact

relationship between labor migration and children’s health, the results give us an

idea of the correlation between children’s health and the explanatory variables. It

69

shows that parental migration does not necessarily negatively correlate with children’s

health outcomes. Firstly, coefficients are similar for father’s migration and household

migration status because the majority of household migrations are fathers’ migration.

Father’s migration and household migration are positively correlated with children’s

height-for-age Z-score, and negatively correlation with the number of immunization

shots that children received. However father’s migration and household migration

have no significant correlation with children’s nutrient intake. Secondly, compared

with father’s migration, a mother’s migration has a higher correlation with children’s

health outcomes, although the rate of migration is smaller for mothers. For instance,

mothers’ migrations is positively correlated with children’s height-for-age Z-score and

negatively correlated with children’s daily calorie and protein intakes. The fact that

migrated mothers are more likely to access child care knowledge may explain this

correlation, as childcare knowledge is positively correlated with children’s physical

outcomes. However, a mother’s absence from home means they are not able to pay

attention or take care of their child’s diet, which leads to lower calorie and protein

consumption in their children.

When the OLS regression results are compared to Table (3.2a), Table (3.2b)

and Table (3.2c),the coefficients of parents’ migration and household migration cease

to be significant for some measures of child health outcomes and care in the OLS re-

sults. This may be because both parents/household migration and children’s health

outcome are correlated with the added explanatory variables in the OLS regression.

For instance, children’s weight-for-age Z-score is significantly different for migrant

household and non-migrant household in Table (3.2a), but the coefficient of house-

hold migration on children’s weight-for-age Z-score is not significant in Table (3.4a).

It can be seen that in Table (3.4a) children’s weight-for-age Z-score is significantly

correlated with fathers’ education, county level average weight and height. At the

70

same time, we can see from Table (3.3) that fathers’ education, county level av-

erage weight and height are all significantly different for migrant household and

non-migrant household. Therefore the correlation between those control variables

and household migration status explains the difference in the OLS results and the

summary statistics. Unlike father’s migration and household migration, mother’s mi-

gration remain significant in Table (3.6a) and Table (3.6b) for the variables that are

significantly different for children with migrant mothers and non-migrant mothers

in table (3.2c). The correlations between mothers’ migration status and some child

outcomes remain significant when variables are added.

3.6.2 Results of Fixed Effects model

Table (3.7a) and Table (3.7b) shows the effects of household migration sta-

tus on the health outcomes and care using the fixed effects model approach without

considering the endogeneity of migration. Similarly, Table (3.8a) and Table (3.8b)

show the effects of father’s migration status on children’s health outcomes and care.

Table (3.9a) and Table (3.9b) show the effects of mother’s migration status on chil-

dren’s health outcomes and care.

With the aid of the fixed effects model, we considerably reduced the threat

of omitted variable bias. From the OLS regression results we can see that most of

the coefficients of parents’ migration and household migration become insignificant

in the fixed effects model results, especially the coefficients of mothers’ migration.

The results imply that there must be some omitted variables that are correlated with

parents’ migration decisions and may have casual effects on children’s outcomes. Even

though we have tried to include most of the relevant variables for children’s outcome,

due to the limitations of the data available, some factors may still be left out. For

instance, we cannot observe whether the child has a chronic health condition. Chronic

71

health conditions are defined as a health problems that persist for over three months,

affects the child’s normal activities, and require hospitalization and/or home health

care and/or extensive medical care 2. Children with chronic health conditions usually

require more time and care from their parents, as well as increased financial support.

Within Chinese families, the mother usually spends more time taking care of the child

while father is the main income provider. Therefore, in households with a chronically

ill child, compared to households with healthy children, the mother is more likely

to stay at home (less likely to migrate), while father is more likely to migrate for

higher wages. For the above reason, the results from the fixed effects model show

that household migration and father’s migration are now negatively correlated with

children’s weight-for-age Z-score, and they are not significant in the OLS model. For

the same reason, the coefficients of mothers’ migration become insignificant in the

fixed effects model results.

3.6.3 Results of Fixed Effects model with instrument variable

Besides omitted variable bias caused by children with chronic health condi-

tions, endogeneity bias may be partially responsible for the insignificant fixed effects

results. First of all, the endogeneity could be a result of reverse causality. Parents’

migration decisions may depend on children’s health status. For instance, mothers

are less likely to migrate when children have relatively poor health status. Moreover

both parents’ migration decisions and children’s outcomes could be correlated with

local environment and development level. Though we have tried to control those local

factors by adding county average variables such as income as independent variables,

it is hard to control all the local differences using current data. For example, the

available data provided little information on the availability and condition of local

2such as Asthma (the most common) and Sickle cell anemia

72

transport. In towns that have railways or paved roads, people are more likely to mi-

grate, and the local market is more prosperous, factors which could favor children’s

health. In this case, both parent migration and children’s outcome are positively

correlated with these unobservable factors, which may strengthen the positive corre-

lation between them.

To solve this endogeneity problem and identify the potential causality effects

of migration on children, we adopted the instrument variable method. The three en-

dogenous variables are the household migration status dummy variable, the father’s

migration status dummy variable and the mother’s migration status dummy vari-

able. The child’s household migration status dummy variable equals to one if either

or both their parents have migrated. The instrumental variables are the historical

county level average household migration rate, the historical county level average

male migration rate, and the historical county level average female migration rate

respectively. The instruments are gender specific. In the survey data, there are be-

tween 20 to 30 households in each county. The instruments capture the migration

network and local culture. It is conceivable that the migrant network affects migra-

tion decisions. From Table (3.3), we can see that households with higher historical

migration rates are more likely to have migrant household members. The local av-

erage migration rates may affect children’s health and care as a result of the income

the parent earned from the urban job. Once we control for the household income

directly in the regression, the local average migration rate is unlikely to affect chil-

dren’s anthropomorphic outcomes. Another threat to the validity of the IV is that

both IV and children’s health outcome may be correlated with unobserved variables.

For instance, the government policy may affect both the historical migration rate and

children’s health. In China, the change of the HuKou system is the biggest change

in government policy that affects labor migration. The policy may affect children’s

73

health through the development of the local economy and labor migration. As we

have already controlled the county level average income in the regression, the change

of the HuKou system is unlikely to effect children’s health.

Table (3.10) presents the first-stage results from the fixed effects regres-

sion. The historical average migration rate is strongly correlated with individual and

household migration status. We have calculated the F-statistics against the null that

the excluded instruments are irrelevant. The F-statistics are 6.65, 5.09 and 7.19 on 1

and 821 degrees of freedom for historical county level male migration rate, historical

county level female migration rate, and historical county level household migration

rate respectively. A common rule of thumb for models with one endogenous regressor

is: the F-statistic against the null that the excluded instruments are irrelevant in the

first-stage regression should be larger than 10 (Stock, Wright, and Yogo 2002). The

instruments we use are not strong instruments, and it may cause bias towards the

OLS estimator. However the coefficients estimated on the instrumental variables are

still significant at the five percent level, which shows the predictive power of the his-

torical county level migration rate. As the average migration rate is a measure of the

local migration network, the regression results support the hypothesis that the local

migration network is a crucial factor that affects individuals’ migration decisions in

the corresponding local area.

The effects of household migration status on children’s health outcome and

care from the fixed effects model using the IV approach are presented in Table (3.11a)

and Table (3.11b). The effects of father’s migration status on children’s health out-

come and care from the fixed effects model using the IV approach are presented in

Table (3.12a) and Table (3.12b). The effects of mother’s migration status on chil-

dren’s health outcome and care are presented in Table (3.13a) and Table (3.13b).

After the correction of the endogeneity, there are few significant effects of

74

parents’ migration on children’s outcomes. There are three possible reasons. The

first is that the IV approach removes the reverse causality between parents’ migration

and children’s health. The second is that the weak instrument we use may cause bias

toward the OLS estimator. As a result, household migration and father’ migration

may lead to an even higher increase in children’s weight than reported in the tables.

The third is that IV usually reduces significance. It is not surprising that after

applying the IV, more coefficients became insignificant. So correcting for endogeneity

did not change the results.

We have discussed that parental migration effects a child’s health in three

major ways: the income effect, the allocation of time and the information effect. As

we have controlled the income effect by including household income as one of the

explanatory variables, the net effect of the parents’ migration here is the combined

effect of the time allocation (the time a parent spends with their child) and informa-

tion effect. The estimation results show that the net effect of parent’s migration is

not significant for most measures of children’s health outcome.

It is surprising to see that the number of elderly in the household only

has a few significant effects on children’s health. The elderly in the household are

likely to be children’s grandparents. Intuitively, the care from grandparents could

compensate the leave of children’s parents. From the regression results, children who

live with grandfathers take more calories. Children who live with their grandmothers

do not have significantly better health outcome than those who do not live with

their grandmothers. However the analysis in this paper focuses only on the measures

of children’s physical health, grandparents may have positive effects on children’s

mental health when children’s parents are absent, which could be studied by future

research.

The variances of the coefficients in the IV approach are obviously larger than

75

the ones in the fixed effects model without correcting for the endogeneity. This is a

sign that the instruments are not adding much variation. The variance is especially

large for the dependent variable of childcare. as the effective sample size is relatively

small due to missing values for the childcare variable. A total of 1048 observations

were used to analyze the childcare variable. There are 118 individuals that have more

than one observation in the sample, among which 33 children’s parents have changed

their migration status.

Overall, the IV approach suggests that there were few significant causality

effects of parents’ migration on children’s health outcomes. In contrast to the concern

that left-behind children might suffer health problems without sufficient care from

migrated parents, our empirical results show that the net effect of parents’ migration

on children’s health is not necessarily negative. These results suggest that the effects

of health information provided by migrated parents are important, and cannot be

ignored.

3.7 ROBUSTNESS CHECK

Of primary concern is that changes in household characteristics reflected in

our data may be endogenous to children’s health status. For instance, the changes

in household income may be correlated with unobserved shocks that could also lead

to changes in children’s health. Moreover, household income may be correlated with

migration decisions of household members. Such correlation may lead to biased

estimates of migration. To rule out the possibility that the above results are driven

by changes in endogenous household income, we estimated the regressions without

including household income as a control variable. The effects of households’ migration

on children’s health outcome and care are reported in Table 3.14a and Table 3.14b.

The effects of fathers’ migration on children’s health outcomes and care are reported

76

in Table 3.15a and Table 3.15b. The effects of mothers’ migration on children’s health

outcomes and care are reported in Table 3.16a and Table 3.16b.

Compared to the previous estimates with household income as a control

variable, the coefficients on migration are very similar in all regression analyses, with

only small variations in the coefficients. The existence of the small variations might

due to the coefficients of parents’ migration also capturing the income effect from

labor migration when we exclude the household income as a control variable.

Besides the household income, the number of elders in the household might

be endogenous because this may be a factor in parents’ migration decision-making.

Therefore the estimates of the coefficients of migration might be biased. In order to

address this issue, we estimated the effects of households’ migration on child health

outcomes and care from the fixed effects model without including the number of

males/females over 60 in the household as control variables in Table 3.17a and Table

3.17b. The effects of fathers’ migration on children’s health outcomes and care in

Table 3.18a and Table 3.18b, and the effects of mothers’ migration on children’s

health outcomes and care in Table 3.19a and Table 3.19b.

The estimation results showed that the magnitude of the coefficients were

very similar. It is worth noting that there were small changes in the standard de-

viation of some coefficients. One of the possible explanations is that the number of

elders is positively correlated with migration decisions. As a consequence of multi-

collinearity, the variance is smaller in this robustness check. Albeit the change, the

results are consistent with the previous findings.

3.8 REGRESSION RESULTS ON SUBSAMPLES

Although the regression results show that there are few significant effect

of parents’ migration on children’s health outcomes and care in general, parent’s

77

migration may have a significant effect on children in particular groups. In this

section, we present the regression results from fixed effects model and IV approach

on subsamples.

In total, we studied ten subgroups: a) children who live in low income

households, where low income is defined as household income level less than the

average annual income level; b) children who live in high income households, where

high income is defined as household income level higher than the average annual

income level; c) children whose parents did not finish high school; d) children whose

parents finished high school; e) children younger than age 5; f) children between ages

5 and 10; g) children who live with their grandparents; h) children who live in nuclear

families; i) children who live in north China; j) children who live in south China.

Due to limitations of the data, some of the coefficients are not identifiable,

particularly the coefficients of mother’s migration, as the effective sample size is too

small for some subsamples. The effective sample contains the children who have more

than one observation in the data. Moreover the IVs are the county level average mi-

gration rate, not much variation was added by the IVs especially when the effective

sample size was small. This problem is more serious for mother’s migration because

males migrate more often than females, and there is less variation in female’s migra-

tion status than male’s migration status. For the above reason, the regression results

of mothers’ migration are not reported here. The regression results for the subsample

of children under age 5 and children with highly educated parents are not available

for the same reason. A second problem is that due to the missing value problem, the

effective sample size was too small in some subsamples for some variables to conduct

fixed effect model analyses. For instance, childcare data for several subsamples were

not available.

Table (3.20a) and Table (3.20b) show the results of fixed effects model of

78

household migration on children’s health and care on subsamples using the IV ap-

proach. Table (3.21a) and Table (3.21b) show the results of fixed effects model of

fathers’ migration.

Generally speaking, the IV approach shows that children between age 5

and 10 are significantly affected by fathers’ migration. The effects are positive on

children’s calories and protein intake for children aged between 5 and 10 years. As

I have mentioned, the effects of parents’ migration can be both positive and nega-

tive. Positive effects include better access to nutritional information and products.

Negative effects may include children not being in the care of either their mother or

father. The results show that there are more positive effects of fathers’ migration

than negative effects for children aged between 5 and 10 years. No significant positive

effects were found for other subsamples, possibly because children between the ages

of 5 and 10 were in the midst of a crucial period of physical development. Most of

the other coefficients were not significant due to the large standard deviations in the

IV approach.

The regressions on subsamples show that parents’ migration had significant

effects on children’s health outcome and care for children in particular groups. The

results showed that the positive effects of parents’ migration could offset the nega-

tive effects of parents’ migration. Additionally, the positive effects outnumber the

negative effects for children’s nutrient intake in some subsamples.

3.9 CONCLUSION

In this paper, we studied left-behind children’s health outcomes including

height-for-age Z-score (HAZ), weight-for-age Z-score (WAZ), daily calorie intake,

daily protein intake, the number of immunization shots received by children and

whether children have been sick during the survey year. The evidence presented

79

above showed that children with migrated parents did not necessarily have poorer

health outcomes than children who lived with both parents. The robustness checks

on the endogeneity assumption supported the findings that labor migration had no

causal effect on the health of left-behind children. The fact that the results changed

so little after excluding household income and the number of elders in the household

suggests that parents’ migration had no significant impact on children’s health, that

children’s health is independent of household income and the number of elders in the

household.

The regression results on subsamples showed that fathers’ migration had

significant positive effects on children’s nutrient intake for children between 5 and 10

years of age. It showed that the positive effects of parents’ migration out-number

and could offset the negative effects of parents’ migration. The regression results

on subsamples provide some insights of the insignificance of the effects of parents’

migration. The negative effects on children’s health of parents’ migration are possibly

compensated by better access to nutrition information and products, the care from

grandparents and the remittances that migrated parents are able to provide.

We have explored the possible mechanisms that may lead to better access to

nutritional information. Future research should examine whether parental migration

effects the social support that children receive and how children’s health outcomes

vary based on the duration of parents’ migration. Nevertheless, these first steps into

the investigation of this important topic cast further doubt on the view that those left-

behind children in China always suffer from their parents’ absence. These findings

should encourage policy makers in areas of high migration to provide alternative

sources of support for left-behind children.

80

Table 3.1: Parents Migration Rate for Children under age ten(CHNS)

1997 2000 2004 2006 2009

Any Parent Migrated 0.06 0.10 0.17 0.21 0.14

Father Migrated Only 0.05 0.09 0.14 0.18 0.12

Mother Migrated Only 0.02 0.03 0.07 0.06 0.05

Both Parents Migrated 0.01 0.02 0.04 0.04 0.03

Number of Observations 927 785 614 585 531

81

Table 3.2a: Descriptive Statistics (CHNS)

Variables Migrant Non-Migrant t-stats of

household Household the difference

Weight (kg) 21.04 21.17 −0.32

(0.36)3 (0.16) (0.74)4

Height (cm) 114.90 113.94 0.98

(0.89) (0.37) (0.33)

Weight-for-age Z-score −0.49 −0.25 −3.41∗∗∗

(0.07) (0.03) (0.00)

Height-for-age Z-score −0.65 −0.50 −1.80.

(0.08) (0.03) (0.07)

Calories (Kcal) 1362.68 1374.04 −0.34

(30.00) (13.87) (0.73)

Protein (g) 40.65 42.75 −1.90.

(1.00) (0.48) (0.06)

Calories/RDA 0.81 0.84 −1.26

(0.02) (0.01) (0.21)

Protein/RDA 0.72 0.78 −3.03∗∗

(0.02) (0.01) (0.00)

Number of immunization shots 6.53 8.97 −1.88.

(1.15) (0.60) (0.06)

Whether the child has been cared by 0.39 0.47 −1.06

non-family member for the past week (0.07) (0.03) (0.29)

1standard deviation of the sample mean;

2p-value, ***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

82

Table 3.2b: Descriptive Statistics (CHNS)

Variables Migrant Father Non-Migrant Father t-stats of the difference

Weight (kg) 21.08 21.16 −0.18

(0.39)5 (0.15) (0.86)6

Height (cm) 115.04 113.95 1.04

(0.97) (0.37) (0.30)

Weight-for-age Z-score −0.48 −0.25 −2.86∗∗

(0.07) (0.03) (0.00)

Height-for-age Z-score −0.63 −0.51 −1.45

(0.08) (0.03) (0.15)

Calories (Kcal) 1379.47 1371.43 0.21

(33.35) (13.63) (0.83)

Protein (g) 41.22 42.62 −1.07

(1.10) (0.47) (0.29)

Calories/RDA 0.82 0.83 −0.58

(0.02) (0.01) (0.56)

Protein/RDA 0.73 0.77 −1.90.

(0.02) (0.01) (0.06)

Number of immunization shots 6.42 8.92 −1.58

(1.22) (0.59) (0.11)




2p-value, ***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

83

Table 3.2c: Descriptive Statistics (CHNS)

Variables Migrant Mother Non-Migrant Mother t-stats of the difference

Weight (kg) 20.63 21.18 −0.87

(0.57)7 (0.15) (0.38)8

Height (cm) 113.73 114.10 −0.24

(1.38) (0.36) (0.81)

Weight-for-age Z-score −0.49 −0.27 −1.95.

(0.11) (0.03) (0.05)

Height-for-age Z-score −0.69 −0.52 −1.39

(0.12) (0.03) (0.16)

Calories (Kcal) 1265.22 1378.60 −2.03∗

(44.74) (13.10) (0.04)

Protein (g) 37.65 42.73 −2.67∗∗

(1.52) (0.45) (0.01)

Calories/RDA 0.77 0.84 −2.16∗

(0.02) (0.01) (0.03)

Protein/RDA 0.68 0.77 −2.73∗∗

(0.03) (0.01) (0.01)

Number of immunization shots 5.52 8.79 −1.44

(1.59) (0.57) (0.15)




2p-value, ***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

84

Table 3.3: Descriptive Statistics (CHNS) of Control Variables

Variables Migrant household Non-Migrant Household t-stats of the difference

Household annual income (10000$) 2.30 2.82 −2.38∗

(0.20) (0.08) (0.02)

Father’s education 2.03 2.37 −6.75∗∗∗

(0.04) (0.02) (0.00)

Mother’s education 1.79 2.15 −6.34∗∗∗

(0.05) (0.03) (0.00)

County level average 0.71 0.83 −3.45∗∗∗

income (0.03) (0.01) (0.00)


weight (0.26) (0.12) (0.00)


height (0.18) (0.08) (0.00)

Number of male over 0.28 0.26 0.60

60 in the household (0.03) (0.01) (0.55)

Child’s gender 0.47 0.48 −0.06

(girls=1) (0.03) (0.01) (0.95)

Number of female over 0.32 0.29 1.00.

60 in the household (0.03) (0.01) (0.32)

Number of boys in 0.93 0.79 3.39∗∗∗

the household (0.04) (0.01) (0.00)

Number of girls in 0.82 0.72 2.03∗

the household (0.04) (0.02) (0.04)

County level average 1.02 1.00 1.16

calorie intake/RDA (0.01) (0.00) (0.25)

County level average 1.30 1.30 0.04

protein intake/RDA (0.02) (0.01) (0.97)

Children’s age 6.22 5.94 2.006∗

(0.13) (0.05) (0.05)

Father’s age 32.60 34.16 −3.05∗∗

(0.47) (0.19) (0.00)

Mother’s age 31.25 32.32 −2.10∗

(0.46) (0.22) (0.04)

Historical county level 0.26 0.15 9.50∗∗∗

male migration rate (0.01) (0.00) (0.00)


female migration rate (0.01) (0.00) (0.00)


household migration rate (0.01) (0.00) (0.00)

1standard deviation;

2p-value ***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1,

3historical county level migration rate: the average local migration rate from previous survey year.

85

Table 3.4a: OLS regression results: the effects of the household migration status

WAZ HAZ Immunization shots Childcare by non-family member

Household migration status 0.07 0.13· −2.95· −0.03

(0.06) (0.07) (1.51) (0.04)

Household income 0.00 0.01 −0.35· 0.00

(0.01) (0.01) (0.20) (0.00)

Father education 0.07∗∗ 0.10∗∗ −0.52 0.04∗

(0.03) (0.03) (0.68) (0.02)

Mother education 0.02 0.06· −0.92 0.01

(0.03) (0.03) (0.69) (0.02)

County average income 0.10∗ 0.24∗∗∗ 4.06∗ −0.03

(0.05) (0.05) (1.84) (0.03)

County average weight 0.05∗∗∗ 0.03∗∗ 0.11 0.01∗

(0.01) (0.01) (0.22) (0.01)

County average height 0.08∗∗∗ 0.08∗∗∗ −0.42 −0.02·

(0.01) (0.01) (0.32) (0.01)

Male in household with age over 60 −0.06 −0.05 3.53∗ −0.08∗

(0.06) (0.07) (1.42) (0.04)

Female in household with age over 60 0.11∗ 0.10 0.52 −0.02

(0.05) (0.06) (1.32) (0.04)

Gender −0.20∗∗ −0.19∗ −0.14 0.06

(0.07) (0.08) (1.66) (0.05)

Number of boys in household −0.09· −0.10· 0.78 −0.02

(0.05) (0.05) (1.11) (0.03)

Number of girls in household −0.01 0.01 −0.88 −0.05·

(0.04) (0.05) (1.07) (0.03)

County average calorie consumption −0.22 0.00 −8.67 0.30∗

(0.21) (0.24) (5.34) (0.14)

County average protein consumption 0.14 0.24 3.95 −0.12

(0.15) (0.18) (3.89) (0.10)

Child age −0.05∗∗∗ 0.03∗∗ −0.10 0.02·

(0.01) (0.01) (0.23) (0.01)

Intercept −15.46∗∗∗ −16.22∗∗∗ 74.37· 1.68

(1.67) (1.92) (41.52) (1.09)

R2 0.25 0.19 0.02 0.04

Adj. R2 0.25 0.19 0.02 0.04

Num. obs. 2201 2201 1491 1048

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

86

Table 3.4b: OLS regression results: the effects of the household migration status

Calorie Protein Calorie/RDA Protein/RDA

Household Migration status −30.15 −1.67 −0.02 −0.03

(31.22) (1.06) (0.02) (0.02)

Household income −2.16 0.07 0.00 0.00

(3.35) (0.11) (0.00) (0.00)

Father education 29.90∗ 1.19∗∗ 0.02∗∗ 0.03∗∗

(13.12) (0.45) (0.01) (0.01)

Mother education 28.58∗ 1.30∗∗ 0.02∗ 0.02∗∗

(13.17) (0.45) (0.01) (0.01)

County average income 9.06 0.52 0.00 0.01

(21.97) (0.75) (0.01) (0.01)

County average weight 0.41 0.02 0.00 0.00

(4.28) (0.15) (0.00) (0.00)

County average height 0.99 0.08 0.00 0.00

(6.24) (0.21) (0.00) (0.00)

Male in household with age over 60 12.53 0.32 0.01 0.01

(27.41) (0.93) (0.02) (0.02)

Female in household with age over 60 −0.19 −0.20 0.00 −0.01

(26.26) (0.89) (0.02) (0.02)

Gender −80.79∗ −3.89∗∗∗ −0.01 −0.04·

(33.45) (1.14) (0.02) (0.02)

Number of boys in household 3.52 −0.41 0.00 0.00

(22.30) (0.76) (0.01) (0.01)

Number of girls in household −10.36 −0.23 −0.01 0.00

(21.16) (0.72) (0.01) (0.01)

County average calorie consumption 738.90∗∗∗ −10.95∗∗ 0.42∗∗∗ −0.22∗∗∗

(99.30) (3.38) (0.06) (0.07)

County average protein consumption 195.62∗∗ 35.16∗∗∗ 0.13∗∗ 0.65∗∗∗

(73.58) (2.51) (0.05) (0.05)

Child age 111.98∗∗∗ 3.21∗∗∗ 0.01∗∗∗ 0.01∗

(4.58) (0.16) (0.00) (0.00)

Intercept −566.10 −28.08 −0.29 −0.50

(805.99) (27.45) (0.51) (0.53)

R2 0.29 0.30 0.11 0.18

Adj. R2 0.29 0.30 0.11 0.18

Num. obs. 2201 2201 2201 2201

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

87

Table 3.5a: OLS regression results: the effects of the father’s migration


Father’s migration status 0.07 0.14· −3.05· −0.03

(0.07) (0.08) (1.60) (0.05)


(0.01) (0.01) (0.20) (0.00)


(0.03) (0.03) (0.68) (0.02)


(0.03) (0.03) (0.69) (0.02)


(0.05) (0.05) (1.84) (0.03)


(0.01) (0.01) (0.22) (0.01)


(0.01) (0.01) (0.32) (0.01)


(0.06) (0.07) (1.42) (0.04)


(0.05) (0.06) (1.32) (0.04)

Gender −0.20∗∗ −0.19∗ −0.14 0.06

(0.07) (0.08) (1.66) (0.05)


(0.05) (0.05) (1.11) (0.03)

Number of girls in household −0.01 0.01 −0.85 −0.05·

(0.04) (0.05) (1.07) (0.03)

County average calorie consumption −0.22 0.00 −8.55 0.30∗

(0.21) (0.24) (5.34) (0.14)


(0.15) (0.18) (3.89) (0.10)

Child age −0.05∗∗∗ 0.03∗∗ −0.11 0.02·

(0.01) (0.01) (0.23) (0.01)

Intercept −15.44∗∗∗ −16.18∗∗∗ 73.12· 1.68

(1.67) (1.92) (41.47) (1.09)

R2 0.25 0.19 0.02 0.04

Adj. R2 0.25 0.19 0.02 0.04

Num. obs. 2201 2201 1491 1048

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

88

Table 3.5b: OLS regression results: the effects of the father’s migration


Father’s Migration status −12.72 −1.05 −0.01 −0.02

(33.16) (1.13) (0.02) (0.02)


(3.35) (0.11) (0.00) (0.00)


(13.13) (0.45) (0.01) (0.01)


(13.17) (0.45) (0.01) (0.01)


(21.98) (0.75) (0.01) (0.01)


(4.29) (0.15) (0.00) (0.00)


(6.24) (0.21) (0.00) (0.00)


(27.42) (0.93) (0.02) (0.02)

Female in household with age over 60 −0.31 −0.21 0.00 −0.01

(26.26) (0.89) (0.02) (0.02)

Gender −80.83∗ −3.89∗∗∗ −0.01 −0.04·

(33.46) (1.14) (0.02) (0.02)


(22.32) (0.76) (0.01) (0.01)


(21.17) (0.72) (0.01) (0.01)


(99.32) (3.38) (0.06) (0.07)


(73.58) (2.51) (0.05) (0.05)

Child age 111.88∗∗∗ 3.20∗∗∗ 0.01∗∗∗ 0.01∗

(4.58) (0.16) (0.00) (0.00)

Intercept −617.76 −30.21 −0.31 −0.53

(805.08) (27.42) (0.51) (0.53)

R2 0.29 0.30 0.11 0.18

Adj. R2 0.29 0.30 0.11 0.18

Num. obs. 2201 2201 2201 2201

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

89

Table 3.6a: OLS regression results: the effects of the mother’s migration


Mother’s migration status 0.18· 0.20· −3.47 0.02

(0.10) (0.11) (2.28) (0.06)


(0.01) (0.01) (0.20) (0.00)


(0.03) (0.03) (0.68) (0.02)


(0.03) (0.03) (0.69) (0.02)


(0.05) (0.05) (1.84) (0.03)


(0.01) (0.01) (0.22) (0.01)


(0.01) (0.01) (0.32) (0.01)


(0.06) (0.07) (1.42) (0.04)


(0.05) (0.06) (1.32) (0.04)

Gender −0.20∗∗ −0.19∗ −0.19 0.06

(0.07) (0.08) (1.66) (0.05)


(0.05) (0.05) (1.11) (0.03)

Number of girls in household 0.00 0.02 −0.98 −0.05·

(0.04) (0.05) (1.07) (0.03)

County average calorie consumption −0.23 −0.02 −8.33 0.30∗

(0.21) (0.24) (5.34) (0.14)


(0.15) (0.18) (3.88) (0.10)

Child age −0.05∗∗∗ 0.03∗∗ −0.12 0.02·

(0.01) (0.01) (0.23) (0.01)

Intercept −15.56∗∗∗ −16.16∗∗∗ 70.76· 1.58

(1.67) (1.92) (41.44) (1.08)

R2 0.25 0.19 0.02 0.04

Adj. R2 0.25 0.19 0.02 0.04

Num. obs. 2201 2201 1491 1048

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

90

Table 3.6b: OLS regression results: the effects of the mother’s migration


Mother’s Migration status −116.12∗ −3.45∗ −0.06∗ −0.06·

(47.71) (1.63) (0.03) (0.03)


(3.34) (0.11) (0.00) (0.00)


(13.08) (0.45) (0.01) (0.01)


(13.16) (0.45) (0.01) (0.01)


(21.95) (0.75) (0.01) (0.01)


(4.28) (0.15) (0.00) (0.00)


(6.23) (0.21) (0.00) (0.00)


(27.39) (0.93) (0.02) (0.02)

Female in household with age over 60 0.52 −0.18 0.00 −0.01

(26.23) (0.89) (0.02) (0.02)

Gender −81.57∗ −3.91∗∗∗ −0.01 −0.04·

(33.42) (1.14) (0.02) (0.02)

Number of boys in household 1.49 −0.50 0.00 −0.01

(22.25) (0.76) (0.01) (0.01)

Number of girls in household −12.25 −0.31 −0.01 −0.01

(21.13) (0.72) (0.01) (0.01)


(99.18) (3.38) (0.06) (0.07)


(73.44) (2.50) (0.05) (0.05)

Child age 111.85∗∗∗ 3.20∗∗∗ 0.01∗∗∗ 0.01∗

(4.57) (0.16) (0.00) (0.00)

Intercept −471.22 −27.27 −0.23 −0.49

(804.12) (27.40) (0.51) (0.53)

R2 0.29 0.30 0.11 0.18

Adj. R2 0.29 0.30 0.11 0.18

Num. obs. 2201 2201 2201 2201

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

91

Table 3.7a: Fixed effects model results of the effects of the household migration status

on children’s health outcome and care

WAZ HAZ Immunization Childcare by

shots non-family member

Household migration status −0.20· −0.13 −5.11 −0.02

(0.10) (0.12) (3.78) (0.13)

Household income −0.03∗ −0.02 −0.28 −0.01

(0.01) (0.01) (0.57) (0.01)

County average income 0.27∗∗ 0.08 3.84 0.05

(0.09) (0.10) (5.60) (0.10)

County average weight 0.03 −0.04 −0.10 0.02

(0.03) (0.03) (1.13) (0.04)

County average height 0.03 0.05 −1.65 −0.01

(0.03) (0.04) (1.21) (0.04)

Male in household with age over 60 −0.07 0.08 0.75 0.06

(0.16) (0.18) (6.52) (0.23)

Female in household with age over 60 0.13 0.30 11.13 0.10

(0.19) (0.21) (7.50) (0.28)

Number of boys in household −0.04 −0.17 9.50∗ 0.07

(0.12) (0.13) (4.12) (0.17)


(0.12) (0.14) (5.21) (0.18)

County average calorie consumption 0.28 0.14 1.27 0.87

(0.35) (0.39) (14.05) (0.54)

County average protein consumption 0.08 0.29 −4.57 −0.36

(0.25) (0.29) (9.00) (0.38)

Child age −0.04∗ 0.09∗∗∗ 1.35∗ 0.00

(0.02) (0.02) (0.62) (0.03)

R2 0.04 0.10 0.07 0.03

Adj. R2 0.01 0.03 0.02 0.00

Num. obs. 2201 2201 1491 1048

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

92

Table 3.7b: Fixed effects model results of the effects of the household migration status



Household Migration status −25.15 1.20 −0.01 0.03

(69.77) (2.26) (0.05) (0.05)

Household income −3.97 −0.24 0.00 −0.01

(7.34) (0.24) (0.00) (0.00)

County average income −28.54 −0.04 −0.02 −0.01

(60.74) (1.97) (0.04) (0.04)


(20.05) (0.65) (0.01) (0.01)

County average height 12.92 1.15· 0.01 0.02·

(20.71) (0.67) (0.01) (0.01)

Male in household with age over 60 207.50· 3.46 0.13· 0.07

(107.41) (3.48) (0.07) (0.07)


(125.27) (4.06) (0.09) (0.08)


(78.20) (2.54) (0.05) (0.05)

Number of girls in household 27.00 −0.89 0.02 −0.03

(82.23) (2.67) (0.06) (0.06)

County average calorie consumption 1159.18∗∗∗ 4.22 0.65∗∗∗ 0.02

(232.22) (7.53) (0.16) (0.16)

County average protein consumption 47.99 26.66∗∗∗ 0.07 0.54∗∗∗

(170.03) (5.51) (0.12) (0.12)

Child age 97.59∗∗∗ 2.56∗∗∗ 0.00 0.00

(11.05) (0.36) (0.01) (0.01)

R2 0.27 0.23 0.09 0.10

Adj. R2 0.07 0.06 0.02 0.03

Num. obs. 2201 2201 2201 2201

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

93

Table 3.8a: Fixed effects model results of the effects of the father’s migration status




Father migration status −0.19· −0.20 −4.19 −0.06

(0.11) (0.12) (3.80) (0.14)


(0.01) (0.01) (0.57) (0.01)


(0.09) (0.10) (5.60) (0.10)

County average weight 0.03 −0.04 −0.07 0.02

(0.03) (0.03) (1.13) (0.04)


(0.03) (0.03) (1.21) (0.04)


(0.16) (0.18) (6.52) (0.23)


(0.19) (0.21) (7.51) (0.28)


(0.12) (0.13) (4.12) (0.17)


(0.12) (0.14) (5.21) (0.18)


(0.35) (0.39) (14.07) (0.54)


(0.25) (0.29) (9.01) (0.38)

Child age −0.04∗ 0.09∗∗∗ 1.31∗ 0.01

(0.02) (0.02) (0.62) (0.03)

R2 0.04 0.11 0.07 0.03

Adj. R2 0.01 0.03 0.02 0.00

Num. obs. 2201 2201 1491 1048

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

94

Table 3.8b: Fixed effects model results of the effects of the father’s migration status



Father Migration status 7.54 3.18 0.01 0.07

(72.23) (2.34) (0.05) (0.05)


(7.33) (0.24) (0.00) (0.00)

County average income −28.19 0.11 −0.02 −0.01

(60.84) (1.97) (0.04) (0.04)


(20.05) (0.65) (0.01) (0.01)

County average height 13.89 1.19· 0.01 0.02·

(20.67) (0.67) (0.01) (0.01)


(107.36) (3.48) (0.07) (0.07)


(125.21) (4.05) (0.09) (0.08)


(78.21) (2.53) (0.05) (0.05)


(82.27) (2.66) (0.06) (0.06)


(232.30) (7.52) (0.16) (0.16)


(170.09) (5.51) (0.12) (0.12)

Child age 96.94∗∗∗ 2.52∗∗∗ 0.00 0.00

(11.08) (0.36) (0.01) (0.01)

R2 0.27 0.23 0.09 0.10

Adj. R2 0.07 0.06 0.02 0.03

Num. obs. 2201 2201 2201 2201

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

95

Table 3.9a: Fixed effects model results of the effects of the mother’s migration status




Mother migration status 0.00 −0.10 −4.31 0.23

(0.15) (0.17) (5.85) (0.22)


(0.01) (0.01) (0.57) (0.01)


(0.09) (0.10) (5.61) (0.10)

County average weight 0.03 −0.04 0.01 0.02

(0.03) (0.03) (1.12) (0.04)


(0.03) (0.03) (1.21) (0.04)


(0.16) (0.18) (6.54) (0.23)


(0.19) (0.21) (7.52) (0.28)


(0.12) (0.13) (4.20) (0.18)


(0.12) (0.14) (5.21) (0.18)

County average calorie consumption 0.29 0.14 1.53 0.92·

(0.35) (0.39) (14.09) (0.54)


(0.25) (0.29) (9.02) (0.38)

Child age −0.04∗ 0.09∗∗∗ 1.23∗ 0.00

(0.02) (0.02) (0.61) (0.03)

R2 0.04 0.10 0.06 0.04

Adj. R2 0.01 0.03 0.02 0.01

Num. obs. 2201 2201 1491 1048

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

96

Table 3.9b: Fixed effects model results of the effects of the mother’s migration status



Mother Migration status −113.18 −1.52 −0.06 0.00

(102.70) (3.33) (0.07) (0.07)


(7.32) (0.24) (0.00) (0.00)


(60.68) (1.97) (0.04) (0.04)


(20.03) (0.65) (0.01) (0.01)


(20.60) (0.67) (0.01) (0.01)


(107.65) (3.49) (0.07) (0.07)


(125.43) (4.07) (0.09) (0.09)


(79.17) (2.57) (0.05) (0.05)


(82.07) (2.66) (0.06) (0.06)


(232.03) (7.53) (0.16) (0.16)


(170.04) (5.52) (0.12) (0.12)

Child age 97.88∗∗∗ 2.60∗∗∗ 0.00 0.00

(10.98) (0.36) (0.01) (0.01)

R2 0.27 0.23 0.09 0.10

Adj. R2 0.07 0.06 0.02 0.03

Num. obs. 2201 2201 2201 2201

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

97

Table 3.10: First Stage fixed effects Regression Results

Father’s migration Mother’s migration Household migration

status status status

County level male migration rate −0.3185∗

(0.1236)

County level female migration rate −0.2548∗

(0.1131)

County level household migration rate −0.3266∗∗

(0.1212)

Father’s age −0.1506∗ −0.1538∗

(0.0667) (0.0692)

Mother’s age 0.0132 0.0260

(0.0577) (0.0842)

Household income −0.0031 −0.0028 −0.0048

(0.0041) (0.0030) (0.0043)

Male in household with age over 60 0.0214 −0.0854∗ −0.0472

(0.0606) (0.0431) (0.0628)

Female in household with age over 60 0.0038 0.0902· 0.0456

(0.0702) (0.0499) (0.0727)

Number of children in the family 0.0001 −0.0132 −0.0050

(0.0135) (0.0096) (0.0140)

County average income −0.0353 0.0134 0.0092

(0.0344) (0.0248) (0.0363)

Children’s age 0.1724∗∗ −0.0067 0.1475

(0.0665) (0.0581) (0.1086)

R2 0.0381 0.0264 0.0373

Adj. R2 0.0103 0.0071 0.0100

Num. obs. 2201 2201 2201

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

98

Table 3.11a: Fixed effects model results of the effects of the household migration

status on children’s health outcome and care: IV approach



Household Migration status 2.07 0.19 26.62 0.08

(1.35) (1.14) (39.20) (18.46)

Household income −0.01 −0.02 −0.08 −0.01

(0.02) (0.01) (0.67) (0.16)

County average income 0.27∗ 0.08 5.88 0.05

(0.12) (0.10) (6.61) (0.61)


(0.04) (0.03) (1.81) (0.36)

County average height 0.10· 0.06 −0.91 −0.01

(0.06) (0.05) (1.60) (0.30)


(0.23) (0.19) (7.88) (0.65)

Female in household with age over 60 −0.02 0.28 7.55 0.11

(0.27) (0.23) (9.31) (0.54)

Number of boys in household 0.01 −0.16 10.28∗ 0.06

(0.16) (0.14) (4.60) (1.24)


(0.19) (0.16) (6.08) (2.15)


(0.47) (0.40) (15.41) (1.19)

County average protein consumption −0.04 0.27 −6.11 −0.37

(0.35) (0.30) (10.02) (1.53)

Child age −0.08∗ 0.08∗∗ 0.27 0.00

(0.03) (0.03) (1.49) (0.39)

R2 0.00 0.09 0.01 0.03

Adj. R2 0.00 0.03 0.00 0.00

Num. obs. 2201 2201 1491 1048

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

99

Table 3.11b: Fixed effects model results of the effects of the household migration

status on children’s health outcome and care: IV approach



(799.13) (23.09) (0.50) (0.46)

Household income 2.16 −0.14 0.00 −0.01

(9.80) (0.28) (0.01) (0.01)


(72.15) (2.08) (0.05) (0.04)


(24.21) (0.70) (0.02) (0.01)

County average height 46.75 1.74· 0.03 0.03

(34.91) (1.01) (0.02) (0.02)

Male in household with age over 60 268.37∗ 4.53 0.16· 0.08

(135.15) (3.90) (0.08) (0.08)

Female in household with age over 60 −24.43 −3.14 −0.01 −0.09

(158.03) (4.57) (0.10) (0.09)


(94.45) (2.73) (0.06) (0.05)


(110.29) (3.19) (0.07) (0.06)


(278.32) (8.04) (0.17) (0.16)

County average protein consumption −8.44 25.67∗∗∗ 0.05 0.53∗∗∗

(206.16) (5.96) (0.13) (0.12)

Child age 76.80∗∗∗ 2.20∗∗∗ −0.01 −0.01

(20.10) (0.58) (0.01) (0.01)

R2 0.12 0.17 0.03 0.08

Adj. R2 0.03 0.04 0.01 0.02

Num. obs. 2201 2201 2201 2201

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

100


on children’s health outcome and care: IV approach



Father Migration status 2.12 0.87 37.40 −14.95

(1.38) (1.25) (41.49) (229.20)


(0.02) (0.01) (0.69) (2.32)

County average income 0.37∗∗ 0.12 6.59 −0.23

(0.14) (0.12) (7.00) (4.43)


(0.04) (0.04) (1.88) (2.75)


(0.05) (0.05) (1.56) (4.12)


(0.21) (0.19) (7.71) (6.34)

Female in household with age over 60 0.07 0.27 6.59 −0.41

(0.25) (0.23) (9.70) (8.33)

Number of boys in household −0.08 −0.18 9.10· 0.98

(0.16) (0.14) (4.79) (14.10)


(0.19) (0.17) (6.39) (27.34)


(0.48) (0.43) (16.55) (14.02)


(0.35) (0.32) (11.31) (5.04)

Child age −0.09∗ 0.07∗ −0.04 0.35

(0.04) (0.03) (1.52) (5.37)

R2 0.00 0.04 0.00 0.00

Adj. R2 0.00 0.01 0.00 0.00

Num. obs. 2201 2201 1491 1048

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

101





(788.91) (22.98) (0.50) (0.47)


(8.82) (0.26) (0.01) (0.01)

County average income 14.92 0.63 0.00 −0.01

(77.86) (2.27) (0.05) (0.05)


(23.66) (0.69) (0.01) (0.01)


(30.31) (0.88) (0.02) (0.02)


(122.17) (3.56) (0.08) (0.07)


(143.24) (4.17) (0.09) (0.09)


(89.86) (2.62) (0.06) (0.05)


(108.06) (3.15) (0.07) (0.06)


(272.01) (7.92) (0.17) (0.16)


(201.15) (5.86) (0.13) (0.12)

Child age 76.80∗∗∗ 2.28∗∗∗ 0.00 0.00

(21.05) (0.61) (0.01) (0.01)

R2 0.15 0.21 0.04 0.10

Adj. R2 0.04 0.06 0.01 0.03

Num. obs. 2201 2201 2201 2201

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

102

Table 3.13a: Fixed effects model results of the effects of the mother’s migration status




Mother Migration status 2.21 −0.58 21.18 1.76

(2.11) (2.07) (47.55) (4.04)


(0.01) (0.01) (0.58) (0.01)

County average income 0.26∗ 0.09 5.20 0.11

(0.11) (0.10) (6.20) (0.17)


(0.03) (0.03) (1.31) (0.06)

County average height 0.06 0.04 −1.33 0.00

(0.04) (0.04) (1.32) (0.05)


(0.27) (0.26) (7.68) (0.29)

Female in household with age over 60 −0.08 0.34 8.66 0.31

(0.29) (0.29) (8.79) (0.56)


(0.30) (0.29) (7.78) (0.52)


(0.15) (0.15) (5.44) (0.21)

County average calorie consumption 0.39 0.12 −1.02 1.26

(0.42) (0.41) (15.21) (1.08)


(0.34) (0.34) (9.63) (1.24)

Child age −0.06∗ 0.09∗∗∗ 0.93 −0.03

(0.02) (0.02) (0.84) (0.07)

R2 0.00 0.09 0.03 0.02

Adj. R2 0.00 0.03 0.01 0.00

Num. obs. 2201 2201 1491 1048

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

103

Table 3.13b: Fixed effects model results of the effects of the mother’s migration status



Mother Migration status 1742.55 26.08 0.95 0.47

(1514.51) (43.02) (0.96) (0.86)

Household income 0.49 −0.19 0.00 −0.01

(9.87) (0.27) (0.01) (0.01)


(76.02) (2.09) (0.05) (0.04)


(24.99) (0.69) (0.02) (0.01)


(29.97) (0.83) (0.02) (0.02)


(192.07) (5.37) (0.12) (0.11)

Female in household with age over 60 −114.50 −4.20 −0.07 −0.12

(209.53) (5.85) (0.13) (0.12)

Number of boys in household 262.89 2.70 0.14 0.04

(213.42) (6.02) (0.14) (0.12)


(109.40) (3.02) (0.07) (0.06)


(297.79) (8.21) (0.19) (0.17)


(245.26) (6.80) (0.16) (0.14)

Child age 85.21∗∗∗ 2.41∗∗∗ 0.00 −0.01

(17.13) (0.48) (0.01) (0.01)

R2 0.09 0.16 0.01 0.06

Adj. R2 0.03 0.04 0.00 0.02

Num. obs. 2201 2201 2201 2201

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

104

Table 3.14a: Robustness Check 1: the effects of the household migration status on

children’s health outcome and care without household income as a control variable


Household Migration 2.09 0.22 26.66 0.48

status (1.34) (1.13) (39.02) (12.48)

R2 0.00 0.09 0.01 0.00

Adj. R2 0.00 0.02 0.00 0.00

Num. obs. 2201 2201 1491 1048

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

Table 3.14b: Robustness Check 1: the effects of the household migration status on




(789.50) (22.87) (0.50) (0.46)

R2 0.12 0.16 0.03 0.08

Adj. R2 0.03 0.04 0.01 0.02

Num. obs. 2201 2201 2201 2201

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

105

Table 3.15a: Robustness Check 1: the effects of the father’s migration status on




(1.38) (1.25) (41.37) (83.99)

R2 0.00 0.04 0.00 0.00

Adj. R2 0.00 0.01 0.00 0.00

Num. obs. 2201 2201 1491 1048

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

Table 3.15b: Robustness Check 1: the effects of the father’s migration status on




(785.27) (22.90) (0.50) (0.47)

R2 0.15 0.21 0.04 0.09

Adj. R2 0.04 0.06 0.01 0.02

Num. obs. 2201 2201 2201 2201

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

106

Table 3.16a: Robustness Check 1: the effects of the mother’s migration status on




(2.12) (2.11) (47.46) (4.57)

R2 0.00 0.09 0.03 0.01

Adj. R2 0.00 0.02 0.01 0.00

Num. obs. 2201 2201 1491 1048

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

Table 3.16b: Robustness Check 1: the effects of the mother’s migration status on




(1533.75) (42.86) (0.97) (0.86)

R2 0.09 0.16 0.01 0.06

Adj. R2 0.03 0.04 0.00 0.02

Num. obs. 2201 2201 2201 2201

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

107

Table 3.17a: Robustness Check 2: the effects of the household migration status on

children’s health outcome and care without the number of elders as control variables


Household Migration 2.07 0.19 28.87 0.13

status (1.35) (1.15) (39.34) (20.97)

R2 0.00 0.09 0.00 0.02

Adj. R2 0.00 0.02 0.00 0.00

Num. obs. 2201 2201 1491 1048

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

Table 3.17b: Robustness Check 2: the effects of the household migration status on




(802.52) (23.11) (0.50) (0.46)

R2 0.12 0.16 0.02 0.08

Adj. R2 0.03 0.04 0.01 0.02

Num. obs. 2201 2201 2201 2201

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

108

Table 3.18a: Robustness Check 2: the effects of the father’s migration status on




(1.38) (1.25) (41.67) (225.42)

R2 0.00 0.04 0.00 0.00

Adj. R2 0.00 0.01 0.00 0.00

Num. obs. 2201 2201 1491 1048

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

Table 3.18b: Robustness Check 2: the effects of the father’s migration status on




(790.83) (23.02) (0.50) (0.47)

R2 0.15 0.21 0.04 0.09

Adj. R2 0.04 0.06 0.01 0.03

Num. obs. 2201 2201 2201 2201

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

109

Table 3.19a: Robustness Check 2: the effects of the mother’s migration status on




(2.10) (2.06) (46.78) (5.39)

R2 0.00 0.09 0.03 0.01

Adj. R2 0.00 0.02 0.01 0.00

Num. obs. 2201 2201 1491 1048

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

Table 3.19b: Robustness Check 2: the effects of the mother’s migration status on




(1513.70) (43.79) (0.96) (0.85)

R2 0.09 0.16 0.01 0.05

Adj. R2 0.02 0.04 0.00 0.01

Num. obs. 2201 2201 2201 2201

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

110

Table 3.20a: Fixed effects model results of the effects of the household migration

status on children’s health outcome and care on subsamples: IV approach



Household Migration status 3.39 0.08 75.25 −0.44

(Low income household) (3.59) (2.37) (133.99) (1.91)

Household Migration status 3.25 −4.97 N.A.9 N.A.

(High income household) (8.35) (9.78) N.A. N.A.

Household Migration status 2.45 1.68 65.05 N.A.

(Parents with low education level) (1.57) (1.42) (71.59) N.A.


(Child above age 5) (0.98) (0.92) (21.14) N.A.


(Child who lives with grandparents) (4.28) (3.80) (89.56) (10.77)

Household Migration status 1.79 −0.90 19.33 1.49

(Child who lives in nuclear family) (1.99) (1.64) (48.11) (7.00)

Household Migration status 2.52 0.42 13.07 −0.07

(North China) (2.04) (1.62) (64.06) (1.74)


(South China) (1.76) (1.55) (45.25) N.A.

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1, 9 the regression results are not available due to missing

values for the variables immunization shots and childcare. In some subsamples, the effective sample

sizes for those two variables are too small to produce reliable regression results, where the effective

sample contains the individuals that have more than one observation in the data.

111

Table 3.20b: Fixed effects model results of the effects of the household migration

status on children’s health outcome and care on subsamples: IV approach




Household Migration status −346.98 −18.25 −0.14 −0.25

(High income household) (3812.96) (126.21) (2.66) (2.66)


(Parents with low education level) (931.02) (25.21) (0.56) (0.47)

Household Migration status 1722.03∗ 48.37∗ 0.94· 0.77·

(Child above age 5) (867.25) (24.18) (0.48) (0.40)






(North China) (1402.55) (35.05) (0.88) (0.72)


(South China) (917.81) (30.35) (0.56) (0.58)

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

112


on children’s health outcome and care on subsamples: IV approach





Father Migration status 2.14 −1.77 N.A. N.A.

(High income household) (4.54) (3.99) N.A.10 N.A.

Father Migration status 2.07 2.28 68.83 N.A.

(Parents with low education level) (1.34) (1.51) (69.92) N.A.


(Child above age 5) (0.91) (0.92) (25.00) N.A.






(North China) (1.92) (1.84) (75.66) (7.08)


(South China) (2.22) (1.88) (41.46) N.A.

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1, 10 the regression results are not available due to missing

values for the variables immunization shots and childcare. In some subsamples, the effective sample

sizes for those two variables are too small to produce reliable regression results, where the effective

sample contains the individuals that have more than one observation in the data.

113


on children’s health outcome and care on subsamples: IV approach




Father Migration status −1831.69 −89.18 −1.44 −2.19

(High income household) (3122.73) (127.10) (2.30) (2.97)


(Parents with low education level) (859.56) (24.21) (0.52) (0.44)

Father Migration status 1354.02· 36.92· 0.72· 0.54

(Child above age 5) (749.43) (20.53) (0.41) (0.33)

Father Migration status −37.47 58.81 0.06 1.00





(North China) (1284.92) (32.55) (0.83) (0.70)


(South China) (1065.57) (34.84) (0.64) (0.66)

***p < 0.001, **p < 0.01, *p < 0.05, ·p < 0.1.

114

Chapter 4

Conclusion

In Chapter 2, I develop and estimate a two-period ability-learning struc-

tural model to provide a more complete picture of the college market by including

community colleges as a viable pathway to bachelor’s degrees. The results show

that the market has no discrimination against transfer students because the effect of

transfer on future income is not statistically significant from zero, which coincides

with the finding by Kane and Rouse (1995), suggesting that the only cost of transfer

is direct transfer costs that are the main barrier to college transfer. The estimation

results also show that family income has a significant effect on college choices, which

provides evidence that students tend to start in community colleges when facing fi-

nancial constraints. Finally, the results support the idea that the return to abilities

is higher in universities than in community colleges.

An immediate extension is to consider jointly the strategies between colleges

and students. Schools may set different strategies to admit high school graduates

and transfer students. A dynamic general equilibrium model that takes into account

both sides of the college admission market would give a more complete picture of the

decision making process and the underlying driving forces. Another extension is to

modify the model by allowing for heterogeneous risk aversion levels. The extension

115

can be achieved by employing the constant relative risk aversion utility, and allowing

the risk aversion coefficient to be different for different individuals. The extension can

help us to understand a diversity of college choices and different college preferences

from another perspective.

Chapter 3 studied left-behind children’s health outcomes including height-

for-age Z-score (HAZ), weight-for-age Z-score (WAZ), daily calorie intake, daily pro-

tein intake, the number of immunization shots received by children and whether chil-

dren have been sick during the survey year. The evidence presented above showed

that children with migrated parents did not necessarily have poorer health outcomes

than children who lived with both parents. The regression results on subsamples

showed that fathers’ migration had significant positive effects on children’s nutrient

intake for children between 5 and 10 years of age. It showed that the positive effects

of parents’ migration could out-number and offset the negative effects of parents’ mi-

gration. The negative effects on children’s health of parents’ migration are possibly

compensated by better access to nutrition information and products, the care from

grandparents and the remittances that migrated parents are able to provide.

We have explored the possible mechanisms that may lead to better access to

nutritional information. Future research should examine whether parental migration

effects the social support that children receive and how children’s health outcomes

vary based on the duration of parents’ migration.

116

Appendix A

Appendix to Chapter 2

A.1 Bayesian Update in Ability Learning Process

Bayesian update after receiving high school GPA: To update the

distribution of αik for all 0 ≤ k ≤ J , we make use of Equation (2.10),

HsGPAi = µ0 +

J∑

j=0

µ1j · αij + εHsij . (A.1)

which is equivalent to saying

HsGPAi − µ0 −∑

j 6=k µ1j · αij

µ1k= αik +

εHsij

µ1k. (A.2)

As the prior distribution of αijs are defined in Equation (2.8), which can be rewritten

as

αij = mj + χij + εαij , where εij ∼ N(0, σ2α)

We can substitute the preceding equation back to Equation (A.2), so we have


j 6=k µ1j · (mj + χij)

µ1k= αik +

εHsij +

∑

j 6=k µ1j · εαij

µ1k. (A.3)

Let

αHsik =


j 6=k µ1j · (mj + χij)

µ1k,

εHsik =

εHsij +

∑

j 6=k µ1jεαij

µ1k, where εHs

ik ∼ N(0,σ2 + σ2

α

∑

j 6=k µ21j

µ21k

).

117

Therefore the posterior distribution of student ability after receiving high school GPA

is

αik ∼ N(αHsik , σ2

Hs,k), (A.4)

where

αHsik =

(mk + χik) ·σ2+σ2

α

∑j 6=k µ2

1j

µ21k

+ αHsik · σ2

α

σ2+σ2α

∑j 6=k µ2

1j

µ21k

+ σ2α

σ2Hs,k =

11

σ2+σ2α

∑j 6=k µ2

1j

µ21k

+ 1σ2α

Bayesian update after receiving SAT score: To update the distribu-

tion of αik for all 0 ≤ k ≤ J , we use Equation (2.11),

SATi = µ0 +J∑

j=0

µ1j · αij + εSATij . (A.5)

which is equivalent to saying

SATi − µ0 −∑

j 6=k µ1j · αij

µ1k= αik +

εSATij

µ1k. (A.6)

From Equation (A.4), the ability αij can be rewritten as

αij = αHsij + εHs

ij , where εHsij ∼ N(0, σ2

Hs,j)

We can substitute the above equation back to Equation (A.6), we have

SATi − µ0 −∑

j 6=k µ1j · αHsij

µ1k= αik +

εSATij +

∑

j 6=k µ1j εHsij

µ1k. (A.7)

Let

αSATik =

SATi − µ0 −∑

j 6=k µ1j · αHsij

µ1k,

εSATik =

εSATij +

∑

j 6=k µ1j εHsij

µ1k, where εSAT

ik ∼ N(0,σ2 +

∑

j 6=k σ2Hs,kµ

21j

µ21k

).

Therefore the posterior distribution of student’s ability after receiving SAT score is

αik ∼ N(αSATik , σ2

SAT,k), (A.8)

118

where

αSATik =

αHsik ·

σ2+∑

j 6=k σ2Hs,kµ

21j

µ21k

+ αSATik · σ2

Hs,k

σ2+∑

j 6=k σ2Hs,k

µ21j

µ21k

+ σ2Hs,k

σ2SAT,k =

11

σ2+∑

j 6=k σ2Hs,k

µ21j

µ21k

+ 1σ2Hs,k

Bayesian update after receiving college GPA: From Equation (2.12),

we have

κijt = αij + εκit where εκit ∼ N(0, σ2κ). (A.9)

The posterior distribution of student’s ability after receiving college GPA is

αik ∼ N(αColik , σ2

Col,k), (A.10)

where

αColik =

αSATik · σ2

κ + αSATik · σ2

SAT,k

σ2κ + σ2

SAT,k

σ2Col,k =

11σ2κ+ 1

σ2SAT,k

A.2 Estimation Details

A.2.1 The Closed Form of P (si1|·)

Define the probability: To find the closed form of P (si1|·), I make use

of the property of extreme value type I distribution. In the model, I have described

that εSij1s in Equation (2.4) follows Extreme Value Type I distribution with location

and scale parameters zero and τ , where εSij1s are the preference shock. εSij1s enters

the value function at period 1 (V1(·) defined in Equation (2.16)), as it is in the utility

of attending school (US(·) defined in Equation (2.4)). By the property of extreme

119

type I distribution, the probability of attending college si1 at period 1 is given by 1

P (si1|wi3, κi1, κi2, Xi, {νj}j∈J ) =exp( 1

τV1(Xi, Ii1, si1))

∑

j≥0 exp(1τV1(Xi, Ii1, j))

(A.11)

• Here V1(Xi, Ii1, j) is the value function at period 1 without the preference shock

εSij1 in US(·). To be more specific, let us define

US(Xi, si1) = ln(ξ(·)) + νij ,

and

V1(Xi, Ii1, si1) = US(Xi, si1) + E

[

maxj∈C(si1)

[V2(Xi, Ii2, j)] |Ii1, si1

]

.

It is equivalent to saying

US(Xi, si1) = US(Xi, si1) + εSij1, and V1(Xi, Ii1, j) = V1(Xi, Ii1, j) + εSij1.

In the denominator of Equation (A.11), I have the summation of exp( 1τV1(Xi,

Ii1, j)) for j ≥ 0, because working outside j = −1 is not a option in period 1 (all

students in this data set have received post secondary education).

Details of deriving the closed form of V1(·): To find the closed form

of the value function at period 1, I have to find the expected maximum of the value

function at period 2 (E[

maxj∈C(si1) [V2(Xi, Ii2, j)] |Ii1, si1]

). The expected value is

taken over the distribution of the error terms εSij2s and εWit . As εSij2s and εWit all follow

Extreme Value Type I distribution with location and scale parameters zero and τ ,

1Details available in Domencich and McFadden (1975, Chapter 4).

120

the expectation has a closed form.

V1(Xi, Ii1, j) = US(Xi, si1) + E

[

maxj∈C(si1)

[V2(Xi, Ii2, j)] |Ii1, si1

]

= US(Xi, si1)

+

∫

τι+ τ log{∑

j∈C(si1)

exp(1

τV2(Xi, Ii2, j))}dK({αij}j∈C(si1))

= US(Xi, si1) + τι

+τ

∫

log{∑

j∈C(si1)

exp(1

τV2(Xi, Ii2, j))}dK({αij}j∈C(si1))

(A.12)

• Here ι = 0.57 is the Euler’s constant.

• dK({αij}j∈C(si1)) is the joint distribution of αijs for j ∈ C(si1).

In the following, I use the Taylor Expansion to approximate the integration. To be

more specific, let

f(αi) = log{∑

j∈C(si1)

exp(1

τV2(Xi, Ii2, j))},

To expand function f(·) at αSATi which is the posterior mean of αi before making

college enrollment decision at period 1.

f(αi) ≈ f(αSATi ) +

f (1)(αSATi )(αSAT

i − αi)

1

Therefore

∫

f(αi)dK(αi) ≈ f(αSATi ) +

f (1)(αSATi )× 0

1= f(αSAT

i )

As a result Equation (A.12) has the form

V1(Xi, Ii1, j) = US(Xi, si1) + τι+ τf(αSATi ).

A.2.2 The Closed Form of P (si2|·)

For the same reason, as εSij2s follows Extreme Value Type I distribution with

location and scale parameters zero and τ , the probability of attending college si2 at

121

period 2 is given by

LSi2 = 1(si2|si1, Xi, κi1, κi2, {λ

µj }j∈J , {νj}j∈J )

=exp( 1

τV2(Xi, Ii1, si1))

∑

j∈C(si1)exp( 1

τV2(Xi, Ii1, j))

,

• V2(Xi, Ii1, si1) is similarly defined as V1(Xi, Ii1, si1). It has the following relation

with V2(·).

V2(Xi, Ii1, j) = V2(Xi, Ii1, j) + εSij1.

A.2.3 The Closed Form of f(wit|·)

f(wit|·) is easy to compute, as the random variable εWit follow an Extreme

Value Type I distribution with location and scale parameters zero and τ .

f(wit|·) = f{εWit = ln(wit)− ln(wit)− ρ2j · εi · σ2Col,Di

} (A.13)

= exp[−(ln(wit)− ln(wit)− ρ2j · εi · σ

2Col,Di

τ] (A.14)

×exp{exp[−(ln(wit)− ln(w)it − ρ2j · εi · σ

2Col,Di

τ]}, (A.15)

where

• wit is the observed student wage at time t,

• ln(w)it is the predicted logarithm of student wage at time t using equation

(2.1),

• εi follows N(0, 1).

• αColi,Di

and σ2Col,Di

are defined in (A.10).

To see Equation (A.13), from the wage equation (Equation (2.1)), I have

ln (wt(αi,Di, si1, Di)) = ρ1,Di

+ ρ2,Diαi,Di

+ γ11(Di > si1)

122


for t = 2, · · · , T.

Substitute αi,Di= αCol

i,Di+ εi · σ

2Col,Di

(Equation (A.10)) into the wage equation, I

have


+ ρ2,Di(αCol

i,Di+ εi · σ

2Col,Di

) + γ11(Di > si1)


for t = 2, · · · , T.

The predicted logarithm of wage is


+ ρ2,Diαcoli,Di

+ γ11(Di > si1)

+γ2Exprit + γ3Expr2it,

for t = 2, · · · , T.

After performing a simple math problem, I have

εWit = ln(wit)− ln(wit)− ρ2j · εi · σ2Col,Di

A.3 The Closed Form of f(κit|·):

f(κit|·) = φ(κit − αSAT

i,si1

(σ2κ + σ2

SAT,si1)0.5

). (A.16)

where si1 is student school choice at period 1, as I have mentioned before. αSATi,si1

and

σ2SAT,si1

are defined in (A.8).

To see (A.16), for j ≥ 0 (j = 0 indicates community college; j > 0 indicates

4 year universities), we have

αij = αSATi,si1

+ ε2SAT,si1, (derived from Equation (A.8)),

123

where ε2SAT,si1∼ N(0, σ2

SAT,si1). It can be shown that

κijt = αij + εκij , (Equation (2.12))

= αSATi,si1

+ ε2SAT,si1+ εκij , (substitute αij)

To find the value of the likelihood function Li(·), for each individual I draw

shocks {{λµijr}j∈J , {νijr}j∈J , {εj}j∈J }

Rr=1 from their joint distribution G({λµ

j }j∈J ,

{νj}j∈J , {εj}j∈J ). The likelihood function is approximated by

1

R

R∑

r=1

P r(si1, si2|wi3, κi1, κi2, Xi, {νj}j∈J )× f r(wi3, κi1, κi2|{λµj }j∈J , {εj}j∈J ).

The likelihood function for students who complete only one period education

is similar to Equation (2.18), except that the contribution from working starts from

time two and there is only one period contribution of κit (Lκi1).

124

Bibliography

[1] Altonji, J. (1993): “The Demand for and Return to Education When Education

Outcomes are Uncertain,” Journal of Labor Economics, 11(1), 48-83.

[2] Bao, Shuming and Orn B. Bodvarsson, Jack W. Hou, and Yaohui Zhao (2009).

“Migration in China from 1985-2000-The Effects of Past Migration, Investments,

and Deregulation.” The Chinese Economy, 42(4), 7-28.

[3] Belzil, C. and J. Hansen (2002): “Unobserved Ability and the Return To School-

ing,” Econometrica, 70(5), 2075-2091.

[4] Brauw, Alan D. and Ren Mu. 2011. “Migration and the Overweight and Under-

weight Status of Children in Rural China.” Food Policy, 36(1), 88-100.

[5] Cai F, Albert Park, and Yaohui Zhao. 2008. “The Chinese Labor Market in the

Reform Era. In: Brandt, L, and Tom Rawski (Eds), China’s Economic Tran-

sition: Origins, Mechanisms, and Consequences.” Cambridge University Press:

Cambridge; 2008.

[6] Campbell, R. and B. Siegel (1967): “The Demand for Higher Education in the

United States, 1919-1964,” American Economic Review, 57(3), 482-94.

[7] Chen, Chunming. 2000. “Fat intake and Nutritional Status of Childre in China.”

American Journal of Clinical Nutrition, 72(5S), 1368S-1372S.

125

[8] Chen, Xinxin, Qiuqiong Huang, Scott Rozelle, Yaojiang Shi, and Linxiu Zhang.

2009. “Effect of Migration on Children’s Educational Performance in Rural

China.” Comparative Economic Studies, 51(3); 323-343.

[9] Chen, S. (2008): “Estimating the Variance of Wages in the Presence of Selection

and Unobservable Heterogeneity, Review of Economics and Statistics, 90(2), 275-

289.

[10] Chinese Nutrition Society, 2000. Dietary Reference Intakes, Beijing: Chinese

Light Industry Press, 2000.

[11] Cunha, F., J. J. Heckman, and S. Navarro (2005): “Separating Uncertainty from

Heterogeneity in Life Cycle Earnings,” Oxford Economic Papers, 57(2), 191-261.

[12] Czepiel, S. (2002): “Maximum Likelihood Estimation of Logistic Regres-

sion Models: Theory and Implementation,” [online], Available: http :

//czep.net/stat/mlelr.pdf (May 1, 2011).

[13] de Brauw, Alan, and J. Giles. 2008. “Migrant labor markets and the welfare

of rural households in the developing world: evidence from China.” World Bank

Policy Research Working Paper 4585.

[14] de Brauw, Alan, and Ren Mu. 2011. “Migration and the Overweight and Un-

derweight Status of Children in Rural China.” Food Policy, 36(1), 88-100.

[15] Domencich, T. A. and D. McFadden (1975): “Urban Travel Demand: a Behav-

ioral Analysis,” North-Holland Publishing Company, Amsterdam.

[16] Doyle, W. (2009): “The Effect of Community College Enrollment on Bachelor’s

Degree Completion,” Economics of Education Review, 28(2), 199-206.

[17] Du, Shufa, Tom A. Mroz, Fengying Zhai, and Barry M. Popkin. 2004. “Rapid

126

Income Growth Adversely Affects Diet Quality in China - Particularly for the

Poor!” Social Science and Medicine, 59(7), 1505-1515.

[18] Du, Yang, Albert Park, and Sangui Wang. 2005. “Migration and Rural Poverty

in China.” Journal of Comparative Economics, 33(4), 688-709.

[19] Dunning, Thad, Freedman, A. David. 2008. “Modeling selection effects.” S.P.

Handbook of social science methodology. Sage.

[20] Epple, D., R. Romano and H. Sieg (2006): “Admission, Tuition, and Financial

Aid Policies in The Market for Higher Education,” Econometrica, 74(4), 885-928.

[21] Fu, C (2010): “Equilibrium Tuition, Applications, Admissions and Enrollment

in the College Market,” Working Paper, University of Wisconsin-Madison.

[22] Galper, H. and R. M. Dunn (1969): “A Short-Run Demand Function for Higher

Education in the United States,” Journal of Political Economy, 77(5), 765-777.

[23] Giles, John. 2006. “Is Life More Risky in the Open? Household Risk-Coping

and the opening of China’s Labor Markets.” Journal of Development Economics,

81(1), 25-60.

[24] Hilmer, M (1998): “Post-Secondary Fees and the Decision to Attend a University

or a Community College,” Journal of Public Economics, 67, 329-348.

[25] Kane, T and C. Rouse (1993): “Labor-Market Returns to Two- and Four-Year

College,” The American Economic Review, 85(3), 600-614.

[26] Leslie, L.L. and P.T. Brinkman (1987): “Student Price Response in Higher

Education: The Student Demand Studies,” Journal of Higher Education, 55,

181-204.

127

[27] Liu, Hong, Hai Fang and Zhong Zhao. 2012. “Urban-rural disparities of child

health and nutritional status in China from 1989 to 2006.” Economics and Human

Biology, doi: 10.1016/j.ehb.2012.04.010.

[28] Liang, Zai, and Zhongdong Ma. 2004. “China’s Floating Population: New Evi-

dence from the 2000 Census.” Population and Development Review, 30(3), 467-

488.

[29] Mallee, Hein. 1995. “China’s Household Registration System under Reform.”

Development and Change, 26, 1-29.

[30] Mu, Ren, and Dominique van de Walle. 2011. “Left Behind to Farm? Women’s

Labor Reallocation in Rural China.” Labor Economics, 18(S1), S83-S97.

[31] National Bureau of Statistics of China. 2012. “Year 2011 Report

on the Rural-Urban Labor Migration in China.” stats.gov.cn, http :

//www.stats.gov.cn/tjfx/fxbg/t20120427 402801903.htm.

[32] Osberg, Lars, Jiaping Shao and Kuan Xu. 2009. “The Growth of Poor Children in

China 1991-2000: Why Food Subsidies May Matter.” Health Economics, 18(S1),

S89-S108.

[33] Popkin, Barry M, Shufa Du, Fengying Zhai and Bing Zhang. 2010. “Cohort

Profile: The China Health and Nutrition Survey-monitoring and understanding

socio-economic and health change in China, 1989-2011.” International Journal of

Epidemiology, 39(6), 1435-1440.

[34] Rozelle, Scott, Li Guo, Minggao Shen, Amelia Hughart, John Giles. 1999. “Leav-

ing China’s Farms: Survey Results of New Paths and Remaining Hurdles to Rural

Migration.” China Quarterly, 158, 367-393.

128

[35] Sandy J., A. Gonzalez, and M. Hilmer (2006): “Alternative Paths to College

Completion: Effect of Attending a 2-year School on the Probability of Completing

a 4-year Degree”, Economics of Education Review, 25(5), 463-471.

[36] Shen, Tiefu, Jean-Pierre Habicht, and Ying Chang. 1996. “Effect of Economic

Reforms on Child Growth in Urban and Rural Areas of China.” The New England

Journal of Medicine, 335(6), 400-406.

[37] Stock, Wright, and Yogo. 2002. ”A Survey of weak instruments and weak identi-

fication in Generalized Method of Moments.” Journal of the American Statistical

Association, 20(4), 518-529.

[38] Svedberg, Peter. 2006. “Declining Child Malnutrition: a Reassessment.” Inter-

national Journal of Epidemiology, 35(5), 1336-1346.

[39] World Bank. 2009. From Poor Areas to Poor people: China’s Evolving Poverty

Reduction Agenda. Washington, DC.

[40] Zhang, Shu, 2012. Migration and Children’s Health: Evidence From Rural

China. Department of Economics, University of Houston Working paper.

[41] Zhao, Yaohui. Leaving the Countryside: Rural-to-Urban Migration in China.

American Economic Review 1999, 89(2), 281-286.

129

Curriculum Vitae

Xiaochen Xu was born in Henan, China on March 19, 1984. She obtained a

B.Sc. in Actuarial Science from University of Calgary, Calgary, Canada in 2006. She

obtained a M.Phil. in Statistics from the University of Hong Kong, Hong Kong in

2008, and entered the Ph.D. program in Economics at the Johns Hopkins University

in 2008.

130

essays on college transfer in the u.s. and ......abstract this dissertation studies college transfer...

Documents