the socio-economic effects of teen childbearing re ... · the socio-economic effects of teen...
TRANSCRIPT
Working Paper Series∗
Department of Economics
Alfred Lerner College of Business & EconomicsUniversity of Delaware
Working Paper No. 2003-08
The Socio-Economic Effects of Teen Childbearing
Re-considered: A Re-Analysis of the Teen
Miscarriage Experiment†
Saul D. Hoffman
September 2003
∗http://www.be.udel.edu/economics/workingpaper.htm† c© 2003 by author(s). All rights reserved.
The Socio-Economic Effects of Teen Childbearing Re-considered: A Re-Analysis of the Teen Miscarriage Experiment
Saul D. HoffmanDepartment of Economics
University of DelawareNewark, DE 19716
September 16, 2003
This research was supported by grants from the W. T. Grant Foundation and the CharlesStewart Mott Foundation; this financial support is gratefully acknowledged. Helpful commentswere received from Robert Plotnick, Kristen Moore, and Michael Foster.
Abstract
In an important contribution to the literature on the socio-economic impacts of teenchildbearing, Hotz, McElroy, and Sanders used a natural experiment based on the randomoccurrence of miscarriages. They concluded that the negative impacts of teen childbearing hadbeen substantially exaggerated. In a replication of their work, I identify a number of importanterrors that undermine their results. Correction and re-estimation with their data showsubstantially smaller impacts on income variables. Re-estimation with a new data set yieldsimpacts that are smaller yet. The re-estimation generally does not alter the sign of theestimated effects, but does lead to a much more modest conclusion.
I. Introduction
In 1997, Hotz, McElroy, and Sanders contributed a chapter to Kids Having Kids, an
important and heavily publicized research volume about teen childbearing funded by the Robin
Hood Foundation and published by the Urban Institute. It was, according to the foreword, “the
first comprehensive effort to identify the extent to which the undesirable outcomes of teen
pregnancy are attributable to adolescent pregnancy itself rather than to the wider environment in
which most of these pregnancies and the subsequent child rearing take place” (Maynard, 1997,
p. IX). The book included substantive chapters on the impact of teen childbearing on mothers,
fathers, and children.
Their chapter, which focused on the costs of childbearing to the young women,
concluded that teen childbearing, properly analyzed, was actually beneficial, not only to the teen
mothers, but also to the Federal government in terms of net taxes and transfers. They found
that the teen mothers –in this study, women who had a first pregnancy at age 17 or earlier that
resulted in a birth– worked more, earned more, married men who earned higher incomes, and
received less support from welfare through their mid-20s to early 30s than if they had delayed
their childbearing. Using these estimates, they estimated that Federal expenditures on the
women who became teen mothers would actually increase by $4.0 billion, net of income taxes
paid, if they had delayed their childbearing. They concluded, referring to their findings and the
findings of two other earlier studies, that “the failure to account for selection bias vastly
overstates (emphasis in original) the negative consequences of teenage childbearing and [the
findings] certainly provide no support for the view that there are large negative consequences of
teenage childbearing per se for the socioeconomic attainment of teen mothers” (p. 81). They
also wrote that “these findings call into question the view that teenage childbearing is one of the
nation’s most serious social problems, at least when one measures its severity in terms of costs
to taxpayers.” Their findings were prominently featured in the foreword to the book which noted
that “the basic message of the book is that early parenting itself has little effect on the mothers’
education or earnings” (Maynard, 1997, p. IX). This was a controversial chapter, to say the
least.
In this paper, I examine this important and controversial contribution to the literature on
the socio-economic impact of teen childbearing. I first outline the main arguments of the article,
2
including its methodological contribution and its substantive findings. I then turn to a direct re-
estimation of their data, guided by the data and programs that the authors shared with me. I
examine their data and sample construction carefully, undoubtedly far more carefully than the
average referee –or even a particularly conscientious one– examines a typical journal article.
I can summarize my findings quite easily. There is much to admire in their
methodological contribution to the teen childbearing literature. They joined a growing literature,
originating with Geronimus and Korenman (1992) and including Hoffman, Foster, and
Furstenberg (1993), and Bronars and Grogger (1993), which argues that traditional research
approaches overestimate the causal impact of teen childbearing by incorrectly attributing to teen
childbearing the impact of unobserved individual, family, and community factors that are
correlated with both teen childbearing and the particular outcomes under examination. Their
particular methodological twist was to identify a natural comparison group consisting of teens
whose first pregnancies ended in a miscarriage. Since miscarriage is usually a random event
that results in a delay in the age of a first birth, a comparison of outcomes for teens whose first
pregnancy ended in a birth with outcomes for teens whose first pregnancy ended in a
miscarriage ought to provide an unbiased estimate of the impact of a teen birth. Such a
comparison is, in their words, “a natural experiment,” and, as such, it shares many of the
strengths of a random-assignment experiment which, in the context of teen childbearing, could,
of course, not otherwise be implemented.
The application of this methodology is, however, marred by a series of data processing
errors. I identify and discuss five such kinds of errors: 1) the incorrect re-scaling of 1978-1992
incomes into a common year that makes some year’s incomes more than two times the correct
real income; 2) inclusion of data points outside the data window available in the NLSY; 3) an
improbable coding of AFDC, Food Stamp, and total welfare income; 4) questionable coding of
teen fertility for a substantial number of cases; and 5) errors in using sample weights that result
in non-response cases being included in their analyses.
HMS have written two related papers that use the miscarriage methodology to estimate
1 There is also a primarily methodological piece Hotz, Mullin, and Sanders (1997). I do not discuss thatpaper here.
2 The KHK analyses include teens who became pregnant at ages 18 and 19, although they are not the focuson any direct analysis. The analyses in HMS 2002 exclude these women. HMS 2002 includes additional analyses,both in terms of the dependent variables considered and the specifications of age effects that are estimated.
3 In a personal correspondence in answer to my query about whether the data sets in the two analyses werethe same, one of the authors indicated that “to the best of my knowledge, the answer to your question is “yes”although there may be a few changes.”
4 These two figures show average earnings (Fig. 3.3) and AFDC+Food Stamp Benefits (3.4) for teenmothers and women who did not have a teen birth. I have had to re-scale the dollar values to replicate the figures.
3
the causal impact of teen childbearing.1 One is the chapter in Kids Having Kids, referred to
hereafter as KHK. The other is a revised version, referred to here as HMS 2002. I also use
HMS without a year to refer to the authors themselves. The authors note that the results in the
two papers are substantively similar, and that some minor errors in KHK were corrected in HMS
2002. There are some differences in model specification and sample.2
At the request of the authors, I focus here on HMS 2002 rather than KHK. Where
appropriate, I indicate if there are substantial differences between the two papers. It appears
that the same data was used in both analyses, except perhaps for the addition of one more year
of data for the 2002 analyses.3 Using the data for HMS 2002, I can replicate exactly Figures 3.3
and 3.4 in their chapter in Kids Having Kids.4 This data correspondence between the two
papers is important, because it means that my findings here also apply to the findings of the
chapter in Kids Having Kids.
In the next section, I review the new literature on the effects of teen childbearing and
then discuss the methodology and findings in KHK and HMS 2002. In the following section, I
discuss the various errors summarized above and evaluate their likely impact. I think most
reasonable, seasoned empirical researchers will appreciate the issues and agree that the errors
are genuine and go beyond the kind of innocuous errors and subjective decisions that are part
of almost any social science research project. An interesting question that I examine there is
whether these errors could have been identified in the refereeing process, short of the direct re-
examination of the data that I undertook with the assistance of the authors. In the fourth
section, I outline my own findings from a correction and re-analysis of their data and an original
4
re-analysis of the underlying NLSY data.
As a researcher who has examined some of the same issues that HMS analyze, I view
their articles as major contributions to the literature. My goal here is to determine exactly what
can be learned from implementing their proposed research methodology with the care and
attention to detail that it deserves.
II. The New Approach to The Consequences of Teen Childbearing – Methods and Findings
The effect of teen childbearing on subsequent socio-economic outcomes has been one
of the most researched topics in family demography. Campbell’s (1968) conclusion that “when a
16 year old girl has a child... 90 percent of her life’s script is written for her” is a well-known
pessimistic assessment, but it is no longer widely accepted. It has been well understood for
more than a decade that there is a wide variation in outcomes (Furstenberg, Brooks-Gunn and
Morgan, 1987; Duncan and Hoffman, 1990) and that many teen mothers may do well, especially
with the passage of time. Still, as of the late-1980s, the consensus research view, was the teen
childbearing was a serious problem and that it exacerbated the disadvantages of the young
women. The summary in Risking the Future, the 1987 report of the National Research Council,
is well known: "Women who become parents as teenagers are at greater risk of social and
economic disadvantage throughout their lives than those who delay childbearing” (Hayes,
1987).
The traditional literature evaluated the consequences of teen childbearing primarily
using a multivariate regression of the form:
(1) Yi = XiB + "Ti + ,i
where Y is some outcome of interest, X are other control variables that affect outcome Y, T is a
measure of a teen birth (either dichotomous or age at first birth) and " is the effect of a teen
birth on outcome Y. The model was estimated either by OLS or logit/probit, depending on the
nature of the dependent variable. Outcomes were typically measured some years subsequent
to the teen birth, often when the women were in their mid-20s, although this varies across
studies. Studies like these, which appeared from the mid-1970s through the 1980s, were the
5 Technically, 8 is the coefficient on T from a regression of Z on X and T.
6 As a practical matter, there is no accepted statistical test to establish that all relevant variables have beenincluded. Any and all multivariate specifications are potentially at risk for criticisms that they are plagued byspecification bias due to omitted variables.
5
basis for the negative prognoses of the consequences of teen childbearing summarized in
Risking the Future.
The specification in (1) was, however, fair game for criticism that the estimates were not
convincingly causal and that they could well reflect omitted individual, family, and neighborhood
variables that affected both the outcome (education, income, etc) and the probability of a teen
birth. In a regression context, if there were a single omitted variable Z, the bias would be the
product of two terms:
(2) Bias = $Z x 8Z,T .
In (2), the first term is the effect of the omitted variable on outcome Y and the second
term has the sign of the partial correlation, conditional on the Xs in equation (1), between Z and
T.5 Let Z be measured so that more of it is a good thing and T be a dummy variable coded “1"
for a teen birth. With Z omitted, T then captures the correlated effect of Z on Y, which is
precisely the bias term in (2). For negative outcomes such as poverty and welfare receipt, both
terms in (2) would likely be negative, which means that standard estimates overstate the
negative consequences of teen childbearing. For positive outcomes, such as income or
educational attainment, $Z > 0, while 8Z,T is still negative. That would make standard estimates
too negative. Thus, this critique suggests that standard (multivariate) estimates would be too
large in absolute value compared to true causal effects.
The solution, within the traditional framework, is to include better and/or additional
measures of explanatory variables, but this is almost inevitably an unconvincing exercise to a
skeptical audience.6 Instead, the post-Risking the Future literature proceeds in one of two
ways, following the lead of equation (2).
Fixed-effect sibling models were introduced to this literature by Geronimus and
Korenman (1990,1992), who compared outcomes for pairs of sisters, one of whom had a teen
birth, while the other did not. Expanding (1) to allow for common family effects, the model is:
7 In the NLSYW, differences in the sisters’ own family incomes were quite small and were not statisticallysignificantly different from zero, as were differences in the probability of high school graduation. In the PSID andNLSY, the average difference in economic well-being (family income divided by the poverty line) between sisters wasabout one-third. There were also big differences between the sisters in the probability of being poor, receiving welfare,and in educational attainment. The findings were unusually consistent across the two data sets and the two sets ofauthors.
8 A random assignment experiment is impossible for obvious reasons.
6
(3) Yij = XiB + "Ti + (Fj + ,i
where “j” now indicates the family in which individual “i” grew up and (Fj is the impact on
outcome Y of growing up in a particular family. In this specification, F is exactly the Z from
above. As above, let F be measured such that more is better and assume that T and F are
negatively correlated – being raised in a “better” family reduces the probability of a teen birth.
Then, if F is unobserved and hence omitted from equations like (1), we have exactly the over-
estimated effect due to bias.
Geronimus and Korenman’s important contribution was to use data on sisters to
eliminate the unobserved family effect. If the family effect on outcome Y is constant for sisters
within a family, it will not affect differences between them. The first estimates of this model used
data from the NLSYW (Geronimus and Korenman, 1990); subsequent implementation of the
model by Geronimus and Korenman (1992) and Hoffman, Foster, and Furstenberg (1993) used
more current and more representative data from the NLSY and PSID. The NLSYW estimates
show not only that the causal effects of a teen birth were smaller than in the conventional
studies, but also that they were very small and often essentially zero. The PSID and NLSY
estimates are also smaller than in the conventional studies, but they are substantially larger
than the NLSYW estimates and arguably large enough to be important for policy purposes.7
The alternative approach is to identify and implement a natural experiment for teen
childbearing.8 An ideal natural experiment eliminates the bias problem, since by construction
8Z,T = 0 for all Z when the assignment to “control” or “experimental” group is random by natural
means. Two natural experiment approaches have been presented, one based on twins and the
other based on miscarriages.
The twins experiment, due to Grogger and Bronars (1993), uses the difference in
7
outcomes between teen mothers of twins and teen mothers with single births to assess the
difference in outcomes between teen mothers and women who do not have a birth, i.e., to
measure the consequences of teen childbearing. Since twinning is random, 8Z,T = 0 in equation
(2), and the resulting estimate of the effect of teen childbearing ought to be unbiased. While the
estimated effects are not zero in this study, they are relatively small and certainly smaller than in
the multivariate literature. For example, the probability of graduating from high school is about
four percentage points lower for the teen mothers of twins and their family income is about
$1100 (10 percent) less. The mothers of twins were also slightly more likely to be in poverty
and on welfare.
Grogger and Bronars note that the use of the twins experiment to estimate the costs of a
teen birth is appropriate only if the impact of having twins is a linear function of the number of
children. In that case, the observed difference in outcomes between mothers of twins and
mothers of singletons is equivalent to the unobserved difference between mothers with single
births and women with no birth, which is the impact of interest. If, however, the marginal impact
of a second, simultaneous child (i.e., a twin) is essentially zero, the twins natural experiment
would seriously under-estimate the true impact of a teen birth. In fact, as the authors show,
there are substantial increasing returns to scale in the time cost of children (which implies
declining marginal cost) in the sense that mothers of twins do not spend anywhere near twice
the time in childcare as mothers of one child. This means that the twins experiment understates
the impact of a teen birth, perhaps dramatically so. The authors quite clearly note, that “the
twins approach provides a conservative estimate of the effect of an unplanned teenage
singleton first birth” (p.161).
The most recent and most well-known natural experiment study is Hotz, McElroy, and
Sanders’s chapter in Kids Having Kids and their 2002 paper, which are the focus of this review
paper. Their analyses are based on a comparison of young teen mothers –in their study,
adolescents who became pregnant at age 17 or younger– with girls who had become pregnant
by the same age but suffered a miscarriage instead. Since most miscarriages are random, the
same argument as above carries over, and the “experiment” should yield an unbiased estimate
9 Because the two groups are more similar in terms of their early sexual behavior, they may also be moresimilar in terms of unmeasured variables. But that is irrelevant if miscarriages are random.
10 Also included are dummy variables reflecting an early pregnancy and having a birth at ages 18 or 19.
8
of the consequences of teen childbearing.9 The subsequent differences in outcomes between
the two groups thus ought to provide an estimate of the causal effects of a teen birth.
In practice, there are some complications to this methodological design. The ideal
comparison group for the natural experiment is women who would have chosen to have a birth,
but were randomly prevented from doing so by virtue of a miscarriage. As HMS note, some
women who have a miscarriage would actually have chosen to have an abortion had the
miscarriage not intervened. These women cannot be identified in the data. Thus, the authors
do not quite proceed as if this were a natural experiment, i.e. by using the difference-in-
difference estimator commonly used in natural experiments. Instead, they estimate an
instrumental variables model, using a teen miscarriage as an instrument for a teen birth. To be
a valid instrument for a teen birth, a teen miscarriage must satisfy two basic statistical
requirements: it must be correlated with a teen birth and it must be uncorrelated with the
outcomes of interest, conditional on the other independent variables. The former condition is
clearly satisfied; the latter is plausible, but might not be valid, if a miscarriage had an
independent effect. There is, however, no simple way to test for this.
In KHK, HMS examine a wide set of outcomes, including educational attainment, labor
supply and earnings, marriage and fertility, spouse earnings, and welfare receipt. They examine
outcomes from ages 18 to 35, which allows them to distinguish between short run and longer
run impacts, and which is an important contribution of their work. Their findings are summarized
in column (1) of Table 1. The results shown are taken from the text and figures, and are based
on a model with a quadratic or cubic function in age and interactions between the age terms
and a dummy variable for a teen birth. Control variables include including race and ethnicity,
parents’ education, measures of family structure and family income at age 14, and an ability
measure (AFQT score).10 The key finding of the study is that by their late-20s, teen mothers do
better over a quite wide range of outcomes than their counterparts who had a miscarriage. The
9
teen mothers were less likely to have graduated from high school, but they were more likely to
have received a GED by an essentially offsetting amount. HMS find that the teen mothers
worked more and earned more than their counterparts, and their spouses had higher incomes.
The two earnings effects are quite large, and amount to approximately a 50% increment for
spouse earnings and 35% for own earnings. Differences in income from welfare between the
two groups were very small. The teen mothers were worse off only on two categories – they
had more births by age 30 and they spent more time as a single mother than did the teens with
miscarriages.
The corresponding estimates from HMS 2002 are shown in columns (2) - (4). Column
(4) comes from a model quite similar to KHK, although the sample differs as discussed in the
introduction. In columns (2) and (3), the estimates are for a non-parametric specification of age
plus age x teen birth interactions that provides separate estimates of impacts at each age. The
impacts are reasonably similar to those in KHK, although there are some differences between
the parametric and non-parametric specifications, especially for income and earnings variables.
New estimates of the impact of a teen birth on poverty, and receipt of Food Stamps and AFDC
show that a teen birth has a negative, though not statistically significant impact. This holds for
many of the estimates– t-statistics, especially for the non-parametric estimates are often quite
small. Certainly there is little evidence here that a teen birth has negative consequences for
socio-economic outcomes, as long as a GED is equivalent to a high school diploma.
Although not shown in the table, there is considerable year-to-year variation in the non-
parametric estimates for the income variables. For example, at age 28, it is estimated that the
teen mothers earned more than $9,000 more than the women who had a miscarriage, but at
age 31, they earn just $800 more with a standard error five times as large. At age 28, their
husbands earned more than $1,250 more, but with a standard error of $5,000. One year later,
the husbands are earning more than $14,000 more. At age 28, the women themselves earn
nearly $4.50 more per hour, but three years later, at age 31, they earn $8.00 less per hour (t-
stat=1.2). This may reflect the relatively small samples of teens with miscarriages at specific
ages and the changing sample composition, especially older ages.
10
II. Data and Data-Processing Issues
In this section I consider and discuss a set of data and data-processing issues in HMS
2002 (and by extension in KHK). I do this in substantial detail, since, as is often the case in
empirical work and is certainly the case here, “the devil is in the details.”
The data-processing task of adapting the NLSY data into the form required for analysis
of the medium-term impacts of teen childbearing is formidable. There are multiple margins for
error. Data coding, especially of fertility events, in the NLSY is highly complex; even under the
very best of circumstances, errors or subjective and largely undocumented coding decisions
creep in. In this case, the task is further complicated because a long time series of data is
used, covering the years 1978-1992, which is, in turn, transformed into individual life-cycle ages
from 18 to 35. Of course, since only 15 years of data are observed, no single individual is
observed at all ages from 18-35.
A. Scaling Problems
HMS 2002 analyzes the impact of teen childbearing on incomes (own, spouse, and
transfer) that are observed between 1978 and 1992. Accordingly, HMS rescale the observed
nominal incomes into real incomes, using 1994 for this purpose. Unfortunately, they do this re-
scaling incorrectly. Rather than multiplying each year’s income by P94/Pt (the ratio of the price
level in 1994 to the price level in the observation year), they multiply by P94/P78, irrespective of
the year in which the income is observed. Thus, all incomes are multiplied by 2.27, which is the
ratio of the CPI in 1994 to the CPI in 1978 (148.2/65.2). There is one exception to this –1992
incomes are scaled properly by multiplying by P94/P92. As a result, all income variables except
those in 1978 and 1992 are too large.
This problem affects all income variables –own earnings and spouse earnings, plus a
woman’s wage rate and her family income, both of which are derived directly from these two
variables. It also probably affects welfare benefits received. I say “probably,” because as I
show below, there are other problems with that variable that make it difficult to assess exactly
how it was constructed and whether this particular error applies.
11
Table 2 documents the scaling problem for a few selected, but representative cases. In
order to be precise, I identify and list the underlying variables in the NLSY so that there will be
no misunderstanding. Information is shown for own earnings and spouse earnings from the
HMS data file and for the corresponding data drawn directly from the underlying NLSY variable.
The notation “P-C YR” in the NLSY variable names means that the variable is for the “Previous
Calendar Year,” so “P-C YR 79” is a question asked in the 1979 interview year that applies to
1978 income. The problem is evident: the values in the HMS datafile are always exactly 2.27
times the corresponding value from the NLSY file, rather than varying with the year of
observation, as they should. Note also that the 1992 values are scaled correctly, multiplied by
1.05. The scaling error is not limited to these cases, but is consistently, if incorrectly, applied to
all cases that I examined.
Because of this error, real incomes in all years except 1978 and 1992 incomes are
incorrect. The scaling error increases annually: the error in each successive year is simply the
cumulative increase in the CPI from 1978 to the year in date. For example, 1980 incomes are
26% too high, 1985 incomes are 65% too high, and 1990 incomes are twice as high as the
correct value. 1992 incomes are correct.
Taken at face value, the HMS data suggest that the teen mothers are faring quite well in
both the labor market and especially the marriage market. Their own mean earnings at age 30,
computed among those with positive earnings, equals a very respectable $26,485, a figure
roughly equal to the median earnings of all year-round full-time working women in 1992
($26,235). This figure is all the more impressive since only half of the women are high school
graduates, with another 25% earning a GED, and they are all relatively early in their careers.
Their working husbands are apparently doing remarkably well. When the women were age 30,
their husbands had mean earnings of $52,815, a whopping 60 percent more than the roughly
$32,000 (in $1994) earned by all year-round full-time working men in the mid-1980s and early
1990s. Actual earnings, corrected for the scaling problem, are a bit more reasonable and
realistic. Average own earnings and average spouse earnings at age 30, again computed
among those with positive earnings, are $13,813 and $27,489, respectively, in 1994 dollars.
12
Because the scaling error rises over time, it distorts age-earnings profiles. This is
particularly important, because one of the key contributions of the HMS analysis is to extend the
time period and thus allow for rebound or recovery effects. The distortion of age-earnings
profiles is shown in Figure 1 for own earnings and Figure 2 for spouse earnings; in both figures,
the sample includes only persons with positive earnings. The reported profiles are unusually
steep –own real earnings almost triple from age 20 to age 30 and spouse real earnings more
than double. The precipitous fall in reported real earnings at age 35 reflects the correct scaling
of 1992 income. The corrected profiles, also shown in the figures, are considerably less steep,
that is, show far less growth in earnings with age, and are far more consistent with labor market
data from other sources.
What is the likely impact of this scaling error on the regression estimates HMS report?
Standard units of measurement statistical analysis shows that if the dependent variable is re-
scaled by some constant 8 and the independent variable is not re-scaled, then the
corresponding estimated coefficient is 8 times the original estimate. If the analyses were
organized by calendar year and estimated separately, this would be the expected result: the
bias would increase with calendar year.
However, the problem here is different, because the analyses are organized by age, not
year. Since the same age occurs in different years for different cohorts, different observations
at the same age are mis-scaled differently; the incomes at a given age are multiplied by 8c
where the subscript denotes the birth cohort. Figure 3 shows the average scaling error by age;
the numbers shown are simple averages, without adjustment for the small difference in sample
size across the birth cohorts. As shown in the figure, the scaling error increases with age
through age 27, since older ages are, on average, observed in later years where the scaling
error is larger (except for 1992). The average scaling error is 34% at age 20, 67% at age 25,
and 76% at age 30. At age 28, the age at which many results are emphasized in HMS, the
average scaling error is 74%, with the range of scaling error running from 65% for the 1957 birth
cohort to 109% for the 1963 birth cohort and 0% for the 1964 birth cohort. After age 27, the
error is relatively constant at 75% and then dips a bit as the correct 1992 observations become
11 Variables such as marital status are reported contemporaneously, and thus run from 1979 to 1993.
13
a larger portion of the observed sample. At age 35, there is no scaling error, because all
observations are from 1992. The scaling error also increases with birth year, since the later
birth cohorts experience any given age in a later year.
In general, since average incomes by age are consistently too large and the error
increases with age, it is likely that the resulting life-cycle pattern of teen birth effects will be
biased, with spuriously larger effects observed at older ages. I present evidence on this in the
results section.
B. Sample Construction
For their analyses, HMS 2002 uses data from the 1979 through 1993 NLSY interviews,
typically covering incomes for the preceding calendar year (1978 to 1992) to construct outcome
measures from ages 18 to 35.11 KHK apparently uses one less year of data, only through the
1992 interview. In 1979, the NLSY sample was between ages 14 and 21 (i.e, the women were
born between 1957 and 1964). Table 3 below shows the sample size per birth cohort and the
range of ages (from 18 to 35) observed for each one. Younger ages are unobserved for the
earlier birth cohorts, older ages for the later cohorts. For example, the 1957 birth cohort is
never observed at an age younger than 21, which would have occurred in 1978 and been
reported on in the 1979 interview. The 1964 birth cohort is never observed at any age older
than 28, which would have occurred in 1992 and been reported on in 1993. The number of
years observed ranges from 11 (1964 cohort) to 15 (1957-1960 cohorts). The last column
shows the maximum number of data points each birth cohort could contribute if there were no
missing data, given the cohort sample size and the number of years of observations. The grand
total with no missing data is 13,468. I return to that figure below.
HMS properly exclude in their analyses those observations of the appropriate age that
fall in calendar years subsequent to the end of the data observation window, i.e. ages through
35 occurring after 1992 for which there was no data at the time of their analyses. But for many
variables, observations for ages falling in calendar years prior to the beginning of the
observation window (i.e., prior to 1978) are included, even though there is no valid data to
12 To make matters worse, about one-third of the 1977 cases coded as having received either AFDC orFood Stamps have missing data for the underlying dollar value of benefits received variables.
14
assign to these observations either.
Table 4 shows the extent of this problem for the major dependent variables. The
inappropriate data observations correspond to ages 18-20 for the 1957 birth cohort (“observed”
in 1975-77), ages 18-19 for the 1958 birth cohort (“observed” in 1976-77), and age 18 for the
1959 birth cohort (“observed” in 1977). In the table, the sample Ns for poverty, AFDC use, and
Food Stamp use (121, 222, and 352 in 1975-1977, respectively) are precisely the cumulative
sample sizes for the teen pregnancy sample from the 1957 birth cohort, the 1958 birth, and the
1959 birth cohort; see Table 3 for the cohort sample sizes. The 1975 observations are solely
from the 1957 birth cohort, the 1978 observations are from the 1957 and 1958 birth cohorts, and
the 1977 observations are from the 1957-59 birth cohorts.
As seen in Table 4, out-of-sample observations are included in the analyses for seven of
the dependent variables used in the analysis. No out-of-sample observations are included for
own earnings, own wage rate, and hours worked. For poverty status and receipt of AFDC and
Food Stamps, out-of-sample observations are included for all persons in the 1957-59 birth
cohorts. In all three years, none of the observations are poor, which is very unlikely to be
anything other than some default assignment. Similarly, in 1975 and 1976, all of the
observations are coded as receiving AFDC and Food Stamps. In 1977, about one-third are
apparently receiving AFDC and Food Stamps.12 Spouse earnings and total welfare benefits for
1977 are included, but not for the two earlier years. Curiously, all spouses had earnings equal
to $0; this is another sure indication that something is wrong. It turns out that $0 is the default
spouse earnings assigned to unmarried women: the 178 women with included spouse earnings
in 1977 are all of the women coded as unmarried at those ages. Total welfare benefits in 1977
(in 1994 dollars) equal $3,137. For the variables with actual values rather than apparent
assignments –Food Stamps, AFDC receipt, and total welfare benefits, all in 1997– it is unclear
what the basis for the assignment of values is or could be. There is absolutely no data
whatsoever available for ages that fall in these years.
13 There is a further problem with the missing data. As I explain below, HMS do not correctly identifymissing data cases and inadvertently include some of them in their analyses. The figures discussed in the text referto their own identification of missing data.
15
Receipt of a HS diploma is reported for all persons in the out-of-sample years (1975-77).
This could be a reasonable assignment from knowledge that an individual is a high school
graduate and presumably graduated at age 18 or thereabouts. GED receipt status is also
reported for all cases, but this is far more problematic. There is no data available in the NLSY
to assign data values for the age at which a GED is earned and, unlike a high school diploma,
there is no obvious age at which a GED is earned. The .8% value corresponds to one
observation coded as having received a GED at this age.
I have confirmed that HMS actually used all of these out-of-sample observations in their
analyses. Using all observations, including those from these years, I replicate their results for
the affected variables –spouse earnings, welfare benefits received, AFDC receipt, whether poor,
and whether an individual has a high school degree or GED degree only.
There is other evidence that these pre-sample years are used. As already noted (see
the last column of Table 3), the maximum available sample size for any analysis is 13,468,
which is the total N observed between ages 18 and 35 in calendar years 1978-1992. Adjusting
further for observations in their data set that are entirely missing in a sample year brings the
maximum sample size down to 13,229; variable-specific missing data on the dependent
variables would undoubtedly reduce it further.13 And yet, reported sample sizes in HMS 2002
are consistently larger than this – 13,924 for analyses of educational attainment, fertility
measures, poverty status, AFDC receipt, and Food Stamp receipt. The 695 sample size
difference between the HMS 2002 sample size and the actual available sample size is exactly
equal to the out-of-sample cases contributed by the 1957-59 birth cohorts, as shown in Table 5.
There seems little doubt that HMS have used observations for which there is no data in the
NLSY.
The same problem appears in KHK. Those analyses are based on a larger sample
(N=1744) that includes women ever pregnant by age 19, but that runs only through the 1992
NLSY interview. Thus, here the absolute maximum number of observations is 1744 x 14 years
14 I have spent considerable effort trying to identify where the out-of-sample data observations do, in fact,come from since I assume that they are not literally invented. I have found no simple explanation. There is noevidence, for example, that the out-of-year information is misplaced 1978 data, i. e, data transposed by several years. I suspect that the data may be the result of an incorrect assignment algorithm, and could even come from anothersample member.
15 Actually, the value on the data file is $256.062286376953, which suggests that it is the result of some kindof numerical transformation.
16
= 24,416. Adjusting for the number of years each birth cohort is actually observed between
ages 18 and 35 between 1978 and 1991, just as in Table 3, decreases the maximum sample
size to 22,236. Adjusting further for cases lost during a particular interview year decreases the
available sample size to 21,670. But again, reported sample sizes consistently exceed that
number. For high school graduation and GED attainment, the reported sample sizes are
25,432. Reported sample sizes also exceed 21,670 for annual work hours, AFDC/Food Stamp
benefits, and a woman’s annual labor market earnings. Thus, it is likely that the same
inappropriate use of out-of-sample observations exists in the KHK analyses, too.
The obvious question for all these variables is: where do the sample values come from,
since they are not contained in the NLSY data?14 While it is clear that these pre-sample years
do no belong in the analysis, it is not clear a priori what the impact on estimated coefficients will
be. I provide evidence on this in the next section.
C. Welfare Benefits
In KHK, two welfare benefit variables are analyzed –the annual monetary value of AFDC
and Food stamps and a second value that includes Medicaid benefits. In HMS 2002, only the
first of these is analyzed, but there are also analyses of receipt of Food Stamps and receipt of
AFDC.
I have identified several problems with the welfare and food stamp variables. First,
nearly 60% of the observations in the teen pregnancy sample have welfare benefits equal to
exactly $256.06 and no cases have a value of $0.15 This is highly suspicious, to say the least.
On closer inspection, it turns out that $256.06 is the value assigned to all cases that have
AFDCPAY and FOODPAY (actual AFDC and Food Stamp dollars, respectively, from which total
welfare benefits appear to be constructed) equal to $0. This strongly suggests that the $256.06
value is, simply, a coding error. The correct value is $0. Note that this does not affect the
16 For two of the cases shown, the number of months of AFDC receipt is computed as either fractional orgreater than 12. I assume that in these cases, there is some additional source of welfare benefits.
17
underlying AFDC and Food Stamp receipt variables
Second, I can find no correspondence whatsoever between the Food Stamp and AFDC
variables on the NLSY file and those on the HMS file. Table 5 shows this for a few
representative cases. In columns 3-5, I show the HMS values for Food Stamps and AFDC
benefits, and also for Total Welfare Benefits, which are described as the sum of these two
variables. According to the documentation, Food Stamps and AFDC benefits are in current
dollars, that is, they are not scaled to 1994 dollars, as Total Welfare Benefits are. In columns 6-
10, I show corresponding information from the NLSY. Columns 6-8 are directly from the data;
column 9 is the usually obvious number of months of welfare receipt, computed from the three
preceding columns,16 and column 10 is the resulting likely total value of AFDC benefits.
Comparing values shows that the HMS figures bear no clear or consistent pattern to the
NLSY figures. The two food stamp values (Columns 3 and 6) and the two AFDC Benefit
measures (Columns 4 and 10) ought to be identical or at least comparable. They are not.
Consider case 1 for ID=4. The NLSY figures show $416 in Food Stamp benefits, an average
monthly AFDC benefit of $314, and total welfare benefits of $1986, which almost certainly
implies five months of AFDC receipt (5 x $314 +$416 = $1986), and thus total AFDC benefits of
$1570. The corresponding HMS 2002 values are 87% higher than this for food stamp benefits,
59% higher than this for AFDC benefits, and more than four and a half times greater for total
welfare benefits. In case 2 for ID=4, the HMS 2002 values for Food Stamps and AFDC benefits
are all lower than the NLSY values. The string of cases for ID=86 with NLSY Food Stamp
benefits equal to $648 (interrupted by one at $828) show the problem in another form. Some of
the HMS values drop in a pattern that looks a bit like a CPI adjustment, but the figures do not
correspond to any CPI series, and the sharp fall in 1982 and the rise in 1987 show that this is
not, in fact, a consistent application of any CPI-based transformation.
Third, there is no consistent relationship whatsoever between the value of welfare
benefits and the sum of AFDCPAY and FOODPAY, the two underlying variables. I computed
17 AFDCPAY and FOODPAY are not adjusted to 1994 dollars, but the average adjustment factor to 1994dollars is only 1.46 and thus does not account for the discrepancy.
18
the ratio of total welfare benefits to the sum of AFDCPAY and FOODPAY. If total welfare
benefits were the 1994 value of the sum of these variables, the ratio would range from 1.06
(benefits received in 1992) to 2.27 (1978 welfare benefits) or, in light of the error in inflating the
other income variables, perhaps 2.27 for all cases except 1992. Instead the minimum value of
this ratio is 2.7. For half of the cases, the ratio lies in a narrow margin, between 2.7 and 2.89
and two-thirds have a ratio between 2.7 and 3.29. (For the cases shown in Table 5, the ratio
ranges from 2.79 to 4.83). 10% have a ratio greater than 5.6 and 5% have a ratio greater than
8.5. Four cases have a ratio greater than 100!
Fourth, maximum and average welfare benefits are implausibly high. The maximum
value of welfare benefits on the file is $150,604.91. 151 cases in the HMS teen pregnancy
sample have benefits greater than $20,000 and 64 have benefits greater than $30,000.
Average welfare benefits (for all cases with benefits greater than $256.06) are $7,236.85. The
average sum of AFDCPAY + FOODPAY (the underlying variables used to compute welfare
benefits) for the same sample is $2271.66, just over 30% of the average value of WELFBEN.17
By comparison, the maximum values of “Welfare Income Received” on the NLSY file for the
teen pregnancy sample ranges from $7,800 in 1978 to $28,332 in 1989. In only two years is
there a case with welfare benefits greater than $20,000.
There are also several smaller issues:
• 359 cases have welfare income > $256.06 (i.e., a value which I interpret as positive
AFDC and/or Food Stamp benefits) even though AFDCPAY and FOODPAY both equal
0. If these cases were coded as other cases appear to be, they ought to have a value of
$256.06. Instead, average welfare benefits for this group are actually $3582.71. It is not
clear which values are incorrect.
• As previously noted, there are values of welfare benefits for ages that would have been
observed in 1977, a year for which no observations are available in the NLSY. Mean
1977 benefits are greater than $7,000 in 1994 dollars (after eliminating the cases with
19
benefits equal to $256.06).
• The food stamp benefit variable appears to have been coded inconsistently and possibly
incorrectly. I cross-checked the food stamp dollar measures on the HMS file against the
NLSY, looking for a pattern. For some cases, it appears that the NLSY food stamp
measure is not in current dollars as indicated but has been re-scaled to 1978 dollars.
But this is not the case generally. Appendix Table 1 provides some evidence on this.
For the first case shown (ID=10259), the NLSY values, rescaled to 1978 dollars, are so
similar to the values on the HMS file that it is highly unlikely to be a coincidence. But as
the other two cases illustrate, this correspondence does not hold. There is no
discernible pattern at all in those two cases. I believe that the cases shown are
representative, although I have not done an exhaustive search for cases that fit either
the “rescaled to 1978 pattern” or the “no pattern” profile.
• In light of all of the inconsistencies noted, I cannot determine whether welfare benefits
suffer from the scaling problem that affects all the other variables. It seems likely that
the scaling error was consistently applied to all income variables, but because I cannot
link the HMS 2002 values to the underlying NLSY values, I cannot establish that this is
the case.
I have no a priori hypothesis of how these multiple data errors are likely to affect the
estimates of the effect of teen childbearing on AFDC and Food Stamp receipt. It is an empirical
matter.
D. Teen Fertility Coding
Information on the age at the beginning of a first pregnancy and the outcome of that
pregnancy is directly available in the 1984-86, 1988, 1990, and 1992 interviews. In each of
these years, there are a pair of variables, indicating the age at the beginning of a first pregnancy
and the outcome of that pregnancy. This corresponds precisely to the description that HMS
employ throughout, namely, the outcome of a teen pregnancy that began at age 17 or earlier.
There is no other distinct variable on age at first pregnancy, as there is for age at first birth.
These variables are designated as “created” variables on the NLSY file, meaning they are
18 For example, relying only of 1992 information on pregnancy outcomes omits 44 cases who report validinformation in previous years but are missing in 1992.
20
created by the NLSY staff from information elsewhere in the survey, and thus reflect some effort
at establishing consistency. From 1984 to 1990, the outcome variable is coded into four
categories: birth, abortion, miscarriage, or stillbirth. In 1992, the last two categories are
combined. In 1982 and 1983, there is information on the outcome of a first pregnancy (for
pregnancies that did not end in a live birth) and the year and month in which that pregnancy
ended. This information can be combined with information on own birth date to construct an
age at first pregnancy.
HMS 2002 reports that their sample includes 727 births, 185 abortions, and 68
miscarriages, all resulting from first pregnancies that began at age 17 or earlier. (In KHK, only
the total sample size and the weighted proportions in each teen fertility category are shown in
Table 3.2. Because the weighted proportions in that table are exactly the proportions that I
compute from their data, I assume that the same coding applies in KHK). I have no direct
information on how HMS coded teen fertility. Instead, I have compared the coding on their file
with a coding based on the 1984-1992 variables and, where necessary, the 1982-83 variables.
I use all years of data in this coding, since some cases appear in the file only in some year or
provide information about, for example, age at first pregnancy only in a single year.18
Because the fertility coding is so obviously central to this analysis, I examined the HMS
coding very carefully, cross-checking it against the classification that comes from the
straightforward use of the 1984-1992 information, augmented by the 1982-83 information in
cases where the 1984-1992 information is missing or otherwise not conclusive. It is clear to me
that the coding of teen fertility from the NLSY is inherently and inevitably subjective, especially,
for example, when the information is conflicting, as it is in some cases. Thus, there is no single
definitive classification. Reasonable people could probably disagree about the details.
My findings are summarized in Table 6. The details of specific case assignments are
included in the appendix. The first row shows the HMS 2002 teen fertility classification. I find
three distinct problems. First, a substantial number of cases included in the HMS samples
19 NCHS, Series 20, No. 31, “Medical and Life-Style Risk Factors Affecting Fetal Mortality, 1989-90" hasdata on fetal deaths for 29 states (61% of all fetal deaths). Fetal deaths are defined as stillbirths or miscarriages after20 weeks of duration. Induced abortions are not included. Table B shows a rate of 7.3 deaths per 1000 fetal deathsplus births for women < age 30. I assume that this rate holds for younger teens and for the other 21 states. Duringthe 1980s, there were about 285,000 births annually to women who were pregnant at age 17 or earlier. This figureincludes all births to adolescents age 17 and younger and 3/4 of those at age 18. Applying the 7.3/1000 fetal deathrate to the 285,000 young teen births figure yields an estimate of about 2,100 fetal deaths in pregnancies at age 17 orearlier. About half of teen pregnancies end in birth and about 13% in a miscarriage according to the AlanGuttmacher Institute estimates, which means there were approximately 74,000 miscarriages annually (285,000 x 2 x.13= 74,100). This means that there were about 35 times as many miscarriages as fetal deaths. In my calculations, Iam assuming that no miscarriages are included as fetal deaths.
21
appear to be misclassified, i.e., there is no consistent supporting evidence in the 1982-1992
information. For the teen birth sub-sample, there are 19 such cases –one true miscarriage, one
abortion, three whose first pregnancy began at age 18 or older, and 14 for whom there is no
information whatsoever about how the pregnancy ended or the age at which it began. For the
abortion sample, there are 15 dubious cases – three are teen miscarriages, three are teen births,
and nine are either not teen pregnancies or have insufficient information about pregnancy age or
outcome. Four cases are questionable for the miscarriage sample – one abortion and three
whose first pregnancy began at age 18 or older. The proportion of miscoded included cases
ranges from 2.6% for teen births to 8% for teen abortions.
Second, the HMS and KHK sample classification excludes a large number of cases that
are readily classified by the 1984-92 information. 65 births, 17 abortions, and nine miscarriages,
all from first pregnancies at age 17 or earlier and with complete information about the age at first
pregnancy and the outcome of the pregnancy, are omitted. The omitted proportions range from
9% to 13% of the HMS samples. The appendix table provides further details on these cases.
Third, the miscarriage category is actually miscarriage/stillbirth. It includes a strikingly
high proportion of stillbirths– eight out of 64 valid miscarriages in the HMS sample (12.5%) and
11 out of 73 in the augmented sample (12.3%). This raises two distinct concerns. First, the
impact of a stillbirth might not be innocuous, or at a minimum might be substantially greater than
a miscarriage that terminates a very short duration pregnancy. Second, and probably more
importantly, the high proportion of stillbirths suggests that miscarriages are substantially under-
reported in the NLSY and in the HMS sample. Using reported national data on fetal deaths and
miscarriages, I estimate that miscarriages ought to outnumber abortions by a ratio of nearly
35:1.19 In the NLSY sample, the ratio is closer to 7:1. Put differently, if my calculations are even
20 Attrition in the NLSY has been remarkably low. Unlike some surveys, non-response cases in one yearcan and do return to the NLSY sample in subsequent years.
22
close and if there are 11 stillbirths, there ought to be more than 350 miscarriages, rather than 62.
There is, of course, no way to determine ex ante the likely impact of a re-classification of
a substantial number of fertility cases. It is an empirical matter.
E. Improper use of sample weights
All of the regression analyses in KHK and HMS 2002 are weighted, using the sample
weights provided in the NLSY data. The NLSY consists of three separate sub-samples: a cross-
sectional sample; a supplementary sample of Hispanic, black, and economically disadvantaged
white youth; and a military sample. Most members of the military sample were dropped after the
1984 survey and all the economically disadvantaged white respondents were dropped after the
1991 survey. Sample weights are constructed for each survey year to adjust for differences in
initial selection probabilities and subsequent attrition, including the two sample drops. With
weighting, the NLSY is nationally representative.
The use of sample weights is obviously essential to estimate population means. There is
a lively debate about whether regression analyses ought to be weighted (see Deaton, 1997 for a
useful discussion) even with data that are unrepresentative, but I do not focus on that issue here.
The problem is that HMS have used the wrong weight and used it improperly.
In their analyses, HMS use a single weight taken from the 1979 interview. This is the
wrong weight for a number of reasons. First, the 1979 weight accounts only for the initial
selection probabilities, and thus obviously does not account for subsequent attrition.20 Attrition
modestly changes (increases) the weights for the remaining observations that are similar to the
non-response cases, but more importantly for our purposes it results in a zero weight for the non-
response cases. As a result, for ages that occur subsequent to 1979, HMS use a slightly wrong
weight for all continuing cases and a dramatically wrong weight for non-response observations.
Second, the 1979 weight is not the appropriate weight in the first place, even for analyses
of 1979 data, given the sample they analyze. This is true for two reasons. Because HMS
translate analyze outcomes by age rather than calendar year, there is no analysis of 1979
21 In 1979, there were 198 poor whites in the cross-section and 901 poor whites in the supplementarysample. The cross-section poor whites had a mean sample weight of 1207, representing a total population of about239,000 persons. The supplementary poor whites had a mean weight of 915 and represented a total population ofabout 825,000 persons. Thus the total represented population was 1.06 million persons. In 1992, following the dropof the supplementary poor white sample, the mean weight for the 198 remaining poor whites increased to 5360, thusrepresenting the same 1.06 million persons.
23
outcomes for which that weight would be appropriate; the 1979 outcomes are scattered across
the various ages. At a minimum, the appropriate weight to use is the annual, updated weight,
translated from calendar year to age. But even this is incorrect. HMS restrict their sample to
cases from the cross-section (white, black, and Hispanic) and the supplementary black and
Hispanic samples, i.e., they exclude the military sample and the poor white supplementary
sample. All of the sample weights provided in the NLSY, including the 1979 weight, are,
however, appropriate only for the whole sample and not for portions of it.
A particularly clear and relevant example of this involves the weights for the group of poor
whites who are part of the cross-sectional sample. The initial sample weights for this group
reflect both their representation in the cross-sectional sample and the inclusion of the poor
whites in the supplementary sample. Thus, when the latter group was dropped from the NLSY in
1991, the sample weights for the remaining cross-sectional poor whites were increased by a
factor of more than four, precisely because the selection probability of poor whites decreased.21
Since HMS do not use the poor whites from the supplementary sample even in the years prior to
their drop from the sample, the 1979 weights for the cross-sectional poor whites are incorrect, in
the sense that they do not yield appropriate population estimates. The weights for other groups
are not affected.
Because they use the 1979 weight and because of the nature of some of their variable
assignments, HMS 2002 actually includes observations that are non-response in their analyses.
Table 7 documents this. There are a maximum of 341 such non-response observations included
in their teen analyses –about 3% of all cases. All 341 of these cases have assigned values for
all education measures – presumably carried forward from some previous year– as well as for
current poverty status and Food Stamp use – for which no data exist. Only two of them are poor,
which is a suspiciously low figure. Nearly three-fifths apparently used Food Stamps. Almost half
have reported spouse earnings –all of them are zero, almost certainly due to a default
24
assignment in data processing. Sixty percent have reported welfare benefits; two-thirds of the
values are $256.06, which is the value incorrectly assigned to cases with no welfare benefits.
But another 64 of these non-response cases somehow have values for welfare benefits; their
average welfare benefits are $10,110 in 1994 dollars.
It is important to appreciate that none of these cases belong in a weighted analysis, even
when it is possible to assign a value for a variable such as educational attainment that is not
time-varying. The sample weight (if properly applied) adjusts appropriately and fully for the non-
responding observations.
F. Summary
In sum, HMS have made a number of data-processing errors that could undermine their
analyses in KHK and HMS 2002. They mis-scaled almost all incomes, included out-of-sample
observations, made a series of probably erroneous AFDC, Food Stamp, and welfare benefit
assignments, arguable miscoded fertility outcomes for some cases and omitted a substantial
number of other available cases, and applied an inappropriate weight variable in their analyses.
Some of these problems –the mis-scaling, the incorrect default $256.06 welfare benefits
assignments, and the inclusion of out-of-sample cases– can be corrected with their own data, but
most of the others require a fresh analysis. With the exception of the mis-scaling, which
undoubtedly increases the size of estimated coefficients, there is, ex ante, no clear prediction
about how correcting the other errors will affect the estimates, except that the resulting estimates
can then be viewed with substantially greater confidence.
G. Could Anyone Have Known?
Could a normally diligent referee, without access to the data, have known about these
problems? Possibly, I think, but only with some sleuthing, cleverness, and luck.
The most obvious error is the mis-scaling, an error I discovered by literally comparing
values. I did, however, suspect there was an error the first time I read the paper, although I had
no idea what its source was. HMS reported in KHK that at age 30, annual earnings for teen
22 This latter group is not the teen miscarriage group, but rather a sample of all women who did not have ateen birth. HMS use this comparison primarily to show the “apparent” consequences of teen childbearing that theylater debunk.
23 As noted earlier, the 24,416 number is actually too high. Adjusting for the number of years each birthcohort is actually observed between ages 18 and 35 between 1978 and 1991 decreases the maximum sample size to22,236 and further adjustment for cases lost during a particular interview year decreases the available sample size to21,670. A referee would have no way of knowing those figures without access to the data.
24 An interesting example of an error that was identified and caught was Weitzman’s claim that the economicstatus of women fell 73% in the year following divorce (Weitzman, 1985). Indirect evidence of the error was reportedby Hoffman and Duncan (1988) and direct evidence was reported by Peterson (1996).
25
mothers was $18,544 compared to $32,935 for women who did not have a teen birth.22 In fact,
these means almost certainly included the zero earnings of non-participants and thus was
consistent with the unrealistically high mean earnings that I discussed above. The inclusion of
the non-working women was, however, not explicitly stated and a referee might easily not
recognize that. In the next sentence, HMS note that teen mothers worked an average of 966
hours annually, compared to 1280 hours for the other group. Simple arithmetic then shows that
this implies hourly wages of $19.20 and $25.73, respectively. Those average hourly earnings
are clearly not very likely for any group of young women, especially not for the teen mothers,
given their age and education. So that was one possible clue.
Another possible clue was the discordant sample sizes. As discussed above, the actual
sample sizes reported in the appendix to KHK – as high as 25,432– often exceed the maximum
potential number of observations for a sample of 1744 women observed for 14 years (24,416).23
But most perfectly diligent referees are unlikely to notice something in the fine print like that.
Because neither of these errors are conspicuous, I conclude that referees are not to
blame for not catching the errors. I seriously doubt that this is the first publication that had
substantial data errors, and, probably more often than not, the errors are not detected.24 This
certainly speaks to the need to make data sets available as a condition for publication, a practice
now followed by many journals. Researchers are well advised to examine their data carefully,
inspecting means, minimums, and maximums and comparing them to benchmarks where
available, rather than proceeding directly to multivariate analyses.
III. Data and Estimates
26
I drew a sample from the NLSY following the procedures used in HMS 2002. Like HMS, I
exclude cases from the military sample and the poor white supplementary sample even in years
where those cases are available. Like HMS, I define a teen pregnancy as one that began at age
17 or earlier. The full sample includes 1033 women who had a teen pregnancy and had
sufficient information in at least one year to determine the age at the beginning of the pregnancy
and the outcome of the pregnancy. Of these, 773 women had a teen birth, 187 had a teen
abortion, and 73 had a teen miscarriage. The full person-year sample (ages 18-35 observed
between 1978 and 1992) includes a potential maximum of 14,222 cases, but sample attrition
reduces the actual total to 13,651. The number of available cases for particular dependent
variables is typically smaller than this.
Table 8 shows the population estimates (weighted sample means) for the independent
variables and the outcomes that I examine. I have adjusted the weight for the poor whites in the
cross-section for the exclusion of the poor whites in the supplementary sample. The outcomes
are shown at age 28, except for high school graduation which is at age 21. Age 28 is the latest
age at which all birth cohorts are observed and it is the age for which HMS present most of their
results. Also shown in the table are the corresponding means from HMS 2002. The background
means shown in the top panel are taken from their Table 2; the outcome measures are my
estimates using their data. Note that the weight variable used to compute the HMS means is the
incorrect 1979 weight variable as described above.
As shown in the first column of the table, the NLSY teen pregnancy sample is about one-
quarter black and almost two-thirds white. These proportions differ from the HMS proportions
(30% black, 60% white) primarily because of the use of the adjusted weight for the poor whites,
which increases the white percentage by about three percentage points in my data. Many of the
other family background means are quite close in the two samples, for example, family structure,
parents’ education, and the AFQT score. Income variables are an exception. Family income in
HMS 2002 is about 25% higher than what I find. More than half of the cases in HMS 2002 are
25 In my sample, 22% of the cases (unweighted) are missing 1978 family income and 38% are missinginformation on 1978 family AFDC receipt. I have confirmed that many of the cases reported by HMS to be missingfamily income do, in fact, have legitimate data for this variable. The proportion of HMS cases missing income data isstrongly related to birth cohort, ranging from 84% for the 1957 birth cohort to just 12% for the 1964 birth cohort. It isvery possible that HMS are doing something systematic that is related to the age of the respondent.
27
missing family income, so the means may well represent different populations.25 For the
outcome variables, means for educational attainment (especially high school graduation), work
hours, and food stamp use are quite similar in the two data sets, but incomes differ considerably
because of the scaling problem in HMS. Mean own earnings are 77% larger, mean spouse
earnings 91% larger and mean income from welfare is more than 60% larger in HMS 2002 than
in my tabulations. There are also fairly large differences in the proportions married, poor, and to
a lesser extent, receiving AFDC income. I do not know what accounts for these latter
differences.
Table 9 presents my re-estimation of the HMS model using their data and correcting for
the scaling problem and the inclusion of out-of-sample data points for 1975-1977. As described
earlier, HMS estimate both a non-parametric model in which the effect of a teen birth at each age
is estimated via a separate interaction term (“is age t” x “whether had a teen birth”). This model
is estimated by HMS in three versions, first with no additional explanatory variables and then in
two versions with additional explanatory variables. Explanatory variables include dummy
variables for birth cohort, having a first pregnancy at age 16 or 17, race/ethnicity (two dummy
variables), family structure at age 14 (two dummy variables), correlates of miscarriage (two
dummy variables for whether used alcohol or tobacco during pregnancy), parents’ education,
and two measures of family income (actual income and whether received income from welfare,
both measured for 1978). AFQT scores are also included for outcomes other than education.
Most of the strong results (i.e., positive or no negative impact of a teen birth) appear in the model
without explanatory variables; with the exception of spouse earnings and income from welfare,
the teen birth estimates are quite robust across the specifications. HMS also estimate a
parametric age effect model that uses either a quadratic or cubic function in age plus
corresponding interaction terms and a dummy variable for a teen birth. This model is estimated
only in the form with all additional explanatory variables. Estimation is by two-stage least
26 I was able to replicate their results for all outcomes except income from welfare where my coefficientestimates for most variables differed from theirs by approximately five to ten percent. This may reflect differencesdue to the statistical software used.
28
squares, with having a miscarriage used as an instrument for having a teen birth. I estimate the
models using Limdep. To keep the table simpler, I report results only for the non-
parametric model without additional explanatory variables and the age-parametric model with all
covariates.26 Columns (1) and (2) are the original HMS estimates and columns (3) and (4) are
the corrected estimates. The estimates shown are for age 28, except for high school graduation
which is shown for age 21. For the non-parametric model, the reported effect comes directly
from the estimated coefficient; for the parametric models, the effect is computed at the indicated
age from the underlying coefficients on the teen birth dummy and the age interactions. For some
outcomes, there are no corrections, either because there is no problem with the underlying
variable (hours worked, whether married) or because the underlying problem does not affect the
particular estimate shown in the table (non-parametric estimates for educational attainment,
poverty status, and receipt of food stamps and AFDC, where the data problem is confined to
ages 18-20).
The impact of the income scaling problem is quite evident. Coefficient estimates for all
incomes fall substantially. Of the eight pairs of estimated income coefficients, in six cases, the
corrected estimates are about 60% of the original. This includes all of the parametric estimates.
The other two cases have much larger changes. The effect of a teen birth on total income from
welfare, estimated in the non-parametric specification, falls more than 75% from -$372 to -$85.
For the non-parametric estimate of spouse earnings, the impact of a teen birth at age 28 falls
from about $1270 to $28, a drop of 98%. This particular impact is an outlier– at surrounding
ages, the drop in the coefficient is more similar to the changes in the other income coefficients.
At other ages, the impacts generally follow the magnitude of the scaling error discussed earlier,
increasing with age; these results are not shown in the table. On average, from age 18 to 32 the
corrected coefficient estimate is 55% of the original estimate for a woman’s wage rate, 64% for
earnings, 22% for spouse earnings (61% omitting the age 28 effect) and 62.5% for welfare
benefits. The signs of the coefficients are not changed by the correction, although statistical
29
significance occasionally is.
The inclusion of out-of-sample cases for the non-monetary outcomes does not have
much of an impact. Education estimates are essentially unchanged, the impact of teen
childbearing on the probability of being in poverty decreases by about .015 in absolute value,
while the impact on the probability of receiving food stamps or AFDC increases by about .018 to
.020.
In Table 10, I present my estimates of the impact of teen pregnancy on socio-economic
outcomes using the sample of teens pregnant at age 17 or earlier that I drew from the NLSY79.
Again, I estimate two models – a non-parametric age specification with no additional explanatory
variables and a parametric model very similar to, though not identical to, the one estimated by
HMS. The differences are slight and highly unlikely to affect results; if they did, the results would
be extremely unrobust. I have used the AFQT score as a linear variable rather than as a set of
dummy variables, and I have not included variables for use of alcohol and tobacco during
pregnancy, because I was uncertain how those variables were constructed. All other variable
are either identical to those used by HMS or as close as feasible without my having access to
their coding and data processing procedures. Both specifications allow the teen birth effect to
vary over time through the appropriate interaction terms.
The top portion of the table shows the non-parametric estimates for the outcome
variables I use and the bottom portion shows the parametric estimates. I show the actual non-
parametric coefficient estimates for the age-teen birth interaction terms at two year intervals from
age 18 to 30. For the parametric estimates, I show the key teen birth coefficients and the
impacts computed from these estimates at the same two-year intervals. To keep the table
manageable, I do not show estimates of the impact on the wage rate, because in the NLSY79
this is simply the ratio of earnings to hours worked; cumulative hours worked because the
cumulative sum can be derived from the analysis of annual hours; and whether the individual has
a GED, because this is implicit in the HS Graduate or GED analysis. I have also excluded
analyses of the number of children and years as a single mother, since these seem less central.
I have added an additional variable measuring educational attainment – whether the individual
30
completed at least two years of college– because this is an increasingly important outcome. I do
not show standard errors, but simply indicate which variables are statistically significant. I also
do not show coefficients estimates for any of the other independent variables. Full estimates are
available in a data appendix available from the author.
The most conspicuous feature of the non-parametric estimates is the general lack of
statistical significance. Of the 77 coefficients shown, only 9 are statistically significant at the 10%
level or higher and a great many have t-statistics of .5 or lower (not shown in the table). This is
also characteristic of the estimates in HMS 2002. More than half of the statistically significant
estimates are at age 30. Own and spouse earnings are consistently positive, but not significant
until age 30. Welfare income bounces around in a positive and negative direction and again is
not statistically significant until age 30. There are no significant differences in either high school
graduation or GED attainment; the age pattern of HS graduation estimates is actually a bit odd,
because the distribution is constant from age 20 on. There are some indications that teen
mothers are less likely to go on to college. Those impacts are consistently negative from age 22
to 28 and statistically significant at age 24, but they appear to disappear at age 30. Impacts on
marriage are positive at first, presumably because some of the women marry the father of their
child; this probably also accounts for the initial premium in spouse earnings. From age 24 on,
the marriage impacts are erratic, but they are occasionally large (-.112 at age 28), although none
of the impacts are statistically significant. The most consistent effect is for food stamps: from
age 24 on, the teen mothers consistently and usually statistically significantly use food stamps
less than other women. Their use of AFDC is negative in all years shown but one, but only the
effect at age 30 is statistically significant. Impacts on the probability of being in poverty are
erratic and non are significant. Teen mothers do appear to work more than other women, and
significantly so at age 28.
Estimates of the parametric effects are somewhat more likely to be statistically significant,
but even here only 25% (eight of 33) are statistically significant. Estimates of the teen birth
dummy and the age-interaction terms are reliably estimated for spouse earnings and for some
college attendance. In many cases, it is probably the case that the data would reject the
31
hypothesis of a teen birth effect that varies over this age range. Given the nature of the
parametric model, the estimates very less wildly from year to year. While there are exceptions,
the estimates are usually smaller in absolute value than in the non-parametric specification.
With all the same caveats about lack of statistical significance, it does appear that the teen
mothers earn less through age 22 and then catch up. They have spouses who initially earn more,
probably because of marriage per se, then lose most of that advantage, before regaining it plus
more at ages 28 and 30. The impacts on welfare benefits are substantially positive at first, but
do finally turn negative at age 28. The patterns from the non-parametric estimates for college
attendance and food stamps both appear here: the teen mothers are again less likely to attend
college or receive food stamps.
Table 11 is a summary of the results from HMS 2002 and from my re-estimation of the
NLSY data. The HMS 2002 results are the original ones that they present, that is, they are not
corrected for scaling and sample problems as was done in Table 9. The estimates shown are at
age 28, except for high school graduation, which is reported at age 21. Also shown in brackets
beneath the estimates for the income variables and hours worked are the sum of the estimates
from age 18 to 30. These sums are not discounted and make no allowance for whether the
underlying estimates are statistically significant or not.
The largest difference in estimated effects are, not surprisingly, with the income variables
where the scaling problem caused HMS to substantially overstate the impacts. My age 28
estimates for own earnings are about 17-22% of theirs; the cumulative total is 23% as large for
the non-parametric estimates and less than 10% as large for the parametric estimates. For
spouse earnings, my age 28 non-parametric estimate is twice as large as theirs, but the
parametric estimate is less than half as large. Cumulative effects on spouse earnings are two-
thirds and one-half as large for the two specifications. All specifications show the teen mothers
receiving less welfare income at age 28; my parametric estimate is about one-quarter of theirs.
Interestingly, their estimates show that teen mothers receive substantially more welfare benefits
from age 18 to 30, while my non-parametric estimates show just the opposite. (Recall that there
seemed to be a number of problems with their welfare benefit variable). My parametric estimate
32
shows nearly $2900 additional welfare benefits for the teen mothers. The general conclusion for
the income variables is of effects that are far more modest and less beneficial than were reported
in HMS 2002, both at age 28 (though with some exceptions) and over the years from 18 to 30.
There are no sign reversals, and no evidence of significant adverse impacts of teen childbearing
except perhaps with welfare benefits.
The same pattern holds for hours worked, a variable for which I identified no special data
problems. Nonetheless, there are differences in our estimates. Like HMS, I find that teen
mothers tend to work more hours, but I find a much smaller advantage. My cumulative non-
parametric estimates are 1200 hours less then theirs, and my parametric estimates are more
than 1500 hours less. I do not know how to account for this difference.
For the education variables, there is reasonable consensus. This is not surprising, since I
found no problems with the education variable per se, except for the out-of-sample cases that
were erroneously included. All estimates show a negative impact on receiving a high school
diploma. HMS 2002 find that the positive impact on receiving a GED more than compensates for
this negative impact, so that the teen mothers are actually slightly more likely to have either a
high school degree or a GED. I do not find this latter effect. In my estimates, there is essentially
no difference in the proportion with either a high school diploma or a GED. As noted above, I
find some evidence of a negative impact on the probability of attending at least two years of
college.
Finally, all of the estimates indicate that teen mothers are less likely to be receiving food
stamps; this impact is probably the largest and most consistent of all the outcomes. All of the
estimates indicate a quantitatively small, but negative impact on AFDC receipt, with teen mothers
less likely to be receiving assistance. HMS find that the teen mothers are less likely to be poor. I
find essentially no impact. Finally, all estimates show that teen mothers are less likely to be
married at age 28. The parametric estimates are very small, while the non-parametric estimates
are quite a bit larger.
HMS use their parametric estimates to compute budgetary costs of teen childbearing,
including AFDC benefits received and taxes paid. They conclude that teen childbearing actually
33
reduces net government expenditures by $3.9 billion, primarily because it increases earnings
and thus taxes paid, relative to what would happen were these women to delay their
childbearing. Their procedures are complex and I have not tried to recompute this at this time
using my revised estimates. I do suspect that their overstatement of the earnings gain
attributable to teen childbearing would change these estimates by reducing the taxes paid.
Given the magnitude of the scaling problem and its impact on coefficient estimates, the change
in these budgetary cost estimates might well be substantial.
IV. Conclusion and Future Directions
I firmly believe that the new approaches to the impact of teen childbearing are productive
and informative. The two HMS contributions (KHK and HMS 2002) are particularly clever and
important contributions to this literature. Just as firmly, I believe that approaches need not only
be conceptually sound but also soundly and carefully implemented. Here, the two HMS
contributions fell short. Some of their errors may have been innocuous, affecting too few cases
to affect the estimates, although that does not constitute a defense. Other errors, especially
including the mis-scaling of incomes, had a very substantial impact of estimated coefficients.
I have focused here on a relatively straightforward replication of HMS 2002. There are a
number of extensions that could be implemented and that might well be productive. I note some
of them here.
First, there is no need to limit the analysis to pregnancies that occur at age 17 or earlier.
This definition was chosen for the research reported in Kids Having Kids and HMS
understandably continue to follow it. Other researchers, however, might want to expand the age
limit. One advantage of extending the approach to include pregnancies at age 18 is that it will
likely increase the sample of teens with a miscarriage. It will also thereby more closely mimic the
teen birth definition used in most research.
Second, researchers might well want to include the portions of the NLSY sample that
HMS dropped. The military sample is available from 1979 until 1984 and the sample of
economically disadvantaged white respondents are available through the 1991 survey. Available
34
sample weights allow for their inclusion in the years in which they are present in the sample. In
fact, excluding these cases without adjusting the weight variable is incorrect. Including these
sub-samples would further increase sample sizes of teens with a miscarriage.
Third, especially with a larger sample of teen miscarriages, it might be possible to
examine some issues further. One issue is the apparent high representation of stillbirths in the
miscarriage sample. The current specification treats a full-term stillbirth as equivalent to a
miscarriage that occurs in a pregnancy of much shorter duration; with a larger sample, one could
test for equivalent impacts. Second, a very substantial portion of the teens with miscarriages
had a follow-up pregnancy at age 17 or earlier that led to a birth. Hoffman (1998), citing data
reported in KHK, notes that nearly 30% of the teens with a miscarriage had such a birth.
Researchers might want to consider whether the benefits of delay of a first birth were a function
of the length of time that the teens with a miscarriage actually did delay. Researchers might also
be able to consider whether the teen birth effect varies by race and/or ethnicity.
Fourth, in this paper, I have not examined issues of the sensitivity of the estimates to the
extreme incomes that are occasionally reported in the NLSY. In my data, the maximum value for
a woman’s own earnings is over $137,000 and the maximum for spouse earnings exceeds
$550,000. Maximum welfare income is greater than $27,000, which, while high, is substantially
lower than the maximum value of over $150,000 in HMS 2002. Maximum annual hours worked
is 5980, an average of over 16 hours per day for 365 days. Such values may, of course, be
genuine.
Finally, the NLSY teens who were 14-21 in 1979 and 28-35 in 1993 when these analyses
ended, have moved further into their own life-cycles. It would be valuable to examine whether
the trends that HMS identified in the mid-to-late 20s continue as all the women in the sample
move into their 30s.
35
References
Campbell A. 1968. “The role of family planning in the reduction of poverty.” Journal of Marriageand the Family, Vol. 30(2): 236 – 245.
Deaton, Angus S. 1997. The Analysis of Household Surveys. Batimore: JohnsHopkins University Press.
Duncan, Greg J. and Saul D. Hoffman. 1990. “Economic Opportunities, Welfare Benefits, andOut-of-wedlock Births among Black Teenage Girls.” Demography, 27, 519-535.
Furstenberg, Frank F. Jr., J. Brooks-Gunn, and S. Philip Morgan. 1987. Adolescent Mothers inLater Life. Cambridge: Cambridge University Press
Geronimus A. T. and S. Korenman. 1990. “The Socioeconomic Consequences of TeenChildbearing Reconsidered,” mimeo, University of Michigan.
___________. 1992. “The Socioeconomic Consequences of Teen Childbearing Reconsidered.”Quarterly Journal of Economics, Vol 107: 1187-1214.
Grogger, Jeff and Stephen G. Bronars. 1993. “The Socioeconomic Consequences of TeenageChildbearing: Findings From a Natural Experiment.” Family Planning Perspectives, Vol. 25 (4).
Hayes, Cheryl (ed.). 1987. Risking the Future. Vol.1. Washington, DC: National Academy Press.
Hoffman, Saul D. 1998 “Teen Childbearing Isn’t So Bad After All ... or Is It? — A Review of theNew Literature on the Consequences of Teen Childbearing.” Family Planning Perspectives, Vol.30, No. 5, pp. 236-239.
Hoffman, Saul D. and Greg J. Duncan. “What Are the Economic Consequences of Divorce?",Demography, November, 1988.
Hoffman, Saul D., E. Michael Foster, and Frank F. Furstenberg, Jr. 1993. “Re-evaluating TheCosts of Teenage Childbearing.” Demography, Vol 30 (1), 1-13.
Hotz, V. Joseph, Susan McElroy, and Seth G. Sanders. 1997. ”The Impacts of TeenageChildbearing on the Mothers and the Consequences of those Impacts for Government” in KidsHaving Kids, Rebecca Maynard (ed.). Washington, DC: The Urban Institute Press.
___________. 2002. “Teenage Childbearing and Its Life Cycle Consequences: Exploiting aNatural Experiment,” mimeo.
Hotz, V. Joseph, Charles H. Mullin, and Seth G. Sanders. 1997. “Bounding Causal Effects UsingData from a Contaminated Natural Experiment: Analyzing the Effects of Teenage Childbearing,”Review of Economic Studies, Vol. 64, 575-603.
Maynard, Rebecca. 1997. Kids Having Kids. Washington, D.C.: The Urban Institute Press.
National Center for Health Statistics. 1996. “Medical and Life-style Risk Factors Affecting FetalMortality, 1989-90.” Vital and Health Statistics, Series 20, No. 31.
Peterson, Richard R. 1996. “A Re-Evaluation of the Economic Consequences of Divorce.” American Sociological Review, Vol. 61, June.
Weitzman, Lenore. 1985. The Divorce Revolution. New York: The Free Press.
36
Table 1. Effects of Teen Childbearing on Selected Socioeconomic Outcomes, Teen Mothers vs Teens with Miscarriage
Outcome KHK HMS 2002
All covariates,polynomial age
effects(1)
No covariates,unconstrained age
effects(2)
All covariatesunconstrained age
effects(3)
All covariates,Polynomial age
effects(4)
High School Diploma -.20 -.11 -.16* -.15*
High School Diploma or GED .02 .08 .03 .05
Annual Hours Worked 130 - 500 369* 331* 304*
Spouse’s Earnings $8485 $1269 $2505 $7512*
Annual Own Earnings $4508 9270** $8489** 6660**
Own Wage Rate – $4.34** 4.22** 1.63
Dollars from AFDC/Food Stamps No effect -$372 -$516 -1018
Number of Births .30 .30 .27 .35
Years as Single Mother 1.6 – – –
Proportion Married – -.08 -.07 -.03
Proportion in Poverty -- -.11 -.12 -.14
Proportion Receiving Food Stamps -- -.10 -.09 -.15
Proportion Receiving AFDC – -.04 -.06 -.04
Source: Kids Having Kids, Chapter 3, text and figures, and HMS 2002, Table 4.
Table Notes:**= statistically significant at 95% level; *= statistically significant at 90% levelMost KHK estimates are at age 30, except annual hours worked where the range of estimates is for mid-20s to early 30s. HMS 2002 estimates areat age 28 except for high school diploma which is evaluated at age 21.
37
Table 2 – Own and Spouse Earnings, NLSY and HMS, Selected Cases
Variable and Case ID # Data Value,NLSY
Data Value, HMS
Ratio, HMS / NLSY
Spouse Earnings (years with positive earnings only)
ID = 244R0155500 TOT INC SP WGS & SAL P-C YR 79 $5,044 missing – R0312710 TOT INC SP WGS & SAL P-C YR 80 $5,800 $13,166 2.27R0482910 TOT INC SP WGS & SAL P-C YR 81 $9,152 $20,775 2.27R0784300 TOT INC SP WGS & SAL P-C YR 82 $12,473 $28,314 2.27R1026200 TOT INC SP WGS & SAL P-C YR 83 $14,700 $33,369 2.27R1412900 TOT INC SP WGS & SAL P-C YR 84 $17,547 $39,831 2.27R1780700 TOT INC SP WGS & SAL P-C YR 85 $22,000 $49,940 2.27R2143800 TOT INC SP WGS & SAL P-C YR 86 $31,275 $70,994 2.27R2352500 TOT INC SP WAGES & SALARY PAST YR 87 $32,840 $74,546 2.27R3561200 AMT SP REC'D 1990 FROM WAGES 91 $50,000 $113,500 2.27R3899300 AMT SP REC'D 1991 FROM WAGES 92 $70,000 $158,900 2.27R4314400 AMT SP REC'D 1992 FROM WAGES 93 $60,000 $63,295 1.05ID=175R1026200 TOT INC SP WGS & SAL P-C YR 83 500 $10,215 2.27R1412900 TOT INC SP WGS & SAL P-C YR 84 $22,000 $49,980 2.27R3561200 AMT SP REC'D 1990 FROM WAGES 91 $30,000 $68,100 2.27R3899300 AMT SP REC'D 1991 FROM WAGES 92 $34,560 $78,451 2.27R4314400 AMT SP REC'D 1992 FROM WAGES 93 $37,000 $39,032 1.05
Woman's Annual Earnings (years with positive earnings only)
ID=3R0312300 TOT INC WGS & SAL P-C YR 80 1979 $7,000 $15,890 2.27R0782100 TOT INC WGS & SAL P-C YR 82 1981 $7,000 $15,890 2.27R3897100 R'S WAGES/SALARY/TIPS (PCY) 92 1991 $4,000 $9,080 2.27R4295100 AMT OF R'S WAGES/SALARY/TIPS (PCY) 93 $6,000 $6,329 1.05ID=19R0155400 TOT INC WGS & SAL P-C YR 79 1978 $4,000 $9,080 2.27R0312300 TOT INC WGS & SAL P-C YR 80 1979 $8,500 $19,295 2.27R0482600 TOT INC WGS & SAL P-C YR 81 1980 $10,000 $22,700 2.27R0782100 TOT INC WGS & SAL P-C YR 82 1981 $9,000 $20,430 2.27ID=62R3897100 AMT OF R'S WAGES/SALARY/TIPS (PCY) 92 $25,000 $56,750 2.27R4295100 AMT OF R'S WAGES/SALARY/TIPS (PCY) 93 $26,000 $27,428 1.05
Note: CPI (1994) = 148.2; CPI (1978) = 65.2.
38
Table 3 – Observed Data Ages and Sample Sizes by Birth Cohort
Birth Cohort Number ofObservations
Observed Age Range(Ages 18-35, 1978-92)
Maximum TotalPotential Observations
Contributed
1957 121 21-35 1815
1958 101 20-34 1515
1959 130 19-33 1950
1960 125 18-32 1875
1961 128 18-31 1792
1962 133 18-30 1729
1963 141 18-29 1692
1964 100 18-28 1100
ALL 979 18-35 13,468
39
Table 4. Included Out-of-Sample Observations by Dependent Variable, HMS Teen Pregnancy Sample
VariableOut-of
SampleYear(s)
Included
Number ofIncludedCases
Mean Notes
Spouse Earnings 1977 178 $0$0 assigned for allunmarried women;missing for married
women
Welfare Benefits 1977 316 $3,137Not clear what is
source of data or whysome cases are
missing
In Poverty197519761977
121222352
0.0%0.0%0.0%
All cases included;none are poor.
Received FoodStamps
197519761977
121222352
100.0%100.0%32.0%
All cases included; allreported receiving
Food Stamps,1975-1976.
Received AFDCBenefits
197519761977
121222352
100.0%100.0%31.0%
All cases included; allreported receivingAFDC, 1975-1976
HS Diploma
197519761977
121222352
39.0%44.0%42.0%
All cases included;possible assignment
by natural age atgraduation
GED197519761977
121222352
0.8%4.5%7.7%
All cases included;not clear what issource of data.
Hours Worked None – –
Annual Earnings None – – Wage Rate None – –
40
Table 5– Comparison of Food Stamps and AFDC Benefits, NLSY and HMS, Selected Representative Cases
HMS VARIABLES NLSY VARIABLESID
(1)
Year
(2)
Food StampBenefits
(3)
AFDCBenefits
(4)
WelfareBenefits
(5)
AnnualFood
StampDollars
(6)
AFDC -AvgMonthly Benefit
(7)
TotalWelfareDollars
(8)
AFDC Months(computed)
(9)
AFDC Benefits(computed)
(10)4 82 $776 $2,499 $9,126 $416 $314 $1,986 5.0 $1,5704 83 $749 $2,407 $8,802 $1,296 $314 $5,064 12.0 $3,7684 84 $555 $1,979 $7,118 $1,080 $314 $4,848 12.0 $3,76819 88 $667 $2,825 $9,713 $348 $446 $2,132 4.0 $1,78419 89 $808 $1,569 $6,694 $1,284 $446 $6,636 12.0 $5,35219 90 $130 $0 $607 $1,644 $446 $4,320 6.0 $2,67619 91 $726 $0 $2,222 $256 $0 $256 0.0 $086 78 $614 $2,498 $9,357 $0 $0 $1,188 0.0 $086 79 $603 $2,990 $9,988 $912 $309 $4,620 12.0 $3,70886 80 $481 $2,953 $9,556 $672 $303 $4,308 12.0 $3,63686 81 $536 $2,836 $9,501 $648 $346 $4,870 12.2 – 86 82 $458 $2,806 $9,096 $828 $344 $4,956 12.0 $4,12886 83 $411 $2,849 $9,084 $648 $358 $4,944 12.0 $4,29686 84 $396 $2,837 $9,010 $648 $380 $5,208 12.0 $4,56086 85 $387 $2,891 $9,134 $648 $389 $5,316 12.0 $4,66886 86 $375 $2,923 $9,189 $648 $408 $5,544 12.0 $4,89686 87 $405 $2,889 $9,178 $648 $425 $5,748 12.0 $5,10086 88 $407 $2,882 $9,162 $756 $435 $5,976 12.0 $5,22086 89 $324 $2,467 $7,815 $768 $457 $6,252 12.0 $5,48486 90 $121 $0 $583 $448 $500 $3,948 7.0 $3,50086 91 $375 $0 $1,271 $336 $336 0.0 $0204 78 $610 $3,330 $12,311 $300 $333 $4,296 12.0 $3,996204 82 $921 $2,406 $9,266 $1,420 $424 $5,236 9.0 $3,816204 87 $861 $2,560 $12,672 $1,800 $446 $10,056 18.5 – 204 88 $601 $1,611 $6,247 $805 $432 $2,965 5 $2,160204 89 $789 $2,103 $8,087 $1,800 $400 $6,600 12 $4,800
41
Table 6. Fertility Coding Issues in HMS Teen Fertility Data
Birth Abortion Miscarriage/StillbirthReported Cases 727 185 68
Disposition Based onNLSY Variables,1982-92
708 – teen birth1 – miscarriage 1 – abortion 3 – 1st pregnancy atage > 17 14 – no reportedinformation abouthow pregnancyended or age at 1st pregnancy
170 – abortion3 – miscarriage3 – 1st pregnancy atage > 179 – no reportedinformation abouthow pregnancyended or age at 1st pregnancy
56– miscarriage 1– abortion3 – 1st pregnancy atage > 178 – stillbirth
Number of excludedappropriate cases
65 17 9 (includes 3 stillbirths)
Total AvailableAnalysis Sample
773 185 64 (miscarriage)73 (including stillbirths)
Percent of IncludedCases Correct
708/727 170/185 64/68
Percent of AvailableCases Included
708/773 170/187 64/73
Table 7. Sample Means for Included Non-Response Cases, HMS Teen Pregnancy Sample
N Minimum Maximum Mean Has Received HS Diploma or GED by Age t 341 .00 1.00 .581 Has Received GED by Age t 341 .00 1.00 .191 Has Received HS Diploma by Age t 341 .00 1.00 .405 On Food Stamps at Age t 341 .00 1.00 .58 In poverty in year t 341 0.00 1.00 .006 Annual Welfare Benefits, in 1994$ 197 256.06 106,407.55 3457.36 Spouse's Annual Earnings, in 1994$ 155 .00 .00 .00 Woman's Annual Earnings, in 1994$ 2 .00 .00 .00 Ann. Family Income , in 1994$ 2 .00 .00 .00 Hourly Wage Rate, in 1994$ 0 -- -- --
42
Table 8. Population Estimates for Background and Outcome Variables, Teens Pregnant at Age17 or Earlier, NLSY79
Variable Population Mean
Background Variables NLSY(N=1033)
HMS 2002(N=980)
Black 25.9% 30.0%
White 65.5% 60.6%
Hispanic 8.6% 9.5%
From female-headed family 18.3% 17.4%
From two-parent family 72.3% 73.2%
Mother’s education 10.5 10.5a
Father’s education 10.6 10.5a
AFQT score 31.7 31.8
Family Income (1978) $13,565 $16,981b
Family on AFDC (1978) 18.7% 15.0%
Outcome Measures (at age 28, except as noted)
High School Grad (age 21) 47.6% 48.2%
HS or GED 64.9% 68.5%
Some College 9.8% --
Hours Worked 1166 1092
Cumulative Hours Worked 9102 9189
Own Earningsc $9110 $16,147
Spouse Earningsc (if married) $25,468 –
Spouse Earningsc (all) $12,507 $23,891
Income from Welfarec $1447 $2351
Married 55.0% 64.6%
Poor 31.0% 42.0%
Received Food Stamps 26.0% 27.0%
Received AFDC Income 18.2% 22.0%
Table Notes:a – adjusted from HMS 2002 to exclude missing datab – adjusted from HMS 2002 to exclude missing data and rescale to 1978 dollarsc – in 1994 dollars
43
Table 9. Original and Corrected HMS Estimates of Effect of Teen Childbearing on SelectedSocioeconomic Outcomes – Teen Mothers vs Teens with Miscarriage
Outcome Original Correcteda
Non-Parametric,no covariates
Parametric,with
covariates
Non-Parametric,no covariates
Parametric,with
covariates
High School Graduate -0.107 -0.146 -- -0.145
HS or GED 0.080 0.046 -- 0.049
Hours Worked 368.90* 303.64* -- --
Cumulative HoursWorked
2605.41** 1732.10 -- --
Wage Rate 4.34** 1.64 2.15 1.08
Own Earnings 9269.65*** 6660.43** 5467.41** 3915.85
Spouse Earnings 1269.85 7511.77* 28.13 4436.58
Income from Welfare -372.27 -1017.79 -84.99 -617.73
Married -0.082 -0.033 – --
Poor -0.106 -.139** -- -0.124
Received Food Stamps -0.097 -.152** -- -0.174
Received AFDC Income -0.043 -0.045 -- -0.063
a Corrected for scaling and sampling problems.*, **, and *** = statistically significant at 10%, 5%, and 1% level
44
Table 10. IV Estimates of Effect of Teen Childbearing (Pregnant by Age 17) on Socio-Economic Outcomes, NLSY79
Earnings SpouseEarnings
WelfareIncome
HSGraduate
HS Grador GED
SomeCollege
HoursWorked
Poor Married Food StampUse
AFDCUse
Non-ParametricEstimates,No Covariates
Effect at Age:18 -215.5 2469.9 473.2 0.099 0.174 0.002 -143.3 0.025 0.189 -0.038 -0.00820 1086.4 2375.1 177.2 0.002 0.014 -0.010 219.0 -0.040 0.065 0.066 -0.05622 1727.6 2680.9 -244.2 -0.048 0.013 -0.053 200.8 -0.094 0.056 0.044 -0.05224 1167.6 4015.9 -302.7 -0.053 0.021 -0.084* 133.9 0.005 -0.055 -0.199** -0.07426 909.1 2904.6 171.9 -0.114 -0.022 -0.074 188.0 0.066 0.032 -0.094 0.01628 2086.7 2803.8 -470.5 -0.080 0.004 -0.066 308.6** -0.040 -0.112 -0.166** -0.08530 3846.9** 6471.7** -1361.2** 0.006 0.053 0.024 154.5 0.045 0.009 -0.189** -0.122*
Sample Size 13651 13651 13620 13651 13634 13583 13460 11963 12902 13640 13483
Parametric EstimatesWith All Covariates
Teen Birth 54.03 49721.17* -3773.40 2.662 1.009 0.952** -34.68 0.076 1.865 0.836 0.168Age -193.17 -4405.81** 457.02 -0.282 -0.083 -0.082** -16.82 0.000 -0.142 -0.058 -0.001Age Squared 8.29 97.99** -11.82* 0.009 0.002* 0.002** 0.84 0.000 0.003 0.001 0.000Age Cubed -0.0001 Effect at Age:
18 -738.4 2164.9 623.8 0.017 0.060 -- -63.7 0.053 0.172 0.072 0.06820 -495.0 800.5 639.7 -0.051 0.020 -- -33.2 0.049 0.090 0.022 0.04722 -185.3 219.9 561.0 -0.091 -0.006 -0.060 4.2 0.044 0.030 -0.020 0.02524 190.6 423.3 387.7 -0.107 -0.019 -0.074 48.3 0.039 -0.008 -0.056 0.00026 632.9 1410.5 119.9 -.0.103 -0.019 -0.075 99.1 0.034 -0.025 -0.085 -0.02628 1141.4 3181.7 -242.5 -0.083 -0.005 -0.064 156.7 0.028 -0.021 -0.107 -0.05430 1716.2 5736.8 -699.4 -0.053 0.021 -0.039 221.0 0.022 0.004 -0.121 -0.084
Sample Size 13076 13076 13049 13606 13606 13555 12892 11473 12396 13066 12919** = statistically significant at 5% level; * = statistically significant at 10% level
45
Table 11. Summary of Estimated Effects of Teen Childbearing on Socio-Economic Outcomes,HMS 2002 and NLSY
Outcome Estimate at Age 28, Except as Noted; Sum of Impacts Age 18-30 in Brackets
HMS 2002 NLSY79
Non-Parametric Parametric Non-Parametric Parametric
High School(age 21)
-0.107 -0.146 -0.080 -0.080
HS or GED 0.080 0.046 0.004 -0.010
Some College -- -- -0.066 -0.060
Hours Worked 368.9
[2964.0]
303.6
[2313.6]
308.6
[1751.5]
156.7
[780.9]
Own Earnings 9269.7
[66868.6]
6660.4
[46784.3]
2086.7
[15898.1]
1141.4
[3986.4]
SpouseEarnings
1269.9
[61281.0]
7511.8
[45444.0]
2803.8
[40209.9]
3181.7
[23336.2]
Welfare Income -372.3
[2784.4]
-1017.8
[1619.2]
-470.5
[-1780.5]
-242.5
[2889.1]
Married -0.082 -0.033 -0.112 -0.02
Poor -0.106 -.139 -0.040 0.03
Food Stamps -0.097 -0.152 -0.166 -0.11
AFDC -0.043 -0.045 -0.085 -0.05
46
Figure 1. Average Own Earnings by Age, 1994 Dollars,Teen Pregnancy Sample
$0
$5,000
$10,000$15,000
$20,000
$25,000
$30,000
15 20 25 30 35 40
Age
Earn
ings
Reported Earnings
Corrected Earnings
Reported earnings are from HMS 2002; corrected earnings are adjusted properly to 1994 dollars. Sample is persons w ith earnings.
Figure 2. Average Spouse Earnings by Age, 1994 Dollars, Teen Pregnancy Sample
$10,000
$20,000
$30,000$40,000
$50,000
$60,000
$70,000
15 20 25 30 35 40
Age
Earn
ings
Reported SpouseEarnings
Corrected SpouseEarnings
Reported earnings are from HMS 2002; corrected earnings are adjusted properly to 1994 dollars. Sample is persons w ith earnings.
47
Figure 3. Average Scaling Error by Age
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
18 20 22 24 26 28 30 32 34
48
Appendix Table 1 – Comparison of Food Stamp Dollars, NLSY and HMS Data, Selected Cases
Case ID Number and Year NLSY Value HMS Value NLSY in 1978 $
ID=102591978 $65.00 $65.00 $65.001979 $657.00 $589.92 $590.031980 $1,068.00 $845.06 $845.071981 $1,656.00 $1,187.63 $1,187.801982 $1,920.00 $1,297.13 $1,297.241983 $1,980.00 $1,296.00 $1,296.141984 $1,608.00 $1,008.94 $1,009.061985 $1,656.00 $1,003.31 $1,003.451986 $264.00 $157.03 $157.051987 $1,242.00 $712.83 $712.841988 $0.00 $0.00 $0.001989 $1,920.00 $1,009.50 $1,009.551990 $3,000.00 $1,496.44 $1,496.561991 $1,080.00 $516.94 $517.00
ID=62341978 $1,800.00 $1,123.27 $1,800.001979 $924.00 $720.14 $829.821980 $864.00 $487.17 $683.651981 $588.00 $451.99 $421.761982 $684.00 $115.52 $462.141983 $0.00 $321.89 $0.001984 $684.00 $418.08 $429.231985 $684.00 $103.59 $414.471986 $0.00 $366.75 $0.001987 $852.00 $424.80 $489.001988 $732.00 $498.33 $403.441989 $1,008.00 $612.87 $530.01
ID = 103141978 $744.00 $1,781.72 $744.001979 $2,004.00 $1,852.78 $1,799.741980 $2,364.00 $1,823.25 $1,870.541981 $2,520.00 $1,868.53 $1,807.521982 $2,796.00 $1,650.47 $1,889.111983 $2,475.00 $1,633.50 $1,620.181984 $2,532.00 $1,673.25 $1,588.901985 $2,808.00 $1,701.38 $1,701.501986 $0.00 $0.00 $0.001987 $4,080.00 $2,248.50 $2,341.691988 $4,080.00 $2,010.00 $2,248.661989 $3,672.00 $2,139.19 $1,930.76
Note: Bolded entries indicate that HMS values are in 1978 dollars.
27 There are 17 cases that do not report a teen birth on the basis of the 1984-92 information, but in theirfertility history report a birth at age 18.75 or earlier, thus indicating a pregnancy that began at age 17 or earlier. Thereis, however, no way to know whether this was a first pregnancy. I have chosen not to include these cases.
49
Data Appendix – Coding of Teen Fertility Events in NLSY
Coding of the outcome of an early teen pregnancy is based primarily on information available inthe 1984-86, 1988, 1990, and 1992 interviews. In each of these years, there are a pair of variables,indicating the age at the beginning of a first pregnancy and the outcome of that pregnancy. I use all yearsof data in this coding, since some cases appear in the file only in some year or provide information about,for example, age at first pregnancy, only in a single year. From 1984 to 1990, the outcome variable iscoded into four categories: birth, abortion, miscarriage, or stillbirth. In 1992, the last two categories arecombined. It appears that HMS included stillbirths in their miscarriage category.
In 1982 and 1983, there is information on the outcome of a pregnancy that occurred prior to a firstbirth and the year and month in which that first pregnancy ended. This can be combined with informationon own date of birth to construct an approximate age at the beginning of a first pregnancy that ended inother than a live birth. In most cases, the 1982-83 information is translated into and is consistent with the1984-1992 information. There is some occasional inconsistency across years.
I code teen fertility in two steps. First, I use the 1984-92 variables to identify cases that everreport a first pregnancy at age 17 or less along with one of the designated fertility outcomes. The relevantpairs of variables are 1984: r1522057and r1522056; 1985: r1892759 and r1892758; 1986: r2259859 andr2259858; 1988: r2879800 and r2879700; 1990: r3409900 and r3409800; and 1992: r4009469 andr4009468. I then check the 1982-83 pregnancy information and the information on the date of first birth toidentify additional cases not otherwise classified that report a pregnancy that began at age 17 or earlier. Ifurther check these cases against what I treat as the more reliable information in the 1984-92 data. Wherethere are conflicts that cannot be resolved, I rely on the 1984-92 data. Like HMS, I exclude the militarysubsample and the poor whites in the supplementary sample, even in those years where they areavailable.
Using just the 1984-92 data, there are 773 teen births, 181 teen abortions, and 67 teenmiscarriages (including 11 stillbirths). An additional 10 cases have 1982-83 information that indicates,either by itself or in conjunction with the 1984-92 data, there was a miscarriage that could have begun atage 17 or earlier. After checking the 1982-83 information against the 1984-92 information, I conclude thatsix of these cases (IDs= 1302, 3952, 5774, 6715, 8194, and 8289) are probably teen miscarriages, one isa miscarriage in a pregnancy that began at age 18 (1000), and three are teen abortions (626, 4645, and11876). This leaves a grand total of 73 miscarriages, including 11 stillbirths.
There are also another eight cases that appear to be teen abortions based on the 1982-83information. Of these, two are inconsistent with the 1984-92 information for the age at which thepregnancy began, but the other six appear to be genuine teen abortions (IDs 3001, 5659, 5927, 5930,6772, and 8103). This yields a total of 187 teen abortions.27
28 The actual abortion count in HMS is 184. One additional case is coded as an abortion but is excludedfrom the sample by a filter that erroneously codes it as missing in each sample year.
50
HMS 2002 report 727 teen births, 185 abortions, and 68 miscarriages (including stillbirths). Of the68 HMS teen miscarriages, eight are stillbirths (Case ID= 1985, 2214, 5589, 6807, 7074, 7439, 7774,11802); one is coded abortion in all years (ID=4645); and three occurred at age 18 or older (ID = 5171,8614, 10000). This leaves 56 legitimate teen miscarriage cases or 64, including the stillbirths.
In addition, there are nine teen miscarriages that are not included in the HMS miscarriage sampleand that are readily identified by the 1984-1992 fertility information. Six are genuine teen miscarriages(ID= 590, 845, 4252, 4295, 4787, 5896). Three are stillbirths (ID = 96, 3191, 8260), but I include thembecause HMS included stillbirths as miscarriages, too. Including the stillbirths, there are a maximum of 73teen miscarriages, comprised of 11 stillbirths and 62 miscarriages.
Of the 185 HMS teen abortion cases,28 three are teen births (IDs= 1071, 1548, 3051), three areteen miscarriages (IDs= 3191, 4252, 5896), and nine are not teen pregnancies or have insufficientinformation to evaluate as teen abortions (IDs= 343, 1250, 1558, 5044, 9371, 9521, 9863, 9927, 11968). There are also 17 legitimate teen abortion cases not included as teen abortions on HMS. All are coded asa teen abortion in most years (may be missing in some years); case ids= 438, 692, 1377, 1940, 2217,2221, 2402, 2552, 2952, 3185, 4474, 4645, 4665, 5195, 5681, 5682, 9865.) The net abortion count is 185(HMS) - 15 (invalid) + 17 (not on HMS file) = 187.
The HMS file includes 727 teen births. Of these, 19 do not have any reported information of a teenbirth: one is a teen miscarriage (ID=845), one is a teen abortion (9865), three are the result of pregnanciesat age 18 (842, 4802, 8319), and 14 have no reported information about how pregnancy ended or age atpregnancy for 1982-1992. In addition, there are 65 cases that report a teen birth at age 17 or younger thatare excluded. A complete listing is available on request. The net teen birth count is 773.
51