the socio-economic effects of teen childbearing re ... · the socio-economic effects of teen...

Working Paper Series∗

Department of Economics

Alfred Lerner College of Business & EconomicsUniversity of Delaware

Working Paper No. 2003-08

The Socio-Economic Effects of Teen Childbearing

Re-considered: A Re-Analysis of the Teen

Miscarriage Experiment†

Saul D. Hoffman

September 2003

∗http://www.be.udel.edu/economics/workingpaper.htm† c© 2003 by author(s). All rights reserved.

The Socio-Economic Effects of Teen Childbearing Re-considered: A Re-Analysis of the Teen Miscarriage Experiment

Saul D. HoffmanDepartment of Economics

University of DelawareNewark, DE 19716

[email protected]

September 16, 2003

This research was supported by grants from the W. T. Grant Foundation and the CharlesStewart Mott Foundation; this financial support is gratefully acknowledged. Helpful commentswere received from Robert Plotnick, Kristen Moore, and Michael Foster.

Abstract

In an important contribution to the literature on the socio-economic impacts of teenchildbearing, Hotz, McElroy, and Sanders used a natural experiment based on the randomoccurrence of miscarriages. They concluded that the negative impacts of teen childbearing hadbeen substantially exaggerated. In a replication of their work, I identify a number of importanterrors that undermine their results. Correction and re-estimation with their data showsubstantially smaller impacts on income variables. Re-estimation with a new data set yieldsimpacts that are smaller yet. The re-estimation generally does not alter the sign of theestimated effects, but does lead to a much more modest conclusion.

I. Introduction

In 1997, Hotz, McElroy, and Sanders contributed a chapter to Kids Having Kids, an

important and heavily publicized research volume about teen childbearing funded by the Robin

Hood Foundation and published by the Urban Institute. It was, according to the foreword, “the

first comprehensive effort to identify the extent to which the undesirable outcomes of teen

pregnancy are attributable to adolescent pregnancy itself rather than to the wider environment in

which most of these pregnancies and the subsequent child rearing take place” (Maynard, 1997,

p. IX). The book included substantive chapters on the impact of teen childbearing on mothers,

fathers, and children.

Their chapter, which focused on the costs of childbearing to the young women,

concluded that teen childbearing, properly analyzed, was actually beneficial, not only to the teen

mothers, but also to the Federal government in terms of net taxes and transfers. They found

that the teen mothers –in this study, women who had a first pregnancy at age 17 or earlier that

resulted in a birth– worked more, earned more, married men who earned higher incomes, and

received less support from welfare through their mid-20s to early 30s than if they had delayed

their childbearing. Using these estimates, they estimated that Federal expenditures on the

women who became teen mothers would actually increase by $4.0 billion, net of income taxes

paid, if they had delayed their childbearing. They concluded, referring to their findings and the

findings of two other earlier studies, that “the failure to account for selection bias vastly

overstates (emphasis in original) the negative consequences of teenage childbearing and [the

findings] certainly provide no support for the view that there are large negative consequences of

teenage childbearing per se for the socioeconomic attainment of teen mothers” (p. 81). They

also wrote that “these findings call into question the view that teenage childbearing is one of the

nation’s most serious social problems, at least when one measures its severity in terms of costs

to taxpayers.” Their findings were prominently featured in the foreword to the book which noted

that “the basic message of the book is that early parenting itself has little effect on the mothers’

education or earnings” (Maynard, 1997, p. IX). This was a controversial chapter, to say the

least.

In this paper, I examine this important and controversial contribution to the literature on

the socio-economic impact of teen childbearing. I first outline the main arguments of the article,

2

including its methodological contribution and its substantive findings. I then turn to a direct re-

estimation of their data, guided by the data and programs that the authors shared with me. I

examine their data and sample construction carefully, undoubtedly far more carefully than the

average referee –or even a particularly conscientious one– examines a typical journal article.

I can summarize my findings quite easily. There is much to admire in their

methodological contribution to the teen childbearing literature. They joined a growing literature,

originating with Geronimus and Korenman (1992) and including Hoffman, Foster, and

Furstenberg (1993), and Bronars and Grogger (1993), which argues that traditional research

approaches overestimate the causal impact of teen childbearing by incorrectly attributing to teen

childbearing the impact of unobserved individual, family, and community factors that are

correlated with both teen childbearing and the particular outcomes under examination. Their

particular methodological twist was to identify a natural comparison group consisting of teens

whose first pregnancies ended in a miscarriage. Since miscarriage is usually a random event

that results in a delay in the age of a first birth, a comparison of outcomes for teens whose first

pregnancy ended in a birth with outcomes for teens whose first pregnancy ended in a

miscarriage ought to provide an unbiased estimate of the impact of a teen birth. Such a

comparison is, in their words, “a natural experiment,” and, as such, it shares many of the

strengths of a random-assignment experiment which, in the context of teen childbearing, could,

of course, not otherwise be implemented.

The application of this methodology is, however, marred by a series of data processing

errors. I identify and discuss five such kinds of errors: 1) the incorrect re-scaling of 1978-1992

incomes into a common year that makes some year’s incomes more than two times the correct

real income; 2) inclusion of data points outside the data window available in the NLSY; 3) an

improbable coding of AFDC, Food Stamp, and total welfare income; 4) questionable coding of

teen fertility for a substantial number of cases; and 5) errors in using sample weights that result

in non-response cases being included in their analyses.

HMS have written two related papers that use the miscarriage methodology to estimate

1 There is also a primarily methodological piece Hotz, Mullin, and Sanders (1997). I do not discuss thatpaper here.

2 The KHK analyses include teens who became pregnant at ages 18 and 19, although they are not the focuson any direct analysis. The analyses in HMS 2002 exclude these women. HMS 2002 includes additional analyses,both in terms of the dependent variables considered and the specifications of age effects that are estimated.

3 In a personal correspondence in answer to my query about whether the data sets in the two analyses werethe same, one of the authors indicated that “to the best of my knowledge, the answer to your question is “yes”although there may be a few changes.”

4 These two figures show average earnings (Fig. 3.3) and AFDC+Food Stamp Benefits (3.4) for teenmothers and women who did not have a teen birth. I have had to re-scale the dollar values to replicate the figures.

3

the causal impact of teen childbearing.1 One is the chapter in Kids Having Kids, referred to

hereafter as KHK. The other is a revised version, referred to here as HMS 2002. I also use

HMS without a year to refer to the authors themselves. The authors note that the results in the

two papers are substantively similar, and that some minor errors in KHK were corrected in HMS

2002. There are some differences in model specification and sample.2

At the request of the authors, I focus here on HMS 2002 rather than KHK. Where

appropriate, I indicate if there are substantial differences between the two papers. It appears

that the same data was used in both analyses, except perhaps for the addition of one more year

of data for the 2002 analyses.3 Using the data for HMS 2002, I can replicate exactly Figures 3.3

and 3.4 in their chapter in Kids Having Kids.4 This data correspondence between the two

papers is important, because it means that my findings here also apply to the findings of the

chapter in Kids Having Kids.

In the next section, I review the new literature on the effects of teen childbearing and

then discuss the methodology and findings in KHK and HMS 2002. In the following section, I

discuss the various errors summarized above and evaluate their likely impact. I think most

reasonable, seasoned empirical researchers will appreciate the issues and agree that the errors

are genuine and go beyond the kind of innocuous errors and subjective decisions that are part

of almost any social science research project. An interesting question that I examine there is

whether these errors could have been identified in the refereeing process, short of the direct re-

examination of the data that I undertook with the assistance of the authors. In the fourth

section, I outline my own findings from a correction and re-analysis of their data and an original

4

re-analysis of the underlying NLSY data.

As a researcher who has examined some of the same issues that HMS analyze, I view

their articles as major contributions to the literature. My goal here is to determine exactly what

can be learned from implementing their proposed research methodology with the care and

attention to detail that it deserves.

II. The New Approach to The Consequences of Teen Childbearing – Methods and Findings

The effect of teen childbearing on subsequent socio-economic outcomes has been one

of the most researched topics in family demography. Campbell’s (1968) conclusion that “when a

16 year old girl has a child... 90 percent of her life’s script is written for her” is a well-known

pessimistic assessment, but it is no longer widely accepted. It has been well understood for

more than a decade that there is a wide variation in outcomes (Furstenberg, Brooks-Gunn and

Morgan, 1987; Duncan and Hoffman, 1990) and that many teen mothers may do well, especially

with the passage of time. Still, as of the late-1980s, the consensus research view, was the teen

childbearing was a serious problem and that it exacerbated the disadvantages of the young

women. The summary in Risking the Future, the 1987 report of the National Research Council,

is well known: "Women who become parents as teenagers are at greater risk of social and

economic disadvantage throughout their lives than those who delay childbearing” (Hayes,

1987).

The traditional literature evaluated the consequences of teen childbearing primarily

using a multivariate regression of the form:

(1) Yi = XiB + "Ti + ,i

where Y is some outcome of interest, X are other control variables that affect outcome Y, T is a

measure of a teen birth (either dichotomous or age at first birth) and " is the effect of a teen

birth on outcome Y. The model was estimated either by OLS or logit/probit, depending on the

nature of the dependent variable. Outcomes were typically measured some years subsequent

to the teen birth, often when the women were in their mid-20s, although this varies across

studies. Studies like these, which appeared from the mid-1970s through the 1980s, were the

5 Technically, 8 is the coefficient on T from a regression of Z on X and T.

6 As a practical matter, there is no accepted statistical test to establish that all relevant variables have beenincluded. Any and all multivariate specifications are potentially at risk for criticisms that they are plagued byspecification bias due to omitted variables.

5

basis for the negative prognoses of the consequences of teen childbearing summarized in

Risking the Future.

The specification in (1) was, however, fair game for criticism that the estimates were not

convincingly causal and that they could well reflect omitted individual, family, and neighborhood

variables that affected both the outcome (education, income, etc) and the probability of a teen

birth. In a regression context, if there were a single omitted variable Z, the bias would be the

product of two terms:

(2) Bias = $Z x 8Z,T .

In (2), the first term is the effect of the omitted variable on outcome Y and the second

term has the sign of the partial correlation, conditional on the Xs in equation (1), between Z and

T.5 Let Z be measured so that more of it is a good thing and T be a dummy variable coded “1"

for a teen birth. With Z omitted, T then captures the correlated effect of Z on Y, which is

precisely the bias term in (2). For negative outcomes such as poverty and welfare receipt, both

terms in (2) would likely be negative, which means that standard estimates overstate the

negative consequences of teen childbearing. For positive outcomes, such as income or

educational attainment, $Z > 0, while 8Z,T is still negative. That would make standard estimates

too negative. Thus, this critique suggests that standard (multivariate) estimates would be too

large in absolute value compared to true causal effects.

The solution, within the traditional framework, is to include better and/or additional

measures of explanatory variables, but this is almost inevitably an unconvincing exercise to a

skeptical audience.6 Instead, the post-Risking the Future literature proceeds in one of two

ways, following the lead of equation (2).

Fixed-effect sibling models were introduced to this literature by Geronimus and

Korenman (1990,1992), who compared outcomes for pairs of sisters, one of whom had a teen

birth, while the other did not. Expanding (1) to allow for common family effects, the model is:

7 In the NLSYW, differences in the sisters’ own family incomes were quite small and were not statisticallysignificantly different from zero, as were differences in the probability of high school graduation. In the PSID andNLSY, the average difference in economic well-being (family income divided by the poverty line) between sisters wasabout one-third. There were also big differences between the sisters in the probability of being poor, receiving welfare,and in educational attainment. The findings were unusually consistent across the two data sets and the two sets ofauthors.

8 A random assignment experiment is impossible for obvious reasons.

6

(3) Yij = XiB + "Ti + (Fj + ,i

where “j” now indicates the family in which individual “i” grew up and (Fj is the impact on

outcome Y of growing up in a particular family. In this specification, F is exactly the Z from

above. As above, let F be measured such that more is better and assume that T and F are

negatively correlated – being raised in a “better” family reduces the probability of a teen birth.

Then, if F is unobserved and hence omitted from equations like (1), we have exactly the over-

estimated effect due to bias.

Geronimus and Korenman’s important contribution was to use data on sisters to

eliminate the unobserved family effect. If the family effect on outcome Y is constant for sisters

within a family, it will not affect differences between them. The first estimates of this model used

data from the NLSYW (Geronimus and Korenman, 1990); subsequent implementation of the

model by Geronimus and Korenman (1992) and Hoffman, Foster, and Furstenberg (1993) used

more current and more representative data from the NLSY and PSID. The NLSYW estimates

show not only that the causal effects of a teen birth were smaller than in the conventional

studies, but also that they were very small and often essentially zero. The PSID and NLSY

estimates are also smaller than in the conventional studies, but they are substantially larger

than the NLSYW estimates and arguably large enough to be important for policy purposes.7

The alternative approach is to identify and implement a natural experiment for teen

childbearing.8 An ideal natural experiment eliminates the bias problem, since by construction

8Z,T = 0 for all Z when the assignment to “control” or “experimental” group is random by natural

means. Two natural experiment approaches have been presented, one based on twins and the

other based on miscarriages.

The twins experiment, due to Grogger and Bronars (1993), uses the difference in

7

outcomes between teen mothers of twins and teen mothers with single births to assess the

difference in outcomes between teen mothers and women who do not have a birth, i.e., to

measure the consequences of teen childbearing. Since twinning is random, 8Z,T = 0 in equation

(2), and the resulting estimate of the effect of teen childbearing ought to be unbiased. While the

estimated effects are not zero in this study, they are relatively small and certainly smaller than in

the multivariate literature. For example, the probability of graduating from high school is about

four percentage points lower for the teen mothers of twins and their family income is about

$1100 (10 percent) less. The mothers of twins were also slightly more likely to be in poverty

and on welfare.

Grogger and Bronars note that the use of the twins experiment to estimate the costs of a

teen birth is appropriate only if the impact of having twins is a linear function of the number of

children. In that case, the observed difference in outcomes between mothers of twins and

mothers of singletons is equivalent to the unobserved difference between mothers with single

births and women with no birth, which is the impact of interest. If, however, the marginal impact

of a second, simultaneous child (i.e., a twin) is essentially zero, the twins natural experiment

would seriously under-estimate the true impact of a teen birth. In fact, as the authors show,

there are substantial increasing returns to scale in the time cost of children (which implies

declining marginal cost) in the sense that mothers of twins do not spend anywhere near twice

the time in childcare as mothers of one child. This means that the twins experiment understates

the impact of a teen birth, perhaps dramatically so. The authors quite clearly note, that “the

twins approach provides a conservative estimate of the effect of an unplanned teenage

singleton first birth” (p.161).

The most recent and most well-known natural experiment study is Hotz, McElroy, and

Sanders’s chapter in Kids Having Kids and their 2002 paper, which are the focus of this review

paper. Their analyses are based on a comparison of young teen mothers –in their study,

adolescents who became pregnant at age 17 or younger– with girls who had become pregnant

by the same age but suffered a miscarriage instead. Since most miscarriages are random, the

same argument as above carries over, and the “experiment” should yield an unbiased estimate

9 Because the two groups are more similar in terms of their early sexual behavior, they may also be moresimilar in terms of unmeasured variables. But that is irrelevant if miscarriages are random.

10 Also included are dummy variables reflecting an early pregnancy and having a birth at ages 18 or 19.

8

of the consequences of teen childbearing.9 The subsequent differences in outcomes between

the two groups thus ought to provide an estimate of the causal effects of a teen birth.

In practice, there are some complications to this methodological design. The ideal

comparison group for the natural experiment is women who would have chosen to have a birth,

but were randomly prevented from doing so by virtue of a miscarriage. As HMS note, some

women who have a miscarriage would actually have chosen to have an abortion had the

miscarriage not intervened. These women cannot be identified in the data. Thus, the authors

do not quite proceed as if this were a natural experiment, i.e. by using the difference-in-

difference estimator commonly used in natural experiments. Instead, they estimate an

instrumental variables model, using a teen miscarriage as an instrument for a teen birth. To be

a valid instrument for a teen birth, a teen miscarriage must satisfy two basic statistical

requirements: it must be correlated with a teen birth and it must be uncorrelated with the

outcomes of interest, conditional on the other independent variables. The former condition is

clearly satisfied; the latter is plausible, but might not be valid, if a miscarriage had an

independent effect. There is, however, no simple way to test for this.

In KHK, HMS examine a wide set of outcomes, including educational attainment, labor

supply and earnings, marriage and fertility, spouse earnings, and welfare receipt. They examine

outcomes from ages 18 to 35, which allows them to distinguish between short run and longer

run impacts, and which is an important contribution of their work. Their findings are summarized

in column (1) of Table 1. The results shown are taken from the text and figures, and are based

on a model with a quadratic or cubic function in age and interactions between the age terms

and a dummy variable for a teen birth. Control variables include including race and ethnicity,

parents’ education, measures of family structure and family income at age 14, and an ability

measure (AFQT score).10 The key finding of the study is that by their late-20s, teen mothers do

better over a quite wide range of outcomes than their counterparts who had a miscarriage. The

9

teen mothers were less likely to have graduated from high school, but they were more likely to

have received a GED by an essentially offsetting amount. HMS find that the teen mothers

worked more and earned more than their counterparts, and their spouses had higher incomes.

The two earnings effects are quite large, and amount to approximately a 50% increment for

spouse earnings and 35% for own earnings. Differences in income from welfare between the

two groups were very small. The teen mothers were worse off only on two categories – they

had more births by age 30 and they spent more time as a single mother than did the teens with

miscarriages.

The corresponding estimates from HMS 2002 are shown in columns (2) - (4). Column

(4) comes from a model quite similar to KHK, although the sample differs as discussed in the

introduction. In columns (2) and (3), the estimates are for a non-parametric specification of age

plus age x teen birth interactions that provides separate estimates of impacts at each age. The

impacts are reasonably similar to those in KHK, although there are some differences between

the parametric and non-parametric specifications, especially for income and earnings variables.

New estimates of the impact of a teen birth on poverty, and receipt of Food Stamps and AFDC

show that a teen birth has a negative, though not statistically significant impact. This holds for

many of the estimates– t-statistics, especially for the non-parametric estimates are often quite

small. Certainly there is little evidence here that a teen birth has negative consequences for

socio-economic outcomes, as long as a GED is equivalent to a high school diploma.

Although not shown in the table, there is considerable year-to-year variation in the non-

parametric estimates for the income variables. For example, at age 28, it is estimated that the

teen mothers earned more than $9,000 more than the women who had a miscarriage, but at

age 31, they earn just $800 more with a standard error five times as large. At age 28, their

husbands earned more than $1,250 more, but with a standard error of $5,000. One year later,

the husbands are earning more than $14,000 more. At age 28, the women themselves earn

nearly $4.50 more per hour, but three years later, at age 31, they earn $8.00 less per hour (t-

stat=1.2). This may reflect the relatively small samples of teens with miscarriages at specific

ages and the changing sample composition, especially older ages.

10

II. Data and Data-Processing Issues

In this section I consider and discuss a set of data and data-processing issues in HMS

2002 (and by extension in KHK). I do this in substantial detail, since, as is often the case in

empirical work and is certainly the case here, “the devil is in the details.”

The data-processing task of adapting the NLSY data into the form required for analysis

of the medium-term impacts of teen childbearing is formidable. There are multiple margins for

error. Data coding, especially of fertility events, in the NLSY is highly complex; even under the

very best of circumstances, errors or subjective and largely undocumented coding decisions

creep in. In this case, the task is further complicated because a long time series of data is

used, covering the years 1978-1992, which is, in turn, transformed into individual life-cycle ages

from 18 to 35. Of course, since only 15 years of data are observed, no single individual is

observed at all ages from 18-35.

A. Scaling Problems

HMS 2002 analyzes the impact of teen childbearing on incomes (own, spouse, and

transfer) that are observed between 1978 and 1992. Accordingly, HMS rescale the observed

nominal incomes into real incomes, using 1994 for this purpose. Unfortunately, they do this re-

scaling incorrectly. Rather than multiplying each year’s income by P94/Pt (the ratio of the price

level in 1994 to the price level in the observation year), they multiply by P94/P78, irrespective of

the year in which the income is observed. Thus, all incomes are multiplied by 2.27, which is the

ratio of the CPI in 1994 to the CPI in 1978 (148.2/65.2). There is one exception to this –1992

incomes are scaled properly by multiplying by P94/P92. As a result, all income variables except

those in 1978 and 1992 are too large.

This problem affects all income variables –own earnings and spouse earnings, plus a

woman’s wage rate and her family income, both of which are derived directly from these two

variables. It also probably affects welfare benefits received. I say “probably,” because as I

show below, there are other problems with that variable that make it difficult to assess exactly

how it was constructed and whether this particular error applies.

11

Table 2 documents the scaling problem for a few selected, but representative cases. In

order to be precise, I identify and list the underlying variables in the NLSY so that there will be

no misunderstanding. Information is shown for own earnings and spouse earnings from the

HMS data file and for the corresponding data drawn directly from the underlying NLSY variable.

The notation “P-C YR” in the NLSY variable names means that the variable is for the “Previous

Calendar Year,” so “P-C YR 79” is a question asked in the 1979 interview year that applies to

1978 income. The problem is evident: the values in the HMS datafile are always exactly 2.27

times the corresponding value from the NLSY file, rather than varying with the year of

observation, as they should. Note also that the 1992 values are scaled correctly, multiplied by

1.05. The scaling error is not limited to these cases, but is consistently, if incorrectly, applied to

all cases that I examined.

Because of this error, real incomes in all years except 1978 and 1992 incomes are

incorrect. The scaling error increases annually: the error in each successive year is simply the

cumulative increase in the CPI from 1978 to the year in date. For example, 1980 incomes are

26% too high, 1985 incomes are 65% too high, and 1990 incomes are twice as high as the

correct value. 1992 incomes are correct.

Taken at face value, the HMS data suggest that the teen mothers are faring quite well in

both the labor market and especially the marriage market. Their own mean earnings at age 30,

computed among those with positive earnings, equals a very respectable $26,485, a figure

roughly equal to the median earnings of all year-round full-time working women in 1992

($26,235). This figure is all the more impressive since only half of the women are high school

graduates, with another 25% earning a GED, and they are all relatively early in their careers.

Their working husbands are apparently doing remarkably well. When the women were age 30,

their husbands had mean earnings of $52,815, a whopping 60 percent more than the roughly

$32,000 (in $1994) earned by all year-round full-time working men in the mid-1980s and early

1990s. Actual earnings, corrected for the scaling problem, are a bit more reasonable and

realistic. Average own earnings and average spouse earnings at age 30, again computed

among those with positive earnings, are $13,813 and $27,489, respectively, in 1994 dollars.

12

Because the scaling error rises over time, it distorts age-earnings profiles. This is

particularly important, because one of the key contributions of the HMS analysis is to extend the

time period and thus allow for rebound or recovery effects. The distortion of age-earnings

profiles is shown in Figure 1 for own earnings and Figure 2 for spouse earnings; in both figures,

the sample includes only persons with positive earnings. The reported profiles are unusually

steep –own real earnings almost triple from age 20 to age 30 and spouse real earnings more

than double. The precipitous fall in reported real earnings at age 35 reflects the correct scaling

of 1992 income. The corrected profiles, also shown in the figures, are considerably less steep,

that is, show far less growth in earnings with age, and are far more consistent with labor market

data from other sources.

What is the likely impact of this scaling error on the regression estimates HMS report?

Standard units of measurement statistical analysis shows that if the dependent variable is re-

scaled by some constant 8 and the independent variable is not re-scaled, then the

corresponding estimated coefficient is 8 times the original estimate. If the analyses were

organized by calendar year and estimated separately, this would be the expected result: the

bias would increase with calendar year.

However, the problem here is different, because the analyses are organized by age, not

year. Since the same age occurs in different years for different cohorts, different observations

at the same age are mis-scaled differently; the incomes at a given age are multiplied by 8c

where the subscript denotes the birth cohort. Figure 3 shows the average scaling error by age;

the numbers shown are simple averages, without adjustment for the small difference in sample

size across the birth cohorts. As shown in the figure, the scaling error increases with age

through age 27, since older ages are, on average, observed in later years where the scaling

error is larger (except for 1992). The average scaling error is 34% at age 20, 67% at age 25,

and 76% at age 30. At age 28, the age at which many results are emphasized in HMS, the

average scaling error is 74%, with the range of scaling error running from 65% for the 1957 birth

cohort to 109% for the 1963 birth cohort and 0% for the 1964 birth cohort. After age 27, the

error is relatively constant at 75% and then dips a bit as the correct 1992 observations become

11 Variables such as marital status are reported contemporaneously, and thus run from 1979 to 1993.

13

a larger portion of the observed sample. At age 35, there is no scaling error, because all

observations are from 1992. The scaling error also increases with birth year, since the later

birth cohorts experience any given age in a later year.

In general, since average incomes by age are consistently too large and the error

increases with age, it is likely that the resulting life-cycle pattern of teen birth effects will be

biased, with spuriously larger effects observed at older ages. I present evidence on this in the

results section.

B. Sample Construction

For their analyses, HMS 2002 uses data from the 1979 through 1993 NLSY interviews,

typically covering incomes for the preceding calendar year (1978 to 1992) to construct outcome

measures from ages 18 to 35.11 KHK apparently uses one less year of data, only through the

1992 interview. In 1979, the NLSY sample was between ages 14 and 21 (i.e, the women were

born between 1957 and 1964). Table 3 below shows the sample size per birth cohort and the

range of ages (from 18 to 35) observed for each one. Younger ages are unobserved for the

earlier birth cohorts, older ages for the later cohorts. For example, the 1957 birth cohort is

never observed at an age younger than 21, which would have occurred in 1978 and been

reported on in the 1979 interview. The 1964 birth cohort is never observed at any age older

than 28, which would have occurred in 1992 and been reported on in 1993. The number of

years observed ranges from 11 (1964 cohort) to 15 (1957-1960 cohorts). The last column

shows the maximum number of data points each birth cohort could contribute if there were no

missing data, given the cohort sample size and the number of years of observations. The grand

total with no missing data is 13,468. I return to that figure below.

HMS properly exclude in their analyses those observations of the appropriate age that

fall in calendar years subsequent to the end of the data observation window, i.e. ages through

35 occurring after 1992 for which there was no data at the time of their analyses. But for many

variables, observations for ages falling in calendar years prior to the beginning of the

observation window (i.e., prior to 1978) are included, even though there is no valid data to

12 To make matters worse, about one-third of the 1977 cases coded as having received either AFDC orFood Stamps have missing data for the underlying dollar value of benefits received variables.

14

assign to these observations either.

Table 4 shows the extent of this problem for the major dependent variables. The

inappropriate data observations correspond to ages 18-20 for the 1957 birth cohort (“observed”

in 1975-77), ages 18-19 for the 1958 birth cohort (“observed” in 1976-77), and age 18 for the

1959 birth cohort (“observed” in 1977). In the table, the sample Ns for poverty, AFDC use, and

Food Stamp use (121, 222, and 352 in 1975-1977, respectively) are precisely the cumulative

sample sizes for the teen pregnancy sample from the 1957 birth cohort, the 1958 birth, and the

1959 birth cohort; see Table 3 for the cohort sample sizes. The 1975 observations are solely

from the 1957 birth cohort, the 1978 observations are from the 1957 and 1958 birth cohorts, and

the 1977 observations are from the 1957-59 birth cohorts.

As seen in Table 4, out-of-sample observations are included in the analyses for seven of

the dependent variables used in the analysis. No out-of-sample observations are included for

own earnings, own wage rate, and hours worked. For poverty status and receipt of AFDC and

Food Stamps, out-of-sample observations are included for all persons in the 1957-59 birth

cohorts. In all three years, none of the observations are poor, which is very unlikely to be

anything other than some default assignment. Similarly, in 1975 and 1976, all of the

observations are coded as receiving AFDC and Food Stamps. In 1977, about one-third are

apparently receiving AFDC and Food Stamps.12 Spouse earnings and total welfare benefits for

1977 are included, but not for the two earlier years. Curiously, all spouses had earnings equal

to $0; this is another sure indication that something is wrong. It turns out that $0 is the default

spouse earnings assigned to unmarried women: the 178 women with included spouse earnings

in 1977 are all of the women coded as unmarried at those ages. Total welfare benefits in 1977

(in 1994 dollars) equal $3,137. For the variables with actual values rather than apparent

assignments –Food Stamps, AFDC receipt, and total welfare benefits, all in 1997– it is unclear

what the basis for the assignment of values is or could be. There is absolutely no data

whatsoever available for ages that fall in these years.

13 There is a further problem with the missing data. As I explain below, HMS do not correctly identifymissing data cases and inadvertently include some of them in their analyses. The figures discussed in the text referto their own identification of missing data.

15

Receipt of a HS diploma is reported for all persons in the out-of-sample years (1975-77).

This could be a reasonable assignment from knowledge that an individual is a high school

graduate and presumably graduated at age 18 or thereabouts. GED receipt status is also

reported for all cases, but this is far more problematic. There is no data available in the NLSY

to assign data values for the age at which a GED is earned and, unlike a high school diploma,

there is no obvious age at which a GED is earned. The .8% value corresponds to one

observation coded as having received a GED at this age.

I have confirmed that HMS actually used all of these out-of-sample observations in their

analyses. Using all observations, including those from these years, I replicate their results for

the affected variables –spouse earnings, welfare benefits received, AFDC receipt, whether poor,

and whether an individual has a high school degree or GED degree only.

There is other evidence that these pre-sample years are used. As already noted (see

the last column of Table 3), the maximum available sample size for any analysis is 13,468,

which is the total N observed between ages 18 and 35 in calendar years 1978-1992. Adjusting

further for observations in their data set that are entirely missing in a sample year brings the

maximum sample size down to 13,229; variable-specific missing data on the dependent

variables would undoubtedly reduce it further.13 And yet, reported sample sizes in HMS 2002

are consistently larger than this – 13,924 for analyses of educational attainment, fertility

measures, poverty status, AFDC receipt, and Food Stamp receipt. The 695 sample size

difference between the HMS 2002 sample size and the actual available sample size is exactly

equal to the out-of-sample cases contributed by the 1957-59 birth cohorts, as shown in Table 5.

There seems little doubt that HMS have used observations for which there is no data in the

NLSY.

The same problem appears in KHK. Those analyses are based on a larger sample

(N=1744) that includes women ever pregnant by age 19, but that runs only through the 1992

NLSY interview. Thus, here the absolute maximum number of observations is 1744 x 14 years

14 I have spent considerable effort trying to identify where the out-of-sample data observations do, in fact,come from since I assume that they are not literally invented. I have found no simple explanation. There is noevidence, for example, that the out-of-year information is misplaced 1978 data, i. e, data transposed by several years. I suspect that the data may be the result of an incorrect assignment algorithm, and could even come from anothersample member.

15 Actually, the value on the data file is $256.062286376953, which suggests that it is the result of some kindof numerical transformation.

16

= 24,416. Adjusting for the number of years each birth cohort is actually observed between

ages 18 and 35 between 1978 and 1991, just as in Table 3, decreases the maximum sample

size to 22,236. Adjusting further for cases lost during a particular interview year decreases the

available sample size to 21,670. But again, reported sample sizes consistently exceed that

number. For high school graduation and GED attainment, the reported sample sizes are

25,432. Reported sample sizes also exceed 21,670 for annual work hours, AFDC/Food Stamp

benefits, and a woman’s annual labor market earnings. Thus, it is likely that the same

inappropriate use of out-of-sample observations exists in the KHK analyses, too.

The obvious question for all these variables is: where do the sample values come from,

since they are not contained in the NLSY data?14 While it is clear that these pre-sample years

do no belong in the analysis, it is not clear a priori what the impact on estimated coefficients will

be. I provide evidence on this in the next section.

C. Welfare Benefits

In KHK, two welfare benefit variables are analyzed –the annual monetary value of AFDC

and Food stamps and a second value that includes Medicaid benefits. In HMS 2002, only the

first of these is analyzed, but there are also analyses of receipt of Food Stamps and receipt of

AFDC.

I have identified several problems with the welfare and food stamp variables. First,

nearly 60% of the observations in the teen pregnancy sample have welfare benefits equal to

exactly $256.06 and no cases have a value of $0.15 This is highly suspicious, to say the least.

On closer inspection, it turns out that $256.06 is the value assigned to all cases that have

AFDCPAY and FOODPAY (actual AFDC and Food Stamp dollars, respectively, from which total

welfare benefits appear to be constructed) equal to $0. This strongly suggests that the $256.06

value is, simply, a coding error. The correct value is $0. Note that this does not affect the

16 For two of the cases shown, the number of months of AFDC receipt is computed as either fractional orgreater than 12. I assume that in these cases, there is some additional source of welfare benefits.

17

underlying AFDC and Food Stamp receipt variables

Second, I can find no correspondence whatsoever between the Food Stamp and AFDC

variables on the NLSY file and those on the HMS file. Table 5 shows this for a few

representative cases. In columns 3-5, I show the HMS values for Food Stamps and AFDC

benefits, and also for Total Welfare Benefits, which are described as the sum of these two

variables. According to the documentation, Food Stamps and AFDC benefits are in current

dollars, that is, they are not scaled to 1994 dollars, as Total Welfare Benefits are. In columns 6-

10, I show corresponding information from the NLSY. Columns 6-8 are directly from the data;

column 9 is the usually obvious number of months of welfare receipt, computed from the three

preceding columns,16 and column 10 is the resulting likely total value of AFDC benefits.

Comparing values shows that the HMS figures bear no clear or consistent pattern to the

NLSY figures. The two food stamp values (Columns 3 and 6) and the two AFDC Benefit

measures (Columns 4 and 10) ought to be identical or at least comparable. They are not.

Consider case 1 for ID=4. The NLSY figures show $416 in Food Stamp benefits, an average

monthly AFDC benefit of $314, and total welfare benefits of $1986, which almost certainly

implies five months of AFDC receipt (5 x $314 +$416 = $1986), and thus total AFDC benefits of

$1570. The corresponding HMS 2002 values are 87% higher than this for food stamp benefits,

59% higher than this for AFDC benefits, and more than four and a half times greater for total

welfare benefits. In case 2 for ID=4, the HMS 2002 values for Food Stamps and AFDC benefits

are all lower than the NLSY values. The string of cases for ID=86 with NLSY Food Stamp

benefits equal to $648 (interrupted by one at $828) show the problem in another form. Some of

the HMS values drop in a pattern that looks a bit like a CPI adjustment, but the figures do not

correspond to any CPI series, and the sharp fall in 1982 and the rise in 1987 show that this is

not, in fact, a consistent application of any CPI-based transformation.

Third, there is no consistent relationship whatsoever between the value of welfare

benefits and the sum of AFDCPAY and FOODPAY, the two underlying variables. I computed

17 AFDCPAY and FOODPAY are not adjusted to 1994 dollars, but the average adjustment factor to 1994dollars is only 1.46 and thus does not account for the discrepancy.

18

the ratio of total welfare benefits to the sum of AFDCPAY and FOODPAY. If total welfare

benefits were the 1994 value of the sum of these variables, the ratio would range from 1.06

(benefits received in 1992) to 2.27 (1978 welfare benefits) or, in light of the error in inflating the

other income variables, perhaps 2.27 for all cases except 1992. Instead the minimum value of

this ratio is 2.7. For half of the cases, the ratio lies in a narrow margin, between 2.7 and 2.89

and two-thirds have a ratio between 2.7 and 3.29. (For the cases shown in Table 5, the ratio

ranges from 2.79 to 4.83). 10% have a ratio greater than 5.6 and 5% have a ratio greater than

8.5. Four cases have a ratio greater than 100!

Fourth, maximum and average welfare benefits are implausibly high. The maximum

value of welfare benefits on the file is $150,604.91. 151 cases in the HMS teen pregnancy

sample have benefits greater than $20,000 and 64 have benefits greater than $30,000.

Average welfare benefits (for all cases with benefits greater than $256.06) are $7,236.85. The

average sum of AFDCPAY + FOODPAY (the underlying variables used to compute welfare

benefits) for the same sample is $2271.66, just over 30% of the average value of WELFBEN.17

By comparison, the maximum values of “Welfare Income Received” on the NLSY file for the

teen pregnancy sample ranges from $7,800 in 1978 to $28,332 in 1989. In only two years is

there a case with welfare benefits greater than $20,000.

There are also several smaller issues:

• 359 cases have welfare income > $256.06 (i.e., a value which I interpret as positive

AFDC and/or Food Stamp benefits) even though AFDCPAY and FOODPAY both equal

0. If these cases were coded as other cases appear to be, they ought to have a value of

$256.06. Instead, average welfare benefits for this group are actually $3582.71. It is not

clear which values are incorrect.

• As previously noted, there are values of welfare benefits for ages that would have been

observed in 1977, a year for which no observations are available in the NLSY. Mean

1977 benefits are greater than $7,000 in 1994 dollars (after eliminating the cases with

19

benefits equal to $256.06).

• The food stamp benefit variable appears to have been coded inconsistently and possibly

incorrectly. I cross-checked the food stamp dollar measures on the HMS file against the

NLSY, looking for a pattern. For some cases, it appears that the NLSY food stamp

measure is not in current dollars as indicated but has been re-scaled to 1978 dollars.

But this is not the case generally. Appendix Table 1 provides some evidence on this.

For the first case shown (ID=10259), the NLSY values, rescaled to 1978 dollars, are so

similar to the values on the HMS file that it is highly unlikely to be a coincidence. But as

the other two cases illustrate, this correspondence does not hold. There is no

discernible pattern at all in those two cases. I believe that the cases shown are

representative, although I have not done an exhaustive search for cases that fit either

the “rescaled to 1978 pattern” or the “no pattern” profile.

• In light of all of the inconsistencies noted, I cannot determine whether welfare benefits

suffer from the scaling problem that affects all the other variables. It seems likely that

the scaling error was consistently applied to all income variables, but because I cannot

link the HMS 2002 values to the underlying NLSY values, I cannot establish that this is

the case.

I have no a priori hypothesis of how these multiple data errors are likely to affect the

estimates of the effect of teen childbearing on AFDC and Food Stamp receipt. It is an empirical

matter.

D. Teen Fertility Coding

Information on the age at the beginning of a first pregnancy and the outcome of that

pregnancy is directly available in the 1984-86, 1988, 1990, and 1992 interviews. In each of

these years, there are a pair of variables, indicating the age at the beginning of a first pregnancy

and the outcome of that pregnancy. This corresponds precisely to the description that HMS

employ throughout, namely, the outcome of a teen pregnancy that began at age 17 or earlier.

There is no other distinct variable on age at first pregnancy, as there is for age at first birth.

These variables are designated as “created” variables on the NLSY file, meaning they are

18 For example, relying only of 1992 information on pregnancy outcomes omits 44 cases who report validinformation in previous years but are missing in 1992.

20

created by the NLSY staff from information elsewhere in the survey, and thus reflect some effort

at establishing consistency. From 1984 to 1990, the outcome variable is coded into four

categories: birth, abortion, miscarriage, or stillbirth. In 1992, the last two categories are

combined. In 1982 and 1983, there is information on the outcome of a first pregnancy (for

pregnancies that did not end in a live birth) and the year and month in which that pregnancy

ended. This information can be combined with information on own birth date to construct an

age at first pregnancy.

HMS 2002 reports that their sample includes 727 births, 185 abortions, and 68

miscarriages, all resulting from first pregnancies that began at age 17 or earlier. (In KHK, only

the total sample size and the weighted proportions in each teen fertility category are shown in

Table 3.2. Because the weighted proportions in that table are exactly the proportions that I

compute from their data, I assume that the same coding applies in KHK). I have no direct

information on how HMS coded teen fertility. Instead, I have compared the coding on their file

with a coding based on the 1984-1992 variables and, where necessary, the 1982-83 variables.

I use all years of data in this coding, since some cases appear in the file only in some year or

provide information about, for example, age at first pregnancy only in a single year.18

Because the fertility coding is so obviously central to this analysis, I examined the HMS

coding very carefully, cross-checking it against the classification that comes from the

straightforward use of the 1984-1992 information, augmented by the 1982-83 information in

cases where the 1984-1992 information is missing or otherwise not conclusive. It is clear to me

that the coding of teen fertility from the NLSY is inherently and inevitably subjective, especially,

for example, when the information is conflicting, as it is in some cases. Thus, there is no single

definitive classification. Reasonable people could probably disagree about the details.

My findings are summarized in Table 6. The details of specific case assignments are

included in the appendix. The first row shows the HMS 2002 teen fertility classification. I find

three distinct problems. First, a substantial number of cases included in the HMS samples

19 NCHS, Series 20, No. 31, “Medical and Life-Style Risk Factors Affecting Fetal Mortality, 1989-90" hasdata on fetal deaths for 29 states (61% of all fetal deaths). Fetal deaths are defined as stillbirths or miscarriages after20 weeks of duration. Induced abortions are not included. Table B shows a rate of 7.3 deaths per 1000 fetal deathsplus births for women < age 30. I assume that this rate holds for younger teens and for the other 21 states. Duringthe 1980s, there were about 285,000 births annually to women who were pregnant at age 17 or earlier. This figureincludes all births to adolescents age 17 and younger and 3/4 of those at age 18. Applying the 7.3/1000 fetal deathrate to the 285,000 young teen births figure yields an estimate of about 2,100 fetal deaths in pregnancies at age 17 orearlier. About half of teen pregnancies end in birth and about 13% in a miscarriage according to the AlanGuttmacher Institute estimates, which means there were approximately 74,000 miscarriages annually (285,000 x 2 x.13= 74,100). This means that there were about 35 times as many miscarriages as fetal deaths. In my calculations, Iam assuming that no miscarriages are included as fetal deaths.

21

appear to be misclassified, i.e., there is no consistent supporting evidence in the 1982-1992

information. For the teen birth sub-sample, there are 19 such cases –one true miscarriage, one

abortion, three whose first pregnancy began at age 18 or older, and 14 for whom there is no

information whatsoever about how the pregnancy ended or the age at which it began. For the

abortion sample, there are 15 dubious cases – three are teen miscarriages, three are teen births,

and nine are either not teen pregnancies or have insufficient information about pregnancy age or

outcome. Four cases are questionable for the miscarriage sample – one abortion and three

whose first pregnancy began at age 18 or older. The proportion of miscoded included cases

ranges from 2.6% for teen births to 8% for teen abortions.

Second, the HMS and KHK sample classification excludes a large number of cases that

are readily classified by the 1984-92 information. 65 births, 17 abortions, and nine miscarriages,

all from first pregnancies at age 17 or earlier and with complete information about the age at first

pregnancy and the outcome of the pregnancy, are omitted. The omitted proportions range from

9% to 13% of the HMS samples. The appendix table provides further details on these cases.

Third, the miscarriage category is actually miscarriage/stillbirth. It includes a strikingly

high proportion of stillbirths– eight out of 64 valid miscarriages in the HMS sample (12.5%) and

11 out of 73 in the augmented sample (12.3%). This raises two distinct concerns. First, the

impact of a stillbirth might not be innocuous, or at a minimum might be substantially greater than

a miscarriage that terminates a very short duration pregnancy. Second, and probably more

importantly, the high proportion of stillbirths suggests that miscarriages are substantially under-

reported in the NLSY and in the HMS sample. Using reported national data on fetal deaths and

miscarriages, I estimate that miscarriages ought to outnumber abortions by a ratio of nearly

35:1.19 In the NLSY sample, the ratio is closer to 7:1. Put differently, if my calculations are even

20 Attrition in the NLSY has been remarkably low. Unlike some surveys, non-response cases in one yearcan and do return to the NLSY sample in subsequent years.

22

close and if there are 11 stillbirths, there ought to be more than 350 miscarriages, rather than 62.

There is, of course, no way to determine ex ante the likely impact of a re-classification of

a substantial number of fertility cases. It is an empirical matter.

E. Improper use of sample weights

All of the regression analyses in KHK and HMS 2002 are weighted, using the sample

weights provided in the NLSY data. The NLSY consists of three separate sub-samples: a cross-

sectional sample; a supplementary sample of Hispanic, black, and economically disadvantaged

white youth; and a military sample. Most members of the military sample were dropped after the

1984 survey and all the economically disadvantaged white respondents were dropped after the

1991 survey. Sample weights are constructed for each survey year to adjust for differences in

initial selection probabilities and subsequent attrition, including the two sample drops. With

weighting, the NLSY is nationally representative.

The use of sample weights is obviously essential to estimate population means. There is

a lively debate about whether regression analyses ought to be weighted (see Deaton, 1997 for a

useful discussion) even with data that are unrepresentative, but I do not focus on that issue here.

The problem is that HMS have used the wrong weight and used it improperly.

In their analyses, HMS use a single weight taken from the 1979 interview. This is the

wrong weight for a number of reasons. First, the 1979 weight accounts only for the initial

selection probabilities, and thus obviously does not account for subsequent attrition.20 Attrition

modestly changes (increases) the weights for the remaining observations that are similar to the

non-response cases, but more importantly for our purposes it results in a zero weight for the non-

response cases. As a result, for ages that occur subsequent to 1979, HMS use a slightly wrong

weight for all continuing cases and a dramatically wrong weight for non-response observations.

Second, the 1979 weight is not the appropriate weight in the first place, even for analyses

of 1979 data, given the sample they analyze. This is true for two reasons. Because HMS

translate analyze outcomes by age rather than calendar year, there is no analysis of 1979

21 In 1979, there were 198 poor whites in the cross-section and 901 poor whites in the supplementarysample. The cross-section poor whites had a mean sample weight of 1207, representing a total population of about239,000 persons. The supplementary poor whites had a mean weight of 915 and represented a total population ofabout 825,000 persons. Thus the total represented population was 1.06 million persons. In 1992, following the dropof the supplementary poor white sample, the mean weight for the 198 remaining poor whites increased to 5360, thusrepresenting the same 1.06 million persons.

23

outcomes for which that weight would be appropriate; the 1979 outcomes are scattered across

the various ages. At a minimum, the appropriate weight to use is the annual, updated weight,

translated from calendar year to age. But even this is incorrect. HMS restrict their sample to

cases from the cross-section (white, black, and Hispanic) and the supplementary black and

Hispanic samples, i.e., they exclude the military sample and the poor white supplementary

sample. All of the sample weights provided in the NLSY, including the 1979 weight, are,

however, appropriate only for the whole sample and not for portions of it.

A particularly clear and relevant example of this involves the weights for the group of poor

whites who are part of the cross-sectional sample. The initial sample weights for this group

reflect both their representation in the cross-sectional sample and the inclusion of the poor

whites in the supplementary sample. Thus, when the latter group was dropped from the NLSY in

1991, the sample weights for the remaining cross-sectional poor whites were increased by a

factor of more than four, precisely because the selection probability of poor whites decreased.21

Since HMS do not use the poor whites from the supplementary sample even in the years prior to

their drop from the sample, the 1979 weights for the cross-sectional poor whites are incorrect, in

the sense that they do not yield appropriate population estimates. The weights for other groups

are not affected.

Because they use the 1979 weight and because of the nature of some of their variable

assignments, HMS 2002 actually includes observations that are non-response in their analyses.

Table 7 documents this. There are a maximum of 341 such non-response observations included

in their teen analyses –about 3% of all cases. All 341 of these cases have assigned values for

all education measures – presumably carried forward from some previous year– as well as for

current poverty status and Food Stamp use – for which no data exist. Only two of them are poor,

which is a suspiciously low figure. Nearly three-fifths apparently used Food Stamps. Almost half

have reported spouse earnings –all of them are zero, almost certainly due to a default

24

assignment in data processing. Sixty percent have reported welfare benefits; two-thirds of the

values are $256.06, which is the value incorrectly assigned to cases with no welfare benefits.

But another 64 of these non-response cases somehow have values for welfare benefits; their

average welfare benefits are $10,110 in 1994 dollars.

It is important to appreciate that none of these cases belong in a weighted analysis, even

when it is possible to assign a value for a variable such as educational attainment that is not

time-varying. The sample weight (if properly applied) adjusts appropriately and fully for the non-

responding observations.

F. Summary

In sum, HMS have made a number of data-processing errors that could undermine their

analyses in KHK and HMS 2002. They mis-scaled almost all incomes, included out-of-sample

observations, made a series of probably erroneous AFDC, Food Stamp, and welfare benefit

assignments, arguable miscoded fertility outcomes for some cases and omitted a substantial

number of other available cases, and applied an inappropriate weight variable in their analyses.

Some of these problems –the mis-scaling, the incorrect default $256.06 welfare benefits

assignments, and the inclusion of out-of-sample cases– can be corrected with their own data, but

most of the others require a fresh analysis. With the exception of the mis-scaling, which

undoubtedly increases the size of estimated coefficients, there is, ex ante, no clear prediction

about how correcting the other errors will affect the estimates, except that the resulting estimates

can then be viewed with substantially greater confidence.

G. Could Anyone Have Known?

Could a normally diligent referee, without access to the data, have known about these

problems? Possibly, I think, but only with some sleuthing, cleverness, and luck.

The most obvious error is the mis-scaling, an error I discovered by literally comparing

values. I did, however, suspect there was an error the first time I read the paper, although I had

no idea what its source was. HMS reported in KHK that at age 30, annual earnings for teen

22 This latter group is not the teen miscarriage group, but rather a sample of all women who did not have ateen birth. HMS use this comparison primarily to show the “apparent” consequences of teen childbearing that theylater debunk.

23 As noted earlier, the 24,416 number is actually too high. Adjusting for the number of years each birthcohort is actually observed between ages 18 and 35 between 1978 and 1991 decreases the maximum sample size to22,236 and further adjustment for cases lost during a particular interview year decreases the available sample size to21,670. A referee would have no way of knowing those figures without access to the data.

24 An interesting example of an error that was identified and caught was Weitzman’s claim that the economicstatus of women fell 73% in the year following divorce (Weitzman, 1985). Indirect evidence of the error was reportedby Hoffman and Duncan (1988) and direct evidence was reported by Peterson (1996).

25

mothers was $18,544 compared to $32,935 for women who did not have a teen birth.22 In fact,

these means almost certainly included the zero earnings of non-participants and thus was

consistent with the unrealistically high mean earnings that I discussed above. The inclusion of

the non-working women was, however, not explicitly stated and a referee might easily not

recognize that. In the next sentence, HMS note that teen mothers worked an average of 966

hours annually, compared to 1280 hours for the other group. Simple arithmetic then shows that

this implies hourly wages of $19.20 and $25.73, respectively. Those average hourly earnings

are clearly not very likely for any group of young women, especially not for the teen mothers,

given their age and education. So that was one possible clue.

Another possible clue was the discordant sample sizes. As discussed above, the actual

sample sizes reported in the appendix to KHK – as high as 25,432– often exceed the maximum

potential number of observations for a sample of 1744 women observed for 14 years (24,416).23

But most perfectly diligent referees are unlikely to notice something in the fine print like that.

Because neither of these errors are conspicuous, I conclude that referees are not to

blame for not catching the errors. I seriously doubt that this is the first publication that had

substantial data errors, and, probably more often than not, the errors are not detected.24 This

certainly speaks to the need to make data sets available as a condition for publication, a practice

now followed by many journals. Researchers are well advised to examine their data carefully,

inspecting means, minimums, and maximums and comparing them to benchmarks where

available, rather than proceeding directly to multivariate analyses.

III. Data and Estimates

26

I drew a sample from the NLSY following the procedures used in HMS 2002. Like HMS, I

exclude cases from the military sample and the poor white supplementary sample even in years

where those cases are available. Like HMS, I define a teen pregnancy as one that began at age

17 or earlier. The full sample includes 1033 women who had a teen pregnancy and had

sufficient information in at least one year to determine the age at the beginning of the pregnancy

and the outcome of the pregnancy. Of these, 773 women had a teen birth, 187 had a teen

abortion, and 73 had a teen miscarriage. The full person-year sample (ages 18-35 observed

between 1978 and 1992) includes a potential maximum of 14,222 cases, but sample attrition

reduces the actual total to 13,651. The number of available cases for particular dependent

variables is typically smaller than this.

Table 8 shows the population estimates (weighted sample means) for the independent

variables and the outcomes that I examine. I have adjusted the weight for the poor whites in the

cross-section for the exclusion of the poor whites in the supplementary sample. The outcomes

are shown at age 28, except for high school graduation which is at age 21. Age 28 is the latest

age at which all birth cohorts are observed and it is the age for which HMS present most of their

results. Also shown in the table are the corresponding means from HMS 2002. The background

means shown in the top panel are taken from their Table 2; the outcome measures are my

estimates using their data. Note that the weight variable used to compute the HMS means is the

incorrect 1979 weight variable as described above.

As shown in the first column of the table, the NLSY teen pregnancy sample is about one-

quarter black and almost two-thirds white. These proportions differ from the HMS proportions

(30% black, 60% white) primarily because of the use of the adjusted weight for the poor whites,

which increases the white percentage by about three percentage points in my data. Many of the

other family background means are quite close in the two samples, for example, family structure,

parents’ education, and the AFQT score. Income variables are an exception. Family income in

HMS 2002 is about 25% higher than what I find. More than half of the cases in HMS 2002 are

25 In my sample, 22% of the cases (unweighted) are missing 1978 family income and 38% are missinginformation on 1978 family AFDC receipt. I have confirmed that many of the cases reported by HMS to be missingfamily income do, in fact, have legitimate data for this variable. The proportion of HMS cases missing income data isstrongly related to birth cohort, ranging from 84% for the 1957 birth cohort to just 12% for the 1964 birth cohort. It isvery possible that HMS are doing something systematic that is related to the age of the respondent.

27

missing family income, so the means may well represent different populations.25 For the

outcome variables, means for educational attainment (especially high school graduation), work

hours, and food stamp use are quite similar in the two data sets, but incomes differ considerably

because of the scaling problem in HMS. Mean own earnings are 77% larger, mean spouse

earnings 91% larger and mean income from welfare is more than 60% larger in HMS 2002 than

in my tabulations. There are also fairly large differences in the proportions married, poor, and to

a lesser extent, receiving AFDC income. I do not know what accounts for these latter

differences.

Table 9 presents my re-estimation of the HMS model using their data and correcting for

the scaling problem and the inclusion of out-of-sample data points for 1975-1977. As described

earlier, HMS estimate both a non-parametric model in which the effect of a teen birth at each age

is estimated via a separate interaction term (“is age t” x “whether had a teen birth”). This model

is estimated by HMS in three versions, first with no additional explanatory variables and then in

two versions with additional explanatory variables. Explanatory variables include dummy

variables for birth cohort, having a first pregnancy at age 16 or 17, race/ethnicity (two dummy

variables), family structure at age 14 (two dummy variables), correlates of miscarriage (two

dummy variables for whether used alcohol or tobacco during pregnancy), parents’ education,

and two measures of family income (actual income and whether received income from welfare,

both measured for 1978). AFQT scores are also included for outcomes other than education.

Most of the strong results (i.e., positive or no negative impact of a teen birth) appear in the model

without explanatory variables; with the exception of spouse earnings and income from welfare,

the teen birth estimates are quite robust across the specifications. HMS also estimate a

parametric age effect model that uses either a quadratic or cubic function in age plus

corresponding interaction terms and a dummy variable for a teen birth. This model is estimated

only in the form with all additional explanatory variables. Estimation is by two-stage least

26 I was able to replicate their results for all outcomes except income from welfare where my coefficientestimates for most variables differed from theirs by approximately five to ten percent. This may reflect differencesdue to the statistical software used.

28

squares, with having a miscarriage used as an instrument for having a teen birth. I estimate the

models using Limdep. To keep the table simpler, I report results only for the non-

parametric model without additional explanatory variables and the age-parametric model with all

covariates.26 Columns (1) and (2) are the original HMS estimates and columns (3) and (4) are

the corrected estimates. The estimates shown are for age 28, except for high school graduation

which is shown for age 21. For the non-parametric model, the reported effect comes directly

from the estimated coefficient; for the parametric models, the effect is computed at the indicated

age from the underlying coefficients on the teen birth dummy and the age interactions. For some

outcomes, there are no corrections, either because there is no problem with the underlying

variable (hours worked, whether married) or because the underlying problem does not affect the

particular estimate shown in the table (non-parametric estimates for educational attainment,

poverty status, and receipt of food stamps and AFDC, where the data problem is confined to

ages 18-20).

The impact of the income scaling problem is quite evident. Coefficient estimates for all

incomes fall substantially. Of the eight pairs of estimated income coefficients, in six cases, the

corrected estimates are about 60% of the original. This includes all of the parametric estimates.

The other two cases have much larger changes. The effect of a teen birth on total income from

welfare, estimated in the non-parametric specification, falls more than 75% from -$372 to -$85.

For the non-parametric estimate of spouse earnings, the impact of a teen birth at age 28 falls

from about $1270 to $28, a drop of 98%. This particular impact is an outlier– at surrounding

ages, the drop in the coefficient is more similar to the changes in the other income coefficients.

At other ages, the impacts generally follow the magnitude of the scaling error discussed earlier,

increasing with age; these results are not shown in the table. On average, from age 18 to 32 the

corrected coefficient estimate is 55% of the original estimate for a woman’s wage rate, 64% for

earnings, 22% for spouse earnings (61% omitting the age 28 effect) and 62.5% for welfare

benefits. The signs of the coefficients are not changed by the correction, although statistical

29

significance occasionally is.

The inclusion of out-of-sample cases for the non-monetary outcomes does not have

much of an impact. Education estimates are essentially unchanged, the impact of teen

childbearing on the probability of being in poverty decreases by about .015 in absolute value,

while the impact on the probability of receiving food stamps or AFDC increases by about .018 to

.020.

In Table 10, I present my estimates of the impact of teen pregnancy on socio-economic

outcomes using the sample of teens pregnant at age 17 or earlier that I drew from the NLSY79.

Again, I estimate two models – a non-parametric age specification with no additional explanatory

variables and a parametric model very similar to, though not identical to, the one estimated by

HMS. The differences are slight and highly unlikely to affect results; if they did, the results would

be extremely unrobust. I have used the AFQT score as a linear variable rather than as a set of

dummy variables, and I have not included variables for use of alcohol and tobacco during

pregnancy, because I was uncertain how those variables were constructed. All other variable

are either identical to those used by HMS or as close as feasible without my having access to

their coding and data processing procedures. Both specifications allow the teen birth effect to

vary over time through the appropriate interaction terms.

The top portion of the table shows the non-parametric estimates for the outcome

variables I use and the bottom portion shows the parametric estimates. I show the actual non-

parametric coefficient estimates for the age-teen birth interaction terms at two year intervals from

age 18 to 30. For the parametric estimates, I show the key teen birth coefficients and the

impacts computed from these estimates at the same two-year intervals. To keep the table

manageable, I do not show estimates of the impact on the wage rate, because in the NLSY79

this is simply the ratio of earnings to hours worked; cumulative hours worked because the

cumulative sum can be derived from the analysis of annual hours; and whether the individual has

a GED, because this is implicit in the HS Graduate or GED analysis. I have also excluded

analyses of the number of children and years as a single mother, since these seem less central.

I have added an additional variable measuring educational attainment – whether the individual

30

completed at least two years of college– because this is an increasingly important outcome. I do

not show standard errors, but simply indicate which variables are statistically significant. I also

do not show coefficients estimates for any of the other independent variables. Full estimates are

available in a data appendix available from the author.

The most conspicuous feature of the non-parametric estimates is the general lack of

statistical significance. Of the 77 coefficients shown, only 9 are statistically significant at the 10%

level or higher and a great many have t-statistics of .5 or lower (not shown in the table). This is

also characteristic of the estimates in HMS 2002. More than half of the statistically significant

estimates are at age 30. Own and spouse earnings are consistently positive, but not significant

until age 30. Welfare income bounces around in a positive and negative direction and again is

not statistically significant until age 30. There are no significant differences in either high school

graduation or GED attainment; the age pattern of HS graduation estimates is actually a bit odd,

because the distribution is constant from age 20 on. There are some indications that teen

mothers are less likely to go on to college. Those impacts are consistently negative from age 22

to 28 and statistically significant at age 24, but they appear to disappear at age 30. Impacts on

marriage are positive at first, presumably because some of the women marry the father of their

child; this probably also accounts for the initial premium in spouse earnings. From age 24 on,

the marriage impacts are erratic, but they are occasionally large (-.112 at age 28), although none

of the impacts are statistically significant. The most consistent effect is for food stamps: from

age 24 on, the teen mothers consistently and usually statistically significantly use food stamps

less than other women. Their use of AFDC is negative in all years shown but one, but only the

effect at age 30 is statistically significant. Impacts on the probability of being in poverty are

erratic and non are significant. Teen mothers do appear to work more than other women, and

significantly so at age 28.

Estimates of the parametric effects are somewhat more likely to be statistically significant,

but even here only 25% (eight of 33) are statistically significant. Estimates of the teen birth

dummy and the age-interaction terms are reliably estimated for spouse earnings and for some

college attendance. In many cases, it is probably the case that the data would reject the

31

hypothesis of a teen birth effect that varies over this age range. Given the nature of the

parametric model, the estimates very less wildly from year to year. While there are exceptions,

the estimates are usually smaller in absolute value than in the non-parametric specification.

With all the same caveats about lack of statistical significance, it does appear that the teen

mothers earn less through age 22 and then catch up. They have spouses who initially earn more,

probably because of marriage per se, then lose most of that advantage, before regaining it plus

more at ages 28 and 30. The impacts on welfare benefits are substantially positive at first, but

do finally turn negative at age 28. The patterns from the non-parametric estimates for college

attendance and food stamps both appear here: the teen mothers are again less likely to attend

college or receive food stamps.

Table 11 is a summary of the results from HMS 2002 and from my re-estimation of the

NLSY data. The HMS 2002 results are the original ones that they present, that is, they are not

corrected for scaling and sample problems as was done in Table 9. The estimates shown are at

age 28, except for high school graduation, which is reported at age 21. Also shown in brackets

beneath the estimates for the income variables and hours worked are the sum of the estimates

from age 18 to 30. These sums are not discounted and make no allowance for whether the

underlying estimates are statistically significant or not.

The largest difference in estimated effects are, not surprisingly, with the income variables

where the scaling problem caused HMS to substantially overstate the impacts. My age 28

estimates for own earnings are about 17-22% of theirs; the cumulative total is 23% as large for

the non-parametric estimates and less than 10% as large for the parametric estimates. For

spouse earnings, my age 28 non-parametric estimate is twice as large as theirs, but the

parametric estimate is less than half as large. Cumulative effects on spouse earnings are two-

thirds and one-half as large for the two specifications. All specifications show the teen mothers

receiving less welfare income at age 28; my parametric estimate is about one-quarter of theirs.

Interestingly, their estimates show that teen mothers receive substantially more welfare benefits

from age 18 to 30, while my non-parametric estimates show just the opposite. (Recall that there

seemed to be a number of problems with their welfare benefit variable). My parametric estimate

32

shows nearly $2900 additional welfare benefits for the teen mothers. The general conclusion for

the income variables is of effects that are far more modest and less beneficial than were reported

in HMS 2002, both at age 28 (though with some exceptions) and over the years from 18 to 30.

There are no sign reversals, and no evidence of significant adverse impacts of teen childbearing

except perhaps with welfare benefits.

The same pattern holds for hours worked, a variable for which I identified no special data

problems. Nonetheless, there are differences in our estimates. Like HMS, I find that teen

mothers tend to work more hours, but I find a much smaller advantage. My cumulative non-

parametric estimates are 1200 hours less then theirs, and my parametric estimates are more

than 1500 hours less. I do not know how to account for this difference.

For the education variables, there is reasonable consensus. This is not surprising, since I

found no problems with the education variable per se, except for the out-of-sample cases that

were erroneously included. All estimates show a negative impact on receiving a high school

diploma. HMS 2002 find that the positive impact on receiving a GED more than compensates for

this negative impact, so that the teen mothers are actually slightly more likely to have either a

high school degree or a GED. I do not find this latter effect. In my estimates, there is essentially

no difference in the proportion with either a high school diploma or a GED. As noted above, I

find some evidence of a negative impact on the probability of attending at least two years of

college.

Finally, all of the estimates indicate that teen mothers are less likely to be receiving food

stamps; this impact is probably the largest and most consistent of all the outcomes. All of the

estimates indicate a quantitatively small, but negative impact on AFDC receipt, with teen mothers

less likely to be receiving assistance. HMS find that the teen mothers are less likely to be poor. I

find essentially no impact. Finally, all estimates show that teen mothers are less likely to be

married at age 28. The parametric estimates are very small, while the non-parametric estimates

are quite a bit larger.

HMS use their parametric estimates to compute budgetary costs of teen childbearing,

including AFDC benefits received and taxes paid. They conclude that teen childbearing actually

33

reduces net government expenditures by $3.9 billion, primarily because it increases earnings

and thus taxes paid, relative to what would happen were these women to delay their

childbearing. Their procedures are complex and I have not tried to recompute this at this time

using my revised estimates. I do suspect that their overstatement of the earnings gain

attributable to teen childbearing would change these estimates by reducing the taxes paid.

Given the magnitude of the scaling problem and its impact on coefficient estimates, the change

in these budgetary cost estimates might well be substantial.

IV. Conclusion and Future Directions

I firmly believe that the new approaches to the impact of teen childbearing are productive

and informative. The two HMS contributions (KHK and HMS 2002) are particularly clever and

important contributions to this literature. Just as firmly, I believe that approaches need not only

be conceptually sound but also soundly and carefully implemented. Here, the two HMS

contributions fell short. Some of their errors may have been innocuous, affecting too few cases

to affect the estimates, although that does not constitute a defense. Other errors, especially

including the mis-scaling of incomes, had a very substantial impact of estimated coefficients.

I have focused here on a relatively straightforward replication of HMS 2002. There are a

number of extensions that could be implemented and that might well be productive. I note some

of them here.

First, there is no need to limit the analysis to pregnancies that occur at age 17 or earlier.

This definition was chosen for the research reported in Kids Having Kids and HMS

understandably continue to follow it. Other researchers, however, might want to expand the age

limit. One advantage of extending the approach to include pregnancies at age 18 is that it will

likely increase the sample of teens with a miscarriage. It will also thereby more closely mimic the

teen birth definition used in most research.

Second, researchers might well want to include the portions of the NLSY sample that

HMS dropped. The military sample is available from 1979 until 1984 and the sample of

economically disadvantaged white respondents are available through the 1991 survey. Available

34

sample weights allow for their inclusion in the years in which they are present in the sample. In

fact, excluding these cases without adjusting the weight variable is incorrect. Including these

sub-samples would further increase sample sizes of teens with a miscarriage.

Third, especially with a larger sample of teen miscarriages, it might be possible to

examine some issues further. One issue is the apparent high representation of stillbirths in the

miscarriage sample. The current specification treats a full-term stillbirth as equivalent to a

miscarriage that occurs in a pregnancy of much shorter duration; with a larger sample, one could

test for equivalent impacts. Second, a very substantial portion of the teens with miscarriages

had a follow-up pregnancy at age 17 or earlier that led to a birth. Hoffman (1998), citing data

reported in KHK, notes that nearly 30% of the teens with a miscarriage had such a birth.

Researchers might want to consider whether the benefits of delay of a first birth were a function

of the length of time that the teens with a miscarriage actually did delay. Researchers might also

be able to consider whether the teen birth effect varies by race and/or ethnicity.

Fourth, in this paper, I have not examined issues of the sensitivity of the estimates to the

extreme incomes that are occasionally reported in the NLSY. In my data, the maximum value for

a woman’s own earnings is over $137,000 and the maximum for spouse earnings exceeds

$550,000. Maximum welfare income is greater than $27,000, which, while high, is substantially

lower than the maximum value of over $150,000 in HMS 2002. Maximum annual hours worked

is 5980, an average of over 16 hours per day for 365 days. Such values may, of course, be

genuine.

Finally, the NLSY teens who were 14-21 in 1979 and 28-35 in 1993 when these analyses

ended, have moved further into their own life-cycles. It would be valuable to examine whether

the trends that HMS identified in the mid-to-late 20s continue as all the women in the sample

move into their 30s.

35

References

Campbell A. 1968. “The role of family planning in the reduction of poverty.” Journal of Marriageand the Family, Vol. 30(2): 236 – 245.

Deaton, Angus S. 1997. The Analysis of Household Surveys. Batimore: JohnsHopkins University Press.

Duncan, Greg J. and Saul D. Hoffman. 1990. “Economic Opportunities, Welfare Benefits, andOut-of-wedlock Births among Black Teenage Girls.” Demography, 27, 519-535.

Furstenberg, Frank F. Jr., J. Brooks-Gunn, and S. Philip Morgan. 1987. Adolescent Mothers inLater Life. Cambridge: Cambridge University Press

Geronimus A. T. and S. Korenman. 1990. “The Socioeconomic Consequences of TeenChildbearing Reconsidered,” mimeo, University of Michigan.

___________. 1992. “The Socioeconomic Consequences of Teen Childbearing Reconsidered.”Quarterly Journal of Economics, Vol 107: 1187-1214.

Grogger, Jeff and Stephen G. Bronars. 1993. “The Socioeconomic Consequences of TeenageChildbearing: Findings From a Natural Experiment.” Family Planning Perspectives, Vol. 25 (4).

Hayes, Cheryl (ed.). 1987. Risking the Future. Vol.1. Washington, DC: National Academy Press.

Hoffman, Saul D. 1998 “Teen Childbearing Isn’t So Bad After All ... or Is It? — A Review of theNew Literature on the Consequences of Teen Childbearing.” Family Planning Perspectives, Vol.30, No. 5, pp. 236-239.

Hoffman, Saul D. and Greg J. Duncan. “What Are the Economic Consequences of Divorce?",Demography, November, 1988.

Hoffman, Saul D., E. Michael Foster, and Frank F. Furstenberg, Jr. 1993. “Re-evaluating TheCosts of Teenage Childbearing.” Demography, Vol 30 (1), 1-13.

Hotz, V. Joseph, Susan McElroy, and Seth G. Sanders. 1997. ”The Impacts of TeenageChildbearing on the Mothers and the Consequences of those Impacts for Government” in KidsHaving Kids, Rebecca Maynard (ed.). Washington, DC: The Urban Institute Press.

___________. 2002. “Teenage Childbearing and Its Life Cycle Consequences: Exploiting aNatural Experiment,” mimeo.

Hotz, V. Joseph, Charles H. Mullin, and Seth G. Sanders. 1997. “Bounding Causal Effects UsingData from a Contaminated Natural Experiment: Analyzing the Effects of Teenage Childbearing,”Review of Economic Studies, Vol. 64, 575-603.

Maynard, Rebecca. 1997. Kids Having Kids. Washington, D.C.: The Urban Institute Press.

National Center for Health Statistics. 1996. “Medical and Life-style Risk Factors Affecting FetalMortality, 1989-90.” Vital and Health Statistics, Series 20, No. 31.

Peterson, Richard R. 1996. “A Re-Evaluation of the Economic Consequences of Divorce.” American Sociological Review, Vol. 61, June.

Weitzman, Lenore. 1985. The Divorce Revolution. New York: The Free Press.

36

Table 1. Effects of Teen Childbearing on Selected Socioeconomic Outcomes, Teen Mothers vs Teens with Miscarriage

Outcome KHK HMS 2002

All covariates,polynomial age

effects(1)

No covariates,unconstrained age

effects(2)

All covariatesunconstrained age

effects(3)

All covariates,Polynomial age

effects(4)

High School Diploma -.20 -.11 -.16* -.15*

High School Diploma or GED .02 .08 .03 .05

Annual Hours Worked 130 - 500 369* 331* 304*

Spouse’s Earnings $8485 $1269 $2505 $7512*

Annual Own Earnings $4508 9270** $8489** 6660**

Own Wage Rate – $4.34** 4.22** 1.63

Dollars from AFDC/Food Stamps No effect -$372 -$516 -1018

Number of Births .30 .30 .27 .35

Years as Single Mother 1.6 – – –

Proportion Married – -.08 -.07 -.03

Proportion in Poverty -- -.11 -.12 -.14

Proportion Receiving Food Stamps -- -.10 -.09 -.15

Proportion Receiving AFDC – -.04 -.06 -.04

Source: Kids Having Kids, Chapter 3, text and figures, and HMS 2002, Table 4.

Table Notes:**= statistically significant at 95% level; *= statistically significant at 90% levelMost KHK estimates are at age 30, except annual hours worked where the range of estimates is for mid-20s to early 30s. HMS 2002 estimates areat age 28 except for high school diploma which is evaluated at age 21.

37

Table 2 – Own and Spouse Earnings, NLSY and HMS, Selected Cases

Variable and Case ID # Data Value,NLSY

Data Value, HMS

Ratio, HMS / NLSY

Spouse Earnings (years with positive earnings only)

ID = 244R0155500 TOT INC SP WGS & SAL P-C YR 79 $5,044 missing – R0312710 TOT INC SP WGS & SAL P-C YR 80 $5,800 $13,166 2.27R0482910 TOT INC SP WGS & SAL P-C YR 81 $9,152 $20,775 2.27R0784300 TOT INC SP WGS & SAL P-C YR 82 $12,473 $28,314 2.27R1026200 TOT INC SP WGS & SAL P-C YR 83 $14,700 $33,369 2.27R1412900 TOT INC SP WGS & SAL P-C YR 84 $17,547 $39,831 2.27R1780700 TOT INC SP WGS & SAL P-C YR 85 $22,000 $49,940 2.27R2143800 TOT INC SP WGS & SAL P-C YR 86 $31,275 $70,994 2.27R2352500 TOT INC SP WAGES & SALARY PAST YR 87 $32,840 $74,546 2.27R3561200 AMT SP REC'D 1990 FROM WAGES 91 $50,000 $113,500 2.27R3899300 AMT SP REC'D 1991 FROM WAGES 92 $70,000 $158,900 2.27R4314400 AMT SP REC'D 1992 FROM WAGES 93 $60,000 $63,295 1.05ID=175R1026200 TOT INC SP WGS & SAL P-C YR 83 500 $10,215 2.27R1412900 TOT INC SP WGS & SAL P-C YR 84 $22,000 $49,980 2.27R3561200 AMT SP REC'D 1990 FROM WAGES 91 $30,000 $68,100 2.27R3899300 AMT SP REC'D 1991 FROM WAGES 92 $34,560 $78,451 2.27R4314400 AMT SP REC'D 1992 FROM WAGES 93 $37,000 $39,032 1.05

Woman's Annual Earnings (years with positive earnings only)

ID=3R0312300 TOT INC WGS & SAL P-C YR 80 1979 $7,000 $15,890 2.27R0782100 TOT INC WGS & SAL P-C YR 82 1981 $7,000 $15,890 2.27R3897100 R'S WAGES/SALARY/TIPS (PCY) 92 1991 $4,000 $9,080 2.27R4295100 AMT OF R'S WAGES/SALARY/TIPS (PCY) 93 $6,000 $6,329 1.05ID=19R0155400 TOT INC WGS & SAL P-C YR 79 1978 $4,000 $9,080 2.27R0312300 TOT INC WGS & SAL P-C YR 80 1979 $8,500 $19,295 2.27R0482600 TOT INC WGS & SAL P-C YR 81 1980 $10,000 $22,700 2.27R0782100 TOT INC WGS & SAL P-C YR 82 1981 $9,000 $20,430 2.27ID=62R3897100 AMT OF R'S WAGES/SALARY/TIPS (PCY) 92 $25,000 $56,750 2.27R4295100 AMT OF R'S WAGES/SALARY/TIPS (PCY) 93 $26,000 $27,428 1.05

Note: CPI (1994) = 148.2; CPI (1978) = 65.2.

38

Table 3 – Observed Data Ages and Sample Sizes by Birth Cohort

Birth Cohort Number ofObservations

Observed Age Range(Ages 18-35, 1978-92)

Maximum TotalPotential Observations

Contributed

1957 121 21-35 1815

1958 101 20-34 1515

1959 130 19-33 1950

1960 125 18-32 1875

1961 128 18-31 1792

1962 133 18-30 1729

1963 141 18-29 1692

1964 100 18-28 1100

ALL 979 18-35 13,468

39

Table 4. Included Out-of-Sample Observations by Dependent Variable, HMS Teen Pregnancy Sample

VariableOut-of

SampleYear(s)

Included

Number ofIncludedCases

Mean Notes

Spouse Earnings 1977 178 $0$0 assigned for allunmarried women;missing for married

women

Welfare Benefits 1977 316 $3,137Not clear what is

source of data or whysome cases are

missing

In Poverty197519761977

121222352

0.0%0.0%0.0%

All cases included;none are poor.

Received FoodStamps

197519761977

121222352

100.0%100.0%32.0%

All cases included; allreported receiving

Food Stamps,1975-1976.

Received AFDCBenefits

197519761977

121222352

100.0%100.0%31.0%

All cases included; allreported receivingAFDC, 1975-1976

HS Diploma

197519761977

121222352

39.0%44.0%42.0%

All cases included;possible assignment

by natural age atgraduation

GED197519761977

121222352

0.8%4.5%7.7%

All cases included;not clear what issource of data.

Hours Worked None – –

Annual Earnings None – – Wage Rate None – –

40

Table 5– Comparison of Food Stamps and AFDC Benefits, NLSY and HMS, Selected Representative Cases

HMS VARIABLES NLSY VARIABLESID

(1)

Year

(2)

Food StampBenefits

(3)

AFDCBenefits

(4)

WelfareBenefits

(5)

AnnualFood

StampDollars

(6)

AFDC -AvgMonthly Benefit

(7)

TotalWelfareDollars

(8)

AFDC Months(computed)

(9)

AFDC Benefits(computed)

(10)4 82 $776 $2,499 $9,126 $416 $314 $1,986 5.0 $1,5704 83 $749 $2,407 $8,802 $1,296 $314 $5,064 12.0 $3,7684 84 $555 $1,979 $7,118 $1,080 $314 $4,848 12.0 $3,76819 88 $667 $2,825 $9,713 $348 $446 $2,132 4.0 $1,78419 89 $808 $1,569 $6,694 $1,284 $446 $6,636 12.0 $5,35219 90 $130 $0 $607 $1,644 $446 $4,320 6.0 $2,67619 91 $726 $0 $2,222 $256 $0 $256 0.0 $086 78 $614 $2,498 $9,357 $0 $0 $1,188 0.0 $086 79 $603 $2,990 $9,988 $912 $309 $4,620 12.0 $3,70886 80 $481 $2,953 $9,556 $672 $303 $4,308 12.0 $3,63686 81 $536 $2,836 $9,501 $648 $346 $4,870 12.2 – 86 82 $458 $2,806 $9,096 $828 $344 $4,956 12.0 $4,12886 83 $411 $2,849 $9,084 $648 $358 $4,944 12.0 $4,29686 84 $396 $2,837 $9,010 $648 $380 $5,208 12.0 $4,56086 85 $387 $2,891 $9,134 $648 $389 $5,316 12.0 $4,66886 86 $375 $2,923 $9,189 $648 $408 $5,544 12.0 $4,89686 87 $405 $2,889 $9,178 $648 $425 $5,748 12.0 $5,10086 88 $407 $2,882 $9,162 $756 $435 $5,976 12.0 $5,22086 89 $324 $2,467 $7,815 $768 $457 $6,252 12.0 $5,48486 90 $121 $0 $583 $448 $500 $3,948 7.0 $3,50086 91 $375 $0 $1,271 $336 $336 0.0 $0204 78 $610 $3,330 $12,311 $300 $333 $4,296 12.0 $3,996204 82 $921 $2,406 $9,266 $1,420 $424 $5,236 9.0 $3,816204 87 $861 $2,560 $12,672 $1,800 $446 $10,056 18.5 – 204 88 $601 $1,611 $6,247 $805 $432 $2,965 5 $2,160204 89 $789 $2,103 $8,087 $1,800 $400 $6,600 12 $4,800

41

Table 6. Fertility Coding Issues in HMS Teen Fertility Data

Birth Abortion Miscarriage/StillbirthReported Cases 727 185 68

Disposition Based onNLSY Variables,1982-92

708 – teen birth1 – miscarriage 1 – abortion 3 – 1st pregnancy atage > 17 14 – no reportedinformation abouthow pregnancyended or age at 1st pregnancy

170 – abortion3 – miscarriage3 – 1st pregnancy atage > 179 – no reportedinformation abouthow pregnancyended or age at 1st pregnancy

56– miscarriage 1– abortion3 – 1st pregnancy atage > 178 – stillbirth

Number of excludedappropriate cases

65 17 9 (includes 3 stillbirths)

Total AvailableAnalysis Sample

773 185 64 (miscarriage)73 (including stillbirths)

Percent of IncludedCases Correct

708/727 170/185 64/68

Percent of AvailableCases Included

708/773 170/187 64/73

Table 7. Sample Means for Included Non-Response Cases, HMS Teen Pregnancy Sample

N Minimum Maximum Mean Has Received HS Diploma or GED by Age t 341 .00 1.00 .581 Has Received GED by Age t 341 .00 1.00 .191 Has Received HS Diploma by Age t 341 .00 1.00 .405 On Food Stamps at Age t 341 .00 1.00 .58 In poverty in year t 341 0.00 1.00 .006 Annual Welfare Benefits, in 1994$ 197 256.06 106,407.55 3457.36 Spouse's Annual Earnings, in 1994$ 155 .00 .00 .00 Woman's Annual Earnings, in 1994$ 2 .00 .00 .00 Ann. Family Income , in 1994$ 2 .00 .00 .00 Hourly Wage Rate, in 1994$ 0 -- -- --

42

Table 8. Population Estimates for Background and Outcome Variables, Teens Pregnant at Age17 or Earlier, NLSY79

Variable Population Mean

Background Variables NLSY(N=1033)

HMS 2002(N=980)

Black 25.9% 30.0%

White 65.5% 60.6%

Hispanic 8.6% 9.5%

From female-headed family 18.3% 17.4%

From two-parent family 72.3% 73.2%

Mother’s education 10.5 10.5a

Father’s education 10.6 10.5a

AFQT score 31.7 31.8

Family Income (1978) $13,565 $16,981b

Family on AFDC (1978) 18.7% 15.0%

Outcome Measures (at age 28, except as noted)

High School Grad (age 21) 47.6% 48.2%

HS or GED 64.9% 68.5%

Some College 9.8% --

Hours Worked 1166 1092

Cumulative Hours Worked 9102 9189

Own Earningsc $9110 $16,147

Spouse Earningsc (if married) $25,468 –

Spouse Earningsc (all) $12,507 $23,891

Income from Welfarec $1447 $2351

Married 55.0% 64.6%

Poor 31.0% 42.0%

Received Food Stamps 26.0% 27.0%

Received AFDC Income 18.2% 22.0%

Table Notes:a – adjusted from HMS 2002 to exclude missing datab – adjusted from HMS 2002 to exclude missing data and rescale to 1978 dollarsc – in 1994 dollars

43

Table 9. Original and Corrected HMS Estimates of Effect of Teen Childbearing on SelectedSocioeconomic Outcomes – Teen Mothers vs Teens with Miscarriage

Outcome Original Correcteda

Non-Parametric,no covariates

Parametric,with

covariates

Non-Parametric,no covariates

Parametric,with

covariates

High School Graduate -0.107 -0.146 -- -0.145

HS or GED 0.080 0.046 -- 0.049

Hours Worked 368.90* 303.64* -- --

Cumulative HoursWorked

2605.41** 1732.10 -- --

Wage Rate 4.34** 1.64 2.15 1.08

Own Earnings 9269.65*** 6660.43** 5467.41** 3915.85

Spouse Earnings 1269.85 7511.77* 28.13 4436.58

Income from Welfare -372.27 -1017.79 -84.99 -617.73

Married -0.082 -0.033 – --

Poor -0.106 -.139** -- -0.124

Received Food Stamps -0.097 -.152** -- -0.174

Received AFDC Income -0.043 -0.045 -- -0.063

a Corrected for scaling and sampling problems.*, **, and *** = statistically significant at 10%, 5%, and 1% level

44

Table 10. IV Estimates of Effect of Teen Childbearing (Pregnant by Age 17) on Socio-Economic Outcomes, NLSY79

Earnings SpouseEarnings

WelfareIncome

HSGraduate

HS Grador GED

SomeCollege

HoursWorked

Poor Married Food StampUse

AFDCUse

Non-ParametricEstimates,No Covariates

Effect at Age:18 -215.5 2469.9 473.2 0.099 0.174 0.002 -143.3 0.025 0.189 -0.038 -0.00820 1086.4 2375.1 177.2 0.002 0.014 -0.010 219.0 -0.040 0.065 0.066 -0.05622 1727.6 2680.9 -244.2 -0.048 0.013 -0.053 200.8 -0.094 0.056 0.044 -0.05224 1167.6 4015.9 -302.7 -0.053 0.021 -0.084* 133.9 0.005 -0.055 -0.199** -0.07426 909.1 2904.6 171.9 -0.114 -0.022 -0.074 188.0 0.066 0.032 -0.094 0.01628 2086.7 2803.8 -470.5 -0.080 0.004 -0.066 308.6** -0.040 -0.112 -0.166** -0.08530 3846.9** 6471.7** -1361.2** 0.006 0.053 0.024 154.5 0.045 0.009 -0.189** -0.122*

Sample Size 13651 13651 13620 13651 13634 13583 13460 11963 12902 13640 13483

Parametric EstimatesWith All Covariates

Teen Birth 54.03 49721.17* -3773.40 2.662 1.009 0.952** -34.68 0.076 1.865 0.836 0.168Age -193.17 -4405.81** 457.02 -0.282 -0.083 -0.082** -16.82 0.000 -0.142 -0.058 -0.001Age Squared 8.29 97.99** -11.82* 0.009 0.002* 0.002** 0.84 0.000 0.003 0.001 0.000Age Cubed -0.0001 Effect at Age:

18 -738.4 2164.9 623.8 0.017 0.060 -- -63.7 0.053 0.172 0.072 0.06820 -495.0 800.5 639.7 -0.051 0.020 -- -33.2 0.049 0.090 0.022 0.04722 -185.3 219.9 561.0 -0.091 -0.006 -0.060 4.2 0.044 0.030 -0.020 0.02524 190.6 423.3 387.7 -0.107 -0.019 -0.074 48.3 0.039 -0.008 -0.056 0.00026 632.9 1410.5 119.9 -.0.103 -0.019 -0.075 99.1 0.034 -0.025 -0.085 -0.02628 1141.4 3181.7 -242.5 -0.083 -0.005 -0.064 156.7 0.028 -0.021 -0.107 -0.05430 1716.2 5736.8 -699.4 -0.053 0.021 -0.039 221.0 0.022 0.004 -0.121 -0.084

Sample Size 13076 13076 13049 13606 13606 13555 12892 11473 12396 13066 12919** = statistically significant at 5% level; * = statistically significant at 10% level

45

Table 11. Summary of Estimated Effects of Teen Childbearing on Socio-Economic Outcomes,HMS 2002 and NLSY

Outcome Estimate at Age 28, Except as Noted; Sum of Impacts Age 18-30 in Brackets

HMS 2002 NLSY79

Non-Parametric Parametric Non-Parametric Parametric

High School(age 21)

-0.107 -0.146 -0.080 -0.080

HS or GED 0.080 0.046 0.004 -0.010

Some College -- -- -0.066 -0.060

Hours Worked 368.9

[2964.0]

303.6

[2313.6]

308.6

[1751.5]

156.7

[780.9]

Own Earnings 9269.7

[66868.6]

6660.4

[46784.3]

2086.7

[15898.1]

1141.4

[3986.4]

SpouseEarnings

1269.9

[61281.0]

7511.8

[45444.0]

2803.8

[40209.9]

3181.7

[23336.2]

Welfare Income -372.3

[2784.4]

-1017.8

[1619.2]

-470.5

[-1780.5]

-242.5

[2889.1]

Married -0.082 -0.033 -0.112 -0.02

Poor -0.106 -.139 -0.040 0.03

Food Stamps -0.097 -0.152 -0.166 -0.11

AFDC -0.043 -0.045 -0.085 -0.05

46

Figure 1. Average Own Earnings by Age, 1994 Dollars,Teen Pregnancy Sample

$0

$5,000

$10,000$15,000

$20,000

$25,000

$30,000

15 20 25 30 35 40

Age

Earn

ings

Reported Earnings

Corrected Earnings

Reported earnings are from HMS 2002; corrected earnings are adjusted properly to 1994 dollars. Sample is persons w ith earnings.

Figure 2. Average Spouse Earnings by Age, 1994 Dollars, Teen Pregnancy Sample

$10,000

$20,000

$30,000$40,000

$50,000

$60,000

$70,000

15 20 25 30 35 40

Age

Earn

ings

Reported SpouseEarnings

Corrected SpouseEarnings

Reported earnings are from HMS 2002; corrected earnings are adjusted properly to 1994 dollars. Sample is persons w ith earnings.

47

Figure 3. Average Scaling Error by Age

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

18 20 22 24 26 28 30 32 34

48

Appendix Table 1 – Comparison of Food Stamp Dollars, NLSY and HMS Data, Selected Cases

Case ID Number and Year NLSY Value HMS Value NLSY in 1978 $

ID=102591978 $65.00 $65.00 $65.001979 $657.00 $589.92 $590.031980 $1,068.00 $845.06 $845.071981 $1,656.00 $1,187.63 $1,187.801982 $1,920.00 $1,297.13 $1,297.241983 $1,980.00 $1,296.00 $1,296.141984 $1,608.00 $1,008.94 $1,009.061985 $1,656.00 $1,003.31 $1,003.451986 $264.00 $157.03 $157.051987 $1,242.00 $712.83 $712.841988 $0.00 $0.00 $0.001989 $1,920.00 $1,009.50 $1,009.551990 $3,000.00 $1,496.44 $1,496.561991 $1,080.00 $516.94 $517.00

ID=62341978 $1,800.00 $1,123.27 $1,800.001979 $924.00 $720.14 $829.821980 $864.00 $487.17 $683.651981 $588.00 $451.99 $421.761982 $684.00 $115.52 $462.141983 $0.00 $321.89 $0.001984 $684.00 $418.08 $429.231985 $684.00 $103.59 $414.471986 $0.00 $366.75 $0.001987 $852.00 $424.80 $489.001988 $732.00 $498.33 $403.441989 $1,008.00 $612.87 $530.01

ID = 103141978 $744.00 $1,781.72 $744.001979 $2,004.00 $1,852.78 $1,799.741980 $2,364.00 $1,823.25 $1,870.541981 $2,520.00 $1,868.53 $1,807.521982 $2,796.00 $1,650.47 $1,889.111983 $2,475.00 $1,633.50 $1,620.181984 $2,532.00 $1,673.25 $1,588.901985 $2,808.00 $1,701.38 $1,701.501986 $0.00 $0.00 $0.001987 $4,080.00 $2,248.50 $2,341.691988 $4,080.00 $2,010.00 $2,248.661989 $3,672.00 $2,139.19 $1,930.76

Note: Bolded entries indicate that HMS values are in 1978 dollars.

27 There are 17 cases that do not report a teen birth on the basis of the 1984-92 information, but in theirfertility history report a birth at age 18.75 or earlier, thus indicating a pregnancy that began at age 17 or earlier. Thereis, however, no way to know whether this was a first pregnancy. I have chosen not to include these cases.

49

Data Appendix – Coding of Teen Fertility Events in NLSY

Coding of the outcome of an early teen pregnancy is based primarily on information available inthe 1984-86, 1988, 1990, and 1992 interviews. In each of these years, there are a pair of variables,indicating the age at the beginning of a first pregnancy and the outcome of that pregnancy. I use all yearsof data in this coding, since some cases appear in the file only in some year or provide information about,for example, age at first pregnancy, only in a single year. From 1984 to 1990, the outcome variable iscoded into four categories: birth, abortion, miscarriage, or stillbirth. In 1992, the last two categories arecombined. It appears that HMS included stillbirths in their miscarriage category.

In 1982 and 1983, there is information on the outcome of a pregnancy that occurred prior to a firstbirth and the year and month in which that first pregnancy ended. This can be combined with informationon own date of birth to construct an approximate age at the beginning of a first pregnancy that ended inother than a live birth. In most cases, the 1982-83 information is translated into and is consistent with the1984-1992 information. There is some occasional inconsistency across years.

I code teen fertility in two steps. First, I use the 1984-92 variables to identify cases that everreport a first pregnancy at age 17 or less along with one of the designated fertility outcomes. The relevantpairs of variables are 1984: r1522057and r1522056; 1985: r1892759 and r1892758; 1986: r2259859 andr2259858; 1988: r2879800 and r2879700; 1990: r3409900 and r3409800; and 1992: r4009469 andr4009468. I then check the 1982-83 pregnancy information and the information on the date of first birth toidentify additional cases not otherwise classified that report a pregnancy that began at age 17 or earlier. Ifurther check these cases against what I treat as the more reliable information in the 1984-92 data. Wherethere are conflicts that cannot be resolved, I rely on the 1984-92 data. Like HMS, I exclude the militarysubsample and the poor whites in the supplementary sample, even in those years where they areavailable.

Using just the 1984-92 data, there are 773 teen births, 181 teen abortions, and 67 teenmiscarriages (including 11 stillbirths). An additional 10 cases have 1982-83 information that indicates,either by itself or in conjunction with the 1984-92 data, there was a miscarriage that could have begun atage 17 or earlier. After checking the 1982-83 information against the 1984-92 information, I conclude thatsix of these cases (IDs= 1302, 3952, 5774, 6715, 8194, and 8289) are probably teen miscarriages, one isa miscarriage in a pregnancy that began at age 18 (1000), and three are teen abortions (626, 4645, and11876). This leaves a grand total of 73 miscarriages, including 11 stillbirths.

There are also another eight cases that appear to be teen abortions based on the 1982-83information. Of these, two are inconsistent with the 1984-92 information for the age at which thepregnancy began, but the other six appear to be genuine teen abortions (IDs 3001, 5659, 5927, 5930,6772, and 8103). This yields a total of 187 teen abortions.27

28 The actual abortion count in HMS is 184. One additional case is coded as an abortion but is excludedfrom the sample by a filter that erroneously codes it as missing in each sample year.

50

HMS 2002 report 727 teen births, 185 abortions, and 68 miscarriages (including stillbirths). Of the68 HMS teen miscarriages, eight are stillbirths (Case ID= 1985, 2214, 5589, 6807, 7074, 7439, 7774,11802); one is coded abortion in all years (ID=4645); and three occurred at age 18 or older (ID = 5171,8614, 10000). This leaves 56 legitimate teen miscarriage cases or 64, including the stillbirths.

In addition, there are nine teen miscarriages that are not included in the HMS miscarriage sampleand that are readily identified by the 1984-1992 fertility information. Six are genuine teen miscarriages(ID= 590, 845, 4252, 4295, 4787, 5896). Three are stillbirths (ID = 96, 3191, 8260), but I include thembecause HMS included stillbirths as miscarriages, too. Including the stillbirths, there are a maximum of 73teen miscarriages, comprised of 11 stillbirths and 62 miscarriages.

Of the 185 HMS teen abortion cases,28 three are teen births (IDs= 1071, 1548, 3051), three areteen miscarriages (IDs= 3191, 4252, 5896), and nine are not teen pregnancies or have insufficientinformation to evaluate as teen abortions (IDs= 343, 1250, 1558, 5044, 9371, 9521, 9863, 9927, 11968). There are also 17 legitimate teen abortion cases not included as teen abortions on HMS. All are coded asa teen abortion in most years (may be missing in some years); case ids= 438, 692, 1377, 1940, 2217,2221, 2402, 2552, 2952, 3185, 4474, 4645, 4665, 5195, 5681, 5682, 9865.) The net abortion count is 185(HMS) - 15 (invalid) + 17 (not on HMS file) = 187.

The HMS file includes 727 teen births. Of these, 19 do not have any reported information of a teenbirth: one is a teen miscarriage (ID=845), one is a teen abortion (9865), three are the result of pregnanciesat age 18 (842, 4802, 8319), and 14 have no reported information about how pregnancy ended or age atpregnancy for 1982-1992. In addition, there are 65 cases that report a teen birth at age 17 or younger thatare excluded. A complete listing is available on request. The net teen birth count is 773.

the socio-economic effects of teen childbearing re ... · the socio-economic effects of teen...

Documents