the specication of earnings functions: tests and implications€¦ · earnings functions have been...

42
r HE VORLD BANK EDroloo DiscussionPRper EDUCATION AND TRAINING SERIES Report No. EDTIOO The Specication of Earnings Functions: Tests and Implications C. R. S. Dougherty and E. Jimenez June 198'7 Educationand Training Department Operations Policy Staff The views presented here are those of the author(s), and they should riot be interpreted as reflectingthose of the World Bank. Public Disclosure Authorized Public Disclosure Authorized Public Disclosure Authorized Public Disclosure Authorized Public Disclosure Authorized Public Disclosure Authorized Public Disclosure Authorized Public Disclosure Authorized

Upload: others

Post on 15-Oct-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

r HE VORLD BANK EDroloo

Discussion PRper

EDUCATION AND TRAINING SERIES

Report No. EDTIOO

The Specication of Earnings Functions:Tests and Implications

C. R. S. Doughertyand

E. Jimenez

June 198'7

Education and Training Department Operations Policy Staff

The views presented here are those of the author(s), and they should riot be interpreted as reflecting those of the World Bank.

Pub

lic D

iscl

osur

e A

utho

rized

Pub

lic D

iscl

osur

e A

utho

rized

Pub

lic D

iscl

osur

e A

utho

rized

Pub

lic D

iscl

osur

e A

utho

rized

Pub

lic D

iscl

osur

e A

utho

rized

Pub

lic D

iscl

osur

e A

utho

rized

Pub

lic D

iscl

osur

e A

utho

rized

Pub

lic D

iscl

osur

e A

utho

rized

Page 2: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which
Page 3: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

Discussion Paper

Education and Training Series

Report No. EDT100

The Specification of Earnings Functions:

Tests and Implications

C.R.S. Dougherty and E. JimenezJune 1987

Research DivisionEducation and Training Department

The World Eank

The World Bank does not accept responsibilit:y for the views expressedherein, which are those of the author(s) and should not be attributed tothe World Bank or its affiliated organizations. The findings,interpretations, and conclusions are the results of research or analysissupported by the Bank; they do not necessarily represent official policy ofthe Bank.

Copyright C 1987 The International Bank for Reconstruction and Development/The World Bank

Page 4: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which
Page 5: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

ABSTRACT

Many studies of the returns to education have relied on theMincerian specification for the earnings function. This study uses datafrom a random sample of adult male workers of the 1980 Brazilian census totest the empirical validity of the assumptions embodied in thisspecification, with the following findings: the evidence supports theassumption that the appropriate regressand Ls the logarithm of earnings,but it does not support the implicit assumpl:ion that there is nointeraction between the effec:ts of education and work experience, or theassumption that a single function is appropriate for modelling both earlyand mature earnings. We find that the Mincerian specification leads toupwardly biased estimates of the returns to education, particularly at theprimary level.

Page 6: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which
Page 7: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

I. Introduction

Earnings functions have been widely used to estimate the returns

to education and training -- estimates which have had a significant effect

on the policy debate concern:Lng educational investment. Most studies have

adopted a Mincerian specification in which the core regressors are years of

schooling or schooling dummies, work experience and work experience

squared. This specification has been popular because the coefficient of

the schooling variable can conveniently be interpreted as a crude estimate

of the rate of return to schooling, but it embodies three strong assump-

tions:

(i) The appropriate definition of the dependent variable

is the logarithm of earnings, as opposed to earnings

as such or any other functional form.

(ii) There is no interaction between the contributions of

the schooling and work experience variables to earn-

ings.

(iii) A single function can be used to model lifetime earn-

ings, making no distinction between early and mature

labour market experience.

So powerful is the hold of the Mincerian model that these

assumptions are seldom even mentioned. And yet it is commonly accepted

that the earnings functions of those with relatively little education are

much flatter than those with more education, and that many entry-level jobs

-2-

Page 8: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

are effectively training slots whose compensation is determined differently

from those of mainstream occupations 1.

1 For discussions of the nature of entry-level jobs and the ear-nings of those acquiring human capital through on-the-jobtraining, see Thurow (1980) and Becker (1965).

The first objective of this paper is to subject these assumptions

to overdue tests. The second is to evaluate their practical implications.

Convenience is a legitimate consideration in model specification, for only

an academic purist would argue against the use of a simplified model if it

gave results similar to those derived from more elaborate ones with much

less labor. In the present context, the obvious criterion is the impact of

the assumptions on estimates of rates of return to different levels of

schooling.

The rest of the paper proceeds as follows. In the second section,

we briefly describe the data base, which is a random sample of urban males

in Brazil from the 1980 census. Then, in the third section, we investigate

whether the semi-logarithmic specification of the earnings equation yields

the best fit by testing the explanatory power of alternative transforma-

tions of the dependent variable, the earnings term. We also test whether

the specification conforms with the basic assumptions of the classical

regression model: homoscedasticity, and, for the validity of conventional

tests, normality of the error term. In the fourth section, we investigate

the specification of the right-hand side of the regression model. Our main

concern is the bias caused by neglecting interactive effects between years

of work experience and level of schooling. We also evaluate alternative

- 3 -

Page 9: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

measures of the experience term at low levels of schooling and the impact

of certification on the measured returns to schooling and training.

Finally, in the last section of this paper, we examine the sensitivity of

estimated rates of return to the alternative specifications considered in

the paper.

II. Data

The 3 percent national sample for the Brazilian 1980 census covers

3.5 million individuals in 0.81 million households (IGBE 1985). Out of

this base, and for the purpose of making the statistical analyses more

manageable, a random subsample (stratified by state) of 200 thousand

individuals (in 40 thousand households) was drawn. This subsample was

further refined to include only males aged 15 to 65, living in urban areas

and who reported positive earnings in their main occupation. This resulted

in a total sample size of 22,875 individuals.

The means and standard deviations of the key variables used in the

subsequent analysis are described in Table 1. They are divided by

employment status of the individual. We focus our analysis on the private

sector subsample, since labor earnings for this group are more likely to be

immune from potential biases due to the non-competitive nature of the

public sector and the difficulty in measuring self-employed earnings.

However, where warranted, we discuss differences in the results of the

private versus the other sectors.

-4-

Page 10: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

Table 1: Earnings, experience and schooling levels in Brazilby level of economic activity, 1980

Employed in Self-Variable Private Public Employed

Monthly earnings in cruzeiros (Y) 12865 18470 15123(means, std deviations) (16976) (22917) (24918)

Experience in years 16.9 22.2 23.9(X = min{Age-schooling-6, Age-15}) (11.8) (12.3) (13.3)(means, std deviations)

Level of School Certificate(proportions, numbers in sample)

None (NIL) 0.37 0.23 0.49(5,790) (490) (2,565)

Primary lower (PL) 0.40 0.31 0.36(6,180) (666) (1,889)

Primary upper (PU) 0.11 0.16 0.07(1,733) (341) (368)

Secondary general (SECG) 0.05 0.10 0.03(760) (215) (150)

Secondary technical (SECT) 0.03 0.05 0.02(480) (97) (83)

Higher Scientific (HISCI) 0.01 0.05 0.01(223) (109) (73)

Higher mgt./agric.(HIMGT) 0.01 0.02 0.00(194) (52) (23)

Higher soc. sci.(HISOC) 0.01 0.06 0.01(146) (137) (68)

Years of Education(means, std deviations)

Primary lower (YRSPL) 3.07 3.46 2.72(1.46) (1.20) (1.61)

Primary upper (YRSPU) 1.21 2.08 0.80(1.67) (1.87) (1.46)

Secondary (YRSSEC) 0.44 1.13 0.30(1.13) (1.65) (0.97)

Higher (YRSHI) 0.18 0.72 0.16(0.88) (1.69) (0.85)

Sample Size 15,523 2,127 5,225

-5-

Page 11: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

In Table 1 we have divided the first eight years of Brazilian

primary education into four years each of :Lower and upper. This is

consistent with the Brazilian educational reform of the early 70's in which

grades 5-8 were redesignated from high school to primary.

Another important issue is the def-inition of work experience.

Although work experience is almost invariably an important variable in

determining earnings - usually the only 2iajor one apart from schooling -

lack of data usually leads to its being estimated by an expression of the

type (age - years of schooling - 6). This procedure can be inappropriate

in developing countries where much of the labour force has had little or no

schooling, for it implies that "work experience" gained during childhood

should be treated on the same level as adult work experience. In the

present study, work experience has been estimated as the smaller of the

above expression and (age - 15), years out of school before the age of 15

not being counted.

Graphically, the effect of the revised definition is to shift the

experience-earnings profiles for those witb. the lowest levels of education

to the left, the shift being greatest for the lowest levels. Those with no

certificate, and hence on average two or three years of schooling, would

have six fewer years.of work experience under the revised definition than

under the traditional one. 'For lower primary the average shift would be

about two years, and higher levels of education would not be affected.

-6-

Page 12: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

III. Specification of the dependent variable

The most popular specification of the earnings (y) function is

parabolic, containing schooling (s), experience (x) and experience squared

as explanatory variables. For the ith individual, this can be represented

as:

(1) ln Yi = a + bsi + cxi + dxi2 + Ui

Its popularity stems from Mincer's pioneering work, which showed that this

specification is a good linear approximation of the earnings function

derived from a human capital model, under several simplifying assumptions

about the complex dependence of earnings on schooling and postschool

investments. In this specification, the coefficient of the variable

measuring years of schooling can be interpreted as the private rate of

return to schooling.

A variation on equation (1) has also been widely used since it

allows the estimated rate of return to vary by level of schooling:

(2) ln Yi = a + E bkDik + cxi + dxi2 + uik

where k stands for the level of education (i.e., k = lower and upper

primary, general and technical secondary, and various higher levels). In

this specification, the rate of return to the kth level of education (rk)

has been estimated by comparing the coefficient of Dk with that of Dk-1 and

dividing by the number of years of schooling at the kth level (nk)

(Psacharopoulos, 1981):

(3) rk = (bk - bk-l)/nk

-7-

Page 13: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

In order to simplify the interpretation of the coefficients and to focus

attention on the methodological comparisons, most of the analysis is done

on variants of equation (2).

The theoretical foundation for the semi-logarithmic specification

is so widely accepted that it has seldom been subjected to empirical tests.

However, the link between theory and the estimating equation rests on a set

of ingenious but empirically debatable assumptions. As summarized by

Blinder (1976), among the most important are: (i) in the absence of

post-school investments, an individual's age-earnings profile would be flat

and the present discounted value of lifetirie earnings would be the same for

all individuals, regardless of how long they stayed in school; (ii) the

number of years spent at work is independent of the number of years spent

in school; (iii) the return to all post-school investment in human capital

is a constant; and (iv) during schooling, no time is spent in the labor

force, whereas after schooling, everyone works full-time.

Alternative assumptions would result in altered regression

equations, as considered by Mincer (pp.83-S92). For example, the assumption

of a linear decline in post-school investment in human capital over the

life-cycle could lead to an estimating equeLtion that has earnings (instead

of its log) on the left hand side. In this section, we consider the

empirical validity of using the logarithm of earnings as an explanatory

variable.

A. Empirical validity of the semi-log dependent variable:

A general transformation, widely t.sed in the applied economics

literature, is applied to the Brazilian data base to test for alternative

-8-

Page 14: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

functional forms. The Box-Cox transformation takes the following general

form:

(4) Yi{X} = a + E bkDik + eXi + dxi2 + Ui,k

where earnings, Y, is transformed such that:

Yi{X} = (Yi -1)/X for X $ 0,

YiJ{X} = ln (Yi) for X = 0.

The attractive characteristic of (4) is that the functional form

is dictated by the parameter X, which is itself estimated as the value

that maximizes the log-likelihood function. Note that, if the estimated

X = 1, the earnings function is linear in the dependent variable; if

X = 0, the appropriate functional form would be semi-logarithmic, as post-

ulated by Mincer's basic human capital model. Further we can construct a

confidence interval around the estimated value of X to see if alternative

functional forms (transformations) are also consistent with the data. In

our case, we are particularly interested in testing the appropriateness of

the simpler and oft-used functional forms, such as the linear and the

semi-logarithmic.

The estimation of (4) requires the maximization of a nonlinear

likelihood function. It has been shown (Spitzer, 1982) that there are

alternative ways of consistently estimating the parameters through simpler

and available computer algorithms, such as nonlinear least squares or

iterative OLS. In principle, these techniques involve the repeated OLS

estimation of (4) for various values of X. Spitzer and others have shown

-9-

Page 15: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

that its maximum likelihood estimate is equivalent to the value for which

the variance of the squared disturbances is minimized 2,

2 To ensure the comparability of the sum of squared errors fordifferent values of X, the equation can be rendered scale invar-iant through the use of a scaling trick originally attributedto Zarembka (1968). The trick is to multiply through (1) by y'where y' is the geometric mean of y. An ordinary least squarescomputer program can then simply be applied to the transformedversion of (1) and modified to repeatedly estimate a* and b*(where a* = (a - y'{X}yjy', the vector b* = bly', y' = the geom-etric mean of y) for different X's. The error sum of squares iscomputed in each case. We iterate for different values untilthe error sum of squares is minimized.

We utilize these techniques, estimate (4) and compare the results with

estimated parameters of the linear and semi-logarithmic specifications.

The value of X for which the error sum of squares is minimized is

-0.13. A 95% confidence interval can be constructed by noting that the

maximum likelihood function is:

(5) Max ln (X) - N ln s2/2

where s2 is the maximum likelihood estimate of the variance of

disturbances of the regression and N is the number of observations

(Spitzer, 1986). This formula is used to plot maximized log likelihood

over the whole parameter space and the maximizing X . Large sample theory

can be used to test hypotheses about the parameters. Twice the difference

in the logarithmic likelihood between a null and alternative hypothesis is

distributed x2 with the degrees of freedom depending on the number of parame-

ters specified in the null hypothesis (Zarembka 1968; Heckman and Polacheck

1974).

- 10 -

Page 16: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

The plot of the log-likelihood values for different values of

X is shown in Figure 1. According to this figure, the maximum likelihood

value of L for the private sector is -0.16. A 95% confidence region is

around this estimate is -0.20 and -0.12. Thus, the earnings function

specification is significantly different from both the linear and the

semi-logarithmic forms. However, when comparing the two simple specifica-

tions, the semi-logarithmic form dominates the linear version. The large

size of the sample causes asymptotic likelihood ratio tests to reject both

the null hypotheses that X equals one and that it equals zero, but the nega-

tive value for X implies that the x2 test statistic is rejected

at a much higher significance level for the linear model.

3El, -

;,.a - e F ' EL EL

B'5'E' - .'S

^~~~~Fgr Lo-lkeiho value by lamb,dl

, B a. 3E' g , "

-' .8,4

--0. SD --.0.25 -,. E, -,15 -0{.10 -U. Db [,D3

Figure 1: Log-likelihood values by lambda

- 11 -

Page 17: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

This is not the first time that Eox-Cox transformations have been

used to test alternative specifications ir. the human capital literature.

Heckman and Polacheck (1974) performed similar experiments with the 1960

and 1970 public use samples of the U.S. census. Their conclusion is

similar to ours -- that among simple transformations, the natural logarithm

of earnings is the correct dependent variable in earnings functions.

B. Homoscedasticity on the schooling dimerLsion

In the basic specification of the model it is assumed that the

disturbance term is homoscedastic with respect to schooling. This can be

checked by disaggregating the sample by level of schooling, running

regressions for earnings with respect to experience and its square for each

category separately, and calculating the mean square residual.

- 12 -

Page 18: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

Table 2: Mean Square Error of Residuals by Certificateand by Experience Category

Dependent Variable

Certificate Y(x106) Ln Y

None 36 0.29Primary lower 75 0.32Primary upper 163 0.38Secondary general 457 0.48Secondary technical 368 0.45Higher science 1540 0.39Higher social sci 1079 0.32Higher mgt & agric 1234 0.64

Experience

• 10 65 0.24Ž 19 215 0.41

Table 2 presents the mean square residual for each category, for

both the semi-logarithmic and linear specifications of the dependent

variable. The linear specification is clearly subject to very severe

heteroscedasticity. Almost inevitably in view of the large size of the

sample, formal F-tests indicate that the heteroscedasticity is still

significant in the semi-logarithmic specification, but it is relatively

mild.

The near-homoscedasticity in the semi-logarithmic version was by

no means a foregone conclusion since there is no theoretical apparatus

predicting it. Indeed it would not have been a surprise to have found

heteroscedasticity so severe that it would have led to the abandonment of

the use of a single, combined earnings function for all levels of

- 13 -

Page 19: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

education.

C. Homoscedasticity on the experience dimension

For the purpose of evaluating heteroscedasticity in the experience

dimension, earnings functions were fitted using the subsamples containing

those with the least experience and those with the most experience, the

cut-off points following the guidelines of Goldfeld and Quandt (1965) 3.

3 Those with least experience had 10 or fewer years of schooling(5704 cases, 37% of the total); those with most experience had19 or more years of schooling (5695 cases).

Table 2 presents the mean square error for each subsample for

basic regressions using ln y and y as dependent variable. There is

evidence of significant heteroscedasticity in both cases but it is much

less severe for the regression using ln y

D. Normality of the distribution of the residuals

Finally the distribution of the residuals was tested for normal-

ity. The unbiasedness and efficiency of OLS do not depend upon any

assumption concerning the distribution of the disturbance term. Neverthe-

less, in view of the fact that earnings functions seldom account for more

than 50% of the variance in earnings and 1:hat the residual variance is

popularly attributed to a multitude of factors, it is reasonable to expect

the Central Limit Theorem to apply and the disturbance term, and hence

residuals, to approximate a normal distribution. Moreover, the validity of

the t-tests and F-tests depend upon such an approximation.

- 14 -

Page 20: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

* *t*4 *

". *IC *

.4*.. 4

**** t

it 44

444 44

*44 **

*4 *K -*4*

49 ~ ~ ~ 4

8@e~~3 cas@nes8 75 ca@^@oses 4es

$*mfi log&ritbmic Linear

Figure 2:

Distributions of residuals, semilogarltbmic and linear *anings functions,

standardized by division by standard er'ro'r-of tht regression-

- 15 -

Page 21: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

Figure 2 presents histograms for the distribution of the residuals

using the semi-logarithmic and linear specifications, both standardized by

division by the standard error of the regression. Both distributions are

significantly different from normal, but that for the semi-logarithmic

regression conforms much more closely than. that for the linear

4regression, which is far more peaked and long-tailed

4 Adopting the 0.33 standard deviation intervals used in the his-tograms, and amalgamating into single categories the tailsbeyond two standard deviations, the %2 statistics were 79.0and 8,425 for the semi-logarithmic and linear specifications,respectively. With 12 degrees of freedom, the critical levelof x2 is 26.2 at the 1% significance level. We are indebted toJ.J. Thomas for proposing this test.

IV. Specification of the expilanatory variables

In this section, we examine the empirical nature of the assump-

tions regarding the right hand side of earnings functions.

A. Interaction of the effects of schooling; and experience on earnings

It is commonly accepted that the age-earnings profiles of those

with the lowest levels of schooling tend to rise relatively slowly after

the first few years of work experience. In the case of unskilled manual

workers, they are likely to reach an absolute plateau and in middle age

begin to fall as physical powers decline. By contrast, the earnings of

those with extended schooling continue to grow throughout their working

lives and the rate of growth is positively correlated with the level of

schooling.

- 16 -

Page 22: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

These stylized facts are faithfully reproduced in diagrams

depicting typical earnings profiles by level of education. It is therefore

surprising that they are not imilarly reflected in the specification of

regression models: typically the work experience variable and its square

appear in the regression equation unaccompanied by schooling interactives

and their coefficients are therefore interpreted as applying independently

of schooling level. The equation with interactive terms (which we call the

"basic specification") is:

(6) ln Yi = a + E bkDik + cxi + dxi2

k+ E ckDikxi + E gk Dikxi2 + ui

k k

where Ck and gk denote interactive terms.

- 17 -

Page 23: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

Table 3: Private Sector Semi-Logarithmic Earnings Functions in Brazil, 1980

Variables Coefficient Std. error Coefficient Std. error

Constant 8.29016 0.02090 8.01946 0.01434X 0.04678 0.00199 0.07086 0.00129X2 -0.00088 0.00004 -0.00126 0.00003PL 0.05306 0.02771 0.36720 0.01088PU 0.09512 0.04139 0.69072 0.01619SECG 0.57692 0.06463 1.18726 0.02266SECT 0.45868 0.07727 1.16037 0.02781HISCI 1.91258 0.08984 2.32203 0.03997HIMGTAG 1.56187 0.13120 2.09362 0.04268HISOC 1.48525 0.16825 1.72776 0.04894X*PL 0.02832 0.00286X*PU 0.05747 0.00469X*SECG 0.05519 0.00780X*SECT 0.06266 0.00828X*HISCI 0.04608 0.01076X*HIMGT 0.05733 0.01650X*HISOC 0.00997 0.01828X2*PL -0.00044 0.00006X2*PU -0.00085 0.00011X2*SECG -0.00072 0.00019X2*SECT -0.00082 0.00018X2*HISCI -0.00087 0.00022X2*HIMGT -0.00113 0.00044X2*HISOC 0.00021 0.00043

---------------------------------------..-------------------- __-

R-Squared 0.46664 0.500N 15,523 15,523

Table 3 presents the regressio:n results including and excluding

the interactive terms, respectively. The x-interactives all have the

expected positive sign and those for the two levels of primary education

and the two types of secondary education are all significant at the 1%

level. The x2-interactives are likewise significantly negative for the

lower levels of schooling.

The main consequence of omitting the interactives is to overesti-

mate the initial upward shift of earnings profile associated with progres-

- 18 -

Page 24: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

sively greater amounts of educatton. In the case of lower primary

education, for example, the full specification suggests that the initial

shift is a modest 5.3%, but this of course increases over time since the

earnings of those with lower primary education grow faster than those with

no certificate: by the twentieth year of work experience, mid-way through

the individual's working life, the differential would be 55%. The

specification without interactives, constrained to yield an average figure,

suggests that lower primary education results in a once-and-for-all

relative shift of 44%, effective immediately.5.

5 Throughout the text we calculate income differentials bycomparing the absolute earnings predicted by the logarithmicfunctions. By way of illustration, the coefficient of lowerprimary in the interactiveless specification, 0.3672, impliesthat the earnings of lower primary graduates are higher thanthe earnings of those with no certificate by a factorexp(0.3672), that is, 1.44, implying a differential of 44%.

Similar remarks apply to the estimates of the impact of other

levels of education. The implications of this distortion for

rate-of-return analysis are obvious. By exaggerating the initial impact of

education, the interactive-less specification will systematically tend to

lead to overestimates of the rate of return to it. This point is explored

further in Section V.

B. Modelling early labour market experience

The literature on occupational training suggests that early labour

market experience differs from later experience in two respects: (i) the

first few years of labour market experience are a time for experiment and

- 19 -

Page 25: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

for testing the job market, leading to relatively frequent job change

(Grasso and Shea 1979); (ii) a major characteristic of many entry-level

jobs is their training furLction (Thurow, 1980). As a consequence it is

commonly accepted that the earnings of many individuals rise relatively

rapidly, in proportional terms at least, in their first few years in the

labour force, and then settle down to a more sedate rate of growth.

- 20 -

Page 26: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

Table 4: Private Sector Semi-Logarithmic Earnings Functions in Brazil, 1980Variations on Basic Specification

For X = Min{Age-school-6, Age-15} For X = Age-school-6

For X < 10 For X > 10

Variables Coeff. Std.err. Coeff. Std.err. Coeff. Std.err.

Constant 8.07407 0.03992 8.60137 0.05782 7.92317 0.03768X 0.10871 0.01665 0.02409 0.00429 0.05620 0.00254X2 -0.00457 0.00151 -0.00052 0.00007 -0.00081 0.00004PL 0.12301 0.05195 0.09487 0.08248 0.06962 0.04705PU 0.09761 0.08487 0.16167 0.13092 0.32979 0.05635SECG 0.93556 0.15774 0.44652 0.20149 0.94391 0.07218SECT 0.53746 0.24680 0.07684 0.20474 0.82567 0.08383HISCI 1.80528 0.22592 1.78286 0.32872 2.27956 0.09568HIMGTAG 1.41521 0.26443 1.60298 0.40962 1.92886 0.13571HISOC 1.67729 0.51904 2.52011 0.45173 1.85224 0.17223X*PL 0.00240 0.02147 0.02240 0.00644 0.02639 0.00353X*PU 0.05355 0.03228 0.04847 0.01079 0.04939 0.00519X*SECG -0.08072 0.05417 0.06185 0.01773 0.04576 0.00800X*SECT 0.01682 0.08607 0.08881 0.01668 0.05323 0.00848X*HISCI 0.08434 0.08383 0.05219 0.02607 0.03665 0.01094X*HIMGT 0.09340 0.09474 0.04342 0.03916 0.04791 0.01668X*HISOC -0.04945 0.17580 -0.07702 0.03979 0.00054 0.01847X2*PL 0.00158 0.00194 -0.00030 0.00011 -0.00041 0.00006X2*PU -0.00053 0.00278 -0.00065 0.00020 -0.00084 0.00010X2*SECG 0.00992 0.00434 -0.00077 0.00035 -0.00079 0.00019X2*SECT 0.00395 0.00687 -0.00122 0.00030 -0.00090 0.00018X2*HISCI -0.00349 0.00701 -0.00096 0.00042 -0.00095 0.00022X2*HIMGT -0.00274 0.00770 -0.00070 0.00084 -0.00120 0.00044X2*HISOC 0.00145 0.01342 0.00182 0.00078 0.00014 0.00043------------------------------------------------------------- __-------------

R-Squared 0.51671 0.40949 0.45956

This stylized fact is also neglected in the econometric litera-

ture. The left-hand and middle double columns of Table 4 show the results

of splitting the Brazilian sample into those who had no more than 10 years

of work experience, and more than 10, respectively 6. The F-statistic for

- 21 -

Page 27: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

6 Splits were evaluated at 6, 8, 10 and 12 years of work experi-ence, the results suggesting that the discontinuity is mostdistinct at 10. Similar splits were evaluated for each levelof education separately, with the firnding that there is a posi-tive correlation between the level of' education and the lengthof the first phase of labour market experience.

the Chow test for the split is 6.72, significant at the 0.1% level.

Table 5: Rate of Growth of Private Sector Earnings by Schooling LevelAnd by Experience Level, Brazil, 1980

Schooling Basic Specification Split SpecificationLevel X = 0 X = 20 X = 0 X = 20

None 4.7 2.9 10.9 1.4PL 7.5 4.9 11.1 3.0PU 10.4 7.0 16.2 4.9SECG 10.2 7.0 2.7 6.0SECT 10.9 7.5 12.6 7.8HISCI 9.3 5.8 19.3 4.7HIMGT 10.4 6.4 20.2 4.3HISOC 5.7 4.3 27.7 -3.1

Table 5 summarizes the rates of growth of earnings by educational

level for x equal to zero and 20 predicted by the basic specification and

the split function. With the exception oid secondary general education, it

can be seen that the rates of growth of earnings are indeed initially

greater, and later smaller, than suggested by the basic specification in

Table 3. Again, there are obvious implications for the estimation of rates

of return and they are discussed in Section V.

- 22 -

Page 28: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

C. Estimation of work experience

The traditional definition for estimating years of work experi-

ence, (age - years of schooling - 6), makes no distinction between "work

experience" acquired as a child and adult work experience. We have

discounted the former by using instead the expression min {age - years of

schooling - 6), (age - 15)}. The effect of using the traditional

expression is to shift the experience-earnings functions for the affected

categories, those with no certificate and lower primary graduates, to the

right. Since these profiles are parabolas, the effort of shifting them to

the right is to lower their intercepts and thus to increase the difference

between the initial predicted earnings of these categories and first

relatively unaffected category, upper primary graduates. The result is to

overestimate the rate of return to upper primary education.

This point is illustrated by the last two columns of Table 4 which

present the results of using the traditional expression in our basic model

specification. The intercepts for those with no certificate, lower primary

and upper primary education are 7.92, 7.99 and 8.25; the corresponding

estimates using the modified expression for work experience (Table 3, first

two columns) are 8.29, 8.34 and 8.39, respectively. The remaining

intercepts are virtually unaffected.

If early labour market experience is modelled separately from

mature experience, the distortions caused by using the traditional

definition are even more pronounced: by causing nearly all workers to fall

into the mature subset, it almost precludes any serious attempt to model

early labour market experience for the lowest levels of education even when

the split is made as late as ten years of experience.

- 23 -

Page 29: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

D. Certification effects

The basic specification of the regression model makes no distinc-

tion between the effect of years of schooling by level on earnings and the

effect of obtaining the corresponding certificate. The second double

column of Table 6 presents the results obtained when these effects are

separated, and for purposes of comparison the first double column presents

the results obtained when certificates are omitted. The first double

column can be regarded as the counterpart of the first double column of

Table 3 when schooling is treated as a splined continuous variable instead

of as a set of dummy variables7.

7 This specification is essentially a variation of the originalMincerian model in which years of schooling is treated as asingle, continuous variable. The Mincerian specificationembodies the assumption that all years of schooling make thesame proportional contribution to earnings. The splined ver-sion allows the contribution of each year of schooling to varyaccording to educational level. A second difference in theversion presented here is that the years of schooling variablesare accompanied by interactive terms with experience and itssquare.

- 24 -

Page 30: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

Table 6: Private Sector Earnings Functions, Brazil, 1980Certification Effects

Variables Coeff. Std. err. Coeff. Std. err.

Constant 8.26093 0.03024 8.28086 0.03040X 0.03313 0.00275 0.03310 0.00274X2 -0.00061 0.00005 -0.00062 0.00005YRSPL 0.01633 0.00968 -0.00668 0.01071YRSPU 0.00593 0.01012 -0.01400 0.01167YRSSEC 0.11386 0.01817 0.06380 0.02392YRSHI 0.28433 0.01861 0.25418 0.02683XYRSPL 0.00909 0.00093 0.00915 0.00093XYRSPU 0.00791 0.00115 0.00775 0.00115XYRSSEC 0.00316 0.00213 0.00309 0.00212XYRSHI -0.00815 0.00214 -0.00484 0.00218X2YRSPL -0.00015 0.00002 -0.00015 0.00002X2YRSPU -0.00011 0.00003 -0.00011 0.00003X2YRSEC -0.00002 0.00005 -0.00002 0.00005X2YRSHI 0.00007 0.00005 0.00001 0.00005PL 0.09713 0.01781PU 0.17896 0.03623SECG 0.34213 0.06591SECT 0.31440 0.06774HISCI 0.56095 0.11436HIMGT 0.42733 0.10700HISOC 0.01865 0.11157

R-Squared 0.48034 0.48452

The coefficients of the certificate dummies all have the expected

positive sign and, in spite of the problem of multicollinearity, the

majority are significantly different form zero at the 1% level.

Although these results suggest that employers are affected by

credentialism in their wage-setting, this is not the only possible

interpretation. An alternative possibility is that those who complete each

level of education are intrinsically more able than those who do not. A

further, related, explanation is that those who complete each level extract

more from it than those who drop out and presumably were struggling. The

- 25 -

Page 31: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

certificate coefficients therefore may equally be regarded as evidence of

credentialism, of screening for ability, or of a true educational effect.

These results are similar to those found recently for US data by Hungerford

and Solon (1987).

V. The Sensitivity of Rates oi Return to Specification

In this section we compare several methods of computing the rates

of return to various levels of education. When interactive experience-

schooling effects are introduced into the regression model, the earnings

functions by educational level cease to be isomorphic and it is no longer

possible to read off a crude estimate of t:he rate of return in the

Mincerian manner. One is forced to return to the more laborious but

theoretically more satisfactory procedure of calculating earnings streams

explicitly and using an iterative procedure to calculate an internal rate

of return. This complicates the comparison of the rates of return using

the standard Mincerian model and more sophisticated ones. In part the

discrepancies are attributable to the use of the short-cut technique in the

Mincerian estimates. In part Mincerian and more elaborate methods are

different because they employ different specifications of the earnings

function.

A. Short-cut and IRR versions of the Mincerian model

In the short-cut version of the Mincerian model the rate of

return to education is estimated directly from the regression results.

When schooling is treated as a continuous, cardinal variable, its

coefficient is the estimate of the rate of return. In the dummy variable

- 26 -

Page 32: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

approach used here, the rate of return to each level is calculated from the

coefficients of the schooling dummies using equation (3). This procedure

involves three assumptions:

1. Direct costs are negligible, or are offset by a student'spart-time and summer earnings.

2. The opportunity cost of foregone earnings is equal to theearnings of the next lower level predicted by the model.

3. The earnings profiles are isomorphic, that is, they are of theform yo f(x), where yo are the initial earnings of theeducational category in question and f(x) is a multiplicativeexperience function common to all educational levels°.

8 A proof of the validity of the short-cut method, subject to theseassumptions, is provided in the appendix.

In the more satisfactory, but more laborious, internal rate of

return (IRR) version, account is taken of direct costs9 and the foregone

earnings of children are treated more appropriately. The third assumption

is maintained.

9 A recent monograph by Winkler (1986) yields the following costs(average of federal state and municipal levels) of education:

1980 CruzeirosPrimary overall 811

Lower 759Upper 949

Secondary overall 1,316General 995Technical 1,742

Higher overall 13,842Sciences 15,820Management 11,865Social Sciences 11,865

Winkler's data allows us to compute unit costs overall for eacheducational level. To assign these costs for the subheadingswithin levels, we assumed that unit costs at upper primaryexceed those at lower primary by 25%; secondary technical

- 27 -

Page 33: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

exceeded unit cost of secondary general by 75%; and unit costfor science to exceed those for management and social sciencesby 25%.

Table 7: Rates of Return under Alternative SpecificationsBrazil 1980 Private Sector by Schooling Level

(percentages)

Primary Secondary Higher

Specification Lower Upper Gen Tech Sci Mgt/Ag Soc

Mincerian

1. Coefficient-difference 9 8 12 12 28 23 14method

2. IRR method 38 35 14 12 21 18 10

With experience interactions

3. Basic 24 25 12 10 20 17 9

4. With experience spline 25 24 12 10 20 16 8

5. With conventional 22 31 14 12 20 17 9experience measure

We are concerned only with the effect of the third assumption

after the first two have been relaxed. However since the majority of

Mincerian studies use the short-cut model, we begin by comparing the

results obtained using it with those obtained using the IRR version and the

Brazilian data. The rates of return for the different levels of education

using the short-cut and IRR methods are presented in the first two lines of

Table 7. The inclusion of direct costs causes the estimates of the rate of

return to secondary and higher education to be lower in the IRR version

- 28 -

Page 34: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

than in the short-cut version. However in the case of lower and primary

education the assumption that foregone earnings are in fact negligible

causes the IRR estimates to be substantially higher.

B. Mincerian versus specification with interactives

Next we compare the rates of return using the IRR version of the

Mincerian model with those obtained using its counterpart including

interactive experience-education terms (line 3 of Table 7). The greatest

impact is on the estimates of the rates of return to lower and

upper primary education, which are significantly lower in the interactive

version, the reason being that the Mincerian specification causes the

initial impact of these levels of education to be overestimated.

C. Variations on the specification with interactives

Finally we evaluate the effect of using the earnings profiles

splined by early and later work experience discussed in Section IV.B, and

the effect of the revised measure of experience discussed in Section IV.C.

Despite the significant Chow tests, the introduction of splines appear to

have a negligible effect on the estimates of rates of return (Table 7, line

4) and clearly in this case was a refinement of secondary importance. The

measurement of the work experience variable is however a more significant

issue: the conventional method gives rise to a substantial upward bias in

the estimate of the rate of return to upper primary education, and, to a

lesser extent, in the estimates for lower primary and secondary education

(Table 7, line 5).

- 29 -

Page 35: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

D. Private and social rates of return

The short-cut version of the Mincerian model is sometimes

described as yielding estimates of the private rate of return, while the

IRR version (and the interactive models discussed here) are described as

yielding social estimates. We are agnostic on these interpretations in

this paper because we are chiefly concerned about the magnitude of the

impact of alternative methodologies in computing rates of return.

Moreover, we have no information on the private direct costs of education

(uniforms, transport, charges for text-bocks, exercise books, pencils and

other materials) and the effects of direct taxation.

VI. Public Sector Emplo ment and Self-Employment

In addition to the data on private sector employees, the sample

contained smaller data sets on public sector employees and the self-

employed (see Table 1).

A regression using the basic specification with interactive

variables (equation 6) and the public sector data yielded a better fit (R2

equal to 0.57 for the public sector, 0.47 for the private sector) and

similar coefficients 10 (Table 8).

10 A Chow test indicated that the fits were nevertheless signifi-cantly different. Half of the discrepancy between the residualsum of squares for the pooled and separate regressions could beaccounted for by a simple sector dumny, but the differenceremained significant after its inclusion.

A similar regression for the self-employed yielded, as antici-

pated, a much inferior fit (R2 equal to 0.25). The intercept dummies were,

- 30 -

Page 36: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

with the exception of higher education, science, considerably larger than

those for the private sector sample and the experience-education interac-

tives were, with the same exception, not significantly different from zero

(Table 8). yhe F-statistic for the explanatory power of the interactive

variables as a group is 1.68, just significant at the 5% level. It follows

that for this subsample the traditional Mincerian specification would have

been approximately appropriate.

Table 8: Public and Own Account SectorsSemi-Logarithmic Earnings Functions in Brazil, 1980

Variations on Basic Specification

Public Sector Own AccountVariables Coeff. Std.err. Coeff. Std.err.

Constant 7.9779 0.1222 8.2825 0.0549X 0.0540 0.0089 0.0510 0.0042X2 -0.0008 0.0002 -0.0009 0.0001PL 0.1220 0.1470 0.3658 0.0787PU 0.1047 0.1613 0.7106 0.1361SECG 0.7154 0.1856 0.7594 0.2110SECT 0.4859 0.2377 1.2039 0.3172HISCI 1.7857 0.1986 1.7278 0.2404HIMGTAG 1.2418 0.3725 1.7404 0.4911HISOC 1.6660 0.2572 1.7772 0.3857X*PL 0.0235 0.0116 0.0098 0.0064X*PU 0.0483 0.0140 -0.0042 0.0126X*SECG 0.0270 0.0166 0.0238 0.0225X*SECT 0.0342 0.0221 -0.0275 0.0293X*HISCI 0.0311 0.0201 0.0561 0.0260X*HIMGT 0.0471 0.0407 -0.0234 0.0604X*HISOC 0.0219 0.0224 0.0017 0.0375X2*PL -0.0003 0.0002 -0.0002 0.0001X2*PU -0.0006 0.0003 0.0002 0.0002X2*SECG -0.0002 0.0003 -0.0003 0.0005X2*SECT -0.0005 0.0004 0.0006 0.0006X2*HISCI -0.0005 0.0004 -0.0016 0.0006X2*HIMGT -0.0005 0.0010 0.0003 0.0015X2*HISOC -0.0003 0.0004 -0.0004 0.0008-------------- ____-----------------------------------

R-Squared 0.5685 0.2549

- 31 -

Page 37: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

The earnings function specification described in Section IV.D was

used to detect certification effects. As anticipated, they were

significant at all levels for the public sector and stronger than in the

case of the private sector. Again, as anl:icipated, they were largely

absent in the self-employed subsample. Only that for lower primary was

significantly different from zero. As noted in Section IV.D, there are

several possible explanations of certification effects. These findings

support the traditional sheep-skin explanation for all but lower primary.

For this, the explanation that completers are inherently different from

non-completers may be more appropriate.

- 32 .

Page 38: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

VI. Conclusions

The empirical results provide striking confirmation of the

superiority of the semi-logarithmic earnings function over its linear

counterpart. The semi-logarithmic version is supported by the Box-Cox

transformation, by relative homoscedasticity in both the schooling and work

experience dimensions, and by the relatively normal distribution of the

residuals.

However they do indicate that the standard Mincerian model errs in

neglecting interactive effects between work experience and schooling and by

not making a distinction between the modelling of initial and later

earnings.

The Brazilian results suggest that the biggest differences in the

contribution of work experience to the growth of earnings occur at the

lowest levels of education, and hence that the standard specification is

likely to overestimate the rate of return at these levels, a bias which is

likely to be aggravated by the traditional method of estimating work

experience. This conclusion must however be tempered by the finding that

the interactive effects appear to be confined primarily to mature earnings

and hence will have greatest impact when the rate of return is low.

- 33 -

Page 39: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

References

Becker, G. (1972) Human Capital, New York: NEBR.

Blinder, A. (1976) "On Dogmatism in Human Capital Theory," Journalof Human Resources 21, 8-22.

Goldfeld, S.M., and R.E. Quandt (1965) Some tests forhomoscedasticity, Journal of the Anierican StatisticalAssociation 60, 539-547.

Grasso, J., and J. Shea (1979) Vocational Education and Training:Impact on Youth, Berkeley: Carnegie Council on Policy Studiesin Higher Education.

Heckman, J. and S. Polachek (1974) "Empirical Evidence on theFunctional Form of the Earnings-Schooling Relationship,"Journal of the American Statistical Association, 69, 350-54.

Hungerford, T. and G. Solon (1987) "Sheepskin Effects and theReturns to Education," Review of Economics and Statistics,175-177.

Mincer, J. (1974) Schooling, Experience and Earnings, New York:Columbia University Press.

Psacharopoulos, G. (1980) "Returns to Education: An UpdatedInternational Comparison," in T. King (ed.) Education andIncome, World Bank Staff Working Paper No. 402.

Spitzer, J. (1982) "A Primer on Box-Co:c Estimation," Review ofEconomics and Statistics, May, 307-313.

Thurow, L.C. (1980) The Zero-Sum Society, New York: Basic Books.

Winkler, D. (1986) Primary Education in Brazil, Washington, D.C.,World Bank.

Zarembka, P. (1968) "Functional Form in the Demand for Money,"Journal of the American Statistical Association, 63, 502-511.

r

- 34 -

Page 40: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which
Page 41: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

Appendix

The Short-Cut (Coefficient-Difference) Mincerian Method

* The short-cut Mincerian method has been adopted in many empirical

studies and its use is explained in Psacharopoulos (1981), but we have not

been able to locate a formal justification. The proof which we provide is

subject to the assumptions listed in Section V, all of which are controver-

sial.

We assume that the education in question takes T years, that the

initial earnings of the uneducated and educated are yo and y1, that

earnings with x years of work experience are yo f(x) and y1 f(x), and that

the lengths of their working lives are No and N1 years, respectively. For

an individual who has the choice of being educated or entering the labour

force directly, the present discounted valueas of the alternative earnings

streams are

No

f yO f(x) e rx dx (direct antry)

N1

I Y1 f(x) er(x+T) dx (entry aEter education)

where r is the rate of discount. The rate of return to education is thus

given by the solution in r to

No N1

yo fI f(x) e-rx dx = y1 e_rT f f(x) e-rx dx

- 35 -

Page 42: The Specication of Earnings Functions: Tests and Implications€¦ · Earnings functions have been widely used to estimate the returns to education and training -- estimates which

Hence, provided that the difference between the integrals is

negligible, the rate of return is given by (log y1 - log yo)/T. Since the

regression is run in logarithmic form, this amounts to dividing the

coefficient of the education dummy by T. When there are several levels of

education, the dividend is the difference between the coefficients of the

relevant schooling dummies.

The difference between the integrals may not be negligible if work

experience has a stronger impact on earnings than education, for example,

if f (x) = eax where a is greater than (log yl - log yo)/T. Note that the

proof does not depend upon the earnings functions being parallel in any

simple sense: it is sufficient that they be isomorphic.

- 36 -