national bureau of economic research human ......we are grateful to nicolas ciarcia, nicholas...

74
NBER WORKING PAPER SERIES HUMAN CAPITAL AND REGIONAL DEVELOPMENT Nicola Gennaioli Rafael La Porta Florencio Lopez-de-Silanes Andrei Shleifer Working Paper 17158 http://www.nber.org/papers/w17158 NATIONAL BUREAU OF ECONOMIC RESEARCH 1050 Massachusetts Avenue Cambridge, MA 02138 June 2011 We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated research assistance over the past 5 years. We thank Gary Becker, Nicholas Bloom, Vasco Carvalho, Edward Glaeser, Gita Gopinath, Josh Gottlieb, Elhanan Helpman, Chang-Tai Hsieh, Mathew Kahn, Pete Klenow, Robert Lucas, Casey Mulligan, Elias Papaioannou, Jacopo Ponticelli, Giacomo Ponzetto, Jesse Shapiro, Chad Syverson, David Weil, and seminar participants at the UCLA Anderson School, Harvard University, University of Chicago, and NBER for extremely helpful comments. Gennaioli thanks the Barcelona Graduate School of Economics and the European Research Council for financial support. Shleifer thanks the Kauffman Foundation for support. The views expressed herein are those of the authors and do not necessarily reflect the views of the National Bureau of Economic Research. NBER working papers are circulated for discussion and comment purposes. They have not been peer- reviewed or been subject to the review by the NBER Board of Directors that accompanies official NBER publications. © 2011 by Nicola Gennaioli, Rafael La Porta, Florencio Lopez-de-Silanes, and Andrei Shleifer. All rights reserved. Short sections of text, not to exceed two paragraphs, may be quoted without explicit permission provided that full credit, including © notice, is given to the source.

Upload: others

Post on 09-Mar-2021

4 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

NBER WORKING PAPER SERIES

HUMAN CAPITAL AND REGIONAL DEVELOPMENT

Nicola GennaioliRafael La Porta

Florencio Lopez-de-SilanesAndrei Shleifer

Working Paper 17158http://www.nber.org/papers/w17158

NATIONAL BUREAU OF ECONOMIC RESEARCH1050 Massachusetts Avenue

Cambridge, MA 02138June 2011

We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, FranciscoQueiro, and Nicolas Santoni for dedicated research assistance over the past 5 years. We thank GaryBecker, Nicholas Bloom, Vasco Carvalho, Edward Glaeser, Gita Gopinath, Josh Gottlieb, ElhananHelpman, Chang-Tai Hsieh, Mathew Kahn, Pete Klenow, Robert Lucas, Casey Mulligan, Elias Papaioannou,Jacopo Ponticelli, Giacomo Ponzetto, Jesse Shapiro, Chad Syverson, David Weil, and seminar participantsat the UCLA Anderson School, Harvard University, University of Chicago, and NBER for extremelyhelpful comments. Gennaioli thanks the Barcelona Graduate School of Economics and the EuropeanResearch Council for financial support. Shleifer thanks the Kauffman Foundation for support. Theviews expressed herein are those of the authors and do not necessarily reflect the views of the NationalBureau of Economic Research.

NBER working papers are circulated for discussion and comment purposes. They have not been peer-reviewed or been subject to the review by the NBER Board of Directors that accompanies officialNBER publications.

© 2011 by Nicola Gennaioli, Rafael La Porta, Florencio Lopez-de-Silanes, and Andrei Shleifer. Allrights reserved. Short sections of text, not to exceed two paragraphs, may be quoted without explicitpermission provided that full credit, including © notice, is given to the source.

Page 2: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

Human Capital and Regional DevelopmentNicola Gennaioli, Rafael La Porta, Florencio Lopez-de-Silanes, and Andrei ShleiferNBER Working Paper No. 17158June 2011, Revised September 2011JEL No. L26,O11,O43,O47,R11

ABSTRACT

We investigate the determinants of regional development using a newly constructed database of 1569sub-national regions from 110 countries covering 74 percent of the world’s surface and 96 percentof its GDP. We combine the cross-regional analysis of geographic, institutional, cultural, and humancapital determinants of regional development with an examination of productivity in several thousandestablishments located in these regions. To organize the discussion, we present a new model of regionaldevelopment that introduces into a standard migration framework elements of both the Lucas (1978)model of the allocation of talent between entrepreneurship and work, and the Lucas (1988) modelof human capital externalities. The evidence points to the paramount importance of human capitalin accounting for regional differences in development, but also suggests from model estimation andcalibration that entrepreneurial inputs and human capital externalities are essential for understandingthe data.

Nicola GennaioliCREIUniversitat Pompeu FabraRamon Trias Fargas 25-2708005 Barcelona (Spain)[email protected]

Rafael La PortaDartmouth CollegeTuck School210 Tuck HallHanover, NH 03755and [email protected]

Florencio Lopez-de-SilanesEDHEC Business School393, Promenade des Anglais BP 311606202 Nice Cedex 3FRANCEand [email protected]

Andrei ShleiferDepartment of EconomicsHarvard UniversityLittauer Center M-9Cambridge, MA 02138and [email protected]

An online appendix is available at:http://www.nber.org/data-appendix/w17158

Page 3: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

2

I. Introduction.

We investigate the determinants of regional development using a newly constructed database

of 1569 sub-national regions from 110 countries covering 74 percent of the world’s surface and 96

percent of its GDP. We consider a variety of fundamental determinants of economic development,

such as geography, natural resource endowments, institutions, human capital, and culture, by looking

within countries. We combine this analysis with an examination of labor productivity and wages in

several thousand establishments covered by the World Bank Enterprise Survey, for which we have both

establishment-specific and regional data. Throughout the analysis, human capital measured using

education emerges as the most consistently important determinant of both regional income and

productivity of regional establishments. The combination of regional and establishment-level data

enables us to investigate some of the key channels through which human capital operates, including

education of workers, education of entrepreneurs/managers, and externalities.

To organize this discussion, we present a new model describing the channels through which

human capital influences productivity, which combines three features. First, human capital of workers

enters as an input into the neoclassical production function, but human capital of the

entrepreneur/manager influences firm-level productivity independently. The distinction between

entrepreneurs/managers and workers has been shown empirically to be critical in accounting for

productivity and size of firms in developing countries (Bloom and Van Reenen 2007, 2010; La Porta and

Shleifer 2008; Syverson 2011). In the models of allocation of talent between work and entrepreneurship

such as Lucas (1978), Baumol (1990), and Murphy, Shleifer, and Vishny (1991), returns to

entrepreneurial schooling may appear as profits rather than wages. By modeling this allocation, we

trace these two separate contributions of human capital to productivity.

Second, our approach allows for human capital externalities, emphasized in the regional context

by Jacobs (1969), and in the growth context by Lucas (1988, 2008) and Romer (1990). These

Page 4: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

3

externalities result from people in a given location spontaneously interacting with and learning from

each other, so knowledge is transmitted across people without being paid for. Because our framework

incorporates both the allocation of talent between entrepreneurship and work as in Lucas (1978), and

human capital externalities as in Lucas (1988), we call it the Lucas-Lucas model2. Human capital

externalities have been found in a variety of development and regional contexts (Rauch 1993, Glaeser,

Scheinkman and Shleifer 1995, Angrist and Acemoglu 2000, Glaeser and Mare 2001, Moretti 2004,

Iranzo and Peri 2009), although Ciccone and Peri (2006) and Caselli (2005) find them to be unimportant.

By decomposing human capital effects into those of worker education, entrepreneurial/managerial

education, and externalities using a unified framework, we try to disentangle different mechanisms.

Third, because we are looking at the regions, we need to consider the mobility of firms, workers,

and entrepreneurs across regions, which is presumably less expensive than that across countries. To

this end, our model follows the standard urban economics approach (e.g., Roback 1982, Glaeser and

Gottlieb 2009) of labor mobility across regions with scarce resources, such as land and housing, limiting

universal migration into the most productive regions. This aspect of the model allows us to consider

jointly the education coefficients in regional and establishment level regressions. A key benefit of our

model of the three channels of influence of education is to reconcile regional and firm-level evidence.

To begin, we use regional data to examine the determinants of regional income in a

specification with country fixed effects. Our approach follows development accounting, as in Hall and

Jones (1999), Caselli (2005), and Hsieh and Klenow (2010). Among the determinants of regional

productivity, we consider geography, as measured by temperature (Dell, Jones, and Olken 2009),

distance to the ocean (Bloom and Sachs 1998), and natural resources endowments. We also consider

2 We do not consider the role of human capital in shaping the adoption of new technologies. Starting with Nelson

and Phelps (1966), economists have argued that human capital accelerates the adoption of new technologies. For recent models of these effects, see Benhabib and Spiegel (1994), Klenow and Rodriguez-Clare (2005), and Caselli and Coleman (2006). For evidence on such externalities and R&D spillovers, see Coe and Helpman (1995), Ciccone and Papaioannou (2009), and Wolff (2011).

Page 5: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

4

institutions, which have been found by King and Levine (1993), De Long and Shleifer (1993), Hall and

Jones (1999), and Acemoglu et al. (2001) to be significant determinants of development. We also look

at culture, measured by trust (Knack and Keefer 1997) and at ethnic heterogeneity (Easterly and Levine

1997, Alesina et al. 2003). Last, we consider the effect of average education in a region on its level of

development. A substantial cross-country literature points to a large role of education. Barro (1991)

and Mankiw, Romer, and Weil (1992) are two early empirical studies; de La Fuente and Domenech

(2006) and Cohen and Soto (2007) are recent confirmations. Across countries, the effects of education

and institutions are difficult to disentangle empirically because both are endogenous and because the

potential instruments for each are correlated with both (Glaeser et al 2004). By using country fixed

effects, we can avoid identification problems caused by unobserved country-specific factors.

We find that favorable geography, such as lower average temperature and proximity to the

ocean, as well as higher natural resource endowments, are associated with higher per capita income in

regions within countries. We do not find that culture, as measured by ethnic heterogeneity or trust,

explains regional differences. Nor do we find that institutions as measured by survey assessments of

the business environment in the Enterprise Surveys help account for cross-regional differences within a

country. Some institutions or culture may matter only at the national level, but then large income

differences within countries call for explanations other than culture and institutions. In contrast,

differences in educational attainment account for a large share of the regional income differences

within a country. The within country R2 in the univariate regression of the log of per capita income on

the log of education is about 25 percent; this R2 is not higher than 8 percent for any other variable.

Acemoglu and Dell (2010) examine sub-national data from North and South America to

disentangle the roles of education and institutions in accounting for development. The authors find that

about half of the within-country variation in levels of income is accounted for by education. This is

similar to the Mankiw et al. (1992) estimate that half of the differences in per capita incomes across

Page 6: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

5

countries is attributable to education. We confirm a large role of education, but try to go further in

identifying the channels. Acemoglu and Dell also conjecture that institutions shape the remainder of

the local income differences. We have regional data on several aspects of institutional quality, but find

that their ability to explain cross-regional differences is minimal3.

Regional regressions have the problem that human capital in a region may be endogenous

because of migration. To make progress on this front, we next examine the determinants of firm-level

productivity. To this end, we merge our data with World Bank Enterprise Surveys, which provide

establishment-level information on sales, labor force, educational level of management and employees,

as well as energy and capital use for several thousand establishments in the regions for which we have

data. The micro data show that establishments with employees and managers with more education are

more productive, holding capital and energy inputs constant and including regionXindustry fixed effects.

These data point to a large role of managerial/entrepreneurial human capital in raising firm productivity.

We then re-estimate the same equations including regional education, but replacing regionXindustry

fixed effects with country and industry fixed effects. In these regressions, regional education has a large

and positive coefficient, pointing to sizeable human capital externalities that survive controls for several

potential determinants of regional productivity. However, because regional education may be

correlated with unobserved region-specific productivity parameters, we do not have perfect

identification in this analysis of externalities.

To assess the extent to which firm-level results can account for the role of human capital across

regions, we combine estimation with calibration. To circumvent the endogeneity problems entailed in

cross country regressions, scholars have turned to calibration techniques, using micro studies to pin

3 A recent literature looks at colonial history within countries, and argues that regions that were treated

particularly badly by colonizers have poor institutions and lower income many years later (Banerjee and Iyer 2005, Dell 2010). It is surely possible, even likely, that severe institutional shocks have long run consequences because they influence human capital accumulation and institutions in the long run. But to the extent that we have adequate measures of institutional quality, the consequences of such shocks for modern institutions do not appear to add a great deal of explanatory power to understanding cross-regional evidence today.

Page 7: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

6

down the parameters of the production function (e.g., Caselli 2005). We also rely on previous research

regarding factor shares (e.g., Gollin 2002, Caselli and Feyer 2007, Valentinyi and Herrendorf 2008), but

then combine it with coefficient estimates from regional and firm-level regressions. This allows us to

back out the returns to schooling and the role of externalities. We find substantial consistency between

regional and firm-level results, as well as plausible estimates of the parameters of the production

function. Our calibrations show that worker education, entrepreneurial education, and externalities all

substantially contribute to productivity. We find the role of workers’ human capital to be in line with

standard wage regressions, which are the benchmark adopted by conventional calibration studies (e.g.

Caselli 2005). Crucially, however, our results indicate that focusing on worker education alone

substantially underestimates both private and social returns to education. Private returns are very high

but to a substantial extent are earned by entrepreneurs, and hence might appear as profits rather than

wages, consistent with Lucas (1978). Although we have less confidence in the findings for externalities,

our best estimates suggest that those are also sizeable and fall in the ballpark of existing estimates (e.g.

Moretti 2004). In sum, the evidence points to a large influence of entrepreneurial human capital, and

perhaps of human capital externalities, on productivity.

In section II, we present a model of regional development that organizes the evidence. In

section III, we describe our data. Section IV examines the determinants of both national and regional

development. Section V presents firm-level evidence to disentangle the channels of through which

human capital influences productivity. We combine the regional and the micro estimates to compute

the model’s parameters and to assess its ability to explain income differences. Section VI concludes.

II. A Lucas-Lucas spatial model of regional and national income

A country consists of a measure 1 of regions, a share p of which has productivity ̃ and a share

1– p of which has productivity ̃ ̃ . We refer to the former regions as “productive”, to the latter

Page 8: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

7

regions as “unproductive”, and denote them by i = G, B. A measure 2 of agents is uniformly distributed

across regions. An agent j enjoys consumption and housing according to the utility function:

jj acacu

1),( , (1)

where c and a denote consumption and housing, respectively. Half the agents are “rentiers,” the

remaining half are “labourers’’. Each rentier owns 1 unit of housing, T units of land, K units of physical

capital (and no human capital). Each labourer is endowed with hR++ units of human capital. In region i

= G, B the distribution of human capital is Pareto in [h,+∞), with mean μih/(μi–1), where μi, h>1. We

denote by Hi = μih/(μ–1) the initial human capital endowment of region i = G, B. Differences in Hi

capture exogenous variation of human capital across regions.

A labourer can become either an entrepreneur or a worker. By operating in region i, an

entrepreneur with human capital h who hires physical capital Ki,h, land Ti,h, and workers with total

human capital Hi,h produces an amount of the consumption good equal to:

hihihiihi TKHhAy ,,,1

, , 1 . (2)

As in Lucas (1978), a firm’s output increases, at a diminishing rate, in the entrepreneur’s human capital

has well as in Hi,h, Ki,h and Ti,h. We model human capital externalities (Lucas 1988) by assuming that

regional total factor productivity is given by:

iiii LhEAA )(~

, γ> 0, ψ ≥ 1. (3)

According to (3), holding region-specific factors ̃ constant (e.g., geography, institutions), productivity

increases in the region’s human capital. In expression (3), Ei(h) is the average level of human capital in

region i and Li is the measure of labour in that region. Parameter ψ captures the importance of the

quality of human capital: when ψ = 1 only the total quantity of human capital Hi = Ei(h)Li matters for

externalities; as ψ rises the quality of human capital becomes relatively more important than quantity.

Parameter γ captures the overall importance of externalities. To ease notation, we suppress the

Page 9: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

8

dependence of productivity on regional human capital until we examine the spatial equilibrium, in which

productivity Ai is endogenously determined with regional sorting of labourers.

Rentiers rent land and physical capital to firms, and housing to entrepreneurs and workers. In

region i, each rentier earns λiT and ηi by renting land and housing, where λi and ηi are rental rates, and

ρiK by renting physical capital. A region’s land and housing endowments T and 1 are immobile; physical

capital is fully mobile. Labourers use their human capital in work or in entrepreneurship. By operating

in region i, a labourer with human capital h earns either profits πi(h) as an entrepreneur or wage income

wi∙h as a worker, where wi is the wage rate. All labourers, whether they become entrepreneurs or

workers, are partially mobile: a labourer moving to region i loses φwi units of income, where φ<h.4

At t = 0, a labourer with human capital h selects the location and occupation that maximize his

income. The housing market clears, so houses are allocated to each region’s labour. At t = 1,

entrepreneurs hire land, human, and physical capital. Production is carried out and distributed in wages,

land rental, capital rental, housing rental and profits. Consumption takes place.

A spatial equilibrium is a regional allocation iiWi

Ei KHH ,, of entrepreneurial human capital

EiH , workers’ human capital W

iH , and physical capital Ki such that: a) entrepreneurs hire workers,

physical capital, and land to maximize profits, b) labourers optimally choose location, occupation and

the fraction of income devoted to consumption and housing, and c) capital, labour, land and housing

markets clear. Because physical capital is fully mobile, there is a unique rental rate ρ. Since land and

housing are immobile, their rental rates λi and ηi vary across regions depending on productivity and

population. To determine the sorting of labourers across regions and their choice between work and

entrepreneurship within a region, we must compute regional wages wi and profits πi(hj). To do so, we

first determine regional output and factor returns at a given allocation iiWi

Ei KHH ,, . Second, we solve

4 For simplicity, we assume that moving costs are a redistribution from migrants to locals (the latter may be viewed

as providing moving services) and are non-rival with the time spent working. This ensures that the human capital employed in a region, as well as the aggregate income of laborers, do not depend on moving costs.

Page 10: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

9

for the equilibrium allocation. Throughout the analysis, the price of consumption is normalized to one.

Production and occupational choice

An entrepreneur with human capital h operating in region i maximizes his profit by solving:

hiihihiihihihiiKTHTKHwTKHhA

hihihi,,,,,,

1

,, ,,,

max , (4)

implying that in each region firms employ factors in the same proportion. Since at iiWi

Ei KHH ,, firm j

employs a share of entrepreneurial capital hj/EiH , it hires the others factors according to:

.,, ,,, THh

TKHh

KHHh

H Ei

jjiiE

i

jji

WiE

i

jji (5)

As in Lucas (1978), more skilled entrepreneurs run larger firms.

Equation (5) implies that the aggregate regional output is given by:

TKHHAY iWi

Eiii

1. (6)

Equation (6) allows us to determine wages, profits, and capital rental rates as a function of regional

factor supplies via the usual (private) marginal product pricing. That is:

.///

,///)1(

,///

1

1

iiWii

Eii

i

i

Ei

Eii

Ei

WiiE

i

ii

Wi

Wii

Wi

EiiW

i

ii

KTKHKHAKY

HTHKHHAHY

HTHKHHAHY

w

(7)

Thus, profit πi(h) is equal to πi (the marginal product of the entrepreneur’s human capital in region i),

times the entrepreneur’s human capital h, namely πi(h) = πi∙h.

Using Equation (7) we can solve for a labourer’s occupational choice. A labourer j with human

capital hj chooses to be an entrepreneur if πi∙hj>wi∙hj and a worker if πi∙hj<wi∙hj. In equilibrium,

labourers must be indifferent between the two occupations (i.e., πi=wi), which implies:

Page 11: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

10

iWii

Ei HHHH

1,

11

, (8)

where Wi

Eii HHH is total human capital in region i. E

iH increases with the share of the total private

return to human capital earned by entrepreneurs [i.e. with (1–α–β–δ)/(1–β–δ)]. Equation (8) describes

the allocation of labour within in a region from the total quantities of human and physical capital (Hi,Ki).

The spatial equilibrium: consumption, housing and mobility

We consider symmetric spatial equilibria in which all productive regions share the same factor

allocation (HG,KG), the same wage wG and rental rates λG and ηG, and unproductive regions share the

same allocation (HB,KB), wage wB, and rentals λB and ηB. The uniformity of the capital rental ρ across

regions pins down the allocation of physical capital as a function of the regional allocation of human

capital. By exploiting equations (7) and (8) one finds that ρ is constant across regions provided:

11

11

B

G

B

G

B

G

HH

AA

KK , (9)

Intuitively, the physical capital stock allocated to productive regions increases in their productivity

advantage AG/AB and in their relative human capital stock HG/HB. Remember that the productivity

differential in (10) is endogenously determined through externalities, as modelled in (3).

To find the allocation of human capital, we must characterize labour mobility by computing the

utility that labourers obtain from operating in different regions. Labourers maximize their utility in (2)

by devoting a share θ of their income to housing and the remaining share (1 – θ) to consumption. Since

the aggregate income of labourers in region i is equal to wiHi, the demand for housing in the regions is

θ∙wiHi/ηi. With the unitary supply of housing, the housing rental rate is equal to:

ηi= θ∙wi∙Hi. (10)

As a consequence, the utility (gross of moving costs) of a labourer in region i is equal to:

Page 12: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

11

i

i

i

iiw H

hwhwacu 1

, ),( , (11)

which rises with the wage and falls with regional human capital Hi due to higher rents.

To describe the spatial equilibrium, we need to find the ratio between the wages paid in

productive and unproductive regions as a function of their human capital. Using the wage equation (7),

the interest rate equalization condition (8), and the expression for externalities (3), we find that:

111

)()(

~~

G

B

BB

GG

B

G

B

G

HH

LhELhE

AA

ww

(12)

Ceteris paribus, the wage is higher in productive regions. A higher human capital stock has a negative

effect on the wage because of diminishing returns but once externalities are taken into account the net

effect is ambiguous. In the remainder we assume:

A.1 1~~

B

G

B

G

HH

AA ,

which implies that the autarky wage and interest rates are higher in productive regions, so that both

capital and labour tend to move there. We can then prove the following (in Appendix 1):

Proposition 1 Under the parametric restriction:

(β – ψγ)(1 – θ) + θ(1 – δ)> 0, (13)

there is a stable equilibrium allocation HG and HB. In this allocation:

a) There is a cutoff hm such that agent j migrates from an unproductive to a productive region if

and only if hj ≥ hm. The cutoff hm increases in the mobility cost φ.

b) Denote by BG HpHpH )1( the aggregate human capital. Then, when φ = 0, the

equilibrium level of human capital in region i is independent of the region’s initial human capital

endowment. In particular, for ψ = 1 the full mobility allocation satisfies:

Page 13: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

12

H

AE

AHH Gfree

GG

)1()1)((1

)1()1)((1

~

. (14)

When φ > 0 and ψ ≥ 1, we have that HG< freeGH~ and HG increases in HG holding H constant.

Since wages (and profits) are higher in the productive than in the unproductive regions, labour

migrates to the former from the latter. The cutoff rule in a) is intuitive: more skilled people have a

greater incentive to pay the migration cost because the wage (or profit) gain they experience from doing

so is higher. Even if mobility costs are zero, migration to the more productive regions is not universal.

This is due to the limited supply of land T, which causes decreasing returns in production, and to the

limited supply of housing, which implies that migration causes housing costs to rise until the incentive to

migrate disappears. Regional externalities moderate the adverse effect of fixed supplies of land and

housing on mobility. In fact, for migration to be interior, condition (13) must be met, which requires

external effects ψγ to be sufficiently small relative to: i) the diminishing returns β due to land and ii) the

sensitivity θ of house prices to regional human capital.

In equilibrium, wages are higher in the more productive regions, wG>wB, but the housing rental

rate is also higher there, ηG>ηB. While (14) shows that under costless migration the human capital

employed in a region only depends on its productivity, when mobility is imperfect (i.e., φ > 0) a region

exogenously endowed with more human capital will employ more human capital in equilibrium. This

property will be important for the empirical analysis. When ψ = 1 and φ = 0, national output is equal to:

TKHHHAY WE

1, (15)

where A

is a function ),,,~,~,,,( pAAA BG

of exogenous parameters. More generally, under

condition (13) the Lucas-Lucas model yields the following equation for firm level output:

jijijijiiiji TKHhLhEAy ,,,

1, )(~ , (16)

Page 14: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

13

and the following equation for regional output:

TKHHLhEAY iWi

Eiiiii )()()(~ 1 .

(17)

Empirical Predictions of the Model

To obtain predictions on the role of schooling, we need to specify a link between human capital

(which we do not observe) and schooling (which we do observe). We follow the Mincerian approach in

which for an individual j the link between human capital and schooling is:

jjj Sh exp , (18)

where Sj ≥ 0 and μj ≥ 0 are two random variables (distributed to ensure that the distribution of hj is

Pareto). The return to schooling μi varies across individuals, potentially due to talent. This allows us to

estimate different returns to schooling for workers and entrepreneurs. Card (1999) offers some

evidence of heterogeneity in the returns to schooling. Human capital in region i is then equal to

0,0

),()(

S i

S

h i dSdSgehhdF , where )(hdFi is the density of region i labourers’ with human

capital h and ),( Sgi is the density of region i labourers having human capital Seh , so that

iS i LdSdSg 0,0),(

. In line with macro studies, in our regressions we express average human

capital in the region as a first order expansion around the mean Mincerian return and years of schooling:

ii Si ehE

)( , (19)

where iS is average schooling while i is the average Mincerian return, both computed in region i.

Regional Income Differences

To test Equation (17) we need to specify a regression in terms of observables, which entails

regressing regional per capita income on human capital and population (we do not have regional data

Page 15: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

14

on physical capital). From the Equation for ρ in (7) we obtain the condition Ki=B

11

11

ii HA where B >

0 is a constant. Substituting this condition as well as Equation (19) into Equation (17) we find that:

ln(Yi/Li) = C + [1/(1 – δ)]ln ̃ + [1+ γψ –β/(1 – δ)] i iS + [γ – β/(1 – δ)]lnLi, (20)

where C is a constant absorbed by the country fixed effect. If mobility is costly (φ>0), our model predicts

that regional human capital should vary with the initial exogenous endowment Hi and that regional

income differences are also triggered by fixed housing supply. Estimating (20) using OLS implies that the

coefficient on average regional schooling should be interpreted as the product of the “technological”

parameter (1+ γψ – β) and the nation-wide average of the regional Mincerian returns i .

The coefficient [γ – β/(1 – δ)] on population Li captures the benefit γ of increasing regional

workforce in terms of externalities minus the cost β of crowding the fixed land supply. A similar

interpretation holds with respect to the schooling coefficient (1+ γψ – β). The estimation of Equation

(20) raises a serious concern: since in our model human capital migrates to more productive regions, any

mismeasurement of regional productivity Ai may contaminate the coefficient of regional human capital.

We deal with this issue in two steps. First, we control in regression (20) for proxies of Ai. Second, we

compare these results to the coefficients obtained from the firm level regressions and to the calibration

exercises performed by the development accounting literature. These comparisons allow us to assess

the severity of the endogeneity problem in the estimation of (20).

Firm-Level Productivity

In (16), the output of a firm j operating in region i depends on the human capital hE,j of the

entrepreneur, as determined by his schooling SE,j and return to schooling ijE , , and on the average

human capital E(hW,j) of workers, which again we approximate by jWjW Se ,, (where jW , and jWS , are

Page 16: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

15

average values in the firm’s workforce). Ceteris paribus, in our model entrepreneurs have a higher

return to schooling than workers because in region i an entrepreneur with schooling S is someone

whose return satisfies iES he ,

, where iEh , is the human capital threshold for becoming an

entrepreneur in region i. At a schooling level S, the entrepreneurial class includes talented labourers

whose return satisfies ShS iEiE /ln)( ,, while labourers with )(, SiE become workers.

By writing Equation (16) in terms of firm-level output per worker yi,j/li,j and by exploiting the

expressions for entrepreneurs’ and workers human capital, we obtain the prediction:

ln(yi,j/li,j) = ln ̃ + (1–α–β–δ) jE , SE,ij + α jW , jWS , +

(1–α–β–δ)ln(E

jil , /li,j) + αln( Wjil , /li,j)+δlnki,j +βlnti,j + γlnLi + γψ i iS , (21)

Where xi,j = Xi,j/li,j denote per-worker values, E

jil , /li,j and W

jil , /li,j capture the share of a firm’s total

employment on managerial and non-managerial jobs, respectively. Taken literally, in our model output

per-worker is equalized across firms within a region. More realistically, though, output per-worker is

equalized across firms ex-ante, but its ex-post value varies as a result of stochastic ex-post changes in

the values of firm level TFP and inputs. This is the variation we appeal to when estimating (21).5

When estimating (21) using OLS, the coefficient on the entrepreneur’s schooling should be

interpreted as the product of entrepreneurs’ rents (1–α–β–δ) and a nation-wide average Mincerian

entrepreneurs’ return to education E . The coefficient on workers’ average schooling should be

interpreted as the labour share α times a nation-wide average Mincerian returns to workers W . The

5 Formally, suppose that if ex-ante a firm hires Xi,j units of a productive factor, this results in Xi,j = εX∙ Xi,j units of

the same factor being employed in production ex-post, where εX is a random shock to the value of inputs (e.g. an unpredictable change in the value of equipment, in the size of the workforce and so on). Given the Cobb-Douglas production function, the firm’s ex-ante optimization problem (occurring with respect to the ex-ante inputs Xi,j) does not change with respect to Equations (4) and (5). The only change is that a firm’s productivity also includes expectations of the random factors εX. Crucially, this formulation implies that ex-ante returns are equalized, ex-post returns are not, which allows us to estimate (21) insofar as our input measures captures the ex-pot values Xi,j.

Page 17: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

16

coefficient on regional schooling should be interpreted as the product of the externality parameter γψ

and the population-wide average Mincerian return .6

The estimation of (21) allows us to separate the role of the “low human capital” of workers from

the “high human capital” of entrepreneurs in shaping firm productivity. Since the selection of talented

entrepreneurs into more productive regions/industries may contaminate our results, we first estimate

(21) by controlling for the full set of regionXindustry dummies. In addition, we estimate (21) by directly

controlling for region specific variables to assess the importance of externalities from the coefficient γ

on population and the coefficient γψ on regional schooling. In this case, however, migration of human

capital to more productive regions may give the false impression of positive externalities, creating the

identification issue present in estimating (21). We deal with this problem by controlling for proxies for

Ai and by comparing our estimation results to standard calibrations of externalities.

III. Data.

Our analysis is based on measures of income, geography, institutions, infrastructure, and culture

in up to 110 (out of 193 recognized sovereign) countries for which we found regional data on either

income or education. Almost all countries in the world have administrative divisions.7 In turn,

administrative divisions may have different levels. For instance a country may be divided into states or

provinces, which are further subdivided into counties or municipalities. For each variable, we collect

data at the highest administrative division available (i.e., states and provinces rather than counties or

municipalities) or, when such data does not exist, at the statistical division (e.g. the Eurostat NUTS in

6Both the regional level Equation (20) and the firm level Equation (21) imply that the average return to schooling

should vary across regions. One way to empirically account for such possibility is to run random coefficient regressions. We have performed this analysis and the results change very little (the results on human capital become slightly stronger). We do not report the results to save space. 7 The exceptions are Cook Islands, Hong Kong, Isle of Man, Macau, Malta, Monaco, Niue, Puerto Rico, Vatican City,

Singapore, and Tuvalu.

Page 18: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

17

Europe) that is closest to it. Because we focus on regions, and typically run regressions with country

fixed effects, we do not include countries with no administrative divisions in the sample.

The reporting level for data on income, geography, institutions, infrastructure, and culture

differs across variables. GDP and education are typically available at the first-level administrative

division (i.e., states and provinces). In contrast, GIS geo-spatial data on geography, climate, and

infrastructure is typically available for areas as small as 10 km2. Finally, survey data on institutions and

culture are typically available at the municipal level. In our empirical analysis, we aggregate all variables

for each country to a region from the most disaggregated level of reporting available.8 To illustrate, we

have GDP data for 27 first-level administrative regions in Brazil, corresponding to its 26 states plus the

Federal District, but survey data on institutions for 248 municipalities. For our empirical analysis, we

aggregate the data on institutions by taking the simple average of all observations for establishments

located in the same first-level administrative division. Similarly, we aggregate the GIS geo-spatial data

on geography, climate, and infrastructure at the first-administrative level using the Collins-Bartholomew

World Digital Map.

The final data set has 1,569 regions in 110 countries: (1) 79 countries have regions at the first-

level administrative division; and (2) 31 countries have regions at a more aggregated level than the first-

administrative level because one or several variables (often education) are unavailable at the first-

administrative level. For example, Ireland has 34 first-level divisions (i.e., 29 counties and 5 cities), but

publishes GDP per capita data for 8 regions and education for 2 regions. Thus, we aggregate all the Irish

data to match the 2 regions for which education statistics are available. The data Appendix identifies

8 We used a variety of aggregation procedures. Specifically, we computed population-weighted averages for GDP

per capita and years of schooling. We computed regional averages for temperature, precipitation, distance to coast, travel time, and soil characteristics by first summing the (average) values of the relevant variable for all grid cells lying within a region and then dividing by the number of cells lying within a region. We computed regional averages for the density (e.g., power lines) and natural resources variables (oil and gas) by first summing the relevant variable for all grid cells within a region and then dividing by the region’s population. We averaged the responses within a region for all the variables from the Enterprise and World Value Surveys. We sum up the number of unique ethnic groups and computed the probability that people within a region speak the same language based on the total number of people in each “language” area.

Page 19: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

18

the reporting level for the regions in our dataset. As noted earlier, all countries have administrative

divisions (although 31 countries in our sample report statistics for statistical regions). The principal

constraint on the sample is the availability of human capital data. Of course, all countries have periodic

censuses and thus have sub-national data on human capital, but these data are hard to find.

Figure 1 portrays the 1,569 regions in our sample. It shows that coverage is extensive outside of

North and sub-Saharan Africa. Sample coverage is strongly related to a country’s surface area,

presumably because very small countries do not report regional data. For example, the smallest country

in our dataset is Lebanon (10,400 km2), leaving out of our sample some very prosperous countries such

as Luxembourg (2,590 km2) and Singapore (699 km2). Among countries ranked by their surface area, we

only have data for 14% of the first (smallest) 50 countries, 44% of the first 100 countries, and 53% of the

first 150 countries. Similarly, sample coverage rises with the absolute level of GDP but not with GDP per

capita. For example, we have data for 18% the first 50 countries ranked in terms of GDP in 2005, 38% of

the first 100 countries, and 49% of the first 150 countries. The comparable figures for countries ranked

on the basis of GDP per capita are 52%, 57% and 57%, respectively. Since sample coverage rises with

GDP, it turns out that the countries in our sample account for 97% of world GDP in 2005.

Our final dataset has regional income data for 107 countries in 2005, drawn from sources

including National Statistics Offices and other government agencies (42 countries), Human Development

Reports (36 countries), OECDStats (26 countries), the World Bank Living Standards Measurement Survey

(Ghana and Kazakhstan), and IPUMS (Israel). When regional income data for 2005 is missing, we use

log-linear interpolation based on as much data as it is available for the period 1990-2008 or, when

interpolation is not possible, the closest available year.9 Our measure of regional income per capita is

typically based on value added but we use data on income (6 countries), expenditure (8 countries),

wages (3 countries), gross value added (2 countries), and consumption, investment and government

9 We are missing regional GDP per capita for Bangladesh and Costa Rica and national GDP per capita in PPP terms

for Cuba.

Page 20: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

19

expenditure (1 country) to fill-in missing values. We measure regional GDP in current purchasing-

power-parity dollars as we lack data on regional price indexes. To ensure consistency with the national

GDP figures reported by World Development Indicators, we adjust regional income values so that --

when weighted by population-- they total GDP at the country level. Not surprisingly, adjustments

exceeding 20% were necessary in 19 out of the 20 countries for which use GDP proxies rather than

actual GDP. Adjustments exceeding 20% were also necessary in 13 countries (Democratic Republic of

Congo, Lebanon, Lesotho, Madagascar, Malaysia, Nepal, Niger, Philippines, Senegal, Swaziland, Syria,

Uganda, and Venezuela) where our GDP data are in real terms.

We compute regional income per capita using population data from Thomas Brinkhoff: City

Population, which collects official census data as well as population estimates for regions where official

census data are unavailable.10 We adjust these regional population values so that their sum matches

the country’s population in the World Development Indicators database. This adjustment exceeds 10%

for 6 countries: Bangladesh (+13%), Benin (+11%), Democratic Republic of Congo (-10.5%), Gabon (-

25%), Swaziland (+16%) and Uzbekistan (-22%).

In addition, we examine productivity and its determinants using establishment-level data from

the Enterprise Survey for as many as 53,957 establishments in 82 countries and 539 of the regions in our

sample.11 We collect operating data on sales, cost of raw materials, cost of labor, cost of electricity, and

the cost of communications. We also collect data on the book value of property, plant, and equipment.

Critically, the Enterprise Survey keeps track of the highest educational attainment of the establishment’s

top manager as well as of its workers. Finally, we collect the two-digit ISIC code (e.g., food, textiles,

chemicals, etc.) of the establishments in our sample. Like the economic census and business registry

10

We also used data from OECDStats (for Denmark, Greece, Ireland, Italy, and the UK) and the National Statistics Office of Macedonia. 11

The Enterprise Survey data was collected between 2002 and 2009. When data from the Enterprise Survey for one of the countries in our sample are available for multiple years, we use the most recent one.

Page 21: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

20

data, the Enterprise Survey data only covers registered establishments. A limitation of the Enterprise

Survey data is that it largely excludes OECD countries (Ireland and Mexico are the exceptions).

We relate regional economic development to five sets of potential determinants: (1)

geography, (2) education, (3) institutions, (4) infrastructure, and (5) culture. To narrow down the list of

candidate variables, we restrict attention to variables that are available at the regional level for at least

40 countries and 200 regions.

We use three measures of geography and natural resources obtained from the WorldClim

database, which are available for all regions of the world. They include the average temperature during

the period 1950-2000, the (inverse) average distance between the cells in a region and the nearest

coastline, and the estimated volume of oil production and reserves in the year 2000.12

We gather data on the educational attainment of the population 15 years and older for 106

countries and 1491 regions from EPDC Data Center (55 countries), Eurostat (17 countries), National

Statistics Offices (27 countries) and IPUMS (7 countries); see the data appendix for sources. We collect

data on school attainment during the period 1990-2006 and use data for the most recently available

period. We compute years of schooling following Barro and Lee (1993). Specifically, we use UNESCO

data on the duration of primary and secondary school in each country and assume: (a) zero years of

school for the pre-primary level, (b) 4 additional years of school for tertiary education, and (c) zero

additional years of school for post-graduate degrees. We do not use data on incomplete levels because

it is only available for about half of the countries in the sample. For example, we assume zero years of

additional school for the lower secondary level. For each region, we compute average years of schooling

as the weighted sum of the years of school required to achieve each educational level, where the

weights are the fraction of the population aged 15 an older that has completed each level of education.

12

The results in the paper are robust to controlling for the standard deviation of temperature, the average annual precipitation during the period 1950-2000, the average output for multiple cropping of rain-fed and irrigated cereals during the period 1960-1996, the estimated volume of natural gas production and reserves in year 2000, and dummies for the presence of various minerals in the year 2005.

Page 22: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

21

To illustrate these calculations consider the Mexican state of Chihuahua. The EPDC data on the

highest educational attainment of the population 15 years and older in Chihuahua in 2005 shows that

4.99% of the that population had no schooling, 13.76% had incomplete primary school, 22.12% had

complete primary school, 5.10% had incomplete lower secondary school, 23.04% had complete lower

secondary school, 17.94% had complete upper secondary school, and 13.05% had complete tertiary

school. Next, based on UNESCO’s mapping of the national educational system of Mexico, we assign six

years of schooling to people who have completed primary school and 12 years of schooling to those that

have completed secondary school. Finally, we calculate the average years of schooling in 2005 in

Chihuahua as the sum of: (1) six years times the fraction of people whose highest educational

attainment level is complete primary school (22.12%), incomplete lower secondary (5.1%), or complete

lower secondary school (23.04%); (2) 12 years times the fraction of people whose highest attainment

level is complete upper secondary school (17.94%); and (3) 16 years times the fraction of people whose

highest attainment level is complete tertiary school. Accordingly, we estimate that the average years of

schooling of the population 15 and older in Chihuahua in 2005 is 7.26 years

(=6*0.5026+12*0.1794+16*0.1305).

We compute years of schooling at the country-level by weighting the average years of schooling

for each region by the fraction of the country’s population 15 and older in that region. The correlation

between this measure and the number of years of schooling for the population 15 years and older in

Barro and Lee (2010) is 0.9. For the average (median) country in our sample, the number of years of

schooling in Barro and Lee (2010) is 8.18 vs. 6.88 in ours (8.56 vs.6.92 years). Two factors could

potentially explain why the Barro-Lee dataset yields a higher level of educational attainment than ours:

(1) Barro-Lee captures incomplete degrees while we do not; and (2) education levels have increased

rapidly over time but some of our educational attainment data is stale (e.g. for 14 countries our

educational attainment data is for the year 2000 or earlier). To make the Barro and Lee (2010) measure

Page 23: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

22

of educational attainment more comparable to ours, we make two adjustments to their data. First, we

apply our methodology to the Barro-Lee dataset and compute the level of educational attainment in

2005. After this first adjustment, the level of educational attainment computed with the Barro-Lee

dataset for the average (median) country in our sample drops to 7.07 (7.23). Second, we apply our

methodology to the Barro-Lee dataset but –rather than use data for 2005 -- use figures for the year that

best matches the year in our dataset. After this second adjustment, the level of educational attainment

using the Barro-Lee dataset for the average (median) country in our dataset drop further to 6.95

(7.22).13 Since most of our results are run with country-fixed effects, country-level biases in our

measure of human capital do not affect our results.14

We gather data on seven measures of the quality of institutions from the Enterprise Survey and

the Sub-national Doing Business Reports. The Enterprise Survey covers as many as 80 of the countries

and 410 of the regions in our sample.15 The Enterprise Survey asked business managers to quantify: (1)

informal payments in the past year (percent of sales spent in informal payments by a typical firm in the

respondent’s industry), (2) the number of days spent in meeting with tax authorities in the past year, (3)

the number of days without electricity in the previous year, and (4) security costs (cost of security

equipment, personnel, or professional security service as a percentage of sales). The Enterprise Survey

also asks managers to rate a variety of obstacles to doing business, including: (1) access to land, and (2)

13

After the second adjustment, there are 5 countries (i.e., Great Britain, Poland, Switzerland, Syria, and Uruguay) for which our educational attainment numbers remain 25% or more above the adjusted Barro-Lee numbers, and 12 countries (i.e., Armenia, Bangladesh, Benin, Bolivia, Cambodia, Honduras, Laos, Morocco, Niger, Pakistan, Senegal, and Sri Lanka) for which our numbers remain 25% or more below the adjusted Barro-Lee numbers. In all but two of these 17 cases (Great Britain and Poland are the two exceptions), data sources differ (our data for these two countries comes from household or individual surveys and theirs from national censuses). For Great Britain we have 12.14 years of schooling, as does the OECD, while Barro-Lee has 9.21. For Poland, we have 11.15 years of schooling while Barro-Lee has 9.65 and the OECD has 10.55. 14

Results for our cross-country regressions are qualitatively similar if we use educational attainment from Barro-Lee (2010) rather than the population-weighted average of regional values. 15

The main reason why we have fewer regions with measures of institutions than regions with productivity data is because we imposed a filter of a minimum of 10 establishments answering the particular institutions question. The rest of the discrepancy in the number of regions is because some questions about institutions were not included in the survey for some countries.

Page 24: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

23

access to finance.16 For each of these obstacles to doing business, we keep track of the percentage of

the respondents that rate the item as a moderate, a major, or a very severe obstacle to business. The

final Enterprise Survey variable that we examine is the perception of government predictability

(measured as the percentage of respondents who tend to agree, agree in most cases, or fully agree that

government officials’ interpretations of regulations are consistent and predictable).

To make sure that our results on the importance of institutions are not driven by measurement

error, we also gather objective measures of the quality of institutions from the Sub-national Doing

Business Reports, which are available for 19 countries and 180 regions in our sample. We focus on the

number of procedures and their cost in four areas: starting a new business, enforcing contracts,

registering property, and dealing with licenses. Interestingly, variation in the cost of regulation swamps

the variation in the number of procedures. For example, there is no variation in the number of steps

required to enforce a contract in the 30 Chinese cities tracked by the Sub-National Doing Business

Report. However, the estimated time to enforce a contract ranges from 112 days in the city of Nanjing

(Jiangsu) to 540 days in the city of Changchun (Jilin). As it turns out, results using objective measures of

institutions are qualitatively similar to the results using subjective measures that we have described.

We use two measures of infrastructure. The first is the density of power lines in 1997 from the

US Geological Survey Global GIS database.17 The second measure is the average estimated travel time

between cells in a region and the nearest city of 50,000 people or more in the year 2000 from the Global

Environment Monitoring Unit. Both measures of infrastructure are available for all regions of the world.

Cultural variables are the last set of potential determinants of regional income that we examine.

We gather two proxies for cultural values and attitudes from the World Value Survey for as many as 75

16

From the Enterprise Survey, we also assembled data on the number of days in the past year with telephone outages, the percentage of sales reported to the tax authorities, and the confidence that the judicial system would enforce contracts and property rights in business. Results for these variables are available upon request. 17

Results using other density measures of infrastructure (e.g. air fields, highways, and roads) also available on the US Geological Survey Global GIS database are qualitatively similar.

Page 25: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

24

of our sample countries and 745 of our regions.18,19 The first survey measure is the percentage of

respondents in each region that answer that “most people can be trusted” when asked whether

"Generally speaking, would you say that most people can be trusted, or that you can't be too careful in

dealing with people?" The second measure is a proxy for civic values based on whether each of the

following behaviors "can always be justified, never be justified or something in between.": (a) "claiming

government benefits which you are not entitled to"; (b) "avoiding a fare on public transport"; (c)

"cheating on taxes if you have the chance"; and (d) "someone taking a bribe" (Knack and Keefer, 1997).20

In addition, we gather two measures of fractionalization for up to 1,568 regions and 110 of our sample

countries. The first one is simply the number of ethnic groups that inhabited each region in 1964. The

second one is the probability that a randomly chosen person in a region shares the same mother

language with a randomly chosen people from the rest of the country in 2004.

Finally, in addition to running regressions using regional data, we examine GDP per capita at the

country level, which come from World Development Indicators. All the other country-level variables in

the paper are computed based on our regional data rather than drawn from primary sources.

Specifically, the country-level analogs of our regional measures of education, geography, institutions,

public goods, and culture are the area- and population-weighted averages of the relevant regional

variables, as appropriate.

18

We set to missing World Value Survey data for five countries (France, Japan, Philippines, Russia, and the United States) because they are only available at a very coarse level. 19

The World Value Survey was collected between 1981 and 2005. When data from the World Value Survey for one of the countries in our sample are available for multiple years, we use the most recent one. 20

We also examined proxies for confidence in various institutions (government, parliament, armed forces, education, civil service, police, and justice), for what is important in people’s lives (family, friends, leisure, politics, work, and religion) as well as for characteristics valued in children (determination, faith, hard work, imagination, independence, obedience, responsibility, thrift, and unselfishness). Moreover, we also examined proxies for broad cultural attitudes with regards to authority (percent who think that one must always love and respect one’s parents regardless of their qualities and faults), tolerance for other people (percent who select tolerance and respect for other people as an important quality for children to learn), and family (percent who think that parents have a duty to do their best for their children even at the expense of their own well-being). Finally, we examined the percentage of respondents that participate in professional and civic associations. The results for these variables are qualitatively similar to the results for the WVS variables that we discuss in the text.

Page 26: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

25

Table 1 describes our variables and Table 2 summarizes our data. For each variable we examine

in the regional regressions, it shows the number of regions for which we have the information, the

number of countries these regions are in, the median and the average number of regions per country,

and the median range and standard deviation within a country. The data show substantial income

inequality among regions within a country. On average, the ratio of the income in the richest region to

that in the poorest region is 4.41. This ratio is 3.74 in both Africa and Europe, 4.60 in North America,

5.61 in South America, and 5.63 in Asia. The country with the highest ratio of incomes in the richest to

that in the poorest region is Russia (43.3); the country with the lowest ratio is Pakistan (1.32).

Interestingly, this ratio is 5.16 for the United States, 2.59 for Germany, 1.93 for France, and 2.03 for

Italy. Italy has attracted enormous attention because of differences in income between its North and

its South, usually attributed to culture. As it turns out, Italian regional inequality is not unusual. We also

note that regional inequality of incomes within a country, as measured by the standard deviation of the

logarithm of per capita incomes, declines with income, perhaps because richer countries have more

equalizing policies (Figure 2).

There is likewise substantial inequality in education among regions within a country. On

average, the ratio of educational attainment in the richest region to that in the poorest region is 1.80.

This ratio is 2.71 for Africa, 1.68 for Asia, 1.16 for Europe, 1.33 for North America, and 1.81 for South

America. The highest ratio is in Burkina Faso (14.66), where education is 0.22 in the Sahel region and

3.20 in the Centre region. The lowest ratio is 1.05 in Ireland. One striking fact in this data is the much

great regional equality in the distribution of education in the richer than in the poorer countries. Figure

3 presents the evidence for the relationship between standard deviation of education levels in a country

and its per capita income. Such tendency to equality might follow from the more uniform educational

policies in richer countries, and may account for greater regional income equality in the richer countries.

Page 27: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

26

The patterns of inequality among regions within countries is interesting for some of the other

variables as well. Table 2 shows that differences in endowments, such as temperature and distance to

coast, are small, which suggests that these variables will have difficulty in explaining regional differences

in per capita income. Density of power lines and travel time to the next big city varies a great deal

across regions, suggesting that urban theories of development might be helpful in explaining regional

inequality. There is also considerable variation across regions in the estimates of the quality of

institutions, which suggests that, at least in principle, there is a regional aspect to institutional quality

that could relate to differences in economic development.

IV. Accounting for National and Regional Productivity.

In this section, we present cross-country and cross-region evidence on the determinants of

productivity. We present national regressions only for comparison. These regressions are difficult to

interpret in our model because it is not possible to express national output in closed form. More

importantly, the problem of endogeneity of education is particularly severe in the national context.

With respect to regional income, our benchmark is Equation (20). As already mentioned, we have

measures of average education at the regional level, but we do not have either national or regional data

on physical capital (except for public infrastructure) or other inputs, so these variables only appear in

the firm-level regressions in Section V.

Table 3 presents our basic regional results in perhaps the most transparent way. It reports the

results of univariate regressions of regional GDP per capita on its possible determinants, all with country

fixed effects. Such specifications are loaded in favor of each variable seeming important since it does

not need to compete with any other variable. We report both the within country and between

countries R2 of these regressions. The first row shows that education explains 58% of between country

variation of per capita income, and 38% of within country variation of per capita income. Figure 4

Page 28: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

27

shows, for Brazil, Colombia, India, and Russia, the striking raw correlation between regional schooling

and per capita income. Although several other variables explain a significant share of between country

variation, none comes close to education in explaining within country variation in income per capita.

Starting with geographical variables, temperature and inverse distance to coast – taken

individually – explain 27 and 13 percent of between country income variation, but 1 and 4 percent

respectively of within country variation. Oil explains a trivial amount of variation at either level.

Turning to institutions, some of the variables, such as access to finance or the number of days it takes to

file a tax return, explain a considerable share of cross-country variation, consistent with the empirical

findings at the cross-country level such as King and Levine (1993) or Acemoglu et al. (2001), but none

explains more than 2 percent of within country variation of per capita incomes. Indicators of

infrastructure or other public good provision do slightly better: on their own many explain a large share

of between country variation, while density of power lines and travel time account for up to 7% of

within country variation. These variables are obviously highly endogenous, and still do much worse than

education. Some of the cultural variables account for a substantial share of between country variation,

none account for much of within country variation. Of course, culture might operate at the national

rather than the sub-national level, although we note that much of the research on trust focuses on

regional rather than national differences (e.g., Putnam 1993). After presenting the regression results,

we try to explain why some of these variables do so poorly.

Tables 4 through 6 show the multivariate regression results at the national and regional level.

Table 4 presents the baseline regressions of national (Panel A) and regional (Panel B) per capita income

on geography and education, controlling in some instances for population or employment, as suggested

by our model. At the country level, temperature, inverse distance to coast, and oil endowment are all

highly statistically significant in explaining cross-country variation in incomes, and explain an impressive

Page 29: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

28

50% of the variance. Education is also statistically significant, with a coefficient of .25, raising the R2 to

63%. Note that oil comes in positive and highly statistically significant.

As Panel B shows, the coefficients on geography and education continue to be significant at the

regional level. However, the within country R2 is much higher for education than for the geographic

variables. The coefficient on the regional labour force is now positive and statistically significant, and

ranges from .01 for population to .07 for employment. The coefficient on education is around .27.

Table 5 presents country-level regressions with measures of institutions (Panel A) and of

infrastructure and culture (Panel B) added to the specification in Table 4. Education remains highly

statistically significant in each specification, and its coefficient does not fall much. At the country level,

only the logarithm of tax days is statistically significant. The last two rows of Table 5 show the adjusted

R2 of each regression if we omit the institutional (or infrastructure or cultural) variable, as well as the

adjusted R2 if we omit education. Dropping education sharply reduces explanatory power, while the

only institutional variable that adds explanatory power is the logarithm of tax days.

Table 6 presents the corresponding results at the regional level. The education coefficient is

slightly higher than in Table 4, and is highly significant, as illustrated in Figure 5. Institutional variables

are almost never significant, and their incremental explanatory power is tiny. We find a small adverse

effect of ethnic heterogeneity on income at the regional level, although the incremental explanatory

power of all the institutional and cultural variables is small21.

We have conducted several robustness checks of these findings, and here summarize but do not

present the results. First, we eliminated regions that include national capitals from the regressions; the

results are not materially affected. Second, we included measures of regional population density in the

specifications; density is typically insignificant and other results are not importantly affected. Third, we

21

We have tested the robustness of these results using data on regional luminosity instead of per capita income (see Henderson, Storeygard, and Weil 2009). The results are highly consistent with the evidence we have described, both with respect to the importance of human capital, and the evidence of relative unimportance of other factors, in accounting for cross-regional differences.

Page 30: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

29

have rerun all the regressions breaking down our aggregated educational attainment measure into

measures, for each level of educational attainment between 1 and 13 years, of the presence of

individuals at that level of education in that region. Educational variables continue to be highly

correlated with per capita income, and the coefficients increase monotonically with the level of

attainment. That is, having more people at a higher level of education is associated with higher income.

What are some of the possible explanations of the low explanatory power of institutions,

keeping in mind that endogeneity of institutions should if anything raise the coefficients? It is possible

that we have inappropriate measures of institutions, although the measures we have are commonly

considered to be relevant for economic outcomes. It is also possible that the measures from Enterprise

Surveys are particularly noisy, although one should remember that these are surveys of managers who

should be particularly focused on institutional constraints. In general, such subjective assessments

correlate much better with measures of development than objective measures of institutions (Glaeser

et al. 2004). It is also likely that at least some institutions only matter at the national level, if, for

example, the critical-to-development business activity is concentrated in the capital.

To shed further light on these issues, Table 7 presents at the national level (we have no regional

data) regressions in the same format as Table 5, but using standard measures of institutions, including

autocracy, constraints on the executive, expropriation risk, proportional representation, and corruption.

Except for proportional representation, all of these variables are highly statistically significant in these

specifications. However, with the exception of expropriation risk and corruption, both of which are

highly endogenous, the incremental explanatory power of institutional variables is minimal, and in most

cases much smaller than the incremental explanatory power of education. Perhaps the more important

point is that Enterprise Surveys do not cover rich countries. If we run the regressions in Table 7 for the

72 countries with data on informal payments, we find that proportional representation is insignificant,

autocracy and executive constraints are significant at only 10% level, expropriation risk is significant at

Page 31: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

30

the 5% level, and corruption is significant at the 1% level. Critically, the value of estimated coefficients

falls, rather than standard errors rising. Our bottom line is that the weakness of institutional variables

results in part from different (and possibly but not definitely inferior) data, and in part from the focus on

poorer countries, for which institutional variables indeed matter less.

Due to potential migration of better educated workers to more productive regions, we cannot

interpret the large education coefficients - which appear to come through with a similar magnitude

across a range of specifications – as the causal impact of human capital on regional income. To address

this problem, we next estimate the role of human capital in the production function by looking at firm

level evidence based on Enterprise Surveys. By combining estimation and calibration, we then assess

the extent to which the role of human capital at the firm level can account for its role across regions.

V. Firm-Level Evidence.

In Tables 8-10, we turn to the micro evidence and estimate essentially Equation (21). We use

the Enterprise Survey data described in Section III. In establishment-level regressions, we try to

disentangle managerial/entrepreneurial human capital, worker human capital, and human capital

externalities. To address the concern that the sorting of entrepreneurs by unobservable skills into more

productive regions may contaminate our estimates of entrepreneurial returns to schooling as well as of

human capital externalities, in Table 8 we use an extremely flexible specification that controls for

region-industry interactions by including region X industry fixed effects and no regional education. This

enables us to focus on the effect of managerial/entrepreneurial education on productivity without

worrying about regional and sectoral productivity, which are subsumed in the fixed effects. In Tables 9

and 10 we then turn to an examination of human capital externalities, first with regional education only,

and then adding geographic variables. All standard errors are clustered at the regional level. We have

experimented with some indicators of regional institutions and infrastructure as independent variables,

Page 32: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

31

but consistent with the findings for regional data, they are usually insignificant, and hence we do not

focus on these results.

We use three dependent variables to proxy for productivity. First, we look at the log of the

establishment sales per worker, yi,j/li,j. Second, we look at a rough measure of value added, namely the

logarithm of sales net out raw material inputs, per worker. Third, we run regressions with the log of

average wages paid by the establishment (which in our Cobb-Douglas production function correspond to

a constant fraction of output) as a dependent variable. We measure capital (which includes both land ti,j

and physical capital ki,j) by the log of property, plant and equipment per employee. As an alternative, we

proxy for physical capital by the log of expenditure on energy per employee. We also use the log of the

number of employees, which is a proxy of li,j, to control for the share of the entrepreneur’s labour E

jil , /li,j

and of the workers’ labour W

jil , /li,j. Indeed, assuming that each firm has only one entrepreneur we have

Ejil , /li,j = 1/li,j and

Wjil , /li,j = (li,j-1)/li,j. Unfortunately, the regression coefficient of the log of employees is

not susceptible of a clean interpretation in terms of technological parameters.

Most important, to trace out the effects of human capital, we include the years of schooling of

the manager SE and the years of schooling of workers SW in Table 8, and subsequently the average years

of schooling in the region Si in Tables 9 and 10. As we explained in Section II, the Mincer model of the

relationship between education and human capital implies that schooling should enter the specification

in levels, rather than in logs. Accordingly, the regression coefficients of the schooling variables should

respectively capture parameters (1–α–β–δ) iE , , α iW , and γψ i in Equation (21). To capture scale

effects in regional externalities, in Tables 9 and 10 we control for the log of the region’s population Li.

The regression coefficient on this variable should capture γ in Equation (21). After presenting the basic

estimation results, we compute the values of these coefficients by combining estimation and calibration.

Page 33: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

32

Estimation Results

Begin with the results in Table 8, which includes region-industry fixed effects (for 16 industries).

In the (log) sales per employee specification, the coefficient on energy per employee is .34, while that on

capital per employee is .30. These coefficients, however, are closer to .3 in the value added

specification, and to .2 in the average wage specifications. When both variables are included in the

specification, their sum is higher. Based on these estimates as well as on conventional calibration

parameters (see Caselli 2005), we use .35 as capital share when we calibrate the model and assess its

ability to account for variation in productivity across space.

The coefficient on worker schooling averages to about .03 in the sales per employee

specifications, roughly the same in the value added specifications, but is closer to .015 in the wage

specifications. The coefficient on management schooling is also about .03 in the sales per employee

specification, slightly lower in the valued added specification, but falls to about .02 in the wage

specification. The coefficient on the log employment (firm size) is about .1 across specifications. The

result that larger firms are more productive is at first glance inconsistent with the diminishing returns

specification, but it may reflect omitted organization-specific capital or other sources of higher

productivity in larger firms. Firm size might also capture some of the effects of managerial education,

since in our data larger firms are run by better educated managers (see also earlier results of Bloom and

Van Reenen 2007 and La Porta and Shleifer 2008 on the importance of managerial education).

Controlling for size may also help account for market power.

In Table 9, we include country and industry fixed effects, but add regional years of education to

the regressions. There are some changes in parameter estimates, but the coefficients on worker

education remain around .03, those on manager education likewise remain around .03, and capital

shares stay around .35, like in the specifications of Table 8. In addition, we find consistent evidence of

large effects on productivity from regional factors. The coefficient on regional schooling is amazingly

Page 34: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

33

consistent and statistically significant across specifications, and varies between .05 and .1. The

coefficient on regional population varies across specifications, but we will take it to be around .09 based

on the results for sales per employee and value added per employee.

In our analysis of determinants of regional productivity, geographic variables, but not measures

of culture or institutions, have been consistently statistically significant. Accordingly, in Table 10 we

examine the robustness of the results in Table 9 by controlling for the geographic factors. Such controls

might also go some way toward enabling us to attribute the coefficient on regional human capital to

externalities rather than omitted regional productivity factors. The coefficients on geography variables

are quite unstable in these specifications, with inverse distance to coast exerting a large positive

influence on productivity in four specifications (but not the other eleven), and oil endowment exerting

now a large negative effect in some specifications. The most obvious proxies for omitted regional

productivity thus do not appear to be important. Critically, the coefficients on years of education of

managers and years of education of workers do not fall much relative to the specifications in Table 8,

indicating that returns to education of entrepreneurs remain high even with regional controls. The

coefficient on years of education in the region falls a bit in some specifications relative to its value in

Table 9. We will use the average estimate of .065 in our calculations.

We added additional controls to these regressions, and obtained similar results. This evidence

needs to be explored further, but most of the specifications confirm both the general findings, and

parameter estimates, computed from Tables 8 and 9. There does not appear to be much evidence of

significant omitted regional effects, although since we do not have a complete set of determinants of

regional productivity, our assessment of external effects might be exaggerated.

Combining Estimation with Calibration

Page 35: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

34

Can the effects estimated from firm level regressions account for the large role of schooling in

the regional regressions? How do these effects compare with the work in development accounting?

We address these questions by starting with the roles of managers’ and workers’ schooling. The

coefficient on workers’ average schooling in the firm level regressions is about 0.03, which in our model

implies α W = 0.03. The standard calibration for the U.S. labour share is about α = 0.6. If however we

calibrate α = 0.55 to capture the fact that in developing countries the labour share tends to be lower

than in the U.S. (also because part of labour income goes to self employment, Gollin 2002), we obtain

W = 0.55, which is in the ballpark of micro evidence on worker Mincerian returns (Psacharopoulos

1994). The fact that Mincerian returns to education implied by our empirical evidence are consistent

with existing research suggests that our firm level productivity regressions help reduce identification

problems, at least as far as firm-level variables are concerned.

The regressions also point to an overall capital share (considering energy or equipment) roughly

equal to 0.35. In our model, this captures the income share δ + β going to K and T which leaves – under

constant returns – a share of 0.1 going to entrepreneurial rents. That is, (1–α–β–δ) = (1– 0.55– 0.35) =

0.1. Since the estimated coefficient on managerial education roughly implies (1–α–β–δ) E = 0.03, our

results are consistent with a Mincerian return E = .30 for entrepreneurs. The 30% estimate is higher

than those found by Goldin and Katz (2008) for returns to college education for workers, but lines up

with the few existing analyses of entrepreneurial education, which document substantially larger

returns of education for managers than for workers (Parker and van Praag 2005, van Praag et al. 2009).22

22

Using U.S. and Dutch individual-level data, these studies find that one extra year of schooling increases entrepreneurial income by 18% and 14%, respectively. This is much higher than the 3% found in our firm-level data (in our model entrepreneurial income is a constant share of a firm’s output), implying gigantic Mincerian returns under an entrepreneurial share of α = 0.1. Note, however, that these studies rely on small start-ups (in the Dutch data) or on self employed individuals (in the U.S. data). In both cases, the entrepreneurial share is likely to be higher than 0.1, moving Mincerian returns closer to our benchmark of 30%.

Page 36: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

35

This preliminary assessment suggests that a neglected but critical channel through which

schooling and human capital affect productivity is via entrepreneurial inputs. Individuals selected into

entrepreneurship appear to be vastly more talented than workers, driving up productivity. Of course,

entrepreneurial talent may be more important than entrepreneurial schooling in explaining this finding.

Our analysis cannot adequately address this issue (which would require better data and an endogenous

determination of the connection between schooling and talent). Our analysis is, nevertheless, sufficient

to identify a critical role of management and entrepreneurship in determining productivity.

The large returns to entrepreneurial education, compared to the relatively small returns to

worker education, might explain the problem that the previous literature on development accounting

has experienced with the Mincer regressions (Caselli 2005, Hsieh and Klenow 2010). As we show below,

spatial differences in the stocks of human capital implied only by returns to worker education are

considerably lower than those implied by blended returns of workers and entrepreneurs. Including

entrepreneurial returns in the calculation thus substantially enhances the ability of human capital to

account for differences in productivity across space. Entrepreneurial returns might be ignored in

surveys focusing on wages as returns to education.

We can use the estimates of Tables 9 and 10 to assess the magnitude of human capital

externalities. The coefficient on population in Table 9, roughly equal to 0.09, suggests that γ is also

about 0.09. Interestingly, this assessment is consistent with the regional regressions in Table 4. In fact,

the coefficient on population is positive and roughly equal to 0.01. This implies that γ – β/(1 – δ) = 0.01

in Equation (21). Given that γ = 0.09 and β + δ = 0.35, this condition yields β to be roughly 0.06, which

happens to be in the ballpark of the land share estimated from income accounts (Valentinyi and

Herrendorf 2008). In sum, β = 0.06, δ=0.29 and γ = 0.09 are roughly consistent with both firm level and

regional regressions as well as with standard calibration exercises.

Page 37: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

36

The coefficient on regional schooling in the firm-level regressions of Table 9 is about .065. This

implies (given γ = 0.09) that ψ is about .72. To separate the effect of the population-wide Mincerian

return from the strength of the externalities ψ, we exploit the regional regressions. According to

Equation (20) describing regional output, in these regressions the coefficient on schooling is equal to [1+

γψ –β/(1 – δ)] . In Table 4, this coefficient is about 0.26. Since we have already accepted γ = 0.09 and

β/(1 – δ) = 0.08 as reasonable estimates, we are left with two equations with two unknowns, namely (1+

0.09ψ – 0.08) = 0.26 and ψ = 0.72. These equations imply an average population-wide Mincerian

return of about 0.21 (which is in between our estimates of workers’ and entrepreneurs’ values) and

that the average education externality parameter ψ is about 3.45. That is, a given increase in regional

human capital generates 3.45 times more externalities if it is due to an increase in the average amount

of human capital rather than to a higher number of people with average education. These estimates

point to a large effect of schooling for productivity via social interactions or R&D spillovers, consistent

with Lucas (1985, 2008) as well as with the literature in urban economics cited in the introduction.

Finally, note that at the above parameter values and at a reasonable share of housing consumption of θ=

0.4, the spatial equilibrium is stable, since (β – ψγ)(1 – θ) + θ(1 – δ) = – (0.25)(0.6) + (0.4)(0.74)>0.

We can now use our results to assess the magnitude of the effect of schooling. Begin with the

role of workers’ and entrepreneurs’ inputs. In regional regressions, the population-wide Mincerian

return of 0.21 is needed to make sense of the data, while the firm level regressions suggest that the

Mincerian return is 6% for workers and 30% for entrepreneurs. Although we lack direct data on the

number of entrepreneurs in the economy, we can make a back-of-the-envelope calculation to assess

whether our firm level evidence is consistent with a 21% Mincerian return population-wide. If: (1) an

average entrepreneur is as educated as the entrepreneurs in the enterprise survey on average, i.e. has

14 years of schooling; and (2) an average worker in the economy is as educated as the average person in

Page 38: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

37

the sample, i.e. has roughly 7 years of schooling, then to obtain an average population-wise Mincerian

return of 21% entrepreneurs need to account for only 4.8% of the workforce.23 Our estimates thus

suggest that private returns to schooling may be much larger than what previously thought due to the

neglected role of entrepreneurial inputs.

Consider next the role of externalities. Using our estimated parameters, raising the educational

level from the sample mean of 6.58 years by one year can be calculated to increase regional TFP by

about 6.7%. Rauch (1993) estimates a comparable magnitude of 3-5%. Acemoglu and Angrist (2000)

estimate that a one year increase in average schooling is associated with a 7% increase in average

wages. Moretti (2004) examines the impact of spillovers associated with the share of college graduates

living in a city. If we run the regressions in Table 9 using the fraction of the population with college

degrees instead of our measure of years of schooling, our estimates imply that a one percentage point

increase in the share of region’s population with a college degree increases output per capita by 7.9%.

Iranzo and Peri (2009) estimate that one extra year of college per worker increase the state’s TFP by a

very significant 6-9%, whereas the effect of an extra year of high school is closer to 0-1%. The

agreement among the various estimates is quite striking. Even if we use the coefficients obtained in

Table 10 when controlling for factors potentially affecting regional productivity, the change is very

modest, increasing the required population-wide Mincerian return to .23 and reducing the externality

parameter ψ to 2.4. Notwithstanding the difficulties involved in the identification of externalities, their

quantitative role seems to be quite robust.

In sum, the firm level and regional productivity differences are mutually consistent because: i)

entrepreneurial education is much more important than workers’ education in accounting for

productivity across firms and regions, and ii) human capital externalities are also sizeable. The standard

23

The population-wise average Mincerian return is computed as the return that solves the equation exp( [

7)1(14 ff ]) = exp(0.3 14) (1 )exp(0.06 7)f f where f is the fraction of entrepreneurs.

Page 39: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

38

development accounting literature has neglected both channels. Since our estimation of

entrepreneurial returns is less subject to endogeneity problems than the estimation of externalities, we

now assess the relative importance of these two channels in explaining cross-country differences in

income per capita by using as a reference some well known results of macro development accounting

exercises. To do so, define a factor-based model of national income as KHLhEY )( 1^

, which

is national income level predicted by our model when: i) all regions in a country are identical and all

countries are equally productive, and ii) where – in line with standard development accounting - we

consider only physical and human capital, thereby attributing land rents to physical capital (deducting

these rents would not change much). This simplified model with no regional mobility provides a

benchmark to assess the role of physical and human capital when productivity differences are absent.

Following Caselli (2005), one measure of the success of the model in explaining cross-country

income differences is

))var(log())var(log(

^

YYsuccess

,

where Y is observed GDP. Using Caselli’s dataset, the observed variance of (log) GDP per worker is 1.32.

Ignoring human capital externalities (i.e., assuming ψ=γ=0) and using the standard 8% average Mincerian

return on human capital for both workers and entrepreneurs (i.e., setting =8%), the variance of log( ̂)

equals 0.76, i.e. physical and human capital explain 57% (0.76/1.32) of the observed variation in income

per worker. This calculation reproduces the standard finding that, under standard Mincerian returns, a

big chunk of the cross country income variation is accounted for by the productivity residual.

To isolate the role of entrepreneurial capital, we compute ̂ by assuming no human capital

externalities (i.e., ψ=γ=0) while still keeping a population-wide Mincerian return of 21%, which is

consistent with our firm-level estimates. It is not surprising that average Mincerian returns of about

Page 40: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

39

20% greatly improve the explanatory power of human capital. Indeed, under this assumption success

rises to 83%. This improvement is solely due to accounting for managerial schooling. We note that this

result is quite sensitive to our assumption of labour share of 55%. If labour share were lower, the

residual income share that we allocate to entrepreneurial rents would be correspondingly higher, which

would then reduce our estimate of the returns to entrepreneurial education, and therefore of average

Mincerian returns. This reservation should be kept in mind in interpreting our results.

Finally, to assess the incremental explanatory power of human capital externalities, we

compute ̂ assuming our estimated values (i.e., ψ=3.45 and γ=0.09), while retaining the assumption that

the average Mincerian return equals 21%. Under these new assumptions, success rises to 99%. Of

course, these results need to be interpreted cautiously since there is considerable uncertainty regarding

the true values of the underlying coefficients. In particular, a sharply lower value of returns to

entrepreneurial schooling would reduce this measure of success (but would also be difficult to reconcile

with our estimates). Nevertheless, these calculations illustrate that entrepreneurial inputs play a much

more important role than externalities in increasing the explanatory power of human capital.

The comparison between Mozambique and the US illustrates the importance of entrepreneurial

inputs to understand cross-country income differences. Income per worker is roughly 33 times higher in

the US than in Mozambique ($57,259 vs. $1,752), while the stock of physical capital is 185 times higher

in the US than in Mozambique ($125,227 vs. $676). The average number of years of schooling for the

population 15 years and older is 1.01years Mozambique and 12.69 years in the United States. These

large differences in schooling imply that the (per capita) stock of human capital is 11.6 higher

(HUS/HMOZ=e.21*(12.69-1.01)) in the US than in Mozambique if the average Mincerian return is 21%. In

contrast, the (per capita) stock of human capital is only 2.5 times higher (HUS/HDRC=e.08*(12.69-1.01)) in the US

than in Mozambique if the average Mincerian return is 8%. Using weights of 1/3 and 2/3 for physical

and human capital, these differences in physical and human capital imply that income per worker should

Page 41: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

40

be 29 times higher in the US than in Mozambique (29=11.62/3x1851/3), which is much closer to the actual

value of 33 times than the 10.6 multiple implied by 8% Mincerian return (10.6=2.52/3x1851/3).

In sum, our firm level and regional regressions suggest that: i) in line with the development

accounting literature, workers’ human capital is an important but not a large contributor to productivity

differences, ii) entrepreneurial inputs area fundamental and relatively neglected channel for

understanding the role of schooling in shaping productivity differences, and iii) human capital

externalities might also be an important determinant of regional productivity differences. Our

parameter estimates point to very large returns to entrepreneurial schooling (perhaps due to

entrepreneurs’ general talent) and to large social returns at the regional level arising from education.

VI. Conclusion.

We have presented evidence from more than 1500 sub-national regions of the world on the

determinants of regional income and labor productivity. The evidence suggests that regional education

is the critical determinant of regional development, and the only such determinant that explains a

substantial share of regional variation. Using data on several thousand firms located in these regions,

we have also found that regional education influences regional development through education of

workers, education of entrepreneurs, and regional externalities. The latter come primarily from the

level of education (the quality of human capital) in a region, and not from its total quantity (the number

of people with some education).

A simple Cobb-Douglas production function specification used in development accounting would

have difficulty accounting for all this evidence. Instead, we presented what we called a Lucas-Lucas

model of an economy, which combines the allocation of talent between work and entrepreneurship,

human capital externalities, and migration of labour across regions within a country. Although many

issues remain to be resolved, the empirical findings we presented are both consistent with the general

Page 42: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

41

predictions of this model, and provide plausible values for the model’s parameters. In addition, we

follow Caselli (2005) in assessing the ability of the model to account for variation of output per worker

across countries. When we use our Lucas-Lucas model, we can roughly double the ability of the model

to account for cross-country income differences relative to the traditional specification. Our

parameterization can explain 99% of income differences across countries.

The central message of the estimation/calibration exercise is that, while private returns to

worker education are modest and close to previous estimates, but private returns to entrepreneurial

education (in the form of profits) and possibly also social returns to education through external

spillovers, are large. This evidence suggests that earlier estimates of return to education have perhaps

underestimated one of its important benefits – the externalities, and largely missed the other –

entrepreneurship. This final observation has significant implications for economic development.

Our data points most directly to the role of the supply of educated entrepreneurs for the

creation and productivity of firms. From the point of view of development accounting, having such

entrepreneurs seems more important than having educated workers. Consistent with earlier

observations of Banerjee and Duflo (2005) and La Porta and Shleifer (2008), economic development

occurs in educated regions that concentrate entrepreneurs, who run productive firms. These

entrepreneurs, as well, appear to contribute to the exchange of ideas, leading so significant regional

externalities. The observed large benefits of education through the creation of a supply of

entrepreneurs and through externalities offer an optimistic assessment of the possibilities of economic

development through raising educational attainment.

Page 43: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

42

Bibliography

Acemoglu, Daron, and Joshua Angrist (2000). “How Large are Human-Capital Externalities? Evidence

from Compulsory Schooling Laws.” NBER Macroeconomics Annual, 15:9-59.

Acemoglu, Daron, Simon Johnson, and James Robinson (2001). “The Colonial Origins of Comparative

Development: An Empirical Investigation.” American Economic Review, 91(5): 1369-1401.

Acemoglu, Daron, and Melissa Dell (2010). “Productivity Differences Between and Within Countries.”

American Economic Journal: Macroeconomics, 2(1):169-188.

Alesina, Alberto, Arnaud Devleeschauwer, William Easterly, Sergio Kurlat and Romain Wacziarg (2003).

“Fractionalization.” Journal of Economic Growth, 8:155-94.

Alvarez, Michael, Jose Cheibub, Fernando Limongi, and Adam Przeworski (2000). Democracy and

Development: Political Institutions and Material Well-Being in the World 1950-1990. Cambridge,

UK: Cambridge University Press.

Banerjee, Abhijit, and Esther Duflo (2005). “Growth Theory through the Lens of Development

Economics.” In Philippe Aghion and Steven Durlauf, eds. Handbook of Economic Growth v. 1a,

Amsterdam: Elsevier.

Banerjee, Abhijit, and LakshmiIyer (2005). “History, Institutions, and Economic Performance: The Legacy

of Colonial Land Tenure System in India.” American Economic Review 95: 1190-1213.

Barro, Robert J (1991). "Economic Growth in a Cross-Section of Countries." Quarterly Journal of

Economics 106(2): 407-443.

Baumol, William (1990), “Entrepreneurship: Productive, Unproductive, and Destructive.” Journal of

Political Economy 98 (5): 893-921.

Beck, Thorsten, Gerge Clarke, Alberto Groff, Philip Keefer, and Patrick Walsh (2001). “New Tools in

Comparative Political Economy: The Database of Political Institutions.” World Bank Economic

Review 15 (1): 165-176.

Page 44: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

43

Benhabib, Jess, and Spiegel, Mark (1994) "The Role of Human Capital in Economic Development."

Journal of Monetary Economics 34(2): 143-174.

Bils, Mark, and Peter Klenow (2000). “Does Schooling Cause Growth?” American Economics Review

90(5):1160-1183.

Bloom, David and Jeffrey Sachs (1998). “Geography, Demography, and Economic Growth in Africa.”

Brookings Papers on Economic Activity, 29(2): 207-296.

Bloom, Nicholas, and John Van Reenen (2007). “Measuring and Explaining Management Practices across

Firms and Countries.” Quarterly Journal of Economics 122(4): 1351-1408.

Bloom, Nicholas, and John Van Reenen (2010). “Why Do Management Practices Differ across Firms and

Countries?” Journal of Economic Perspectives, 24(1): 203-224.

Card, David (1999). “The Causal Effect of Education on Earnings.” In Orley Ashenfelter and David Card,

eds., Handbook of Labor Economics, vol 3A, Amsterdam: North Holland.

Caselli, Francesco (2005). “Accounting for Cross-Country Income Differences.” In Philippe Aghion and

Steven Durlauf (eds.), Handbook of Economic Growth, vol.1, ch. 9: 679-741 Amsterdam: Elsevier.

Caselli, Francesco, and Coleman, John Wilbur II (2005). “The World Technology Frontier.” American

Economic Review, 96(3):499-522.

Caselli, Francesco and James Feyrer (2007). “The Marginal Product of Capital.” Quarterly Journal of

Economics 122 (2): 535-568.

Caselli, Francesco, and Nicola Gennaioli (2009). “Dynastic Management.” Mimeo: LSE and CREI.

Ciccone, Antonio, and Robert Hall (1996). “Productivity and the Density of Economic Activity.” American

Economic Review, 86(1):54-70.

Ciccone, Antonio, and Giovanni Peri (2006). “Identifying Human-Capital Externalities: Theory with

Applications.” Review of Economic Studies, 73(2):381-412.

Page 45: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

44

Ciccone, Antonio, and Elias Papaioannou (2009). “Human Capital, the Structure of Production, and

Growth.” Review of Economics and Statistics, 91(1): 66-82.

Coe, David, and Elhanan Helpman (1995). “International R&D Spillovers.” European Economic Review

39(5): 859-887.

Cohen, Daniel, and Marcelo Soto (2007). “Growth and Human Capital: Good Data, Good Results.”

Journal of Economics Growth, 12 (1):51-76.

de La Fuente, Angel and Rafael Domenech, (2006). Human Capital in Growth Regressions: How Much

Difference Does Data Quality Make?” Journal of the European Economics Association, 4(1):1-36.

Dell, Melissa (2010). “The Persistent Effects of Peru’s Mining Mita.” Econometrica, forthcoming.

Dell, Melissa, Benjamin Jones, and Benjamin Olken (2009). “Temperature and Income: Reconciling New

Cross-Sectional and Panel Estimates.” NBER Working Paper No. 14680.

De Long, Bradford, and Andrei Shleifer (1993). "Princes or Merchants? City Growth before the Industrial

Revolution" Journal of Law and Economics, 36:671-702.

Easterly, William, and Levine, Ross (1997). “Africa's Growth Tragedy: Policies and Ethnic Divisions.”

Quarterly Journal of Economics 112(4):1203-50.

Glaeser, Edward, Hedi Kallal, Jose Scheinkman, and Andrei Shleifer (1992). “Growth in Cities.” Journal of

Political Economy 100(6):1126-1152.

Glaeser, Edward, and Joshua Gottlieb (2009). “The Wealth of Cities: Agglomeration Economies and

Spatial Equilibrium in the United States.”Journal of Economic Literature 47(4):983-1028.

Glaeser, Edward L., Jose Scheinkman, and Andrei Shleifer (1995). “Economic Growth in a Cross-Section

of Cities.” Journal of Monetary Economics 36:117-143.

Glaeser, Edward, and David Mare (2001). “Cities and Skills.” Journal of Labor Economics 19(2):316-342.

Glaeser, Edward, Rafael La Porta, Florencio Lopez-de-Silanes, and Andrei Shleifer (2004). “Do Institutions

Cause Growth?” Journal of Economic Growth, 9 (2):271-303.

Page 46: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

45

Goldin, Claudia, and Lawrence Katz (2008). The Race Between Education and Technology. Cambridge,

MA: Harvard University Press.

Gollin, Douglas (2002). “Getting Income Shares Right.” Journal of Political Economy 110(2): 458-474.

Hall, Robert, and Charles Jones (1999). “Why Do Some Countries Produce So Much More Output per

Worker than Others?” Quarterly Journal of Economics 114(1): 83-116.

Henderson, Vernon, Adam Storeygard, and David Weil (2009). “Measuring Economic Growth From

Outer Space.” American Economic Review, forthcoming.

Hsieh, Chang-Tai, and Peter Klenow (2009). “Misallocation and Manufacturing TFP in China and India.”

Quarterly Journal of Economics 124(4): 1403-1448.

Hsieh, Chang-Tai, and Peter Klenow (2010). “Development Accounting.” American Economic Journal:

Macroeconomics 2(1): 207–223.

Iranzo, Susana, and Giovanni Peri (2009). “Schooling Externalities, Technology, and Productivity: Theory

and Evidence from U.S. States.”Review of Economics and Statistics 91(2):420-431.

Jaggers, Keith, and Monty Marshall (2000). Polity IV Project. University of Maryland.

Jacobs, Jane (1969). The Economy of Cities. New York: Vintage.

King, Robert, and Levine, Ross, 1993. “Finance and Growth : Schumpeter Might be Right.” Quarterly

Journal of Economics 108(3): 717-737.

Klenow, Peter, and Andres Rodriguez-Clare (2005). “Externalities and Growth.” In Philippe Aghion and

Steven Durlauf (eds.), The Handbook of Economic Growth, Volume 1A, pp. 818-861, Elsevier.

Knack, Stephen, and Philip Keefer (1997). “Does Social Capital Have An Economic Payoff? A Cross-

Country Investigation.” Quarterly Journal of Economics 112(4):1251-1288.

Krueger, Alan, and Mikael Lindahl (2001). “Education for Growth: Why and For Whom?” Journal of

Economic Literature 39(4):1101-1136.

Page 47: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

46

La Porta, Rafael, Florencio Lopez-de-Silanes, and Andrei Shleifer. 2008. "The Economic Consequences of

Legal Origins." Journal of Economic Literature, 46(2): 285–332.

La Porta, Rafael and Andrei Shleifer (2008). “The Unofficial Economy and Economic Development.”

Brookings Papers on Economic Activity, 2008: 275-352.

Lucas, Robert (1978). “On the Size Distribution of Business Firms.” Bell Journal of Economics 9(2): 508–

23.

Lucas, Robert (1988). “On the Mechanics of Economic Development.” Journal of Monetary Economics,

22(1):3-42.

Lucas, Robert (2009). “Ideas and Growth.” Economica, 76(301): 1–19.

Mankiw, Gregory, David Romer, and David Weil (1992). “A Contribution to the Empirics of Economic

Growth.”Quarterly Journal of Economics, 107(2): 407-437.

Moretti, Enrico (2004). “Estimating the Social Return to Higher Education: Evidence from Longitudinal

and Repeated Cross-Sectional Data.” Journal of Econometrics, 121: 175-212.

Moretti, Enrico (2004). “Workers’ Education, Spillovers, and Productivity: Evidence from Plant-Level

Production Functions.” American Economic Review, 94(3):656-690.

Murphy Kevin, Andrei Shleifer, and Robert Vishny (1991). “The Allocation of Talent: Implications for

Growth.” Quarterly Journal of Economics, 106(2):503-530.

Nelson, Richard, and Edmund Phelps (1966). “Investment in Humans, Technological Diffusion and

Economic Growth.” American Economic Review, 56(2):69-75.

Parker, Simon and Mirjam van Praag, (2005). “Schooling, Capital Constraint and Entrepreneurial

Performance.” Tinbergen Institute Discussion Paper.

Psacharopoulos, George (1994). “Returns to Investment in Education: A Global Update.” World

Development 22(9): 1325-1343.

Page 48: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

47

Putnam, Robert (1993). Making Democracy Work: Civic Traditions in Modern Italy. Princeton, NJ:

Princeton University Press.

Rauch, James (1993).“Productivity Gains from Geographic Concentration of Human Capital: Evidence

from the Cities.” Journal of Urban Economics, 34:380-400.

Roback, Jennifer (1982). “Wages, Rents, and the Quality of Life.” Journal of Political Economy 90(6):

1257-1278.

Romer, Paul (1990). “Endogenous Technical Change.” Journal of Political Economy, 98(5): S71-S102.

Syverson, Chad (2011). “What Determines Productivity?” Journal of Economic Literature, forthcoming.

Valentinyi, Akos and Berthold Herrendorf (2008). “Measuring Factor Income Shares at the Sector Level.”

Review of Economic Dynamics 11: 820-838.

van Praag, Mirjam, Arjen van Witteloostuijn, and Justin van der Sluis (2009). “Returns for Entrepreneurs

vs. Employees: The Effect of Education and Personal Control on the Relative Performance of

Entrepreneurs vs. Wage Employees.” IZA Discussion Paper no. 4628.

Woff, Edward (2011). “Spillovers, Linkages, and Productivity Growth in the US Economy, 1958-2007.”

Working Paper No. 16864. Cambridge, MA: National Bureau of Economic Research.

Page 49: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

48

Appendix 1.

Proof of Proposition 1 Due to A.1, labour moves from unproductive to productive regions. Formally,

Equation (11) implies that an agent with human capital hj migrates if BjBGjG HhwHhw //)( 11 ,

where φ captures migration costs. This identifies a human capital threshold hm such that agent j migrates

if and only if hj ≥ hm. By exploiting the wage equation in (6) and the equilibrium condition (9), threshold

hm can be implicitly expressed as:

B

G

G

Bm H

Hwwh

1

1 . (Ap.1)

To pin down the equilibrium, note that the aggregate resource constraint is given by:

p∙HG + (1 – p)∙HB = H. (Ap.2)

After accounting for externalities, the equilibrium condition (Ap.1) can be written as:

1)1()1)((

11)1(

11

1B

G

B

G

G

Bm H

HLL

AAh . (Ap.3)

The previous migration-threshold implies that the human capital stock in each productive region is:

11 1)(1

B

m

BB

mBGh BGG h

hp

pHHdhhhhp

pHH

. (Ap.4)

Using Equation (Ap.4) and (Ap.3), it is immediate to express hm as a function of HG and thus recover:

1

1

11

111

B

B

B

B

pp

HHH

pp

HHH

pp

LL

B

GG

B

GG

B

G

. (Ap.5)

Under full mobility (φ = 0), using (Ap.3) one finds that the equilibrium is determined by the condition:

Page 50: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

49

1)1()1)((

11

)1(

1

1

11

)1(

111

11

G

G

B

GG

B

GG

B

G

pHHHp

pp

HHH

pp

pp

HHH

AA

B

B

B

B

. (Ap.6)

The left hand side is decreasing in HG. If (β - ψγ)(1- θ) + θ(1- δ)> 0, the right hand side - which captures

the cost of migrating to productive regions, increases in HG. As a result, when (β - ψγ)(1- θ) + θ(1- δ)> 0

even under full mobility in the stable equilibrium there is no universal migration to productive regions.

Indeed, if all human capital moves to productive regions, then HG = H/p and the right hand side of

(Ap.10) becomes infinite. Full migration is not an equilibrium. No migration is not an equilibrium either,

as in this case A.1 implies that (Ap.10) cannot hold. When ψ = 1 (and φ = 0) the equilibrium has:

.)1()1)((

1

)1()1)((1

H

AE

AH ii

(Ap.7)

With imperfect mobility φ = 0, the equilibrium fulfils the condition:

.)1(

11

111

11

1)1()1)((

11

)1(

1

1

11

11

11

G

G

B

GG

B

GG

G

B

B

GG

pHHHp

pp

HHH

pp

HHH

pp

AA

HHH

pp

hB

B

B

B

When (β - ψγ)(1- θ) + θ(1- δ)> 0, an increase in HG (holding H constant) shifts down the left hand side

and shifts up the right hand side above. As a result, the equilibrium is unique.

Page 51: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

Table 1 – Definitions and sources for the variables used in the paper 

This table provides the names, definitions and sources of all the variables used in the tables of the paper. 

Variable  Description Sources and linksI. GDP per capita, population, employment and human capital

 

Ln(GDP per capita)  The logarithm of Gross Domestic Product per capita in PPP constant 2005 international dollars in the region in 2005.  Data on regional GDP is available for all countries except 20. For those 20 countries, we  approximate  GDP  using  data  on  income  (6  countries),  expenditure  (8  countries),  wages  (3 countries),  gross  value  added  (2  countries),  and  consumption,  investment  and  government expenditure  (1  country).    For  each  country, we  scale  regional GDP per  capita  values  so  that  their population‐weighted sum equals  the World Development  Indicators  (WDI) value of Gross Domestic Product in PPP constant 2005 international dollars. Similarly, for each country, we adjust the regional population values so that their sum equals the country‐level analog  in WDI.   For years with missing regional GDP per capita data, we interpolate using all available data for the period 1990‐2008.  When interpolating GDP  values  is not possible, we use  the  regional distribution of  the  closest  year with regional GDP data. Population data  for years without census data  is  interpolated and extrapolated from  the  available  census  data  for  the  period  1990‐2008.  At  the  country  level, we  calculate  this variable as the population‐weighted average of regional GDP. 

Regional GDP:  See online appendix "Appendix GDP Sources". Regional population:  Thomas Brinkhoff: City Population, http://www.citypopulation.de/ Country‐level GDP per capita and PPP exchange rates: World Bank, (2010). Data retrieved on March 2, 2010, from World Development Indicators Online (WDI) database,  http://go.worldbank.org/6HAYAHG8H0 

     

Ln(Population)  The logarithm of the number of inhabitants in the region in 2005. Population data for years without census data  is  interpolated  and  extrapolated  from  the  available  census data  for  the period  1990‐2008. For each country, we adjust the regional populations so that the sum of regional populations equals the country‐level analog in the World Development Indicators (WDI).  At the country level, we calculate this variable following the same methodology but using country boundaries. 

 Regional population: Thomas Brinkhoff: City Population, http://www.citypopulation.de/ Regional spherical: Collins‐Bartholomew World Digital Map, http://www.bartholomewmaps.com/data.asp?pid=5. 

Ln(Employment)  The number of manufacturing and  service employees working  in  the establishments  in  the  region. The data is for the year 2005 or the closest available. At the country level, we calculate this variable as  the product of  the  total population and  the employment  ratio  for  the population 15  years and older.    

See online appendix "Appendix on Economic Census Sources".   Development Indicators Online (WDI) database,  http://go.worldbank.org/6HAYAHG8H0 

     

Years of education  The average  years of  schooling  from primary  school onwards  for  the population aged 15  years or older. Data  for China and Georgia  is  for  the population 6 years and older. We use  the most  recent information  available  for  the  period  1990‐2006.  To  make  levels  of  educational  attainment comparable  across  countries,  we  translate  educational  statistics  into  the  International  Standard Classification of Education (ISCED) standard and use UNESCO data on the duration of school levels in each country for the year for which we have educational attainment data.  Eurostat aggregates data for  ISCED  levels  0‐2  and we  assign  such  observations  an  ISCED  level  1.    Following  Barro  and  Lee (1993):  (1) we assign zero years of schooling  to  ISCED  level 0  (i.e., pre‐primary);  (2) we assign zero years of additional  schooling  to  (a)  ISCED  level 4  (i.e., vocational), and  (b)  ISCED  level 6  (i.e. post‐graduate); and  (3) we assign 4  years of additional  schooling  to  ISCED  level 5  (i.e.  graduate).  Since regional data  is not available  for all countries, unlike Barro and Lee  (1993), we assign zero years of additional schooling: (a) to all incomplete levels; and (b) to ISCED level 2 (i.e. lower secondary).  Thus, the average years of schooling  in a region  is calculated as: (1) the product of the fraction of people whose highest attainment level is ISCED 1 or 2 and the duration of ISCED 1; plus (2) the product of the fraction of people whose highest attainment  level  is  ISCED 3 or 4 and  the  cumulative duration of ISCED 3; plus (3) the product of the fraction of people whose highest attainment level is ISCED 5 or 6 and the sum of the cumulative duration of ISCED 3 plus 4 years. At the country level, we calculate this variable as the population‐weighted average of the regional values. 

See online appendix "Appendix on Education Sources".    Links to online data:   http://epdc.org/ http://epp.eurostat.ec.europa.eu/portal/page/portal/region_cities/introduction https://international.ipums.org/international/index.html http://stats.uis.unesco.org/unesco/TableViewer/document.aspx?ReportId=143&IF_Language=eng. 

II. Climate, geography and natural resources 

Temperature  Average  temperature during  the period 1950‐2000  in degrees Celsius. To produce  the  regional and national numbers, we create equal area projections using the Collins‐Bartholomew World Digital Map and the temperature raster in ArcGIS.  For each region, we sum the temperatures of all cells in that region  and  divide  by  the  number  of  cells  in  that  region.    At  the  country  level, we  calculate  this variable following the same methodology but using country boundaries. 

Climate: Hijmans, R. et al. (2005) , http://www.worldclim.org/ Collins‐Bartholomew World Digital Map, http://www.bartholomewmaps.com/data.asp?pid=5  

Page 52: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

Variable  Description Sources and linksInverse distance to coast 

The ratio of one over one plus the region’s average distance to the nearest coastline in thousands of kilometers.  To calculate each region’s average distance to the nearest coastline we create an equal distance projection of the Collins‐Bartholomew World Digital Map and a map of the coastlines.  Using these two maps we create a raster with the distance to the nearest coastline of each cell in a given region. Finally, to get the average distance to the nearest coastline, we sum up the distance to the nearest  coastline of all  cells within each  region and divide  that  sum by  the number of  cells  in  the region. At  the  country  level, we  calculate  this  variable  following  the  same methodology but using country boundaries. 

Collins‐Bartholomew World Digital Map, http://www.bartholomewmaps.com/data.asp?pid=5   

     

Ln(Oil)  Logarithm of one plus the estimated per capita volume of cumulative oil production and reserves by region,  in millions of barrels of oil. To produce  the  regional measure, we  load  the oil map of  the World Petroleum Assessment and the Collins‐Bartholomew World Digital map onto ArcGIS. On‐shore estimated  oil  in  each  assessment  unit  was  allocated  to  the  regions  based  on  the  fraction  of assessment  unit  area  covered  by  each  region.   Off‐shore  assessment  units  are  not  included.  The World Petroleum Assessment map includes all oil fields in the world except those in the United States of  America.  Data  for  the  United  States  is  calculated  using  the  national‐level  information  on cumulative production and estimated reserves, available from the World Petroleum Assessment 2000 (USGS), and the United States' regional production and estimated reserves for the year 2000 from the U.S. Energy Information Administration (USEIA).  The national level data for this variable is calculated following the same methodology outlined but using the data on national boundaries.   The national level numbers for the U.S. are those available from the World Petroleum Assessment. 

http://energy.cr.usgs.gov/oilgas/wep/products/dds60/export.htm. http://tonto.eia.doe.gov/dnav/pet/pet_crd_crpdn_adc_mbbl_a.htm.   http://www.bartholomewmaps.com/data.asp?pid=5   

III. Institutions 

Informal payments  The average percentage of sales spent on  informal payments made to public officials to “get things done”  with  regard  to  customs,  taxes,  licenses,  regulations,  services,  etc,  as  reported  by  the respondents in the region.   The country‐level analog of this variable is the arithmetic average of the regions in the country.  Data is from the most recent year available, ranging from 2002 through 2009.   

World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/ 

Ln(Tax days)  The logarithm of one plus the average number of days spent in mandatory meetings and inspections with tax authority officials  in the past year as reported by respondents  in the region.   The country‐level analog of this variable  is the arithmetic average of the regions  in the country.   Data  is for the most recent year available, ranging from 2002 through 2009.     

World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/ 

Ln(Days without electricity) 

The logarithm of one plus the average number of days without electricity in the past year as reported by the respondents in the region.  The country‐level analog of this variable is the arithmetic average of the regions in the country. Data is for the most recent year available, ranging from 2002 through 2009.   

World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/ 

Security costs  The  average  costs  of  security  (i.e.,  equipment,  personnel,  or  professional  security  services)  as  a percentage of sales as reported by  the respondents  in the  region.   The country‐level analog of this variable  is  the  arithmetic  average of  the  regions  in  the  country. Data  is  for  the most  recent  year available, ranging from 2002 through 2009.   

World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/ 

Access to land  The percentage of respondents in the region who think that access to land is a moderate, major, or very severe obstacle to business.  The country‐level analog of this variable is the arithmetic average of the regions in the country.  Data is for the most recent year available, ranging from 2002 through 2009.   

World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/ 

Access to finance  The percentage of respondents in the region who think that access to financing is a moderate, major, or  very  severe  obstacle  to  business.    The  country‐level  analog  of  this  variable  is  the  arithmetic average of the regions in each respective country.  Data is for the most recent year available, ranging from 2002 through 2009.   

World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/ 

Government predictability 

The percentage of respondents  in the region who tend to agree, agree  in most cases, or fully agree that  their  government  officials’  interpretation  of  regulations  are  consistent  and  predictable.    The country‐level analog of this variable is the arithmetic average of the regions in the country. Data is for the most recent year available, ranging from 2002 through 2009.  

World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/ 

Doing Business Percentile Rank 

The average of  the percentile  ranks  in each of  the following  five areas:  (1)  starting a business;  (2) dealing with  construction permits;  (3)  registering property;  (4)  enforcing  contracts;  and  (5) paying taxes.    Higher  values  indicate  more  burdensome  regulation.    Data  is  for  the  most  recent  year available, ranging from 2007 through 2010.    

Word Bank’s Doing Business Subnational Reports.  http://doingbusiness.org/Reports/Subnational‐Reports/ 

Autocracy  This  variable  classifies  regimes  based  on  their  degree  of  autocracy. Democracies  are  coded  as  0, bureaucracies (dictatorships with a legislature) are coded as 1 and autocracies (dictatorship without a legislature) are coded as 2. Transition years are coded as the regime that emerges afterwards. This variable  ranges  from  zero  to  two  where  higher  values  equal  a  higher  degree  of  autocracy.  This 

Alvarez et al. (2000).

Page 53: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

Variable  Description Sources and linksvariable is measured as the average from 1960 through 1990.  

     

Executive Constraints  A measure  of  the  extent  of  institutionalized  constraints  on  the  decision making  powers  of  chief executives. The variable  takes  seven different values:  (1) Unlimited authority  (there are no  regular limitations  on  the  executive's  actions,  as  distinct  from  irregular  limitations  such  as  the  threat  or actuality of coups and assassinations); (2) Intermediate category; (3) Slight to moderate limitation on executive authority  (there are  some  real but  limited  restraints on  the executive);  (4)  Intermediate category;  (5)  Substantial  limitations  on  executive  authority  (the  executive  has  more  effective authority  than  any  accountability  group  but  is  subject  to  substantial  constraints  by  them);  (6) Intermediate  category;  (7)  Executive  parity  or  subordination  (accountability  groups  have  effective authority equal to or greater than the executive in most areas of activity). This variable ranges from one to seven where higher values equal a greater extent of institutionalized constraints on the power of chief executives. This variable is calculated as the average from 1960 through 2000.   

Jaggers and Marshall (2000).

     

Expropriation Risk  Risk of “outright confiscation and forced nationalization" of property. This variable ranges from zero to ten where higher values are equals a lower probability of expropriation. This variable is calculated as the average from 1982 through 1997.  

International Country Risk Guide at http://www.countrydata.com/datasets/. 

     

Proportional Representation 

This  variable  is  equal  to one  for  each  year  in which  candidates were  elected using a proportional representation  system;  equals  zero  otherwise.  Proportional  representation means  that  candidates are elected based on the percentage of votes received by their party. This variable is measured as the average from 1975 through 2000. 

Beck et al. (2001).

     

Corruption  The  average  score  of  the  Transparency  International  index  of  corruption  perception  in  2005.  The index provides a measure of  the extent  to which corruption  is perceived  to exist  in  the public and political sectors. The  index focuses on corruption  in the public sector and defines corruption as the abuse of public office for private gain. It is based on assessments by experts and opinion surveys. The index ranges between 0 (highly corrupt) and 10 (highly clean).  

www.transparency.org

     

IV. Infrastructure 

Ln(Power line density)  The  logarithm of one plus  the  length  in kilometers of power  lines per 10km2 in  the year 1997.   To produce the regional numbers, we  load the power  line map from the US Geological Survey and the Collins‐Bartholomew World Digital Map onto ArcGIS. We take the ratio of total  length of the power lines in the region to the spherical area of that region.  At the country level, we calculate this variable following the same methodology but using country boundaries. 

US Geological Survey Global GIS database, accessed through Harvard University's Geospatial Library.  Collins‐Bartholomew World Digital Map, http://www.bartholomewmaps.com/data.asp?pid=5   

Ln(Travel time)  The  logarithm  of  the  average  estimated  travel  time  in minutes  from  each  cell  in  a  region  to  the nearest  city  of  50,000  or  more  people  in  the  year  2000.  We  use  the  raster  from  the  Global Environmental Monitoring Unit and the Collins‐Bartholomew World Digital Map.  For each region, we sum the travel time from all its cells and divide by the number of cells in that region.  At the country level, we calculate this variable following the same methodology but using country boundaries. 

Global Environment Monitoring Unit, http://bioval.jrc.ec.europa.eu/products/gam/index.htm Collins‐Bartholomew World Digital Map,  http://www.bartholomewmaps.com/data.asp?pid=5   

V. Culture 

Trust in others  The percentage of respondents in the region who believe that most people can generally be trusted.  The country‐level analog of this variable is the arithmetic average of the regions in the country.  Data is for the most recent available year, ranging from 1980 through 2005. 

World Values Survey, http://www.worldvaluessurvey.org/ 

Civic values  The average of the value of the answers of respondents in the region about the degree of justifiability of the following four behaviors: (1) Claiming government benefits to which you are not entitled; (2) Avoiding a  fare on public  transport;  (3) Cheating on  taxes  if you have a  chance; and  (4) Someone accepting a bribe  in  the  course of  their duties.   For each question, possible answers  range  from 1 (never justifiable) to 10 (always justifiable). We only include observations with non‐missing data for at least two of the four questions.  The country‐level analog of this variable is the arithmetic average of the  regions  in  the country.   Data  is  for  the most  recent available year,  ranging  from 1980  through 2005. 

World Values Survey, http://www.worldvaluessurvey.org/ 

Ln(Number of ethnic groups) 

The  logarithm  of  the  number  of  ethnic  groups  that  inhabited  the  region  in  the  year  1964.    The country‐level analog of this variable is constructed using country boundaries. 

Weidmann et al., 2010, http://www.icr.ethz.ch/research/greg 

Page 54: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

Variable  Description Sources and linksProbability of same language 

The probability that two randomly chosen people, one from the corresponding region and one from the rest of the country, share the same mother tongue  in the year 2004.   Where  language areas do not overlap with our regions, we compute the number of people speaking a language in a region by weighing  the  total  number  of  people  in  a  language  area  by  the  fraction  of  the  region’s  surface covered by  that  language area. We  compute  the probability of  same  language  separately  for each language in a region and then calculate the surface‐weighted average of the different languages in a region.  The country‐level analog of this variable is calculated as the population‐weighted average of the regional values.     

World Language Mapping System, http://www.gmi.org/wlms/  

VI.  Enterprise Survey Data

Ln(Sales / Employee)  The  logarithm  of  the quotient of  total  annual  revenue (in  current USD) over  the  total  number of employees  in  each  establishment.  Data  is  for  the most  recent  available  year,  ranging  from  2002 through 2009.      

World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/  

Ln(Sales –Raw Materials / Employee) 

The  logarithm  of  the  quotient  of  total  annual  revenue minus  expenditure  on  raw materials  l(in current USD) over the total number of employees in each establishment. Data is for the most recent available year, ranging from 2002 through 2009.      

World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/  

 Ln(Wages / Employee) 

The logarithm of the quotient of total cost of labor (in current USD) the total number of employees in each establishment. Data is for the most recent available year, ranging from 2002 through 2009.    

World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/ 

     

Ln(Expenditure on energy / Employee) 

The logarithm of the quotient of total energy and fuel costs over (in current USD) the total number of employees  in  each  establishment.  Data  is  for  the most  recent  available  year,  ranging  from  2002 through 2009.     

World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/ 

Years of Education of workers 

The number of  years of  schooling  from primary  school onwards  of  the  average non‐management employee in each establishment. To compute this variable, we use the same assumptions and follow the same procedure as used for the previously described years of schooling variable at the regional level. Data is for the most recent available year, ranging from 2002 through 2009.     

World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/ 

Years of Education of  manager 

The number of years of schooling from primary school onwards of the manager of the establishment. To compute this variable, we use the same assumptions and follow the same procedure as used for the previously described years of schooling variable at the regional level. Data is for the most recent available year, ranging from 2002 through 2009.     

World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/ 

Ln( Property, plant, and equipment / Employee) 

The  logarithm of the quotient of the book value of property, plant and equipment (in current USD) over the total number of employees in the establishment. Data is for the most recent available year, ranging from 2002 through 2009.    

World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/ 

     

 

Page 55: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

Number of Regions

Number of Countries

Regions per country

Mean Minimum MaximumWithin‐country 

RangeWithin‐country std deviation

Coefficient of Variation for Variable in Levels

Ln(GDP per capita) 1,537 107 11 8.69 8.07 9.54 1.03 0.30 0.33Years of Education 1,489 106 12 6.58 5.34 8.70 2.34 0.73 0.92Temperature 1,568 110 12 16.84 10.23 21.13 4.47 1.45 0.09Inverse Distance to Coast 1,569 110 12 0.90 0.80 0.99 0.13 0.05 0.05Ln(Oil) 1,569 110 12 0.00 0.00 0.00 0.00 0.00 0.00Informal Payments 361 76 4 1.02 0.40 1.60 0.94 0.45 0.59ln(Tax Days) 270 58 5 1.29 1.06 1.51 0.36 0.19 0.18Ln(Days without electricity) 222 75 2 3.03 2.73 3.37 0.54 0.36 0.32Security costs 373 79 4 0.91 0.39 1.34 0.72 0.34 0.42Access to land 519 81 5 0.15 0.04 0.27 0.21 0.09 0.40Access to finance 536 82 5 0.28 0.14 0.47 0.29 0.12 0.24Government Predictability 386 75 4 0.46 0.34 0.61 0.24 0.10 0.20Doing Business Percentile Rank 180 19 6 0.40 0.21 0.49 0.22 0.11 0.31Ln(Power line density) 1,569 110 12 1.34 0.00 2.53 1.87 0.61 0.61Ln(Travel time) 1,569 110 12 5.28 4.21 6.00 1.82 0.54 0.46Trust in others 745 69 9 0.23 0.12 0.38 0.22 0.07 0.35Civic Values 683 75 8 2.23 1.71 3.12 1.08 0.48 0.19Ln(Number of ethnic groups) 1,568 110 12 0.98 0.00 1.79 1.39 0.50 0.46Probability of same language 1,545 109 12 0.67 0.28 0.79 0.26 0.09 0.21

Medians for:

The table reports descriptive statistics for the variables in the paper.  We report the total number of observations, the number of countries and medians for: (1) the number of regions with non‐missing data, (2) the country average, (3) the within‐country range, (4) the within‐country standard deviation, and (5) the coefficient of variation for the variable in levels (ratherthan in logs).   We also report the adjusted R squared from an univariate regression of each variable in the table on country dummies.  All variables are described in Table 1.

Table 2:  Descriptive Statistics

Panel A:  Regional GDP, Education, Geography, Institutions, Infrastructure, and Culture

Page 56: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

Number of Regions

Number of Countries

Regions per country

Mean Minimum MaximumWithin‐country 

RangeWithin‐country std 

deviationCoefficient of Variation for Variable in Levels

Ln(Establishments / Population) 984 65 12 ‐4.89 ‐5.45 ‐4.06 1.17 0.37 0.37Ln(Employees / Establishments) 1,068 69 12 2.07 1.69 2.39 0.80 0.20 0.19Ln(Employees / Population) 1,056 69 12 ‐2.66 ‐3.38 ‐1.80 1.58 0.43 0.41Ln(Employees Big Firms / Employees) 540 31 13 ‐1.45 ‐2.17 ‐0.78 1.13 0.33 0.27Ln(Sales / Employee) 549 82 5 10.21 9.79 10.59 0.79 0.35 1.22Ln([Sales ‐ Raw Materials]/Employee) 359 70 4 9.53 9.24 9.87 0.69 0.37 1.21Ln(Wages / Employee) 515 77 5 8.28 8.00 8.66 0.62 0.25 1.79Ln(Employees) 549 82 5 3.25 2.72 3.71 0.82 0.35 1.46Ln(Expenditure on energy / Employee) 326 66 4 6.10 5.51 6.36 0.60 0.30 1.22Ln(Property, plant and equipment / Employee) 205 41 4 8.72 8.37 9.37 0.99 0.47 1.26Years of Education of Workers 507 74 5 9.97 8.66 10.80 2.25 0.93 3.06Years of Education of Managers 195 38 4 14.90 14.24 15.36 1.34 0.62 0.89

Medians for:

Table 2:  Descriptive Statistics (continued)

Panel B:  Enterprise Survey and Census Data

Page 57: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

Observations Countries R2 Within R2 Between

Independent Variables:Years of Education 1,470 104 38% 58%Temperature 1,536 107 1% 27%Inverse Distance to Coast 1,537 107 4% 13%Ln(Oil) 1,537 107 2% 4%Informal Payments 350 74 0% 21%ln(Tax Days) 263 56 0% 20%Ln(Days without electricity) 219 73 2% 6%Security costs 362 77 0% 7%Access to land 507 79 0% 15%Access to finance 524 80 1% 8%Government Predictability 380 73 1% 0%Doing Business Percentile Rank 176 18 2% 13%Ln(Power line density) 1,537 107 5% 36%Ln(Travel time) 1,537 107 7% 15%Trust in others 739 68 0% 18%Ln(Number of ethnic groups) 1,536 107 5% 17%Probability of same language 1,518 106 1% 26%

Table 3:  Univariate Fixed Effects Regressions

Fixed effects regressions of the log of GDP per capita at the regional level in the year 2005.   The independent variables are proxies for: (1) geography, (2)  Institutions, and (3) Infrastructure and Culture.   The table reports 

the number of observations, the number of countries, the R2 within, the R2 between, and the fraction of the variance due to countries.  All variables are described in Table 1.

Page 58: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

(1) (2) (3)

Temperature ‐0.0914a ‐0.0189c ‐0.0190c

(0.0100) (0.0106) (0.0106)

Inverse Distance to Coast 4.4768a 2.9647a 2.9499a

(0.5266) (0.5736) (0.5782)

Ln(Oil) 1.2192a 0.9503a 0.9473a

(0.1985) (0.1371) (0.1375)

Years of Education 0.2566a 0.2574a

(0.0308) (0.0311)

Ln(Population) 0.0684c

(0.0408)

Ln(Employment) 0.0576(0.0398)

Constant 6.3251a 3.5761a 3.7959a

(0.4598) (0.9372) (0.8977)

Observations 107 104 103

Adjusted R2 50% 63% 63%

Table 4:  GDP per capita and Geography

Ordinary least squares and fixed effects regressions of the log of GDP per capita. The dependent variable is the logarithm of the 2005 level of GDP per capita at the country level in Panel A and at the  logarithm of regional GDP per capita in Panel B.  The independent variables are  (1) temperature, (2) inverse distance to coast, (3) the logarithm of per capita oil production and reserves, (4) the average years of education, (5) the logarithm of population, and (6) the logorithm of the number of employees.   Robust standard errors are shown in parentheses.   All variables are described in Table 1.

Panel A:  Dependent Variable is Logarithm National GDP per capita

Note:  a = significant at the 1% level, b = significant at the 5% level, and c = significant at the 10% level. 

Page 59: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

(1) (2) (3)

Temperature ‐0.0156c ‐0.0140c ‐0.0206c

(0.0082) (0.0084) (0.0105)

Inverse Distance to Coast 1.0318a 0.4979a 0.5096a

(0.2078) (0.1438) (0.1745)

Ln(Oil) 0.1651a 0.1752a 0.1941a

(0.0477) (0.0578) (0.0440)

Years of Education 0.2755a 0.2751a

(0.0171) (0.0271)

Ln(Population) 0.0125(0.0168)

Ln(Employment) 0.0661a

(0.0244)

Constant 8.0947a 6.3886a 5.9154a

(0.2282) (0.1944) (0.2516)

Observations 1,545 1,478 833Number of countries 107 104 49

R2  Within 8% 42% 50%

R2  Between 47% 60% 70%

R2  Overall 34% 62% 70%Fixed Effects Yes Yes Yes

Note:  a = significant at the 1% level, b = significant at the 5% level, and c = significant at the 10% level. 

Panel B:  Dependent Variable is Logarithm Regional GDP per capita

Table 4:  GDP per capita and Geography (continued)

Page 60: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

(1) (2) (3) (4) (5) (6) (7) (8) (9)

Years of Education 0.2566a 0.2310a 0.1890a 0.2339a 0.2291a 0.2301a 0.2264a 0.2355a 0.1749b(0.0308) (0.0344) (0.0310) (0.0316) (0.0336) (0.0350) (0.0344) (0.0332) (0.0703)

Ln(Population) 0.0684c ‐0.0022 0.0887 0.0428 0.0320 0.0067 0.0299 0.0611 ‐0.0782(0.0408) (0.0494) (0.0582) (0.0488) (0.0481) (0.0519) (0.0473) (0.0457) (0.1074)

Temperature ‐0.0189c ‐0.0105 ‐0.0276b ‐0.0083 ‐0.0094 ‐0.0066 ‐0.0082 ‐0.0129 ‐0.0147(0.0106) (0.0128) (0.0128) (0.0119) (0.0114) (0.0112) (0.0110) (0.0117) (0.0306)

Inverse Distance to Coast 2.9647a 2.3086a 2.1692a 2.5170a 2.2652a 2.2826a 2.1892a 2.3979a 0.2385(0.5736) (0.6321) (0.7006) (0.5698) (0.5856) (0.5406) (0.5562) (0.5616) (2.1131)

Ln(Oil) 0.9503a 1.6367a 0.5257 1.1319a 1.1739a 1.1916a 1.1165a 1.2054b 0.5201(0.1371) (0.5966) (0.5050) (0.3309) (0.3219) (0.3302) (0.2950) (0.4982) (0.4921)

Informal Payments ‐0.0121(0.0499)

ln(Tax Days) ‐0.5497a

(0.1446)

Ln(Days without electricity) ‐0.1375(0.0847)

Security costs ‐0.0332(0.0250)

Access to land ‐0.7493(0.5783)

Access to finance ‐0.5164(0.4202)

Government Predictability 0.3835(0.4431)

Doing Business Percentile Rank 0.6704(1.6413)

Constant 3.5761a 5.1927a 5.1619a 4.6815a 4.7382a 5.1545a 4.9498a 3.9328a 8.6509b

(0.9372) (1.1015) (1.2918) (0.9542) (1.0046) (0.9971) (1.0246) (0.9724) (3.1636)

Observations 104 73 55 75 76 80 81 72 17

Adjusted R2 63% 73% 76% 69% 69% 70% 70% 71% 34%

Adj. R2  without institution 63% 73% 69% 69% 69% 69% 69% 71% 39%

Adj. R2 without education 50% 53% 60% 49% 50% 52% 52% 50% 26%

Note:  a = significant at the 1% level, b = significant at the 5% level, and c = significant at the 10% level. 

Table 5:  National GDP per capita, Institutions, Infrastructure, and Culture

Ordinary least square regressions of the log of GDP per capita at the country level.  All regressions include the years of education, logarithm of population, temperature, inverse distance to coast, and the logarithm of per capita oil production and reserves.  In addition, regressions include measures of: (1) institutions (Panel A) and (2) infrastructure and culture (Panel B).  Robust standard errors are shown in parenthesis.  

For comparison, the bottom panel shows the adjusted R2 of two alternative specifications: (1) a regression with all regressors except the measure of institutions or culture; and (2) a regression with all regressors except education.  All variables are described in Table 1.

Panel A: Institutions

Page 61: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

(1) (2) (3) (4) (5) (6) (7)

Years of Education 0.2566a 0.2379a 0.2642a 0.1935a 0.1818a 0.2534a 0.2394a

(0.0308) (0.0338) (0.0325) (0.0498) (0.0538) (0.0347) (0.0377)

Ln(Population) 0.0684c 0.0688c 0.0653 0.1238 0.2169b 0.0999 0.0807c

(0.0408) (0.0414) (0.0407) (0.0788) (0.1017) (0.0640) (0.0450)

Temperature ‐0.0189c ‐0.0145 ‐0.0191c ‐0.0283b ‐0.0434a ‐0.0188c ‐0.0163(0.0106) (0.0109) (0.0108) (0.0135) (0.0148) (0.0107) (0.0108)

Inverse Distance to Coast 2.9647a 2.7218a 3.0968a 3.6522a 4.3386a 2.7758a 2.7448a

(0.5736) (0.6025) (0.6268) (0.7902) (1.0486) (0.6473) (0.5853)

Ln(Oil) 0.9503a 1.0157a 0.8737a 0.9902a 0.9751a 0.9538a 0.8792a

(0.1371) (0.1438) (0.1467) (0.3207) (0.2895) (0.1443) (0.1657)

Ln(Power line density) 0.1480(0.1099)

Ln(Travel time) 0.0825(0.0934)

Trust in others 1.2472(0.8796)

Civic values 0.4180(0.3105)

Ln(Number of ethnic groups) ‐0.0996(0.1550)

Probability of same language 0.4195(0.3391)

Constant 3.5761a 3.6383a 3.0050b 2.3962 ‐0.1572 3.4625a 3.3864a

(0.9372) (0.9251) (1.2448) (2.0122) (3.2084) (0.9289) (0.9548)

Observations 104 104 104 67 57 104 103

Adjusted R2 63% 63% 63% 49% 47% 63% 62%

Adj. R2  without infrastructure or culture 63% 63% 63% 48% 45% 63% 62%

Adj. R2 without education 50% 54% 50% 44% 42% 51% 52%

Note:  a = significant at the 1% level, b = significant at the 5% level, and c = significant at the 10% level. 

Table 5:  National GDP per capita, Institutions, Infrastructure, and Culture (cont)

Panel B:  Infrastructure and Culture

Page 62: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

(1) (2) (3) (4) (5) (6) (7) (8) (9)

Years of Education in the Region 0.2758a 0.3056a 0.3620a 0.3439a 0.3343a 0.3267a 0.3273a 0.3166a 0.4141a

(0.0172) (0.0298) (0.0288) (0.0481) (0.0310) (0.0218) (0.0215) (0.0207) (0.0229)

Ln(Population in the Region) 0.0126 ‐0.0185 ‐0.0175 ‐0.0442 ‐0.0191 ‐0.0087 ‐0.0098 ‐0.0113 ‐0.0026(0.0168) (0.0495) (0.0536) (0.0613) (0.0432) (0.0316) (0.0312) (0.0305) (0.0229)

Temperature ‐0.0140c ‐0.0101 ‐0.0086 ‐0.0015 ‐0.0064 ‐0.0093 ‐0.0106 ‐0.0131 0.0016(0.0084) (0.0096) (0.0078) (0.0122) (0.0093) (0.0086) (0.0086) (0.0081) (0.0059)

Inverse Distance to Coast 0.4971a 0.4647 0.8290c 0.1810 0.2703 0.4054 0.5133c 0.4420 0.0913(0.1441) (0.3293) (0.4273) (0.4312) (0.3041) (0.2636) (0.2822) (0.2788) (0.3460)

Ln(Oil) 0.1752a ‐0.0578 0.1555 ‐0.0584 ‐0.0473 ‐0.0224 ‐0.0040 ‐0.0170 0.1834(0.0578) (0.1283) (0.1319) (0.2503) (0.0862) (0.1081) (0.1113) (0.0735) (0.1160)

Informal Payments ‐0.0089(0.0353)

ln(Tax Days) ‐0.0479(0.0630)

Ln(Days without electricity) 0.0001(0.0764)

Security costs ‐0.0004(0.0060)

Access to land ‐0.1900(0.1457)

Access to finance ‐0.0935(0.1536)

Government Predictability ‐0.1251(0.1426)

Doing Business Percentile Rank ‐0.6199c

(0.3437)

Constant 6.3853a 6.5073a 5.7640a 6.8622a 6.4507a 6.3453a 6.2816a 6.4790a 6.3186a

(0.1947) (0.7043) (0.8220) (0.7867) (0.5993) (0.4664) (0.4827) (0.4629) (0.4428)

Observations 1,469 338 255 216 352 387 381 368 172Number of countries 104 73 55 72 76 77 76 72 17

R2  Within 42% 58% 66% 59% 60% 62% 62% 63% 69%

R2  Between 60% 64% 64% 53% 58% 60% 60% 63% 39%

R2  Overall 62% 59% 60% 49% 53% 55% 55% 56% 51%

Within R2  without institution 42% 57% 66% 59% 60% 62% 62% 62% 67%

Within R2  without education 9% 11% 14% 10% 9% 6% 5% 7% 9%

Between R2  without institution 60% 64% 63% 53% 58% 60% 60% 63% 41%

Between R2  without education 42% 25% 20% 21% 26% 35% 39% 45% 50%

Fixed Effects Yes Yes Yes Yes Yes Yes Yes Yes Yes

Note:  a = significant at the 1% level, b = significant at the 5% level, and c = significant at the 10% level. 

Table 6: Regional GDP per capita, Institutions, Infrastructure, and Culture

Ordinary least square regressions of the log of regional GDP per capita.  All regressions include years of education, logarithm of population, temperature, inverse distance to coast, and the logarithm of per capita oil production and reserves.  In addition, regressions include measures of: (1) institutions (Panel A) and (2) infrastructure and culture (Panel B).  Robust standard errors are shown in parenthesis.  For comparison, the 

bottom panel shows the adjusted R2 of two alternative specifications: (1) a regression with all regressors except the measure of institutions or culture; and (2) a regression with all regressors except education.  All variables are described in Table 1.

Panel A:  Institutions

Page 63: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

(1) (2) (3) (4) (5) (6) (7)

Years of Education in the Region 0.2758a 0.2713a 0.2627a 0.3021a 0.2986a 0.2644a 0.2719a

(0.0172) (0.0187) (0.0197) (0.0286) (0.0305) (0.0181) (0.0175)

Ln(Population in the Region) 0.0126 0.0101 0.0023 0.0091 0.0138 0.0170 0.0115(0.0168) (0.0168) (0.0184) (0.0187) (0.0193) (0.0173) (0.0157)

Temperature ‐0.0140c ‐0.0142c ‐0.0166c ‐0.0015 ‐0.0038 ‐0.0154c ‐0.0140c

(0.0084) (0.0085) (0.0085) (0.0060) (0.0056) (0.0090) (0.0080)

Inverse Distance to Coast 0.4971a 0.4872a 0.4626a 0.4750c 0.4093 0.4351a 0.5162a

(0.1441) (0.1427) (0.1438) (0.2590) (0.2713) (0.1358) (0.1450)

Ln(Oil) 0.1752a 0.1793a 0.1864a 0.0534 0.0354 0.1922a 0.1772a

(0.0578) (0.0584) (0.0582) (0.0669) (0.0572) (0.0613) (0.0591)

Ln(Power line density) 0.0199(0.0198)

Ln(Travel time) ‐0.0456c

(0.0231)

Trust in others ‐0.0611(0.0868)

Civic values ‐0.0040(0.0231)

Ln(Number of ethnic groups) ‐0.0504b

(0.0249)

Probability of same language 0.1723(0.2067)

Constant 6.3853a 6.4350a 6.9287a 6.0940a 6.0196a 6.5272a 6.2956a

(0.1947) (0.1928) (0.3351) (0.2863) (0.3245) (0.1679) (0.2337)

Observations 1,469 1,469 1,469 699 635 1,468 1,445Number of countries 104 104 104 65 70 104 103

R2  Within 42% 42% 43% 49% 48% 42% 42%

R2  Between 60% 60% 60% 50% 50% 60% 60%

R2  Overall 62% 62% 61% 50% 47% 62% 62%

Within R2  without institution 42% 42% 42% 49% 48% 42% 42%

Within R2  without education 9% 13% 17% 10% 10% 14% 11%

Between R2  without institution 60% 60% 60% 51% 50% 60% 59%

Between R2  without education 42% 51% 47% 7% 17% 47% 50%Fixed Effects Yes Yes Yes Yes Yes Yes Yes

Note:  a = significant at the 1% level, b = significant at the 5% level, and c = significant at the 10% level. 

Table 6: Regional GDP per capita, Institutions, Infrastructure, and Culture (Cont)

Panel B: Infrastructure and Culture

Page 64: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

(1) (2) (3) (4) (5) (6)

Years of Education 0.2567a 0.2200a 0.2069a 0.1626a 0.2448a 0.1850a

(0.0308) (0.0433) (0.0438) (0.0480) (0.0363) (0.0351)

Ln(Population) 0.0683c 0.0354 0.0559 ‐0.0356 0.0732 0.0504(0.0410) (0.0487) (0.0470) (0.0482) (0.0533) (0.0370)

Temperature ‐0.0189c ‐0.0179 ‐0.0135 0.0024 ‐0.0181 ‐0.0100(0.0106) (0.0118) (0.0109) (0.0106) (0.0126) (0.0104)

Inverse Distance to Coast 2.9646a 2.3421a 2.3853a 2.3974a 2.9603a 1.9906a

(0.5742) (0.7800) (0.6050) (0.5941) (0.6208) (0.5463)

Ln(Oil) 0.9503a 0.7877c 1.0708a 0.8965a 1.0720b 0.9928a

(0.1373) (0.4564) (0.1729) (0.1100) (0.4094) (0.2013)

Autocracy ‐0.5994a

(0.2184)

Executive Constraints 0.1633b

(0.0696)

Expropriation Risk 0.3952a

(0.0986)

Proportional Representation 0.3972c

(0.2328)

Corruption 0.2130a

(0.0479)

Constant 3.5771a 5.3781a 3.7896a 3.1830b 3.2958a 4.1183a

(0.9416) (1.3861) (1.0059) (1.3630) (1.0503) (0.8118)

Observations 103 80 101 81 97 103

Adjusted R2 63% 67% 65% 70% 63% 69%

Adj. R2  without institution 63% 64% 63% 63% 62% 63%

Adj. R2 without education 50% 60% 59% 67% 52% 63%

Note:  a = significant at the 1% level, b = significant at the 5% level, and c = significant at the 10% level. 

Table 7:  National GDP per capita and commonly used measures of institutions

Ordinary least square regressions of the log of GDP per capita at the country level.  All regressions include the years of education, logarithm of population, temperature, inverse distance to coast, and the logarithm of per capita oil production and reserves.  In addition, regressions include the following variables: (1) Autocracy; (2) Executive constraints; (3) Expropriation riks; (4) Proportional representation; and (5) Corruption.  Robust standard errors are shown in parenthesis.  For comparison, the bottom 

panel shows the adjusted R2 of two alternative specifications: (1) a regression with all regressors except the measure of institutions or culture; and (2) a regression with all regressors except education. 

Page 65: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15)

Years of Education of manager 0.0470a 0.0322a 0.0227a 0.0242a 0.0148a 0.0445a 0.0311a 0.0267a 0.0241a 0.0150a 0.0262a 0.0179a 0.0082a 0.0112a 0.0049(0.0033) (0.0034) (0.0033) (0.0033) (0.0047) (0.0033) (0.0033) (0.0032) (0.0033) (0.0041) (0.0027) (0.0027) (0.0028) (0.0028) (0.0038)

Ln(Employees) . 0.1472a . ‐0.0134b 0.1324a . 0.1388a . 0.0230a 0.1099a . 0.0822a . ‐0.0257a 0.0613a

. (0.0097) . (0.0062) (0.0127) . (0.0101) . (0.0066) (0.0121) . (0.0081) . (0.0055) (0.0107)

Years of Education of workers 0.0318a 0.0243a 0.0434a 0.0443a 0.0071 0.0288a 0.0217a 0.0514a 0.0500a 0.0082c 0.0164a 0.0120a 0.0151a 0.0167a 0.0056(0.0033) (0.0033) (0.0027) (0.0028) (0.0044) (0.0037) (0.0036) (0.0030) (0.0030) (0.0045) (0.0028) (0.0028) (0.0025) (0.0026) (0.0038)

Ln(Expenditure on energy / employee) 0.3408a 0.3401a . . 0.2843a 0.3035a 0.3007a . . 0.2389a 0.2180a 0.2176a . . 0.1738a

(0.0095) (0.0094) . . (0.0118) (0.0102) (0.0101) . . (0.0117) (0.0086) (0.0086) . . (0.0102)

Ln(Property, Plant, Equipment / employees) . . 0.3038a 0.3046a 0.1851a . . 0.2964a 0.2949a 0.1745a . . 0.1668a 0.1684a 0.1126a

. . (0.0069) (0.0070) (0.0101) . . (0.0072) (0.0072) (0.0100) . . (0.0058) (0.0058) (0.0089)

Constant 6.9256a 6.6765a 4.2481a 4.2722a 5.9533a 9.3425a 9.1129a 5.1127a 5.0704a 8.9826a 6.2151a 6.0776a 2.7971a 2.8432a 5.5371a

(0.0673) (0.0660) (0.0556) (0.0564) (0.0889) (0.0682) (0.0674) (0.0564) (0.0574) (0.0863) (0.0595) (0.0604) (0.0487) (0.0500) (0.0784)

Observations 13,248 13,248 19,305 19,305 7,733 10,651 10,651 17,893 17,893 6,655 12,782 12,782 19,209 19,209 7,706Number of Regions‐Industries 855 855 1,037 1,037 487 754 754 1,005 1,005 458 807 807 1,033 1,033 486

Within R2  21% 23% 21% 21% 30% 20% 22% 20% 20% 29% 13% 14% 8% 8% 17%

Between R2  90% 89% 67% 67% 88% 28% 24% 55% 54% 51% 89% 88% 66% 67% 86%

Overall R2  79% 78% 58% 58% 81% 37% 33% 59% 58% 53% 76% 74% 57% 59% 78%Regions‐Industry fixed effects Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

Note:  a = significant at the 1% level, b = significant at the 5% level, and c = significant at the 10% level. 

Table 8:  Firm level productivity

Dependent Variable:Logarithm of Sales per employee Ln[(Sales ‐ Raw Materials)/Employee] Logarithm of Wages per employee

The table reports fixed effect regressions for for the following two dependent variables: (1) logarithm of sales per employee, (2) logarithm of sales net of raw materials per employee, and (3) logarithm of wages per employee.   All regressions include region‐industry fixed effects.  Errors are clustered at the regional level.  The independent variables include:  (1)  Years of Education of manager, (2) Ln(Employees), (3) Years of Education of workers, (4) Ln(Expenditure on energy / employee), and (5) Ln(Property, Plant, Equipment / employees). All variables are described in Table 1.  

Page 66: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

Panel A: Basic Specification

(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15)

Years of Education in the Region 0.0655a 0.0639a 0.0954a 0.0950a 0.0478b 0.0748a 0.0735a 0.0945a 0.0928a 0.0565a 0.0580a 0.0577a 0.0840a 0.0843a 0.0462a

(0.0202) (0.0185) (0.0280) (0.0279) (0.0185) (0.0197) (0.0181) (0.0275) (0.0270) (0.0177) (0.0162) (0.0159) (0.0233) (0.0234) (0.0151)

Ln(Population in the Region) 0.0920a 0.0803a 0.1437a 0.1409a 0.0917a 0.1332a 0.1178a 0.1046b 0.0938c 0.1188a 0.0682 0.0622 0.0135 0.0159 0.0787c

(0.0321) (0.0297) (0.0501) (0.0504) (0.0328) (0.0343) (0.0327) (0.0511) (0.0501) (0.0410) (0.0425) (0.0418) (0.0352) (0.0354) (0.0413)

Years of Education of manager 0.0534a 0.0352a 0.0257a 0.0243a 0.0169b 0.0498a 0.0335a 0.0287a 0.0236a 0.0165a 0.0315a 0.0215a 0.0118a 0.0131a 0.0070(0.0047) (0.0048) (0.0062) (0.0057) (0.0077) (0.0047) (0.0042) (0.0044) (0.0043) (0.0058) (0.0038) (0.0036) (0.0044) (0.0042) (0.0052)

Ln(Employees) . 0.1497a . 0.0113 0.1468a . 0.1392a . 0.0401a 0.1177a . 0.0827a . ‐0.0095 0.0717a

. (0.0154) . (0.0176) (0.0193) . (0.0176) . (0.0141) (0.0193) . (0.0150) . (0.0108) (0.0187)

Years of Education of workers 0.0349a 0.0279a 0.0384a 0.0378a 0.0066 0.0332a 0.0264a 0.0490a 0.0468a 0.0112c 0.0195a 0.0151a 0.0146a 0.0152a 0.0049(0.0053) (0.0054) (0.0056) (0.0058) (0.0068) (0.0059) (0.0057) (0.0059) (0.0061) (0.0062) (0.0036) (0.0036) (0.0033) (0.0033) (0.0051)

Ln(Expenditure on energy / employee) 0.3577a 0.3554a . . 0.2902a 0.3183a 0.3133a . . 0.2440a 0.2248a 0.2232a . . 0.1721a

(0.0185) (0.0177) . . (0.0220) (0.0182) (0.0175) . . (0.0202) (0.0173) (0.0172) . . (0.0168)

Ln(Property, Plant, Equipment / employees) . . 0.3258a 0.3250a 0.1946a . . 0.3150a 0.3118a 0.1818a . . 0.1787a 0.1794a 0.1230a

. . (0.0132) (0.0136) (0.0162) . . (0.0139) (0.0141) (0.0161) . . (0.0086) (0.0089) (0.0135)

Constant 5.1202a 5.0055a 4.8529a 4.8850a 4.6033a 4.5073a 4.4682a 3.9685a 4.1094a 3.8699a 5.1007a 5.0322a 6.6732a 6.6461a 5.0701a

(0.3706) (0.3373) (1.1885) (1.1887) (0.4521) (0.4183) (0.4008) (0.8173) (0.8009) (0.6479) (0.5225) (0.5199) (0.7223) (0.7248) (0.7073)

Observations 13,248 13,248 19,305 19,305 7,733 10,651 10,651 17,893 17,893 6,655 12,782 12,782 19,209 19,209 7,706Number of Countries 29 29 22 22 21 25 25 21 21 20 27 27 22 22 21

Within R2  30% 32% 31% 31% 37% 28% 30% 27% 28% 36% 20% 21% 13% 13% 22%

Between R2  90% 90% 59% 59% 92% 87% 86% 70% 71% 89% 88% 87% 57% 57% 89%

Overall R2  74% 74% 54% 54% 80% 74% 72% 73% 73% 81% 69% 68% 44% 44% 76%Country fixed effects Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes YesIndustry fixed effects Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes

Note:  a = significant at the 1% level, b = significant at the 5% level, and c = significant at the 10% level. 

Table 9:  Firm level productivity and Regional Human Capital

The table reports fixed effect regressions for for the following two dependent variables: (1) logarithm of sales per employee, (2) logarithm of sales net of raw materials per employee, and (3) logarithm of wages per employee. All regressions include country and industry fixed effects. Errors are clustered at the regional level. The independent variables include: (1) Years of Education in the Region, (2) Ln(Population in the Region), (3) Years of Education of manager, (4) Ln(Employees), (5) Years of Education of workers, (6) Ln(Expenditure on energy / employee), and (7) Ln(Property, Plant, Equipment / employees). All variables are described in Table 1.

Ln[(Sales ‐ Raw Materials)/Employee] Logarithm of Wages per employeeLn(Sales/Employee)Dependent Variable:

Page 67: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15)

Temperature 0.0084 0.0105 0.0081 0.0082 0.0190 ‐0.0217 ‐0.0200 ‐0.0023 ‐0.0021 0.0278 ‐0.0141 ‐0.0128 0.0113b 0.0112b ‐0.0046(0.0134) (0.0124) (0.0079) (0.0079) (0.0150) (0.0171) (0.0155) (0.0079) (0.0079) (0.0232) (0.0116) (0.0112) (0.0048) (0.0048) (0.0100)

Inverse Distance to Coast 0.3464 0.2275 1.1491a 1.1458a 0.6196 ‐0.2990 ‐0.4531 1.0464a 1.0125a ‐0.1930 ‐0.0127 ‐0.0841 0.2830 0.2947 0.1021(0.3708) (0.3652) (0.2065) (0.2079) (0.4041) (0.4233) (0.3851) (0.2286) (0.2228) (0.5291) (0.3349) (0.3369) (0.1824) (0.1824) (0.3564)

Ln(Oil) ‐0.8681a ‐0.6813b 0.1682 0.1728 ‐1.2953c ‐0.7747a ‐0.5679c ‐0.0945 ‐0.0492 ‐0.6443 ‐0.6755c ‐0.5721 0.1596 0.1433 ‐1.2264b

(0.2856) (0.3124) (0.5944) (0.5919) (0.7382) (0.2878) (0.3084) (0.4935) (0.4942) (0.8883) (0.3430) (0.3619) (0.3833) (0.3804) (0.4702)

Years of Education in the Region 0.0532a 0.0558a 0.0236 0.0236 0.0287c 0.0800a 0.0839a 0.0307 0.0310 0.0680a 0.0560a 0.0583a 0.0639b 0.0638b 0.0467b

(0.0198) (0.0184) (0.0287) (0.0287) (0.0162) (0.0205) (0.0188) (0.0299) (0.0295) (0.0180) (0.0182) (0.0181) (0.0261) (0.0261) (0.0179)

Ln(Population in the Region) 0.0963a 0.0817a 0.0713c 0.0705c 0.1029b 0.1235a 0.1062a 0.0493 0.0418 0.1195a 0.0706c 0.0629 ‐0.0257 ‐0.0230 0.0852c

(0.0321) (0.0295) (0.0382) (0.0386) (0.0416) (0.0319) (0.0300) (0.0401) (0.0395) (0.0443) (0.0417) (0.0411) (0.0291) (0.0290) (0.0439)

Years of Education of manager 0.0533a 0.0353a 0.0271a 0.0266a 0.0169b 0.0495a 0.0333a 0.0300a 0.0256a 0.0168a 0.0314a 0.0216a 0.0126a 0.0142a 0.0069(0.0047) (0.0047) (0.0060) (0.0053) (0.0077) (0.0047) (0.0042) (0.0044) (0.0041) (0.0059) (0.0037) (0.0036) (0.0043) (0.0040) (0.0052)

Ln(Employees) . 0.1486a . 0.0035 0.1453a . 0.1393a . 0.0341b 0.1178a . 0.0821a . ‐0.0125 0.0719a

. (0.0153) . (0.0175) (0.0190) . (0.0174) . (0.0137) (0.0191) . (0.0150) . (0.0110) (0.0187)

Years of Education of workers 0.0344a 0.0275a 0.0406a 0.0404a 0.0058 0.0333a 0.0266a 0.0514a 0.0494a 0.0112c 0.0194a 0.0152a 0.0154a 0.0162a 0.0047(0.0053) (0.0054) (0.0059) (0.0061) (0.0067) (0.0059) (0.0057) (0.0061) (0.0063) (0.0062) (0.0037) (0.0037) (0.0034) (0.0034) (0.0052)

Ln(Expenditure on energy / employee) 0.3571a 0.3549a . . 0.2890a 0.3178a 0.3128a . . 0.2426a 0.2245a 0.2230a . . 0.1717a

(0.0184) (0.0176) . . (0.0220) (0.0183) (0.0176) . . (0.0204) (0.0174) (0.0172) . . (0.0168)

Ln(Property, Plant, Equipment / employees) . . 0.3191a 0.3189a 0.1929a . . 0.3108a 0.3082a 0.1825a . . 0.1765a 0.1773a 0.1228a

. . (0.0122) (0.0127) (0.0160) . . (0.0132) (0.0134) (0.0163) . . (0.0085) (0.0088) (0.0135)

Constant 5.0425a 5.1078a 5.1076a 5.1172a 3.5975a 11.2711a 11.3696a 4.9529a 5.0614a 3.8682a 6.0203a 5.9598a 6.8223a 6.7884a 4.9977a

(0.4348) (0.3958) (0.9995) (1.0016) (0.5013) (0.8349) (0.7626) (0.6071) (0.6003) (0.9306) (0.7249) (0.7132) (0.6127) (0.6112) (0.7689)

Observations 13,248 13,248 19,305 19,305 7,733 10,651 10,651 17,893 17,893 6,655 12,782 12,782 19,209 19,209 7,706(0.2856) (0.3124) (0.5944) (0.5919) (0.7382) (0.2878) (0.3084) (0.4935) (0.4942) (0.8883) (0.3430) (0.3619) (0.3833) (0.3804) (0.4702)

Years of Education in the Region 0.0532a 0.0558a 0.0236 0.0236 0.0287c 0.0800a 0.0839a 0.0307 0.0310 0.0680a 0.0560a 0.0583a 0.0639b 0.0638b 0.0467b

(0.0198) (0.0184) (0.0287) (0.0287) (0.0162) (0.0205) (0.0188) (0.0299) (0.0295) (0.0180) (0.0182) (0.0181) (0.0261) (0.0261) (0.0179)

Ln(Population in the Region) 0.0963a 0.0817a 0.0713c 0.0705c 0.1029b 0.1235a 0.1062a 0.0493 0.0418 0.1195a 0.0706c 0.0629 ‐0.0257 ‐0.0230 0.0852c

(0.0321) (0.0295) (0.0382) (0.0386) (0.0416) (0.0319) (0.0300) (0.0401) (0.0395) (0.0443) (0.0417) (0.0411) (0.0291) (0.0290) (0.0439)

b

Table10:  Firm level productivity, Regional Human Capital, and Geography

Dependent Variable:Logarithm of Sales per employee Logarithm of Wages per employeeLn[(Sales ‐ Raw Materials)/Employee]

The table reports fixed effect regressions for for the following two dependent variables: (1) logarithm of sales per employee, (2) logarithm of sales net of raw materials per employee, and (3) logarithm of wages per employee. All regressions include country and industry fixed effects. Errors are clustered at the regional level. The independent variables include: (1) Temperature, (2) Inverse Distance to Coast, (3) Ln(Oil), (4) Years of Education in the Region, (5) Ln(Population in the Region), (6) Years of Education of manager, (7) Ln(Employees), (8) Years of Education of workers, (9) Ln(Expenditure on energy / employee), and (10) Ln(Property, Plant, Equipment / employees). All variables are described in Table 1.

Page 68: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

Figure 1:  Regions in the database 

 

 

   

Page 69: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

Figure 2: Within‐country standard deviation of GDP per capita and development

 

 

Figure 3: Within‐country standard deviation of years of education and development 

 

   

ZWE

ZAR

MWI

MOZ

NER

MDG

UGA

NPL

GHA

TZA

BFA

ZMBLSO

BEN

KHM

SEN

KEN

LAO

NGA

KGZ

MNG

CMR

UZB

NIC

VNM

PAK

MDA

HND

PHL

LKA

IND

IDN

GEOMAR

GTM

ARM

PRY

JORSYR

BOL

SLVSWZ

CHN

AZE

DOM

NAM

UKRALB

EGY

PER

BLZ

COL

SRB

ECUMKD

PAN

BRA

URY

BIH

LVA

ZAF

VEN

THA

BGR

LBN

IRN

KAZROM

TUR

RUS

ARG

ESTHRV

LTU

MYSMEXCHL

POL

HUN

SVK

GAB

CZE

KORSVNGRCPRTNZL

ISR

ESP

ITA

FRAJPNFIN

DEU

GBR

BEL

SWE

DNK

ARE

AUSNLDAUT

IRLCHECAN

NORUSA

-.2

0.2

.4.6

Sta

nda

rd d

evi

atio

n o

f Ln

Reg

iona

l GD

P p

er c

apita

)

-6 -4 -2 0 2Ln GDP per capita

coef = -.02489678, (robust) se = .01172037, t = -2.12

ZWE

ZAR

MOZMWI

NER

MDG

UGA

NPL

GHA

TZABFA

KEN

ZMB

BEN

LSOKHMSEN

NGA

KGZ

LAO

MNG

UZB

CMR

VNM

NICMDA

PAK

HNDIDNIND

LKA

PHL

GEO

GTM

MAR

CHN

PRY

BOL

ARM

JOR

NAM

SYR

SLV

SWZDOM

AZE

UKR

PER

EGY

COL

PAN

BLZ

ECUSRB

MKD

BRA

THA

LVAURY

BIH

ZAF

VEN

RUS

ARG

LBN

BGR

KAZ

TURROM

EST

HRV

MEX

MYS

CHL

LTU

POL

HUN

SVK

GAB

CZESVNGRCPRT

NZL

ISR

ITA

ESPJPNFRABEL

ARE

FIN

DEU

GBR

DNK

SWENLD

AUS

AUT

CHE

IRLCAN

USA

NOR

-.5

0.5

11

.5S

td D

ev

Ye

ars

of E

duc

atio

n

-6 -4 -2 0 2Ln(GDP per capita)

coef = -.12647737, (robust) se = .02655853, t = -4.76

Page 70: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

Figure 4: GDP per capita and years of schooling in Brazil, Colombia, India, and Russia 

 

 

Figure 5: Partial Correlation Graph of (Log) GDP per capita and Years of Education 

 

 

PiauíAlagoasMaranhãoParaíbaCeará

BahiaAcreSergipeParáTocantins

Rondônia

Rio Grande do NortePernambuco

AmazonasRoraima

Mato Grosso

GoiásMinas GeraisMato Grosso do SulAmapá

Espírito SantoParanáSanta CatarinaRio Grande do Sul

São PauloRio de Janeiro

Distrito Federal

-1-.

50

.51

1.5

e( ln

Reg

GD

P |

X )

-1 0 1 2 3e( HKij | X )

coef = .53632514, (robust) se = .03022503, t = 17.74

Brazil

Vichada

PutumayoGuainiaGuaviareChoco

La Guajira

CaquetaNarinoVaupesCauca

Casanare

Amazonas

Cordoba

Sucre

HuilaArauca

CesarBoyaca

Norte de SantanderMagdalena

TolimaCundinamarcaMeta

BolivarCaldasRisaralda

SantanderAntioquia

Quindio

Valle del CaucaAtlantico

San Andres y ProvidenciaBogota

-1-.

50

.51

1.5

e( ln

Reg

GD

P |

X )

-2 -1 0 1 2 3e( HKij | X )

coef = .23096736, (robust) se = .04695696, t = 4.92

Colombia

Bihar

OrissaJharkhandChattisgarhRajasthanArunachal PradeshAndhra Pradesh

Madhya Pradesh

West Bengal

Uttar Pradesh

MeghalayaAssamTripuraSikkimNagalandKarnataka

Gujarat

Jammu & Kashmir

Andaman & Nicobar IslandsHaryanaTamil NaduPunjabMaharashtra

MizoramUttaranchalHimachal Pradesh

Manipur

Goa

KeralaPondicherryDelhi

Chandigarh

-10

12

e( ln

Reg

GD

P |

X )

-2 0 2 4 6e( HKij | X )

coef = .31301955, (robust) se = .0470328, t = 6.66

India

Kurgan RegionJewish Autonomous RegionKirov Region

Chechen Republic

Perm RegionRepublic of BashkortostanArkhangelsk Region

Altai KraiKursk RegionChita RegionTambov RegionAltai RepublicPskov regionBryansk Region

Kemerovo RegionVologda Region

The Republic of Mari El

Lipetsk Region

Kostroma regionOrenburgskayaRepublic of KareliaNovgorod RegionChuvash RepublicOrel RegionIvanovo RegionRyazan RegionTver RegionPenza Region

Republic of Ingushetia

Sverdlovsk RegionAmur Region

Komi Republic

Smolensk RegionVladimir RegionUlyanovsk RegionRepublic of Mordovia

Yaroslavl RegionChelyabinsk region

Tuva Republic

Republic of KhakassiaTula RegionUdmurt RepublicOmsk RegionNizhny Novgorod RegionLeningrad Region

Krasnodar

Adygeya Republic

Belgorod regionKrasnoyarsk KraiThe Republic of Tatarstan

Stavropol Territory

Irkutsk RegionRepublic of BuryatiaVoronezh RegionNovosibirsk RegionAstrakhan Region

Dagestan Republic

Sakhalin Oblast

Saratov RegionVolgograd Region

Republic of Kalmykia

Rostov RegionKaluga RegionPrimorsky Krai

Karachay-Cherkessia Republic

Tomsk RegionChukotka Autonomous DistrictMurmansk regionSamara RegionKhabarovsk Krai

Tyumen Region

Kabardino-Balkar Republic

The Republic of Sakha (Yakutia)

Kaliningrad Oblast

Magadan Region

Republic of North Ossetia - Alania

Kamchatka RegionMoscow RegionSt. Petersburg

Moscow

-2-1

01

2e(

lnR

egG

DP

| X

)

-1 0 1 2e( HKij | X )

coef = .54581172, (robust) se = .12771885, t = 4.27

Russian Federation

Ln GDP per capita and Years of Schooling

KEN

CHNCMRPAN

CMR

NGAPERGHAIDN

GHAMEXHND

COL

GHA

NIC

NGA

PER

MEXPERPHL

PERCOLPER

BOL

IND

NAM

LAOARG

SYR

MYSTUR

PER

PRY

MYS

NAM

IDN

PER

VEN

ARG

ZWE

ZMBHNDLBNPHL

BRA

DOMPAN

ZAR

NIC

THAZMBJOR

COL

COLGTMMKD

COL

PRY

ECU

BRABRA

GAB

CMR

MEX

BENTZA

MEX

SWZ

IND

NAM

ECU

IND

ECU

IND

ARG

IDN

NIC

IDN

IDN

ZMB

INDEGY

THA

VNMSRBLAO

BLZMEX

ESPIDN

ZAR

LAO

COL

CHN

CHN

IND

VNM

VENSYRISRZARLVA

CHL

IND

PRYHNDHND

ARG

PER

ESPMEXLVA

LAO

TUR

AUS

SYR

COLLSO

BRA

HNDCHLSRB

IND

PHLECU

IDN

MEX

JOR

COL

TZAURY

MKD

COLHND

INDNIC

NICVENZMBKHMDEU

LVA

BOL

BRA

TZA

MNGLVACHL

BELBRA

LAO

SRB

IDNSVN

IDNDOMCHNBRA

ARG

NAM

PHLKHMNZL

LVA

BENJPN

NAM

LSO

ZAR

CHN

IND

HRVZWEBRAJPNBENMARSLV

UKR

MDG

ZAF

COLPER

IDN

VENVENKHMSYR

VENLTU

GTMSVKFRAAREDNKARE

RUS

DEU

NAM

RUSNZL

HRV

LSO

MDAMARSRBLAOSYR

JPN

CHN

LBNURY

ZAF

CMR

MEXSENINDCAN

GTM

HNDCHL

URYZAR

COL

JPNLAO

NICGRCJPNHRVRUS

LAOJPNMYS

MEXCHNHND

SENPRY

UKRHRVITAKENCOL

ZWENICDEUZAF

IND

IND

LVA

NAMKAZ

ITABFA

ECU

CHL

PAN

LBNLVA

MNG

USABRA

BRAMEXSENZAF

SRB

UKR

ZWEEST

PER

SLVDNK

RUS

MNG

AUT

LKACZENOR

RUS

ITA

LAO

LBN

DEUMDG

HRV

RUS

NORDNK

NZLMARGEOMAR

USAGEOFRA

SLV

NER

TZA

USA

MYSESPHRV

LTU

TUR

BEN

MNGMDG

RUS

RUSSVK

PRY

PRY

UKR

UGASENVENPRY

DNKBFA

NER

HRVSRBMDAMNGNORJPNLVAUZBMNG

MNG

MAR

SRB

LVA

MYSBGRIND

USA

JPNCOLLSOFRARUSAZEMEXPAKROM

TZA

MNG

GBR

MNGMNG

GRCJPNJPNMDAKHMESTDEUSWE

NERLVA

ZMBNLD

USASWZ

MOZ

ROMHRV

LVAUSA

MOZECU

TZALVA

KHMRUSPOL

KEN

PRTKEN

NZL

IDN

ARMCANPHL

ROMECU

BLZ

RUS

GRC

ITANZL

UKRRUS

JORURY

ESTINDRUSBFA

BEN

ARMHUN

DNK

MYS

MWI

COL

DOM

RUS

BFAGEOLAOSLVBRAJPN

ITA

NORBFA

ARG

DNKLVABGR

MYS

SVNNERDNKVENESP

BIH

SRBTZANPL

TZA

AZEESTBRAUSAISRUSASLV

COL

HRVNICHUN

NER

NLDSYR

URY

BEL

DEU

PRY

HRVIDN

KEN

NERVNMECU

RUS

ECUJORPRY

TZA

JPNSLVBRA

COL

PAK

ARG

FRASEN

RUS

AUTBENURYUKR

BLZ

MNGMAR

JPNGBR

UKR

GBR

RUS

URYNZLMNG

ARE

NORKHM

ARM

RUS

LVA

CHNPRY

USAARMESTCHECHEZWEIRLLTUARE

BFAECUSRB

RUSKHM

USALVAINDNLDUSA

RUS

GEOLVA

RUS

HUNZWE

ECU

CHECHN

GTM

TZAMOZ

SVN

MNG

MWI

CAN

FRA

CANAUTRUSLTUJPN

CHEARM

DEU

PRYBEL

KGZ

PRT

RUSMKDUGA

CHE

FINKAZAZEVEN

ROMCANUKRPOLMOZCHNBLZ

CHN

JPN

IND

LKASLVLVAGRCEST

KGZ

CHN

DNKAZEZAF

RUS

ARG

HRVTUR

URYKGZPHLGBRMOZUSASENSVK

MKD

NLD

RUS

JPNRUSDOMUKRFRARUS

RUS

UKR

BFAPOLBELESP

MNG

GEO

ZAR

CZENOR

GRC

ROM

ARG

SWEGBR

CHECHN

SVN

ECU

JPNBFA

SVK

POLARGKENFRAJPNCHLJPNPHLVEN

PRT

AUSJORUKR

IDN

BEN

ESPCHE

FIN

BGRCOL

RUS

SVKLKABGR

UZBJPN

USAFRAPHLLTU

RUS

ZWE

USA

ESP

CHNMDGUSAGBR

KGZEST

GRCARMUGACHEITAMOZHUNGEOESP

GAB

FRANLD

BENIDN

CHEPHLITAINDHRVSVK

RUSNZLMEX

MEXCHN

DEU

CHEPOL

SLVHNDNORVEN

CHL

NPL

RUS

POLLKA

AUSBFATZA

JPNRUS

JPN

BOL

POLUSAGBRDNK

USAUSAUKRAUT

RUS

FRA

RUSBELUKR

LVA

RUS

SRBPRTTZAGEOFRA

MEX

MAR

RUSRUS

RUS

FRALVA

SRB

ITAAUSPHL

GRCESPAZE

UKR

LTUAZERUSTURPERNORUKRCZEPRTCHESWE

IDNDEUKHMRUS

IDN

SVN

BLZRUSPAN

NORTHA

RUSAUS

EGYLSOLVA

MOZLVAESTSVNUKRBGRJPN

BIHGBRCHENOR

MEX

MDGURYPERKGZHRV

ZAR

BEL

EST

JPNJPNCZEHUN

CZEGAB

CHNGRCUSA

DEU

IDN

TZANLDCOLSWEKAZ

TZA

USA

CAN

JPNLTUCZESYRZAFDOMUKRSVNBRA

URY

JPNUSA

CANVNM

MNGLVA

IND

VENFIN

PAN

ITA

MYS

ESPFRAUKRRUS

NZL

CHEKAZGTMESPMYSJORNICPOLURY

SVN

HND

MARHND

LSOJPN

GRCARG

ECU

UKR

SEN

DNK

ITAUSACHE

BELRUS

BEL

NGANLDNZLUSAESTCHE

KGZ

KGZ

SLV

KHMPOLCHNROMGTMRUSGHA

ARE

NZLSWEAUS

NAM

POLAUT

IND

ZAF

GEO

ARG

BFA

SWEURY

RUS

NORLSONPLFRA

ARG

ARMLAO

ITA

CHN

LVA

ZWEURYSLV

COL

CHE

RUSNLD

NLD

BFADOMCZEHNDUSAMYSJPNARM

RUS

VEN

AUT

UKR

RUS

RUSUZBPRY

ARG

RUS

USAARG

MAR

COL

ARMUSA

NORSYRECUCHE

EST

RUSUSALKA

BENESTNIC

USA

TURUSALKAPRY

THA

VNMMAR

USA

NORUSAJPNUSAPRY

SRB

EGY

RUS

ZMB

FINPOL

SRB

RUSHUNUKR

RUS

MAR

JPNMNGUSAUZBJORSRB

USA

USAHRVITA

ARGUSA

DOM

LVA

BOL

GRCCHN

RUS

CHE

VEN

PRT

MOZ

MOZ

ARMCHEZAR

CHN

PAN

RUS

NPL

RUS

JPNRUS

RUS

USA

NAMPER

NOR

RUS

ISRECUCANUSA

CHL

BRAARGPOL

USA

MEX

JPNNORKHMTUR

PHLAZEVENITA

JPNCMRSWE

NLD

MNGPHL

BRAITA

IDN

INDMEXHRV

TURECUCHN

SVKFRA

URY

USA

ESP

IND

USAUSASENROMPAK

AUTSVNPOLLAOESPCHN

TZAESP

UKR

ITA

RUSSLVJPNCOL

MEX

JPNBRA

IDN

TZA

KAZ

ESPPRYITA

GEOJPN

AUSJOR

POL

BOL

CAN

CHE

CHE

JORPOLFRANORGHAGRCMAR

IRLTUR

AUTURYSYR

ARGNORUSABRA

GEOPERITANZLCHNISR

CHE

SVNZMB

URYMEXNPLUSABRAIND

MNGMEXUSA

LTUNICUSA

MEX

CHNCHNRUS

RUS

GBR

RUSCHN

MEX

ISRDEU

CHE

VNMURY

RUS

PRYBENUSA

URY

SVNITATZANAMSYRUSAUSA

VEN

GBR

BEL

FINDOM

RUS

UKR

AZE

SRB

LAOITACHN

PERRUSRUSMYS

BOL

ZAR

COLARGCANIDNHND

DEU

CHE

IDNGRCBRABIH

ARGUSA

LSO

GHA

DNKGBRCAN

RUS

JPNBELSLVKHM

ECU

RUS

LAOMKDCOLNERISR

IND

UZB

ARG

CHE

MKD

DEU

POL

PANNIC

PER

BRATUR

NLD

CHLPAK

IDN

RUS

ARG

INDRUSJPNNOR

CHN

DNKMEXJPN

CHLMEX

COLJPN

JORFRAESTVENMKDSWZ

RUS

NZLJPN

LVA

KHMFRA

LBNPHL

BFA

CMR

SWEDNK

USA

JORGTMCANCOLLTUCOL

MEXMWI

LAO

RUS

HRV

MEX

LVA

KAZ

JPN

MNGLSOAUTBRA

DEU

MYSESPVENRUSGBR

IDN

IDN

NLDZARSEN

TZA

COL

FRA

BOL

PERDNKGHA

IND

SRBVEN

BRABEN

IDN

MEX

SRBVNM

SRBEST

PHLNIC

JPN

DEUJPN

MEX

HRVSYRTZAVENPHL

LVA

RUS

LVA

IND

ITACHLGHACOL

PER

VEN

COLARE

ECUHRV

TUR

BOL

HNDUGA

MKD

MEXSLVLVABRA

LKA

CHL

SWZAZE

SVN

TURCMRBELNICMEXFRAPRT

KGZ

LVA

VENCZE

NZLNICLVAESPGHA

CHLESTIND

SYR

AREESPTZAINDUKR

LTUEGY

PER

LAO

HND

HUN

JPN

ECUPHLSYR

HND

SYR

DEU

NICJOR

VENNOR

ZAF

AUS

IND

VNM

NGA

HRVKENNZL

ARG

NGABGRMARDNKJPN

LVA

MEXBRA

COLNGA

GAB

MNGMAR

GRC

UKR

MNG

BRA

NAMSLV

ROM

PAN

PER

CHN

ARMCMR

ZAFMDAGEO

UKR

PER

MOZ

RUS

CMR

NAMHNDLSOHRV

SVK

CMRECUSEN

RUS

ZWE

MYSCOLPRY

ZWE

NERMDGPER

LVA

PHL

MYSMNG

CHN

PER

BLZDOM

IDN

GTM

URY

ZMB

ZMB

BRA

MEX

LAO

TZAPER

ECU

BFA

IND

LBN

PAN

THA

NICGHAKHM

COL

SRBLAO

IDN

HND

CHN

NAM

COL

BEN

ARG

IDN

ZAR

KEN

PRY

IND

-2-1

01

2L

n(R

egio

nal G

DP

per

cap

ita

-4 -2 0 2 4Years of Education

coef = .28573483, (robust) se = .01343762, t = 21.26

Page 71: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

Code Country Type of Data Source Available link

ALB Albania GDP Data from HDR 2002ARE United Arab Emirates GDP Data from HDR 1997 in arabicARG Argentina GDP 1990‐2001 Data from Ministry of interior http://www.ec.gba.gov.ar/Estadistica/FTP/pbg/pbg3.html

ARM Armenia Expenditure National Statistics Office http://www.armstat.am/file/article/marz_07_e_22.pdf

AUS Australia GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx

AUT Austria GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx

AZE Azerbaijan Income National Statistics Office http://74.125.47.132/search?q=cache:http://www.azstat.org/statinfo/budget_households/en/003.shtml

BEL Belgium GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx

BEN Benin GDP Data from HDR 2007/2008 and 2003BFA Burkina Faso GDP Data from HDR for GDP per capita.BGD Bangladesh NABGR Bulgaria GDP Data from HDR 2003, 2002 and 2001BIH Bosnia and Herzegovina GDP National Statistics Offices http://www.bhas.ba/Arhiva/2007/brcko/PODACI%201‐08.pdf

BLZ Belize Expenditure Data from LSMS 2002 http://www.statisticsbelize.org.bz/dms20uc/dm_filedetails.asp?action=d&did=13

BOL Bolivia GDP National Statistics Office http://www.ine.gov.bo/indice/visualizador.aspx?ah=PC0104010201.HTM

BRA Brazil GDP National Statistics Office http://www.ibge.gov.br/home/estatistica/economia/contasregionais/2002_2005/contasregionais2002 2005.pdf

CAN Canada GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx

CHE Switzerland GDP National Statistics Office http://www.bfs.admin.ch/bfs/portal/fr/index/infothek/lexikon/bienvenue___login/blank/zugang lexikon.Document.20896.xls

CHL Chile GDP National Statistics Office http://www.bcentral.cl/publicaciones/estadisticas/actividad‐economica‐gasto/aeg07a.htm

CHN China GDP Data from National Statistics Yearbooks 2006, 2002, 1998 and 1996 http://www.stats.gov.cn/english/statisticaldata/yearlydata/YB1998e/C3‐8E.htm

CMR Cameroon Expenditure National Statistics Officehttp://www.statistics‐cameroon.org/archive/ECAM/ECAM2001/survey0/data/ECAM2001/Documentation/ECAM%20II%20‐%20Rapport%20principal.pdf

COL Colombia GDP National Statistics Office http://www.dane.gov.co/index.php?option=com_content&task=category&sectionid=33&id=148&Itemid=705

CRI Costa Rica NACUB Cuba Wages Monthly wages from HDR 1996CZE Czech Republic GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx

DEU Germany GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx

DNK Denmark GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx

DOM Dominican Republic GDP National Statistics OfficeECU Ecuador GDP National Statistics OfficeEGY Egypt GDP Data from HDRs 2008, 2005, 2004, 2003, 2001; data from 2006 excludedESP Spain GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx

EST Estonia GDP Data from National Statistics Office

http://pub.stat.ee/px‐web.2001/I_Databas/Economy_regional/23National_accounts/01Gross_Domestic_product_(GDP)/14Regional_gross_domestic_product/14Regional_gross_domestic product asp

FIN Finland GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx

FRA France GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx

GAB Gabon Expenditure Data from HDR 2005GBR United Kingdom GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx

GEO Georgia GDP Data from HDR 2002. http://www.undp.org.ge/nhdr2001‐02/chpt1.htm

GHA Ghana Income Data from Living Standards Measurement Survey Reports for 1998/9 and 1991/2 http://siteresources.worldbank.org/INTLSMS/Resources/3358986‐1181743055198/3877319‐1190221709991/G3report.pdf

GRC Greece GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx

GTM Guatemala GDP Data from HDR 2007/2008 annex http://cms.fideck.com/userfiles/desarrollohumano.org/File/8012264236003654.pdf

HND Honduras GDP Data from HDR 2006HRV Croatia GDP Data from National Statistics Office http://www.dzs.hr/default_e.htm

HUN Hungary GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx

IDN Indonesia GDP Data from National Statistics Office http://www.bps.go.id/sector/nra/grdp/table1.shtml

IND India GDP National Statistics Office http://mospi.nic.in/6_gsdp_cur_9394ser.htm

IRL Ireland GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx

IRN Iran GDP Data from National Statistics Office http://www.sci.org.ir/content/userfiles/_sci_en/sci_en/sel/year85/f21/CS_21_4.HTM

ISR Israel GDP National Statistics OfficeITA Italy GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx

JOR Jordan GDP Data from HDR 2004JPN Japan GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx

KAZ Kazakhstan Income LSMS 1996, World Bank http://siteresources.worldbank.org/INTLSMS/Resources/3358986‐1181743055198/3877319‐1181930718899/finrep1.pdf

KEN Kenya GDP Data from HDRs for 2006, 2005, 2003, 2001 and 1999KGZ Kyrgyz Republic GDP Data from HDR 2005, 2001

APPENDIX OF DATA SOURCES:  REGIONAL GDP

Page 72: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

Code Country Type of Data Source Available link

KHM Cambodia Expenditure Data from Poverty profile of Cambodia 2004; Daily consumption  http://www.mop.gov.kh/Situationandpolicyanalysis/PovertyProfile/tabid/191/Default.aspx

KOR Korea, Rep. GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx

LAO Lao PDR C+I+G Data from HDR 2006; Consumption, Investment and Government ExpenditureLBN Lebanon GDP Data from HDR 2001

LKA Sri Lanka GDP Data from HDR 1998 and National Statistics Office http://www.cbsl.gov.lk/pics_n_docs/08_statistics/_docs/xls_real_sector/table1.17.xls

LSO Lesotho GDP Data from HDR 2006

LTU Lithuania GDP Data from National Statistics Office http://db1.stat.gov.lt/statbank/SelectVarVal/Define.asp?MainTable=M2010210&PLanguage=1&PXSId=0&ShowNews=OFF

LVA Latvia GDP National Statistics Office

http://data.csb.gov.lv/Dialog/varval.asp?ma=02‐02a&ti=2‐2.+GROSS+DOMESTIC+PRODUCT+BY+STATISTICAL+REGION,+CITY+AND+DISTRICT&path=../DATABASEEN/ekfin/Annual%20statistical%20data/02.%20Gross%20domestic%20product/&lang=1

MAR Morocco GDP + Expenditure Data from HDR 1999, 2003 and Enquete Nationale sur la Consommation et les Depenses des Menages 2000/2001

MDA Moldova Wages Data from 2007 Statistical Yearbook; monthly salary http://www.statistica.md/public/files/Yearbook/Venit_pop_1999_2006_en.doc

MDG Madagascar GDP Data from HDR 2003, 2000MEX Mexico GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx

MKD Macedonia, FYR GDP Data from National Statistics Office http://www.stat.gov.mk/english/statistiki_eng.asp?ss=09.01&rbs=2

MNG Mongolia GDP National Statistics OfficeMOZ Mozambique GDP Data from HDR 2007, 2001MWI Malawi Expenditure Data from Malawi INTEGRATED HOUSEHOLD SURVEY 2004‐2005 and 1998MYS Malaysia GDP Data from Chapter 5 of EIGHTH MALAYSIA PLAN 2001 ‐ 2005 http://www.epu.jpm.my/new%20folder/development%20plan/RM8.htm

NAM Namibia Expenditure Data from Namibia Household Income & Expenditure Survey 2003/2004; data is expendituhttp://www.npc.gov.na/publications/prenhies03_04.pdfNER Niger GDP Data from HDR 2004NGA Nigeria Income 2006 Annual Abstract of Statistics.  http://nigerianstat.gov.ng/annual_report.htm

NIC Nicaragua Expenditure Data from HDR 2002NLD Netherlands GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx

NOR Norway GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx

NPL Nepal GDP Data from HDR 2004, 2001 and 1998NZL New Zealand GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx

PAK Pakistan GDP Data from HDR 2003PAN Panama GDP Data from National Statistics Office http://www.contraloria.gob.pa/dec/

PER Peru GDP Cuentas Nacionales del Peru, Producto Bruto Interno por Departmentos 2001‐2006 http://www1.inei.gob.pe/biblioineipub/bancopub/est/lib0763/cuadros/c037.xls

PHL Philippines GDP National Statistics OfficePOL Poland GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx

PRT Portugal GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx

PRY Paraguay GDP Data from Atlas de Desarrollo Humano Paraguay 2007 http://www.undp.org.py/dh/?page=atlas

ROM Romania GDP Data from National Statistics Office http://www.insse.ro/cms/files/pdf/en/cp11.pdf

RUS Russia GDP National Statistics Officehttp://translate.google.com/translate?ie=UTF8&oe=UTF&u=http://www.gks.ru/bgd/regl/b07_14p/IssWWW.exe/Stg/d02/10‐02.htm&hl=en&ie=UTF8&sl=ru&tl=en

SEN Senegal GDP Data from HDR 2001SLV El Salvador GDP Data from HDR 2007/2008, 2005, 2003, 2001; 1996 values were in 1994 pricesSRB Serbia Income Data from National Statistics Municipal Database http://www.statserb.sr.gov.yu/Pod/epok.asp

SVK Slovak Republic GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx

SVN Slovenia GDP Data from National Statistics Office http://www.stat.si/eng/novica_prikazi.aspx?id=1318

SWE Sweden GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx

SWZ Swaziland GDP Data from HDR 2008SYR Syrian Arab Republic GDP Data from HDR 2005THA Thailand GDP Data from Statistical Year Book Thailand 2002 http://web.nso.go.th/eng/en/pub/pub0.htm

TUR Turkey GDP National Statistics Office http://www.turkstat.gov.tr/VeriBilgi.do?tb_id=56&ust_id=16

TZA Tanzania GDP National Statistics OfficeUGA Uganda GDP Data from HDR 2007UKR Ukraine GDP Data from National Statistics Office http://www.ukrstat.gov.ua/operativ/operativ2008/vvp/vrp/vrp2008_e.htm

URY Uruguay GDP Data from HDR 2005USA United States GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx

UZB Uzbekistan GDP Data from HDR 2007/8, 2000 and 1998VEN Venezuela GDP Data from HDR 2000VNM Vietnam Wages National Statistics Office http://www.gso.gov.vn/Modules/Doc_Download.aspx?DocID=2097

ZAF South Africa GDP National Statistics Office  (table 16) http://www.statssa.gov.za/publications/statsdownload.asp?PPN=P0441&SCH=4048

ZAR Congo, Dem. Rep. GDP Data from HDR 2008ZMB Zambia GDP Data from HDR 2007 and 2003ZWE Zimbabwe GDP Data fom HDR 2003

Page 73: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

Code Country Source Available Link

ALB Albania NAARE United Arab Emirates Ministry of Economy, 2005 Census http://www.economy.ae/English/economicandstatisticreports/statisticreports/pages/census2005.aspx

ARG Argentina Education Policy and Data Center (EPDC) http://epdc.org/

ARM Armenia Education Policy and Data Center (EPDC) http://epdc.org/

AUS Australia National Statistics Office http://www.abs.gov.au/ 

AUT Austria Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing 

AZE Azerbaijan Education Policy and Data Center (EPDC) http://epdc.org/

BEL Belgium Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing 

BEN Benin Education Policy and Data Center (EPDC) http://epdc.org/

BFA Burkina Faso Education Policy and Data Center (EPDC) http://epdc.org/

BGD Bangladesh Education Policy and Data Center (EPDC) http://epdc.org/

BGR Bulgaria Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing 

BIH Bosnia and Herzegovina Education Policy and Data Center (EPDC) http://epdc.org/

BLZ Belize Education Policy and Data Center (EPDC) http://epdc.org/

BOL Bolivia Education Policy and Data Center (EPDC) http://epdc.org/

BRA Brazil Integrated Public Use Microdata Series International  (IPUMS) https://international.ipums.org/international/ 

CAN Canada National Statistics Office http://www40.statcan.gc.ca/l01/cst01/educ43a‐eng.htm

CHE Switzerland Swiss Labor Force Survey (SLFS) SFSO http://www.bfs.admin.ch/bfs/portal/de/index/themen/15/04/ind4.informations.40101.401.html

CHL Chile National Statistics Office http://espino.ine.cl/CuadrosCensales/apli_excel.asp

CHN China National Statistics Office http://www.stats.gov.cn/ndsj/information/nj97/C091A.END

CMR Cameroon Education Policy and Data Center (EPDC) http://epdc.org/

COL Colombia National Statistics Office http://190.25.231.246:8080/Dane/tree.jsf

CRI Costa Rica Education Policy and Data Center (EPDC) http://epdc.org/

CUB Cuba NACZE Czech Republic Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing 

DEU Germany Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing 

DNK Denmark National Statistics Office http://www.statbank.dk/statbank5a/SelectVarVal/Define.asp?Maintable=RASU1&PLanguage=1

DOM Dominican Republic Education Policy and Data Center (EPDC) http://epdc.org/

ECU Ecuador National Statistics Office http://190.95.171.13/cgibin/RpWebEngine.exe/PortalAction?&MODE=MAIN&BASE=ECUADOR21&MAIN=WebServerMain.inl

EGY Egypt Education Policy and Data Center (EPDC) http://epdc.org/

ESP Spain Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing 

EST Estonia National Statistics Officehttp://pub.stat.ee/px‐web.2001/Dialog/varval.asp?ma=PC414&ti=ECONOMICALLY+ACTIVE+POPULATION+BY+AGE,+EDUCATIONAL+ATTAINMENT+AND+ETHNIC+NATIONALITY*&path=../I Databas/Population census/06Economically active population/ =1

FIN Finland Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing 

FRA France Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing 

GAB Gabon Education Policy and Data Center (EPDC) http://epdc.org/

GBR United Kingdom Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing 

GEO Georgia National Statistics Office (special request of data)GHA Ghana Education Policy and Data Center (EPDC) http://epdc.org/

GRC Greece Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing 

GTM Guatemala Education Policy and Data Center (EPDC) http://epdc.org/

HND Honduras Education Policy and Data Center (EPDC) http://epdc.org/

HRV Croatia National Statistics Office http://www.dzs.hr/Eng/censuses/Census2001/Popis/E01_01_07/E01_01_07.html

HUN Hungary Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing 

IDN Indonesia Education Policy and Data Center (EPDC) http://epdc.org/

IND India Education Policy and Data Center (EPDC) http://epdc.org/

IRL Ireland Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing 

IRN Iran NAISR Israel Integrated Public Use Microdata Series International  (IPUMS) https://international.ipums.org/international/ 

ITA Italy Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing 

JOR Jordan Education Policy and Data Center (EPDC) http://epdc.org/

JPN Japan National Statistics Office http://www.e‐stat.go.jp/SG1/chiiki/ToukeiDataSelectDispatchAction.do

KAZ Kazakhstan Education Policy and Data Center (EPDC) http://epdc.org/

KEN Kenya Education Policy and Data Center (EPDC) http://epdc.org/

KGZ Kyrgyz Republic Education Policy and Data Center (EPDC) http://epdc.org/

KHM Cambodia Education Policy and Data Center (EPDC) http://epdc.org/

KOR Korea, Rep. NALAO Lao PDR Education Policy and Data Center (EPDC) http://epdc.org/

LBN Lebanon Ministry of Social Affairs http://www.cas.gov.lb/images/PDFs/Educational%20status‐2004.pdf

LKA Sri Lanka Education Policy and Data Center (EPDC) http://epdc.org/

APPENDIX OF DATA SOURCES:  REGIONAL YEARS OF EDUCATION

Page 74: NATIONAL BUREAU OF ECONOMIC RESEARCH HUMAN ......We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, Francisco Queiro, and Nicolas Santoni for dedicated

Code Country Source Available Link

LSO Lesotho Education Policy and Data Center (EPDC) http://epdc.org/

LTU Lithuania National Statistics Office http://db.stat.gov.lt/sips/Dialog/varval.asp?ma=gs_dem17en&ti=Population+by+educational+attainment+and+age+group%A0%28aged+10+years+and+over%29&path=../Database/cen en/p71en/demography/ =2

LVA Latvia National Statistics Office http://data.csb.gov.lv/Dialog/varval.asp?ma=tsk03a&ti=EDUCATIONAL+ATTAINMENT+OF+POPULATION&path=../DATABASEEN/tautassk/Results%20of%20Population%20Census%202000%20in%20brief/=1

MAR Morocco Education Policy and Data Center (EPDC) http://epdc.org/

MDA Moldova Education Policy and Data Center (EPDC) http://epdc.org/

MDG Madagascar Education Policy and Data Center (EPDC) http://epdc.org/

MEX Mexico Education Policy and Data Center (EPDC) http://epdc.org/

MKD Macedonia, FYR Education Policy and Data Center (EPDC) http://epdc.org/

MNG Mongolia Integrated Public Use Microdata Series International  (IPUMS) https://international.ipums.org/international/ 

MOZ Mozambique Education Policy and Data Center (EPDC) http://epdc.org/

MWI Malawi Education Policy and Data Center (EPDC) http://epdc.org/

MYS Malaysia Integrated Public Use Microdata Series International  (IPUMS) https://international.ipums.org/international/ 

NAM Namibia Education Policy and Data Center (EPDC) http://epdc.org/

NER Niger Education Policy and Data Center (EPDC) http://epdc.org/

NGA Nigeria Education Policy and Data Center (EPDC) http://epdc.org/

NIC Nicaragua Education Policy and Data Center (EPDC) http://epdc.org/

NLD Netherlands Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing 

NOR Norway National Statistics Office http://statbank.ssb.no/statistikkbanken/Default_FR.asp?PXSid=0&nvl=true&PLanguage=1&tilside=selecttable/hovedtabellHjem.asp&KortnavnWeb=utniv

NPL Nepal Education Policy and Data Center (EPDC) http://epdc.org/

NZL New Zealand National Statistics Office http://wdmzpub01.stats.govt.nz/wds/ReportFolders/reportFolders.aspx

PAK Pakistan Education Policy and Data Center (EPDC) http://epdc.org/

PAN Panama Education Policy and Data Center (EPDC) http://epdc.org/

PER Peru Education Policy and Data Center (EPDC) http://epdc.org/

PHL Philippines Education Policy and Data Center (EPDC) http://epdc.org/

POL Poland Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing 

PRT Portugal Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing 

PRY Paraguay National Statistics Office http://celade.cepal.org/cgibin/RpWebEngine.exe/EasyCross?&BASE=CPVPRY2002&ITEM=INDICADO&MAIN=WebServerMain.inl

ROM Romania Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing 

RUS Russian Federation National Statistics Officehttp://74.125.65.132/translate_c?hl=en&ie=UTF‐8&sl=ru&tl=en&u=http://www.perepis2002.ru/index.html%3Fid%3D15&prev=_t&usg=ALkJrhiZr6thPp3doxH9mXdDZgf‐DA1fyw

SEN Senegal Education Policy and Data Center (EPDC) http://epdc.org/

SLV El Salvador VI Censo de la Poblacion y V de Vivienda 2007 http://www.digestyc.gob.sv/cgibin/RpWebEngine.exe/Crosstabs 

SRB Serbia National Statistics Office http://webrzs.statserb.sr.gov.yu/axd/en/Zip/CensusBook4.zip

SVK Slovak Republic National Statistics Office http://px‐web.statistics.sk/PXWebSlovak/DATABASE/En/02EmploMarket/01EconPopActiv/EA_total.px

SVN Slovenia National Statistics Office http://www.stat.si/pxweb/Database/Census2002/Administrative%20units/Population/Education/Education.asp

SWE Sweden National Statistics Office

http://www.ssd.scb.se/databaser/makro/SubTable.asp?yp=tansss&xu=C9233001&omradekod=UF&huvudtabell=Utbildning&omradetext=Education+and+research&tabelltext=Population+16‐74+years+of+age+by+highest+level+of+education,+age+and+sex.+Year&preskat=O&prodid=UF0506&starttid=1985&stopptid=2007&Fromwhere=M =2&langdb=2

SWZ Swaziland Education Policy and Data Center (EPDC) http://epdc.org/

SYR Syrian Arab Republic Education Policy and Data Center (EPDC) http://epdc.org/

THA Thailand Education Policy and Data Center (EPDC) http://epdc.org/

TUR Turkey National Statistics Office http://www.tuik.gov.tr/isgucueng/Kurumsal.do

TZA Tanzania Education Policy and Data Center (EPDC) http://epdc.org/

UGA Uganda Education Policy and Data Center (EPDC) http://epdc.org/

UKR Ukraine National Statistics Office http://stat6.stat.lviv.ua/PXWEB2007/Database/POPULATION/1/06/06.asp

URY Uruguay National Statistics Office http://www.ine.gub.uy/microdatos/engih2006/persona.zip,USA United States National Statistics Office http://factfinder.census.gov/servlet/DatasetMainPageServlet?_program=ACS&_submenuId=&_lang=en&_ts=

UZB Uzbekistan Education Policy and Data Center (EPDC) http://epdc.org/

VEN Venezuela Integrated Public Use Microdata Series International  (IPUMS) https://international.ipums.org/international/ 

VNM Vietnam Education Policy and Data Center (EPDC) http://epdc.org/

ZAF South Africa National Statistics Office

http://www.statssa.gov.za/timeseriesdata/pxweb2006/Dialog/varval.asp?ma=Highest%20level%20of%20education%20grouped%20by%20province&ti=Table:+Census+2001+by+province,+highest+level+of+education+grouped,++population+group+and+gender.+&path=../Database/South%20Africa/Population%20Census/Census%202001%20‐%20NEW%20Demarcation%20boundaries%20as%20at%209%20December%202005/Provincial%20level%20‐%20Persons/=1

ZAR Congo, Dem. Rep. Education Policy and Data Center (EPDC) http://epdc.org/

ZMB Zambia Education Policy and Data Center (EPDC) http://epdc.org/

ZWE Zimbabwe Education Policy and Data Center (EPDC) http://epdc.org/