national bureau of economic research human ......we are grateful to nicolas ciarcia, nicholas...
TRANSCRIPT
NBER WORKING PAPER SERIES
HUMAN CAPITAL AND REGIONAL DEVELOPMENT
Nicola GennaioliRafael La Porta
Florencio Lopez-de-SilanesAndrei Shleifer
Working Paper 17158http://www.nber.org/papers/w17158
NATIONAL BUREAU OF ECONOMIC RESEARCH1050 Massachusetts Avenue
Cambridge, MA 02138June 2011
We are grateful to Nicolas Ciarcia, Nicholas Coleman, Sonia Jaffe, Konstantin Kosenko, FranciscoQueiro, and Nicolas Santoni for dedicated research assistance over the past 5 years. We thank GaryBecker, Nicholas Bloom, Vasco Carvalho, Edward Glaeser, Gita Gopinath, Josh Gottlieb, ElhananHelpman, Chang-Tai Hsieh, Mathew Kahn, Pete Klenow, Robert Lucas, Casey Mulligan, Elias Papaioannou,Jacopo Ponticelli, Giacomo Ponzetto, Jesse Shapiro, Chad Syverson, David Weil, and seminar participantsat the UCLA Anderson School, Harvard University, University of Chicago, and NBER for extremelyhelpful comments. Gennaioli thanks the Barcelona Graduate School of Economics and the EuropeanResearch Council for financial support. Shleifer thanks the Kauffman Foundation for support. Theviews expressed herein are those of the authors and do not necessarily reflect the views of the NationalBureau of Economic Research.
NBER working papers are circulated for discussion and comment purposes. They have not been peer-reviewed or been subject to the review by the NBER Board of Directors that accompanies officialNBER publications.
© 2011 by Nicola Gennaioli, Rafael La Porta, Florencio Lopez-de-Silanes, and Andrei Shleifer. Allrights reserved. Short sections of text, not to exceed two paragraphs, may be quoted without explicitpermission provided that full credit, including © notice, is given to the source.
Human Capital and Regional DevelopmentNicola Gennaioli, Rafael La Porta, Florencio Lopez-de-Silanes, and Andrei ShleiferNBER Working Paper No. 17158June 2011, Revised September 2011JEL No. L26,O11,O43,O47,R11
ABSTRACT
We investigate the determinants of regional development using a newly constructed database of 1569sub-national regions from 110 countries covering 74 percent of the world’s surface and 96 percentof its GDP. We combine the cross-regional analysis of geographic, institutional, cultural, and humancapital determinants of regional development with an examination of productivity in several thousandestablishments located in these regions. To organize the discussion, we present a new model of regionaldevelopment that introduces into a standard migration framework elements of both the Lucas (1978)model of the allocation of talent between entrepreneurship and work, and the Lucas (1988) modelof human capital externalities. The evidence points to the paramount importance of human capitalin accounting for regional differences in development, but also suggests from model estimation andcalibration that entrepreneurial inputs and human capital externalities are essential for understandingthe data.
Nicola GennaioliCREIUniversitat Pompeu FabraRamon Trias Fargas 25-2708005 Barcelona (Spain)[email protected]
Rafael La PortaDartmouth CollegeTuck School210 Tuck HallHanover, NH 03755and [email protected]
Florencio Lopez-de-SilanesEDHEC Business School393, Promenade des Anglais BP 311606202 Nice Cedex 3FRANCEand [email protected]
Andrei ShleiferDepartment of EconomicsHarvard UniversityLittauer Center M-9Cambridge, MA 02138and [email protected]
An online appendix is available at:http://www.nber.org/data-appendix/w17158
2
I. Introduction.
We investigate the determinants of regional development using a newly constructed database
of 1569 sub-national regions from 110 countries covering 74 percent of the world’s surface and 96
percent of its GDP. We consider a variety of fundamental determinants of economic development,
such as geography, natural resource endowments, institutions, human capital, and culture, by looking
within countries. We combine this analysis with an examination of labor productivity and wages in
several thousand establishments covered by the World Bank Enterprise Survey, for which we have both
establishment-specific and regional data. Throughout the analysis, human capital measured using
education emerges as the most consistently important determinant of both regional income and
productivity of regional establishments. The combination of regional and establishment-level data
enables us to investigate some of the key channels through which human capital operates, including
education of workers, education of entrepreneurs/managers, and externalities.
To organize this discussion, we present a new model describing the channels through which
human capital influences productivity, which combines three features. First, human capital of workers
enters as an input into the neoclassical production function, but human capital of the
entrepreneur/manager influences firm-level productivity independently. The distinction between
entrepreneurs/managers and workers has been shown empirically to be critical in accounting for
productivity and size of firms in developing countries (Bloom and Van Reenen 2007, 2010; La Porta and
Shleifer 2008; Syverson 2011). In the models of allocation of talent between work and entrepreneurship
such as Lucas (1978), Baumol (1990), and Murphy, Shleifer, and Vishny (1991), returns to
entrepreneurial schooling may appear as profits rather than wages. By modeling this allocation, we
trace these two separate contributions of human capital to productivity.
Second, our approach allows for human capital externalities, emphasized in the regional context
by Jacobs (1969), and in the growth context by Lucas (1988, 2008) and Romer (1990). These
3
externalities result from people in a given location spontaneously interacting with and learning from
each other, so knowledge is transmitted across people without being paid for. Because our framework
incorporates both the allocation of talent between entrepreneurship and work as in Lucas (1978), and
human capital externalities as in Lucas (1988), we call it the Lucas-Lucas model2. Human capital
externalities have been found in a variety of development and regional contexts (Rauch 1993, Glaeser,
Scheinkman and Shleifer 1995, Angrist and Acemoglu 2000, Glaeser and Mare 2001, Moretti 2004,
Iranzo and Peri 2009), although Ciccone and Peri (2006) and Caselli (2005) find them to be unimportant.
By decomposing human capital effects into those of worker education, entrepreneurial/managerial
education, and externalities using a unified framework, we try to disentangle different mechanisms.
Third, because we are looking at the regions, we need to consider the mobility of firms, workers,
and entrepreneurs across regions, which is presumably less expensive than that across countries. To
this end, our model follows the standard urban economics approach (e.g., Roback 1982, Glaeser and
Gottlieb 2009) of labor mobility across regions with scarce resources, such as land and housing, limiting
universal migration into the most productive regions. This aspect of the model allows us to consider
jointly the education coefficients in regional and establishment level regressions. A key benefit of our
model of the three channels of influence of education is to reconcile regional and firm-level evidence.
To begin, we use regional data to examine the determinants of regional income in a
specification with country fixed effects. Our approach follows development accounting, as in Hall and
Jones (1999), Caselli (2005), and Hsieh and Klenow (2010). Among the determinants of regional
productivity, we consider geography, as measured by temperature (Dell, Jones, and Olken 2009),
distance to the ocean (Bloom and Sachs 1998), and natural resources endowments. We also consider
2 We do not consider the role of human capital in shaping the adoption of new technologies. Starting with Nelson
and Phelps (1966), economists have argued that human capital accelerates the adoption of new technologies. For recent models of these effects, see Benhabib and Spiegel (1994), Klenow and Rodriguez-Clare (2005), and Caselli and Coleman (2006). For evidence on such externalities and R&D spillovers, see Coe and Helpman (1995), Ciccone and Papaioannou (2009), and Wolff (2011).
4
institutions, which have been found by King and Levine (1993), De Long and Shleifer (1993), Hall and
Jones (1999), and Acemoglu et al. (2001) to be significant determinants of development. We also look
at culture, measured by trust (Knack and Keefer 1997) and at ethnic heterogeneity (Easterly and Levine
1997, Alesina et al. 2003). Last, we consider the effect of average education in a region on its level of
development. A substantial cross-country literature points to a large role of education. Barro (1991)
and Mankiw, Romer, and Weil (1992) are two early empirical studies; de La Fuente and Domenech
(2006) and Cohen and Soto (2007) are recent confirmations. Across countries, the effects of education
and institutions are difficult to disentangle empirically because both are endogenous and because the
potential instruments for each are correlated with both (Glaeser et al 2004). By using country fixed
effects, we can avoid identification problems caused by unobserved country-specific factors.
We find that favorable geography, such as lower average temperature and proximity to the
ocean, as well as higher natural resource endowments, are associated with higher per capita income in
regions within countries. We do not find that culture, as measured by ethnic heterogeneity or trust,
explains regional differences. Nor do we find that institutions as measured by survey assessments of
the business environment in the Enterprise Surveys help account for cross-regional differences within a
country. Some institutions or culture may matter only at the national level, but then large income
differences within countries call for explanations other than culture and institutions. In contrast,
differences in educational attainment account for a large share of the regional income differences
within a country. The within country R2 in the univariate regression of the log of per capita income on
the log of education is about 25 percent; this R2 is not higher than 8 percent for any other variable.
Acemoglu and Dell (2010) examine sub-national data from North and South America to
disentangle the roles of education and institutions in accounting for development. The authors find that
about half of the within-country variation in levels of income is accounted for by education. This is
similar to the Mankiw et al. (1992) estimate that half of the differences in per capita incomes across
5
countries is attributable to education. We confirm a large role of education, but try to go further in
identifying the channels. Acemoglu and Dell also conjecture that institutions shape the remainder of
the local income differences. We have regional data on several aspects of institutional quality, but find
that their ability to explain cross-regional differences is minimal3.
Regional regressions have the problem that human capital in a region may be endogenous
because of migration. To make progress on this front, we next examine the determinants of firm-level
productivity. To this end, we merge our data with World Bank Enterprise Surveys, which provide
establishment-level information on sales, labor force, educational level of management and employees,
as well as energy and capital use for several thousand establishments in the regions for which we have
data. The micro data show that establishments with employees and managers with more education are
more productive, holding capital and energy inputs constant and including regionXindustry fixed effects.
These data point to a large role of managerial/entrepreneurial human capital in raising firm productivity.
We then re-estimate the same equations including regional education, but replacing regionXindustry
fixed effects with country and industry fixed effects. In these regressions, regional education has a large
and positive coefficient, pointing to sizeable human capital externalities that survive controls for several
potential determinants of regional productivity. However, because regional education may be
correlated with unobserved region-specific productivity parameters, we do not have perfect
identification in this analysis of externalities.
To assess the extent to which firm-level results can account for the role of human capital across
regions, we combine estimation with calibration. To circumvent the endogeneity problems entailed in
cross country regressions, scholars have turned to calibration techniques, using micro studies to pin
3 A recent literature looks at colonial history within countries, and argues that regions that were treated
particularly badly by colonizers have poor institutions and lower income many years later (Banerjee and Iyer 2005, Dell 2010). It is surely possible, even likely, that severe institutional shocks have long run consequences because they influence human capital accumulation and institutions in the long run. But to the extent that we have adequate measures of institutional quality, the consequences of such shocks for modern institutions do not appear to add a great deal of explanatory power to understanding cross-regional evidence today.
6
down the parameters of the production function (e.g., Caselli 2005). We also rely on previous research
regarding factor shares (e.g., Gollin 2002, Caselli and Feyer 2007, Valentinyi and Herrendorf 2008), but
then combine it with coefficient estimates from regional and firm-level regressions. This allows us to
back out the returns to schooling and the role of externalities. We find substantial consistency between
regional and firm-level results, as well as plausible estimates of the parameters of the production
function. Our calibrations show that worker education, entrepreneurial education, and externalities all
substantially contribute to productivity. We find the role of workers’ human capital to be in line with
standard wage regressions, which are the benchmark adopted by conventional calibration studies (e.g.
Caselli 2005). Crucially, however, our results indicate that focusing on worker education alone
substantially underestimates both private and social returns to education. Private returns are very high
but to a substantial extent are earned by entrepreneurs, and hence might appear as profits rather than
wages, consistent with Lucas (1978). Although we have less confidence in the findings for externalities,
our best estimates suggest that those are also sizeable and fall in the ballpark of existing estimates (e.g.
Moretti 2004). In sum, the evidence points to a large influence of entrepreneurial human capital, and
perhaps of human capital externalities, on productivity.
In section II, we present a model of regional development that organizes the evidence. In
section III, we describe our data. Section IV examines the determinants of both national and regional
development. Section V presents firm-level evidence to disentangle the channels of through which
human capital influences productivity. We combine the regional and the micro estimates to compute
the model’s parameters and to assess its ability to explain income differences. Section VI concludes.
II. A Lucas-Lucas spatial model of regional and national income
A country consists of a measure 1 of regions, a share p of which has productivity ̃ and a share
1– p of which has productivity ̃ ̃ . We refer to the former regions as “productive”, to the latter
7
regions as “unproductive”, and denote them by i = G, B. A measure 2 of agents is uniformly distributed
across regions. An agent j enjoys consumption and housing according to the utility function:
jj acacu
1),( , (1)
where c and a denote consumption and housing, respectively. Half the agents are “rentiers,” the
remaining half are “labourers’’. Each rentier owns 1 unit of housing, T units of land, K units of physical
capital (and no human capital). Each labourer is endowed with hR++ units of human capital. In region i
= G, B the distribution of human capital is Pareto in [h,+∞), with mean μih/(μi–1), where μi, h>1. We
denote by Hi = μih/(μ–1) the initial human capital endowment of region i = G, B. Differences in Hi
capture exogenous variation of human capital across regions.
A labourer can become either an entrepreneur or a worker. By operating in region i, an
entrepreneur with human capital h who hires physical capital Ki,h, land Ti,h, and workers with total
human capital Hi,h produces an amount of the consumption good equal to:
hihihiihi TKHhAy ,,,1
, , 1 . (2)
As in Lucas (1978), a firm’s output increases, at a diminishing rate, in the entrepreneur’s human capital
has well as in Hi,h, Ki,h and Ti,h. We model human capital externalities (Lucas 1988) by assuming that
regional total factor productivity is given by:
iiii LhEAA )(~
, γ> 0, ψ ≥ 1. (3)
According to (3), holding region-specific factors ̃ constant (e.g., geography, institutions), productivity
increases in the region’s human capital. In expression (3), Ei(h) is the average level of human capital in
region i and Li is the measure of labour in that region. Parameter ψ captures the importance of the
quality of human capital: when ψ = 1 only the total quantity of human capital Hi = Ei(h)Li matters for
externalities; as ψ rises the quality of human capital becomes relatively more important than quantity.
Parameter γ captures the overall importance of externalities. To ease notation, we suppress the
8
dependence of productivity on regional human capital until we examine the spatial equilibrium, in which
productivity Ai is endogenously determined with regional sorting of labourers.
Rentiers rent land and physical capital to firms, and housing to entrepreneurs and workers. In
region i, each rentier earns λiT and ηi by renting land and housing, where λi and ηi are rental rates, and
ρiK by renting physical capital. A region’s land and housing endowments T and 1 are immobile; physical
capital is fully mobile. Labourers use their human capital in work or in entrepreneurship. By operating
in region i, a labourer with human capital h earns either profits πi(h) as an entrepreneur or wage income
wi∙h as a worker, where wi is the wage rate. All labourers, whether they become entrepreneurs or
workers, are partially mobile: a labourer moving to region i loses φwi units of income, where φ<h.4
At t = 0, a labourer with human capital h selects the location and occupation that maximize his
income. The housing market clears, so houses are allocated to each region’s labour. At t = 1,
entrepreneurs hire land, human, and physical capital. Production is carried out and distributed in wages,
land rental, capital rental, housing rental and profits. Consumption takes place.
A spatial equilibrium is a regional allocation iiWi
Ei KHH ,, of entrepreneurial human capital
EiH , workers’ human capital W
iH , and physical capital Ki such that: a) entrepreneurs hire workers,
physical capital, and land to maximize profits, b) labourers optimally choose location, occupation and
the fraction of income devoted to consumption and housing, and c) capital, labour, land and housing
markets clear. Because physical capital is fully mobile, there is a unique rental rate ρ. Since land and
housing are immobile, their rental rates λi and ηi vary across regions depending on productivity and
population. To determine the sorting of labourers across regions and their choice between work and
entrepreneurship within a region, we must compute regional wages wi and profits πi(hj). To do so, we
first determine regional output and factor returns at a given allocation iiWi
Ei KHH ,, . Second, we solve
4 For simplicity, we assume that moving costs are a redistribution from migrants to locals (the latter may be viewed
as providing moving services) and are non-rival with the time spent working. This ensures that the human capital employed in a region, as well as the aggregate income of laborers, do not depend on moving costs.
9
for the equilibrium allocation. Throughout the analysis, the price of consumption is normalized to one.
Production and occupational choice
An entrepreneur with human capital h operating in region i maximizes his profit by solving:
hiihihiihihihiiKTHTKHwTKHhA
hihihi,,,,,,
1
,, ,,,
max , (4)
implying that in each region firms employ factors in the same proportion. Since at iiWi
Ei KHH ,, firm j
employs a share of entrepreneurial capital hj/EiH , it hires the others factors according to:
.,, ,,, THh
TKHh
KHHh
H Ei
jjiiE
i
jji
WiE
i
jji (5)
As in Lucas (1978), more skilled entrepreneurs run larger firms.
Equation (5) implies that the aggregate regional output is given by:
TKHHAY iWi
Eiii
1. (6)
Equation (6) allows us to determine wages, profits, and capital rental rates as a function of regional
factor supplies via the usual (private) marginal product pricing. That is:
.///
,///)1(
,///
1
1
iiWii
Eii
i
i
Ei
Eii
Ei
WiiE
i
ii
Wi
Wii
Wi
EiiW
i
ii
KTKHKHAKY
HTHKHHAHY
HTHKHHAHY
w
(7)
Thus, profit πi(h) is equal to πi (the marginal product of the entrepreneur’s human capital in region i),
times the entrepreneur’s human capital h, namely πi(h) = πi∙h.
Using Equation (7) we can solve for a labourer’s occupational choice. A labourer j with human
capital hj chooses to be an entrepreneur if πi∙hj>wi∙hj and a worker if πi∙hj<wi∙hj. In equilibrium,
labourers must be indifferent between the two occupations (i.e., πi=wi), which implies:
10
iWii
Ei HHHH
1,
11
, (8)
where Wi
Eii HHH is total human capital in region i. E
iH increases with the share of the total private
return to human capital earned by entrepreneurs [i.e. with (1–α–β–δ)/(1–β–δ)]. Equation (8) describes
the allocation of labour within in a region from the total quantities of human and physical capital (Hi,Ki).
The spatial equilibrium: consumption, housing and mobility
We consider symmetric spatial equilibria in which all productive regions share the same factor
allocation (HG,KG), the same wage wG and rental rates λG and ηG, and unproductive regions share the
same allocation (HB,KB), wage wB, and rentals λB and ηB. The uniformity of the capital rental ρ across
regions pins down the allocation of physical capital as a function of the regional allocation of human
capital. By exploiting equations (7) and (8) one finds that ρ is constant across regions provided:
11
11
B
G
B
G
B
G
HH
AA
KK , (9)
Intuitively, the physical capital stock allocated to productive regions increases in their productivity
advantage AG/AB and in their relative human capital stock HG/HB. Remember that the productivity
differential in (10) is endogenously determined through externalities, as modelled in (3).
To find the allocation of human capital, we must characterize labour mobility by computing the
utility that labourers obtain from operating in different regions. Labourers maximize their utility in (2)
by devoting a share θ of their income to housing and the remaining share (1 – θ) to consumption. Since
the aggregate income of labourers in region i is equal to wiHi, the demand for housing in the regions is
θ∙wiHi/ηi. With the unitary supply of housing, the housing rental rate is equal to:
ηi= θ∙wi∙Hi. (10)
As a consequence, the utility (gross of moving costs) of a labourer in region i is equal to:
11
i
i
i
iiw H
hwhwacu 1
, ),( , (11)
which rises with the wage and falls with regional human capital Hi due to higher rents.
To describe the spatial equilibrium, we need to find the ratio between the wages paid in
productive and unproductive regions as a function of their human capital. Using the wage equation (7),
the interest rate equalization condition (8), and the expression for externalities (3), we find that:
111
)()(
~~
G
B
BB
GG
B
G
B
G
HH
LhELhE
AA
ww
(12)
Ceteris paribus, the wage is higher in productive regions. A higher human capital stock has a negative
effect on the wage because of diminishing returns but once externalities are taken into account the net
effect is ambiguous. In the remainder we assume:
A.1 1~~
B
G
B
G
HH
AA ,
which implies that the autarky wage and interest rates are higher in productive regions, so that both
capital and labour tend to move there. We can then prove the following (in Appendix 1):
Proposition 1 Under the parametric restriction:
(β – ψγ)(1 – θ) + θ(1 – δ)> 0, (13)
there is a stable equilibrium allocation HG and HB. In this allocation:
a) There is a cutoff hm such that agent j migrates from an unproductive to a productive region if
and only if hj ≥ hm. The cutoff hm increases in the mobility cost φ.
b) Denote by BG HpHpH )1( the aggregate human capital. Then, when φ = 0, the
equilibrium level of human capital in region i is independent of the region’s initial human capital
endowment. In particular, for ψ = 1 the full mobility allocation satisfies:
12
H
AE
AHH Gfree
GG
)1()1)((1
)1()1)((1
~
. (14)
When φ > 0 and ψ ≥ 1, we have that HG< freeGH~ and HG increases in HG holding H constant.
Since wages (and profits) are higher in the productive than in the unproductive regions, labour
migrates to the former from the latter. The cutoff rule in a) is intuitive: more skilled people have a
greater incentive to pay the migration cost because the wage (or profit) gain they experience from doing
so is higher. Even if mobility costs are zero, migration to the more productive regions is not universal.
This is due to the limited supply of land T, which causes decreasing returns in production, and to the
limited supply of housing, which implies that migration causes housing costs to rise until the incentive to
migrate disappears. Regional externalities moderate the adverse effect of fixed supplies of land and
housing on mobility. In fact, for migration to be interior, condition (13) must be met, which requires
external effects ψγ to be sufficiently small relative to: i) the diminishing returns β due to land and ii) the
sensitivity θ of house prices to regional human capital.
In equilibrium, wages are higher in the more productive regions, wG>wB, but the housing rental
rate is also higher there, ηG>ηB. While (14) shows that under costless migration the human capital
employed in a region only depends on its productivity, when mobility is imperfect (i.e., φ > 0) a region
exogenously endowed with more human capital will employ more human capital in equilibrium. This
property will be important for the empirical analysis. When ψ = 1 and φ = 0, national output is equal to:
TKHHHAY WE
1, (15)
where A
is a function ),,,~,~,,,( pAAA BG
of exogenous parameters. More generally, under
condition (13) the Lucas-Lucas model yields the following equation for firm level output:
jijijijiiiji TKHhLhEAy ,,,
1, )(~ , (16)
13
and the following equation for regional output:
TKHHLhEAY iWi
Eiiiii )()()(~ 1 .
(17)
Empirical Predictions of the Model
To obtain predictions on the role of schooling, we need to specify a link between human capital
(which we do not observe) and schooling (which we do observe). We follow the Mincerian approach in
which for an individual j the link between human capital and schooling is:
jjj Sh exp , (18)
where Sj ≥ 0 and μj ≥ 0 are two random variables (distributed to ensure that the distribution of hj is
Pareto). The return to schooling μi varies across individuals, potentially due to talent. This allows us to
estimate different returns to schooling for workers and entrepreneurs. Card (1999) offers some
evidence of heterogeneity in the returns to schooling. Human capital in region i is then equal to
0,0
),()(
S i
S
h i dSdSgehhdF , where )(hdFi is the density of region i labourers’ with human
capital h and ),( Sgi is the density of region i labourers having human capital Seh , so that
iS i LdSdSg 0,0),(
. In line with macro studies, in our regressions we express average human
capital in the region as a first order expansion around the mean Mincerian return and years of schooling:
ii Si ehE
)( , (19)
where iS is average schooling while i is the average Mincerian return, both computed in region i.
Regional Income Differences
To test Equation (17) we need to specify a regression in terms of observables, which entails
regressing regional per capita income on human capital and population (we do not have regional data
14
on physical capital). From the Equation for ρ in (7) we obtain the condition Ki=B
11
11
ii HA where B >
0 is a constant. Substituting this condition as well as Equation (19) into Equation (17) we find that:
ln(Yi/Li) = C + [1/(1 – δ)]ln ̃ + [1+ γψ –β/(1 – δ)] i iS + [γ – β/(1 – δ)]lnLi, (20)
where C is a constant absorbed by the country fixed effect. If mobility is costly (φ>0), our model predicts
that regional human capital should vary with the initial exogenous endowment Hi and that regional
income differences are also triggered by fixed housing supply. Estimating (20) using OLS implies that the
coefficient on average regional schooling should be interpreted as the product of the “technological”
parameter (1+ γψ – β) and the nation-wide average of the regional Mincerian returns i .
The coefficient [γ – β/(1 – δ)] on population Li captures the benefit γ of increasing regional
workforce in terms of externalities minus the cost β of crowding the fixed land supply. A similar
interpretation holds with respect to the schooling coefficient (1+ γψ – β). The estimation of Equation
(20) raises a serious concern: since in our model human capital migrates to more productive regions, any
mismeasurement of regional productivity Ai may contaminate the coefficient of regional human capital.
We deal with this issue in two steps. First, we control in regression (20) for proxies of Ai. Second, we
compare these results to the coefficients obtained from the firm level regressions and to the calibration
exercises performed by the development accounting literature. These comparisons allow us to assess
the severity of the endogeneity problem in the estimation of (20).
Firm-Level Productivity
In (16), the output of a firm j operating in region i depends on the human capital hE,j of the
entrepreneur, as determined by his schooling SE,j and return to schooling ijE , , and on the average
human capital E(hW,j) of workers, which again we approximate by jWjW Se ,, (where jW , and jWS , are
15
average values in the firm’s workforce). Ceteris paribus, in our model entrepreneurs have a higher
return to schooling than workers because in region i an entrepreneur with schooling S is someone
whose return satisfies iES he ,
, where iEh , is the human capital threshold for becoming an
entrepreneur in region i. At a schooling level S, the entrepreneurial class includes talented labourers
whose return satisfies ShS iEiE /ln)( ,, while labourers with )(, SiE become workers.
By writing Equation (16) in terms of firm-level output per worker yi,j/li,j and by exploiting the
expressions for entrepreneurs’ and workers human capital, we obtain the prediction:
ln(yi,j/li,j) = ln ̃ + (1–α–β–δ) jE , SE,ij + α jW , jWS , +
(1–α–β–δ)ln(E
jil , /li,j) + αln( Wjil , /li,j)+δlnki,j +βlnti,j + γlnLi + γψ i iS , (21)
Where xi,j = Xi,j/li,j denote per-worker values, E
jil , /li,j and W
jil , /li,j capture the share of a firm’s total
employment on managerial and non-managerial jobs, respectively. Taken literally, in our model output
per-worker is equalized across firms within a region. More realistically, though, output per-worker is
equalized across firms ex-ante, but its ex-post value varies as a result of stochastic ex-post changes in
the values of firm level TFP and inputs. This is the variation we appeal to when estimating (21).5
When estimating (21) using OLS, the coefficient on the entrepreneur’s schooling should be
interpreted as the product of entrepreneurs’ rents (1–α–β–δ) and a nation-wide average Mincerian
entrepreneurs’ return to education E . The coefficient on workers’ average schooling should be
interpreted as the labour share α times a nation-wide average Mincerian returns to workers W . The
5 Formally, suppose that if ex-ante a firm hires Xi,j units of a productive factor, this results in Xi,j = εX∙ Xi,j units of
the same factor being employed in production ex-post, where εX is a random shock to the value of inputs (e.g. an unpredictable change in the value of equipment, in the size of the workforce and so on). Given the Cobb-Douglas production function, the firm’s ex-ante optimization problem (occurring with respect to the ex-ante inputs Xi,j) does not change with respect to Equations (4) and (5). The only change is that a firm’s productivity also includes expectations of the random factors εX. Crucially, this formulation implies that ex-ante returns are equalized, ex-post returns are not, which allows us to estimate (21) insofar as our input measures captures the ex-pot values Xi,j.
16
coefficient on regional schooling should be interpreted as the product of the externality parameter γψ
and the population-wide average Mincerian return .6
The estimation of (21) allows us to separate the role of the “low human capital” of workers from
the “high human capital” of entrepreneurs in shaping firm productivity. Since the selection of talented
entrepreneurs into more productive regions/industries may contaminate our results, we first estimate
(21) by controlling for the full set of regionXindustry dummies. In addition, we estimate (21) by directly
controlling for region specific variables to assess the importance of externalities from the coefficient γ
on population and the coefficient γψ on regional schooling. In this case, however, migration of human
capital to more productive regions may give the false impression of positive externalities, creating the
identification issue present in estimating (21). We deal with this problem by controlling for proxies for
Ai and by comparing our estimation results to standard calibrations of externalities.
III. Data.
Our analysis is based on measures of income, geography, institutions, infrastructure, and culture
in up to 110 (out of 193 recognized sovereign) countries for which we found regional data on either
income or education. Almost all countries in the world have administrative divisions.7 In turn,
administrative divisions may have different levels. For instance a country may be divided into states or
provinces, which are further subdivided into counties or municipalities. For each variable, we collect
data at the highest administrative division available (i.e., states and provinces rather than counties or
municipalities) or, when such data does not exist, at the statistical division (e.g. the Eurostat NUTS in
6Both the regional level Equation (20) and the firm level Equation (21) imply that the average return to schooling
should vary across regions. One way to empirically account for such possibility is to run random coefficient regressions. We have performed this analysis and the results change very little (the results on human capital become slightly stronger). We do not report the results to save space. 7 The exceptions are Cook Islands, Hong Kong, Isle of Man, Macau, Malta, Monaco, Niue, Puerto Rico, Vatican City,
Singapore, and Tuvalu.
17
Europe) that is closest to it. Because we focus on regions, and typically run regressions with country
fixed effects, we do not include countries with no administrative divisions in the sample.
The reporting level for data on income, geography, institutions, infrastructure, and culture
differs across variables. GDP and education are typically available at the first-level administrative
division (i.e., states and provinces). In contrast, GIS geo-spatial data on geography, climate, and
infrastructure is typically available for areas as small as 10 km2. Finally, survey data on institutions and
culture are typically available at the municipal level. In our empirical analysis, we aggregate all variables
for each country to a region from the most disaggregated level of reporting available.8 To illustrate, we
have GDP data for 27 first-level administrative regions in Brazil, corresponding to its 26 states plus the
Federal District, but survey data on institutions for 248 municipalities. For our empirical analysis, we
aggregate the data on institutions by taking the simple average of all observations for establishments
located in the same first-level administrative division. Similarly, we aggregate the GIS geo-spatial data
on geography, climate, and infrastructure at the first-administrative level using the Collins-Bartholomew
World Digital Map.
The final data set has 1,569 regions in 110 countries: (1) 79 countries have regions at the first-
level administrative division; and (2) 31 countries have regions at a more aggregated level than the first-
administrative level because one or several variables (often education) are unavailable at the first-
administrative level. For example, Ireland has 34 first-level divisions (i.e., 29 counties and 5 cities), but
publishes GDP per capita data for 8 regions and education for 2 regions. Thus, we aggregate all the Irish
data to match the 2 regions for which education statistics are available. The data Appendix identifies
8 We used a variety of aggregation procedures. Specifically, we computed population-weighted averages for GDP
per capita and years of schooling. We computed regional averages for temperature, precipitation, distance to coast, travel time, and soil characteristics by first summing the (average) values of the relevant variable for all grid cells lying within a region and then dividing by the number of cells lying within a region. We computed regional averages for the density (e.g., power lines) and natural resources variables (oil and gas) by first summing the relevant variable for all grid cells within a region and then dividing by the region’s population. We averaged the responses within a region for all the variables from the Enterprise and World Value Surveys. We sum up the number of unique ethnic groups and computed the probability that people within a region speak the same language based on the total number of people in each “language” area.
18
the reporting level for the regions in our dataset. As noted earlier, all countries have administrative
divisions (although 31 countries in our sample report statistics for statistical regions). The principal
constraint on the sample is the availability of human capital data. Of course, all countries have periodic
censuses and thus have sub-national data on human capital, but these data are hard to find.
Figure 1 portrays the 1,569 regions in our sample. It shows that coverage is extensive outside of
North and sub-Saharan Africa. Sample coverage is strongly related to a country’s surface area,
presumably because very small countries do not report regional data. For example, the smallest country
in our dataset is Lebanon (10,400 km2), leaving out of our sample some very prosperous countries such
as Luxembourg (2,590 km2) and Singapore (699 km2). Among countries ranked by their surface area, we
only have data for 14% of the first (smallest) 50 countries, 44% of the first 100 countries, and 53% of the
first 150 countries. Similarly, sample coverage rises with the absolute level of GDP but not with GDP per
capita. For example, we have data for 18% the first 50 countries ranked in terms of GDP in 2005, 38% of
the first 100 countries, and 49% of the first 150 countries. The comparable figures for countries ranked
on the basis of GDP per capita are 52%, 57% and 57%, respectively. Since sample coverage rises with
GDP, it turns out that the countries in our sample account for 97% of world GDP in 2005.
Our final dataset has regional income data for 107 countries in 2005, drawn from sources
including National Statistics Offices and other government agencies (42 countries), Human Development
Reports (36 countries), OECDStats (26 countries), the World Bank Living Standards Measurement Survey
(Ghana and Kazakhstan), and IPUMS (Israel). When regional income data for 2005 is missing, we use
log-linear interpolation based on as much data as it is available for the period 1990-2008 or, when
interpolation is not possible, the closest available year.9 Our measure of regional income per capita is
typically based on value added but we use data on income (6 countries), expenditure (8 countries),
wages (3 countries), gross value added (2 countries), and consumption, investment and government
9 We are missing regional GDP per capita for Bangladesh and Costa Rica and national GDP per capita in PPP terms
for Cuba.
19
expenditure (1 country) to fill-in missing values. We measure regional GDP in current purchasing-
power-parity dollars as we lack data on regional price indexes. To ensure consistency with the national
GDP figures reported by World Development Indicators, we adjust regional income values so that --
when weighted by population-- they total GDP at the country level. Not surprisingly, adjustments
exceeding 20% were necessary in 19 out of the 20 countries for which use GDP proxies rather than
actual GDP. Adjustments exceeding 20% were also necessary in 13 countries (Democratic Republic of
Congo, Lebanon, Lesotho, Madagascar, Malaysia, Nepal, Niger, Philippines, Senegal, Swaziland, Syria,
Uganda, and Venezuela) where our GDP data are in real terms.
We compute regional income per capita using population data from Thomas Brinkhoff: City
Population, which collects official census data as well as population estimates for regions where official
census data are unavailable.10 We adjust these regional population values so that their sum matches
the country’s population in the World Development Indicators database. This adjustment exceeds 10%
for 6 countries: Bangladesh (+13%), Benin (+11%), Democratic Republic of Congo (-10.5%), Gabon (-
25%), Swaziland (+16%) and Uzbekistan (-22%).
In addition, we examine productivity and its determinants using establishment-level data from
the Enterprise Survey for as many as 53,957 establishments in 82 countries and 539 of the regions in our
sample.11 We collect operating data on sales, cost of raw materials, cost of labor, cost of electricity, and
the cost of communications. We also collect data on the book value of property, plant, and equipment.
Critically, the Enterprise Survey keeps track of the highest educational attainment of the establishment’s
top manager as well as of its workers. Finally, we collect the two-digit ISIC code (e.g., food, textiles,
chemicals, etc.) of the establishments in our sample. Like the economic census and business registry
10
We also used data from OECDStats (for Denmark, Greece, Ireland, Italy, and the UK) and the National Statistics Office of Macedonia. 11
The Enterprise Survey data was collected between 2002 and 2009. When data from the Enterprise Survey for one of the countries in our sample are available for multiple years, we use the most recent one.
20
data, the Enterprise Survey data only covers registered establishments. A limitation of the Enterprise
Survey data is that it largely excludes OECD countries (Ireland and Mexico are the exceptions).
We relate regional economic development to five sets of potential determinants: (1)
geography, (2) education, (3) institutions, (4) infrastructure, and (5) culture. To narrow down the list of
candidate variables, we restrict attention to variables that are available at the regional level for at least
40 countries and 200 regions.
We use three measures of geography and natural resources obtained from the WorldClim
database, which are available for all regions of the world. They include the average temperature during
the period 1950-2000, the (inverse) average distance between the cells in a region and the nearest
coastline, and the estimated volume of oil production and reserves in the year 2000.12
We gather data on the educational attainment of the population 15 years and older for 106
countries and 1491 regions from EPDC Data Center (55 countries), Eurostat (17 countries), National
Statistics Offices (27 countries) and IPUMS (7 countries); see the data appendix for sources. We collect
data on school attainment during the period 1990-2006 and use data for the most recently available
period. We compute years of schooling following Barro and Lee (1993). Specifically, we use UNESCO
data on the duration of primary and secondary school in each country and assume: (a) zero years of
school for the pre-primary level, (b) 4 additional years of school for tertiary education, and (c) zero
additional years of school for post-graduate degrees. We do not use data on incomplete levels because
it is only available for about half of the countries in the sample. For example, we assume zero years of
additional school for the lower secondary level. For each region, we compute average years of schooling
as the weighted sum of the years of school required to achieve each educational level, where the
weights are the fraction of the population aged 15 an older that has completed each level of education.
12
The results in the paper are robust to controlling for the standard deviation of temperature, the average annual precipitation during the period 1950-2000, the average output for multiple cropping of rain-fed and irrigated cereals during the period 1960-1996, the estimated volume of natural gas production and reserves in year 2000, and dummies for the presence of various minerals in the year 2005.
21
To illustrate these calculations consider the Mexican state of Chihuahua. The EPDC data on the
highest educational attainment of the population 15 years and older in Chihuahua in 2005 shows that
4.99% of the that population had no schooling, 13.76% had incomplete primary school, 22.12% had
complete primary school, 5.10% had incomplete lower secondary school, 23.04% had complete lower
secondary school, 17.94% had complete upper secondary school, and 13.05% had complete tertiary
school. Next, based on UNESCO’s mapping of the national educational system of Mexico, we assign six
years of schooling to people who have completed primary school and 12 years of schooling to those that
have completed secondary school. Finally, we calculate the average years of schooling in 2005 in
Chihuahua as the sum of: (1) six years times the fraction of people whose highest educational
attainment level is complete primary school (22.12%), incomplete lower secondary (5.1%), or complete
lower secondary school (23.04%); (2) 12 years times the fraction of people whose highest attainment
level is complete upper secondary school (17.94%); and (3) 16 years times the fraction of people whose
highest attainment level is complete tertiary school. Accordingly, we estimate that the average years of
schooling of the population 15 and older in Chihuahua in 2005 is 7.26 years
(=6*0.5026+12*0.1794+16*0.1305).
We compute years of schooling at the country-level by weighting the average years of schooling
for each region by the fraction of the country’s population 15 and older in that region. The correlation
between this measure and the number of years of schooling for the population 15 years and older in
Barro and Lee (2010) is 0.9. For the average (median) country in our sample, the number of years of
schooling in Barro and Lee (2010) is 8.18 vs. 6.88 in ours (8.56 vs.6.92 years). Two factors could
potentially explain why the Barro-Lee dataset yields a higher level of educational attainment than ours:
(1) Barro-Lee captures incomplete degrees while we do not; and (2) education levels have increased
rapidly over time but some of our educational attainment data is stale (e.g. for 14 countries our
educational attainment data is for the year 2000 or earlier). To make the Barro and Lee (2010) measure
22
of educational attainment more comparable to ours, we make two adjustments to their data. First, we
apply our methodology to the Barro-Lee dataset and compute the level of educational attainment in
2005. After this first adjustment, the level of educational attainment computed with the Barro-Lee
dataset for the average (median) country in our sample drops to 7.07 (7.23). Second, we apply our
methodology to the Barro-Lee dataset but –rather than use data for 2005 -- use figures for the year that
best matches the year in our dataset. After this second adjustment, the level of educational attainment
using the Barro-Lee dataset for the average (median) country in our dataset drop further to 6.95
(7.22).13 Since most of our results are run with country-fixed effects, country-level biases in our
measure of human capital do not affect our results.14
We gather data on seven measures of the quality of institutions from the Enterprise Survey and
the Sub-national Doing Business Reports. The Enterprise Survey covers as many as 80 of the countries
and 410 of the regions in our sample.15 The Enterprise Survey asked business managers to quantify: (1)
informal payments in the past year (percent of sales spent in informal payments by a typical firm in the
respondent’s industry), (2) the number of days spent in meeting with tax authorities in the past year, (3)
the number of days without electricity in the previous year, and (4) security costs (cost of security
equipment, personnel, or professional security service as a percentage of sales). The Enterprise Survey
also asks managers to rate a variety of obstacles to doing business, including: (1) access to land, and (2)
13
After the second adjustment, there are 5 countries (i.e., Great Britain, Poland, Switzerland, Syria, and Uruguay) for which our educational attainment numbers remain 25% or more above the adjusted Barro-Lee numbers, and 12 countries (i.e., Armenia, Bangladesh, Benin, Bolivia, Cambodia, Honduras, Laos, Morocco, Niger, Pakistan, Senegal, and Sri Lanka) for which our numbers remain 25% or more below the adjusted Barro-Lee numbers. In all but two of these 17 cases (Great Britain and Poland are the two exceptions), data sources differ (our data for these two countries comes from household or individual surveys and theirs from national censuses). For Great Britain we have 12.14 years of schooling, as does the OECD, while Barro-Lee has 9.21. For Poland, we have 11.15 years of schooling while Barro-Lee has 9.65 and the OECD has 10.55. 14
Results for our cross-country regressions are qualitatively similar if we use educational attainment from Barro-Lee (2010) rather than the population-weighted average of regional values. 15
The main reason why we have fewer regions with measures of institutions than regions with productivity data is because we imposed a filter of a minimum of 10 establishments answering the particular institutions question. The rest of the discrepancy in the number of regions is because some questions about institutions were not included in the survey for some countries.
23
access to finance.16 For each of these obstacles to doing business, we keep track of the percentage of
the respondents that rate the item as a moderate, a major, or a very severe obstacle to business. The
final Enterprise Survey variable that we examine is the perception of government predictability
(measured as the percentage of respondents who tend to agree, agree in most cases, or fully agree that
government officials’ interpretations of regulations are consistent and predictable).
To make sure that our results on the importance of institutions are not driven by measurement
error, we also gather objective measures of the quality of institutions from the Sub-national Doing
Business Reports, which are available for 19 countries and 180 regions in our sample. We focus on the
number of procedures and their cost in four areas: starting a new business, enforcing contracts,
registering property, and dealing with licenses. Interestingly, variation in the cost of regulation swamps
the variation in the number of procedures. For example, there is no variation in the number of steps
required to enforce a contract in the 30 Chinese cities tracked by the Sub-National Doing Business
Report. However, the estimated time to enforce a contract ranges from 112 days in the city of Nanjing
(Jiangsu) to 540 days in the city of Changchun (Jilin). As it turns out, results using objective measures of
institutions are qualitatively similar to the results using subjective measures that we have described.
We use two measures of infrastructure. The first is the density of power lines in 1997 from the
US Geological Survey Global GIS database.17 The second measure is the average estimated travel time
between cells in a region and the nearest city of 50,000 people or more in the year 2000 from the Global
Environment Monitoring Unit. Both measures of infrastructure are available for all regions of the world.
Cultural variables are the last set of potential determinants of regional income that we examine.
We gather two proxies for cultural values and attitudes from the World Value Survey for as many as 75
16
From the Enterprise Survey, we also assembled data on the number of days in the past year with telephone outages, the percentage of sales reported to the tax authorities, and the confidence that the judicial system would enforce contracts and property rights in business. Results for these variables are available upon request. 17
Results using other density measures of infrastructure (e.g. air fields, highways, and roads) also available on the US Geological Survey Global GIS database are qualitatively similar.
24
of our sample countries and 745 of our regions.18,19 The first survey measure is the percentage of
respondents in each region that answer that “most people can be trusted” when asked whether
"Generally speaking, would you say that most people can be trusted, or that you can't be too careful in
dealing with people?" The second measure is a proxy for civic values based on whether each of the
following behaviors "can always be justified, never be justified or something in between.": (a) "claiming
government benefits which you are not entitled to"; (b) "avoiding a fare on public transport"; (c)
"cheating on taxes if you have the chance"; and (d) "someone taking a bribe" (Knack and Keefer, 1997).20
In addition, we gather two measures of fractionalization for up to 1,568 regions and 110 of our sample
countries. The first one is simply the number of ethnic groups that inhabited each region in 1964. The
second one is the probability that a randomly chosen person in a region shares the same mother
language with a randomly chosen people from the rest of the country in 2004.
Finally, in addition to running regressions using regional data, we examine GDP per capita at the
country level, which come from World Development Indicators. All the other country-level variables in
the paper are computed based on our regional data rather than drawn from primary sources.
Specifically, the country-level analogs of our regional measures of education, geography, institutions,
public goods, and culture are the area- and population-weighted averages of the relevant regional
variables, as appropriate.
18
We set to missing World Value Survey data for five countries (France, Japan, Philippines, Russia, and the United States) because they are only available at a very coarse level. 19
The World Value Survey was collected between 1981 and 2005. When data from the World Value Survey for one of the countries in our sample are available for multiple years, we use the most recent one. 20
We also examined proxies for confidence in various institutions (government, parliament, armed forces, education, civil service, police, and justice), for what is important in people’s lives (family, friends, leisure, politics, work, and religion) as well as for characteristics valued in children (determination, faith, hard work, imagination, independence, obedience, responsibility, thrift, and unselfishness). Moreover, we also examined proxies for broad cultural attitudes with regards to authority (percent who think that one must always love and respect one’s parents regardless of their qualities and faults), tolerance for other people (percent who select tolerance and respect for other people as an important quality for children to learn), and family (percent who think that parents have a duty to do their best for their children even at the expense of their own well-being). Finally, we examined the percentage of respondents that participate in professional and civic associations. The results for these variables are qualitatively similar to the results for the WVS variables that we discuss in the text.
25
Table 1 describes our variables and Table 2 summarizes our data. For each variable we examine
in the regional regressions, it shows the number of regions for which we have the information, the
number of countries these regions are in, the median and the average number of regions per country,
and the median range and standard deviation within a country. The data show substantial income
inequality among regions within a country. On average, the ratio of the income in the richest region to
that in the poorest region is 4.41. This ratio is 3.74 in both Africa and Europe, 4.60 in North America,
5.61 in South America, and 5.63 in Asia. The country with the highest ratio of incomes in the richest to
that in the poorest region is Russia (43.3); the country with the lowest ratio is Pakistan (1.32).
Interestingly, this ratio is 5.16 for the United States, 2.59 for Germany, 1.93 for France, and 2.03 for
Italy. Italy has attracted enormous attention because of differences in income between its North and
its South, usually attributed to culture. As it turns out, Italian regional inequality is not unusual. We also
note that regional inequality of incomes within a country, as measured by the standard deviation of the
logarithm of per capita incomes, declines with income, perhaps because richer countries have more
equalizing policies (Figure 2).
There is likewise substantial inequality in education among regions within a country. On
average, the ratio of educational attainment in the richest region to that in the poorest region is 1.80.
This ratio is 2.71 for Africa, 1.68 for Asia, 1.16 for Europe, 1.33 for North America, and 1.81 for South
America. The highest ratio is in Burkina Faso (14.66), where education is 0.22 in the Sahel region and
3.20 in the Centre region. The lowest ratio is 1.05 in Ireland. One striking fact in this data is the much
great regional equality in the distribution of education in the richer than in the poorer countries. Figure
3 presents the evidence for the relationship between standard deviation of education levels in a country
and its per capita income. Such tendency to equality might follow from the more uniform educational
policies in richer countries, and may account for greater regional income equality in the richer countries.
26
The patterns of inequality among regions within countries is interesting for some of the other
variables as well. Table 2 shows that differences in endowments, such as temperature and distance to
coast, are small, which suggests that these variables will have difficulty in explaining regional differences
in per capita income. Density of power lines and travel time to the next big city varies a great deal
across regions, suggesting that urban theories of development might be helpful in explaining regional
inequality. There is also considerable variation across regions in the estimates of the quality of
institutions, which suggests that, at least in principle, there is a regional aspect to institutional quality
that could relate to differences in economic development.
IV. Accounting for National and Regional Productivity.
In this section, we present cross-country and cross-region evidence on the determinants of
productivity. We present national regressions only for comparison. These regressions are difficult to
interpret in our model because it is not possible to express national output in closed form. More
importantly, the problem of endogeneity of education is particularly severe in the national context.
With respect to regional income, our benchmark is Equation (20). As already mentioned, we have
measures of average education at the regional level, but we do not have either national or regional data
on physical capital (except for public infrastructure) or other inputs, so these variables only appear in
the firm-level regressions in Section V.
Table 3 presents our basic regional results in perhaps the most transparent way. It reports the
results of univariate regressions of regional GDP per capita on its possible determinants, all with country
fixed effects. Such specifications are loaded in favor of each variable seeming important since it does
not need to compete with any other variable. We report both the within country and between
countries R2 of these regressions. The first row shows that education explains 58% of between country
variation of per capita income, and 38% of within country variation of per capita income. Figure 4
27
shows, for Brazil, Colombia, India, and Russia, the striking raw correlation between regional schooling
and per capita income. Although several other variables explain a significant share of between country
variation, none comes close to education in explaining within country variation in income per capita.
Starting with geographical variables, temperature and inverse distance to coast – taken
individually – explain 27 and 13 percent of between country income variation, but 1 and 4 percent
respectively of within country variation. Oil explains a trivial amount of variation at either level.
Turning to institutions, some of the variables, such as access to finance or the number of days it takes to
file a tax return, explain a considerable share of cross-country variation, consistent with the empirical
findings at the cross-country level such as King and Levine (1993) or Acemoglu et al. (2001), but none
explains more than 2 percent of within country variation of per capita incomes. Indicators of
infrastructure or other public good provision do slightly better: on their own many explain a large share
of between country variation, while density of power lines and travel time account for up to 7% of
within country variation. These variables are obviously highly endogenous, and still do much worse than
education. Some of the cultural variables account for a substantial share of between country variation,
none account for much of within country variation. Of course, culture might operate at the national
rather than the sub-national level, although we note that much of the research on trust focuses on
regional rather than national differences (e.g., Putnam 1993). After presenting the regression results,
we try to explain why some of these variables do so poorly.
Tables 4 through 6 show the multivariate regression results at the national and regional level.
Table 4 presents the baseline regressions of national (Panel A) and regional (Panel B) per capita income
on geography and education, controlling in some instances for population or employment, as suggested
by our model. At the country level, temperature, inverse distance to coast, and oil endowment are all
highly statistically significant in explaining cross-country variation in incomes, and explain an impressive
28
50% of the variance. Education is also statistically significant, with a coefficient of .25, raising the R2 to
63%. Note that oil comes in positive and highly statistically significant.
As Panel B shows, the coefficients on geography and education continue to be significant at the
regional level. However, the within country R2 is much higher for education than for the geographic
variables. The coefficient on the regional labour force is now positive and statistically significant, and
ranges from .01 for population to .07 for employment. The coefficient on education is around .27.
Table 5 presents country-level regressions with measures of institutions (Panel A) and of
infrastructure and culture (Panel B) added to the specification in Table 4. Education remains highly
statistically significant in each specification, and its coefficient does not fall much. At the country level,
only the logarithm of tax days is statistically significant. The last two rows of Table 5 show the adjusted
R2 of each regression if we omit the institutional (or infrastructure or cultural) variable, as well as the
adjusted R2 if we omit education. Dropping education sharply reduces explanatory power, while the
only institutional variable that adds explanatory power is the logarithm of tax days.
Table 6 presents the corresponding results at the regional level. The education coefficient is
slightly higher than in Table 4, and is highly significant, as illustrated in Figure 5. Institutional variables
are almost never significant, and their incremental explanatory power is tiny. We find a small adverse
effect of ethnic heterogeneity on income at the regional level, although the incremental explanatory
power of all the institutional and cultural variables is small21.
We have conducted several robustness checks of these findings, and here summarize but do not
present the results. First, we eliminated regions that include national capitals from the regressions; the
results are not materially affected. Second, we included measures of regional population density in the
specifications; density is typically insignificant and other results are not importantly affected. Third, we
21
We have tested the robustness of these results using data on regional luminosity instead of per capita income (see Henderson, Storeygard, and Weil 2009). The results are highly consistent with the evidence we have described, both with respect to the importance of human capital, and the evidence of relative unimportance of other factors, in accounting for cross-regional differences.
29
have rerun all the regressions breaking down our aggregated educational attainment measure into
measures, for each level of educational attainment between 1 and 13 years, of the presence of
individuals at that level of education in that region. Educational variables continue to be highly
correlated with per capita income, and the coefficients increase monotonically with the level of
attainment. That is, having more people at a higher level of education is associated with higher income.
What are some of the possible explanations of the low explanatory power of institutions,
keeping in mind that endogeneity of institutions should if anything raise the coefficients? It is possible
that we have inappropriate measures of institutions, although the measures we have are commonly
considered to be relevant for economic outcomes. It is also possible that the measures from Enterprise
Surveys are particularly noisy, although one should remember that these are surveys of managers who
should be particularly focused on institutional constraints. In general, such subjective assessments
correlate much better with measures of development than objective measures of institutions (Glaeser
et al. 2004). It is also likely that at least some institutions only matter at the national level, if, for
example, the critical-to-development business activity is concentrated in the capital.
To shed further light on these issues, Table 7 presents at the national level (we have no regional
data) regressions in the same format as Table 5, but using standard measures of institutions, including
autocracy, constraints on the executive, expropriation risk, proportional representation, and corruption.
Except for proportional representation, all of these variables are highly statistically significant in these
specifications. However, with the exception of expropriation risk and corruption, both of which are
highly endogenous, the incremental explanatory power of institutional variables is minimal, and in most
cases much smaller than the incremental explanatory power of education. Perhaps the more important
point is that Enterprise Surveys do not cover rich countries. If we run the regressions in Table 7 for the
72 countries with data on informal payments, we find that proportional representation is insignificant,
autocracy and executive constraints are significant at only 10% level, expropriation risk is significant at
30
the 5% level, and corruption is significant at the 1% level. Critically, the value of estimated coefficients
falls, rather than standard errors rising. Our bottom line is that the weakness of institutional variables
results in part from different (and possibly but not definitely inferior) data, and in part from the focus on
poorer countries, for which institutional variables indeed matter less.
Due to potential migration of better educated workers to more productive regions, we cannot
interpret the large education coefficients - which appear to come through with a similar magnitude
across a range of specifications – as the causal impact of human capital on regional income. To address
this problem, we next estimate the role of human capital in the production function by looking at firm
level evidence based on Enterprise Surveys. By combining estimation and calibration, we then assess
the extent to which the role of human capital at the firm level can account for its role across regions.
V. Firm-Level Evidence.
In Tables 8-10, we turn to the micro evidence and estimate essentially Equation (21). We use
the Enterprise Survey data described in Section III. In establishment-level regressions, we try to
disentangle managerial/entrepreneurial human capital, worker human capital, and human capital
externalities. To address the concern that the sorting of entrepreneurs by unobservable skills into more
productive regions may contaminate our estimates of entrepreneurial returns to schooling as well as of
human capital externalities, in Table 8 we use an extremely flexible specification that controls for
region-industry interactions by including region X industry fixed effects and no regional education. This
enables us to focus on the effect of managerial/entrepreneurial education on productivity without
worrying about regional and sectoral productivity, which are subsumed in the fixed effects. In Tables 9
and 10 we then turn to an examination of human capital externalities, first with regional education only,
and then adding geographic variables. All standard errors are clustered at the regional level. We have
experimented with some indicators of regional institutions and infrastructure as independent variables,
31
but consistent with the findings for regional data, they are usually insignificant, and hence we do not
focus on these results.
We use three dependent variables to proxy for productivity. First, we look at the log of the
establishment sales per worker, yi,j/li,j. Second, we look at a rough measure of value added, namely the
logarithm of sales net out raw material inputs, per worker. Third, we run regressions with the log of
average wages paid by the establishment (which in our Cobb-Douglas production function correspond to
a constant fraction of output) as a dependent variable. We measure capital (which includes both land ti,j
and physical capital ki,j) by the log of property, plant and equipment per employee. As an alternative, we
proxy for physical capital by the log of expenditure on energy per employee. We also use the log of the
number of employees, which is a proxy of li,j, to control for the share of the entrepreneur’s labour E
jil , /li,j
and of the workers’ labour W
jil , /li,j. Indeed, assuming that each firm has only one entrepreneur we have
Ejil , /li,j = 1/li,j and
Wjil , /li,j = (li,j-1)/li,j. Unfortunately, the regression coefficient of the log of employees is
not susceptible of a clean interpretation in terms of technological parameters.
Most important, to trace out the effects of human capital, we include the years of schooling of
the manager SE and the years of schooling of workers SW in Table 8, and subsequently the average years
of schooling in the region Si in Tables 9 and 10. As we explained in Section II, the Mincer model of the
relationship between education and human capital implies that schooling should enter the specification
in levels, rather than in logs. Accordingly, the regression coefficients of the schooling variables should
respectively capture parameters (1–α–β–δ) iE , , α iW , and γψ i in Equation (21). To capture scale
effects in regional externalities, in Tables 9 and 10 we control for the log of the region’s population Li.
The regression coefficient on this variable should capture γ in Equation (21). After presenting the basic
estimation results, we compute the values of these coefficients by combining estimation and calibration.
32
Estimation Results
Begin with the results in Table 8, which includes region-industry fixed effects (for 16 industries).
In the (log) sales per employee specification, the coefficient on energy per employee is .34, while that on
capital per employee is .30. These coefficients, however, are closer to .3 in the value added
specification, and to .2 in the average wage specifications. When both variables are included in the
specification, their sum is higher. Based on these estimates as well as on conventional calibration
parameters (see Caselli 2005), we use .35 as capital share when we calibrate the model and assess its
ability to account for variation in productivity across space.
The coefficient on worker schooling averages to about .03 in the sales per employee
specifications, roughly the same in the value added specifications, but is closer to .015 in the wage
specifications. The coefficient on management schooling is also about .03 in the sales per employee
specification, slightly lower in the valued added specification, but falls to about .02 in the wage
specification. The coefficient on the log employment (firm size) is about .1 across specifications. The
result that larger firms are more productive is at first glance inconsistent with the diminishing returns
specification, but it may reflect omitted organization-specific capital or other sources of higher
productivity in larger firms. Firm size might also capture some of the effects of managerial education,
since in our data larger firms are run by better educated managers (see also earlier results of Bloom and
Van Reenen 2007 and La Porta and Shleifer 2008 on the importance of managerial education).
Controlling for size may also help account for market power.
In Table 9, we include country and industry fixed effects, but add regional years of education to
the regressions. There are some changes in parameter estimates, but the coefficients on worker
education remain around .03, those on manager education likewise remain around .03, and capital
shares stay around .35, like in the specifications of Table 8. In addition, we find consistent evidence of
large effects on productivity from regional factors. The coefficient on regional schooling is amazingly
33
consistent and statistically significant across specifications, and varies between .05 and .1. The
coefficient on regional population varies across specifications, but we will take it to be around .09 based
on the results for sales per employee and value added per employee.
In our analysis of determinants of regional productivity, geographic variables, but not measures
of culture or institutions, have been consistently statistically significant. Accordingly, in Table 10 we
examine the robustness of the results in Table 9 by controlling for the geographic factors. Such controls
might also go some way toward enabling us to attribute the coefficient on regional human capital to
externalities rather than omitted regional productivity factors. The coefficients on geography variables
are quite unstable in these specifications, with inverse distance to coast exerting a large positive
influence on productivity in four specifications (but not the other eleven), and oil endowment exerting
now a large negative effect in some specifications. The most obvious proxies for omitted regional
productivity thus do not appear to be important. Critically, the coefficients on years of education of
managers and years of education of workers do not fall much relative to the specifications in Table 8,
indicating that returns to education of entrepreneurs remain high even with regional controls. The
coefficient on years of education in the region falls a bit in some specifications relative to its value in
Table 9. We will use the average estimate of .065 in our calculations.
We added additional controls to these regressions, and obtained similar results. This evidence
needs to be explored further, but most of the specifications confirm both the general findings, and
parameter estimates, computed from Tables 8 and 9. There does not appear to be much evidence of
significant omitted regional effects, although since we do not have a complete set of determinants of
regional productivity, our assessment of external effects might be exaggerated.
Combining Estimation with Calibration
34
Can the effects estimated from firm level regressions account for the large role of schooling in
the regional regressions? How do these effects compare with the work in development accounting?
We address these questions by starting with the roles of managers’ and workers’ schooling. The
coefficient on workers’ average schooling in the firm level regressions is about 0.03, which in our model
implies α W = 0.03. The standard calibration for the U.S. labour share is about α = 0.6. If however we
calibrate α = 0.55 to capture the fact that in developing countries the labour share tends to be lower
than in the U.S. (also because part of labour income goes to self employment, Gollin 2002), we obtain
W = 0.55, which is in the ballpark of micro evidence on worker Mincerian returns (Psacharopoulos
1994). The fact that Mincerian returns to education implied by our empirical evidence are consistent
with existing research suggests that our firm level productivity regressions help reduce identification
problems, at least as far as firm-level variables are concerned.
The regressions also point to an overall capital share (considering energy or equipment) roughly
equal to 0.35. In our model, this captures the income share δ + β going to K and T which leaves – under
constant returns – a share of 0.1 going to entrepreneurial rents. That is, (1–α–β–δ) = (1– 0.55– 0.35) =
0.1. Since the estimated coefficient on managerial education roughly implies (1–α–β–δ) E = 0.03, our
results are consistent with a Mincerian return E = .30 for entrepreneurs. The 30% estimate is higher
than those found by Goldin and Katz (2008) for returns to college education for workers, but lines up
with the few existing analyses of entrepreneurial education, which document substantially larger
returns of education for managers than for workers (Parker and van Praag 2005, van Praag et al. 2009).22
22
Using U.S. and Dutch individual-level data, these studies find that one extra year of schooling increases entrepreneurial income by 18% and 14%, respectively. This is much higher than the 3% found in our firm-level data (in our model entrepreneurial income is a constant share of a firm’s output), implying gigantic Mincerian returns under an entrepreneurial share of α = 0.1. Note, however, that these studies rely on small start-ups (in the Dutch data) or on self employed individuals (in the U.S. data). In both cases, the entrepreneurial share is likely to be higher than 0.1, moving Mincerian returns closer to our benchmark of 30%.
35
This preliminary assessment suggests that a neglected but critical channel through which
schooling and human capital affect productivity is via entrepreneurial inputs. Individuals selected into
entrepreneurship appear to be vastly more talented than workers, driving up productivity. Of course,
entrepreneurial talent may be more important than entrepreneurial schooling in explaining this finding.
Our analysis cannot adequately address this issue (which would require better data and an endogenous
determination of the connection between schooling and talent). Our analysis is, nevertheless, sufficient
to identify a critical role of management and entrepreneurship in determining productivity.
The large returns to entrepreneurial education, compared to the relatively small returns to
worker education, might explain the problem that the previous literature on development accounting
has experienced with the Mincer regressions (Caselli 2005, Hsieh and Klenow 2010). As we show below,
spatial differences in the stocks of human capital implied only by returns to worker education are
considerably lower than those implied by blended returns of workers and entrepreneurs. Including
entrepreneurial returns in the calculation thus substantially enhances the ability of human capital to
account for differences in productivity across space. Entrepreneurial returns might be ignored in
surveys focusing on wages as returns to education.
We can use the estimates of Tables 9 and 10 to assess the magnitude of human capital
externalities. The coefficient on population in Table 9, roughly equal to 0.09, suggests that γ is also
about 0.09. Interestingly, this assessment is consistent with the regional regressions in Table 4. In fact,
the coefficient on population is positive and roughly equal to 0.01. This implies that γ – β/(1 – δ) = 0.01
in Equation (21). Given that γ = 0.09 and β + δ = 0.35, this condition yields β to be roughly 0.06, which
happens to be in the ballpark of the land share estimated from income accounts (Valentinyi and
Herrendorf 2008). In sum, β = 0.06, δ=0.29 and γ = 0.09 are roughly consistent with both firm level and
regional regressions as well as with standard calibration exercises.
36
The coefficient on regional schooling in the firm-level regressions of Table 9 is about .065. This
implies (given γ = 0.09) that ψ is about .72. To separate the effect of the population-wide Mincerian
return from the strength of the externalities ψ, we exploit the regional regressions. According to
Equation (20) describing regional output, in these regressions the coefficient on schooling is equal to [1+
γψ –β/(1 – δ)] . In Table 4, this coefficient is about 0.26. Since we have already accepted γ = 0.09 and
β/(1 – δ) = 0.08 as reasonable estimates, we are left with two equations with two unknowns, namely (1+
0.09ψ – 0.08) = 0.26 and ψ = 0.72. These equations imply an average population-wide Mincerian
return of about 0.21 (which is in between our estimates of workers’ and entrepreneurs’ values) and
that the average education externality parameter ψ is about 3.45. That is, a given increase in regional
human capital generates 3.45 times more externalities if it is due to an increase in the average amount
of human capital rather than to a higher number of people with average education. These estimates
point to a large effect of schooling for productivity via social interactions or R&D spillovers, consistent
with Lucas (1985, 2008) as well as with the literature in urban economics cited in the introduction.
Finally, note that at the above parameter values and at a reasonable share of housing consumption of θ=
0.4, the spatial equilibrium is stable, since (β – ψγ)(1 – θ) + θ(1 – δ) = – (0.25)(0.6) + (0.4)(0.74)>0.
We can now use our results to assess the magnitude of the effect of schooling. Begin with the
role of workers’ and entrepreneurs’ inputs. In regional regressions, the population-wide Mincerian
return of 0.21 is needed to make sense of the data, while the firm level regressions suggest that the
Mincerian return is 6% for workers and 30% for entrepreneurs. Although we lack direct data on the
number of entrepreneurs in the economy, we can make a back-of-the-envelope calculation to assess
whether our firm level evidence is consistent with a 21% Mincerian return population-wide. If: (1) an
average entrepreneur is as educated as the entrepreneurs in the enterprise survey on average, i.e. has
14 years of schooling; and (2) an average worker in the economy is as educated as the average person in
37
the sample, i.e. has roughly 7 years of schooling, then to obtain an average population-wise Mincerian
return of 21% entrepreneurs need to account for only 4.8% of the workforce.23 Our estimates thus
suggest that private returns to schooling may be much larger than what previously thought due to the
neglected role of entrepreneurial inputs.
Consider next the role of externalities. Using our estimated parameters, raising the educational
level from the sample mean of 6.58 years by one year can be calculated to increase regional TFP by
about 6.7%. Rauch (1993) estimates a comparable magnitude of 3-5%. Acemoglu and Angrist (2000)
estimate that a one year increase in average schooling is associated with a 7% increase in average
wages. Moretti (2004) examines the impact of spillovers associated with the share of college graduates
living in a city. If we run the regressions in Table 9 using the fraction of the population with college
degrees instead of our measure of years of schooling, our estimates imply that a one percentage point
increase in the share of region’s population with a college degree increases output per capita by 7.9%.
Iranzo and Peri (2009) estimate that one extra year of college per worker increase the state’s TFP by a
very significant 6-9%, whereas the effect of an extra year of high school is closer to 0-1%. The
agreement among the various estimates is quite striking. Even if we use the coefficients obtained in
Table 10 when controlling for factors potentially affecting regional productivity, the change is very
modest, increasing the required population-wide Mincerian return to .23 and reducing the externality
parameter ψ to 2.4. Notwithstanding the difficulties involved in the identification of externalities, their
quantitative role seems to be quite robust.
In sum, the firm level and regional productivity differences are mutually consistent because: i)
entrepreneurial education is much more important than workers’ education in accounting for
productivity across firms and regions, and ii) human capital externalities are also sizeable. The standard
23
The population-wise average Mincerian return is computed as the return that solves the equation exp( [
7)1(14 ff ]) = exp(0.3 14) (1 )exp(0.06 7)f f where f is the fraction of entrepreneurs.
38
development accounting literature has neglected both channels. Since our estimation of
entrepreneurial returns is less subject to endogeneity problems than the estimation of externalities, we
now assess the relative importance of these two channels in explaining cross-country differences in
income per capita by using as a reference some well known results of macro development accounting
exercises. To do so, define a factor-based model of national income as KHLhEY )( 1^
, which
is national income level predicted by our model when: i) all regions in a country are identical and all
countries are equally productive, and ii) where – in line with standard development accounting - we
consider only physical and human capital, thereby attributing land rents to physical capital (deducting
these rents would not change much). This simplified model with no regional mobility provides a
benchmark to assess the role of physical and human capital when productivity differences are absent.
Following Caselli (2005), one measure of the success of the model in explaining cross-country
income differences is
))var(log())var(log(
^
YYsuccess
,
where Y is observed GDP. Using Caselli’s dataset, the observed variance of (log) GDP per worker is 1.32.
Ignoring human capital externalities (i.e., assuming ψ=γ=0) and using the standard 8% average Mincerian
return on human capital for both workers and entrepreneurs (i.e., setting =8%), the variance of log( ̂)
equals 0.76, i.e. physical and human capital explain 57% (0.76/1.32) of the observed variation in income
per worker. This calculation reproduces the standard finding that, under standard Mincerian returns, a
big chunk of the cross country income variation is accounted for by the productivity residual.
To isolate the role of entrepreneurial capital, we compute ̂ by assuming no human capital
externalities (i.e., ψ=γ=0) while still keeping a population-wide Mincerian return of 21%, which is
consistent with our firm-level estimates. It is not surprising that average Mincerian returns of about
39
20% greatly improve the explanatory power of human capital. Indeed, under this assumption success
rises to 83%. This improvement is solely due to accounting for managerial schooling. We note that this
result is quite sensitive to our assumption of labour share of 55%. If labour share were lower, the
residual income share that we allocate to entrepreneurial rents would be correspondingly higher, which
would then reduce our estimate of the returns to entrepreneurial education, and therefore of average
Mincerian returns. This reservation should be kept in mind in interpreting our results.
Finally, to assess the incremental explanatory power of human capital externalities, we
compute ̂ assuming our estimated values (i.e., ψ=3.45 and γ=0.09), while retaining the assumption that
the average Mincerian return equals 21%. Under these new assumptions, success rises to 99%. Of
course, these results need to be interpreted cautiously since there is considerable uncertainty regarding
the true values of the underlying coefficients. In particular, a sharply lower value of returns to
entrepreneurial schooling would reduce this measure of success (but would also be difficult to reconcile
with our estimates). Nevertheless, these calculations illustrate that entrepreneurial inputs play a much
more important role than externalities in increasing the explanatory power of human capital.
The comparison between Mozambique and the US illustrates the importance of entrepreneurial
inputs to understand cross-country income differences. Income per worker is roughly 33 times higher in
the US than in Mozambique ($57,259 vs. $1,752), while the stock of physical capital is 185 times higher
in the US than in Mozambique ($125,227 vs. $676). The average number of years of schooling for the
population 15 years and older is 1.01years Mozambique and 12.69 years in the United States. These
large differences in schooling imply that the (per capita) stock of human capital is 11.6 higher
(HUS/HMOZ=e.21*(12.69-1.01)) in the US than in Mozambique if the average Mincerian return is 21%. In
contrast, the (per capita) stock of human capital is only 2.5 times higher (HUS/HDRC=e.08*(12.69-1.01)) in the US
than in Mozambique if the average Mincerian return is 8%. Using weights of 1/3 and 2/3 for physical
and human capital, these differences in physical and human capital imply that income per worker should
40
be 29 times higher in the US than in Mozambique (29=11.62/3x1851/3), which is much closer to the actual
value of 33 times than the 10.6 multiple implied by 8% Mincerian return (10.6=2.52/3x1851/3).
In sum, our firm level and regional regressions suggest that: i) in line with the development
accounting literature, workers’ human capital is an important but not a large contributor to productivity
differences, ii) entrepreneurial inputs area fundamental and relatively neglected channel for
understanding the role of schooling in shaping productivity differences, and iii) human capital
externalities might also be an important determinant of regional productivity differences. Our
parameter estimates point to very large returns to entrepreneurial schooling (perhaps due to
entrepreneurs’ general talent) and to large social returns at the regional level arising from education.
VI. Conclusion.
We have presented evidence from more than 1500 sub-national regions of the world on the
determinants of regional income and labor productivity. The evidence suggests that regional education
is the critical determinant of regional development, and the only such determinant that explains a
substantial share of regional variation. Using data on several thousand firms located in these regions,
we have also found that regional education influences regional development through education of
workers, education of entrepreneurs, and regional externalities. The latter come primarily from the
level of education (the quality of human capital) in a region, and not from its total quantity (the number
of people with some education).
A simple Cobb-Douglas production function specification used in development accounting would
have difficulty accounting for all this evidence. Instead, we presented what we called a Lucas-Lucas
model of an economy, which combines the allocation of talent between work and entrepreneurship,
human capital externalities, and migration of labour across regions within a country. Although many
issues remain to be resolved, the empirical findings we presented are both consistent with the general
41
predictions of this model, and provide plausible values for the model’s parameters. In addition, we
follow Caselli (2005) in assessing the ability of the model to account for variation of output per worker
across countries. When we use our Lucas-Lucas model, we can roughly double the ability of the model
to account for cross-country income differences relative to the traditional specification. Our
parameterization can explain 99% of income differences across countries.
The central message of the estimation/calibration exercise is that, while private returns to
worker education are modest and close to previous estimates, but private returns to entrepreneurial
education (in the form of profits) and possibly also social returns to education through external
spillovers, are large. This evidence suggests that earlier estimates of return to education have perhaps
underestimated one of its important benefits – the externalities, and largely missed the other –
entrepreneurship. This final observation has significant implications for economic development.
Our data points most directly to the role of the supply of educated entrepreneurs for the
creation and productivity of firms. From the point of view of development accounting, having such
entrepreneurs seems more important than having educated workers. Consistent with earlier
observations of Banerjee and Duflo (2005) and La Porta and Shleifer (2008), economic development
occurs in educated regions that concentrate entrepreneurs, who run productive firms. These
entrepreneurs, as well, appear to contribute to the exchange of ideas, leading so significant regional
externalities. The observed large benefits of education through the creation of a supply of
entrepreneurs and through externalities offer an optimistic assessment of the possibilities of economic
development through raising educational attainment.
42
Bibliography
Acemoglu, Daron, and Joshua Angrist (2000). “How Large are Human-Capital Externalities? Evidence
from Compulsory Schooling Laws.” NBER Macroeconomics Annual, 15:9-59.
Acemoglu, Daron, Simon Johnson, and James Robinson (2001). “The Colonial Origins of Comparative
Development: An Empirical Investigation.” American Economic Review, 91(5): 1369-1401.
Acemoglu, Daron, and Melissa Dell (2010). “Productivity Differences Between and Within Countries.”
American Economic Journal: Macroeconomics, 2(1):169-188.
Alesina, Alberto, Arnaud Devleeschauwer, William Easterly, Sergio Kurlat and Romain Wacziarg (2003).
“Fractionalization.” Journal of Economic Growth, 8:155-94.
Alvarez, Michael, Jose Cheibub, Fernando Limongi, and Adam Przeworski (2000). Democracy and
Development: Political Institutions and Material Well-Being in the World 1950-1990. Cambridge,
UK: Cambridge University Press.
Banerjee, Abhijit, and Esther Duflo (2005). “Growth Theory through the Lens of Development
Economics.” In Philippe Aghion and Steven Durlauf, eds. Handbook of Economic Growth v. 1a,
Amsterdam: Elsevier.
Banerjee, Abhijit, and LakshmiIyer (2005). “History, Institutions, and Economic Performance: The Legacy
of Colonial Land Tenure System in India.” American Economic Review 95: 1190-1213.
Barro, Robert J (1991). "Economic Growth in a Cross-Section of Countries." Quarterly Journal of
Economics 106(2): 407-443.
Baumol, William (1990), “Entrepreneurship: Productive, Unproductive, and Destructive.” Journal of
Political Economy 98 (5): 893-921.
Beck, Thorsten, Gerge Clarke, Alberto Groff, Philip Keefer, and Patrick Walsh (2001). “New Tools in
Comparative Political Economy: The Database of Political Institutions.” World Bank Economic
Review 15 (1): 165-176.
43
Benhabib, Jess, and Spiegel, Mark (1994) "The Role of Human Capital in Economic Development."
Journal of Monetary Economics 34(2): 143-174.
Bils, Mark, and Peter Klenow (2000). “Does Schooling Cause Growth?” American Economics Review
90(5):1160-1183.
Bloom, David and Jeffrey Sachs (1998). “Geography, Demography, and Economic Growth in Africa.”
Brookings Papers on Economic Activity, 29(2): 207-296.
Bloom, Nicholas, and John Van Reenen (2007). “Measuring and Explaining Management Practices across
Firms and Countries.” Quarterly Journal of Economics 122(4): 1351-1408.
Bloom, Nicholas, and John Van Reenen (2010). “Why Do Management Practices Differ across Firms and
Countries?” Journal of Economic Perspectives, 24(1): 203-224.
Card, David (1999). “The Causal Effect of Education on Earnings.” In Orley Ashenfelter and David Card,
eds., Handbook of Labor Economics, vol 3A, Amsterdam: North Holland.
Caselli, Francesco (2005). “Accounting for Cross-Country Income Differences.” In Philippe Aghion and
Steven Durlauf (eds.), Handbook of Economic Growth, vol.1, ch. 9: 679-741 Amsterdam: Elsevier.
Caselli, Francesco, and Coleman, John Wilbur II (2005). “The World Technology Frontier.” American
Economic Review, 96(3):499-522.
Caselli, Francesco and James Feyrer (2007). “The Marginal Product of Capital.” Quarterly Journal of
Economics 122 (2): 535-568.
Caselli, Francesco, and Nicola Gennaioli (2009). “Dynastic Management.” Mimeo: LSE and CREI.
Ciccone, Antonio, and Robert Hall (1996). “Productivity and the Density of Economic Activity.” American
Economic Review, 86(1):54-70.
Ciccone, Antonio, and Giovanni Peri (2006). “Identifying Human-Capital Externalities: Theory with
Applications.” Review of Economic Studies, 73(2):381-412.
44
Ciccone, Antonio, and Elias Papaioannou (2009). “Human Capital, the Structure of Production, and
Growth.” Review of Economics and Statistics, 91(1): 66-82.
Coe, David, and Elhanan Helpman (1995). “International R&D Spillovers.” European Economic Review
39(5): 859-887.
Cohen, Daniel, and Marcelo Soto (2007). “Growth and Human Capital: Good Data, Good Results.”
Journal of Economics Growth, 12 (1):51-76.
de La Fuente, Angel and Rafael Domenech, (2006). Human Capital in Growth Regressions: How Much
Difference Does Data Quality Make?” Journal of the European Economics Association, 4(1):1-36.
Dell, Melissa (2010). “The Persistent Effects of Peru’s Mining Mita.” Econometrica, forthcoming.
Dell, Melissa, Benjamin Jones, and Benjamin Olken (2009). “Temperature and Income: Reconciling New
Cross-Sectional and Panel Estimates.” NBER Working Paper No. 14680.
De Long, Bradford, and Andrei Shleifer (1993). "Princes or Merchants? City Growth before the Industrial
Revolution" Journal of Law and Economics, 36:671-702.
Easterly, William, and Levine, Ross (1997). “Africa's Growth Tragedy: Policies and Ethnic Divisions.”
Quarterly Journal of Economics 112(4):1203-50.
Glaeser, Edward, Hedi Kallal, Jose Scheinkman, and Andrei Shleifer (1992). “Growth in Cities.” Journal of
Political Economy 100(6):1126-1152.
Glaeser, Edward, and Joshua Gottlieb (2009). “The Wealth of Cities: Agglomeration Economies and
Spatial Equilibrium in the United States.”Journal of Economic Literature 47(4):983-1028.
Glaeser, Edward L., Jose Scheinkman, and Andrei Shleifer (1995). “Economic Growth in a Cross-Section
of Cities.” Journal of Monetary Economics 36:117-143.
Glaeser, Edward, and David Mare (2001). “Cities and Skills.” Journal of Labor Economics 19(2):316-342.
Glaeser, Edward, Rafael La Porta, Florencio Lopez-de-Silanes, and Andrei Shleifer (2004). “Do Institutions
Cause Growth?” Journal of Economic Growth, 9 (2):271-303.
45
Goldin, Claudia, and Lawrence Katz (2008). The Race Between Education and Technology. Cambridge,
MA: Harvard University Press.
Gollin, Douglas (2002). “Getting Income Shares Right.” Journal of Political Economy 110(2): 458-474.
Hall, Robert, and Charles Jones (1999). “Why Do Some Countries Produce So Much More Output per
Worker than Others?” Quarterly Journal of Economics 114(1): 83-116.
Henderson, Vernon, Adam Storeygard, and David Weil (2009). “Measuring Economic Growth From
Outer Space.” American Economic Review, forthcoming.
Hsieh, Chang-Tai, and Peter Klenow (2009). “Misallocation and Manufacturing TFP in China and India.”
Quarterly Journal of Economics 124(4): 1403-1448.
Hsieh, Chang-Tai, and Peter Klenow (2010). “Development Accounting.” American Economic Journal:
Macroeconomics 2(1): 207–223.
Iranzo, Susana, and Giovanni Peri (2009). “Schooling Externalities, Technology, and Productivity: Theory
and Evidence from U.S. States.”Review of Economics and Statistics 91(2):420-431.
Jaggers, Keith, and Monty Marshall (2000). Polity IV Project. University of Maryland.
Jacobs, Jane (1969). The Economy of Cities. New York: Vintage.
King, Robert, and Levine, Ross, 1993. “Finance and Growth : Schumpeter Might be Right.” Quarterly
Journal of Economics 108(3): 717-737.
Klenow, Peter, and Andres Rodriguez-Clare (2005). “Externalities and Growth.” In Philippe Aghion and
Steven Durlauf (eds.), The Handbook of Economic Growth, Volume 1A, pp. 818-861, Elsevier.
Knack, Stephen, and Philip Keefer (1997). “Does Social Capital Have An Economic Payoff? A Cross-
Country Investigation.” Quarterly Journal of Economics 112(4):1251-1288.
Krueger, Alan, and Mikael Lindahl (2001). “Education for Growth: Why and For Whom?” Journal of
Economic Literature 39(4):1101-1136.
46
La Porta, Rafael, Florencio Lopez-de-Silanes, and Andrei Shleifer. 2008. "The Economic Consequences of
Legal Origins." Journal of Economic Literature, 46(2): 285–332.
La Porta, Rafael and Andrei Shleifer (2008). “The Unofficial Economy and Economic Development.”
Brookings Papers on Economic Activity, 2008: 275-352.
Lucas, Robert (1978). “On the Size Distribution of Business Firms.” Bell Journal of Economics 9(2): 508–
23.
Lucas, Robert (1988). “On the Mechanics of Economic Development.” Journal of Monetary Economics,
22(1):3-42.
Lucas, Robert (2009). “Ideas and Growth.” Economica, 76(301): 1–19.
Mankiw, Gregory, David Romer, and David Weil (1992). “A Contribution to the Empirics of Economic
Growth.”Quarterly Journal of Economics, 107(2): 407-437.
Moretti, Enrico (2004). “Estimating the Social Return to Higher Education: Evidence from Longitudinal
and Repeated Cross-Sectional Data.” Journal of Econometrics, 121: 175-212.
Moretti, Enrico (2004). “Workers’ Education, Spillovers, and Productivity: Evidence from Plant-Level
Production Functions.” American Economic Review, 94(3):656-690.
Murphy Kevin, Andrei Shleifer, and Robert Vishny (1991). “The Allocation of Talent: Implications for
Growth.” Quarterly Journal of Economics, 106(2):503-530.
Nelson, Richard, and Edmund Phelps (1966). “Investment in Humans, Technological Diffusion and
Economic Growth.” American Economic Review, 56(2):69-75.
Parker, Simon and Mirjam van Praag, (2005). “Schooling, Capital Constraint and Entrepreneurial
Performance.” Tinbergen Institute Discussion Paper.
Psacharopoulos, George (1994). “Returns to Investment in Education: A Global Update.” World
Development 22(9): 1325-1343.
47
Putnam, Robert (1993). Making Democracy Work: Civic Traditions in Modern Italy. Princeton, NJ:
Princeton University Press.
Rauch, James (1993).“Productivity Gains from Geographic Concentration of Human Capital: Evidence
from the Cities.” Journal of Urban Economics, 34:380-400.
Roback, Jennifer (1982). “Wages, Rents, and the Quality of Life.” Journal of Political Economy 90(6):
1257-1278.
Romer, Paul (1990). “Endogenous Technical Change.” Journal of Political Economy, 98(5): S71-S102.
Syverson, Chad (2011). “What Determines Productivity?” Journal of Economic Literature, forthcoming.
Valentinyi, Akos and Berthold Herrendorf (2008). “Measuring Factor Income Shares at the Sector Level.”
Review of Economic Dynamics 11: 820-838.
van Praag, Mirjam, Arjen van Witteloostuijn, and Justin van der Sluis (2009). “Returns for Entrepreneurs
vs. Employees: The Effect of Education and Personal Control on the Relative Performance of
Entrepreneurs vs. Wage Employees.” IZA Discussion Paper no. 4628.
Woff, Edward (2011). “Spillovers, Linkages, and Productivity Growth in the US Economy, 1958-2007.”
Working Paper No. 16864. Cambridge, MA: National Bureau of Economic Research.
48
Appendix 1.
Proof of Proposition 1 Due to A.1, labour moves from unproductive to productive regions. Formally,
Equation (11) implies that an agent with human capital hj migrates if BjBGjG HhwHhw //)( 11 ,
where φ captures migration costs. This identifies a human capital threshold hm such that agent j migrates
if and only if hj ≥ hm. By exploiting the wage equation in (6) and the equilibrium condition (9), threshold
hm can be implicitly expressed as:
B
G
G
Bm H
Hwwh
1
1 . (Ap.1)
To pin down the equilibrium, note that the aggregate resource constraint is given by:
p∙HG + (1 – p)∙HB = H. (Ap.2)
After accounting for externalities, the equilibrium condition (Ap.1) can be written as:
1)1()1)((
11)1(
11
1B
G
B
G
G
Bm H
HLL
AAh . (Ap.3)
The previous migration-threshold implies that the human capital stock in each productive region is:
11 1)(1
B
m
BB
mBGh BGG h
hp
pHHdhhhhp
pHH
. (Ap.4)
Using Equation (Ap.4) and (Ap.3), it is immediate to express hm as a function of HG and thus recover:
1
1
11
111
B
B
B
B
pp
HHH
pp
HHH
pp
LL
B
GG
B
GG
B
G
. (Ap.5)
Under full mobility (φ = 0), using (Ap.3) one finds that the equilibrium is determined by the condition:
49
1)1()1)((
11
)1(
1
1
11
)1(
111
11
G
G
B
GG
B
GG
B
G
pHHHp
pp
HHH
pp
pp
HHH
AA
B
B
B
B
. (Ap.6)
The left hand side is decreasing in HG. If (β - ψγ)(1- θ) + θ(1- δ)> 0, the right hand side - which captures
the cost of migrating to productive regions, increases in HG. As a result, when (β - ψγ)(1- θ) + θ(1- δ)> 0
even under full mobility in the stable equilibrium there is no universal migration to productive regions.
Indeed, if all human capital moves to productive regions, then HG = H/p and the right hand side of
(Ap.10) becomes infinite. Full migration is not an equilibrium. No migration is not an equilibrium either,
as in this case A.1 implies that (Ap.10) cannot hold. When ψ = 1 (and φ = 0) the equilibrium has:
.)1()1)((
1
)1()1)((1
H
AE
AH ii
(Ap.7)
With imperfect mobility φ = 0, the equilibrium fulfils the condition:
.)1(
11
111
11
1)1()1)((
11
)1(
1
1
11
11
11
G
G
B
GG
B
GG
G
B
B
GG
pHHHp
pp
HHH
pp
HHH
pp
AA
HHH
pp
hB
B
B
B
When (β - ψγ)(1- θ) + θ(1- δ)> 0, an increase in HG (holding H constant) shifts down the left hand side
and shifts up the right hand side above. As a result, the equilibrium is unique.
Table 1 – Definitions and sources for the variables used in the paper
This table provides the names, definitions and sources of all the variables used in the tables of the paper.
Variable Description Sources and linksI. GDP per capita, population, employment and human capital
Ln(GDP per capita) The logarithm of Gross Domestic Product per capita in PPP constant 2005 international dollars in the region in 2005. Data on regional GDP is available for all countries except 20. For those 20 countries, we approximate GDP using data on income (6 countries), expenditure (8 countries), wages (3 countries), gross value added (2 countries), and consumption, investment and government expenditure (1 country). For each country, we scale regional GDP per capita values so that their population‐weighted sum equals the World Development Indicators (WDI) value of Gross Domestic Product in PPP constant 2005 international dollars. Similarly, for each country, we adjust the regional population values so that their sum equals the country‐level analog in WDI. For years with missing regional GDP per capita data, we interpolate using all available data for the period 1990‐2008. When interpolating GDP values is not possible, we use the regional distribution of the closest year with regional GDP data. Population data for years without census data is interpolated and extrapolated from the available census data for the period 1990‐2008. At the country level, we calculate this variable as the population‐weighted average of regional GDP.
Regional GDP: See online appendix "Appendix GDP Sources". Regional population: Thomas Brinkhoff: City Population, http://www.citypopulation.de/ Country‐level GDP per capita and PPP exchange rates: World Bank, (2010). Data retrieved on March 2, 2010, from World Development Indicators Online (WDI) database, http://go.worldbank.org/6HAYAHG8H0
Ln(Population) The logarithm of the number of inhabitants in the region in 2005. Population data for years without census data is interpolated and extrapolated from the available census data for the period 1990‐2008. For each country, we adjust the regional populations so that the sum of regional populations equals the country‐level analog in the World Development Indicators (WDI). At the country level, we calculate this variable following the same methodology but using country boundaries.
Regional population: Thomas Brinkhoff: City Population, http://www.citypopulation.de/ Regional spherical: Collins‐Bartholomew World Digital Map, http://www.bartholomewmaps.com/data.asp?pid=5.
Ln(Employment) The number of manufacturing and service employees working in the establishments in the region. The data is for the year 2005 or the closest available. At the country level, we calculate this variable as the product of the total population and the employment ratio for the population 15 years and older.
See online appendix "Appendix on Economic Census Sources". Development Indicators Online (WDI) database, http://go.worldbank.org/6HAYAHG8H0
Years of education The average years of schooling from primary school onwards for the population aged 15 years or older. Data for China and Georgia is for the population 6 years and older. We use the most recent information available for the period 1990‐2006. To make levels of educational attainment comparable across countries, we translate educational statistics into the International Standard Classification of Education (ISCED) standard and use UNESCO data on the duration of school levels in each country for the year for which we have educational attainment data. Eurostat aggregates data for ISCED levels 0‐2 and we assign such observations an ISCED level 1. Following Barro and Lee (1993): (1) we assign zero years of schooling to ISCED level 0 (i.e., pre‐primary); (2) we assign zero years of additional schooling to (a) ISCED level 4 (i.e., vocational), and (b) ISCED level 6 (i.e. post‐graduate); and (3) we assign 4 years of additional schooling to ISCED level 5 (i.e. graduate). Since regional data is not available for all countries, unlike Barro and Lee (1993), we assign zero years of additional schooling: (a) to all incomplete levels; and (b) to ISCED level 2 (i.e. lower secondary). Thus, the average years of schooling in a region is calculated as: (1) the product of the fraction of people whose highest attainment level is ISCED 1 or 2 and the duration of ISCED 1; plus (2) the product of the fraction of people whose highest attainment level is ISCED 3 or 4 and the cumulative duration of ISCED 3; plus (3) the product of the fraction of people whose highest attainment level is ISCED 5 or 6 and the sum of the cumulative duration of ISCED 3 plus 4 years. At the country level, we calculate this variable as the population‐weighted average of the regional values.
See online appendix "Appendix on Education Sources". Links to online data: http://epdc.org/ http://epp.eurostat.ec.europa.eu/portal/page/portal/region_cities/introduction https://international.ipums.org/international/index.html http://stats.uis.unesco.org/unesco/TableViewer/document.aspx?ReportId=143&IF_Language=eng.
II. Climate, geography and natural resources
Temperature Average temperature during the period 1950‐2000 in degrees Celsius. To produce the regional and national numbers, we create equal area projections using the Collins‐Bartholomew World Digital Map and the temperature raster in ArcGIS. For each region, we sum the temperatures of all cells in that region and divide by the number of cells in that region. At the country level, we calculate this variable following the same methodology but using country boundaries.
Climate: Hijmans, R. et al. (2005) , http://www.worldclim.org/ Collins‐Bartholomew World Digital Map, http://www.bartholomewmaps.com/data.asp?pid=5
Variable Description Sources and linksInverse distance to coast
The ratio of one over one plus the region’s average distance to the nearest coastline in thousands of kilometers. To calculate each region’s average distance to the nearest coastline we create an equal distance projection of the Collins‐Bartholomew World Digital Map and a map of the coastlines. Using these two maps we create a raster with the distance to the nearest coastline of each cell in a given region. Finally, to get the average distance to the nearest coastline, we sum up the distance to the nearest coastline of all cells within each region and divide that sum by the number of cells in the region. At the country level, we calculate this variable following the same methodology but using country boundaries.
Collins‐Bartholomew World Digital Map, http://www.bartholomewmaps.com/data.asp?pid=5
Ln(Oil) Logarithm of one plus the estimated per capita volume of cumulative oil production and reserves by region, in millions of barrels of oil. To produce the regional measure, we load the oil map of the World Petroleum Assessment and the Collins‐Bartholomew World Digital map onto ArcGIS. On‐shore estimated oil in each assessment unit was allocated to the regions based on the fraction of assessment unit area covered by each region. Off‐shore assessment units are not included. The World Petroleum Assessment map includes all oil fields in the world except those in the United States of America. Data for the United States is calculated using the national‐level information on cumulative production and estimated reserves, available from the World Petroleum Assessment 2000 (USGS), and the United States' regional production and estimated reserves for the year 2000 from the U.S. Energy Information Administration (USEIA). The national level data for this variable is calculated following the same methodology outlined but using the data on national boundaries. The national level numbers for the U.S. are those available from the World Petroleum Assessment.
http://energy.cr.usgs.gov/oilgas/wep/products/dds60/export.htm. http://tonto.eia.doe.gov/dnav/pet/pet_crd_crpdn_adc_mbbl_a.htm. http://www.bartholomewmaps.com/data.asp?pid=5
III. Institutions
Informal payments The average percentage of sales spent on informal payments made to public officials to “get things done” with regard to customs, taxes, licenses, regulations, services, etc, as reported by the respondents in the region. The country‐level analog of this variable is the arithmetic average of the regions in the country. Data is from the most recent year available, ranging from 2002 through 2009.
World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/
Ln(Tax days) The logarithm of one plus the average number of days spent in mandatory meetings and inspections with tax authority officials in the past year as reported by respondents in the region. The country‐level analog of this variable is the arithmetic average of the regions in the country. Data is for the most recent year available, ranging from 2002 through 2009.
World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/
Ln(Days without electricity)
The logarithm of one plus the average number of days without electricity in the past year as reported by the respondents in the region. The country‐level analog of this variable is the arithmetic average of the regions in the country. Data is for the most recent year available, ranging from 2002 through 2009.
World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/
Security costs The average costs of security (i.e., equipment, personnel, or professional security services) as a percentage of sales as reported by the respondents in the region. The country‐level analog of this variable is the arithmetic average of the regions in the country. Data is for the most recent year available, ranging from 2002 through 2009.
World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/
Access to land The percentage of respondents in the region who think that access to land is a moderate, major, or very severe obstacle to business. The country‐level analog of this variable is the arithmetic average of the regions in the country. Data is for the most recent year available, ranging from 2002 through 2009.
World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/
Access to finance The percentage of respondents in the region who think that access to financing is a moderate, major, or very severe obstacle to business. The country‐level analog of this variable is the arithmetic average of the regions in each respective country. Data is for the most recent year available, ranging from 2002 through 2009.
World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/
Government predictability
The percentage of respondents in the region who tend to agree, agree in most cases, or fully agree that their government officials’ interpretation of regulations are consistent and predictable. The country‐level analog of this variable is the arithmetic average of the regions in the country. Data is for the most recent year available, ranging from 2002 through 2009.
World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/
Doing Business Percentile Rank
The average of the percentile ranks in each of the following five areas: (1) starting a business; (2) dealing with construction permits; (3) registering property; (4) enforcing contracts; and (5) paying taxes. Higher values indicate more burdensome regulation. Data is for the most recent year available, ranging from 2007 through 2010.
Word Bank’s Doing Business Subnational Reports. http://doingbusiness.org/Reports/Subnational‐Reports/
Autocracy This variable classifies regimes based on their degree of autocracy. Democracies are coded as 0, bureaucracies (dictatorships with a legislature) are coded as 1 and autocracies (dictatorship without a legislature) are coded as 2. Transition years are coded as the regime that emerges afterwards. This variable ranges from zero to two where higher values equal a higher degree of autocracy. This
Alvarez et al. (2000).
Variable Description Sources and linksvariable is measured as the average from 1960 through 1990.
Executive Constraints A measure of the extent of institutionalized constraints on the decision making powers of chief executives. The variable takes seven different values: (1) Unlimited authority (there are no regular limitations on the executive's actions, as distinct from irregular limitations such as the threat or actuality of coups and assassinations); (2) Intermediate category; (3) Slight to moderate limitation on executive authority (there are some real but limited restraints on the executive); (4) Intermediate category; (5) Substantial limitations on executive authority (the executive has more effective authority than any accountability group but is subject to substantial constraints by them); (6) Intermediate category; (7) Executive parity or subordination (accountability groups have effective authority equal to or greater than the executive in most areas of activity). This variable ranges from one to seven where higher values equal a greater extent of institutionalized constraints on the power of chief executives. This variable is calculated as the average from 1960 through 2000.
Jaggers and Marshall (2000).
Expropriation Risk Risk of “outright confiscation and forced nationalization" of property. This variable ranges from zero to ten where higher values are equals a lower probability of expropriation. This variable is calculated as the average from 1982 through 1997.
International Country Risk Guide at http://www.countrydata.com/datasets/.
Proportional Representation
This variable is equal to one for each year in which candidates were elected using a proportional representation system; equals zero otherwise. Proportional representation means that candidates are elected based on the percentage of votes received by their party. This variable is measured as the average from 1975 through 2000.
Beck et al. (2001).
Corruption The average score of the Transparency International index of corruption perception in 2005. The index provides a measure of the extent to which corruption is perceived to exist in the public and political sectors. The index focuses on corruption in the public sector and defines corruption as the abuse of public office for private gain. It is based on assessments by experts and opinion surveys. The index ranges between 0 (highly corrupt) and 10 (highly clean).
www.transparency.org
IV. Infrastructure
Ln(Power line density) The logarithm of one plus the length in kilometers of power lines per 10km2 in the year 1997. To produce the regional numbers, we load the power line map from the US Geological Survey and the Collins‐Bartholomew World Digital Map onto ArcGIS. We take the ratio of total length of the power lines in the region to the spherical area of that region. At the country level, we calculate this variable following the same methodology but using country boundaries.
US Geological Survey Global GIS database, accessed through Harvard University's Geospatial Library. Collins‐Bartholomew World Digital Map, http://www.bartholomewmaps.com/data.asp?pid=5
Ln(Travel time) The logarithm of the average estimated travel time in minutes from each cell in a region to the nearest city of 50,000 or more people in the year 2000. We use the raster from the Global Environmental Monitoring Unit and the Collins‐Bartholomew World Digital Map. For each region, we sum the travel time from all its cells and divide by the number of cells in that region. At the country level, we calculate this variable following the same methodology but using country boundaries.
Global Environment Monitoring Unit, http://bioval.jrc.ec.europa.eu/products/gam/index.htm Collins‐Bartholomew World Digital Map, http://www.bartholomewmaps.com/data.asp?pid=5
V. Culture
Trust in others The percentage of respondents in the region who believe that most people can generally be trusted. The country‐level analog of this variable is the arithmetic average of the regions in the country. Data is for the most recent available year, ranging from 1980 through 2005.
World Values Survey, http://www.worldvaluessurvey.org/
Civic values The average of the value of the answers of respondents in the region about the degree of justifiability of the following four behaviors: (1) Claiming government benefits to which you are not entitled; (2) Avoiding a fare on public transport; (3) Cheating on taxes if you have a chance; and (4) Someone accepting a bribe in the course of their duties. For each question, possible answers range from 1 (never justifiable) to 10 (always justifiable). We only include observations with non‐missing data for at least two of the four questions. The country‐level analog of this variable is the arithmetic average of the regions in the country. Data is for the most recent available year, ranging from 1980 through 2005.
World Values Survey, http://www.worldvaluessurvey.org/
Ln(Number of ethnic groups)
The logarithm of the number of ethnic groups that inhabited the region in the year 1964. The country‐level analog of this variable is constructed using country boundaries.
Weidmann et al., 2010, http://www.icr.ethz.ch/research/greg
Variable Description Sources and linksProbability of same language
The probability that two randomly chosen people, one from the corresponding region and one from the rest of the country, share the same mother tongue in the year 2004. Where language areas do not overlap with our regions, we compute the number of people speaking a language in a region by weighing the total number of people in a language area by the fraction of the region’s surface covered by that language area. We compute the probability of same language separately for each language in a region and then calculate the surface‐weighted average of the different languages in a region. The country‐level analog of this variable is calculated as the population‐weighted average of the regional values.
World Language Mapping System, http://www.gmi.org/wlms/
VI. Enterprise Survey Data
Ln(Sales / Employee) The logarithm of the quotient of total annual revenue (in current USD) over the total number of employees in each establishment. Data is for the most recent available year, ranging from 2002 through 2009.
World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/
Ln(Sales –Raw Materials / Employee)
The logarithm of the quotient of total annual revenue minus expenditure on raw materials l(in current USD) over the total number of employees in each establishment. Data is for the most recent available year, ranging from 2002 through 2009.
World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/
Ln(Wages / Employee)
The logarithm of the quotient of total cost of labor (in current USD) the total number of employees in each establishment. Data is for the most recent available year, ranging from 2002 through 2009.
World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/
Ln(Expenditure on energy / Employee)
The logarithm of the quotient of total energy and fuel costs over (in current USD) the total number of employees in each establishment. Data is for the most recent available year, ranging from 2002 through 2009.
World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/
Years of Education of workers
The number of years of schooling from primary school onwards of the average non‐management employee in each establishment. To compute this variable, we use the same assumptions and follow the same procedure as used for the previously described years of schooling variable at the regional level. Data is for the most recent available year, ranging from 2002 through 2009.
World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/
Years of Education of manager
The number of years of schooling from primary school onwards of the manager of the establishment. To compute this variable, we use the same assumptions and follow the same procedure as used for the previously described years of schooling variable at the regional level. Data is for the most recent available year, ranging from 2002 through 2009.
World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/
Ln( Property, plant, and equipment / Employee)
The logarithm of the quotient of the book value of property, plant and equipment (in current USD) over the total number of employees in the establishment. Data is for the most recent available year, ranging from 2002 through 2009.
World Bank's Enterprise Surveys. https://www.enterprisesurveys.org/
Number of Regions
Number of Countries
Regions per country
Mean Minimum MaximumWithin‐country
RangeWithin‐country std deviation
Coefficient of Variation for Variable in Levels
Ln(GDP per capita) 1,537 107 11 8.69 8.07 9.54 1.03 0.30 0.33Years of Education 1,489 106 12 6.58 5.34 8.70 2.34 0.73 0.92Temperature 1,568 110 12 16.84 10.23 21.13 4.47 1.45 0.09Inverse Distance to Coast 1,569 110 12 0.90 0.80 0.99 0.13 0.05 0.05Ln(Oil) 1,569 110 12 0.00 0.00 0.00 0.00 0.00 0.00Informal Payments 361 76 4 1.02 0.40 1.60 0.94 0.45 0.59ln(Tax Days) 270 58 5 1.29 1.06 1.51 0.36 0.19 0.18Ln(Days without electricity) 222 75 2 3.03 2.73 3.37 0.54 0.36 0.32Security costs 373 79 4 0.91 0.39 1.34 0.72 0.34 0.42Access to land 519 81 5 0.15 0.04 0.27 0.21 0.09 0.40Access to finance 536 82 5 0.28 0.14 0.47 0.29 0.12 0.24Government Predictability 386 75 4 0.46 0.34 0.61 0.24 0.10 0.20Doing Business Percentile Rank 180 19 6 0.40 0.21 0.49 0.22 0.11 0.31Ln(Power line density) 1,569 110 12 1.34 0.00 2.53 1.87 0.61 0.61Ln(Travel time) 1,569 110 12 5.28 4.21 6.00 1.82 0.54 0.46Trust in others 745 69 9 0.23 0.12 0.38 0.22 0.07 0.35Civic Values 683 75 8 2.23 1.71 3.12 1.08 0.48 0.19Ln(Number of ethnic groups) 1,568 110 12 0.98 0.00 1.79 1.39 0.50 0.46Probability of same language 1,545 109 12 0.67 0.28 0.79 0.26 0.09 0.21
Medians for:
The table reports descriptive statistics for the variables in the paper. We report the total number of observations, the number of countries and medians for: (1) the number of regions with non‐missing data, (2) the country average, (3) the within‐country range, (4) the within‐country standard deviation, and (5) the coefficient of variation for the variable in levels (ratherthan in logs). We also report the adjusted R squared from an univariate regression of each variable in the table on country dummies. All variables are described in Table 1.
Table 2: Descriptive Statistics
Panel A: Regional GDP, Education, Geography, Institutions, Infrastructure, and Culture
Number of Regions
Number of Countries
Regions per country
Mean Minimum MaximumWithin‐country
RangeWithin‐country std
deviationCoefficient of Variation for Variable in Levels
Ln(Establishments / Population) 984 65 12 ‐4.89 ‐5.45 ‐4.06 1.17 0.37 0.37Ln(Employees / Establishments) 1,068 69 12 2.07 1.69 2.39 0.80 0.20 0.19Ln(Employees / Population) 1,056 69 12 ‐2.66 ‐3.38 ‐1.80 1.58 0.43 0.41Ln(Employees Big Firms / Employees) 540 31 13 ‐1.45 ‐2.17 ‐0.78 1.13 0.33 0.27Ln(Sales / Employee) 549 82 5 10.21 9.79 10.59 0.79 0.35 1.22Ln([Sales ‐ Raw Materials]/Employee) 359 70 4 9.53 9.24 9.87 0.69 0.37 1.21Ln(Wages / Employee) 515 77 5 8.28 8.00 8.66 0.62 0.25 1.79Ln(Employees) 549 82 5 3.25 2.72 3.71 0.82 0.35 1.46Ln(Expenditure on energy / Employee) 326 66 4 6.10 5.51 6.36 0.60 0.30 1.22Ln(Property, plant and equipment / Employee) 205 41 4 8.72 8.37 9.37 0.99 0.47 1.26Years of Education of Workers 507 74 5 9.97 8.66 10.80 2.25 0.93 3.06Years of Education of Managers 195 38 4 14.90 14.24 15.36 1.34 0.62 0.89
Medians for:
Table 2: Descriptive Statistics (continued)
Panel B: Enterprise Survey and Census Data
Observations Countries R2 Within R2 Between
Independent Variables:Years of Education 1,470 104 38% 58%Temperature 1,536 107 1% 27%Inverse Distance to Coast 1,537 107 4% 13%Ln(Oil) 1,537 107 2% 4%Informal Payments 350 74 0% 21%ln(Tax Days) 263 56 0% 20%Ln(Days without electricity) 219 73 2% 6%Security costs 362 77 0% 7%Access to land 507 79 0% 15%Access to finance 524 80 1% 8%Government Predictability 380 73 1% 0%Doing Business Percentile Rank 176 18 2% 13%Ln(Power line density) 1,537 107 5% 36%Ln(Travel time) 1,537 107 7% 15%Trust in others 739 68 0% 18%Ln(Number of ethnic groups) 1,536 107 5% 17%Probability of same language 1,518 106 1% 26%
Table 3: Univariate Fixed Effects Regressions
Fixed effects regressions of the log of GDP per capita at the regional level in the year 2005. The independent variables are proxies for: (1) geography, (2) Institutions, and (3) Infrastructure and Culture. The table reports
the number of observations, the number of countries, the R2 within, the R2 between, and the fraction of the variance due to countries. All variables are described in Table 1.
(1) (2) (3)
Temperature ‐0.0914a ‐0.0189c ‐0.0190c
(0.0100) (0.0106) (0.0106)
Inverse Distance to Coast 4.4768a 2.9647a 2.9499a
(0.5266) (0.5736) (0.5782)
Ln(Oil) 1.2192a 0.9503a 0.9473a
(0.1985) (0.1371) (0.1375)
Years of Education 0.2566a 0.2574a
(0.0308) (0.0311)
Ln(Population) 0.0684c
(0.0408)
Ln(Employment) 0.0576(0.0398)
Constant 6.3251a 3.5761a 3.7959a
(0.4598) (0.9372) (0.8977)
Observations 107 104 103
Adjusted R2 50% 63% 63%
Table 4: GDP per capita and Geography
Ordinary least squares and fixed effects regressions of the log of GDP per capita. The dependent variable is the logarithm of the 2005 level of GDP per capita at the country level in Panel A and at the logarithm of regional GDP per capita in Panel B. The independent variables are (1) temperature, (2) inverse distance to coast, (3) the logarithm of per capita oil production and reserves, (4) the average years of education, (5) the logarithm of population, and (6) the logorithm of the number of employees. Robust standard errors are shown in parentheses. All variables are described in Table 1.
Panel A: Dependent Variable is Logarithm National GDP per capita
Note: a = significant at the 1% level, b = significant at the 5% level, and c = significant at the 10% level.
(1) (2) (3)
Temperature ‐0.0156c ‐0.0140c ‐0.0206c
(0.0082) (0.0084) (0.0105)
Inverse Distance to Coast 1.0318a 0.4979a 0.5096a
(0.2078) (0.1438) (0.1745)
Ln(Oil) 0.1651a 0.1752a 0.1941a
(0.0477) (0.0578) (0.0440)
Years of Education 0.2755a 0.2751a
(0.0171) (0.0271)
Ln(Population) 0.0125(0.0168)
Ln(Employment) 0.0661a
(0.0244)
Constant 8.0947a 6.3886a 5.9154a
(0.2282) (0.1944) (0.2516)
Observations 1,545 1,478 833Number of countries 107 104 49
R2 Within 8% 42% 50%
R2 Between 47% 60% 70%
R2 Overall 34% 62% 70%Fixed Effects Yes Yes Yes
Note: a = significant at the 1% level, b = significant at the 5% level, and c = significant at the 10% level.
Panel B: Dependent Variable is Logarithm Regional GDP per capita
Table 4: GDP per capita and Geography (continued)
(1) (2) (3) (4) (5) (6) (7) (8) (9)
Years of Education 0.2566a 0.2310a 0.1890a 0.2339a 0.2291a 0.2301a 0.2264a 0.2355a 0.1749b(0.0308) (0.0344) (0.0310) (0.0316) (0.0336) (0.0350) (0.0344) (0.0332) (0.0703)
Ln(Population) 0.0684c ‐0.0022 0.0887 0.0428 0.0320 0.0067 0.0299 0.0611 ‐0.0782(0.0408) (0.0494) (0.0582) (0.0488) (0.0481) (0.0519) (0.0473) (0.0457) (0.1074)
Temperature ‐0.0189c ‐0.0105 ‐0.0276b ‐0.0083 ‐0.0094 ‐0.0066 ‐0.0082 ‐0.0129 ‐0.0147(0.0106) (0.0128) (0.0128) (0.0119) (0.0114) (0.0112) (0.0110) (0.0117) (0.0306)
Inverse Distance to Coast 2.9647a 2.3086a 2.1692a 2.5170a 2.2652a 2.2826a 2.1892a 2.3979a 0.2385(0.5736) (0.6321) (0.7006) (0.5698) (0.5856) (0.5406) (0.5562) (0.5616) (2.1131)
Ln(Oil) 0.9503a 1.6367a 0.5257 1.1319a 1.1739a 1.1916a 1.1165a 1.2054b 0.5201(0.1371) (0.5966) (0.5050) (0.3309) (0.3219) (0.3302) (0.2950) (0.4982) (0.4921)
Informal Payments ‐0.0121(0.0499)
ln(Tax Days) ‐0.5497a
(0.1446)
Ln(Days without electricity) ‐0.1375(0.0847)
Security costs ‐0.0332(0.0250)
Access to land ‐0.7493(0.5783)
Access to finance ‐0.5164(0.4202)
Government Predictability 0.3835(0.4431)
Doing Business Percentile Rank 0.6704(1.6413)
Constant 3.5761a 5.1927a 5.1619a 4.6815a 4.7382a 5.1545a 4.9498a 3.9328a 8.6509b
(0.9372) (1.1015) (1.2918) (0.9542) (1.0046) (0.9971) (1.0246) (0.9724) (3.1636)
Observations 104 73 55 75 76 80 81 72 17
Adjusted R2 63% 73% 76% 69% 69% 70% 70% 71% 34%
Adj. R2 without institution 63% 73% 69% 69% 69% 69% 69% 71% 39%
Adj. R2 without education 50% 53% 60% 49% 50% 52% 52% 50% 26%
Note: a = significant at the 1% level, b = significant at the 5% level, and c = significant at the 10% level.
Table 5: National GDP per capita, Institutions, Infrastructure, and Culture
Ordinary least square regressions of the log of GDP per capita at the country level. All regressions include the years of education, logarithm of population, temperature, inverse distance to coast, and the logarithm of per capita oil production and reserves. In addition, regressions include measures of: (1) institutions (Panel A) and (2) infrastructure and culture (Panel B). Robust standard errors are shown in parenthesis.
For comparison, the bottom panel shows the adjusted R2 of two alternative specifications: (1) a regression with all regressors except the measure of institutions or culture; and (2) a regression with all regressors except education. All variables are described in Table 1.
Panel A: Institutions
(1) (2) (3) (4) (5) (6) (7)
Years of Education 0.2566a 0.2379a 0.2642a 0.1935a 0.1818a 0.2534a 0.2394a
(0.0308) (0.0338) (0.0325) (0.0498) (0.0538) (0.0347) (0.0377)
Ln(Population) 0.0684c 0.0688c 0.0653 0.1238 0.2169b 0.0999 0.0807c
(0.0408) (0.0414) (0.0407) (0.0788) (0.1017) (0.0640) (0.0450)
Temperature ‐0.0189c ‐0.0145 ‐0.0191c ‐0.0283b ‐0.0434a ‐0.0188c ‐0.0163(0.0106) (0.0109) (0.0108) (0.0135) (0.0148) (0.0107) (0.0108)
Inverse Distance to Coast 2.9647a 2.7218a 3.0968a 3.6522a 4.3386a 2.7758a 2.7448a
(0.5736) (0.6025) (0.6268) (0.7902) (1.0486) (0.6473) (0.5853)
Ln(Oil) 0.9503a 1.0157a 0.8737a 0.9902a 0.9751a 0.9538a 0.8792a
(0.1371) (0.1438) (0.1467) (0.3207) (0.2895) (0.1443) (0.1657)
Ln(Power line density) 0.1480(0.1099)
Ln(Travel time) 0.0825(0.0934)
Trust in others 1.2472(0.8796)
Civic values 0.4180(0.3105)
Ln(Number of ethnic groups) ‐0.0996(0.1550)
Probability of same language 0.4195(0.3391)
Constant 3.5761a 3.6383a 3.0050b 2.3962 ‐0.1572 3.4625a 3.3864a
(0.9372) (0.9251) (1.2448) (2.0122) (3.2084) (0.9289) (0.9548)
Observations 104 104 104 67 57 104 103
Adjusted R2 63% 63% 63% 49% 47% 63% 62%
Adj. R2 without infrastructure or culture 63% 63% 63% 48% 45% 63% 62%
Adj. R2 without education 50% 54% 50% 44% 42% 51% 52%
Note: a = significant at the 1% level, b = significant at the 5% level, and c = significant at the 10% level.
Table 5: National GDP per capita, Institutions, Infrastructure, and Culture (cont)
Panel B: Infrastructure and Culture
(1) (2) (3) (4) (5) (6) (7) (8) (9)
Years of Education in the Region 0.2758a 0.3056a 0.3620a 0.3439a 0.3343a 0.3267a 0.3273a 0.3166a 0.4141a
(0.0172) (0.0298) (0.0288) (0.0481) (0.0310) (0.0218) (0.0215) (0.0207) (0.0229)
Ln(Population in the Region) 0.0126 ‐0.0185 ‐0.0175 ‐0.0442 ‐0.0191 ‐0.0087 ‐0.0098 ‐0.0113 ‐0.0026(0.0168) (0.0495) (0.0536) (0.0613) (0.0432) (0.0316) (0.0312) (0.0305) (0.0229)
Temperature ‐0.0140c ‐0.0101 ‐0.0086 ‐0.0015 ‐0.0064 ‐0.0093 ‐0.0106 ‐0.0131 0.0016(0.0084) (0.0096) (0.0078) (0.0122) (0.0093) (0.0086) (0.0086) (0.0081) (0.0059)
Inverse Distance to Coast 0.4971a 0.4647 0.8290c 0.1810 0.2703 0.4054 0.5133c 0.4420 0.0913(0.1441) (0.3293) (0.4273) (0.4312) (0.3041) (0.2636) (0.2822) (0.2788) (0.3460)
Ln(Oil) 0.1752a ‐0.0578 0.1555 ‐0.0584 ‐0.0473 ‐0.0224 ‐0.0040 ‐0.0170 0.1834(0.0578) (0.1283) (0.1319) (0.2503) (0.0862) (0.1081) (0.1113) (0.0735) (0.1160)
Informal Payments ‐0.0089(0.0353)
ln(Tax Days) ‐0.0479(0.0630)
Ln(Days without electricity) 0.0001(0.0764)
Security costs ‐0.0004(0.0060)
Access to land ‐0.1900(0.1457)
Access to finance ‐0.0935(0.1536)
Government Predictability ‐0.1251(0.1426)
Doing Business Percentile Rank ‐0.6199c
(0.3437)
Constant 6.3853a 6.5073a 5.7640a 6.8622a 6.4507a 6.3453a 6.2816a 6.4790a 6.3186a
(0.1947) (0.7043) (0.8220) (0.7867) (0.5993) (0.4664) (0.4827) (0.4629) (0.4428)
Observations 1,469 338 255 216 352 387 381 368 172Number of countries 104 73 55 72 76 77 76 72 17
R2 Within 42% 58% 66% 59% 60% 62% 62% 63% 69%
R2 Between 60% 64% 64% 53% 58% 60% 60% 63% 39%
R2 Overall 62% 59% 60% 49% 53% 55% 55% 56% 51%
Within R2 without institution 42% 57% 66% 59% 60% 62% 62% 62% 67%
Within R2 without education 9% 11% 14% 10% 9% 6% 5% 7% 9%
Between R2 without institution 60% 64% 63% 53% 58% 60% 60% 63% 41%
Between R2 without education 42% 25% 20% 21% 26% 35% 39% 45% 50%
Fixed Effects Yes Yes Yes Yes Yes Yes Yes Yes Yes
Note: a = significant at the 1% level, b = significant at the 5% level, and c = significant at the 10% level.
Table 6: Regional GDP per capita, Institutions, Infrastructure, and Culture
Ordinary least square regressions of the log of regional GDP per capita. All regressions include years of education, logarithm of population, temperature, inverse distance to coast, and the logarithm of per capita oil production and reserves. In addition, regressions include measures of: (1) institutions (Panel A) and (2) infrastructure and culture (Panel B). Robust standard errors are shown in parenthesis. For comparison, the
bottom panel shows the adjusted R2 of two alternative specifications: (1) a regression with all regressors except the measure of institutions or culture; and (2) a regression with all regressors except education. All variables are described in Table 1.
Panel A: Institutions
(1) (2) (3) (4) (5) (6) (7)
Years of Education in the Region 0.2758a 0.2713a 0.2627a 0.3021a 0.2986a 0.2644a 0.2719a
(0.0172) (0.0187) (0.0197) (0.0286) (0.0305) (0.0181) (0.0175)
Ln(Population in the Region) 0.0126 0.0101 0.0023 0.0091 0.0138 0.0170 0.0115(0.0168) (0.0168) (0.0184) (0.0187) (0.0193) (0.0173) (0.0157)
Temperature ‐0.0140c ‐0.0142c ‐0.0166c ‐0.0015 ‐0.0038 ‐0.0154c ‐0.0140c
(0.0084) (0.0085) (0.0085) (0.0060) (0.0056) (0.0090) (0.0080)
Inverse Distance to Coast 0.4971a 0.4872a 0.4626a 0.4750c 0.4093 0.4351a 0.5162a
(0.1441) (0.1427) (0.1438) (0.2590) (0.2713) (0.1358) (0.1450)
Ln(Oil) 0.1752a 0.1793a 0.1864a 0.0534 0.0354 0.1922a 0.1772a
(0.0578) (0.0584) (0.0582) (0.0669) (0.0572) (0.0613) (0.0591)
Ln(Power line density) 0.0199(0.0198)
Ln(Travel time) ‐0.0456c
(0.0231)
Trust in others ‐0.0611(0.0868)
Civic values ‐0.0040(0.0231)
Ln(Number of ethnic groups) ‐0.0504b
(0.0249)
Probability of same language 0.1723(0.2067)
Constant 6.3853a 6.4350a 6.9287a 6.0940a 6.0196a 6.5272a 6.2956a
(0.1947) (0.1928) (0.3351) (0.2863) (0.3245) (0.1679) (0.2337)
Observations 1,469 1,469 1,469 699 635 1,468 1,445Number of countries 104 104 104 65 70 104 103
R2 Within 42% 42% 43% 49% 48% 42% 42%
R2 Between 60% 60% 60% 50% 50% 60% 60%
R2 Overall 62% 62% 61% 50% 47% 62% 62%
Within R2 without institution 42% 42% 42% 49% 48% 42% 42%
Within R2 without education 9% 13% 17% 10% 10% 14% 11%
Between R2 without institution 60% 60% 60% 51% 50% 60% 59%
Between R2 without education 42% 51% 47% 7% 17% 47% 50%Fixed Effects Yes Yes Yes Yes Yes Yes Yes
Note: a = significant at the 1% level, b = significant at the 5% level, and c = significant at the 10% level.
Table 6: Regional GDP per capita, Institutions, Infrastructure, and Culture (Cont)
Panel B: Infrastructure and Culture
(1) (2) (3) (4) (5) (6)
Years of Education 0.2567a 0.2200a 0.2069a 0.1626a 0.2448a 0.1850a
(0.0308) (0.0433) (0.0438) (0.0480) (0.0363) (0.0351)
Ln(Population) 0.0683c 0.0354 0.0559 ‐0.0356 0.0732 0.0504(0.0410) (0.0487) (0.0470) (0.0482) (0.0533) (0.0370)
Temperature ‐0.0189c ‐0.0179 ‐0.0135 0.0024 ‐0.0181 ‐0.0100(0.0106) (0.0118) (0.0109) (0.0106) (0.0126) (0.0104)
Inverse Distance to Coast 2.9646a 2.3421a 2.3853a 2.3974a 2.9603a 1.9906a
(0.5742) (0.7800) (0.6050) (0.5941) (0.6208) (0.5463)
Ln(Oil) 0.9503a 0.7877c 1.0708a 0.8965a 1.0720b 0.9928a
(0.1373) (0.4564) (0.1729) (0.1100) (0.4094) (0.2013)
Autocracy ‐0.5994a
(0.2184)
Executive Constraints 0.1633b
(0.0696)
Expropriation Risk 0.3952a
(0.0986)
Proportional Representation 0.3972c
(0.2328)
Corruption 0.2130a
(0.0479)
Constant 3.5771a 5.3781a 3.7896a 3.1830b 3.2958a 4.1183a
(0.9416) (1.3861) (1.0059) (1.3630) (1.0503) (0.8118)
Observations 103 80 101 81 97 103
Adjusted R2 63% 67% 65% 70% 63% 69%
Adj. R2 without institution 63% 64% 63% 63% 62% 63%
Adj. R2 without education 50% 60% 59% 67% 52% 63%
Note: a = significant at the 1% level, b = significant at the 5% level, and c = significant at the 10% level.
Table 7: National GDP per capita and commonly used measures of institutions
Ordinary least square regressions of the log of GDP per capita at the country level. All regressions include the years of education, logarithm of population, temperature, inverse distance to coast, and the logarithm of per capita oil production and reserves. In addition, regressions include the following variables: (1) Autocracy; (2) Executive constraints; (3) Expropriation riks; (4) Proportional representation; and (5) Corruption. Robust standard errors are shown in parenthesis. For comparison, the bottom
panel shows the adjusted R2 of two alternative specifications: (1) a regression with all regressors except the measure of institutions or culture; and (2) a regression with all regressors except education.
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15)
Years of Education of manager 0.0470a 0.0322a 0.0227a 0.0242a 0.0148a 0.0445a 0.0311a 0.0267a 0.0241a 0.0150a 0.0262a 0.0179a 0.0082a 0.0112a 0.0049(0.0033) (0.0034) (0.0033) (0.0033) (0.0047) (0.0033) (0.0033) (0.0032) (0.0033) (0.0041) (0.0027) (0.0027) (0.0028) (0.0028) (0.0038)
Ln(Employees) . 0.1472a . ‐0.0134b 0.1324a . 0.1388a . 0.0230a 0.1099a . 0.0822a . ‐0.0257a 0.0613a
. (0.0097) . (0.0062) (0.0127) . (0.0101) . (0.0066) (0.0121) . (0.0081) . (0.0055) (0.0107)
Years of Education of workers 0.0318a 0.0243a 0.0434a 0.0443a 0.0071 0.0288a 0.0217a 0.0514a 0.0500a 0.0082c 0.0164a 0.0120a 0.0151a 0.0167a 0.0056(0.0033) (0.0033) (0.0027) (0.0028) (0.0044) (0.0037) (0.0036) (0.0030) (0.0030) (0.0045) (0.0028) (0.0028) (0.0025) (0.0026) (0.0038)
Ln(Expenditure on energy / employee) 0.3408a 0.3401a . . 0.2843a 0.3035a 0.3007a . . 0.2389a 0.2180a 0.2176a . . 0.1738a
(0.0095) (0.0094) . . (0.0118) (0.0102) (0.0101) . . (0.0117) (0.0086) (0.0086) . . (0.0102)
Ln(Property, Plant, Equipment / employees) . . 0.3038a 0.3046a 0.1851a . . 0.2964a 0.2949a 0.1745a . . 0.1668a 0.1684a 0.1126a
. . (0.0069) (0.0070) (0.0101) . . (0.0072) (0.0072) (0.0100) . . (0.0058) (0.0058) (0.0089)
Constant 6.9256a 6.6765a 4.2481a 4.2722a 5.9533a 9.3425a 9.1129a 5.1127a 5.0704a 8.9826a 6.2151a 6.0776a 2.7971a 2.8432a 5.5371a
(0.0673) (0.0660) (0.0556) (0.0564) (0.0889) (0.0682) (0.0674) (0.0564) (0.0574) (0.0863) (0.0595) (0.0604) (0.0487) (0.0500) (0.0784)
Observations 13,248 13,248 19,305 19,305 7,733 10,651 10,651 17,893 17,893 6,655 12,782 12,782 19,209 19,209 7,706Number of Regions‐Industries 855 855 1,037 1,037 487 754 754 1,005 1,005 458 807 807 1,033 1,033 486
Within R2 21% 23% 21% 21% 30% 20% 22% 20% 20% 29% 13% 14% 8% 8% 17%
Between R2 90% 89% 67% 67% 88% 28% 24% 55% 54% 51% 89% 88% 66% 67% 86%
Overall R2 79% 78% 58% 58% 81% 37% 33% 59% 58% 53% 76% 74% 57% 59% 78%Regions‐Industry fixed effects Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
Note: a = significant at the 1% level, b = significant at the 5% level, and c = significant at the 10% level.
Table 8: Firm level productivity
Dependent Variable:Logarithm of Sales per employee Ln[(Sales ‐ Raw Materials)/Employee] Logarithm of Wages per employee
The table reports fixed effect regressions for for the following two dependent variables: (1) logarithm of sales per employee, (2) logarithm of sales net of raw materials per employee, and (3) logarithm of wages per employee. All regressions include region‐industry fixed effects. Errors are clustered at the regional level. The independent variables include: (1) Years of Education of manager, (2) Ln(Employees), (3) Years of Education of workers, (4) Ln(Expenditure on energy / employee), and (5) Ln(Property, Plant, Equipment / employees). All variables are described in Table 1.
Panel A: Basic Specification
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15)
Years of Education in the Region 0.0655a 0.0639a 0.0954a 0.0950a 0.0478b 0.0748a 0.0735a 0.0945a 0.0928a 0.0565a 0.0580a 0.0577a 0.0840a 0.0843a 0.0462a
(0.0202) (0.0185) (0.0280) (0.0279) (0.0185) (0.0197) (0.0181) (0.0275) (0.0270) (0.0177) (0.0162) (0.0159) (0.0233) (0.0234) (0.0151)
Ln(Population in the Region) 0.0920a 0.0803a 0.1437a 0.1409a 0.0917a 0.1332a 0.1178a 0.1046b 0.0938c 0.1188a 0.0682 0.0622 0.0135 0.0159 0.0787c
(0.0321) (0.0297) (0.0501) (0.0504) (0.0328) (0.0343) (0.0327) (0.0511) (0.0501) (0.0410) (0.0425) (0.0418) (0.0352) (0.0354) (0.0413)
Years of Education of manager 0.0534a 0.0352a 0.0257a 0.0243a 0.0169b 0.0498a 0.0335a 0.0287a 0.0236a 0.0165a 0.0315a 0.0215a 0.0118a 0.0131a 0.0070(0.0047) (0.0048) (0.0062) (0.0057) (0.0077) (0.0047) (0.0042) (0.0044) (0.0043) (0.0058) (0.0038) (0.0036) (0.0044) (0.0042) (0.0052)
Ln(Employees) . 0.1497a . 0.0113 0.1468a . 0.1392a . 0.0401a 0.1177a . 0.0827a . ‐0.0095 0.0717a
. (0.0154) . (0.0176) (0.0193) . (0.0176) . (0.0141) (0.0193) . (0.0150) . (0.0108) (0.0187)
Years of Education of workers 0.0349a 0.0279a 0.0384a 0.0378a 0.0066 0.0332a 0.0264a 0.0490a 0.0468a 0.0112c 0.0195a 0.0151a 0.0146a 0.0152a 0.0049(0.0053) (0.0054) (0.0056) (0.0058) (0.0068) (0.0059) (0.0057) (0.0059) (0.0061) (0.0062) (0.0036) (0.0036) (0.0033) (0.0033) (0.0051)
Ln(Expenditure on energy / employee) 0.3577a 0.3554a . . 0.2902a 0.3183a 0.3133a . . 0.2440a 0.2248a 0.2232a . . 0.1721a
(0.0185) (0.0177) . . (0.0220) (0.0182) (0.0175) . . (0.0202) (0.0173) (0.0172) . . (0.0168)
Ln(Property, Plant, Equipment / employees) . . 0.3258a 0.3250a 0.1946a . . 0.3150a 0.3118a 0.1818a . . 0.1787a 0.1794a 0.1230a
. . (0.0132) (0.0136) (0.0162) . . (0.0139) (0.0141) (0.0161) . . (0.0086) (0.0089) (0.0135)
Constant 5.1202a 5.0055a 4.8529a 4.8850a 4.6033a 4.5073a 4.4682a 3.9685a 4.1094a 3.8699a 5.1007a 5.0322a 6.6732a 6.6461a 5.0701a
(0.3706) (0.3373) (1.1885) (1.1887) (0.4521) (0.4183) (0.4008) (0.8173) (0.8009) (0.6479) (0.5225) (0.5199) (0.7223) (0.7248) (0.7073)
Observations 13,248 13,248 19,305 19,305 7,733 10,651 10,651 17,893 17,893 6,655 12,782 12,782 19,209 19,209 7,706Number of Countries 29 29 22 22 21 25 25 21 21 20 27 27 22 22 21
Within R2 30% 32% 31% 31% 37% 28% 30% 27% 28% 36% 20% 21% 13% 13% 22%
Between R2 90% 90% 59% 59% 92% 87% 86% 70% 71% 89% 88% 87% 57% 57% 89%
Overall R2 74% 74% 54% 54% 80% 74% 72% 73% 73% 81% 69% 68% 44% 44% 76%Country fixed effects Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes YesIndustry fixed effects Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes
Note: a = significant at the 1% level, b = significant at the 5% level, and c = significant at the 10% level.
Table 9: Firm level productivity and Regional Human Capital
The table reports fixed effect regressions for for the following two dependent variables: (1) logarithm of sales per employee, (2) logarithm of sales net of raw materials per employee, and (3) logarithm of wages per employee. All regressions include country and industry fixed effects. Errors are clustered at the regional level. The independent variables include: (1) Years of Education in the Region, (2) Ln(Population in the Region), (3) Years of Education of manager, (4) Ln(Employees), (5) Years of Education of workers, (6) Ln(Expenditure on energy / employee), and (7) Ln(Property, Plant, Equipment / employees). All variables are described in Table 1.
Ln[(Sales ‐ Raw Materials)/Employee] Logarithm of Wages per employeeLn(Sales/Employee)Dependent Variable:
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12) (13) (14) (15)
Temperature 0.0084 0.0105 0.0081 0.0082 0.0190 ‐0.0217 ‐0.0200 ‐0.0023 ‐0.0021 0.0278 ‐0.0141 ‐0.0128 0.0113b 0.0112b ‐0.0046(0.0134) (0.0124) (0.0079) (0.0079) (0.0150) (0.0171) (0.0155) (0.0079) (0.0079) (0.0232) (0.0116) (0.0112) (0.0048) (0.0048) (0.0100)
Inverse Distance to Coast 0.3464 0.2275 1.1491a 1.1458a 0.6196 ‐0.2990 ‐0.4531 1.0464a 1.0125a ‐0.1930 ‐0.0127 ‐0.0841 0.2830 0.2947 0.1021(0.3708) (0.3652) (0.2065) (0.2079) (0.4041) (0.4233) (0.3851) (0.2286) (0.2228) (0.5291) (0.3349) (0.3369) (0.1824) (0.1824) (0.3564)
Ln(Oil) ‐0.8681a ‐0.6813b 0.1682 0.1728 ‐1.2953c ‐0.7747a ‐0.5679c ‐0.0945 ‐0.0492 ‐0.6443 ‐0.6755c ‐0.5721 0.1596 0.1433 ‐1.2264b
(0.2856) (0.3124) (0.5944) (0.5919) (0.7382) (0.2878) (0.3084) (0.4935) (0.4942) (0.8883) (0.3430) (0.3619) (0.3833) (0.3804) (0.4702)
Years of Education in the Region 0.0532a 0.0558a 0.0236 0.0236 0.0287c 0.0800a 0.0839a 0.0307 0.0310 0.0680a 0.0560a 0.0583a 0.0639b 0.0638b 0.0467b
(0.0198) (0.0184) (0.0287) (0.0287) (0.0162) (0.0205) (0.0188) (0.0299) (0.0295) (0.0180) (0.0182) (0.0181) (0.0261) (0.0261) (0.0179)
Ln(Population in the Region) 0.0963a 0.0817a 0.0713c 0.0705c 0.1029b 0.1235a 0.1062a 0.0493 0.0418 0.1195a 0.0706c 0.0629 ‐0.0257 ‐0.0230 0.0852c
(0.0321) (0.0295) (0.0382) (0.0386) (0.0416) (0.0319) (0.0300) (0.0401) (0.0395) (0.0443) (0.0417) (0.0411) (0.0291) (0.0290) (0.0439)
Years of Education of manager 0.0533a 0.0353a 0.0271a 0.0266a 0.0169b 0.0495a 0.0333a 0.0300a 0.0256a 0.0168a 0.0314a 0.0216a 0.0126a 0.0142a 0.0069(0.0047) (0.0047) (0.0060) (0.0053) (0.0077) (0.0047) (0.0042) (0.0044) (0.0041) (0.0059) (0.0037) (0.0036) (0.0043) (0.0040) (0.0052)
Ln(Employees) . 0.1486a . 0.0035 0.1453a . 0.1393a . 0.0341b 0.1178a . 0.0821a . ‐0.0125 0.0719a
. (0.0153) . (0.0175) (0.0190) . (0.0174) . (0.0137) (0.0191) . (0.0150) . (0.0110) (0.0187)
Years of Education of workers 0.0344a 0.0275a 0.0406a 0.0404a 0.0058 0.0333a 0.0266a 0.0514a 0.0494a 0.0112c 0.0194a 0.0152a 0.0154a 0.0162a 0.0047(0.0053) (0.0054) (0.0059) (0.0061) (0.0067) (0.0059) (0.0057) (0.0061) (0.0063) (0.0062) (0.0037) (0.0037) (0.0034) (0.0034) (0.0052)
Ln(Expenditure on energy / employee) 0.3571a 0.3549a . . 0.2890a 0.3178a 0.3128a . . 0.2426a 0.2245a 0.2230a . . 0.1717a
(0.0184) (0.0176) . . (0.0220) (0.0183) (0.0176) . . (0.0204) (0.0174) (0.0172) . . (0.0168)
Ln(Property, Plant, Equipment / employees) . . 0.3191a 0.3189a 0.1929a . . 0.3108a 0.3082a 0.1825a . . 0.1765a 0.1773a 0.1228a
. . (0.0122) (0.0127) (0.0160) . . (0.0132) (0.0134) (0.0163) . . (0.0085) (0.0088) (0.0135)
Constant 5.0425a 5.1078a 5.1076a 5.1172a 3.5975a 11.2711a 11.3696a 4.9529a 5.0614a 3.8682a 6.0203a 5.9598a 6.8223a 6.7884a 4.9977a
(0.4348) (0.3958) (0.9995) (1.0016) (0.5013) (0.8349) (0.7626) (0.6071) (0.6003) (0.9306) (0.7249) (0.7132) (0.6127) (0.6112) (0.7689)
Observations 13,248 13,248 19,305 19,305 7,733 10,651 10,651 17,893 17,893 6,655 12,782 12,782 19,209 19,209 7,706(0.2856) (0.3124) (0.5944) (0.5919) (0.7382) (0.2878) (0.3084) (0.4935) (0.4942) (0.8883) (0.3430) (0.3619) (0.3833) (0.3804) (0.4702)
Years of Education in the Region 0.0532a 0.0558a 0.0236 0.0236 0.0287c 0.0800a 0.0839a 0.0307 0.0310 0.0680a 0.0560a 0.0583a 0.0639b 0.0638b 0.0467b
(0.0198) (0.0184) (0.0287) (0.0287) (0.0162) (0.0205) (0.0188) (0.0299) (0.0295) (0.0180) (0.0182) (0.0181) (0.0261) (0.0261) (0.0179)
Ln(Population in the Region) 0.0963a 0.0817a 0.0713c 0.0705c 0.1029b 0.1235a 0.1062a 0.0493 0.0418 0.1195a 0.0706c 0.0629 ‐0.0257 ‐0.0230 0.0852c
(0.0321) (0.0295) (0.0382) (0.0386) (0.0416) (0.0319) (0.0300) (0.0401) (0.0395) (0.0443) (0.0417) (0.0411) (0.0291) (0.0290) (0.0439)
b
Table10: Firm level productivity, Regional Human Capital, and Geography
Dependent Variable:Logarithm of Sales per employee Logarithm of Wages per employeeLn[(Sales ‐ Raw Materials)/Employee]
The table reports fixed effect regressions for for the following two dependent variables: (1) logarithm of sales per employee, (2) logarithm of sales net of raw materials per employee, and (3) logarithm of wages per employee. All regressions include country and industry fixed effects. Errors are clustered at the regional level. The independent variables include: (1) Temperature, (2) Inverse Distance to Coast, (3) Ln(Oil), (4) Years of Education in the Region, (5) Ln(Population in the Region), (6) Years of Education of manager, (7) Ln(Employees), (8) Years of Education of workers, (9) Ln(Expenditure on energy / employee), and (10) Ln(Property, Plant, Equipment / employees). All variables are described in Table 1.
Figure 1: Regions in the database
Figure 2: Within‐country standard deviation of GDP per capita and development
Figure 3: Within‐country standard deviation of years of education and development
ZWE
ZAR
MWI
MOZ
NER
MDG
UGA
NPL
GHA
TZA
BFA
ZMBLSO
BEN
KHM
SEN
KEN
LAO
NGA
KGZ
MNG
CMR
UZB
NIC
VNM
PAK
MDA
HND
PHL
LKA
IND
IDN
GEOMAR
GTM
ARM
PRY
JORSYR
BOL
SLVSWZ
CHN
AZE
DOM
NAM
UKRALB
EGY
PER
BLZ
COL
SRB
ECUMKD
PAN
BRA
URY
BIH
LVA
ZAF
VEN
THA
BGR
LBN
IRN
KAZROM
TUR
RUS
ARG
ESTHRV
LTU
MYSMEXCHL
POL
HUN
SVK
GAB
CZE
KORSVNGRCPRTNZL
ISR
ESP
ITA
FRAJPNFIN
DEU
GBR
BEL
SWE
DNK
ARE
AUSNLDAUT
IRLCHECAN
NORUSA
-.2
0.2
.4.6
Sta
nda
rd d
evi
atio
n o
f Ln
Reg
iona
l GD
P p
er c
apita
)
-6 -4 -2 0 2Ln GDP per capita
coef = -.02489678, (robust) se = .01172037, t = -2.12
ZWE
ZAR
MOZMWI
NER
MDG
UGA
NPL
GHA
TZABFA
KEN
ZMB
BEN
LSOKHMSEN
NGA
KGZ
LAO
MNG
UZB
CMR
VNM
NICMDA
PAK
HNDIDNIND
LKA
PHL
GEO
GTM
MAR
CHN
PRY
BOL
ARM
JOR
NAM
SYR
SLV
SWZDOM
AZE
UKR
PER
EGY
COL
PAN
BLZ
ECUSRB
MKD
BRA
THA
LVAURY
BIH
ZAF
VEN
RUS
ARG
LBN
BGR
KAZ
TURROM
EST
HRV
MEX
MYS
CHL
LTU
POL
HUN
SVK
GAB
CZESVNGRCPRT
NZL
ISR
ITA
ESPJPNFRABEL
ARE
FIN
DEU
GBR
DNK
SWENLD
AUS
AUT
CHE
IRLCAN
USA
NOR
-.5
0.5
11
.5S
td D
ev
Ye
ars
of E
duc
atio
n
-6 -4 -2 0 2Ln(GDP per capita)
coef = -.12647737, (robust) se = .02655853, t = -4.76
Figure 4: GDP per capita and years of schooling in Brazil, Colombia, India, and Russia
Figure 5: Partial Correlation Graph of (Log) GDP per capita and Years of Education
PiauíAlagoasMaranhãoParaíbaCeará
BahiaAcreSergipeParáTocantins
Rondônia
Rio Grande do NortePernambuco
AmazonasRoraima
Mato Grosso
GoiásMinas GeraisMato Grosso do SulAmapá
Espírito SantoParanáSanta CatarinaRio Grande do Sul
São PauloRio de Janeiro
Distrito Federal
-1-.
50
.51
1.5
e( ln
Reg
GD
P |
X )
-1 0 1 2 3e( HKij | X )
coef = .53632514, (robust) se = .03022503, t = 17.74
Brazil
Vichada
PutumayoGuainiaGuaviareChoco
La Guajira
CaquetaNarinoVaupesCauca
Casanare
Amazonas
Cordoba
Sucre
HuilaArauca
CesarBoyaca
Norte de SantanderMagdalena
TolimaCundinamarcaMeta
BolivarCaldasRisaralda
SantanderAntioquia
Quindio
Valle del CaucaAtlantico
San Andres y ProvidenciaBogota
-1-.
50
.51
1.5
e( ln
Reg
GD
P |
X )
-2 -1 0 1 2 3e( HKij | X )
coef = .23096736, (robust) se = .04695696, t = 4.92
Colombia
Bihar
OrissaJharkhandChattisgarhRajasthanArunachal PradeshAndhra Pradesh
Madhya Pradesh
West Bengal
Uttar Pradesh
MeghalayaAssamTripuraSikkimNagalandKarnataka
Gujarat
Jammu & Kashmir
Andaman & Nicobar IslandsHaryanaTamil NaduPunjabMaharashtra
MizoramUttaranchalHimachal Pradesh
Manipur
Goa
KeralaPondicherryDelhi
Chandigarh
-10
12
e( ln
Reg
GD
P |
X )
-2 0 2 4 6e( HKij | X )
coef = .31301955, (robust) se = .0470328, t = 6.66
India
Kurgan RegionJewish Autonomous RegionKirov Region
Chechen Republic
Perm RegionRepublic of BashkortostanArkhangelsk Region
Altai KraiKursk RegionChita RegionTambov RegionAltai RepublicPskov regionBryansk Region
Kemerovo RegionVologda Region
The Republic of Mari El
Lipetsk Region
Kostroma regionOrenburgskayaRepublic of KareliaNovgorod RegionChuvash RepublicOrel RegionIvanovo RegionRyazan RegionTver RegionPenza Region
Republic of Ingushetia
Sverdlovsk RegionAmur Region
Komi Republic
Smolensk RegionVladimir RegionUlyanovsk RegionRepublic of Mordovia
Yaroslavl RegionChelyabinsk region
Tuva Republic
Republic of KhakassiaTula RegionUdmurt RepublicOmsk RegionNizhny Novgorod RegionLeningrad Region
Krasnodar
Adygeya Republic
Belgorod regionKrasnoyarsk KraiThe Republic of Tatarstan
Stavropol Territory
Irkutsk RegionRepublic of BuryatiaVoronezh RegionNovosibirsk RegionAstrakhan Region
Dagestan Republic
Sakhalin Oblast
Saratov RegionVolgograd Region
Republic of Kalmykia
Rostov RegionKaluga RegionPrimorsky Krai
Karachay-Cherkessia Republic
Tomsk RegionChukotka Autonomous DistrictMurmansk regionSamara RegionKhabarovsk Krai
Tyumen Region
Kabardino-Balkar Republic
The Republic of Sakha (Yakutia)
Kaliningrad Oblast
Magadan Region
Republic of North Ossetia - Alania
Kamchatka RegionMoscow RegionSt. Petersburg
Moscow
-2-1
01
2e(
lnR
egG
DP
| X
)
-1 0 1 2e( HKij | X )
coef = .54581172, (robust) se = .12771885, t = 4.27
Russian Federation
Ln GDP per capita and Years of Schooling
KEN
CHNCMRPAN
CMR
NGAPERGHAIDN
GHAMEXHND
COL
GHA
NIC
NGA
PER
MEXPERPHL
PERCOLPER
BOL
IND
NAM
LAOARG
SYR
MYSTUR
PER
PRY
MYS
NAM
IDN
PER
VEN
ARG
ZWE
ZMBHNDLBNPHL
BRA
DOMPAN
ZAR
NIC
THAZMBJOR
COL
COLGTMMKD
COL
PRY
ECU
BRABRA
GAB
CMR
MEX
BENTZA
MEX
SWZ
IND
NAM
ECU
IND
ECU
IND
ARG
IDN
NIC
IDN
IDN
ZMB
INDEGY
THA
VNMSRBLAO
BLZMEX
ESPIDN
ZAR
LAO
COL
CHN
CHN
IND
VNM
VENSYRISRZARLVA
CHL
IND
PRYHNDHND
ARG
PER
ESPMEXLVA
LAO
TUR
AUS
SYR
COLLSO
BRA
HNDCHLSRB
IND
PHLECU
IDN
MEX
JOR
COL
TZAURY
MKD
COLHND
INDNIC
NICVENZMBKHMDEU
LVA
BOL
BRA
TZA
MNGLVACHL
BELBRA
LAO
SRB
IDNSVN
IDNDOMCHNBRA
ARG
NAM
PHLKHMNZL
LVA
BENJPN
NAM
LSO
ZAR
CHN
IND
HRVZWEBRAJPNBENMARSLV
UKR
MDG
ZAF
COLPER
IDN
VENVENKHMSYR
VENLTU
GTMSVKFRAAREDNKARE
RUS
DEU
NAM
RUSNZL
HRV
LSO
MDAMARSRBLAOSYR
JPN
CHN
LBNURY
ZAF
CMR
MEXSENINDCAN
GTM
HNDCHL
URYZAR
COL
JPNLAO
NICGRCJPNHRVRUS
LAOJPNMYS
MEXCHNHND
SENPRY
UKRHRVITAKENCOL
ZWENICDEUZAF
IND
IND
LVA
NAMKAZ
ITABFA
ECU
CHL
PAN
LBNLVA
MNG
USABRA
BRAMEXSENZAF
SRB
UKR
ZWEEST
PER
SLVDNK
RUS
MNG
AUT
LKACZENOR
RUS
ITA
LAO
LBN
DEUMDG
HRV
RUS
NORDNK
NZLMARGEOMAR
USAGEOFRA
SLV
NER
TZA
USA
MYSESPHRV
LTU
TUR
BEN
MNGMDG
RUS
RUSSVK
PRY
PRY
UKR
UGASENVENPRY
DNKBFA
NER
HRVSRBMDAMNGNORJPNLVAUZBMNG
MNG
MAR
SRB
LVA
MYSBGRIND
USA
JPNCOLLSOFRARUSAZEMEXPAKROM
TZA
MNG
GBR
MNGMNG
GRCJPNJPNMDAKHMESTDEUSWE
NERLVA
ZMBNLD
USASWZ
MOZ
ROMHRV
LVAUSA
MOZECU
TZALVA
KHMRUSPOL
KEN
PRTKEN
NZL
IDN
ARMCANPHL
ROMECU
BLZ
RUS
GRC
ITANZL
UKRRUS
JORURY
ESTINDRUSBFA
BEN
ARMHUN
DNK
MYS
MWI
COL
DOM
RUS
BFAGEOLAOSLVBRAJPN
ITA
NORBFA
ARG
DNKLVABGR
MYS
SVNNERDNKVENESP
BIH
SRBTZANPL
TZA
AZEESTBRAUSAISRUSASLV
COL
HRVNICHUN
NER
NLDSYR
URY
BEL
DEU
PRY
HRVIDN
KEN
NERVNMECU
RUS
ECUJORPRY
TZA
JPNSLVBRA
COL
PAK
ARG
FRASEN
RUS
AUTBENURYUKR
BLZ
MNGMAR
JPNGBR
UKR
GBR
RUS
URYNZLMNG
ARE
NORKHM
ARM
RUS
LVA
CHNPRY
USAARMESTCHECHEZWEIRLLTUARE
BFAECUSRB
RUSKHM
USALVAINDNLDUSA
RUS
GEOLVA
RUS
HUNZWE
ECU
CHECHN
GTM
TZAMOZ
SVN
MNG
MWI
CAN
FRA
CANAUTRUSLTUJPN
CHEARM
DEU
PRYBEL
KGZ
PRT
RUSMKDUGA
CHE
FINKAZAZEVEN
ROMCANUKRPOLMOZCHNBLZ
CHN
JPN
IND
LKASLVLVAGRCEST
KGZ
CHN
DNKAZEZAF
RUS
ARG
HRVTUR
URYKGZPHLGBRMOZUSASENSVK
MKD
NLD
RUS
JPNRUSDOMUKRFRARUS
RUS
UKR
BFAPOLBELESP
MNG
GEO
ZAR
CZENOR
GRC
ROM
ARG
SWEGBR
CHECHN
SVN
ECU
JPNBFA
SVK
POLARGKENFRAJPNCHLJPNPHLVEN
PRT
AUSJORUKR
IDN
BEN
ESPCHE
FIN
BGRCOL
RUS
SVKLKABGR
UZBJPN
USAFRAPHLLTU
RUS
ZWE
USA
ESP
CHNMDGUSAGBR
KGZEST
GRCARMUGACHEITAMOZHUNGEOESP
GAB
FRANLD
BENIDN
CHEPHLITAINDHRVSVK
RUSNZLMEX
MEXCHN
DEU
CHEPOL
SLVHNDNORVEN
CHL
NPL
RUS
POLLKA
AUSBFATZA
JPNRUS
JPN
BOL
POLUSAGBRDNK
USAUSAUKRAUT
RUS
FRA
RUSBELUKR
LVA
RUS
SRBPRTTZAGEOFRA
MEX
MAR
RUSRUS
RUS
FRALVA
SRB
ITAAUSPHL
GRCESPAZE
UKR
LTUAZERUSTURPERNORUKRCZEPRTCHESWE
IDNDEUKHMRUS
IDN
SVN
BLZRUSPAN
NORTHA
RUSAUS
EGYLSOLVA
MOZLVAESTSVNUKRBGRJPN
BIHGBRCHENOR
MEX
MDGURYPERKGZHRV
ZAR
BEL
EST
JPNJPNCZEHUN
CZEGAB
CHNGRCUSA
DEU
IDN
TZANLDCOLSWEKAZ
TZA
USA
CAN
JPNLTUCZESYRZAFDOMUKRSVNBRA
URY
JPNUSA
CANVNM
MNGLVA
IND
VENFIN
PAN
ITA
MYS
ESPFRAUKRRUS
NZL
CHEKAZGTMESPMYSJORNICPOLURY
SVN
HND
MARHND
LSOJPN
GRCARG
ECU
UKR
SEN
DNK
ITAUSACHE
BELRUS
BEL
NGANLDNZLUSAESTCHE
KGZ
KGZ
SLV
KHMPOLCHNROMGTMRUSGHA
ARE
NZLSWEAUS
NAM
POLAUT
IND
ZAF
GEO
ARG
BFA
SWEURY
RUS
NORLSONPLFRA
ARG
ARMLAO
ITA
CHN
LVA
ZWEURYSLV
COL
CHE
RUSNLD
NLD
BFADOMCZEHNDUSAMYSJPNARM
RUS
VEN
AUT
UKR
RUS
RUSUZBPRY
ARG
RUS
USAARG
MAR
COL
ARMUSA
NORSYRECUCHE
EST
RUSUSALKA
BENESTNIC
USA
TURUSALKAPRY
THA
VNMMAR
USA
NORUSAJPNUSAPRY
SRB
EGY
RUS
ZMB
FINPOL
SRB
RUSHUNUKR
RUS
MAR
JPNMNGUSAUZBJORSRB
USA
USAHRVITA
ARGUSA
DOM
LVA
BOL
GRCCHN
RUS
CHE
VEN
PRT
MOZ
MOZ
ARMCHEZAR
CHN
PAN
RUS
NPL
RUS
JPNRUS
RUS
USA
NAMPER
NOR
RUS
ISRECUCANUSA
CHL
BRAARGPOL
USA
MEX
JPNNORKHMTUR
PHLAZEVENITA
JPNCMRSWE
NLD
MNGPHL
BRAITA
IDN
INDMEXHRV
TURECUCHN
SVKFRA
URY
USA
ESP
IND
USAUSASENROMPAK
AUTSVNPOLLAOESPCHN
TZAESP
UKR
ITA
RUSSLVJPNCOL
MEX
JPNBRA
IDN
TZA
KAZ
ESPPRYITA
GEOJPN
AUSJOR
POL
BOL
CAN
CHE
CHE
JORPOLFRANORGHAGRCMAR
IRLTUR
AUTURYSYR
ARGNORUSABRA
GEOPERITANZLCHNISR
CHE
SVNZMB
URYMEXNPLUSABRAIND
MNGMEXUSA
LTUNICUSA
MEX
CHNCHNRUS
RUS
GBR
RUSCHN
MEX
ISRDEU
CHE
VNMURY
RUS
PRYBENUSA
URY
SVNITATZANAMSYRUSAUSA
VEN
GBR
BEL
FINDOM
RUS
UKR
AZE
SRB
LAOITACHN
PERRUSRUSMYS
BOL
ZAR
COLARGCANIDNHND
DEU
CHE
IDNGRCBRABIH
ARGUSA
LSO
GHA
DNKGBRCAN
RUS
JPNBELSLVKHM
ECU
RUS
LAOMKDCOLNERISR
IND
UZB
ARG
CHE
MKD
DEU
POL
PANNIC
PER
BRATUR
NLD
CHLPAK
IDN
RUS
ARG
INDRUSJPNNOR
CHN
DNKMEXJPN
CHLMEX
COLJPN
JORFRAESTVENMKDSWZ
RUS
NZLJPN
LVA
KHMFRA
LBNPHL
BFA
CMR
SWEDNK
USA
JORGTMCANCOLLTUCOL
MEXMWI
LAO
RUS
HRV
MEX
LVA
KAZ
JPN
MNGLSOAUTBRA
DEU
MYSESPVENRUSGBR
IDN
IDN
NLDZARSEN
TZA
COL
FRA
BOL
PERDNKGHA
IND
SRBVEN
BRABEN
IDN
MEX
SRBVNM
SRBEST
PHLNIC
JPN
DEUJPN
MEX
HRVSYRTZAVENPHL
LVA
RUS
LVA
IND
ITACHLGHACOL
PER
VEN
COLARE
ECUHRV
TUR
BOL
HNDUGA
MKD
MEXSLVLVABRA
LKA
CHL
SWZAZE
SVN
TURCMRBELNICMEXFRAPRT
KGZ
LVA
VENCZE
NZLNICLVAESPGHA
CHLESTIND
SYR
AREESPTZAINDUKR
LTUEGY
PER
LAO
HND
HUN
JPN
ECUPHLSYR
HND
SYR
DEU
NICJOR
VENNOR
ZAF
AUS
IND
VNM
NGA
HRVKENNZL
ARG
NGABGRMARDNKJPN
LVA
MEXBRA
COLNGA
GAB
MNGMAR
GRC
UKR
MNG
BRA
NAMSLV
ROM
PAN
PER
CHN
ARMCMR
ZAFMDAGEO
UKR
PER
MOZ
RUS
CMR
NAMHNDLSOHRV
SVK
CMRECUSEN
RUS
ZWE
MYSCOLPRY
ZWE
NERMDGPER
LVA
PHL
MYSMNG
CHN
PER
BLZDOM
IDN
GTM
URY
ZMB
ZMB
BRA
MEX
LAO
TZAPER
ECU
BFA
IND
LBN
PAN
THA
NICGHAKHM
COL
SRBLAO
IDN
HND
CHN
NAM
COL
BEN
ARG
IDN
ZAR
KEN
PRY
IND
-2-1
01
2L
n(R
egio
nal G
DP
per
cap
ita
-4 -2 0 2 4Years of Education
coef = .28573483, (robust) se = .01343762, t = 21.26
Code Country Type of Data Source Available link
ALB Albania GDP Data from HDR 2002ARE United Arab Emirates GDP Data from HDR 1997 in arabicARG Argentina GDP 1990‐2001 Data from Ministry of interior http://www.ec.gba.gov.ar/Estadistica/FTP/pbg/pbg3.html
ARM Armenia Expenditure National Statistics Office http://www.armstat.am/file/article/marz_07_e_22.pdf
AUS Australia GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx
AUT Austria GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx
AZE Azerbaijan Income National Statistics Office http://74.125.47.132/search?q=cache:http://www.azstat.org/statinfo/budget_households/en/003.shtml
BEL Belgium GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx
BEN Benin GDP Data from HDR 2007/2008 and 2003BFA Burkina Faso GDP Data from HDR for GDP per capita.BGD Bangladesh NABGR Bulgaria GDP Data from HDR 2003, 2002 and 2001BIH Bosnia and Herzegovina GDP National Statistics Offices http://www.bhas.ba/Arhiva/2007/brcko/PODACI%201‐08.pdf
BLZ Belize Expenditure Data from LSMS 2002 http://www.statisticsbelize.org.bz/dms20uc/dm_filedetails.asp?action=d&did=13
BOL Bolivia GDP National Statistics Office http://www.ine.gov.bo/indice/visualizador.aspx?ah=PC0104010201.HTM
BRA Brazil GDP National Statistics Office http://www.ibge.gov.br/home/estatistica/economia/contasregionais/2002_2005/contasregionais2002 2005.pdf
CAN Canada GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx
CHE Switzerland GDP National Statistics Office http://www.bfs.admin.ch/bfs/portal/fr/index/infothek/lexikon/bienvenue___login/blank/zugang lexikon.Document.20896.xls
CHL Chile GDP National Statistics Office http://www.bcentral.cl/publicaciones/estadisticas/actividad‐economica‐gasto/aeg07a.htm
CHN China GDP Data from National Statistics Yearbooks 2006, 2002, 1998 and 1996 http://www.stats.gov.cn/english/statisticaldata/yearlydata/YB1998e/C3‐8E.htm
CMR Cameroon Expenditure National Statistics Officehttp://www.statistics‐cameroon.org/archive/ECAM/ECAM2001/survey0/data/ECAM2001/Documentation/ECAM%20II%20‐%20Rapport%20principal.pdf
COL Colombia GDP National Statistics Office http://www.dane.gov.co/index.php?option=com_content&task=category§ionid=33&id=148&Itemid=705
CRI Costa Rica NACUB Cuba Wages Monthly wages from HDR 1996CZE Czech Republic GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx
DEU Germany GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx
DNK Denmark GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx
DOM Dominican Republic GDP National Statistics OfficeECU Ecuador GDP National Statistics OfficeEGY Egypt GDP Data from HDRs 2008, 2005, 2004, 2003, 2001; data from 2006 excludedESP Spain GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx
EST Estonia GDP Data from National Statistics Office
http://pub.stat.ee/px‐web.2001/I_Databas/Economy_regional/23National_accounts/01Gross_Domestic_product_(GDP)/14Regional_gross_domestic_product/14Regional_gross_domestic product asp
FIN Finland GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx
FRA France GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx
GAB Gabon Expenditure Data from HDR 2005GBR United Kingdom GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx
GEO Georgia GDP Data from HDR 2002. http://www.undp.org.ge/nhdr2001‐02/chpt1.htm
GHA Ghana Income Data from Living Standards Measurement Survey Reports for 1998/9 and 1991/2 http://siteresources.worldbank.org/INTLSMS/Resources/3358986‐1181743055198/3877319‐1190221709991/G3report.pdf
GRC Greece GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx
GTM Guatemala GDP Data from HDR 2007/2008 annex http://cms.fideck.com/userfiles/desarrollohumano.org/File/8012264236003654.pdf
HND Honduras GDP Data from HDR 2006HRV Croatia GDP Data from National Statistics Office http://www.dzs.hr/default_e.htm
HUN Hungary GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx
IDN Indonesia GDP Data from National Statistics Office http://www.bps.go.id/sector/nra/grdp/table1.shtml
IND India GDP National Statistics Office http://mospi.nic.in/6_gsdp_cur_9394ser.htm
IRL Ireland GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx
IRN Iran GDP Data from National Statistics Office http://www.sci.org.ir/content/userfiles/_sci_en/sci_en/sel/year85/f21/CS_21_4.HTM
ISR Israel GDP National Statistics OfficeITA Italy GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx
JOR Jordan GDP Data from HDR 2004JPN Japan GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx
KAZ Kazakhstan Income LSMS 1996, World Bank http://siteresources.worldbank.org/INTLSMS/Resources/3358986‐1181743055198/3877319‐1181930718899/finrep1.pdf
KEN Kenya GDP Data from HDRs for 2006, 2005, 2003, 2001 and 1999KGZ Kyrgyz Republic GDP Data from HDR 2005, 2001
APPENDIX OF DATA SOURCES: REGIONAL GDP
Code Country Type of Data Source Available link
KHM Cambodia Expenditure Data from Poverty profile of Cambodia 2004; Daily consumption http://www.mop.gov.kh/Situationandpolicyanalysis/PovertyProfile/tabid/191/Default.aspx
KOR Korea, Rep. GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx
LAO Lao PDR C+I+G Data from HDR 2006; Consumption, Investment and Government ExpenditureLBN Lebanon GDP Data from HDR 2001
LKA Sri Lanka GDP Data from HDR 1998 and National Statistics Office http://www.cbsl.gov.lk/pics_n_docs/08_statistics/_docs/xls_real_sector/table1.17.xls
LSO Lesotho GDP Data from HDR 2006
LTU Lithuania GDP Data from National Statistics Office http://db1.stat.gov.lt/statbank/SelectVarVal/Define.asp?MainTable=M2010210&PLanguage=1&PXSId=0&ShowNews=OFF
LVA Latvia GDP National Statistics Office
http://data.csb.gov.lv/Dialog/varval.asp?ma=02‐02a&ti=2‐2.+GROSS+DOMESTIC+PRODUCT+BY+STATISTICAL+REGION,+CITY+AND+DISTRICT&path=../DATABASEEN/ekfin/Annual%20statistical%20data/02.%20Gross%20domestic%20product/&lang=1
MAR Morocco GDP + Expenditure Data from HDR 1999, 2003 and Enquete Nationale sur la Consommation et les Depenses des Menages 2000/2001
MDA Moldova Wages Data from 2007 Statistical Yearbook; monthly salary http://www.statistica.md/public/files/Yearbook/Venit_pop_1999_2006_en.doc
MDG Madagascar GDP Data from HDR 2003, 2000MEX Mexico GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx
MKD Macedonia, FYR GDP Data from National Statistics Office http://www.stat.gov.mk/english/statistiki_eng.asp?ss=09.01&rbs=2
MNG Mongolia GDP National Statistics OfficeMOZ Mozambique GDP Data from HDR 2007, 2001MWI Malawi Expenditure Data from Malawi INTEGRATED HOUSEHOLD SURVEY 2004‐2005 and 1998MYS Malaysia GDP Data from Chapter 5 of EIGHTH MALAYSIA PLAN 2001 ‐ 2005 http://www.epu.jpm.my/new%20folder/development%20plan/RM8.htm
NAM Namibia Expenditure Data from Namibia Household Income & Expenditure Survey 2003/2004; data is expendituhttp://www.npc.gov.na/publications/prenhies03_04.pdfNER Niger GDP Data from HDR 2004NGA Nigeria Income 2006 Annual Abstract of Statistics. http://nigerianstat.gov.ng/annual_report.htm
NIC Nicaragua Expenditure Data from HDR 2002NLD Netherlands GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx
NOR Norway GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx
NPL Nepal GDP Data from HDR 2004, 2001 and 1998NZL New Zealand GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx
PAK Pakistan GDP Data from HDR 2003PAN Panama GDP Data from National Statistics Office http://www.contraloria.gob.pa/dec/
PER Peru GDP Cuentas Nacionales del Peru, Producto Bruto Interno por Departmentos 2001‐2006 http://www1.inei.gob.pe/biblioineipub/bancopub/est/lib0763/cuadros/c037.xls
PHL Philippines GDP National Statistics OfficePOL Poland GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx
PRT Portugal GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx
PRY Paraguay GDP Data from Atlas de Desarrollo Humano Paraguay 2007 http://www.undp.org.py/dh/?page=atlas
ROM Romania GDP Data from National Statistics Office http://www.insse.ro/cms/files/pdf/en/cp11.pdf
RUS Russia GDP National Statistics Officehttp://translate.google.com/translate?ie=UTF8&oe=UTF&u=http://www.gks.ru/bgd/regl/b07_14p/IssWWW.exe/Stg/d02/10‐02.htm&hl=en&ie=UTF8&sl=ru&tl=en
SEN Senegal GDP Data from HDR 2001SLV El Salvador GDP Data from HDR 2007/2008, 2005, 2003, 2001; 1996 values were in 1994 pricesSRB Serbia Income Data from National Statistics Municipal Database http://www.statserb.sr.gov.yu/Pod/epok.asp
SVK Slovak Republic GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx
SVN Slovenia GDP Data from National Statistics Office http://www.stat.si/eng/novica_prikazi.aspx?id=1318
SWE Sweden GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx
SWZ Swaziland GDP Data from HDR 2008SYR Syrian Arab Republic GDP Data from HDR 2005THA Thailand GDP Data from Statistical Year Book Thailand 2002 http://web.nso.go.th/eng/en/pub/pub0.htm
TUR Turkey GDP National Statistics Office http://www.turkstat.gov.tr/VeriBilgi.do?tb_id=56&ust_id=16
TZA Tanzania GDP National Statistics OfficeUGA Uganda GDP Data from HDR 2007UKR Ukraine GDP Data from National Statistics Office http://www.ukrstat.gov.ua/operativ/operativ2008/vvp/vrp/vrp2008_e.htm
URY Uruguay GDP Data from HDR 2005USA United States GDP Data from OECDStats http://stats.oecd.org/WBOS/index.aspx
UZB Uzbekistan GDP Data from HDR 2007/8, 2000 and 1998VEN Venezuela GDP Data from HDR 2000VNM Vietnam Wages National Statistics Office http://www.gso.gov.vn/Modules/Doc_Download.aspx?DocID=2097
ZAF South Africa GDP National Statistics Office (table 16) http://www.statssa.gov.za/publications/statsdownload.asp?PPN=P0441&SCH=4048
ZAR Congo, Dem. Rep. GDP Data from HDR 2008ZMB Zambia GDP Data from HDR 2007 and 2003ZWE Zimbabwe GDP Data fom HDR 2003
Code Country Source Available Link
ALB Albania NAARE United Arab Emirates Ministry of Economy, 2005 Census http://www.economy.ae/English/economicandstatisticreports/statisticreports/pages/census2005.aspx
ARG Argentina Education Policy and Data Center (EPDC) http://epdc.org/
ARM Armenia Education Policy and Data Center (EPDC) http://epdc.org/
AUS Australia National Statistics Office http://www.abs.gov.au/
AUT Austria Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing
AZE Azerbaijan Education Policy and Data Center (EPDC) http://epdc.org/
BEL Belgium Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing
BEN Benin Education Policy and Data Center (EPDC) http://epdc.org/
BFA Burkina Faso Education Policy and Data Center (EPDC) http://epdc.org/
BGD Bangladesh Education Policy and Data Center (EPDC) http://epdc.org/
BGR Bulgaria Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing
BIH Bosnia and Herzegovina Education Policy and Data Center (EPDC) http://epdc.org/
BLZ Belize Education Policy and Data Center (EPDC) http://epdc.org/
BOL Bolivia Education Policy and Data Center (EPDC) http://epdc.org/
BRA Brazil Integrated Public Use Microdata Series International (IPUMS) https://international.ipums.org/international/
CAN Canada National Statistics Office http://www40.statcan.gc.ca/l01/cst01/educ43a‐eng.htm
CHE Switzerland Swiss Labor Force Survey (SLFS) SFSO http://www.bfs.admin.ch/bfs/portal/de/index/themen/15/04/ind4.informations.40101.401.html
CHL Chile National Statistics Office http://espino.ine.cl/CuadrosCensales/apli_excel.asp
CHN China National Statistics Office http://www.stats.gov.cn/ndsj/information/nj97/C091A.END
CMR Cameroon Education Policy and Data Center (EPDC) http://epdc.org/
COL Colombia National Statistics Office http://190.25.231.246:8080/Dane/tree.jsf
CRI Costa Rica Education Policy and Data Center (EPDC) http://epdc.org/
CUB Cuba NACZE Czech Republic Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing
DEU Germany Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing
DNK Denmark National Statistics Office http://www.statbank.dk/statbank5a/SelectVarVal/Define.asp?Maintable=RASU1&PLanguage=1
DOM Dominican Republic Education Policy and Data Center (EPDC) http://epdc.org/
ECU Ecuador National Statistics Office http://190.95.171.13/cgibin/RpWebEngine.exe/PortalAction?&MODE=MAIN&BASE=ECUADOR21&MAIN=WebServerMain.inl
EGY Egypt Education Policy and Data Center (EPDC) http://epdc.org/
ESP Spain Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing
EST Estonia National Statistics Officehttp://pub.stat.ee/px‐web.2001/Dialog/varval.asp?ma=PC414&ti=ECONOMICALLY+ACTIVE+POPULATION+BY+AGE,+EDUCATIONAL+ATTAINMENT+AND+ETHNIC+NATIONALITY*&path=../I Databas/Population census/06Economically active population/ =1
FIN Finland Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing
FRA France Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing
GAB Gabon Education Policy and Data Center (EPDC) http://epdc.org/
GBR United Kingdom Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing
GEO Georgia National Statistics Office (special request of data)GHA Ghana Education Policy and Data Center (EPDC) http://epdc.org/
GRC Greece Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing
GTM Guatemala Education Policy and Data Center (EPDC) http://epdc.org/
HND Honduras Education Policy and Data Center (EPDC) http://epdc.org/
HRV Croatia National Statistics Office http://www.dzs.hr/Eng/censuses/Census2001/Popis/E01_01_07/E01_01_07.html
HUN Hungary Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing
IDN Indonesia Education Policy and Data Center (EPDC) http://epdc.org/
IND India Education Policy and Data Center (EPDC) http://epdc.org/
IRL Ireland Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing
IRN Iran NAISR Israel Integrated Public Use Microdata Series International (IPUMS) https://international.ipums.org/international/
ITA Italy Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing
JOR Jordan Education Policy and Data Center (EPDC) http://epdc.org/
JPN Japan National Statistics Office http://www.e‐stat.go.jp/SG1/chiiki/ToukeiDataSelectDispatchAction.do
KAZ Kazakhstan Education Policy and Data Center (EPDC) http://epdc.org/
KEN Kenya Education Policy and Data Center (EPDC) http://epdc.org/
KGZ Kyrgyz Republic Education Policy and Data Center (EPDC) http://epdc.org/
KHM Cambodia Education Policy and Data Center (EPDC) http://epdc.org/
KOR Korea, Rep. NALAO Lao PDR Education Policy and Data Center (EPDC) http://epdc.org/
LBN Lebanon Ministry of Social Affairs http://www.cas.gov.lb/images/PDFs/Educational%20status‐2004.pdf
LKA Sri Lanka Education Policy and Data Center (EPDC) http://epdc.org/
APPENDIX OF DATA SOURCES: REGIONAL YEARS OF EDUCATION
Code Country Source Available Link
LSO Lesotho Education Policy and Data Center (EPDC) http://epdc.org/
LTU Lithuania National Statistics Office http://db.stat.gov.lt/sips/Dialog/varval.asp?ma=gs_dem17en&ti=Population+by+educational+attainment+and+age+group%A0%28aged+10+years+and+over%29&path=../Database/cen en/p71en/demography/ =2
LVA Latvia National Statistics Office http://data.csb.gov.lv/Dialog/varval.asp?ma=tsk03a&ti=EDUCATIONAL+ATTAINMENT+OF+POPULATION&path=../DATABASEEN/tautassk/Results%20of%20Population%20Census%202000%20in%20brief/=1
MAR Morocco Education Policy and Data Center (EPDC) http://epdc.org/
MDA Moldova Education Policy and Data Center (EPDC) http://epdc.org/
MDG Madagascar Education Policy and Data Center (EPDC) http://epdc.org/
MEX Mexico Education Policy and Data Center (EPDC) http://epdc.org/
MKD Macedonia, FYR Education Policy and Data Center (EPDC) http://epdc.org/
MNG Mongolia Integrated Public Use Microdata Series International (IPUMS) https://international.ipums.org/international/
MOZ Mozambique Education Policy and Data Center (EPDC) http://epdc.org/
MWI Malawi Education Policy and Data Center (EPDC) http://epdc.org/
MYS Malaysia Integrated Public Use Microdata Series International (IPUMS) https://international.ipums.org/international/
NAM Namibia Education Policy and Data Center (EPDC) http://epdc.org/
NER Niger Education Policy and Data Center (EPDC) http://epdc.org/
NGA Nigeria Education Policy and Data Center (EPDC) http://epdc.org/
NIC Nicaragua Education Policy and Data Center (EPDC) http://epdc.org/
NLD Netherlands Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing
NOR Norway National Statistics Office http://statbank.ssb.no/statistikkbanken/Default_FR.asp?PXSid=0&nvl=true&PLanguage=1&tilside=selecttable/hovedtabellHjem.asp&KortnavnWeb=utniv
NPL Nepal Education Policy and Data Center (EPDC) http://epdc.org/
NZL New Zealand National Statistics Office http://wdmzpub01.stats.govt.nz/wds/ReportFolders/reportFolders.aspx
PAK Pakistan Education Policy and Data Center (EPDC) http://epdc.org/
PAN Panama Education Policy and Data Center (EPDC) http://epdc.org/
PER Peru Education Policy and Data Center (EPDC) http://epdc.org/
PHL Philippines Education Policy and Data Center (EPDC) http://epdc.org/
POL Poland Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing
PRT Portugal Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing
PRY Paraguay National Statistics Office http://celade.cepal.org/cgibin/RpWebEngine.exe/EasyCross?&BASE=CPVPRY2002&ITEM=INDICADO&MAIN=WebServerMain.inl
ROM Romania Eurostat http://epp.eurostat.ec.europa.eu/NavTree_prod/everybody/BulkDownloadListing
RUS Russian Federation National Statistics Officehttp://74.125.65.132/translate_c?hl=en&ie=UTF‐8&sl=ru&tl=en&u=http://www.perepis2002.ru/index.html%3Fid%3D15&prev=_t&usg=ALkJrhiZr6thPp3doxH9mXdDZgf‐DA1fyw
SEN Senegal Education Policy and Data Center (EPDC) http://epdc.org/
SLV El Salvador VI Censo de la Poblacion y V de Vivienda 2007 http://www.digestyc.gob.sv/cgibin/RpWebEngine.exe/Crosstabs
SRB Serbia National Statistics Office http://webrzs.statserb.sr.gov.yu/axd/en/Zip/CensusBook4.zip
SVK Slovak Republic National Statistics Office http://px‐web.statistics.sk/PXWebSlovak/DATABASE/En/02EmploMarket/01EconPopActiv/EA_total.px
SVN Slovenia National Statistics Office http://www.stat.si/pxweb/Database/Census2002/Administrative%20units/Population/Education/Education.asp
SWE Sweden National Statistics Office
http://www.ssd.scb.se/databaser/makro/SubTable.asp?yp=tansss&xu=C9233001&omradekod=UF&huvudtabell=Utbildning&omradetext=Education+and+research&tabelltext=Population+16‐74+years+of+age+by+highest+level+of+education,+age+and+sex.+Year&preskat=O&prodid=UF0506&starttid=1985&stopptid=2007&Fromwhere=M =2&langdb=2
SWZ Swaziland Education Policy and Data Center (EPDC) http://epdc.org/
SYR Syrian Arab Republic Education Policy and Data Center (EPDC) http://epdc.org/
THA Thailand Education Policy and Data Center (EPDC) http://epdc.org/
TUR Turkey National Statistics Office http://www.tuik.gov.tr/isgucueng/Kurumsal.do
TZA Tanzania Education Policy and Data Center (EPDC) http://epdc.org/
UGA Uganda Education Policy and Data Center (EPDC) http://epdc.org/
UKR Ukraine National Statistics Office http://stat6.stat.lviv.ua/PXWEB2007/Database/POPULATION/1/06/06.asp
URY Uruguay National Statistics Office http://www.ine.gub.uy/microdatos/engih2006/persona.zip,USA United States National Statistics Office http://factfinder.census.gov/servlet/DatasetMainPageServlet?_program=ACS&_submenuId=&_lang=en&_ts=
UZB Uzbekistan Education Policy and Data Center (EPDC) http://epdc.org/
VEN Venezuela Integrated Public Use Microdata Series International (IPUMS) https://international.ipums.org/international/
VNM Vietnam Education Policy and Data Center (EPDC) http://epdc.org/
ZAF South Africa National Statistics Office
http://www.statssa.gov.za/timeseriesdata/pxweb2006/Dialog/varval.asp?ma=Highest%20level%20of%20education%20grouped%20by%20province&ti=Table:+Census+2001+by+province,+highest+level+of+education+grouped,++population+group+and+gender.+&path=../Database/South%20Africa/Population%20Census/Census%202001%20‐%20NEW%20Demarcation%20boundaries%20as%20at%209%20December%202005/Provincial%20level%20‐%20Persons/=1
ZAR Congo, Dem. Rep. Education Policy and Data Center (EPDC) http://epdc.org/
ZMB Zambia Education Policy and Data Center (EPDC) http://epdc.org/
ZWE Zimbabwe Education Policy and Data Center (EPDC) http://epdc.org/