an estimate of the determinants of migration costs using ... · in this paper we propose a new...
TRANSCRIPT
An Estimate of the Determinants of Migration Costs
Using Mexican-US Microdata
Tamara Bogatzki and Ste�en Sirries∗
July 27, 2016
Abstract
In this paper we propose a new estimate of the determinants of migrationcosts using microdata on spatial wage gaps of Mexican Migrants to theU.S. Leaning on a strand of the trade literature that estimates tradecosts using price di�erentials across locations, we proxy migration costswith the wage di�erential of identical workers between locations. Toavoid the problem of selection on wage-in�uencing unobservables forthe comparisons of observably identical workers, we contrast earningsbefore and during migration of the very same individual in a panel-like structure. We �nd that distance remains a deterrent to migration,even in times of cheaper communication and travel, while an English-speaking Mexican experiences signi�cantly lower cost of relocation thanher non-�uent compatriot.
JEL-Codes : F22, F66, J61, J31.Keywords : Migration Costs, International Labor Mobility, WageDi�erentials.
∗Bogatzki: University of Vienna and University of Bayreuth, Universitätsstraÿe 30, 95447Bayreuth, Germany. Email: [email protected]. Sirries: University of Bayreuth,Universitätsstraÿe 30, 95447 Bayreuth, Germany. E-mail: ste�[email protected]. Wethank participants at the Göttinger Workshop "Internationale Wirtschaftsbeziehungen"2016 for helpful comments.
1 Introduction
The aim of this paper is to provide a measure for the cost of migration as the
di�erence in wages representing violations of the Law of One Price (LOOP).
LOOP states that prices for the very same good should be equal across space.
Given a no-arbitrage condition, spatial price gaps for homogenous goods there-
fore are supposed to re�ect transaction costs only. Yet, the theory's assump-
tions pose severe challenges which the literature on international trade, that
has taken on the concept, has met with collecting better data for more pre-
cise estimates. Main drawbacks have been a lack of knowledge on the exact
location of origin and destination point of sale of a certain product, identity
of goods for which prices in di�erent places are being observed and potential
mark-up pricing (see Atkin and Donaldson, 2015). While there has been ex-
tensive research on convergence of average wages across countries, especially
the former two conditions given by the law of one price have so far been di�-
cult to transfer to wage data which might explain why the term "Law of One
Wage" (LOW) has not yet become a winged word. Following the work of Engel
and Rogers (1996) and Atkin and Donaldson (2015) who gauge trade costs,
we use microdata on migrants from 150 Mexican communities for which wage
data for both Mexico and the United States (U.S.) are available to estimate
the cost of migration from origin to destination as a function of potential cost
shifters such as distance, language skills and owning property in Mexico. We
hope that our work will not only yield insights on the relative size of the im-
pact of each of the regressors, but provide a measure of migration costs that
incorporates both physical and psychological factors. In the remainder of the
paper we will proceed as follows. In Section 2 we provide an overview of the
related literature. Section 3 introduces the theoretical model, while we discuss
the estimation strategy in section 4. Section 5 contains a description of the
data, in section 6 we present results. Section 7 concludes.
1
2 Related Literature
To our knowledge, this papaer constitutes the �rst attempt to employ micro-
data to identify wage di�erentials across countries as an estimate of migration
costs on the one hand, and on the other hand solve the selection problem via
unobservables by examining identical laborers. Our approach is heavily in�u-
enced by the trade literature employing LOOP for estimating trade costs using
prices on which Allen and Arkolakis (2014) provide a comprehensive overview.
In the construction of our own model we will predominantly lean on the work
by Atkin and Donaldson (2015) and Engel and Rogers (1996) who estimate
the determinants of trade costs by looking at price di�erentials of homogenous
goods.
Yet, spatial wage gaps have been explored with regard to migration before:
Hanson (2003) and Gandol�, Halliday, and Robertson (2015) both examine
aggregate wage convergence due to both migration and trade liberalization
between Mexico and the United States. While Hanson (2003) �nds little evi-
dence for wage convergence since the North American Free Trade Agreement
(NAFTA) in 1994 despite increased returns to skill in Mexico, Gandol�, Hall-
iday, and Robertson (2015) detect some convergence across the two countries
but agree that a stable wage premium remains. These �ndings on aggregate
wage convergence are in two ways relevant to our own approach: First, they
support our assumption that spatial wage gaps persist due to institutional,
physical and psychological barriers to migration. Second, Gandol�, Halliday,
and Robertson (2015) indicate structural breaks in the size of the wage gaps
over time between Mexico and the U.S. that we need to revisit due to the na-
ture of our survey data which covers a wide range of years for which earnings
are recorded. Alongside investigations directed at wage convergence at the
macro level, wage di�erentials have been used for modeling optimal location
choice and the propensity to migrate (see for instance Kennan and Walker,
2011; Aguayo-Téllez and Martínez-Navarro, 2013; Ortega and Peri, 2013).
2
All of these models have in common that higher wages at destinations
are understood as the primal incentive to migrate and compensation for both
physical and psychological expenses of migration. Kennan and Walker (2011)
�nd that interstate migration is substantially in�uenced by income prospects
driven by geographic di�erences in mean wages. Although our identi�cation
strategy is based on the trade literature introduced above, it also follows a
widespread tradition in migration theory. We are unable to observe a person
working in two locations at the same time. The usual strategy to solve this
problem is to match foreign-born workers in the destination country with either
observably identical workers who stayed (Clemens, Montenegro, and Pritchett,
2008) or observably identical workers from the destination country (Gandol�,
Halliday, and Robertson, 2015). In both cases one would compare migrants
with non-migrants. It is likely however that migrants di�er signi�cantly from
non-migrants in factors that are unobserved by the econometrician. In this
case, we face a selection bias which results from unobservables that are con-
founded with other observed factors that also in�uence income. Note that the
aim of this work is not to identify the factors driving selection into migration,
but to develop an estimation strategy for migration costs that avoids selection
bias. It is apparent that migrants di�er on many levels from non-migrants
and that migrants are distinct across locations, too. For example, Aguayo-
Téllez and Martínez-Navarro (2013) �nd that single adult male Mexicans with
low schooling levels tend to migrate to the U.S. whilst married Mexicans of
both genders with higher education prefer internal migration. What poses
the relevant di�culty here is that migrants may di�er from non-migrants in
ways that are unobservable for the researcher and at the same time in�uence
wage determination in ways that make it impossible to base inference on the
comparison of wages of di�erent individuals in general. Executing these com-
parisons on groups of workers that di�er in obvious ways such as one having
had the courage to migrate while the other stayed appears precarious. The
literature has recognized this kind of selection and �nds that not controlling
3
for it leads to serious bias. Researchers have come up with various econometric
strategies to confront this problem. For instance, Clemens, Montenegro, and
Pritchett (2008) propose a selection model to estimate how wage gains depend
on a worker's position in the distribution of unobserved wage determinants.
Ortega and Peri (2013) allow for unobserved heterogeneity between migrants
and non-migrants at destination. These strategies have been invoked mainly
because of data shortages and they are unable to reliably control for the bias
due to unobservables reliably. The problem of identifying the very same good
across places is also well-known in the trade literature we refer to. Atkin and
Donaldson (2015) contribute substantial data work by turning to products at
the barcode-equivalent level. In the same way, our approach to observe identi-
cal workers avoids the selection-bias entirely. Akee (2010) uses a very similar
strategy to get hold of the wage di�erence between migrants and non-migrants
that is actually due to unobservables rather than due to the migration deci-
sion. He compares domestic wages of individuals from the Federated States of
Micronesia who he knows will migrate to the U.S. in the following period with
domestic wages of observably identical individuals who will not migrate and
�nds a positive and signi�cant di�erence which stresses the importance of our
data work.
In the next section we will outline the theoretical framework that links our
model to the trade literature.
3 Theoretical Framework
In this section we introduce the theoretical framework. In subsection 3.1
we construct the model for estimating migration costs using wages based on
Jevon's Law for factor prices and examples from the trade literature using
price di�erentials between locations to identify transportation costs. We en-
counter the process underlying individual wage determination in subsection 3.2
and turn to the resulting endogeneity problem that is deeply intertwined with
4
self-selection-bias in subsection 4.2. In subsection 3.3 we take up theoretical
considerations on the determinants of migration costs.
3.1 "The Law of One Wage" and the Place Premium
LOOP contends that for homogenous goods prices should be equal in di�erent
locations as otherwise arbitrageurs are incited to buy a good for the cheaper
price in one place and sell it for riskless pro�t in another place where prices
are higher. Arbitrage in turn will increase the supply of the good in the place
with the initially higher price and thereby result in a convergence towards
the cheaper price (Jevons, 1871). From an international trade economist's
perspective, what keeps prices from equalizing due to arbitrage are the trans-
action costs of trading the good. Transferring the trade model to migration, we
presuppose that wages equal the marginal product of labor (MPL) and adopt
the assumption of iceberg migration costs1 following for example the work by
Aguayo-Téllez and Martínez-Navarro (2013) and Ortega and Peri (2014). We
arrive at the following no-arbitrage condition:
W jki · δ
jli = W jl
i · δjki , (1)
where W jki is the wage of a worker i from location j in location k, W jl
i are the
earnings of this very same worker in location l with j, k, l ∈ S, the number of
locations, and i ∈ N , the number of individuals. δjki depicts worker i's costs
of migrating from j to k, or to l correspondingly for δjli .
The no-arbitrage condition merely holds under well-de�ned presumptions
as discussed by Jevons (1871) concerning goods in general. We will directly
refer to the factor labor instead where violations of LOOP are most persistent
and di�cult to identify (Persson, 2008). First, workers have to be completely
1Iceberg trade costs were �rst introduced by Samuelson (1954). Transportation costs aremodeled as an additional portion of the imported good and therefore have to be 1 in casethe good is sold in the same place as it is produced because no portion of the good is proneto be lost on the way. In Appendix A we further consider additive migration costs.
5
homogenous which is why for instance Atkin and Donaldson (2015) in their es-
timation of trade costs turn to products at the barcode-equivalent level. As we
observe exactly the same workers in both the country of origin and the destina-
tion country over a short time span, we come very close to this �rst condition.
Secondly, the Law of One Wage does not apply inter-temporally. This poses a
problem since we only observe the worker in both places at di�erent times. We
try to mitigate this problem by adjusting wages for in�ation on the one hand,
and on the other hand, by executing a robustness check that excludes time
spans exceeding one year between occupations. Furthermore, we assume that
income-a�ecting unobservables are stable over time and that marginal e�ects
of an additional year of observables on wages are negligibly small. Third, we
assume perfect information concerning wages and job conditions in the United
States. As the United States are very close to Mexico, constitute the main
destination for Mexican migrants and there are many Mexicans who take sev-
eral trips, it is very plausible that Mexican migrants are at least reasonably
well informed and average expectations are not systematically o�. Fourth, we
presuppose rational agents aiming to maximize their utility proxied by income
given migration costs which complements the mainly economic incentives for
migration. Fifth, there is free entry to the market which implies that Mexicans
are unrestricted in entering the U.S. workforce. This assumption however is
questionable, but its violation can be considered as part of the migration costs
in form of an institutional barrier. Sixth and last, we adopt that there are no
variable mark-ups, so that wages re�ect a worker's marginal product.
Returning to our model after rearranging to
W jki
W jli
=δjkiδjli, (2)
We replace l by j for workers earning wages in their home country which yields
W jki
W jji
=δjkiδjji
. (3)
6
Laborers are paid their marginal products which are proportional to migration
costs δ. We model migration costs as a multiplicative factor of the wage
received in the home country, so that for a worker who stays, and who does
not incur any costs of moving to another location, migration costs are 1. In
contrast, it is plausible to expect costs greater than one for migrants, and an
accordingly higher wage at destination. Hence, equation (3) becomes
W jki
W jji
= δjki (4)
Rearranging and replacing j with MX for Mexico, the country of origin ob-
served in the data, and k with US for the United States of America as only
destination to be considered leads to
WMX,USi = δMX,US
i ·WMX,MXi (5)
andWMX,US
i
WMX,MXi
= δMX,USi . (6)
The MPL at destination is the product of the MPL at origin and the costs of
migrating, represented by δMX,USi . The migration costs depend on a vector of
potential cost-shifters xMX,USi . As migration costs are assumed to be positive,
they are speci�ed as an exponential function:
δMX,USi = exp(β0 + x′MX,US
i β1 + εMX,USi ). (7)
Besides the aforementioned xMX,USi , β0 denotes a constant and ε
MX,USi an error
term that is uncorrelated with the regressors with expectation 0. The vector
of parameters of interest corresponding to the variables included in xMX,USi is
β1. Taking logs, we arrive at the �nal model
lnWMX,USi − lnWMX,MX
i = β0 + x′MX,USi β1 + εMX,US
i . (8)
7
We will refer to the place premium δMX,USi (xMX,US
i ) as migration costs for the
remainder of this work. Our aim is to estimate the extent to which migration
costs can be explained by some particular xMX,USi included in xMX,US
i , in
other words to recover the marginal e�ects ∂δMX,USi /∂xMX,US
i of the variables.
We discuss determinants of migration costs in detail in subsection 3.3; the
regressors are described in section 4.
3.2 Wage determination
Mexicans migrate to the United States for predominantly economic reasons.
Despite working for comparably low wages in the United States, migrant
worker wages still exceed possible earnings in Mexico which suggests that po-
tential wage gaps proxy the decisive compensation for the expenses of migra-
tion. Following Mincer (1958), the wage of an individual worker i in place k
is determined by
ln(Wik) = α0 + z′iα1 + ηi + νik (9)
where zi constitutes a vector of observable individual characteristics like edu-
cation, sex and working experience, the coe�cients of which are given by the
parameter vector α1. The parameter ηi denotes unobservable traits such as
inherent ability and quality of schooling, some of which are correlated with
zi. Last, νik is an idiosyncratic error term for location match with E(νik) = 0
and cov(zi, νik) = 0; α0 is a constant. Each worker knows about all of her
individual characteristics and the expected value of νik.2
2Note that we assume that an individual's wage is determined by its personal plus locationcharacteristics and not by the speci�c occupation. This has the advantage that we cannotonly consider migrants that worked in the same occupation before and after migration. Thissimplifying assumption of the irrelevance of the speci�c occupations is particularly plausiblegiven that the whole sample covers low-skilled workers. The portion of workers within thesample who maintain the same occupation within the U.S. is 9.18 percent. Also, there is noone-direction change of occupation classes.
8
3.3 Migration Costs
In the previous paragraphs we discussed how spatial wage gaps can proxy
migration costs and how the incorporated wages are determined for each in-
dividual. Now we turn to the factors whose in�uence on the size of migration
costs we would like to investigate.
Despite the links that can be drawn to the trade literature, there are im-
portant di�erences between trade costs and migration costs. Adam Smith ob-
served as early as 1776 that regional di�erences in commodity prices in Great
Britain were much smaller than di�erences in wages for homogenous workers
between locations. Strangely though, despite higher opportunities for arbi-
trage on the labor market, there was far more movement of goods, i.e. trade
than movement of people, i.e. migration. It was evident to him that the costs
of migration needed to exceed the gains which led him to the conclusion that
"a man is of all sorts of luggage the most di�cult to be transported" (Smith,
1776). Indeed, migration faces barriers that are unknown to trade:
Even though there is no paradigm on the constellation of migration costs,
the migration literature, if at all, commonly discriminates between what Clemens,
Montenegro, and Pritchett (2008) refer to as natural barriers and what we will
call institutional barriers. While the latter summarize restrictions on interna-
tional movement enacted by governments, natural barriers are more diverse:
They encompass both direct (or monetary) costs of relocation and indirect
(psychological) costs. Direct costs can be relocation expenses, the sacri�ce
of pension rights in the home country (Bodvarsson, Simpson, and Sparber,
2015), and the costs of searching for a new job. The latter will be especially
relevant if the move aims at an entirely new geographic area. Indirect costs
of worker mobility cover the sorrow of leaving beloved ones behind and the
cutting of community ties. They also include the burden of having to �nd
one's way around in a new surrounding which may explain part of the �nding
that the average migrant is rather young (Ehrenberg and Smith, 2003). The
9
presence of immigrant networks at the destination however, may decrease the
costs of migration, especially for low-skilled migrants (Beine, Docquier, and
Özden, 2011). Sharing the destination country's language or being pro�cient
in it may also contribute to lowering psychological costs (Beine, Bertoli, and
Fernández-Huertas Moraga, 2014). An obviously signi�cant role is played by
the distance that has to be overcome. Distance can be both an indirect and
a direct natural barrier to migration. It determines time and money that has
to be invested in transportation not only for the move but also for visits to
home. At the the same time, distance enlarges the burden of being away from
family and friends and of acquiring information about the circumstances at
the destination.
All in all, it seems plausible to assume that voluntary migration only oc-
curs if its expected bene�ts are relatively large and cover the costs of migra-
tion (Ehrenberg and Smith, 2003) which potentially exceed conventional trade
costs.
4 Estimation Strategy
Having put forward the theoretical framework, we now turn to our empirical es-
timation strategy. In subsection 4.1 we will discuss our regression model which
we estimate using Ordinary Least Squares (OLS). we conclude by confronting
possible violations of the unbiasedness and consistency of our estimates in
subsection 4.2.
4.1 The Regression Model
We estimate variants of the following model:
lnWMX,USi − lnWMX,MX
i = β0 + β1lndistMX,USi
+β2englishspeakerUSi + β3married
MXi +
β4famusexpMXi + β5property
MXi + εMX,US
i
(10)
10
The dependent variable are the migration costs measured by the log of the wage
for the last formal job in Mexico minus the log of the wage received during
the last U.S. migration for each individual i ∈ n, the set of all individuals
within the sample. β1 denotes the coe�cient of the log of the distance in
kilometers between origin and destination of the worker. We expect the sign
of β1 to be positive so that migration costs increase with distance. Following
the common gravity model and the akin work by trade economists Engel and
Rogers (1996), we consider the log as we assume the relationship between
distance and migration costs to be concave. englishspeakerUSi is a dummy
for English pro�ciency that becomes one if the head of household at least
understands some English and is also able to speak, and zero if she is only able
to understand some English or even less and unable to speak. A certain level
of English speaking pro�ciency should lower migration costs. marriedMXi ,
famusexpMXi , and propertyMX
i also denote dummy variables. They indicate
if the person was married, if her family had U.S. migration experience, and
if she had any property before her last U.S. migration, respectively. Being
married in Mexico might pose an obstacle to migration as it indicates a strong
social network that might have to be left behind. Otherwise, taking your
spouse with you imposes additional costs for housing, job search and the like.
Yet, taking your spouse with you could also decrease migration costs because
it might be paired with an additional income and lower psychological costs.
Unfortunately, from the sample we cannot discriminate between migrating as
a couple and migrating single. Though we expect a positive sign for β3, there
appears to be a plausible explanation for a negative sign, too. Turning to the
remaining dummies, we anticipate that by having a family member with U.S.
migration experience a future migrant can pro�t from existing knowledge and
already built social networks abroad. Household heads for whom famusexpMXi
is one should therefore have lower migration costs than those who do not
have family with migration experience, that is, for whom the dummy is zero.
Finally, the direction of the e�ect of having property in Mexico on migration
11
costs in our opinion is as ambiguous as the one of marital unions. Whilst a
migrant may experience high costs of leaving a property behind because of
psychological bonds or a certain �nancial standing that comes with it, this
very �nancial standing might reduce migration costs because it enables access
to more comfortable modes of transportation or job search. Also, having a
property to leave for your stayed-behind relatives along with a return option
equally points at a negative sign for slope parameter β5.
The set of determinants of migrations costs we investigate is limited to what
we referred to as natural barriers in subsection 3.3. One might be inclined to
incorporate policy indicators but despite attraction, regressors measuring insti-
tutional barriers to migration may not be exogenous but depend on migration
�ows and the economic conditions in either country (Ortega and Peri, 2014).
The issue of endogeneity however, directly leads to the following subsection
where we discuss the assumptions underlying our estimations.
4.2 Ordinary Least Squares Assumptions
We estimate the regression model stated in subsection 4.1 using Ordinary
Least Squares. Under the assumptions of �rst, linearity in parameters, second,
random sampling, third, no perfect collinearity and fourth, zero conditional
mean, E(β̂j) = βj, j = 0, 1, · · · , k, for any values of the population parameter
βj, that is, the estimated parameters are unbiased and consistent (Wooldridge,
2008).
Regarding the �rst assumption, theory does not give us any clear guidance
on the functional form of the migration costs speci�cation. We therefore lean
on the trade literature where the standard speci�cation is of the form exp(x′β)
which results in a model that is linear in parameters after taking the logs. The
third assumption can easily be tested by checking if there is variation in the
variables. The standard statistic program will drop variables that are linear
combinations of other variables because otherwise there is no solution for the
parameter vector. Assumptions two and in especially four are of particular
12
relevance for our work, as we will debate in the following passages.
Divergences from Simple Random Sampling
Even though we will provide a detailed description of the data in section 5, we
now shortly present their underlying sampling process: With an initial focus
on Western Mexico, the Mexican Migration Project (MMP) today samples
communities from all over the country. In a simple random sample every
subject in the underlying population has equal probability of being sampled. In
practice however, surveys diverge from simple random sampling in order to save
costs and render estimates for the subgroup of interest more precise (Cameron
and Trivedi, 2005). Accordingly, communities are not selected at random by
the MMP but on the basis of anthropological methods paying special attention
to migration �ows, i.e. the data are not intended to be representative for the
entire Mexican population. Yet, the aim is not to select a community where
migration �ows are particularly dense, but to select communities with any
occurrence of migration in areas of di�erent levels of urbanization. Once a
city, town or village has been selected, the explicit survey place, which is what
the MMP calls the community and can be any kind of geographically distinct
part, is determined by the �eldwork supervisors. From these communities of at
least 1,200 listed dwellings, 200 households are randomly selected. The survey
workers intend to interview at least some migrants from the community who
settled in the U.S.
Since households within a community are sampled at random, the MMP
does not oversample speci�c population groups on purpose. Still, the survey
provides weights for communities and U.S. samples.3 Applying weights results
in data representative of the area constituted by all sampling frames together,
which is as we mentioned earlier not the whole of Mexico (MMP, 2015).
3Sample weights are calculated as the inverse of the sampling fraction, in the case of theMexican communities the number of interviewed households divided by the estimate of theeligible households in the sampling frame; i.e. all dwellings within a survey site. A detailedreconstruction of the calculation of the sample weights is beyond the scope of this work butcan be retraced in the Appendices to the MMP data.
13
Is there the need to employ weights then? If our aim is to describe and
make predictions of the underlying population behavior, yes. The goal how-
ever, is the estimation of a causal model. Respectively, divergences from simple
random sampling are only problematical, if strati�cation is on the dependent
variable (Cameron and Trivedi, 2005). Yet, the MMP focuses on migrants and
diverges from random sampling on the community level. It does not, however,
intentionally select households by place premiums. A contrary example would
be if the aim was to model causal e�ects on income and one would oversample
people with small incomes. Divergence from random sampling therefore does
not cause bias of our estimations. Yet, it is likely that as interviewed house-
holds are clustered in small geographical areas, i.e. communities, errors are
no longer independently distributed between these observations and standard
errors may be underestimated. As long as within-cluster unobservables are un-
correlated with the regressors, only the variances of the regression parameters
need to be adjusted. We thus cluster standard errors by community. We now
descend to potential problems with endogeneity.
Tackling Endogeneity: Selection-Bias Through Unobservables and
Time Trends
As mentioned earlier, a major di�culty in the empirical implementation of the
theoretical framework demonstrated in section 3 lies in the identi�cation of
the dependent variable. We take migration costs to be identi�ed by the spa-
tial wage gap given the no-arbitrage condition posed by LOW holds. Remem-
ber though, that it is impossible to observe the very same worker in several
locations at a time. We believe that there are two approaches to meet this
empirical caveat:
First, one may want to compare observably identical workers in the two lo-
cations from di�erent cross-sections of the same year for instance. Recall from
section 2 that employing varying matching-algorithms has been a widely-used
habit for researchers focusing on place premiums although it is accompanied by
14
an acknowledged drawback. Remember wage-determination-equation (9) from
subsection 3.2. We assume that workers are paid their marginal products and
variations in earnings are governed by actual di�erences in individuals. These
di�erences can be both observed by the statistician and unobserved. Examples
for unobserved characteristics could be personality, motivation, ability or the
quality of schooling. Firms recognize these di�erences in productivity among
individuals and adjust wages accordingly, even though they remain in the dark
for the observer (Johnson, 1977). A matching based on observables only will
therefore be faulty and wage di�erentials will not only re�ect the desired place
premium under investigation, but also di�erences in productivity through un-
observables between individuals. For equation (9) this means, that ηi, i.e.
income-in�uencing factors that cannot be observed, and the stochastic error
term νik are mingled together even though they need to be kept conceptually
apart. While the latter is unknown to the person, she will have a comprehen-
sive idea of ηi. Again, variance in earnings of people with the same observables
z′i may originate from either ηi or νik.
The second course of action, the one that we pursue, is to contrast wages
of the same workers in the two locations administered for the years before and
after migration. In consequence, there will be no selection-bias that may arise
from migrants systematically diverging in unobservable wage-determinants
from non-migrants as would be the case for matching on observables. Yet,
we will need to bite the bullet and assume that the portion of the wage gap
due to an additional life year is negligibly small. Note that it may in fact well
be the case that the time di�erence is even smaller, as the length of the time-
period is preset by the survey and labor force information is ordered in years
(see 5.1 for more information). Also, even singular cross-sections are in reality
not administered at one point in time, so that we expect potential variation
from the panel-like structure we deal with to be at a similar and insigni�cant
level. Unfortunately, for some individuals the di�erence in years available from
the data even exceeds one year. We hence check the robustness of our results
15
by restricting the sample to observations with only one year in between the
two observations.
A more serious caveat is the fact that year-pairs for individuals extend
over a long time horizon, namely from 1959 to 2012. In more than 50 years
it is likely that through labor market integration and changes in migration
policy there is a time trend con�ated with the size of the wage gap across
individuals. In addition, the size of the marginal e�ects of the regressors in
equation (10) may vary over time. Plotting wage di�erentials against year of
last U.S. migration indicates an increase of migration costs over time until the
1990s. From 1990 onwards the place premium remains at a comparably stable
level which may plausibly be explained by the introduction of NAFTA in 1994,
the hindmost change in migration policy and trade liberalization. The time
trend is illustrated in �gures 1 and 2. We conquer the endogeneity problem
in various ways: First, we restrict the sample to years after 1994. Second, to
avoid the loss of observations and get hold of the time-e�ect, we add further
controls. As the sample is small, we defect from including dummies for each
year or even year-pair and resort to dummies for the following periods: 1959 to
1989, 1990 to 1999, and 2000 to 2012. Third, we interact these period dummies
with the regressor for distance to capture time-variant-e�ects.4
Estimates are reported in section 6. In the following section we turn to the
underlying microdata.
5 Data
The subsequent paragraphs deal with the construction of the sample that we
will refer to as MIG_�nal and that constitutes the basis for estimation of
the empirical model presented in section 4. Subsection 5.1 introduces the
underlying micro-database and pays special attention to the identi�cation of
4Consequently, one would proceed interacting the remaining regressors as well. For thescope of this papaer, we focus on the distance e�ect.
16
Figure 1: Migration Costs Across Time, Entire Sample
Figure 2: Migration Costs Across Time After NAFTA
17
the last domestic wage received in Mexico that proofs essential for calculation
of the dependent variable. Furthermore, in subsection 5.2 we explain the role of
supplementary data for making wages comparable across time and countries
and generating explanatory variables until we complete the section with a
descriptive summary of MIG_�nal in passage 5.3.
5.1 The MMP150
The theoretical model introduced in section 3 demands wage comparisons for
workers as homogeneous as possible. The ideal micro-data would therefore
cover Mexican wage rates of the very same migrant directly before her last
trip to the U.S. and after her arrival at destination. In other words, we intend
to construct a panel including wage, labor force and migratory information for
the period directly before and during the last U.S. migration for each worker.
The sample we access is selected from a pooled cross-section of migrants based
on the MMP1505, survey data collected in 150 Mexican communities, released
in April 2015 and freely available at mmp.opr.princeton.edu.6 The Mexican
Migration Project is a collaborative research project based at the Princeton
University and the University of Guadalajara. The database is constructed by
randomly sampling households in communities located throughout Mexico. As
the project focuses on gathering migration information, interviews take place
during the winter months when seasonal migrants tend to return home. The
interviewers collect social, demographic, and economic information on each
household and its members, including general information on each persons
�rst and last trip to the U.S. In addition, a year-by-year labor history includ-
ing migration information is compiled for household heads and spouses. If the
household head is a migrant, further questions concerning the last migration
experience in the U.S. are asked, focusing on employment, earnings, and use
of U.S. social services. Following completion of the Mexican surveys, inter-
5MMP (2015)6The corresponding survey questionnaire can be downloaded from the same address.
18
viewers travel to destination areas in the United States to administer identical
questionnaires to migrants from the same communities sampled in Mexico who
have settled north of the border and no longer return home. These surveys
are combined with those conducted in Mexico to generate a representative bi-
national sample (MMP, 2015). Despite their overall impressive sizes, the way
the MMP databases are constructed reduces the extent of the sample eligible
for our purposes signi�cantly. In the following subsections we will give a de-
tailed description of the two data �les employed for the construction of our
dependent variable, the individual-level wage di�erential between Mexico and
the U.S., and the construction process itself.
5.1.1 MIG150
The MIG150 person level �le lists detailed information about the 8,052 house-
hold heads with migration experience to the U.S. of all persons surveyed. Rec-
ognize though that the investigated population is not intended to be repre-
sentative for the whole of Mexico and thereby also not suitable to detect the
nature of self-selection of migrants by comparison with non-migrants. For in-
vestigating the causal impact of distinct factors on the extent of migration
costs, that is, the required compensation, we focus on migrants who actually
incurred these expenses. MIG150 contributes a number of variables of interest
for our regression analysis. Besides documenting background information on
the household heads at the time of the survey we mainly rely on for unique
identi�cation, MIG150 embraces information on the last migration to the U.S.
Many migrants within the survey look back on several migration trips which
calls for more sophisticated, dynamic models. Given the data at hand how-
ever, it is only possible to identify Mexican wages for the last occupation before
the recent U.S. migration. We tackle the identi�cation process in subsection
5.1.3 in more detail. For now, note that even though MIG150 also captures
information on the �rst trip to the U.S. and many individuals are return mi-
grants, we focus on the last U.S. migration and household heads who have not
19
returned from it at the time of the survey which shrinks the sample tremen-
dously. Regarding the last U.S. migration, we make use of the reported wage
and employment characteristics of the migrant household head for both the
last formal job in Mexico before the trip and during the trip. Furthermore,
we are interested in the level of English pro�ciency, participation in sports
or social activities at the destination as indicators for having built new so-
cial networks, additional �nancial characteristics that may constitute bene�ts
only possible through migration like average monthly remittances and savings,
and whether the migrant received any social welfare payments. Note that the
MIG150 �le is not fruitful regarding individual circumstances before the last
U.S. migration that potentially in�uence the migration decision. Fortunately,
the information in question can be recovered from LIFE.
5.1.2 LIFE
LIFE is an event-history �le for each household head from the year of birth
until the survey year which we reduce to those with migration experience to
the U.S. From these we keep only the last person-year available before the
year of the last U.S. migration. Alongside identi�er information, LIFE is built
using various time-speci�c variables. For the isolated year before the last U.S.
migration, we therefore take from LIFE information on participation in the
labor force (there are no wage indicators though), marital unions, family com-
position, U.S. migration experience among family of origin, and property and
business holdings in Mexico. Even though LIFE itself does not contain time-
speci�c wage information, the �le is essential for integrating the last domestic
wage values from MIG150 as we will explain in the following passage.
5.1.3 Identifying the last domestic wage
The MIG150 data captures detailed information on the migratory experience
of all household heads in the survey who ever migrated to the US. While the
�le includes the last domestic wage and the corresponding unit, two problems
20
arise. First, many individuals in the sample are returners. In this case "last
domestic wage" may either equal the wage currently received in Mexico or
re�ect the last wage received in Mexico before retirement or layo� and is most
importantly not a wage received before migrating to the U.S. We therefore
kept only household heads who are reported to still reside in the U.S. Second,
while the year of the last U.S. migration is known, from MIG150 one can
neither infer the year of the last domestic wage, nor the occupation worked in
or the job place. We thus resorted to the LIFE �le. Recall that LIFE compiles
a detailed labor history of all in all 380,302 person-years that correspond to
persons with U.S. migration experience. Note however that LIFE does not
encompass time-dependent wage data. To identify the last domestic wage
from the labor history for the remaining household heads in MIG150 who were
residing in the U.S. at the time of the survey, we secluded the latest person-
year available before the year of the last U.S. migration for each individual.
Then we merged the information connected to the remaining person-years with
the household heads left in MIG150 using unique identi�ers we constructed
from time-constant background information available in both �les. The newly
constructed MIG150-sample now comprises household heads who migrated to
the US and are still there and for each of them their last domestic wage, wage
units, wage on last US trip, years, places and codes for both occupations, and
additional information on personal life and migratory experience.
5.2 Supplementary Data
Driving Distance and Traveltime
Once job places for each individual in both countries were available, we manu-
ally added two measures for the distance between locations employing maps.google.de.7
we measure the distance in kilometers for the shortest driving connection and
the travel time in hours without tra�c to check robustness of results. Still,
7Google Maps (2015)
21
precision of the available job data di�ers across individuals: For a substantial
number of individuals, only the states but not the precise cities of occupation
are available. The lack of distinct location data is especially severe for job-
places in Mexico, where about 80 percent only reported states and not cities,
while this is true for only a �fth of U.S. locations. We considered di�erent
options to proxy the unknown job places: First, the geographical middle of a
state (which does not work well for coast states), second, the city with highest
population (which may not be stable across person-years of interest) and third,
capital. As the �rst two options to not exhibit clear advantages in comparison
to capital and the second option might even often coincide with it, we take the
capital of the job state as a proxy for the missing job place.
Making Wages Comparable Across Time and Countries
As di�erences in wages are likely to be driven by varying price levels not only
between Mexico and the United States, but also across years before and after
migration for each individual and the cross-section as a whole, we adjust wages
for in�ation and purchasing power. To make wages comparable across time and
countries we transformed nominal wages to real wages in 2010 U.S. dollars via
two separate procedures following Gandol�, Halliday, and Robertson (2015).
For the �rst approach we converted Mexican pesos to U.S. dollars using the
nominal exchange rate of the year of the last domestic wage. We retrieved his-
torical nominal exchange rates from di�erent sources: We merged the indicator
from a longitudinal supplementary �le of the MMP150 called NATLYEAR for
the years 1965 to 20128. NATLYEAR itself is built using external sources such
as the Mexican census and statistical yearbooks from the U.S. Department of
Homeland Security. For the years 1959 and 1961 which are not covered by
NATLYEAR, we added the missing exchange rate manually. This is possible
8In the Codebook for NATLYR it is reported that the exchange rate is given in dollarsper peso. Comparing the exchange rates with the tables available via the Banco de Mexicowhich report the exchange rate in pesos per dollar and looking at the resulting values forwages in US dollar, there appears to be a mistake in the Codebook. In fact the exchangerates in NATLYR are given by pesos per dollar.
22
because from April 19, 1954, to August 31, 1975, Mexico had a �xed peso-
dollar exchange rate of 0.0125 (in NATLYEAR rounded to 0.013) pesos per
dollar (Banco de Mexico). Subsequently, we de�ated all nominal dollar values,
including wages during last U.S. migration, to 2010 dollars using the national
Consumer Price Index (CPI) for urban wage earners and clerical workers from
the U.S. Bureau of Labor Statistics.9 However, the mentioned U.S. CPI values
are stated for the base year 1967 and needed to be rebased to 2010. Moreover,
as CPIs are �led on a monthly basis we calculated annual averages for each
year before de�ating.
The second method yields very similar values and proceeds as follows: we
�rst de�ated nominal pesos using historical CPI data for Mexico available
from the Instituto Nacional de Estadística y Geografía (INEGI)10 (We again
calculated annual averages) and nominal dollars using the annual average U.S.
CPIs with base year 2010 calculated before. We then transferred the 2010
pesos to 2010 dollars by entertaining the 2010 nominal exchange rate. While
we prefer the second procedure because it better accounts di�ering consumer
baskets depending on the country of residence, we take the �rst procedure as
our default option. Procedure number one earns its priority status because
it minimizes data loss as the INEGI data only comprise CPIs starting from
the year 1969, while our sample contains years beginning as early as 1965. In
addition, real wages calculated from the two procedures are similar.
Unifying Wage Units and Calculating the Dependent Variable
To calculate the dependent variable ln(WMX,US
i /WMX,MXi
), wage units across
countries are required to be consistent for each individual within the sample.
While MIG150 comprises not only hourly wages for last U.S. migration, but
also usual hours worked per week and months worked per year, last domestic
wages are reported in a variety of units across individuals without additional
9BLS (2015)10INEGI (2015b)
23
information on working hours. Therefore, we generated a new variable for U.S.
wages with U.S. units adjusted to those reported for wages in Mexico. Where
the Mexican wage unit was hourly wage we retained the U.S. hourly wage at
hand. For all other wage units we employed the available information on usual
time worked per period aided by intuitive assumptions like �ve working days
per week and 13/3 weeks per month. Yet, for 30 observations hourly wages in
the U.S. were missing. For those where U.S. wages were available at di�ering
rates for a variable much like the one for last domestic wage, we used these to
�ll in the missing observations, following the same adjustment procedure for
both wages. Finally, we transferred all calculated wages to 2010 U.S. dollars as
described above and calculated ratios.11 Still, wage ratios were unrealistically
high for some observations, indicating mismeasured data. Consequently, we
decided to apply a correction factor and replace wages with ratios greater than
50 for both possible directions with missing values before calculating the �nal
wage ratio by dividing U.S. wages by last domestic wages (both in 2010 U.S.
dollars) and taking the log.
11For the model incorporating additive migration costs described in Appendix A, wageunits need to be consistent not only for each individual but also across all individuals withinthe sample, as the dependent variable in this case is the log of the wage di�erence (andnot the log of the fraction). Therefore, we additionally uni�ed wage ratios to hourly wagesfor the entire sample. To achieve uni�cation, it was inevitable to assume that historicalaverage weekly working hours created an appropriate proxy for actual weekly working hoursof Mexican workers in Mexico because as mentioned before, this kind of information wasnot available for last domestic wages. We incorporated average weekly working hours fromdi�erent sources: The "Encuesta Anual de Trabajo y Salarios Industriale" published byINEGI (2015a) provides data for the year 1940 to 1985. The years 1995 to 2011 are availablefrom the OECD (2015). Unfortunately, there is no data for the years 1986 to 1994. Asvariation appears to be limited in general though, we �lled the gaps with the calculatedmean of the adjacent years 1985 and 1995. As described above, there are missing U.S.hourly wages, too. Furthermore, when hourly wages were missing, so was data on workinghours. If U.S. wages in other units were accessible, we therefore recalculated these to �ll inthe gaps using historical average weekly working hours for the U.S. available from Gallup(2012). We equally applied the introduced correction factor to exclude unrealistically highwage ratios greater than 50. To conclude, we also estimated the model described in section3.1 employing the ratio of the newly approximated hourly wages as the dependent variable.See Appendix A for the results.
24
The Restricted Sample
In order to test whether our results are driven by certain violations of pre-
sumptions required for the no-arbitrage condition (see section 3.1), structural
breaks in general wage convergence as investigated for instance by Gandol�,
Halliday, and Robertson (2015) and Hanson (2003) or unrealistically high and
therefore mismeasured wages we imposed strict restrictions on the sample dis-
cussed so far. First, we drop all persons for whom there lies more than one
year between reported wages. Second, we exclude those with hourly wages
greater than 30 2010 U.S. dollar and last, we only consider U.S. migrations
from 1994 (the year of the amendment of NAFTA) onwards.
The preceding paragraphs are supposed to enable the reader to comprehend
the construction of the sample that we subsequently will refer to as MIG_�nal
and that constitutes the basis for estimation of the empirical model presented
in section 4. Table 1 depicts a summary of the process.
Table 1: Sample Selection
RespondentsMIG: Household heads that migrated to the U.S. 8,052Currently not on U.S. migration -6,324Last domestic wage not reported -1,352Wage on last U.S. migration not reported -117Occupation on last U.S. migration unknown -1State of last U.S. migration unknown -29State of occupation before last U.S. migration unknown -4Unrealisticly high wage ratios* -18MIG_�nal 207More than one year between observations per person -35Mexican hourly wage >30 2010 U.S. dollar -17Year of last U.S. migration before 1994 -30Restricted Sample 125
Notes: *Unrealistically high wage ratios were de�ned as wage ratios >50 in both
directions. We replaced the corresponding wages with missings. Source: MMP150.
25
5.3 Descriptive Statistics
In this subsection we present descriptive statistics of the sample we con-
structed.
Table 2: Representativeness MIG_�nal we
Mexican Communities: Level of Urbanization
PERS MIG MIG_�nalLevel of Urbanization Frequency Percent Frequency Percent Frequency PercentRanchos 34,993 0.22 2,205 0.27 47 0.23Pueblos 49,797 0.32 2,645 0.33 57 0.28Mid-sized cities 40,502 0.26 2,154 0.27 62 0.30Metropolitan 32,587 0.21 1,048 0.13 41 0.20Total 157,879 1.00 8,052 1.00 207 1.00
Notes: Levels of urbanization have been adopted from the MMP150 community selectionprocess as described under http://mmp.opr.princeton.edu/research/selectingcommunities-en.aspx (Accessed 25th of August 2015). Ranchos: < 2, 500inhabitants. Pueblos: 2, 500 ≤ inhabitants < 10, 000. Mid-sized cities: 10, 000 ≤inhabitants < 100, 000. Metropolitan: ≥ 100, 000 inhabitants. Where population growthcrossed levels over time, Iwe assigned levels according to number of inhabitants during theyear closest to the survey year. Sources: MMP Codebooks (1 - 150), Appendix A - SampleInformation (MMP150); MMP150 database
First, however, have a look at tables 2 and 3, where we investigate the
representativeness of MIG_�nal with regard to the initial samples available
from the MMP150. Table 2 shows the level of urbanization of the communi-
ties represented in the sample. PERS encompasses all persons surveyed by the
MMP150, migrants and nonmigrants. One can see that about 80 percent of
the persons within the survey come from places with less than 100,000 inhab-
itants. Turning to MIG which reduces the PERS �le to household heads with
migration experience in the U.S., the portion becomes as high as 87 percent,
indicating that there is a selection of migrants with regard to rural commu-
nities. MIG_�nal thus oversamples migrants from metropolitan regions with
regard to MIG by 7 percentage points. Unfortunately, it is di�cult to extract
summary statistics for all variables included in the regressions with regard to
the underlying supersamples because many variables have been created from
the merger with LIFE and are not included in MIG or PERS. Table 3 hence
26
Table 3: Representativeness MIG_�nal II
Obs Mean SD Min Max
age_usmigl MIG 8047 32.63 11.73 1 85MIG_�nal 207 31.64 9.90 15 68RestrictedSample
125 33.71 10.21 16 68
educ_before MIG (timeconstant)
8038 5.42 3.99 0 28
MIG_�nal 207 7.05 3.41 0 17RestrictedSample
125 7.2 3.09 0 17
Notes: MIG is a cross-sectional �le for each head of household that migrated to the U.S.MIG_�nal is a cross-sectional �le including a panel dimension for each head of householdthat has not returned from her last US migration and for whom wage and occupation datafor last job before US migration and job during last US migration are available.
Figure 3: Job Places Before and During Last U.S. Migration in MIG_�nal,Source: maps.google.de
27
o�ers only limited insights into the demographics of the individuals in the
samples. One can state though, that the mean age at last U.S. migration are
the early thirties across all samples. Concerning the level of schooling, we
compare years of education before the last trip to the U.S. for MIG_�nal and
the restricted sample, while the number of years reported in MIG is the years
of education at the time of the survey. The average person in MIG has experi-
enced less then 5 years of education, with a slightly higher standard deviation
than in the other two samples who exhibit higher levels of education of about
7 years on average. Note that both numbers indicate a low skill-level.12 Also,
all household heads in MIG have already migrated to the U.S. at least once,
but do not have to be wage earners yet, so that it might be the case that MIG
includes more persons who have not �nished their education yet. Nonethe-
less, our results on the determinants of migrations costs appear to refer to
migrants with a higher level of schooling than the average migrant household
head in the MMP150 survey which leads us to the summary of our regression
variables for MIG_�nal displayed in table 4. The �rst four rows deal with
the dependent variables, i.e. migration costs. Migration costs are measured
in the log of the ratio of the wages at the two locations, once adjusted to the
unit given by information on Mexican wages per individual and once adjusted
to hourly wages for across the entire sample. The average migration costs in
all cases are greater than one, indicating a wage gain for the average worker
within the sample from migrating to the U.S. Hoewever, migration costs show
large variations with standard deviations greater than one as well and even
become negative which we explain with regard to the stochastic component in
the wage determination equation (9) given in subsection 3.2. The subsequent
rows summarize variants and transformations of measures of distance between
the job place in Mexico and the location of employment in the U.S. Most infor-
mative for the reader are the driving distance measured in kilometers and the
12According to Borjas (1990), migrants are negatively selected with regard to educationbecause the gains from relocating to the U.S. from countries with less equal income distri-butions like Mexico are large if human capital investments in the origin are impossible.
28
traveltime denoted in driving hours. Distances range from only 200 kilometres
to 5726 kilometres with an average value of a little more than 3000 kilometers.
Hours of traveltime lie between 2.05 and 58 hours, the mean time being 27.6
hours. This rather high variation in distance may come as a surprise if one ex-
pected an accumulation of migrants close to the border or in the metropolitan
cities at the North American West Coast. The map given in �gure 3 demon-
strates a more vivid overview of the distribution of locations, whereas green
dots represent origins and blue dots illustrate destinations. Returning to table
4, the next variable in row is a dummy for English pro�ciency which is one
if the person is at least able to speak and understand English and zero if the
level of comprehension lies below speaking-ability. One can see that almost
half of the migrant household heads have achieved this level of English speak-
ing pro�ciency. Concerning the remaining dummy variables, 62 percent of the
sample were in a marital union before their last U.S. migration, 51 percent
already had family with U.S. migration experience at the time and also about
half of the persons owned property in Mexico. The following variables are not
included in the baseline speci�cation of the regression model: Mean monthly
remittances are 306.91 U.S. dollars with a standard deviation of 260.17 dol-
lars. The average worker was almost 32 years old in the last available time
period before the proximate U.S. trip. 66 in every 100 workers had at least one
child before migrating. Only a third participated in social or sport activities
in the U.S., 11 percent received social welfare. We do not include gender in
our regressions, but be aware that the vast majority of household heads with
migration experience is male and there are only 12 women in the sample.
All in all, the average migrant within MIG_�nal is a low-skilled man in
his early thirties with a high likelihood of having family or friends who equally
migrated, but with rather strong social relationships in Mexico via children
and partners. In the successive paragraph we present the regression results.
29
Table 4: Descriptive Statistics - MIG_�nal
Variable Observations Mean SD Min Max
Migration costs in logs, Mexican wage rate(a) 207 1.21 1.64 -3.85 3.57Migration costs in logs, Mexican wage rate(b) 203 1.18 1.62 -4.02 3.47Migration costs in logs, ratio of hourly wages(a) 148 1.23 1.32 -3.85 3.27Migration costs in logs, ratio of hourly wages(b) 144 1.12 1.38 -4.02 3.27Log of driving distance in kilometres 207 7.90 0.55 5.30 8.65Log of traveltime in driving hours 207 3.21 0.56 0.72 4.06Driving distance in kilometres 207 3012.26 1108.18 200.00 5726.00Squared driving distance in kilometres 207 10300000.00 6512663.00 40000.00 32800000.00Traveltime in driving hours 207 27.60 10.41 2.05 58.00Squared traveltime in driving hours 207 869.71 568.69 4.20 3364.00At least able to speak and understand some English 207 0.49 0.50 0.00 1.00Married in year before last U.S. migration 207 0.62 0.49 0.00 1.00Family with migration experience before last U.S. migration 207 0.51 0.50 0.00 1.00Any property in Mexico in year before last U.S. migration 207 0.55 0.50 0.00 1.00Average monhtly remittances 195 306.91 260.17 0.00 1000.00Age in the year before last U.S. migration 207 31.64 9.90 15.00 68.00Any child in the year before last U.S. migration 207 0.66 0.48 0.00 1.00Pursued any social or sport activity during last U.S. trip 200 0.34 0.47 0.00 1.00Any social welfare payment received during last U.S. migration 201 0.11 0.32 0.00 1.00Male 207 0.94 0.23 0.00 1.00
Notes: (a) Comparable wage rates were calculated by �rst transferring Mexican pesos to U.S. dollars using the nominal exchange rate for thecorresponding year and second de�ating the nominal dollar values to 2010 dollars accessing the U.S. CPI. (b) Comparable wage rates werecalculated by de�ating the peso values to 2010 pesos using the Mexican CPI and transferring these to U.S. dollars via the 2010 nominal exchangerate. Source: MMP 150.
30
6 Results
We now turn to the estimated coe�cients for the baseline speci�cation given by
(10). Further, we consider a number of robustness checks. Reported standard
errors are clustered at the community level because some rural communities
in the MMP150 are very small and sampled entirely so that individual back-
grounds and migration experiences are unlikely to be independent. Table 5,
column (1), shows estimates for the regression on migration costs, i.e. the
log of the U.S. wage divided by the last Mexican wage, calculated based on
de�ation using the U.S. CPI. Distance, as expected has a positive e�ect on mi-
gration costs. A one percent increase in distance between locations can ceteris
paribus be associated with a 0.764 percent elevation in the cost of migration.
The e�ect is signi�cant at the one percent level. Moreover, a Mexican mi-
grant who is able to speak and not only understand English has statistically
signi�cant [exp(−0.572)− 1] · 100 = 43.56 percent lower costs to incur than a
comparable Mexican with a lower English pro�ciency. Being married before
migration in contrast, can be associated with signi�cantly higher migration
costs. The coe�cients for having family with U.S. migration experience and
property in Mexico are both positive, but not statistically signi�cant. That
migration experience of the own social network may raise the cost of migration
may come as a surprise as one could expect a respective decrease of the costs
for gaining information about the migration or job search process. Besides,
family that stayed in the U.S. should be helpful in getting to know one's way
around. Still, experience reports about troublesome migrations might also in-
crease compensation demands as migration costs are expected to be high. The
second column states estimates for the same coe�cients but with regard to
the dependent variable being constructed using the Mexican CPI for de�a-
tion of wages. The magnitudes of the coe�cients are slightly smaller for all
variables, while signs and statistical signi�cance are consistent across the two
estimations. The amount of variation in the dependent variable explained by
31
Table 5: MIG_�nal - Baseline Speci�cation
(1) (2)VARIABLES Migration costs
Mexican wage rate(a) Mexican wage rate(b)
lndist 0.764*** 0.708***(0.212) (0.193)
englishspeaker -0.572** -0.522***(0.220) (0.190)
married_before 0.533* 0.400*(0.266) (0.214)
famusmig_before 0.149 0.107(0.237) (0.251)
anyproperty_before 0.239 0.243(0.178) (0.181)
Constant -5.084*** -4.616***(1.643) (1.494)
Observations 207 203R-squared 0.170 0.127
Robust standard errors in parentheses*** p<0.01, ** p<0.05, * p<0.1
Notes: (a) Comparable wage rates were calculated by �rst transferring Mexican pesos toU.S. dollars using the nominal exchange rate for the corresponding year and secondde�ating the nominal dollar values to 2010 dollars accessing the U.S. CPI. (b) Comparablewage rates were calculated by de�ating the peso values to 2010 pesos using the MexicanCPI and transferring these to U.S. dollars via the 2010 nominal exchange rate. Source:MMP 150.
the regressors varies however; it is 17 percent for the �rst dependent variable
and 12.7 percent for the second. As the results are remarkably similar, for
the following robustness checks we proceed with regressing on migration costs
de�ated using the U.S. CPI on grounds of keeping as many observations and
thereby variation within the sample as possible.
To begin with, we check the robustness of the marginal e�ect of geographi-
cal distance on migration costs. Results can be retraced in table 6. As hinted,
the measurement of the dependent variable stays the same across speci�ca-
tions. The �rst column is the baseline speci�cation. In the second column we
replace the log of the distance measured in kilometers by the log of traveltime
in driving hours. A one percent increase in hours traveltime goes along with a
0.753 increase in migration costs. The coe�cient is signi�cant at the one per-
32
Table 6: Robustness of Distance
(1) (2) (3) (4)VARIABLES Migration costs, Mexican wage rate(a)
lndist 0.764***(0.212)
englishspeaker -0.572** -0.568** -0.574** -0.572**(0.220) (0.223) (0.226) (0.228)
married_before 0.533* 0.538** 0.563** 0.565**(0.266) (0.267) (0.263) (0.265)
famusmig_before 0.149 0.143 0.119 0.118(0.237) (0.235) (0.247) (0.247)
anyproperty_before 0.239 0.245 0.222 0.227(0.178) (0.177) (0.173) (0.172)
lntrvltime 0.753***(0.213)
distance 0.000739*(0.000427)
distsq -7.11e-08(6.71e-08)
traveltime 0.0738(0.0443)
trvltimesq -0.000729(0.000741)
Constant -5.084*** -1.462** -0.533 -0.445(1.643) (0.707) (0.606) (0.589)
Observations 207 207 207 207R-squared 0.170 0.169 0.161 0.159
Robust standard errors in parentheses*** p<0.01, ** p<0.05, * p<0.1
Notes: (a) Comparable wage rates were calculated by �rst transferring Mexican pesos to U.S. dollars usingthe nominal exchange rate for the corresponding year and second de�ating the nominal dollar values to2010 dollars accessing the U.S. CPI. Source: MMP 150.
cent level and close to the marginal e�ect of the log of distance in kilometers.
All other coe�cients are nearly unchanged. In columns (3) and (4) we ac-
count for the possibility that the relationship between distance and migration
costs might be misspeci�ed by including both distance proxies, that is driving
distance in kilometers and traveltime in hours (no logs this time) in turn, to-
gether with their squares. We presume that the marginal e�ect of distance in
migration costs is positive but decreasing with distance. The corresponding
coe�cients have the expected signs. Yet, only the coe�cient for distance in
kilometers remains statistically signi�cant at the ten percent level. The coef-
�cients for traveltime and traveltime squared are not statistically signi�cant.
We now turn to the results for our coping with the potential time trend
adherent to the data that we described in section 4.2 and that may invite
bias to the estimation. Table 7 o�ers an overview. Column (1) is again the
33
baseline speci�cation. In column (2) we deal with the problem that for 35
individuals the year di�erence between observed wages is greater than one so
that the no-arbitrage assumption may not hold. We therefore estimate the
baseline speci�cation on a restricted sample of 172 observations. This leads
to an increase in the magnitude of the marginal e�ect of distance on migra-
tion costs. A one percent increase in distance leads ceteris paribus to an
increase in migration costs of 0.943 percent, given the model is correctly speci-
�ed. The magnitudes of the remaining variables experience only little changes
in comparison to the unrestricted sample. Signs and statistical signi�cance
remain unchanged. We maintain the year-di�erence constraint across subse-
quent columns. In the third column we only consider persons who migrated
after 1994, which yields the same results as for the restricted sample (see col-
umn (6)), indicating that hourly wages reported for the following years did not
exceed 30 2010 U.S. dollars. Regarding the estimates, the restriction reduces
the size of the distance coe�cient to 0.606 and the size of the coe�cient for
English pro�ency to -0.236. The marginal e�ect of marital unions loses its
statistical signi�cance. The e�ect of having own property in Mexico changes
signs. For someone who migrated after NAFTA, owning property leads to
about 22.2 lower migration costs in comparison to someone without property.
The e�ect is signi�cant at the 10 percent level. It might be the case that the
�nancial role of property has gained further importance or that return migra-
tion has become more common over the years, so that maintaining property
reduces fears about the prospects at destination. In columns (4) and (5) we
incorporate time period dummies into the regression. The period dummies
are all statistically signi�cant, so that a time trend is apparent. The marginal
e�ect of property described for column 3 increases in magnitude and becomes
signi�cant at the 5 percent level. In addition, the distance e�ect remains pos-
itive and highly signi�cant. The log of the distance is omitted in column (5)
due to collinearity because we interacted the variable with the period dummies
to capture time-variant e�ects. For the period 1959 to 1989 the marginal ef-
34
fect is greater than in the previous estimates. It decreases for the period 1990
to 1999 and increases again for the period from 2000 onwards, which appears
to be counter-intuitive given cheaper transportation and communication. All
interaction terms are signi�cant at the 5 percent level. Including time controls
raises the percentage of variance in migration costs explained by the regressors
and the R2 amounts to values around 60 percent. In column 7 we embrace
further potential determinants of migration costs like having had children be-
fore migration, participating in social or sports activities at destination and
receiving social welfare payments in the U.S. None of the additional coe�cients
is signi�cant.
We perform regressions applying the described robustnesschecks using only
the restricted sample as described in section 5.2 and table 1. Regression results
are covered in table 8, where column (1) depicts the baseline speci�cation
using MIG_�nal for comparison. Columns (1) and (2) employ the baseline
speci�cation for both measures of de�ation paralleling table 5. Columns (4)
and (5) correspond to robustness checks of the distance coe�cient as illustrated
in table 6. The �nal column adds additional controls and period dummies.
For the restricted sample, the R2 are substantially lower, between 0.142
and 0.208. The distance e�ect, also if measured by the log of traveltime as in
column (4), remains positive and statistically signi�cant except for the speci-
�cation check using the square of the distance in column (7), despite expected
signs. The marginal e�ect of English pro�ciency on migration costs stays nega-
tive and statistically signi�cant across columns. Compared to the �nal sample,
the magnitude of the coe�cients is only about half as big though. Yet, the
marginal e�ect is still impressive: For column (2) English speaker incurs ce-
teris paribus 21.02 lower migration costs than a Mexican who does not speak
the language.
To sum up, the e�ect of distance on migration costs appears to be generally
robust across di�erent speci�cations and samples. The size of the coe�cients
using MIG_�nal ranges from 0.675 to an upper bound of 0.943. For the
35
restricted sample the coe�cients lie between 0.492 and 0.659. All estimates are
signi�cant at the 5 percent level or lower. The language dummy is also stable
and signi�cant in many cases. Moreover, the included period dummies are all
signi�cant and point at a time trend that potentially biases the estimates if it
is not controlled for. Di�erent de�ation measures exhibit very similar results.
36
Table 7: Controlling for Time Trends
VARIABLES Migration costs, Mexican wage rate(a)
(1) (2) (3) (4) (5) (6) (7)
lndist 0.764*** 0.943*** 0.606*** 0.799*** 0.606*** 0.675***(0.212) (0.224) (0.165) (0.136) (0.165) (0.184)
d_1990_1999 2.687*** 6.855*** 2.755***(0.363) (2.533) (0.387)
d_2000_2012 2.709*** 3.930* 2.815***(0.343) (2.143) (0.364)
englishspeaker -0.572** -0.500** -0.236** -0.197 -0.166 -0.236** -0.171(0.220) (0.195) (0.105) (0.149) (0.149) (0.105) (0.179)
married_before 0.533* 0.546** 0.0167 0.0289 0.0416 0.0167 0.417(0.266) (0.223) (0.144) (0.169) (0.174) (0.144) (0.220)
famusmig_before 0.149 0.296 -0.0294 0.355** 0.352** -0.0294 0.329*(0.237) (0.264) (0.113) (0.167) (0.169) (0.113) (0.190)
anyproperty_before 0.239 0.175 -0.251* -0.303** -0.312** -0.251* -0.320*(0.178) (0.201) (0.127) (0.147) (0.145) (0.127) (0.189)
remit -0.000127(0.000348)
age_usyrl 0.015(0.00971)
anychild_before -0.288(0.223)
activity_usmigl 0.08(0.167)
welfare -0.0181(0.317)
lndist_1959_1989 1.066***(0.266)
lndist_1990_1999 0.529**(0.218)
lndist_2000_2012 0.905***(0.137)
di�erence between years 1 No Yes Yes Yes Yes Yes Yes
migration after NAFTA No No Yes No No Yes No
period dummies No No No Yes Yes No Yes
lndist time-variant No No No No Yes No No
restricted sample No No No No No Yes No
Constant -5.084*** -6.514*** -2.585* -7.078*** -9.171*** -2.585* -5.080***(1.643) (1.765) (1.337) (1.113) (1.976) (1.337) (1.489)
Observations 207 172 125 172 172 125 189R-squared 0.170 0.227 0.206 0.624 0.630 0.206 0.580
Robust standard errors in parentheses*** p<0.01, ** p<0.05, * p<0.1
Notes: (a) Comparable wage rates were calculated by �rst transferring Mexican pesos to U.S. dollars usingthe nominal exchange rate for the corresponding year and second de�ating the nominal dollar values to2010 dollars accessing the U.S. CPI. Source: MMP 150.
37
Table 8: Restricted Sample
Migration costsVARIABLES Mexican wage rate(a) Mexican wage rate(a) Mexican wage rate(b) Mexican wage rate(a) Mexican wage rate(a) Mexican wage rate(a)
(1) (2) (3) (4) (5) (6)
lndist 0.764*** 0.606*** 0.659*** 0.492**(0.212) (0.165) (0.154) (0.188)
d_1990_1999 0.0528(0.131)
englishspeaker -0.572** -0.236** -0.222** -0.231** -0.275** -0.236*(0.220) (0.105) (0.106) (0.105) (0.109) (0.128)
married_before 0.533* 0.0167 -0.0120 0.0184 0.0342 0.0567(0.266) (0.144) (0.152) (0.141) (0.157) (0.163)
famusmig_before 0.149 -0.0294 -0.0197 -0.0221 -0.0173 -0.0393(0.237) (0.113) (0.116) (0.112) (0.116) (0.127)
anyproperty_before 0.239 -0.251* -0.172 -0.244* -0.296** -0.0944(0.178) (0.127) (0.133) (0.125) (0.132) (0.151)
age_usyrl 0.00770(0.00785)
anychild_before -0.477**(0.208)
activity_usmigl -0.103(0.198)
welfare -0.0914(0.177)
remit -9.49e-05(0.000249)
lntrvltime 0.589***(0.167)
distance 0.000326(0.000351)
distsq -1.37e-08(5.10e-08)
Constant -5.084*** -2.585* -3.089** 0.300 1.392** -1.622(1.643) (1.337) (1.231) (0.583) (0.631) (1.623)
Observations 207 125 125 125 125 110R-squared 0.170 0.206 0.208 0.202 0.177 0.142
Robust standard errors in parentheses*** p<0.01, ** p<0.05, * p<0.1
Notes: (a) Comparable wage rates were calculated by �rst transferring Mexican pesos to U.S. dollars using the nominal exchange rate for the corresponding year and secondde�ating the nominal dollar values to 2010 dollars accessing the U.S. CPI. (b) Comparable wage rates were calculated by de�ating the peso values to 2010 pesos using theMexican CPI and transferring these to U.S. dollars via the 2010 nominal exchange rate. Source: MMP 150.
38
7 Conclusion and Outlook
With this work we put forward a new estimate of the determinants of migration
costs using microdata on spatial wage gaps of 207 Mexicans who migrated to
the U.S. We �nd that distance remains a deterrent to migration, even in times
of cheaper communication and travel. Even for a sample restricting migration
to the years after the amendment of NAFTA in 1994 with minimal temporal
di�erence between reported wages and controlling for potential time trends the
marginal e�ect of distance is statistically signi�cant at the �ve percent level and
as high as 0.495. This means that for a location 500 kilometers away associated
with migration costs of 100 U.S. dollars, an additional 5 kilometers can be
associated with an increase of the cost of migration to 105 U.S. dollars, all
other things equal. While most other variables considered were not found to be
meaningful in�uences of migration costs, besides distance, English pro�ciency
appears to be robust and signi�cant across di�erent variants of the regression
model, so that the average English speaking Mexican migrant is confronted
with lower costs than her non-�uent compatriot.
The empirical investigation we propose faces limitations many of which
are due to a lack of data. First of all, the constructed panel-like sample is
small with regard to individuals, but embraces information extended over a
long time horizon which makes wage comparisons despite identical individu-
als and short timespans between occupations problematic. The limited size
of the sample also kept me from including controls for di�erences in regional
amenities. However, it has been argued before that these are already captured
by de�ating monetary values with regional CPIs (Bodvarsson, Simpson, and
Sparber, 2015). Regrettably, we were only able to obtain national CPIs for
the work presented here. Also, the results are tied to a relatively homogenous
group of migrants and a single country-pair, although the Mexican-U.S. border
is of prevalent interest with regard to migration �ows. Consequently, it would
be an interesting research project to discriminate between internal and inter-
39
national migration costs by incorporating a border dummy. Unfortunately
and despite administering information on domestic migration, the MMP150
data do not include further location-speci�c wage data that could be accessed.
This fact also creates an obstacle to dynamically modeling return migration
with the data at hand, even though this would be advisable given that we ob-
serve that many Mexicans take several trips to the U.S. An analysis of return
migration might also allow to disentangle direct and indirect natural barriers
to migration, as return migrants are likely to experience lower psychological
costs relative to monetary expenses. To conclude however, we believe that
our transfer from the trade literature estimating trade costs using prices to an
estimate of the determinants of migration costs provides an interesting pro-
gramme for further research. In particular, through our resort to microdata
on the same individuals we o�er a contribution to the solution of the problem
of selection-bias through wage-in�uencing unobservables pertaining to earning
comparisons across locations.
40
References
Aguayo-Téllez, E., and J. Martínez-Navarro (2013): �Internal and
international migration in Mexico: 1995-2000,� Applied Economics, 45(13),
1647�1661.
Akee, R. (2010): �Who Leaves? Deciphering Immigrant Self-Selection from a
Developing Country,� Economic Development and Cultural Change, 58(2),
323�344.
Allen, T., and C. Arkolakis (2014): �Lecture 12: Estimating Trade
Costs using Prices [Lecture Notes],� Retrieved September 7, 2015, from
https://sites.google.com/site/treballen/teaching/econ-460-2014.
Anderson, J. E. (2011): �The Gravity Model,� Annual Review of Economics,
3(1), 133�160.
Atkin, D., and D. Donaldson (2015): �Who's getting Globalized? The Size
and Nature of Intranational Trade Costs,� NBER Working Paper 21439.
Beine, M., S. Bertoli, and J. Fernández-Huertas Moraga (2014): �A
practicioners' guide to gravitiy models of international migration,� CREA
Discussion Paper 2014-24, pp. 0�24.
Beine, M., F. Docquier, and Ç. Özden (2011): �Diasporas,� Journal of
Development Economics, 95(1), 30�41.
BLS (2015): �Monthly US CPI-W, 1967=100,� [Data sheet], Available from
http://data.bls.gov/cgi-bin/surveymost, Accessed August 8, 2015.
Bodvarsson, Ö. B., N. B. Simpson, and C. Sparber (2015): �Migration
Theory,� in Handbook of the Economics of International Migration, ed. by
B. R. Chiswick, and P. W. Miller, vol. 1, chap. 1, pp. 3�51. Elsevier B.V.
Borjas, G. J. (1990): Friends or Strangers. Basic Books, New York.
41
Cameron, A. C., and P. K. Trivedi (2005): Microeconometrics: Methods
and Applications, Cambridge Books. Cambridge University Press.
Clemens, M. a., C. E. Montenegro, and L. Pritchett (2008): �The
Place Premium: Wage Di�erences for Identical Workers Across the US Bor-
der,� CGDEV Policy Research Working Paper 4671.
Ehrenberg, R. G., and R. S. Smith (2003): �Worker Mobility: Migration,
Immigration, and Turnover,� in Modern Labor Economics: Theory and Pub-
lic Policy, chap. 10, pp. 310�343. Addison-Wesley Higher Education Group,
8 edn.
Engel, and Rogers (1996): �How Wide Is the Border?,� American Eco-
nomic Review, pp. 1�12.
Gallup (2012): �Gallup Poll Social Series: Work and Education,� [Poll
Results]. Retrieved from http://cdn.cnsnews.com/documents/GALLUP-
SCHOOL%20POLL.pdf, Accessed August 11, 2015.
Gandolfi, D., T. Halliday, and R. Robertson (2015): �Trade, FDI,
Migration, and the Place Premium: Mexico and the United States,� IZA
Discussion Paper 9215.
Google Maps (2015): �Distance and Traveltime between locations,� [Map].
Available from maps.google.de, Accessed July 30, 2015.
Hanson, G. H. (2003): �What Has Happened to Wages in Mexico since
NAFTA?,� NBER Working Paper 9563.
INEGI (2015a): �Encuesta Anual de Trabajo y
Salarios Industriale,� [Data sheet]. Retrieved from
http://www.inegi.org.mx/prod\_serv/contenidos/espanol/bvinegi/
productos/nueva\_estruc/HyM2014/5.\%20Trabajo.pdf, Cuadro 5.19,
1.a parte, Accessed August 10, 2015.
42
(2015b): �Monthly Mexican CPI, De-
cember 2010=100,� [Data sheet], Available from
http://www.inegi.org.mx/sistemas/indiceprecios/Estructura.aspx?i
dEstructura=112000200010&T=%C3%8Dndices%20de%20Precios%20al-
%20Consumidor&ST=Principales%20%C3%Adndices, Accessed August 7,
2015.
Jevons, W. S. (1871): The Theory of Political Economy. Macmillan.
Johnson, W. R. (1977): �Uncertainty and the Distribution of Earnings,� in
Distribution of Economic Well-Being, ed. by T. F. Juster, vol. I, pp. 379�396.
NBER.
Kennan, J., and J. R. Walker (2011): �The E�ect of Expected Income on
Individual Migration Decisions,� Econometrica, 79(1), 211�251.
Mincer, J. (1958): �Investment in Humand Capital and Personal Income
Distribution,� Journal of Political Economy, 66(4), 281�302.
MMP (2015): �MMP150,� [Data �les and Codebooks]. Available from
mmp.opr.princeton.edu, Accessed July 20, 2015.
OECD (2015): �Average Weekly Working Hours
Mexico (1995-2011),� [Data sheet]. Available from
https://stats.oecd.org/Index.aspx?DataSetCode=AVE\_HRS\#, Accessed
August 10, 2015.
Ortega, F., and G. Peri (2013): �The E�ect of Income and Immigration
Policies on International Migration,� Migration Studies, 1(1), 1�35.
(2014): �Openness and income: The roles of trade and migration,�
Journal of International Economics, 92(2), 231�251.
Persson, K. (2008): �Law of One Price,� EH.Net Encyclopedia, Retrieved
July 30, 2015, from http://eh.net/encyclopedia/the-law-of-one-price/.
43
Samuelson, P. A. (1954): �The Transfer Problem and Transport Costs, II:
Analysis of E�ects of Trade Impediments,� The Economic Journal, 64(254),
264�289.
Smith, A. (1776): An Inquiry into the Nature and Causes of the Wealth of
Nations. Modern Library, New York, 1937 edn.
Wooldridge, J. M. (2008): Introductory Econometrics: A Modern Ap-
proach. Cengage Learning Emea.
44
Appendix: Additive Migration Costs
We model migration costs as a portion of the wage earned in the country of
origin, leaning on the common application of these so called "iceberg costs"
in established gravity models (Anderson, 2011), on examples from trade eco-
nomics estimating trade costs using prices (Engel and Rogers, 1996), and on
work from the migration literature itself (Ortega and Peri, 2014; Aguayo-Téllez
and Martínez-Navarro, 2013).
In this Appendix we further consider additive migration costs. Whereas
the general theoretical background including wage determination remains, the
no-arbitrage can be derived as follows. Consider the subsequent equation:
W j,ki − δ
j,ki = W j,l
i − δj,li (11)
where W jki is the wage of a worker i from location j in location k, W jl
i are the
earnings of this very same worker in location l with j, k, l ∈ S, the number
of locations, and i ∈ N , the number of individuals. δjki depicts worker i's
costs of migrating from j to k, or to l correspondingly for δjli . Wages between
locations thus have to be equal if there are no migration costs. This implies that
the compensating wages have to be higher, the greater the cost of migration.
Replacing l with j for workers earning wages in their home country results in
W j,ki + δj,ji = W j,j
i + δj,ki (12)
Repeating from section 3.1, a worker who stays does not incur relocation ex-
penses of any kind, so that in the case for the additive case δj,ji will be zero
and Iwe are left with
W j,ki = W j,j
i + δj,ki (13)
As before, we adjust the model to the data at hand which means replacing j
with MX for Mexico, the place of origin observed in the data, and k with US
45
for the U.S. as the destination country:
WMX,USi = WMX,MX
i + δMX,USi (14)
The determination of migration costs remains the same as in equation (7)
δMX,USi = exp(β0 + x′MX,US
i β1 + εMX,USi ), (15)
with xMX,USi being a vector of diverse determinants of migration costs, β0
denoting a constant and εMX,USi posing an error term that is uncorrelated
with the regressors with mean expectation 0. The vector of parameters of
interest still is β1. Taking logs, the �nal model for additive costs now ends in
ln(WMX,USi −WMX,MX
i ) = β0 + x′MX,USi β1 + εMX,US
i (16)
Recognize that the dependent variable in this case cannot be written as the
di�erence of the logs of the wages in the two locations. The dependent variable
is the log of the wage di�erential, which is why the model based on iceberg
costs and the model working with additive migration costs cannot be translated
into each other as they implicitly make di�erent functional form assumptions
when using the same migration cost speci�cation in both models. Therefore,
estimates ensued from the proximate equation cannot be expected to be di-
rectly comparable to estimates from equation (10). Keeping this in mind, the
regression model for additive migration costs can be stated as follows
ln(WMX,USi −WMX,MX
i ) = β0 + β1lndistMX,USi
+β2englishspeakerUSi + β3married
MXi +
β4famusexpMXi + β5property
MXi + εMX,US
i
(17)
There are several empirical di�culties to be recognized: From the microdata
at hand we face imprecision regarding the wages measured resulting from miss-
46
ing data on working hours in the last domestic occupation. While wage units
are available for both domestic and US wages, only the U.S. wage data ad-
ditionally allow for an adjustment to the unit dictated by the information on
domestic wages. Thus, the data does not pose a problem for our preferred
model that relies on iceberg migration costs. As the dependent variable in the
case of iceberg migration cost is the log of a ratio, units need only be consistent
per individual. In contrast, the additive model requires equal wage units for
every individual within the sample. To compute these we need to make fur-
ther presuppositions on the average hours worked based on additional sources
which might bias the estimate for the cost of migration either downwards if
the average working hours in Mexico are set too low (so that hourly wages in
Mexico are assumed to be too high) or upwards in the reverse case. Moreover,
the additive model cannot deal with negative wage di�erentials on the estima-
tion stage which results in further reduction of the sample size to merely 90
observations in the restricted sample and a loss of actual variation in the data.
Table 9 depicts results for equal observations, namely the restricted sample
minus the number of observations exhibiting negative wage di�erentials. we
control for time trends using period dummies.
A one percent increase in distance between location leads to a ceteris
paribus increase in migration costs of 0.615 percent in the multiplicative model
and a 0.622 percent increase in the additive migration costs, given any of the
models has been correctly speci�ed (columns 1 and 3). As the additive model
is based on hourly wages, we additionally regress on migration costs as the
ratio of hourly wages, which yields a smaller e�ect on migration costs of an
0.371 percent increase. The distance e�ect is statistically signi�cant at the one
percent level for all models. For the remaining variables, the additive model
mostly exhits opposite signs of the coe�cient in comparison to the regression
on iceberg migration costs. For instance, though the e�ect is statistically in-
signi�cant, an Englishspeaker is associated with 10.33 percent lower migration
costs than someone who only understands English or does not have a grasp on
47
Table 9: Iceberg Migration Costs and Additive Migration Costs
Iceberg Migration Costs Additive MigrationCosts(a)
VARIABLES Mexican wage rate(a) Ratio of hourlywages(a)
(1) (2) (3)
lndist 0.615*** 0.371*** 0.622***(0.155) (0.102) (0.192)
d_1990_1999 1.034*** 0.789*** -0.408(0.200) (0.199) (0.247)
d_2000_2012 0.770*** 0.662*** -0.660***(0.224) (0.196) (0.201)
englishspeaker -0.109 -0.0644 0.0723(0.126) (0.104) (0.119)
married_before -0.00632 0.00781 -0.0319(0.165) (0.132) (0.137)
famusmig_before 0.108 0.0736 0.150(0.140) (0.108) (0.0952)
anyproperty_before -0.329** -0.242* -0.0377(0.142) (0.134) (0.119)
Constant -3.812*** -1.672* -2.276(1.247) (0.851) (1.586)
Observations 90 90 90R-squared 0.282 0.210 0.371
Robust standard errors in parentheses*** p<0.01, ** p<0.05, * p<0.1
Notes: (a) Comparable wage rates were calculated by �rst transferring Mexican pesos to U.S. dollars usingthe nominal exchange rate for the corresponding year and second de�ating the nominal dollar values to2010 dollars accessing the U.S. CPI. Source: MMP 150.
the language at all in the multiplicative speci�cation (taking hourly wages the
magnitude of the e�ect is only 6.24 percent), while they are 7.5 percent higher
in the additive model.
48