an estimate of the determinants of migration costs using ... · in this paper we propose a new...

An Estimate of the Determinants of Migration Costs

Using Mexican-US Microdata

Tamara Bogatzki and Ste�en Sirries∗

July 27, 2016

Abstract

In this paper we propose a new estimate of the determinants of migrationcosts using microdata on spatial wage gaps of Mexican Migrants to theU.S. Leaning on a strand of the trade literature that estimates tradecosts using price di�erentials across locations, we proxy migration costswith the wage di�erential of identical workers between locations. Toavoid the problem of selection on wage-in�uencing unobservables forthe comparisons of observably identical workers, we contrast earningsbefore and during migration of the very same individual in a panel-like structure. We �nd that distance remains a deterrent to migration,even in times of cheaper communication and travel, while an English-speaking Mexican experiences signi�cantly lower cost of relocation thanher non-�uent compatriot.

JEL-Codes : F22, F66, J61, J31.Keywords : Migration Costs, International Labor Mobility, WageDi�erentials.

∗Bogatzki: University of Vienna and University of Bayreuth, Universitätsstraÿe 30, 95447Bayreuth, Germany. Email: [email protected]. Sirries: University of Bayreuth,Universitätsstraÿe 30, 95447 Bayreuth, Germany. E-mail: ste�[email protected]. Wethank participants at the Göttinger Workshop "Internationale Wirtschaftsbeziehungen"2016 for helpful comments.

1 Introduction

The aim of this paper is to provide a measure for the cost of migration as the

di�erence in wages representing violations of the Law of One Price (LOOP).

LOOP states that prices for the very same good should be equal across space.

Given a no-arbitrage condition, spatial price gaps for homogenous goods there-

fore are supposed to re�ect transaction costs only. Yet, the theory's assump-

tions pose severe challenges which the literature on international trade, that

has taken on the concept, has met with collecting better data for more pre-

cise estimates. Main drawbacks have been a lack of knowledge on the exact

location of origin and destination point of sale of a certain product, identity

of goods for which prices in di�erent places are being observed and potential

mark-up pricing (see Atkin and Donaldson, 2015). While there has been ex-

tensive research on convergence of average wages across countries, especially

the former two conditions given by the law of one price have so far been di�-

cult to transfer to wage data which might explain why the term "Law of One

Wage" (LOW) has not yet become a winged word. Following the work of Engel

and Rogers (1996) and Atkin and Donaldson (2015) who gauge trade costs,

we use microdata on migrants from 150 Mexican communities for which wage

data for both Mexico and the United States (U.S.) are available to estimate

the cost of migration from origin to destination as a function of potential cost

shifters such as distance, language skills and owning property in Mexico. We

hope that our work will not only yield insights on the relative size of the im-

pact of each of the regressors, but provide a measure of migration costs that

incorporates both physical and psychological factors. In the remainder of the

paper we will proceed as follows. In Section 2 we provide an overview of the

related literature. Section 3 introduces the theoretical model, while we discuss

the estimation strategy in section 4. Section 5 contains a description of the

data, in section 6 we present results. Section 7 concludes.

1

2 Related Literature

To our knowledge, this papaer constitutes the �rst attempt to employ micro-

data to identify wage di�erentials across countries as an estimate of migration

costs on the one hand, and on the other hand solve the selection problem via

unobservables by examining identical laborers. Our approach is heavily in�u-

enced by the trade literature employing LOOP for estimating trade costs using

prices on which Allen and Arkolakis (2014) provide a comprehensive overview.

In the construction of our own model we will predominantly lean on the work

by Atkin and Donaldson (2015) and Engel and Rogers (1996) who estimate

the determinants of trade costs by looking at price di�erentials of homogenous

goods.

Yet, spatial wage gaps have been explored with regard to migration before:

Hanson (2003) and Gandol�, Halliday, and Robertson (2015) both examine

aggregate wage convergence due to both migration and trade liberalization

between Mexico and the United States. While Hanson (2003) �nds little evi-

dence for wage convergence since the North American Free Trade Agreement

(NAFTA) in 1994 despite increased returns to skill in Mexico, Gandol�, Hall-

iday, and Robertson (2015) detect some convergence across the two countries

but agree that a stable wage premium remains. These �ndings on aggregate

wage convergence are in two ways relevant to our own approach: First, they

support our assumption that spatial wage gaps persist due to institutional,

physical and psychological barriers to migration. Second, Gandol�, Halliday,

and Robertson (2015) indicate structural breaks in the size of the wage gaps

over time between Mexico and the U.S. that we need to revisit due to the na-

ture of our survey data which covers a wide range of years for which earnings

are recorded. Alongside investigations directed at wage convergence at the

macro level, wage di�erentials have been used for modeling optimal location

choice and the propensity to migrate (see for instance Kennan and Walker,

2011; Aguayo-Téllez and Martínez-Navarro, 2013; Ortega and Peri, 2013).

2

All of these models have in common that higher wages at destinations

are understood as the primal incentive to migrate and compensation for both

physical and psychological expenses of migration. Kennan and Walker (2011)

�nd that interstate migration is substantially in�uenced by income prospects

driven by geographic di�erences in mean wages. Although our identi�cation

strategy is based on the trade literature introduced above, it also follows a

widespread tradition in migration theory. We are unable to observe a person

working in two locations at the same time. The usual strategy to solve this

problem is to match foreign-born workers in the destination country with either

observably identical workers who stayed (Clemens, Montenegro, and Pritchett,

2008) or observably identical workers from the destination country (Gandol�,

Halliday, and Robertson, 2015). In both cases one would compare migrants

with non-migrants. It is likely however that migrants di�er signi�cantly from

non-migrants in factors that are unobserved by the econometrician. In this

case, we face a selection bias which results from unobservables that are con-

founded with other observed factors that also in�uence income. Note that the

aim of this work is not to identify the factors driving selection into migration,

but to develop an estimation strategy for migration costs that avoids selection

bias. It is apparent that migrants di�er on many levels from non-migrants

and that migrants are distinct across locations, too. For example, Aguayo-

Téllez and Martínez-Navarro (2013) �nd that single adult male Mexicans with

low schooling levels tend to migrate to the U.S. whilst married Mexicans of

both genders with higher education prefer internal migration. What poses

the relevant di�culty here is that migrants may di�er from non-migrants in

ways that are unobservable for the researcher and at the same time in�uence

wage determination in ways that make it impossible to base inference on the

comparison of wages of di�erent individuals in general. Executing these com-

parisons on groups of workers that di�er in obvious ways such as one having

had the courage to migrate while the other stayed appears precarious. The

literature has recognized this kind of selection and �nds that not controlling

3

for it leads to serious bias. Researchers have come up with various econometric

strategies to confront this problem. For instance, Clemens, Montenegro, and

Pritchett (2008) propose a selection model to estimate how wage gains depend

on a worker's position in the distribution of unobserved wage determinants.

Ortega and Peri (2013) allow for unobserved heterogeneity between migrants

and non-migrants at destination. These strategies have been invoked mainly

because of data shortages and they are unable to reliably control for the bias

due to unobservables reliably. The problem of identifying the very same good

across places is also well-known in the trade literature we refer to. Atkin and

Donaldson (2015) contribute substantial data work by turning to products at

the barcode-equivalent level. In the same way, our approach to observe identi-

cal workers avoids the selection-bias entirely. Akee (2010) uses a very similar

strategy to get hold of the wage di�erence between migrants and non-migrants

that is actually due to unobservables rather than due to the migration deci-

sion. He compares domestic wages of individuals from the Federated States of

Micronesia who he knows will migrate to the U.S. in the following period with

domestic wages of observably identical individuals who will not migrate and

�nds a positive and signi�cant di�erence which stresses the importance of our

data work.

In the next section we will outline the theoretical framework that links our

model to the trade literature.

3 Theoretical Framework

In this section we introduce the theoretical framework. In subsection 3.1

we construct the model for estimating migration costs using wages based on

Jevon's Law for factor prices and examples from the trade literature using

price di�erentials between locations to identify transportation costs. We en-

counter the process underlying individual wage determination in subsection 3.2

and turn to the resulting endogeneity problem that is deeply intertwined with

4

self-selection-bias in subsection 4.2. In subsection 3.3 we take up theoretical

considerations on the determinants of migration costs.

3.1 "The Law of One Wage" and the Place Premium

LOOP contends that for homogenous goods prices should be equal in di�erent

locations as otherwise arbitrageurs are incited to buy a good for the cheaper

price in one place and sell it for riskless pro�t in another place where prices

are higher. Arbitrage in turn will increase the supply of the good in the place

with the initially higher price and thereby result in a convergence towards

the cheaper price (Jevons, 1871). From an international trade economist's

perspective, what keeps prices from equalizing due to arbitrage are the trans-

action costs of trading the good. Transferring the trade model to migration, we

presuppose that wages equal the marginal product of labor (MPL) and adopt

the assumption of iceberg migration costs1 following for example the work by

Aguayo-Téllez and Martínez-Navarro (2013) and Ortega and Peri (2014). We

arrive at the following no-arbitrage condition:

W jki · δ

jli = W jl

i · δjki , (1)

where W jki is the wage of a worker i from location j in location k, W jl

i are the

earnings of this very same worker in location l with j, k, l ∈ S, the number of

locations, and i ∈ N , the number of individuals. δjki depicts worker i's costs

of migrating from j to k, or to l correspondingly for δjli .

The no-arbitrage condition merely holds under well-de�ned presumptions

as discussed by Jevons (1871) concerning goods in general. We will directly

refer to the factor labor instead where violations of LOOP are most persistent

and di�cult to identify (Persson, 2008). First, workers have to be completely

1Iceberg trade costs were �rst introduced by Samuelson (1954). Transportation costs aremodeled as an additional portion of the imported good and therefore have to be 1 in casethe good is sold in the same place as it is produced because no portion of the good is proneto be lost on the way. In Appendix A we further consider additive migration costs.

5

homogenous which is why for instance Atkin and Donaldson (2015) in their es-

timation of trade costs turn to products at the barcode-equivalent level. As we

observe exactly the same workers in both the country of origin and the destina-

tion country over a short time span, we come very close to this �rst condition.

Secondly, the Law of One Wage does not apply inter-temporally. This poses a

problem since we only observe the worker in both places at di�erent times. We

try to mitigate this problem by adjusting wages for in�ation on the one hand,

and on the other hand, by executing a robustness check that excludes time

spans exceeding one year between occupations. Furthermore, we assume that

income-a�ecting unobservables are stable over time and that marginal e�ects

of an additional year of observables on wages are negligibly small. Third, we

assume perfect information concerning wages and job conditions in the United

States. As the United States are very close to Mexico, constitute the main

destination for Mexican migrants and there are many Mexicans who take sev-

eral trips, it is very plausible that Mexican migrants are at least reasonably

well informed and average expectations are not systematically o�. Fourth, we

presuppose rational agents aiming to maximize their utility proxied by income

given migration costs which complements the mainly economic incentives for

migration. Fifth, there is free entry to the market which implies that Mexicans

are unrestricted in entering the U.S. workforce. This assumption however is

questionable, but its violation can be considered as part of the migration costs

in form of an institutional barrier. Sixth and last, we adopt that there are no

variable mark-ups, so that wages re�ect a worker's marginal product.

Returning to our model after rearranging to

W jki

W jli

=δjkiδjli, (2)

We replace l by j for workers earning wages in their home country which yields

W jki

W jji

=δjkiδjji

. (3)

6

Laborers are paid their marginal products which are proportional to migration

costs δ. We model migration costs as a multiplicative factor of the wage

received in the home country, so that for a worker who stays, and who does

not incur any costs of moving to another location, migration costs are 1. In

contrast, it is plausible to expect costs greater than one for migrants, and an

accordingly higher wage at destination. Hence, equation (3) becomes

W jki

W jji

= δjki (4)

Rearranging and replacing j with MX for Mexico, the country of origin ob-

served in the data, and k with US for the United States of America as only

destination to be considered leads to

WMX,USi = δMX,US

i ·WMX,MXi (5)

andWMX,US

i

WMX,MXi

= δMX,USi . (6)

The MPL at destination is the product of the MPL at origin and the costs of

migrating, represented by δMX,USi . The migration costs depend on a vector of

potential cost-shifters xMX,USi . As migration costs are assumed to be positive,

they are speci�ed as an exponential function:

δMX,USi = exp(β0 + x′MX,US

i β1 + εMX,USi ). (7)

Besides the aforementioned xMX,USi , β0 denotes a constant and ε

MX,USi an error

term that is uncorrelated with the regressors with expectation 0. The vector

of parameters of interest corresponding to the variables included in xMX,USi is

β1. Taking logs, we arrive at the �nal model

lnWMX,USi − lnWMX,MX

i = β0 + x′MX,USi β1 + εMX,US

i . (8)

7

We will refer to the place premium δMX,USi (xMX,US

i ) as migration costs for the

remainder of this work. Our aim is to estimate the extent to which migration

costs can be explained by some particular xMX,USi included in xMX,US

i , in

other words to recover the marginal e�ects ∂δMX,USi /∂xMX,US

i of the variables.

We discuss determinants of migration costs in detail in subsection 3.3; the

regressors are described in section 4.

3.2 Wage determination

Mexicans migrate to the United States for predominantly economic reasons.

Despite working for comparably low wages in the United States, migrant

worker wages still exceed possible earnings in Mexico which suggests that po-

tential wage gaps proxy the decisive compensation for the expenses of migra-

tion. Following Mincer (1958), the wage of an individual worker i in place k

is determined by

ln(Wik) = α0 + z′iα1 + ηi + νik (9)

where zi constitutes a vector of observable individual characteristics like edu-

cation, sex and working experience, the coe�cients of which are given by the

parameter vector α1. The parameter ηi denotes unobservable traits such as

inherent ability and quality of schooling, some of which are correlated with

zi. Last, νik is an idiosyncratic error term for location match with E(νik) = 0

and cov(zi, νik) = 0; α0 is a constant. Each worker knows about all of her

individual characteristics and the expected value of νik.2

2Note that we assume that an individual's wage is determined by its personal plus locationcharacteristics and not by the speci�c occupation. This has the advantage that we cannotonly consider migrants that worked in the same occupation before and after migration. Thissimplifying assumption of the irrelevance of the speci�c occupations is particularly plausiblegiven that the whole sample covers low-skilled workers. The portion of workers within thesample who maintain the same occupation within the U.S. is 9.18 percent. Also, there is noone-direction change of occupation classes.

8

3.3 Migration Costs

In the previous paragraphs we discussed how spatial wage gaps can proxy

migration costs and how the incorporated wages are determined for each in-

dividual. Now we turn to the factors whose in�uence on the size of migration

costs we would like to investigate.

Despite the links that can be drawn to the trade literature, there are im-

portant di�erences between trade costs and migration costs. Adam Smith ob-

served as early as 1776 that regional di�erences in commodity prices in Great

Britain were much smaller than di�erences in wages for homogenous workers

between locations. Strangely though, despite higher opportunities for arbi-

trage on the labor market, there was far more movement of goods, i.e. trade

than movement of people, i.e. migration. It was evident to him that the costs

of migration needed to exceed the gains which led him to the conclusion that

"a man is of all sorts of luggage the most di�cult to be transported" (Smith,

1776). Indeed, migration faces barriers that are unknown to trade:

Even though there is no paradigm on the constellation of migration costs,

the migration literature, if at all, commonly discriminates between what Clemens,

Montenegro, and Pritchett (2008) refer to as natural barriers and what we will

call institutional barriers. While the latter summarize restrictions on interna-

tional movement enacted by governments, natural barriers are more diverse:

They encompass both direct (or monetary) costs of relocation and indirect

(psychological) costs. Direct costs can be relocation expenses, the sacri�ce

of pension rights in the home country (Bodvarsson, Simpson, and Sparber,

2015), and the costs of searching for a new job. The latter will be especially

relevant if the move aims at an entirely new geographic area. Indirect costs

of worker mobility cover the sorrow of leaving beloved ones behind and the

cutting of community ties. They also include the burden of having to �nd

one's way around in a new surrounding which may explain part of the �nding

that the average migrant is rather young (Ehrenberg and Smith, 2003). The

9

presence of immigrant networks at the destination however, may decrease the

costs of migration, especially for low-skilled migrants (Beine, Docquier, and

Özden, 2011). Sharing the destination country's language or being pro�cient

in it may also contribute to lowering psychological costs (Beine, Bertoli, and

Fernández-Huertas Moraga, 2014). An obviously signi�cant role is played by

the distance that has to be overcome. Distance can be both an indirect and

a direct natural barrier to migration. It determines time and money that has

to be invested in transportation not only for the move but also for visits to

home. At the the same time, distance enlarges the burden of being away from

family and friends and of acquiring information about the circumstances at

the destination.

All in all, it seems plausible to assume that voluntary migration only oc-

curs if its expected bene�ts are relatively large and cover the costs of migra-

tion (Ehrenberg and Smith, 2003) which potentially exceed conventional trade

costs.

4 Estimation Strategy

Having put forward the theoretical framework, we now turn to our empirical es-

timation strategy. In subsection 4.1 we will discuss our regression model which

we estimate using Ordinary Least Squares (OLS). we conclude by confronting

possible violations of the unbiasedness and consistency of our estimates in

subsection 4.2.

4.1 The Regression Model

We estimate variants of the following model:

lnWMX,USi − lnWMX,MX

i = β0 + β1lndistMX,USi

+β2englishspeakerUSi + β3married

MXi +

β4famusexpMXi + β5property

MXi + εMX,US

i

(10)

10

The dependent variable are the migration costs measured by the log of the wage

for the last formal job in Mexico minus the log of the wage received during

the last U.S. migration for each individual i ∈ n, the set of all individuals

within the sample. β1 denotes the coe�cient of the log of the distance in

kilometers between origin and destination of the worker. We expect the sign

of β1 to be positive so that migration costs increase with distance. Following

the common gravity model and the akin work by trade economists Engel and

Rogers (1996), we consider the log as we assume the relationship between

distance and migration costs to be concave. englishspeakerUSi is a dummy

for English pro�ciency that becomes one if the head of household at least

understands some English and is also able to speak, and zero if she is only able

to understand some English or even less and unable to speak. A certain level

of English speaking pro�ciency should lower migration costs. marriedMXi ,

famusexpMXi , and propertyMX

i also denote dummy variables. They indicate

if the person was married, if her family had U.S. migration experience, and

if she had any property before her last U.S. migration, respectively. Being

married in Mexico might pose an obstacle to migration as it indicates a strong

social network that might have to be left behind. Otherwise, taking your

spouse with you imposes additional costs for housing, job search and the like.

Yet, taking your spouse with you could also decrease migration costs because

it might be paired with an additional income and lower psychological costs.

Unfortunately, from the sample we cannot discriminate between migrating as

a couple and migrating single. Though we expect a positive sign for β3, there

appears to be a plausible explanation for a negative sign, too. Turning to the

remaining dummies, we anticipate that by having a family member with U.S.

migration experience a future migrant can pro�t from existing knowledge and

already built social networks abroad. Household heads for whom famusexpMXi

is one should therefore have lower migration costs than those who do not

have family with migration experience, that is, for whom the dummy is zero.

Finally, the direction of the e�ect of having property in Mexico on migration

11

costs in our opinion is as ambiguous as the one of marital unions. Whilst a

migrant may experience high costs of leaving a property behind because of

psychological bonds or a certain �nancial standing that comes with it, this

very �nancial standing might reduce migration costs because it enables access

to more comfortable modes of transportation or job search. Also, having a

property to leave for your stayed-behind relatives along with a return option

equally points at a negative sign for slope parameter β5.

The set of determinants of migrations costs we investigate is limited to what

we referred to as natural barriers in subsection 3.3. One might be inclined to

incorporate policy indicators but despite attraction, regressors measuring insti-

tutional barriers to migration may not be exogenous but depend on migration

�ows and the economic conditions in either country (Ortega and Peri, 2014).

The issue of endogeneity however, directly leads to the following subsection

where we discuss the assumptions underlying our estimations.

4.2 Ordinary Least Squares Assumptions

We estimate the regression model stated in subsection 4.1 using Ordinary

Least Squares. Under the assumptions of �rst, linearity in parameters, second,

random sampling, third, no perfect collinearity and fourth, zero conditional

mean, E(β̂j) = βj, j = 0, 1, · · · , k, for any values of the population parameter

βj, that is, the estimated parameters are unbiased and consistent (Wooldridge,

2008).

Regarding the �rst assumption, theory does not give us any clear guidance

on the functional form of the migration costs speci�cation. We therefore lean

on the trade literature where the standard speci�cation is of the form exp(x′β)

which results in a model that is linear in parameters after taking the logs. The

third assumption can easily be tested by checking if there is variation in the

variables. The standard statistic program will drop variables that are linear

combinations of other variables because otherwise there is no solution for the

parameter vector. Assumptions two and in especially four are of particular

12

relevance for our work, as we will debate in the following passages.

Divergences from Simple Random Sampling

Even though we will provide a detailed description of the data in section 5, we

now shortly present their underlying sampling process: With an initial focus

on Western Mexico, the Mexican Migration Project (MMP) today samples

communities from all over the country. In a simple random sample every

subject in the underlying population has equal probability of being sampled. In

practice however, surveys diverge from simple random sampling in order to save

costs and render estimates for the subgroup of interest more precise (Cameron

and Trivedi, 2005). Accordingly, communities are not selected at random by

the MMP but on the basis of anthropological methods paying special attention

to migration �ows, i.e. the data are not intended to be representative for the

entire Mexican population. Yet, the aim is not to select a community where

migration �ows are particularly dense, but to select communities with any

occurrence of migration in areas of di�erent levels of urbanization. Once a

city, town or village has been selected, the explicit survey place, which is what

the MMP calls the community and can be any kind of geographically distinct

part, is determined by the �eldwork supervisors. From these communities of at

least 1,200 listed dwellings, 200 households are randomly selected. The survey

workers intend to interview at least some migrants from the community who

settled in the U.S.

Since households within a community are sampled at random, the MMP

does not oversample speci�c population groups on purpose. Still, the survey

provides weights for communities and U.S. samples.3 Applying weights results

in data representative of the area constituted by all sampling frames together,

which is as we mentioned earlier not the whole of Mexico (MMP, 2015).

3Sample weights are calculated as the inverse of the sampling fraction, in the case of theMexican communities the number of interviewed households divided by the estimate of theeligible households in the sampling frame; i.e. all dwellings within a survey site. A detailedreconstruction of the calculation of the sample weights is beyond the scope of this work butcan be retraced in the Appendices to the MMP data.

13

Is there the need to employ weights then? If our aim is to describe and

make predictions of the underlying population behavior, yes. The goal how-

ever, is the estimation of a causal model. Respectively, divergences from simple

random sampling are only problematical, if strati�cation is on the dependent

variable (Cameron and Trivedi, 2005). Yet, the MMP focuses on migrants and

diverges from random sampling on the community level. It does not, however,

intentionally select households by place premiums. A contrary example would

be if the aim was to model causal e�ects on income and one would oversample

people with small incomes. Divergence from random sampling therefore does

not cause bias of our estimations. Yet, it is likely that as interviewed house-

holds are clustered in small geographical areas, i.e. communities, errors are

no longer independently distributed between these observations and standard

errors may be underestimated. As long as within-cluster unobservables are un-

correlated with the regressors, only the variances of the regression parameters

need to be adjusted. We thus cluster standard errors by community. We now

descend to potential problems with endogeneity.

Tackling Endogeneity: Selection-Bias Through Unobservables and

Time Trends

As mentioned earlier, a major di�culty in the empirical implementation of the

theoretical framework demonstrated in section 3 lies in the identi�cation of

the dependent variable. We take migration costs to be identi�ed by the spa-

tial wage gap given the no-arbitrage condition posed by LOW holds. Remem-

ber though, that it is impossible to observe the very same worker in several

locations at a time. We believe that there are two approaches to meet this

empirical caveat:

First, one may want to compare observably identical workers in the two lo-

cations from di�erent cross-sections of the same year for instance. Recall from

section 2 that employing varying matching-algorithms has been a widely-used

habit for researchers focusing on place premiums although it is accompanied by

14

an acknowledged drawback. Remember wage-determination-equation (9) from

subsection 3.2. We assume that workers are paid their marginal products and

variations in earnings are governed by actual di�erences in individuals. These

di�erences can be both observed by the statistician and unobserved. Examples

for unobserved characteristics could be personality, motivation, ability or the

quality of schooling. Firms recognize these di�erences in productivity among

individuals and adjust wages accordingly, even though they remain in the dark

for the observer (Johnson, 1977). A matching based on observables only will

therefore be faulty and wage di�erentials will not only re�ect the desired place

premium under investigation, but also di�erences in productivity through un-

observables between individuals. For equation (9) this means, that ηi, i.e.

income-in�uencing factors that cannot be observed, and the stochastic error

term νik are mingled together even though they need to be kept conceptually

apart. While the latter is unknown to the person, she will have a comprehen-

sive idea of ηi. Again, variance in earnings of people with the same observables

z′i may originate from either ηi or νik.

The second course of action, the one that we pursue, is to contrast wages

of the same workers in the two locations administered for the years before and

after migration. In consequence, there will be no selection-bias that may arise

from migrants systematically diverging in unobservable wage-determinants

from non-migrants as would be the case for matching on observables. Yet,

we will need to bite the bullet and assume that the portion of the wage gap

due to an additional life year is negligibly small. Note that it may in fact well

be the case that the time di�erence is even smaller, as the length of the time-

period is preset by the survey and labor force information is ordered in years

(see 5.1 for more information). Also, even singular cross-sections are in reality

not administered at one point in time, so that we expect potential variation

from the panel-like structure we deal with to be at a similar and insigni�cant

level. Unfortunately, for some individuals the di�erence in years available from

the data even exceeds one year. We hence check the robustness of our results

15

by restricting the sample to observations with only one year in between the

two observations.

A more serious caveat is the fact that year-pairs for individuals extend

over a long time horizon, namely from 1959 to 2012. In more than 50 years

it is likely that through labor market integration and changes in migration

policy there is a time trend con�ated with the size of the wage gap across

individuals. In addition, the size of the marginal e�ects of the regressors in

equation (10) may vary over time. Plotting wage di�erentials against year of

last U.S. migration indicates an increase of migration costs over time until the

1990s. From 1990 onwards the place premium remains at a comparably stable

level which may plausibly be explained by the introduction of NAFTA in 1994,

the hindmost change in migration policy and trade liberalization. The time

trend is illustrated in �gures 1 and 2. We conquer the endogeneity problem

in various ways: First, we restrict the sample to years after 1994. Second, to

avoid the loss of observations and get hold of the time-e�ect, we add further

controls. As the sample is small, we defect from including dummies for each

year or even year-pair and resort to dummies for the following periods: 1959 to

1989, 1990 to 1999, and 2000 to 2012. Third, we interact these period dummies

with the regressor for distance to capture time-variant-e�ects.4

Estimates are reported in section 6. In the following section we turn to the

underlying microdata.

5 Data

The subsequent paragraphs deal with the construction of the sample that we

will refer to as MIG_�nal and that constitutes the basis for estimation of

the empirical model presented in section 4. Subsection 5.1 introduces the

underlying micro-database and pays special attention to the identi�cation of

4Consequently, one would proceed interacting the remaining regressors as well. For thescope of this papaer, we focus on the distance e�ect.

16

Figure 1: Migration Costs Across Time, Entire Sample

Figure 2: Migration Costs Across Time After NAFTA

17

the last domestic wage received in Mexico that proofs essential for calculation

of the dependent variable. Furthermore, in subsection 5.2 we explain the role of

supplementary data for making wages comparable across time and countries

and generating explanatory variables until we complete the section with a

descriptive summary of MIG_�nal in passage 5.3.

5.1 The MMP150

The theoretical model introduced in section 3 demands wage comparisons for

workers as homogeneous as possible. The ideal micro-data would therefore

cover Mexican wage rates of the very same migrant directly before her last

trip to the U.S. and after her arrival at destination. In other words, we intend

to construct a panel including wage, labor force and migratory information for

the period directly before and during the last U.S. migration for each worker.

The sample we access is selected from a pooled cross-section of migrants based

on the MMP1505, survey data collected in 150 Mexican communities, released

in April 2015 and freely available at mmp.opr.princeton.edu.6 The Mexican

Migration Project is a collaborative research project based at the Princeton

University and the University of Guadalajara. The database is constructed by

randomly sampling households in communities located throughout Mexico. As

the project focuses on gathering migration information, interviews take place

during the winter months when seasonal migrants tend to return home. The

interviewers collect social, demographic, and economic information on each

household and its members, including general information on each persons

�rst and last trip to the U.S. In addition, a year-by-year labor history includ-

ing migration information is compiled for household heads and spouses. If the

household head is a migrant, further questions concerning the last migration

experience in the U.S. are asked, focusing on employment, earnings, and use

of U.S. social services. Following completion of the Mexican surveys, inter-

5MMP (2015)6The corresponding survey questionnaire can be downloaded from the same address.

18

viewers travel to destination areas in the United States to administer identical

questionnaires to migrants from the same communities sampled in Mexico who

have settled north of the border and no longer return home. These surveys

are combined with those conducted in Mexico to generate a representative bi-

national sample (MMP, 2015). Despite their overall impressive sizes, the way

the MMP databases are constructed reduces the extent of the sample eligible

for our purposes signi�cantly. In the following subsections we will give a de-

tailed description of the two data �les employed for the construction of our

dependent variable, the individual-level wage di�erential between Mexico and

the U.S., and the construction process itself.

5.1.1 MIG150

The MIG150 person level �le lists detailed information about the 8,052 house-

hold heads with migration experience to the U.S. of all persons surveyed. Rec-

ognize though that the investigated population is not intended to be repre-

sentative for the whole of Mexico and thereby also not suitable to detect the

nature of self-selection of migrants by comparison with non-migrants. For in-

vestigating the causal impact of distinct factors on the extent of migration

costs, that is, the required compensation, we focus on migrants who actually

incurred these expenses. MIG150 contributes a number of variables of interest

for our regression analysis. Besides documenting background information on

the household heads at the time of the survey we mainly rely on for unique

identi�cation, MIG150 embraces information on the last migration to the U.S.

Many migrants within the survey look back on several migration trips which

calls for more sophisticated, dynamic models. Given the data at hand how-

ever, it is only possible to identify Mexican wages for the last occupation before

the recent U.S. migration. We tackle the identi�cation process in subsection

5.1.3 in more detail. For now, note that even though MIG150 also captures

information on the �rst trip to the U.S. and many individuals are return mi-

grants, we focus on the last U.S. migration and household heads who have not

19

returned from it at the time of the survey which shrinks the sample tremen-

dously. Regarding the last U.S. migration, we make use of the reported wage

and employment characteristics of the migrant household head for both the

last formal job in Mexico before the trip and during the trip. Furthermore,

we are interested in the level of English pro�ciency, participation in sports

or social activities at the destination as indicators for having built new so-

cial networks, additional �nancial characteristics that may constitute bene�ts

only possible through migration like average monthly remittances and savings,

and whether the migrant received any social welfare payments. Note that the

MIG150 �le is not fruitful regarding individual circumstances before the last

U.S. migration that potentially in�uence the migration decision. Fortunately,

the information in question can be recovered from LIFE.

5.1.2 LIFE

LIFE is an event-history �le for each household head from the year of birth

until the survey year which we reduce to those with migration experience to

the U.S. From these we keep only the last person-year available before the

year of the last U.S. migration. Alongside identi�er information, LIFE is built

using various time-speci�c variables. For the isolated year before the last U.S.

migration, we therefore take from LIFE information on participation in the

labor force (there are no wage indicators though), marital unions, family com-

position, U.S. migration experience among family of origin, and property and

business holdings in Mexico. Even though LIFE itself does not contain time-

speci�c wage information, the �le is essential for integrating the last domestic

wage values from MIG150 as we will explain in the following passage.

5.1.3 Identifying the last domestic wage

The MIG150 data captures detailed information on the migratory experience

of all household heads in the survey who ever migrated to the US. While the

�le includes the last domestic wage and the corresponding unit, two problems

20

arise. First, many individuals in the sample are returners. In this case "last

domestic wage" may either equal the wage currently received in Mexico or

re�ect the last wage received in Mexico before retirement or layo� and is most

importantly not a wage received before migrating to the U.S. We therefore

kept only household heads who are reported to still reside in the U.S. Second,

while the year of the last U.S. migration is known, from MIG150 one can

neither infer the year of the last domestic wage, nor the occupation worked in

or the job place. We thus resorted to the LIFE �le. Recall that LIFE compiles

a detailed labor history of all in all 380,302 person-years that correspond to

persons with U.S. migration experience. Note however that LIFE does not

encompass time-dependent wage data. To identify the last domestic wage

from the labor history for the remaining household heads in MIG150 who were

residing in the U.S. at the time of the survey, we secluded the latest person-

year available before the year of the last U.S. migration for each individual.

Then we merged the information connected to the remaining person-years with

the household heads left in MIG150 using unique identi�ers we constructed

from time-constant background information available in both �les. The newly

constructed MIG150-sample now comprises household heads who migrated to

the US and are still there and for each of them their last domestic wage, wage

units, wage on last US trip, years, places and codes for both occupations, and

additional information on personal life and migratory experience.

5.2 Supplementary Data

Driving Distance and Traveltime

Once job places for each individual in both countries were available, we manu-

ally added two measures for the distance between locations employing maps.google.de.7

we measure the distance in kilometers for the shortest driving connection and

the travel time in hours without tra�c to check robustness of results. Still,

7Google Maps (2015)

21

precision of the available job data di�ers across individuals: For a substantial

number of individuals, only the states but not the precise cities of occupation

are available. The lack of distinct location data is especially severe for job-

places in Mexico, where about 80 percent only reported states and not cities,

while this is true for only a �fth of U.S. locations. We considered di�erent

options to proxy the unknown job places: First, the geographical middle of a

state (which does not work well for coast states), second, the city with highest

population (which may not be stable across person-years of interest) and third,

capital. As the �rst two options to not exhibit clear advantages in comparison

to capital and the second option might even often coincide with it, we take the

capital of the job state as a proxy for the missing job place.

Making Wages Comparable Across Time and Countries

As di�erences in wages are likely to be driven by varying price levels not only

between Mexico and the United States, but also across years before and after

migration for each individual and the cross-section as a whole, we adjust wages

for in�ation and purchasing power. To make wages comparable across time and

countries we transformed nominal wages to real wages in 2010 U.S. dollars via

two separate procedures following Gandol�, Halliday, and Robertson (2015).

For the �rst approach we converted Mexican pesos to U.S. dollars using the

nominal exchange rate of the year of the last domestic wage. We retrieved his-

torical nominal exchange rates from di�erent sources: We merged the indicator

from a longitudinal supplementary �le of the MMP150 called NATLYEAR for

the years 1965 to 20128. NATLYEAR itself is built using external sources such

as the Mexican census and statistical yearbooks from the U.S. Department of

Homeland Security. For the years 1959 and 1961 which are not covered by

NATLYEAR, we added the missing exchange rate manually. This is possible

8In the Codebook for NATLYR it is reported that the exchange rate is given in dollarsper peso. Comparing the exchange rates with the tables available via the Banco de Mexicowhich report the exchange rate in pesos per dollar and looking at the resulting values forwages in US dollar, there appears to be a mistake in the Codebook. In fact the exchangerates in NATLYR are given by pesos per dollar.

22

because from April 19, 1954, to August 31, 1975, Mexico had a �xed peso-

dollar exchange rate of 0.0125 (in NATLYEAR rounded to 0.013) pesos per

dollar (Banco de Mexico). Subsequently, we de�ated all nominal dollar values,

including wages during last U.S. migration, to 2010 dollars using the national

Consumer Price Index (CPI) for urban wage earners and clerical workers from

the U.S. Bureau of Labor Statistics.9 However, the mentioned U.S. CPI values

are stated for the base year 1967 and needed to be rebased to 2010. Moreover,

as CPIs are �led on a monthly basis we calculated annual averages for each

year before de�ating.

The second method yields very similar values and proceeds as follows: we

�rst de�ated nominal pesos using historical CPI data for Mexico available

from the Instituto Nacional de Estadística y Geografía (INEGI)10 (We again

calculated annual averages) and nominal dollars using the annual average U.S.

CPIs with base year 2010 calculated before. We then transferred the 2010

pesos to 2010 dollars by entertaining the 2010 nominal exchange rate. While

we prefer the second procedure because it better accounts di�ering consumer

baskets depending on the country of residence, we take the �rst procedure as

our default option. Procedure number one earns its priority status because

it minimizes data loss as the INEGI data only comprise CPIs starting from

the year 1969, while our sample contains years beginning as early as 1965. In

addition, real wages calculated from the two procedures are similar.

Unifying Wage Units and Calculating the Dependent Variable

To calculate the dependent variable ln(WMX,US

i /WMX,MXi

), wage units across

countries are required to be consistent for each individual within the sample.

While MIG150 comprises not only hourly wages for last U.S. migration, but

also usual hours worked per week and months worked per year, last domestic

wages are reported in a variety of units across individuals without additional

9BLS (2015)10INEGI (2015b)

23

information on working hours. Therefore, we generated a new variable for U.S.

wages with U.S. units adjusted to those reported for wages in Mexico. Where

the Mexican wage unit was hourly wage we retained the U.S. hourly wage at

hand. For all other wage units we employed the available information on usual

time worked per period aided by intuitive assumptions like �ve working days

per week and 13/3 weeks per month. Yet, for 30 observations hourly wages in

the U.S. were missing. For those where U.S. wages were available at di�ering

rates for a variable much like the one for last domestic wage, we used these to

�ll in the missing observations, following the same adjustment procedure for

both wages. Finally, we transferred all calculated wages to 2010 U.S. dollars as

described above and calculated ratios.11 Still, wage ratios were unrealistically

high for some observations, indicating mismeasured data. Consequently, we

decided to apply a correction factor and replace wages with ratios greater than

50 for both possible directions with missing values before calculating the �nal

wage ratio by dividing U.S. wages by last domestic wages (both in 2010 U.S.

dollars) and taking the log.

11For the model incorporating additive migration costs described in Appendix A, wageunits need to be consistent not only for each individual but also across all individuals withinthe sample, as the dependent variable in this case is the log of the wage di�erence (andnot the log of the fraction). Therefore, we additionally uni�ed wage ratios to hourly wagesfor the entire sample. To achieve uni�cation, it was inevitable to assume that historicalaverage weekly working hours created an appropriate proxy for actual weekly working hoursof Mexican workers in Mexico because as mentioned before, this kind of information wasnot available for last domestic wages. We incorporated average weekly working hours fromdi�erent sources: The "Encuesta Anual de Trabajo y Salarios Industriale" published byINEGI (2015a) provides data for the year 1940 to 1985. The years 1995 to 2011 are availablefrom the OECD (2015). Unfortunately, there is no data for the years 1986 to 1994. Asvariation appears to be limited in general though, we �lled the gaps with the calculatedmean of the adjacent years 1985 and 1995. As described above, there are missing U.S.hourly wages, too. Furthermore, when hourly wages were missing, so was data on workinghours. If U.S. wages in other units were accessible, we therefore recalculated these to �ll inthe gaps using historical average weekly working hours for the U.S. available from Gallup(2012). We equally applied the introduced correction factor to exclude unrealistically highwage ratios greater than 50. To conclude, we also estimated the model described in section3.1 employing the ratio of the newly approximated hourly wages as the dependent variable.See Appendix A for the results.

24

The Restricted Sample

In order to test whether our results are driven by certain violations of pre-

sumptions required for the no-arbitrage condition (see section 3.1), structural

breaks in general wage convergence as investigated for instance by Gandol�,

Halliday, and Robertson (2015) and Hanson (2003) or unrealistically high and

therefore mismeasured wages we imposed strict restrictions on the sample dis-

cussed so far. First, we drop all persons for whom there lies more than one

year between reported wages. Second, we exclude those with hourly wages

greater than 30 2010 U.S. dollar and last, we only consider U.S. migrations

from 1994 (the year of the amendment of NAFTA) onwards.

The preceding paragraphs are supposed to enable the reader to comprehend

the construction of the sample that we subsequently will refer to as MIG_�nal

and that constitutes the basis for estimation of the empirical model presented

in section 4. Table 1 depicts a summary of the process.

Table 1: Sample Selection

RespondentsMIG: Household heads that migrated to the U.S. 8,052Currently not on U.S. migration -6,324Last domestic wage not reported -1,352Wage on last U.S. migration not reported -117Occupation on last U.S. migration unknown -1State of last U.S. migration unknown -29State of occupation before last U.S. migration unknown -4Unrealisticly high wage ratios* -18MIG_�nal 207More than one year between observations per person -35Mexican hourly wage >30 2010 U.S. dollar -17Year of last U.S. migration before 1994 -30Restricted Sample 125

Notes: *Unrealistically high wage ratios were de�ned as wage ratios >50 in both

directions. We replaced the corresponding wages with missings. Source: MMP150.

25

5.3 Descriptive Statistics

In this subsection we present descriptive statistics of the sample we con-

structed.

Table 2: Representativeness MIG_�nal we

Mexican Communities: Level of Urbanization

PERS MIG MIG_�nalLevel of Urbanization Frequency Percent Frequency Percent Frequency PercentRanchos 34,993 0.22 2,205 0.27 47 0.23Pueblos 49,797 0.32 2,645 0.33 57 0.28Mid-sized cities 40,502 0.26 2,154 0.27 62 0.30Metropolitan 32,587 0.21 1,048 0.13 41 0.20Total 157,879 1.00 8,052 1.00 207 1.00

Notes: Levels of urbanization have been adopted from the MMP150 community selectionprocess as described under http://mmp.opr.princeton.edu/research/selectingcommunities-en.aspx (Accessed 25th of August 2015). Ranchos: < 2, 500inhabitants. Pueblos: 2, 500 ≤ inhabitants < 10, 000. Mid-sized cities: 10, 000 ≤inhabitants < 100, 000. Metropolitan: ≥ 100, 000 inhabitants. Where population growthcrossed levels over time, Iwe assigned levels according to number of inhabitants during theyear closest to the survey year. Sources: MMP Codebooks (1 - 150), Appendix A - SampleInformation (MMP150); MMP150 database

First, however, have a look at tables 2 and 3, where we investigate the

representativeness of MIG_�nal with regard to the initial samples available

from the MMP150. Table 2 shows the level of urbanization of the communi-

ties represented in the sample. PERS encompasses all persons surveyed by the

MMP150, migrants and nonmigrants. One can see that about 80 percent of

the persons within the survey come from places with less than 100,000 inhab-

itants. Turning to MIG which reduces the PERS �le to household heads with

migration experience in the U.S., the portion becomes as high as 87 percent,

indicating that there is a selection of migrants with regard to rural commu-

nities. MIG_�nal thus oversamples migrants from metropolitan regions with

regard to MIG by 7 percentage points. Unfortunately, it is di�cult to extract

summary statistics for all variables included in the regressions with regard to

the underlying supersamples because many variables have been created from

the merger with LIFE and are not included in MIG or PERS. Table 3 hence

26

Table 3: Representativeness MIG_�nal II

Obs Mean SD Min Max

age_usmigl MIG 8047 32.63 11.73 1 85MIG_�nal 207 31.64 9.90 15 68RestrictedSample

125 33.71 10.21 16 68

educ_before MIG (timeconstant)

8038 5.42 3.99 0 28

MIG_�nal 207 7.05 3.41 0 17RestrictedSample

125 7.2 3.09 0 17

Notes: MIG is a cross-sectional �le for each head of household that migrated to the U.S.MIG_�nal is a cross-sectional �le including a panel dimension for each head of householdthat has not returned from her last US migration and for whom wage and occupation datafor last job before US migration and job during last US migration are available.

Figure 3: Job Places Before and During Last U.S. Migration in MIG_�nal,Source: maps.google.de

27

o�ers only limited insights into the demographics of the individuals in the

samples. One can state though, that the mean age at last U.S. migration are

the early thirties across all samples. Concerning the level of schooling, we

compare years of education before the last trip to the U.S. for MIG_�nal and

the restricted sample, while the number of years reported in MIG is the years

of education at the time of the survey. The average person in MIG has experi-

enced less then 5 years of education, with a slightly higher standard deviation

than in the other two samples who exhibit higher levels of education of about

7 years on average. Note that both numbers indicate a low skill-level.12 Also,

all household heads in MIG have already migrated to the U.S. at least once,

but do not have to be wage earners yet, so that it might be the case that MIG

includes more persons who have not �nished their education yet. Nonethe-

less, our results on the determinants of migrations costs appear to refer to

migrants with a higher level of schooling than the average migrant household

head in the MMP150 survey which leads us to the summary of our regression

variables for MIG_�nal displayed in table 4. The �rst four rows deal with

the dependent variables, i.e. migration costs. Migration costs are measured

in the log of the ratio of the wages at the two locations, once adjusted to the

unit given by information on Mexican wages per individual and once adjusted

to hourly wages for across the entire sample. The average migration costs in

all cases are greater than one, indicating a wage gain for the average worker

within the sample from migrating to the U.S. Hoewever, migration costs show

large variations with standard deviations greater than one as well and even

become negative which we explain with regard to the stochastic component in

the wage determination equation (9) given in subsection 3.2. The subsequent

rows summarize variants and transformations of measures of distance between

the job place in Mexico and the location of employment in the U.S. Most infor-

mative for the reader are the driving distance measured in kilometers and the

12According to Borjas (1990), migrants are negatively selected with regard to educationbecause the gains from relocating to the U.S. from countries with less equal income distri-butions like Mexico are large if human capital investments in the origin are impossible.

28

traveltime denoted in driving hours. Distances range from only 200 kilometres

to 5726 kilometres with an average value of a little more than 3000 kilometers.

Hours of traveltime lie between 2.05 and 58 hours, the mean time being 27.6

hours. This rather high variation in distance may come as a surprise if one ex-

pected an accumulation of migrants close to the border or in the metropolitan

cities at the North American West Coast. The map given in �gure 3 demon-

strates a more vivid overview of the distribution of locations, whereas green

dots represent origins and blue dots illustrate destinations. Returning to table

4, the next variable in row is a dummy for English pro�ciency which is one

if the person is at least able to speak and understand English and zero if the

level of comprehension lies below speaking-ability. One can see that almost

half of the migrant household heads have achieved this level of English speak-

ing pro�ciency. Concerning the remaining dummy variables, 62 percent of the

sample were in a marital union before their last U.S. migration, 51 percent

already had family with U.S. migration experience at the time and also about

half of the persons owned property in Mexico. The following variables are not

included in the baseline speci�cation of the regression model: Mean monthly

remittances are 306.91 U.S. dollars with a standard deviation of 260.17 dol-

lars. The average worker was almost 32 years old in the last available time

period before the proximate U.S. trip. 66 in every 100 workers had at least one

child before migrating. Only a third participated in social or sport activities

in the U.S., 11 percent received social welfare. We do not include gender in

our regressions, but be aware that the vast majority of household heads with

migration experience is male and there are only 12 women in the sample.

All in all, the average migrant within MIG_�nal is a low-skilled man in

his early thirties with a high likelihood of having family or friends who equally

migrated, but with rather strong social relationships in Mexico via children

and partners. In the successive paragraph we present the regression results.

29

Table 4: Descriptive Statistics - MIG_�nal

Variable Observations Mean SD Min Max

Migration costs in logs, Mexican wage rate(a) 207 1.21 1.64 -3.85 3.57Migration costs in logs, Mexican wage rate(b) 203 1.18 1.62 -4.02 3.47Migration costs in logs, ratio of hourly wages(a) 148 1.23 1.32 -3.85 3.27Migration costs in logs, ratio of hourly wages(b) 144 1.12 1.38 -4.02 3.27Log of driving distance in kilometres 207 7.90 0.55 5.30 8.65Log of traveltime in driving hours 207 3.21 0.56 0.72 4.06Driving distance in kilometres 207 3012.26 1108.18 200.00 5726.00Squared driving distance in kilometres 207 10300000.00 6512663.00 40000.00 32800000.00Traveltime in driving hours 207 27.60 10.41 2.05 58.00Squared traveltime in driving hours 207 869.71 568.69 4.20 3364.00At least able to speak and understand some English 207 0.49 0.50 0.00 1.00Married in year before last U.S. migration 207 0.62 0.49 0.00 1.00Family with migration experience before last U.S. migration 207 0.51 0.50 0.00 1.00Any property in Mexico in year before last U.S. migration 207 0.55 0.50 0.00 1.00Average monhtly remittances 195 306.91 260.17 0.00 1000.00Age in the year before last U.S. migration 207 31.64 9.90 15.00 68.00Any child in the year before last U.S. migration 207 0.66 0.48 0.00 1.00Pursued any social or sport activity during last U.S. trip 200 0.34 0.47 0.00 1.00Any social welfare payment received during last U.S. migration 201 0.11 0.32 0.00 1.00Male 207 0.94 0.23 0.00 1.00

Notes: (a) Comparable wage rates were calculated by �rst transferring Mexican pesos to U.S. dollars using the nominal exchange rate for thecorresponding year and second de�ating the nominal dollar values to 2010 dollars accessing the U.S. CPI. (b) Comparable wage rates werecalculated by de�ating the peso values to 2010 pesos using the Mexican CPI and transferring these to U.S. dollars via the 2010 nominal exchangerate. Source: MMP 150.

30

6 Results

We now turn to the estimated coe�cients for the baseline speci�cation given by

(10). Further, we consider a number of robustness checks. Reported standard

errors are clustered at the community level because some rural communities

in the MMP150 are very small and sampled entirely so that individual back-

grounds and migration experiences are unlikely to be independent. Table 5,

column (1), shows estimates for the regression on migration costs, i.e. the

log of the U.S. wage divided by the last Mexican wage, calculated based on

de�ation using the U.S. CPI. Distance, as expected has a positive e�ect on mi-

gration costs. A one percent increase in distance between locations can ceteris

paribus be associated with a 0.764 percent elevation in the cost of migration.

The e�ect is signi�cant at the one percent level. Moreover, a Mexican mi-

grant who is able to speak and not only understand English has statistically

signi�cant [exp(−0.572)− 1] · 100 = 43.56 percent lower costs to incur than a

comparable Mexican with a lower English pro�ciency. Being married before

migration in contrast, can be associated with signi�cantly higher migration

costs. The coe�cients for having family with U.S. migration experience and

property in Mexico are both positive, but not statistically signi�cant. That

migration experience of the own social network may raise the cost of migration

may come as a surprise as one could expect a respective decrease of the costs

for gaining information about the migration or job search process. Besides,

family that stayed in the U.S. should be helpful in getting to know one's way

around. Still, experience reports about troublesome migrations might also in-

crease compensation demands as migration costs are expected to be high. The

second column states estimates for the same coe�cients but with regard to

the dependent variable being constructed using the Mexican CPI for de�a-

tion of wages. The magnitudes of the coe�cients are slightly smaller for all

variables, while signs and statistical signi�cance are consistent across the two

estimations. The amount of variation in the dependent variable explained by

31

Table 5: MIG_�nal - Baseline Speci�cation

(1) (2)VARIABLES Migration costs

Mexican wage rate(a) Mexican wage rate(b)

lndist 0.764*** 0.708***(0.212) (0.193)

englishspeaker -0.572** -0.522***(0.220) (0.190)

married_before 0.533* 0.400*(0.266) (0.214)

famusmig_before 0.149 0.107(0.237) (0.251)

anyproperty_before 0.239 0.243(0.178) (0.181)

Constant -5.084*** -4.616***(1.643) (1.494)

Observations 207 203R-squared 0.170 0.127

Robust standard errors in parentheses*** p<0.01, ** p<0.05, * p<0.1

Notes: (a) Comparable wage rates were calculated by �rst transferring Mexican pesos toU.S. dollars using the nominal exchange rate for the corresponding year and secondde�ating the nominal dollar values to 2010 dollars accessing the U.S. CPI. (b) Comparablewage rates were calculated by de�ating the peso values to 2010 pesos using the MexicanCPI and transferring these to U.S. dollars via the 2010 nominal exchange rate. Source:MMP 150.

the regressors varies however; it is 17 percent for the �rst dependent variable

and 12.7 percent for the second. As the results are remarkably similar, for

the following robustness checks we proceed with regressing on migration costs

de�ated using the U.S. CPI on grounds of keeping as many observations and

thereby variation within the sample as possible.

To begin with, we check the robustness of the marginal e�ect of geographi-

cal distance on migration costs. Results can be retraced in table 6. As hinted,

the measurement of the dependent variable stays the same across speci�ca-

tions. The �rst column is the baseline speci�cation. In the second column we

replace the log of the distance measured in kilometers by the log of traveltime

in driving hours. A one percent increase in hours traveltime goes along with a

0.753 increase in migration costs. The coe�cient is signi�cant at the one per-

32

Table 6: Robustness of Distance

(1) (2) (3) (4)VARIABLES Migration costs, Mexican wage rate(a)

lndist 0.764***(0.212)

englishspeaker -0.572** -0.568** -0.574** -0.572**(0.220) (0.223) (0.226) (0.228)

married_before 0.533* 0.538** 0.563** 0.565**(0.266) (0.267) (0.263) (0.265)

famusmig_before 0.149 0.143 0.119 0.118(0.237) (0.235) (0.247) (0.247)

anyproperty_before 0.239 0.245 0.222 0.227(0.178) (0.177) (0.173) (0.172)

lntrvltime 0.753***(0.213)

distance 0.000739*(0.000427)

distsq -7.11e-08(6.71e-08)

traveltime 0.0738(0.0443)

trvltimesq -0.000729(0.000741)

Constant -5.084*** -1.462** -0.533 -0.445(1.643) (0.707) (0.606) (0.589)

Observations 207 207 207 207R-squared 0.170 0.169 0.161 0.159


Notes: (a) Comparable wage rates were calculated by �rst transferring Mexican pesos to U.S. dollars usingthe nominal exchange rate for the corresponding year and second de�ating the nominal dollar values to2010 dollars accessing the U.S. CPI. Source: MMP 150.

cent level and close to the marginal e�ect of the log of distance in kilometers.

All other coe�cients are nearly unchanged. In columns (3) and (4) we ac-

count for the possibility that the relationship between distance and migration

costs might be misspeci�ed by including both distance proxies, that is driving

distance in kilometers and traveltime in hours (no logs this time) in turn, to-

gether with their squares. We presume that the marginal e�ect of distance in

migration costs is positive but decreasing with distance. The corresponding

coe�cients have the expected signs. Yet, only the coe�cient for distance in

kilometers remains statistically signi�cant at the ten percent level. The coef-

�cients for traveltime and traveltime squared are not statistically signi�cant.

We now turn to the results for our coping with the potential time trend

adherent to the data that we described in section 4.2 and that may invite

bias to the estimation. Table 7 o�ers an overview. Column (1) is again the

33

baseline speci�cation. In column (2) we deal with the problem that for 35

individuals the year di�erence between observed wages is greater than one so

that the no-arbitrage assumption may not hold. We therefore estimate the

baseline speci�cation on a restricted sample of 172 observations. This leads

to an increase in the magnitude of the marginal e�ect of distance on migra-

tion costs. A one percent increase in distance leads ceteris paribus to an

increase in migration costs of 0.943 percent, given the model is correctly speci-

�ed. The magnitudes of the remaining variables experience only little changes

in comparison to the unrestricted sample. Signs and statistical signi�cance

remain unchanged. We maintain the year-di�erence constraint across subse-

quent columns. In the third column we only consider persons who migrated

after 1994, which yields the same results as for the restricted sample (see col-

umn (6)), indicating that hourly wages reported for the following years did not

exceed 30 2010 U.S. dollars. Regarding the estimates, the restriction reduces

the size of the distance coe�cient to 0.606 and the size of the coe�cient for

English pro�ency to -0.236. The marginal e�ect of marital unions loses its

statistical signi�cance. The e�ect of having own property in Mexico changes

signs. For someone who migrated after NAFTA, owning property leads to

about 22.2 lower migration costs in comparison to someone without property.

The e�ect is signi�cant at the 10 percent level. It might be the case that the

�nancial role of property has gained further importance or that return migra-

tion has become more common over the years, so that maintaining property

reduces fears about the prospects at destination. In columns (4) and (5) we

incorporate time period dummies into the regression. The period dummies

are all statistically signi�cant, so that a time trend is apparent. The marginal

e�ect of property described for column 3 increases in magnitude and becomes

signi�cant at the 5 percent level. In addition, the distance e�ect remains pos-

itive and highly signi�cant. The log of the distance is omitted in column (5)

due to collinearity because we interacted the variable with the period dummies

to capture time-variant e�ects. For the period 1959 to 1989 the marginal ef-

34

fect is greater than in the previous estimates. It decreases for the period 1990

to 1999 and increases again for the period from 2000 onwards, which appears

to be counter-intuitive given cheaper transportation and communication. All

interaction terms are signi�cant at the 5 percent level. Including time controls

raises the percentage of variance in migration costs explained by the regressors

and the R2 amounts to values around 60 percent. In column 7 we embrace

further potential determinants of migration costs like having had children be-

fore migration, participating in social or sports activities at destination and

receiving social welfare payments in the U.S. None of the additional coe�cients

is signi�cant.

We perform regressions applying the described robustnesschecks using only

the restricted sample as described in section 5.2 and table 1. Regression results

are covered in table 8, where column (1) depicts the baseline speci�cation

using MIG_�nal for comparison. Columns (1) and (2) employ the baseline

speci�cation for both measures of de�ation paralleling table 5. Columns (4)

and (5) correspond to robustness checks of the distance coe�cient as illustrated

in table 6. The �nal column adds additional controls and period dummies.

For the restricted sample, the R2 are substantially lower, between 0.142

and 0.208. The distance e�ect, also if measured by the log of traveltime as in

column (4), remains positive and statistically signi�cant except for the speci-

�cation check using the square of the distance in column (7), despite expected

signs. The marginal e�ect of English pro�ciency on migration costs stays nega-

tive and statistically signi�cant across columns. Compared to the �nal sample,

the magnitude of the coe�cients is only about half as big though. Yet, the

marginal e�ect is still impressive: For column (2) English speaker incurs ce-

teris paribus 21.02 lower migration costs than a Mexican who does not speak

the language.

To sum up, the e�ect of distance on migration costs appears to be generally

robust across di�erent speci�cations and samples. The size of the coe�cients

using MIG_�nal ranges from 0.675 to an upper bound of 0.943. For the

35

restricted sample the coe�cients lie between 0.492 and 0.659. All estimates are

signi�cant at the 5 percent level or lower. The language dummy is also stable

and signi�cant in many cases. Moreover, the included period dummies are all

signi�cant and point at a time trend that potentially biases the estimates if it

is not controlled for. Di�erent de�ation measures exhibit very similar results.

36

Table 7: Controlling for Time Trends

VARIABLES Migration costs, Mexican wage rate(a)

(1) (2) (3) (4) (5) (6) (7)

lndist 0.764*** 0.943*** 0.606*** 0.799*** 0.606*** 0.675***(0.212) (0.224) (0.165) (0.136) (0.165) (0.184)

d_1990_1999 2.687*** 6.855*** 2.755***(0.363) (2.533) (0.387)

d_2000_2012 2.709*** 3.930* 2.815***(0.343) (2.143) (0.364)

englishspeaker -0.572** -0.500** -0.236** -0.197 -0.166 -0.236** -0.171(0.220) (0.195) (0.105) (0.149) (0.149) (0.105) (0.179)

married_before 0.533* 0.546** 0.0167 0.0289 0.0416 0.0167 0.417(0.266) (0.223) (0.144) (0.169) (0.174) (0.144) (0.220)

famusmig_before 0.149 0.296 -0.0294 0.355** 0.352** -0.0294 0.329*(0.237) (0.264) (0.113) (0.167) (0.169) (0.113) (0.190)

anyproperty_before 0.239 0.175 -0.251* -0.303** -0.312** -0.251* -0.320*(0.178) (0.201) (0.127) (0.147) (0.145) (0.127) (0.189)

remit -0.000127(0.000348)

age_usyrl 0.015(0.00971)

anychild_before -0.288(0.223)

activity_usmigl 0.08(0.167)

welfare -0.0181(0.317)

lndist_1959_1989 1.066***(0.266)

lndist_1990_1999 0.529**(0.218)

lndist_2000_2012 0.905***(0.137)

di�erence between years 1 No Yes Yes Yes Yes Yes Yes

migration after NAFTA No No Yes No No Yes No

period dummies No No No Yes Yes No Yes

lndist time-variant No No No No Yes No No

restricted sample No No No No No Yes No

Constant -5.084*** -6.514*** -2.585* -7.078*** -9.171*** -2.585* -5.080***(1.643) (1.765) (1.337) (1.113) (1.976) (1.337) (1.489)

Observations 207 172 125 172 172 125 189R-squared 0.170 0.227 0.206 0.624 0.630 0.206 0.580



37

Table 8: Restricted Sample

Migration costsVARIABLES Mexican wage rate(a) Mexican wage rate(a) Mexican wage rate(b) Mexican wage rate(a) Mexican wage rate(a) Mexican wage rate(a)

(1) (2) (3) (4) (5) (6)

lndist 0.764*** 0.606*** 0.659*** 0.492**(0.212) (0.165) (0.154) (0.188)

d_1990_1999 0.0528(0.131)

englishspeaker -0.572** -0.236** -0.222** -0.231** -0.275** -0.236*(0.220) (0.105) (0.106) (0.105) (0.109) (0.128)

married_before 0.533* 0.0167 -0.0120 0.0184 0.0342 0.0567(0.266) (0.144) (0.152) (0.141) (0.157) (0.163)

famusmig_before 0.149 -0.0294 -0.0197 -0.0221 -0.0173 -0.0393(0.237) (0.113) (0.116) (0.112) (0.116) (0.127)

anyproperty_before 0.239 -0.251* -0.172 -0.244* -0.296** -0.0944(0.178) (0.127) (0.133) (0.125) (0.132) (0.151)

age_usyrl 0.00770(0.00785)

anychild_before -0.477**(0.208)

activity_usmigl -0.103(0.198)

welfare -0.0914(0.177)

remit -9.49e-05(0.000249)

lntrvltime 0.589***(0.167)

distance 0.000326(0.000351)

distsq -1.37e-08(5.10e-08)

Constant -5.084*** -2.585* -3.089** 0.300 1.392** -1.622(1.643) (1.337) (1.231) (0.583) (0.631) (1.623)

Observations 207 125 125 125 125 110R-squared 0.170 0.206 0.208 0.202 0.177 0.142


Notes: (a) Comparable wage rates were calculated by �rst transferring Mexican pesos to U.S. dollars using the nominal exchange rate for the corresponding year and secondde�ating the nominal dollar values to 2010 dollars accessing the U.S. CPI. (b) Comparable wage rates were calculated by de�ating the peso values to 2010 pesos using theMexican CPI and transferring these to U.S. dollars via the 2010 nominal exchange rate. Source: MMP 150.

38

7 Conclusion and Outlook

With this work we put forward a new estimate of the determinants of migration

costs using microdata on spatial wage gaps of 207 Mexicans who migrated to

the U.S. We �nd that distance remains a deterrent to migration, even in times

of cheaper communication and travel. Even for a sample restricting migration

to the years after the amendment of NAFTA in 1994 with minimal temporal

di�erence between reported wages and controlling for potential time trends the

marginal e�ect of distance is statistically signi�cant at the �ve percent level and

as high as 0.495. This means that for a location 500 kilometers away associated

with migration costs of 100 U.S. dollars, an additional 5 kilometers can be

associated with an increase of the cost of migration to 105 U.S. dollars, all

other things equal. While most other variables considered were not found to be

meaningful in�uences of migration costs, besides distance, English pro�ciency

appears to be robust and signi�cant across di�erent variants of the regression

model, so that the average English speaking Mexican migrant is confronted

with lower costs than her non-�uent compatriot.

The empirical investigation we propose faces limitations many of which

are due to a lack of data. First of all, the constructed panel-like sample is

small with regard to individuals, but embraces information extended over a

long time horizon which makes wage comparisons despite identical individu-

als and short timespans between occupations problematic. The limited size

of the sample also kept me from including controls for di�erences in regional

amenities. However, it has been argued before that these are already captured

by de�ating monetary values with regional CPIs (Bodvarsson, Simpson, and

Sparber, 2015). Regrettably, we were only able to obtain national CPIs for

the work presented here. Also, the results are tied to a relatively homogenous

group of migrants and a single country-pair, although the Mexican-U.S. border

is of prevalent interest with regard to migration �ows. Consequently, it would

be an interesting research project to discriminate between internal and inter-

39

national migration costs by incorporating a border dummy. Unfortunately

and despite administering information on domestic migration, the MMP150

data do not include further location-speci�c wage data that could be accessed.

This fact also creates an obstacle to dynamically modeling return migration

with the data at hand, even though this would be advisable given that we ob-

serve that many Mexicans take several trips to the U.S. An analysis of return

migration might also allow to disentangle direct and indirect natural barriers

to migration, as return migrants are likely to experience lower psychological

costs relative to monetary expenses. To conclude however, we believe that

our transfer from the trade literature estimating trade costs using prices to an

estimate of the determinants of migration costs provides an interesting pro-

gramme for further research. In particular, through our resort to microdata

on the same individuals we o�er a contribution to the solution of the problem

of selection-bias through wage-in�uencing unobservables pertaining to earning

comparisons across locations.

40

References

Aguayo-Téllez, E., and J. Martínez-Navarro (2013): �Internal and

international migration in Mexico: 1995-2000,� Applied Economics, 45(13),

1647�1661.

Akee, R. (2010): �Who Leaves? Deciphering Immigrant Self-Selection from a

Developing Country,� Economic Development and Cultural Change, 58(2),

323�344.

Allen, T., and C. Arkolakis (2014): �Lecture 12: Estimating Trade

Costs using Prices [Lecture Notes],� Retrieved September 7, 2015, from

https://sites.google.com/site/treballen/teaching/econ-460-2014.

Anderson, J. E. (2011): �The Gravity Model,� Annual Review of Economics,

3(1), 133�160.

Atkin, D., and D. Donaldson (2015): �Who's getting Globalized? The Size

and Nature of Intranational Trade Costs,� NBER Working Paper 21439.

Beine, M., S. Bertoli, and J. Fernández-Huertas Moraga (2014): �A

practicioners' guide to gravitiy models of international migration,� CREA

Discussion Paper 2014-24, pp. 0�24.

Beine, M., F. Docquier, and Ç. Özden (2011): �Diasporas,� Journal of

Development Economics, 95(1), 30�41.

BLS (2015): �Monthly US CPI-W, 1967=100,� [Data sheet], Available from

http://data.bls.gov/cgi-bin/surveymost, Accessed August 8, 2015.

Bodvarsson, Ö. B., N. B. Simpson, and C. Sparber (2015): �Migration

Theory,� in Handbook of the Economics of International Migration, ed. by

B. R. Chiswick, and P. W. Miller, vol. 1, chap. 1, pp. 3�51. Elsevier B.V.

Borjas, G. J. (1990): Friends or Strangers. Basic Books, New York.

41

Cameron, A. C., and P. K. Trivedi (2005): Microeconometrics: Methods

and Applications, Cambridge Books. Cambridge University Press.

Clemens, M. a., C. E. Montenegro, and L. Pritchett (2008): �The

Place Premium: Wage Di�erences for Identical Workers Across the US Bor-

der,� CGDEV Policy Research Working Paper 4671.

Ehrenberg, R. G., and R. S. Smith (2003): �Worker Mobility: Migration,

Immigration, and Turnover,� in Modern Labor Economics: Theory and Pub-

lic Policy, chap. 10, pp. 310�343. Addison-Wesley Higher Education Group,

8 edn.

Engel, and Rogers (1996): �How Wide Is the Border?,� American Eco-

nomic Review, pp. 1�12.

Gallup (2012): �Gallup Poll Social Series: Work and Education,� [Poll

Results]. Retrieved from http://cdn.cnsnews.com/documents/GALLUP-

SCHOOL%20POLL.pdf, Accessed August 11, 2015.

Gandolfi, D., T. Halliday, and R. Robertson (2015): �Trade, FDI,

Migration, and the Place Premium: Mexico and the United States,� IZA

Discussion Paper 9215.

Google Maps (2015): �Distance and Traveltime between locations,� [Map].

Available from maps.google.de, Accessed July 30, 2015.

Hanson, G. H. (2003): �What Has Happened to Wages in Mexico since

NAFTA?,� NBER Working Paper 9563.

INEGI (2015a): �Encuesta Anual de Trabajo y

Salarios Industriale,� [Data sheet]. Retrieved from

http://www.inegi.org.mx/prod\_serv/contenidos/espanol/bvinegi/

productos/nueva\_estruc/HyM2014/5.\%20Trabajo.pdf, Cuadro 5.19,

1.a parte, Accessed August 10, 2015.

42

(2015b): �Monthly Mexican CPI, De-

cember 2010=100,� [Data sheet], Available from

http://www.inegi.org.mx/sistemas/indiceprecios/Estructura.aspx?i

dEstructura=112000200010&T=%C3%8Dndices%20de%20Precios%20al-

%20Consumidor&ST=Principales%20%C3%Adndices, Accessed August 7,

2015.

Jevons, W. S. (1871): The Theory of Political Economy. Macmillan.

Johnson, W. R. (1977): �Uncertainty and the Distribution of Earnings,� in

Distribution of Economic Well-Being, ed. by T. F. Juster, vol. I, pp. 379�396.

NBER.

Kennan, J., and J. R. Walker (2011): �The E�ect of Expected Income on

Individual Migration Decisions,� Econometrica, 79(1), 211�251.

Mincer, J. (1958): �Investment in Humand Capital and Personal Income

Distribution,� Journal of Political Economy, 66(4), 281�302.

MMP (2015): �MMP150,� [Data �les and Codebooks]. Available from

mmp.opr.princeton.edu, Accessed July 20, 2015.

OECD (2015): �Average Weekly Working Hours

Mexico (1995-2011),� [Data sheet]. Available from

https://stats.oecd.org/Index.aspx?DataSetCode=AVE\_HRS\#, Accessed

August 10, 2015.

Ortega, F., and G. Peri (2013): �The E�ect of Income and Immigration

Policies on International Migration,� Migration Studies, 1(1), 1�35.

(2014): �Openness and income: The roles of trade and migration,�

Journal of International Economics, 92(2), 231�251.

Persson, K. (2008): �Law of One Price,� EH.Net Encyclopedia, Retrieved

July 30, 2015, from http://eh.net/encyclopedia/the-law-of-one-price/.

43

Samuelson, P. A. (1954): �The Transfer Problem and Transport Costs, II:

Analysis of E�ects of Trade Impediments,� The Economic Journal, 64(254),

264�289.

Smith, A. (1776): An Inquiry into the Nature and Causes of the Wealth of

Nations. Modern Library, New York, 1937 edn.

Wooldridge, J. M. (2008): Introductory Econometrics: A Modern Ap-

proach. Cengage Learning Emea.

44

Appendix: Additive Migration Costs

We model migration costs as a portion of the wage earned in the country of

origin, leaning on the common application of these so called "iceberg costs"

in established gravity models (Anderson, 2011), on examples from trade eco-

nomics estimating trade costs using prices (Engel and Rogers, 1996), and on

work from the migration literature itself (Ortega and Peri, 2014; Aguayo-Téllez

and Martínez-Navarro, 2013).

In this Appendix we further consider additive migration costs. Whereas

the general theoretical background including wage determination remains, the

no-arbitrage can be derived as follows. Consider the subsequent equation:

W j,ki − δ

j,ki = W j,l

i − δj,li (11)

where W jki is the wage of a worker i from location j in location k, W jl

i are the

earnings of this very same worker in location l with j, k, l ∈ S, the number

of locations, and i ∈ N , the number of individuals. δjki depicts worker i's

costs of migrating from j to k, or to l correspondingly for δjli . Wages between

locations thus have to be equal if there are no migration costs. This implies that

the compensating wages have to be higher, the greater the cost of migration.

Replacing l with j for workers earning wages in their home country results in

W j,ki + δj,ji = W j,j

i + δj,ki (12)

Repeating from section 3.1, a worker who stays does not incur relocation ex-

penses of any kind, so that in the case for the additive case δj,ji will be zero

and Iwe are left with

W j,ki = W j,j

i + δj,ki (13)

As before, we adjust the model to the data at hand which means replacing j

with MX for Mexico, the place of origin observed in the data, and k with US

45

for the U.S. as the destination country:

WMX,USi = WMX,MX

i + δMX,USi (14)

The determination of migration costs remains the same as in equation (7)

δMX,USi = exp(β0 + x′MX,US

i β1 + εMX,USi ), (15)

with xMX,USi being a vector of diverse determinants of migration costs, β0

denoting a constant and εMX,USi posing an error term that is uncorrelated

with the regressors with mean expectation 0. The vector of parameters of

interest still is β1. Taking logs, the �nal model for additive costs now ends in

ln(WMX,USi −WMX,MX

i ) = β0 + x′MX,USi β1 + εMX,US

i (16)

Recognize that the dependent variable in this case cannot be written as the

di�erence of the logs of the wages in the two locations. The dependent variable

is the log of the wage di�erential, which is why the model based on iceberg

costs and the model working with additive migration costs cannot be translated

into each other as they implicitly make di�erent functional form assumptions

when using the same migration cost speci�cation in both models. Therefore,

estimates ensued from the proximate equation cannot be expected to be di-

rectly comparable to estimates from equation (10). Keeping this in mind, the

regression model for additive migration costs can be stated as follows

ln(WMX,USi −WMX,MX

i ) = β0 + β1lndistMX,USi

+β2englishspeakerUSi + β3married

MXi +

β4famusexpMXi + β5property

MXi + εMX,US

i

(17)

There are several empirical di�culties to be recognized: From the microdata

at hand we face imprecision regarding the wages measured resulting from miss-

46

ing data on working hours in the last domestic occupation. While wage units

are available for both domestic and US wages, only the U.S. wage data ad-

ditionally allow for an adjustment to the unit dictated by the information on

domestic wages. Thus, the data does not pose a problem for our preferred

model that relies on iceberg migration costs. As the dependent variable in the

case of iceberg migration cost is the log of a ratio, units need only be consistent

per individual. In contrast, the additive model requires equal wage units for

every individual within the sample. To compute these we need to make fur-

ther presuppositions on the average hours worked based on additional sources

which might bias the estimate for the cost of migration either downwards if

the average working hours in Mexico are set too low (so that hourly wages in

Mexico are assumed to be too high) or upwards in the reverse case. Moreover,

the additive model cannot deal with negative wage di�erentials on the estima-

tion stage which results in further reduction of the sample size to merely 90

observations in the restricted sample and a loss of actual variation in the data.

Table 9 depicts results for equal observations, namely the restricted sample

minus the number of observations exhibiting negative wage di�erentials. we

control for time trends using period dummies.

A one percent increase in distance between location leads to a ceteris

paribus increase in migration costs of 0.615 percent in the multiplicative model

and a 0.622 percent increase in the additive migration costs, given any of the

models has been correctly speci�ed (columns 1 and 3). As the additive model

is based on hourly wages, we additionally regress on migration costs as the

ratio of hourly wages, which yields a smaller e�ect on migration costs of an

0.371 percent increase. The distance e�ect is statistically signi�cant at the one

percent level for all models. For the remaining variables, the additive model

mostly exhits opposite signs of the coe�cient in comparison to the regression

on iceberg migration costs. For instance, though the e�ect is statistically in-

signi�cant, an Englishspeaker is associated with 10.33 percent lower migration

costs than someone who only understands English or does not have a grasp on

47

Table 9: Iceberg Migration Costs and Additive Migration Costs

Iceberg Migration Costs Additive MigrationCosts(a)

VARIABLES Mexican wage rate(a) Ratio of hourlywages(a)

(1) (2) (3)

lndist 0.615*** 0.371*** 0.622***(0.155) (0.102) (0.192)

d_1990_1999 1.034*** 0.789*** -0.408(0.200) (0.199) (0.247)

d_2000_2012 0.770*** 0.662*** -0.660***(0.224) (0.196) (0.201)

englishspeaker -0.109 -0.0644 0.0723(0.126) (0.104) (0.119)

married_before -0.00632 0.00781 -0.0319(0.165) (0.132) (0.137)

famusmig_before 0.108 0.0736 0.150(0.140) (0.108) (0.0952)

anyproperty_before -0.329** -0.242* -0.0377(0.142) (0.134) (0.119)

Constant -3.812*** -1.672* -2.276(1.247) (0.851) (1.586)

Observations 90 90 90R-squared 0.282 0.210 0.371



the language at all in the multiplicative speci�cation (taking hourly wages the

magnitude of the e�ect is only 6.24 percent), while they are 7.5 percent higher

in the additive model.

48

an estimate of the determinants of migration costs using ... · in this paper we propose a new...

Documents