integrated public use microdata series
Post on 31-Dec-2015
29 Views
Preview:
DESCRIPTION
TRANSCRIPT
Integrated Public Use Microdata Series
IPUMS
www.ipums.orgwww.ipums.org
Matt SobekMinnesota Population Center
sobek@pop.umn.edu
IPUMS Overview
1. What is the IPUMS1. What is the IPUMS
2. Harmonization2. Harmonization
3. Additional Data Enhancements3. Additional Data Enhancements
4. Access4. Access
5. Strengths and Limitations5. Strengths and Limitations
6. Research examples6. Research examples
1. What is the IPUMS1. What is the IPUMS
IPUMS-USA 1991 -- Steve RugglesIPUMS-USA 1991 -- Steve Ruggles All existing samples of US censusAll existing samples of US census Data extraction system 1998Data extraction system 1998
IPUMS-International 2001 IPUMS-International 2001 2004 IPUMS-Latin America2004 IPUMS-Latin America 2005 IPUMS-Europe2005 IPUMS-Europe 2005 NSF Expansion2005 NSF Expansion
World’s largest collection census dataWorld’s largest collection census data 30 samples per year for the next 3 years30 samples per year for the next 3 years Bob McCaaBob McCaa
Brief History
Belarus 1999Brazil 2000 1991 1980 1970 1960Cambodia 1998 Chile 2002 1992 1982 1970 1960China 1982Colombia 1993 1985 1973 1964Costa Rica 2000 1984 1973 1960Ecuador 2001 1990 1982 1974 1962France 1990 1982 1975 1968 1962Greece 2001 1991 1981 1971Kenya 1999 1989Mexico 2000 1990 1970 1960Philippines 2000 1995 1990Romania 2002 1992South Africa 2001 1996Spain 2001 1991 1981Uganda 2002 1991United States 2000 1990 1980 1970 1960Venezuela 1990 1981 1970Vietnam 1999 1989
Datasets in IPUMS
IPUMS Census Sample Holdings and Release DatesIPUMS Census Sample Holdings and Release Dates
June 2007 December 2007
1970 Argentina 1971 Austria 2001 Armenia 1983 Guinea 1971 Nicaragua
1980 Argentina 1981 Austria 1976 Bolivia 1996 Guinea 1973 Pakistan
1991 Argentina 1991 Austria 1992 Bolivia 1961 Honduras 1981 Pakistan
2001 Argentina 2001 Austria 2001 Bolivia 1974 Honduras 1998 Pakistan
1970 Hungary 1971 Canada 2005 Colombia 1988 Honduras 1962 Paraguay
1980 Hungary 1981 Canada 1991 Czech Republic 1971 Indonesia 1972 Paraguay
1990 Hungary 1991 Canada 2001 Czech Republic 1976 Indonesia 1982 Paraguay
2001 Hungary 1970 Malaysia 1960 Dominican Rep 1980 Indonesia 1992 Paraguay
1972 Israel 1980 Malaysia 1970 Dominican Rep 1990 Indonesia 2002 Paraguay
1983 Israel 1991 Malaysia 1981 Dominican Rep 1995 Indonesia 1993 Peru
1995 Israel 2000 Malaysia 1986 Egypt 1997 Iraq 1970 Puerto Rico
1997 Palestine 1960 Panama 1996 Egypt 1961 Israel 1980 Puerto Rico
1981 Portugal 1970 Panama 1992 El Salvador 1991 Italy 1990 Puerto Rico
1991 Portugal 1980 Panama 1966 Fiji 1993 Madagascar 2000 Puerto Rico
2001 Portugal 1990 Panama 1986 Fiji 1987 Malawi 1983 Sudan
1991 Rwanda 2000 Panama 1996 Fiji 1998 Malawi 1993 Sudan
2001 Rwanda 2005 United States 1999 France 1987 Mali 1995 Turkmenistan
2001 Venezuela 1964 Guatemala 1998 Mali 1991 United Kingdom
1973 Guatemala 2002 Mongolia 1963 Uruguay
1981 Guatemala 1960 Netherlands 1975 Uruguay
1994 Guatemala 1970 Netherlands 1985 Uruguay
2002 Guatemala 2001 Netherlands 1996 Uruguay
June 2008 to June 2009
Dark green = disseminatingDark green = disseminating
Medium green = data held by IPUMSMedium green = data held by IPUMS
Light green = negotiatingLight green = negotiating
Yellow = not negotiatingYellow = not negotiating
IPUMS Global Coverage
Variable Name BR CL CN CO CR EC FR KE MX ZA UG US VE VN
Relationship to hh head X X X X X X X X X X X X X X
Age X X X X X X X X X X X X X X
Sex X X X X X X X X X X X X X X
Marital status X X X X X X X X X X X X X X
Children ever born X X X x x x . X X X X x X X
Children surviving X x X x x x . X . X X . x X
Date of last birth x . . . . . . X . . X . . X
Country of birth X x . X X x . X X X X X X .
Nativity X X . X X X X X X X X X X .
Religion X X . . . . . . X X X . . x
School attendance X X . x x x x X x X X X X X
Education attainment X X X X X X X X X X X X X X
Years of schooling X X . X X X . X X X X x X X
Literacy X X X X X X . x X . X . X X
Employment status X X X X X X X X x X X X X X
Class of worker X X . X x X X x X X X X X .
Occupation X X X x x X X x X X X X . X
Industry X X X X X x X . X X x X X X
Income X . . x X . . . X X . X X .
Migration, previous country X x . X X x . X X x X x . .
Migration, internal X X . X X x X X X X X X X X
Year of migration X X . X X x . . x X X . x .
Disability x x . . x . x . . X X x X .
x = available in only some samples for that country
. = not available for that country
BR=Brazil; CL=Chile; CN=China; CO=Colombia; CR=Costa Rica; EC=Ecuador; FR=France; KE=Kenya
MX=Mexico; ZA=South Africa; UG = Uganda; US=United States; VE=Venezuela; VN=Vietnam
X = available in all samples for that country
Selected Variable Availability -- PERSON
Variable Name BR CL CN CO CR EC FR KE MX ZA UG US VE VN
Region X X X X . . X X X . . X . X
State/province X X X X X X . X X X X X X X
District/county/municip . X X X X X . X X X X X X X
Metropolitan area X . . X . . . . . . . X . .
Urban-rural status X X . X X x X X X X X x x X
Electricity X X . x X X x X x X X . X X
Water X X . x X X . X X X X X X X
Sewage X X . x X x x X X x . X X X
Toilet . X . x X X . X X X . x X X
Home ownership X X . x X X x X X X X X X x
x = available in only some samples for that country
. = not available for that country
BR=Brazil; CL=Chile; CN=China; CO=Colombia; CR=Costa Rica; EC=Ecuador; FR=France; KE=Kenya
MX=Mexico; ZA=South Africa; UG = Uganda; US=United States; VE=Venezuela; VN=Vietnam
X = available in all samples for that country
Selected Variable Availability -- HOUSEHOLD
What Are Microdata?
Individual-level data
• every record represents a separate person • all of their individual characteristics are recorded • users must manipulate the data themselves
Different from aggregate/summary/tabular data
• a count of persons by municipality • an employment status table by sex from a published census volume
Kenya 1999 Census Questionnaire
H9101000000030982025200090000001324101001000071000000008800000000P9101000000030102520252120000000002109730111020010103212001182000P9101000000030202520252120000000001109730111020020103622001181080P9101000000030302520252120201010100009000199996030101122006990000P9101000000030402520252120201010100009000199996030100912006990000P9101000000030502520252120201010100009000199996030100712006990000P9101000000030602520252120201010100009000199996030100612006990000P9101000000030702520252120201010100009000199996030100422006990000P9101000000030802520252120201010100009000199996030100322006990000P9101000000030902520252120201010100009000199996030100222006990000H9101000000040360025200030000001324101001000071000000008800000000P9101000000040102520252120000000002103110101010010103011001021000P9101000000040202520252120000000001103110101010020102121001021020P9101000000040302520252120201010100003000199990030100111006990000H9101000000050338025200030000001324101001000071000000008800000000P9101000000050102520251200000000021031001070700101045120010520000P9101000000050202520252120000000001103100107070020102522001051020P9101000000050302520252120201010100003000199990030100722006990000H9101000000060416025200040000001324101001000071000000008800000000P9101000000060102520252120000000002104200119150010104912001192000P9101000000060202520252120000000001104200119150020104922001192040P9101000000060302520252120201010100004000199991030101922006990000P9101000000060402520252120201010100004000199991030101522006990000
Raw Census Microdata from IPUMS
H910000240000000088001001000220100P910000020101032120010010010011504P910000010201036220010010010011999P910201000301011220060010010011999P910201000301009120060010010011999P910201000301007120060010010011999P910201000301006120060010010011999P910201000301004220060010010011999P910201000301003220060010010011999P910201000301002220060010010011999H910000240000000088001001000110100P910000020101030110010290510511310P910000010201021210010290290171999P910201000301001110060010290291999H910000240000000088001001000220100P910000020101045120010010010011100P910000010201025220010010010011820P910201000301007220060010010011999H910000240000000088001001000220100P910000020101049120010010010011100P910000010201049220010010010011820P910201000301019220060010010011820P910201000301015220060010010012820
IPUMS Data Structure
Household record(shaded) followedby a person recordfor each member of the household
Relationship
AgeSexRace
BirthplaceMother’s birthplace
Occupation
For each type ofrecord, columns correspond tospecific variables
The Advantages of Microdata
Combination of all of a person’s characteristics
Characteristics of everyone with whom a person lived
Freedom to make any table you need
Freedom to make models examining multivariaterelationships
Basically, you are only limited by the questions asked in the particular census
1. What is the IPUMS1. What is the IPUMS
2. Harmonization2. Harmonization
3. Additional Data Enhancements3. Additional Data Enhancements
4. Access4. Access
5. Strengths and Limitations5. Strengths and Limitations
6. Research examples6. Research examples
IPUMS Overview
Translation Table – Marital Status
MARST Marital Status
code label CN82A403 CO73A411 KN89A413 MX70A402 US90A425
100 SINGLE/NEVER MARRIED 1=never married 4=single 1=single 9=single 6=never married
200 MARRIED/IN UNION
210 Married (not specified) 2=married 2=married 3=monogamous 1=married
211 Civil 3=only civil
212 Religious 4=only religious
213 Civil and religious 2=civil and religious
214 Polygamous 3=polygamous
220 Consensual union 1=free union 5=free union
300 SEPARATED/DIVORCED 3=sep. or divorced
310 Separated 6=separated 8=separated 3=separated
321 Legally separated
322 De facto separated
330 Divorced 4=divorced 5=divorced 7=divorced 4=divorced
400 WIDOWED 3=widowed 5=widowed 4=widowed 6=widowed 5=widowed
999 UNKNOWN/MISSING 0=missing 6=unknown B=blank 1=unknown
ChinaChina19821982
ColombiaColombia19731973
KenyaKenya19891989
MexicoMexico19701970
U.S.A.U.S.A.19901990
(IPUMS-International)
Translation Table – Marital Status
MARST Marital Status
gen code label CN82A403 CO73A411 KN89A413 MX70A402 US90A425
1 100 SINGLE/NEVER MARRIED 1=never married 4=single 1=single 9=single 6=never married
2 200 MARRIED/IN UNION
210 Married (not specified) 2=married 2=married 3=monogamous 1=married
211 Civil 3=only civil
212 Religious 4=only religious
213 Civil and religious 2=civil and religious
214 Polygamous 3=polygamous
220 Consensual union 1=free union 5=free union
3 300 SEPARATED/DIVORCED 3=sep. or divorced
310 Separated 6=separated 8=separated 3=separated
321 Legally separated
322 De facto separated
330 Divorced 4=divorced 5=divorced 7=divorced 4=divorced
4 400 WIDOWED 3=widowed 5=widowed 4=widowed 6=widowed 5=widowed
9 999 UNKNOWN/MISSING 0=missing 6=unknown B=blank 1=unknown
General Codes
Variable Description: Literacy (International)
1. What is the IPUMS1. What is the IPUMS
2. Harmonization2. Harmonization
3. Additional Data Enhancements3. Additional Data Enhancements
4. Access4. Access
5. Strengths and Limitations5. Strengths and Limitations
6. Research examples6. Research examples
IPUMS Overview
Pernum Relate Age Sex Marst Chborn
1 head 46 male married n/a
2 spouse 44 female married 3
3 aunt 77 female widow 7
4 child 15 female single 0
5 child 13 female single n/a
6 child 11 male single n/a
Pernum Relate Age Sex Marst Chborn
1 head 46 male married n/a
2 spouse 44 female married 3
3 aunt 77 female widow 7
4 child 15 female single 0
5 child 13 female single n/a
6 child 11 male single n/a
Spouse’s
Mother’s Father’s
IPUMS “Pointer” Variables
Location
2
1
0
0
0
0
Location
Location
0
0
0 0
0
0
2 1
1
1
2
2
(Simple household)
Pernum Relationship Age Sex Marst Chborn
1 head 53 female separated 6
2 child 28 male single n/a
3 child 22 male single n/a
4 child 21 male single n/a
5 child 25 female married 2
6 child-in-law 28 male married n/a
7 grandchild 3 male single n/a
8 grandchild 1 male single n/a
9 non-relative 32 female separated 2
10 non-relative 10 male single n/a
11 non-relative 5 female single n/a
Location
Location
Location
0
0
0
0
0
6
5
0
0
0
0
0
0
1
1
1
1
0
5
5
0
9
9
0
0
0
6
6
0
0
0
0
0
Spouse’s Father’sMother’s
IPUMS “Pointer” Variables(Complex household)
1. What is the IPUMS1. What is the IPUMS
2. Harmonization2. Harmonization
3. Additional Data Enhancements3. Additional Data Enhancements
4. Access4. Access
5. Strengths and Limitations5. Strengths and Limitations
6. Dissemination6. Dissemination
IPUMS Overview
IPUMS Access
• Restricted access
• Scholarly and educational purposes
• Conditions of use: key is not to redistribute
• Serious vetting
1. What is the IPUMS1. What is the IPUMS
2. Harmonization2. Harmonization
3. Additional Data Enhancements3. Additional Data Enhancements
4. Access4. Access
5. Strengths and Limitations5. Strengths and Limitations
6. Research examples6. Research examples
IPUMS Overview
4 Key Strengths of theCensus Microdata Samples
• National in scopeResults not subject to local peculiaritiesProvide context for local studies
More cases than any comparable datasetsEnable study of relatively small populations
• Large
• Temporal depth
Provide historical perspective
• MicrodataCan make your own tabulationsApply multivariate techniques
Limitations of the Microdata Samples
Confidentiality
• Geography
20,000 population or larger
• Sensitive variables, swapping, etc
• Samples
Too small to answer some questions
Other Issues and Limitations
• Not annual
Any historical analysis will have gaps
• Cross-sectional dataNot longitudinal
• Need knowledge of a statistical package
• User burden
Information overload; culturally specific knowledge
• Very large extracts
1. What is the IPUMS1. What is the IPUMS
2. Harmonization2. Harmonization
3. Additional Data Enhancements3. Additional Data Enhancements
4. Users and Access4. Users and Access
5. Strengths and Limitations5. Strengths and Limitations
6. Research examples6. Research examples
IPUMS Overview
IPUMS-International Research Topics
• Child labor outside the household in Mexico and Colombia
• Effect of NAFTA on educational attainment and school enrollment by region within Mexico
• Concentration of mortality within families in Kenya
• Life course patterns of co-residence among Mexicans in Mexico, Mexicans in the U.S., and Mexican Americans
• Brain drain from developing countries
• How language diversity is affected by migration and economic factors
0
5
10
15
20
25
30
35
40
45
50
1960 1965 1970 1975 1980 1985 1990 1995 2000 2005
Pe
rce
nt
in L
ab
or
Fo
rce
MexicoMexicoCosta RicaCosta Rica
EcuadorEcuador
ChileChile
VenezuelaVenezuela
ColombiaColombia
BrazilBrazil
Married Female Labor Force Participation in Latin America(age 18 to 65)
0
10
20
30
40
50
60
70
1920 1930 1940 1950 1960 1970 1980 1990 2000 2010
Pe
rce
nt
in L
ab
or
Fo
rce
Latin Latin AmericaAmerica
United United StatesStates
Married Female Labor Force Participation:Latin America and U.S. (age 18 to 65)
0
10
20
30
40
50
60
70
1920 1930 1940 1950 1960 1970 1980 1990 2000 2010
Pe
rce
nt
in L
ab
or
Fo
rce
United United StatesStates
MexicoMexicoCosta RicaCosta Rica
EcuadorEcuadorChileChile
VenezuelaVenezuela
ColombiaColombia
BrazilBrazil
Married Female Labor Force Participation:Latin America and U.S. (age 18 to 65)
Compare Latin Compare Latin America to U.S. America to U.S.
40 years ago40 years ago
Married Female Labor Force Participation:Mexican-born Women, 1970-2000
0
10
20
30
40
50
60
70
1970 1975 1980 1985 1990 1995 2000
Pe
rce
nt
in L
ab
or
Fo
rce
Mexican-born Women Mexican-born Women in United Statesin United States
Women in Women in MexicoMexico
Working-Age Population in the Labor Force, by Sex
0
10
20
30
40
50
60
70
80
90
100B
razi
l 19
60
Bra
zil 1
97
0B
razi
l 19
80
Bra
zil 1
99
1B
razi
l 20
00
Ch
ile 1
96
0C
hile
19
70
Ch
ile 1
98
2C
hile
19
92
Ch
ile 2
00
2
Co
lom
bia
19
64
Co
lom
bia
19
73
Co
lom
bia
19
85
Co
lom
bia
19
93
Co
sta
Ric
a 1
96
3C
ost
a R
ica
19
73
Co
sta
Ric
a 1
98
4C
ost
a R
ica
20
00
Ecu
ad
or
19
62
Ecu
ad
or
19
74
Ecu
ad
or
19
82
Ecu
ad
or
19
90
Ecu
ad
or
20
01
Me
xico
19
70
Me
xico
19
90
Me
xico
20
00
Ve
ne
zue
la 1
97
1V
en
ezu
ela
19
81
Ve
ne
zue
la 1
99
0
Ch
ina
19
82
Vie
tna
m 1
98
9V
ietn
am
19
99
Ke
nya
19
89
Ke
nya
19
99
So
uth
Afr
ica
19
96
So
uth
Afr
ica
20
01
Fra
nce
19
62
Fra
nce
19
68
Fra
nce
19
75
Fra
nce
19
82
Fra
nce
19
90
Un
ited
Sta
tes
19
60
Un
ited
Sta
tes
19
70
Un
ited
Sta
tes
19
80
Un
ited
Sta
tes
19
90
Un
ited
Sta
tes
20
00
Pe
rce
nt
of
Wo
rkin
g-A
ge
Po
pu
lati
on
Males Females Persons age 16 to 65.
Persons with Completed Secondary Education:National Populations Versus Migrants to the United States
0
10
20
30
40
50
60
70
80
90
100
Brazil Chile Costa Rica Ecuador Mexico Vietnam Kenya South Africa
Pe
rce
nt
In home country, ca. 2000 Migrants to U.S. 1995-2000
Population Residing with an Elderly Person
0
5
10
15
20
25
30
1960
1970
1980
1991
2000
1973
1985
1993
1970
1990
2000
1989
1999
1996
2001
1982
1989
1999
1962
1968
1975
1982
1990
1960
1970
1980
1990
2000
Per
cen
t o
f to
tal
po
pu
lati
on
Elderly persons (age 65+) Non-elderly residing with an elderly person
Brazil Mexico KenyaColombia VietnamChinaS Africa France United States
End
sobek@pop.umn.edu
top related