mics4 data processing workshop multiple indicator cluster surveys data processing workshop adding...

38
MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

Upload: noah-rogers

Post on 27-Mar-2015

236 views

Category:

Documents


3 download

TRANSCRIPT

Page 1: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

MICS4 Data Processing Workshop

Multiple Indicator Cluster SurveysData Processing Workshop

Adding Sample Weights, Wealth Index, and

GPS Data

Page 2: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

Secondary Data Processing Flow

Export Data from CSPro

Import Data into SPSS

Recode Variables

Add Sample Weights, Wealth Index, and GPS Data

Run Tables

MICS4 Data Processing Workshop

Page 3: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

MICS4 Data Processing Workshop

Adding Sample Weights

Page 4: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

Sampling

• In most MICS surveys, if not all, samples are not self-weighting

• Household samples are selected with different probabilities of selection from each domain of interest – Examples: Regions, area (urban-rural), combination

of these (typical in MICS), or other domains

Page 5: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

Sampling: Example Popstan

• Example: the probability of selecting a household for MICS interviews was not equal across all of Popstan

• The country has two regions: North and West (which are equal size)

• In North region– 500 households were selected and interviewed per 10,000

• In West Region– 250 households were selected and interviewed per 10,000

• Which means that overall– 750 households were selected and interviewed from 20,000

MICS4 Data Processing Workshop

Page 6: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

Sample Weights

• Sample weights are used to adjust the sample to produce accurate estimates for the whole country

• Sample weights are the inverse of the probabilities of selection

• For example, the weights for North and West region– North region 10,000/500 = 20– West region 10,000/250 = 40

• In North region, each household selected represents 20 households in that region – same figure is 40 in West

MICS4 Data Processing Workshop

Page 7: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

Sample Weights

• Overall, every household selected in Popstan represents 26.6667 households (20000/750)

• In other words, relative to a proportional selection (should be 375 households selected from each region), more households have been selected from North, less have been selected from West

Page 8: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

Sample Weights

• This has to be “compensated” by using sample weights during analysis to re-calibrate the sample to the national level

Page 9: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

Sample weights• Weights should always be used when tabulating

• Sample weights will have two components– The initial probability of selection– Non-response: We have to take into account what proportion of

households (women, under-5s) we have successfully interviewed

• In Popstan North region, if the sample was initially selected with a probability of 500 households per 10,000, but we then were able to successfully interview 450, the final sample weight should be calculated based on 450, not on 500

Page 10: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

Why sample weights

• 25 percent of households in North use improved water sources

• 75 percent of households in West use improved water sources

• If the sample was selected proportionally (375 households from each region), then our survey estimate would be – ((375 * 0.25) + (375 * 0.75)) / 750 = 0.50

Page 11: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

Why sample weights

• If we do not weight, then our national estimate will be– ((500 * 0.25) + (250 * 0.75)) / 750 = 0.417– Because, we have over-sampled a region

where use of improved water sources is less

• We need to calculate sample weights to “correct” this situation

Page 12: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

Why sample weights

• If we assigned a weight of 20 to each household in North, and 40 to each household in West, this would do the trick

(500 * 20 * 0.25) + (250 * 40 * 0.75)-----------------------------------------------

(500 * 20) + (250 * 40)

= 0.50

Page 13: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

Why sample weights

• This is fine, but SPSS tables would show 20000 households as the denominator

• We do not want this

• So, we normalize the weights

• We calibrate (normalize) them so that the average of the weights in the data set is equal to 1

Page 14: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

Why sample weights

• The normalized weight for the North region is calculated as (10000/500)/(20000/750) = 0.75

• And for the West region, (10000/250)/(20000/750)= 1.5

When we calculate the national use of improved water sources by using normalized weights,

(500 * 0.75 * 0.25) + (250 * 1.5 * 0.75) 375-------------------------------------------------- = -----(500 * 0.75) + (250 * 1.5) 750

Page 15: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

Sample weights

• Based on the design of the sample, there are two (common) approaches to calculating weights:– Each cluster has a unique sample weight

(weights.xls)– Each stratum has a unique sample weight

(weights_alt.xls)

• We have templates for both. You will need to work with your sampling expert to see which one you will use

Page 16: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

Sample Weights Objects

• weights.xls– spreadsheet that calculates weights

• weights_table.sps– SPSS program that provides input data for spreadsheet

• weights.sps– SPSS program that defines structure of spreadsheet’s

output

• weights_merge.sps– SPSS program that merges weights onto the MICS data

files

MICS4 Data Processing Workshop

Page 17: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

Steps in Adding Weights

1. Update weights.xls to have one row per strata or cluster depending on sample design

2. Add sampling information to weights.xls

3. Adapt strata definitions in weights_table.sps

4. Execute weights_table.sps program

5. Copy resulting table’s contents into “Calculations” sheet of weights.xls

6. Save “Output” sheet of weights.xls as weights.xls in directory c:\mics4\weights

7. Execute weights_merge.sps program

MICS4 Data Processing Workshop

Page 18: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

Step 1: Updating weights.xls

• Spreadsheet has one row per cluster• Adjust the number of rows in “Calculations” to

reflect the number of clusters in your survey– do so by copying and pasting internal rows

• Check that the totals cells have the correct ranges

• Adjust the number of rows in “Output”• Check that data in “Output” is correct

MICS4 Data Processing Workshop

Page 19: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

Step 2: Adding Sampling Info

• Open weights.xls

• Complete columns C and D – probabilities of selection of households in a cluster, and of clusters in a stratum

• or

• Complete the “stratum sampling fraction” column

MICS4 Data Processing Workshop

Page 20: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

Step 3: Defining Strata

• Your survey has sampling strata. Examples:– all combinations of area (HH6) and region

(HH7)– region

• Lines 3-10 of weights_table.sps define the standard survey’s strata

• Update these statements to reflect the definition of strata in your country

MICS4 Data Processing Workshop

Page 21: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

Step 4: Executing weights_table.sps

• Open weights_table.sps in SPSS

• Select Run--->all

• Check output for error messages

• Examine output table

MICS4 Data Processing Workshop

Page 22: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

Step 5: Copying Output

• Double-click inside the table to open it

• Select the household results

• Paste them in the “Calculations” sheet of weights.xls

• Repeat for the women and children results

• Save weights.xls

MICS4 Data Processing Workshop

Page 23: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

Step 6: Saving the Output Sheet

• Click on the “output” tab in the weights.xls spreadsheet

• Select File ---> Save As• Navigate to the directory c:\mics4\spss

– Save under name weights.xls• Click the save button

MICS4 Data Processing Workshop

Page 24: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

Step 7: Merging Weights into SPSS

• Open weights_merge.sps in SPSS

• Select Run ---> all

• Check output for error messages

• Open each data file—HH, HL, TN, WM, BH, and CH — and check that weights were correctly added

MICS4 Data Processing Workshop

Page 25: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

weights_merge.sps

Source Files: c:\mics4\spss\weights.sav

Destination Files: HH.sav, HL.sav, TN.sav, WM.sav, BH.sav, FG.sav, CH.sav, MN.sav

Match By: HH1

Variables Added: xxWeight where

xx is HH, WM, CH, MN, TN, BH, FG file

MICS4 Data Processing Workshop

Page 26: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

MICS4 Data Processing Workshop

Wealth Index

Page 27: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

The Wealth Index

• The MICS wealth index is an attempt to measure the socio-economic status of households

• The analysis section of this process will be done at the 3rd workshop

• The goal today is to discuss the programs and how they work

MICS4 Data Processing Workshop

Page 28: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

The Wealth Index

• But briefly– The wealth index is a method to divide households

into 5 groups of equal size (quintiles) in terms of “wealth” – from poorest to richest

– “Wealth” is constructed by using information on household characteristics (crowding), amenities (water and sanitation), household assets (durable goods) owned by households

– Useful in the absence of information on income and expenditures

Page 29: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

Wealth Index Programs

The program related to the wealth index is:

wealth.sps—This program calculates the wealth index and merges the wealth index values to the SPSS data files

MICS4 Data Processing Workshop

Page 30: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

wealth.sps

• Calculates a wealth index using factor analysis• Inputs:

– dichotomous variables related to household/ individual assets

• Outputs:– wscore - a wealth index score for each

household– windex5 - a wealth quintile for each

household

MICS4 Data Processing Workshop

Page 31: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

A Recoding Example

• Code below creates variable with value 1 if household owns a car, value 0 otherwise

Recode hc9f (1=1) (9=9) (else=0) into car.

variable label car 'Household member owns: car/truck'.

value label car 0 'No' 1 'Yes'.

Missing values car (9).

MICS4 Data Processing Workshop

Page 32: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

The Rest of the Program

The factor statement– creates wealth index score

The compute statement– generates household member weights

The rank statement– creates wealth quintiles

The save outfile statement– saves wealth variables in wealth.sav file

MICS4 Data Processing Workshop

Page 33: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

The rest of the program

• Calls each file (hh.sav, hl.sav, wm.sav, ch.sav, tn.sav, bh.sav, fg.sav, mn.sav) at a time, and based on HH1 and HH2, adds wealth index variables (windex5 and wscore).

• Saves data files with wealth variables.

Page 34: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

MICS4 Data Processing Workshop

GPS

Page 35: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

GPS Readings

• Some countries will take GPS readings during their MICS survey

• These readings allow researchers to merge diverse data sets using a cluster’s location

• Data sets that can be linked to the MICS data

– Climate data

– Agricultural data

MICS4 Data Processing Workshop

Page 36: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

The GPS FormGEOGRAPHIC POSITIONING SYSTEM FORM GP

GP1. Cluster number:___ ___ ___

GP2. Area:Urban........................................... 1Rural ............................................ 2

GP3. Region:Region 1 ...................................... 1Region 2 ...................................... 2Region 3 ...................................... 3Region 4 ...................................... 4

GP4. Operator name and number:

Name ___ ___

GP5. Day/Month/Year of measurement: ___ ___ / ___ ___ / ___ ___ ___ ___

GP6. Waypoint name:___ ___ ___ ___ ___ ___

N/S/E/W Degrees Decimal degrees

GP7. Latitude: N S ___ ___ . ___ ___ ___ ___ ___

GP8. Longitude: E W ___ ___ ___ . ___ ___ ___ ___ ___

MICS4 Data Processing Workshop

Page 37: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

GPS Programs• GPS.dcf

– CSPro dictionary• GPSEntry.ent

– CSPro data entry application• GPS.sps

– SPSS version of GPS.dcf• GPS_merge.SPS

– reads in GPS data and merges it onto SPSS data files

MICS4 Data Processing Workshop

Page 38: MICS4 Data Processing Workshop Multiple Indicator Cluster Surveys Data Processing Workshop Adding Sample Weights, Wealth Index, and GPS Data

gps_merge.sps

Source Files: c:\mics4\spss\gps.dat

Destination Files: HH.sav, HL.sav, TN.sav, WM.sav, BH.sav, CH.sav, MN.sav

Match By: HH1

Variables Added: all variables on GPS form

MICS4 Data Processing Workshop