chile seminar 03.01.2012 - ministerio de desarrollo...

1

Survey Estimation

March 1, 2012

Richard Valliant, University of Michigan & University of Maryland USA

2

Survey data are collected in a specific way that can affect

how analyses should be done

Features of complex sample designs

- Stratification

- Clustering

- Weights

Sample designs can be “informative” or “noninformative”

- If the sample is not a miniature of the population, the

effects of the design need to be accounted for

- Weights are used to make the sample “look like” the

population

- Stratification & clustering affect standard errors

3

Difference between strata & clusters

CASEN is stratified and clustered

Strata = comunas + urban/rural

Clusters (primary sampling units, PSUs)

secciones in rural

manzanas in urban

4

Effects of design features

Stratification

Assures that sample is distributed across all strata

Eliminates some badly distributed samples

Can reduce variances of full population estimates; allows

control over SEs for stratum estimates

Strata can be combinations of variables

CASEN uses regions for sample size determination,

comunas/(urban-rural) for sample selection

5

Clustering

In household surveys, clusters are geographic (secciones

or manzanas in CASEN)

Aims to reduce travel costs if data collection is done in

person

Useful if a complete, up-to-date list of households not

available

Chilean master frame is a list of PSUs not addresses

or housing units

INE field workers made list in May-July 2011 used for

sampling in each sample PSU

6

Usually increases variances of descriptive statistics

(totals, means, proportions)

Persons who live near each other are alike in some

ways (income level, environmental exposures, social

status)

2

var 1 1y nmn

m = number of clusters

n = average number of sample units per cluster

(e.g., families or persons)

= intraclass correlation; measure of how alike

units are within clusters

7

Effect of correlation within a cluster

The number of HU’s per PSU in CASEN is about 16 in urban

areas and 22 in rural.

n

Design Effect =

1 1n 0.01 16 1.07 0.01 22 1.10 0.05 16 1.32 0.05 22 1.43 0.10 16 1.58 0.10 22 1.76

If n=22 and =0.10, an SE of a mean would be about 76%

higher than if persons were uncorrelated.

8

- In 2009 CASEN, Deff’s for proportion in poverty ranged

from 1.4 to 8.7 across the 15 regions.

- Deff = 8.7 implies 0.38 . With Deff=8.7, SE’s are

2.95 times larger than with a simple random sample.

- High Deffs caused by sampling clusters of families

who live in same neighborhoods

This effect of clustering must be accounted for in

standard error estimation.

9

Weights

Used to project sample to the entire finite population

Typical properties

Sum of weights for any subgroup is an estimate of

number of units in the population for the subgroup.

For example

sum of weights for persons in Santiago estimates

population size of Santiago

An estimate made using weights is an estimate of

the “census” value. For example,

- weighted sum of income estimates total income in

population

10

Another example of a census value …

- Regression model with survey weights estimates

the model that would be fitted if whole population

were in the sample

Not using survey weights will (usually) lead to biased

point estimates of descriptive statistics. Model parameter

estimates can also be biased.

11

Components of weights

- Base weights—inverses of selection probabilities - Nonresponse adjustments—compensate for fact that not

all persons respond. Important to do if different groups respond at

different rates (urban vs. rural, regional differences)

- Post-stratification or other adjustments using administrative data

Important if some population groups are not well-covered by the survey

Can adjust for inaccurate base weights Can reduce variances

(Final wt) = (Base wt) * (NR adj) * (poststrat adj)

12

Nonresponse Adjustments

Related to response rates

- Response rates accounted for when calculating initial

sample size

- One performance measure used for surveys

Higher response rate Less burden on the nonresponse

adjustment

In CASEN the response rates vary by region, poverty rates

vary by regions and urban/rural locales. Accounting for

regional and urban/rural differences is important when

adjusting for nonresponse.

13

Post-stratification

Begin with base weights adjusted for nonresponse

Further adjust weights so that estimated totals of persons

agree with administrative record counts

CASEN will poststratify by comuna

- Estimates of comuna population counts will agree with

INE population projections for November 2011.

14

Example of an “Informative” Design

Population of hospitals

Two sample designs:

Simple random sampling without replacement

Probability proportional to number of beds in each

hospital

Similar to way PSUs are selected in CASEN

15

Select 100 samples of each type with n = 30

What do the histograms of beds for the (unweighted)

samples look like?

16

0 200 400 600 800 1000

0.00

000.

0015

0.00

30Population histogram of no. of beds in a hospital population

Beds

0 200 400 600 800 1000

0.00

000.

0015

0.00

30

Histogram of beds in 100 simple random samples of n = 30

Beds

Top—population histogram

Bottom—histogram of SRS samples (no

weights needed)

17

0 200 400 600 800 1000

0.00

000.

0015

0.00

30Population histogram of no. of beds in a hospital population

Beds

0 200 400 600 800 1000

0.00

000.

0015

0.00

30

Histogram of beds in 100 samples of n = 30 selected pp(beds)

Beds Top—population histogram

Bottom—histogram of PPS samples (no

weights)

18

SRSWOR gives a miniature of the population

PPS sampling gives a sample distribution that is much

different from the population histogram.

Unweighted sample mean and quantiles will be

larger, on average, than the population quantities.

Sample weights are needed to make the sample look

like the population.

19

Example from US National Health & Nutrition Examination Survey (Kreuter & Valliant 2007)

- NHANES is a geographically stratified, multistage,

clustered sample of households with age, sex, and

race-ethnic groups sampled at different rates.

- Estimate % of persons with hypertension

Ignoring all design

Accounting for stratification and clustering in SEs

information Unweighted Weighted Hypertension 5.4% 5.4% 3.9% SE 0.25 0.34 0.43

20

Example from Programme for International Student Assessment (PISA) in 2000

- In each country: sample of schools, and students within

those schools

- Compare average reading scores in Denmark and US

21

Solid lines— weighted means and confidence intervals ignoring stratification and clustering.

Dashed lines—confidence intervals after computing the correct standard errors.

22

Estimating Model Parameters

Analyze survey data

Find correlates of poverty

Decide whether a social program is effective

Fit a model to a sample from a finite population

Obtain estimates of model parameters that are model-

unbiased and design-unbiased.

Model-unbiased means “average with respect to a

model”

Design-unbiased means “average over samples

selected in the same way as the one you analyze”

23

Regression Example using Academic Performance

Index (API) data from California

- Stratified school sample clustered by school district

- Regress API for school on these predictors:

ell Percent of English Language Learners in school

meals Percentage of students eligible for subsidized meals in school

mobility percentage of students for whom this is the first year at the school

24

OLS regression

Est SE t-stat Pr(>|t|) Intcpt 794.98 11.74 67.71 <2e-16 *** ell -0.64 0.42 -1.52 0.13 meals -2.87 0.29 -9.74 <2e-16 *** mobility 0.02 0.47 0.03 0.98

Survey-weighted regression

Est SE t-stat Pr(>|t|) Intcpt 811.49 30.88 26.28 <2e-16 *** ell -2.06 1.41 -1.46 0.15 meals -1.78 1.11 -1.61 0.12 mobility 0.33 0.53 0.61 0.54

25

Interpretation of Estimates

If the model is “correctly specified”, meaning that it is a

good description of the structure in the population, then

OLS and WLS each give model-unbiased estimates of

the underlying model parameters.

Weighted estimates are also estimates of parameters if

we fitted the model to the whole finite population (i.e.,

the census model). True even if model is misspecified.

26

- Suppose we fit i i iy x but the right model is

2i i i iy x x .

- WLS sample-fit estimates i i iy x for the full

population—probably not a good model

27

Effort needs to be made to specify the model correctly

by using diagnostics, e.g.,

- Weighted bubble plots of y vs. each (quantitative

or ordered categorical) x being considered.

- Plot residuals vs. individual x’s

- Compare cell means of y for groups based on

categorical x’s

28

Standard Error Estimation

Regardless of whether OLS or WLS is used, stratification

and clustering needs to be considered when estimating

standard errors.

Clustering will usually increase standard errors

compared to SRS. OLS SE’s or even WLS SE’s from

standard (non-survey) regression procedures that

ignore clustering will be incorrect—typically too small.

Using SE’s that are too small will lead to some

explanatory variables being judged as important

that are not.

29

Software for Survey Estimation

30

Software for Estimating Sampling Errors

Commercial Free

STATA R (survey package add-on) SPSS IVEware (U. of Michigan) SAS WesVar SUDAAN Mplus

31

Variance

Estimation MethodsSTATA R SPSS SAS SUDAAN

Linearization Replication BRR BRR-Fay Jackknife Bootstrap

32

Descriptive Statistics STATA R SPSS SAS SUDAAN

Estimates and Standard errors for:

Means Totals Ratios Proportions Geometric means Quantiles

33

Modeling

Analysis Features Stata R SPSS SAS SUDAANLinear Regression Logistic Regression Dichotomous Polychotomous

Multinomial Cum logit

Poisson Regression Probit Models Loglinear Models Tests of Independence in Tables

Linear Contrasts, Differences Survival Analysis

34

Procedures only available in Stata:

svyivreg Instrumental variables regression

svyintreg Interval and censored regression

svyprobit Probit models for survey data

svyoprobit Ordered probit models

svynbreg Negative binomial regression

svygnbreg Generalized negative binomial regression

svyheckman Heckman selection model

svyheckprob Probit estimation with selection

35

Programming Capabilities

STATA, R, SPSS, SAS Extensive programming featuresrecoding, functions,

macros, matrix manipulations, data management Graphics

SUDAAN Limited recoding in standalone version SAS-callable gives access to all of SAS

36

Web Pages

IVEware http://www.isr.umich.edu/src/smp/ive/

Mplus http://www.statmodel.com/mplus_index.shtml

R http://www.r-project.org/

SAS http://www.sas.com

SPSS http://www.spss.com/complex_samples/

STATA http://www.stata.com

SUDAAN http://www.rti.org/sudaan

37

References

Handbook of Statistics No. 29, Sample Surveys: Methods and Inference (2010). Amsterdam: Elsevier.

Heeringa, S., West, B., Berglund, P. (2010). Applied Survey Data Analysis. Boca Raton: Chapman Hall/CRC.

Kreuter, F. and Valliant, R. (2007). A Survey on Survey Statistics: What is done and can be done in Stata. Stata Journal, 7, 1-21.

Rabe-Hesketh, S., and Skrondal, A. (2006). Multilevel modelling of complex survey data, Journal of the Royal Statistical Society A, 169, Part 4, 805–827.

Valliant, R., Dever, J., and Kreuter, F. (2012). Practical Tools for Designing and Weighting Survey Samples. New York: Springer.

38

Journals Journal of Official Statistics published by Statistics

Sweden

Public Opinion Quarterly published by American

Association of Public Opinion Research

Survey Methodology published by Statistics Canada

Survey Research Methods published by European Survey

Research Association

All are free on-line (with a delay in some cases)

39

Conclusion

Features of complex sample designs that affect

inferences

- Stratification

- Clustering

- Weights

Software is available that easily allows the complexities

to be accounted for

chile seminar 03.01.2012 - ministerio de desarrollo...

Documents