performance of resampling variance estimation techniques with imputed survey data

13
Performance of Resampling Variance Estimation Techniques with Imputed Survey data.

Upload: jonas-baker

Post on 02-Jan-2016

218 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Performance of Resampling Variance Estimation Techniques with Imputed Survey data

Performance of Resampling Variance

Estimation Techniques with Imputed Survey data.

Page 2: Performance of Resampling Variance Estimation Techniques with Imputed Survey data

• The Jackknife variance estimation based on adjusted imputed values proposed by Rao and Shao (1992).

• The Bootstrap procedure proposed by Shao and Sitter (1996)

Page 3: Performance of Resampling Variance Estimation Techniques with Imputed Survey data

Performance

We carry out a Montecarlo study.

For each replication, we compute:

• Relative bias

• Relative mean square error

• The 95% confidence interval based on the normal distribution

Page 4: Performance of Resampling Variance Estimation Techniques with Imputed Survey data

Imputation methods

• Ratio and mean imputation

• For each method we consider several fractions of missing data, with and without covariates

Page 5: Performance of Resampling Variance Estimation Techniques with Imputed Survey data

1 case: Structural Business Survey

• Population : Annual Industrial Business Survey (completely enumerates enterprises with 20 or more employees) of size N=16,438

• The variable to impute: Turnover

• Auxiliary variable: total expense

Page 6: Performance of Resampling Variance Estimation Techniques with Imputed Survey data

2 case:Retail Trade Index Survey

• Population : sample of businesses from the Retail Trade Index Survey of size N=9,414

• The variable to impute: Turnover

• Auxiliary variable: the same month year ago turnover

Page 7: Performance of Resampling Variance Estimation Techniques with Imputed Survey data

1 case:Montecarlo study

• Simple random samples without replacement of sizes n=100, 500, 1000 and 5000

• Non-response in the turnover variable is randomly generated (response mechanism uniform)

• A loss of about 30% is simulated

Page 8: Performance of Resampling Variance Estimation Techniques with Imputed Survey data

2 case:Montecarlo study• Stratified random samples without

replacement of sizes n=800, 1500, 2200 and 3000

• Non-response in the turnover variable is randomly generated (response mechanism uniform)

• Missing data are generated following a distribution similar to the true missing value pattern observed in the survey.

Page 9: Performance of Resampling Variance Estimation Techniques with Imputed Survey data

Montecarlo study

Number of replications is 200,000 for each auxiliary variable, imputation method and sample size

Page 10: Performance of Resampling Variance Estimation Techniques with Imputed Survey data

Results (I)

• The performance of the jackknife variance estimator is better for larger sample sizes and for ratio imputation.

• The jackknife variance performs poorly. This shows that strong skewness and kurtosys of imputed variable can influence considerably the results.

Page 11: Performance of Resampling Variance Estimation Techniques with Imputed Survey data

Results (II)

• The relative bias is large for small sizes, then decreases and increases again when the sampling fraction becomes non-negligible

• The coverage rate is not close to the nominal one even for large samples. (Due to the skewed and heavy-tailed distributions of the variables)

Page 12: Performance of Resampling Variance Estimation Techniques with Imputed Survey data

Conclusions (I)

• Ratio imputation should be used instead of mean whenever auxiliary variable are avalaible.

• In these examples, the stratification of the sample doesn’t improve the quality of the the jackknife variance estimator

Page 13: Performance of Resampling Variance Estimation Techniques with Imputed Survey data

Conclusions (II)

• The percentile bootstrap performs better than the jackknife for coverage rate of the confidence intervals and the reverse is true for mean square errors and bias of the variance.