Performance of Resampling Variance
Estimation Techniques with Imputed Survey data.
• The Jackknife variance estimation based on adjusted imputed values proposed by Rao and Shao (1992).
• The Bootstrap procedure proposed by Shao and Sitter (1996)
Performance
We carry out a Montecarlo study.
For each replication, we compute:
• Relative bias
• Relative mean square error
• The 95% confidence interval based on the normal distribution
Imputation methods
• Ratio and mean imputation
• For each method we consider several fractions of missing data, with and without covariates
1 case: Structural Business Survey
• Population : Annual Industrial Business Survey (completely enumerates enterprises with 20 or more employees) of size N=16,438
• The variable to impute: Turnover
• Auxiliary variable: total expense
2 case:Retail Trade Index Survey
• Population : sample of businesses from the Retail Trade Index Survey of size N=9,414
• The variable to impute: Turnover
• Auxiliary variable: the same month year ago turnover
1 case:Montecarlo study
• Simple random samples without replacement of sizes n=100, 500, 1000 and 5000
• Non-response in the turnover variable is randomly generated (response mechanism uniform)
• A loss of about 30% is simulated
2 case:Montecarlo study• Stratified random samples without
replacement of sizes n=800, 1500, 2200 and 3000
• Non-response in the turnover variable is randomly generated (response mechanism uniform)
• Missing data are generated following a distribution similar to the true missing value pattern observed in the survey.
Montecarlo study
Number of replications is 200,000 for each auxiliary variable, imputation method and sample size
Results (I)
• The performance of the jackknife variance estimator is better for larger sample sizes and for ratio imputation.
• The jackknife variance performs poorly. This shows that strong skewness and kurtosys of imputed variable can influence considerably the results.
Results (II)
• The relative bias is large for small sizes, then decreases and increases again when the sampling fraction becomes non-negligible
• The coverage rate is not close to the nominal one even for large samples. (Due to the skewed and heavy-tailed distributions of the variables)
Conclusions (I)
• Ratio imputation should be used instead of mean whenever auxiliary variable are avalaible.
• In these examples, the stratification of the sample doesn’t improve the quality of the the jackknife variance estimator
Conclusions (II)
• The percentile bootstrap performs better than the jackknife for coverage rate of the confidence intervals and the reverse is true for mean square errors and bias of the variance.