rudi seljak statistical office of the republic of slovenia

15
STANDARD ERRORS PRESENTATION AND DISEMINATION AT THE STATISTICAL OFFICE OF THE REPUBLIC OF SLOVENIA Rudi Seljak Statistical Office of the Republic of Slovenia

Upload: leila-oliver

Post on 31-Dec-2015

61 views

Category:

Documents


1 download

DESCRIPTION

STANDARD ERRORS PRESENTATION AND DISEMINATION AT THE STATISTICAL OFFICE OF THE REPUBLIC OF SLOVENIA. Rudi Seljak Statistical Office of the Republic of Slovenia. The summary of the presentation. Introduction The “old” system for standard error presentation Reasons for the revision - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Rudi Seljak Statistical Office of the Republic of Slovenia

STANDARD ERRORS PRESENTATION AND DISEMINATION AT THE STATISTICAL OFFICE OF THE

REPUBLIC OF SLOVENIA

Rudi Seljak Statistical Office of the Republic of Slovenia

Page 2: Rudi Seljak Statistical Office of the Republic of Slovenia

The summary of the presentation

• Introduction • The “old” system for standard error presentation • Reasons for the revision • The revised system• Conclusions

Page 3: Rudi Seljak Statistical Office of the Republic of Slovenia

Introduction

• Sample surveys are still the most commonly used means for the collection of the needed data in the official statistics.

• Precision of the statistical results is still very important criteria of the quality of the statistical results.

• It is important obligation of the national statistical institutes to estimate the sampling errors and then to disseminate and represent these errors to the users in the transparent and clearly readable form.

Page 4: Rudi Seljak Statistical Office of the Republic of Slovenia

The “old” system at SORS

• The standard errors for the estimated statistical results were very rarely explicitly published.

• Estimated coefficients of variation were used to determine degree of precision.

• The four different degrees of precision were defined:– Estimates with sufficient degree of precision (CV <10%). Value

disseminated with no denotation.– Less precise estimates (10% ≤ CV < 15%). Value disseminated

in the single bracket.– Imprecise estimates (15% < CV ≤ 30%). Value disseminated in

the double bracket.– Extremely imprecise estimates (30% < CV). Value not published

but replaced with the sign “.”.

Page 5: Rudi Seljak Statistical Office of the Republic of Slovenia

The “old” system – example (LFS)

Page 6: Rudi Seljak Statistical Office of the Republic of Slovenia

Reasons for the revision

• Using brackets for the denotation signs is not the best solution for the dissemination through the electronic means.

• Using the denotation of different degrees of precision is not sufficient for the needs of some more demanding users.

• Definition of the degrees of precision by using only the coefficient of variation is not appropriate for some (non-dimensional) statistics (e.g. proportions).

Page 7: Rudi Seljak Statistical Office of the Republic of Slovenia

“CV criteria” problem

• Applying the direct CV criteria in the case of the non-dimensional statistics could cause the serious inconsistencies.

• Suppose we want to estimate the proportion of the units with certain characteristic in the large population.

95n 01.0ˆ p %5.10)ˆ( pcv )01.0(

• Disseminating q=1-p instead

95n 99.0ˆ q %1.0)ˆ( qcv 99.0

Page 8: Rudi Seljak Statistical Office of the Republic of Slovenia

CV- p:q differences

Page 9: Rudi Seljak Statistical Office of the Republic of Slovenia

The revised system

• Standard errors could be disseminated explicitly or by using the denotations for the different degrees of precision. – Denotations for the releases with small amount of the results (e.g. First

release) and for the general releases (e.g. Statistical Yearbook).

– Explicit dissemination in the case of more exhaustive and domain oriented releases.

• Three degrees of precision: – The estimate of acceptable precision → published without limitations

– Less precise estimates → flagged for caution with letter M

– Too imprecise estimates to be published → suppressed for use by letter N

Page 10: Rudi Seljak Statistical Office of the Republic of Slovenia

The revised system cont’d

• For the totals or means of the positive, continuous variables the limits are determined directly on the basis of the coefficient of variation.– CV < 10% → estimate of acceptable precision →

published without denotations

– CV between 10%-30% → less precise estimate → flagged for caution with letter M

– CV>30% → too imprecise estimate to be published → suppressed for use by letter N

Page 11: Rudi Seljak Statistical Office of the Republic of Slovenia

Limits for the proportions

• The “limit standard errors” limits are calculated for the case p=0.5 by using the CV criteria. Then this limits are used for each value of the estimated proportion.

0 10.5 0 1p

Page 12: Rudi Seljak Statistical Office of the Republic of Slovenia

Limits for the proportions cont’d

• se < 0.05 → estimate of acceptable precision → published without denotations

• se between 0.05 and 0.15 → less precise estimate → flagged for caution with letter M

• se>0.15 → too imprecise estimate to be published → suppressed for use by letter N

• For the number of the units with certain characteristics, the limits are derived from the above limits by using the formula se(N·p)=N ·se(p)

Page 13: Rudi Seljak Statistical Office of the Republic of Slovenia

The revised system – example (Tourism Travels of Domestic Population)

Page 14: Rudi Seljak Statistical Office of the Republic of Slovenia

Application for standard error calculation

• To enable standardize and transparent calculation of the standard errors a special sas application was built.

• The application enables aggregation, standard error calculation and also denotation with the special signs, if needed.

• The application is designed as a metadata driven system.

• So far the application enables calculation of standard error for five types of statistics.

Page 15: Rudi Seljak Statistical Office of the Republic of Slovenia

Conclusions

• The system of denotation of different degrees of less precise estimates, based on the coefficient of variation was used at SORS for many years.

• Some critical exploration pointed out the need for theoretical and technical revision of the system.

• The main new features of the revised system: – Denotation with signs as well as explicit dissemination of standard

errors could be used.– The “bracket signs” are replaced with “letter signs”. – Different criteria for different types of statistics are used. – Metadata driven application which enables standardized process of

standard error calculation was built.