to be presented 8-10-2014 at the un workshop before the ciret conference in hangzhou. the views...

20
To be presented 8-10-2014 at the UN workshop before the CIRET conference in Hangzhou. The views expressed in this presentation are those of the authors and do not necessarily reflect the policies of Statistics Netherlands. Frank van de Pol, Jan van den Brakel, Pim Ouwehand, Floris van Ruth, Piet Verbiest Handbook Composite Estimators, Data Related Issues (chapter 4)

Upload: charles-cody-warner

Post on 22-Dec-2015

213 views

Category:

Documents


0 download

TRANSCRIPT

To be presented 8-10-2014 at the UN workshop before the CIRET conference in Hangzhou. The views expressed in this presentation are those of the authors and do not necessarily reflect the policies of Statistics Netherlands.

Frank van de Pol, Jan van den Brakel, Pim Ouwehand, Floris van Ruth, Piet Verbiest

Handbook Composite Estimators,

Data Related Issues (chapter 4)

2

Business cycle clock summarises and shows economic trend as clock movement

3 Ideal situation: all data in time, comparable,…

– Eurostat gives series for EU countries, USA, …– http://epp.eurostat.ec.europa.eu/cache/BCC2/group1/

xdis_en.html

– OECD gives series for total of member countrieshttp://stats.oecd.org/mei/bcc/default.html

– Netherlands, Germany, Denmark, …

– What if we have “data problems”?

Data Related Issues with Composite Estimators

A. Discontinuities in time series: back casting

B. Some indicators may be missing or too late

C. Mixed frequency data, delayed results

D. Incomplete data and indices, changing weights

E. Seasonal Adjustment interpretation: outlier or start of crisis?

Problematic data?

4 Overview

Interruption of time series caused by a change of method

• Importance of comparable figures in time series• Answers are influenced by context• Selective response affects outcomes

• Survey design changes• “New” modes: telephone, web• Combination of similar surveys for more detail

SN with other governmental bodies• Police monitor (BiZa), safety monitor (SN) & other• Health survey (SN) & GGD surveys (communal health

service)

5 A. Interrupted time series

6 A. Interrupted time series

Jan van den Brakel, Paul Smith, Simon Compton, Survey Methodology 2008

7 A. Interrupted time series

8 A. Interrupted time series

Sometimes a discontinuity is found

9 D. Interrupted time series

Interrupted time series due to method change: several situations

• Changed measurement target variables• Parallel data collection with both designs• Step parameter interrupted time series

• Change in classification, editing or imputation • Recalculation with both methods

• New editing strategy (partially automatic)• More efficient estimation (small area)• Classification change: profession, education,

industry• UN classifications with thousands of categories • http://unstats.un.org/unsd/class/family/default.asp

10 A. Interrupted time series

Discontinuities: how to cope?

• Suppose a correction factor has been calculated

• Which part of the series is better, old or new?• Adapt history: “back casting” of the series• Adapt future data: cannot continue very long

• Correction factor gets less valid as extrapolation is extended further, both w.r.t. past and future

• It would be convenient if we could consider the new figures as better figures

11 A. Interrupted time series

Data Related Issues with Composite Estimators

Some indicators may be missing or too late

The right data are not available

‐ “Holes” in micro data‐ No register, but there is a survey‐ No survey, but related registers ‐ Register information is too late, but proxy available

‐ Some regions provide information, others not /late

‐ Only social media information available

12 B. Indicators missing or too late

Data Related Issues with Composite Estimators

Some indicators may be missing or too late

‐ “Holes” in micro data: ‐ No register‐ No survey, but related registers

‐ Register information is too late, but proxy available

‐ Some regions provide information, others not /late

‐ Only social media information available

13 B. Indicators missing or too late

‐ Imputation‐ Use a survey‐ Use predictors‐ Combine old register data with change information

‐ Use small area estimate or synthetic estimator

‐ Create indicator from Twitter, Facebook, Weibo

Mixed frequency data and delayed results

Composing several indicators‐ Publication delay differs: ragged edges‐ Publication frequency differs: ragged edges

Use high frequency (monthly) indicator to forecast the lagging quarterly or yearly indicators (Foroni and Marcelino, 2013)

Methods:–Interpolation: linear, Chow-Lin, Denton (Eviews),–ARIMA regression & imputation (US Leading Economic Index)–State space models (Stamp, Ox, Eviews), –Mixed Frequency Vector Autoregression (VAR),–Bridge models,–Mixed Data Sampling (MiDaS)

14 C. Mixed frequency data, delayed results

Incomplete data and indices, changing weights

– Data collection skipping 2nd half of the period‐ i.e. observation period does not match reporting period

– Large revisions from preliminary to final figures are harmful and should be avoided

– Observation period one week? Fluctuations between weeks will inflate fluctuation in monthly series

– Observation period first two weeks of a month? More stable series will result.

15 D. Incomplete data and indices, weights

Incomplete data and indices, changing weights

– Index: starting point=100, multiplied with a series of growth rates

– For this simplification to be true, the population should not change, but it does

16 D. Incomplete data and indices, weights

Incomplete data and indices, changing weights

– Price index: weighted set of basic products

– Weights should reflect consumption pattern

– Weights can be fixed until a revision year or updated every year, linking years with a chain of growth rates

17 D. Incomplete data and indices, weights

Seasonal Adjustment: outliers and seasonal or calendar effects

– When should an outlier be viewed as indicative of a discontinuity, a crisis or a boom?

– Treat it as an outlier until the contrary is proven‐ Statistical proof (standard error)‐ Context proof (news item, change in related series)

– Should an outlier affect seasonal pattern or not?– Correct treatment of calendar effects (holidays)

18 E. Seasonal adjustment and outliers

Seasonal Adjustment: outliers and seasonal or calendar effects

19 E. Seasonal adjustment and outliers

Consumer confidence index

-36

-18

0

18

J F MAMJ J ASONDJ FMAMJ J ASONDJ FMAMJ J ASONDJ FMAMJ J ASONDJ FMAMJ J ASONDJ FMAMJ J ASOND

2005 2006 2007 2008 2009 2010

Original Seasonally adjusted

Seasonal Adjustment: outliers and seasonal or calendar effects

20 E. Seasonal adjustment and outliers

Indsutrial Production Index

80

100

120

2000 2001 2002 2003 2004 2005 2006 2007 2008 2009

Original

seasonally adjusted (no outliers set)

seasonally adjusted (with outliers set manually)

Industrial Production Index