some implementation issues of scanner data muhanad sammar, anders norberg & can tongur

22
Some Implementation Issues of Scanner Data Muhanad Sammar, Anders Norberg & Can Tongur

Upload: eric-black

Post on 17-Dec-2015

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Some Implementation Issues of Scanner Data Muhanad Sammar, Anders Norberg & Can Tongur

Some Implementation Issuesof Scanner Data

Muhanad Sammar, Anders Norberg &

Can Tongur

Page 2: Some Implementation Issues of Scanner Data Muhanad Sammar, Anders Norberg & Can Tongur

Some Background

• 3 major outlet chains in Sweden• Statistics Sweden has received scanner data

since 2009• First principal issue to decide how to use S.D.• The Swedish CPI Board approved the use of

scanner data in 2011 • Second principal issue how to aggregate data

Page 3: Some Implementation Issues of Scanner Data Muhanad Sammar, Anders Norberg & Can Tongur

The First Principal Issue – How to Use Scanner Data

A. Replace the manually collected price data with scanner data for the sample of outlets and products

B. Use scanner data as auxiliary informationC. Compute index based on scanner data all

products (and outlets)D. Use scanner data for auditing and quality

control

Page 4: Some Implementation Issues of Scanner Data Muhanad Sammar, Anders Norberg & Can Tongur

The First Principal Issue – How to Use Scanner Data

A. Replace the manually collected price data with scanner data for the sample of outlets and products

B. Use scanner data as auxiliary informationC. Compute index based on scanner data all

products (and outlets)D. Use scanner data for auditing and quality

control

Page 5: Some Implementation Issues of Scanner Data Muhanad Sammar, Anders Norberg & Can Tongur

The First Principal Issue – How to Use Scanner Data

A. Replace the manually collected price data with scanner data for the sample of outlets and products

Sample of 32 supermarket and local shops and 4 hypermarkets3 negatively coordinated samples of 500 products, identified by EAN for products

A. is the Swedish idea

Page 6: Some Implementation Issues of Scanner Data Muhanad Sammar, Anders Norberg & Can Tongur

The First Principal Issue – How to Use Scanner Data

A. Replace the manually collected price data with scanner data for the sample of outlets and products

B. Use scanner data as auxiliary informationC. Compute index based on scanner data all

products (and outlets)D. Use scanner data for auditing and quality

control

Page 7: Some Implementation Issues of Scanner Data Muhanad Sammar, Anders Norberg & Can Tongur

The First Principal Issue – How to Use Scanner Data

A. Replace the manually collected price data with scanner data for the sample of outlets and products

B. Use scanner data as auxiliary information

Index(M.C.P.)Index(S.D.)

* Index(S.D.)

small samplebig sample

Index =

Page 8: Some Implementation Issues of Scanner Data Muhanad Sammar, Anders Norberg & Can Tongur

The First Principal Issue – How to Use Scanner Data

A. Replace the manually collected price data with scanner data for the sample of outlets and products

B. Use scanner data as auxiliary information

Index(M.C.P.)Index(S.D.) *

Index(S.D.)

small sample

big sample

Index =

Page 9: Some Implementation Issues of Scanner Data Muhanad Sammar, Anders Norberg & Can Tongur

The First Principal Issue – How to Use Scanner Data

A. Replace the manually collected price data with scanner data for the sample of outlets and products

B. Use scanner data as auxiliary informationC. Compute index based on scanner data all

products (and outlets)D. Use scanner data for auditing and quality

control

Page 10: Some Implementation Issues of Scanner Data Muhanad Sammar, Anders Norberg & Can Tongur

The First Principal Issue – How to Use Scanner Data

A. Replace the manually collected price data with scanner data for the sample of outlets and products

B. Use scanner data as auxiliary informationC. Compute index based on scanner data all

products (and outlets) Problems; - COICOP-classification of all products - Products with deposits must be identified - New products might hide price changes

Page 11: Some Implementation Issues of Scanner Data Muhanad Sammar, Anders Norberg & Can Tongur

The First Principal Issue – How to Use Scanner Data

A. Replace the manually collected price data with scanner data for the sample of outlets and products

B. Use scanner data as auxiliary informationC. Compute index based on scanner data all

products (and outlets)D. Use scanner data for auditing and quality

control.

Page 12: Some Implementation Issues of Scanner Data Muhanad Sammar, Anders Norberg & Can Tongur

The First Principal Issue – How to Use Scanner Data

A. Replace the manually collected price data with scanner data for the sample of outlets and products

B. Use scanner data as auxiliary informationC. Compute index based on scanner data all

products (and outlets)D. Use scanner data for auditing and quality

control. We have seen variation between price collectors as regards quality of delivery

Page 13: Some Implementation Issues of Scanner Data Muhanad Sammar, Anders Norberg & Can Tongur

The Second Principal Issue – Data Aggregation• Scanner data are weekly aggregates of data

for each product and outlet in the sample

• Each week has ca. 8 500 price observations

• Weekly data requires aggregation to month

Natural choices of aggregation:i. Unweighted Geometric Mean value or ii. Quantity-Weighted Arithmetic Mean value

Motives

i. In line with rest of CPI for daily necessities

ii. In line with data

Page 14: Some Implementation Issues of Scanner Data Muhanad Sammar, Anders Norberg & Can Tongur

The Two Mean Values

• The geometric mean value:

• The weighted arithmetic mean value:

• We compared the two methods irrespective of their inhabited differences

WW

w wkjGt pP

/1

1 ,,

W

wwkj

W

wwkjwkj

At

q

qpP

1,,

1,,,,

Page 15: Some Implementation Issues of Scanner Data Muhanad Sammar, Anders Norberg & Can Tongur

Some Statistics

• 2% Geometric > Arithmetic in base while Geometric=Arithmetic in Jan, Feb, Mar

• 3% Geometric = Arithmetic in base while Geometric > Arithmetic in Jan, Feb, Mar

• > 98% of observations (weekly prices) without variations between days

• Ca. 9% of monthly prices had variations between weeks

Page 16: Some Implementation Issues of Scanner Data Muhanad Sammar, Anders Norberg & Can Tongur

Figure 5.1 in the paper: Logarithmic ratios of mean prices in current month relative to base period. Unweighted geometric mean on vertical axis and quantity-weighted arithmetic mean on horizontal axis. Eight sectors are numbered for analysis purposes.

)()( ,,ABase

GBasewBasewBase PPQP

)()()( ,, wtwtABase

GBase

Gt

At QPPPPP

Page 17: Some Implementation Issues of Scanner Data Muhanad Sammar, Anders Norberg & Can Tongur

Index_G

80

90

100

110

120

Index_A

80 90 100 110 120

Figure 5.2 in the paper. Monthly price indices for product groups in supermarkets and hypermarkets, based on geometric and arithmetic mean prices per month.

Page 18: Some Implementation Issues of Scanner Data Muhanad Sammar, Anders Norberg & Can Tongur

Indices by Different Methods

Period Unw. Geom. W. Arith. W. Geom. Unw. Arith.

January 100 99.815 99.785 100.038

February 100 99.998 99.996 100.000

March 100 100.000 100.000 100.003

April 100 99.969 99.963 100.008

Quantity weigthing seems to impact a bit…

Page 19: Some Implementation Issues of Scanner Data Muhanad Sammar, Anders Norberg & Can Tongur

-.80 -.65 -.50 -.35 -.20 -.05 0.10 0.25 0.40 0.55 0.70 0.85

pr i cer at i o

0

1000

2000

3000

F

r

e

q

u

e

n

c

y

Figure 5.3 in the paper. Distribution of price changes during January – April 2012 with base in December 2011. Unweighted geometric mean.

Page 20: Some Implementation Issues of Scanner Data Muhanad Sammar, Anders Norberg & Can Tongur

Data QualityVariation between outlets for scanner data (left) and manually collected data (right). Individual prices on vertical axis and monthly average prices per product on horizontal axis. The year 2010.

Page 21: Some Implementation Issues of Scanner Data Muhanad Sammar, Anders Norberg & Can Tongur

Data Quality (2)

Number of comparable product-offers is 36 102 and 38 786 respectively.

Matching categories in 2009 (%) 2010 (%)

Neither in M.C.P. or S.D. 1.5 0.6In M.C.P. but not in S.D. 4.5 5.3In S.D. but not in M.C.P. 1.5 0.9M.C.P. = S.D. 83.4 86.2M.C.P. > S.D. 4.3 3.7M.C.P < S.D. 4.8 3.3

Scanner Data (S.D.) and Manually Collected Prices (M.C.P.) in comparison. Product-offers, outlets and weeks. January – December, 2009 and 2010.

Page 22: Some Implementation Issues of Scanner Data Muhanad Sammar, Anders Norberg & Can Tongur

EAN code maintenance

• S.D = Vast Amounts of Data ≠ Large Samples• Data extraction = EAN code probing• Yearly EAN survival rate (base-to-base) 70-80%• Some 500 products identified and maintained• Until now, 35 of 538 products changed EAN

code during 2012 (=6.5%)• Fixed basket implication - Always up to date

with S.D.!