towards a new methodology for processing scanner data in ... · esbas), which may differ across...
TRANSCRIPT
1
Towards a new methodology for processing
scanner data in the Dutch CPI1
Antonio G. Chessa,2 Stefan Boumans2 and Jan Walschots2
1 The authors want to thank various colleagues, both at Statistics Netherlands and of other statistical offices, for
their continuous support and discussions. The authors also thank Professor Bert Balk for his comments on a previous, different version of the paper. The views expressed in this paper are those of the authors and do not necessarily reflect the policies of Statistics Netherlands. 2 Statistics Netherlands, Team CPI; P.O. Box 24500, 2490 HA The Hague, The Netherlands. Correspondence can be
sent to: [email protected]
2
Abstract This paper presents a new methodology for processing electronic transaction data and for calculating price indices, with the aim of reducing the methodological differences across retailers and consumer goods in the Dutch CPI. Articles (GTINs or EANs) are combined into “homogeneous products”, each of which is defined by a set of article characteristics. Articles that share the same characteristics form a product, which should capture price increases associated with “relaunches” (EAN changes). Two methods for selecting article characteristics are described, which have given the same outcomes. The new index method calculates price indices as a ratio of a turnover index and a weighted quantity index. Weights of homogeneous products are calculated from prices and quantities of multiple periods. Product weights are updated each month in order to incorporate new products timely into the index calculations. Results show that their contribution to a price index may be significant. The method does not lead to chain drift and does not require price imputations. The new methodology is intended to replace the current sample-based methods in the CPI for a department store and for mobile phones in January 2016. Results show clear improvements over the current methods and the previously used survey-based methods. Keywords: Scanner data, CPI, GTIN/EAN, relaunch, product homogeneity, index theory.
3
1. Introduction
Scanner data have clear advantages over traditional survey data collection, notably
because such data sets offer a better coverage of articles sold, sales data offer complete
transaction information (prices and quantities), and the data collection process is
automatised. In spite of their potential, scanner data are still used by a small number of
statistical agencies in their CPI, but the number is likely to increase during the coming
years.3
By scanner data we mean transaction data that specify turnover and numbers of
articles sold by EAN (or GTIN, barcode). At the time of introduction in the Dutch CPI in
2002, scanner data involved two supermarket chains. In January 2010, the data were
extended to six supermarket chains, as part of a re-design of the CPI (de Haan, 2006; van
der Grient and de Haan, 2010; de Haan and van der Grient, 2011). At present, scanner data
of 10 supermarket chains are used and surveys are not carried out anymore for
supermarkets since January 2013. Beside supermarkets, scanner data from other retailers
are used since January 2014. Other forms of electronic data containing both price and
quantity information are obtained from travel agencies, for fuel prices and for mobile
phones. More than 20% of the Dutch CPI is now based on electronic transaction data (in
terms of Coicop weights).
The shift from traditional price collection to electronic transaction data has
introduced new possibilities for developing index methods. Ideally, we would like to
develop a method that makes use of both prices and quantities, and that processes the
transactions of all EANs instead of taking a sample.4 With thousands of EANs per retailer
the question is how to find efficient solutions. This has turned out to be a complex process
over the years, as the current methods in the Dutch CPI differ across retailers and
consumer goods. The current method for supermarket scanner data intends to process all
EANs, but evidences different index related issues. The methods for other retailers make
use of samples of articles.
As the search for new electronic data sources will continue, the question has been put
forward whether a generic index method could be developed that is applicable to different
types of consumer goods and sources (retailers), and that is capable of handling issues
that are not resolved in a fully satisfactory way so far in certain methods (amongst which
the “relaunch” problem and, related to this, the definition of homogeneous products). Such
a method could then also be gradually applied to data sets that are currently in
production, so that the differences in methods used for different retailers can be reduced.
Section 2 gives an overview of the index methods that have been developed for
different electronic transaction data and retailers over the past years in the Dutch CPI. An
outline of a new methodology for processing electronic transaction data is presented in
Section 3. The intention of this section is to show how the new methodology fits within the
CPI system. The aim of the methodology is twofold: (1) to process all EANs, thus
abandoning the traditional approach of selecting a basket of goods, and (2) to have an
index method that deals with the dynamics of an assortment over time, in which new
goods are timely included, and that efficiently handles relaunches.
3 In Europe, six countries will be using scanner data in 2016. The scanner data workshops in Vienna (2014) and
Rome (2015) evidenced that several countries are expecting their first data, while other countries made concrete steps towards acquiring their first scanner data. 4 In this paper, the term “article” and EAN/GTIN are used interchangeably.
4
Sections 4 and 5 elaborate on two essential aspects of the new methodology. The
relaunch problem implies that EANs are not always appropriate as unique identifiers of
homogeneous products. Product homogeneity should then be achieved at a broader level,
at which EANs are combined into groups. Homogeneous products or EAN groups could be
defined by combining EANs that share the same set of characteristics. These have to be
selected in some way. Section 4 proposes and compares two selection methods.
Homogeneous products constitute one of two intermediate article group levels
between the EAN level and the lowest publication level (L-Coicop).5 At the product level,
turnover and quantities of articles sold are summed and used to calculate unit values per
product. These are used to calculate price indices for so-called “consumption segments”,
which combine different homogeneous products (e.g., a segment T-shirts with underlying
products that are described by one or more characteristics).
The index method that has been developed for this purpose is described in Section 5.
A price index for a consumption segment is calculated as the ratio of a turnover index and
a weighted quantity index. The essential element of the index method is the definition and
calculation of the ‘weights’ of the homogeneous products, such that new products can be
directly included. Price indices at Coicop levels are calculated according to conventional
Laspeyres type methods.
Some results obtained with the new methodology are presented in Section 6. Section
7 summarises and concludes with short-term plans with the methodology.
2. Historical overview of methods for processing scanner data
The search for efficient processing methods and index methods for scanner data has
proven to be a challenge throughout the years. Different directions have been investigated
for supermarket scanner data, which may be useful to share with statistical agencies that
are thinking about using scanner data or are about to use scanner data in their CPI. Over
the past 16-17 years, three methods were developed at Statistics Netherlands for
supermarket scanner data, which are referred here to as versions 0, 1 and 2. The choices
and decisions made for these methods are summarised in Table 1.
The first method was proposed at the end of the 1990s. EANs were considered to be a
natural choice for homogeneous products. The availability of weekly turnover data led to
suggesting the Fisher index as an ideal index. Because of frequent assortment changes a
monthly chained Fisher index was considered. However, the price indices showed a strong
downward drift. As a consequence, this method was never implemented.
Based on these experiences, a switch was made to a method that used a (large) basket
and a Laspeyres index with yearly fixed weights. This method was implemented, but
article replacements turned out to be very time consuming as the dynamics of assortments
increased. After seven years it was decided to develop a less labour intensive method.6
The third method (version 2) is the method currently used for supermarket scanner
data. The idea of using a basket of goods was put aside, and a return was made to
processing all transactions. This decision led to adopting a monthly chained index again,
like in version 0. But in the light of the experiences with a strong drifting behaviour, an
5 In the current Coicop classification, L-Coicops are specified at the fifth digit level at most (depending on division).
6 In classical surveys, consumer specialists define ‘representative’ products within Coicops. Overall, the Dutch CPI
contains between 1000 and 1500 of such products. In method version 1, a basket of about 10,000 EANs per retailer was used (Table 1), which makes manual replacement of articles much more difficult.
5
index method was chosen that assigns equal weights to EANs. A drawback of this choice is
that EANs with negligible turnover would receive a relatively high weight. A turnover
threshold was therefore introduced in order to exclude such EANs from the index
calculations (see de Haan and van der Grient (2011) for more details).
Table 1. Methods developed at Statistics Netherlands for supermarket scanner data.
Monthly chained Jevons indices are calculated for consumption segments.7 The price
indices for consumption segments are aggregated to Coicop levels by applying a Laspeyres
index with fixed weights for each consumption segment, which are revised each year. The
impact of equal weights is thus confined to consumption segments.
An important issue with the current method is that relaunches are not handled in a
satisfactory way. Price increases after relaunches are not captured. A so-called “dump
price filter” is introduced in order to prevent price indices from serious downward biases.
Strong price decreases for EANs that are about to be replaced, in combination with low
quantities sold, are filtered out from the calculations. Other practical issues are that the
filter settings need to be tested for each new data set, and possibly also repeated tests
need to be done in the future in order to verify their validity.
Beside supermarkets, scanner data and other electronic transaction data sets are used
for other retailers. Table 2 shows a complete overview of the retailers and types of
consumer goods. More than 20 per cent of the entire CPI is based on electronic transaction
data (in terms of Coicop weights).
The index methods for retailers other than supermarkets make use of a sample of
EANs. Like in traditional surveys, a limited number of products is defined, for instance, by
picking a small number of brands based on turnover share. EANs are combined into
products based on brand and one or two additional characteristics, such as package
7 In this context, consumption segments are the “internal scanner data aggregates” in Dutch CPI jargon (abbr.:
ISBAs). These ISBAs are our own harmonised reclassification of the retailers’ own classifications of EANs (called ESBAs), which may differ across supermarket chains.
Choices and decisions Version 0 Version 1 Version 2
Developed/used in Late 1990s 2002-2009 2010-present
Sample All data Basket All EANs that satisfy
(± 10,000 EANs certain filters
per retailer)
Homogeneous products EANs EANs EANs
Replacements No Yes, manually No
(EANs with large
turnover share)
Index method Monthly chained Laspeyres, with Monthly chained Jevons,
Fisher index yearly fixed weights with equal weights for
'accepted' EANs;
Laspeyres for (L-)Coicops
Implemented? No Yes Yes
6
content. Laspeyres indices with yearly fixed product weights are used for these retailers.
The primary focus in these methods was to capture price increases after relaunches. Table 2. Use of scanner data and other types of electronic transaction data compared to survey data, by retailer/consumer goods in the Dutch CPI, as percentages of the sum of the Coicop weights in 2015.
A drawback of the methods for non-supermarkets is the small size of the samples. In
addition, products might not be homogeneous enough. We come back to these issues in
Section 6. Given the diversity of methods across retailers and consumer goods, and the
different potential problems identified in this section, a study was set up last year with the
aim of finding a more generic method that is ideally applicable to all transaction data sets,
does not draw samples but processes all transactions, and that resolves the issues
mentioned above.
3. Outline of a new processing framework
The introduction of different methods for different retailers in the CPI has made the
system increasingly complex over time. New choices were made each time a new data set
was added to the production system. The current index method for supermarkets makes
use of different types of price and turnover filters, which have to be regularly checked and
tested for each new chain. The methods for non-supermarkets make use of samples, which
need continuous monitoring as their turnover shares might become less significant due to
new developments in the assortments.
For these reasons, the possibilities of developing a generic method have been studied
in order to reduce the methodological differences between retailers and types of
consumer goods. The new methodology focuses on two main problems:
How could a relaunched article and its predecessor be combined, that is, be
considered as equivalent instances of the same set of articles (“homogeneous
Retailers Transaction data Survey data
Supermarkets* 12.9
Do it yourself stores* 0.4 0.5
Department stores* 0.7 0.6
Drug stores* 0.8 0.3
Travel agencies 1.7
Fuel 3.6
Mobile phones 0.5
Other 78.0
Total 20.6 79.4
* Scanner data, i.e. transaction data specif ied by EAN/GTIN
7
product”)? How could homogeneous products be defined, and how could articles be
matched in an efficient way, without time consuming manual interventions?
What kind of index method could be developed, such that all transactions can be
processed and new articles can be directly included?
The problem of product homogeneity and the index method developed for the above
purposes are treated in sections 4 and 5, respectively. In the present section we sketch
how these components are intended to be integrated into the full CPI production system,
based on our experiences so far with the tests performed with the new methodology.
The processing of electronic data in the Dutch CPI can roughly be subdivided into four
stages:
1. Reading and checking data;
2. Linking articles/EANs to (L-)Coicops;
3. Calculating prices and price indices at “lower aggregate levels”;
4. Calculating price indices for (L-)Coicops and for the CPI as a whole.
The first stage consists of reading data files and performing basic checks on the data, such
as the correctness and completeness of records and record variables and controlling for
quantities sold with value zero (which are isolated before article prices are calculated).
The subsequent three steps are worked out in more detail in the chart of Figure 1. The
“lower aggregate levels” mentioned in step 3 consist of three levels, which are explained
below.
Figure 1. Nested group levels of individual articles in the CPI, and price definitions and price index calculations at these levels in the new methodology.
Article group levels
Consumer goods and services are subdivided in the CPI into Coicops. The most detailed
level of publication within Coicop divisions is referred to as “L-Coicop” in the Dutch CPI.
Scanner data contain transaction data at EAN level. For reasons explained previously, a
(L-)Coicops
Consumption segments
Homogeneous products
Individual articles/EANs
Laspeyres type indices
QU-indices
Product prices (unit values)
Transaction prices
Retailer's ESBAs
Article groups Index calculation
8
further subdivision is made between L-Coicop and EAN level. Individual EANs may have to
be combined into groups, which we refer to as “homogeneous products”. These products
and their underlying articles need to be linked to L-Coicops, which has to be done in an
efficient way if we aim at processing all EANs. For this purpose, it is important to ask
retailers for their own classification of EANs (called ESBAs in our system).
Usually, we take the most detailed ESBA level for establishing the EAN-Coicop links.
However, the most detailed ESBAs may still cover more than one L-Coicop, so that we
need to define an intermediate level between L-Coicops and homogeneous products. This
intermediate level, which we call “consumption segments”, may be derived from more
detailed EAN characteristics (more details are given in Section 4).8
In our first tests with the new methodology, we have chosen consumption segments
to be “types of article”. Examples of segments are men’s T-shirts, ladies’ cardigans or
chocolate. Each of these segments contains a set of homogeneous products. For T-shirts, a
product may contain EANs that have the same number of items per package, the same
sleeve length, fabric and colour. In this way we obtain a nested partition of individual
articles/EANs at different levels, as is shown in Figure 1.
The question is how products can be defined, which article characteristics to select
and what methods could be considered for this purpose. This will be treated in Section 4.
Calculation of price indices
At each level in Figure 1 we either need to define prices or establish the method for
calculating price indices. The price of an individual article is its “transaction price”, that is,
turnover divided by number of articles sold (in fact, this is a unit value at EAN level). For
homogeneous products we use the same definition. Both turnover and quantities sold are
summed over articles that share the same set of selected characteristics. The ratio of these
two sums for groups of articles is usually referred to as a “unit value”.
Unit values and quantities sold for homogeneous products are subsequently used to
calculate a price index for each consumption segment. A new index method is developed
for this purpose, referred to as the “QU-method”, which is described in Section 5. Price
indices for consumption segments are then aggregated to L-Coicops according to
traditional Laspeyres type indices, with weights based on turnover of the preceding year.9
4. Consumption segments and product homogeneity
In order to make choices about consumption segments and homogeneous products,
statistical agencies should ask retailers for information about article characteristics and
article classifications used by retailers for their own purpose (ESBAs). Information about
article characteristics may be contained in article descriptions and also in detailed ESBAs.
Our experiences with electronic data sets are that this information may be supplied in
varying formats by different retailers. For instance, the record variables in drugstore
scanner data are all contained in separate columns (Chessa, 2013). Information about
8 Consumption segments play the same linking role as the ISBA classification in our current system.
9 Aggregation to Coicop levels could also be carried out by applying the aggregation according to the QU-method, by
summing turnover and weighted quantities in the numerator and the denominator of index formula (1) over consumption segments (see Section 5.1). This is a more consistent aggregation method. Preliminary research has shown that the differences between the two aggregation methods are very small at L-Coicop level (Chessa, 2014). As a consequence, we have decided to stick to the traditional way of aggregating to Coicop levels.
9
article characteristics may also be exclusively contained in text strings with EAN
descriptions.
The first example is clearly the preferred data format, as consumption segments and
products can be derived immediately, and EANs can be automatically assigned to both
article group levels and linked to Coicop. In the second case, some form of text mining will
have to be applied in order to retrieve and place information about article characteristics
in separate columns. Text mining falls outside the scope of this paper and will therefore
not be treated further.
Consumption segments are defined as sets of homogeneous products. We have taken
consumption segments to be equivalent with ‘types of article’. Article types can be defined
at different levels of detail (e.g., socks as a whole, or sports, thermal and walking socks as
separate types). In our first tests with department store scanner data, we have defined
article types at the most detailed level of the two mentioned in the example (i.e., sports,
thermal and walking socks). We could say that we differentiate article types, and hence
consumption segments, according to purpose.
Defining tighter consumption segments increases the chance of index imputations.
We come back to this point in Section 6, where a test case is presented. An advantage of
defining tighter consumption segments is that price indices are ‘purer’ in some sense. The
new index method computes weights per unit of product sold (Section 5). In our socks
example, product weights for walking socks are merely based on prices for that article
type and are not mixed with price information and the price index for other types of socks.
In order to avoid potential problems with relaunches, we combine articles into
homogeneous products. This could be done by selecting a set of ‘relevant’ article
characteristics and combine those articles into groups (products) that share the same
characteristics. A walking sock product could contain articles with two pairs of socks per
package, with colour brown and a specific type of fabric. The question is which article
characteristics to select and what methods could be used for this purpose.
Before proceeding, we introduce the following terminology. By “characteristic” of an
article we refer to an instance, a specific value that an article can take. Such a value
belongs to a broader set, which we refer to as “attribute”. For example, ‘white’ is a
characteristic of a T-shirt that belongs to the article attribute ‘colour’.
The selection of article attributes is traditionally part of the consumer specialist’s
domain. Consumer specialists define products by selecting specific characteristics, for
instance, a specific brand name, package content and a consumer target group. We could
build on this idea when using scanner data and supplement it with a sensitivity analysis. A
more technical method could be suggested as an alternative, which belongs to the field of
statistical model selection (Claeskens and Hjort, 2008). We briefly describe both
approaches, which will subsequently be compared with an example.
Sensitivity analysis
The selection of article attributes covers three stages in this approach:
1. For a given consumption segment, the consumer specialist selects a number of article
attributes that (s)he finds to be relevant. This gives rise to an initial set of products;
2. A price index is calculated for the consumption segment, according to the method of
Section 5;
3. A sensitivity analysis is performed: an attribute that was not selected in step 1 is now
added and the price index is re-calculated. If the price index changes ‘significantly’,
10
then the attribute is accepted. This step can be repeated with other attributes.
Attributes may also be omitted when their impact on the price index is negligible.
A meaning has to be given to what is found to be a ‘significant’ impact on a price index
after adding or omitting attributes in step 3. What is found to be an acceptable tolerance?
This may be harder to establish for consumption segments than at Coicop level.
Experience has to be gained with this aspect when the methodology is taken into
production.
Statistical model selection
The point of departure in this approach is a stochastic model for the price of an article.
Following the terminology adopted for the QU-method in Section 5, the price of an article
is decomposed into a product-specific component (the product to which an
article/EAN belongs), a time-specific component (price index with respect to a base
month) and some random (residual) term.
Different sets of attributes may give rise to different numbers of products, and thus
different numbers of product-specific parameters . The problem is to compare different
model versions, with different numbers of free parameters. This can be done by
calculating “information criteria” for each model version, which represent a class of
statistical fit measures that consist of two components:
Under suitable assumptions, the first main term simplifies to a sum of weighted least
squares. The squared differences compare the article transaction prices to the model
prices (or a logarithmic transformation of both prices, depending on the type of
model);
A term that involves the number of free parameters in a model.
The second term acts as a penalty term in assessing model fits. Adding parameters to
a model may decrease the weighted least squares term, but also increases the risk of
overfitting. The aim is to select a model version that balances model fit against model
complexity (number of parameters). The model version with the optimum value of the
information criterion is finally selected.
An advantage of the first method over the second is that it is less technical and, as a
consequence, is expected to be easier to understand among CPI users. An advantage of the
statistical method is that it can be automatised, although this should also be possible with
the first method. The statistical method does not need to specify a ‘tolerance level’, like the
first method. It simply selects that model version, with its corresponding set of
homogeneous products, that optimises the information criterion.
However, there are different types of information criteria, which differ by the size of
the penalty term. This raises the obvious question which one to select. Based on
experiences with scanner data, Chessa (2015) suggests to use the Bayesian Information
Criterion.
A consideration that is not mentioned above is that article attributes may be treated
in different ways. For instance, each package content may be treated as a different value,
thus giving rise to a different product, or different contents give rise to the same product
with quantity-adjustments applied to each.10 It is beyond the scope of this paper to treat 10
This is what we decided in our current method for drugstore scanner data. However, we defined content thresholds in order to keep single item articles separate from multipacks, which are treated as different products.
11
this aspect in detail, but it is important to be aware of the fact that different treatments can
be given to article attributes. It is advisable to compare their impact on a price index.
Example
The new methodology has been tested on scanner data of a department store. Historical
data from the period February 2009 until March 2013 have been used to define
homogeneous products for consumption segments in different Coicops and to test the
index method of Section 5.
We take menswear and ladies’ wear as examples to illustrate the above two methods
for defining products. In the traditional survey for the department store, the consumption
segment T-shirts was characterised by the consumer specialist in terms the following
attributes, for both ladies and men:
Number of items in a package;
Sleeve length;
Colour.
Fabric was not mentioned explicitly, possibly because the consumer specialist had decided
to use an article description with one specific fabric in mind (e.g., standard cotton).
Suppose we exclude fabric initially. Adding fabric as a fourth attribute in a sensitivity
analysis has a very small impact on the price index for men’s T-shirts, but a large impact is
found for ladies’ T-shirts. Fabric should therefore be added to the list of attributes for T-
shirts (which could also be done for both men and ladies in order to limit the number of
different lists of attributes across consumption segments).
Figure 2 shows the price indices for men’s and ladies’ wear for the department store
before and after adding fabric for T-shirts. There is hardly any difference at the L-Coicop
level for menswear, while a small difference can be noted for ladies’ wear (due to the
smaller weight of T-shirts compared to other consumption segments).
The sensitivity analysis gives the same results as the statistical method. For the
consumption segments other than T-shirts, the statistical method even resulted in the
same selection of attributes as was specified by the consumer specialist in the traditional
survey. Based on these results, it may be advisable to start experimenting with the simpler
sensitivity analysis when the new methodology goes into production.
Figure 2. Price indices for menswear and ladies’ wear for a Dutch department store, before and after adding fabric as an extra attribute for T-shirts (Feb. 2009 = 100).
Our primary objective to characterise homogeneous products in terms of a set of
attributes does not necessarily imply that we exclude EANs as product choices in any case.
60
70
80
90
100
110
120
130
140
20
09
02
20
09
05
20
09
08
20
09
11
20
10
02
20
10
05
20
10
08
20
10
11
20
11
02
20
11
05
20
11
08
20
11
11
20
12
02
20
12
05
20
12
08
20
12
11
20
13
02
Menswear
Fabric added for T-shirts Survey-based choices
60
70
80
90
100
110
120
130
140
20
09
02
20
09
05
20
09
08
20
09
11
20
10
02
20
10
05
20
10
08
20
10
11
20
11
02
20
11
05
20
11
08
20
11
11
20
12
02
20
12
05
20
12
08
20
12
11
20
13
02
Ladies' wear
Fabric added for T-shirts Survey-based choices
12
Article characteristics are contained in the EAN descriptions (text strings) for the
department store scanner data. A considerable part of the EAN descriptions only consists
of one to three terms. Defining homogeneous products with such a small number of
attributes may introduce a risk of missing important attributes.
In such cases, one should first contact and visit the retailer and discuss the problem.
Additional information could also be collected through web scraping. But under certain
circumstances it is not needed to collect more information about article characteristics,
which is illustrated by the following examples.
Figure 3 shows the price indices at two levels of product differentiation for the
restaurant part of the department store and for kitchen textiles. Both article assortments
are stable over time, so that calculating price indices with EANs as products does not lead
to problems caused by relaunches. Product differentiation according to a limited set of
attributes (article type, size and taste for the restaurant, and only article type and colour
for kitchen textiles) gives comparable results. No difference was even found for the
restaurant. In cases with short EAN descriptions and stable assortments, the choice of
EANs as homogeneous products can be defended. Moreover, it does not require the
extraction of article characteristics from EAN descriptions, which has been rather time
consuming for the department store data.
Figure 3. Price indices for the restaurant and for kitchen textiles of the department store, for two levels of product differentiation, as calculated with the index method of Section 5 (Feb. 2009 = 100).
5. An index method for consumption segments
5.1 Price index formula
Once consumption segments and the homogeneous products in each segment have
been defined, the question is according to what method price indices could be computed.
The following aspects were considered in our choice of index method:
The index method should be able to incorporate new products in the month of
introduction into the assortment;
Based on our first experience at Statistics Netherlands with scanner data (Version 0,
see Table 1 in Section 2), the method should not suffer from chain drift;
A price index should simplify to a unit value index when all products are
homogeneous.
60
70
80
90
100
110
120
130
140
20
090
2
20
090
5
20
090
8
20
091
1
20
100
2
20
100
5
20
100
8
20
101
1
20
110
2
20
110
5
20
110
8
20
111
1
20
120
2
20
120
5
20
120
8
20
121
1
20
130
2
Restaurant
Products based on attributes EANs as products
60
70
80
90
100
110
120
130
140
20
090
2
20
090
5
20
090
8
20
091
1
20
100
2
20
100
5
20
100
8
20
101
1
20
110
2
20
110
5
20
110
8
20
111
1
20
120
2
20
120
5
20
120
8
20
121
1
20
130
2
Kitchen textiles
Products based on attributes EANs as products
13
Before giving the formulas behind the index method, we introduce some notation. Let
and denote sets of homogeneous products in some consumption segment G in periods 0 and t. The sets of homogeneous products in 0 and t may be different. Let , and
, denote the prices and quantities sold for product , respectively, in period t.11
We denote the price index in period t with respect to, say, a base period 0 by . The
following formula is proposed for calculating price indices (Chessa, 2015):
∑ , ,
∑ , , ⁄
∑ , ∑ , ⁄
. ( )
The numerator is a turnover index, while the denominator is a weighted quantity
(“volume”) index. The product specific parameters are the only unknown factors in
formula (1). Choices concerning the calculation of the are described in Section 5.2.
Price index formula (1) can be written in the following compact form:
⁄
⁄, ( )
where and denote weighted arithmetic averages of the prices and the , respectively,
over the set of products in period t, that is,
∑ , ,
∑ ,
, (3)
∑ ,
∑ ,
. ( )
Notice that the numerator of (2) is equal to the unit value index, where unit values are
defined as the ratio of the sum of turnover and the sum of quantities sold over a set of
products in a consumption segment, as given by (3).
If the products in a consumption segment are homogeneous, then the of all
products have the same value. In this special case, price index (1) simplifies to a unit value
index, a property that we imposed on the index method.12 In the more general case where
a set of products is not homogeneous, then the unit value index must be adjusted. Price
index formula (1) gives a precise expression for the adjustment term, which is the
denominator of (2). This term captures shifts in consumption patterns between different
periods. A shift towards products with higher weights (‘quality’) results in an upward
effect on the volume index and, consequently, in a complementary downward effect on the
price index.
As the method adjusts for shifts between products with different quality, we call index
( ) a “quality adjusted unit value index” (“QU-index” for short). The fundamental question
is how values for the could be obtained. This will be discussed in the next subsection.
11
A different notation is used in this paper from the commonly accepted notation of time as a superscript in prices, quantities and indices. In this paper, preference is given to the notation of both product and time indices as subscripts. This was done in order to reserve the superscript for other purposes (see Chessa (2015), Section 2.3). 12
Time-product dummy and hedonic methods (e.g., see de Haan and Krsinich (2014)) do not satisfy this property.
14
5.2 Choices concerning the
In order to find a method for calculating the product specific weights , we focused
on the following questions:
How could new products be timely incorporated into the index calculations?
The appear in index formula (1) as factors that depend on product, but not on time.
Are the constant in time, or can their values be allowed to vary over time? In the
latter case, how long should we take the periods on which the are constant?
How could chain drift be avoided?
Inclusion of new products
Formula (1) can be considered as a family of price indices. Special cases can be derived for
specific choices of the , which are worth mentioning and are helpful to fix ideas. If we set
the equal to the product prices from the publication period t, then it is easily verified
that price index (1) simplifies to a Laspeyres index. If the are set equal to the prices of
the products sold in the base period 0, then (1) turns into a Paasche price index. The use of
price and quantity information from both periods leads to a Lowe type of index in the QU-
method:
∑ , ( , , , )
∑ , ( , , , )
, ( )
where is the harmonic mean of the quantities sold in the two periods.
The three special cases are not able to take into account new products in the year of
publication, unless some form of price imputation is carried out. Monthly chaining would
be an alternative, but given the problems experienced at Statistics Netherlands with
scanner data in the first years, this option is not considered here.
Price imputations are not needed if the are based on price and quantity information
from multiple periods. This is an essential property of the method, which can be exploited
to obtain transitive price indices. Considering product prices and quantities from some
period T, we define for product as follows:
∑ , ,
, ( )
where
, ,
∑ , ( )
denotes the share of period z in the total amount of quantities sold for product over
period T. Two remarks are worth making:
First, it is clear that a choice for the length of the period T must be made, which will be
dealt with later in this subsection;
Second, the are defined as a weighted average of deflated prices observed in T. The
effect of price change is thus removed in order to yield product specific in the
volume index of (1). The price index to be calculated also appears in the , which in
turn are needed to calculate the price index. In Section 5.3, a computational method is
presented that deals with this characteristic of the method.
15
The index method (“QU-method”) is completely described by formulas ( ), ( ) and
(7). This system of expressions has a counterpart in the field of PPPs, in a method that is
known as the Geary-Khamis (GK) method (Geary (1958), Khamis (1972), Balk (1996,
2001, 2012)). The GK-method has been the subject of some debate, essentially because of
the linear form proposed for the (e.g., see Balk (1996), p. 214). Alternative forms for (6)
are considered in Chessa (2015). These departures from linearity (“perfect substitution”)
were shown to have negligible effects on the price indices analysed, so that it was decided
to stick to the simpler expression (6).
Apart from its simplicity, expression (6) also has another appealing feature. A
straightforward rewriting of (6) gives:
∑ , ,
∑ ,
⁄ . ( )
Expression (8) says that is equal to turnover “in constant prices” of product over
period T, divided by the total number of products sold in the same period. The
numerator in (8) coincides with the notion of volume as used in national accounts. In this
sense, can be defined as volume per unit of product sold.
Returning to our proposal of using prices and quantities from multiple periods for
calculating the , we conclude by comparing some QU-indices with Lowe type index (5),
which is calculated both as a direct index and as a monthly chained index. Figure 4
compares the three price indices for four types of menswear.13 The results show large
differences, which have different causes. While the direct method and the QU-method give
comparable results for socks and underwear, the direct method fails for T-shirts.
Figure 4. QU-indices compared with direct and monthly chained indices (MoM) for four types of menswear, based on scanner data of a department store (Feb. 2009 = 100).
13
In the results shown, T-shirts represent a consumption segment, while the results are aggregated over various consumption segments for the other three article groups.
50
60
70
80
90
100
110
120
130
140
150
20
090
2
20
090
5
20
090
8
20
091
1
20
100
2
20
100
5
20
100
8
20
101
1
20
110
2
20
110
5
20
110
8
20
111
1
20
120
2
20
120
5
20
120
8
20
121
1
20
130
2
Socks
QU-index Direct index MoM index
50
60
70
80
90
100
110
120
130
140
150
20
090
2
20
090
5
20
090
8
20
091
1
20
100
2
20
100
5
20
100
8
20
101
1
20
110
2
20
110
5
20
110
8
20
111
1
20
120
2
20
120
5
20
120
8
20
121
1
20
130
2
Underwear
QU-index Direct index MoM index
5060708090
100110120130140150
20
090
2
20
090
5
20
090
8
20
091
1
20
100
2
20
100
5
20
100
8
20
101
1
20
110
2
20
110
5
20
110
8
20
111
1
20
120
2
20
120
5
20
120
8
20
121
1
20
130
2
T-shirts
QU-index Direct index MoM index
020406080
100120140160180200
20
090
2
20
090
5
20
090
8
20
091
1
20
100
2
20
100
5
20
100
8
20
101
1
20
110
2
20
110
5
20
110
8
20
111
1
20
120
2
20
120
5
20
120
8
20
121
1
20
130
2
Pullovers and Cardigans
QU-index Direct index MoM index
16
The direct method does not capture the contribution of new products to price change
in the year of introduction to the assortment. New types of T-shirts, made of organic
cotton, were introduced in 2010 at high initial prices, which already started to decrease in
2010. The contribution of the new T-shirts is captured by the QU-index, and also by the
monthly chained index, but not by the direct index. The latter only evidences the price
behaviour of the existing part of the assortment, which, in contrast to the new articles,
shows a price increase in 2010.
The monthly chained indices do not come close to the QU-indices in none of the four
cases. This can be partly explained by seasonal effects (e.g., articles returning into the
assortment with price increases, which are missed).
The examples in Figure 4 show that it is important to have an index method in which
not only existing articles enter the calculations, but also new articles. Leaving out new
articles in the year of introduction may have a huge impact on a price index. This implies
that should be calculated for new products as soon as these appear in an assortment.14
Length of the time window T
Scanner data of a large department store and of drugstores, and also electronic data of
mobile phones, have been used to compare time windows that vary between 1 and 4 years
in length. Different window lengths were compared by calculating information criteria
(Section 4).
A unique choice is not easy to make, as different results have been obtained for
different types of goods. One-year windows turned out to give slightly better fits for the
department store scanner data. Longer windows tend to show better fits for drugstore
scanner data, but the differences among the price indices for different window lengths are
negligible in most cases. The same holds for mobile phones. A 1-year window fits well with
current practice in the Dutch CPI and is advantageous with regard to system maintenance
compared to longer windows, as only items sold within one year have to be followed.
The problem of chain drift
We intend to use one-year windows T and the corresponding to calculate price indices
according to expression (1) per publication year. We will do this by using December of the
preceding year as a fixed base month. The choices on the and the length of the time
window T give rise to index numbers that are transitive in publication years, and are
therefore free of chain drift. It is obvious that we need to switch from one set of ’s to
another in the next publication year, a choice that has been motivated by statistical
research on historical data, as mentioned previously.
The method described in Krsinich (2014) offers an alternative to a method with a
fixed base month. She also suggests one-year windows, but proposes to shift the window
along with each publication month (quarter in New-Zealand). Price indices for publication
periods are calculated by chaining year-on-year indices at each shift of the window. The
use of her rolling window approach may lead to large differences compared to our
method.
Different questions and issues have emerged from the comparison with the rolling
window approach. It is not clear how the price indices for different publication years are
related to each other. One would expect the rolling window approach to produce results
14
However, note that new products contribute to a price index from the second month in which they are sold.
17
that are comparable with our method, but with the choice of base month averaged out.
Apparently, this is not the case (Chessa, 2015).
As a final note, it may be interesting to mention that we found a price model with
December as a fixed base month to give structurally better fits than a model based on the
rolling window approach (i.e., the former choice resulted in smaller weighted least
squares values). This finding coincides with current practice in the CPI/HICP, as December
is the month in which yearly weight revisions are carried out.
5.3 Computation of price indices in practice
In order to incorporate new products in a timely way in the index calculations, we
calculate from prices and quantities of the publication year. At this point we encounter a
problem, since we decided upon a model with yearly constant . We refer to the
corresponding index as the theoretical “benchmark index”. The complete set of annual
prices and quantities becomes available only in the final month of a year. This raises the
question how to deal with this problem.
We propose a method for calculating a “real time” version of the benchmark index,
such that:
The are updated each publication month with product prices and quantities which
become available in that month;
Price indices are calculated with respect to the base month, by making use of the
updated . That is, a direct index is used instead of a monthly chained index.
These choices ensure that the benchmark and real time indices are equal at the end of
each year, so that real time indices are free of chain drift as well. This is an important
property of the computational method. The question is how the two price indices compare
in previous months. This will be illustrated in Section 6 with several examples.
Price indices cannot be calculated directly, since the depend on the price indices.
We propose a simple method, which follows an iterative scheme:
1. Suppose that a price index for publication month t has to be calculated. As a first step,
choose initial values for the price indices from base month 0 up to month t;
2. Calculate the for each product sold between the base month and month t by making
use of product prices and quantities up to month t:
∑ , ,
, ( )
where
, ,
∑ ,
. ( )
3. Substitute the obtained in step 2 into expression (1) and calculate updated price
indices up to month t;
4. Repeat steps 2 and 3 until the differences between the price indices obtained in the
last two iterations are ‘small’, according to some pre-defined distance measure.
A number of comments need to be made:
18
The initial values for the price indices in step 1 can be chosen arbitrarily, for instance
, as the algorithm can be shown to converge to a unique
solution (when such a solution exists, of course);
Computation times can be reduced by constructing suitable initial price indices. A
method with this aim is described in Chessa (2015), which has shown that the initial
indices already give very good approximations to the final indices;
The in step 2 are calculated by making use of product prices and quantities from
the base month up to the publication month t. This means that a shorter period is
used at the beginning of each year. As an alternative we could use a moving one-year
window and include data from the preceding year. The results obtained with the
above choices have been satisfactory, as will be shown in Section 6. Therefore we
stick to the method presented above so far, which is simpler to implement;
Price indices are calculated for each month between the base month and the
publication month. However, the price indices up to month t – 1 will not be revised, as
this is not common practice in the CPI (apart from exceptional cases). This means that
only the price index for the publication month will be retained from the calculations,
which hence will not be modified in successive months.
6. First tests with the new methodology
The methodology presented in this paper has been studied and applied to scanner
data of a department store and drugstore chains, and also to transaction data of mobile
phones. It has been implemented for the department store and for mobile phones in a test
environment. The methodology has been programmed in T-SQL for the department store.
It was tested in Excel for mobile phones, as the data set is small (70 devices make up
almost 80% of total turnover). However, the method will be programmed in T-SQL as well
for mobile phones in the near future.
In this section, some first findings are reported with the methodology from tests for
the department store. The following facts show that we are dealing with a big retailer. The
data contain over 133,000 EANs for 2014, which are subdivided into 221 ESBAs according
to the retailer’s article classification. An ESBA gives rise to one or more consumption
segments. The article assortment covers eight Coicop divisions. The department store has
a weight of 0.7 per cent in the Dutch CPI (see also Table 2, Section 2).
One of the key properties of the data is that information about article characteristics
is bundled into text strings. This means that we had to find a way to restructure the data
into a format, such that the price index related problems could be efficiently dealt with. A
method for identifying article characteristics in text strings had to be developed, after
which the characteristics could be placed in separate columns.
A basic form of text mining was used to identify article characteristics. Lists of key
words were set up for the characteristics, based on the coding used by the retailer. The
initial stage of this process consisted of a visual inspection of text strings in order to obtain
a first impression of the coding system. The key word lists were gradually expanded by
isolating the EANs that still did not match the current key words, which could indicate a
different coding for the same characteristic or no coding at all.
This data processing step proved to be time consuming, especially at the beginning. A
retailer may use different ways of coding the same characteristic over time. For instance,
19
we encountered both Dutch and English terms for the same characteristics, and terms
were both spelt out and abbreviated, the latter even in different ways.
Missing information about characteristics can be interpreted in different ways. A
characteristic may not be mentioned when it concerns a ‘default value’. For instance,
single-item packages never have the content of a package mentioned in EAN descriptions,
which, however, is specified for the number of items in multipacks. For data sets like the
scanner data for the department store, one therefore has to try to imagine the logic behind
the retailer’s coding rules. Some degree of interpretation of the EAN descriptions can thus
not be excluded when applying text mining.
The results obtained with text mining were subsequently used to address the
problems related to the calculation of price indices. Historical data over a four-year period
were used for this purpose. Article attributes were selected and homogeneous products
were defined by applying the statistical approach described in Section 4. Choices with
regard to the QU-method were made as well, which are motivated in sections 5.2 and 5.3.
These choices were taken into the implementation and test phase. Price indices have been
calculated according to the scheme in Figure 1 (Section 3), which is worked out in more
detail in Figure 5 below.
Figure 5. Process steps in the implemented methodology for the department store.
The linking of article types and characteristics to EANs, as mentioned in the chart, is
achieved by searching the text strings for corresponding key words.
The chart is in fact generally applicable to any transaction data set. What makes the
chart specifically tailored to the department store data is the possibility to take EANs as
homogeneous products. As was stated in Section 4, this choice was made for consumption
segments with short EAN descriptions and assortments that are stable over time (no or
hardly any relaunch). This choice can be easily made for Coicop divisions 01, 02 and 11, at
least, in case of the department store. For other Coicops, notably for clothing articles,
products are defined in terms of a limited set of attributes (four at most).
A question that comes to mind when looking at the chart in Figure 5 is what to do
with EANs with short descriptions and frequent relaunches. We have not come across
such cases in the department store data. But if such cases would emerge in other data sets,
then some decision must be made. Such cases could be left out when their turnover share
Link article types to EANs in ESBAs
Consumption segment = type of
article
Define consumption segments
Link article characteristics to
EANs
Define homogeneous products Calculate price indices
Product = Combination of characteristics = Group of EANs
Product = EAN
Real time QU-indices for
consumptionsegments
Laspeyres indices for (L-)Coicops
Link article types to retailer's ESBAs
'High' risk of relaunches?
YES NO
20
justifies to do so. Otherwise, some solution has to be found in order to include such EANs.
This is an open problem.
The possibility of choosing EANs as products could be extended to other retailers and
consumer goods, but we prefer to avoid this. The applications of the methodology to
mobile phones and drugstore articles make a consistent choice: products are conceived of
as combinations of article characteristics. The drugstore scanner data show frequent
relaunches, which take place in most of its assortment.
Price indices for consumption segments are calculated according to the algorithm in
Section 5.3, which computes real time indices. The resulting indices are free of chain drift,
as they are equal to the theoretical benchmark indices, with yearly fixed product specific
parameters , at the end of each year. But these parameters are updated each month
when calculating real time indices for each publication month, so the question was raised
to what extent the real time and benchmark indices differ.
Both indices were compared during the validation of the first test results. Some
results are shown in figures 6-8, for consumption segments in three Coicop divisions. The
real time indices are the indices that would be published in the CPI. The results show
hardly any difference between the real time and benchmark indices. This has been
observed so far throughout almost the entire assortment. An exceptional case is ladies’ T-
shirts in Figure 7. Differences may typically occur in the first months of a year, as price and
quantity data from shorter time windows are used to calculate the .
Figure 6. Real time and benchmark indices for cake and crisps, based on scanner data for the department store from recent years (Dec. 2012 = 100).
Figure 7. Real time and benchmark indices for men’s and ladies’ T-shirts (Dec. 2012 = 100).
As was suggested in Section 5.3, a moving one-year window could be used, adding
price and quantity data from the preceding year. This is a point worth investigating in
subsequent tests of the index method. The real time and benchmark indices for all other
ladies’ clothing articles show negligible differences or no difference at all. At aggregate
0
20
40
60
80
100
120
140
160
180
200
20
121
2
20
130
2
20
130
4
20
130
6
20
130
8
20
131
0
20
131
2
20
140
2
20
140
4
20
140
6
20
140
8
20
141
0
20
141
2
20
150
2
Cake
Benchmark Real time index
0
20
40
60
80
100
120
140
160
180
200
20
121
2
20
130
2
20
130
4
20
130
6
20
130
8
20
131
0
20
131
2
20
140
2
20
140
4
20
140
6
20
140
8
20
141
0
20
141
2
20
150
2
Crisps
Benchmark Real time index
0
20
40
60
80
100
120
140
160
180
200
20
121
2
20
130
2
20
130
4
20
130
6
20
130
8
20
131
0
20
131
2
20
140
2
20
140
4
20
140
6
20
140
8
20
141
0
20
141
2
20
150
2
Men's T-shirts
Benchmark Real time
0
20
40
60
80
100
120
140
160
180
200
20
121
2
20
130
2
20
130
4
20
130
6
20
130
8
20
131
0
20
131
2
20
140
2
20
140
4
20
140
6
20
140
8
20
141
0
20
141
2
20
150
2
Ladies' T-shirts
Benchmark Real time
21
levels, the impact of the difference for ladies’ T-shirts will therefore be small, if not
negligible.
The test results have shown that the method converges rapidly. In most cases, less
than 10 iteration cycles were sufficient, with a stop criterion set at 0.001 for the maximum
absolute difference between the price indices in the last two iteration cycles. As was stated
in Section 5.3, even less iteration cycles are needed when the initial index described in
Chessa (2015) is used to start the algorithm, which has proved to give a very good
approximation to the final indices.
Figure 8. Real time and benchmark indices for toilet and kitchen towels (Dec. 2012 = 100).
During the implementation phase, it was decided to define consumption segments by
article types, differentiated by purpose (e.g., we take different types of socks as different
consumption segments instead of combining the different types into one segment ‘socks’).
Defining tighter consumption segments increases the chance of index imputations, but this
proved to be hardly an issue in the first tests. We could consider less detailed article types
as consumption segments (e.g., ‘socks’) in subsequent tests. This obviously decreases the
number of segments and, as such, is an interesting option for investigating the impact on
total computation time.
To conclude this section, we compare the results for the QU-method with the current
method, which is based on samples of the scanner data, and with the method that was
based on the classical survey. Figure 9 compares the three price indices for four L-Coicops.
The results clearly show notable differences. In preliminary research it was tried to mimic
the behaviour of the current and survey-based methods by implementing the respective
choices in the QU-method (Chessa, 2014). The degree of representativeness of the samples
chosen, combined with the broader defined products in the current and survey-based
methods, seems to give a plausible explanation of the differences with the QU-method.
Figure 9 clearly illustrates the improvement that can be achieved with a method that aims
at processing all transactions and that emphasises the importance of defining
homogeneous products.
7. Main findings and future plans
The methodological differences across electronic data sets in the Dutch CPI, in
conjunction with the increased use of such data, motivated a search towards a more
generic index method. An index method has been developed, which has been studied and
applied to data sets of different retailers and consumer goods. The QU-method has clear
advantages compared to the current methods:
0
20
40
60
80
100
120
140
160
180
200
20
121
2
20
130
2
20
130
4
20
130
6
20
130
8
20
131
0
20
131
2
20
140
2
20
140
4
20
140
6
20
140
8
20
141
0
20
141
2
20
150
2
Toilet towels
Benchmark Real time
0
20
40
60
80
100
120
140
160
180
200
20
121
2
20
130
2
20
130
4
20
130
6
20
130
8
20
131
0
20
131
2
20
140
2
20
140
4
20
140
6
20
140
8
20
141
0
20
141
2
20
150
2
Kitchen towels
Benchmark Real time
22
Figure 9. Price indices for the QU-method, the current method and the previously used survey-based method for four L-Coicops (Feb. 2009 = 100).
The classical approach of following prices for a basket of goods can be abandoned and
replaced by integral processing of transaction data;
This can be achieved without introducing and testing data filters. Of course, it is wise
to apply certain filters (e.g., for outlier detection);
New products can be incorporated into the index calculations in a timely way. The
availability of an index method that is capable of doing this has clearly proven its
usefulness and superiority over methods that postpone the inclusion of new products
until the next year (see Figure 4, Section 5.2);
There is no need to impute prices of products within consumption segments. A
product that is not sold in some period simply does not contribute to turnover and
volume (weighted quantity) in that period. If the same product is sold in a different
period, then it contributes to turnover and volume for that period. So, the weighted
quantity measure handles products that are not sold in certain periods without any
problem and need for imputation.
The aim of timely including new products in an index method adds notable
complexity to the quest for such a method. Product weights need to be based on price and
sales information from the current publication year. The computational method described
in Section 5.3 makes use of monthly updated weights, which consequently will vary over
time. One of the key features of the method is that it is benchmarked to a method with
yearly fixed weights, which is transitive, and therefore enables us to obtain price indices
that are free of chain drift within a publication year. The product weights are allowed to
vary over years, a choice that was motivated by statistical analyses, in which time
windows of different length were compared.
The methodology has been extensively applied to different electronic transaction data
sets. It has been implemented and tested for scanner data of a department store and for
60
70
80
90
100
110
120
130
1402
00
90
2
20
09
05
20
09
08
20
09
11
20
10
02
20
10
05
20
10
08
20
10
11
20
11
02
20
11
05
20
11
08
20
11
11
20
12
02
20
12
05
20
12
08
20
12
11
20
13
02
Menswear
QU-index Current method Survey
60
70
80
90
100
110
120
130
140
20
09
02
20
09
05
20
09
08
20
09
11
20
10
02
20
10
05
20
10
08
20
10
11
20
11
02
20
11
05
20
11
08
20
11
11
20
12
02
20
12
05
20
12
08
20
12
11
20
13
02
Ladies' wear
QU-index Current method Survey
60
70
80
90
100
110
120
130
140
20
09
02
20
09
05
20
09
08
20
09
11
20
10
02
20
10
05
20
10
08
20
10
11
20
11
02
20
11
05
20
11
08
20
11
11
20
12
02
20
12
05
20
12
08
20
12
11
20
13
02
Bed linen
QU-index Current method Survey
60
70
80
90
100
110
120
130
140
20
09
02
20
09
05
20
09
08
20
09
11
20
10
02
20
10
05
20
10
08
20
10
11
20
11
02
20
11
05
20
11
08
20
11
11
20
12
02
20
12
05
20
12
08
20
12
11
20
13
02
Table linen and bathroom linen
QU-index Current method Survey
23
transaction data of mobile phones. Some of the findings that have emerged from the
analysis of the test results can be summarised as follows:
Most of the differences between the benchmark and real time indices are negligible or
show no difference at all;
Larger differences have been noted in some exceptional cases for the department
store data, which typically arise in the first months of a year due to the use of a
shorter time window for calculating the product weights;
These differences could be reduced by extending the time window with months of the
preceding year. Although this is an interesting option for further research, it does not
seem to be a big issue so far;
The results have been validated and are in agreement with theoretical expectations
about the behaviour of the price indices.
The department store data required quite some time in extracting article
characteristics from the EAN descriptions. This investment of time eventually paid off, as
the linking of article characteristics to EANs operates through a short list of key words that
has remained stable over time (for almost 7 years of data now).
Based on this finding, we expect monthly maintenance work to be limited. Most of this
work will be on controlling new EANs on whether they contain new types of articles and
attributes. With the aid of the current list of search items, new types of articles should be
rather easy to isolate. New consumption segments should therefore be easy to identify.
If the new article types possess attributes that have not been identified so far, then
the current list of key words should be extended with new characteristics. Attributes
should then be selected in order to define homogeneous products for the new segments,
which can be handled by applying one of the two approaches described in Section 4 (we
will start experimenting with the simpler one). This would require most of the
maintenance work. The exact implications in terms of time will become clear next year,
after taking the methodology into production.
Statistics Netherlands intends to take the methodology into production in January
2016, both for the department store and for mobile phones. The methodology then will
replace the current sample-based methods. The methodology has also been studied and
applied to scanner data for drugstores and do-it-yourself stores. Additional data are
needed for both data sets, as questions on discounts and product homogeneity have been
raised. In order to resolve these issues, the first step should be to contact retailers. We
have received a test data set for the do-it-yourself stores with additional information. In
addition, the test data is better structured than the original data, which should even
reduce text mining.
Statistics Netherlands has defined a research program for the coming years, which
aims at studying possibilities for further improvement of the methodology, ranging from
text mining and data analysis/exploration to price index methods. This research will be
extended to other scanner data sets, amongst which supermarkets.
To conclude, Statistics Netherlands is putting a lot of effort into collecting internet
prices through web scraping. However, considerable care is needed when using such data
to compile price indices, as numbers of articles sold are not available. Methods that assign
equal weights to articles generally give poor statistical fits to price data and the resulting
price indices may differ considerably from price indices in which articles are weighted
according to turnover shares (Chessa, 2014, 2016). Information from additional sources is
24
therefore needed in order to use internet prices in a meaningful way. If it is possible to
obtain turnover share type of weights, then this would open possibilities to apply the QU-
method to internet data as well.
References
Balk, B.M. 1996. A comparison of ten methods for multilateral international price and
volume comparison. Journal of Official Statistics, 12: 199-222.
Balk, B.M. 2001. Aggregation methods in international comparisons: What have we
learned? Paper originally prepared for the Joint World Bank - OECD Seminar on
Purchasing Power Parities, 30 January - 2 February 2001, Washington DC.
Balk, B.M. 2012. Price and Quantity Index Numbers: Models for Measuring Aggregate Change and Difference. Cambridge, UK: Cambridge University Press.
Chessa, A.G. 2013. Comparing scanner data and survey data for measuring price change of
drugstore articles. Paper presented at the Workshop on Scanner Data for HICP, 26-27
September 2013, Lisbon.
Chessa, A.G. 2014. An index method for a Dutch department store. The Hague: Statistics
Netherlands. (In Dutch)
Chessa, A.G. 2015. Towards a generic price index method for scanner data in the Dutch
CPI. Ottawa Group Meeting, 20-22 May 2015, Urayasu City, Japan.
Chessa, A.G. 2016. Product homogeneity and weighting when using scanner data for price
index calculation. Invited paper at the 2016 International Methodology Symposium “Growth in Statistical Information: Challenges and Benefits”, 22-24 March 2016, Gatineau,
Quebec, Canada. (In preparation)
Claeskens, G. and N.L. Hjort. 2008. Model Selection and Model Averaging. Cambridge, UK:
Cambridge University Press.
Geary, R. C. 1958. A note on the comparison of exchange rates and purchasing power
between countries. Journal of the Royal Statistical Society A, 121: 97-99.
van der Grient, H.A. and J. de Haan. 2010. The use of supermarket scanner data in the
Dutch CPI. Paper presented at the Joint ECE/ILO Workshop on Scanner Data, 10 May 2010,
Geneva.
de Haan, J. 2006. The re-design of the Dutch CPI. Statistical Journal of the United Nations Economic Commission for Europe, 23: 101-118.
de Haan, J. and H.A. van der Grient. 2011. Eliminating chain drift in price indexes based on
scanner data. Journal of Econometrics, 161: 36-46.
25
Haan, J. de, and F. Krsinich. 2014. Time dummy hedonic and quality-adjusted unit value
indexes: Do they really differ? Paper presented at the 1st Conference of the Society for
Economic Measurement, 18-20 August 2014, University of Chicago.
Khamis, S. H. 1972. A new system of index numbers for national and international
purposes. Journal of the Royal Statistical Society A, 135: 96-121.
Krsinich, F. 2014. The FEWS Index: Fixed Effects with a Window Splice – Non-revisable
quality-adjusted price indexes with no characteristic information. Paper presented at the
Meeting of the group of experts on consumer price indices, 26-28 May 2014, Geneva,
Switzerland.