asia-pacific economic statistics week seminar component ... · in india, the retail sector expanded...
TRANSCRIPT
Asia-Pacific Economic Statistics Week Seminar Component
Bangkok, 2 – 4 May 2016
Name of author Abhiman Das (Professor, IIM Ahmedabad) Pulak Ghosh (Professor, IIM Bangalore) Anirban Sanyal (Research Officer, Reserve Bank of India) Organization Indian Institute of Management (IIM) – Ahmedabad & Bangalore Reserve Bank of India Contact address Department of Statistics and Information Management Modelling and Forecasting Division C-8, 6th Floor, Bandra Kurla Complex, Bandra (E) Mumbai- 400051 Phone:022-26578700 Ext-7503 Mobile:- 08879239691 Contact phone (+91) 9820414476, (+91) 8879239691, (+91) 9742065806 Email [email protected] [email protected] [email protected]
Title of Paper
Monitoring consumer price trend using daily price data of online grocery stores in India
Abstract
Central banks and other policymakers monitor the trends in consumer price using different price indices. These indices are compiled by official agencies and are usually published at monthly frequency. However, the time lag of publishing the official data often creates bottleneck for policymaking as information on price situation remains unavailable at the time of framing policy decisions. In order to bridge the information gap, this paper proposed a methodology of tracking price momentum almost on real time basis using online grocery stores price data. Online grocery stores have been operationally successful in different advanced and emerging economies including India for a considerable period of time and have a large consumer base. Given the wide reach of such online groceries and their product coverage, we explore the information content of the daily pricing data from such online groceries in Indian context. We find concordance of price momentum between price quotations of e-retailers with the officially published data. We argue that such high frequency data can be effectively used for developing composite price indices which can track the price momentum in real time and can provide valuable policy inputs to policymakers.
I. Contents
I. Contents .............................................................................................................................................. 2
I. Introduction .......................................................................................................................................... 3
II. Literature Review ............................................................................................................................... 5
III. Approach and data used ............................................................................................................... 6
1. Step 1: Data cleansing and transformation ................................................................................ 7
2. Step 2: Mapping online grocery commodities to CPI item basket .......................................... 7
3. Step 3: Construction of Index ....................................................................................................... 9
4. Step 4: Comparative analysis and deep-dive .......................................................................... 10
IV. Empirical Findings ........................................................................................................................ 11
5. Comparative Analysis .................................................................................................................. 11
6. Deep-dive analysis ....................................................................................................................... 13
V. Concluding remarks ......................................................................................................................... 15
VI. References .................................................................................................................................... 16
I. Introduction
Price stability has been one of the major mandate of central banks across countries. As price
pressure builds up, inflation starts picking up resulting in discomfort for the policymakers.
Thus monitoring price condition continues to remain one of the major concerns for any
central bank. Generally different price indices (retail and wholesale) are released by official
agencies which are helpful for price monitoring across economy. However such official data
often come with a lag which creates huddle in effective policy making. Central Banks often
track different high frequency indicators for monitoring the price momentum and detect any
build-up of inflationary environment at the earliest. In this context, the usefulness of online
grocery data has been analyzed at length across various literature where the information
content of such real time price movement has been acknowledged as important source of
information on price condition in terms of timeliness and accuracy (Cavallo and Ribogon
(2011), Cavallo (2012 & 2013) etc.). Cavallo and Ribogon initiated the project titled ‘MIT
Billion Prices Project’ (BPP) with the objective of using the real time price movements across
large set of commodities for developing comparable price indices. In this process, BPP
covered over five million commodities sold across 300 online retail chains in more than 70
different countries1. The price indices estimated using BPP project across different countries
differed from the coverage of official estimates as online marketed commodities was found to
be concentrated in some specific categories among other limitations. However in spite of
these limitations, the estimated price indices were found to be tracking official price
momentum to significant extent which opened up a new gamut of opportunities for real time
price monitoring particularly in cases where the high frequency price data remains
unavailable.
The phenomenal growth of online grocery stores across countries points towards shifting
consumer preference towards online ordering. As per a recent survey report of Nielson
(April, 2015), more than 80% of 30000 respondents across countries find it convenient to
order from online grocery stores. The advancement in technological growth powered by
customer ease in accessing wide variety of products has been one of the major drivers of
such behavioral change among consumers. Asia-Pacific Region has been a leading region
for adopting such online initiatives in grocery market. As per the report, the expansion of
mobile data usage and broadband connectivity drives the consumer preference in this
region. The growth of online groceries has been found to be phenomenal in food and
beverage segments as customers across Asia Pacific countries prefers ordering online
groceries instead of visiting retail stores. The online size of grocery stores in India started
1 Source: Wikipedia (https://en.wikipedia.org/wiki/MIT_Billion_Prices_project)
peaking up since 2011 but the market size remains small compared to others2. In India, the
retail sector expanded significantly in recent times. India’s retail sector was ranked 5 in 2012
as per the Global Retail Development Index (GRDI) of At Kearney. Online purchases has
also becoming part of this success story in India where sales growth of online retails of food
items increased by 80% during January to March of 20133. Since 2011, the popularity of
online grocery stores in India started peaking up when two of the major online stores namely
Bigbasket and Zopnow started their operation. In recent times, 6 major online players rule
the online grocery market in India. According to Mr. Hari Menon, Co-founder of Bigbasket,
the changes in consumer behaviour is helping these online retailers to increase their
business operations. Though these online grocery stores experienced significant impulses in
terms of increasing business operations, the spatial coverage of these retailers majorly
confined to major metro cities. Also the commodities offered through these online grocery
portals is limited to food and beverage items compared to the usual CPI item basket.
India adopted inflation targeting monetary policy in 2015 and consumer inflation was
selected as the official target of inflation. Eventually the consumer price inflation is derived
from CPI Combined index released at monthly frequency by Central Statistical Office with a
lag of 11-12 days. Food and fuel items having weight of around 58% remains one of the
major drivers of the consumer inflation. The easing price impact of fuel inflation was offset by
food inflation in recent times. Though fuel inflation moves in tandem with global crude price,
the food inflation is driven by primarily supply shocks. In such a situation, the real time
tracking of food prices acts as early warning system for detecting any unfavorable movement
in one of the major drivers of consumer inflation. With this background, this paper extends
the real time price monitoring approach for primarily food items to Indian context using online
groceries data obtained from one of the largest online grocery store of India. This paper
contributes to the increasing literature of Big Data Analysis by incorporating real time
tracking of price of food items using online grocery data in Indian context. Such exercise,
though carried out across countries, have not been addressed in emerging market
economies to a great extent and first of its kind in Indian context. The paper also addresses
one of the major policy imperatives in terms of real time tracking of food inflation which may
be used to detect any inflationary situation for consumer price inflation contributed by food
items. The rest of the paper is structured as follows – Section 2 covers the literature review
followed by approach and data used in Section 3. The empirical findings are illustrated in
Section 4 and conclusion follows in Section 5.
2 “The retailer” – EY (http://www.ey.com/Publication/vwLUAssets/EY-the-retailer-july-september-2015/$FILE/EY-the-retailer-july-september-2015.pdf) 3 Source: https://www.kpmg.com/IN/en/IssuesAndInsights/ArticlesPublications/Documents/BBG-Retail.pdf
II. Literature Review
The utility of internet search data for tracking economic activities has been taken up across
countries in recent times. Data obtained by monitoring internet activities have been used by
several countries like America, UK, and Germany to produce a high frequency series that
can capture macroeconomic activity and give accurate estimates of economic parameters in
Real- time. Google being largest search engine, majority of such studies has considered
Google Search engine data which is officially released in Google Trends platform at weekly
frequency. Varian and Choi (2009) were the first to use Google Trends to estimate several
macroeconomic indicators in the U.S economy. Google Trends was used to estimate
automobile sales and initial claims to unemployment in the US, along with tourism in Hong
Kong and consumer confidence in Australia .Theoretically this paper is similar to the doctoral
thesis of Brian.D.Humphrey, (2010) under James W. Roberts, Faculty Advisor. He uses
simple OLS techniques to show that indicators developed using Google search engine data
can successfully, predict the local and national household sales in the United States.
McLauren and Shanbhogue (2011) surveyed indicators for UK housing and labour markets
using data from Google Insights by entering certain keywords. They used simple auto
regressive models with lags of one and two months to create a causality between internet
and published data. The model exhibited a good fit and data on keywords , “estate agents”,
“RICS”,”HBF” were able to predict movements in housing prices. Conquests of Google
Trends are not restricted to advanced economies, Yan Carrière-Swallow and Felipe Labbé
(2010) of the Central Bank of Chile published research on relevance of internet data in
emerging countries. Their paper now-casts Chilean Automotive sales using Key-word based
search data of Google trends data. Using internet data to estimate economic activities, has a
huge underlying assumption of high internet penetration. In recent times, World Bank
identified Big Data as one of the potential source of information for areas like early warning
system, nowcasting macroeconomic aspects, forecasting weather patterns etc.
The usefulness of daily price data has been widely acknowledged for timely information on
price condition at micro level. In his article titled “A Way, Day by Day, of Gauging Prices”,
Justin Lahart (2010) observed that the daily price data contains information on development
of price condition in retail market and thereby applauded initiative of collecting daily price
data from online grocery markets. Cavallo and Rigobon (2011) initiated the Billion Prices
Project (BPP) in MIT where the scrapped daily price information were collected for around 5
million commodities from major 300 online grocery markets across 70 countries. Though
BPP started with as academic exercise, the objective of BPP was to use high frequency
price information across major commodities and to develop supplementary price monitoring
mechanism for policymakers. Michael Bordo, economist of Rutgers University who reviewed
the methodology of BPP, tagged the initiative as “brilliant way of measuring the deep
fundamentals of inflation”. Later Cavallo (2012) argued that the price index constructed out
of the daily data, is likely to provide alternate measure of inflation compared to official data
and that too for countries like Argentina where official estimates have been subjected to
criticism and the hyper-inflationary episodes have been observed frequently in recent past.
On similar footing, Varian (2010) constructed price index (also called Google Price Index
(GPI)), using google price data where he observed that the constructed index (GPI) tracks
the development of pricing situation even better than the official index of consumer price.
Having acknowledged the benefits of using the daily price quotes, one of the major criticism
of this approach has been lack of coverage in terms of spatial distribution and product
covered. Particularly this point holds good for emerging market economies (EME) as the
coverage of such online groceries have been found to be limited in these economies.
However the recent growth observed in the sales growth of these online grocery companies
points towards increasing preference shift towards online ordering for food items and such
evolution in consumer preferences vote for a relook into the feasibility of using daily price
data in EME.
III. Approach and data used
The paper uses daily price data from one of the major online grocery of India, Bigbasket.com
where daily price of around 2200 commodities have been monitored over 5 months horizon.
The data structure of the item level daily price quotations included the following details
(Table 1)
Table 1: Data structure – daily price data
Item Code Unique identifier of each commodity which is maintained by online grocery
Item Description Description of the item
City City name
Sale Price Sale price quotation of commodities
MRP Maximum retail price
The business operation of the aforesaid online grocery store has been found to be restricted
only to 5 major cities namely Hyderabad, Bengaluru, Chennai, Mumbai and Pune. The
products offered through this portal was majorly classified into food and beverage categories
of existing CPI item basket (base 2012=100). The approach adopted in this paper broadly
follows the steps followed for constructing official estimate of consumer price index by India’s
official agency, Central Statistical Office (CSO). The Central Statistical Office started
publishing the new CPI with base 2012=100 in recent times. Apart from the revamping the
consumption basket and corresponding weights based on 68th round of household consumer
expenditure survey of NSSO, the recent base revision also adopted new methodology of
computing the index value as the new index value is computed as geometric average of
commodity prices following Jevons’s approach. The empirical analysis carried out in this
paper adopts Jevons’s approach to estimate the representative price across commodities
and subsequently rolls up the price across cities and commodity groups using weighted
average. Step-wise illustration of the approach has been described below
1. Step 1: Data cleansing and transformation
The daily price data has been cleansed prior to processing. The items which does not have
any reported price, has been excluded from the analysis. On the other hand, sale price
generally includes the discount provided by the company for marketing purpose and hence
MRP has been used as representative price quotation across commodities. However as the
information on MRP is not available for Bengaluru, the sale price has been used as proxy for
MRP for Bengaluru only. Also the entire exercise was carried out excluding Bengaluru for
checking the robustness.
2. Step 2: Mapping online grocery commodities to CPI item basket
A comparative study of unique commodity list across different cities indicate that the
maximum number of commodities are being offered in Bengaluru whereas minimum number
of commodities are being offered in Pune (Chart 1).
Chart 1: Unique commodities offered in different cities
Next the unique list of commodities which are available from daily price data, has been
mapped against CPI consumption basket in order to perform the aggregation according to
CPI item basket. Also the mapping provides input to assess the extent of coverage of item
basket in comparison with CPI basket and thereby helps to explain any diverging momentum
observed in the estimated price index. Here the items which constitutes of the broad item
groups of CPI and index data is available, are mapped against the Bigbasket items for
comparative analysis. In this paper, the mapping revealed that majority of items covered by
BigBasket falls under Food and Beverage item group of CPI consumption basket. Also the
larger proportion of items mapped against commodities like ‘Other fresh Fruit’, ‘Other
vegetables’, ‘Palak and other leafy vegetables’ etc. points towards lack of mapping between
these two item baskets (Table 2).
Table 2: Summary of Mapping with CPI item basket
CPI Items No. of items
mapped
Palak and other leafy vegetables 260
other fresh fruits 257
other vegetables 254
gourd, pumpkin 161
mango 138
beans, barbati 94
apple 92
banana (no.) 72
onion 72
processed food 70
brinjal 64
tomato 55
potato 51
flower (fresh): all purposes 46
876
702765
525614
Bangalore Hyderabad Mumbai Pune Chennai
No
. of
un
iqu
e
com
mo
dit
ies
off
ere
d
CPI Items No. of items
mapped
grapes 41
green chillies 40
carrot 37
lemon (no.) 32
papaya 32
cabbage 29
orange, mausami (no.) 27
ginger (gm) 26
fruit juice and shake (litre) 24
garlic (gm) 24
lady's finger 24
pears/nashpati 21
watermelon 21
coconut (no.) 20
dhania (gm) 17
peas 15
guava 14
others: birds, crab, oyster, tortoise, etc. 14
groundnut 11
cauliflower 9
leechi 9
berries 8
pan: leaf (no.) 8
jackfruit 7
dates 6
parwal/patal, kundru 6
raisin, kishmish, monacca, etc. 5
tamarind (gm) 5
turmeric (gm) 5
other nuts 4
black pepper (gm) 2
sugar - other sources 2
Total 2231
3. Step 3: Construction of Index
Since the benchmark indicator for the analysis is CPI item level data which are available at
monthly frequency, the daily price has been converted to monthly frequency using simple
average over days. The next step is to derive the retail price index at commodity level. For
that, the retail price (MRP) of granular commodities are aggregate using geometric average
to derive the representative price relative in line with the practice followed by official
agencies in India. The derived representative price relative represents the average change
in price of the item (belonging to CPI item basket) and use of geometric mean smoothens
the price fluctuations across commodities within CPI item to certain extent.
𝐶𝑃𝐼𝑖𝑡𝑐𝑖𝑡𝑦
=𝑃𝑖𝑗𝑡
𝑐𝑖𝑡𝑦
𝑃𝑖𝑗0𝑐𝑖𝑡𝑦
× 100
where 𝐶𝑃𝐼𝑖𝑡𝑐𝑖𝑡𝑦
is the price relative of ith CPI commodity at time t for particular city, 𝑃𝑖𝑗𝑡 & 𝑃𝑖𝑗0
is the price of jth commodity falling under ith CPI item at time t. Here the price relative is
calculated with respect to time point April 2015 and hence the price relative for Apr-15 would
be 100. The price relative in subsequent months would thus represent the relative change in
prices of ith commodity of CPI basket compared to Apr-15 price.
The representative price of CPI item is them aggregated across cities using the relative
weight of urban consumption share of the respective states in which the cities belong. The
consumption share of each state has been estimated by CSO at the time of preparing the
new index of CPI. Using the consumption share, the aggregate price relative of each CPI
commodity has been derived as follows
𝐶𝑃𝐼𝑖𝑡𝑎𝑔𝑔𝑟
=1
∑ 𝑤𝑖𝑐𝐶
𝑐=1
× ∑(𝑤𝑖𝑐 × 𝐶𝑃𝐼𝑖𝑡
𝑐 )
𝐶
𝑐=1
where 𝑤𝑖𝑐 is the consumption share of i-th commodity in City ‘c’, c=1(1) C.
4. Step 4: Comparative analysis and deep-dive
The index thus obtained provides information on the month on month variation in the
commodity price. Here the variation in price fluctuation can be attributed by seasonal
variation also. With this background, CPI item level indices are rebased at Apr-15=100 for
comparative purpose. Assuming the seasonal pattern of the price fluctuation remains
invariant across these two series of indices, the comparative analysis has been carried out
using visual inspection. If the derived series is found to track the official price data, the
information content of the daily price data is likely to provide insight about official data at
higher frequency. On the other hand, any divergent momentum between these two sets of
indices would be analyzed further to identify the reason of divergence which may attributed
towards lack of mapping, coverage and price inelasticity in retail and online market.
IV. Empirical Findings
1. Comparative Analysis
The price index derived at each commodity level indicates a mixed scenario. Within
vegetables, the price momentum has been found to be moving in line with official data for
items like ‘Potato’, ‘Onion’, ‘Tomato’ and ‘Palak’ (Chart 2-2(a)) whereas divergent
momentum has been observed in case of ‘Cabbage’ and ‘Brinjal’ (Chart 2(b))
Chart 2: Price momentum at item level (vegetables)
Chart 2(a): Price momentum at item level (vegetables)
Chart 2(b): Price momentum at item level (vegetables)
On the other hand, the price momentum observed from the derived index, is found to differ
from the official estimates in most of the items (Chart 3-3a)
Chart 3: Price momentum – Fruits
Chart 3a: Price momentum – Fruits
2. Deep-dive analysis
Further divergence in price momentum has been analyzed in terms of spatial disparity in
price condition and mapping issues. For instance, city-wise price momentum of ‘Apple’
indicates wide divergence among 4 cities under consideration (Chart 4)
Chart 4: City-wise price divergence
94
96
98
100
102
104
106
108
110
112
114
80
100
120
140
160
180
200
220
Apr-15 May-15 Jun-15 Jul-15 Aug-15
Pappaya
CPI (Derived) Composite
Chart 4: City-wise price divergence (Contd.)
Such wide divergence among the price pattern of items contributes to the divergence of the
derived index with official data as the official agencies uses state level price quotations for
developing the price index. Apart from the spatial coverage, the mapping of items also
influences divergence in the momentum. The varieties of apple obtained from daily data,
comprises of varieties which are generally of supreme quality (primarily imported) and
thereby the representativeness of the price behaviour is likely to diverge from the official
data (Table 3).
Table 3: Mapping of commodities from daily price with CPI basket
CPI BigBasket
Ap
ple
Fresho Apple - Fuji Fresho Organically Grown - Apple
Fresho Apple - Royal Gala Fresho Apple - Queen
Fresho Apple - Washington Fresho Apple - Shimla
Fresho Apple - Green Fresho Apple - Golden Delicious
Fresho Apple - Kinnaur Fresho Apple - Granny Smith
Fresho Apple Fuji Premium Fresho Apple - Indian
V. Concluding remarks
The rapid expansion of online grocery stores and shift in consumer preferences in recent
times points towards using high frequency price data for developing competitive price indices
against official statistics. Cavallo and Rigobon (2011) started the initiative of analysing the
price momentum exhibited using large set of commodities from major online groceries
across 70 countries. However similar exercise is hardly found in case of emerging market
economies. This paper tries to assess the information content of daily price data in view of
the official estimates in Indian context using online data of one of the largest online grocery
store in India. The paper uses the item level price quotations to map against the CPI
consumption basket and derives the retail price indices at item level following same
methodology as official data. The empirical findings suggest that the daily price data is able
to track the price momentum across commodities which matches with the official data.
However divergence in price momentum is also observed in certain commodities. Further
detailed analysis reveals that the divergence of price momentum can be attributed to
regional coverage and items covered under these online services. Such divergence between
online data and official estimates has also been observed by Cavallo (2012) when he
compared the official price indices of Argentina with daily online prices. As these online
groceries are spread out only in metro cities, the price quoted in these online groceries are
often representative of premium quality of items. As the price experience differs among the
different regions of country, the cumulative behaviour of the macro-level price momentum is
often misleading in nature. The spatial dimension of the daily price data also provides
detailed view on the price condition prevailing in different parts of the country and thereby
enables the policymakers to take appropriate policy actions on real time basis.
The paper contributes to the literature of Big Data analytics by suggesting an alternative data
source which is timely available and provides valuable insights about the price directions.
Such analysis which is an extension of Billion Prices Project (BPP), is first of its kind in
context of emerging market economies and is expected to provide important insight about
using alternative source of information for deriving price indices. The paper uses daily price
data of one of the major online groceries for 5 months and tries to compare the price
momentum. Further scope includes extending the timeline of analysis along with increasing
coverage of the item basket by incorporating larger information set from other online
groceries.
VI. References
[1] Brian.D.Humphrey, "Forecasting Existing Home Sales using Google Search Engine
Queries", Ph.D Thesis, Duke University, 2010
[2] Cavallo, A. and Roberto Ribogon, ‘The Distribution of the Size of Price Changes’, MIT
Press, 2011
[3] Cavallo, A. ‘Scraped Data and Sticky Prices’, MIT and NBER, 2012
[4] Cavallo, A., Online and official price indexes: Measuring Argentina’s inflation. Journal of
Monetary Economics (2012), http://dx.doi.org/10.1016/j.jmoneco.2012.10.002
[5] Cavallo, A., Guillermo Cruces and Ricardo Perez-Truglia, ‘Inflation Expectations,
Learning and Supermarket Prices’, MIT Press, May 2015
[6] Community Whitepaper, "Challenges and Opportunities with Big Data", A community
white paper developed by leading researchers across the United States, 2011
[7] Choi,Hyunyoung and Hal Varian, "Predicting the present with Google trends" (December
2011)
[8] Hal .R.Varian, "Big data :new tricks for econometrics", 2011
[9] Y. Fondeur and F. Karame,"Can Google data help predict French youth unemployment?"
(2013)
[10] Nyman et. al., "Big data and economic forecasting: a top-down approach using directed
algorithmic text analysis”, Centre For The Study Of Decision-Making Uncertainty, Faculty Of
Brain Sciences, University College, London
[11] Nick McLaren (of the Bank’s Conjunctural Assessment and Projections Division) and
Rachana Shanbhogue (of the Bank’s Structural Economic Analysis Division.),"Using internet
data as economic indicators"
[12] Nikolaos Askitas and Klaus F. Zimmermann, "Google econometrics and unemployment
forecasting", 2013
[13] Yan Carriere-Swallow Felipe Labbe,"Nowcasting With Google Trends In An Emerging
Market", 2012