a demand estimation procedure for retail assortment ... · the assortment planning process varies...

A Demand Estimation Procedure for Retail AssortmentOptimization with Results from Implementations

Marshall FisherOPIM Department, The Wharton School, 3730 Walnut Street, Philadelphia, PA 19104, USA

[email protected]

Ramnath VaidyanathanDesautels Faculty of Management, 1001 Sherbrooke Street West, Montreal, QC H3A 1G5, Canada

[email protected]

June 2009, Revised September 2012, May 2013

AbstractWe consider the problem of choosing, from a set of N potential SKUs in a retail category, K SKUs tobe carried at each store so as to maximize revenue or profit. Assortments can vary by store, subject toa maximum number of different assortments. We introduce a model of substitution behavior, in case acustomer’s first choice is unavailable and consider the impact of substitution in choosing assortmentsfor the retail chain. We view a SKU as a set of attribute levels, apply maximum likelihood estimationto sales history of the SKUs currently carried by the retailer to estimate the demand for attributelevels and substitution probabilities, and from this, the demand for any potential SKU, including thosenot currently carried by the retailer. We specify several alternative heuristics for choosing SKUs tobe carried in an assortment. We apply this approach to optimize assortments for three real examples:snack cakes, tires and automotive appearance chemicals. A portion of our recommendations fortires and appearance chemicals were implemented and produced sales increases of 5.8% and 3.6%respectively, which are significant improvements relative to typical retailer annual comparable storerevenue increases. We also forecast sales shares of 1, 11 and 25 new SKUs, for the snack cakes, tiresand automotive appearance chemicals applications, respectively, with MAPEs of 16.2%, 19.1% and28.7%, which compares favorably to the 30.7% MAPE for chain sales of two new SKUs reported byFader and Hardie (1996).

1 Introduction

A retailer’s assortment is the set of products they carry in each store at each point in time. Retailersperiodically revise the assortment for each category, deleting some products and adding others to takeaccount of changes in customer demand over time as well as new products introduced by suppliers. Thisperiodic assortment reset seeks to choose a set of SKUs to carry in the new assortment to maximizerevenue or profit over a future planning horizon, subject to a shelf space constraint, which can often beexpressed as an upper bound on the number of SKUs carried.

1

If a customer’s most preferred product is not in a retailer’s assortment, they may elect to buy nothing orto purchase another product sufficiently similar to their most preferred product that they are willing tobuy it. This possibility of substitution must be taken into account in both assortment optimization andin estimation. In estimation, substitution probabilities need to be estimated and the sales of a SKU tocustomers who most preferred that SKU must be distinguished from sales to customers who preferred adifferent SKU but substituted when they did not find their preferred SKU in the assortment.Most retailers use the same assortment in all stores, except for eliminating some SKUs in smaller stores.Recently however, localizing assortments by store or store cluster has become a high priority for manyretailers. For example, Zimmerman (September 7, 2006), O’Connell (April 21, 2008), McGregor (May15, 2008) and Zimmerman (October 7, 2008) describe recent efforts by Wal-Mart, Macy’s, Best Buy andHome Depot to vary the assortment they carry at each store to account for local tastes. Despite theflurry of interest in localization reported in the business press and which we have encountered in ourinteraction with retailers, there have been no studies to document the level of benefits from localizationor to provide tools to help a retailer determine the right degree of localization.The assortment planning process varies across retail products, which are commonly segmented intoapparel, grocery, and everything else, usually called hard goods.1 An analytic approach to apparelassortments is challenging because rapidly changing tastes make sales history of limited value. Groceryassortment planning (called category management) is the most analytic, due in part to companies likeNielsen and IRI who enlist households to record over time their grocery purchases in all stores. Muchacademic research on grocery consumer behavior has relied heavily on this data. Among other things,the data allows one to model substitution behavior by observing what a customer buys, if anything,when a product they purchase every week is unavailable due to a stockout. Many hard goods retailersconduct an annual review of their various categories, reallocating store space by category and decidingwhich SKUs to delete and add to each assortment.SKU deletion decisions are relatively easy, since sales data is available to indicate the popularity ofexisting SKUs, but retailers lack (with the partial exception of grocery), analytic tools to forecast thesales of SKUs that might be added to the assortment, to estimate the rate of customer substitution, andto intelligently localize assortments by store. The goal of this paper is to provide these tools.Our approach follows the marketing literature in viewing a SKU as defined by a set of attribute levelsand assuming that a given customer has a preferred set of attribute levels. We use prior sales to estimatethe market share in each store of each attribute level and forecast the demand share for a SKU as theproduct of the demand shares for the attribute levels of that SKU. We assume that if a customer does notfind their ideal product in the assortment, they buy the product in the assortment closest to their idealwith some probability and we also estimate these substitution probabilities. We apply various heuristicsto these demand and substitution estimates to determine optimized assortments.In the extreme, a retailer might carry a unique assortment in each store, but most retailers claim thatthis is administratively too complicated. In particular, retailers develop a diagram called a planogramshowing how all products should be displayed in a store, a process that is labor intensive. A planogramwould need to be developed for each assortment, which means the administrative cost of each assortment

1This discussion is based on Fisher and Raman (2010) and conversations with several retail executives including PaulBeswick, Partner and Head, Oliver Wyman North American Retail Practice, Robert DiRomualdo, former CEO, BordersGroup, Kevin Freeland, COO, Advance Auto, Matthew Hamory, Principal, Oliver Wyman North American Retail Practice,Herbert Kleinberger, Principal, ARC Consulting, Chris Morrison, Senior VP of Sales, Americas, Tradestone, Robert Price,Chief Marketing Officer, CVS, and Cheryl Sullivan, Vice President of Product Management, Revionics, Inc.

2

is high. Our process can control the degree of localization by limiting the number of different assortmentsto be any level between a single assortment for the chain to a unique assortment for each store.

We applied this approach to the snack cakes category at a regional convenience store chain, the tireassortment at a national tire retailer and the appearance chemicals category of a major auto aftermarketparts retailer. The tire and auto parts retailers implemented portions of our recommended assortmentchanges and obtained revenue increases of 5.8% and 3.6% respectively, significant improvements giventraditional comparable store annual increases in these segments.

Our approach best fits situations where demand is changing over time, but gradually enough that historyis useful, where the individual consumer data common in grocery is not available, and where someproducts are better substitutes for a given product than others. Also, we do not consider inventorydecisions, so our approach fits when inventory decisions are unrelated to assortment decisions, becausethe retailer carries a fixed, often small amount of inventory for each SKU. This was the case for all ofour applications, as well as other categories like jewelry, books and CDs.

Despite the enormous economic importance of assortment planning, we are aware of only two papers,Chong et al. (2001) and Kök and Fisher (2007), that formulate a decision support model for assortmentplanning, describe a methodology for estimating parameters and optimizing assortments and test theprocess on real data. While these papers are an excellent start, there is obviously a need for muchmore research on the vast topic of assortment planning. This paper extends these papers, and hence theexisiting literature, in four ways.

1. We provide a thorough treatment of the important emerging topic of assortment localization. Weallow a constraint on the number of different assortments so as to bound the administrative costof localization and measure how the amount of localization impacts revenue. We compare andanalyze the causes of differing levels of localization benefits across our three applications. Chonget al. (2001) do not deal with localization. Kök and Fisher (2007) allow a unique assortment foreach store but do not provide a constraint on the number of assortments or assess how localizationimpacts revenue.

2. We forecast the demand for new SKUs that have not been carried before in any store, basedon past sales of products currently carried. Chong et al. (2001) do not explicitly forecast newSKUs, although it would appear from their process that they could derive forecasts for new SKUs.However, they need consumer-level transaction data over multiple shopping trips and this detaileddata is not available in most non-grocery applications. Kök and Fisher (2007) forecast how a SKUcarried in some stores will sell in other stores in which it is not carried, but do not provide a wayto forecast demand for a completely new SKU carried in no stores.

3. We introduce a new demand model not previously considered in assortment decision support re-search. Both Chong et al. (2001) and Kök and Fisher (2007) use the multinomial logit demandmodel, which implicitly assumes that substitution demand is divided over available products inproportion to those products’ market share. This assumption fits a situation in which products aresimilar to each other, but vary in a taste parameter, such as different flavors of yogurt or colorsof apparel. By contrast, our demand model fits a situation in which some products are bettersubstitutes for a given product than others. This was a very real feature of our applications; forexample, the natural substitutes for a 14 inch tire are other 14 inch tires, not 15 inch tires. As an-

3

other example, in our snack cakes application we found the probability of substituting from Brand1 to Brand 2 was 89%, but only 22% of substituting from Brand 2 to Brand 1. Hence Brand 2customers were much more loyal. The multinomial logit approach better fits some real assortmentproblems and our approach fits others better, so both are needed.

4. Two of the retailers in our study implemented a portion of our recommended assortment changesand we estimate that the revenue increase in revenue from these changes is larger than the an-nual same store revenue increases typically seen in these industries. We believe this is the firstimplementation validation of an analytic approach to assortment planning.

2 Literature Review

We review here the prior research on consumer choice models, demand estimation and assortment opti-mization that is most related to this paper. See Kök et al. (2008) for a more extensive review.

Consumer choice models constitute the fundamental platform for assortment planning, and may beclassified as (1) utility based models, and (2) exogenous demand models. Utility based models assumethat every customer associates a utility Ui with each product i ∈ N . In addition, there is a no-purchaseoption denoted i = 0, with associated utility U0. When offered an assortment A, every customer choosesthe option giving him the highest utility in A∪ {0}. The market share for each SKU i ∈ A can then beevaluated once we know the distribution of utilities across the consumer population.

The Multinomial Logit (MNL) model (Guadagni and Little 1983) assumes that the utilities Ui canbe decomposed into a deterministic component ui that represents the average utility derived by thepopulation of customers, and a random component ξi that represents idiosyncrasies across customers.The ξi are assumed to be identical and independent Gumbel random variables with mean zero andscale parameter µ. Under these assumptions, the market share for each SKU i ∈ A can be written as

exp(uiµ

)∑i∈A exp

(uiµ

) .

In the Locational Choice model (Hotelling 1929, Lancaster 1966, 1975), every SKU i ∈ N is representedas a bundles of attribute levels zi = (z1

i , z2i , ..., zTi ) ∈ RT . Each consumer has an ideal point y ∈

RT that defines his most preferred attribute levels. The utility this consumer associates to SKU i isUyi = c − τ ‖y− zi‖, where c is the utility derived from his most preferred product y, and τ is thedisutility associated with each unit of deviation from y. A consumer not finding his ideal product y inthe assortment A substitutes the variant j ∈ A that is located closest to his ideal point in the attributespace if Uyj > 0, or declines to purchase if Uyj ≤ 0 .

In the exogenous demand model, every consumer is assumed to have a favorite product i, and fi is theshare of consumers whose favorite product is i. A consumer whose favorite product is i buys it if i ∈ A;if i /∈ A they substitute to SKU j ∈ A with probability αij . Under these assumptions, the market shareof each SKU i ∈ A is given by qi(A) = fi +

∑j/∈A fjαji.

All three demand models assume that customers have a favorite product and they buy that productif it is in the assortment. They also all assume that if a customer’s favorite product is not in theassortment, they may substitute a different product. Where the models differ is in their assumptionsabout substitution behavior. The exogenous demand model is the most flexible model, allowing for anysubstitution structure, but it has many parameters and is hence difficult to estimate in practice. The

4

MNL model assumes that the demand for a missing product which is not lost transfers to other productsin the assortment in proportion to their popularity. By contrast, the Locational Choice model assumesthat a given product may be more like some products than others and that substitution demand transfersto the product in the assortment that is most similar to a customer’s preferred product.Demand estimation has been studied by Fader and Hardie (1996), who use maximum likelihood toestimate the parameters of an MNL model from sales transaction data under the assumption that productutility is the sum of utilities of the product attributes. Anupindi et al. (1998), Chong et al. (2001) Bellet al. (2005), Kök and Fisher (2007) and Vulcano et al. (2009) estimate demand as part of an assortmentoptimization process, and are discussed below.Assortment optimization research has been based on both stylized models intended to provide in-sight into structural properties of optimal assortments and decision support models intended to guide amanager planning retail assortments.Stylized model research began with van Ryzin and Mahajan (1999), who show that an optimal assort-ment under a MNL consumer choice model consists of a certain number of the highest utility products.Mahajan and van Ryzin (2001), Cachon et al. (2005), Caro and Gallien (2007) and Maddah and Bish(2007) extend van Ryzin and Mahajan (1999) in various ways.Gaur and Honhon (2006) show that for a locational choice model, the products in the optimal assortmentare located far from each other in the attribute space, indicating maximum differentiation, with nosubstitution between products in the assortment. This implies that the most popular product may notbe carried in the optimal assortment, contrasting the results of van Ryzin and Mahajan (1999).Decision support research began with Green and Krieger (1985), who formulate the problem of whichout of a set of potential products, a firm should select so as to maximize (1) consumer welfare or (2)firm profits, and propose solution heuristics.Green and Krieger (1987a,b, 1992), McBride and Zufryden(1988), Dobson and Kalish (1988) and Kohli and Sukumar (1990) extend this line of research. Belloniet al. (2008) compare the performance of different heuristics for product line design and find that thegreedy and the greedy-interchange heuristics perform extremely well.Smith and Agrawal (2000) use an exogenous demand model and an integer programming formulation ofassortment planning. They solve a number of small problems by complete enumeration to demonstratehow assortment and stocking decisions depend on the nature of assumed substitution behavior, and alsopropose a heuristic to solve larger problems.Chong et al. (2001) develop an assortment modeling framework based on the Guadagni and Little (1983)brand-share model. They use consumer-level transaction data over multiple grocery shopping trips toestimate the parameters of the model and use a local improvement heuristic to suggest an alternativeassortment with higher revenue. Although their model can implicitly predict SKU level demand, theirexplicit focus is on brand-shares, and hence they do not create or measure the accuracy of SKU leveldemand forecasts. They report an average 50% mean squared error of predicting brand choice at thecustomer level across different product categories.Kök and Fisher (2007) use an exogenous demand model to study a joint assortment selection and in-ventory planning problem in the presence of shelf-space constraints. They note that the constrainedshelf space implies that average inventory per SKU is inversely related to the breadth of the assortmentand assess the tradeoff between assortment breadth vs in-stock level for the SKUs carried. They con-sider substitution and assume that substitution demand accrues to available products in proportion to

5

their original market share, as in the MNL model. They provide a process for estimating demand andsubstitution rates and apply their method to data from a large Dutch grocery retailer.

3 Problem Formulation and Demand Model

We seek optimal assortments for a retail category over a specified future planning horizon. We assumethat the goal is to maximize revenue, since this was the concern in our three applications. It is straight-forward to adapt our approach to maximizing other functions, such as unit sales or gross profit. Wedefine the following parameters.N = {1, 2, ...,n} Index set of all possible SKUs a retailer could carry in this categoryM = {1, 2, ...,m} Index set of all storesK Maximum number of SKUs per assortmentL Maximum number of different assortmentsDsi Number of customers at store s for whom SKU i is their most preferred product

pi Price of SKU i ∈ NThe parameters K and L would be specified by the retailer. For expositional simplicity, K does not varyby store, but it would be easy to modify our process to enable K varying by store, and in fact we do thisin our computational work. L would be chosen to lie between 1 and m to tradeoff the greater revenuethat comes with larger L against the administrative simplicity that comes with smaller L. As will beseen, our solution approach makes it easy to solve this problem for all possible values of K and L, thusproviding the retailer with rich sensitivity analysis to guide their choice of these parameters.

We define an assortment to be a set S ⊆ N with |S| ≤ K and let ls denote the index of the assortmentassigned to store s. A solution to the assortment optimization problem is defined by a portfolio ofassortments Sl, l = 1, 2, ...,L and ls ∈ {1, 2, . . . ,L}, for all s ∈M .

Our demand model assumes that a consumer shopping in this category in store s has a most preferredSKU i ∈ N , but might be willing to substitute to other SKUs if i /∈ Sls . We view a SKU as a collectionof attribute levels, use historical sales data to estimate the demand share of each attribute level, andfinally estimate the demand share of any SKU as the product of the demand shares of its attribute levels.

We introduce some notation to formalize this approach.A Number of attributesa Attribute type indexNa Number of levels of attribute a,a = 1, 2, . . . ,Afsau Fraction of customers at store s who prefer level u of the

attribute a, u = 1, 2, . . . ,Na, a = 1, 2, . . . ,Aπsauv Probability that a customer at store s whose first choice on

attribute a is u is willing to substitute to v, defined for all a,u,and v.

By definition, πsavv = 1. Moreover, πsauv can be 0 if attribute level v is not a feasible substitute for u.For example, size is an attribute of a tire and a 14 inch tire is not a feasible substitute for a customerwith a 15 inch wheel.

The fraction of customers who most prefer SKU i with attribute levels i1, i2, . . . , iA is defined to befsi = Πa=A

a=1 fsaia

. If a customer’s most preferred SKU i with attributes i1, i2, . . . iA is not in the assortment,

6

they are willing to substitute to SKU j with attributes j1, j2, . . . jA with probability πsij = Πa=Aa=1 π

saiaja

.If a customer with most preferred SKU i finds i ∈ Sa(s) when they shop the store, then we assume theybuy it. Otherwise, they buy the best substitute for i in S, defined to be j(i,S) = arg maxj∈S ΠA

a=1πsaiaja

with probability πsij(i,S).

In using store sales data to estimate the parameters of our model, we will estimate Ds, the total categoryunit demand in store s, if all possible SKUs were offered, and then set Ds

i = fsi Ds.

The revenue earned by store s using assortment Sls can then be written as

Rs(Sls) =

∑i∈Sls

piDsi +

∑i/∈Sls

Dsi π

sij(i,Sls )

pj(i,Sls )

The first term in this expression is the revenue from customers whose most preferred SKU is containedin the assortment and the second term is the expected substitution revenue from customers whose mostpreferred SKU was not in the assortment.

The assortment optimization problem is to choose Sl, | Sl |≤ K, l = 1, 2, . . . ,L and ls for all s ∈ M tomaximize

∑s∈M Rs(Sls).

We allow substitution probabilities to vary by store because we found in our applications that they didin fact vary by store. For example, in the tire application, the willingness of a consumer to substitute toa higher priced product varies by store and is correlated with median income in the zip code in whichthe store is located.

Our demand model implies that a consumer’s preferences for the various attributes are independent,which may not be true. For example a college student shopping for twin size bed-sheets might have adifferent color preference than a suburban homemaker shopping for queen size sheets, so the color andsize attributes for sheets would interact.

Our defense of this assumption is three-fold: (1) all prior publications we are aware of that use attributesin demand estimation make a similar assumption. Fader and Hardie (1996) is typical of the approachfollowed in the literature. They assume that the utility for a SKU is a linear function of its attributes andthen use this utility in a Multinomial Logit model to determine SKU demand shares. They state that “Inboth the marketing and economics literature, it is common to assume an additive utility function” andnote that this implies no interaction between attributes. (2) in our applications, we check the accuracyof this approximation by comparing demand estimates with actual sales for the SKUs currently carriedand find that forecasts based on this model are accurate compared to previously published research. (3)if there is significant interaction between attributes, we demonstrate ways to modify our demand modelto take this into account. In the snack cakes application, the attributes are flavor, package size (singleserve or family size) and brand. Package size and brand interact since one brand is stronger in singleserve and another in family size. We deal with this by combining brand and size into a new attributebrand-size. In the tire application, the attributes are size, brand (4 brands) and mileage warranty (low,medium, high). Brand and mileage warranty interact because a given brand does not offer all warrantylevels, and so we combine brand and warranty level to create the attribute brand-warranty.

In the tire application, there is also an interaction between size and brand-warranty. A tire with a givensize attribute level fits a defined set of car models of a certain age and value. The six brand-warrantylevels, corresponding to different price points and quality, might interact with the age and value of the

7

cars a size tire fits. We show how to deal with this by partitioning the sizes into a finite number ofhomogenous segments (which are latent) and allowing the brand-warranty shares to be conditional onsegment membership (Fader and Hardie 1996, Kamakura and Russell 1989)

4 Analysis

We describe our methods for estimating model parameters and choosing assortments.

4.1 Estimating demand and substitution probabilities

We use Maximum Likelihood Estimation to estimate demand and substitution probabilities. Our primaryinput for estimation is store-SKU sales of products currently carried by the retailer during a prior historyperiod. Parameters are estimated at the store level, but for expositional simplicity we will drop the storesuperscript in the discussion that follows. We describe here a generic, broadly applicable approach; inSection 5, we will exploit special structure of the applications to refine this approach. Let S denote theassortment carried in a particular store and xi the sales of SKU i ∈ S, during a history period.

We can write the probability Fj(S) that a customer purchases j ∈ S, as Fj(S) = fj +∑i/∈S,j=j(i,S) fiπij .

Let F (S) =∑j∈S Fj(S) denote the probability that a customer shopping in this category makes any

purchase from assortment S. Then, assuming that each consumer purchase is an independent random

draw, the likelihood of observing sales data x = {xi}i∈S is given by LH(f ,π) = C∏j∈S

[Fj(S)

F (S)

]xjwhere

the proportionality constant is C =(∑xj∈S)!∏j∈S xj !

. The maximum likelihood estimates (MLE) for the

parameters (f ,π) can be obtained by maximizing the log-likelihood function

LLH(f ,π) =∑j∈S

xj logFj(S)−

∑j∈S

xj

logF (S) (1)

subject to the constraints

Na∑u=1

fau = 1, a = 1, 2, . . . ,A (2)

fi = ΠAa=1faia i = 1, 2, . . . ,n (3)

πij = ΠAa=1πaiaja i = 1, 2, . . . ,n and j = 1, 2, . . . ,n (4)

fau,πauv ∈ [0, 1] ∀a,u, and v (5)

Given the complex nature of the log-likelihood function, it is not possible to derive analytical results.Hence we resort to numerical optimization methods based on gradients after transforming the probleminto an unconstrained optimization problem by reparametrizing fau and πauv in terms of f̂au and π̂auv,

8

where

fau =exp(f̂au)∑Nau=1 exp(f̂au)

, u = 1, 2, . . . ,Na − 1 and f̂aNa = 1 (6)

πauv =exp(π̂auv)

1 + exp(π̂auv), ∀a,u, and v (7)

We found examples showing that the log-likelihood function may not be concave, which implies thatnumerical optimization methods may not converge to a global maximum. We handle this issue byrunning the optimization algorithm from several randomly generated starting points. This does notguarantee global convergence, but lowers the chances of the algorithm getting stuck at a local maximum.Mahajan and van Ryzin (2001) use a similar approach to compute the optimal inventory levels using thesample path gradient algorithm.

Once we obtain MLE estimates for demand shares and substitution probabilities, as noted in the previoussection, we estimate total demand for the product category as D =

∑i∈S xiF (S)

, and Di as fiD.

4.2 Estimating Prices for New SKUs

We endeavored to set prices on SKUs not currently carried by the retailer in a way that would be consis-tent with their current pricing policy. We assume that prices on existing SKUs were set in relationshipto the value of the SKU to a consumer and that consumer value is related to attribute levels. Hence, weregressed the log of price on attribute levels to obtain the pricing equation:

log(pi) = α0 +A∑a=1

Na−1∑u=1

βauziau, i = 1, 2, . . . ,n (8)

where ziau is a dummy variable taking the value one if SKU i has level u of attribute a, and zerootherwise.2

This is a hedonic pricing equation, and has been extensively used in economics (Rosen 1974, Goodman1998, Pakes 2003).

4.3 Heuristics for Choosing Assortments

Absent substitution, the assortment problem could be optimally solved by a greedy algorithm thatchooses SKU’s in decreasing order of their revenue contribution. But substitution makes the objectivefunction nonlinear, because the contribution of a SKU depends in part on its substitution demand, whichdepends on which other SKUs are in the assortment. As a result, the assortment problem is complexto solve optimally. Hence, we define greedy and interchange heuristics for choosing the assortmentsSl, l = 1, 2, . . . ,L and the specification ls, s ∈M of the assortment assigned to store s.

For assortment planning, Kök and Fisher (2007) use a greedy heuristic and Chong et al. (2001) aninterchange heuristic. For the product line design problem, Green and Krieger (1985) use greedy andinterchange. Belloni et al. (2008) find that for the product line design problem, greedy and interchange

2The summation is only over Na − 1 levels of attribute a, since the reference levels for all attributes are accounted forby the intercept term α0.

9

together find 98.5% of optimal profits on average for randomly generated problems, and 99.9% for realproblems.

The greedy heuristic for finding a single assortment for a store or set of stores adds SKUs in decreasingorder of revenue contribution until K SKUs have been added. The interchange heuristic starts with agiven assortment and tests whether interchanging a SKU which is not in the assortment with a SKUin the assortment would increase revenue. Any revenue increasing interchanges are made as they arediscovered. The process continues until a full pass over all possible interchanges discovers no revenueincreasing interchanges. We apply the interchange heuristic both starting with the greedy assortmentand starting with random assortments.

To find a portfolio of L assortments and assignments of stores to assortments, we have two alternativeheuristics, a forward and reverse greedy. In the forward greedy heuristic, we first apply the greedyand interchange heuristics to generate a single assortment for the chain. If L > 1, we apply greedyand interchange to generate an assortment for each store and add as a second assortment, the storeassortment that maximizes the revenue increase with stores optimally assigned to each assortment. Thisprocess continues until L assortments have been chosen.

In the reverse greedy heuristic, we first apply greedy and interchange to generate an assortment for eachstore. If L < m, we identify the single store assortment to delete, that would minimize the revenueloss. We calculate the revenue loss from deleting an assortment by reassigning stores to their revenuemaximizing assortment in the reduced set of assortments and calculating the loss in revenue due to thereassignment. At any point in the algorithm, we have a portfolio of l assortments and l store clustersdefined by the assignment of stores to assortments. As long as l > L, we delete one assortment from thisportfolio that leads to the least loss in revenue and reassign stores to the reduced set of assortments.

To evaluate the performance of the heuristics, we proved that if we use estimated, rather than actualprices for existing SKUs, the optimal solution in our applications has a special structure that allows usto find optimal assortments. We used this result to create optimal assortments for the 140 stores inthe snack cake application and the 574 stores in the tires application, then applied greedy and foundit generated solutions that were 97.2% and 98.5% of the optimal, respectively. We also tested theinterchange heuristic, starting both with the greedy assortment and random assortments, but in no casefound solutions that improved on the greedy assortment.

5 Applications

We worked with three retailers to apply our methodology, analyzing one product category for each ofthem: snack cakes for a regional convenience chain, tires for a national tire retailer and appearancechemicals for a major auto aftermarket parts retailer. Results for the three applications are summarizedin Table 1. The entries in this table are described below.

5.1 Description of Applications and SKU Attributes

The regional convenience chain offered snack cakes in 60 flavors, two brands {B1,B2}, and severaldifferent package sizes in 140 stores. We restricted our analysis to the top 23 flavors that accounted for95% of revenue. Although there were several different package sizes, what mattered from a consumer’s

10

perspective was whether the size was single-serve or family size, and hence we grouped sizes into SingleServe (S) and Family Size (F ). Further, because the retailer advised us that brand shares and willingnessto substitute varied by size, we combined brand and size to obtain a single attribute called Brand-Size,indexed 1 to 4 for SB1,SB2,FB1 and FB2 in order.

Thus there were 23 Flavor attribute levels, 4 Brand-Size attribute levels and 92 possible SKUs, of which52 were being offered by the retailer in at least one store. The number of SKUs offered across stores variedbetween 24 and 52, and averaged 40.3. An internal market research study on the industry commissionedby the retailer showed that Flavor was the most important attribute for a consumer purchasing fromthis category. Hence, we assumed that the probability of substituting across flavors was negligible andcould be set to 0. The retailer also believed that there is negligible substitution between sizes S and F ,so this substitution was assumed to be 0. The substitute for a particular brand-size is the other brandin the same size. We define j(i) to be the brand-size that a customer might substitute to if i is not inthe assortment and set j(1) = 2, j(2) = 1, j(3) = 4 and j(4) = 3. We need to estimate the 23 flavorshares f1v, 4 brand-size shares f2b, and 4 substitution probability parameters π12,π21,π34 and π43.

Attributes for the tire study were brand, mileage warranty, and size. The retailer offered several nationallyadvertised brands that they believed were equivalent to the consumer, which we denote National (N)and treat as one brand. They also offered three house brands of decreasing quality, which we denote asHouse 1 (H1), House 2 (H2) and House 3 (H3), where H1 is the highest quality and most expensive housebrand. There were a large number of distinct mileage warranties offered, but some of these varied onlyslightly and hence were believed by the retailer to be equivalent to consumers. Therefore, we aggregatedthe mileage warranties into three levels of Low (15,000−40,000 miles), Medium (40,001−60,000 miles)and High (>60,000 miles), denoted L, M and H, respectively. We combined brand and warranty into asingle attribute to account for interaction between these attributes (for example, national brands werealways offered only with high or medium warranty, while H3 tires were offered only with low warranty)to identify the following six brand-warranty combinations: NH, NM, H1H, H2H, H2M, H3L. Sixty fourdistinct tire sizes were offered, resulting in 64× 6 = 384 distinct possible tire SKUs that could be offered.This retailer carried 122 of these 384 possible SKUs in at least one of their stores. The number of SKUsoffered across stores varied between 93 and 117, and averaged 105.2. The assortment offered also variedslightly across the stores, indicating some localization.

We were advised by the retailer that customers do not substitute across sizes. Table 2 is based onestimates provided by the Vice President of the tire category for the retailer and depicts the qualitativelikelihood of substitution across brand-warranty levels. We let {πS ,πL,πM} denote the substitutionprobabilities somewhat likely, likely and most likely.

The auto aftermarket parts retailer examines performance of each of their product categories once ayear on a rotating schedule and considers changes in the assortment. We were asked in early May,2009 to apply our methodology for the annual assortment reset for the appearance chemicals category, acategory comprised of liquids and pastes for washing, waxing, polishing, protecting, etc. all surfaces ofan auto, including the body, tires, wheels, windshield and other glass, and various interior surfaces. Theretailer used a market research firm, NPD, that had assigned to each SKU in the appearance categorythe attributes (1) segment (defined by the surface of the car treated and what is done to that surface),(2) 9 brands and (3) 3 quality levels, denoted good, better or best, where ’good’ is the lowest quality and’best’ is the highest. We appended package size, denoted small (S) or large (L), to the segment attribute

11

to create 45 segment/size attribute levels. We combined brand and quality to create a second attribute,brand/quality. Because some brands did not offer all quality levels, there were 17, not 27 brand/qualitycombinations. In some cases there were two package sizes that differed slightly and were classified asS or L, so that two SKUs occupied the same cell of the attribute matrix. Consequently, the 160 SKUscurrently offered corresponded to 130 cells of the 45 x 17 matrix of possible attribute levels. Of the 45 x17 – 130 = 635 attribute combinations not carried by the retailer, only 24 were available in the market.No substitution parameters were used in the model for the following reason. If a brand-quality level werenot offered for a product, it seemed likely that some of the demand for that brand-quality would transferto several other brands. To estimate this effect we would have needed instances of different stores withvarying numbers of brand-quality levels offered for the same product, and this data was not available.

For each of the applications, assortments varied somewhat across stores, although the retailers told usthis was not the result of a systematic effort to localize assortments to store demand patterns, but ratherdue to varying store sizes. Typically, a small store would carry a subset of a large store assortment.

Given this, we developed a measure of assortment commonality that would indicate the extent of local-ization. For any pair of stores A and B, the overlap in assortment was computed as the ratio of thenumber of common SKUs to the minimum number of SKUs across the two stores. Note that this metricequals 1 if the assortment of one store is identical to or a subset of the other, indicating no localization.The average for all store pairs was 97.1% for Cakes, 94% for Tires and 96.4% for Appearance Chemicals,indicating that the extent of localization was minimal.

5.2 Estimation

Sales data for up to three time periods were used in the applications: fit, validation and implementationsamples. For snack cakes, the fit sample was store-SKU sales from July - Dec 2005; the validation periodwas July - Dec 2007. For tires the fit and validation periods were July – Dec 2004 and Jan – June 2005respectively. Results were implemented for tires and the impact on sales tracked for July – Dec 2005,the implementation sample. For appearance chemicals the fit period was May 2008 – April 2009. Theresults obtained in July 2009 were so compelling that the retailer decided to implement them in earlyJanuary 2010 without validation. To evaluate the impact of implementation, we used store-SKU salesfor the 27 week period 23 January – 8 July 2010.

We applied the estimation procedure described in 4.1 to Store-SKU sales data for the fit sample toestimate model parameters at each store for the three applications. This process worked except for 255stores of the tire retailer where there was insufficient data to determine all 6 brand-warranty shares be-cause there was no size in which brand-warranties H2M and H3L were both offered, making it impossibleto identify the split of demand between H2M and H3L. Of these 255 stores, there were 52 stores wherewe could also not identify the share of H2M.We therefore regressed the H3L and H2M shares againstmedian income for the 319 stores at which we could estimate all the parameters and used the regressionestimate for these shares as needed. We then used MLE to estimate the remaining parameters.

To estimate prices of new SKUs not carried for snack cakes and tires, we used a hedonic regressionas described in Section 4.2 to assign prices, with the R2 values shown in Table 1. We multiplied theestimated prices by a scale factor of 1.02 and 1.05 respectively, so as to equalize the estimated revenue atthe chain level to the actual revenue, which will facilitate comparison of new optimized assortments with

12

current revenue. The retailer of appearance chemicals sold branded products, so the prices of potentialnew SKUs were known.

5.3 Validation

We measured the overall estimation error across all stores, by computing the Sales-Weighted Mean-Absolute-Deviation (MAD) of estimated Store-SKU sales shares from actual Store-SKU sales shares, as

given by

∑s∈M

∑j∈Ss

∣∣∣∣ xsj∑jxsj

−F sj (Ss)

∣∣∣∣xsj∑s∈M

∑j∈Ss

xsj∑jxsj

, where Ss is the set of SKUs in the assortment for store s. We

measure MAD in terms of sales shares because unit sales are significantly influenced by overall growth orshrinkage in the category, whereas sales shares are not. Moreover, our assortment choices are determinedsolely by sales shares, so these are the parameters important to our analysis.

We validated our results for the snack cake and tires applications using store-SKU sales for the validationperiods defined above3. We used the fit estimated parameters to compute the share of sales for eachSKU for the validation period, at the store and chain levels and compared it with the actual sales sharesto obtain the MADs given in Table 1.

As discussed previously, there can be interaction between attributes, including size and brand-warrantyin the tire application. One way to account for this interaction is to follow a latent class approach. Weassume that there are several homogenous segments of sizes, and the brand-warranty shares vary acrosssegments, but are the same within each segment. For example, a size segment could include tires thatfit old cars, and the brand-warranty shares reflect this fact. To explore the feasibility of this approach,we estimated such a model for a subset of 20 stores, allowing for two size segments. We find that thesales-weighted MAD of sales share estimates was 15.3% as compared to 19.6% for the case of one sizesegment.

5.4 Optimization

We next applied the greedy heuristic described in Section 4.3 to compute optimized assortments at thechain and store levels, varying the number of SKUs in the assortment from 1 to n. Figure 1 showsthe percentage of the maximum possible revenue if all SKUs were offered captured by the chain levelassortment, as a function of K as a percentage of n. Note that maximizing revenue for a given value ofK is equivalent to maximizing this percentage of maximum revenue captured.

For cakes and tires, the potential SKU set was all combinations of attributes. The appearance chemicalsapplication sold branded products available in the market place, and hence there was a list of availableproducts that we used in the optimization. As noted previously, this comprised SKUs currently offeredand 24 potential new SKUs. Interestingly, while we were waiting to receive this list from the categorymanager, we did an analysis using all combinations of attributes which recommended adding a waxproduct for a leading brand. The category manager pointed out that the product did not exist in themarket place, but then was shocked when the brand sales representative met with her several days laterand told her they were planning to introduce exactly this product. So for retailers of branded products,it makes sense to analyze two cases: with SKUs that currently exist in the market place for assortment

3For the snack cakes validation we had data for only 54 of the 140 stores in the chain.

13

update, and with a SKU set corresponding to all attribute combinations, so as to identify new productopportunities for suppliers and private label opportunities for the retailer.

All computations were carried out on the same computer with an Intel Core 2 Duo 2 GHz processor.The computation times for estimation and greedy for each application are summarized in Table 1.

To quantify the potential improvement in revenue, we compared the assortments we generated for L = 1and L = m to the current assortment. SKU count varied somewhat by store, so we compared the currentassortment for store s with SKU count Ks to the greedy assortment for K = Ks.

5.5 Findings

Tables 3 - 7 give results of parameter estimation for the three applications. Table 2 for single serve, showsbrand shares of 61% and 27%, a more than 2 to 1 difference, while for family size, the two brands haveequal shares. This shows the importance of our having combined brand and size into a single attribute,thus allowing for differences such as this. Similarly, as shown in Table 4, there is 89% substitution frombrand 1 to 2 in family size, but only 18% in single serve.4 The retailer found this the most interestingfinding of the study, as they believed substitution rates varied, but had previously no way to measurethe exact rates.

The most interesting finding for the tire retailer was that, as shown in Table 5, the demand share estimatefor H3L is much higher than the sales share, while for H2M it is much lower. The reason for this appearsto be that the retailer offered H3L in many fewer sizes than H2M; H3L is offered in only 15 of the 64sizes, versus 52 sizes in which H2M is offered. But for those sizes where H3L and H2M are both offered,H3L outsells H2M by 40:1 on average, indicating that it is strongly preferred over H2M. The retaileroffered H3L in fewer sizes because they preferred to sell the higher priced H2M and believed that theirsales staff could convince customers to trade up to this tire. The substitution estimate of 45% showsthat many customers did in fact trade up, and this explains the high sales share for H2M relative toits demand share. However, the 55% of the 61% of customers preferring H3L who did not substituterepresents more than 34% of demand that was being lost due to the meager offering of H3L in thecurrent assortment, suggesting that there was substantial opportunity to increase sales by re-assorting.We can see that offering H3L in only a few sizes hurts revenue. The average price of H3L and H2M inthe sizes where both were offered was $28 and $36 respectively. Suppose that there were 100 consumersshopping the store and consider the two alternatives of offering H3L alone or H2M alone. OfferingH2M would capture (5%*100+45%*61%*100)*$36=$1168 in revenues while offering H3L would capture(61%*100)*$28=$1708 implying 46% additional revenue.

Figure 2 shows that the estimated share of H3L at each store is negatively correlated and the share ofH2H and H2M are positively correlated with median income level in the zip code in which the store islocated, which is intuitive and provides confirming demographic evidence to support the reasonablenessof our parameter estimates.

In the tire application it was feasible for the retailer to offer store-specific assortments, since the tiresare not actually displayed at the store. But the snack cakes and appearance chemicals retailers thoughtit would be administratively too complex to have more than, respectively, 6 and 5 different assortments.

4Beswick and Isotta (2010), page 2, reports a very similar finding for an orange juice study. For the leading brand, only21% are willing to substitute to another brands, but for the second brand, 85% are willing to substitute

14

For snack cakes, we applied our assortment heuristics for L = 1, 2, ..., 6 ∪m, keeping K = 40 for allstores5. Table 8 shows revenue as a function of L. We note that complete localization increases revenueto $8.11 million compared to the revenue of $7.38 million for L = 1. However, 76.7% of this increase canbe achieved with just 6 different assortments, suggesting that a small amount of localization can have abig impact.

For appearance chemicals, we worked first on a 2,183 store ‘warm up’ case and then the full 3,236store case, generating up to five different assortments. Table 9 shows estimated revenue increase forvarious cases. Based on results of the 2,183 store case, the retailer concluded that at most three storeclusters would be used, so these were the cases run for the 3,236 store case. The retailer liked the twocluster solution and regarded cluster 1 as a base case that was representative of the chain as a wholeand cluster two as a subset of stores differentiated, as shown in Table 7, by a higher level of tire relatedpurchases, a preference for brand two, and a higher percentage of customers in a demographic the retailercalled “urban/bilingual”. This store segmentation was compelling for the retailer and agreed with morequalitative market research inputs they had received. Again we see that a small amount of localizationproduces most of the benefit.

In order to determine total market potential, we also ran an alternate scenario for Appearance Chemicals,expanding the potential SKU set to consist of all combinations of product attributes. This resulted ina chain optimal assortment lift of 15.2% and store optimal assortment lift of 19.7%, compared to thevalues of 11.8% and 14.2% for the comparable cases using only SKUs available in the market.

5.6 Implementation

The tire and appearance chemicals retailers implemented a portion of our recommendations, whichallowed us to assess how our methodology performed under a stringent implementation test.

Most of the products offered by the tire retailer were private label products. The retailer had relationswith a number of suppliers from whom they could request production of a particular tire, but it was notalways possible to find a supplier that would fulfill a given order. Usually the reason for declination wasthat the order was too small. Given this, the retailer asked us to optimize the assortment based on allattribute combinations, so they could take the resultant “shopping list” to the market to secure as manyas possible of our recommended new tires. Our optimized assortment deleted 47 SKUs and replacedthem with 47 new SKUs. Of these 47 new SKUs, the retailer could find suppliers for eleven, and thesewere added to the assortment in all stores, ten H3L SKUs and one H1H.

Estimating the change in revenue from the current to the implemented assortment was complicatedbecause, in addition to adding eleven new SKUs to the assortment, the retailer deleted more than elevenSKUs in each store. The number of SKUs deleted varied somewhat by store but averaged 24 SKUsdeleted. To achieve a fair comparison, in a store where Ns SKUs had been deleted and 11 added, weused the greedy heuristic to choose Ns−11 SKUs that were in the current assortment but not in theimplemented assortment and added them to create a modified implementation assortment that had thesame SKU count as the pre implementation assortment at each store and that was used in evaluation. Wethen used parameters estimated for the calibration, validation and implementation periods to estimaterevenue for the current assortment, the modified implementation assortment and the two optimized

5We only report results obtained using the forward greedy heuristic as the results based on the reverse greedy heuristicwere not significantly different.

15

assortments. Table 10 gives revenue estimates and percentage improvement over the current baselineassortment for these periods, and shows a 5.8% revenue increase for the new assortment.

A 5.8% improvement is significant. For example Canadian Tire reports achieving a 3−4% annual revenueincrease in existing stores during 2005−2009 and is targeting the same increase through 2012 as a resultof all efforts, including assortment modification, to increase sales in existing stores (Canadian TireCorporation Limited 2007).

The appearance chemicals retailer also adopted our recommendations, using the two cluster solution. Inour recommended assortment, 20 SKUs in cluster 1 with the lowest estimated revenue were replaced by20 new attribute combinations. In cluster 2, 19 existing SKUs were replaced. The retailer used exactlythe assignment of stores to clusters in our recommendations. They also adopted our recommendationson which attribute combinations to add to the assortment, although in some instances more than oneSKU in the market corresponded to the same attribute combination, with the result that the numberof SKUs added exceeded the number of attribute combinations added. Twenty two SKUs were addedto cluster one and twenty five to cluster two. The choice of which SKUs to delete differed from ourrecommendations in the number of SKUs deleted and in which SKUs were deleted. Seventeen SKUswere deleted from cluster one and twenty three from cluster two. The choice of which SKUs to deletewas guided by factors other than year to date revenue; for example, one car wash SKU was deleted dueto a history of quality issues. In addition to assortment changes, their localization efforts included givingmore prominent display and signage for tire products and brand 2 in the cluster 2 stores.

To evaluate the impact of these changes, we had available sales by cluster of the new assortment for the27 week period January 1, 2010 – July 8, 2010, and of the previous assortment for a comparable periodin 2009. In the discussion below, we refer to these as 2010 and 2009 sales, while recognizing they were foronly a portion of these years. SKUs can be segmented into three groups: kept SKUs that were in boththe 2009 and 2010 assortments, deleted SKUs that were in the 2009 assortment but not the 2010 andadded SKUs that were in the 2010 assortment but not the 2009 assortment. To evaluate the impact ofthe assortment changes we compared 2010 kept plus added revenue to 2010 kept revenue plus an estimateof what 2010 revenue would have been for the 2009 SKUs deleted. Revenue is affected by a variety offactors other than assortment, including weather, the economy and competitive activity. We measuredthe impact of these other factors by the ratio of the 2010 to 2009 revenue for kept SKUs and estimatedthe 2010 revenue of the deleted SKUs as their 2009 revenue times this factor. We are thus comparingthe new assortment revenue to an estimate of what the old assortment would have sold in 2010.

We needed to deal with the fact that more SKUs were added than deleted. Twenty two SKUs were addedto cluster 1 vs. seventeen deleted and twenty five SKUs were added to cluster 2 vs. twenty three deleted.The retailer had a fixed amount of shelf space allocated to this category and accommodated the increasein SKU count by reducing the shelf space assigned to some of the existing SKUs. They therefore did notview the increase in SKU count as a cause for concern. Still, reducing the space for some existing SKUsmight have caused greater stock outs, reducing revenue in a way we could not capture. Thus, to makea more rigorous evaluation of benefits, we used the twenty two and twenty five SKUs for clusters oneand two, respectively, with lowest revenue in the 2009 evaluation period in estimating deletion revenue,thereby equalizing the add and delete counts.

The newly added SKUs were introduced some time after January 1, 2010 and hence were not on sale forthe entire January 1 – July 8, 2010 period and moreover took some time to build to a steady state level

16

of sales. Examining the weekly sales data of the added SKUs, we observed that it took them 7 weeks toachieve a steady steady sales rate. Hence, we used added SKU revenue for weeks 8 – 27 scaled by 27/20as our estimate of added SKU revenue for the period January 1 – July 8, 2010.The result of these calculations showed a 3.6% revenue increase due to the revised assortment. Inaddition, there may have been some improvement due to the localized product display and signage incluster two stores that we were not able to measure. The retailer’s appearance chemicals team agreedwith our assessment of benefits and believed that the re-assortment exercise had been a success. We notethat a 3.6% revenue increase is large relative to the usual impact of all things retailers do to increasesame store sales. For example, in their annual reports, four leading auto parts retailers (Advance AutoParts, Auto Zone, O’Reilly, and Pep Boys) reported comparable store sales increases over 2006-2010 thataveraged 1.9%.

5.7 Forecasting New SKUs

One new SKU was added to the snack cakes assortment in the July-December 2007 validation period,Butterscotch in Brand 2 Single Serve. Eleven SKUs were added to the tire assortment and 25 to theappearance chemicals assortment during the implementation periods. Table 1 shows the MAPE of allnew SKU forecasts. We note that these values compare favorably to the 30.7% MAPE for chain sales oftwo new SKUs reported by Fader and Hardie 1996.Sources of forecast error affecting the fit, validation and implementation samples include random fluctu-ation in sales and the approximation of representing SKU shares as the product of attribute shares. Anadditional source of error for the validation and implementation samples are demand trends over time.For example, the demand for a particular tire initially increases over time as cars that use the tire ageand need replacement tires, but eventually declines as those cars become old enough that they beginto exit the population. Price changes can also impact demand patterns in any application. Table 11shows how changes in relative prices across tire brand-warranties relates to systematic changes in theirdemand shares over time. In particular, it is striking that as the price difference between H2M and H3Lnarrowed from 43.7% to 41.9% to 22.9% over the three periods, the demand split between these twobrand-warranties shifted from 7%/70% to 29%/27%.

6 Factors Influencing Localization Revenue Lift

The Localization Lift, defined as the revenue increase from using store specific assortments vs. a singleassortment for the chain was 12.2%, 5.8% and 2.4% for the snack cakes, tires and appearance chemicalsexamples described in the previous sections. As we sought to understand what features of the problemdata cause these differences in Localization Lift, our first thought was that Localization Lift must bedriven by demand variation across stores. We thus calculated a coefficient of demand variation (COV)defined as

√∑i∈N σ

2i /∑i∈N µi, where µi and σi denote the mean and standard deviation of revenue

shares of SKU i across all stores. COV for snack cakes, tires and appearance chemicals was 17%, 11%and 10% respectively. These values shed some light on variation in Localization Lift in that the highestCOV matches the highest lift, for snack cakes, but leave open the drivers of variation in Localization Liftbetween tires and appearance chemicals, where the Localization Lift varies by a factor of three while theCOV’s are nearly equal.

17

To better understand this issue, we examined the data more closely and made two observations. First,SKUs can be segmented into three groups: (1) those with such high demand that they were carriedin every store optimal assortment, (2) those with such low demand that they were in no store optimalassortment and (3) the remainder. While there may be substantial variation in demand across stores forthe first groups, none of this variation impacts Localization Lift.For example, in the case of appearance chemicals, we see in Table 7 that the best selling segment-sizefor the chain is Tire Dressings/Shines TRIGGER L. Table 12 shows substantial difference in the salesrate between two stores for Tire Dressings/Shines TRIGGER L. The best selling single SKU in thissegment-size, as defined by brand and quality level, accounted for 5.5% of revenue in Store A and 16.6%in Store B, a 3.3 to 1 difference. Yet even though the SKU sold much worse in Store A, with a revenueshare of 5.5%, it clearly made sense to have this SKU in a revenue maximizing assortment for Store A,and hence this difference in sales rate had no impact on the Localization Lift.Secondly, we noticed substantial variation in the breadth of assortment carried, from 40 out of 92 possibleSKUs for snack cakes to 130 out of 154 possible SKUs for appearance chemicals. A broader assortmentmeans that a single chain optimal assortment captures a greater fraction of potential demand, leavingless room for improvement from assortment localization.The example in Table 13 is designed to illustrate how these patterns can occur. The example comparesCOV and Localization Lift for two demand cases, and within each case, for K equal to 3 or 4. The priceof all SKUs is $1, so unit demand and revenue are the same. Moreover, total demand is equal for thetwo stores, so there is no distinction between demand units and demand share. We also assume there isno substitution. The variance column gives the variance in demand across the two stores for each SKU.COV, as defined above is the square root of total variance divided by total demand.In all cases the chain optimal assortment is SKUs 1 through K. For K = 3, the optimal assortment forstore 1 is SKUs 1, 2 and 3 and for store 2, SKUs 1, 2 and 4. For K = 4, SKUs 1 to 4 are an optimalassortment for both stores.For K = 3, note that Case 1 has the highest COV but the lowest Localization Lift. This happensbecause almost all of the inter store demand variation occurs for SKUs that are in either both storeoptimal assortments or neither and hence none of this variation impacts Localization Lift. By contrast,in Case 2, all of the inter store demand variation occurs for SKUs 3 and 4, the very SKUs that differin the store optimal assortments. For K=4, the lift in both cases drops to 0, demonstrating the impactthat breath of assortment can have on lift.Motivated by what we saw in the data for the three applications and by the features demonstrated bythe example in Table 13, we defined two additional metrics which we hypothesize would be correlatedwith Localization Lift. COV Select (K) is the coefficient of variation for SKUs which are in some but notall store optimal assortments. This metric is a function of K because store optimal assortments dependon K. We re-label our original coefficient of variation metric as COV – All to emphasize its differencewith COV – Select (K).We also define Chain Optimal Share (K) to be the share of the total potential revenue

∑s∈M

∑i∈N piD

si

achieved by a single, chain optimal assortment with K SKUs. So far we have not discussed the impactof substitution. We simply note that an increase in willingness to substitute increases Chain OptimalShare (K) and thus decreases Localization Lift.Table 14 gives Localization Lift and the three metrics for the three applications. We observe that the

18

highest Localization Lift for cakes can be explained by the low value of Chain Optimal (K) and highvalues of COV – All and COV Select (K). We also note that the difference in Localization Lift betweentires and appearance chemicals can be explained by a high value of Chain Optimal Share (K) and arelatively low value of COV - Select (K) for appearance chemicals.

We also used the solutions to the three applications for K varying from 1 to n to create Figure 3 showingLocalization Lift versus Percent of Maximum Total SKUs in the Assortment and Chain Optimal Share(K). We note that Localization Lift varies with Chain Optimal Share (K) as we have hypothesized. Thedip in the lift index initially in Figure 3 is because the top SKUs at the chain level have different rankordering across stores.

To further investigate the effect of demand variation and share captured by the chain optimal assortmenton localization lift, we used the existing data to create 100 additional problem instances for each of thethree applications by randomizing (a) the number of stores in the chain, (b) the actual stores sampledand (c) K, the maximum number of SKUs allowed in the assortment. Table 15 gives the range usedfor each application in randomly generating store count and K. We then calculated Localization Lift,COV - All, COV - Select and Chain Optimal Share (K) and regressed Localization Lift against thethree dependent variables; the results are summarized in Table 15. The regression results confirm ourhypothesis that Localization Lift depends mainly on Chain Optimal Share (K) and COV - Select, whichare highly significant, and not on COV - All, which is insignificant.

7 Conclusions

We have formulated a process for finding optimal assortments, comprised of a demand model, andestimation approach and heuristics for choosing assortments. We have applied this process to real datafrom three applications and shown that the approach produces accurate forecasts for new SKUs. Ourrecommendations were implemented in two of the cases. We measured the impact based on actualsales and found the assortment revisions had produced revenue increases of 5.8% and 3.6%, which aresignificant relative to typical comparable store increases in these product segments.

We note the following observations from this research.

1. Forecast accuracy for new SKUs compared favorably to prior research in Fader and Hardie (1996)and was adequate to achieve significant benefits in implementation. Nonetheless, the errors weregreat enough to reduce the revenue increase by about half from the fit to validation samples, soimproving forecast accuracy would be a useful focus of future research.

2. Sales is not true demand, but demand distorted by the assortment offered. We do not see demandfor SKUs not offered, and the sales of SKUs offered may be increased above true demand due tosubstitution. The impact of these effects can be significant. In the tire application, the lowest pricebrand-warranty level had a demand share of 61% but a sales share of only 5%, because the retaileroffered the lowest price brand-warranty in few sizes. Adding more of this brand warranty to theassortment was a big source of the revenue increase attained.

3. Substitution can be measured, can vary significantly and have a major impact on the optimalassortment. In the snack cakes example, in the family size, the probability of substituting from

19

Brand 1 to Brand 2 was 89%, versus only a 22% probability of substituting from Brand 2 to Brand1. This resulted in a complete replacement of Brand 1 by Brand 2 in family size of the optimizedassortment.

4. We were able to use demographic data to confirm our parameter estimates in the tire and ap-pearance chemicals examples. In particular, the share of the lowest price tire and unwillingness tosubstitute up to a higher price tire were correlated with median income in the store area. In someinstances, we also used demographic data to assist in estimating parameters.

5. The benefit from localizing assortments by store varied considerably, from 2% to 12%. We showedthat this difference is not only driven by the coefficient of demand variation across stores, but alsoby the percent of maximum revenue captured by a chain optimal assortment and by the coefficientof variation in demand for those SKUs that are in some, but not all store optimal assortments.

6. A limited amount of localization can capture most of the benefits of maximum localization. In thesnack cakes example, going from 1 assortment to 6 provided 77% of the benefit as going from 1assortment to 140 assortments.

7. There may be interaction between attribute levels not captured by our demand model. In the caseof tires, the demand for the least expensive brand-warranty level will be higher for a size tire thatgoes on an older, inexpensive car than for a tire that goes on a new, luxury car. We showed thatthis could be incorporated into our approach through latent class analysis.

Acknowledgments

We thank Professors Abba Krieger and David Bell from the Wharton School for numerous valuable com-ments and helpful suggestions with our research and Paul Beswick, Partner and Head, Oliver WymanNorth American Retail Practice, Robert DiRomualdo, former CEO, Borders Group, Kevin Freeland,COO, Advance Auto, Matthew Hamory, Principal, Oliver Wyman North American Retail Practice, Her-bert Kleinberger, Principal, ARC Consulting, Chris Morrison, Senior VP of Sales, Americas, Tradestone,Robert Price, Chief Marketing Officer, CVS, and Cheryl Sullivan, vice president of product management,Revionics, Inc. for helpful discussions on assortment planning practice. We also thank Carol Jensen,Tamara H. Jermyn, and Amanda O’Brien from Wawa, Inc. for their support. The financial supportof the Wharton School Fishman Davidson Center for Service and Operations Management is gratefullyacknowledged.

20

8 Tables

Table 1: Summary of key values for the three applications

ApplicationSnack Cakes Tires Appearance Chemicals

Category Parametersn - possible SKU count 92 384 154

K - average number of SKUs in an assortment 40 105 130m - store count 140 574 3,236

Demand ForecastsFit MAD, Chain Level 6.2% 4.5% 8.5%Fit MAD, Store Level 16.4% 13.6% 19.3%

Validation MAD, Chain Level 25.8% 21.1%Validation MAD, Store Level 40.1% 38.2%

Price Regression R2 85.5% 96.3%Estimated Revenue Increase (Based on Fit Sample)Chain Optimal Assortment 29.2% 30.1% 11.8%Store optimal assortment 41.4% 35.9% 14.2%

Implementation 5.8% 3.6%Number of New SKUs Forecasted 1 11 25

New SKU Forecast MAPE 16.2% 19.1% 28.7%Computation Time (Average per Store)

Estimation 7.2s 13.3s 5sGreedy 2.2s 6.5s 1.4s

Table 2: Tires: Management’s estimate of the most likelysubstitution probabilities

ToFrom NH NM H1H H2H H2M H3LNH 1 S S S 0 0NM L 1 S S 0 0H1H 0 0 1 L S 0H2H 0 0 S 1 S 0H2M 0 0 S L 1 0H3L 0 0 0 0 M 1S =Somewhat Likely, L =Likely, M = Most Likely

21

Table 3: Snack Cakes: Demand Share Esti-mates (Averaged across Stores)

Brand Size Sales Share DemandShare

B1S 67% 61%B2S 24% 27%B1F 4% 6%B2F 5% 6%

Table 4: Snack Cakes: Substitution ProbabilityEstimates (Averaged across Stores)

B1S B2S B1F B2F

B1S 1 18% 0 0B2S 26% 1 0 0B1F 0 0 1 89%B2F 0 0 22% 1

Table 5: Tires: Demand Share Estimates(Averaged across Stores)

BrandWarranty

SalesShare

DemandShare

NH 1% 4%NM 1% 3%H1H 3% 4%H2H 26% 24%H2M 45% 5%H3L 24% 61%

Table 6: Tires: Substitution Probability Esti-mates (Averaged across Stores)

Substitution Likelihood ProbabilitySomewhat Likely 2%Likely 6%Most Likely 45%

Table 7: Appearance Chemicals: Cluster Statistics for Top 25 SKUs for Appearance Chemicals

22

Table 8: Snack Cakes: Impact of Amount ofLocalization on Revenue

L Revenues ($ million)1 7.382 7.623 7.754 7.865 7.926 7.94... ....

m = 140 8.11

Table 9: Appearance Chemicals: Esti-mated Revenue Increases (%) vs. Num-ber of Store Clusters

No. of 2183 Stores 3236 StoresClusters (warm up) (implementation)

1 14.1 11.82 14.4 11.93 14.6 12.05 14.9

Table 10: Tires: Revenue Estimates for Current, Implemented and Optimized Assortments (percentageimprovement over current revenues given in parenthesis)

Revenues ($$ million) Parameter Estimates UsedAssortment Jul 04 - Dec 04 Jan 05 - Jun 05 Jul 05 - Dec 05

Current Assortment (Jul 04 - Dec 04) 80.2 74.9 72.3Modified Implementation (Jul 05 - Dec 05) 90.7 (13.1) 84.7 (13.1) 76.5 (5.8)Recommended Optimal Assortment, Chain 104.1 (29.8) 94.3 (25.9) 81.4 (12.6)Recommended Optimal Assortment, Store 108.2 (34.9) 99.2 (32.4) 83.6 (15.6)

Table 11: Tires: Price Changes and Impact on De-mand Shares for a Representative Store

Share of Demand (Price in $$)Brand

WarrantyJul - Dec

2004Jan - Jun

2005Jul - Dec

2005NH 2 (73.9) 5 (69.7) 8 (77.6)NM 2 (58.9) 3 (51.4) 6 (50.9)H1H 3 (59.8) 10 (61.1) 8 (58.6)H2H 16 (49.6) 21 (53.5) 22 (56.5)H2M 7 (43.3) 10 (45.7) 29 (46.5)H3L 70 (30.1) 51 (32.2) 27 (37.9)

H2M-H3L %Price Difference 43.7 41.9 22.9

23

Table 12: Statistics for Stores with Maximum and Minimum Percent Suburban

Table 13: Example to Illustrate Drivers of Localization

Case 1 Case 2Demand Demand

SKU Store 1 Store 2 Variance Store 1 Store 2 Variance1 50 100 1250 75 75 02 100 50 1250 75 75 03 40 50 50 50 0 12504 50 40 50 0 50 12505 40 0 800 25 25 06 0 40 800 25 25 07 40 0 800 25 25 08 0 40 800 25 25 09 40 0 800 25 25 010 0 40 800 25 25 011 40 0 800 25 25 012 0 40 800 25 25 0

Totals 400 400 9000 400 400 2500COV 11.9% 6.4%

K ChainOptimal

LocalizedOptimal

LocalizationLift

ChainOptimal

LocalizedOptimal

LocalizationLift

3 390 400 2.6% 350 400 14.3%4 480 480 0 400 400

24

Table 14: Explaining Localization LiftCategory Lift COV -

AllCOV -Select(K)

ChainOptimalShare

Cakes 0.122 0.17 0.23 0.71Tires 0.058 0.11 0.12 0.80AppearanceChemicals

0.024 0.10 0.10 0.93

Table 15: Regression of Localization LiftCoefficient Cakes Tires Appearance

Chemicals(Intercept) 0.147*** 0.113*** 0.186***Chain Optimal Share -0.095*** -0.166*** -0.157***COV Select 0.045* 0.475*** 0.339***COV All -0.024 0.232 -0.000Simulation DetailsMinimum K 10 10 5Maximum K 40 100 40Minimum Stores 10 20 80Maximum Stores 30 60 240# of Instances Simulated 100 100 100

9 Figures

Figure 1: Revenue vs. Percent of MaximumPossible SKUs in the Assortment

% of Total SKUs in Assortment

Sha

re o

f Max

imum

Rev

enue

Cap

ture

d

0.2

0.4

0.6

0.8

1.0

0.2 0.4 0.6 0.8 1.0

Category

Tires Cakes AppChem

Figure 2: Tires: Share of H3L (H2H,H2M ) isNegatively (Positively) Correlated with Income

Median Household Income ($$ Thousand)

Est

imat

ed S

hare

0.1

0.2

0.3

0.4

0.5

0.6

0.7

●

●

●

●

●●

●●

●

●●

● ● ●●

●

●

●

●

●

●●

●

●

H2H

H2M

H3L

20−30 30−40 40−50 50−60 60−70 70−80 80−90 90+

25

Figure 3: Localization Lift versus Maximum Total SKUs in the Assortment and Chain Optimal Share

% of Total SKUs in Assortment

Lift

0.0

0.1

0.2

0.3

0.2 0.4 0.6 0.8 1.0

Category

Tires

Cakes

AppChem

Chain Optimal ShareLi

ft

0.0

0.1

0.2

0.3

0.0 0.2 0.4 0.6 0.8 1.0

ReferencesAnupindi, R., M. Dada, S. Gupta. 1998. Estimation of consumer demand with stock-out based substitution: An

application to vending machine products. Marketing Science 17(4) 406–423.

Bell, D.R., A. Bonfrer, P.K. Chintagunta. 2005. Recovering SKU-level preferences and response sensitivities frommarket share models estimated on item aggregates. Journal of Marketing Research 42(2) 169–182.

Belloni, A., R. Freund, M. Selove, D. Simester. 2008. Optimizing Product Line Designs: Efficient Methods andComparisons. Management Science 54(9) 1544.

Beswick, P., M. Isotta. 2010. Making the right choices: SKU rationalization in retail. Oliver Wyman WorkingPaper.

Cachon, G., C. Terwiesch, Y. Xu. 2005. Retail assortment planning in the presence of consumer search. Manu-facturing & Service Operations Management 7(4) 330.

Canadian Tire Corporation Limited. 2007. Annual report.

Caro, F., J. Gallien. 2007. Dynamic assortment with demand learning for seasonal consumer goods. ManagementScience 53(2) 276–276.

Chong, J. K., T. H. Ho, C. S. Tang. 2001. A modeling framework for category assortment planning. Manufacturing& Service Operations Management 3(3) 191–210.

Dobson, G., S. Kalish. 1988. Positioning and pricing a product line. Marketing Science 7(2) 107–125.

Fader, Peter S., Bruce G. S. Hardie. 1996. Modeling consumer choice among skus. Journal of Marketing Research33(4) 442–452.

Fisher, M., A. Raman. 2010. The New Science of Retailing: How Analytics are Transforming the Supply Chainand Improving Performance. Harvard Business School Press.

Gaur, V., D. Honhon. 2006. Assortment planning and inventory decisions under a locational choice model.Management Science 52(10) 1528–1543.

Goodman, A. C. 1998. Andrew court and the invention of hedonic price analysis. Journal of Urban Economics44(2) 291–298.

26

Green, P.E., A.M. Krieger. 1985. Models and heuristics for product line selection. Marketing Science 4(1) 1–19.

Green, P.E., A.M. Krieger. 1987a. A consumer-based approach to designing product line extensions. Journal ofProduct Innovation Management 4(1) 21–32.

Green, P.E., A.M. Krieger. 1987b. A simple heuristic for selecting ’good’ products in conjoint analysis. Applica-tions of Management Science 5 131–153.

Green, P.E., A.M. Krieger. 1992. An application of a product positioning model to pharmaceutical products.Marketing Science 11(2) 117–132.

Guadagni, P.M., J.D.C. Little. 1983. A logit model of brand choice calibrated on scanner data. Marketing Science2(3) 203–238.

Hotelling, H. 1929. Stability in competitition. Economic Journal 39 41–57.

Kamakura, W.A., G.J. Russell. 1989. A probabilistic choice model for market segmentation and elasticity struc-ture. Journal of Marketing Research 26(4) 379–390.

Kohli, R., R. Sukumar. 1990. Heuristics for Product-Line Design Using Conjoint Analysis. Management Science36(12) 1464–1478.

Kök, G., M. L. Fisher. 2007. Demand Estimation and Assortment Optimization under Substitution: Methodologyand Application. Operations Research 55(6) 1001–1021.

Kök, G., M. L. Fisher, R. Vaidyanathan. 2008. Assortment planning: Review of literature and industrial practice.Invited chapter in Retail Supply Chain Management, Eds. N. Agrawal, S.A. Smith. Kluwer Publishers.

Lancaster, K. 1975. Socially Optimal Product Differentiation. The American Economic Review 65(4) 567–585.

Lancaster, K. J. 1966. A new approach to consumer behavior theory. Journal of Political Economy 74 132–157.

Maddah, B., EK Bish. 2007. Joint pricing, assortment, and inventory decisions for a retailer’s product line. NavalResearch Logistics 54(3) 315.

Mahajan, S., G. van Ryzin. 2001. Stocking retail assortments under dynamic consumer substitution. OperationsResearch 49(3) 334–351.

McBride, R.D., F.S. Zufryden. 1988. An integer programming approach to the optimal product line selectionproblem. Marketing Science 7 126–140.

McGregor, Jena. May 15, 2008. At Best Buy, marketing goes micro. BusinessWeek .

O’Connell, Vanessa. April 21, 2008. Reversing field, Macy’s goes local. Wall Street Journal .

Pakes, Ariel. 2003. A reconsideration of hedonic price indexes with an application to PC’s. The AmericanEconomic Review 93(5) 1578–1596.

Rosen, S. 1974. Hedonic prices and implicit markets: Product differentiation in pure competition. The Journalof Political Economy 82(1) 34–55.

Smith, S. A., N. Agrawal. 2000. Management of multi-item retail inventory systems with demand substitution.Operations Research 48(1) 50–64.

van Ryzin, G., S. Mahajan. 1999. On the relationship between inventory costs and variety benefits in retailassortments. Management Science 45(11) 1496–1509.

Vulcano, G., G. van Ryzin, R. Ratliff. 2009. Estimating primary demand for substitutable products from salestransaction data. Working Paper, NYU Stern.

Zimmerman, A. September 7, 2006. To boost sales, Wal-Mart drops one-size-fits-all approach. Wall Sreet JournalOnline A1.

Zimmerman, Ann. October 7, 2008. Home Depot learns to go local. Wall Street Journal .

27

a demand estimation procedure for retail assortment ... · the assortment planning process varies...

Documents