costly sequential product search and the in uence of rankings · the pros and cons of continuing...

Costly Sequential Product Search and the Influence of

Rankings

Jack-William Barotta

December 17, 2019

Introduction

In the following paper, I will discuss the effects of product rankings on consumer behavior.

Namely, I will be discussing both the theoretical prediction and observed behavior of a

consumer in an effort to quantify the effect of ranking relevant products on overall click

through and buy-rates. I will first begin with motivation behind the idea of costly search.

I will then discuss the theoretical model proposed by Weitzman applied to ordered lists

on e-commerce sites such as Expedia. I will then show the main empirical results that

have been found in relation to consumer spending patterns and searching. Finally, I will

conclude by demonstrating how the theoretical predictions of the Weitzman model motivate

the results that are observed empirically in a natural field experiment conducted by the

company Expedia on their online platform. In doing so, we will potentially be able to pinpoint

the effect of search costs on the consumer’s utility whilst also discussing the limitations of the

model and potential improvements. I will now begin introducing the theoretical framework

of the problem that follows in close derivation to that of Weizmann 1979[7] and Ursu 2017[6].

1

The Mathematical Model

In the following section of this paper, I will set up the mathematical model that will be

utilized for a sequential search as we see in many online platforms that list their products.

The most common website that do this are Amazon and Ebay [2]. After the mathematical

model is set up however, we will apply the formalism in the model with the field experiment

done over at Expedia in an effort to advance the theory with the data. Note that the model

that I will be following here is largely based off of the model presented in Ursu 2017 [6].

We will consider the single consumer problem. Suppose the consumer visits some website,

S, that organizes each of their search results as an ordered list that contains, L products.

With this organizational structure, each product, l ∈ L, has some information displayed in

the list and an additional amount of information available to the consumer if the consumer

decides to go ahead and click the product. This is exactly how Amazon search results are

classified. On Amazon, consumers are generally offered a picture of the product, pricing,

and small amounts of information on the initial search result page that allows the consumer

to make some judgement on the exact valuation of the product. In addition, the consumer

can click on the item to learn more information, browse customer reviews, or view more

images to gain an even sharper opinion on the product. We will formalize this approach to

the online shopping model by denoting the prior level of utility that the object appears to

offer via the initial search results as vl [6]. In addition, we will denote the level of utility

that the object offers to the consumer after viewing more information as the posterior utility

level, ul. As such, we have the following relationship:

ul = vl + εl where εl ∼ N(0, σ2l ) (1)

We capture the change in valuation from the prior to posterior utility for the lth good

by εl, which is normally distributed with mean zero and standard deviation σl. This is

a modeling assumption here [6]. The motivation behind this is that after learning more

information on product l, the utility gained from purchasing item l can go up or down due

2

to either positive/negative reviews or helpful/unhelpful additional information. As such,

on average, we expect that the utility change is zero, in expectation, but with appropriate

fluctuations about the mean [1]. As such, the decision to open the additional information

tab, click on the product, the consumer gains εl utility.

The final component of our modeling is that of the search cost. We have that opening

each of the additional pages opens makes the consumer incur some search cost, cl for the lth

for the lth good. In addition, in spirit of the paper by Chen, I will model the search cost

more generally as a function of the position of the item, pl. Therefore, we have that in order

for the consumer to gain the additional information and more importantly, the additional

utility of the lth good, εl, the consumer incurs cost cl(pl) [1]. The search cost will satisfy:

cl(pl) > 0 (2)

c′l(pl) > 0 (3)

The justification for the first one is there in order to not make the following solution degen-

erate. Namely, we have that the consumer actually incurs some positive search cost from

opening an additional information tab in order to give incentive to the consumer to weighing

the pros and cons of continuing the search process for the item. With a zero or even negative

search cost, the consumer could search all day in order to find the item with the highest

posterior utility leading to a degenerate solution.

In addition, we have the second condition yielding that the search costs are increasing as

a function of position. This can be justified due to the fact that the goods with positions at

the start of search results are easily accessible and require little to no scrolling or additional

time to get to compared to items in positions further down the page. The search cost turns

out to only be a function of position [6]. The authors ruled out the effects of other parameters

in our problem such as the utility gained from accessing the additional information, εl of

even the prior utilities, vl. Finally, in order to stay in line with Weitzman’s paper, we will

assume there is the outside option or simply selecting no product, and as such, incurring

3

no search cost of looking for this outside option [7]. We will give this outside option some

constant utility, say v0. I will now move on to quantify the optimal search that this model

leads to.

Quantifying the Optimal Search

We will consider the option the consumer faces after already having made some non-

negative amount of searches [6]. Suppose that the consumer has searched some amount of

products on the page, and she is now deciding whether or not she should choose to select one

of the searched options or instead continue searching. Thus, the consumer faces the trade-off

between stopping the search and choosing an ”opened”, searched, item or incurring the fixed

cost and looking at another [1]. The guiding quantity in this scenario actually turns out to

be exactly what was derived in Weitzman’s 1979 paper, namely we turn to the reservation

utility.

Consider good l with reservation utility, zl. The reservation utility is defined isomorphic

to a reservation price, namely the reservation utility is the utility level of good l such that

the consumer is indifferent between searching for additional information on good l or not

search at all [7]. As such, we can represent this as:

cl(pl) =

ˆ ∞zl

(ul − zl)f(ul)dul (4)

where f(ul) is the probability density function of uj. As a note, we actually know this

quantity given our modeling assumption. We have that, ul = vl+εl which leads to E [ul] = vl

and var(ul) = σ2l . As such, we can make a transformation of our utility, ul to a standard

normal distribution by defining:

ζ =ul − vlσl

(5)

This quantity will come up again as we solve the integral utilizing both the PDfs and CDfs of

the normal distribution. As such, we can proceed in solving the integral through clever tricks

as demonstrated in Kim et. al. 2010 [4]. Throughout the next sequence of mathematics,

4

I will utilize standard notation for random variables, with F (·) denoting the CDF of ul,

f(·) denoting the PDF of ul, with corresponding Φ(·) and φ(·) as the CDF and PDF of the

standard normal distribution, ζ ∼ N(0, 1). Deriving for an expression of the search cost, cl1

cl =

ˆ ∞zl

(ul − zl)f(ul)dul (6)

I can now multiply and divide by a constant, 1− F (zl) which captures the probability that

that the posterior utility is greater than or equal to the reservation utility level, zl.

cl = (1− F (zl))

ˆ ∞zl

(ul − zl)f(ul)

1− F (zl)dul (7)

Now computing the integral, and substituting in my standardized forms of the normal dis-

tribution, we have that:

cl =

(1− Φ

(zl − vlσl

))vl − zl + σlφ(zl−vlσl

)1− Φ

(zl−vlσl

) (8)

We can now deploy a change of coordinates in order to better interpret the result present.

We can define, χl = zl−vlσl

, xl = clσl

, and λ = φ(χl)1−Φ(χl)

. Doing so, we can re-express equation 12

as:

xl = (1− Φ(χl))(λ(χl)− (χl)) (9)

We can now utilize this expression to characterize the solution. Our term λ cropping up in

the expression for the effective search cost, xl, is analogous to hazard rate. The hazard rate

measures the rate of death, or stoppage of such a contract, namely searching for additional

information as a function of our nondimensionalized number χ [6]. As such, we now have

an equation that is solvable for xl. Thus, we could, if necessary, either with the assistance

of computational resources or if analytic, could solve this equation either for xl or χl. Since

we have xl = g(χl) where g is some invertible function. This is incredibly helpful. Since,

1Note that the argument is dropped for ease of exposition. The choice for the argument will be chosenas position as mentioned before.

5

we have such an invertible equation, we can actually solve for our reservation utility, zl by

solving for χl and then noting that:

zl = vl + σlχl (10)

As such, we have shown an analytical way of deriving a reservation utility for the lth good.

We can now take our result for the reservation wage as solvable and apply the optimal search

theorems that are formulated in Weitzman’s 1979 work.

Now having a way of deriving a reservation utility for the ordered goods available to the

consumer, we can state two more rules that come as a direct consequence. First, we have that

a given consumer will stop searching through remaining objects when the maximum utility

observed already, namely the posterior utilities available, exceeds the reservation utility of

any non-searched options [7]. We have this in place, as a means to stop an infinite search.

Since all potential goods that could be realized have smaller levels of utility in comparison

to posterior utilities realized, the consumer can be no better off by searching the remaining

goods. As such the consumer decides to stop the search. Second, we have a rule for once

the search has been stopped. Given the same consumer, that has now stopped searching

through goods, and has a set of posterior utilities of goods available to her, she will choose

the options with the highest utility among those where the posterior utility has been realized

[7]. This is intuitive and similar to the aforementioned rule in the sense that let S denote

the set of posterior utilities known to the consumer. The consumer will then choose s ∈ S

such that:

ps = maxs∈S

us (11)

where ps is the product that is bought/chosen from the set of searched goods.

Outside the mathematical formalism, we simply have developed the notion that the

consumer will 1. have some reservation utilities known to her, 2. compute a search through

goods available until we have reached our first rule of search, and 3. choose the best product

from those searched. We have now developed the mathematical framework necessary to

6

theoretically characterize the behavior of the individual when searching products on an

online website such as Amazon or Expedia. We will now enter an empirical discussion on

the field experiment conducted by Expedia before coming full circle to relate the empirical

regularities and the theoretical model.

Application to Expedia

In a field experiment, Expedia was able to test the effect of position on the click rate and

buy-rate on their hotels. As motivation, suppose as an online retailer, you are interested

just exactly how to rank products on your website. There are many potential ideas such as

most popular, highest rating, prices, or relevance. It happens that most websites utilize a

relevance ranking where they place the most relevant products at the top of search results

with those of less relevance following sequentially. However, in order to test the relevance

of the products being displayed, a natural question that arises is that are products relevant

because of their position on the website or rather intrinsic details of the product itself. The

online retailer, Expedia, sought to find the causal reasoning for click-through rates of objects

in the top positions by utilizing a field experiment [3].

Expedia is a travel company that has a main purpose is to aggregate travel-related

purchasing and display it on one consolidated platform [5]. Expedia features hotels, airlines

other forms of transportation tickets, and offers the customers on the website to even bundle

multiple aspects of their travel expenses. The website features a design that allows an

individual to type in locations, types of travel deals, and dates of travel in order to generate

the best possible deals for the customer. Doing so, an individual is displayed results as seen

below:

7

Figure 1: A screenshot of a typical search result. Source: Expedia.com

The experimental setup followed by Expedia was quite simple. Each customer that

visited the site over an 8-month period would be randomly assigned to a control group or a

treatment group[6]. Customers come to Expedia to check out hotels, and as such, customers

in each of the two groups would receive different search results for the same exact query. The

control group received the regular Expedia ranking, whereas the treatment group received

a random ranking. The treatment group was utilized to see the effect that the position had

on the click-through rate and the buy-rate of the hotels.

Throughout the 8 months of the field experiment that lasted between 2012 and 2013,

166,036 queries were recorded into the Expedia dataset. It should be noted that while

completing this expository project, I did not have access to the dataset, and as such, I will

be displaying the results as found in Ursu [6]. In addition, 13

of the time individuals were

put into the treatment group. Interestingly, a third of the individuals were exposed to the

treatment, which seems quite large given the fact that there could potentially have been a

large revenue loss. However, due to the large amount of samples that were taken for the

8

treatment group, we have a more complete data set, allowing us to remain more confident

in the results presented later. The results are now presented in the following the conclusion

of the field experiment.

Results

The following two graphs demonstrate both the click through rate and the purchase rate

conditioned on a click versus the position of the product.

Click-Rate versus Position. Source: Ursu 2017.

There are two large patterns in the graphs displayed. I will first discuss the click through

rate as a function of position. What we see is a decreasing click through rate as a function

of the position of the product. Namely, we expect that the products in the first few slots to

receive a larger amount of click in comparison to heavy tail of products remaining a smaller

amount of clicks. We see that the difference between the first position and the second

even differs in almost double the amount of clicks received. If there was more time to add

extensions to this project, it would be nice to utilize some inverse problem methods found

9

in applied mathematics, to try and back out the type of distribution of clicks that is leading

to such a graph being produced. It seems that we have a form of a Poisson distribution, or

exponential distribution, that has the majority of the data clustering around the leftmost

interval of the domain. As such, we see that the hotels that are in the top positions are

clicked more irregardless of the products being offered.

This is a strong effect observed in the data. Namely, we are seeing that rankings are

playing a critical role in the click through rate, and as such, are having a profound effect

on the customer’s search. However, let us now turn to the second graph to gain a greater

insight into the decisions of the consumer. Now we will turn to the second graph displaying

the buy-rate conditional on a click versus the position of the good.

Buy-Rate versus Position conditional on Click. Source: Ursu 2017.

What we see in the second graph, is that there is essentially a flat conversion rate be-

tween clicking through a product to gain additional information and actually purchasing the

product. We see that the buy-rate conditional on clicking through to the product is roughly

uniform. As such, we can conclude that the fraction of clicks that lead to a purchase is not

a function of position. I will now link the motivation for these results with the theoretical

results provided earlier.

10

The key notions are here is that the click-through rate decreases as a function of position

yet the buy-rate, conditional on the click through, is constant. We note that the Expedia

example is no different than what we would have expected to have seen in the theoretical

model. First, the consumer decides to go ahead and search through some of the products.

In our specific case, the searching the consumer does is analogous to clicking through on

the products that are being seen. Since, the cost incurred for clicking through the products

at the top of the page, with high-placed positions, are less in value than those in lower

positions down the page, one would expect individuals to open more of the products’ pages

towards the top of the page. In the empirical results, we see that irrespective of the product’s

prior utility level, products towards the top of the page will be clicked more. As such, the

theoretical motivation provided by that of Weitzman and advanced with the work of Kim

[4] and Ursu [6], are in close concord with the empirical results.

In addition to the first graph being carefully described by theory, we can also motivate

the second graph as well. Give the secondary rules of the Weitzman search, namely that the

individual will eventually stop searching when the cost of searching the remaining is more

than the reservation utilities and that the maximum posterior utility achieving product will

be chosen, supports the notion of the products buy-rates being independent of positions2.

Again, we gain a careful insight into the empirical evidence through the theory. Weitzman’s

theory of price search predicts just this. The fact that once the objects are searched, i.e.

clicked through, one would not expect the position to play anymore role[7]. This problem

is considered is parallel to the Pandora’s box. After searching all of the boxes that should

be searched, given the criteria, the algorithm in place to select the item only depend on the

posterior utility. Since, for random ranking the posterior utilities should be independent of

ranking, one would expect a uniform buy-rate for each position. As such, we see a synthesis of

both the theoretical model proposed over 30 years ago with the empirical evidence gathered

recently.

2conditional on the individual clicking through

11

Conclusion

Throughout the paper, we have introduced the notions of a costly sequential search, the

application of price search to online retail, and the causal effects on the click-through and

buy rates for ranked products on an online platform. In doing so, notions of a theoretical

model, based mainly off the work of Ursu [6], was utilized as motivation to capture the

essential elements of costly price search as derived in Weitzman’s landmark paper in 1979.

To couple the theoretical argument, we also discussed the field experiment completed by

Expedia as a means to justify the theoretical predictions with empirical evidence.

We see that while the click through rate is indeed a decreasing function of position,

items with positions on the top of the page are inherently clicked more. However, we also

note that conditional on a click-through to the product description and other additional

information, the buy-rate is independent of the position. This shows how the simple model

provides powerful insight to what drives revenue of online retail that featured ranked results

for queries. I thought that this project offered a nice blend of theoretical models, like

those discussed in the beginning of class, and applied it to a contemporary problem. I was

able to utilize skill learnt in maths courses whilst also motivating variables, concepts, and

equations through economics argument that allowed me to synthesize what I had the fun

learning throughout the course. an interesting potential future work would be to pose a

choice modeling problem that looks at if there is ways to better maximize revenue for places

like Expedia and Amazon by changing the ordering of their products on their respective

websites. It would be interesting to see whether there was potential to move lesser relevant

products into higher positions and move best sellers to ternary positions, for example, and

note the effects of revenue.

12

References

[1] Y. Chen and S. Yao. Sequential search with refinement: Model and application with

click-stream data. Management Science, 63(12):4345–4365, 2016.

[2] X. Jiang, Y. Xiao, and S. Li. Personalized expedia hotel searches. 2013.

[3] N. Khantal, V. Kroshilina, and D. Maini. Rank hotels on expedia. com to maximize

purchases. Technical report, 2013.

[4] J. B. Kim, P. Albuquerque, and B. J. Bronnenberg. Online demand under limited con-

sumer search. Marketing science, 29(6):1001–1023, 2010.

[5] R. Law and F. Chen. Internet in travel and tourism-part ii: Expedia. Journal of Travel

& Tourism Marketing, 9(4):83–87, 2000.

[6] R. M. Ursu. The power of rankings: Quantifying the effect of rankings on online consumer

search and purchase decisions. Marketing Science, 37(4):530–552, 2018.

[7] M. L. Weitzman. Optimal search for the best alternative. Econometrica: Journal of the

Econometric Society, pages 641–654, 1979.

13

costly sequential product search and the in uence of rankings · the pros and cons of continuing...

Documents