show me the money! deriving the pricing power of product features by mining consumer reviews....

Post on 17-Dec-2015

212 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Show me the Money! Deriving the Pricing Power of Product Features by Mining

Consumer Reviews.

Nikolay Archak, Anindya Ghose, Panagiotis Ipeirotis

New York UniversityStern School of Business

Information Systems Group, IOMS department

Word of “Mouse”

Consumer reviews Derived from user experience Describe different product features Provide subjective evaluations of product features

I love virtually everything about this camera....except the lousy picture quality. The camera looks great, feels nice, is easy to use, starts up quickly, and is of course waterproof. It fits easily in a pocket and the battery lasts for a reasonably long period of time.

Comment | Was this review helpful to you?  (Report this) (Report this)

Existing work Identifying product features

Hu, Liu (AAAI, 2004) Ghani, Probst, Liu, Krema, Fano (KDD, 2006) Scaffidi (2006)

Sentiment classification Das, Chen (2001) Turney, Littman (ACL, 2003) Dave, Lawrence, Pennock (WWW, 2003) Hu, Liu (KDD, 2004) Popescu, Etzioni, (EMNLP, 2005)

Opinion Analysis Hu, Liu, Cheng (WWW, 2005)

Research Questions

How important is each product feature to customers?

What is the pragmatic polarity and strength of customers’ opinions?

Sales data provides valuable clues

Examine changes in demand and estimate weights of features and strength of evaluations.

Overview of our Approach

“poor lenses”

+3%

“excellent lenses”

-1%

“poor photos”

+6%

“excellent photos”

-2% Feature “photos” is twice more important than “lenses” “Excellent” is positive, “poor” is negative “Excellent” is three times stronger than “poor”

Economic background – Hedonic goods and hedonic regressions We are not the first to measure weights of product

features. Economists are doing this for years. Hedonic goods [Rosen, 1974]:

Each good is characterized by the set of its objectively measured features

Preferences of consumers are solely determined by features of available goods

Are all goods hedonic? Hedonic regressions:

log(CameraPrice) = const + b1*NumMegapixels + b2*Zoom

+ b3*StorageSize +…

Hedonic regressions with subjectively measured features Problem: traditional hedonic regressions include only

objectively measured features Our solution: introduce review evaluations into the

hedonic framework. Each opinion assigns implicit subjective score to a feature [We don’t know the scores].

For example: review1 says “excellent lenses” [implicit opinion score: 0.7]

and “nice lenses” [implicit opinion score: 0.3] review2 says “decent lenses” [implicit opinion score: -0.1]

Average score of the “lenses” feature is:[0.7 + 0.3 - 0.1] / 3 = 0.3

Representing consumer review(s)

Nx – opinion phrase frequencyWx – opinion phrase weights – smoothing factor

excellent poor good ej

lenses 0.2 0 0.1 ...

photos 0 0 0.3 …

ease of use 0 0.4 0 …

fi… … … Wij

( )ij

ijkl

kl

N sW

N s

Evaluations

Fea

ture

s

Matrix [tensor] representation allows us naturally estimate feature weights and evaluation scores.

Our Model

[ ] 1 2

"text" model: log( ) log( )

k - product index, t - time index

- consumer review information

( )kt k kt kt kt

kt

ktD Wa P R

W

1 1 1 1

- impact of consumer reviews on product demand:

( ) ( )n m n m

kt i j ktij ij ktiji j i j

W f e W W

Too many parameters (m*n): ( )ij i jf e

log (Demand) = a + b* log (Price) + b1* Megapixels + b2* Zoom + …

Ψ11*W[“excellent lenses”] + Ψ12*W[“great lenses”] + ... + Ψ1M*W[“terrible lenses”] +

Ψ21*W[“excellent photos”] + Ψ22*W[“great photos”] + … + Ψ2M*W[“terrible photos”] +

ΨN1*W[“excellent size”] + ΨN2*W[“great size”] + ... + ΨNM*W[“terrible size”]

Technical Challenge – Reduce the Number of Parameters Solution: place a rank constraint

Special case (p = 1): independent

features weights and evaluation scores

ijrank p

1..1..

1 1 1 1

( )

j i ij j ii nj m

n m n m

kt ij ktij i j ktiji j i j

W W W

Amazon.com DatasetProduct Category

“Audio & Video”

“Camera & Photo”

Number of products

127 115

Number of sales rank observations

35,143 31,233

Number of reviews

2,580 1,955

Period April 2005 – May 2006

Results - Feature Weights for “Camera & Photo”

0

0.2

0.4

0.6

0.8

1

1.2

Results - Evaluation Coefficients for “Camera & Photo”

-0.5

0

0.5

1

1.5

2

Partial effects for “Camera & Photo”great camera 0.4235 decent battery -0.0139

good camera 0.1128 decent quality -0.0822

great quality 0.0931 poor quality -0.1067

good quality 0.0385 bad camera -0.6547

great battery 0.0138 fine camera -0.677

Partial effect of an opinion phrase: score of the “average review” where all evaluations of the feature f are replaced by the evaluation e minus score of the “average review”.

Predictive power of product reviews Goal: predict future sales using review text Model test: 10-fold cross validation

(product holdout) Compared with model that ignores text but

keeps numeric variables including average review rating

Average RMSE improvement 5%, Avg. Err improvement 3%

Conclusions

We provided technique for: Measuring importance of product features for

consumers Identifying polarity and strength of user

evaluations Alleviating problem of data sparseness

Thank you!

Comments? Questions?

Related Work Chevalier, Mayzlin (2006) Chevalier, Goolsbee (2003) Ghani, Probst, Liu, Krema, Fano (2006) Hu, Liu (2004) Hu, Liu, Cheng (2005) Turney (2002) Pang, Lee (2005) Popescu, Etzioni (2005)

top related