show me the money! deriving the pricing power of product features by mining consumer reviews....
TRANSCRIPT
Show me the Money! Deriving the Pricing Power of Product Features by Mining
Consumer Reviews.
Nikolay Archak, Anindya Ghose, Panagiotis Ipeirotis
New York UniversityStern School of Business
Information Systems Group, IOMS department
Word of “Mouse”
Consumer reviews Derived from user experience Describe different product features Provide subjective evaluations of product features
I love virtually everything about this camera....except the lousy picture quality. The camera looks great, feels nice, is easy to use, starts up quickly, and is of course waterproof. It fits easily in a pocket and the battery lasts for a reasonably long period of time.
Comment | Was this review helpful to you? (Report this) (Report this)
Existing work Identifying product features
Hu, Liu (AAAI, 2004) Ghani, Probst, Liu, Krema, Fano (KDD, 2006) Scaffidi (2006)
Sentiment classification Das, Chen (2001) Turney, Littman (ACL, 2003) Dave, Lawrence, Pennock (WWW, 2003) Hu, Liu (KDD, 2004) Popescu, Etzioni, (EMNLP, 2005)
Opinion Analysis Hu, Liu, Cheng (WWW, 2005)
Research Questions
How important is each product feature to customers?
What is the pragmatic polarity and strength of customers’ opinions?
Sales data provides valuable clues
Examine changes in demand and estimate weights of features and strength of evaluations.
Overview of our Approach
“poor lenses”
+3%
“excellent lenses”
-1%
“poor photos”
+6%
“excellent photos”
-2% Feature “photos” is twice more important than “lenses” “Excellent” is positive, “poor” is negative “Excellent” is three times stronger than “poor”
Economic background – Hedonic goods and hedonic regressions We are not the first to measure weights of product
features. Economists are doing this for years. Hedonic goods [Rosen, 1974]:
Each good is characterized by the set of its objectively measured features
Preferences of consumers are solely determined by features of available goods
Are all goods hedonic? Hedonic regressions:
log(CameraPrice) = const + b1*NumMegapixels + b2*Zoom
+ b3*StorageSize +…
Hedonic regressions with subjectively measured features Problem: traditional hedonic regressions include only
objectively measured features Our solution: introduce review evaluations into the
hedonic framework. Each opinion assigns implicit subjective score to a feature [We don’t know the scores].
For example: review1 says “excellent lenses” [implicit opinion score: 0.7]
and “nice lenses” [implicit opinion score: 0.3] review2 says “decent lenses” [implicit opinion score: -0.1]
Average score of the “lenses” feature is:[0.7 + 0.3 - 0.1] / 3 = 0.3
Representing consumer review(s)
Nx – opinion phrase frequencyWx – opinion phrase weights – smoothing factor
excellent poor good ej
lenses 0.2 0 0.1 ...
photos 0 0 0.3 …
ease of use 0 0.4 0 …
fi… … … Wij
( )ij
ijkl
kl
N sW
N s
Evaluations
Fea
ture
s
Matrix [tensor] representation allows us naturally estimate feature weights and evaluation scores.
Our Model
[ ] 1 2
"text" model: log( ) log( )
k - product index, t - time index
- consumer review information
( )kt k kt kt kt
kt
ktD Wa P R
W
1 1 1 1
- impact of consumer reviews on product demand:
( ) ( )n m n m
kt i j ktij ij ktiji j i j
W f e W W
Too many parameters (m*n): ( )ij i jf e
log (Demand) = a + b* log (Price) + b1* Megapixels + b2* Zoom + …
Ψ11*W[“excellent lenses”] + Ψ12*W[“great lenses”] + ... + Ψ1M*W[“terrible lenses”] +
Ψ21*W[“excellent photos”] + Ψ22*W[“great photos”] + … + Ψ2M*W[“terrible photos”] +
…
ΨN1*W[“excellent size”] + ΨN2*W[“great size”] + ... + ΨNM*W[“terrible size”]
Technical Challenge – Reduce the Number of Parameters Solution: place a rank constraint
Special case (p = 1): independent
features weights and evaluation scores
ijrank p
1..1..
1 1 1 1
( )
j i ij j ii nj m
n m n m
kt ij ktij i j ktiji j i j
W W W
Amazon.com DatasetProduct Category
“Audio & Video”
“Camera & Photo”
Number of products
127 115
Number of sales rank observations
35,143 31,233
Number of reviews
2,580 1,955
Period April 2005 – May 2006
Results - Feature Weights for “Camera & Photo”
0
0.2
0.4
0.6
0.8
1
1.2
Results - Evaluation Coefficients for “Camera & Photo”
-0.5
0
0.5
1
1.5
2
Partial effects for “Camera & Photo”great camera 0.4235 decent battery -0.0139
good camera 0.1128 decent quality -0.0822
great quality 0.0931 poor quality -0.1067
good quality 0.0385 bad camera -0.6547
great battery 0.0138 fine camera -0.677
Partial effect of an opinion phrase: score of the “average review” where all evaluations of the feature f are replaced by the evaluation e minus score of the “average review”.
Predictive power of product reviews Goal: predict future sales using review text Model test: 10-fold cross validation
(product holdout) Compared with model that ignores text but
keeps numeric variables including average review rating
Average RMSE improvement 5%, Avg. Err improvement 3%
Conclusions
We provided technique for: Measuring importance of product features for
consumers Identifying polarity and strength of user
evaluations Alleviating problem of data sparseness
Thank you!
Comments? Questions?
Related Work Chevalier, Mayzlin (2006) Chevalier, Goolsbee (2003) Ghani, Probst, Liu, Krema, Fano (2006) Hu, Liu (2004) Hu, Liu, Cheng (2005) Turney (2002) Pang, Lee (2005) Popescu, Etzioni (2005)