why your sentiment is wrong by networked insights
TRANSCRIPT
T.R. Fitz-Gibbon — Chief ScientistText Analytics Summit May, 2011
Why your Sentiment is WrongA Statistical Analysis of the Sentiment Task
2 © 2011 Networked Insights
“There is no true interpretation ofanything; interpretation is a vehicle inthe service of human comprehension.The value of interpretation is in enablingothers to fruitfully think about an idea.”
- Andreas Buja
Subjectivity
3 © 2011 Networked Insights
Subjectivity
“I would never get 7hours just browsingwith Wi-Fi. Not even 6hours. I think myrecord has been 5hours 30 minutessomething.”
“Anyway 5 hours isenough for me. It wasenough with 3GS.Maybe with next iPhoneI will have better luckwith the battery.”
Subject: iPhone 4 battery life, not so good?
4 © 2011 Networked Insights
Sentiment Analysis vs. Semantic Analysis
Sentiment Analysis
Ignores most Data
Results Determinedby Chance
High Subjective Error
Semantic Analysis
Analyzes all Data
Results Determinedby Data
CommunicatesSubjectivity
5 © 2011 Networked Insights
Positive
Negative
Neutral
The Sentiment Task
SentimentAnalysis
SentimentAnalysis
Natural Language ProcessingNatural Language Processing
Machine LearningMachine Learning
Manual Analysis by ExpertsManual Analysis by Experts
Manual Analysis by CrowdsourcingManual Analysis by Crowdsourcing
6 © 2011 Networked Insights
Happy
Sad
Indifferent
The Sentiment Task
SentimentAnalysis
SentimentAnalysis
Natural Language ProcessingNatural Language Processing
Machine LearningMachine Learning
Manual Analysis by ExpertsManual Analysis by Experts
Manual Analysis by CrowdsourcingManual Analysis by Crowdsourcing
7 © 2011 Networked Insights
The Sentiment Task
SentimentAnalysis
SentimentAnalysis
Intends to Purchase
Intends to Renew
Intends to Cancel
Natural Language ProcessingNatural Language Processing
Machine LearningMachine Learning
Manual Analysis by ExpertsManual Analysis by Experts
Manual Analysis by CrowdsourcingManual Analysis by Crowdsourcing
8 © 2011 Networked Insights
Why Sentiment Fails
Causes1. Narrow view of meaning
2. Low statistical confidence
Effects- Ignores data
- Results left up to chance
9 © 2011 Networked Insights
Percentage of posts that contain sentiment
Data is based on a 500-postsentiment study weconducted aroundsmartphones. The posts wereclassified by 20 people each.
Posts were assigned to asentiment category basedon a majority vote.
Only about 10% of postswere found to containsentiment.
1. Why Sentiment Fails - Narrow View of Meaning
10 © 2011 Networked Insights
2. Why Sentiment Fails - Statistical Confidence
11 © 2011 Networked Insights
2. Why Sentiment Fails - Statistical Confidence
12 © 2011 Networked Insights
2. Why Sentiment Fails - Statistical Confidence
13 © 2011 Networked Insights
2. Why Sentiment Fails - Statistical Confidence
14 © 2011 Networked Insights
2. Why Sentiment Fails - Statistical Confidence
15 © 2011 Networked Insights
2. Why Sentiment Fails - Statistical Confidence
16 © 2011 Networked Insights
2. Why Sentiment Fails - Statistical Confidence
Confidence intervals for a sample of four readersWhen three out of four readersagree on the sentiment of a post,we can only be 35% confidentthat a majority of all readerswould agree.
Normally, statistical significanceat the 95% level is desired (forresearch and opinion polls). Thisexample is only statisticallysignificant at the 35% level.
Thus, in this case we have notyet sampled enough readers toconcluded that we know thesentiment of this post.
17 © 2011 Networked Insights
2. Why Sentiment Fails - Statistical Confidence
Inter-Reader Agreement90%75%65%55%
Sample Size 95% Confidence< 10~ 20~ 50~ 500
18 © 2011 Networked Insights
Why Sentiment Fails - Putting it all Together
(Positive) Sentiment as a Percent
x 10^4
CorrectValue
Expected Outcomeof
Sentiment Analysis
19 © 2011 Networked Insights
What is the Alternative
Semantic Analysis
Topic Discovery
Topic Trending
20 © 2011 Networked Insights
Topic Discovery - Apple Topic Tree
21 © 2011 Networked Insights
Topic Discovery - Apple Topic Tree
22 © 2011 Networked InsightsThe tail end of antenna-gate, signal issues,glass cracking, daylight savings bug
Apple's big "Back tothe Mac" event
”iPhone Problems" and “Support" areboth high before “Back to the Mac" andquite low after
Conversation becomes allabout buying iPhones and Mac
Buy iPhone
Problem iPhone
Support, Apple Products
Mac OS, Mac Pro
Volu
me
Topic Trending - Apple
23 © 2011 Networked Insights
Moms discussing fabric care
Topic Trending - Moms and CPG
“baby, baby clothes”experiences a lonespike around Christmas
“saving money” drivesthis acceleration around
“cloth diapers”
24 © 2011 Networked Insights
Sentiment Analysis vs. Semantic Analysis
Sentiment Analysis
Ignores most Data
Results Determinedby Chance
High Subjective Error
Semantic Analysis
Analyzes all Data
Results Determinedby Data
CommunicatesSubjectivity
Semantic Value
Understand the EntireConversation
Understand the ActualConversation
Actually Understandthe Conversation
25 © 2011 Networked Insights
What if I need Sentiment Analysis
3 Questions to Ask your Provider:
• What is the inter-reader agreement of your manuallyscored sentiment data?
• When you manually score/label posts with sentiment, to howmany readers do you give each post?
• For what type of posts was your solution designed and tested?
We fuel insights, helping brands and theiragencies make better marketing decisions.
Founded in 2006 by industry leaders andseasoned entrepreneurs in the fields of socialmedia and customer intelligence.
Headquartered in Madison, WI, with officesin New York and Chicago.
T.R. Fitz-Gibbon, Chief [email protected]
networkedinsights.com