beyond sentiment hype: conversation context for accurate discovery
TRANSCRIPT
Beyond Sen)ment Hype: Conversa)on Context for Accurate
Discovery
Hadley Reynolds NextEra Research
Agenda
• Where we are now – market drivers & technology dynamics
• The Sen)ment Bubble considered • Differen)a)ng levels of analysis • Prac)cal dimensions of analysis and examples • Discussion
Market Drivers for Sen)ment Analysis
Market Drivers for Sen)ment Analysis
Market Drivers for Sen)ment Analysis
Market Drivers for Sen)ment Analysis
Market Drivers for Sen)ment Analysis
Addi$onal Web 2.0 Content: Blogs
Discussion Forums Amazon (Yelp, Trip Advisor etc.) Reviews
User Generated RaAngs Data “Like” Google+
And more, much more…
Sen)ment Technology Providers
0
5
10
15
20
25
30
35
40
45
2003 2004 2005 2006 2007 2008 2009 2010 2011
Corpora Software
Where Does Sen)ment Belong?
Early Social Monitoring
Naïve Sen)ment
Credibility: comes from accuracy & insight
Ambiguity: is the enemy of accuracy
Challenges for Sen)ment Analysis
• Level of analysis • Timeframes for analysis • Rela)ve sophis)ca)on of analysis
Level of Analysis
• Corpus (Do the bloggers like us?) • Document (Does this author like us?)
Document Sen)ment Math
good
o.k.
great
disappointed
Term Value Score
good 2 2
great 3 3
o.k. 1 1
disappointed -‐4 -‐4
Total: +2
Posi)ve document = 4 points or above Nega)ve document = -‐2 points or below Neutral document = -‐2 through +3
Neutral Document
Document Sen)ment Math
ok
o.k.
great
disappointed
Term Value Score
Product A good 2 2
Product A great 3 3
Product A o.k. 1 1
Product A disappointed
-‐4 -‐4
Product B good 1 1
Product B ok 1 2
Product B disappointed
-‐4 -‐8
Total: -‐3
Posi)ve document = 4 points or above Nega)ve document = -‐2 points or below Neutral document = -‐2 through +3
good
ok
Product B Product A
disappointed
disappointed
Nega)ve Document
good
Level of Analysis
• Corpus (Do the bloggers like us?) • Document (Does this author like us?) • Sentence (What is this person’s comment?) • En)ty/A`ribute (What is it about us that she likes or doesn’t like?)
En)ty-‐level Analysis
Opinion Target En)ty Person
Opinion Target En)ty Person (Emo)on) (Feature) (Profile)
(Social Network)
(Feature)
(Feature)
Sources
Timeframes of Analysis
• Retrospec)ve analy)cs/business intelligence • Predic)ve analy)cs – quality issues, future performance
• Trend emergence • Real-‐)me – customer interac)ons, social interac)ons/engagements
Sophis)ca)on of Analysis
• Keyword-‐based sen)ment techniques – Sen)ment terms: elusive, ambiguous, in flux – Sen)ment lexicons: incomplete, non-‐specific, inflexible
– Unable to understand context surrounding an expression or the people contribu)ng
– Unable to understand connec)ons among related en))es and a`ributes and people
– Unable to gauge quality of source materials
Sophis)ca)on of Analysis
• Seman)c-‐based sen)ment techniques – Sen)ment terms >> incorporate related expressions, fuzzy logic -‐ NLP
– Sen)ment lexicons >> domain ontologies (available or buildable) provide analy)cal context
– Able to understand context surrounding an expression or the people contribu)ng -‐ machine learning & other techniques
– Able to understand connec)ons among related en))es and a`ributes and people -‐ triples, event extrac)on
Dimensions of Analysis
• Ontologies around opinion objects • Iden)fica)on and qualifica)on of en))es & a`ributes & rela)onships
• Emo)onal content of expression(s) • Quality gauge of sources • Profiles of individual commenters • Roles/interac)ons/sociology of commenters and their affilia)ons
• Timeframe for expressions and responses
Beyond +/-‐: Ontology-‐based analy)cs
Source: BuzzStory
Same Ontology breakdown Same Scale: Expressed Opinions
Higher values for cardiovascular diseases with Avas)n
Opinion::Emo)on
Source: BuzzStory
Quality of Content Sources
"I know of one method that would be really scary and graphic that would work towards gepng people to stop pollu)ng my sea breeze environment. What I wonna know is they keep pupng down smokers and blaming us for evrything.”
"As shown above, a total of 362 pa)ents who hadn't progressed aser first line chemo/Avas)n were randomized to either of the two maintenance therapy arms, and the combina)on arm showed a significantly longer progression-‐free survival (PFS) coun)ng from the beginning of all treatment, at 10.”
• Quality: 4.48 • Quality: 16.78 topix.com cancergrace.org
Affilia)on Network – Map of Affilia)ons of People & Topics
Co-‐Morbidi)es
Tobacco Addic)on
Biomarkers Targeted Therapies
Supplements
Prostate Cancer
Breast Cancer
H&N Cancer
Thyroid Disease
Lung Cancer Chemotherapy
Source: BuzzStory
Sociology of Affilia)ons & Topic Groupings
Tobacco Addic)on Co-‐Morbidi)es
Misc. Side-‐Effects
Biomarkers
Supplements
Other Types of Cancer
Misc. Side-‐Effects
Source: BuzzStory
Where Does Sen)ment Belong?
Keyword technology
Contextual Analy)cs
Challenges Remain
“The service at Reynards is, in general, friendly and loose. Though they couldn’t find a reserva)on for four one Friday night, they compensated with so much warmth and comped wine that all was forgiven. In some ways, Reynards offers what one wishes a dining experience in Manha`an would be: kindness instead of aptude, inoffensive prices, glorious food, and aesthe)c variety—the clientele is split roughly in half between the stylish and the schlumpy.” The New Yorker, September 24, 2012
Resources
• Bing Liu, Sen$ment Analysis and Opinion Mining, Morgan & Claypool, 2012
• Bo Pang and Lillian Lee, Opinion Mining and Sen$ment Analysis, (Founda$ons and Trends in Informa$on Retrieval), Now Publishers, 2008
• Sen)ment Analysis Symposium, San Francisco, CA, October 30, 2012
Ques)ons?