expanding domain sentiment lexicon through double propagationxwu/wie/courseslides/sentiment.pdf ·...
TRANSCRIPT
1
Expanding Domain Sentiment Expanding Domain Sentiment Lexicon through Double PropagationLexicon through Double Propagation
PresentorPresentor: He : He JiangJiang
Report based on the following materials:Report based on the following materials:
1) Bing Liu. Tutorial: Opinion Mining and Summarization - Sentiment Analysis, WWW-2008.2) Guang Qiu, Bing Liu, Jiajun Bu, and Chun Chen. Expanding Domain Sentiment Lexicon through Double Propagation. IJCAI’09
2
OutlineOutline
Part 1: Opinion Mining and Sentiment AnalysisPart 1: Opinion Mining and Sentiment Analysis
Part 2: Expanding Domain Sentiment Lexicon Part 2: Expanding Domain Sentiment Lexicon through Double Propagationthrough Double Propagation
3
Part 1: Opinion Mining and Sentiment AnalysisPart 1: Opinion Mining and Sentiment Analysis
Opinion on the WebOpinion on the WebApplicationsApplicationsOpinion SearchOpinion SearchRelated ProblemsRelated Problems
4
Part 1: Opinion Mining and Sentiment AnalysisPart 1: Opinion Mining and Sentiment Analysis
Opinion on the WebOpinion on the WebTwo main types of information on the webTwo main types of information on the web
factsfactsopinionopinion
WhereWhere’’s opinions opinionreview sitesreview sitesforumsforumsdiscussion groupsdiscussion groupsblogsblogs
5
Part 1: Opinion Mining and Sentiment AnalysisPart 1: Opinion Mining and Sentiment Analysis
Opinion on the WebOpinion on the WebTwo types of opinionsTwo types of opinions
Direct Opinions: sentiment expressions on some objects, e.g. Direct Opinions: sentiment expressions on some objects, e.g. products, events, topics, personsproducts, events, topics, persons
e.g. e.g. ““the picture quality of this camera is greatthe picture quality of this camera is great””
Comparisons: relations expressing similarities or differences ofComparisons: relations expressing similarities or differences ofmore than one objectmore than one object
e.g. e.g. ““car x is cheaper than car ycar x is cheaper than car y””
6
Part 1: Opinion Mining and Sentiment AnalysisPart 1: Opinion Mining and Sentiment Analysis
ApplicationsApplicationsBusinesses and organizations: product and service benchmarking.
Market intelligence.Business spends a huge amount of money to find consumer sentiments and opinions.Consultants, surveys and focused groups, etc
Individuals: interested in other’s opinions whenPurchasing a product or using a service,Finding opinions on political topics,
Ads placements: Placing ads in the user-generated contentPlace an ad when one praises a product.Place an ad from a competitor if one criticizes a product.
Opinion retrieval/search: providing general search for opinions.
7
Part 1: Opinion Mining and Sentiment AnalysisPart 1: Opinion Mining and Sentiment Analysis
Opinion SearchOpinion SearchFind the opinion of a person or organization (opinion
holder) on a particular object or a feature of the object.e.g., what is Bill Clinton’s opinion on abortion?
Find positive and/or negative opinions on a particularobject (or some features of the object), e.g.,
customer opinions on a digital camera.public opinions on a political topic.
Find how opinions on an object change over time.How object A compares with Object B?
Gmail vs. Hotmail
8
Part 1: Opinion Mining and Sentiment AnalysisPart 1: Opinion Mining and Sentiment Analysis
Related ProblemsRelated ProblemsSentiment classificationSentiment classification
classify wordsclassify words’’ sentiment expressed by authors (positive, negative, sentiment expressed by authors (positive, negative, neutral)neutral)
FeatureFeature--based opinion extraction and summarizationbased opinion extraction and summarizationA product consists of a few features (components), whatA product consists of a few features (components), what’’s the opinion s the opinion over each featureover each feature
Comparative sentence and relation extractionComparative sentence and relation extractionVisual Summarization & ComparisonVisual Summarization & Comparison
9
Part 1: Opinion Mining and Sentiment AnalysisPart 1: Opinion Mining and Sentiment Analysis
Related ProblemsRelated ProblemsVisual Summarization & ComparisonVisual Summarization & Comparison
10
Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation
MotivationMotivationRelated workRelated workBasic IdeaBasic IdeaSentiment Word ExtractionSentiment Word ExtractionPolarity AssignmentPolarity AssignmentExperiments and DiscussionsExperiments and Discussions
11
Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation
MotivationMotivationSentiment words: words that convey positive or negative sentimenSentiment words: words that convey positive or negative sentiment t polarities (attitude)polarities (attitude)
A comprehensive sentiment lexicon (dictionary) is essential A comprehensive sentiment lexicon (dictionary) is essential opinion expressions vary significantly among different domains, opinion expressions vary significantly among different domains, i.e., domain i.e., domain dependentdependentno universal sentiment lexicon to cover all domainsno universal sentiment lexicon to cover all domains
How to extract domainHow to extract domain--dependent sentiment words based on a set of dependent sentiment words based on a set of seeding sentiment words?seeding sentiment words?
12
Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation
Related WorkRelated WorkSentiment Analysis can be conducted atSentiment Analysis can be conducted at
word (like this paper)word (like this paper)expressionexpressionsentencesentence
Sentiment Word Analysis on Word LevelSentiment Word Analysis on Word LevelCorporaCorpora--based approaches (this paper falls in this category)based approaches (this paper falls in this category)
KanayamaKanayama and and NasukawaNasukawa 20062006» Extract domain specific sentiment words in Japanese text» Idea: 1) They exploit sentiment coherency within sentence and among
sentences to extract sentiment candidates; 2) then use a statistical method to determine whether a candidate is correct or not.
» Key difference with double propagation: Double propagation exploits the relationships between sentiment words and product features in extraction process.
dictionarydictionary--based approachesbased approaches
13
Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation
Basic IdeaBasic Ideainput: a set of sentiment wordsinput: a set of sentiment wordsoutput: a set of sentiment words & a set of featuresoutput: a set of sentiment words & a set of features
sentiment sentiment wordswords
sentiment sentiment wordswords
featuresfeatures
sentiment words sentiment words new sentiment wordsnew sentiment words
senetimentsenetiment wordswords new featuresnew features
features features new new senetimentsenetiment wordswords
features features new featuresnew features
14
Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation
Sentiment Word ExtractionSentiment Word Extraction4 tasks: 4 tasks:
Model:Model:
sentiment words sentiment words new sentiment wordsnew sentiment words
senetimentsenetiment wordswords new featuresnew features
features features new new senetimentsenetiment wordswords
features features new featuresnew features
( Extraction Rules based on Relations)dependency parser Minipar
sentiment sentiment wordswords
featuresfeatures
sentiment sentiment wordswords
15
Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation
Sentiment Word ExtractionSentiment Word ExtractionRelations of sentiment words and featuresRelations of sentiment words and features
e.g. e.g. ““I love I love iPodiPod””both both ““II”” and and ““iPodiPod”” depend on the verb depend on the verb ““lovelove”” with the relations of with the relations of subjsubj and and objobjrespectively. Here, respectively. Here, subjsubj: : ““II”” is the subject of is the subject of ““lovelove””, while , while objobj means that means that ““iPodiPod”” is the object of is the object of ““lovelove””..
16
Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation
Sentiment Word ExtractionSentiment Word ExtractionExtraction Rules based on Extraction Rules based on RelationsRelations
POS: partPOS: part--ofof--speechspeechs, f: word/feature to be extracteds, f: word/feature to be extracted{S},{F}: known sentiment words/ {S},{F}: known sentiment words/
featuresfeatures{JJ}: adjectives and their variants{JJ}: adjectives and their variantse.g. smart, smarter, smartest e.g. smart, smarter, smartest {NN}: nouns{NN}: nounse.g. picture, pictures e.g. picture, pictures {CONJ}: conjunction{CONJ}: conjunctione.g. and, ore.g. and, or{MR}: dependency relations between {MR}: dependency relations between
sentiment words and features, such sentiment words and features, such as mod, as mod, subjsubj, , objobj, , pnmodpnmod
17
Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation
Sentiment Word ExtractionSentiment Word Extractiona detailed modela detailed model
dependency parser Minipar
sentiment sentiment wordswords
featuresfeatures
sentiment sentiment wordswords stanford POS
tagger
corpuscorpus
tokenstokens
rulesrules
18
Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation
Polarity AssignmentPolarity AssignmentObservation 1Observation 1
Same polarity for same feature in a reviewSame polarity for same feature in a review
Observation 2Observation 2Same polarity for same sentiment word in a domain corpusSame polarity for same sentiment word in a domain corpus
19
Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation
Polarity AssignmentPolarity AssignmentRules:Rules:
Rule 1 : Heterogeneous ruleRule 1 : Heterogeneous ruleFor sentiment words extracted by known feature, and features extFor sentiment words extracted by known feature, and features extracted by racted by known sentiment words, known sentiment words, assign them the same polarities as the known onesassign them the same polarities as the known ones..
Rule 2: Homogeneous ruleRule 2: Homogeneous ruleFor sentiment words extracted by known sentiment words, and featFor sentiment words extracted by known sentiment words, and features ures extracted by known features, extracted by known features, also assign them the same polarities as the also assign them the same polarities as the known ones.known ones.
Rule 3: IntraRule 3: Intra--review rulereview ruleFor new sentiment words extracted by features which are extracteFor new sentiment words extracted by features which are extracted in d in other reviews, those sentiment words cannot be determined (by Ruother reviews, those sentiment words cannot be determined (by Rule1). If le1). If those sentiments words only appear in this review, Rule2 cannot those sentiments words only appear in this review, Rule2 cannot be applied be applied either. Those sentiment words are set to the same polarity of theither. Those sentiment words are set to the same polarity of the review. e review. The polarity of a review is determined by the sum of the polaritThe polarity of a review is determined by the sum of the polarities of words ies of words in this review (+1 for positive, in this review (+1 for positive, --1 for negative, 0 for neutral).1 for negative, 0 for neutral).
» If sum > 0, positive for the review» If sum < 0, negative
Conflict resolution: Conflict resolution: » For sentiment words with distinct polarities by calculation, use sum of
those polarities to determine the final polarity.
20
Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation
Experiments and DiscussionsExperiments and DiscussionsExperiment Set UpExperiment Set Up
Data set: 5 review data sets (2 for digital cameras, 1 for DVP pData set: 5 review data sets (2 for digital cameras, 1 for DVP player, 1 for layer, 1 for MP3, 1 for cell phone)MP3, 1 for cell phone)On average, each data set consists of 789 sentences and 63 revieOn average, each data set consists of 789 sentences and 63 reviews.ws.
4 4 algorihtmsalgorihtms::CRF (conditional random fields): a supervised algorithm (2001)CRF (conditional random fields): a supervised algorithm (2001)KN06 KN06 PropProp--depdepnoPropnoProp--depdep (non(non--propagation version of the new approach)propagation version of the new approach)
An initial positive and negative sentiment lists (654 and 1098 An initial positive and negative sentiment lists (654 and 1098 words, words, respectively)respectively)One data set is used as the training set for CRFOne data set is used as the training set for CRFwords appearing in both 10% (20%, 50, 80%) of initial lists and words appearing in both 10% (20%, 50, 80%) of initial lists and the chosen the chosen training data set are feed into KN06, Proptraining data set are feed into KN06, Prop--depdep, , noPropnoProp--depdep
21
Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation
Experiments and DiscussionsExperiments and DiscussionsResults of sentiment word extractionResults of sentiment word extraction
22
Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation
Experiments and DiscussionsExperiments and DiscussionsResults of sentiment word extractionResults of sentiment word extraction
23
Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation
Experiments and DiscussionsExperiments and DiscussionsResults of sentiment word extractionResults of sentiment word extraction
24
Part 2: Expanding Domain Sentiment Lexicon through Part 2: Expanding Domain Sentiment Lexicon through Double PropagationDouble Propagation
Experiments and DiscussionsExperiments and DiscussionsResults of polarity assignmentResults of polarity assignment
25
ConclusionsConclusions
Opinion mining is a promising topic in data mining, thereOpinion mining is a promising topic in data mining, there’’re re a lot of new problems in this area !a lot of new problems in this area !Sentiment word detection could be a starting point for Sentiment word detection could be a starting point for opinion miningopinion mining