an overview of text mining and sentiment analysis for decision support system
TRANSCRIPT
![Page 1: An overview of text mining and sentiment analysis for Decision Support System](https://reader030.vdocuments.site/reader030/viewer/2022032611/55cad457bb61ebb3438b4583/html5/thumbnails/1.jpg)
An Overview of Text Mining and Sentiment Analysis
- for Decision Support System
Gan Keng Hoon
School of Computer Sciences
Universiti Sains Malaysia
12 May 2015
![Page 2: An overview of text mining and sentiment analysis for Decision Support System](https://reader030.vdocuments.site/reader030/viewer/2022032611/55cad457bb61ebb3438b4583/html5/thumbnails/2.jpg)
Outlines
1. Decision Support Systems
2. Overview of Text Mining & Sentiment Analysis Techniques in Text Mining
Techniques in Sentiment Analysis
3. Applications and Challenges ahead.
![Page 3: An overview of text mining and sentiment analysis for Decision Support System](https://reader030.vdocuments.site/reader030/viewer/2022032611/55cad457bb61ebb3438b4583/html5/thumbnails/3.jpg)
Decision Support SystemAs an end user, every day, we need to make decision ..
What to eat for lunch? What
subject to choose?
Which hotel to stay?
![Page 4: An overview of text mining and sentiment analysis for Decision Support System](https://reader030.vdocuments.site/reader030/viewer/2022032611/55cad457bb61ebb3438b4583/html5/thumbnails/4.jpg)
Decision Support System
every hour/minute/second, business provider needs to make crucialdecision ..
Source: http://attunelive.com/blog/how-a-screening-prompted-by-clinical-decision-support-system-helped-save-a-patients-life/
As a business provider,
![Page 5: An overview of text mining and sentiment analysis for Decision Support System](https://reader030.vdocuments.site/reader030/viewer/2022032611/55cad457bb61ebb3438b4583/html5/thumbnails/5.jpg)
Decision Support System
Source: http://www.informationbuilders.com/decision-support-systems-dss
Decision maker in a company checks the sales before decide which product to promote ..
![Page 6: An overview of text mining and sentiment analysis for Decision Support System](https://reader030.vdocuments.site/reader030/viewer/2022032611/55cad457bb61ebb3438b4583/html5/thumbnails/6.jpg)
Decision Support System
A hotelier wants to know why ..
If location is good, how can I take advantage ..
![Page 7: An overview of text mining and sentiment analysis for Decision Support System](https://reader030.vdocuments.site/reader030/viewer/2022032611/55cad457bb61ebb3438b4583/html5/thumbnails/7.jpg)
Why are they/we using Decision Support System
Business provider Improve customer
experience
Improve products and services
More returns …
End user Better purchasing choice
Better value
Happier ..
![Page 8: An overview of text mining and sentiment analysis for Decision Support System](https://reader030.vdocuments.site/reader030/viewer/2022032611/55cad457bb61ebb3438b4583/html5/thumbnails/8.jpg)
Sample Decision Support System
Looks good, 155 person says Very
Good…
Not bad, customers rated 4
* and above for location,
cleanliness ..
http://www.tripadvisor.com.my
![Page 9: An overview of text mining and sentiment analysis for Decision Support System](https://reader030.vdocuments.site/reader030/viewer/2022032611/55cad457bb61ebb3438b4583/html5/thumbnails/9.jpg)
The Truth ?
http://www.tripadvisor.com.my
![Page 10: An overview of text mining and sentiment analysis for Decision Support System](https://reader030.vdocuments.site/reader030/viewer/2022032611/55cad457bb61ebb3438b4583/html5/thumbnails/10.jpg)
Many Questions …
Mr X: How is the condition of Wifi?
Miss Y: Is the toilet really dirty?
Family Z: Any convenience store nearby?
Manager of Hotel: I want to know all the complaints about toilet!
![Page 11: An overview of text mining and sentiment analysis for Decision Support System](https://reader030.vdocuments.site/reader030/viewer/2022032611/55cad457bb61ebb3438b4583/html5/thumbnails/11.jpg)
Harnessing Web and Social Texts
Very influential.
Latest and most updated.
The truth (but sometimes not).
Free (most of the time).
Source: Hotel Review Sites: What’s the ‘Truth’ About Fairness? http://www.hospitalitynet.org/news/4056065.html
![Page 12: An overview of text mining and sentiment analysis for Decision Support System](https://reader030.vdocuments.site/reader030/viewer/2022032611/55cad457bb61ebb3438b4583/html5/thumbnails/12.jpg)
However. With No Automation Methods
It is impossible to scan through each of them. Important details could be missed.
It is hard to visualize or summarize all the texts via manual effort.
It is impossible to digest new reviews generated each day.
*There are 344 reviews (as of 10/5/2015) for the mentioned hotel.
![Page 13: An overview of text mining and sentiment analysis for Decision Support System](https://reader030.vdocuments.site/reader030/viewer/2022032611/55cad457bb61ebb3438b4583/html5/thumbnails/13.jpg)
Overview of Text Mining & Sentiment Analysis
Is the toilet really dirty?
Text Mining- Let’s mine some texts to answer the question.
1. in the bathroom, used toiletries (shampoo & soap) were not thrown and were left in the shower area
2. dirty sink, and very verydirty shower glass wall.
3. the shower, it's clean...
Sentiment Analysis- Let’s find some sentiments about these texts.
![Page 14: An overview of text mining and sentiment analysis for Decision Support System](https://reader030.vdocuments.site/reader030/viewer/2022032611/55cad457bb61ebb3438b4583/html5/thumbnails/14.jpg)
Techniques in Text Mining
What is text mining?
To exploit information contained in textual documents in various ways.
Natural Language Processing
Information Retrieval
![Page 15: An overview of text mining and sentiment analysis for Decision Support System](https://reader030.vdocuments.site/reader030/viewer/2022032611/55cad457bb61ebb3438b4583/html5/thumbnails/15.jpg)
Information Retrieval- Find relevant sentences.
Document Collection Processing1. Texts Preprocessing
Sentence Tokenizer
Stop Word Removal
2. Feature Selection Bags of Words Approach
Term Frequency Inversed Document Frequency
3. Inverted Index Creation Term – Doc Posting
![Page 16: An overview of text mining and sentiment analysis for Decision Support System](https://reader030.vdocuments.site/reader030/viewer/2022032611/55cad457bb61ebb3438b4583/html5/thumbnails/16.jpg)
Information Retrieval- Find relevant sentences.
Query Processing1. Intention as Query
2. Query Preprocessing Tokenization
Expansion using Synonym
3. Query-Doc Matching Ranking
![Page 17: An overview of text mining and sentiment analysis for Decision Support System](https://reader030.vdocuments.site/reader030/viewer/2022032611/55cad457bb61ebb3438b4583/html5/thumbnails/17.jpg)
Information Retrieval- Find relevant sentences.
Simple and fast Quickly retrieve all relevant sentences or
documents given some keywords. But losses detail like sentence structure,
word order. Context is not captured.
E.g. a term “cold” may be referring to air cond is cold or the receptionist is cold.
![Page 18: An overview of text mining and sentiment analysis for Decision Support System](https://reader030.vdocuments.site/reader030/viewer/2022032611/55cad457bb61ebb3438b4583/html5/thumbnails/18.jpg)
Natural Language Processing
Source: Cheng Xiang Zhai, Text Retrieval and Search Engine, Coursera Slide.
![Page 19: An overview of text mining and sentiment analysis for Decision Support System](https://reader030.vdocuments.site/reader030/viewer/2022032611/55cad457bb61ebb3438b4583/html5/thumbnails/19.jpg)
Natural Language Processing
Difficult because we assume the hearer has some background knowledge.
Not only surface analysis of text is required.
Need common sense analysis. E.g. I can write words on that dusty
table top.
![Page 20: An overview of text mining and sentiment analysis for Decision Support System](https://reader030.vdocuments.site/reader030/viewer/2022032611/55cad457bb61ebb3438b4583/html5/thumbnails/20.jpg)
Techniques in Sentiment Analysis
Sentence Extractor
Tokenization
Boundary Detection
Sentence Selector
Entity Dictionary
Sentence Categorization
Sentiment Dictionary
Sentiment Extraction
Pre-processing Entity Detection Post-processing
MySQL Database
Browser
Entity Extraction Prediction Rating
Part of Summarev Framework for Entity’s Text Processing and Sentiment Analysis
http://ir.cs.usm.my/siir/project_summarev.php
![Page 21: An overview of text mining and sentiment analysis for Decision Support System](https://reader030.vdocuments.site/reader030/viewer/2022032611/55cad457bb61ebb3438b4583/html5/thumbnails/21.jpg)
Entity Detection (or Aspect Selection)
Texts
1. in the bathroom, used toiletries (shampoo & soap) were not thrown and were left in the shower area
2. dirty sink, and very verydirty shower glass wall.
3. the shower, it's clean...
…
Aspect
1. Bathroom
2. Toiletries
3. Shower area
4. Sink
5. Shower
6. Hair dryer
7. Wifi
8. Bed
...
- POS- Tagging
- Noun Phrase Selection
- Term Weighting
![Page 22: An overview of text mining and sentiment analysis for Decision Support System](https://reader030.vdocuments.site/reader030/viewer/2022032611/55cad457bb61ebb3438b4583/html5/thumbnails/22.jpg)
Sentiment Extraction
Texts
1. in the bathroom, used toiletries (shampoo & soap) were not thrown and were left in the shower area
2. dirty sink, and very verydirty shower glass wall.
3. the shower, it's clean...
…
Aspect -Sentiment
1. Sink – dirty
2. Shower – clean
3. Shower glass wall - dirty
- POS- Tagging
- Adjective Phrase Selection
![Page 23: An overview of text mining and sentiment analysis for Decision Support System](https://reader030.vdocuments.site/reader030/viewer/2022032611/55cad457bb61ebb3438b4583/html5/thumbnails/23.jpg)
Sentiment Scoring
Texts
1. in the bathroom, used toiletries (shampoo & soap) were not thrown and were left in the shower area
2. dirty sink, and very verydirty shower glass wall.
3. the shower, it's clean...
…
Aspect - Sentiment
1. Sink – dirty (N:0.75)
2. Shower – clean (P:0.5)
3. Shower glass wall – dirty (N:0.75)
Source: sentiwordnet.isti.cnr.it
![Page 24: An overview of text mining and sentiment analysis for Decision Support System](https://reader030.vdocuments.site/reader030/viewer/2022032611/55cad457bb61ebb3438b4583/html5/thumbnails/24.jpg)
Applications Source: http://www.twtbase.com/twitrratr/
![Page 25: An overview of text mining and sentiment analysis for Decision Support System](https://reader030.vdocuments.site/reader030/viewer/2022032611/55cad457bb61ebb3438b4583/html5/thumbnails/25.jpg)
Challenges Ahead
How to detect a more in depth sentiment.
Differentiate the spam and the credible.
Language problem
usage of mixed languages.
Usage of non standard languages.
![Page 26: An overview of text mining and sentiment analysis for Decision Support System](https://reader030.vdocuments.site/reader030/viewer/2022032611/55cad457bb61ebb3438b4583/html5/thumbnails/26.jpg)
Challenges Ahead
Last but not least,The challenge is to put the research and solution into real use.