an overview of text mining and sentiment analysis for decision support system

1

Upload: gan-keng-hoon

Post on 12-Aug-2015

259 views

Category:

Data & Analytics


2 download

TRANSCRIPT

Page 1: An overview of text mining and sentiment analysis for Decision Support System

An Overview of Text Mining and Sentiment Analysis

- for Decision Support System

Gan Keng Hoon

School of Computer Sciences

Universiti Sains Malaysia

12 May 2015

Page 2: An overview of text mining and sentiment analysis for Decision Support System

Outlines

1. Decision Support Systems

2. Overview of Text Mining & Sentiment Analysis Techniques in Text Mining

Techniques in Sentiment Analysis

3. Applications and Challenges ahead.

Page 3: An overview of text mining and sentiment analysis for Decision Support System

Decision Support SystemAs an end user, every day, we need to make decision ..

What to eat for lunch? What

subject to choose?

Which hotel to stay?

Page 4: An overview of text mining and sentiment analysis for Decision Support System

Decision Support System

every hour/minute/second, business provider needs to make crucialdecision ..

Source: http://attunelive.com/blog/how-a-screening-prompted-by-clinical-decision-support-system-helped-save-a-patients-life/

As a business provider,

Page 5: An overview of text mining and sentiment analysis for Decision Support System

Decision Support System

Source: http://www.informationbuilders.com/decision-support-systems-dss

Decision maker in a company checks the sales before decide which product to promote ..

Page 6: An overview of text mining and sentiment analysis for Decision Support System

Decision Support System

A hotelier wants to know why ..

If location is good, how can I take advantage ..

Page 7: An overview of text mining and sentiment analysis for Decision Support System

Why are they/we using Decision Support System

Business provider Improve customer

experience

Improve products and services

More returns …

End user Better purchasing choice

Better value

Happier ..

Page 8: An overview of text mining and sentiment analysis for Decision Support System

Sample Decision Support System

Looks good, 155 person says Very

Good…

Not bad, customers rated 4

* and above for location,

cleanliness ..

http://www.tripadvisor.com.my

Page 9: An overview of text mining and sentiment analysis for Decision Support System

The Truth ?

http://www.tripadvisor.com.my

Page 10: An overview of text mining and sentiment analysis for Decision Support System

Many Questions …

Mr X: How is the condition of Wifi?

Miss Y: Is the toilet really dirty?

Family Z: Any convenience store nearby?

Manager of Hotel: I want to know all the complaints about toilet!

Page 11: An overview of text mining and sentiment analysis for Decision Support System

Harnessing Web and Social Texts

Very influential.

Latest and most updated.

The truth (but sometimes not).

Free (most of the time).

Source: Hotel Review Sites: What’s the ‘Truth’ About Fairness? http://www.hospitalitynet.org/news/4056065.html

Page 12: An overview of text mining and sentiment analysis for Decision Support System

However. With No Automation Methods

It is impossible to scan through each of them. Important details could be missed.

It is hard to visualize or summarize all the texts via manual effort.

It is impossible to digest new reviews generated each day.

*There are 344 reviews (as of 10/5/2015) for the mentioned hotel.

Page 13: An overview of text mining and sentiment analysis for Decision Support System

Overview of Text Mining & Sentiment Analysis

Is the toilet really dirty?

Text Mining- Let’s mine some texts to answer the question.

1. in the bathroom, used toiletries (shampoo & soap) were not thrown and were left in the shower area

2. dirty sink, and very verydirty shower glass wall.

3. the shower, it's clean...

Sentiment Analysis- Let’s find some sentiments about these texts.

Page 14: An overview of text mining and sentiment analysis for Decision Support System

Techniques in Text Mining

What is text mining?

To exploit information contained in textual documents in various ways.

Natural Language Processing

Information Retrieval

Page 15: An overview of text mining and sentiment analysis for Decision Support System

Information Retrieval- Find relevant sentences.

Document Collection Processing1. Texts Preprocessing

Sentence Tokenizer

Stop Word Removal

2. Feature Selection Bags of Words Approach

Term Frequency Inversed Document Frequency

3. Inverted Index Creation Term – Doc Posting

Page 16: An overview of text mining and sentiment analysis for Decision Support System

Information Retrieval- Find relevant sentences.

Query Processing1. Intention as Query

2. Query Preprocessing Tokenization

Expansion using Synonym

3. Query-Doc Matching Ranking

Page 17: An overview of text mining and sentiment analysis for Decision Support System

Information Retrieval- Find relevant sentences.

Simple and fast Quickly retrieve all relevant sentences or

documents given some keywords. But losses detail like sentence structure,

word order. Context is not captured.

E.g. a term “cold” may be referring to air cond is cold or the receptionist is cold.

Page 18: An overview of text mining and sentiment analysis for Decision Support System

Natural Language Processing

Source: Cheng Xiang Zhai, Text Retrieval and Search Engine, Coursera Slide.

Page 19: An overview of text mining and sentiment analysis for Decision Support System

Natural Language Processing

Difficult because we assume the hearer has some background knowledge.

Not only surface analysis of text is required.

Need common sense analysis. E.g. I can write words on that dusty

table top.

Page 20: An overview of text mining and sentiment analysis for Decision Support System

Techniques in Sentiment Analysis

Sentence Extractor

Tokenization

Boundary Detection

Sentence Selector

Entity Dictionary

Sentence Categorization

Sentiment Dictionary

Sentiment Extraction

Pre-processing Entity Detection Post-processing

MySQL Database

Browser

Entity Extraction Prediction Rating

Part of Summarev Framework for Entity’s Text Processing and Sentiment Analysis

http://ir.cs.usm.my/siir/project_summarev.php

Page 21: An overview of text mining and sentiment analysis for Decision Support System

Entity Detection (or Aspect Selection)

Texts

1. in the bathroom, used toiletries (shampoo & soap) were not thrown and were left in the shower area

2. dirty sink, and very verydirty shower glass wall.

3. the shower, it's clean...

Aspect

1. Bathroom

2. Toiletries

3. Shower area

4. Sink

5. Shower

6. Hair dryer

7. Wifi

8. Bed

...

- POS- Tagging

- Noun Phrase Selection

- Term Weighting

Page 22: An overview of text mining and sentiment analysis for Decision Support System

Sentiment Extraction

Texts

1. in the bathroom, used toiletries (shampoo & soap) were not thrown and were left in the shower area

2. dirty sink, and very verydirty shower glass wall.

3. the shower, it's clean...

Aspect -Sentiment

1. Sink – dirty

2. Shower – clean

3. Shower glass wall - dirty

- POS- Tagging

- Adjective Phrase Selection

Page 23: An overview of text mining and sentiment analysis for Decision Support System

Sentiment Scoring

Texts

1. in the bathroom, used toiletries (shampoo & soap) were not thrown and were left in the shower area

2. dirty sink, and very verydirty shower glass wall.

3. the shower, it's clean...

Aspect - Sentiment

1. Sink – dirty (N:0.75)

2. Shower – clean (P:0.5)

3. Shower glass wall – dirty (N:0.75)

Source: sentiwordnet.isti.cnr.it

Page 24: An overview of text mining and sentiment analysis for Decision Support System

Applications Source: http://www.twtbase.com/twitrratr/

Page 25: An overview of text mining and sentiment analysis for Decision Support System

Challenges Ahead

How to detect a more in depth sentiment.

Differentiate the spam and the credible.

Language problem

usage of mixed languages.

Usage of non standard languages.

Page 26: An overview of text mining and sentiment analysis for Decision Support System

Challenges Ahead

Last but not least,The challenge is to put the research and solution into real use.